Enterprise End-Users - MemCon

Memory Con

March 2025

Silicon Valley, CA

Why Should Enterprise End-Users Attend MemCon 2024?

We attract enterprise end-users from across the BFSI, pharma, healthcare, energy, automotive, retail, entertainment and more, as they come together to:

Understand technology offerings by viewing products available from key vendors
Liaise with other end-users and project teams to understand issues better.
Consult hyperscalers to get the leading opinions on systems architecture.

If you'd like to find out more information about attending as an AI vendors, register your interest here

CONFIRM YOUR PLACE HERE

Featured Speakers Include

Author:

Puja Das

Senior Director, Personalization

Warner Bros. Entertainment

Dr. Puja Das, leads the Personalization team at Warner Brothers Discovery (WBD) which includes offerings on Max, HBO, Discovery+ and many more.

Prior to WBD, she led a team of Applied ML researchers at Apple, who focused on building large scale recommendation systems to serve personalized content on the App Store, Arcade and Apple Books. Her areas of expertise include user modeling, content modeling, recommendation systems, multi-task learning, sequential learning and online convex optimization. She also led the Ads prediction team at Twitter (now X), where she focused on relevance modeling to improve App Ads personalization and monetization across all of Twitter surfaces.

She obtained her Ph.D from University of Minnesota in Machine Learning, where the focus of her dissertation was online learning algorithms, which work on streaming data. Her dissertation was the recipient of the prestigious IBM Ph D. Fellowship Award.

She is active in the research community and part of the program committee at ML and recommendation system conferences. Shas mentored several undergrad and grad students and participated in various round table discussions through Grace Hopper Conference, Women in Machine Learning Program colocated with NeurIPS, AAAI and Computing Research Association- Women’s chapter.

Author:

Tom Sheffler

Solution Architect, Next Generation Sequencing

Roche

Tom earned his PhD from Carnegie Mellon in Computer Engineering with a focus on parallel computing architectures and prrogramming models. His interest in high-performance computing took him to NASA Ames, and then to Rambus where he worked on accelerated memory interfaces for providing high bandwidth. Following that, he co-founded the cloud video analytics company, Sensr.net, that applied scalable cloud computing to analyzing large streams of video data. He later joined Roche to work on next-generation sequencing and scalable genomics analysis platforms. Throughout his career, Tom has focused on the application of high performance computer systems to real world problems.

Author:

Galen Shipman

Computer Scientist

Los Alamos National Laboratories

Galen Shipman is a computer scientist at Los Alamos National Laboratory (LANL). His interests include programming models, scalable runtime systems, and I/O. As Chief Architect he leads architecture and technology of Advanced Technology Systems (ATS) at LANL. He has led performance engineering across LANL’s multi-physics integrated codes and the advancement and integration of next-generation programming models such as the Legion programming system as part of LANL's next-generation code project, Ristra. His work in storage systems and I/O is currently focused on composable micro-services as part of the Mochi project. His prior work in scalable software for HPC include major contributions to broadly used technologies including the Lustre parallel file system and Open MPI.

Author:

Tejas Chopra

Senior Engineer of Software

Netflix

Tejas Chopra is a Sr. Engineer at Netflix working on Machine Learning Platform for Netflix Studios and a Founder at GoEB1 which is the world’s first and only thought leadership platform for immigrants.Tejas is a recipient of the prestigious EB1A (Einstein) visa in US. Tejas is a Tech 40 under 40 Award winner, a TEDx speaker, a Senior IEEE Member, an ACM member, and has spoken at conferences and panels on Cloud Computing, Blockchain, Software Development and Engineering Leadership.Tejas has been awarded the ‘International Achievers Award, 2023’ by the Indian Achievers’ Forum. He is an Adjunct Professor for Software Development at University of Advancing Technology, Arizona, an Angel investor and a Startup Advisor to startups like Nillion. He is also a member of the Advisory Board for Flash Memory Summit.Tejas’ experience has been in companies like Box, Apple, Samsung, Cadence, and Datrium. Tejas holds a Masters Degree in ECE from Carnegie Mellon University, Pittsburgh.

Author:

Ping Zhou

Researcher/Architect

Bytedance Ltd.

Ping Zhou is a Senior Researcher/Architect with ByteDance, focusing on next-gen infrastructure innovations with hardware/software co-design. Prior to joining ByteDance, Ping worked with Google, Alibaba and Intel on products including Google Assistant, Optane SSD and Open Channel SSD. Ping earned his PhD in Computer Engineering at University of Pittsburgh, specializing in the field of emerging memory and storage technologies.

Author:

Rahul Gupta

AI Research Scientist

US Army Laboratory

Dr. Rahul Gupta has been working at the Army Research Lab for more than a decade. In his current position he is conducting research and development using Deep Learning Artificial Neural Network and Convolutional Neural Network. He joined ARL as a Distinguished Research Scholar and led several successful programs. He became a Fellow of the American Society of Mechanical Engineers in 2014. He is passionate about mentoring and team building with the goal of providing the Army the best possible technology to dominate today’s complex Multi-Domain Environment (MDE).

Author:

Puja Das

Senior Director, Personalization

Warner Bros. Entertainment

Dr. Puja Das, leads the Personalization team at Warner Brothers Discovery (WBD) which includes offerings on Max, HBO, Discovery+ and many more.

Prior to WBD, she led a team of Applied ML researchers at Apple, who focused on building large scale recommendation systems to serve personalized content on the App Store, Arcade and Apple Books. Her areas of expertise include user modeling, content modeling, recommendation systems, multi-task learning, sequential learning and online convex optimization. She also led the Ads prediction team at Twitter (now X), where she focused on relevance modeling to improve App Ads personalization and monetization across all of Twitter surfaces.

She obtained her Ph.D from University of Minnesota in Machine Learning, where the focus of her dissertation was online learning algorithms, which work on streaming data. Her dissertation was the recipient of the prestigious IBM Ph D. Fellowship Award.

She is active in the research community and part of the program committee at ML and recommendation system conferences. Shas mentored several undergrad and grad students and participated in various round table discussions through Grace Hopper Conference, Women in Machine Learning Program colocated with NeurIPS, AAAI and Computing Research Association- Women’s chapter.

Author:

Tom Sheffler

Solution Architect, Next Generation Sequencing

Roche

Tom earned his PhD from Carnegie Mellon in Computer Engineering with a focus on parallel computing architectures and prrogramming models. His interest in high-performance computing took him to NASA Ames, and then to Rambus where he worked on accelerated memory interfaces for providing high bandwidth. Following that, he co-founded the cloud video analytics company, Sensr.net, that applied scalable cloud computing to analyzing large streams of video data. He later joined Roche to work on next-generation sequencing and scalable genomics analysis platforms. Throughout his career, Tom has focused on the application of high performance computer systems to real world problems.

Author:

Galen Shipman

Computer Scientist

Los Alamos National Laboratories

Galen Shipman is a computer scientist at Los Alamos National Laboratory (LANL). His interests include programming models, scalable runtime systems, and I/O. As Chief Architect he leads architecture and technology of Advanced Technology Systems (ATS) at LANL. He has led performance engineering across LANL’s multi-physics integrated codes and the advancement and integration of next-generation programming models such as the Legion programming system as part of LANL's next-generation code project, Ristra. His work in storage systems and I/O is currently focused on composable micro-services as part of the Mochi project. His prior work in scalable software for HPC include major contributions to broadly used technologies including the Lustre parallel file system and Open MPI.

Author:

Tejas Chopra

Senior Engineer of Software

Netflix

Tejas Chopra is a Sr. Engineer at Netflix working on Machine Learning Platform for Netflix Studios and a Founder at GoEB1 which is the world’s first and only thought leadership platform for immigrants.Tejas is a recipient of the prestigious EB1A (Einstein) visa in US. Tejas is a Tech 40 under 40 Award winner, a TEDx speaker, a Senior IEEE Member, an ACM member, and has spoken at conferences and panels on Cloud Computing, Blockchain, Software Development and Engineering Leadership.Tejas has been awarded the ‘International Achievers Award, 2023’ by the Indian Achievers’ Forum. He is an Adjunct Professor for Software Development at University of Advancing Technology, Arizona, an Angel investor and a Startup Advisor to startups like Nillion. He is also a member of the Advisory Board for Flash Memory Summit.Tejas’ experience has been in companies like Box, Apple, Samsung, Cadence, and Datrium. Tejas holds a Masters Degree in ECE from Carnegie Mellon University, Pittsburgh.

Author:

Ping Zhou

Researcher/Architect

Bytedance Ltd.

Ping Zhou is a Senior Researcher/Architect with ByteDance, focusing on next-gen infrastructure innovations with hardware/software co-design. Prior to joining ByteDance, Ping worked with Google, Alibaba and Intel on products including Google Assistant, Optane SSD and Open Channel SSD. Ping earned his PhD in Computer Engineering at University of Pittsburgh, specializing in the field of emerging memory and storage technologies.

Author:

Rahul Gupta

AI Research Scientist

US Army Laboratory

Dr. Rahul Gupta has been working at the Army Research Lab for more than a decade. In his current position he is conducting research and development using Deep Learning Artificial Neural Network and Convolutional Neural Network. He joined ARL as a Distinguished Research Scholar and led several successful programs. He became a Fellow of the American Society of Mechanical Engineers in 2014. He is passionate about mentoring and team building with the goal of providing the Army the best possible technology to dominate today’s complex Multi-Domain Environment (MDE).

VIEW FULL SPEAKER LINE-UP

Agenda Highlights

Filter by:

All Topics

All Job Focuses

Printable version

Memory Optimizations for Machine Learning

As Machine Learning continues to forge its way into diverse industries and applications, optimizing computational resources, particularly memory, has become a critical aspect of effective model deployment. This session, "Memory Optimizations for Machine Learning," aims to offer an exhaustive look into the specific memory requirements in Machine Learning tasks and the cutting-edge strategies to minimize memory consumption efficiently.
We'll begin by demystifying the memory footprint of typical Machine Learning data structures and algorithms, elucidating the nuances of memory allocation and deallocation during model training phases. The talk will then focus on memory-saving techniques such as data quantization, model pruning, and efficient mini-batch selection. These techniques offer the advantage of conserving memory resources without significant degradation in model performance.
Additional insights into how memory usage can be optimized across various hardware setups, from CPUs and GPUs to custom ML accelerators, will also be presented.

Author:

Tejas Chopra

Senior Engineer of Software

Netflix

Tejas Chopra is a Sr. Engineer at Netflix working on Machine Learning Platform for Netflix Studios and a Founder at GoEB1 which is the world’s first and only thought leadership platform for immigrants.Tejas is a recipient of the prestigious EB1A (Einstein) visa in US. Tejas is a Tech 40 under 40 Award winner, a TEDx speaker, a Senior IEEE Member, an ACM member, and has spoken at conferences and panels on Cloud Computing, Blockchain, Software Development and Engineering Leadership.Tejas has been awarded the ‘International Achievers Award, 2023’ by the Indian Achievers’ Forum. He is an Adjunct Professor for Software Development at University of Advancing Technology, Arizona, an Angel investor and a Startup Advisor to startups like Nillion. He is also a member of the Advisory Board for Flash Memory Summit.Tejas’ experience has been in companies like Box, Apple, Samsung, Cadence, and Datrium. Tejas holds a Masters Degree in ECE from Carnegie Mellon University, Pittsburgh.

Indirect/Irregular Workloads within Large Simulations and How to Improve Access through Co-Design

Los Alamos National Laboratory's (LANL) has a diverse set of High Performance Computing codes. Analysis of many of these codes indicate they are heavily memory bound with sparse memory accesses. High Bandwidth Memory (HBM) has proven a significant advancement in improving the performance of these codes but the roadmap for major (step function) improvements in memory technologies is unclear. Addressing this challenge will require a renewed focus on high performance memory and processor technologies that take a more aggressive and holistic view of advancements in ISA, microarchitecture, and memory controller technologies. Beyond scientific simulations, advancements in performance of sparse memory accesses will benefit graph analysis, DLRM inference, and database workloads.

Author:

Galen Shipman

Computer Scientist

Los Alamos National Laboratories

Galen Shipman is a computer scientist at Los Alamos National Laboratory (LANL). His interests include programming models, scalable runtime systems, and I/O. As Chief Architect he leads architecture and technology of Advanced Technology Systems (ATS) at LANL. He has led performance engineering across LANL’s multi-physics integrated codes and the advancement and integration of next-generation programming models such as the Legion programming system as part of LANL's next-generation code project, Ristra. His work in storage systems and I/O is currently focused on composable micro-services as part of the Mochi project. His prior work in scalable software for HPC include major contributions to broadly used technologies including the Lustre parallel file system and Open MPI.

Recommendation Systems - Data Demands & Infrastructure Requirements

Author:

Puja Das

Senior Director, Personalization

Warner Bros. Entertainment

Dr. Puja Das, leads the Personalization team at Warner Brothers Discovery (WBD) which includes offerings on Max, HBO, Discovery+ and many more.

Prior to WBD, she led a team of Applied ML researchers at Apple, who focused on building large scale recommendation systems to serve personalized content on the App Store, Arcade and Apple Books. Her areas of expertise include user modeling, content modeling, recommendation systems, multi-task learning, sequential learning and online convex optimization. She also led the Ads prediction team at Twitter (now X), where she focused on relevance modeling to improve App Ads personalization and monetization across all of Twitter surfaces.

She obtained her Ph.D from University of Minnesota in Machine Learning, where the focus of her dissertation was online learning algorithms, which work on streaming data. Her dissertation was the recipient of the prestigious IBM Ph D. Fellowship Award.

She is active in the research community and part of the program committee at ML and recommendation system conferences. Shas mentored several undergrad and grad students and participated in various round table discussions through Grace Hopper Conference, Women in Machine Learning Program colocated with NeurIPS, AAAI and Computing Research Association- Women’s chapter.

Data Movement for Enterprise Teams – AI Challenges: Latency, Performance and Failing AI Training Scenarios

There are a set of challenges that emanate from memory issues in GenAI deployments in enterprise

Poor tooling for performance issues related from GPU and memory interconnectedness
Latency issues as a result of data movement and poor memory capacity planning
Failing AI training scenarios in low memory constraints

There is both opacity and immature tooling to manage a foundational infrastructure for GenAI deployment, memory. This is experienced by AI teams who need to double-click on the infrastructure and improve on these foundations to deploy AI at scale.

Author:

Rodrigo Madanes

Global AI Innovation Officer

EY

Rodrigo Madanes is EY’s Global Innovation AI Leader. Rodrigo has a computer science degree from MIT and a PhD from UC Berkeley. Some testament to his technical expertise includes 3 patents and having created novel AI products at both the MIT Media Lab as well as Apple’s Advanced Technologies Group.

Prior to EY, Rodrigo ran the European business incubator at eBay which launched new ventures including eBay Hire. At Skype, he was the C-suite executive leading product design globally during its hyper-growth phase, where the team scaled the userbase, revenue, and profits 100% YoY for 3 consecutive years.

Scaling Genomics Computations and Adapting to New Architectures

As the cost of sequencing drops and the quantity of data produced by sequencing grows, the amount of processing dedicated to genomics is increasing at a rapid pace. [Genomics is evolving in a number of directions simultaneously.] Complex pipelines are written in such a manner that they are portable to either clusters or clouds. Key kernels are also being ported to GPUs in a drop-in replacement for their non-accelerated counterpart. These techniques are helping to address challenges of scaling up genomics computations and porting validated pipelines to new systems. However, all of these computations strain the bandwidth and capacity of available resources. In this talk, Roche´s Tom Sheffler will share an overview of the memory-bound challenges present in genomics and venture some possible solutions.

Author:

Tom Sheffler

Solution Architect, Next Generation Sequencing

Roche

Tom earned his PhD from Carnegie Mellon in Computer Engineering with a focus on parallel computing architectures and prrogramming models. His interest in high-performance computing took him to NASA Ames, and then to Rambus where he worked on accelerated memory interfaces for providing high bandwidth. Following that, he co-founded the cloud video analytics company, Sensr.net, that applied scalable cloud computing to analyzing large streams of video data. He later joined Roche to work on next-generation sequencing and scalable genomics analysis platforms. Throughout his career, Tom has focused on the application of high performance computer systems to real world problems.

Data-Driven Machine Learning: Past, Present and Future

The presentation delves into the evolution, current state, and prospective developments within data-driven machine learning. In an era where data has ascended to the status of a pivotal resource, this presentation emphasizes its indispensable role in shaping the landscape of machine learning and how these changes have significantly influenced systems infrastructure.

Delving into the past, it meticulously examines the historical origins of data-driven modeling, charting its progression from rudimentary concepts to the intricate algorithms that underpin modern machine learning. The presentation illuminates early techniques like perceptrons and decision trees and elucidates their enduring impact on the field.

In the present, this presentation expounds upon the transformative influence of big data and deep learning, illuminating real-world applications while highlighting the associated challenges and opportunities that have engendered profound alterations in systems infrastructure.

As we look towards the future, this presentation provides invaluable insights into emerging trends and technologies such as quantum computing and edge AI, poised to redefine the future of machine learning and further revolutionize systems infrastructure.

By amalgamating theoretical insights, empirical observations, and forward-looking perspectives, this presentation offers a comprehensive overview of the past achievements, current dynamics, and potential future scenarios in the realm of data-driven machine learning, shedding light on how these changes have reshaped systems infrastructure.

Author:

Rahul Gupta

AI Research Scientist

US Army Laboratory

Dr. Rahul Gupta has been working at the Army Research Lab for more than a decade. In his current position he is conducting research and development using Deep Learning Artificial Neural Network and Convolutional Neural Network. He joined ARL as a Distinguished Research Scholar and led several successful programs. He became a Fellow of the American Society of Mechanical Engineers in 2014. He is passionate about mentoring and team building with the goal of providing the Army the best possible technology to dominate today’s complex Multi-Domain Environment (MDE).

VIEW THE FULL AGENDA

Spotlight Series

View Ved's Full Spotlight Here

View Rahul's Full Spotlight Here