MLOps

Lowering Complexity & Improving Efficiency Of Inferencing

Technologist Deep-Dive (Gen AI & Data Science) Track

AI Technologists

Data Science

Digital Infrastructure

MLOps

Author:

Aayush Mudgal

Senior Machine Learning Engineer

Aayush Mudgal is a Senior Machine Learning Engineer at Pinterest, currently leading the efforts around Privacy Aware Conversion Modeling. He has a successful track record of starting and executing 0 to 1 projects, including conversion optimization, video ads ranking, landing page optimization, and evolving the ads ranking from GBDT to DNN stack. His expertise is in large-scale recommendation systems, personalization, and ads marketplaces. Before entering the industry, Aayush conducted research on intelligent tutoring systems, developing data-driven feedback to aid students in learning computer programming. He holds a Master's in Computer Science from Columbia University and a Bachelor of Technology in Computer Science from Indian Institute of Technology Kanpur.

Read more about Lowering Complexity & Improving Efficiency Of Inferencing

Transforming Privacy in Language Processing: Opaque Prompts and Beyond

This talk offers a deep dive into data privacy in language model (LM) applications, spotlighting the use of opaque prompts as a key strategy for safeguarding sensitive information. We explore how opaque prompts effectively sanitize user inputs by substituting sensitive data with non-identifiable placeholders, thereby preventing LMs from accessing personally identifiable information (PII). The discussion extends to the intricacies of implementing these prompts, highlighting the technical challenges in reliably masking PII and the need for customizable identification mechanisms. The talk also addresses the privacy concerns in LM training data, focusing on the challenges in anonymizing datasets and the implications for model accuracy and utility. This session aims to provide insights into advancing data protection methodologies within the realm of language models.

Technologist Deep-Dive (Gen AI & Data Science) Track

Search Optimization

LLM Modification

Prompt Engineering

MLOps

Data Optimization

Author:

Zairah Mustahsan

Senior Data Scientist

You.com

Zairah Mustahsan is a Staff Data Scientist at You.com, an AI chatbot for search, where she leverages her expertise in statistical and machine-learning techniques to build analytics and experimentation platforms. Previously, Zairah was a Data Scientist at IBM Research, researching Natural Language Processing (NLP) and AI Fairness topics. Zairah obtained her M.S. in Computer Science from the University of Pennsylvania, where she researched scikit-learn model performance. Her findings have since been used as guidelines for machine learning. Zairah is a regular speaker at AI conferences such as NeurIPS, AI4, AI Hardware & Edge AI Summit, and ODSC. Zairah has published her work in top AI conferences such AAAI and has over 300 citations. Aside from work, Zairah enjoys adventure sports and poetry.

Read more about Transforming Privacy in Language Processing: Opaque Prompts and Beyond

How To Differentiate Between Proprietary & Public Data Within Foundation Models

This engaging panel discussion delves into the critical differences between proprietary and public data, emphasising the distinct advantages and disadvantages associated with each. Explore how the accessibility and vast quantities of public data facilitate robust generalisation within AI models, contrasting with the nuanced strengths of proprietary data.

Public data's accessibility and abundance offer significant advantages, enabling broad generalisation within AI models. Conversely, proprietary data boasts higher quality, enhanced control, and minimal risk of contamination, catering specifically to niche topics with detailed coverage.

Delve into the advantages of public data, its scalability, and the challenges it poses, juxtaposed against the precise and controlled nature of proprietary data. Gain valuable insights into navigating the trade-offs between the two, understanding their impacts on model performance, ethical and regulatory considerations, and innovation within the realm of AI.

Technologist Deep-Dive (Gen AI & Data Science) Track

AI Technologists

Data Science

Digital Infrastructure

MLOps

Moderator

Author:

Tom Kersten

R&D Engineer

Royal NLR - Netherlands Aerospace Centre

Tom is a distinguished R&D Engineer specialising in AI within the aerospace sector. Armed with a background in computer science and AI, Tom possesses a comprehensive understanding of AI systems. Within his company, he stands out as a leading visionary delving into the integration of generative AI in space, in particular to support the efforts of the Dutch government and its military in this domain. His pioneering work involves exploring and harnessing the potential of GenAI models to revolutionise satellite operations, mission planning, earth observation and space exploration. Tom's dedication to pushing the boundaries of AI in aerospace extends to leveraging generative AI's capabilities, envisaging transformative applications that could redefine the landscape of space technology.

Read more about How To Differentiate Between Proprietary & Public Data Within Foundation Models