Embed Recycling to Accelerate ML Inferencing – Jo Kristan Bergum, Yahoo

With self-supervised deep learning, unstructured data is more useful than ever. Yet, running deep learning models on the same data for different tasks can be costly.

November 7, 2023

Session Outline

With self-supervised deep learning, unstructured data from various sources is more useful than ever. However, running deep learning models on the same data for different tasks can be very costly. A new technique called embedding recycling (ER) helps reduce these costs by allowing intermediate data representations, or embeddings, to be reused across different tasks. By saving the outputs of certain layers in a neural network, these embeddings can be applied to multiple tasks with specialized layers for each. This talk introduces recyclable embeddings and shows how Vespa, a scalable search and recommendation engine, efficiently performs ML inferences on large data sets using this approach.

Key Takeaways

Overview of deep-learned embeddings and the growing importance of embedding models for ML
How to recycle embeddings for different tasks
Practical strategies and best practices for efficiently managing and operating deep-learned embeddings in real-world applications.

byJana

Published November 07, 2023

Add a comment

Embed Recycling to Accelerate ML Inferencing – Jo Kristan Bergum, Yahoo

Session Outline

Key Takeaways

Leave a Reply Cancel reply

Discover more

Chairman’s Opening Remarks – Day I – NDSML 2025 – Robert Luciani, Nerv Dynamics

AI for efficient industrial operations – Sepideh Pashami, RISE

Model Down! How Failure continues to help us build our AI Strategy – Abdalla Elbedwihi and Miguel Játiva, Xeneta AS

Building AI That Scales: Performance, Reliability, and Responsible Deployment – Daniel Johansson, Oracle

Scaling AI Development: An Architecture for Multiplying Impact – Sara Hajian, Trustpilot

Panel: Advisory Board: AI at Scale: How Are Data Science/ML Functions Adapting to the Thousand-Agent Future?