Hyperight

Embed Recycling to Accelerate ML Inferencing – Jo Kristan Bergum, Yahoo

Session Outline

With self-supervised deep learning, unstructured data from various sources is more useful than ever. However, running deep learning models on the same data for different tasks can be very costly. A new technique called embedding recycling (ER) helps reduce these costs by allowing intermediate data representations, or embeddings, to be reused across different tasks. By saving the outputs of certain layers in a neural network, these embeddings can be applied to multiple tasks with specialized layers for each. This talk introduces recyclable embeddings and shows how Vespa, a scalable search and recommendation engine, efficiently performs ML inferences on large data sets using this approach.

Key Takeaways

  • Overview of deep-learned embeddings and the growing importance of embedding models for ML
  • How to recycle embeddings for different tasks
  • Practical strategies and best practices for efficiently managing and operating deep-learned embeddings in real-world applications.

Add comment

Upcoming Events

Data Innovation Summit 2025

Early bird tickets ending in:

days hours minutes seconds
SECURE YOUR TICKET NOW!