Hyperight

Building a Lakehouse: Leverage Open Table Formats for Unified Stream and Batch Architecture – Micha Kunze, Maersk

Session Outline

In this session at the NDSML Summit 2023, Micha Kunze from Maersk, dives into how their team builds, tests, quality-checks, and manages thousands of data pipelines. They use open-source table formats and processing frameworks to deliver near-real-time operational data and support our ML feature store.

Key Takeaways:

  • Table streaming for cheap nearline (minutes) data processing
  • Leveraging ML to find anomalies on streaming velocity across our datasets
  • Query any dataset/stream from via a fully automated metastore

Add comment

Upcoming Events

Data Innovation Summit 2025

Early bird tickets ending in:

days hours minutes seconds
SECURE YOUR TICKET NOW!