Building a Lakehouse: Leverage Open Table Formats for Unified Stream and Batch Architecture – Micha Kunze, Maersk

In this talk at NDSML Summit 2023, Micha Kunze from Maersk, dives into how their team builds, tests, quality-checks, and manages data pipelines.

November 6, 2023

Session Outline

In this session at the NDSML Summit 2023, Micha Kunze from Maersk, dives into how their team builds, tests, quality-checks, and manages thousands of data pipelines. They use open-source table formats and processing frameworks to deliver near-real-time operational data and support our ML feature store.

Key Takeaways:

Table streaming for cheap nearline (minutes) data processing
Leveraging ML to find anomalies on streaming velocity across our datasets
Query any dataset/stream from via a fully automated metastore

Published November 06, 2023

Add a comment

Leave a Reply Cancel reply

You must be logged in to post a comment.

Discover more

Chairman’s Opening Remarks – Day I – NDSML 2025 – Robert Luciani, Nerv Dynamics

December 23, 2025

AI for efficient industrial operations – Sepideh Pashami, RISE

December 22, 2025

Model Down! How Failure continues to help us build our AI Strategy – Abdalla Elbedwihi and Miguel Játiva, Xeneta AS

December 22, 2025

Building AI That Scales: Performance, Reliability, and Responsible Deployment – Daniel Johansson, Oracle

December 22, 2025

Scaling AI Development: An Architecture for Multiplying Impact – Sara Hajian, Trustpilot

December 22, 2025

Panel: Advisory Board: AI at Scale: How Are Data Science/ML Functions Adapting to the Thousand-Agent Future?

December 22, 2025