Session Outline
In this session at the NDSML Summit 2023, Micha Kunze from Maersk, dives into how their team builds, tests, quality-checks, and manages thousands of data pipelines. They use open-source table formats and processing frameworks to deliver near-real-time operational data and support our ML feature store.
Key Takeaways:
- Table streaming for cheap nearline (minutes) data processing
- Leveraging ML to find anomalies on streaming velocity across our datasets
- Query any dataset/stream from via a fully automated metastore
Add comment