Artificial Intelligence

Deep Data Observability in World of Structured & Semi-structured Data – Patrik Liu-Tran, Validio

August 8, 2023

Session Outline

In this talk, Patrik Liu Tran uncovers how Validio pioneers Deep Data Observability for nested and semi-structured data. As a result, they enable semi-structured and structured data to co-exist across data pipelines. With this new technology, data teams can unlock new data use cases without having to rely on flattening nested files. It’s clear that semi-structured data is here to stay. Although it’s been around for years as the de facto standard in event streams and applications, it’s only in recent years made a permanent home in the data warehouse. Nested or semi-structured data provides many benefits. More flexibility, it’s more object-oriented and thus easier to work with for backend teams. And leads to simpler data models since the structure itself can hold the model. However, these same pros cause severe data quality problems for users and pipelines that depend on properties of the nested data. This talk at the Data Innovation Summit 2023 explores these pain points and technological ways to solve them! E.g. by validating nested schemas!