Hyperight

Beyond Schema Validation: Build Trust in Your Real-Time Data Pipeline

As data engineering evolves, new tech is transforming real-time data processing and scalability. In this interview, we speak with Tulika Bhatt, a Senior Software Engineer at Netflix! Tulika’s career spans finance, entertainment, and data systems. Her work contributes to the real-time personalization for which Netflix is famous.

In this interview, Tulika shares her expertise on real-time data systems at Netflix, from schema evolution to building trust through inline validation. She reveals how to balance speed and reliability in streaming environments. Get ready for some key lessons and best practices Tulika will share at the Data Innovation Summit 2025!

Hyperight: Tulika, you have an impressive background spanning technology, finance, and entertainment. Can you tell us more about your journey and what currently drives your work at Netflix?

Tulika Bhatt, speaker at the upcoming Data Innovation Summit 2025
Tulika Bhatt, speaker at Data Innovation Summit 2025

Tulika Bhatt: Thank you! My journey has been shaped by a curiosity and a passion for exploring the power of data across various industries. I started in finance, where performance and risk modeling taught me the value of clean, reliable data. From there, I moved to Netflix, where data doesn’t just support the product; it is the product and powers everything.

At Netflix, I focus on large-scale data engineering for real-time personalization use cases. What drives me today is the challenge of building trustworthy, high-throughput data systems. Systems that can adapt to fast-moving product needs while maintaining integrity. I’m especially passionate about observability and data quality, making sure teams can innovate rapidly without silently breaking things.

More broadly, I’m driven by impact. Helping teams make smarter decisions with data, building infrastructure that scales with creativity, and mentoring the next generation of data practitioners to treat quality and speed as equally non-negotiable.

Hyperight: With nearly a decade of experience in the industry, what are some exciting advancements you’ve seen in data engineering?

Tulika Bhatt: One exciting shift I’ve noticed is that more and more companies are moving towards streaming pipelines compared to batch-centric workloads. The evolution and mainstream adoption of streaming jobs have been an emerging trend in recent years.

Another thing is the rise of declarative stream processing. Now, you can write your streaming jobs using SQL-like business logic, abstracting away the complexity. This has opened the access for a much broader set of practitioners.

Also, we are seeing more and more unification of batch and stream processing into one system. This has been a game changer. It has simplified architectures and reduced the need for duplicated logic across real-time and historical views.

Hyperight: Your session at the Data Innovation Summit 2025 will focus on how to build trust in real-time data pipelines. What are some of the key lessons you want attendees to walk away with?

Tulika Bhatt: Some of the key takeaways from my session would be:

  • Schema validation is necessary but not sufficient. Valid schemas can still contain garbage. We’ll explore real examples where pipelines passed validation but silently degraded data quality or user experience due to business logic drift, unexpected distributions, or stale dimensions.
  • Trust is built on observability, not assumptions. You can’t trust what you don’t observe. We’ll dive into how adding lightweight checks, for e.g –  volume, null ratios, or cross-field consistency, can help catching issues early, building confidence across teams.
  • Real-time doesn’t mean fragile. Many engineers believe real-time systems are inherently brittle. I’ll challenge that belief by sharing patterns (e.g., sidecar monitors, delayed mirrors, circuit breakers) that make real-time pipelines as trustworthy as batch, if not more.
  • Alerts are the easy part, actionability is hard. It’s not just about detecting anomalies but making sure the right people are notified, with the right context, in time to take action. We’ll talk about designing for “human-in-the-loop” debugging and ops.
  • Trust is cultural, not just technical. Building trust in pipelines also requires investing in shared vocabulary, ownership boundaries, and collaboration between data producers and consumers. We’ll look at what organizational habits separate “data chaos” from “data confidence.”

Hyperight: You will also discuss the limitations of traditional batch data quality approaches in real-time streaming environments. What are some differences between batch and stream processing?

Tulika Bhatt: Batch processing handles large volumes of data at once with high latency but full context, making it ideal for historical analysis. On the other hand, streaming processes data in real time as it arrives, requiring low-latency, incremental processing with limited context.

Traditional batch data quality methods don’t really work for real-time streaming environments. These methods operate with the assumption that data would be complete, even if it comes with added latency. In contrast, streaming pipelines deal with continuous, unbounded data and require instant, in-flight validation. Batch methods often catch issues too late, after the damage is done. On the other hand, streaming systems demand proactive, low-latency checks that operate without full context. To build trust in real-time data, we need lightweight, intent-driven observability built into the stream itself, not bolted on afterward.

Hyperight: Inline validation and schema evolution management are essential in maintaining data quality. How do you implement these strategies within the Netflix ecosystem when dealing with large-scale content recommendation systems?

Tulika Bhatt: As a Software Engineer working on large-scale impression data systems at Netflix, maintaining data quality starts with inline validation embedded directly into our streaming pipelines. We implement lightweight checks, like null enforcement, field type validation, and domain constraints, at the point of ingestion using tools like Flink. This ensures that bad data is caught early without blocking throughput. For schema evolution, we rely on a central schema registry with compatibility checks and versioning enforcement. This allows producers to evolve safely while giving downstream consumers time to adapt.

Since impression data is also used to construct user home pages at Netflix, even small data quality issues can create a significant user impact. That’s why we drop or route suspicious records. We do this to quarantine streams for analysis and maintain metrics dashboards that track schema drift, null ratios, and volume anomalies in near real-time. This approach lets us evolve fast, scale confidently, and maintain trust in our signals powering billions of recommendations daily.

Hyperight: With high throughput being a priority in streaming environments, how do you design and implement data pipelines that balance strict data validation with the need for low-latency performance?

Tulika Bhatt: In streaming environments where low latency and high throughput are critical, data validation must be efficient and strategic. I follow a layered approach. Lightweight, high-confidence checks (like nulls, types, and basic range validations) run inline to catch critical issues early without adding latency, while heavier validations (like referential integrity or distribution checks) are deferred to post-processing audit batch jobs. Multiple alerting layers monitor the stream, critical data drops trigger immediate pages, while lower priority validation failures are either gracefully dropped or routed to a quarantine stream for later inspection. This architecture ensures that pipelines remain fast, resilient, and trustworthy without compromising on observability or data quality.

Hyperight: As a Software Engineer working with Netflix’s vast scale, how do you approach data governance and management to ensure data consistency and integrity? Especially when schema changes occur?

Tulika Bhatt: At Netflix’s scale, data governance is not a centralized function. It is a distributed responsibility enabled by strong tooling and cultural alignment. When it comes to ensuring data consistency and integrity across services, especially during schema changes, my approach is rooted in three pillars: ownership, automation, and observability.

Each data-producing team owns its schema. This schema is registered and versioned through a central schema registry with enforced compatibility rules. Typically backward or forward compatibility, depending on the consumer contracts. Schema changes go through automated validations, and we integrate these checks in the testing phase so that issues are caught before reaching production.

On the governance side, we use metadata tagging, lineage tracking, and access controls. The goal is to ensure clarity around what data exists, who owns it, and how it’s being used. This becomes especially critical when schema changes ripple across services, we rely on lineage to track consumption and run campaigns to escalate breaking changes.

For real-time systems, I also ensure runtime observability is in place: schema drift metrics, record format mismatches, and dropped field counts are continuously monitored. This allows us to react quickly if something unexpected slips through, preserving the integrity of data powering everything from personalization to analytics.

Hyperight: In large-scale real-time systems, there is often a need for optimizing data throughput without sacrificing quality. How do you ensure efficient data processing, and how do you manage trade-offs in system design?

Tulika Bhatt: In large-scale real-time systems, optimizing for throughput without compromising data quality requires some very deliberate trade-offs. I typically apply a few core techniques:

First, I focus on data shaping at the edge; dropping unnecessary fields, and applying filtering as close to the source as possible. This reduces network and compute overhead downstream. For transformations, I lean on stateful stream processors like Flink, but design operators to be memory-efficient and parallelizable, using techniques like event-time windowing and state expiration to control resource usage.

To preserve quality, I embed inline validations (e.g., schema checks, null guards) that are lightweight but catch critical issues early. Heavier checks are deferred to post-processing layers, ensuring that the streaming path remains fast.

Trade-offs often come down to whether you want latency versus correctness. Where strict accuracy isn’t feasible in real-time, I support eventual correction mechanisms via late-arriving data handling or periodic reprocessing.

Ultimately, the goal is to make smart compromises: guarantee trust where it matters most, and design gracefully degradable systems everywhere else.

Hyperight: Data-driven decision-making is critical. How do you ensure data engineers and software engineers collaborate effectively to maintain high-quality data pipelines?

Tulika Bhatt: Ensuring effective collaboration between data engineers and software engineers starts with a shared understanding that data is a product, not just a byproduct. At Netflix, where real-time data powers everything from personalized recommendations to A/B testing, we foster this alignment through clear ownership boundaries, and tight feedback loops.

First, we define data contracts between teams, explicit expectations around schema, freshness, quality metrics, and SLAs. These contracts help both sides understand the impact of changes and reduce friction during iteration. 

Second, we rely on self-serve observability tools that make it easy for engineers to track data lineage, quality trends, and anomalies. This democratizes visibility and encourages accountability. When an issue arises, we don’t just escalate, we conduct shared postmortems and build tooling or automation to prevent recurrence.

Finally, we cultivate a culture where experimentation and reliability can coexist. Product innovation moves fast, but we make sure pipelines are resilient enough to support that speed without compromising trust. When both roles see data quality as a foundation for innovation, not a blocker, collaboration becomes second nature.

If you’re looking to explore the latest trends in real-time data engineering, don’t miss Tulika’s (virtual) session at the Data Innovation Summit 2025! She’ll dive into building scalable, high-quality data pipelines. Also, she’ll share her expertise on everything from schema evolution to ensuring trust through inline validation.

Whether you’re tackling streaming data challenges or optimizing your real-time systems, Tulika’s insights will offer you strategies for balancing speed, quality, and reliability. Join us to gain knowledge on how to design resilient data systems that power innovation without compromise!

Add comment

Upcoming Events