Garbage In, Gold Out: How GenAI Transforms Complex Data

Discover how Aida Kokanovic uses GenAI to turn messy, unstructured data into insight—and why smarter tools, not cleaner data, are the future!

May 5, 2025

More than 80% of data is unstructured. So, how do you separate noise from insight, especially when the stakes are high? To explore this, we spoke with Aida Kokanovic, a speaker at the Data Innovation Summit 2025, whose work sits at the intersection of generative AI, open-source intelligence, and complex data analysis!

With a background in investigative research and a passion for turning messy, raw data into meaningful outcomes, Aida challenges the old rule of “garbage in, garbage out.” Her talk, “Garbage In, Gold Out,” dives into how GenAI reshapes the way researchers process and understand complex, unstructured information.

In this interview, Aida shares how GenAI unlocks new possibilities in data structuring, the challenges of applying it to research, and what it means for the future of decision-making. She also offers advice for those looking to get started with AI-powered investigations.

Read on for Aida’s perspective on why the future of data analysis lies not just in cleaner inputs—but in smarter, more adaptive tools!

Hyperight: Aida, can you briefly introduce yourself, what is your professional background and current working focus?

Aida Kokanovic, speaker at the Data Innovation Summit 2025 — Aida Kokanovic, speaker at Data Innovation Summit 2025

Aida Kokanovic: I work at the intersection of generative AI, open-source intelligence (OSINT), and investigative journalism. My background is in data-driven research and digital project management. Over the years, I’ve specialized in building tools that make sense of messy, unstructured data—whether leaked documents, public records, or long PDF archives.

Right now, I’m focused on developing AI-powered systems that help journalists and analysts extract structure, timelines, and entities from complex sources, turning raw information into actionable insight. I also teach OSINT and advise media teams on how to integrate AI meaningfully into their workflows.

Hyperight: In your talk at the Data Innovation Summit **2025, you will speak about ’garbage in, gold out,’ and the role of genAI in research and complex data. What do you hope attendees take away from it?**

Aida Kokanovic: The talk flips the old saying “garbage in, garbage out” on its head. We’re now seeing how generative AI can actually work with messy, noisy, and incomplete data—not just clean datasets—and still produce structured, meaningful results. I’ll walk through real-world investigative cases where GenAI helped surface hidden patterns from unstructured sources in a leaked dataset from Russian officials.

Attendees will get a look under the hood of how these tools work, but more importantly, I want them to walk away understanding that the real innovation isn’t just in data cleaning—it’s in how we build adaptive systems that ask the right questions. Whether you’re in journalism, compliance, or intelligence, the key is creating workflows where GenAI supports human sense-making, not replacing it.

Hyperight: Aida, your work has focused on open-source intelligence and data structuring. How has genAI changed the analysis of complex, unstructured data compared to traditional methods?

Aida Kokanovic: Great question! GenAI has been a game-changer, really. Traditionally, structuring unstructured data—like extracting timelines, entities, or relationships from raw text—was slow, rule-based, and extremely manual. You had to know exactly what you were looking for, and even then, you’d often miss context or subtle patterns.

With GenAI, we can flip the workflow. Instead of defining all the rules up front, we give the model examples, and it starts to see structure where we might not—even across diverse formats like emails, transcripts, or PDFs. It doesn’t just extract data; it interprets, suggests, and connects dots across sources. That shift from rigid parsing to adaptive reasoning is huge. It’s like moving from a metal detector to ground-penetrating radar—you uncover more, faster, and with better contextual insight.

**Hyperight: Over 80% of the world’s data is unstructured. How does genAI make sense of this ‘garbage’ data and turn it into actionable intelligence for research and investigations?**

Aida Kokanovic: The magic of GenAI is that it doesn’t need perfectly labeled data to find meaning. Instead of seeing unstructured data as ‘garbage,’ GenAI treats it as raw material. Messy, yes, but full of hidden signals. It can read thousands of pages, extract entities, detect patterns, build timelines, and even highlight contradictions or anomalies that deserve a closer look.

In investigative contexts, this is a superpower. You can feed it interview transcripts, leaked documents, or scraped web content, and it helps you spot the needle in the haystack—or realize the haystack is the story. GenAI accelerates the path from noise to narrative by doing what no human can do alone at that scale or speed: turn context chaos into structured insight.

Hyperight: What are some challenges you’ve faced in applying genAI to investigative research, and how have you overcome them?

Aida Kokanovic: One of the biggest challenges has been trust. GenAI models are powerful, but they can hallucinate—making up facts or inventing sources. In investigative research, that’s unacceptable. You need traceability. So I had to rethink how we integrate GenAI into the workflow: not as a source of truth, but as a sense-making tool that suggests, highlights, and organizes—while keeping humans in the loop.

Another challenge is data formatting. We often work with documents that aren’t clean—scans, screenshots, mixed languages. Building a pipeline that handles OCR, translation, structuring, and context interpretation takes work. But by combining modular tools and setting clear boundaries for where GenAI adds value, I’ve found we can drastically cut research time while actually increasing depth and precision.

Hyperight: For those interested in diving deeper into AI-driven research, where would you suggest they start to learn and experiment with this technology?

Aida Kokanovic: Start small and start messy—just like the data. You don’t need a PhD or a big tech stack to get going. Tools like ChatGPT, Claude, and NotebookML are great entry points. With NotebookML, for example, you can run live experiments with structured prompts, chain reasoning steps, and see how GenAI handles unstructured inputs like reports or transcripts in a more guided way.

Try taking a PDF, some interview notes, or a messy Excel sheet and ask the model to extract events, flag contradictions, or cluster themes. It’s all about learning through experimentation. Once you’re comfortable, look into tools like LangChain or LlamaIndex to build modular workflows—and always stress-test the output. See where the model helps, where it breaks, and what you need to stay in control.

If you follow real-world cases—whether in journalism, intelligence, or risk—you’ll learn faster, because the stakes sharpen your thinking.

Hyperight: Looking ahead, how do you envision the role of genAI evolving in the upcoming years, in terms of transforming research, data analysis, and decision-making?

Aida Kokanovic: I think we’re moving from GenAI as a tool to GenAI as a collaborator. In the near future, I see GenAI becoming a sort of thinking partner—not just extracting facts, but helping us explore angles, ask better questions, and even stress-test assumptions in real time.

For research and data analysis, that means faster synthesis across huge volumes of unstructured input. But more importantly, it means more contextual insight: models that can spot contradictions, link evidence across documents, or map narrative arcs in investigative cases.

When it comes to decision-making, the key shift will be interpretability and trust. We’ll need systems that not only generate insights but show their reasoning and sources—so decision-makers can understand the “why” behind recommendations, not just the “what.”

I don’t believe GenAI will replace analysts or researchers—it will amplify them. The future belongs to teams who know how to ask the right questions and use GenAI to illuminate what humans might miss.

If you’re navigating the messy reality of unstructured data and wondering how GenAI can turn that chaos into clarity, don’t miss Aida’s session at the Data Innovation Summit 2025! She’ll challenge conventional thinking with real-world examples that show how even “garbage” data can yield gold.

Whether you’re in journalism, compliance, or research, Aida’s talk will equip you with strategies for using AI as a true investigative partner—one that helps surface patterns, spot inconsistencies, and accelerate insights without sacrificing human judgment.

byJana

Published May 05, 2025

Add a comment

Garbage In, Gold Out: How GenAI Transforms Complex Data

Hyperight: Aida, can you briefly introduce yourself, what is your professional background and current working focus?

Hyperight: In your talk at the Data Innovation Summit **2025, you will speak about ’garbage in, gold out,’ and the role of genAI in research and complex data. What do you hope attendees take away from it?**

Hyperight: Aida, your work has focused on open-source intelligence and data structuring. How has genAI changed the analysis of complex, unstructured data compared to traditional methods?

**Hyperight: Over 80% of the world’s data is unstructured. How does genAI make sense of this ‘garbage’ data and turn it into actionable intelligence for research and investigations?**

Hyperight: What are some challenges you’ve faced in applying genAI to investigative research, and how have you overcome them?

Hyperight: For those interested in diving deeper into AI-driven research, where would you suggest they start to learn and experiment with this technology?

Hyperight: Looking ahead, how do you envision the role of genAI evolving in the upcoming years, in terms of transforming research, data analysis, and decision-making?

Leave a Reply Cancel reply

Discover more

Inside Uber’s AI Engine: Michelangelo, the EU AI Act, and 10 Trillion Predictions

The AI Shift Enterprises Are Not Prepared For – Ingo Paas

Why Seat Counts Flatter You: Measuring the Operating Model, Not the Tool – Lenni Laukkanen

GenAI Synthetic tabular data – variations vs ontologies – Ericka Johnson, Fair AI Data

When the Interface is an Agent: Re-imagining Your Customer Context Layer – Yali Sassoon, Snowplow

Text2SQL: From Academic Benchmarks to Self-Service Analytics – Rauf Kurbanov, JetBrains

Garbage In, Gold Out: How GenAI Transforms Complex Data

Hyperight: Aida, can you briefly introduce yourself, what is your professional background and current working focus?

Hyperight: In your talk at the Data Innovation Summit 2025, you will speak about ’garbage in, gold out,’ and the role of genAI in research and complex data. What do you hope attendees take away from it?

Hyperight: Aida, your work has focused on open-source intelligence and data structuring. How has genAI changed the analysis of complex, unstructured data compared to traditional methods?

Hyperight: Over 80% of the world’s data is unstructured. How does genAI make sense of this ‘garbage’ data and turn it into actionable intelligence for research and investigations?

Hyperight: What are some challenges you’ve faced in applying genAI to investigative research, and how have you overcome them?

Hyperight: For those interested in diving deeper into AI-driven research, where would you suggest they start to learn and experiment with this technology?

Hyperight: Looking ahead, how do you envision the role of genAI evolving in the upcoming years, in terms of transforming research, data analysis, and decision-making?

Leave a Reply Cancel reply

Discover more

Hyperight: In your talk at the Data Innovation Summit **2025, you will speak about ’garbage in, gold out,’ and the role of genAI in research and complex data. What do you hope attendees take away from it?**

**Hyperight: Over 80% of the world’s data is unstructured. How does genAI make sense of this ‘garbage’ data and turn it into actionable intelligence for research and investigations?**