Building Conviction and Credibility in ML-Based Solutions - Interview with Armin Catovic, EQT

Through the events that Hyperight organizes, we had a unique chance to follow the path of many companies during their data and AI-driven transformation. EGT is no exception. In this edition, we’ll have another opportunity to hear another success story from their team at the Data Innovation Summit. Armin Catovic, Senior Machine Learning Engineer and Anna Martin, Senior Data Scientist and Team Lead at EQT Partners, will share their key findings from applying machine learning (ML) within the venture capital and private equity sector.

Some of those insights are shared in this interview. One key point is that building trust is central to building conviction and credibility in ML solutions. The methods EQT used to build trust were beneficial in several ways for the company. In which way? Read the interview that we hope will be inspirational enough for you to follow Armin’s and Anna’s presentation and the EQT journey with the ML solutions this May in Stockholm.

Hyperight: Can you tell us more about yourself and your organization? What are your professional background and current working focus?

Armin Catovic: Anna and I work with EQT’s proprietary, AI-driven investment platform, Motherbrain. We work everyday to build a product that gives our investment professionals superpowers. Part CRM (customer relationship management), part machine learning platform, Motherbrain is a product at the intersection of AI and a great user-friendly interface that our investment professionals use to identify, assess and manage investment opportunities.

As for the two of us, we both share the common thread of being data scientists, but currently have slightly different roles. I am at present working as a machine learning engineer, helping standardize how we build and deploy our ML pipelines, with particular focus on our ranking models used for sourcing startups. Anna on the other hand has stepped up as a product lead to help optimize our investment professionals’ workflows.

Hyperight: During the Data Innovation Summit 2023, you will share more on “Building Conviction and Credibility in ML-Based Solutions”. What can the delegates at the event expect from your presentation?

Armin Catovic: Our presentation distills some of our key findings from applying machine learning within the venture capital and private equity sector. That being said, the take-aways are applicable to any organization that has a set of users/customers with deep expertise, i.e. non-web scale consumers.

When machine learning well and truly took off in the industry six to seven years ago, there was a preconception that you simply needed to hire data scientists and you were good to go. After a while, companies also realized that data acquisition, transformation, and storage were pivotal, so they started hiring data engineers. Shortly thereafter, they realized that ML systems also require careful lifecycle management, so MLOps was invented. The end-result is now a well-established picture of ML as a pipeline – a piece of engineering ingenuity.

However, ultimately, continuous iteration and distillation of business and user understanding is key; while data and ML engineers are still at the core, designers, user researchers, ambassadors, and careful product and relationship management is critical to building conviction among users. We hope that our audience will walk away with a more holistic understanding of what is needed to build impactful ML and data-driven solutions for their business – something beyond the traditional pipeline view.

Hyperight: What made EQT work on increased confidence and credibility in ML models and ML-based solutions?

Armin Catovic: Building conviction and credibility means building trust. Trust is of existential importance to us – if our users (who constitute the investment professionals within the deal teams) don’t trust what our ML/data solutions provide, they will simply replace us with external consultants, who will throw in many resources and undertake long hours of traditional research in order to meet their perceived level of trust.

This is quite different from the ML/data solutions within e-commerce or streaming services. Even though they aim to improve retention and engagement with better recommendations and predictions, they can afford to show irrelevant data (up to a point) – their users will continue using their core services, i.e. listening to music, watching tv series, and purchasing goods. We on the other hand have an extremely small margin for error, and any error dents our users’ trust.

Hyperight: What does the journey of installing confidence and credibility look like? Can you guide us through it?

Armin Catovic: This journey is effectively the scope of our presentation. However, we can say now that the journey starts and ends with understanding our users and their core problems. In other words, we have had to keep re-visiting our users and updating our assumptions based on their evolving needs and behaviors.

Motherbrain began as a platform for sourcing early-stage startups and has now evolved to include an array of use cases such as building conviction for a much smaller set of companies, enabling value-add for our portfolio companies via add-on acquisitions, as well as providing a mapping and a better understanding of the markets to our users. In terms of a data science analogue, our journey started with a focus on sensitivity (recall), and has now also shifted into prioritizing specificity (precision).

Hyperight: What methods have you used to improve the trust and credibility of ML solutions?

Armin Catovic: From a data and ML engineering standpoint, we’ve had to double down on improving our data quality. This meant analyzing the coverage and accuracy of various data sources, and becoming smarter about how we prioritize data points. Revisiting our models and features from the ground up helped us find inconsistencies and bugs that creeped in over time. Finally, improving the amount of first-hand, gold-standard data from our actual users has helped significantly. All of these served to improve the precision of our insights and model predictions.

While engineering forms a centerpiece of these efforts, our UX designers and product management have been absolutely indispensable. Conducting interviews, iterating mock-ups, and analyzing how Motherbrain is being used has helped us create a more detailed breakdown of our users and their behaviors, and was instrumental in creating a concrete product roadmap.

Hyperight: Were there any challenges you faced during the journey of building conviction and credibility in ML-based solutions?

Armin Catovic: This is of course an ongoing journey but building trust is the one big challenge, especially when expanding the product to new user groups. When working with such large amounts of data you have to answer several questions: how do we make insights and predictions that are intelligent enough to bring true value, while still making it transparent enough for the user what data is used. And what’s the quality of that data and what assumptions are being made?

Hyperight: Can you share some of the benefits for the organization with this improved ML solution?

Armin Catovic: We truly believe that Motherbrain has been a game changer for our organization. Today, almost every step of the investment process is supplemented by ML, making us quicker, smarter and ultimately better investors. It’s not that ML will ever replace human investors, but we would argue that an investor with ML is always better than one without.

That makes it hard to narrow down on just a few benefits. But let us share some tangible examples of companies that we have sourced through Motherbain. Since its inception we have made fifteen investments using Motherbrain, with a total investment capital of €200M+. Companies like Peakon, Griffin, Anyfin (and more) – these are all companies that EQT Ventures has invested in after initial identification by Motherbrain.

Hyperight: The solution you work for is beneficial within the context of private capital. Is it applicable to organizations in other industries?

Armin Catovic: In principle, lessons learnt here can be universally applied in any industry or organization. However, companies serving their ML output at web-scale – i.e. with millions of users – would probably be better off relying on good-old-fashioned A/B testing. For organizations with a small number of expert users (where A/B testing cannot yield credible statistical significance), and where this small group of users have high requirements on accuracy and credibility of your output, our presentation will be highly relevant.

Hyperight: Based on the lessons learned from this journey, what are your recommendations for those interested in improving the credibility of the ML models? What should they pay attention to?

Armin Catovic: You just have to listen to our talk at the Data Innovation Summit in May this year! But, we can tease out a few important takeaways:

Continuously re-visit and refine who your users/customers are, their behaviors, and what their core problems are
Build trust in your data quality – both with the engineering efforts, but also with clever UX solutions
Build trust in your models – make your assumptions clear, work on explainability, and find optimal ways for your users/customers to engage with your models

Hyperight: According to you, what AI trends can we expect in the upcoming 12 months?

Armin Catovic: We are currently experiencing an unprecedented popularity in large-scale, multimodal, and generative AI, as popularized by Stable Diffusion, ChatGPT, and the like. While the underlying principles powering these models have been around for a while, it’s really the method of delivery, and particularly the focus on human-in-the-loop that has given them a jolt in popularity.

ChatGPT is on every venture capitalist’s lips right now, and various startups are bubbling up, providing API or interaction layers on top of these models. So over the next 12 months we are likely going to see some risers (and fallers!) of systems leveraging generative AI. There are some concrete use cases, such as leveraging generative AI as part of a creative process, for example, during concept sketching and ideation, or during image and video editing. General design and advertising sectors may see a significant disruption here. These generative systems inherently rely on “prompts” (either textual prompts or image “seeds”), so we are likely to see new methods in prompt engineering and prompt tuning (and perhaps even new tech roles, such as a “senior prompt engineer”!).

There is a question of whether the non-deterministic prompting methods will be accepted by users, especially those users that are used to static/fixed interfaces (e.g. icons in a toolbar). At some point, questions surrounding the scalability, and legal/copyright and ethical uses of these tools will come into play, and we may start edging towards the “trough of disillusionment” (as per Gartner).

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bp_user-registered	13 years 8 months 8 days	This cookie is used to set which users can access the private pages of the website. It is a functional cookie.
bp_user-role	13 years 8 months 8 days	This is a functional cookie. It is used to set restriction to the user on acessing certain pages like back office, account page etc.
bp_ut_session	13 years 8 months 8 days	This is a functional cookie. This cookie is used to set restriction to the user on acessing certain pages like back office, account page etc.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.

Cookie	Duration	Description
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Duration	Description
_gat_gtag_UA_62786802_1	1 minute	No description
CONSENT	16 years 9 months 21 days 15 hours 5 minutes	No description
ihc_workflow_restrictions_0	1 month	No description
ihcMedia	1 hour	No description

Building Conviction and Credibility in ML-Based Solutions – Interview with Armin Catovic, EQT

Add comment

Cancel reply

The Future of AI-Enabled Experiences – Interview with Dr. Ather Gattami, Leading Swedish AI Expert, AI Researcher at Bitynamics

AIAW Podcast E125 – Liza-Maria Norlin

Recent posts

The Future of AI-Enabled Experiences – Interview with Dr. Ather Gattami, Leading Swedish AI Expert, AI Researcher at Bitynamics

AIAW Podcast E125 – Liza-Maria Norlin

AIAW Podcast E124 – All about #DBRX AI Model – Hagay Lupesko

Semantic Layers: Your Strategic Advantage for AI-driven Insights – Interview with Ernesto Ongaro, dbt Labs

Data Innovation Summit 2024: What You Can’t Afford to Miss!

Beyond the Basics: 5 Fine-Tuning Stages for Precision in Machine Learning

AIAW Podcast E123 – The Power Grid of Tomorrow – Kiryl Zhdanovich & Jonatan Raber

Empowering Customer Conversations with SagaAI’s Intelligent Insights and Recommendations – Interview with Anca Iordanescu, VP Engineering, Store of the Future at Ingka Digital, IKEA

Topics

Email Newsletter

Events

Hyperight