Through the events that Hyperight organizes, we had a unique chance to follow the path of many companies during their data and AI-driven transformation. EGT is no exception. This year, we’ll have another opportunity to hear another success story from their team at the Data Innovation Summit. There, Armin Catovic, Senior Machine Learning Engineer and Anna Martin, Senior Data Scientist and Team Lead at EQT Partners, will share their key findings from applying machine learning (ML) within the venture capital and private equity sector.
Some of those insights are shared in this interview. One key point is that building trust is central to building conviction and credibility in ML solutions. The methods EQT used to build trust were beneficial in several ways for the company. In which way? Read the interview that we hope will be inspirational enough for you to follow Armin’s and Anna’s presentation and the EQT journey with the ML solutions this May in Stockholm.
Hyperight: Can you please tell us more about yourself and your organization? What are your professional background and current working focus?
Armin Catovic: Anna and I work with EQT’s proprietary, AI-driven investment platform, Motherbrain. We work everyday to build a product that gives our investment professionals superpowers. Part CRM (customer relationship management), part machine learning platform, Motherbrain is a product at the intersection of AI and a great user-friendly interface that our investment professionals use to identify, assess and manage investment opportunities.
As for the two of us, we both share the common thread of being data scientists, but currently have slightly different roles. I am at present working as a machine learning engineer, helping standardize how we build and deploy our ML pipelines, with particular focus on our ranking models used for sourcing startups. Anna on the other hand has stepped up as a product lead to help optimize our investment professionals’ workflows.
Hyperight: During the Data Innovation Summit 2023, you will share more on “Building Conviction and Credibility in ML-Based Solutions”. What can the delegates at the event expect from your presentation?
Armin Catovic: Our presentation distills some of our key findings from applying machine learning within the venture capital and private equity sector. That said, the take-aways are applicable to any organization that has a set of users/customers with deep expertise, i.e. non-web scale consumers.
When machine learning well and truly took off in the industry six to seven years ago, there was a preconception that you simply needed to hire data scientists and you were good to go. After a while, companies also realized that data acquisition, transformation, and storage were pivotal, so they started hiring data engineers. Shortly thereafter, they realized that ML systems also require careful lifecycle management, so MLOps was invented. The end-result is now a well-established picture of ML as a pipeline – a piece of engineering ingenuity.
However, ultimately continuous iteration and distillation of business and user understanding is key; while data and ML engineers are still at the core, designers, user researchers, ambassadors, and careful product and relationship management is critical to building conviction among users. We hope our audience will walk away with a more holistic understanding of what is needed to build impactful ML and data-driven solutions for their business – something beyond the traditional pipeline view.
Hyperight: What made EQT work on increased confidence and credibility in ML models and ML-based solutions?
Armin Catovic: Building conviction and credibility means building trust. Trust is of existential importance to us – if our users (who constitute the investment professionals within the deal teams) don’t trust what our ML/data solutions provide, they will simply replace us with external consultants, who will throw in many resources and undertake long hours of traditional research in order to meet their perceived level of trust.
This is quite different from the ML/data solutions within e-commerce or streaming services. Even though they aim to improve retention and engagement with better recommendations and predictions, they can afford to show irrelevant data (up to a point) – their users will continue using their core services, i.e. listening to music, watching tv series, and purchasing goods. We on the other hand have an extremely small margin for error, and any error dents our users’ trust.
Hyperight: What does the journey of installing confidence and credibility look like? Can you guide us through it?
Armin Catovic: This journey is effectively the scope of our presentation. However, we can say now that the journey starts and ends with understanding our users and their core problems. In other words, we have had to keep re-visiting our users and updating our assumptions based on their evolving needs and behaviors.
Motherbrain began as a platform for sourcing early-stage startups and has now evolved to include an array of use cases such as building conviction for a much smaller set of companies, enabling value-add for our portfolio companies via add-on acquisitions, as well as providing a mapping and a better understanding of the markets to our users. In terms of a data science analogue, our journey started with a focus on sensitivity (recall), and has now also shifted into prioritizing specificity (precision).
Hyperight: What methods have you used to improve the trust and credibility of ML solutions?
Armin Catovic: From a data and ML engineering standpoint, we’ve had to double down on improving our data quality. This meant analyzing the coverage and accuracy of various data sources, and becoming smarter about how we prioritize data points. Revisiting our models and features from the ground up helped us find inconsistencies and bugs that creeped in over time. Finally, improving the amount of first-hand, gold-standard data from our actual users has helped significantly. All of these served to improve the precision of our insights and model predictions.
While engineering forms a centerpiece of these efforts, our UX designers and product management have been absolutely indispensable. Conducting interviews, iterating mock-ups, and analyzing how Motherbrain is being used has helped us create a more detailed breakdown of our users and their behaviors, and was instrumental in creating a concrete product roadmap.
Hyperight: Were there any challenges you faced during the journey of building conviction and credibility in ML-based solutions?
Armin Catovic: This is of course an ongoing journey but building trust is the one big challenge, especially when expanding the product to new user groups. When working with such large amounts of data you have to answer several questions: how do we make insights and predictions that are intelligent enough to bring true value, while still making it transparent enough for the user what data is used. And what’s the quality of that data and what assumptions are being made?
Hyperight: Can you share some of the benefits for the organization with this improved ML solution?
Armin Catovic: We truly believe that Motherbrain has been a game changer for our organization. Today, almost every step of the investment process is supplemented by ML, making us quicker, smarter and ultimately better investors. It’s not that ML will ever replace human investors, but we would argue that an investor with ML is always better than one without.
That makes it hard to narrow down on just a few benefits. But let us share some tangible examples of companies that we have sourced through Motherbain. Since its inception we have made fifteen investments using Motherbrain, with a total investment capital of €200M+. Companies like Peakon, Griffin, Anyfin (and more) – these are all companies that EQT Ventures has invested in after initial identification by Motherbrain.
Hyperight: The solution you work for is beneficial within the context of private capital. Is it applicable to organizations in other industries?
Armin Catovic: In principle, lessons learnt here can be universally applied in any industry or organization. However, companies serving their ML output at web-scale – i.e. with millions of users – would probably be better off relying on good-old-fashioned A/B testing. For organizations with a small number of expert users (where A/B testing cannot yield credible statistical significance), and where this small group of users have high requirements on accuracy and credibility of your output, our presentation will be highly relevant.
Hyperight: Based on the lessons learned from this journey, what are your final recommendations for those interested in improving the credibility of the ML models? What should they pay attention to?
Armin Catovic: For the final recommendations, you will just have to listen to our talk at the Data Innovation Summit in May this year! But we can tease out a few important takeaways:
- Continuously re-visit and refine who your users/customers are, their behaviors, and what their core problems are
- Build trust in your data quality – both with the engineering efforts, but also with clever UX solutions
- Build trust in your models – make your assumptions clear, work on explainability, and find optimal ways for your users/customers to engage with your models
Hyperight: According to you, what AI trends can we expect in the upcoming 12 months?
Armin Catovic: We are currently experiencing an unprecedented popularity in large-scale, multimodal, and generative AI, as popularized by Stable Diffusion, ChatGPT, and the like. While the underlying principles powering these models have been around for a while, it’s really the method of delivery, and particularly the focus on human-in-the-loop that has given them a jolt in popularity.
ChatGPT is on every venture capitalist’s lips right now, and various startups are bubbling up, providing API or interaction layers on top of these models. So over the next 12 months we are likely going to see some risers (and fallers!) of systems leveraging generative AI. There are some concrete use cases, such as leveraging generative AI as part of a creative process, for example, during concept sketching and ideation, or during image and video editing. General design and advertising sectors may see a significant disruption here. These generative systems inherently rely on “prompts” (either textual prompts or image “seeds”), so we are likely to see new methods in prompt engineering and prompt tuning (and perhaps even new tech roles, such as a “senior prompt engineer”!).
There is a question of whether the non-deterministic prompting methods will be accepted by users, especially those users that are used to static/fixed interfaces (e.g. icons in a toolbar). At some point, questions surrounding the scalability, and legal/copyright and ethical uses of these tools will come into play, and we may start edging towards the “trough of disillusionment” (as per Gartner).