Would You Like to Start with MLOps? - Interview with Stefano Bosisio, Trustpilot

Would you like to start with MLOps? If the answer is yes, then welcome to this interview with Stefano Bosisio, a Machine Learning Engineer from Trustpilot. Stefano is a seasoned expert in Machine Learning Operations (MLOps) and scaling ML into production and is confirmed to present at the eighth edition of the annual Data Innovation Summit in Stockholm.

In this interview, he shares his insights and expertise on how organizations can overcome the challenges of incorporating machine learning into their operations. So, let’s dive into MLOps and scale ML into production with Stefano.

Hyperight: Can you tell us more about yourself and your organization? What are your professional background and current working focus?

Stefano Bosisio: Sure! I am Stefano and I am a Machine Learning Engineer at Trustpilot. As you can guess I am Italian, but I live in Edinburgh, Scotland. I moved here for a Ph.D. in computational chemistry, in 2014. In my research, I’ve combined statistical learning, statistical physics, and computer science to tackle the drug discovery problem. 4 amazing years, publishing several papers and methods to get towards more efficient and robust methods to find new drugs. After that experience I moved to the data science field, working for 2.5 years for a FinTech company.

During these years, I’ve had the great opportunity to start creating the very first ML platform within the company (and actually in the FinTech sector), based on AWS. Fantastic years, where I’ve learned so much about what data scientists need, what it means to meet business targets, and how to create ML platforms and models and see them improving the company’s revenues. After that, I moved to Trustpilot and after almost 3 years, here I am! At Trustpilot, we are working hard to be the most trusted and used consumer review brand, globally. This strategic focus poses a lot of stress on our ML capabilities. In less than 3 years, we managed to implement new MLOps practices, delivering great products for data scientists to improve their production route and model development journey. The usage of data we do within TP is massive, ranging from scam/spam detection, and platform misuse to NLP pipelines, topic modelling, sentiment analytics, and relevance ranking.

Hyperight: During the Data Innovation Summit 2023, you will share more on “Would you like to start with MLOps? Lessons learned from Trustpilot”. What can the delegates at the event expect from your presentation?

Stefano Bosisio: Well, I think that nowadays MLOps is getting more and more of a buzzword. What I would like to give is a practical example of applying the MLOps “philosophy” on a business-wide scale. In particular, I’ll talk about how we integrated MLOps practices within our frameworks, and I will be showing a practical application and architecture for one of our scam/spam models. Along the way, we’ve learned 3 major lessons in applying MLOps in a business, and the process hasn’t been smooth, with accidents during the integration phase, even in production.

Hyperight: Trustpilot is a worldwide business review platform that processes more than 4 million reviews monthly. That means the company has large quantities of data. Tell us more about the company and what led you to consider implementing architectural solutions in Trustpilot.

Stefano Bosisio: Trustpilot’s value proposition is to unpack the benefits of review both on the consumer and the business side. The consumer can trust us because we are hosting only genuine reviews on the platform, while businesses can use us as a reference point to investigate their customers’ feelings and get to know more product insights from their reviews. We started considering implementing more MLOps practices 3 years ago. At that time Trustpilot was evolving from being a startup-like company to a public company. During the startup phase, data scientists were creating models, but the road to deployment was very hard and manual – on average they need 9 months to have a model to production. The transition was the turning point for our business, where at all levels it was decided we need to scale up our data science layout and ease the model journey from prototyping to deployment. After we learned together what we need, and what the business needed, we started implementing solutions that decreased the deployment time from 9 months to weeks, as well as improving the entire data life cycle, from data retrieval to model retirement. This is reflected in the number of fake reviews we can filter, the increasing number of customers we can support, and the increasing number of reviews we’re receiving every month.

Hyperight: Can you tell us more about the journey to build MLOps infrastructure?

Stefano Bosisio: Sure thing! Applying MLOps within our framework hasn’t been easy. We started 3 years ago, with a lot of work and study to understand what other companies were doing in this field. At the same time, we began a tighter collaboration with our data scientists and drafting solutions and POCs. We learned that we needed to have a constant checkup on our side and on our data scientists’ side, to be up-to-date with the latest requests and problems. From there, we started to build the building blocks of the entire MLOps infrastructure, e.g. online and offline feature stores for smooth feature retrieval, and real-time streaming systems exploiting GCP Dataflow, and we evolved more and more by developing SDKs to allow the creation of tracking, training and deployment pipelines. All of these products didn’t come immediately, they required time to carefully tune all the features to meet our business needs, and they underwent a lot of simplification processes to deliver to data scientists ready-to-use products, without learning barriers.

Two laptops with programming language and statistics — Image credits: KaikaTaaK on Envato Elements

Hyperight: When you build the MLOps infrastructure for the platform, the teams of ML engineers, data scientists and data engineers work closely, listening to the business needs. How was this helpful, and what are some learning points from this working experience with diverse teams?

Stefano Bosisio: You’ll see in my talk that this is one of the most important lessons we learned so far. Working together and listening to each other is extremely helpful to deliver efficient MLOps tools and help your entire business to grow. Daily meetings and weekly updates were key elements to building up the infrastructure and testing it out. The demo session and tutorial were pivotal to allow data scientists to start building pipelines on their own and start deploying models. We learned how to simplify our products, the importance of delivering tutorials, and keeping documentation up-to-date so everyone was on the same page. Again, this is not an easy thing to do and it requires time, patience, and a lot of meetings to have perfect coordination across all the teams.

Hyperight: What challenges with data and infrastructure did you experience?

Stefano Bosisio: I think that being a wide business, we had data sparse all over the place. This was one of the major challenges to address, thus we helped data scientists to have fresh data directly from ground truth sources and board them in using feature stores. A second challenge was to find the best way to have real-time predictions. This is not just one problem, namely finding a real-time streaming platform, but it also deals with serving data efficiently to our models, so the latency stays under the milliseconds’ threshold. This challenge has required a great engineering effort, from building an online feature store to coordinating multiple processes at the same time and tuning our platform to run processes in real-time. The final challenge was communication. Having clear communication, regular meetings, and catch-ups as well as translating complicated concepts in easy terms was a challenge we had to tackle for improving the collaboration across all the various data science teams.

Hyperight: Based on the lessons learned from building the MLOps infrastructure for Trustpilot, what solutions do you propose for organizations to tackle MLOps issues?

Stefano Bosisio: This is a tough question! I don’t think there’s an easy answer to tackle MLOps issues within an organization. I think that first of all you have to understand if you need ML and MLOps. Sometimes businesses just need a good pipeline to process data and transform these data into a dashboard. Then, if you need ML and you want to tackle MLOps, start listening to your stakeholders and data scientists. What do they need? What are they struggling with? What part of the model development/deployment journey is the most painful? Highlight and isolate the problems and start tackling those pain points. Start simple with a proof of concept. Then, read a lot of docs and don’t immediately trust solutions from the major cloud providers. Do your research and find the tools that may alleviate your pain. Finally, translate these tools into simple products (e.g. SDKs) that data scientists can easily and immediately use. You’ll see that from there everything will be downhill and further problems will be solved.

Hyperight: According to you, what AI trends can we expect in the upcoming 12 months?

Stefano Bosisio: One trend is evident, and is the large language model (LLM). I can see that there’ll be a lot of new models next year with impressive capacities (just think of a possible GPT-4). From here, I think LLM could be used to create real reviews that could be used to train new models for different purposes, from scam spam detection to topic modelling and NLP analyses. Generative models are another trend in 2024. We’ll see new companies based on these models, especially marketing and advertising companies, that could have a big help in tailoring their graphics and campaigns – I am even expecting some synthetic voices and music! Another theme, which I’m not certain whether it will be a trend in the next 12 months, but I hope so, is ethical AI. We always need something to grow awareness surrounding the ethical use of AI. I guess some packages like LIME and SHAP may be further refined next year, giving more insights into complicated neural networks, to forecast and check for some possible bias and unethical behaviours.

Featured image credits: coffeekai on Envato Elements