Scaling the Role of AI and ML for a Fintech Company: Interview with Andrew Wu and Eliisabet Hein

The financial sector and the fintech companies were probably among the first early adopters to deploy machine learning (ML) and artificial intelligence (AI) to improve performance and enhance the quality of services and products offered to consumers. Today, given the development of technology and regulations, there is an opportunity for new improvements and innovations.

One of the new sources of innovation shaping the financial sector, especially the banking industry, is open banking. Because of the data and services exchanged between financial institutions and third-party providers, there are challenges that every fintech company faces.

What are those challenges? How can they be met and overcome? What is the importance of the team in this process? We talked with the Tink data experts – Andrew Wu, Machine Learning Tech Lead and Eliisabet Hein, Data Scientist. They are among the speakers at the 7th edition of the Data Innovation Summit.

Hyperight: Can you introduce yourself, your professional background and current working focus?

Andrew Wu: I was born and raised in China and spent my first eighteen years there before my college in Finland, and later a master’s degree in Sweden. During the last thirteen years in the industry, I started as a software developer, and gradually transitioned to a data engineer, and now a machine learning engineer. I see myself as a backend generalist, as I have worked with different technologies, from building websites to infrastructure as code. I did my master’s thesis on the topic of Machine Learning and contributed to several ML-related projects at Eniro, US Bank, Swedbank, and H&M. Currently, I am working with a team of brilliant engineers to support and simplify the go-to production process for data scientists at Tink.

Eliisabet Hein: I’m originally from Tallinn, Estonia, but I moved to Scotland to study Computer Science & AI at the University of Edinburgh for my bachelor’s. The machine learning courses I took as part of this degree really captured me, so I continued my studies on the Machine Learning master’s program at KTH. I started at Tink as a thesis student, and continued as a full-time Data Scientist in the Enrichment Categorization team, where our focus is on identifying different spending and income categories to empower people to better manage their finances. I’ve also always been interested in languages and linguistics, so I’m very lucky to be working with NLP in a professional capacity.

Hyperight: During the Data Innovation Summit 2022, you will share more on the topic “Scaling ML in one of Europe’s hottest fintech companies”. Can you tell us what the delegates can expect from the presentation?

Andrew Wu: As the first dedicated machine learning professional Tink hired, I have to made a lot of decisions from tooling and architecture, to who to hire. All of those decisions have long-term effects when scaling with Tink – a fintech company rocketing. In this talk, we want to share our journey at Tink on those decisions, the reason behind them, was it a good decision, and how did we evolve from there. A lot of companies are now investing in data-driven decisions and data products. I hope this talk can be useful for people before or on the same journey.

Hyperight: To start with, can you tell us about Tink, what open banking is and what’s the role of AI and ML in the banking industry and the fintech companies?

Andrew Wu: There is an open banking definition out there for both the EU and other parts of the world. In simple terms, open banking is the exchange of data and services between financial institutions and third-party providers, allowing companies and developers to build services and applications for the banks and end-users.

Tink is a fintech startup, a pioneer in open banking now part of VISA. Tink is the most robust open banking platform – with the broadest, deepest connectivity and powerful services that create value out of the financial data.

When it comes to AI and ML, the banking industry probably is one of those early adopters, they do financial modeling, anti fraudulent transactions, and anti-money laundering with mathematics and data before the term “data science” existed. However, there is still a large room for improvements and innovations, especially for the end-users. At Tink, ML powers data products directly and indirectly. For example, the transaction categorization model and recurring transaction model are helping us to understand our purchase pattern, predicting balance, saving goals, and our ability to borrow. With ML/AI, those features of private banking previously reserved for wealthy individuals are becoming available to everyone.

Hyperight: In the summary of your presentation, you mentioned the growth of the company you work for, starting with no data scientists and only one handcrafted data model, to becoming a company that offers an ML product. What are the reasons behind this success?

Andrew Wu: A tiny correction there, Tink today offers a portfolio of data products that are directly or indirectly powered by ML, including Income Check, Risk Insights, and Money Management. It is a collective effort from everyone involved that made us today. We have been good at creating sustainable MVPs that are based on data, but not necessarily powered by machine learning. We do use the opportunity to design flexible APIs and feedback systems that will help us evolve into a machine learning solution in the end.

Eliisabet Hein: I agree. I think to be successful in a commercial environment, you have to be pragmatic, and design a simple MVP for a new product while keeping in mind a more complex ML solution down the road. The categorization product is a good example of that, from its beginnings as simple rule-based systems, to the automated pipeline that trains new neural network models we have now. Having a feedback loop from end-users has been crucial to know where our models make mistakes and be able to adjust for them, as well as collect new training and test data over time.

Hyperight: What are the key challenges a fintech company can face regarding data?

Andrew Wu: This can be different from company to company. At Tink, the complexity comes from data ownership. Our partners and their end users are the owners of their data, which means it is very limited on what we can do with it. Another challenge probably shared across all companies is GDPR, especially the “right to be forgotten”. This means it is impossible to save training data to repeat the training process.

A team member of a Fintech company doing analysis — Photo by Scott Graham on Unsplash

Hyperight: You emphasize the importance of scaling the team to overcome the barriers a fintech company can have. Can you tell us how to scale a team and how you’ve scaled your data team?

Andrew Wu: At Tink, all teams are constantly having their hands full, and this applies to my team even more so. When I was asked about what kind of profile we want to hire as ML Engineer, I put down things in my mind that matter now and in the future of this team, for example, data engineering, machine learning, and infrastructure as code. This has been proven to be a wrong strategy. Instead of hunting for superstars, we broke down this profile into three: infrastructure engineer/SRE, data engineer, and software engineer. This split made recruitment way easier and provided room for future development into the field of ML.

Hyperight: What would be your recommendations to those who are just starting to look into this topic, where should they start, and what should they pay attention to?

Andrew Wu: If your job is to support ML products and data scientists like mine, I would suggest talking to your data scientists as your step one, and understanding where they are situated on the engineer to researcher spectrum. This decides what tools and environments they prefer, and what level of engineering support is needed. Another piece of advice is to find “good enough” solutions. Most ML products do not need Apache Spark or distributed training from day one, and renting a “NASA” machine from a cloud provider for a short period is more cost-effective than maintaining a cluster.

Eliisabet Hein: For a data scientist coming into a commercial setup from academia, at least in my experience, it can be a big change to take scalability and speed requirements into account when designing ML solutions. State-of-the-art models can often have millions of parameters and require heavy computational resources even to predict on, let alone train the models, so they might not work well in a complex production system processing millions of transactions, where real-time speed is an important factor. However, there are great light(er)-weight models, as well as tricks that many practitioners use out there, that I would recommend as extremely useful.

Hyperight: What’s the best advice you’ve received during your career, and what would be your advice for new data enthusiasts?

Eliisabet Hein: I think the best advice I’ve ever gotten is to start with the simplest possible approach and build up from that. It can be tempting to jump straight into a complicated state-of-the-art deep neural network model, but it’s much easier to find issues with your data and understand the constraints of the problem if your initial model is simple. And you’ve probably heard this many times before, but I would repeat that making sure you understand your dataset is probably the single most important thing, especially in the industry where the real-world data you’re working with can often be noisy, biased, or incomplete.

Featured image credits: Towfiqu barbhuiya on Unsplash

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bp_user-registered	13 years 8 months 8 days	This cookie is used to set which users can access the private pages of the website. It is a functional cookie.
bp_user-role	13 years 8 months 8 days	This is a functional cookie. It is used to set restriction to the user on acessing certain pages like back office, account page etc.
bp_ut_session	13 years 8 months 8 days	This is a functional cookie. This cookie is used to set restriction to the user on acessing certain pages like back office, account page etc.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.

Cookie	Duration	Description
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Duration	Description
_gat_gtag_UA_62786802_1	1 minute	No description
CONSENT	16 years 9 months 21 days 15 hours 5 minutes	No description
ihc_workflow_restrictions_0	1 month	No description
ihcMedia	1 hour	No description

Scaling the Role of AI and ML for a Fintech Company: Interview with Andrew Wu and Eliisabet Hein

Add comment

Cancel reply

Recap: Day 2 at Data Innovation Summit 2024

Recap: Day 1 at Data Innovation Summit 2024

Decoding Data Modeling: A Pillar of Modern Data Stacks and AI Cost Efficiency – Interview with Serge Gershkovich, SqlDBM

Recent posts

Recap: Day 2 at Data Innovation Summit 2024

Recap: Day 1 at Data Innovation Summit 2024

Decoding Data Modeling: A Pillar of Modern Data Stacks and AI Cost Efficiency – Interview with Serge Gershkovich, SqlDBM

Next-Generation AI: Deeper Experiments – Interview with Sina Nek Akhtar, Tech Lead, Data Analytics and ML at Google Cloud

Electrolux Continuing Journey to Data-driven Manufacturing Excellence – Interview with Klaas Dobbelaere, Electrolux

Navigating the Next Wave: Generative AI at Accenture – Interview with Mattias Aspelund & Julia Falk, Accenture

The Future of AI-Enabled Experiences – Interview with Dr. Ather Gattami, Leading Swedish AI Expert, AI Researcher at Bitynamics

AIAW Podcast E125 – Liza-Maria Norlin

Topics

Email Newsletter

Events

Hyperight

Scaling the Role of AI and ML for a Fintech Company: Interview with Andrew Wu and Eliisabet Hein

Add comment

You may also like

Recent posts

Topics

Email Newsletter

Events

Hyperight