Machine learning has proven to be extensively effective in detecting fraud. All fraudulent transactions have certain characteristics that differentiate them from legitimate ones. With their ability to process large amounts of datasets, machine learning algorithms can detect patterns in the transaction data and detect fraud. However, fraudsters are catching up and are doing extra efforts to cover up their activities and make them seem like legitimate transactions. Graph machine learning helps organisations go a step further to discern any suspicious transactions.
Dan Saattrup Nielsen, Former Machine Learning Consultant at the Danish Business Authority, currently Research Associate in Machine Learning at the University of Bristol, will talk about how they use graph analytics and machine learning to find tax fraudsters, money launderers and other illegal activity during his session at the NDSML Summit 2021. We invited Dan beforehand to share more about how graph machine learning can help detect and prevent fraud, his views on fighting bias and narrowing the talent gap, as well as some future trends in machine learning.
Learn more about the NDSML Summit
Hyperight: Hi Dan, we are super excited to have you as a speaker at the NDSML Summit 2021. As a start, please tell us a bit more about yourself, your background and your role at the Danish Business Authority.
Dan Saattrup Nielsen: Thank you Ivana, I am also really excited to talk at the summit.
I come from a maths background, I did both my undergraduate and master’s degrees in Mathematics at the University of Copenhagen, and then moved on to do a PhD at the University of Bristol, also in Mathematics. Since then I switched gears a bit, to data science and machine learning, which included working for a couple of startups, and eventually ending up at my current job at the Danish Business Authority.
The Danish Business Authority is the governmental organ which is dealing with all businesses in Denmark. This includes making it as easy as possible for a company to get set up and receiving the help they might need, as well as making sure that the companies are not involved in fraudulent activities.
At the authority, I am part of the Machine Learning Lab, which consists of a group of 15 data scientists. The lab is responsible for testing out ideas and building proof-of-concept models, which are then handed over to an ML ops team responsible for putting the models into production.
Graph Machine Learning
Hyperight: At the NDSML Summit 2021, you will be talking about Case Study on Graph Machine Learning for Fraud Detection. Why does graph structure make it easier to detect tax fraud or money laundering, compared to other machine learning methods?
Dan Saattrup Nielsen: When it comes to areas such as tax fraud and money laundering, the fraudsters are often working hard to make it seem like they are not doing anything out of the ordinary. For this reason, isolated company-specific features might not give a strong signal to a model trying to predict fraud. Instead, we have to try to work with the data that they are not able to manipulate as easily: the networks.
If we zoom out and look at the big picture, patterns might emerge which can be indicative of certain types of fraud, which are precisely the patterns we would like our models to detect. So the focus is less so on the individual company, but more on the relations between companies (and other entities as well). In a graph database, the relations are first-class citizens, compared to traditional relational databases. This makes it a lot easier for us (and our models) to spot these patterns.
I would also like to emphasise that there is no real competition between the graph approach and “normal” machine learning approaches. In every machine learning pipeline, we have a feature engineering stage, where we try to build features that can hopefully be good predictors for what we are interested in, which in our case is usually a fraud. The graph structure thus simply allows us to extract relevant relational features from the network of the companies that we are interested in. With these features at hand, the regular machine learning tools can then be applied, as usual.
I should note that this split of feature engineering and model training can also be combined in what is called graph neural networks, but the idea is still part of the neural network is extracting the relational features, and the latter part is the traditional machine learning model being trained on those features.
Splitting up of the Data Scientist role also makes it easier for people to specialise in a particular area, rather than having to know absolutely everything, and the companies are able to seek the specialised knowledge that they need. I think this helps bridge the talent shortage gap as well.
Hyperight: AI and machine learning models can often be biased. How do you tackle bias in your models?
Dan Saattrup Nielsen: That’s a great question and an aspect that is often neglected in machine learning practice. I can think of three ways in which we are ensuring that our models are bias-free.
Firstly, we would never include features in our models which we would not want our models to be biased against, such as gender or name. This would at least ensure that we don’t explicitly model biases into the model. This is of course not sufficient, as there might be other features that are strongly correlated with these unwanted features, but it is a step that catches the worst.
Secondly, we always check which features our models are relying upon, using SHAP values, to see whether it is putting undesired emphasis on certain features. If this was the case, then we would dig down into the individual predictions to see in what way these features are being used, to see if this is reasonable or not.
Lastly, if biases still occur which were not caught during the first two steps, then we are logging our entire data pipeline from raw data to the model predictions, to be able to reliably say what led a model to produce a given output. So, should anything come up which was not already spotted within the team, we would know exactly how to deal with the problem and correct the bias.
Hyperight: On a more industry-related topic, one of the biggest struggles in the AI industry is the AI and ML talent shortage. As a Machine Learning professional, do you see this challenge can be solved?
Dan Saattrup Nielsen: When I started my machine learning journey, not more than a few years ago, the main resources out there to get started was a couple of books and a handful of online courses. We’re now almost drowning in high-quality learning material, as well as dozens of data science degrees becoming available at universities. So my first point would be that the “supply” of talented data scientists will grow a lot over the next couple of years.
I also think that in recent times machine learning and data science has gone through a “hype peak”, the top of which every company wanted to have a slice of the data cake and started searching for data scientists. By now it is my impression that the hype has calmed down a bit, companies and people are being more realistic about what machine learning can and cannot do presently, which has probably dampened the increase in demand for data scientists.
Another trend I have noticed, which has come along with the slow maturity of the field, is that companies are becoming more aware of what they need. Previously data scientists were expected to be “full-stack”, the concept of which is a myth more than anything, but now these skills have been broken up into more specialised positions.
One popular such position is a Data Engineer, which I understand as being the person in charge of setting up the data pipelines for the company, making sure that the databases are maintained and data is readily available. The Data Analyst will analyse this data and supply the company with valuable insights, accompanied by data visualisations. The Machine Learning Engineer would build and train machine learning models based on this data to be able to predict valuable information to the company, the MLOps Engineer would be in charge of setting up the pipelines to put these models into production, and the Machine Learning Researcher would develop new machine learning algorithms that might be useful to the company. Of course, in many cases, people are wearing multiple hats, especially in smaller companies.
But this splitting up of the Data Scientist role also makes it easier for people to specialise in a particular area, rather than having to know absolutely everything, and the companies are able to seek the specialised knowledge that they need. I think this helps bridge the talent shortage gap as well.
Hyperight: What are the machine learning trends that will mark 2021 and beyond?
Dan Saattrup Nielsen: I’m sure anything I forecast will be terribly wrong, haha. But if I were to make a guess, I would mention two things: graphs and causality.
Graph machine learning is really in its infancy still, almost exclusively using methods that were developed in NLP, so I would be really surprised if they did not find their own graph-specific algorithms in the near future. But even algorithms aside, the mere use of graphs as feature extractors is something that can be highly useful in many areas of industry, and I am sure that the interest in graph data analytics will only continue rising in the next coming years.
As for causality, this is even more in its infancy, at least when it comes to the connection between causality and machine learning. All machine learning models today are good at prediction correlations. If you always bring your umbrella with you when it rains, a machine learning model would simply be able to predict that umbrellas are correlated with rain. It would have no idea whether bringing an umbrella would cause it to rain, or whether it is the other way around. This sort of knowledge would help give more transparency to the predictions of machine learning models, as well as help things like reinforcement learning algorithms often used in robotics.