Although we are far from having real conversations with self-aware, conscious robots as portrayed in the movies, NLP has nevertheless had some important advancements in recent years. Things like smart assistant devices, smart speakers, translation applications, voice recognition would have been considered sci-fi movie occurrences.
But thanks to research in the field, NLP finds more and more useful applications not only in our personal lives but also in business.
Philipp Eisen, AI Research Engineer at Peltarion, will talk about how the huge advances in the latest AI models for NLP, based on Google’s BERT, provide many opportunities for organisations to efficiently find and reuse information from a large corpus of internal documents at the Data Innovation Summit, the Data Engineering Stage, on August 20th.
In his talk on Building a robust and intelligent text similarity service with BERT, he will discuss engineering challenges and architectural choices for building a textual similarity service based on Sentence- BERT in a scalable and robust way. We talked to Philipp about the state of NLP today, its applications and what we can expect from it in the future.
Hyperight: Hello Philipp, we are glad to have you with us at the 5th edition of the Data Innovation Summit. Let’s start by telling us a bit about yourself and your role at Peltarion.
Philipp Eisen: Thanks for having me. I am excited to be here. I am working in Peltarion’s AI Research team as an AI Research Engineer. Our team works both on academic research as well as applied research. With academic research, I mean research that tries to advance state of the art in some fields of AI or propose new approaches. When we work on applied research, we usually have a client that wants to solve a hard problem. We then work together with them to find out how we can use the most recent advances in AI to tackle those problems and create business value.
Hyperight: As you also mention in your presentation intro, there have been some major advancements in the field of AI, machine learning and deep learning particularly. Could you summarise what have been the most significant developments in these AI fields in the last five years?
Philipp Eisen: Five years in our field is a very long time, so this is going to be a bit of a tough one to answer. But for me, some of the biggest advances include the advances we made in computer vision through the combination of big datasets (often underappreciated), a lot of computing as well as inducing knowledge about the problem domain with convolutional neural networks.
What I found interesting in the field of computer vision is that in the beginning, the trend was towards bigger models that could beat state of the art over and over again. More recently, we have seen more and more models that are on par with those massive models, but use far fewer parameters. I think that is exciting because ultimately the complexity of models will determine the cost to run a prediction. That, in turn, will define the business cases in which using machine learning makes sense. Cheaper models allow open the door for more business cases.
For NLP, with the transformer architecture, we seem to have found an architecture that induces knowledge about the problem domain while at the same time being able to make use of the computing platforms we have today. So in a very oversimplified sense, you could look at the transformer architecture as being the convolutional neural network for language.
We are currently still in the phase where the state-of-the-art is pushed forward with bigger and more complex models. This phase is super important as it shows us what is possible with machine learning. In the future, similar to computer vision though, I expect to see more efficient models that are on par with the massive models of today.
Another very interesting development in machine learning is self-supervised learning. This can basically be summarised as using the structure inherent in data we have to learn about our data in general before we optimise for our specific goal. As an example, in NLP, we pre-train language models by using the structured natural language (i.e. words in sentences) to learn to model language. Today most commonly this is done by having the model do a fill-in-the-blanks task. Basically hiding one or several words in a sentence and asking the model to predict which words were there before. We then use that model and fine-tune it to a task like finding the answer to a question in a provided paragraph of text.
Hyperight: NLP is a subfield of AI that enables machines to read, understand and derive meaning from human language. Where is NLP used today?
Philipp Eisen: Some of the most prominent and widely used applications that make heavy use of NLP are found in the Google ecosystem. The most obvious one is Google Translate, which uses Neural Machine Translation technology for its service. While Google Search itself, of course, uses very advanced applications of the more classic NLP technologies, Google recently announced that Search is now also using a Deep Learning model (BERT) to improve their service.
Then there are many more specific applications of NLP. For example monitoring and analysing social media to inform marketing and product departments about what customers are thinking about their products.
Hyperight: Peltarion is dedicated to helping companies get usable and affordable AI technology and operationalise the latest AI techniques. What projects have you been involved in personally?
Philipp Eisen: One project that I am very excited about is a project that we have with the market research firm IPSOS. They have a great vision for a product around their survey creation process. Basically, they want to make it possible for end-users to create self-serviced surveys that are similar in quality to what you would get by having a market research company design them for you. To get there, we are making use of their huge library of high-quality surveys and questions in combination with NLP techniques that help clients to build their custom surveys.
In addition to that, I have been involved in a project that aims to transfer the advances in NLP for English to also be available for the Swedish language.
I also stepped in to add some features that I wanted to see to the product that we are developing at Peltarion, the Peltarion Platform.
Hyperight: Your presentation at the Data Innovation Summit will be focused on building a textual similarity service based on Sentence- BERT. What are some challenges you encountered during and process, and how did you resolve them?
Philipp Eisen: One challenge that arises when working with models that encode text in a semantically meaningful way is that the understanding of the semantics can differ depending on the domain. To illustrate that, let’s consider those two questions: “How often do you drink beer?” and “How often do you tweet on Twitter?”. In most contexts, those questions are very different in that they ask for completely different things. However, if we are in the context or market research, those questions are very similar in that they both ask for how often a person is performing a common action with a product.
This is a challenge since most language models are trained in more general contexts, and therefore if the understanding of similarity differs in a specific context, we need to adapt the model to that specific context. That, in turn, requires either a significant amount of training data to adapt to the domain or some other way of introducing domain knowledge.
Hyperight: And lastly, what do you predict for NLP in upcoming years, where do you see it heading?
Philipp Eisen: I think what we will be seeing is more and more advanced self-supervised training approaches that lead to even better language models. Some recent research suggests that having better language models allows for models to perform tasks with less labelled data and in the extreme case, even without any labelled data.
At the same time, we should see more and more efficient models. I expect that the combination of better and more efficient language models will drastically increase the adoption of deep learning-based NLP models. If we need less data and less compute, this will lower the barrier of getting value from NLP models significantly and therefore make sense, from a business perspective, in more situations.