Session Outline
Our guards document everything they encounter, from criminal incidents like burglary or threats to issues like broken freezers, water leaks, and parking violations. They categorize events using preset options and write detailed notes in their local language. While these free-text notes are rich in information, extracting data from them automatically is challenging, especially in multiple languages. To add value for our clients, we use a Transformer model (XLM-R), pre-trained in 100+ languages, and fine-tune it to understand guard-specific terms.
By adding a second neural network, we train the model to classify reports across key categories. Even with fine-tuning in only a few languages, the model accurately classifies in many others—thanks to its multilingual pre-training. To make this process efficient, we created a collaborative method for labeling and training the model. We’ll share our insights from this project, which will help us in other work with pre-trained language models. These models are now in use across several countries, handling multiple tasks.
Key Takeaways
- Employing a pre-trained Transformer language model to solve a multitude of real problems, outperforming classical methodologies in performance and value scalability, and LLMs in speed and cost
- Transformer language model can be fine-tuned to understand domain-specific data and their multilingual capabilities, from pre-training carries through to the task despite single language fine-tuning
- We need a relatively small amount of labels to reach a high classification performance by developing an efficient way of generating labels in a collaborative manner
- Iterating over label generation and model analysis will ensure the most time efficient model development, with a minimum amount of labels and maximum generalization performance
Add comment