Power of Transfer Learning in NLP: Information Extraction using Transformers DistilBERT

Comprehending natural language text with its first-hand challenges of ambiguity, synonymity, and co-reference has been a long-standing problem in Natural Language Processing. The domain of Natural Language Processing has seen a tremendous amount of research and innovation in the past couple of years to tackle this problem and to implement high-quality machine learning and AI solutions using natural text by abstracting the underlying workings of the algorithms. This essentially allowed for quick application of pre-trained models and integrated them into the real-world industry use-cases. Question-Answering is one such area that is crucial in all sectors like finance, media, chatbots to explore large text datasets and find insights quickly. You can either build a closed domain QA system for specific use-case or work with open-domain systems using some of the open-sourced language models that have been pre-trained on terabytes of data on the general knowledge base. Fine-tuning it based on the problem at hand to add additional information is the way to efficiently implement a machine learning solution. The general idea is to identify K-relevant sentences from the training corpus for a question query, which will then find the span of text from sentences that answers the question. This talk will highlight the general concepts and ways of implementing the language model DistilBERT and using transfer learning to use the base model to build an efficient question-answering model. This also ensures that using the available open-source platforms we are able to have better business outputs as well as a better environment because training a single AI model contributes to 5 cars’ lifetime worth of carbon emissions? A basic understanding of python is desirable. Code can be made available via GitHub for everyone to examine after the talk.

Key Takeaways

Phase 1: Understand the NLP based concepts

Familiarize yourself with NLP terminology and process flow necessary to retrieve information from an unstructured pool of text corpus.

Phase 2: Deep-dive on BERT/DistilBERT

Understand the BERT architecture its workings and why it’s a massive improvement over previous language models. Explore the problem statement and steps to solve it.