Hyperight

NLP: Big Models Aren’t Enough – Ashish Bansal, Twitch

video
play-rounded-fill

Recent conversation in NLP has been dominated by model sizes with GPT-3 having 175B parameters. While the models are giving amazing performance, there are a number of factors beyond model size that contribute to the performance. While training a 175B parameter model may not be feasible for everyone, this talk will discuss key factors that can significantly improve the performance of your NLP models as they have for BERT and GPT.

Key Takeaways

  • Representation of text is critical to performance of NLP models
  • Identify key pre-processing steps that have significant impact on model performance
  • Learn the large number of tricks used to extract maximum performance from models that you can use in your own models

Add comment