NLP for Online Conversations – Katie Bauer, Reddit

As people live more of their lives online, there is a growing need for high quality natural language processing on social media posts, chat logs, and forum replies. Unfortunately, many common preprocessing routines do not capture information that is useful for conversational data. This talk will describe tools and techniques for addressing this unique type of language.


  • Common sets of stop words should be revised to maintain vocabulary that is extremely useful for conversational data
  • Text normalization techniques (such as lowercasing) often smooth over people’s ways of expressing emotion or tone of voice online
  • Document vectors don’t need to be composed solely of word embeddings

➡ View/Download the PDF presentation file at https://datainnovationsummit.com/wp-content/uploads/2019/04/M3_10.30_Katie_Bauer_Reddit.pdf

Add comment

Highlight option

Turn on the "highlight" option for any widget, to get an alternative styling like this. You can change the colors for highlighted widgets in the theme options. See more examples below.


Instagram has returned empty data. Please authorize your Instagram account in the plugin settings .

Ivana Kotorchevikj

Categories count color


Small ads


  • Afro-deko-mono
  • Maria d'Odessa, touching
  • Maria d'Odessa au bâton de rouge-baiser
  • Maria d'Odessa & the red lipstick
  • Maria d'Odessa, soulful.
  • Peanuts
  • Celebrating the hundredth anniversary of Charles M. Schulz
  • À propos serendipity ...
  • Les libraires

Social Widget

Collaboratively harness market-driven processes whereas resource-leveling internal or "organic" sources.