- We present a novel embedding that is resilient to misspelling
- We modify the popular FastText algorithm with a novel loss depending on the task
- We present experimental evidence that the method works in practice
Key Takeaways
• A new model to learn word embeddings (words or phrases mapped to dense vectors of numbers that represent their meaning) that are resilient to misspellings.
• We propose Misspelling Oblivious Embeddings (MOE), a new model that combines our open source library fastText with a supervised task that embeds misspellings close to their correct variants.
• The loss function of fastText aims to more closely embed words that occur in the same context. We call this semantic loss. In addition to the semantic loss, MOE also considers an additional supervisedloss that we call spell correction loss. The spell correction loss aims to embed misspellings close to their correct versions by minimizing the weighted sum of semantic loss and spell correction loss.
Add comment