Need For Distributed Speed - Anders Arpteg

As a data scientist, working at a data-first company leads to many interesting challenges. It is not only about building music recommendations, but also about being able to performing advanced analytics and machine learning on peta-byte level.

Key Questions

  • What do Spotify use all peta-bytes of data for?
  • Isn't it sufficient to take a sample and train models on a single machine?
  • Is Apache Spark a silver-bullet to distributed computing?

Comments

FrancoJairo
FrancoJairo
7 months ago

If you can understand what distributed speed is, and get it, then most of the work that you do at your https://assignmentman.co.uk/ job will be done rather quickly. This anecdote has helped me out very much in this life.