Need For Distributed Speed - Anders Arpteg

As a data scientist, working at a data-first company leads to many interesting challenges. It is not only about building music recommendations, but also about being able to performing advanced analytics and machine learning on peta-byte level.

Key Questions

  • What do Spotify use all peta-bytes of data for?
  • Isn't it sufficient to take a sample and train models on a single machine?
  • Is Apache Spark a silver-bullet to distributed computing?

Comments

FrancoJairo
FrancoJairo
10 months ago

If you can understand what distributed speed is, and get it, then most of the work that you do at your https://assignmentman.co.uk/ job will be done rather quickly. This anecdote has helped me out very much in this life.