Audio based similarity search – Johan Bjurgert & Oscar Utterbäck, Epidemic Sound


We discuss how to use parts of musical audio to search for clips of a similar feel and timbre. The solution includes self supervised learning and a triplet loss function. We provide practical insights from the work.

Key Takeaways

  • Self supervision circumvents the need for huge labeled data sets
  • Musical perception is subjective and user testing is necessary 
  • Practically, deep learning is error-prone and it pays to have a solid setup

