Deep Learning is now the standard in object detection, but it is not easy to analyze large amounts of images, especially in an interactive fashion. Traditionally, there has been a gap between Deep Learning frameworks, which excel at image processing, and more traditional ETL and data science tools, which are usually not designed to handle huge batches of complex data types such as images.
In this talk, we show how manipulating large corpora of images can be accomplished in a few lines of code because of recent developments in Apache Spark. Thanks to Spark’s unique ability to blend different libraries, we show how to start from satellite images and rapidly build complex queries on high level information such as houses or buildings. This is possible thanks to Magellan, a geospatial package, and Deep Learning Pipelines, a library that streamlines the integration of Deep Learning frameworks in Spark. At the end of this session, you will walk away with the confidence that you can solve your own image detection problems at any scale thanks to the power of Spark.