Open Source Data Science With Python, Spark And Jupyter – Daniel Tidström

One of the major advantages with Hadoop is the schema-on-read and data lake architecture that simplifies storing huge amounts of data in its raw form. Having solved the storage part is a good thing but accessing, transforming and analyzing the data is obviously an even more important step towards achieving tangible business value from your data. This session will show how Svenska Spel uses Python and Spark to process and analyze large amounts of data with Jupyter notebooks as a unified interface.

Key Questions

  • How to establish an agile and powerful data science environment using the latest in open source tools?
  • Data Frames as the glue between distributed processing on Spark and in-memory analytics with Python
  • What are Jupyter Notebooks and why are they the perfect interface for data science?

Add comment

Highlight option

Turn on the "highlight" option for any widget, to get an alternative styling like this. You can change the colors for highlighted widgets in the theme options. See more examples below.


Instagram has returned empty data. Please authorize your Instagram account in the plugin settings .

Ivana Kotorchevikj

Categories count color


Small ads


  • It never ends
  • stand still screening-smoking girl
  • Maria d'Odessa performs her art of make-up
  • Afro-deko-mono
  • Maria d'Odessa, touching
  • Maria d'Odessa au bâton de rouge-baiser
  • Maria d'Odessa & the red lipstick
  • Maria d'Odessa, soulful.
  • Peanuts

Social Widget

Collaboratively harness market-driven processes whereas resource-leveling internal or "organic" sources.