Accelerating Data Analytics with Computational Storage – Javier González, Samsung Electronics


The vast amount of data put through modern Machine Learning and AI algorithms is challenging current computer architectures, where data needs to be moved from secondary storage to main memory in order to be processed by the CPU. Not only is this model inefficient in terms of performance, resource utilization and power consumption, but is also non-scalable; processing power and interconnect speeds cannot keep up with the I/O bandwidth unleashed by current NVMe SSDs. One way of reducing this data movement is to preserve data locality and move computation close to where data resides. This form for near-storage compute is nowadays best known as computational storage. In this talk, we will (i) provide an overview of the efforts the community is doing to bring computational storage to market, (ii) show the experimental benefits that we have obtained though real-life workloads on a Samsung computational SSD (i.e., SmartSSD), and (iii) explain the challenges that remain in terms of technology and ecosystem. We hope to encourage the audience to take action and re-think their infrastructure to take advantage of this new type of SSDs.

Key Takeaways

  • Present the audience with the limitations of current architectures in data-intensive workloads such as ML / AI
  • Provide the audience with an overview of the efforts done by the industry to breach this gap
  • Provide the audience with an detailed overview of the architecture and use cases that computational storage targets
  • Encourage the audience to think about their use cases and find similarities in the problems and solutions presented. Help them challenge the infrastructure they use for their ML / AI applications and give them an idea of the benefits that they can expect.

Add comment