Hands-on analytics and data science stories
The previous The Data Innovation Summit: 5 years of data and analytics journey series piece was dedicated to practical case studies on “how” and real examples of putting data innovation into practice of the Data Innovation Summit 2017. Now we’ll see how pioneering companies utilised analytics and data science to innovate through their data and their best analytics practices.
Peter Wallin, then in the capacity of Head of Group Analytics at Hoist Finance, demonstrated how Hoist Finance leverages data analytics to improve their performance and efficiency in the day-to-day business. Peter described the process of setting the stage for data and BI-driven business process engineering and value creation which powered the Hoist transformation. He also provided advice on how to manage and run self-service analytics in a global organisation with increasing user demands and regulations.
In 2017, Big Data, digital, and the IoT were already forcing massive disruptive changes. Companies needed to be innovative with the use of data to stay ahead of the competition. In his session, Zunnoor Tarique, Principal Data Scientist, EMEA at Teradata, gave a few examples to demonstrate how analytical innovators operate and drive value from data. Additionally, Zunnoor presented to drive value from data science projects, how to strategise analytical initiatives and the importance of culture and organisation in becoming “Data-Driven”.
Having access to more data doesn’t bring value unless we know what to do with it. The amount of data companies were collecting accelerated rapidly. Styrbjörn Torbacke, former Managing Director at Zoined Oy, touched upon the issue of how we can harness the power of data when most analytics packages tend to make the complex even more complex. He explained that analytics didn’t have to be cumbersome, costly and difficult and that it can be pretty simple to do.
Markus Rytkölä also talked about companies gathering more and more data which provided useful insight into the business. But Markus proposed that without the final piece of external information that is not created within the organisation, the data puzzle wouldn’t be complete. He demonstrated how external data could enrich and give meaning to internal data. Markus rounded up his talk with the essential components of good ROI – high user adoption, which demands ease of use when infusing external insights into your own data.
Carl Svärd offered insight into the journey starting with a single centralised data science team under the strategy department, to building a whole data and analytics organisation, from an operational perspective with an emphasis on data science and engineering. Carl also presented how to enable data science teams and their members to be impactful and efficient during periods of high growth and uncertainty.
Giovanni Leoni, formerly Business Analyst and Project Leader at IKEA, presented IKEA’s Business Information journey of increasing the business value of their data and its 2017 transition into a phase of creating the highest possible business value for their decision-makers. Giovanni described IDEA’s Business Analytics Competence Centre as a natural part of their performance culture, creating a clear focus of moving from data to fact-based decisions. As Giovanni stated, great business becomes a reality when values, culture, experience and business analytics come together.
Kurt Muehmel, Chief Customer Officer at Dataiku, talked about the transformation of data science teams. As early as 2017, the times when data science teams could focus only on building and optimising models was long past. Across sectors, data science had become a critical component of corporate strategy, and its fruits are informing an increasing number of business decisions. Kurt related that the major challenge was thus no longer simply building the right models but rather companies’ ability to scale their infrastructure, their data prototypes, the number of prototypes deployed to production, and, above all else, their people, both in terms of numbers and in terms of capabilities.
Mikhail Zhilkin, Data Scientist at Arsenal F.C., told a real-life example of how data scientists can make themselves redundant by automating the entire chain from receiving data from an external source all the way to a nice visualisation. Mikhail shared his personal experience of how he solved his repetitive tasks by automating them, so he can concentrate on doing more meaningful data science. The wins of automation are multiple, stated Mikhail, it saves time and effort, makes others less dependent on the data scientist and prevents human errors.
Lisa Neddam gave insight into the challenges that Data Science faced in 2017, which were: finding well-skilled people – Data Scientists, finding where Data Scientists fit in the organisation, and how to keep up to date with the new business challenges and technologies. She introduced the new generation of IBM’s data science platform which fit in the Data Science community that was being shaped by open-source technology.
Lars Albertsson tackled the buzz around stream data processing common three years ago, the rapidly appearing, and the huge number of tutorials on how to count words with different technologies. In his presentation, Lars described how to go beyond a “Hello world” stream application and build a real-time data-driven product. He also included architectural patterns, tradeoffs and considerations when deciding on technology and implementation strategy, and described how to put the pieces together.
Andrea Burbank, Data Scientist at Pinterest, talked about how they scaled data science at Pinterest by building a culture of experimentation. As she stated, successfully running experiments required more than just a working experiment framework. Andrea gave a step-by-step outline of how Pinterest went from merely having the ability to run experiments to developing dozens of individuals advocating for experimentation and A.B. testing best practices.
Shirish Tatikonda, Principal Data Engineer at Target, opened up his presentation by talking about the rising need for custom machine learning algorithms that scale to large data sizes on distributed runtime platforms pose significant productivity challenges to data scientists.
As a solution, he presented the philosophy and the architecture behind Apache SystemML. It enables declarative machine learning by letting data scientists specify their 00analytics using higher-level linear algebraic and mathematical constructs, explained Shirish. These analytics scripts are transparently compiled into efficient execution plans running on modern distributed data-parallel frameworks, such as Apache Spark, Apache Hadoop, and Apache Flink.
In the next piece wrapping up the Data Innovation Summit 2017, the focus will be on the Data Management stage and the technical case studies on Data Management and Engineering as a foundation for solid, secured and quality based data-driven innovation.