Being an international publishing company, the largest media group in Scandinavia and owning some of the largest newspapers in Sweden and Norway, on top of all operating classified business ads and media sites for over 20 countries, for Schibsted Media Group implies collecting a lot of data. But as it is, data is only valuable if you know how to analyse it and take action.
The road from data collection to data action is long and winding, but Ludwig Krokstedt, Head of Media Insights at Schibsted Media Group, gave a great presentation at the Data Innovation Summit 2019 on how they make sure all data users in the media group get actionable data they are able to make data-informed decisions.
As Ludwig points out, good quality journalism relies on data, and data is part of every healthy business. And Schibsted Media Group has no lack of it. They house more than 2 PB of data and has more than 1,4 billion events per day.
Unavoidably, working with a great amount of data in different departments and business cases comes with an explosion of data tools that brings certain challenges, admits Ludwig, some of which are:
- High threshold for getting data
- Increased complexity
- Increased fragmentation
- Higher costs
- Duplicate tracking
- Unclear ROI
- Difficulty to ensure compliance
This is where the Media Insights team steps in. Their job is there to make sure to provide employees with the right data in the right form so they can do their job, indicated Ludwig.
How Media Insights provide data for Schibsted Media Group
The Media Insights team are guided by three principles for getting data to all users that needed:
- Standardise and ensure data quality
- Delivered the data to the right people with tools in a way they can consume it matching the needs of their role. Data scientists, analysts, product managers, UX and journalists have completely different needs. But standardising on a small group of tools brings benefits to all in terms of scale, cost and quality.
- Secure and ensure user privacy.
Strategic initiatives for ensuring high-data quality
Ludwig maintains that the data scientists and analysts spend 80% of their time on getting access to data and cleaning it. This figure implies that they are getting insights only 20% of the time.
Improving data quality required scaling up their data insights five-fold but the abovementioned reality didn’t provide a bright perspective for that. So, they came up with four strategic initiatives for ensuring high-quality data across media.
Replacing the current trackers with proprietary ones
To ensure highest data quality across the media, Schibsted Media Group built their own online tracker, instead of using a pre-packaged solution. Their online tracker allows them to track all page events and provides full control over sending and storing the events.
Providing one common schema across all Schibsted Media Group
The events sent to the central tracker contained the same information but had different naming conventions, so cross-site comparisons were a tedious task. To solve this problem, they implemented a common tracking schema across the whole group.
Defining standardisation and data modelling forum with dedicated focus groups
To make the schema standardisation, they set up groups working with tracking and warehouse standardisation. The focus groups consist of different roles and functions across the whole media. This allows them to have the input of all stakeholders on the table when data issues are being discussed.
Real-time data quality testing
In order to make sure that data tracking is done properly, they provided tools for real-time data quality check which allows immediate detection of bad quality data.
The right tools for the right people
The variety of data consumers have different needs in terms of data and insights. To match the roles with the data needs, the Media Group set up three initiatives:
- Rebuilding different data pipelines with all the data consumers in mind
- Reducing data siloes and skill gap by implementing a new data warehouse which gives back data control to the consumers
- Introducing a new real-time behavioural analysis system.
Data consumers gaining control over data
Initially, they were faced with several requirements for the new common data warehouse:
- ETL infrastructure couldn’t be cumbersome
- Data is constantly evolving, so the warehouse should be flexible
- Data should be accessible by most technical users
- Centralised or specialised teams shouldn’t be bottlenecks for local initiatives
- Easy transferability of data and insights
Matching the need with the tool
After reshuffling their data solution structure and implementing a new data warehouse, Schibsted Media Group can easily match the specific need of the role with the right data tool to gain insights.
Depending on the skill level, requirement of granularity and richness of data, they can offer users different tools and data that provide real-time or non-real-time insights.
Privacy always first
Schibsted Media Group makes sure data protected with access control to maintain privacy at the highest level.
At all times, Schibsted Media Group makes sure they have full GDPR compliance, secure privacy for identified and anonymous users, manage consent and opt-outs. To make sure that all consents and opt-outs are handled correctly, they have set up a distributed privacy broker system for handling take-outs and deletions. The privacy broker notifies all the systems in Schibsted Media Group and tracks their compliance, it provides an upload location for take-outs and alerts if the systems are not compliant, and handles third-party systems by the means of a delegate system.
The key to actionable data
By standardising the collection and quality of the data, democratising access through a select set of tools and systems fitted for the data consumer role, while at the same time ensuring privacy and security, Schibsted Media Group enables people to take action and drive innovation, concludes Ludwig.