Is Data Mesh right for your organisation?

In a previous article, we discussed the new enterprise data architecture spreading like wildfire among the data community – Data Mesh architecture. We mentioned that it presents a paradigm shift in data architecture that sees the data industry follow suit by moving away from massive data teams prioritising centralised, monolithic data lakes and databases, to one that prioritises data domains and data products as first-class citizens.

In a nutshell, it presents a convergence of distribu ted domain-driven architecture, self-serve platform design, and product thinking with data. To better understand the concept, we talked to Daniel Tidström, Partner & Management Consultant at Data Edge, who has been working with Data Mesh in parts at least for quite some time.

Daniel explained that Data Mesh becomes crucial when a company scales quickly. “With the proliferation of data sources and data consumers, having one central team to manage and own data ingestion, data transformation and serving data to all potential stakeholders will inevitably lead to scaling issues. “Given the increasing importance of data in our organisations, designing for scalable teams and scalable platforms is really crucial,” explained Daniel.

An important thing that he mentioned about Data Mesh is starting to discuss the distribution of data because data creation is inherently distributed in all companies.” With the number of data sources growing every day, many organisations should probably at least consider what their options for scaling are.

For companies wondering whether Data Mesh is a good fit for them, Daniel suggested that if you have domain-driven development, started working with Microservices, or if you do a cloud migration, that’s a good time to consider it.

Barr Moses, Co-Founder and CEO, Monte Carlo, states that domain-oriented data architectures, like Data Meshes, give teams the best of both worlds: a centralised database (or a distributed data lake) with domains (or business areas) responsible for handling their own pipelines. This way, Data Mesh allows for easier data architectures scaling by breaking them down into smaller, domain-oriented components.

Is Data Mesh right for your organisation? — Photo by fauxels from Pexels

Does Data Mesh make sense for all types of organisations?

Lars Albertsson, Founder of Scling, described Data Mesh as one way to scale out large data organisations where data management and governance has become challenging due to the number of teams working in the data platform, in his interview.

“A Data Mesh can allow companies to scale further by federating data management and governance to teams that own data sources and data pipelines,” explains Lars.

Companies adopting Big Data and DataOps go through several phases of organisational structures around data, over a period of many years. In early phases, there is one team or a few tightly collaborating teams that use a single data platform, where the core component is typically a data lake combined with batch processing pipelines, potentially complemented with stream processing capabilities.

Most companies are either in these early stages or still have not yet built their first incarnation of a data platform. More data mature companies have managed to spread data innovation beyond pioneering teams, and democratised data processing capabilities. To facilitate data democratisation and make governance manageable, the data platform technology and development processes are kept homogeneous. There are usually small variations, but if entropy is not kept under control, excessive friction will prevent data democratisation, adds Lars.

A centralised data platform can scale to large organisations. For companies whose culture makes it difficult to scale centralised services, centralised data governance can be perceived as a bottleneck. In this case, a Data Mesh can be a way to scale further by distributing governance responsibilities. In practice, a Data Mesh incarnation is a data platform and lake that has been split up into multiple ponds and multiple processing environments, controlled and operated by different teams. The size at which a Data Mesh makes sense depends on a company’s capabilities to coordinate a centralised data platform, clarifies Lars.

In the end, Lars states that Data Mesh is not the only option for scaling to a large number of teams; companies with sufficient capabilities to coordinate data management can keep a centralised data platform and avoid the overhead of decentralisation.

Read the whole interview with Lars Albertsson

Yet, how to know if your organisation is really ready to dive into the Data Mesh? To help companies make the decision, Barr Moses and her team have created calculation in the form of a survey for companies to determine if it makes sense for your organisation to invest in a Data Mesh. By answering questions about their data sources, data team, data domains, data engineering bottlenecks, data governance and data observability, and they get a score that helps them decide if they should go for data mesh. You can find the calculation in the Guide on implementing Data Mesh by Barr Moses, CEO of Monte Carlo, and Lior Gavish, CTO of Monte Carlo.

Future outlooks for Data Mesh and DataOps

Both Data Mesh and DataOps are set to disrupt the data and analytics industry in the next decade. But will they progress and transform organisations?

Regarding the above, Lars says that Organisations that have fully adopted DataOps are 10-100 times more efficient working with data, compared to traditional companies. Although these features are subjective estimations based on observing many companies across the maturity spectrum and there are no scientific measurements for DataOps, Lars states that they match scientifically measured operational metrics for companies at different levels of DevOps maturity, as presented in the State of DevOps Report. DataOps seems to have similar effects in terms of lead time for new ideas and time to recovery in the event of failures.

“This efficiency gap is so significant that it is disruptive.” DataOps is in practice a requirement for getting sustainable value out of machine learning technology. Building and operating machine learning applications, keeping them healthy, and iterating to continuously improve them is complex and expensive. Companies that have not achieved a high level of data maturity may strike luck once or twice, but cannot deliver AI innovation in a sustainable and repeatable manner, says Lars.

Contrary to this, Lars says that Data Mesh is not a disruptive concept, but it is a way for very data mature, large companies to scale even further. “These companies have already obtained disruptive value from data and fully adopted machine learning technology.”

What Lars sees as concerning is that most of the buzz around Data Mesh is among companies that are not yet at this level of maturity.

“Early adoption of a Data Mesh is likely to be harmful; if you have not yet established strong, homogeneous conventions and processes in your data platform, decentralising it will introduce heterogeneity, which slows down innovation and prevents effective data democratisation.”

Lars believes that for most organisations, adopting a Data Mesh is in practice likely to be a step backwards and reintroduce the data silos that we had before the age of big data. “Descriptions of the Data Mesh tend to emphasise the responsibility of teams that own data to publish cleaned, high-quality data as data products, a pattern also known as a Data Hub.”

Shifting responsibilities of data quality improvement to teams that have domain expertise is generally a good idea, and an evolutionary step towards data maturity, but the underlying raw data must also be made available, he relates.

For clarification, Lars gives the example of invalid financial transactions, which may be desirable to cleanse away for financial reporting scenarios, but they might be a gold mine of signals for fraud detection. “Hiding raw data into silos is a drawback of the data hub, and a significant risk when adopting a Data Mesh unless the company first establishes strong data democracy practices in a centralised platform. Hence, I am concerned that the buzz around Data Meshes will be harmful, and lure less data mature companies to build data silos.”

As opposed to this, there is no such risk with DataOps. “If your company can make people with different skill sets work well together, there are only benefits and will accelerate your journey to data maturity.”

…

Have you had any experience with implementing Data Mesh? Share your thoughts in the comments below.

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
bp_user-registered	13 years 8 months 8 days	This cookie is used to set which users can access the private pages of the website. It is a functional cookie.
bp_user-role	13 years 8 months 8 days	This is a functional cookie. It is used to set restriction to the user on acessing certain pages like back office, account page etc.
bp_ut_session	13 years 8 months 8 days	This is a functional cookie. This cookie is used to set restriction to the user on acessing certain pages like back office, account page etc.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the wbsite is doing. The data collected including the number visitors, the source where they have come from, and the pages viisted in an anonymous form.

Cookie	Duration	Description
IDE	1 year 24 days	Used by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
test_cookie	15 minutes	This cookie is set by doubleclick.net. The purpose of the cookie is to determine if the user's browser supports cookies.
VISITOR_INFO1_LIVE	5 months 27 days	This cookie is set by Youtube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Duration	Description
_gat_gtag_UA_62786802_1	1 minute	No description
CONSENT	16 years 9 months 21 days 15 hours 5 minutes	No description
ihc_workflow_restrictions_0	1 month	No description
ihcMedia	1 hour	No description

Is Data Mesh right for your organisation?

Does Data Mesh make sense for all types of organisations?

Future outlooks for Data Mesh and DataOps

Add comment

Cancel reply

Next-Generation AI: Deeper Experiments – Interview with Sina Nek Akhtar, Tech Lead, Data Analytics and ML at Google Cloud

Electrolux Continuing Journey to Data-driven Manufacturing Excellence – Interview with Klaas Dobbelaere, Electrolux

Navigating the Next Wave: Generative AI at Accenture – Interview with Mattias Aspelund & Julia Falk, Accenture

Recent posts

Next-Generation AI: Deeper Experiments – Interview with Sina Nek Akhtar, Tech Lead, Data Analytics and ML at Google Cloud

Electrolux Continuing Journey to Data-driven Manufacturing Excellence – Interview with Klaas Dobbelaere, Electrolux

Navigating the Next Wave: Generative AI at Accenture – Interview with Mattias Aspelund & Julia Falk, Accenture

The Future of AI-Enabled Experiences – Interview with Dr. Ather Gattami, Leading Swedish AI Expert, AI Researcher at Bitynamics

AIAW Podcast E125 – Liza-Maria Norlin

AIAW Podcast E124 – All about #DBRX AI Model – Hagay Lupesko

Semantic Layers: Your Strategic Advantage for AI-driven Insights – Interview with Ernesto Ongaro, dbt Labs

Data Innovation Summit 2024: What You Can’t Afford to Miss!

Topics

Email Newsletter

Events

Hyperight

Is Data Mesh right for your organisation?

Does Data Mesh make sense for all types of organisations?

Future outlooks for Data Mesh and DataOps

Add comment

You may also like

Recent posts

Topics

Email Newsletter

Events

Hyperight