<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=4958233&amp;fmt=gif">
RSS Feed

Insights Through Data | Adriana Calomfirescu |
26 July 2022

Data Mesh seems to herald a paradigm shift in data storage and processing. Instead of central data repositories, such as data warehouses or data lakes, companies could in future rely on a distributed data architecture to finally be able to exploit the full potential of their data. Let’s take a look at the principles of this new data architecture concept, its advantages, and what needs to be considered when deciding whether it is a good fit for a company.


“Data is the new oil” – this quote by British mathematician Clive Humby is over 15 years old, and most companies have since recognised the meaning of his words: they are trying to exploit the potential of their data. To do this, they are collecting ever larger amounts of data in central data stores where it is cleaned and processed so that it can then be further processed as high-quality data.

The data originates from internal operational and transactional systems and domains that are essential for business operations. Furthermore, data from external sources that offer companies additional information is also fed into the data warehouse or data lake.


However, companies are slowly reaching their limits with this monolithic data platform architecture – and they often do not even achieve the desired results. They face the challenge of controlling their ever-growing data volumes and harmonising them to reach their full potential. Moreover, this process costs money and takes time. Thus, their ability to react flexibly and quickly to the increasing number of internal and external data sources and to connect them to their existing data is limited.

Furthermore, the origin of the data in these repositories often cannot be fully traced, for example: from which system did it originally come from? Through which other systems did it migrate? When was it changed, how, and by whom? This information is important to ensure a high level of data quality. However, due to the large amount of data that ends up in the repository – as well as the speed at which data changes – it is sometimes neglected and not fully tracked and recorded. This usually causes those subject matter experts who are supposed to work with the data to become reluctant to use it.

As a result, companies struggle to generate meaningful insights from their data and identify new use cases – such as new products or services for their customers. In addition, it takes time to transform the data and make it ready for its consumers. This is especially the case if a company does not employ enough data specialists who know exactly how the data should be processed to reach its purpose.


The data mesh concept attempts to address these challenges by managing data as a product. This means that the data is structured as data domains, has data owners, and is properly catalogued so everyone interested in certain data within the company can easily access the metadata. The team generating the data is considered the data owner and must prepare its data in such a way that other data consumers in the company can use it easily via self-service options. To do this, they need to satisfy several principles when building and managing their data products, such as Data Integrity, Discoverability, Self-Description, and Interoperability. This increases consumer confidence in the products.

The biggest advantage here is that the data-producing departments naturally know their data best. Accordingly, it is easier for them to derive benefits from it and to develop new possible use cases.

In this new data architecture, the role of the data scientists and engineers also changes: they are no longer acting as go-between for the data-producing and data-consuming teams, but they become part of the data-producing team. In this way, they learn the domain knowledge necessary to support their team colleagues in the best possible way when preparing the data products. This simplifies and speeds up the entire process, which at the same time leads to lower overall costs.


The data mesh approach is particularly suitable for larger companies that work with very large data sets and a variety of data sources. Smaller companies, on the other hand, usually can get by with a central data repository. When implementing a data mesh approach, companies should consider two key things to set up the necessary processes:

A central data governance model: data mesh only works if all data products in a company adhere to consistent standards and guidelines. Only then are they interoperable, and data consumers can merge multiple data products and work with them according to their individual needs. Therefore, companies must first define standards and policies that determine how data products are categorised, managed, and accessed.

A central data catalogue: for data consumers to be able to find data products, companies need a central data catalogue. All existing data products are listed in this catalogue, including additional information such as the origin of the data. Furthermore, data owners can add sample data sets that data consumers can use to try out the product before using their own data sets.


Data mesh is a new, decentralised approach to storing and processing data that might see widespread adoption. But the more companies realise that data repositories, which have become established in recent years, are no longer sufficient for their requirements, the more they will look for alternatives. Data mesh offers them the opportunity to get more out of their existing data, while at the same time deploying the staff more efficiently and making internal processes more effective and flexible.

Adriana Calomfirescu

Global Head of Data Delivery

Adriana has 25+ years of progressive leadership experience across the analysis, design, and implementation of information technology and data systems. She’s responsible for identifying technology trends in the data world and ensuring a constant growth of the technical competences in the data discipline, while also providing governance for the Data projects at Endava. Starting with a small, dedicated team of data engineers in 2015, under Adriana’s leadership, the Data Delivery discipline has grown to include over 400 associates in 17 locations across the globe.


From This Author



  • 13 November 2023

    Delving Deeper Into Generative AI: Unlocking Benefits and Opportunities

  • 07 November 2023

    Retrieval Augmented Generation: Combining LLMs, Task-chaining and Vector Databases

  • 19 September 2023

    The Rise of Vector Databases

  • 27 July 2023

    Large Language Models Automating the Enterprise – Part 2

  • 20 July 2023

    Large Language Models Automating the Enterprise – Part 1

  • 11 July 2023

    Boost Your Game’s Success with Tools – Part 2

  • 04 July 2023

    Boost Your Game’s Success with Tools – Part 1

  • 01 June 2023

    Challenges for Adopting AI Systems in Software Development

  • 07 March 2023

    Will AI Transform Even The Most Creative Professions?

  • 14 February 2023

    Generative AI: Technology of Tomorrow, Today

  • 25 January 2023

    The Joy and Challenge of being a Video Game Tester

  • 14 November 2022

    Can Software Really Be Green

  • 26 July 2022

    Is Data Mesh Going to Replace Centralised Repositories?

  • 09 June 2022

    A Spatial Analysis of the Covid-19 Infection and Its Determinants

  • 17 May 2022

    An R&D Project on AI in 3D Asset Creation for Games

  • 07 February 2022

    Using Two Cloud Vendors Side by Side – a Survey of Cost and Effort

  • 25 January 2022

    Scalable Microservices Architecture with .NET Made Easy – a Tutorial

  • 04 January 2022

    Create Production-Ready, Automated Deliverables Using a Build Pipeline for Games – Part 2

  • 23 November 2021

    How User Experience Design is Increasing ROI

  • 16 November 2021

    Create Production-Ready, Automated Deliverables Using a Build Pipeline for Games – Part 1

  • 19 October 2021

    A Basic Setup for Mass-Testing a Multiplayer Online Board Game

  • 24 August 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 3

  • 20 July 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 2

  • 29 June 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 1

  • 08 June 2021

    Elasticsearch and Apache Lucene: Fundamentals Behind the Relevance Score

  • 27 May 2021

    Endava at NASA’s 2020 Space Apps Challenge

  • 27 January 2021

    Following the Patterns – The Rise of Neo4j and Graph Databases

  • 12 January 2021

    Data is Everything

  • 05 January 2021

    Distributed Agile – Closing the Gap Between the Product Owner and the Team – Part 3

  • 02 December 2020

    8 Tips for Sharing Technical Knowledge – Part 2

  • 12 November 2020

    8 Tips for Sharing Technical Knowledge – Part 1

  • 30 October 2020

    API Management

  • 22 September 2020

    Distributed Agile – Closing the Gap Between the Product Owner and the Team – Part 2

  • 25 August 2020

    Cloud Maturity Level: IaaS vs PaaS and SaaS – Part 2

  • 18 August 2020

    Cloud Maturity Level: IaaS vs PaaS and SaaS – Part 1

  • 08 July 2020

    A Virtual Hackathon Together with Microsoft

  • 30 June 2020

    Distributed safe PI planning

  • 09 June 2020

    The Twisted Concept of Securing Kubernetes Clusters – Part 2

  • 15 May 2020

    Performance and security testing shifting left

  • 30 April 2020

    AR & ML deployment in the wild – a story about friendly animals

  • 16 April 2020

    Cucumber: Automation Framework or Collaboration Tool?

  • 25 February 2020

    Challenges in creating relevant test data without using personally identifiable information

  • 04 January 2020

    Service Meshes – from Kubernetes service management to universal compute fabric

  • 10 December 2019

    AWS Serverless with Terraform – Best Practices

  • 05 November 2019

    The Twisted Concept of Securing Kubernetes Clusters

  • 01 October 2019

    Cognitive Computing Using Cloud-Based Resources II

  • 17 September 2019

    Cognitive Computing Using Cloud-Based Resources

  • 03 September 2019

    Creating A Visual Culture

  • 20 August 2019

    Extracting Data from Images in Presentations

  • 06 August 2019

    Evaluating the current testing trends

  • 23 July 2019

    11 Things I wish I knew before working with Terraform – part 2

  • 12 July 2019

    The Rising Cost of Poor Software Security

  • 09 July 2019

    Developing your Product Owner mindset

  • 25 June 2019

    11 Things I wish I knew before working with Terraform – part 1

  • 30 May 2019

    Microservices and Serverless Computing

  • 14 May 2019

    Edge Services

  • 30 April 2019

    Kubernetes Design Principles Part 1

  • 09 April 2019

    Keeping Up With The Norm In An Era Of Software Defined Everything

  • 25 February 2019

    Infrastructure as Code with Terraform

  • 11 February 2019

    Distributed Agile – Closing the Gap Between the Product Owner and the Team

  • 28 January 2019

    Internet Scale Architecture