Abstract
Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has piqued the practitioners' interest, and considerable gray literature exists. At the same time, we observe a lack of academic attempts at defining and building upon the concept. Hence, in this article, we aim to start from the foundations and characterize the data mesh architecture regarding its design principles, architectural components, capabilities, and organizational roles. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The resulting review provides insights into practitioners' perspectives on the four key principles of data mesh: data as a product, domain ownership of data, self-serve data platform, and federated computational governance. Moreover, due to the comparability of data mesh and SOA (service-oriented architecture), we mapped the findings from the gray literature into the reference architectures from the SOA academic literature to create the reference architectures for describing three key dimensions of data mesh: organization of capabilities and roles, development, and runtime. Finally, we discuss open research issues in data mesh, partially based on the findings from the gray literature.
Original language | English |
---|---|
Article number | 11 |
Number of pages | 36 |
Journal | ACM Computing Surveys |
Volume | 57 |
Issue number | 1 |
Early online date | 7 Oct 2024 |
DOIs | |
Publication status | Published - Jan 2025 |
Bibliographical note
Publisher Copyright:© 2024 Copyright held by the owner/author(s).
Keywords
- data architecture
- data management
- Data mesh
- gray literature review
- principles
- reference architectures
- research challenges