Monolithic, legacy architectures and centralized data platforms thwart business’ agility, making it difficult to quickly adjust to the constantly changing data landscape that demands new views, new aggregations, and new projections of the data (aka data products). Both data fabric and data mesh architectures aim to abstract data management complexity and deliver data to the business with agility and scale.
Organizations relying on legacy architectures are struggling to scale data and analytics given the following challenges:
These challenges hinder organizations from becoming truly data-driven and responding to business demands fast. Although these challenges are not entirely new to the data landscape, they have assumed greater importance as organizations strive to accelerate digital transformation.
Previous approaches to overcome these challenges include semantic layers, data virtualization, and data as a service (a data management strategy aiming to leverage data as a business asset for greater business agility). The objectives of these logical architectures are to scale the delivery of data to satisfy diverse use cases and, most importantly, provide the agility to respond quickly to changing business needs. The data fabric architecture addresses the rising complexity of data management by intelligently integrating and connecting an organization’s data and making reusable data assets available for consumption. Data mesh is an emerging architecture that furthers data fabric objectives. Let’s dive a bit deeper.
Data fabric is a design concept and architecture geared toward addressing the complexity of data management and minimizing disruption to data consumers while ensuring that any data on any platform from any location can be successfully combined, accessed, shared, and governed efficiently and effectively. A data fabric architecture is enabled by AI/ML-driven augmentation and automation, an intelligent metadata foundation, and a strong technology backbone (i.e., cloud-native, microservices-based, API-driven, interoperable, and elastic.).
Data mesh focuses on organizational change – enabling domain teams to own the delivery of data products with the understanding that the domain teams are closer to their data and thus understand their data better.
First defined by Zhamak Dehghani, a ThoughtWorks consultant who coined the term, data mesh is a type of data platform architecture that embraces the ubiquity of data in the enterprise by leveraging a domain-oriented, self-serve design. It enables data consumers to discover, understand, trust, and use data/data products (distributed across different domains) to drive data-driven decisions and initiatives.
Just as engineering teams transitioned from monolithic applications to microservice architectures, data teams view data mesh as a prime opportunity to transition from monolithic data platforms to data microservices (business contextual services) architecture.
Core to data mesh is the concept of breaking apart the monolithic architecture and monolithic kind of custodianship or ownership of the data around domains in the organization. Data warehouses and data lakes can still exist in the mesh architecture, but they become just another node in the mesh, rather than a centralized monolith.
Data mesh advocates distributed, domain-based ownership and custodianship of data and building data products that are self-described and atomic, more easily managed and delivered at the domain level. These data products are sharable with other domains and interoperable with other data products that form the data mesh. A data mesh manages data as a distributed network of self-describing data products.
Data fabric and data mesh each provides a data architecture that enables an integrated, connected data experience across a distributed, complex data landscape.
How to realize data mesh benefits: Data mesh conceptual architecture
Data mesh architecture introduces a shift in how data analytics is enabled in the enterprise, built upon the following design principles:
There are few publicly available case study references to data mesh architecture implementation by companies. However, data mesh architecture is in the infancy stage of adoption. Its effectiveness has not been widely demonstrated for tangible business benefits, and best practices are still evolving.
Questions to ask when evaluating data mesh architecture
Organizations considering a data mesh architecture need to closely evaluate whether data mesh’s domain-oriented design approach is suitable for their needs. They need to ensure that the data products will be reused and fulfill the needs of both local and cross-domain/enterprise users. If you’re considering a data mesh architecture, ask questions like:
Before implementing a data mesh architecture, enterprises need to consider multiple factors for each of the three dimensions identified below.
To mesh or not to mesh?
Since data mesh is an emerging architecture compared to data fabric, companies need to evaluate if data mesh is the appropriate architecture for their needs. In the following scenarios (not an exhaustive list), companies may not benefit significantly from a data mesh architecture:
Domain-oriented ownership and architecture: Intelligent Data Management Cloud enables a metadata-driven approach to building and scaling data pipelines for any data consumer or producer. The Intelligent Data Management Cloud platform includes enterprise data catalog, data governance, and data privacy capabilities. This makes it easy for data producers and consumers to register or discover domain-specific, trusted datasets to use within their data pipelines.
Data product: The Intelligent Data Management Cloud enables enterprises to visualize, analyze, prepare, and collaborate on their data regardless of location, type, format, or the underlying source. It is a comprehensive cloud-native data management platform with data and app integration, data preparation, data quality, and API management capabilities, all in a single platform. This accelerates building data pipelines and delivering data products at scale. The Intelligent Data Management Cloud can expose data access via APIs or other modes of data access. It provides a data marketplace for both data suppliers and consumers to share, discover, and consume data products.
Self-serve data infrastructure: The Intelligent Data Management Cloud offers cloud-native, elastic self-serve data infrastructure with a low-code or no-code experience, allowing your team to go directly from ideation to implementation, responding to dynamic business requirements and changes in real time without the overhead of developing and maintaining code. The Intelligent Data Management Cloud platform includes purpose-built wizards and user experience for every type of user.
Federated data governance: The Intelligent Data Management Cloud has security and trust as a design principle rather than an afterthought. It offers the highest industry standards for data security. It offers an enterprise-scale catalog of catalogs and data privacy. It automates data governance capabilities such as data-asset classification based on domains, data curation, policy linking and enforcement. These ensure that appropriate teams (producers/consumers) can quickly access and understand data and other artifacts like AI models and pipelines. The Intelligent Data Management Cloud ensures trust with consistent enterprise-wide data quality, protects data to minimize privacy risks, and facilitates regulatory compliance.
Here is how the Intelligent Data Management Cloud supports data fabric architecture
The Intelligent Data Management Cloud enables an intelligent data fabric across a distributed data landscape with an active metadata-based AI and ML engine, CLAIRE, that utilizes breadth of metadata to automate data integration and management tasks.
Augmented data catalog – AI-powered intelligent data catalog enables you to find, understand, and prepare all your data with AI-driven metadata discovery and data cataloging.
Knowledge graph enriched with semantics – Enterprise knowledge graph puts data in context by linking and enriching semantic metadata and inferencing to deliver intelligence to data management functions.
Metadata activation and recommendation engine – AI-powered CLAIRE engine learns your data landscape to automate thousands of manual tasks and augment human activity with recommendations and insights.
Data preparation and data delivery – Enterprise data preparation enables you to simplify and speed up the data preparation with advanced ML-based automation and data cataloging.
Orchestration and DataOps – Enterprise orchestration and XOps enable automatic orchestration of all data delivery flows by employing DataOps, MLOps, and InfosecOps in support of continuous analysis and monitoring.
Customer Story: HelloFresh Dials Up Data Analytics with a Data Mesh to Meet Demand
HelloFresh is the world’s leading meal kit company. In 2020, the company more than doubled its annual revenue and met an unprecedented surge in demand for its meal kits during the pandemic. Key to this was scaling up data analytics and making it easy for employees to find key data to accelerate insights such as planning and forecasting cycles.
The company transitioned to a data mesh architecture to decentralize data ownership, enabling data consumers across the organization to easily find and understand relevant data. Data mesh architecture helps to scale data analytics as the company grows to keep customers happy, manage costs, and stay ahead of competitors.
With Informatica’s AI-powered Intelligent Data Management Cloud platform, HelloFresh democratized data, making it easier for more than 11,000 employees in 14 countries to find and use data. By automating the documentation of key data, the company created a single source of truth that is easy for users to discover, understand, and leverage for decision making.
Thanks to the data mesh architecture, HelloFresh was able to meet the unprecedented surge in demand for meal kits during COVID-19, more than doubling annual revenue.
Customer Story: BMC Transforms Complex Technology into Extraordinary Business Performance with a Data Fabric
BMC Software (BMC) helps companies harness technology to improve the delivery and consumption of digital services. The company’s accounts payable and generic ledger operations were handled by decentralized regional services centers using manual processes. This, in turn, caused a lack of standardization across countries, which impacted the BMC treasury team’s ability to view current account balances and resulted in the need to maintain excessive cash reserves to cover the possible occurrence of any unpredicted cash needs.
With Informatica, BMC built a functional system in a very short period of time and then layered on more sophisticated capabilities. The company dramatically improved visibility into actual and projected cash flows, enabling it to better manage cash positions and optimize the use of its working capital.
BMC saved hundreds of thousands of dollars and has much better reporting and control across the hundreds of bank accounts. It now has accurate and timely visibility into its cash holdings and has been able to elevate the rigor behind its risk management and mitigation strategies.
Both data fabric and data mesh are revolutionary architectures that enable organizations to connect and deliver data across a distributed data landscape by abstracting the underlying complexity. Since data mesh is an emerging architecture, enterprises embarking on this path should carefully assess whether this is the right architecture for their organization.
Informatica is uniquely positioned to support both your data fabric and data mesh or any other architectures that might emerge in the future via our Intelligent Data Management Cloud, thus future-proofing your investments in data and analytics. Explore our enterprise architecture center to take the next step in your modernization journey.