Building Data Trust with Democratization and Governance
Businesses are generating more data than ever. To house the massive volume of datasets, enterprises have adopted a mix of on-premise and cloud data stores. This complex data landscape makes it challenging to access the data. Which then makes it difficult to leverage this data to make critical business decisions.
In a Data-driven Organization, the Data Community Must Have Trust in Their Data
What does it mean to have trust in the data? “To have trust in the data” actually holds multiple meanings for users. One is that the data meets the required standards for data quality and that it is fit for the business need. Another critical element is that it is clear who owns the data. Users should be able to confirm where the data is coming from. A final consideration for data trust is the comprehensiveness of data sets — that is, whether you can access all the data you need for your decision making.
To illustrate what this means, let's examine what can happen when a business bases a decision on customer information that is incomplete. In this example, an e-commerce enterprise wanted to drive its revenue through customer acquisition. To do this, they needed insights from behavioral data. This data was from multiple customer touchpoints across the whole customer journey. Seems logical enough — the more info you have about how your customer behaves and what they buy, the better your chances of acquisition or retention. The challenge is that these diverse datasets are usually located in different source systems. Some data may be in spreadsheets, while other data resides in traditional databases. Still other data sits in advanced data lakes or data warehouses. There are even some enterprises that leverage real-time streaming data and IoT device data from several vendors.
At one end of the spectrum, you have the insights gained from complete datasets from all sources. You're able to apply what you know about your customer to create experiences that will engage and delight them. Compare that to the other side of the spectrum. Decisions made with incomplete or missing information create experiences that miss the mark. In other words, your ability to build a holistic view of customers depends on how well you can integrate related data sets from different source systems and provide access for decision makers.
How do we enable data democratization the right way? Here are 5 best practices for an effective data sharing program
One: Act on Metadata with Data Fabric approach
Data consumers now need access to data that is distributed across the enterprise, which requires that organizations to focus on integrated data delivery and governance. A data fabric design approach can help integrate distributed data assets and simplify data access. And a data fabric provides end-to-end data management capabilities, from data discovery to data delivery.
Gartner® says, “By 2024, organizations that utilize active metadata to enrich and deliver a dynamic data fabric will reduce time to integrated data delivery by 50% and improve the productivity of data teams by 20%.”
Because data fabrics act on the metadata rather than data, they are agnostic to the data platforms, data type and location. And by linking active metadata of the datasets, the data fabric provides a unique data view without any need to move the data from their original location.
Two: Understand the Data Context
Companies are constantly innovating and moving into new lines of business to generate multiple revenue streams. This diversification of business initiatives has led to an increased number of citizen data consumers. The sole job of many of these non-technical data users? To analyze the data and derive ways to enable business outcomes.
These users must understand the data that is currently available, and what data they need, based on their current domain. A semantic layer and business glossary deliver a business representation of data and helps these users understand the data context and look for data accordingly. Capturing other information like who else is using these data sets and for what purpose, as part of metadata can provide more visibility to data consumers on what is the best suited data for them.
Three: Enable Self-Service Data Access
IT has traditionally played a key role in delivering the required data to all the data community users within an enterprise. This approach has its challenges, such as the one-off requests, resources wasted on similar requests and delays in fulfillment.
To address these costly challenges, what enterprises need is a shared service. An approach that does not require IT resources but provides data consumers with self-service data access. A self-service data marketplace with roles set by the organization can provide consumers with a one-stop shop for all their data needs. With a data marketplace, the enterprise can empower data consumers with self-service access to relevant data while providing appropriate guardrails for responsible, governed use of data.
Four: Automate Data Management Tasks
Data fabric architecture design provides comprehensive data management capabilities. But there remains the question of how to manage and fulfill each data request. Intelligent automation embedded in the data fabric design can help automate various data management tasks, allowing data consumers to focus their efforts on creating value from data instead of waiting for the right data.
As we deal with data in a distributed landscape, automated data governance and delivery has become ever more important. While AI-enabled governance can spur enterprise-wide scalable governance, a data marketplace can provide automated data delivery for data requests. Organizations need to augment these automation capabilities in the data fabric design itself, to support delivery of data at enterprise scale through the marketplace.
Five: Data Community Collaboration
To continuously curate and enhance the datasets and customize them for data consumers, data producers and consumers need to have more contextual discussions on the data assets, categories and collections.
And access to enterprise data — for the organization’s ecosystem of distributors, suppliers and partners — is key to scaling the business. For example, your suppliers could need self-service access to the data pertaining to a current bid. Or a partner might need to access data relevant to your joint business. Or perhaps one of your distributors has identified that they would be able to operate more effectively if they had access to your current inventory.
Each of these opportunities for collaboration and access would benefit from a shared service with an intuitive interface. One that could incorporate multi-marketplace capability (with the ability to provision for multiple self-service entry points for relevant stakeholders). It would enable real-time messaging among data stakeholders, provide the ability to review and rate datasets and share data knowledge with the community.
Data democratization is driving various initiatives for enterprises and building data trust has been on top of the list. This trust has fueled various analytics and data science projects and ML initiatives across the organization. A data fabric approach to democratize data, facilitated by a data marketplace, will build data trust and drive a data-driven culture.
Informatica Cloud Data Marketplace provides a seamless shopping experience to the data community and helps transform businesses by enabling data democratization across the enterprise.
Read about Cloud Data Marketplace in this data sheet.
Join us for a live demo.
1Source: Gartner, “Cool Vendors™ in Data Management: Creating Operational Efficiencies,” Nina Showell, et al, October 21, 2021
GARTNER is a registered trademark and service mark, and COOL VENDORS is a registered trademark of Gartner, Inc. and/or its affiliates and are used herein with permission. All rights reserved.