Big Data Governance: 4 Steps to Scaling an Enterprise Data Governance Program

Last Published: Aug 05, 2021 |
Thomas Brence
Thomas Brence

Senior Director, Product Marketing: Data Quality and Governance

Big Data projects demand a big data governance strategy. You need to unite disconnected teams, shatter data silos, and integrate all manner of applications and systems.

This is a world away from traditional data governance projects, which can often be highly focused, initiatives.

The challenge is bridging this gap. Just how do you turn a small-scale data governance pilot into a long-term, enterprise-wide big data governance program?

Enterprise data governance programs that are focused on big data need to:

  • Encourage collaboration among teams
  • Be adaptive and flexible in processes
  • Democratize data so that users can actively analyze data

But what does this mean in practice? Here are four steps if you plan on taking on this challenge and scaling up to a big data governance program:

1. Catalog your data

Every data governance initiative, big or small, depends on visibility into data. Data stewards need to see where it resides, where it’s coming from, how it’s used, and who uses it. Retaining this visibility can be a huge challenge as you scale your data governance program. It’s simple—the more data you manage, the harder it is to track.

Cataloging your data is critical to solving this problem, and the best data catalogs rely on metadata. You need to be able to identify the metadata associated with all of your data assets, wherever they currently reside. This will provide you with crucial insight into the processes, people, and platforms that touch this data.

One word of warning here—if you’re using multiple legacy systems to manage and hold data, it’s likely you’ll have to deal with different schema and types of metadata. This can transform the task of cataloging data into a long and costly process. Conducting a full audit of these systems is an essential starting point. It’s also worth asking your data catalog vendor how they’ll tackle the challenge of cataloging different types of metadata quickly and cost-effectively.

2. Enable collaboration across the business

Data governance is a team sport, and collaboration will become increasingly important as you scale your data governance program. You and your team will be dealing with various different departments and won’t have insight into every process that touches data. This can make it difficult to develop policies that work for the business.


An overhead shot showing a group of people sitting around a table working. This picture demonstrates the power of alignment and howcollaboration is key to bring together workflows, policies, definitions, and rules.


The solution is to get business users involved. You need to empower people on the frontline to share their knowledge of processes and provide input on your policies.

The most important thing is getting everyone onto the same page. Look for a collaboration tool that brings together workflows, policies, definitions, and rules so you can create a source of truth about the value, reliability, and lineage. That way you can see the impact of your decisions on policy.

Involving business users will also help secure buy-in for your program. Everyone likes to have input on policies and processes that affect them, and empowering business users is a great way to bring them aboard.

3. Gain data governance agility by aligning your people

No matter how big your data governance program becomes, it must retain its agility. If you can’t adapt quickly, then you’ll lose momentum and your initiative will start to deliver diminishing return.

A big challenge here is aligning the huge number of people involved in your initiative. We’ve already discussed the need for collaboration, but crowdsourcing solutions to big decisions can soon lead to analysis paralysis.

The solution is to develop an efficient decision-making system that allows for everyone’s voice to be heard. A best-practice decision-making framework, such as a DACI approach (Driver, Approver, Contributor, and Informed), can help here. These frameworks establish a continuous cycle of listening and acting, where everyone has a chance to feed into the discussion, but a small group of clearly identified people retain control over decision-making.

That way, everyone’s happy and you make steady progress.

4. Automate data governance with artificial intelligence

The other big issue with scaling a data governance program is coping with an ever-growing volume of data. You’ll be dealing with oceans of data assets, of varying type, flowing in from myriad sources.

The good news is that machine learning tools can now help you automate many of the functions needed for critical data-element discovery, quality, rule-based verification, and reporting. They can also help you improve productivity by providing users with intelligent recommendations.

Here are just a few examples of how AI-enabled data governance tools work in practice:

  • Data users working with one data set can be presented with similar data sets that provide context
  • New unstructured data can be onboarded, structured, and categorized automatically
  • Business terms and definitions can be automatically associated with physical datasets
  • Data can be tagged automatically based on the logic learned from previous tagging practices

All of these capabilities may seem relatively minor, but, put together, they add up to significant savings in terms of time and resources, especially when they’re operationalized at the scale needed to process hundreds of millions of records—or more.

Time to go big with data governance

There’s no question about it, scaling data governance is tough. But if you’re taking on a big data project, it’s also necessary.

The good news is that you can now automate the time-consuming manual tasks that held data governance initiatives back in the past. At the same time, innovations in metadata management have made it far easier to track data as it flows across the enterprise.

And the benefits of implementing an enterprise-wide data governance program go far beyond enabling big analytics projects. Total visibility and control over your data assets—how they’re used, who uses them, where they’re stored—is priceless. You can improve data privacy, accelerate digital transformation, and unite disconnected departments. We’re talking about making huge changes to the way you work and each of them will have an impact on your business’ bottom line.

If you’re a data governance leader, it’s time to go big.

For more practical guidance on scaling data governance, read our eBook, “How to Scale Data Governance for Digital Transformation”.

Also, I am discussing "The 5 Foundations of a Successful Enterprise Data Governance Program" at an upcoming Aug 28th webinar. Please don't miss it! Register here.


First Published: Jun 30, 2019