Welcome back to part two of my blog series on Data Quality for Everyone, Everywhere—let’s jump straight in. (If you missed Part 1, you can find it here.)
If your organization is like most, you know you have data quality issues everywhere. It’s also very likely that you’re tackling them in an ad hoc way because it’s hard to pinpoint where the problems are and where they originate. You don’t know where to start. But you do know it’s difficult to fix them.
You don’t have the right tools to engage all the stakeholders who need to be involved. You aren’t able to cleanse multiple data domains or data in all country locales and languages. And you can’t access all the data sources you need to cleanse or prevent bad data from entering applications every day.
Let’s take a closer look at these obstacles.
Traditionally, only IT developers have been involved in data quality projects using code or tools to build rules. But data quality is not just an IT problem—it’s also a business problem that requires business ownership and knowledge to solve it. But the business isn’t equipped with the right tools necessary to do anything about it. Key stakeholders don’t have the tools they need to be involved in data quality processes.
Data stewards and business analysts—the people responsible for data within applications or processes—typically have generic or custom-built tools that are ill-equipped to manage multiple datatypes or to support the wide range of projects where data quality is a key part of the solution. They rely heavily on IT to access data, make changes to rules, update reference data, and pull reports. All of these activities take time and introduce delays. For example, a data steward typically runs some macros or code in a spreadsheet or database to test the data for data quality errors. If the steward finds issues, he makes notes and emails the notes to IT to specify what needs to change. Next month he does the same thing. Nobody is completely happy with or confident in the process. It’s too long, too inefficient, and not scalable.
Line-of-business managers, who are most directly affected by poor data quality, also lack the tools they need to participate in improving data quality. They aren’t aware of the business impact of poor-quality data on their processes and applications and can’t accelerate resolution. While they may be willing to take responsibility for data quality, unless they have the tools they need, the business will remain frustrated and on the sidelines.
Traditionally, data quality deployments have been focused on customer data-related processes in marketing, sales, and billing. Data quality products were limited to name and address data.
However, the business impact of poor data quality in other domains—such as product, financial, IoT and asset data—is significant. Retrofitting these traditional data quality tools to address customer, product, finance, asset, location, IoT, and partner data is difficult.
And given the global nature of today’s business environment, data quality tools need to provide global coverage in data matching, cleansing, verification, and data enrichment for all countries and locales. If data quality tools can only address customer data within specific geographies, they will not be able to deliver full return on investment. Their limited scope of capabilities will hinder global customer service and operational efficiency initiatives, such as single view of customer and master data management, which will continue to have a negative impact on the business.
Applications are driven by multiple data sources across flat files, unstructured and semi-structured data, data warehouses, ERP systems, CRM applications, and legacy mainframes, to name a few. Poor data quality enters the organization in multiple ways and flows from one application to another.
The major source of corruption is at the point of data entry or data capture. A user can enter incomplete, inconsistent, nonstandard, or duplicate data. Similarly, as applications are modernized, the data that is being migrated to the new application typically does not go through a data profiling process to understand the structure and quality data; a data quality process to standard and cleanse the data; or a data enrichment process to increase the usability and value of the data. While some applications have adequate controls in place, most do not prevent bad data from entering their systems.
There is no process for implementing common data quality standards across all applications. Data quality rules may be implemented for departmental applications, but the rules can’t be reused or scaled across the enterprise. Without a way to reuse data quality rules across multiple applications and multiple projects, there is no way of protecting all your applications from being polluted by poor data quality.
Developing and launching a data governance program to support compliance initiatives and accelerate data-driven digital transformation is no small feat. The scale and complexity of today’s data environments make it difficult or impossible to achieve with manual processes. Data professionals need a solution that automatically discovers what data the organization has, where and how that data is used, and whether it can be trusted. However, knowing where to start—which projects to focus on, what budget to go after, and who to work with—is perhaps the greatest challenge of all.
A study recently commissioned by Accenture and Qlik revealed only 32% of companies reported being able to realize tangible and measurable value from data. One of the causes cited was the individual not truly understanding the data or its potential.
Turning data into value requires the ability to understand data in context, where did it come from, is it fit for purpose, can the data be trusted, how it flows through your data pipelines—while having the necessary skills to shape the data into a usable format and communicating actionable insights. The symbiotic relationship between business and IT requires business leaders to enhance their data literacy skills and data leaders to improve their business acumen.
In the final blog of this series, I will discuss “Delivering Data Quality for Everyone, Everywhere”. If you want to understand the structure and quality of your data, why do not try profiling your data with a Free 30-day Trial of Cloud Data Quality?