Third-Party Data: An Overlooked Enterprise Data Strategy
As Linux kernel inventor Linus Torvalds put it, “Intelligence is the ability to avoid doing work, yet getting the work done.” In this same way, this article discusses the benefits of leveraging data that already exists, versus trying to create everything internally or trying to “make a decision” in the absence of this information. Building a strategy around your existing and/or future benefits of third-party data in your broader ecosystem can potentially amplify your overall data strategy. Informatica has designed a Data Strategy Framework that includes elements of a third-party data strategy which line up in the Strategic perspective as well as in the Data Capabilities areas.
When a company looks at their data assets, they often think of them as the data which exists within their transactional systems (internal or customer-facing) and their analytic systems. What is often overlooked and not centralized is the investment that is made around third-party data assets.
Many of these are free datasets (e.g., ISO codes, Census data, CIA factbook, Data.gov, EU Open Data Portal), and cover a vast number of potential scenarios that can complement or replace internal data assets. There are also a great variety of companies that have developed sophisticated business models embracing data as a service (DaaS), such as DNB, LexisNexis, Bloomberg, etc.
Many companies do leverage third-party data sources, although the intent is typically around solving a particular point problem (i.e., trying to find consumer white space for a marketing campaign), rather than a broader-reaching investment into leveraging it across all facets of the company as a critical data asset. While creating data lakes that include this data can be helpful, without a centralized plan of data management, a centralized catalog to allow searchability, and a strategy around integration or inclusion, these assets tend to fall short of realizing true enterprise value.
“It is much easier to put existing resources to better use, than to develop resources where they do not exist.” – George Soros
As you see in Figure 1, data can be leveraged in multiple ways:
- Master data (all or some attributes)
- An extension to existing transactional system data (e.g., credit check information, company size, location)
- In the analytic space, as a horizontal or vertical enrichment to existing assets or even completely new data assets which can be synthesized to produce new analytics (as seen increasingly in the data science realm depicted as a subset in the overall analytical domain).
Benefits
So why try a holistic third-party data strategy rather than continuing point solutions?
Figure 2 shows a more complete view of the ecosystem where third-party data could improve almost every single aspect of a modern data ecosystem. Ultimately, the realization of value comes in both discrete ways and in enterprise methods where data solves a particular problem. Others in the organization will also gain awareness of these data assets and how they might leverage them (sales, marketing, operations, finance, and others). With the Axon data catalog, this is a strong feature that it presents around data/meta-data discoverability and lineage of data.
Below is a list of potential use cases in which external datasets can be helpful in calculating value, including but not limited to:
Many companies find that third-party data is often purchased separately by different parts of the business to address these types of use cases and consolidating vendor relationships and renegotiating data usage rights at the enterprise level can be a potential source of savings. Creating a centralized governance group/model that is relentlessly looking for synergistic benefits of these datasets across the organization can greatly reduce costs while also introducing proactive reusability.
“The reality is that we are all economists. We all deal with scarcity as we make choices and calculate how to ration various items and resources that we consume, produce and utilize.” – Kurt Bills
Centralized strategy and governance for third-party data as key elements of enterprise value also include:
- Economies of scale in purchase from vendors and negotiating leverage
- Utilization of standard reference data or enterprise keys leading to the ability to synthesize disparate data sources more easily
- Reduction in overall data creation internally and less overall data maintenance
- Improvement in overall data accuracy and quality scores
- Redeployment of resources to more value-added activities (i.e., do you want your sales team to spend their time selling or entering customer master data into Salesforce)
Challenges
If it was easy, it would already have been done! Two primary challenges are:
- Political
Often in organizations, trying to surrender some level of control over contracts or even aligning on a specific vendor/source to utilize for a particular piece of data is not trivial. A particular group (such as sales, marketing, finance, etc.) may have a historical bias, a prior relationship, or simply wants to maintain full control over all elements of a choice.
Do not discount this challenge as being an easy one to solve. As one of my first mentors advised me years ago, “the technical stuff is much easier because it’s logical, the people are far tougher because they are not always logical and also tend to change their minds often.” Understanding who is behind or against specific data domains and vendors is critical in appreciating their motivations and helping to build a partnership for a more holistic alignment.
If there exists a chief process officer or overall change agent within the organization, it might be extremely valuable to engage and enlist their support in the mission. They will likely recognize and can help you estimate the value proposition of this change, and because of their close ties to the CFO/CEO in an organization, may have the capital to push against inertia.
- Technical
As mentioned, the technical side is logical, but by no means does this mean that it’s a slam dunk. Humans are still involved and there is also a concept called architectural or data debt that might place many obstacles in the pathway of success. Additionally, complicated legacy systems and extensive backlogs for efforts that might be wrongly viewed as re-architecting may push this effort lower in priority.
Assuming you can make this a high priority, there are impacts across both the transactional and analytic systems landscape. In cases where you are bringing in a new domain that never existed, this effort might be easier. You will still have to figure out ways to integrate it with existing data to be able to realize greater benefits.
For substitution/replacement into existing data domains, such as customer organization data, you potentially will have to wrestle with data conversions, system logic and analytical artifacts, public and private key harmonization, replacement of existing attributes with standardized ones, and even enriching or augmenting data with new attributes or entities.
Conclusion
There are no absolutes around the value of data. Poor data choices, vendor quality, or an improperly aligned third-party data strategy across systems and process can jeopardize agreement and corporate buy-in. Still, there is quantifiable value associated with leveraging third-party data into existing ecosystems to get more than an insular internal view of the world.
Rather than tackle the ocean of possibilities, pick a particular data domain, understand the relevant use cases/needs around that data domain, and explore what is already in place. Look for potential opportunities to standardize and synergize vendors to work with your key business stakeholders. Avoid making this a cost-savings exercise. If nothing exists, explore potential sources, interview vendors, and engage your stakeholders (business and technical) to explore value and use cases that already exist and look for aforementioned benefits to create a compelling value proposition.
Finally, it cannot be overemphasized that a well-run and managed data governance program can play a critical role in making this next evolutionary step in your organization’s data journey. At Informatica Advisory Services, we offer strength and depth in data strategy, data management, data governance, MDM, cloud data transformation, and business adoption and would be happy to engage and help you drive value.