You cannot anticipate tomorrow’s data needs. The best you can do is avoid paying for mistakes not corrected today.
Organizations today wrangle with data from a huge variety of sources through an equally varied number of interfaces. To make sense of that data, IT departments must extract it, transform it into a usable format, and load it into the right data warehousing systems. Data from a number of applications often is integrated to inform a single business process.
This is no easy task. It is made even more challenging by the fact that making a change to data in one place affects data in any number of other places.
Paying the price for dirty data
In most organizations, integration projects are reactionary. They are launched only when there is a visible need. Responding to requests from business stakeholders quickly and efficiently can seem like an imperative. But if you run integration projects in an ad hoc fashion, you run the risk of increasing point-to-point interfaces that do not fit into your organization’s larger data-management needs.
The consequences of individual deployments are vast:
- These tasks are not replicable, so the cost of ongoing support and enhancements will be high due to the project’s inherent complexity.
- The lineage of your data cannot be guaranteed, so you risk creating processes inconsistent with existing business rules governing data transformation and quality.
- These individual systems will be difficult to adapt and update because each will need special attention due to unique rules, custom code, and inconsistent data.
Investing in clean data
You cannot predict the future when you launch data integration projects. The business side, while often confidently predicting needs, also lacks insight. The best you can do is to follow best practices in order to break point-interface dependencies. Such as the following:
- Decouple applications to reduce dependencies.
- Implement data and integration standards for information exchanges between systems.
- Use staging to facilitate the consumption of large data sets at various latencies.
- Maintain records of changes you make to data in order to provide end-to-end visibility into the data itself as well as into metadata.
You may not feel you have the time or resources to devote to these steps. You may also be lacking a firm commitment from stakeholders to undertake the work necessary to retain data fidelity. Your problems will become only more pronounced, however, the longer you resist a centralized data management solution.
If these challenges seem insurmountable, you may want to consider a solution that automates much, if not all, of these processes. To learn more, see The Next-Generation of Data Integration: Transforming Data Chaos into Breakthrough Results, which provides recommendations on eliminating compromises stemming from traditional approaches to data integration.
Article Resources
- 1 The Data Warehousing Institute, “2012 BI Benchmark Report” and “2008 BI Benchmark Report.”
Your problems will become only more pronounced the longer you resist a centralized data management solution.”