Successful data discovery aligns business goals with IT priorities

Make sure to use appropriate software tools in order to prevent problems before, during, and after data integration.


Do not let shifting business imperatives and accelerated schedules distract you from the importance of data discovery.

Remember that the best insurance against risk and project delays is to uncover problems before detailed design begins. Do not let shifting business imperatives and accelerated schedules distract you from the importance of data discovery.

Data discovery is a precise and potentially time-consuming process. Ignoring best practices can put sensitive data at risk and create compliance problems. It can even result in the failure of a project. The best approach is a cooperative effort between a developer with an IT point of view and a data analyst with a business perspective. It is important to use tools designed for data discovery before you begin integration as well as on an on-going basis.

Profile upfront

The amount and types of data used to inform business decisions is growing. IT departments are increasingly required to meet business needs. To encourage cooperation, use tools that let business owners understand and analyze data. That will help eliminate miscommunication and misunderstandings before integration begins.

Debug on the fly

Identify anomalies and track data quality continually during integration processes. If you are complacent, you risk issues such as compliance violations and security threats.

“Data discovery tools are becoming increasingly necessary for getting a handle on where sensitive data resides,” said Securosis Security Strategist Adrian Lane. “When you have a production database schema with 40,000 tables, most of which are undocumented by the developers who created them, finding information within a single database is cumbersome. Now multiply that problem across financial, HR, business processing, testing, and decision support databases—and you have a big mess.”

Choose the right tools

Because big data is big business right now, some software vendors are positioning their products as data discovery solutions. Uncovering data quality issues is complicated with even the best tools, so make sure you do your research.

Business intelligence (BI) software, for instance, provides nice charts and graphs without the automation that true data discovery tools provide. You need to visually assess data quality, determine joint efficacy, and understand how data relates to business processes. Software with specialized, advanced capabilities is needed to uncover the underlying structure of data in order to relate data in one system to data in another.

The business is under pressure to speed up projects to remain competitive. But you decrease efficiency in the long run if you do not profile your data as often throughout the process as warranted.

For more information, read “Take agility to the next level by using data virtualization for prototyping.”

Related content


Get 3 steps closer to big data gold

NoSQL will not replace all your tools, but it will be a valuable addition to your toolbox.


Use lean integration principles to clean up your metadata mess

Remain efficient and uncompromising in quality, cost, and speed by keeping your data integration process clean.


7 reasons a seasoned developer is an asset to any big data project

Veteran developers will not face obsolescence if they highlight the value of their experience and knowledge.