Recently, we had the pleasure of hosting the Intelligent Data Summit for AI-Powered Innovation on Informatica Live, Informatica’s award-winning virtual event series. The event drew a who’s who list of speakers from Global 2000 companies and showcased presentations by Informatica customers, partners, industry analysts, and thought leaders. As part of the summit on AI-Powered Innovation, speakers shared some great insights and best practices on their use of Informatica Enterprise Data Catalog. One particular session that stood out for me was the joint presentation given by Nader Anaizi, Senior Manager of Data Governance Center of Excellence from Honeywell, and Sunil Soares, CEO and Founder of Information Asset, an Informatica consulting partner that specializes in data governance.
Honeywell has been innovating with technology and data for over 100 years. As the company states “the future is hiding in your data.” Building data-driven business models and democratizing the use of fully governed data is at the heart of Honeywell’s growth strategy. With five core businesses that include aerospace, manufacturing, and connected industrial software, the company operates in a highly complex data landscape with petabyte-scale data spread across thousands of data sources that include complex enterprise systems, databases, data warehouses, ETL tools, Hadoop clusters, and business intelligence tools, among others, across on-premises and cloud.
To accelerate its digital transformation initiatives and to enable the democratization of fully governed data throughout the enterprise, Honeywell selected Informatica Enterprise Data Catalog. The company wanted to build a central and holistic data catalog as a foundational component to enabling enterprise data governance and self-service analytics. Moreover, they wanted to empower IT and business users to easily catalog, discover, validate and collaborate on data.
While there were several features in Enterprise Data Catalog that the Honeywell team (including IT, business analyst, data stewards, data analysts and data scientists) found highly beneficial, there were four capabilities, in particular, that really stood out for them as they began ramping up on their use of Enterprise Data Catalog:
- Google-like semantic search – The search capability in Enterprise Data Catalog turned the concept of “self-service” into a reality for the users. For the first time, data analysts, data stewards, and data scientists felt empowered to easily and rapidly search for data that they needed without relying on IT. Users could easily find relevant data with something as simple as just a single business term navigating in an environment that they were familiar with.
- Data profiling – Identifying and creating certified, high-quality data assets was a key objective for the Honeywell team in order to accelerate democratization of data throughout the enterprise. As such, having comprehensive data profiling statistics such as value distributions, data type and data domain inference, data quality rules, scorecards and metric groups, as well as technical metadata to understand the quality of data assets was another key feature in the Enterprise Data Catalog that the users found highly beneficial.
- End-to-end data lineage – In addition to discovering data rapidly, equally important to the users was their ability to easily visualize and understand the flow of data – from source to destination with in-depth data lineage detail at a granular level. They could obtain details on all the data-driven transformations each data asset underwent across the data pipeline and throughout the data lifecycle. For instance, a user who had just begun using Enterprise Data Catalog was able to search for a business term, find where it was represented in several locations, visualize how the data flowed, where it encountered transformations, and where that data asset eventually landed.
- Automated notifications – The ability to follow data assets and receive automated notifications was another key capability that stood out for the business users. For instance, they could track the data assets that were of interest to them and get automatically generated notifications whenever those data assets underwent any changes.
Implementation and key findings
It took Honeywell six weeks from start to finish to deploy the solution. This included less than one day to install Enterprise Data Catalog. The rest of the implementation included configuring the environment, building a custom metadata model (which they needed to meet their unique requirements), scan data from several data sources (including SAP HANA, an Oracle relational database, Informatica PowerCenter, and Tableau), and conducting a pilot to demonstrate value to business users, as well as training users.
During the pilot, the Honeywell team was able to demonstrate how they could perform complex-yet-detailed analysis on data assets that reside across several systems and are used by multiple functions in 45 minutes. This analysis would have otherwise taken the team several weeks to conclude.
To learn more, I invite you to watch this exciting presentation. The speakers also share several best practices that may serve you on your data cataloging journey. “Honeywell Speeds Data-Driven Industrial Information with Informatica” is available on demand as part of the Intelligent Data Summit for AI-Powered Innovation on Informatica Live.