Gain Visibility into Your Data with Comprehensive, Automated Data Lineage
More than ever, organizations rely on data to make business-critical decisions. Unfortunately, this data is not always trustworthy. Organizations that want to improve trust and confidence in their data — especially as data moves away from its origin and into the hands of data consumers — have driven the demand for increased visibility into the data.
But it can be extremely complicated and time consuming to obtain useful details and metadata such as:
- where the data originated
- which transformations have occurred throughout its journey
- the downstream impact of changes made to the data
Fragmentation — caused primarily by the large volume and variety of data sources — is usually to blame for this complication and delay. And it can be even more overwhelming to gather and manage these details when you are using manual methods.
Comprehensive data lineage, especially when automated, empowers data professionals and facilitates a variety of initiatives. It provides essential details regarding where data originates, its movement throughout the organization, its transformations and its usage. This information provides the visibility and transparency that organizations need to help data professionals understand and trust their data. More specifically, data lineage can help organizations to deliver the data that builds confidence in their analytics and AI models, improves customer experience programs, helps ensure regulatory compliance with industry policies, accelerates cloud modernization initiatives and much more.
Accelerate Insights with Data Lineage in the Cloud
Automated, end-to-end data lineage is a key component of Informatica® Cloud Data Governance and Catalog, a service of Informatica's Intelligent Data Management Cloud™ (IDMC) that combines the capabilities of data governance, data catalog and data quality into a singular tool for automating data intelligence insights. Cloud Data Governance and Catalog provides deep connectivity into a broad range of data sources across cloud, multi-cloud and on-premises data environments and applications. It allows users to track and view data lineage from its origin to consumption across even the most fragmented and complex data landscapes. The IDMC service enables organizations to leverage predictive data intelligence — automated and recommendation-driven data classification, data curation and relationship building and sensitive data discovery, powered by Informatica's CLAIRE® AI and ML engine. This helps ensure that users can efficiently gather data intelligence insights and quickly drive business value from their data.
Automated Metadata Extraction for End-to-End Data Lineage
Cloud Data Governance and Catalog derives end-to-end data lineage using advanced techniques to automatically scan and extract metadata from a variety of data sources, including those that fuel data analytics and business-critical applications. The IDMC service enables a deeper understanding of data by allowing you to visualize data movement from source to target, across disparate sources. This provides insight into the transformations your data has undergone throughout its life cycle.
Comprehensive Code Parsing
Automatically acquire details from the code used to modify, transform and join data. Visually inspect scripts, procedures and processes to fully understand logic and internal data flow. Cloud Data Governance and Catalog scans elements including SQL scripts, stored procedures, business intelligence reports and ETL jobs to capture detailed data transformation information, helping to provide comprehensive visibility of data during its journey
Business and Technical Data Lineage Views
Interactively trace your data’s flow through data lineage views at any level, from business-friendly, system-level views that highlight the endpoints, to granular, column-level views that include all the intricate details in between. Easily move between the summary and detailed views with the ability to progressively drill down, allowing you to examine how the data has been modified at each step, from source to destination.
Figure 1: Explore data element-level lineage from source to target.
Graphic overlays provide additional context by displaying business and technical information alongside data lineage. Conveniently visualize essential details associated with your data such as business glossary terms, domains, policies and processes directly within data lineage views. Data quality overlays allow you to monitor quality scores and how they change throughout the data flow across your data estate.
Figure 2: Gain additional understanding of your data by displaying graphic overlays, including data quality scores, and view relevant information alongside data lineage.
Deep and Broad Metadata Connectivity
Automate the extraction of metadata that is deeply buried in your most complex data sources. Cloud Data Governance and Catalog has broad and deep metadata connectivity that spans multi-cloud and on-premises environments. You can gather metadata across:
- Cloud platforms
- BI tools
- Multi-vendor ETL and data science tools
- Various enterprise applications and file formats
- SQL dialects
- Stored procedures
You can also obtain complete column-level data lineage, including a full inventory of all potential data lineage sources with rich details. Scan both static and dynamic code, as well as perform language parsing for automated data lineage. With the Cloud Data Governance and Catalog Custom Metadata Framework, use simple Excel files to ingest custom metadata and derive data lineage and relationship links from key systems where automated scanners are not available. Model virtually any data source or data lineage across systems.
Contact Informatica for the most current list of supported data sources.
Understand Your Data and Increase Confidence in the Data Fueling your Analytics and AI
Comprehensive data lineage can provide the transparency your organization needs to support self-service analytics and AI initiatives. Cloud Data Governance and Catalog houses end-to-end data lineage along with rapid data discovery, detailed data quality information and collaboration capabilities. This empowers data consumers, data stewards and IT users to automate data intelligence insights with confidence and ease.
Improve Operational Efficiency and Reduce Costs Across the Organization
Provide faster access to trusted data and improve operational efficiency by using automated data lineage to reduce the time required to research data provenance and transformations for data-driven efforts. Data lineage can also simplify your efforts to optimize your data footprint by helping you identify duplicate data, data silos and underutilized data and systems. Reduce costs by eliminating unnecessary data and systems and minimizing silos.
Support Regulatory Compliance Reporting and Identify Risk Mitigation Opportunities
Leverage end-to-end data lineage to comprehend the flow of data throughout your organization. Understand where sensitive data — such as personally identifiable information (PII) and intellectual property (IP) — reside to help mitigate risk exposure and avoid fines and remediation penalties. Additionally, utilize automated, granular data lineage to support transparency and reporting for regulatory compliance mandates, such as BCBS239, GDPR and CCPA. Extract deep metadata from complex enterprise systems and parse code in stored procedures to create comprehensive audit trails quickly for faster reporting during audits and inquiries.
Cloud Data Governance and Catalog allows users to perform detailed impact analysis on upstream and downstream data assets. This information enables organizations to improve data transparency for various initiatives including data modernization and migration projects. Data lineage provides users with information regarding relevant data sources and dependencies to help understand the impact of changes and who is affected by those changes, supporting successful change management efforts and helping improve operational efficiency.