Data and AI Summit 2023: Drive Innovation with Informatica and Databricks
Announcing Integration with Databricks Unity Catalog, INFACore Support for Databricks Connect and Self-Service Migration to Databricks
According to the Databricks 2023 State of Data and AI Report, Informatica was named one of the top three fastest-growing services in the Databricks ecosystem, with 174% year-over-year growth.1 In fact, for over five years, Databricks and Informatica have been working together to provide the latest innovations in cloud data management and analytics. These innovations include cloud-native, low-code/no-code data integration that natively transforms data with Databricks SQL, enabling users beyond IT to leverage the performance and scale of Databricks.
The partnership has witnessed stellar results with joint customers. For instance, Takeda, one of the world's top 10 pharmaceutical companies, is leveraging Informatica Intelligent Data Management Cloud (IDMC) and Databricks to deliver breakthrough therapies with faster analytics while significantly lowering costs. Powered by our CLAIRE AI engine, IDMC is the first and only AI-driven cloud dedicated to data management, with a microservices-based architecture and over 200 intelligent cloud services in a single platform.
Today I’m excited to share the next big step in the Informatica and Databricks partnership: Informatica INFACore integration with Databricks Connect and IDMC support for the Databricks Unity Catalog. Informatica also now supports Databricks SQL as part of our highly successful cloud data warehouse modernization program. This program automatically migrates more than 90% of on-premises Informatica PowerCenter workloads to cloud-native data integration on IDMC and cloud data warehousing with Databricks SQL and Databricks Delta.
Leverage Databricks Unity Catalog Across the Full Range of IDMC Services
Databricks Unity Catalog helps you organize and discover your Databricks data assets. It also enables secure and efficient access to Databricks SQL and underlying data in the Databricks lakehouse.
With Informatica IDMC support for Databricks Unity Catalog, you can now leverage the latest capabilities from Databricks across the full range of IDMC services, giving you the power of comprehensive, AI-driven data management across virtually all your Databricks data assets. These services include:
- Data Integration: Informatica low-code/no-code ELT/ETL data pipelines provide the ability to rapidly load data from enterprise and cloud sources to Databricks Delta with full namespace support for Unity Catalog.
- Data Quality: Informatica Cloud Data Quality supports identification and remediation of data quality issues within Databricks Delta, leveraging hundreds of out-of-the-box data quality rules.
- Data Access, Cataloging and Governance: Informatica Cloud Data Governance and Catalog services (CDGC) integration with Unity Catalog provides holistic data governance within Databricks SQL, as well as across the entire data landscape. This provides complete data lineage, tracking how data flows through Informatica pipelines and Databricks. Informatica data governance tools can also ensure that data is compliant with policies and permissions both within Databricks and across other data platforms.
Accelerate Development with INFACore and Databricks Connect
New Informatica INFACore integration with Databricks Connect brings the power of INFACore connectors, transformations and data quality rules to Databricks data engineers and data scientists in the integrated development environment (IDE) of their choice. The extensibility of Python for both Databricks Connect and Informatica INFACore allows data engineering teams to accelerate development through integrated DevOps and enables advanced analytics development within their preferred environment.
Databricks Connect enables you to write jobs using Spark APIs and then run them remotely on a Databricks cluster, instead of in the local Spark session. It also allows you to debug code in multiple IDEs and iterate quickly when building libraries. Databricks Connect is based on Spark Connect, an open-source project that introduces a decoupled client-server architecture for Apache Spark.
INFACore provides you with Python libraries containing proven, pre-built IDMC data management functions that help you dramatically accelerate data pipeline, data engineering and data analytics development. INFACore enables you to use Informatica data integration, quality, governance and security capabilities within your favorite programming language and environment.
Modernize to Databricks Up to 6x Faster
The Informatica cloud data warehouse modernization program provides you with self-service capabilities to modernize your existing data estate to Databricks 6x faster than virtually any other approach. By automatically migrating more than 90% of PowerCenter workloads to the cloud, the program provides you with anticipated costs savings of up to 20x. The automated conversion now supports Databricks SQL and Databricks Delta as targets for modernized data integration pipelines within IDMC.
Next Steps
With these additional capabilities, Informatica continues to innovate with Databricks, providing our joint customers with the latest analytics and data management capabilities to ensure the success of their data-driven business transformation. No matter where you are on your Databricks journey — from modernizing legacy workloads to working with large language models (LLMs) and generative AI —Informatica IDMC will help you accelerate and build game-changing outcomes on a foundation of trusted, high-quality data.
Join us at booth #37 at the Data + AI Summit 2023 to learn more about how to drive your business forward with Informatica and Databricks. You can also reach us at infafordatabricks@informatica.com or visit www.informatica.com/databricks.