Achieve Greater Scale, Speed and Efficiency: Latest innovations in data ingestion and replication

Last Published: Jan 29, 2025 |
Faisal Ishaq
Faisal Ishaq

Sr. Principal Product Manager

What Is Informatica Cloud Data Ingestion and Replication?

The Informatica Cloud Data Ingestion and Replication (CDIR) service provides organizations with a holistic solution for data ingestion and data replication for operational and analytical use cases. CDIR simplifies data ingestion and data replication from various sources, including database, application, file and streaming using a simple wizard-based user experience. It ensures that data consumers have timely access to the data they need, highlighting the role of data ingestion tools and data replication tools in automating and optimizing data delivery processes

CDIR is the only no-code unified solution for ingesting and replicating terabytes of data reliably with enterprise-grade scale, performance and reliability. CDIR supports hundreds of connectors out-of-the-box and provides useful monitoring, scheduling and logging capabilities in one location. CDIR can be used for different use cases such as:

  • Ingestion of terabytes of data from on-premises database, data warehouse, mainframes, Software-as-a-Service (SaaS) and applications into a cloud data warehouse, data lake or lakehouse.
  • Synchronization of ingested data with efficient and automated change data capture (CDC) and schema drift handling and data validation.
  • Real-time ingestion of log files, clickstreams and IoT devices for real-time analytics and monitoring.
  • Ingestion of terabytes for files from filesystems to cloud data warehouses or data lake targets for critical data backups and analytics while providing in-built encryption/decryption, virus scans and compression/decompression.

CDIR is a key component of the Informatica Intelligent Data Management Cloud (IDMC) platform and plays an essential role in Informatica's Data Integration and Engineering solutions. This blog explores the latest innovations in CDIR to achieve significant improvements in productivity, cost and performance. 

Informatica Cloud Data Ingestion and Replication November Release: Feature Highlights

Row-Filtering

CDIR now enables users to filter records from the source that they wish to replicate to the target. It offers both simple and advanced filtering options for users to define the criteria that a record must meet in order to be eligible for replication to the target. Row-filtering is a fantastic addition to the ever-growing list of CDIR capabilities and provides the following benefits to users:

  • Enhanced security and regulatory compliance by filtering out sensitive records
  • Cost savings by only persisting relevant records on the Target ecosystem reducing compute and storage costs
  • Improved productivity for downstream consumers by removing the need to sift through unnecessary records
  • Increased data availability by ensuring that only the most relevant data is accessible for analysis and integration with downstream applications

CDC Data Staging

CDIR introduces a new ability to persist change data capture (CDC) events from the source database logs onto an intermediate persistence layer, enabling multiple CDIR tasks to consume these CDC events. This impressive capability reduces the overhead on the source database caused by multiple long-running consumer tasks that each read the same set of CDC events. CDC Staging provides the following noteworthy benefits to users:

  • Reduced operational overhead on the source database by persisting cleansed CDC events on cloud storage for multiple consumer tasks to access
  • Improved throughput and replication performance across multiple consumer tasks by removing the overhead of parsing CDC data for every individual task
  • Enhanced granular control to source database and log-tables for database administrators by allowing them to select an alternate connection for reading logs
  • Improved observability and greater control for managing the CDC staging task that runs under the covers to constantly replicate source CDC events to the cloud

New Task Wizard with a Simplified User Experience

CDIR now has a new, user-friendly wizard for configuring application and database ingestion and replication tasks. As shown in Figure 1, this simplified interface is more intuitive and comes with remarkable new capabilities that allow users to:

  • Define a primary cloud destination once, which will then be the default option when creating a new task
  • Define new source and target connections directly within the wizard flow, with built-in assistance to guide you through the setup of connection properties
  • Experience a cleaner and simpler interface that hides the advanced configuration attributes by default, which can be accessed with a simple toggle switch

Figure 1:  Application and Database Ingestion and Replication Interface.

To get more details about the new user interface, please refer to this article within our Knowledge Center.

How to Access the Exciting New CDIR Capabilities 

Row filtering and CDC data staging capabilities are available only on the new CDIR task wizard starting November 2024. To activate the new UI, simply create a support ticket with Informatica Global Customer Support or reach out to your customer success manager. Once the request is made, one of our representatives will provide you with step-by-step instructions to access the new interface.

Your Next Steps

We recommend enabling the new task wizard in your lower environments first to ensure a smooth transition and to conduct a thorough analysis within your specific setup. This will allow you to experience the benefits of this new task wizard while minimizing any impact on your production environment.

To know more about our latest innovations, visit this page

First Published: Jan 29, 2025