Data Integration Architecture – Why There is a Shift in Common Patterns

Sep 02, 2021 |
Sudipta Datta

Product Manager, Cloud Integration Hub and B2B Gateway

Data integration in a multi-cloud world is a real challenge for enterprises today. Think how you as a consumer feel when you’re juggling multiple clouds. I went through this recently when my credit card expired, and the auto-pay transactions were declined! My phone stopped saving my photos in iCloud, I couldn’t access the shared folder of my son’s school assignments in Drobox, and Google Drive started giving me warnings every time I opened it.  

At an individual level as well as enterprise level there is a constant tug of war between picking best-of-the-breed products and a one-bill-to-pay solution.

The data integration landscape is no different. In this era of perishable, high-volume data on multi-cloud, force-fitting old processes and tools to meet new business needs can lead to inefficiency and cost overruns. For long-term benefits, your data architecture should be flexible and agile enough to serve new use cases and do so with the most promising technology available at that time and point.  

Common Architectural Patterns in Data Integration

Your approach to data integration should accommodate several common architectural patterns, along with some emerging ones. The most time-tested pattern is ETL (extract, transform and load) where you have a dedicated data processing server in the middle for all your data transformation and enrichment needs. This helps to integrate data without  taxing the source and target applications. With modern technologies and processes, we have seen applications sharing the responsibilities of data transformation and supporting new data integration patterns.

4 Factors Influencing Data Integration Architecture

Data warehouse and lake: The evolution of data warehouses and data lakes has been one of the biggest influences on data integration strategy. Storage and processing of data are decoupled in modern data warehouses enabling cheaper storage and faster processing on cloud platforms. You can regulate the storage cost with a tiering system of data, based on accessibility. To speed up processing of any amount or kind of data you can make use of massively distributed and parallel processing.

With high volumes of high-speed data pouring in, you don’t have to immediately process all the data before collecting it in a data warehouse. Instead, transform data on demand. And that gives rise to an ELT framework where you extract and load the data and then process it when needed. Think of it as on-demand transformation. On the one hand this gives a lot of control to data scientists and analysts to enrich and modify data as needed, and on the other hand it allow architects to be cost- and time-efficient by reducing unwanted data processing and data egress charges.

Real-time integration: Today you need to champion real-time data and application integration to meet your customer expectations. Time is money. Be it fraud detection in the banking sector or dynamic pricing and offers for retail customers, a tiny delay can incur huge loss in terms of money and reputation. To meet your SLAs it’s very important to understand when you need to use data integration, data ingestion, application integration, messaging, integration hubs and/or API gateways. `

Multi-cloud is obvious, if not today, tomorrow. Part of it is by design, and part of it is due to pursuing best-of-the breed solutions. Your data integration architecture needs to be cloud agnostic and scalable. Scalability doesn’t just mean how many instances you can spin up and down, it also means how many diverse and futuristic uses cases you can support. You can opt for an architectural pattern like data fabric or data mesh that provides an unified view and experience irrespective of the cloud that is sourcing the data. For complete governance and cataloging of your data, multi-cloud data management is a must.

Autonomous data integration: There is no denying the fact that infusing artificial intelligence in your overall data and integration management system will make your team more productive and your processes more efficient. But what if you could level up and achieve autonomous data integration with machine learning and predictive analysis? Imagine if you had an optimization engine that would run on auto pilot to significantly bring down your operational costs over time?

We understand when you architect your data integration landscape the objective is not just to minimize short term cost and to optimize performance, but to also enable digital transformation. When you adopt a comprehensive and modern data platform like the Intelligent Data Management Cloud (IDMC) you have the flexibility to automate the old repeatable processes while staying ready to absorb the new normal in this space.  

I highly encourage you to join the DBTA webinar “Architectural Patterns in Data Integration: Choices and Strategies” to hear John O'Brien, Principal Analyst and CEO of Radiant Advisors, and Makesh Renganathan, Principal Product Manager, R&D Cloud Informatica, discuss the shift in data integration architecture.