Delivering Cloud-Native, Enterprise-Grade Data Lakehouses on Amazon Web Services

Last Published: Jun 29, 2022 |


delivering cloud-native data lakehouses on Amazon web services

“Water, water everywhere, nor any drop to drink.” This famous line from Samuel Taylor Coleridge’s poem The Rime of the Ancient Mariner is spoken by a sailor on a stranded ship who is surrounded by saltwater that he cannot drink.What does this have to do with a cloud data lakehouse? Simply put, it’s all about the quality of the water or, in today’s case, the data, and how it can help or hinder your organization’s ability to drive better business outcomes.

Ensuring Data Management Doesn’t Become Your Albatross

In a recent survey byanalyst firm TDWI, a majority (64%) ofparticipatingorganizationsstatedthat data quality and data management issues are the leading barriers to successfully delivering cloud data warehouses and data lakes.In the same survey, a clear majority (86%) responded that a systematic approach to cloud data management is important to the success of their data strategy.

With the proper tools, migrating and modernizing your legacy on-premises enterprise data warehouse (EDW) or Hadoop environment to Amazon Web Services (AWS) can help your organization become truly data-driven. Informatica’s Enterprise Data Catalog (EDC), powered by our intelligent CLAIRE engine, enables you to quickly discover, inventory, and organize data assets and provides a unified view of enterprise metadata to add context to your data.

Once you identify the data you want to move, along with its dependencies and linkages, you can simplify integrationand rapidly move large data volumes to AWS using Informatica Intelligent Cloud Services (IICS) for ETL/ELT patternsand leveragecloud-native codeless connectors. You can then utilize IICS—our cloud-nativeintegrationplatform as aservice (iPaaS)—to ingest, integrate, and ensure the quality of data you want to move.

This approach allows you to maximize the value of your data and take advantage of the benefits of moving to a cloud lakehouse data management strategy. A cloud lakehouse is a relatively new term for a system that combines the best of a cloud data warehouse and a data lake. It makes it easier to store and analyze data because it allows companies to manage multiple types of data from a wide variety of sources—including business applications, IoT, telematics, social media, machine learning, and more traditional files and databases—and store this data, both structured and unstructured, in a centralized repository.

A cloud lakehouse enables you to augment cleansed and transformed data in your cloud data warehouse with additional data from your data lake, helping you reduce expenditures due to a data lake’s typically lower storage costs. When analyzing data, Informatica’s cloud lakehouse data management capabilities allow you to blur the boundary between your cloud data warehouse and your data lake.

Leveraging an Intelligent Cloud Lakehouse Strategy

Now’s the time to take a closer look at yourcloudlakehousedatamanagementstrategyto get the most value from your cloud data warehouse and/or data lake. You need a moderncloud lakehousedata management partnerto help you successfully deliver business value from your investment, withintelligence and automation.

With theindustry’s leading, metadata-driven cloud lakehousedatamanagement solution from Informatica, you unleash the full potential of your AWS cloud data warehouse and data lake across multi-cloudand hybrid environments.You gain efficiencies, cost savings, and scale with best-of-breed data integration, data quality and governance, and metadata management—built for the cloud, on an AI-powered,intelligent data platform.

Cloud Data Lake Architecture: AWS Ecosystem Support

Here are two success stories showing how organizations today are controlling costs and increasing agility and flexibility with Informatica’s intelligent cloud lakehouse data management and Amazon Redshift cloud data warehouses and/or Amazon S3 cloud data lakes.

More Data, More Grants

A flagship state university sought to give its analysts faster access to research data stored in its transactional systems and enable data architects to save time. The university also wanted to identify grant funding opportunities more quickly to achieve a competitive edge. In addition, the institution had a goal to modernize and consolidate core university systems and transition to cloud-based solutions while keeping research data consistent and up -to-date.

Using Informatica Intelligent Cloud Services, the university brought data from Oracle and SQL Server into Amazon Redshift and Salesforce with Informatica Cloud Data Integration. This allowed the institution to move toward real-time automation and data integration, resulting in a 75 percent reduction in daily database transfer time. The university was also able to accelerate the application process for competitive research grants, putting it in a better position to receive funding.

Informatica Intelligent Cloud Services saves us an incredible amount of time. Without it, modernizing our systems would take much longer.

Data Architect, Flagship State University

Using Data to Help the Most Vulnerable

Community Technology Alliance (CTA) is a California-based nonprofit with a mission to help the homeless. CTA had a goal to collect and integrate data from multiple government and nonprofit agencies to match people in need with available housing and human services. CTA used Informatica Intelligent Cloud Services to connect silos of data, enabling agencies to access and enter data via CTA’s HOME app. Data about clients, housing, and services was fed into a data lake hosted on AWS (using Amazon EC2, S3, and RDS), where it could be used for analytics and reporting.

This setup provided trusted data to serve as the basis for coordinated, centralized assessment and placement systems to prioritize access to housing and services. CTA has helped communities reduce the rate of a return to homelessness by up to 75 percent over three years and expedited access to housing and human services for people who might otherwise slip through the cracks. Watch the video to learn more about CTA’s mission to help the homeless in Santa Clara County.

Behind every statistic is a human story. We’re using Informatica Intelligent Cloud Services to merge systems and help people faster.

Bob Russell, CEO, Community Technology Alliance

Next Steps

First Published: May 07, 2020