At the Strata Conference + Hadoop World 2013, Informatica Corporation (Nasdaq:INFA), the world’s number one independent provider of data integration software, and Cloudera, the leader in enterprise analytic data management powered by Apache Hadoop™, support services and training, today announced a jointly designed reference architecture for optimizing data warehouses for today’s data-driven business world.
The new Data Warehouse Optimization (DWO) reference architecture specifically for Enterprise Data Hub deployments addresses the challenges facing traditional data warehouse infrastructures, where capacity is too quickly consumed by increasing data volumes, leading to performance bottlenecks and costly upgrades. The DWO architecture empowers companies to optimally deploy an Enterprise Data Hub, a central system to land and work with all data in a variety of ways, together with the tools, security and governance customers require. An Enterprise Data Hub is a complementary technology to data warehouse implementations, enabling them to store and process data at any scale, to dramatically reduce data warehouse costs, and to boost developer productivity by up to a factor of five.
The proven core building blocks for implementing the DWO architecture are Cloudera Enterprise, a subscription offering that combines CDH, Cloudera’s 100 percent open source distribution of Apache Hadoop, Cloudera Manager and Cloudera Navigator and Informatica PowerCenter Big Data Edition powered by Informatica Vibe. Informatica Vibe is the world’s first and only embeddable virtual data machine (VDM), with “map once, deploy anywhere” data integration.
“Legacy environments are not going away, but they need to be augmented by Hadoop-based solutions to meet the demands of big data,” said Todd Goldman, vice president and general manager, Enterprise Data Integration, Informatica. “The Cloudera and Informatica Data Warehouse Optimization reference architecture helps companies leverage their existing environment with emerging technologies using readily available skills, so organizations can more affordably and efficiently unlock the massive potential of big data.”
Fast-growing data volumes and new types of data sources, ranging from cloud and mobile apps to social media and machine data, are placing substantial demands on current data warehouse infrastructures. To optimize their data warehouse environments, organizations are seeking ways to support unlimited data volumes while leveraging industry-standard hardware and software to reduce infrastructure costs and existing skills to minimize operational costs. They are also seeking ways to support all types of data, and easily integrate new and existing types of infrastructure.
“One of the best ways to introduce Cloudera into an organization’s data management infrastructure is to start by optimizing the data warehouse environment,” said Charles Zedlewski, vice president, Products, Cloudera. “The Cloudera and Informatica DWO reference architecture has the dual benefit of dramatically lowering costs and providing an enterprise-ready data platform that cost-effectively scales to meet the data storage and processing requirements for big data projects.”
The DWO reference architecture addresses all these requirements through the combination of Informatica and Cloudera technologies. Informatica delivers a broad and mature set of data integration and data management capabilities around Hadoop. Cloudera Enterprise enables cost-effective, scalable storage and processing on commodity infrastructure, along with enterprise-grade security, high availability, cluster management, and low-latency querying. The joint reference architecture includes technologies and solutions that:
- Lower infrastructure and operational costs – Delivers the killer app on Cloudera, so organizations can cost-effectively scale data storage and processing on industry-standard hardware and open-source software using readily available resource skills.
- Use existing resource skills to staff projects – Many data warehouse organizations already have ETL developers and consultants on staff trained on Informatica. With the Informatica PowerCenter Big Data Edition, every Informatica developer is now a Hadoop developer without having to become a Hadoop expert. With Informatica’s and Cloudera’s world-class support and training organizations, users can staff the development and administration of data warehouse projects on Cloudera with readily available resource skills.
- Future proof the data warehouse and drive productivity – Informatica Vibe enables data integration and ETL processes to be written just once and deployed anywhere. This means that existing ETL processes created using Informatica’s codeless visual development paradigm can be redeployed on Cloudera Enterprise with minimal effort, resulting in a more resilient data warehouse infrastructure and an up-to-5x productivity gain for developers. Rapid development is further enhanced with Informatica’s Vibe for rapid ETL prototyping and Cloudera’s Impala for real-time interactive queries to discover insights faster.
- Optimize data warehouse performance – Informatica PowerCenter Big Data Edition deploys on Cloudera Enterprise to load, profile, parse and transform for analysis of data in a high performance and cost-effective fashion. Optimal processing flows can be defined quickly using Informatica’s visual design interface and extensive library of pre-built transforms.
- Handle virtually all types of data and sources – With Informatica, nearly all types of data – including legacy, ERP, CRM, social and machine – can be accessed and integrated through a variety of methods ranging from batch to replication, change data capture (CDC) and real-time streaming. Newly released Informatica Vibe Data Stream for Machine Data technology, for example, collects and streams high-volume, real-time machine data into Hadoop to drive new levels of operational intelligence.
- Ensure data quality – Informatica Data Quality Big Data Edition executes data quality and matching rules on Cloudera Enterprise to ensure trust in the data.
- Ensure enterprise-ready deployments that meet business SLAs – With Informatica’s Vibe, “Map Once, Deploy Anywhere”, virtual data machine technology, users can immediately deploy ETL jobs from development into production. The combination of Informatica’s unified administration and Cloudera Manager makes it easy to manage ETL workloads on Cloudera for data warehouse projects.
The Data Warehouse Optimization reference architecture from Cloudera and Informatica is available now for implementation. To view a Solution Brief, “Cloudera & Informatica Unleash the Power of Hadoop,” click here.
Visit Informatica at Kiosk 63 and Cloudera at Booth 403 at the Strata Conference + Hadoop World 2013, Oct. 28-30 at the New York Hilton Midtown.
Tweet this: News: @Cloudera and @InformaticaCorp Team to Optimize the #DataWarehouse http://bit.ly/1bqFv3c
Informatica Corporation (Nasdaq:INFA) is the world’s number one independent provider of data integration software. Organizations around the world rely on Informatica to realize their information potential and drive top business imperatives. Informatica Vibe, the industry’s first and only embeddable virtual data machine (VDM), powers the unique “Map Once. Deploy Anywhere.” capabilities of the Informatica Platform. Worldwide, over 5,000 enterprises depend on Informatica to fully leverage their information assets from devices to mobile to social to big data residing on-premise, in the Cloud and across social networks. For more information, call +1 650-385-5000 (1-800-653-3871 in the U.S.), or visit www.informatica.com. Connect with Informatica at http://www.facebook.com/InformaticaCorporation, http://www.linkedin.com/company/informatica and http://twitter.com/InformaticaCorp.
Founded in 2008, Cloudera pioneered the business case for Hadoop with CDH, the world’s most comprehensive, thoroughly tested and widely deployed 100% open source distribution of Apache Hadoop in both commercial and non-commercial environments. Now, the company is redefining data management with its Platform for Big Data, Cloudera Enterprise, empowering enterprises to Ask Bigger Questions™ and gain rich, actionable insights from all their data, to quickly and easily derive real business value that translates into competitive advantage. As the top contributor to the Apache open source community and leading educator of data professionals with the broadest array of Hadoop training and certification programs, Cloudera also offers comprehensive consulting services. Over 700 partners across hardware, software and services have teamed with Cloudera to help meet organizations’ big data goals. With tens of thousands of nodes under management and hundreds of customers across diverse markets, Cloudera is the category leader that has set the standard for Hadoop in the enterprise. www.cloudera.com.