Big Data Management Q&A[/caption] Recently, I had the fortune of presenting a deep dive webinar on Informatica Big Data Management and we had nearly 1000 attendees! It was awesome to see such huge interest in Big Data Management. However, I wasn't able to answer all the questions in the time we had. So, I’ve gone thru the list of questions and answered the most common ones here on this blog. Here’s the link to the replay, as well as the top questions and answers from the session. Feel free to reach out to your local Informatica account manager if you’d like clarification or more details about the new features of Big Data Management. Informatica Big Data Management Deep Dive and Demo
Informatica’s Blaze & Smart executor was released in November 2015 with Informatica Big Data Management (v10). The product is available in an Enterprise and Advanced edition.
Informatica’s smart executor, using built-in optimization & your cluster configuration, determines the best processing engine. This layer of abstraction eliminates the complexity of specialized development for specific engines. This ultimately enables speed, flexibility, and repeatability.
The Smart executor is the “polyglot engine”, which means the smart executor understands different processing engines and their programming framework. The smart executor has been designed to figure out which is the best execution engine on hadoop or for database pushdown, or even native DTM engine. All of this enables speed, flexibility, and repeatability for organizations.
Informatica Blaze is a purpose built engine for Hadoop which overcomes the functionality & processing gaps of other engines in the Hadoop ecosystem to consistently deliver maximum performance for customers. One single engine will never be optimal for all complex batch ETL processing use cases. Each engine (MapReduce, Hive on Tez, and Spark) has its benefits in the ecosystem. Blaze complements these engines by optimizing data processing across 100+ pre-built advanced data integration, data quality, parsing and masking transformations
The execution plan is designed for all engines integrated with the smart executor.
Informatica Big Data Management is used today across many complex use cases for Big Data, in any industry. Whether the data domain is around customers, products, etc or the data sources are around social media, sensors, weblogs, relational data, or others, Informatica Big Data Management can help turn raw data into trusted data assets quickly and flexibly.
Informatica Big Data Management (v10) has been certified for on-prem hadoop clusters and cloud-based Hadoop cluster on platforms like Azure & Amazon Web Services.
There isn’t an independent benchmark available, but we did publish internal finds for performance, you can read more here.
Metadata Manager is a data governance solution for use cases like impact analysis. Live Data Map, on the other hand, is designed as a universal metadata catalog service which catalogs data from hadoop, EDW, BI Tools, ETL tools (PowerCenter), making it easy for data analyst, data scientists to explore all the data assets in the enterprise by providing a 360 degree relationship view of the data assets and providing a detail data lineage view.
Data Warehouse Optimization for Hadoop is typically the first step of any company’s journey into Big Data but dynamic templates are relevant for enhancing developer productivity through a variety of use cases. The best way to use the dynamic mapping is to figure out what are the design patterns you see with your data for any use case (Hadoop or not) and any stage of the data (ETL) processing requirement.