Informatica Big Data Edition provides a safe, efficient way to integrate all types of data on Hadoop without having to learn Hadoop.
Early adopters of Hadoop had no choice but to hand code using Java or scripting languages such as Pig or Hive. Informatica Big Data Edition saves developers time with a visual development environment, reusable business rules, efficient collaboration tools, and flexible deployment models that allow them to build production-ready data pipelines up to five times faster than hand coding allows. Access all types of data including transactions, applications, databases, log files, social, machine, and sensor data.
Transaction data stored in databases and flat file formats is only a fraction of the ever-growing volume of data that today's businesses need to integrate. Informatica provides hundreds of high-speed connectors and pre-built transformations so you can integrate and cleanse data on Hadoop from cloud and mobile applications, social media, sensor devices, and more.
Hiring and retaining Hadoop experts can be a time-consuming and costly challenge — but it doesn’t have to be. With Informatica Big Data Edition, no specialized coding is required to scale performance on distributed computing platforms like Hadoop. There are more than 100,000 trained Informatica developers worldwide who can staff your big data projects today.
Hadoop is evolving rapidly, but you shouldn't have to rebuild everything each time a change or innovation emerges. Informatica Big Data Edition is powered by Informatica Vibe™ so you can build your transformation logic once and adopt new technologies fast without having to rebuild your data pipelines and destabilize your production environments.
Faster to value, faster to staff, faster to integrate, faster to trust, faster to innovate, faster to deploy.
Access all types of data including transactions, applications, databases, log files, social, machine, and sensor data
Move data between source systems, Hadoop, and target applications using high-performance connectivity
Collect log files and machine and sensor data in real time and reliably stream data at scale directly into Hadoop
Access an extensive library of prebuilt transformation capabilities on Hadoop via a visual development environment
Profile data on Hadoop to understand the data, identify data quality issues, and collaborate on data pipelines
Automate the discovery of data domains and relationships on Hadoop such as sensitive data that needs to be protected
Scrub, standardize, and enrich data on Hadoop with an extensive set of data quality rules including address validation
Use natural language processing to identify and classify entities in social media and text files
Parse complex, multi-structured, unstructured, and industry standard data on Hadoop using pre-built parsers, or easily create your own
Provides complete transparency with end-to-end data lineage of all data movement from source data, through Hadoop, to target applications
Preserves transformation logic so you can build data pipelines once and speed deployments as Hadoop continues to change
Feature List |
Standard |
Governance |
Data Integration Transforms on Hadoop |
|
|
Data Quality Transforms on Hadoop |
|
|
Data Profiling on Hadoop |
Column, Rule, Join Validation, Mapping Generation from Profile, Midstream, Comparative Profiling and Scorecarding |
|
Complex Data Parsing (Big Data Parser) |
Restricted to logs, XML, JSON, custom/proprietary data formats |
|
End-to-End Data Lineage |
|
Restricted to supporting Big Data Edition |
Business Glossary |
|
Restricted to supporting Big Data Edition |
Natural Language Processing (NLP) Transforms on Hadoop |
|
|
Address Validation Transforms on Hadoop |
|
|
Data Domain Discovery on Hadoop |
|
|
Data Masking Transforms on Hadoop (Limited) |
|
|
Real-Time Data Collection and Streaming (Vibe Data Stream) |
Restricted to HDFS targets and 100 GB daily source data volume |
Restricted to HDFS targets and 100 GB daily source data volume |
High-Speed Data Ingestion |
|
|
Database Connectivity |
|
|
Hadoop Connectivity |
|
|
HBase Connectivity |
|
|
Social Media Connectivity |
Unlimited Data Types |
Unlimited Data Types |
Ten (10) Informatica Data Analyst Named Users |
|
|
Support (included with subscription license only) |
8 x 5 |
24 x 7 |
Provides pre-built parsers on Hadoop for a variety of industry standards, documents, log files, and complex file formats.
Discovers relationships among parties and groups them to create a 360-degree view.
The industry's only fully integrated, end-to-end, agile data integration platform.
Built on fast brokerless messaging technology that helps you manage many small pieces of incoming streaming data.
UPMC used a collection of Informatica products to improve research outcomes in the quest to cure various diseases
BNY Mellon accelerated a successful merger using Informatica’s real-time integration products