Amazon Elastic MapReduce (EMR) is based on Hadoop, and offers a proven technology for storing files and processing data in a highly distributed manner. When faced with several different types of data from a multitude of data sources, a data lake based on Hadoop to analyze the data makes great sense. Loading data from multiple data sources into Amazon EMR is the first step in forming a data lake. The next step is analyzing this data. Considering that most Hadoop clusters consist of several terabytes of data, Amazon Redshift’s compression capabilities can help make sense of the enormous volume of data contained in these clusters. Informatica Intelligent Cloud Services pushdown optimization technology is well suited for both these use cases.
Get startedNew Zealand
New Zealand