Serverless is a cloud architecture that allows you to be free of managing servers, virtual machines (VMs), or containers. Serverless does not mean there are no servers involved (servers are still used for running applications). It simply means that you do not have to interact with or control the servers involved in the architecture. Serverless allows you to focus on the design and objectives of your application.
These four characteristics help classify what is true serverless:
There are many serverless computing providers, but three offerings stand out: AWS Lambda, Microsoft Azure, and Google Cloud Platform. Offerings from these three vendors share similar advantages, but some qualities that make each one special.
AWS Lambda: AWS was one of the first vendors to offer serverless computing. AWS Lambda is a service that runs your code on Amazon EC2. With Lambda, you’re not hosting the code, and you’re not charged when the code is not used. You pay only for computation time. Your code can sit idle for months, and if you don’t run it, Amazon will not charge you. (See Informatica solutions for AWS.)
Microsoft Azure: Microsoft’s serverless offering is Azure Functions. Using Azure Functions, a user can create and upload code and then define triggers or events that will execute the code. Triggers can come from a wide range of sources, including another user's application or other cloud services, such as databases, events, and notification hubs. Azure Functions has a usage-based billing policy. (See Informatica solutions for Microsoft Azure.)
Google Cloud Platform: Google Cloud Functions is Google’s serverless offering. Using Cloud Functions, a user writes simple functions that are attached to events triggered from Google Cloud Platform infrastructure and services. The Cloud Function is triggered when an event being watched is fired, and the code executes in a fully managed environment. Google Cloud Function services are priced per-function. (See Informatica solutions for Google Cloud Platform.)
Each of the serverless options from Amazon, Microsoft, Google has advantages where the user can create and upload code, which can be automatically executed in a serverless manner where you no longer need to worry about managing servers, services, and infrastructure. All of that is handled automatically. This, of course, still has cost as you are spinning servers and you will find cost savings because you don’t have long running servers or not having dedicated operations personnel operating your pipeline.
So, how do we fit this ETL (extract, transform, load) or data management into this concept? Most of the providers have various data storage options that can be linked with your code. For instance, AWS Lambda can trigger your code each time a file is uploaded to Amazon S3 or events streamed to Amazon Kinesis or written to Amazon DynamoDB. All you have to do is supply is code to process that data. As you can see, serverless is a powerful shift in data management and how ETL is performed.
Informatica supports serverless deployments using Amazon EMR, Microsoft Azure HDInsight, and Databricks clusters with data engineering products. Once a developer builds mappings using Informatica Data Engineering Integration, customers have an option to run mappings in an existing cluster for on-premises deployment or serverless using the cluster auto-deployment option.
The cluster auto-deployment option can
The figure below shows what serverless execution means in our data engineering workflow.
A data management system is serverless, if one can ingest data, cleanse data, and enrich data without ever having to think about servers. The key aspects of a serverless data management system should satisfy the four characteristics described above.
Informatica’s Cloud Data Integration Elastic (CDI Elastic) service satisfies all the four characteristics of serverless
As you can see, serverless is a powerful shift in how data engineering jobs are performed. However, there are other nuances like exception, handling, deserialization, transformations, retrying, and monitoring which, need to be implemented. Informatica solves this problem through the Informatica Intelligent Cloud Service Cloud Data Integration Elastic service.
Informatica CDI Elastic virtualizes the runtime environment, enabling developers to focus more on mapping development rather than on infrastructure-related provisioning and management. Thus, from the perspective of an Informatica developer, CDI Elastic allows developers to build and run mappings, and task flows without thinking about servers.
CDI Elastic also provides customers to deploy their next-gen analytics solutions, providing them the ability to ingest, cleanse, and process big data in the cloud using serverless technology.
When deploying cloud applications, you should consider serverless deployments first and only consider the alternatives if serverless does not meet your demands. Serverless offers consumption-based pricing, auto-tuning and auto-scaling, high availability, all without requiring a dedicated administrator to manage the environment.
Informatica is the Enterprise Cloud Data Management leader, and with the evolution of data management using serverless, Informatica helps you future-proof your solutions using IICS and data engineering solutions. We have many enhancements planned in CDIE with integration with the intelligent CLAIRE engine. Please stay tuned for more blogs on this topic.
To find out if Informatica’s CDIE service (formerly Integration at Scale) is right for you, try CDIE free for 30 days.
To learn more about serverless, read How to go Hadoop-less with Informatica Data Engineering and Databricks.