First, let me explain what does latency means in the context of data? Today, every industry – healthcare, retail, telco, banking, etc. – generates a massive amount of data. According to Forbes, the volume of data grew by 5000% in the last decade, increased from 1.2 trillion gigabytes to 59 trillion gigabytes. These data are also time-bound: Data is created in real-time, and its value diminishes over time. Businesses need to take immediate action on this high volume of data when it is fresh or lose business opportunities. The industry term for this is "perishable insights." According to Forrester Research, perishable insights can be defined as "Insights that can provide exponentially more value than traditional analytics, but the value expires and evaporates once the moment is gone."
This is where the term latency comes into the picture. Latency is the measurement of the time taken to process or query the data from the time it is generated. If the data is processed quickly, such as in microseconds or milliseconds – it is called low latency.
Low latency can be defined as a computer system optimized to process high volumes of fast-moving data generated from various sources like IoT devices, machine logs, weather data, geospatial data, applications and social media in near real-time with minimal delay (or latency). These systems are designed to support operations that require real-time access to rapidly changing data or events and process it quickly when the events are still happening.
What Is Low Latency Data Management?
Low latency data management follows a streaming-first architecture that can ingest data from real-time sources like – streaming, IoT, Change Data Capture, real-time applications, etc., into a message hub like Apache Kafka. A stream processing engine (like Apache Spark, Apache Flink, etc.) reads data from the messaging system, transforms it, and publishes the enriched data back to the messaging system, making it available for real-time analytics. Additionally, the data is distributed to the serving layer, such as a cloud data lake, cloud data warehouse, operational intelligence or alerting systems for self-service analytics and machine learning (ML), reporting, dashboarding, predictive and preventive maintenance, and alerting use cases.
Technologies Supporting Low Latency Data Management
To ensure organizations don't miss the opportunity to capture, store, analyze and operationalize these perishable insights, it is essential to leverage the following modern technologies that support low latency use cases:
Streaming analytics platforms can ingest, analyze, and act on real-time streaming data coming from various sources, so you can take immediate action while the events are still occurring. These platforms can gather and analyze large volumes of data arriving in "streams" from always-on sources such as sensor data, telematics data, machine logs, social media feeds, change data capture data from traditional and relationship databases, location data, etc.
Change data capture (CDC) is a design pattern that allows users to detect changes at the data source and then apply them throughout the enterprise. In relational transactional databases, CDC technology helps customers capture the changes in a system as they happen and propagate the changes onto analytical systems for real-time processing and analytics.
Real-time application integration is the merging and optimizing of data and workflows between two disparate software applications, often a new cloud application with a legacy on-premises application.
API integration is the act of connecting disparate databases, devices, and applications. To do pretty much anything in this digital world – from placing an order on an ecommerce site to making an airline reservation – you must leverage application programming interfaces (APIs) and the APIs need to function at a specified latency in real time.
Messaging systems enable communications between producers and consumers using message-based topics. Apache Kafka is the most widely used fast, scalable, fault-tolerant, publish-subscribe open-source messaging system. Using Kafka, you can process large amounts of data quickly. It can serve as an interim staging area for data that various downstream consumer applications will consume.
Low Latency Industry Use Cases
Manufacturing: Preventive maintenance is the most common low latency use case in the manufacturing sector. In the preventive maintenance use case, the low latency system captures and ingests sensor data generated from IoT devices and machines in real time to predict if the machine or appliance will fail and proactively repair it.
For example, a low latency system enables setting up a rule that if the temperature of a turbine goes beyond 100 degrees centigrade, it will send a real-time alert to the operations team to take remedial action. This will help the operations team repair the turbine before it stops working proactively. Thus, the low latency system helps manufacturing companies to move from time-based maintenance to schedule-based maintenance to save time and cost. Discover Informatica’s Manufacturing Industry Solutions.
Public sector: Cybersecurity in the public sector requires low latency systems to instantly identify anomalous behavior and suspicious activities and flag them for immediate investigation. So, rather than remediating after a problem occurs, the attack is stopped before it can do any damage. Discover Informatica’s Public Sector Industry Solutions.
Hospitality: Hotels can leverage low latency solutions to monitor reservations in real time. For example, if a hotel chain notices that one of its properties has low occupancy in the late afternoon, it could text or email special promotions to frequent guests in that area to fill those empty rooms that night.
Retail: Retailers or ecommerce vendors can capture point-of-sale or clickstream data in real time using a low latency messaging system like Kafka and process it in a streaming platform. Low latency systems enable retailers to improve customer experience and drive targeted marketing campaigns by providing the right offer to the right customer at the right time. Also, retail firms use multiple applications to manage their ecommerce operations. By integrating all their applications (e.g., CRM, ERP, ecommerce portal, mobile application, supplier portal, payment application), retailers can run their business operations effectively. Discover Informatica’s Retail Industry Solutions.
Healthcare: Hospitals and other healthcare providers can capture patient data in real time and combine it with historical medical records to improve patient care. They can also leverage low latency systems for ICU monitoring and diabetes management use cases by detecting anomalies in the patient's heartbeat or blood sugar level. Low latency systems improve drug discovery and cancer care by analyzing medical records. Discover Informatica’s Healthcare Industry Solutions
Banking: Banks can use low latency solutions to track their customers' banking apps' activities to get an omnichannel view of the customer to provide attractive insurance, loan, and credit card offers. Also, banks can save money and time by orchestrating and automating loan and trade finance application processes with low latency solutions and integrating all internal and external applications. Discover Informatica’s Financial Services Industry Solutions.
Telecommunications: Telecom companies are facing increasingly tough times as digitalization has severely impacted them. Customers are not ready to pay for voice and text service, and the OTT platforms are forcing telecom operators to adapt to new business models. Reducing customer churn and retaining them is their top priority. With low latency solutions like Streaming Analytics and Internet of Things, Telco’s can capture customer location data, network information from towers, and weather data in real time to predict service disruptions and pro-actively notify their customers with the SLA via SMS or emails. This can help improve customer experience as they are already aware of the issue and the timeline to resolve it. Discover Informatica’s Telecommunication Industry Solutions.
Key Capabilities of Low Latency Data Management
- Unified experience for data ingestion and edge processing: Given that data within enterprises is spread across a variety of disparate sources, a single unified low latency solution is needed to ingest data from various sources. As data is ingested from remote systems, it is important that the ingestion solution can apply simple transformations on the data (e.g., filtering bad records) at the edge before it is ingested into the lake.
- Versatile out-of-the-box connectivity: The low latency solution needs to offer out-of-the-box connectivity to various sources like files, databases, mainframes, IoT, and other streaming sources. Also, it needs to have the capability to persist the enriched data onto various cloud data lakes, data warehouses, and messaging systems.
- Scalable stream processing with complex transformations: The solution should be able to apply complex transformations like merge streams, windowing, aggregate, and data quality on streaming data. Additionally, as streaming data volume increases, it is important that the solution scale to cater to the low latency requirements.
- Operationalized business rules and ML models: The end-to-end solution should be able to operationalize pre-created business rules and/or ML models on the data.
- Ability to handle unstructured data and schema drift: Given that many of the sources emit data in an unstructured form, it is important to parse the unstructured data to discover and understand the structure for downstream use. Changes in the structure at the source, often referred to as schema drift, are a key pain point for many organizations. Users expect the solution to intelligently handle schema drift and automatically propagate changes to the target systems.
- Reusability of processing logic: It is important to reuse the business logic (e.g., transformations and data quality rules) applied for one source to other sources, to reduce the manual effort in re-creating the logic and avoid manual, error-prone processes.
- Governance and lineage: Given that data comes from a variety of sources and endpoints, organizations need to be able to catalog the data for searching and viewing the metadata. With this, users can understand the lineage of the data and ensure that it is governed.
Informatica Solutions for Low Latency
Cloud Mass Ingestion is a unified data ingestion service that can efficiently ingest databases, files, streaming, CDC data, and applications into a cloud data warehouse, cloud data lake, or a message hub to drive database synchronization, data warehouse modernization, and real-time analytics use cases. Cloud Mass Ingestion uses an intuitive, four-step, wizard-based approach that requires no hand-coding.
Data Engineering Streaming is a code-free or low-code stream processing solution that helps businesses implement real-time analytics use cases both on-premises and in hybrid multi-cloud environments. It uses the sense-reason-act framework that enables customers to continuously ingest and process data from various streaming sources by using open-source technologies like Apache Spark and Apache Kafka.
Cloud Application Integration helps organizations modernize their approach to application integration. It automates business processes, accelerates transactions, fuels real-time analytics. drives innovation, and creates efficiencies by seamlessly connecting applications and data, regardless of their location or latency. It also streamlines workflows with interactive data access and automates user processes and business functions that span on-premises and cloud applications.
API Manager helps organizations connect their lines of businesses, customers, and partners to applications, processes, and data, anywhere, at any latency, with intelligent APIs. Organizations can develop, publish, manage, monitor, deprecate, and consume APIs to orchestrate their business processes, even if they span multiple clouds and on-premises systems within and outside their firewalls. Not only does API Manager allow you to develop and consume APIs, but it also provides differentiated capabilities to manage the entire API life cycle.
Cloud Integration Hub is built using a publish-subscribe pattern and spoke-hub architecture to connect and share data virtually from any data source and at any latency. With support for traditional batch processing to streaming data exchange, the subscribing applications can pull data at a pace and interval that meets their business needs. Informatica's Cloud Integration Hub is designed to reduce integration costs, increase integration resiliency, improve IT efficiency, and free up resources to drive innovation.
Low Latency Customer Success Stories
University of New Orleans uses Informatica Cloud Mass Ingestion and Snowflake to democratize data among university employees by creating a centralized cloud data warehouse to enable more effective analytics and accelerate cloud migration at low latency to improve student recruitment, admission, and retention.
Ovo, Indonesia's largest payment, rewards, and financial services platform, uses Informatica Data Engineering Streaming to stream data from millions of customers and process it in low latency to deliver personalized campaigns with maximum impact.
Metropolitan Thames Valley Housing orchestrated data flows through Informatica Cloud Integration Hub, giving housing officers and other employees fast, self-service access to the data they need to be effective. Using Informatica Intelligent Cloud Services, Metropolitan Thames Valley Housing improved timeliness of response and service to residents during the COVID-19 pandemic.
Carbonite was struggling to manage a three-fold growth in Salesforce data. Manual data entry was slowing client interactions and negatively affecting Salesforce adoption. With Informatica, Carbonite could accelerate decision-making in sales, marketing, and services and support rapid business growth in low latency. “Informatica Cloud Application Integration shaves at least 20 seconds off the time needed to create each case. Our 220 agents are creating hundreds of cases every day, so the savings in time, money, and headcount is dramatic,” says Robert Frost, Vice President of Customer Support at Carbonite.
nCino wanted to integrate loan data from an assortment of cloud and on-premises core banking systems and deliver real-time integrations to streamline bank operations at low latency. After just one year of using Informatica Cloud Application Integration, nCino reduced the time it takes to onboard a new customer's data from weeks to days, enabling faster growth. nCino has now integrated more than 120 banking customer systems. Informatica Cloud Application Integration processes push data back to banking customers' core systems to automatically book loans and deposits, pull credit bureau reports and account documents and more.
How to get started with low latency data management
Low latency systems are essential for intelligent data management, allowing organizations to ingest, process and analyze real-time data from various sources -- for data-driven decision-making. Fast-track your low latency use cases with free Cloud Data Integration services on AWS and Azure.