c07-idc-market-spotlight-3473

IoT Data Management: The Rise of
Industrial IoT and Machine Learning

Learn about the difference between IoT and Industrial IoT, including IoT use cases and five must-have capabilities for IoT data management.

IoT vs. Industrial IoT: What’s the difference?

The Internet of Things (IoT) is a network of objects or devices that are connected to the Internet, usually via sensors, and can relate to each other and the data they generate. These connected “things”—ranging from smartphones and cars to refrigerators, thermostats, and mirrors—are slowly entering every aspect of our lives. With 41.6 billion connected devices expected by 2025,1 IoT’s permanence is only going to increase.

IoT data management makes streaming data available for analytics.

IoT adoption has increased significantly in the last five years due to the availability of massive computing power, innovations in data-processing technology, and the advent of machine learning and natural-language processing algorithms. IoT has opened an entirely new arena for customers to address their long-standing issue of connecting devices and using the resulting data to positively influence decision-making processes. IoT also opens an entirely new spectrum of use cases where customers can operationalize actions on the IoT devices in real-time—something that was not possible a few years ago.

Industrial Internet of Things (IIoT) or “Industry 4.0” refers to the combination of IoT technology and data with manufacturing and other industrial processes, often with the goal of increasing automation, efficiency, and productivity. This is where IoT gets applied in practice at various industries, such as:

  • Factory equipment, machines, and devices used in manufacturing

  • Health monitoring devices in healthcare

  • Sensors and Supervisory Control and Data Acquisition (SCADA) systems in oil and gas production

  • Telemetry data from autonomous vehicles

IIoT helps organizations leverage the power of data that their machines created over several years and use that for real-time analytics to drive faster, more accurate business decisions.

Common IoT and Industrial IoT use cases

IIoT use cases in manufacturing include factory automation for operational efficiency; location tracking for locating tools, parts, and inventory; and predictive maintenance for maximizing uptime and disaster tolerance.

IoT use cases in retail extend to both online and offline experiences, including real-time offer management based on what, when, and where customers buy; improved behavior analytics; smart shelves that proactively alert when items are running low or in the wrong place; and automated checkout systems.

IoT use cases in healthcare include using data from medical devices to feed into clinical research processes and treatment efficiency studies to improve patient outcomes; or to track room/bed occupancy and staff proximity to enhance the hospital experience and enhance care delivery.

Why is IoT data management important?

When customers embark on the journey to address IoT and IIoT use cases, the first hurdle they face is how to retrieve the data from the IoT systems and make it available for the analytics systems and for decision making.

The ability to ingest the data from IoT systems into the data lake or into messaging systems like Apache Kafka is a key first step. In most scenarios, organizations also want to enrich and cleanse the data to ensure that bad data doesn’t get into the lake and the analysts have enriched data for their analytics. In some cases, customers want to operationalize actions in real-time on IoT-enabled devices. For example, they may want to automatically stop a painting machine if factory conditions get too warm for optimal paint adhesion—a situation that could cause major quality and warranty issues if not corrected during manufacturing.

5 must-have capabilities for IoT data management

Managing data from IoT devices is an important aspect of a real-time analytics journey. To be sure your data management solution can handle IoT data demands, look for these five key capabilities:

  • Versatile connectivity and ability to handle data variety: IoT systems have a variety of standards and IoT data adheres to a wide range of protocols (MQTT, OPC, AMQP, and so on). Also, most IoT data exists in semi-structured or unstructured formats. Therefore, your data management system must be able to connect to all of those systems and adhere to the various protocols so you can ingest data from those systems. It is equally important that the solution support both structured and unstructured data.

  • Edge processing and enrichments: A good data management solution will be able to filter out erroneous records coming from the IoT systems—such as negative temperature readings—before ingesting it into the data lake. It should also be able to enrich the data with metadata (such as timestamp or static text) to support better analytics.

  • Big data processing and machine learning: Because IoT data comes in very large volumes, performing real-time analytics requires the ability to run enrichments and ingestion in sub-second latency so that the data is ready to be consumed in real time. Also, many customers want to operationalize ML models such as anomaly detection in real time so that they can take preventive steps before it is too late.

  • Address data drift: Data coming from IoT systems can change over time due to events such as firmware upgrades. This is called data drift or schema drift. It is important that your data management solution can automatically address data drift without interrupting the data management process.

  • Real-time monitoring and alerting: IoT data ingestion and processing never stops. Therefore, your data management solution should provide real-time monitoring with flow visualizations to show the status of the process at any time with respect to performance and throughput. The data management solution should also provide alerts in case any issues arise during the process.

Machine learning and IoT: Game changer?

Machine learning (ML) strives to minimize human intervention in tasks that can be automated—and it is fully applicable to IoT. Machine learning opens many opportunities to automate and optimize the world of IoT. Using machine learning algorithms, organizations can use IoT data to discover patterns and build models which can then be scored in real time on the IoT data to operationalize the models.

Common use cases with ML algorithms in IoT are:

  • Smart traffic prediction using classification, anomaly detection, and clustering techniques.

  • Energy usage prediction using linear regression, classification, and regression trees.

  • Food safety prediction using Naive Bayes algorithm.

  • Smart city and smart citizen initiatives with K-means clustering algorithm

To learn more about getting started with ML and artificial intelligence, read our white paper.

The Informatica approach to IoT data management

Informatica offers a Big Data Streaming solution that provides AI-driven, end-to-end management for IoT and streaming data. The solution leverages the Sense-Reason-Act framework for IoT data management, which enables customers to ingest data from IoT sources (sense), apply business logic on the IoT data (reason), and operationalize actions on the IoT device (act)—all on a single platform with the power of CLAIRE™, Informatica’s intelligence engine.

c09-iis-arrow-v2

The Informatica solution helps customers take advantage of open source technologies like Apache Kafka and Apache Spark for scalable and high-performance streaming and IoT analytics, while abstracting the complexity of the open source technology. The Informatica Big Data Streaming solution also supports the cloud ecosystems such as AWS, Azure, and Google Cloud.

Informatica’s cloud-native schema-agnostic ingestion solution collects structured and unstructured IoT data and ingests the data into the cloud and on-premises systems using a simple and easy-to-use graphical UI. The solution also offers the ability to cleanse and enrich the data before ingestion.

The Informatica IoT data processing solution parses complex unstructured data using AI/ML algorithms and handles schema drift. The solution processes millions of messages per second using the power of Apache Spark Streaming. This enables customers to apply their enrichment logic in real time as the data is moved through the pipeline. The solution also helps customers operationalize the AI/ML models as part of the data flow, so that they can take action in real time.

Learn more about Informatica streaming data solutions.

Customer success story: Indian Oil and IoT data management

Indian Oil Corporation, a large oil retailer in India, successfully implemented an IoT use case: automating their retail fuel stations.

Indian Oil needed to implement complete, centralized, and agile controls on the retail side of both the petrol and liquefied petroleum gas (LPG) businesses. It deployed Informatica's Enterprise Streaming Data Management solution consisting of Edge Data Streaming and other real-time data integration components. This enabled better decisions on dynamic pricing of petrol/gas and effective monitoring of stock and dispensing units at the retail outlets. Read more about Indian Oil’s project and its benefits.

   

More IoT data management resources

 

1: IDC, “The Growth in Connected IoT Devices Is Expected to Generate 79.4ZB of Data in 2025, According to a New IDC Forecast,” June 18, 2019, https://www.idc.com/getdoc.jsp?containerId=prUS45213219