A steady stream of real-time data is pouring into your organization: clickstreams from web servers, application and infrastructure log data, and information from sensors or agents placed on devices and machines comprising the "Internet of Things." You can transform this torrent of small messages and events into unprecedented business agility and responsiveness, but only if you can gather and analyze it immediately.

Informatica Vibe Data Stream for Machine Data (VDS) helps you manage many small pieces of data as they flow in at high rates and accumulate quickly into large volumes. Vibe Data Stream is purpose-built for efficiently collecting all forms of streaming data and delivering it directly to both real-time and batch processing technologies — so you can analyze and act on it while it's still fresh and relevant.

Traditional file-based, batch-oriented collection architectures are not well-suited for streaming data since, among other things, they don't support real-time transmission, require careful management, don't scale easily, and create multiple copies of data. Open-source alternatives are equally inadequate—they often require high levels of software development expertise and are limited in their ability to scale for enterprise-grade volumes, speeds, and variances of real-time data.

A distributed, scalable system, Vibe Data Stream uses Informatica's proven high-performance brokerless messaging technology to greatly simplify streaming data collection. Features include:

  • Lightweight agents for an ecosystem of sources and targets
  • Brokerless messaging transport using a publish/subscribe model
  • Flexibility to connect sources and targets in numerous patterns
  • High-performance delivery direct to targets over LAN/WAN
  • Simplified configuration, deployment, administration, and monitoring

Out-of-the-box source and target agents collect and distribute streaming data through the high-performance message bus. The embeddable agents on sources collect data in real-time and stream millions of records per second into big data platform targets such as Hadoop and Cassandra. Vibe Data Stream also streams data directly into Informatica PowerCenter Real Time Edition, Informatica RulePoint (CEP), and Storm to enable event processing and operational intelligence in real time.

Informatica Vibe Data Stream provides streaming data collection for real-time Big Data analytics, operational intelligence, and traditional enterprise data warehousing. Based on Informatica's established high-performance messaging technology, Vibe Data Stream is enterprise-class software that makes once resource-intensive technology affordable and easy to configure and use.

High Performance Streaming Data Collection with Reliable Qualities of Service

Built on the industry's fastest and highest-performing brokerless messaging technology, Vibe Data Stream uses a publish/subscribe model that delivers data directly from source to target based on message topics of interest, with no intermediate data staging. It leverages the underlying network to perform functions normally assigned to a broker, and can collect and deliver streaming data locally or globally over LAN and WAN. Vibe Data Stream streams data with reliable quality of service that ensures delivery as long as both source and target are up and running. For guaranteed level of service, Vibe Data Stream includes the option of adding persistent data stores in parallel to the data stream. These save data and deliver it to failed targets when they return online, with no degradation of performance.

Informatica Vibe Data Stream gathers streaming data from sources and delivers it to targets across the enterprise.

Informatica Vibe Data Stream gathers streaming data from sources and delivers it to targets across the enterprise.

Wide Variety of Supported Sources and Targets

Vibe Data Stream includes lightweight agents that provide out-of-the-box support for a wide and growing number of streaming data sources and targets. These agents minimize the need to develop source and target adapters internally, speeding the process of integrating streaming data into processing environments. The agents are automatically deployed based on a user-defined topology configuration and directly connected via the high-performance message bus. This allows Vibe Data Stream to avoid data staging, moving data instead with simple one-hop flows from sources to targets. At the source, agents read and dispatch data even as it is logged to a file. At the target, agents receive and write that data to the appropriate processing environments.

Informatica also provides an open SDK for customers who want to develop their own agents.

Centralized GUI for Simplified Configuration, Deployment, Administration and Monitoring

A centralized GUI simplifies setup, deployment and monitoring, allowing users to specify the entire system topology configuration and auto-deploy agents to begin operations.

A centralized GUI simplifies setup, deployment and monitoring, allowing users to specify the entire system topology configuration and auto-deploy agents to begin operations.

Vibe Data Stream uses the Informatica Administrator Console as a centralized interface for simplified configuration, deployment, administration and monitoring. A single console allows users to create flexible configurations and visual mappings for multiple source-to-target patterns; auto-generate configurations for messaging, source agents, and target agents; automate deployment; and load balance message delivery across targets to minimize impact on performance. It also provides a central point for monitoring ongoing operations and performing administrative functions like controlling agents or managing logs. This single interface facilitates ease of use, provides a complete view of the Vibe Data Stream environment, and makes it easier to add new streaming data sources or processing targets quickly.

High Availability, Scalability and Architectural Flexibility

Vibe Data Stream is designed for high availability, with brokerless messaging architecture for fewer potential points of failure, automated failover configuration on commodity hardware with no need for a shared file system, and guaranteed data delivery using the solution's parallel persistence capability. Increasing horizontal and vertical scalability is as easy as deploying more data collection or data delivery agents and assigning them to appropriate data topics. And because sources and targets can be connected in any pattern that suits business and analytical needs —one-to-one, one-to-many, many-to-one, and many-to-many — the architecture can flex easily with changing business requirements.