Speed of change simplifies
machine data collection

Streaming technology moves data quickly, removing complexity and reducing potential for failure.

“Developers don’t have to deal with as much code. When there’s less code, there’s less complexity, fewer moving parts, and less potential for failure.”

—Todd Montgomery, chief architect at Kaazing

In order to garner real-time insights from big data to drive business value, organizations need to collect data at a high velocity. But existing architectures slow the flow of data to a degree that streaming is beyond their reach.

There’s a solution, though, that speeds up, yet simplifies, the process of connecting machine data. We spoke with Todd Montgomery, chief architect at Kaazing, about what it takes to stream machine data.

How can organizations create a flow of streaming, real-time data?

Todd Montgomery: Getting that sheer amount of data, especially machine-generated data, is a challenge of scale—for example, web farms where you may have clickstream data. Being able to get that data off of web servers and onto storage very quickly is a critically needed capability. You need to get logs from thousands of machines as the logs are being generated. And, then, without putting the data on a collection tier in the middle, get it to a place where you can perform analytics.

What is the advantage of not using a middle collection tier to land the data?

Montgomery: In the past, the data would have been stored there temporarily because there’s limited storage where the data is actually being generated. Compounding the problem is the fact that the data is being generated at high rates and there usually isn’t the infrastructure to get the data all the way to the target in an efficient manner. However, you can transfer smaller bits of data very efficiently. You have the luxury of sending it all the way to its eventual destination in a continuous stream and storing it incrementally.

What infrastructure do you need to have in place to stream data efficiently?

Montgomery: With most of the use cases that are out there, the infrastructure is already in place. Companies have the ability to gather data close to where it’s generated, and they have the ability to store the data. Most of the time, if they're doing this type of data collection, they have that middle tier. With streaming data, you get rid of that middle tier. So the infrastructure is there, but it just needs to be used more efficiently. In fact, there’s too much infrastructure—you can get rid of some of it.

From a developer point of view, what is the advantage of streaming machine data?

Montgomery: One of the advantages is that developers don’t have to deal with as much code. When there’s less code, there’s less complexity, fewer moving parts, and less potential for failure. If you look at what a developer or data scientist does, they have to get at the data, and then they have to analyze it. Streaming simplifies the collection of data—you’re keeping up with the speed of change.

Developers can use an infrastructure that will deliver the data to wherever they designate. That’s what we’re aiming for with this: Make it so the developer doesn't have to think about gathering the data. They just put it together, sort of like interlocking blocks. And then they’ve got the data, and it’s coming in real time.

Read this white paper for more information and the answer to the question, “Why big data?”

Related content


Templates build consistency and quality into complex processes

Reduce deployment time and inconsistent processes by building best practices into your mappings.


Data virtualization turns stakeholders into 'co-developers'

Rapid prototyping turns a normally difficult requirements process into a collaborative exercise benefitting both business and IT.


Introduce IT to self-service data integration

It is time to prove to IT that self-service data integration and centralized IT governance are not at odds with each other.