Wikibon’s George Gilbert defines the new, machine-learning based analytics pipeline for IoT
Capturing and managing the huge volumes of data being generated by Internet of Things (IoT) and deriving value from it requires a new data analytics architecture, writes Wikibon Big Data Analyst George Gilbert. In his latest Professional Alert, “Recipe For An IoT-Ready Analytic Pipeline,” Gilbert provides a road map for building that new pipeline, first at the IT director level and then for IT architects.
The best way to understand this emerging IoT analytic pipeline, he says, is to find elements in the traditional approach that are changing and extrapolate based on the new requirements. The cost of capturing traditional data manually has stayed roughly constant at $1 billion per terabyte for several decades. But the new IoT data is generated and captured at a marginal cost approaching zero. The new data pipeline must leverage elastic clusters of commodity hardware and software using automated management, bringing the cost of capture and management to as close to zero as possible.
The data pipeline needs to support a much higher data velocity and provide near real-time responsiveness between capturing data and driving action, while still leveraging historical data to improve the context of analytics. It needs to provide converged analytics, supporting both batch and real-time as well as both business intelligence and machine learning on any data type.
An example application is General Electric Co.’s Predix software-as-a-service (SaaS) application for predictive maintenance service for industrial equipment. This analyzes continual data streams from instrumented machinery to monitor and anticipate maintenance needs for smart, connected products operated by a manufacturer’s customers.
This “messy” data is semi-structured and often originates in analog form from sensors. The structure evolves over time, requiring flexible management. The sources are highly decentralized and in some cases (such as airplanes, automobiles and train engines) in motion. The system needs edge processing capability to separate normal readings from abnormal ones that might indicate a developing issue and send only the latter over the network, which may have low bandwidth and intermittent service.
The full alert discusses the new architecture in more detail.
Image via jeferrb
Since you’re here …
… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.