UPDATED 10:25 EDT / OCTOBER 13 2014

Apache Spark heats up with support from 0xdata’s machine-learning platform NEWS

Apache Spark heats up with support from 0xdata’s machine-learning platform

Apache Spark heats up with support from 0xdata’s machine-learning platform

John Furrier Live In theCUBE At Hadoop Summit 2014

Hot on the heels of Hadoop distributor Hortonworks Inc. throwing its weight behind Apache Spark, the ultra-fast analytics engine has found another backer in 0xdata Inc., an emerging provider of machine learning software founded by industry veteran SriSatish Ambati. The Silicon Valley startup is launching a new addition to its flagship platform specifically optimized to tap into the vast processing capacity of Spark.

In-memory is becoming a hot area to create more real-time value in the big data space,” said SiliconANGLE founder John Furrier. “Big Data is becoming about getting low latency data in the hands of apps and users for innovations that combine speed and machine learning.”

Born out of the celebrated AMPLab at UC Berkeley, Spark is an in-memory execution framework for Hadoop that runs up to 100 times faster than the default MapReduce engine included in the batch analytics platform. The open source platrorm is also much better equipped to handle operations that involve looping over the same data in quick succession, which is the underpinning of machine learning.

Google created MapReduce in 2004 to simplify the deployment of parallel applications on distributed clusters, a momentous feat that not only requires effectively spreading a load across individual servers but also enabling rapid inter-node communications and fault-tolerance. The software is perfectly suited for performing that task, hiding most of the complexity and freeing up the user to focus on their application.

As a result of that narrow focus, however, there is no straightforward way to implement an iterative algorithm in MapReduce. Data scientists are left to split loop cycles across disjointed operations that not only take extra effort to input but also run independently of each other, requiring information to be written to disk and reloaded with every iteration.

Spark does away with that hassle by keeping everything in memory as a continuous workflow and thereby killing two birds with one stone: It simplifies life for users while eliminating the massive delays associated with shuffling data around.

That makes it a perfect fit with 0xdata’s H20, which is built from the ground up for performing machine learning calculations in memory. 0xdata says its open source platform provides an environment for data scientists to implement a wide range of machine learning use cases ranging from pricing optimization to predictive analytics using tools they’re already familiar with.

Sparkling Water is the culmination of a four-month effort to integrate H20 with the analytics engine. The technology makes it possible of seamlessly move information back and forth from the two platforms, making the combination much more accessible. Users can now feasibly query Spark for a particular dataset, feed it into H20 to create a machine learning model and push the results back to Spark for rapid execution, which significantly increased the usefulness of both projects.


Since you’re here …

… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.