UPDATED 13:57 EDT / SEPTEMBER 27 2017

BIG DATA

Open-source community pushing big data into AI realm

What’s the surest way to advance a technology in a short time? Give it away — to an open-source community. Seminal big data software library Apache Hadoop gained momentum in open source, and today, most disruptive big data development is springing from open source as well.

“If people have the community traction, that is the new benchmark,” said John Furrier (@furrier) (pictured, left), co-host of theCUBE, SiliconANGLE Media’s mobile livestreaming studio. This is evident at SiliconANGLE’s and theCUBE’s BigData NYC 2017 event, where Furrier and co-host James Kobielus (@jameskobielus(pictured, right) discussed the community edge.

Yahoo Inc. just open-sourced its big data search and recommendation software Vespa, following its hugely popular 2006 Hadoop contribution. It clearly believes Vespa can evolve via open-source developer brains just as Hadoop did.

“As the community model grows up, you’re starting to see a renaissance of real creative developers,” Furrier said.

These developers are not just working out implementation kinks; they’re innovating at a level that makes a difference for applications. “Real creative competition — in a renaissance, that’s really the key,” Furrier stated.

The renaissance will be automated

Much new development branches out from big data per se, into artificial intelligence, machine learning and internet of things. “Data professionals and developers are moving toward new frameworks like TensorFlow,” Kobielus said. TensorFlow is Google’s open-source deep learning framework. Caffe and Theano are additional open-source deep learning technologies with bustling communities around them.

Some of the most exciting work happening in open-source (and at Stanford University) revolves around automating the acquisition of data needed to train machine learning models. Many would like to see deep learning tools and methods operationalized, enabling what some call DataOps or InsightOps (IBM’s term), Kobielus pointed out.

“I think what are coming into being are DevOps frameworks to span the entire life cycle of the creation and the training and deployment and iteration of AI,” he said.

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of BigData NYC 2017.

Photo: SiliconANGLE

Since you’re here …

… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.