Watch #SparkSummitEast on theCUBE for real-time coverage of real-time Hadoop

UPDATED 10:03 EST / FEBRUARY 17 2016

NEWS

Watch #SparkSummitEast on theCUBE for real-time coverage of real-time Hadoop

This week theCUBE goes to Apache Spark Summit East 2016 (#SparkSummitEast) for two days of continuous coverage today from 10:00 a.m. to 5:00 p.m. ET and tomorrow from 10:00 a.m. to 3:00 p.m.. Coming just 44 days after the January 4 release of Apache Spark v. 1.6.0, billed as a stable version, this conference marks the start of a shift in the discussion of this core part of the Apache Hadoop big data stack away from technological development and business impact theory and toward production real-time big data business analysis system running on Hadoop.

The May 31, 2014 release of Apache Spark was a seminal moment in the development of big data and arguably of the IT industry as a whole. Before Spark, Apache Hadoop’s main data analysis program was MapReduce, a disk-based, batch process platform that limited big data analysis to deep insight applications.

Spark, developed initially by the AMPLab at the University of California, Berkeley, and donated to the Apache Software Foundation, changed that dynamic by opening possibilities for near-real-time analysis of unstructured and semi-structured data. Spark allows users to load data into a Hadoop cluster’s memory and query it repeatedly, making it well-suited to machine learning applications. It supports Hadoop Yarn and Apache Mesos cluster management and a variety of distributed storage systems including Hadoop Distributed File System (HDFS), Cassandra, OpenStack Swift, Amazon S3 and Kudu.

Last year Spark was the most active project in the Apache Software Foundation and one of the most active in the entire open source big data ecosystem, with more than 1,000 contributors, including IBM, which has made a major commitment to it.

Interviewees on TheCUBE are scheduled to include Databricks CTO and creator of Apache Spark Matei Zaharia (@matei_zaharia), Databricks Inc. co-founder Reynold Xin (@rxin), Hortonworks Inc.’s Arun Murphy (@acmurphy), and IBM VP of Engineering Anjui Bhambhri (@AnjulBhambhri). Watch live to see what the key players in Apache Spark are saying, and join the conversation in the #SparkSummit CrowdChat, already live, where you can post questions and comments for Wikibon Co-founder David Vellante (@dvellante) and SiliconAngle founder John Furrier (@furrier) to ask on air.

Image courtesy Spark Summit

Since you’re here …

… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.

Watch #SparkSummitEast on theCUBE for real-time coverage of real-time Hadoop

Image courtesy Spark Summit

Since you’re here …

LATEST FROM THECUBE

UPCOMING CUBE EVENTS

Watch #SparkSummitEast on theCUBE for real-time coverage of real-time Hadoop

Image courtesy Spark Summit

Since you’re here …

LATEST STORIES

LATEST STORIES

Cookies