UPDATED 17:22 EDT / OCTOBER 23 2014

ebay CEO John Donahoe NEWS

eBay joins open-source community with ultra-fast OLAP engine for Hadoop

ebay CEO John Donahoe

ebay CEO John Donahoe

Like arch-rival Amazon.com, the soon-to-split eBay Inc. is something of an oddity in that it hasn’t historically been a big contributor to the open-source community. But the e-commerce pioneer hopes to change that with the release of the source-code for a homegrown online analytics processing (OLAP) engine that promises to speed up Hadoop while also making it more accessible to everyday enterprise users.

Dubbed Kylin, the platform was developed after eBay failed to find a solution to help it effectively address the rapid growth in the volume and diversity of data generated by its customers, a story that is familiar to other contributors to the Hadoop community. Kylin optimizes the storage of information by leveraging existing technologies whenever possible from the upstream component ecosystem.

By default, data is stored in Apache Hive, which layers a familiar SQL interface on top of Hadoop that allows business workers to harness the distributed analytics capabilities of the system without having to learn the nuances of the native MapReduce execution paradigm. When Kylin comes across certain repetitions in the rows and columns inside the sub-project – such as a particular product appearing multiple times with different prices – it maps that data into key-value pairs which are then whisked off to Apache Hive, which is another component designed with that specific type of workload in mind.

Specifically, Hive provides random access to information that Kylin exploits to avoid having to sequentially scan tens or hundreds of billions of rows in Hive whenever an eBay employee looks up a certain business detail. That has helped to significantly improve response times at the company, with eBay claiming that the technology handles certain queries in less than a second, allowing truly interactive analytics.

Topping off that performance advantage are a number of complementary features such as integration with popular business intelligence tools like Tableau Inc.’s wildly popular data visualization platform, storage compression and monitoring. Future versions of Kylin will also add better support for more processing paradigm, eBay promises, including multidimensional and hybrid OLAP.

Kylin is not as groundbreaking as some of the other emerging Hadoop add-ons that have been making headlines recently, but it does address an important pain point currently holding back traditional enterprises from taking advantage of the batch processing framework. It’s these kinds of relatively mundane but vital knots that the upstream community must smooth out as more and more corporate deployments of the project move from pilot to production, a mission that eBay’s first major contribution brings an important step forward.


Since you’re here …

… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.