Hadoop Alternative HPCC Now Available on Amazon Web Services
If you’ve been waiting for an easy way to test HPCC, the alternative to Apache Hadoop built and open sourced by LexisNexis Risk Solutions, then you’re in luck. Starting today HPCC Thor clusters will be available on Amazon Web Services. At the moment this is just a beta and it’s offering by HPCC, not as an official AWS service. There are some limitations, but it’s suitable for giving HPCC a test drive. You can find the documentation here.
“AWS is not the best environment for a cluster like this, but it works,” says HPCC CTO Armando Escalante. “AWS is made for clusters of about two or three servers,” Escalante says. “Dealing with a 100+ node cluster would be a nightmare.”
HPCC has developed its own tools for managing clusters, but for now AWS users will be limited to 20 nodes.
Escalante says more features are coming in the future, including a one button deployment option. Another limitation is that HPCC’s querying system/data warehouse Roxie isn’t available on AWS. Escalante describes Roxie as one HPCC’s core differentiators. But Roxy requires additional infrastructure, such as a load balancer, to be present. Escalante says he’s working with AWS on this, and since AWS already supports similar infrastructure for its hosted Oracle services, it should be feasible.
Escalante says that eventually Amazon will include it as part of the Elastic MapReduce service, which currently lets users spin-up Hadoop clusters on AWS infrastructure. Escalante emphasizes that this will be a bit of a misnomer, since HPCC doesn’t sue MapReduce, but says that Amazon is planning on changing the name next year anyway. AWS is developing some sort of hosted data stream processing tool, possibly to be based on Hadoop, so it’s possible that service will be included in this newly renamed Elastic MapReduce stable as well.
HPCC is currently the main apples-to-apples alternative to Hadoop. Microsoft decided to sunset LINQ to HPC ( (formerly called Dryad) in favor of using Hadoop with its partner HortonWorks. That decision doesn’t bode well for the Microsoft Labs project Daytona. The University of California Berkley has its Spark project, but it doesn’t seem to have any enterprise traction yet (though it is used in production at Conviva). There are some indirect competitors, such as data warehousing solutions, complex events processing solutions and Storm, but otherwise Hadoop and HPCC stand alone at the moment.
Since you’re here …
… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.