Automation and Easier Aggregation in Hadoop Clusters Signals Data as a Service Trend
Yesterday I wrote about Cascading 2.0, an alternative to MapReduce. The application framework, managed by Concurrent, allows for developers to develop “Cascading,” big data apps using high-level scripting languages. The apps then get scheduled to run across a Hadoop cluster.
Also yesterday, HP executives presented their case for integrating Hadoop with Autonomy and HP Vertica, its impressive analytics technoloogy.
In both the news from HP and Concurrent, executives often referred to “aggregation,” as what serves as a priority in developing big data systems. It’s becoming clear why. Aggregation represents the next phase on the road to data as a service.
HP executives described how customers now talk about “data lakes,” where all data flows for analysis. With Autonomy, the data feeds into its analysis for filtering and then disrtributed to a Hadoop cluster.
I asked Autonomy Promote’s chief executive Rafiq Mohammadi how the integration might fit with Cascading 2.0. He said it’s not an either or situation. It’s simply an aggregation that could be executed through a REST-based API.
“Our entire strategy is to aggregate logic,” he said.
AWS: The Mega Aggregator
The Autonomy Intelligent Data Operating Layer (IDOL)integrating into Hadoop is similar to the way Amazon Web Services (AWS) aggregates data for customers to shape into apps. It serves as the value for any number of data services.
It does account for AWS success with customers in the business of data. Customers can program apps through platform-as-a-service (PaaS) and run them through AWS Hadoop clusters. Flightcaster did this and made its name for its accurate flight forecasting. Today, Cascading 2.0 gives the capability to more easily develop apps with aggregated data. Thousands more data services will emerge as automation quickens the capability to access aggregated data.
Advances in automation and app development for deployment on Hadoop clusters signals the coming trend in data-as-a-service. PaaS environments and big data frameworks will serve as the foundation for automating the application process to access aggregated data resources.
It’s inevitable. The analytics tools are getting better and the frameworks are far more simole to set up.
But the next step is aggregation. Once that is achieved, data can be shaped and used for competitive advantage.
Since you’re here …
… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.