UPDATED 17:12 EDT / OCTOBER 10 2017

BIG DATA

IBM wants to give data scientists a lot more free time

When it comes to managing data in the enterprise, words like “cleansing,” “wrangling” or “preparation” are frequently used to describe the work necessary to place information in the right shape and form so it can be effectively used. If this sounds like a lot of work, it is. So IBM has introduced its Integrated Analytics System, based on an SQL engine, to let data scientists work across multiple data stores and save significant time.

“Most enterprises struggle with complexity. That’s the number one problem when it comes to analytics. We are trying to make data really simple to use,” said Rob Thomas (pictured), general manager of IBM Analytics at IBM Corp.

Thomas stopped by theCUBE, SiliconANGLE’s mobile livestreaming studio, and spoke with host John Furrier (@furrier) during the recent BigData NYC event in New York City. They discussed the technology behind the new analytics offering, how users can obtain and run it, and the future direction of the multicloud world. (* Disclosure below.)

The Integrated Analytics System is designed for deployment across private, public or hybrid clouds, with machine learning via Apache Spark (an open-source in-memory data processing engine) embedded in the enterprise offering. The concept is to integrate time-consuming functions like combining and cleaning the data, building a warehouse and selecting data science tools into a single system.

“If you move to this model, suddenly what was a bunch of disparate tools are now microservices against a common architecture,” Thomas explained. “So it totally changes the nature of a data platform in the enterprise.”

Eliminates the data wrangling

IBM has also simplified access to the analytics solution. Users can bring the Spark-loaded box into the data center, download a containerized version available on the Web or run it directly on the IBM cloud. “We’ve eliminated that need for all of that data movement, for all of the data wrangling,” Thomas said. “We’ve made it really simple.”

The release of IBM’s analytics tool, which can be used across a variety of cloud environments, follows its announcement of the Hortonworks Inc. DataPlane Service, a cloud offering designed to collect data in multiple locations. These recent announcements from IBM appear to be geared toward meeting the increasing demands in a multicloud world, although this evolution remains a work in progress.

“I don’t think any enterprise will go ‘all in’ on one cloud; it’s delusional for people to think that,” said Thomas, though he also cautioned that it remains to be seen what a multiple cloud world may actually look like. “Let’s be honest, the multicloud world is still pretty early,” he added.

Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of BigData NYC 2017. (* Disclosure: IBM Corp. sponsored this segment of theCUBE. Neither IBM nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)

Photo: SiliconANGLE

Since you’re here …

… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.

If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.