IBM’s New Storage Architecture Incorporates Hadoop
IBM bares its new storage design constructed by scientists at IBM Research-Almaden, with claims to double analytics processing and speed for big data and the cloud through advanced clustering technologies, dynamic file system management and advanced data replication techniques. The new General Parallel File System-Shared Nothing Cluster (GPFS-SNC) architecture incorporates Hadoop Distributed File System (HDFS).
This storage architecture won the Supercomputing 2010 Storage Challenge based on performance, scalability and storage subsystem utilization.
The division of tasks is between independent nodes since each node is self-sufficient. This enables GPFS-SNC to “convert terabytes of pure information into actionable insights twice as fast as previously possible.” Additionally, it supports POSIX for backward compatibility, caching, replication, backup and recovery, and wide area replication for disaster recovery. Prasenjit Sarkar of Storage Analytics and Resiliency, IBM Research-Almaden is the master inventor of the project.
“The world is overflowing with petabytes to exabytes of data and the challenge is to store this data efficiently so that it can be accessed quickly at any point in time. This new way of storage partitioning is another step forward on this path as it gives businesses faster time-to-insight without concern for traditional storage limitations,” Sarkar said.
Though Sarkar refused to comment on how IBM can commercialize GPFS, it serves as the basis for the IBM Scale Out Network Attached Storage platform, also known as SONAS platform, used in IBM’s information Archive and the IBM Smart Business Computer Cloud. It scales capacity and performance while providing parallel access to data and a global name space that can manage billions of files and up to 14.4PB of capacity.
Also, GPFS-SNC will be used for VISION Cloud initiative, a group participated by 15 European countries in development of a new approach to cloud storage where data is represented by smart objects that include information describing the content of the data and how the object should be handled, replicated, or preserved or “smart cloud storage architecture” as they call it. It is a combination of a rich object data model, execution of computations close to the stored content, content-centric access, and full data interoperability.
For Hadoop, the partnership with IBM proves a big step in its development and adoption. The open-source cloud initiative has been gaining a number of partners this past year, with huge developments in the social networking space, through Twitter and the recently launched Facebook Mail.
Partners of IBM for VISION Cloud Initiative include SAP AG, Siemens Corporate Technology, Engineering and ITRicity, Telefónica Investigación y Desarrollo, Orange Labs and Telenor, RAI and Deutche Welle, the SNIA Europe standards organization. The National Technical University of Athens, Umea University, Swedish Institute of Computer Science and University of Messin.
Since you’re here …
… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.