LinkedIn open-sources Ambry, its ultra-scalable object store for media files
Though the profile pictures, company logos and other images that litter LinkedIn often blend into the background without receiving much attention, they take up a significant portion of the user’s view. And added up across the many billions of pages that the social network hosts, the amount of media involved far exceeds the capacity of traditional databases. So in typical Silicon Valley fashion, its engineers decided to build their own custom system a few years ago to better deal with the load.
The effort culminated in 2014 with the creation of Ambry, an ultra-scalable object store that has been made available to the public today under a free license. LinkedIn’s internal deployment of the platform is able to efficiently store trillions of user-submitted images by keeping metadata like the identify of an uploader together with the media content it describes. The arrangement avoids the fragmentation that occurs when a large number of files are kept in a regular NoSQL database, which reduces operational complexity and thereby slashes the amount of processing power needed to complete access requests.
As a result, Ambry can provide fast response times even in an implementation as large as LinkedIn’s. And and at the same time, it’s able to maintain an impressive level of availability: The company says that the platform was developed with a monthly uptime goal of over 99.5 percent, which amounts to less than an hour and a half worth of outages per quarter. The object store’s reliability is owed to an active-active replication service running in every node that keeps the images inside constantly synchronized with the rest of the environment. As a result, work can continue as usual when a disk drive or server fails and the amount of information lost in the process is kept to a minimum.
Because Ambry takes care of the logistics on its own, all the user querying the system has to worry about is how fast they want to access their information. Images can be fetched through a standard REST API that easily interfaces with external applications, or streamed directly in its raw byte form for improved performance. The latter option is facilitated thanks to a special library that LinkedIn plans to speed up even more in future releases of the object store by enabling parallel queries.
Though not a lot of companies deal with as much media content as LinkedIn, Ambry can potentially nonetheless come handy in a wide range of use cases. A major online retailer, for instance, could deploy the system to manage product images, while a publishing portal like Medium could use it to store the pictures and videos that users embed in their articles.
Image via Pixabay
Since you’re here …
… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.