LinkedIn launches new data features for Samza stream processing engine
LinkedIn today announced a major new release of the Samza stream processing engine, one of its marquee open-source projects, that will give enterprises more options in how they analyze their real-time data.
Samza is built to process high-volume data streams quickly with a high degree of reliability. This feature combination makes it handy for a number of important enterprise use cases. Among them are infrastructure monitoring, fraud detection and analyzing data from connected devices such as sensors embedded in factory equipment.
Building an analytics application that can harness Samza to ingest such information is a complex undertaking. That’s why a significant part of the software’s installed base is made up of tech firms such as Netflix Inc., Uber Technologies Inc. and VMware Inc. To ease developers’ work, Samza 1.0 introduces two new ways of plugging workloads into the engine besides the native application programming interface.
The first is a tool called Samza SQL. As the name implies, it enables applications to interact with data processed by Samza using the industry-standard Structured Query Language. LinkedIn said that the tool is more accessible than the native API and removes the need for developers to sort out low-level such as provisioning hardware resources manually.
The other new interface alternative takes the form of an integration with the open-source Beam project. Beam provides a unified API for popular analytics engines such as Spark and Flink that spares software teams the trouble of familiarizing themselves with each individual platform. According to LinkedIn, the integration will make Samza-powered applications more portable while enabling developers to use a wider selection of programming languages.
The Microsoft Corp.-owned company also took the opportunity to revamp the native API itself. Samza 1.0 adds built-in commands for performing tasks such as filtering data that previously required developers to build custom workflows from scratch.
“Developers had to implement complex operations such as windows and joins by themselves on top of this API,” LinkedIn engineer Jagadish Venkatraman wrote in a blog post. “This made building applications time consuming and error-prone. To address this in Samza 1.0, we built a high-level API with built-in operators like map, filter, join, window, etc. This allows you to express complex data pipelines easily by combining multiple operators.”
By simplifying development, these improvements could make the engine accessible for a broader range of enterprises. That’s equally true for the new “standalone mode” rolling out in conjunction.
Until now, Samza had to be deployed with YARN, an open-source system for managing hardware resources and application workflows. The software is fairly popular, but it’s just one of several tools that enterprises use for the task. The standalone mode gives companies the flexibility to build Samza directly into an analytics service and then use their YARN alternative of choice to manage that service.
“As Samza gained momentum, our users desired the flexibility to run stream processing in any environment —Kubernetes, Mesos, or on the cloud,” Venkatraman wrote. “This mode allows Samza to be embedded as a lightweight library within an application and run on any resource manager of your choice. You can increase parallelism by simply spinning up more instances of your application.”
Photo: Unsplash
Since you’re here …
… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.