LinkedIn open-sources its internal tools for dealing with outages
LinkedIn’s tradition of sharing internal technologies with the developer community is going as strong as ever under Microsoft Corp.’s wing.
Its latest contribution is a pair of incident management tools designed to ease the task of responding to technical issues. In a blog post published this morning, LinkedIn engineer Daniel Wang wrote that they were originally created to help internal operations personnel better deal with the technical challenges caused by the social network’s growth. Most important, staffers required an efficient way to notify a colleague in the event something goes wrong with a system under their responsibility.
Before, LinkedIn’s engineers had to identify the relevant contact manually and then figure out the best communications channel for alerting them to the problem. The first of the two tools that were released today, Iris, which LinkedIn said was named for the Greek goddess of messages (pictured), provides the ability to implement automated workflows that enabled LinkedIn to take out the hassle from the task.
The controls are fairly straightforward. Users can have Iris send their incident report to the engineer on support duty, generate a predetermined number of follow-ups and then try reaching another contact if no response is received within a certain time frame. The software also provides the option to alert multiple colleagues at once, which is useful if an issue is particularly urgent.
Over on the receiving end, Iris lets operations staff customize how they wish to be contacted. An engineer could, for example, configure the software to send messages labeled “Urgent” to their phone so that they can respond outside office hours if need be.
Iris could be an appealing free alternative to paid incident management services such as VictorOps for companies that don’t require too many bells and whistles. The software’s simple user interface can be an outright advantage for smaller firms, while enterprises might be drawn in by the fact that LinkedIn has included several reliability features. The social network claims to have experienced just one major Iris outage in nearly two years of use.
LinkedIn’s engineers use the platform together with Oncall, the other homegrown tool that was open-sourced today. It’s a scheduling application that lets managers create support shifts for the engineering team in a dashboard designed with ease-of-use in mind.
Iris and Oncall are available on LinkedIn’s Github page.
Image: Wikipedia
Since you’re here …
… We’d like to tell you about our mission and how you can help us fulfill it. SiliconANGLE Media Inc.’s business model is based on the intrinsic value of the content, not advertising. Unlike many online publications, we don’t have a paywall or run banner advertising, because we want to keep our journalism open, without influence or the need to chase traffic.The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.