Who owns this outage? Building intelligent, automated escalation chains

The Stack Overflow Podcast - A podcast by The Stack Overflow Podcast

Podcast artwork

If your organization is running code on a production server 24/7, you’re going to need a process to handle when that code—or the infrastructure it runs on—fails. No code is bug free, so failures will happen. That means that your SREs and developers are going to have to spend some time on call and ready to respond to when the application breaks down. On this sponsored episode of the podcast, we talk to Eric Maxwell, a solution architect at xMatters, about automating, intelligent escalation chains.

Visit the podcast's native language site