Blog

Two CtrlStack Integrations For Faster Incident Response

June 21, 2023

At CtrlStack, we’re always thinking about ways to make it seamless for customers to work across their suite of tools for effective troubleshooting. Two of those tools I’ll focus on today are PagerDuty and FireHydrant. While those tools are great for incident response and on-call management, most businesses aren’t making the most of them.

  • When alerted of an incident, can you immediately see the root cause of an issue without a lot of context switching?
  • Can you capture knowledge about how to resolve common incidents and surface that knowledge during incident response?
  • Can you automatically generate a postmortem—without additional stress and time on your response team?

CtrlStack’s integrations with PagerDuty and FireHydrant help make DevOps teams more efficient when responding to incidents and during on-call. By providing a link to an automated root cause diagnosis of any event, teams have a contextual understanding of an incident to kick off their investigation. CtrlStack also automatically captures all the actions taken by team members. This ensures that anyone jumping onboard will have the right context, can see what has been done, when, and by whom. More importantly, teams don’t need to spend more time writing a postmortem; CtrlStack automatically generates the postmortem for you.

PagerDuty Integration

PagerDuty provides a broad range of incident response and on-call management capabilities to alert teams of issues and automate tasks and processes for resolution. It’s a common tool used by SREs and operations teams for scheduling on-call coverage and routing alerts and events from monitoring tools to the right people. Their incident response capabilities include runbooks that use machine learning to identify duplicate events, and suggest actions for resolution while keeping everyone informed of what’s happening. 

Through its integration with PagerDuty, CtrlStack helps teams track change impact and find the root causes of production issues faster. CtrlStack monitors PagerDuty events, correlates the events with relevant changes in the system, and automatically performs a root cause analysis (RCA) on the impacted service. CtrlStack also provides all the context needed to continue the investigation—links to an automatic timeline of the incident, metrics charts, relationship mapping, and event mapping. With the assistance of ChatGPT, CtrlStack generates an automatic diagnosis, explained in detail.

Here’s a demo of the CtrlStack and PagerDuty integration:

 

FireHydrant Integration

FireHydrant focuses on defining and automating incident response processes, including automatically updating end-user facing status pages when services are disrupted. It allows incidents to be declared and managed directly from Slack. Through integrations, it also supports on-call notifications and ticketing. To streamline the retrospective process, FireHydrant automatically captures a timeline of activity taken during an incident.

CtrlStack helps FireHydrant users troubleshoot production issues faster by automatically generating the root cause analysis within 30 seconds. Having the ability to trigger a root cause analysis from any entity in the topology view streamlines the investigative process, taking you from the incident to the RCA flow in just one click. From there, you can observe the events that led to the incident, tracing all the way back to the root cause. In the same dashboard, you can drill down to the event details, the backend infrastructure impacted, and relevant logs to get you troubleshooting and resolving the issue quickly.

Here’s a demo of the CtrlStack and FireHydrant integration:

 

 

Whether you’re using PagerDuty or FireHydrant, CtrlStack can greatly improve your incident response and team efficiency. If you would like to try it out for yourself, sign up for beta access now—it’s free.

About Author
Mary Chen
Sr. Director, Product Marketing