Over the last few years, teams have realized the benefits of sharing and distributing knowledge in chat applications such as Slack. Today, teams are extending the use of these applications beyond collaboration by embracing ChatOps. ChatOps empowers teams by bringing complex day-to-day operational work into shared chat channels. If done correctly, it drastically reduces context switching and increases the speed at which teams can tackle tough challenges.
Companies are realizing incredible benefits by incorporating ChatOps into their incident management workflows. Chat channels provide hyper-effective means to alert teams of issues, take immediate action and collaborate with clarity and urgency. It is for this reason that companies including OpsGenie, PagerDuty and VictorOps have invested in integrating their incident management applications into chat applications such as Slack, HipChat and Microsoft Teams.
ChatOps to the Rescue
Ideally, users of a modern incident management solution need to have access to a multitude of features so that they can address specific challenges directly within the chat tool. During an incident, switching between applications to access and collaborate on the actionable data is a waste of time. Keeping the conversation and action in one place, where team members already are, is the key to a fast and successful incident resolution process. That is why ChatOps has emerged as an alternative to the war room concept.
To get the most out of the technology and resolve problems more quickly and easily, here are six tips to consider when using ChatOps for incident management:
1. Post Messages with Clarity
The basic requirement for integrating any application within chat tools is to simply have messages posted within chat channels. However, there are three additional needs to consider when integrating an incident management tool into chat when increasing efficiency is the goal.
- The integration must be able to post chat messages regardless where the incident originates and additional information added (monitoring, ticketing, or collaboration tools).
- Messages need to be customized based on the specifics or severity of the incident. Often the basic message needs to be appended with tags or additional information fields.
- If possible, include a button in the message that allow responders to take instant, prescriptive actions.
2. Control How Information is Shared
Posting every alert and comment into a shared channel can quickly become overwhelming and is considered bad practice. A strong integration enables the creation of separate channels for specific responders and have direct control over what types of alert and follow-up messages are shown.
A seamless integration also provides the flexibility to receive only the incident actions that are decidedly relevant. Low-priority issues that don’t require immediate action can be filtered and not sent to the chat stream, ensuring the team stays focused on what really matters.
3. Accelerate Actions
ChatOps transforms chat into a vehicle through which users take actions and deploy solutions. This virtually eliminates the need for operations teams to switch between applications when responding to an incident.
Most integrations enable users to execute incident actions using text commands. This method is valid and often preferred. However, deeper integrations simplify tasks by providing interactive buttons that enable users to begin working on incidents with just a click.
The value of responding to an alert by clicking a button extends beyond simple alert acknowledgement. Enabling all common responses (i.e. assigning an alert, taking ownership, adding notes, muting incidents, etc.) to be executed in this manner accelerates resolution and unlocks the real power of ChatOps.
4. Instantly Retrieve Supporting Information
A challenge of a long chat thread is that critical information can get lost within the communications or additional information may be needed. A user might have to leave the channel to look up important details and related information, such as looking at other open incidents, for example.
The integration of an incident management system into the chat environment needs to allow users to access supporting information at any time. The ability to query not only additional details about a specific incident but also general information regarding on-call schedules and alert policies is a requirement. By empowering team members to access this data directly within the chat interface, they stay focused and are able to respond in the most efficient manner possible.
5. Create New Alerts
A common task when managing an incident is to create manual alerts. Again, requiring people to switch applications to complete this task is a distraction and slows the progress.
Creating new alerts for a team or service right within the chat channel is a necessity. The incident management system then notifies the person on-call automatically. Having the ability to create an alert directly within chat empowers individuals that may not have direct access to the incident management system to report problems efficiently.
6. Control Access
When an incident occurs, it is important that corrective action is taken by only the people who are authorized to respond. Mapping (or linking) the chat users to the users within the incident management system enables this level of control.
Linking the user profiles between the chat and incident management has additional benefits. As alerts are acknowledged and responded to within the chat environment, the events are recorded within the incident management solution. A clear history describing who took each action is recorded for future reporting and analysis.
While there are many things to consider when integrating incident management into a chat application, the benefits are clear. Response time, ability to collaborate, and team access to incident details are crucial factors for incident resolution.
About the Author / Serhat Can
