Continuing our five-part series on “Continuous Monitoring: The Role of DevOps and APM,” this blog post will talk about how alert notification can help DevOps teams be more productive and agile.
Amid the constant, rapid change and fluidity typical in DevOps organizations, there are some undeniable absolutes. These absolutes, present at each stage of the DevOps life cycle, demand your teams’ attention: A build or test succeeds or it fails; an outage of an app in production is restored to service or it lingers unavailable; users applaud or they blast out complaints. And with each of these states throughout the DevOps life cycle there are absolutes common at every step: Your team is either communicating or they are isolated, and, subsequently, either in sync or operating in confusion and chaos.
This is still common, but avoidable. The means of communication are more abundant than ever. We can text, email, use collaboration apps or even the obscure method of speaking into a telephone to another human in real time. Yet, despite all these methods, DevOps teams still find themselves taking too long to become aware of issues that need immediate attention or action. Or, we realize too late that multiple people are responding to or chasing down the same issue. Ironically, the easier it is to communicate and even automate communication across all of the different methods, the more likely we get inundated and saturated by what we really need to be paying attention to.
What’s the best way to cut through all the noise? DevOps teams can:
- Filter the junk and cut down on the noise to begin with—without missing something important.
- Direct the relevant info only to those people that need to take action, and only when it’s the appropriate time.
- Require an acknowledgment that the recipient is taking ownership of the issue.
- Or, of course, all of the above.
There have been tools available over the years to tackle these noise-slaying objectives. Some do a great job getting info to our team’s pertinent DevOps information, but with no way to filter out the irrelevant or redundant stuff. And others can do some of everything, but are complex to set up and maintain and expensive to use.
DevOps teams I’ve talked to want to be able to:
- Immediately and automatically send critical, actionable alerts.
- Route alerts to the right people immediately, but only when it is the appropriate time to do so (don’t wake up the wrong teammate!).
- Speed alert response with automated acknowledgment and escalation.
- Automatically communicate through text, email, voice mail and mobile app, all custom-designed for DevOps use cases.
The prevalent expectation goes beyond easy-to-use, simple to start and maintain solutions. Effective tools also must meet the increasing demand for agile and efficient collaboration among DevOps team members who often use multiple communication tools.
- Filter out the noise, but don’t miss the important stuff:
- Filter alerts, (e.g. by problem severity), so you can focus on real problems and disregard IT operations noise. Create custom groups of contacts to make sure the right notifications reach the right team member, and streamline the process of determining who to contact when specific problems occur and how to contact them.
- Notification channels:
- Define experts who can respond to problems and store their contact details so they can be notified automatically of problems in their area of responsibility. Users should be organized into groups and so you can send notifications to several users at once. Notifications should able to be sent by email, SMS, mobile app, voice message and chat apps.
- Alert management and notifications:
- Tools with policy-based notifications can let you define which type of subject matter you want each person or group to be notified about. You should be able to create filters based on the alerts that occur, customize the filters and assign users and groups who are notified when matching problems occur. Unacknowledged alerts should be escalated automatically after a set time period to the appropriate contact. Filters such as the severity and status of alerts can be created for your IT monitoring requirements.
- Scheduling:
- You’ll want to be able to easily set up on-duty and on-call shifts. The process of scheduling is considered to be one the most time-consuming tasks that prevent users from utilizing a notification tool. A good scheduling capability should be intuitive and require minimal steps to use. Users should have the ability to customize or use out-of-the-box templates. Users also should be able to manually create shifts or build in shift patterns, such as follow the sun, to get their scheduling done faster. Scheduling capabilities also should be smart enough to schedule users or groups for weeks in advance.
- Alert viewer:
- You’ll also need to be able to monitor alert and notification status in management views, perform actions on alerts, and track them in real time. Seeing key alert information such as alert severity can help you enable to easily prioritize your to-do list. You should be able to track alerts from their receipt by the system through acknowledgment to resolution and changes in an alert history, and use predefined and real-time filters to designate alerts you want to see.
Utilizing the capabilities such as these can empower you to improve responsiveness, save resolution time and focus on actionable notifications requiring immediate attention.
About the Author / James Moore
James Moore is responsible for offering management and strategy for IBM IT Service Management hybrid cloud-based solutions including Runbook Automation and Alert Notification. James joined IBM from the Candle acquisition in 2004, where he was product manager for Candle’s Application Response Time product lines. James has more than 15 years’ experience in business service management, application performance management and event management. Connect with James on LinkedIn and Twitter.