In a recent blog post by my colleague Payal Chakravarty, “Synthetic Monitoring – the Start of the DevOps Monitoring Journey,” she discussed how developers, testers and operations staff all need to ensure their internet and intranet mobile applications and web applications are tested and operate successfully from different points of presence around the world.
Continuing our five-part series on “Continuous Monitoring: The Role of DevOps and APM,” this blog post will talk about how analytics can help developers earlier in the DevOps life cycle think about the performance of their applications.
Continuous monitoring is not just about detecting operational problems in production; it is about getting feedback and reassurance that code changes you’ve made have resulted in the intended operational behavior before you deploy to production.
My career DNA is development. So I know, full well, how code changes can have unintended consequences.
Remember the bad old days when development methodologies relied on a series of quality checks (code reviews, various types of tests and release validation) to catch unintended consequences? If caught at all, most operational degradations surfaced long after the offending code was submitted, because the types of validation designed to catch these problems occurred at the very end of the development life cycle, when a “stability test” was possible. These “banana skin” issues were caught late or not caught at all. Either way, clients were impacted (through release delays, increased costs or service degradation) and you were not happy.
Continuous development was great for me, personally. It forced me to look at development in a whole new light and solve the “little and often” delivery problems. It was fun, but despite best efforts, one “banana skin” remained: the stability tests in the staging environment. Just like before, it could cause a late-breaking wobble in the delivery plan to production or, worse still, issues were missed altogether.
I didn’t solve this problem well because I was still thinking like a developer—mainly because I don’t have the skills, time or the interest to think like Operations.
If cognitive analytics are applied to operation problems, you don’t need to think like Operations to benefit from Operations thinking. Your solution automatically will learn what is normal for your application; it will set dynamic thresholds and it will proactively notify you when it becomes significantly anomalous. With cognitive solutions, you can enable operations (continuous monitoring) in staging and detect anomalies long before the stability test fails or before the application hits production! All you need to do is feed it the application metrics. The good news is, your buddies in operations can show you how to do it.
I am a positive person, so I am going to use a positive development scenario:
You have made a code change to improve the performance of your application. When the code is deployed into staging, the expected operational improvement should occur (for example, a response time metric should change for the better). If you are using cognitive solutions, this anomaly (new and unusual behavior) will be detected automatically and you will be informed.
- You get the reassurance that your code changes are having the intended positive impact on the operation of the service in staging.
- You can inform your Operations team that new, anomalous behavior will be seen and why this is actually a good thing.
- Furthermore, since these changes are intentional and the solution is fully cognitive, it will learn the new “normal” over time and the anomaly will simply go away. You do not need to take any “Operations action” such as setting a new threshold level, etc.
The reverse is true, too. If you did not expect a change in a response time metric, or any of the other operational metrics, this would surface quickly, in staging, well before a test (if it exists) catches it. You rapidly can investigate this and take the necessary action before the “banana skin” moment.
Another type of “Operations thinking” that is easily enabled in staging is the ability to alert on patterns in log files.
This feature looks for patterns in logs, in real time, as files are ingested. You create the alerts. You can include alerts for the type of symptoms Operations looks for (again, talk to your buddies—they are the experts) or better still, you know the symptoms of your application starting to fail (you have investigated and fixed enough bugs!). What if you could use congitive solutions to look for these patterns continuously and proactively send you an e-mail if one emerges? Wouldn’t that be good?
Monitoring solutions have turned a corner. Before cognitive, monitoring solutions relied on Operations SMEs to carefully manage the environment. With cognitive solutions the SME is built-in, allowing developers to shift left “Operations thinking” into staging and focus on what they want to do and do best: code, code, code …
Watch a replay of the recently hosted webinar, “Learn Why We Must Shift APM Left in the DevOps Lifecycle.”
About the Author / Sinead Glynn
Sinead Glynn is an offering manager for IBM’s IT Service Management portfolio within DevOps.
Within the IT Service Management portfolio, she specializes in Operations Analytics offerings and the value they bring to both IT Operations and DevOps teams. In her 10+ years at IBM, Sinead has worked in both Development Management and Offering Management, covering both Network Performance Management and Operations Analytics-type solutions. Connect with Sinead on Twitter / LinkedIn.