A majority of today’s businesses follow a cloud-first strategy. DevOps engineers with these businesses prefer using infrastructure as a service (IaaS), platform as a service (PaaS) or cloud services from leading cloud services providers (CSPs). Given the rate at which the IT is evolving now, from on-prem to XaaS, the future seems a lot more promising for Dev and Ops of an application.
But while the IT has evolved, have the monitoring and management tools evolved along these lines, too? Many DevOps engineers still use the good-old application performance monitoring (APM) tools for measuring the performance of apps and for debugging issues.
This article deliberates on shifting the focus from application performance management (APM) to cloud performance management (CPM).
APM is Good, Infra-Centric APM is better, CPM is the Best
Use of cloud services is a norm in DevOps today. On the one hand, these services act as building blocks for developing or running application(s) and contribute heavily to the performance of that application. On the other hand, they bring with them several layers of idiosyncratic abstraction layers. To top it, new cloud services or SaaS occasionally introduce new trends.
The monitoring and management market has yet to catch up with the pace at which IT is evolving. There are only a handful of cloud monitoring tools available in the market, and they provide siloed reports and dashboards. Result: DevOps engineers are still stuck with good-old APM tools.
APM tools are good at offering insights. However, these insights are just enough to understand what’s happening in the application tier.
If you observe the above APM dashboard, there’s the summary of server status and app throughput, response time and more beautifully packaged with visualization graphs. If there’s a need, one can even get information on CPU usage, memory usage, disk usage details, etc.
Then there are infra-centric APM tools that go one level beneath the app tier to provide information on network statistics, such as bandwidth or opened ports. However, none of the tools provide complete contextual information. Day in and day out, a DevOps engineer must dabble with several dashboards.
Let’s take a look at a very simple scenario. Assume there is an image processing web application leveraging AWS cloud services such as EC2, S3 bucket for storage and SQS for queuing:
From the APM tool, you can continuously monitor the application for response time, then app servers for metrics such as throughput (with <=10 images), CPU utilization, etc. Everything looks fine on the top while the pipeline is moving smoothly.
Then one day, the number of users using the app at a particular time increases and the jobs start queuing up quickly on SQS, and the response time of the application suddenly goes low. However, the performance of all the servers seems fine on the APM tool.
In such a scenario, how do you debug the root cause that is affecting the response time without having a monitoring tool that is watching the AWS SQS?
Of course, AWS provides its own console to monitor its services. If the DevOps engineer is aware of the root cause, debugging is a child’s play. However, when the underlying issue is unknown, it leads to dependency on several tools, which can unearth facts for them.
When a business’ application or applications are dependent on several cloud services that do not have a monitoring tool yet, how can an APM tool, which can pull only specific data, help provide complete visibility?
In this cloud-driven world, a developer needs to be aware of how their application interacts with different tech stacks, and how that interaction drives performance, availability and cost.
Unified CPM Complements APM Bringing the Complete Context
The industry has traditionally separated the performance management of applications and infrastructure, resulting in multiple toolsets in each of these domains.Neither APM nor IPM alone will successfully address and solve shift-left monitoring requirements.
APM focuses only on the application tier providing visibility into just part way down the tech stack. If you are a DevOps engineer with an enterprise or an SMB looking into different customer applications, you know it is difficult to make sense of all the resource usage across availability zones and make a case of each application’s cost/performance trade-off, ability to scale, time-to-market, etc. While the cloud services make it easier for development, each service has its own learning curve and abstraction.
A unified CPM tool thrusts the context up to the shore of APM, in a tech stack. Developers need such tools where they can monitor all the services and get end-to-end visibility from a single pane of glass.
Conclusion
Analyzing performance of each dynamic service in context with other dynamic players (services) in the tech stacks will quickly provide the right context and help detect an underlying problem easily—especially, in cases where the problem lies with the cloud resources rather than the application code itself.