As we move into the second half of 2023, it’s only natural to engage in the annual ‘state of the industry’ reviews and take a general inventory of the past year while looking out into the immediate future. Observability has gained traction among DevOps and platform engineering teams over the past year and is poised to continue that velocity. But my observation is that the current observability model is broken.
Too Much Data
Long story short: Many organizations (and engineers specifically) agree that they are sending too much data to their monitoring tools, resulting in an inability to garner valuable insights because the systems are massively overloaded. Sending this high volume of data also costs too much and requires an inordinate amount of time for analysis, which many organizations don’t have. Put simply–more data doesn’t mean more clarity; in fact, we see the opposite.
Massive adoption of cloud-native technologies such as Kubernetes that heighten the challenge of efficiently monitoring environments is also accelerating this problem. Beyond increasing costs and complexity, many teams seem to feel that their current approach doesn’t create the desired level of observability for Kubernetes, full stop.
Another major issue we see on the observability landscape relates to the overall maturity of monitoring processes and teams. In many instances, even when organizations have effective tools, they struggle to interpret the data produced into targeted conclusions that their teams can translate into effective actions. And this is further amplified by the outstanding engineering talent shortage because good people are harder to find and retain than ever before.
We also see a lot of pervasive challenges related to organizations’ desire to use popular open source monitoring tools-especially if they are attempting to do so in an integrated manner or to support rapidly growing deployments at scale.
Overwhelmed by Observability Data?
In summary, if you’re overwhelmed by observability data, if complexity and costs grow out of control and you’re also struggling to make sense of the available information and improve your cloud applications, we think that means that the current observability model really is broken.
To that end, gaining a deeper perspective into end user opinions on these topics, among others, is the precise goal of the annual DevOps Pulse Report, currently in its sixth year and now in the survey stage. We predict that this year’s research project will underline the aforementioned challenges and help frame how teams might seek to address them.
What types of findings do we hope to unearth? Last year, respondents revealed that despite a growing perception of DevOps maturity, it was actually taking them more time to respond to emerging problems. Roughly 64% of respondents reported that MTTR during production incidents was over an hour, compared to only 47% the previous year.
Last year’s report also highlighted the reality that monitoring complex environments remains difficult with over 52% of respondents citing Kubernetes, microservices and serverless as their primary challenges in gaining improved observability.
Both of these previous findings also seemed to underline the growing issues of data overload and spiraling complexity. We’ve also seen previous evidence of a lack of maturity around distributed tracing—a critical observability practice aimed directly at increasing visibility into complex architectures—despite the fact that 75% of respondents listed it as a priority in last year’s research.
Is your team struggling to make sense of its available monitoring data, or do so in a way that aligns with your current processes or budgets? Hopefully, the 2023 DevOps Pulse Report, set to publish early this year, will put some numbers behind these assertions and help us all understand how to move forward.