DevOps Practice

Threading the Needle on Kubernetes Complexity with AI-Powered Observability

If you run down the laundry list of major industries–transportation, retail, healthcare, automotive, financial services–you’ll see that the major players in each have essentially become software companies. What they do and how they serve their customers rests entirely on their software working perfectly. Underpinning this is a massive digital transformation that has shifted organizations away from the old, static ways of enterprise IT monoliths. Instead, today’s businesses are working in more dynamic IT environments–agile, multi-cloud and, especially for new digital services, based on containers and microservices. All of which means embracing more IT complexity than ever before.

Kubernetes Brings More Data to the Table, but that Impacts Observability

A key part of this more complex environment is the soaring adoption rate of Kubernetes over the last few years. In a survey of 5,000 enterprises, 40% said they were currently running Kubernetes in production and over 80% reported that they’d adopted Kubernetes over any other container management system. Another recent survey revealed that 68% of CIOs are already using containers and a total 86% expect to deploy them over the next year. Enterprise IT is increasingly gravitating to containers and especially to Kubernetes for orchestrating their containerized applications.

What drives organizations toward containers and Kubernetes is the flexibility given to development teams to release services, features or apps faster. But faster doesn’t necessarily mean better or higher quality, unless you also apply cloud-native architectural patterns, include quality gates into your delivery pipeline, embrace new deployment models such as canary or feature toggling and provide full visibility and short feedback loops from production back to engineering.

Kubernetes itself provides great visibility into node or pod health. Kubernetes even ensures the health of your pods through its built-in health check mechanisms. Thanks to components such as service meshes, you also get live information about dependencies between services running in your pods. This data can, in theory, provide organizations with the necessary insights into their environments to understand whether their organization is releasing better value faster to their end users and unlocking new business value.

The reason I said these are theoretical benefits is because, in practice, all that extra data doesn’t necessarily mean more visibility. Often it can mean less visibility–and with that, less insight into how the different elements of your environment may be impacting speed of production deployments, overall system performance and the end-user experiences. I’ve seen Kubernetes environments that are packed with literally billions of interdependencies between applications, containers, microservices and clouds. Asking human beings to keep track of that level of data manually is an impossible task now. And relying on traditional, manual approaches to manage and monitor IT environments shortchanges the value that organizations should be getting out of Kubernetes.

Kubernetes Environments Need Full Stack Observabilitythat Means Needing Deterministic AI

Getting the full potential out of Kubernetes means having observability across the full technology stack and those billions of interdependencies. But as more organizations are discovering, that can’t happen without AI. Specifically, a deterministic AI engine. In fact, in a 2019 survey, 88% of CIOs said they were looking to AI as a critical solution for overcoming IT complexity. As long as Kubernetes is fueling complexity, overcoming that complexity means making deterministic AI an integral part of any Kubernetes environment.

A deterministic AI applies specific algorithms to detect system anomalies and apply its collected dependency information toward detecting the root causes behind those anomalies. When that deterministic AI engine ingests critical Kubernetes events, such as state changes and workload changes, it provides visibility into the Kubernetes stack’s interdependencies. That AI-powered, full stack observability matters because it gives IT new capabilities for their environments, including instrumenting and mapping containers, managing and allocating container resources and drilling down into container runtimes at work. All of that is possible due to the level of observability that you can only get with deterministic AI; traditional metrics and dashboards just can’t compete.

By providing full stack visibility and interdependency mapping across a Kubernetes environment, deterministic AI is empowering IT teams to identify, in real time, the root cause of degradations in business service availability–from reliability and resource usage to system performance and user experience–faster than they ever could manually. That level of automated speed and visibility also makes it possible to immediately resolve those issues at the source before they can impact the end user. The automation provided by deterministic AI also works its way into DevOps’ production pipelines, accelerating the release and feedback cycles of new deployments and allowing teams to more quickly incorporate user feedback into the next iteration for even better experiences.

In short, a deterministic AI engine makes life better for both IT and the customers they’re aiming to serve. Deterministic AI takes work that used to be time-consuming, tedious and counterintuitive when done manually, and performs it fast and automatically. That then frees up the teams that used to do that to focus on work that is more satisfying and drives new business value.

AI-Powered Observability Relies on Quality Data and Quality Monitoring

All that said, a deterministic AI’s ability to provide reliable, precise full stack observability in a Kubernetes environment relies on it being given consistently quality data it can process. The algorithms are dependent on the data that’s ingested by the AI engine. If there’s no standard monitoring capability being used across the board, then you invite a scenario where different teams bring their own preferred monitoring tools into the fold. The bigger a Kubernetes cluster gets, the more diverse the set of monitoring tools can be. And the greater the diversity of monitoring tools, the more inconsistent the data being fed into the deterministic AI engine.

The way out of this predicament is to leverage a single monitoring tool as a self-service capability into Kubernetes clusters, one that can support the diversity of technologies that are introduced into the stack and provide a baseline standard of system and event data that developers, IT administrators and the deterministic AI all lean on. In other words, when you feed consistent, quality data into the AI, you make it smarter. And a smarter deterministic AI engine provides the full stack Kubernetes observability that organizations need. 

Building a Positive Feedback Loop Between Kubernetes, Deterministic AI and Observability

Kubernetes and deterministic AI are a natural fit. Kubernetes provides a platform for managing and instrumenting containerized applications. The deterministic AI provides a tool for mapping out the Kubernetes stack, offering insights into interdependencies and degradations where they arise and resolving them before they can affect end users. The data from Kubernetes makes AI better, and a smarter deterministic AI engine means more precise observability across your technology stack.

It’s a self-reinforcing feedback loop that eliminates the usual pitfalls of container-induced IT complexity. Instead, it creates new standards of visible, consistent and quality data that organizations can leverage for faster and better deployments–and in turn, better user experiences and greater business value.

To learn more about containerized infrastructure and cloud native technologies, consider coming to KubeCon + CloudNativeCon EU, in Amsterdam. The CNCF has made the decision to postpone the event (originally set for March 30 to April 2, 2020) to instead be held in July or August 2020.

Andreas Grabner

Andreas Grabner

Andreas is a DevOps activist at Dynatrace. He has over 20 years of experience as a software developer, tester and architect, and is an advocate for high-performing cloud operations. As a champion of DevOps initiatives, Andreas is dedicated to helping developers, testers and operations teams become more efficient in their jobs with Dynatrace’s software intelligence platform.

Recent Posts

IBM Confirms: It’s Buying HashiCorp

Everyone knew HashiCorp was attempting to find a buyer. Few suspected it would be IBM.

1 hour ago

Embrace Adds Support for OpenTelemetry to Instrument Mobile Applications

Embrace revealed today it is adding support for open source OpenTelemetry agent software to its software development kits (SDKs) that…

9 hours ago

Paying Your Dues

TANSTAAFL, ya know?

11 hours ago

AIOps Success Requires Synthetic Internet Telemetry Data

The data used to train AI models needs to reflect the production environments where applications are deployed.

2 days ago

Five Great DevOps Jobs Opportunities

Looking for a DevOps job? Look at these openings at NBC Universal, BAE, UBS, and other companies with three-letter abbreviations.

2 days ago

Tricentis Taps Generative AI to Automate Application Testing

Tricentis is adding AI assistants to make it simpler for DevOps teams to create tests.

4 days ago