Identifying and Abstracting Business Intelligence from Kubernetes Workloads

In the age of big data, businesses are inundated with data points. While they know there’s an abundance of valuable resources available to them, making sense of this data to derive actionable insights is often still a challenge.

CNCF industry survey data shows workload complexity and monitoring remain top challenges for enterprises in terms of using and deploying Kubernetes. Many understand there are valuable resources within these environments, but struggle to best identify and extract meaning from machine data. While most of this data is readily available, it just takes the right tools to gather and view intelligent insights.

The Benefits of Open-Source Data Collection

Perusing the CNCF website, there are a wealth of tools built around Kubernetes to enable not only monitoring but networking, storage and security. There is even a Kubernetes-specific package manager, Helm, to make the deployment and management of these resources easy and consistent.

There are a couple of key benefits to taking advantage of these open-source collectors:

They stay up to date. Each of these tools benefits from deep community support. As new versions of Kubernetes are released, the extensive use of each of these tools ensures they are quickly updated.
They integrate with everything. Prometheus, for example, has an impressive list of integrations and exporters. Regardless of your unique stack, it is likely that there is support for what you might want to export data from. The importance of these integrations cannot be overstated, as they enable the flexibility needed to grow and evolve a Kubernetes deployment over time.

Ensuring Complete and Organized Metrics Collection

Prometheus–endorsed by the CNCF–is the de facto tool of choice for metrics monitoring with extensive support for anything you might use to collect metrics data.

Prometheus works by pulling data from all of the components and jobs running in Kubernetes since every component of Kubernetes exposes its metrics in a Prometheus format. The processes running behind those components then serve up the metrics on an HTTP URL. For example, the Kubernetes API Server serves its metrics on https://$API_HOST:443/metrics.

Prometheus is particularly good at auto-discovering the jobs and services currently running in a Kubernetes cluster. As pods are added, removed or restarted, the Kubernetes Service construct keeps track of what pods exist for a given service. This auto-discovery capability is one of the primary reasons for Prometheus’ popularity, ensuring that all new and existing components are monitored.

The Importance of Cluster Level Logging and Event Collection

Kubernetes does not define a single standard approach to log collection, but the most common method is called cluster level logging. Cluster level logging deploys a node level logging agent to each node which then funnels data to a separate backend for storage and analysis of logs. The primary benefit of this solution is that if a pod dies, the logs detailing what happened are retained. Implementing node level logging, without funneling data to a logging backend, will not retain log data if pods die or are evicted, while cluster level logging ensures that data is captured and retailed. A common tool for implementing cluster level logging is Fluentd—or Fluentbit, a lightweight version of Fluentd—which acts as the node level logging agent funneling data to a logging backend.

Events provide insight into decisions being made by the cluster and unexpected events that occur in Kubernetes. Events are stored by the API server on the master node and collected using the same method as log collection—via a node level logging agent like Fluentd.

Establishing a Continuous Intelligence Dashboard

Finally, collectors for logs, metrics, events and security can be easily deployed using Helm—an open source Kubernetes package manager. Helm can significantly simplify the setup process, reducing hundreds of lines of configuration to one. These collection plugins can be used on any Kubernetes cluster, whether one from a managed service such Amazon Elastic Kubernetes Service (EKS) or a cluster you are running entirely on your own.

According to Sumo Logic’s recent Continuous Intelligence Report, enterprise adoption and deployments of multi-cloud grew 50% year-over-year, and 80% of users across multi-cloud environments are now utilizing Kubernetes architectures. As Kubernetes continues to become mainstream, it’s essential that businesses understand how these workloads operate and how to best extract valuable insights to inform business decisions.

— Katie Lane