Decoding the Self-Healing Kubernetes

A business application that fails to operate 24/7 would be considered inefficient in the market. The idea is that applications run uninterrupted, irrespective of a technical glitch, feature update or a natural disaster. In today’s heterogeneous environment where infrastructure is intricately layered, a continuous workflow of application is possible via self-healing.

Kubernetes, which is a container orchestration tool, facilitates the smooth working of the application by abstracting machines physically. Moreover, the pods and containers in Kubernetes can self-heal.

In the Avengers movie, Captain America asked Bruce Banner to get angry so he could transform into The Hulk. Bruce replied, “That’s my secret Captain. I’m always angry.” You must have understood the analogy here. Let’s simplify: Kubernetes will self-heal organically, whenever the system is affected.

Kubernetes’s self-healing property ensures that the clusters always function at the optimal state. Kubernetes can effectively self-detect two types of objects—podstatus and container status. Kubernetes’s orchestration capabilities can monitor and replace unhealthy containers as per the desired configuration. Likewise, Kubernetes can fix pods, which are the smallest units encompassing single or multiple containers.

The Three Container States Include

Waiting: created but not running. A container, which is in a waiting stage, will still run operations such as pulling images or applying secrets, etc. To check the waiting pod status, use the following command:

Along with this state, a message and reason about the state are displayed to provide more information.
Running Pods: containers that are running without issues. The following command is executed before the pod enters the running state:

Running pods will display the time of the entrance of the container.
Terminated Pods: a container that fails or completes its execution. The following command is executed before the pod is moved to terminated:

Terminated pods will display the time of the entrance of the container.

Kubernetes’ Self-Healing Concepts – Pod’s Phase, Probes and Restart Policy

The pod phase in Kubernetes offers insight into the pod’s placement. We can have:

Pending Pods–created but not running.
Running Pods–runs all the containers.
Succeeded Pods–successfully completed container lifecycle.
Failed Pods–minimum one container failed and all containers terminated.
Unknown Pods.

Kubernetes execute liveness and readiness probes for the pods to check if they function as per the desired state. The liveness probe will check a container for its running status. If a container fails the probe, Kubernetes will terminate it and create a new container in accordance with the restart policy. The readiness probe will check a container for its service request serving capabilities. If a container fails the probe, then Kubernetes will remove the IP address of the related pod.

Liveness probe example:

The probes include:

ExecAction–to execute commands in containers.
TCPSocketAction–to implement a TCP check w.r.t to the IP address of a container.
HTTPGetAction–to implement a HTTP Get check w.r.t to the IP address of a container.

Each probe gives one of three results:

Success: The container passed the diagnostic.
Failure: The container failed the diagnostic.
Unknown: The diagnostic failed, so no action should be taken.

Demo Description of Self-Healing Kubernetes

We need to set the code replication to trigger the self-healing capability of Kubernetes.

Let’s see an example of the Nginx file:

In the above code, we see that the total number of pods across the cluster must be four.

Next, let’s deploy the file.

Let’s list the pods, using:

Here is the output:

As you see above, we have created four pods.

Now, let’s delete one of the pods.

The pod is now deleted. We get the following output:

Let’s list the pods again.

We get the following output:

We have four pods again, despite deleting one. Kubernetes has self-healed to create a new node and maintain the count to four.

Conclusion

Kubernetes can self-heal applications and containers, but what about healing itself when the nodes are down? For Kubernetes to continue self-healing, it needs a dedicated set of infrastructure, with access to self-healing nodes all the time. The infrastructure must be driven by automation and powered by predictive analytics to preempt and fix issues beforehand. The bottom line is that at any given point in time, the infrastructure nodes should maintain the required count for uninterrupted services.