Cloud-Native Security and Performance: Two Sides of the Same Coin

You’re running Kubernetes in a production environment, and you need to apply a patch — perhaps to a commercial application, an open source component or even a container image. How long should it take to implement that patch in production? Thirty days? One day? One hour?

Remember, cloud-native environments are supposed to respond to change in real-time. Such a response isn’t simply scaling up or down as needed. It’s also essential to respond to security threats, and performance issues, as close to real-time as possible.

Containers, and the microservices they support, are also ephemeral. They can live for five minutes or even less, and yet unpatched microservices are just as dangerous as more permanent code.

Patching must also fit within your application lifecycle. Teams with a mature DevOps process typically release code weekly, daily or even hourly. Their ability to apply the patch and roll to production quickly, significantly reduces their security risk. Are your operators up to the task?

Poor Security Management Becomes a Performance Problem

The connections between security and performance go well beyond cloud-native environments. Denial-of-service (DoS) attacks obviously target a site’s performance, as do cryptojacking and to a lesser extent, ransomware attacks.

In many cases, performance monitoring gives an organization its first indication that such an attack is underway. However, monitoring isn’t just for recognizing attacks in progress. It also plays an important role in the prevention of attacks as well — especially in cloud-native environments.

To understand this point, think about how organizations have traditionally handled patch management, especially to software infrastructure: A vendor (or open source project) releases a patch. The enterprise applies the patch in a test environment that resembles production somewhat. After running a range of integration tests over the course of days or even weeks, the ops team may be ready to deploy the patch into production.

Another likely scenario: The ops team may collect several patches to different pieces of software, hoping to test and deploy them all at once. In the meantime, more time goes by, giving attackers even more opportunity for mischief.

The reason that applying patches in an enterprise production environment takes so long is because IT leadership perceives that patch management is a high-risk activity. Patches may make a piece of software behave poorly or not at all — and the complex interdependencies among both applications and infrastructure components compound the risk of failure unpredictably.

Cloud-Native Computing Changes the Equation

Cloud-native computing extends cloud best practices such as scalability and resilience to the entire enterprise IT landscape, providing a new paradigm for computing across hybrid IT environments that leverage virtualization, containers and serverless computing as appropriate.

At the center of the cloud-native movement is the open source container orchestration platform, Kubernetes. Kubernetes requires and reinforces a cloud-native architectural approach that decomposes applications into containerized microservices.

Deploying, managing and operating microservice-based workloads at scale becomes the central challenge for Kubernetes, and for cloud-native infrastructure in general.

To manage patches, organizations must follow cloud-native principles. Monolithic, “test everything for weeks” approaches simply do not align with the dynamic, ephemeral nature of cloud-native software.

Just as application developers use canary testing for features they are rolling out, canary testing infrastructure updates largely supplants testing in the full-blown test environment.

Testing patches (or updates) become an ongoing, largely automated process of trying new configurations in limited production environments, analyzing the resulting performance and rolling back changes if necessary.

Operators can now test patches (or new releases) more quickly in production, in a way that ideally minimizes the impact on the applications that end-users interact with.

How, then, should operators manage such canary environments? By monitoring the performance of the software under test.

The Convergence of Security and Performance Monitoring

Unpatched software is the leading source of vulnerabilities in the enterprise IT landscape, and managing patches in dynamic environments requires performance monitoring.

Add patch management to other performance-related cyberthreats, such as denial of service, ransomware and cryptomining, and it’s clear that security and performance monitoring are two parts of a broad spectrum of IT operations management (ITOM) capabilities.

In cloud-native infrastructure, ITOM doesn’t stop at monitoring. Instead, operators require observability tools that monitor logs, traces and metrics across the entire infrastructure, in addition to providing tools that operators can use to mitigate and prevent problems with the infrastructure.

In other words, where monitoring is a passive, “watch the dashboard” activity, observability goes the extra mile, promoting an active, “understand and fix the problem” mentality.

It doesn’t really matter if those problems are performance or security-related — in many cases, there’s no distinction. Today’s operators must work with tools that bring the two together into a full-fledged security, compliance and monitoring solution.

The Intellyx Take

Some analysts argue that IT infrastructure has followed a natural progression from on-premises servers to virtualization to cloud computing to containers. However, there is a fatal flaw in this argument.

The first three environments in this sequence follow a clear pattern, as virtual machines work much the same as physical servers, be they on-premises or in the cloud. As a result, coding applications remained largely unchanged as enterprises worked their way along this progression to the cloud.

Containers, however, are fundamentally different — especially once you place them into Kubernetes and the cloud-native context overall. With cloud-native computing, it’s essential to rethink the nature of the application from the ground up. Cloud-native computing is, in fact, an entirely new paradigm for enterprise IT.

It’s no wonder, therefore, that core operational activities like security and performance monitoring have a new role in the cloud-native world. Indeed, security and performance are every bit as important as before, but today, we must deal with both at once in the context of a dynamic, ephemeral application landscape.

Only within this context does it become clear what cloud-native vendors like Sysdig are offering. The world of IT infrastructure is changing, and with it the world of enterprise application development and deployment. Security and performance are as important as ever. Don’t let them fall through the cracks.