Hooked on Service Metrics

We live in an increasingly “as-a-service” world. From software as a service (SaaS) and platform as a service (PaaS) to functions as a service (FaaS) and SaaS-delivered applications, service delivery has become paramount to business goals and practices.

In today’s DevOps landscape, microservices—the cloud-native approach to designing scalable, independently delivered services—allow teams to prioritize each individual service rather than many services simultaneously, enabling agility and continuous delivery. However, even with all their benefits, microservices come with their fair share of challenges, specifically with observability, monitoring and security. Service metrics are therefore more important now than ever before, as mitigating the challenges and seizing the opportunities introduced by microservices helps DevOps teams create a high-quality end user experience.

With service delivery so integrated into the overall business and customer experience, it’s crucial to ensure that service performance always remains at its peak. How can tech pros ensure that service-level agreements are met, services run smoothly and end user expectations are met and exceeded? The answer is simple: become hooked on service metrics.

Getting Hooked

In this piece, we’ll explore service metrics in respect to microservices. Due to the observability, monitoring and security challenges that microservices can introduce, managing and controlling microservices and their traffic requires an additional layer of tooling: a service mesh. Implementing a service mesh architecture helps facilitate traffic management, service identity and security, policy enforcement and more, which ultimately brings reliability, security and manageability to container and microservice management. Important metrics such as request volume, request duration, latency and request size can be gathered automatically across a service mesh, which helps facilitate successful microservice deployment.

However, despite service mesh tooling helping to mitigate risks, failures such as RPC timeouts and exception propagation can occur. Failures in the microservices space often occur during the interactions between services, so a clear view into those transactions enables DevOps teams to better manage architectures and avoid said failures. A service mesh provides observability, which enables DevOps teams to easily see what is happening when services interact with each other, making it easier to build a more resilient, secure and efficient microservice system. Additionally, service meshes provide a developer-driven, services-first network—a network that runs without application developers having to build network concerns into their application codes. Service meshes create a network that empowers operators with the ability to define network behavior, node identity and traffic flow through policy.

Tips for Success: What You Need to Know

So, how does a DevOps team implement a service mesh architecture for service metrics? Following a few best practices can help ensure success:

Know When to Implement: If a DevOps team does not plan to run an excess of microservices, a service mesh may not be necessary; fewer than 10 microservices likely could be managed with existing tools and security tools. But once a team starts running more than 10, a service mesh becomes essential.
Understand Your Use Case: After confirming the necessity of a service mesh architecture, you must understand how it will interact with the other types of infrastructure you’re running, as this will help determine what service mesh to leverage. For example, some work well with Kubernetes or Docker Swarm, but if you’re not running services inside containers, you would only benefit from the security or observability pieces of a service mesh. Depending on your use case, you might want to implement a highly capable service mesh such as Istio or a simpler tooling such as Linkerd 2.0.
Focus on the Network: With a service mesh architecture, you have several microservices that are communicating over a new network. As such, it’s important to understand the network’s criticality: What would you want out of a network that connects your microservices? You want your network to be as intelligent and resilient as possible; to route traffic away from failures to increase the aggregate reliability of your cluster; and to avoid unwanted overhead such as high-latency routes or servers with cold caches. You want your network to ensure that the traffic flowing between services is secure against trivial attack. And you want your network to provide insight by highlighting unexpected dependencies and root causes of service communication failure.

Conclusion

Becoming hooked on service metrics is key to ensuring successful delivery of services and meeting end user experiences. With the value service delivers to a business, it’s crucial to ensure that services are always running at their prime. For microservices, implementing a service mesh architecture and following corresponding best practices help ensure successful service delivery.

— Adam Hert