Observability is growing in importance to meet envisioned business objectives. Despite the benefits to performance data, organizations have faced challenges in realizing the full potential of observability based on implementation, distributed software tools, and other technological hurdles. But with new innovation, the era of true observability is upon us.
Innovative changes
The open-source observability landscape, in particular, is going through a phase of innovation. Powered by distributed tracing “coming of age” with OpenTelemetry, and eBPF taking center stage in terms of restructuring as the next frontier of observability and monitoring on Linux, innovation is at an all-time high.
Kernel technology
Extended Berkeley Packet Filter (eBPF) has presented a fascinating technology native to the Linux kernel. Born with the intent of having a programmable way of dealing with networking, eBPF has evolved into a rather generic and performant facility to have safe, dynamic access to the Linux kernel, that enables, among many other things, to extract telemetry for observability and security purposes. eBPF is powerful and growing in popularity, and Microsoft has been working on a port to Windows.
Versatility
In terms of observability, eBPF proved to be remarkably versatile in 2021. For example, there are tools that use eBPF to perform distributed tracing, like the Pixie project that was open-sourced in mid-2021 by New Relic. But the most promising aspect of eBPF for 2022 is the fact that it is possible, with eBPF, to implement continuous profiling of applications in production. Examples of this are the aforementioned Pixie project and Parca. As proven by tools like Google’s Cloud Profiler and various commercial observability tools, having a low-overhead, always-on profiler in production is a powerful way to troubleshoot latency and memory issues in production environments, which are notoriously hard and time-consuming to reproduce in test labs.
Clear differences
While production profiling with eBPF is happening now, it is not yet a general capability for use with most applications. The way profiling works is surprisingly different across different runtimes and programming languages. While pretty much all of the languages behave similarly in terms of using the CPU and allocating memory (and even in such foundational computing matters, there are differences), the way process concurrency is notably different, pretty much across the board.
For example, Java is using threads inside the Java Virtual Machine that are mapped to threads in the Operating System (to be fair, there are also other options, like NIO and libraries like RxJava, Reactor, Vert.x, plus Project Loom will eventually ship). In Node.js, on the other hand, concurrency is mostly dealt with the Event Loop, and “real” threads at the Operating System are abstracted away from the developer. Differences like these matter for a profiler, because the data shown to the person troubleshooting an issue must be “translated” to the model for concurrency used by the programming language, and this requires explicit support in the profiler for specific runtimes.
Open source observability comes of age in 2022
Fortunately, 2022 is going to be the year that brings us full-featured, open-source, eBPF-based production profiling tools that understand the specifics of the many runtimes that are prevalent in today’s cloud-native applications, like Java, Node.js, Python and .NET, as well as compiled languages like Go and Rust. And that will be a game-changer for improving software and solving issues faster, for all DevOps and SRE and operators.
As we look out to the year ahead, revolutionary technology like eBPF means the rate of innovation happening at the operating system level is only picking up and setting the stage for the year of true observability.