Nobl9 this week revealed it has integrated its reliability center platform for managing service level objectives (SLOs) as code with the Microsoft Azure Monitor service to make it simpler to both create and track service levels.
The Nobl9 Reliability Center provides access to dashboards for a single source of truth about the state of reliability in an application environment based on the metrics collected from the SLOs that have been defined.
Nobl9 Chief Growth Officer Kit Merker said the integration with Microsoft Azure Monitor is the first in a series that will ultimately make it easier to define SLOs using historical data collected by third-party tools.
The Nobl9 Reliability Center is already integrated with more than 50 DevOps, observability and incident management tools through which it collects metrics, events, logs, traces, alerts to track incidents, releases, rollbacks, runbooks and other documentation.
Those integrations will ultimately make it simpler for IT teams to create SLOs using historical data that today are still challenging to define from scratch, noted Merker. In fact, the data tracked by Nobl9 can be used to automate remediations via those integrations, he noted.
As more microservices-based applications are built and deployed, it’s becoming more challenging to maintain SLOs across applications that have many more dependencies than legacy monolithic applications. Each feature and application programming interface (API) added over time conspires to adversely impact performance. It‘s not so much that any given service will outright fail as much as it is determining where degradations are occurring that result in SLOs not being met.
Ultimately, each IT team needs to provide some sort of objective benchmark that assesses their overall effectiveness at delivering application services by making it simpler to track whether SLOs are achieved and maintained. Not every service needs to necessarily meet a stringent requirement. Instead, IT teams need to have confidence in the fact that, on average, certain levels of performance are being consistently maintained.
That’s crucial because as more organizations depend on software to deliver digital services, any disruption is likely to have a direct impact on profit and revenue. As such, the reliability of application environments has become a major concern in an era where IT teams are now trying to manage distinct services rather than individual applications.
The challenge and the opportunity now is to find a way to proactively reduce that stress by programmatically managing SLOs as code rather than having to overprovision infrastructure resources to ensure maximum availability at the highest cost possible. At a time when more organizations are encountering economic headwinds, that’s no longer a luxury they can afford.
Each organization will need to determine for themselves how aggressively they need to manage SLOs. Of course, SLOs as a concept have been around for decades, but with the rise of digital services that span multiple organizations, the level of reliability DevOps teams are expected to provide is now being written into business contracts. Not surprisingly, the overall level of associated stress DevOps teams are experiencing has correspondingly increased.