DevOps.com

  • Latest
    • Articles
    • Features
    • Most Read
    • News
    • News Releases
  • Topics
    • AI
    • Continuous Delivery
    • Continuous Testing
    • Cloud
    • Culture
    • DataOps
    • DevSecOps
    • Enterprise DevOps
    • Leadership Suite
    • DevOps Practice
    • ROELBOB
    • DevOps Toolbox
    • IT as Code
  • Videos/Podcasts
    • Techstrong.tv Podcast
    • Techstrong.tv - Twitch
    • DevOps Unbound
  • Webinars
    • Upcoming
    • On-Demand Webinars
  • Library
  • Events
    • Upcoming Events
    • On-Demand Events
  • Sponsored Content
  • Related Sites
    • Techstrong Group
    • Container Journal
    • Security Boulevard
    • Techstrong Research
    • DevOps Chat
    • DevOps Dozen
    • DevOps TV
    • Techstrong TV
    • Techstrong.tv Podcast
    • Techstrong.tv - Twitch
  • Media Kit
  • About
  • Sponsor
  • AI
  • Cloud
  • Continuous Delivery
  • Continuous Testing
  • DataOps
  • DevSecOps
  • DevOps Onramp
  • Platform Engineering
  • Low-Code/No-Code
  • IT as Code
  • More
    • Application Performance Management/Monitoring
    • Culture
    • Enterprise DevOps
    • ROELBOB
Hot Topics
  • 5 Unusual Ways to Improve Code Quality
  • Bug Bounty Vs. Crowdtesting Programs
  • Five Great DevOps Job Opportunities
  • Items of Value
  • Grafana Labs Acquires Pyroscope to Add Code Profiling Capability

Home » Blogs » The Key Benefits of Observability for SREs

The Key Benefits of Observability for SREs

Avatar photoBy: Jayne Groll on September 3, 2021 Leave a Comment

In today’s technology landscape, organizations strive to champion innovative ideas, techniques and technologies to achieve success and outshine their competitors. For this reason, site reliability engineering (SRE) has become one of the fastest-growing enterprise roles and a set of organizational practices for fast and reliable software delivery. 

SREs use various tools and practices at their disposal to manage services at scale, such as observability. Observability is the ability to infer a system’s internal state(s). It provides actionable insights into when errors occur within a system and, more importantly, why they occur. For SREs, this actionable data can be important to providing secure and reliable applications. 

To gain more insights into the benefits of observability for SREs, I asked SKILup Day participants and DevOps Institute Ambassadors to weigh in. Here’s what they shared: 

Sponsor, Priya Satheesh, CEO, Instana 

“As their name implies, site reliability engineers are tasked with keeping the applications, architectures and websites up and available. Observability solutions put alerts in front of them at the first indications of trouble, with proper context and information so they can take action before issues become incidents and start affecting their customers. In addition, AIOps and machine learning help them to predict events and standardize responses for common occurrences to reduce response and repair times.”

Helen Beal, chief ambassador, DevOps Institute 

“Embracing observability supports SREs’ goals by:

  • Reducing the toil associated with incident management—particularly around cause analysis—improving uptime and MTTR.
  • Providing a platform for inspecting and adapting according to SLOs and ultimately improving teams’ ability to meet them.
  • Offering a potential solution to improve when SLOs are not met and error budgets are overspent.
  • Relieving team cognitive load when dealing with vast amounts of data–reducing burnout.
  • Releasing humans and teams from toil while improving productivity, innovation and the flow and delivery of value.
  • Supporting multifunctional, autonomous teams and the ‘We build it, we own it,’ DevOps mantra.
  • Completing the value stream cycle by providing insights around value outcomes that can be fed back into the innovation phase.”

Mark Peters, technical lead, Novetta

“True system observability means any process or event within any operational or development phase can be monitored by SREs. Systems with built-in observability provide SREs, as subject matter experts, the opportunity to examine events from different perspectives. The more observable the system, the more different places can be examined to help improve the overall flow. SREs can then create the feedback and design experiments to observe from different perspectives. If the event is showing functional difficulties and the only monitoring point is the input/output of the function and there is no process observability, SREs have to go back to the beginning to find events. If they can observe the entire process chain from commit to deployment, one can find the different areas. Think about the A/B cycle: Limited deployments are planned, and the load shifted gradually to ensure the new system can handle the increased functions. If the SRE cannot observe the function, then they cannot help improve reliability.”

Ryan Sheldrake, field CTO, Lacework 

“If you can ‘see’ or observe the complex and ever-changing environments that SREs are tasked with managing, creating and perhaps destroying, you can manage the associated risk posture associated with those environments and corresponding workload.“

Vishnu Vasudevan, head of product engineering and development, Opsera

“The top benefit for site reliability engineering (SRE) teams is the opportunity to work collaboratively with the business stakeholders and technology stakeholders to set goals and continuously improve. Based on these goals, SRE’s can create an approval process, which will help them understand where they are with respect to their own planning, security policies, quality metrics and operations practices. The ability to measure every release and verify the project helps improve speed to market and other factors once a robust observability practice is achieved. With DevOps orchestration across all the DevOps tools, applications being released and the different teams involved, SRE teams can then turn observability practices into a competitive advantage.”

Sushant Mehta, senior manager, application development, Diyar United Company

“Using observability, SREs can achieve a number of objectives in their day-to-day work, like identifying the root cause of production issues, faster resolution of issues and striving for self-healing infrastructure setup with no-code.”

Tiffany Jachja, engineering manager, Vox Media

“Every team that delivers software should be taking responsibility for their contributions to a software service. For SREs, having observability into an application means having the systems in place to help developers gain insights into how software applications function. Ultimately, it’s about building better software.”

Parveen Arora, co-founder and director, VVnT SeQuor

“SREs teams can leverage observability to get the following benefits:

  • To keep SLOs intact by detecting customer-affecting issues faster and rolling back before an issue affects the SLOs.
  • To have real-time health updates and transparency of information with regard to a service’s status. 
  • To create better workflows for debugging, optimizing workflows and resolving issues rapidly.
  • To simplify root cause analysis and investigation of hypotheses.”

Jose Adan Ortiz, solutions engineer, Akamai Technologies

“SREs are mainly concerned with the reliability of systems and usually monitor practices of toil, SLO, error, budget and stability to maintain systems working as expected.

You could ask, ‘How can an SRE achieve such an integrated view of all systems components?’ The answer is observability. Without observability tools, it was impossible for an SRE to provide an effective response, fast RCA and analysis of data to improve SLOs.”

Supratip Banerjee, solutions architect, Principal Global Services

“It can provide several advantages, including:

  • Detecting customer-impacting errors faster and reverting before SLOs are broken.
  • Fostering transparency and delivering real-time updates on the status of a service. This allows SREs to be more productive and saves a lot of time.
  • Developing better methods for debugging, optimizing and quickly addressing issues.
  • Allows for feedback loops, which are crucial in the SRE role.
  • To avoid downtime, the relevance of observability rises in real-world production systems. There should be proper notification processes in place.”

Maciek Jarosz, DevOps and process expert

“The benefits are quite great, I’d say. Unless one is living under a rock with one local system, then it’s rather inevitable that there will be dependencies here and there. And if you’re an SRE, then I’d say that you’d rather know a bit more about dependencies from other branches of your system—or other systems altogether—than less.

Learn more about observability and similar topics by registering for an upcoming SKILup Day. 

Recent Posts By Jayne Groll
  • 4 Container Orchestration Security Concerns
  • 12 Ways to Bake Security Into a DevOps Transformation
  • 7 (More) Security Considerations for CI/CD
Avatar photo More from Jayne Groll
Related Posts
  • The Key Benefits of Observability for SREs
  • Why QA and Monitoring Need to Merge
  • Best of 2022: Day in the Life of a Site Reliability Engineer (SRE)
    Related Categories
  • Blogs
  • Continuous Delivery
  • Continuous Testing
  • DevOps Culture
  • DevOps Practice
  • Editorial Calendar
  • Observability, Monitoring and Analytics
    Related Topics
  • DevOps engineering
  • observability
  • reliability
  • SRE
Show more
Show less

Filed Under: Blogs, Continuous Delivery, Continuous Testing, DevOps Culture, DevOps Practice, Editorial Calendar, Observability, Monitoring and Analytics Tagged With: DevOps engineering, observability, reliability, SRE

« Enterprise Blockchain Adoption Hinges on DevOps
How to Improve Cloud Management »

Techstrong TV – Live

Click full-screen to enable volume control
Watch latest episodes and shows

Upcoming Webinars

How Atlassian Scaled a Developer Security Solution Across Thousands of Engineers
Tuesday, March 21, 2023 - 1:00 pm EDT
The Testing Diaries: Confessions of an Application Tester
Wednesday, March 22, 2023 - 11:00 am EDT
The Importance of Adopting Modern AppSec Practices
Wednesday, March 22, 2023 - 1:00 pm EDT

Sponsored Content

The Google Cloud DevOps Awards: Apply Now!

January 10, 2023 | Brenna Washington

Codenotary Extends Dynamic SBOM Reach to Serverless Computing Platforms

December 9, 2022 | Mike Vizard

Why a Low-Code Platform Should Have Pro-Code Capabilities

March 24, 2021 | Andrew Manby

AWS Well-Architected Framework Elevates Agility

December 17, 2020 | JT Giri

Practical Approaches to Long-Term Cloud-Native Security

December 5, 2019 | Chris Tozzi

Latest from DevOps.com

5 Unusual Ways to Improve Code Quality
March 20, 2023 | Gilad David Maayan
Bug Bounty Vs. Crowdtesting Programs
March 20, 2023 | Rob Mason
Five Great DevOps Job Opportunities
March 20, 2023 | Mike Vizard
Items of Value
March 20, 2023 | ROELBOB
Grafana Labs Acquires Pyroscope to Add Code Profiling Capability
March 17, 2023 | Mike Vizard

TSTV Podcast

On-Demand Webinars

DevOps.com Webinar ReplaysDevOps.com Webinar Replays

GET THE TOP STORIES OF THE WEEK

Most Read on DevOps.com

SVB: When Silly Valley Sneezes, DevOps Catches a Cold
March 14, 2023 | Richi Jennings
Low-Code Should be Worried About ChatGPT
March 14, 2023 | Romy Hughes
Large Organizations Are Embracing AIOps
March 16, 2023 | Mike Vizard
Addressing Software Supply Chain Security
March 15, 2023 | Tomislav Pericin
Understanding Cloud APIs
March 14, 2023 | Katrina Thompson
  • Home
  • About DevOps.com
  • Meet our Authors
  • Write for DevOps.com
  • Media Kit
  • Sponsor Info
  • Copyright
  • TOS
  • Privacy Policy

Powered by Techstrong Group, Inc.

© 2023 ·Techstrong Group, Inc.All rights reserved.