DevOps.com

  • Latest
    • Articles
    • Features
    • Most Read
    • News
    • News Releases
  • Topics
    • AI
    • Continuous Delivery
    • Continuous Testing
    • Cloud
    • Culture
    • DataOps
    • DevSecOps
    • Enterprise DevOps
    • Leadership Suite
    • DevOps Practice
    • ROELBOB
    • DevOps Toolbox
    • IT as Code
  • Videos/Podcasts
    • Techstrong.tv Podcast
    • Techstrong.tv - Twitch
    • DevOps Unbound
  • Webinars
    • Upcoming
    • On-Demand Webinars
  • Library
  • Events
    • Upcoming Events
    • On-Demand Events
  • Sponsored Content
  • Related Sites
    • Techstrong Group
    • Container Journal
    • Security Boulevard
    • Techstrong Research
    • DevOps Chat
    • DevOps Dozen
    • DevOps TV
    • Techstrong TV
    • Techstrong.tv Podcast
    • Techstrong.tv - Twitch
  • Media Kit
  • About
  • Sponsor
  • AI
  • Cloud
  • Continuous Delivery
  • Continuous Testing
  • DataOps
  • DevSecOps
  • DevOps Onramp
  • Platform Engineering
  • Low-Code/No-Code
  • IT as Code
  • More
    • Application Performance Management/Monitoring
    • Culture
    • Enterprise DevOps
    • ROELBOB
Hot Topics
  • npm is Scam-Spam Cesspool ¦ Google in Microsoft Antitrust Thrust
  • 5 Key Performance Metrics to Track in 2023
  • Debunking Myths About Reliability
  • New Relic Bets on AI to Advance Observability
  • Vega Cloud Commits to Reducing Cloud Costs

Home » Latest News Releases » Catchpoint SRE Study Reveals a Global Drop in Toil, Warns of Looming Scalability Ceiling, and Highlights the Need for New Operational Capabilities

Catchpoint SRE Study Reveals a Global Drop in Toil, Warns of Looming Scalability Ceiling, and Highlights the Need for New Operational Capabilities

By: Veronica Haggar on June 15, 2021 Leave a Comment

New York, NY, June 15, 2021 — Catchpoint, the leader in Digital Experience Management, conducted a study with VMware Tanzu and DevOps Institute of nearly 300 site reliability engineers (SREs). The SRE Report is one of the most data-backed studies of its kind and has played a critical role in defining the nature of what it means to be a SRE since it launched four years ago. This year’s report underscores the challenges of multi-cloud, calls out the underutilization of AIOps, and shows a systemic shift in core baselining data. The report concludes by offering an actionable path for SREs to consistently deliver customer value.

Recent Posts By Veronica Haggar
  • SASE Continues to Roll with Revenue up 34 Percent to Top $6 Billion in 2022, According to Dell’Oro Group
  • ConnectALL Expands Betty Knight ConnectALL Scholarship Effort with Second Award
  • DevOps Done Right: How to Succeed in DevOps From Day One
More from Veronica Haggar
Related Posts
  • Catchpoint SRE Study Reveals a Global Drop in Toil, Warns of Looming Scalability Ceiling, and Highlights the Need for New Operational Capabilities
  • DevOps Careers: Report Examines What It Takes To Be an SRE
  • Survey Finds SRE Role Remains Largely Aspirational
    Related Categories
  • Latest News Releases
    Related Topics
  • catchpoint
Show more
Show less

Download the report here.

TechStrong Con 2023Sponsorships Available

“SREs deal with a very broad set of challenges that span across transformational and operational activities,” says Mehdi Daoudi, CEO of Catchpoint. “This report arms them with the insights they need to help address these challenges – to balance the need for agility against the need for stability when building and operating massive, distributed, and reliable systems.”

Levels of Toil Fall Around the World

Toil is the work tied to a production service that tends to be manual, repetitive, automatable, and devoid of enduring value. Google suggests that SREs should do no more than doing 50% ops work (including toil) and 50% dev work. This year, the SRE Report notes an average year-over-year drop in toil of 15%.

“The reason this is such an impactful insight is that the drop in toil was across all geographies,” says Tony Ferrelli, Vice President of Technical Operations at Catchpoint. “If this drop in toil was because work felt more meaningful since COVID-19 led to SREs working-from-home, then will reported toil levels rise next year as people return to the office or a hybrid work environment?”

The Accelerating Use of Multiple Providers Warns of a Looming Scalability Ceiling

If the cloud is your new datacenter, then third-party services like DNS and CDN are your new racks and cabinets. When combining the rising use of multiple same-service platforms (e.g., multi-cloud) with the increase in the volume, velocity, and variety of monitoring data, there is little wonder why lack of visibility across the stack (53%) was the most cited cloud-app monitoring challenge or why SREs continually refine service level objectives (50%). The survey responses give rise to the critical question, how can companies most effectively scale SRE implementations?

“Spanning the gaps between the interfaces and the data that each provider offers increases the difficulty for SRE teams to automate across those multiple providers. These integrations are rarely simple except for the most superficial aspects. Effectively mapping disparate data models together may be the next frontier for SRE in a multi-vendor environment,” says Kurt Andersen, SRE Architect, Blameless.

The Shift Toward AIOps Is Slow

AIOps has been widely touted to reduce laborious ops work and to intelligently sift through the ever-increasing volumes of data that organizations are continually presented with. However, the report shows that many SREs have never used AIOps and their rating of its received value evenly spanned the 1-9 value scale.

According to J. Bobby Dorlus, Staff SRE at Twitter, “Most SREs working at this scale are already leveraging machine learning, especially when it comes to efficiencies around data centers (locations, cooling, and all the things that happen inside it) for networks and building out infrastructure … Evolving that into AIOps is the next logical step.”

Observability Should Include Digital Experience Metrics and Business KPIs

SREs that fail to deliver customer value run the risk of being stuck in an operational toil rut. Conversely, businesses that fail to recognize the importance of SRE activities risk losing talented employees and their competitive edge.

The highest-ranked driver for successful SRE implementations was incident resolution (60%), while expanding the business was fifth lowest (33%). These findings show that SREs are still inwardly focusing on IT operations versus outwardly focusing on the business results that deliver customer value. To close this IT-to-business gap, SRE teams must expand observability boundaries to include digital experience metrics and business KPIs.

“The balancing work of innovation while providing operational excellence has forced many IT teams to put heavy emphasis on improving reliability and stability of services and applications,” says Eveline Oehrlich, Chief Research and Content Officer at the DevOps Institute. “What SREs now need to do is make sure the value of these reliable services and applications are understood by the customer.”

RECOMMENDATIONS

  • Businesses and SREs need to establish a baselining program around core SRE tenets and business level metrics to know whether things are getting better or worse.
  • Platform Operations teams should be implemented to achieve higher levels of scale and efficiency. Platform Ops should develop normalized capabilities for SREs across the organization to draw on (even though underlying platforms will have different interfaces) and treat those capabilities as a product to sell and market to other teams within the business.
  • To achieve the promise of AIOps, SREs and managers must break down AIOps into smaller components and incrementally develop from there, in addition to investing in training in AI and ML for SRE teams.
  • It is crucial to find ways to bridge the gap between SRE and business goals. Start conversations around capabilities, for instance, versus focusing on low-level monitoring metrics and high-level business outcomes.

Resources

SRE Report

Connect with Catchpoint

Twitter: @Catchpoint

LinkedIn: https://www.linkedin.com/company/catchpoint/

About Catchpoint

Catchpoint, the global leader in Digital Experience Monitoring (DEM), empowers business and IT leaders to protect and advance the experience of their customers and employees. In a digital economy, enabled by cloud, SaaS and IoT, applications and users are everywhere. Catchpoint offers the largest and most geographically distributed monitoring network in the industry – it’s the only DEM platform that can scale and support today’s customer and employee location diversity and application distribution. It helps enterprises proactively detect, identify and validate user and application reachability, availability, performance and reliability, across an increasingly complex digital delivery chain. Industry leaders like Google, L’Oréal, Verizon, Oracle, LinkedIn, Honeywell, and Priceline trust Catchpoint’s out-of-the box monitoring platform, to proactively detect, repair, and optimize customer and employee experiences. Learn more at www.catchpoint.com.

Filed Under: Latest News Releases Tagged With: catchpoint

« Akamai Adds Service to Optimize API Traffic
Next.js 11 Accelerates Frontend Performance and Enables Instant Collaboration to Let Developers Build the Next Big Thing, Faster »

Techstrong TV – Live

Click full-screen to enable volume control
Watch latest episodes and shows

Upcoming Webinars

https://webinars.devops.com/overcoming-business-challenges-with-automation-of-sap-processes
Tuesday, April 4, 2023 - 11:00 am EDT
Key Strategies for a Secure and Productive Hybrid Workforce
Tuesday, April 4, 2023 - 1:00 pm EDT
Using Value Stream Automation Patterns and Analytics to Accelerate DevOps
Thursday, April 6, 2023 - 1:00 pm EDT

Sponsored Content

The Google Cloud DevOps Awards: Apply Now!

January 10, 2023 | Brenna Washington

Codenotary Extends Dynamic SBOM Reach to Serverless Computing Platforms

December 9, 2022 | Mike Vizard

Why a Low-Code Platform Should Have Pro-Code Capabilities

March 24, 2021 | Andrew Manby

AWS Well-Architected Framework Elevates Agility

December 17, 2020 | JT Giri

Practical Approaches to Long-Term Cloud-Native Security

December 5, 2019 | Chris Tozzi

Latest from DevOps.com

npm is Scam-Spam Cesspool ¦ Google in Microsoft Antitrust Thrust
March 31, 2023 | Richi Jennings
5 Key Performance Metrics to Track in 2023
March 31, 2023 | Sarah Guthals
Debunking Myths About Reliability
March 31, 2023 | Kit Merker
New Relic Bets on AI to Advance Observability
March 30, 2023 | Mike Vizard
Vega Cloud Commits to Reducing Cloud Costs
March 30, 2023 | Mike Vizard

TSTV Podcast

On-Demand Webinars

DevOps.com Webinar ReplaysDevOps.com Webinar Replays

GET THE TOP STORIES OF THE WEEK

Most Read on DevOps.com

Don’t Make Big Tech’s Mistakes: Build Leaner IT Teams Instead
March 27, 2023 | Olivier Maes
How to Supercharge Your Engineering Teams
March 27, 2023 | Sean Knapp
The Power of Observability: Performance and Reliability
March 29, 2023 | Javier Antich
Five Great DevOps Job Opportunities
March 27, 2023 | Mike Vizard
Cloud Management Issues Are Coming to a Head
March 29, 2023 | Mike Vizard
  • Home
  • About DevOps.com
  • Meet our Authors
  • Write for DevOps.com
  • Media Kit
  • Sponsor Info
  • Copyright
  • TOS
  • Privacy Policy

Powered by Techstrong Group, Inc.

© 2023 ·Techstrong Group, Inc.All rights reserved.