site reliability engineering

Building an Open Source Observability Platform

By investing in open source frameworks and LGTM tools, SRE teams can effectively monitor their apps and gain insights into…

3 months ago

Harnessing AI for Automated and Toil-Free SRE

AI not only reduces toil but also contributes to improving system reliability, efficiency and scalability, forming a critical part of…

11 months ago

Revolutionizing the Nine Pillars of SRE With AI-Engineered Tools

In my blog Rapid Strategic SRE Assessments Accelerate IT Transformations published last year, I classified site reliability engineering (SRE) into…

12 months ago

Why SREs Are Critical to DevOps

Although a relatively new concept, site reliability engineers (SREs) have become crucial for DevOps teams, helping to solve an array…

1 year ago

Best of 2022: Day in the Life of a Site Reliability Engineer (SRE)

As we close out 2022, we at DevOps.com wanted to highlight the most popular articles of the year. Following is…

1 year ago

SRE Survey Reveals Major Technical and Cultural Challenges

Catchpoint, in partnership with Blameless, today published an annual survey of 559 site reliability engineers (SREs) that found 59% of…

2 years ago

Scaling Predictive Analytics With AIOps to Drive Next-Gen SRE

Enterprise systems are only as valuable as they are reliable, in the sense that they don’t suffer excessive breakdowns. Otherwise,…

2 years ago

5 Ways to Prevent an Outage

In today’s always-on, ever-connected world, we all expect 100% availability. What gets in the way of this? The devil is…

2 years ago

Why More Incidents Are Better

Ask most SREs how many incidents they’d have to respond to in a perfect world, and their answer would probably…

2 years ago

How to Adopt an SRE Practice (When You’re not Google)

Site reliability engineering (SRE) isn’t a new term or practice. The practice of applying software engineering skills and principles to…

2 years ago