uptime

SREs Say There’s Plenty of Room to Improve Incident Management

A global survey of site reliability engineers (SREs) found diagnosing issues is the most difficult aspect of incident management.

2 months ago

Unlocking Accountability: How Real-Time App Monitoring Empowers Engineering Teams

Real-time app monitoring is about fundamentally shifting your mindset toward a culture of accountability and continuous improvement.

2 months ago

Managing Risk

We have built some beautiful toolchains that crank out a finished product on the fly without needing anything close to…

1 year ago

Fire at Data Center Causes Chaos | 20% Costlier Cloud

In this week’s The Long View: A S. Korean conflagration leads to a ridiculously long outage, and the price of…

1 year ago

How to Improve Your Uptime Strategy

Outages happen, it’s inevitable. But, unplanned downtime often comes with substantial costs—not only in terms of recovery and revenue loss,…

4 years ago

What It Really Takes to Build a Reliable App

What makes an app reliable? If you ask most IT professionals that question, their minds immediately go to uptime. That’s…

4 years ago

The 3 Secrets to Overcoming the Agility-Stability Paradox

Regardless of whether an organization subscribes to a bimodal IT approach, one thing is clear: It’s more important, and more…

8 years ago

System Redundancy at the Distributed Cafe

  Here are some other ROELBOB’s you might like   A Unit Testing Dillema  The Pink Slip Another Pink Slip…

8 years ago

DevOps culture meets the SLA

The goal of every SaaS provider should be 100% availability - it should be ingrained in their culture. But it's…

10 years ago