outage

Microsoft Outage Outrage: Was it BGP or DNS?

All of Microsoft’s cloud services go down, everywhere. Redmond’s IaaS, PaaS and SaaS—including GitHub—were dead for several hours, and are…

1 year ago

5 Ways to Prevent an Outage

In today’s always-on, ever-connected world, we all expect 100% availability. What gets in the way of this? The devil is…

2 years ago

Cloudflare Outage Outrage | Yet More FAA 5G Stupidity

In this week’s The Long View: Cloudflare suffers another huge outage while the FAA and FCC still disagree over 5G/NR…

2 years ago

What SREs Can Learn From the Atlassian Outage of 2022

What happens when the tools and services you depend on to drive site reliability engineering turns out to be susceptible…

2 years ago

Apple Outage Outrage | Linux Random Redo | Okta Hacked (or Not)

In this week’s The Long View: Why Apple services were down, Linux gets a huge RNG overhaul, and we wonder…

2 years ago

AWS Outage Exposes Weaknesses of DevOps Resilience

The December 7, 2021 Amazon Web Services (AWS) outage severely disrupted services from a wide range of businesses for more…

2 years ago

AWS Outage Outrage | Rusty Linux | ARM Latest

In this week’s The Long View: Amazon Web Services falls on its face, Linux’s move to Rust takes the next…

2 years ago

Nvidia/ARM Wavering | Google Outage Outrage | Backblaze IPO on Fire

In this week’s The Long View: Nvidia’s faltering attempt to buy Arm, Google’s load balancers go offline, and Backblaze’s newly-IPO’ed…

2 years ago

When IT Disaster Strikes, Part 3: Conducting a Blameless Post-Mortem

In the first and second parts of this three-part series, we looked at how organizations can effectively resolve incidents and…

6 years ago

An outage war room primer

One aspect of the DevOps movement I've seen adopted at numerous companies is the idea that everyone supports their products…

10 years ago