Tag: outage

Delta CEO Puts Price Tag on CrowdStrike Damage: $500 Million
Delta Air Lines Inc. Chief Executive Ed Bastian has put a price tag on CrowdStrike Inc.'s debilitating outage to his airline, $500 million, leaving the company "no choice" but to seek damages ...

Microsoft Outage Outrage: Was it BGP or DNS?
All of Microsoft’s cloud services go down, everywhere. Redmond’s IaaS, PaaS and SaaS—including GitHub—were dead for several hours, and are still running unreliably—despite Microsoft saying it’s fixed ...

5 Ways to Prevent an Outage
In today’s always-on, ever-connected world, we all expect 100% availability. What gets in the way of this? The devil is in the details. Over time, everything breaks: Disks, nodes, containers, networks, DNS ...

Cloudflare Outage Outrage | Yet More FAA 5G Stupidity
In this week’s The Long View: Cloudflare suffers another huge outage while the FAA and FCC still disagree over 5G/NR near airports ...

What SREs Can Learn From the Atlassian Outage of 2022
What happens when the tools and services you depend on to drive site reliability engineering turns out to be susceptible to reliability failures of their own? That’s the question teams at about ...

Apple Outage Outrage | Linux Random Redo | Okta Hacked (or Not)
In this week’s The Long View: Why Apple services were down, Linux gets a huge RNG overhaul, and we wonder if Okta was hacked again ...

AWS Outage Exposes Weaknesses of DevOps Resilience
The December 7, 2021 Amazon Web Services (AWS) outage severely disrupted services from a wide range of businesses for more than five hours and highlighted just how reliant businesses have become on ...

AWS Outage Outrage | Rusty Linux | ARM Latest
In this week’s The Long View: Amazon Web Services falls on its face, Linux’s move to Rust takes the next step, and the FTC stabs another fatal wound in the horrible Arm/Nvidia ...

Nvidia/ARM Wavering | Google Outage Outrage | Backblaze IPO on Fire
In this week’s The Long View: Nvidia’s faltering attempt to buy Arm, Google’s load balancers go offline, and Backblaze’s newly-IPO’ed stock jumps 60% ...

When IT Disaster Strikes, Part 3: Conducting a Blameless Post-Mortem
In the first and second parts of this three-part series, we looked at how organizations can effectively resolve incidents and the role of each person involved in the resolution process. In this ...

An outage war room primer
One aspect of the DevOps movement I've seen adopted at numerous companies is the idea that everyone supports their products by being on-call for any incidents that occur in the production environment ...