Welcome to The Long View—where we peruse the news of the week and strip it to the essentials. Let’s work out what really matters.
This week: A South Korean conflagration leads to a ridiculously long outage, and the price of public cloud is skyrocketing.
1. Ouch: Outage Outrage
First up this week: A fire at a data center used by Kakao—“the Korean Google”—has caused the loss of 32,000 servers, an outage lasting several days and even a CEO resignation. It turns out that putting all your eggs in one basket isn’t a good idea. Who knew?
Analysis: How’s your DR plan?
In 2022, downtime is cringe. Your app might not be as vital to the nation as Kakao is to South Korea, but regional failover and other elements of disaster-recovery planning are just as important if you’re to meet users’ uptime expectations.
Chang Che and Jin Yu Young: South Korean Super App Goes Down
“The result of ‘negligence’ and inadequate preparations”
Life in South Korea went askew this past weekend: … The Kakao suite of apps was down because of a fire at a data center [and] led to the resignation of a co-CEO.
Started as a messaging app more than a decade ago, KakaoTalk has become a universe unto itself, with services covering banking and payments, ride-hailing, maps and games. It is found on more than 90 percent of phones in South Korea. … Questions remained about why the company seemed to lack a contingency plan to restore services more quickly.
Multiple groups of customers are preparing a class-action suit against Kakao. … Lawyers have claimed that the accident was the result of “negligence” and inadequate preparations.
Did somebody say resignation? Manish Singh and Kate Park add: Company to invest over $300M to build data center
Whon Namkoong, the co-chief executive of Kakao, has resigned in a remarkable demonstration of corporate accountability after a fire incident at an SK C&C data center in Pangyo, south of Seoul, caused a mass outage. [He] apologized for the mass outage “for such an extended period” and said that he feels “the heavy burden of responsibility” over the incident.
KakaoTalk is the most popular messaging app in South Korea, reaching over 47 million of the nation’s 51.7 million population each month. The app is also used by government officials and businesses, including banks, ride-hailing services and payment players. … South Korean President Yoon Suk-yeol said [it] is practically a national communications infrastructure.
Unheard of. Except in Korea (not North Korea—South Korea), says The Evil Atheist (Marilyn Monroe):
[There’s a] longstanding Confucian influence on Korean and Japanese culture. Resigning due to failure, regardless of how immediately culpable you are, is considered an honorable thing to do. Honor isn’t something westerners understand any more.
How did it happen? lifthrasiir didn’t start the fire:
Not yet fully determined, but the current circumstance indicates that it’s likely an electric spark … ignited emergency fuel in a nearby generator room.
How the hell can you offer banking services on that scale without georedundancy? Did they lie to the banks and credit-card providers or what?
2. IaaS Inflation
Twenty percent: That’s the estimate of how much more you’ll be paying for public cloud in a year’s time. Infrastructure-as-a-service costs are being hit by a double-whammy of energy inflation and spiking interest rates.
Analysis: DevOps must save cash
So find savings. But you can’t manage what you can’t measure. Do you know what you’re spending on cloud—and why?
Paul Kunert: Public cloud prices to surge in US and Europe next year
“Relentless rise in inflation”
Challenges loom, according to Steve Brazier, CEO at channel analyst Canalys, who pointed out that the cost of the ever expanding infrastructure for public cloud is “incredibly expensive.” … For the US, the expectation is for … public cloud prices are forecast to jump … by a fifth.
The cost of borrowing is rising. … A seemingly relentless rise in inflation is also a problem: … This includes buildings, networking gear, and other IT equipment.
Is DevOps helpless in the face of inflation? No, says buro9:
An engineer should be as aware of their $ costs as they are their CPU load, filesystem I/O, or memory usage. An engineer should know how many instances they scaled to, and when they’re going to scale down.
Not being aware of the costs feels irresponsible: … Now we’re in an age of your application being able to scale beyond reasonable bounds of your company’s purse strings, a solid understanding of what your application costs is an essential and core skill. … Even during hyper growth you should be aware of your costs.
Time to migrate back to on-prem? UK_Bedders has:
We can consolidate VM’s/containers onto fewer physical machines, turn up the air conditioning by a degree, get rid of the last older devices that are less power efficient than newer ones. If everything was hosted by Azure or AWS we’d be beholden to them and any price increase they put our way, with no quick way to move away to save costs.
Any tips to reduce the IaaS bill? pid-1 has one or two:
AWS in particular is handsomely engineered so you can’t answer … where the money is going … as well as making people spend more without need.
– Per resource cost is turned off by default. Turning it on costs money. How much? ¯\_(ツ)_/¯
– Services like S3 will bill you per API call. How do you know who is calling your APIs? Using Cloud trail, which is turned off by default. Turning it on for this sort of analysis might cost a lot $$$ for a reasonably used bucket.
– CloudWatch will bill you per custom metric and adds up quickly. Why do I have so many custom metrics? Good luck finding out.
– Many costs are tied with stuff that’s really hard to control e.g. egress traffic.
– Using many accounts is by far the best way to segregate unrelated projects. But then, you will end up paying for many criminally underutilized NAT GWs, VPN connections, Load Balancers, etc…
The list goes on. I know a fair number of consulting companies making the money of their lives reducing AWS bills.