Welcome to The Long View—where we peruse the news of the week and strip it to the essentials. Let’s work out what really matters.
This week: Birmingham looks like the Detroit of the UK—is it Oracle’s fault?
Plus: Was Toyota’s factory failure caused by running out of disk space?
1. Larry Laughs All the Way to the Bank
First up this week: Another huge local government Oracle failure. The England’s Birmingham City Council (BCC) is facing a bill of $125 million for its much delayed SAP replacement project—five times the initial budget.
Analysis: Why does anyone use Oracle?
This failed Oracle Fusion project is a big contributory factor in BCC’s catastrophic budget shortfall. Local taxpayers are likely unhappy funding yet another of Ellison’s Potemkin-esque yachts. I can’t help wondering if custom cloud code could be cheaper and more effective.
Nick Farrell: Oracle project close to bankrupting Birmingham
“Delays, cost over-runs”
The largest local authority in Europe has declared itself in financial distress after an Oracle project’s costs ballooned from $25 million to around $125.5 million. (Contributing to … the $4.3 billion revenue organisation [being] unable to balance the books is a bill … to settle equal pay claims.)
…
The Oracle Fusion cloud-based ERP system project was to replace SAP for core HR and finance functions. Since 2018 it suffered from delays, cost over-runs, and a lack of controls. The council reviewed the plan in 2019, 2020, and again in 2021, when the implementation cost for the project almost doubled.
How did it come to this? Sophie Madden: What is happening in ‘bankrupt’ Birmingham?
A new cloud-based IT system by Oracle … was supposed to cost £19m. But three years of delays in getting it in place and problems once it was installed mean it is now expected to cost £100m.
…
On Tuesday it issued … a declaration that it doesn’t have the means to meet its financial liability and cannot commit to any new spending. [It] was described as the council going “essentially bankrupt” [but] councils cannot officially go bankrupt due to services they have to provide by law. … Council leader John Cotton has insisted vital services would be protected, but warned “tough and robust decisions” were needed. … The council will hold an emergency meeting later this month.
…
Prime Minister Rishi Sunak … said it was “not the government’s job to bail out the council for its financial mismanagement” and it should “do a better job of managing the figures properly and delivering good quality services.”
What’s the problem? Lindsay Clark: Oracle ERP project disaster
One insider [said] Oracle Fusion, the cloud-based ERP system [BCC] is moving to, “is not a product that is suitable for local authorities, because it’s very much geared towards a manufacturing/trading organization. … BCC customized SAP to get it working really well and apart from some minor annoyances, SAP was a good product that should never have been ditched.”
…
We asked Oracle to comment but it has yet to reply.
Hindsight is 20/20. But Szerial1 saw it all coming:
ERP solutions have a 75% failure rate: They are set up in a way that you will always spend more moving off the ERP system rather than maintaining the existing one. … This is common knowledge in IT.
As did mikhailfranco:
Laughing Larry buys another flotilla of carbon fiber hydrofoil catamaran yachts. This is Oracle’s MO.
Still, a 5x budget bloat is “close enough for government work.” But this Anonymous Coward says that’s not entirely fair:
At least 80% of listed companies seem to have **** IT strategy. … Technology strategy is really critical, and is really tough.
…
A great CIO is at least as rare as a great CEO — maybe rarer. [And] there is no excuse for the **** the big consultancy firms provide in exchange for their ridiculous fees.
Um, where’s the DevOps angle? sg_oneill brings modernized sanity:
I struggle to think of any Oracle deployments that succeeded … on budget. I dont “get” why the enterprise guys keep falling for this ****. Hire a gang of 5-6 Django … coders (or Rails or whatever) … a front end guy and a project manager with a proven history of getting big jobs done fast—and it’ll done in 5–6 months, for well under a mil.
2. Toyota Runs Out of Disk Space
Around a third of Toyota’s global manufacturing stopped for more than 48 hours last month. Now it appears the reason was because a database server filled up its allocated storage space.
Analysis: Common mode failure
And the redundant backup system also failed, because it was identical—so it suffered an identical problem. The lesson should be obvious.
Andrew E. Freedman: Toyota Shut Down 14 Factories Due to ‘Insufficient Disk Space’
“One third of Toyota's global car production”
In late August, Toyota had to shut down 28 assembly lines … due to computer issues. Today, Toyota [said] “an error occurred due to insufficient disk space,” [and that] the servers were running on the same system as its backup, causing the same issue there, so the company couldn’t make a switch.
…
Toyota … is known for its mastery of the … ”just-in-time” system of managing parts—keeping only what is needed to build cars at their exact points of assembly. … But, clearly, everything falls apart when you can’t order the parts you need.
…
These Japan-based plants account for roughly one third of Toyota’s global car production. In Feb. 2022, Toyota shutdown the same 14 plants due to a cyberattack.
Sorry, they did what? partomniscient thinks Toyota wasn’t prescient:
It seemed like they had a redundant system to take over in case of hardware failure. Unfortunately due to the nature of the underlying issue, the hot spare suffered an identical problem (ran out of drive space) given it was identically spec’d. Good example lesson to show that simple redundancy doesn’t make all problems go away in the case of failure.
Wait. Pause. Weren’t the volumes virtualized? person of no interest desires you to exit their grassed area:
Virtualization isn’t magic: It can’t create space that doesn’t physically exist. … The most plausible explanation here is their processes are intended to be very, very protective of data in their process control environment to the extent that a lot of copies are made and flow through the system. If you get unlucky and suffer multiple successive process failures, these copies can stack up and consume all the space. … I have seen that happen in factory environments.
…
A hardware fault happens, then a process fault (or two or three) … and suddenly data is building up while free space is dropping. At some point, a choice has to be made between downtime and absolute data integrity. Lost data is generally more expensive than downtime, so downtime becomes the choice.
The Moral of the Story:
Comparison is the thief of joy
—Eleanor Roosevelt
You have been reading The Long View by Richi Jennings. You can contact him at @RiCHi, @richij or tlv@richi.uk.
Image: Leila de Haan (via Unsplash; leveled and cropped)