TL/DR: Highly controlled change is sometimes the best answer.
For much of my career, I’ve worked in mainframe shops. I have never worked on a mainframe team, but I have worked closely with them. I’m in Northeast Wisconsin, and at one point we had the highest per-capita number of mainframes in the world, so pretty much anyone from this area can say the same. They were the “stable core” of the system for which I was the “new shiny”—and at times, that was painful.
If you’re an established enterprise, even after years of trying to eliminate the mainframe, you probably still have one. They don’t go away so easily because, despite frustrations with rate of change, they do host core systems in established organizations and are reliable. I have seen similar in client server systems that are core to the business.
We are reaching a point where there are some great products out there to help modernize the mainframe and/or split up client/server applications. Some of them have been around a while (Linux) and are coming into wider adoption, some of them are building on that bit of more familiar interface and many are simply making the UI more familiar to developers like me that only ever developed in a mainframe terminal for college, if at all.
One of the main reasons why mainframes and future iteration’s applications are “sticky” is because they, and the systems they host, are generally reliable. No matter what you think of the timeline for updates, the data coming out of the mainframe and more controlled distributed applications is considered accurate in most organizations, over most other sources.
To be clear, there are groups like this from each successive wave of computing. The client/server app that replaced your mainframe customer information system has learned over time that a fault in their system impacts the core of the business, for example.
The goal is to increase the rate of change, without undermining the reliability. You hear a lot about throw-away instances and “fail fast,” but dropping your billing system isn’t really an option, so these need to be tempered with the knowledge of what failure implies in core systems.
Make the change, do the modernization, but listen to lessons learned on core systems. Take into account that your informational website being down is a serious problem for marketing, but the billing system being down is a disaster for the company. There is a degree of difference, and the approach should be less about continuous deployment and more about stable deployment. Yes, it can be a scale, and you can start the motion from one to the other slowly to make sure you’re covering your bases.
In short, assess the risk levels of these systems before deciding how to proceed. This is pretty much common-sense advice—but better to read it and not need it, than to find out when corporate revenues are impacted.
I worked at one shop where a certain core application was using technology that was a couple of generations old and no longer supported. The pool of available developers was shrinking, but it was an important application. This type of thing is where modernization is critical; having someone say, “We don’t touch that,” is no longer acceptable. That story ends in disaster when the system fails, and no one knows how to troubleshoot—let alone fix—it.
As with so many things, the balancing act is up to you; determine your organization’s level of risk acceptance, then determine the impacts of this application being unavailable and proceed from there.
Each step of mainframe development is a little different from what developers coming out of college expect, but there are new tools and mechanisms available of which these developers can take advantage.