Over the last 20 years, software has slowly taken over the world. Businesses that historically operated on 8-hour business days are now 24×7, navigating a global economy that never sleeps. The Internet enablement of everything has thrust what was originally an important support role into the lifeline of a business.
A decade ago, Agile software development methodology forced more pressure into the IT system by deploying software at an ever-increasing rate to the point it’s at today – where innovative companies can deploy new software multiple times a day. DevOps is at the forefront of addressing the question of how IT deals with these problems, but like Agile before it, there’s a learning curve.
In this two-part series, I will discuss how the IT world worked before DevOps and how development and operations teams united to adopt more Agile practices.
How things were
Before this sea change, a company would hire IT professionals who would purchase and configure servers, place them in a data center, copy mission critical software onto them and with some help from software engineering, get things running. After some quality checks, the IT team would lock down the datacenter controlling both physical and virtual access and “let things run.” The common way to provide stability was simply to not let anyone touch anything, as most problems have been (and still are) self-inflicted wounds. Keeping the system “static” kept a lot of problems from surfacing. The software was probably not more stable; you just didn’t find the problems as fast.
In this world, IT professionals had a feast and famine existence. If a new service deployment was happening (usually every 3 to 6 months) it was an all hands on deck activity reminiscent of a rocket launch. The same manual deployment would happen, along with software engineering, database engineering, etc. Data migration would occur, new software would be installed, and the system would manually be brought back up. In the event of a software problem, or data migration problem, the team could elect to rollback and bring the old software back up. If things went well, then again some basic end-to-end quality assurance would take place and the new software release would be unleashed on the world. More progressive companies would do this for part of the system, diverting traffic to avoid perceived downtime. It would often take an entire night to do a new service release, and companies generally did them at night because downtime was not unusual.
Then software development changed
The competitive landscape that the Internet created along with the now ubiquitous SaaS business model made companies recognize that their ability to innovate the product was a competitive advantage. A lot of companies had adopted Agile software development practices in response to the increasingly high failure rate of larger efforts. Part of Agile is creating smaller development efforts that get code functional faster, reducing the size of the endeavor. This is commonly referred to as a “software sprint.” This promise of innovating the product faster was very alluring, especially to Web 1.0 companies that were now in feature battles with more nimble Web 2.0 companies who had more quickly adopted Agile. It was becoming more and more evident that the problem was shifting from an issue of creating software fast enough to deploying the necessary code in an expedient manner.
The big problem was the deployment stage of the process. IT teams were accustomed to changing production systems every three to six months. This combined with a trusted model that largely kept software developers out of production systems, effectively nullified any Agile gains in SaaS businesses. In other words, half the product development cycle had been fixed with Agile, but the other half deploying the software had not yet caught up.
Operations as a changing landscape
Major changes were also occurring for operations inside companies, many of which were driven by the development side of the house.
- Virtual server technology started to get major traction within production environments. This allowed teams to treat servers as software. You no longer had to run the operating system and application on a physical server, you could run them on a virtual server (that ran on physical servers). The configuration of servers that, by hand, took an incredible amount of time, could essentially be defined once and then cloned, the same way you would clone a file. Deploying a new server was much easier, and a corrupt server could simply be deleted and reborn in minutes. All these virtual servers could be managed by a single piece of software called a hypervisor, with which creating a new “server” would take a minute, when it used to take weeks.
- Puppet and Chef were born. At the same time development organizations, recognizing this major problem with deployment, started addressing the production configuration problem with tools such as Puppet and Chef that essentially allow software and IT teams to codify the configurations of virtual and physical servers and deployments of new code. This allowed for repeatable processes that could be rapidly done and redone. No longer did your IT team have to install software by hand, instead they could code it.
- Continuous integration and delivery enabled software teams to use newly developed tools like Jenkins to do nightly software builds, perform unit tests of parts of the software and automatically deploy software into production. All of these tasks had been highly manual operations historically and fraught with mistakes and inconsistencies. Codifying these steps would give way to deploying software within minutes or seconds rather than days or hours.
Most importantly, all of these efforts were largely driven by the development of more progressive operations in the company. The operations side of most companies was never conceived as a “maker” culture; the handwriting was on the wall – operations had to change or it will would have become a victim of this revolution. I’ll take a look at some of the resulting changes in the second part of this series.