Version your infrastructure

Infrastructure is code, and like any code, you need to version it. Notice how that works: If it’s infrastructure, then it’s a fundamental part of the system; nothing’s going to work without it, which makes it valuable. So valuable that in full-stack deployments, it is part of the application. And if it’s code, then it should be under version control.

But infrastructure as code does not mean some batch file that you casually throw together to take care of a one-time task. The days when the develop-and-release cycle was littered with that kind of provisioning are long gone. In DevOps, orchestration scripts are complex, and you need to take the maintenance seriously.

Versioning? Who Needs Versioning?

First, let’s take a look at un-versioned DevOps:

Ad-hoc testing environments. An intern installs the latest binaries (with a few manual configuration adjustments) on a snowflake test machine, then tests for fixes the last reported set of bugs. The config adjustments may or may not be saved in a temp file somewhere. Maybe your testing is no longer that primitive, but if there is no system for controlling the configuration of your test environment, and no way to keep strict track of the test results for each build, or even of the master test case documents, then your testing is still ad-hoc.

If you maintain strict manual control over test environment configuration and store master test cases and test results in a standard network location, you’re doing better. But unless the test environment and the test case for a given build can be reproduced exactly at some point in the future, your testing isn’t fully versioned.

Versioned application code only. Makefiles, compiler instructions, and batch files are wherever individual developers store them, and they’re organized, named, backed up, causally overwritten, or deleted based on whatever each developer thinks is important at the moment. You can recompile the source code for an earlier build, but the chances are that you can’t actually duplicate the build decrease drastically with time.

With full version control, all of the files necessary to exactly duplicate the build would be stored together. For libraries and open-source resources, the versions used in the build would be recorded.

Seat-of-the-pants deployment. In un-versioned DevOps, you have scripts to cover part of your deployment, but you’re often unacknowledged, artifacts are still manual files that have to be moved around. If you do update your scripts, you don’t keep strict track of the updates, so you wind up grabbing what you hope is the latest version of your main deployment script, run it, and then try to figure out what manual adjustments you still need to make. And many key steps aren’t scripted; they’re manual, and not at all obvious. They may be documented in some text file, or they may exist only as arcane lore known to a select group of initiates, and prone to human error.

Under full versioning, every step of the deployment process is scripted, and the scripts are under formal version control. If something in a script needs to be changed, it’s saved as a new version. Data and artifacts that are likely to change frequently or with each build (program version numbers, dates, updated file names, etc.) should be pulled from the appropriate location (source code, manifests, XML datasheets, etc.) rather than being hard-coded into the scripts. The means that script version changes are likely to represent substantial changes, rather than changes in transient data.

Fully-Versioned DevOps

So, pulling all of this together, what should you expect from a fully-versioned DevOps infrastructure?

First, everything is scripted. And “everything” means “everything from compiling application source code to deployment and day-to-day operation.” All of it is controlled by scripts, and all of the scripts are available in text format (not binary-only). Many of the key DevOps tools make heavy use of scripts, of course, so to the degree that you are employing these tools, most of your main processes should be scripted. That can leave a variety of intermediate steps and support processes unscripted. There are script based tools to cover all of these tasks as well, and if you are serious about versioning infrastructure, you will need to place such operations under script control.

Why would you do this if a task consists of no more than a few simple manual steps that everybody knows? Even if everybody on your team knows these steps now, there’s no guarantee that someone who needs to recreate your build-and-release process later will know about them. In fact, anyone who has spent much time working in development, QA, or operations can tell a few stories about those crucial “everybody knows” steps that were missing from a set of instructions.

So you script everything, including the simple, obvious steps. You’ll know that you’ve scripted everything when you can precisely and reliably recreate the build, the deployment, and the state of the infrastructure immediately after deployment strictly by running scripts (or better yet, one master script).

Once everything is scripted, you need to store the scripts in a standard version-control repository. The repository that you use is up to you (or your organization), but it should all of the features needed for full version control, because you are going to treat every script, no matter how seemingly trivial its function, as if it were an important piece of source code. The goal is to be able to pull all of the infrastructure code in the same way that you would pull the application source code.

At some point in the not-at-all-distant future, questions like “Should I version my infrastructure?” will no longer even make sense, because versioning will be built into the tools that you use to manage infrastructure, and it will be hard to imagine how anyone ever got along without infrastructure version control. Until then, however, we will all need to take those few extra steps required to place DevOps infrastructure under version control.