A highly functioning enterprise-class DevOps set of services has a fundamental goal of full automation. To be honest, having the right DevOps strategist (or visionary) leading this effort is as important to achieving this goal as the combination of all the challenges and topics that must be addressed. So for the sake of argument, will we assume you have one (if not, feel free to contact me—I am looking at the moment 🙂 ).
With key leadership in place, the first target is always the automation of the build and deploy process. This could easily be considered the core of DevOps or its heart. But once this automation has been achieved, the trickier topic of release orchestration begins to emerge. Strategy is to release orchestration what taste is to food: Having the right one does make all the difference.
To progress our code to production, the inherent assumption is that all testing has been completed successfully. Sounds like common sense. But then, how much of that testing occurs with human hands on it? The user acceptance testing (UAT) discipline inherently requires a human at the wheel. But the reality is, for every time a human must interact with the progression of code from the development-class environment through all the testing environments (integration, performance, user acceptance and QA) on its way to production, the momentum of innovation is slowed down—all the way down to human levels.
The Complexity of Testing Types
The first automated testing begins to occur in development or (DEV) class environments, where the programmer is under strict instructions to implement test-driven development. This requires new feature/functional testing as part of the build or deployment automation into DEV. Either the build or the deployment cannot be said to be successful “if” the accompanying tests are not “passed.” This also represents our first “gate.” Sophisticated release orchestration (RO) software can examine “flags” set during the build or deployment process to see if the testing has been completed successfully. If so, the code can progress to the next testing level environment within the software development life cycle (SDLC).
Another useful type of testing completed early in the SDLC (most often in DEV) is the examination of code coverage. Development organizations can begin to examine how much of their code is put to the test, and set arbitrary percentages as a minimum standard before the target application is able to progress. The addition of alternate types of testing occurring in DEV class environments does add to the quality of the innovation effort. However, it brings with it an increased labor effort to automate both progression—and audit—of the version under test.
For instance, feature/function testing may use a single repository to store the results for a given version of the code. I go to my testing repository and I can look up the results from version 1.4 of my application, or version 1.5 of my app, as many as I desire. Should a regulatory official or internal auditor ever ask me to prove the quality of version 1.5 of my app, I need only consult my testing repository and show them how many tests were executed, and how many passed or failed before this version of my app was ever deployed into production.
When you add code coverage testing to the process, I now need to consult two different testing repositories, producing two separate reports to assess the quality of this same version of my app. There is nothing wrong with this, but it does create more work or effort to produce than it would if a single testing repository was used. The problem only increases as we keep increasing the types of tools we use to prove quality. Security testing, for example, (ensuring code does not have inherent security holes) tends to use different tools, or, worse, processes that require human peer observation, which again can add effort to create the automation and the traceability for audit. These are critical types of testing (and/or processes) so they must be done, but they also come with an inherent cost in automation effort.
Small shops may not worry about these kinds of concerns, but large organizations will, particularly if they are still in the process of transitioning from legacy waterfall methods (where controls were high) to new agile methods (where it feels to auditors like the wild, wild west). Compiling a test log that matches the version of code under scrutiny such that it can be reproduced automatically will restore the faith of both internal auditors and regulatory officials who come asking. But the addition of each new type of testing tool will require more effort to maintain the ability to compile those test logs.
The Complexity of Testing Responses
Next, consider the impact of non-binary responses to automation efforts for gated code progression. Does my feature work (yes/no)? This is a binary evaluation, a true/false or yes/no that is easy to set my progression flag for testing response. It is another matter to evaluate the results of my code coverage and ensure this app is above 70 percent or between 70 percent and 90 percent. These kinds of questions are not binary (in other words, there is a real result to compare to, not just a pass/fail), and require further effort to set a binary flag for the gated pass/fail. When the evaluation question does not intend a binary answer at all, such as a grading of 1 to 5 where 1 equals horrible and 5 equals outstanding, I am now most often forcing a human to make an intelligent decision as to whether this code should progress.
Multiple testing repositories require additional automation effort. Multiple kinds of responses require additional automation effort, and the standards by which the gates pass or fail also may change over time, requiring tracking of the goals at the time the tests were taken. Creating independent test automation that can be called from the build process or the deployment process is wise. Associating this test automation (whether scripts or simulations, etc.) with the version of the application code is wise. And working to reduce the number of repositories to examine is wise. Having de-coupled test automation strictly from a build allows it to be re-executed in more advanced testing environments when the need arises.
Charting the Course to Nirvana
RO benefits from as full a level of automation as possible. I will discuss the RO topic more fully in another article, but the foundation of automated change progression relies on a gated history of proving quality. As long as the code under review continues to pass testing, setting progression flags to green or yes or pass, the RO software can continue to progress it along its way to production. Failing code must be sent back to the DEV environments where development teams can address the issues before making its way back through the automated process once again from beginning to end.
If the organization has made the intentional decision to keep humans substantially involved in the testing and evaluation cycle, it must realize a few impacts of that decision. Daily movement to production becomes less likely—most often it takes a week or two for changes to undergo significant human scrutiny (depending on the level of testing automation). The results of human intelligent decisions also must be chronicled in the testing repository to preserve the history for audit. This can lead to a “blame game” kind of thinking when something goes wrong, as it is immediately attached to “who” approved it, instead of to “why” they did so. And obviously the cost of labor is always the highest cost element in any innovation cycle, so buyer beware in this scenario.
On the other hand, “Nirvana” is the state when all testing (except UAT) is fully automated and gated, and provides the foundation for RO software to move code into production as often as it can prove quality. This could mean fully automated releases instead of release weekends. This could mean multiple releases per day into production. This could mean that releases themselves are no longer considered “events,” but by default are now “non-events.” Nirvana does not endanger the lives of change and release managers—it actually gives them one. For a “change” (pardon the pun), they get to discover something they vaguely remember … a life outside of work.
To continue the conversation, feel free to contact me.