Relevant Continuous Testing: The Primary Key to DevOps

My recent blog, “7 Pillars of DevOps,” identified seven essential best practices areas that make up successful DevOps and explained that the pillars are not silos, but rather foundations that need to be kept in balance. One of the pillars is continuous testing (CT).

For well-balanced DevOps, CT performs tests that are fast enough to just fit within the time budget, yet thorough enough to cover the quality needed for deployment to production. Despite the need for balance, CT can be considered the true backbone of DevOps because testing is the one pillar that proves the validity of changes needed to operate and validate the correct operation of the other pillars from end to end across the pipeline.

Compared to older testing strategies in which large software changes were tested as a complete product after “release to QA,” DevOps CT strategically tests changes incrementally over a series of DevOps stages. CT begins the moment a change is made in development and continues through all DevOps pipeline stages. Release readiness is judged on analysis of test results accumulated prior to deployment. With Blue-Green, A/B and Canary test strategies, some tests may continue into the live environment after deployment, and the results may influence future CT activities.

Like continuous integration (CI), frequent, small, incremental testing has many merits. Quick feedback on problems enables a release consisting of many individual changes to be tested incrementally with confidence. The root cause of problems can be isolated much faster when the changes are tested incrementally. The problem is determining which tests are most relevant to run at each stage of a DevOps pipeline with content, with the risk of changes varying widely from one change to the next.

Given that testing is a stochastic sampling process, there is no such thing as 100 percent testing in any absolute sense. Left unchecked, the content of tests will grow exponentially beyond the content of the product that is being tested. No number of test resources, even if elastic on-demand cloud infrastructures are used, will keep up with the growing demands to satisfy test requirements. The key is to create and choose tests that are most relevant to the changes that are being processed in the pipeline rather than attempting to run all tests with no regard to actual matter what the changes you wish to inspect. A strategy that chooses the most relevant tests at each stage in the pipeline is optimal.

Several strategies for creating and selecting relevant tests are described below. These relevant test strategies may be used standalone for specific DevOps stages or a plurality of these relevant test strategies may be used end to end across the pipeline.

Choose tests that focus on source code deficiencies. Metrics based on static analysis scans can determine which areas of source code have the highest number of defects. Test creation and selection algorithms can provide higher weight to tests written for those areas. This seems like a good idea; however, there are several limitations. Scan tools are limited to specific languages, may miss critical defects and do not have intelligence regarding the systemwide relevance of a specific defect.
Create and choose tests for areas of the code with low test coverage. This also sounds like a good idea, but it is not. The most obvious reason is that tests probably are not available to select or the coverage would have been higher in the first place. Also, code coverage algorithms, especially at the system level, are limited to simple metrics such statement coverage, which is not a good indicator of system-level performance.
Create and choose tests that match the artifacts of build changes including build dependencies. In this approach, tests are mapped to build changes and tests are selected that match the changes. This sounds logical, but unfortunately, a static mapping between test and build artifacts may not reflect system impacts of changes, as critical failures may occur due to interactions with modules outside the build dependency chain.
Create and choose tests per risk analysis. This approach keeps a running record of code changes correlated to test results. Tests that tend to fail the most when a code module is changed are selected. This approach has several advantages over the other approaches. Test selection is not limited to implementation language or build dependency maps. Empirical experience correlates to risk better than code structures in the stochastic context of testing. The same algorithms can be applied to every stage in the pipeline. And the pipeline metrics can be adjusted dynamically for each pipeline stage run.

The above CT practices indicate that relevant testing strategies reduces the time for CT execution and reduces the infrastructure cost for CT, without sacrificing quality delivered to production.

Summary

What do you think? Do you agree with the above relevant continuous testing practices? Are there others you recommend?

While DevOps implemented with all seven pillars provides a strong foundation for long-term enterprise business improvements, it is important to understand DevOps is not an island. Enterprises implementing DevOps should be aware that DevOps interoperates with other IT systems and practices. Enterprises are well-advised to put in place governance policies that encourage the selection of tool-agnostic IT partners with solutions that best suit the needs of each unique enterprise and can integrate and evolve DevOps together with all their IT systems.

— Marc Hornbeek