Isn't Continuous Testing a Longevity Test Anti-pattern?

Accordingly to Jim Coplien: “an anti-pattern is something that looks like a good idea, but which backfires badly when applied. “ In DevOps context, continuous testing is an attractive solution for the narrow context of testing specific software components quickly (in cycles measured in minutes or hours) but may be “bad” if the system that the software is a component of has failure modes (E.g., effects of memory leaks) that will only be observed during long term testing over several days and if the time between releases is greater than that. In other words, the pattern of quick test cycles prescribed by continuous testing appears to be a contrary Anti-pattern to the fundamental concept of long duration “longevity” testing. So what are the well formulated positive patterns that are applicable in its stead?

In my prior blog entitled “From QA to Continuous Testing” I indicated “QA test experts that are most skilled in the best practices of test design are valued because they will produce the best quality products.” QA test experts assert that it is unwise to release a software product that has not been tested for durations comparable to the lifecycle of a release. Only tests that are run for a significantly long time compared to the release cadence are deemed sufficient. Just how long is long enough? This is a subject for further discussion beyond the scope of this short article, but my own observation is that best practices test environments run accelerated level load and impulse load traffic patterns on a build candidate for at least 48 hours or more before a release is determined sufficiently tested for at least limited deployment. Even 48 hours is much longer than typical “continuous testing” cycles. So “what positive continuous longevity testing patterns are applicable in its stead?”

Continuous Longevity Testing Patterns may sound like an oxymoron but in fact longevity testing is compatible within the context of continuous testing principles. Below I describe three Continuous Longevity Testing Patterns that are sometimes used.

Pattern 1: Longevity testing time windows are reserved such as weekends. In this pattern the regular Continuous Integration/Test cycles are interrupted during the weekend and longevity tests are only run on the build that was last created prior to the weekend time window. This has the advantage of not requiring any special test resources other than the resources used for regular CT tests and does not require any sophisticated scheduling to orchestrate the test environment. The disadvantage is that the longevity test results are only available weekly, it is hard to diagnose which build caused a longevity failure and problems with release candidates will delay the release for a week until the next longevity test run validates the fixes.

Pattern 2: Longevity parallel testing is scheduled frequently, perhaps as frequently as every few builds, or perhaps daily. This pattern has the advantages that the regular Continuous Integration/Test cycles are not interrupted any time (they run 24/7), the longevity test results are available for more builds than Pattern 1, which reduces diagnostic complexity and may reduce release delays, but has a disadvantage that separate test resources must be reserved for each longevity test in parallel with the regular CT tests. For complex systems under test with many test targets and lots of builds this can be very expensive so there is a tradeoff between test resource costs, and the cost of delays if not done frequently enough.

Pattern 3: Intelligent longevity testing is scheduled only when certain conditions trigger the longevity tests to be started according to smart rules that correlate software changes to problems with longevity performance. This pattern has the same advantages as Pattern 2 and has the further advantage that the cost of test resources is usually much less than Pattern 2 if the rules for the triggers don’t fire too frequently. The disadvantage is the entire scheme depends on the creation of a rules engine and rules that understand the risk of changes for long duration testing. If the rules are too lenient then this pattern might allow too many longevity failures to escape but if the rules are too tight then this pattern is not any better than pattern 2 and yet more complex to implement.

So which Continuous Longevity Testing Pattern is best? The best answer is the one that best matches organizations’ goals for release cadence, cost of testing, and risk.

So indeed, Continuous Testing is a Longevity Test Anti-pattern but fortunately DevOps infrastructures implemented according to best practices have a variety of complementary Continuous Longevity Testing Patterns to choose from.

At Spirent we think testing has a bright future in DevOps. You can read more about our views at
Spirent.com/solutions/devops

What do you think of these patterns or do you have others that should be mentioned?