Integrated Testing Without Integrated Migraines

I am always amazed at how many practitioners (or, rather, managers of) tout the benefits of test-driven development (TDD). This inherently changes the role of the traditional developer to one of developer and tester. But it would appear that as long as I am able to construct a test for the innovation I wish to create, all is well in the universe, and I can proclaim the full success of automated test integration with my DevOps processes, sleeping quite well at night from then on.

But in the world before yesterday, testers tend to think differently than developers do. They evolved differently, bearing the scars of applications that tested just fine on the developer’s laptop but somehow died under the myriad circumstances that occurred in the world beyond that original safe harbor. In point of fact, human testers with a variety of automated tools tended to very often “prove” to developers that their sanctified code was not in fact so pious, but actually more “holy” (that is, containing holes) than they thought. Thus, remained the cycle of code/test/code again until the gauntlet to production is navigated successfully.

There is also a certain reality about testing that developers cannot easily overcome. The only category of testing than can easily accompany a new feature or function is primarily functional testing. “Did the function work?” would appear to be the kind of testing a developer could construct to accompany his/her code. Notwithstanding the fox-in-the-hen-house analogy, there are quite a few other kinds of tests that are equally valid in that code moving forward to production. Most of these tests are not easily constructed by the original developers.

One might consider performance testing a bit important. In this instance, I am not only concerned with the validity of my new feature, I am also concerned with its performance (timing) relative to how a user would experience it, and inclusive of the conditions of my production environment. For this to be rightly measured, I may need to ensure all integration testing also has has been completed successfully. And assuming all of this completes as expected, I may still want users to test my app—you know, to see if it is actually what they wanted in reality (after decrypting the requirements from the ancient Egyptian notes of the stories they submitted). Beyond this, some functions appear largely in a batch context, or perhaps only when proximity is achieved in a mobile app, so context is equally important to testing.

The Nuances of Integrated Testing

But getting back to our original question, let us assume I can test something, and let us assume I have tested it. Where do we go from here? Once the validity of my new feature has been tested, should I decouple the test execution from the build automation and move it to a list of regression tests I only execute periodically? That would seem logical, as this test might need to be run again during post-integration testing (particularly if it manipulates data with upstream dependencies). What is the “value” of my test past that first successful occurrence? If it has value, then I must begin to look at all of my testing in the same way I look at code: It needs to be version-controlled.

Pairing functional tests to the versions of the code they validate will allow me to roll test automation forward or backwards to match the version of code in the target testing environment. So, for example, if coding build version 1.5 is in my integration environment and it fails, I can roll it back to version 1.4 and concurrently roll back the tests that were valid for the older version as well. That’s better than using tests designed for version 1.5 of my code, as testing new feature validity against old code is sure to generate errors.

But treating test scripts (or entities) the same way you treat code with version controls is something most organizations have yet to get their heads around. Deploying testing as a part of the post-build deployment automation is equally daunting.

And, of course, none of this addresses the more fundamental question: Does my test have value at all? Because I can test a feature or function, should I test it? Is this feature or function critical to the user experience? Am I testing it in such a way that the user experience will be evaluated? Testing whether a date field allows only valid data to be entered into it seems important the first time we do it. But once the characteristics of a date field have been so ubiquitously defined by software libraries, advanced programming languages or database structures, perhaps continuing to validate that input is excessive and provides diminishing value. Testing the right thing matters as much as testing at all.

Then let us assume we have found the Rosetta stone of items to test and we have created an ability to test this feature with scripts passed down from Olympus. If the continued validity or performance characteristics of this feature remain important to the user experience, perhaps we have not found only something to test but also something to monitor. Here is where we begin to think two years into the future. If part of my DevOps continuum is an ability to integrate version-controlled testing as a capability, why not integrate version-controlled monitoring, as well? (Here’s where the Ops guys of DevOps finally get interested.)

Instead of examining testing from only the perspective of the developer chained to the dictate of TDD, why not examine the life cycle of “evaluation” from a more holistic perspective of coding, integration, performance, users and, ultimately, a monitoring point of view. If my monitoring tools are capable of configuration adjustment, parameter adjustment and ceiling- and basement-level adjustment, then through reading external files or by exploiting API’s shouldn’t we be looking to pair our monitoring profiles with the version of applications we deploy into production discreetly by version? Perhaps a less useful test simply gets retired and a more useful test evolves into a program characteristic that we monitor until the feature itself gets retired.

Integrating the Ops team into discussions about automated monitoring is sure to pique their attention. If nothing else, the ability to turn off monitoring tools during deployments and turn them on again afterward will cut down on “noise,” and help to focus resources only where they are truly needed when alerts emerge. Understanding the difference between how production is monitored vs. the unique needs of the test environments—with the ability to incorporate both into the automation of DevOps—is equally appealing.

In any case, the evaluation of an application includes not only the original tests to ensure validity, but also the monitoring that occurs to ensure continued performance as expected. Thinking about where tests begin and where monitoring is more effective or needed can help to keep both disciplines modest and focused only on what is critical, rather than both attempting to take on the weight of the world. The goal of DevOps is not to increase the frequency and duration of headaches, it is to eliminate them.

To continue the conversation, feel free to contact me.