My first article in this series focused on the need for continuous improvement across three key dimensions of software development–velocity, quality and efficiency. While these are all equally important, in this piece we’re going to focus more specifically on quality.
The past few months—what you might call the summer of outages—have been troublesome for many of the internet’s biggest players. Google, Apple and Cloudflare, as well as Facebook, Twitter and LinkedIn, have all experienced bouts of unplanned downtime, sometimes even multiple bouts in quick succession. Is this all just a matter of coincidence and bad luck? Perhaps, but peeling back the layers reveals an increasingly painful reality of modern DevOps life.
While the root causes of these outages were varied, several were blamed on software glitches. This suggests software rollouts may be happening before they are comprehensively battle-tested and ready for production-level systems. This is never a good idea, especially when you’re dealing with high-profile services supporting millions of users worldwide. In these cases, software-related stumbles are that much more damaging.
When software is rolled out prematurely, a cardinal rule of development is broken in that we, the end-users, become the unwilling guinea pig for poor-quality applications. No matter the pace of rollouts, an unwavering focus on quality needs to be maintained, and automation, integration and measurement can be the keys to ensuring this.
Automation
Automation refers to a pre-programmed process that executes tasks with minimal human intervention. The more you automate, the higher your quality should be. Avoiding manual intervention decreases the likelihood of less optimal techniques or human mistakes, both of which can lead to poorer application quality and customer satisfaction. Essentially, automation enables the best techniques and practices to be quickly and easily replicated.
Consider the example of automated unit and functional testing. If you can automate the execution of the best testing scripts incrementally across code units and functions, you can test more accurately, not to mention more quickly. The benefits will increase exponentially with the additional layers of testing you can automate. We recently worked with a customer who realized an overall ROI of 467% from their investment in automated unit testing.
Integration
Modern applications are increasingly componentized, spanning multiple platforms. Consider an e-commerce application, where the front-end may reside on a system of engagement (such as a web server) while the back-end transaction ultimately occurs on a system of record, such as a mainframe.
High-quality, end-to-end digital services ultimately depend on all critical components being universally and automatically included and prioritized in DevOps pipelines. Similar to automation, this is because any time manual effort is introduced into the integration process, you increase the chances for error and consequently, the risk of a particular code segment lagging behind others.
The end result? More defects making it into production in the sprint to the finish line. It’s like the famous I Love Lucy chocolate assembly line episode, which offers a humorous visual interpretation of the chaos that erupts once human intervention is introduced into a process. Ideally, seamless toolchain integrations, across multiple application-leveraged platforms, can be utilized to maximize quality outputs leveraging DevOps technologies you’re already using, such as JIRA, Jenkins and SonarQube.
Measurement
Improving quality depends on development teams constantly improving their methods. This can be achieved through ongoing measurement against key performance indicators (KPIs).
The challenge is establishing relevant KPIs, and we have seen many development teams fail at this. The most effective KPIs are those that translate to real business impact. For example, “number of bugs released into production” is a commonly used metric, but it does nothing to convey the bottom-line cost of those bugs. A more effective metric would be the cost of resolving defects, with success being declared when the organization reduces the amount of money spent on fixing bugs.
The cost of fixing a bug is much less when it takes place earlier in the software development lifecycle (for example, in the coding and unit testing phases) than when the software is near or already in a production state. When a development team can consistently show it is reducing costs through earlier identification of bugs, this puts them in a stronger position to justify investments in better testing and analysis tools.
Conclusion
We’ve long known about the challenges DevOps teams face to balance velocity and efficiency with a high degree of software quality. It’s like asking a circus performer to sprint across a tight rope. It may seem unfair, but the need to release high quality software faster and faster, while staying within budget, is not going to abate any time soon.
The recent spate of outages has taught us that even the biggest and the strongest in the industry are not immune to immense speed pressures, and the quality stumbles that often result. Automation, integration and measurement can help development teams in their ongoing quest to meet the seemingly conflicting requirements of improving quality, velocity and efficiency—at the same time.