DevOps teams are experiencing a chasm between traditionalists and innovators — a rift that grows as pressure mounts for teams to leverage the latest advancements. Traditionalists are fighting to preserve the basics, to carry an appreciation for the early days of DevOps, and to not cut corners (look no further than CrowdStrike). On the other hand, innovators are ready to seize their hero moments with the latest.
For a high-performing DevOps function, you must give equal credence to both. The rocket ship of DevOps will always launch, but the success of its journey largely depends on the work that is done before the countdown begins. How can teams adopt the NASA mindset? Let’s dive in…
Back to the Basics: Reliable, Repeatable Builds
- Processes in Control: To ensure the rocket launches accurately every time, the ‘standard’ needs to be as standard as possible. Where processes typically lose standardization is during assembly and build deployment. Often, there are different techniques and different environments versus creating a singular promotion pathway that builds each environment — regardless of purpose — the same way. Each build to a lower environment is essentially a ‘rehearsal’ so that by the time you get to production, you have likely built at least two different things in the same scripts with the same approach. In DevOps, routine is everything, but you must treat everything like a potential CrowdStrike moment.
- End-to-end Capture: Those responsible for the build pick up new approaches along the way which often only live in their minds. Unfortunately, these strategies may not be caught up with the ways these environments change over time as they are being refreshed continually in terms of code and hardware. Though it might feel tedious, DevOps teams need to have a realistic and verified end-to-end picture of environments. That can be in the form of storytelling, visuals, recorded videos, and so forth. But the goal here is to capture. According to the 2023 State of DevOps Report, documentation quality drives the successful implementation of every single technical capability studied — from continuous integration to trunk-based development to reliability practices.
- Reduction of Manual Steps: Wherever possible, human intervention should only be used for decision points, not to complete or correct a ‘toil’ task. This concept is sometimes misunderstood. When discussing CI/CD, ‘continuous integration’ ideally means that with the push of one button, you could promote code and it could go up through all environments in the promotion path and land in production. That is a capability that if you had to, you could click one button and the change would show up in production at the cycle speed of your automatic processes. ‘Continuous deployment’, on the other hand, is a human decision to do exactly that — and it is relatively rare that a build process will be so routine, low-risk or so urgent that you would choose to eliminate human intervention. Generally, the best practice is to have highly automated builds with automated notifications and alerts for when humans should intervene.
Innovation: An Eye Toward Opportunities
- Infrastructure as Code: A challenge in builds is the need to resolve multiple things at the same time, such as a feature promotion coming up through the dev team; defects that might have to be plugged in based on the experience in production or lower environments; patches and refreshes on underlying systems. The possibility of collision or omission is relatively high in a dynamic environment. Infrastructure as code (IaC) resolves this frustrating barrier to first build success by building the entire stack from the ground up every time. IaC presupposes that you have containerization and can build and deploy things virtually, but it is an incredibly powerful innovation in DevOps because of its ability to ensure clean and complete builds the first time, every time.
- Create Adaptive Builds via MLOps: An important part of each build is to determine the validation and verification ‘package’ that is going to accompany the deployed source code. It is imperative to be able to pinpoint risks, slowdowns and points of failure. But in the absence of ways to assess these on a release-specific basis, the default decision is to include the entire verification in each release — which extends the build and release time cycle. A great place to start is to apply ML/AI to create adaptive builds that can automate the checks, and ensure that criteria are met and that it is secure and performing. Feeding this build optimization infrastructure with the ongoing data pipeline ensures that it remains relevant and optimized in real-time. While a lot of this can be done manually by skilled humans, these are tasks for which ML/AI is a natural fit.
- Incorporate SRE: Site reliability engineering (SRE) represents the next step of DevOps as it breaks down silos and creates continuity between those who create code and those who interact with customers firsthand. One way to incorporate SRE into DevOps is to create a rotational roster where each dev periodically serves a ‘tour of duty’ in the help desk, for example, six weeks and at the second level of support. A reciprocal ‘tour of duty’ of placing help desk personnel in the dev teams as business analysts or developers (depending on their technical background) is equally beneficial. The collaboration and insights created by an SRE model have the potential to transform the end-user experience.
Conclusion
The ‘operational tension’ between DevOps traditionalists and innovators might be growing at present; after all, the secret of DevOps has been to make the difficult routine.
But the great news is that teams need both. It is simply about finding ways to bring these mindsets together, explaining the impact both bring to a successful DevOps function and identifying ways to supercharge DevOps efforts by granting equal credence to both.
Walter McAdamsWalter McAdams is the Chief Engineer, Digital Solutions, at SQA Group — a technology and advisory services firm helping companies pinpoint and solve the most critical use cases around accelerated software engineering, actionable data, and tech-powered innovation. From working for the Department of Defense as an intelligence officer to leading full-scale development and test automation frameworks at large enterprises like Microsoft and Boeing, Walter brings more than three decades of hyper-specialization across disciplines such as software and quality engineering, robotics, embedded systems and emerging technology.