Ideally, DevOps should retain a lean footprint, but avoiding technical debt is easier said than done. As such, over half of IT leaders report technical debt is a big or critical problem. Without routinely addressing technical debt, DevOps teams can easily face inconsistencies during deployments. Versioning can get out of hand without consistent upgrades and aging legacy components can be a roadblock to development fluidity.
Simply put, letting too much technical debt pile up isn’t all that great.
I recently met with Kurt Andersen, SRE architect at Blameless, and Matt Davis, SRE advocate, to gather insights on the sources of technical debt in the field of DevOps. According to Davis, “Tech debt is something that you really want to keep chipping away at.” Below, we’ll highlight some common causes of technical debt and suggest ongoing practices to help tame it.
1. UI-Based Deployment Patterns
UIs are great for quickly configuring DevOps, but if your deploys aren’t codified, this reliance on UIs could come back to haunt you in the form of technical debt. A lot of the time, explained Davis, infrastructure teams have a tendency to just use the GUI of their given cloud provider to initiate deployments. Perhaps the team is wrapped up in feature velocity or a deadline to release.
For example, you could use a UI to quickly create a database, but this is likely not a repeatable and robust process. Especially as an organization scales its cloud adoption and uses multiple clouds, you start to produce “snowflake” deployment patterns across teams, says Andersen. This could lead to inconsistencies around different deployment options per region.
It’s not easy to catalog a configuration if it’s not written in code, and the more undocumented UI-based deployment patterns are used, the more silos of knowledge you create. “If not codified with Terraform, you have to document what you did—code gives you that documentation,” said Davis.
Solution: Rely less on the GUI and codify your deployment patterns with infrastructure-as-code (IaC).
2. No End-of-Service Life Cycle
Development is typically more concerned with feature velocity than planning for service deprecation. Yet not envisioning the entire life cycle or setting an off-ramp from the start can produce more technical debt later on. “No one pays attention to planning for retirement, and at the end-of-life, services are hard to retire,” described Andersen.
What you end up with are these “senile services” that aren’t doing that much but remain critical to the business’s operations. These services may be challenging to migrate. Or, they may be the product of unknown shadow or zombie APIs. For example, a marquee customer may depend upon a certain legacy endpoint, making the service difficult to evolve.
Not planning for the end of the service lifecycle can hold up your development from more efficient ways to work. For example, if one “senile service” has never used a particular commit strategy before, it could stall agility when revamping the deployment pipelines, says Andersen. Thus, it’s best practice to set clear deprecation timelines and plan for the future.
Solution: Plan the end-of-service life cycle from the upstart.
3. Lengthy Dual-Track IT
Another area that can produce technical debt in DevOps is dual-track IT. This is when engineering teams maintain one core stack in addition to an experimental structure and shift toward adopting the experimental structure over time. Development is constantly overseeing migrations to new technologies, yet they rarely plan for the coexistence of both stacks.
While splitting an old and new stack into distinct units sounds like a temporary measure, some of these large transitions simply take longer than you think. For example, SoundCloud’s journey to refactor its monolith into microservices was an eight-year process.
Solution: Realize dual-track IT might take longer than you think and plan accordingly.
4. Intermittent Version Upgrades
Third-party software is constantly upgrading to new versions. Especially if you’re integrating many open source dependencies into your stack, you’ll find the cadence of version upgrades to be quite fast. These upgrades are often imperative to plug new vulnerabilities as they emerge in the software supply chain.
But upgrading versions and base images and having a patch management scheme that works from environment to environment requires a lot of work, Davis cautioned. Kubernetes version upgrades, for example, can take a lot of time. Yet, if your upgrades are not matching the pace of change, added Andersen, they can quickly pile up and leave you left behind with a mountain of work. “If you’re not upgrading monthly, you’re in danger.”
Solution: Have a continual upgrade process in place.
Technical Debt: Keep Paying, Cleaning, and Pruning
Technical debt is something you need to grapple with before it gets out of control. But it’s not always apparent where it lives. Looking toward the future, automated tech debt scanning could help discover where technical debt is and provide remediation points as part of ongoing software stack auditing.
You don’t want to get to the point where you accumulate and accumulate technical debt, only to burn it all to the ground in one disastrous fell swoop, said Davis. Similar to the way you slowly pay off a mortgage each month, technical debt is something you must continually keep an eye on and make reducing part of your team’s continuous workflow.
Or, consider the old camping mantra of always leaving the campground cleaner than you found it. You pick up a little bit of extra trash and take it out with you, making the environment better for everyone. Or, consider how a gardener must prune the branches of a fruit tree to help it grow. Pick the analogy that works for you and see if you can apply it to continually reduce tech debt in your DevOps processes!