Last week, we ran the story Paying Down Technical Debt. This week, we’re discussing that very topic with Gene Kim (@realgenekim). As a follower of DevOps trends, you probably are already aware of his most recent book project, The DevOps Cookbook, and most certainly his previous works, The Visible Ops Handbook and The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win.
Gene also founded the security firm Tripwire and was the CTO there for 13 years. Gene has made a career of studying high performance companies and we’re please he took the time to share with us his thoughts on technical debt.
DevOps.com: Tell us what has been keeping Gene Kim busy lately?
Gene: There is a lot going on. Right now, I’m very concentrated on the DevOps Enterprise Summit taking place in San Francisco October 21 through October 23. The summit will focus on Lean IT, DevOps, and Continuous Delivery. We’re pretty excited about it. We have 50 speakers from large enterprises such as GE, Disney, and Macy’s, and other incredible speakers.
DevOps.com: Sounds like a great event, Gene. I’m curious to get your thoughts on technical debt. Some people argue that it doesn’t exist. Others contend that it does. Does technical debt exist, and if so how do you define it?
Gene: There is a great definition of technical debt that was given by Ward Cunningham: “Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite. … The danger occurs when the debt is not repaid. Every minute spent on not-quite-right code counts as interest on that debt. Entire engineering organizations can be brought to a standstill under the debt load of an unconsolidated implementation, object-oriented or otherwise.”
My simple take on that quote is that technical debt is what you feel the next time you want to make a change. It accrues from not only all the shortcuts you make in a project when it’s rushed, but even every time the developers don’t write an automated test – every time that you don’t do data code analysis. When you skip these things, the debt builds up every day. It’s basically all the variants, from the right way to do things versus the way we actually do things. The more of those that accumulate create this ever-increasing amount of technical debt.
DevOps.com: What does technical debt commonly look like in an enterprise?
Gene: Randy Shoup gave a great talk on this, The Virtuous Cycle of Velocity: What I Learned About Going Fast at eBay and Google, at FlowCon San Francisco 2013, when he was CTO at KIXEYE. In that talk, Shoup provided the countermeasure to technical debt. He stated that any sort of thing that affected performance and quality, as defined by reliability of scaling, was treated as a priority zero defect.
Essentially, what he’s saying is whenever there’s a reliability or a scaling issue, it’s treated as a priority zero defect. In other words, it is as bad as a site down issue. What it means is that it’s like Toyota and when they have a serious issue, everyone stops work in order to swarm the problem until it is resolved. If they can contribute to helping the problem, they do. This is the opposite of what happens in most organizations, where it’s “never my problem” and when technical debt is allowed to accrue. Where you put problems on to the defect backlog, you put it into the security remediation queue, but it will never get done.
It goes there to die. It’s the landfill of good ideas that never get implemented. What Randy Shoup tells is that investing pays off. I would say that’s exactly the culture that enables high performance. High performance is where you’re getting 30 or fewer code deploys, very short lead times, high success rates when they deploy, and short lead time to prepare. That’s as true for operations as it is for information security.
So, technical debt is bad for ops; it’s bad for security. It’s also bad for development, because it slows down future flow.
Shoup nails the difference between the virtuous versus the vicious cycle. The virtuous cycle is described as one having a solid testing cycle. You have good testing so you have a solid foundation, which lets you be more confident. This lets you go faster and faster. The vicious cycle is you don’t have testing. You have a very shaky foundation. You’re afraid to make changes. And you go slower and slower.
DevOps.com: In your experience of watching high performance organizations, what is the challenge of getting from the slide that discussed the vicious cycle and the technical debt world to one with the virtuous cycle? How do you implement what’s detailed on slide 13, I think it is?
Gene: It really is policy. We have to have a policy that elevates paying down technical debt and enables teams to focus on the nonfunctional requirements – a policy that one’s definition of done is whenever the developer says it’s done, then it’s done. It’s not done until there’s an automated test, an automatic deployment process, so that anyone can push the button and it goes through the test cycle and gets deployed into production.
Deliverables also have to operate in production as designed. That’s a good definition of done. That prevents us from going from sprint to sprint and accumulating technical debt behind us. I would say the countermeasure to technical debt is the right definition of done. It’s like in the agile philosophy quote: The definition of done is at the end of each sprint interval we have working and potentially shippable code.
I would add that it is integrated into chunks. There’s a checking depository. There’s automated testing that anybody can run before the code can be deployed. I think these are good countermeasures.
Devops.com: That’s excellent. Now, to get from the vicious to the virtuous cycle, I imagine it takes significant investment upfront before the payoff?
Gene: Unfortunately, this is one of those things where you go slower before you can go faster. But the benefits are so compelling. We know that high performers are doing 30 times more frequent code deployments, and they can do them 8,000 times faster. That’s 8,000 times faster. So the payoff is huge.
Think about it: If you’re not doing a deploy a day, that means that you are 100 to 1,000 times slower than high performers. High performers can execute a deployment in minutes or hours, whereas lower performers need weeks, months, or quarters. Of course, we have to take investment to go from weeks, months, and quarters to on-demand, executing within minutes. But the value of doing that is too high to ignore.