Everyone thought AI would free us from toil. Unfortunately, that’s not true. What it will do, however, is lead to an explosion of technical debt. Some of the most alluring uses of LLMs have the potential to lead to some of the worst outcomes for enterprises and their DevOps and platform engineering teams.
SK Ventures and Adrian Cockcroft raised the alarm bells in their warning posts about the tipping points LLM introduced into work and society. And in my own Gluecon 2023 presentation, I challenged just how powerful and disruptive the new generation of AI/ML will be for ITOps and DevOps efforts.
Working Faster is not Working Smarter
There is no doubt that LLMs are powerful and will change how we think about expertise and knowledge work. While they are not truly innovating, their ability to draw on such a vast pool of human expertise makes it difficult to distinguish from true creative work. Since they are unbounded, they will cheerfully use the knowledge to churn out terabytes of functionally correct but bespoke code.
It’s easy to imagine a future where LLMs crank out DevOps scripts 10x faster. We will be supercharging our ability to produce complex, untested automation at a pace never seen before! On the surface, this seems like a huge productivity boost because we (mistakenly) see our job as focused on producing scripts instead of working systems.
But we already have an overabundance of duplicated and difficult-to-support automation. This ever-expanding surface of technical debt is one of the major reasons that ITOps teams are mired in complexity and are forever underwater. Generating even more scripts, even if they were better written, does not reverse the trend.
Can AI/ML Avoid This Technical Debt Trap?
It depends on how we use the expertise in the models. Instead of asking it to generate new code, we could ask it to interpret and modify existing code. For the first time, we have tools to take down the “not invented here” barriers we’ve created because of the high cognitive load of understanding code. If we can help people work more effectively with existing code, then we can actually converge and reuse our systems. By helping us expand and operate within our working systems base, LLMs could actually help us maintain less code.
Imagine if the teams in your organization were invested in collaborating around shared systems! We haven’t done this well today because it takes significantly more time and effort. Today, LLMs have thrown out those calculations.
Taking this just one more step, we can see how improved reuse paves the way for reduction of the number of architectural patterns. If we improve our collaboration and investment in sharing code, then there is increased ROI in making shared patterns and platforms work. I see that as a tremendous opportunity for LLMs to improve operations in a meaningful way.
It comes down to whether we ignore complexity by replacing it or embrace complexity by understanding it. I see this as a choice between independence or collaboration, as illustrated by the table below.
LLM Driving Independence | LLM Driving Collaboration |
Create new code | Interpret existing code |
Replace existing code | Maintain existing code |
Design new architectures | Be a guide and mentor |
Build new stacks | Evaluate alternatives |
Bypass documentation | Improve documentation |
Can AI Help us Collaborate?
We keep taking the same hardware and software components and then building unique systems. That variation makes it impossible to leverage each other’s work. At its heart, the reusability problem lies in the difficulty of understanding systems, and that makes it nearly impossible for us to work together, collaborate, reuse and accelerate.
Breaking this cycle has been our mission at RackN for nearly a decade. We’ve seen tremendous ROI from availability, reliability and speed when customers are able to leverage standardized infrastructure pipelines. We’ve seen firsthand that it is possible to guide teams into collaboration with sufficient guidance, mentoring and training. Collaboration creates 10x outcomes even without AI, but it takes building expertise, deliberate effort and leadership discipline to implement.
Why does collaboration create 10x ROI? It works because true innovation actually comes from understanding and building on top of systems rather than rebuilding them from scratch over and over again. The key is being able to leverage LLMs to better understand existing automation and platforms instead of creating yet another bespoke stack.
Enterprises Should use LLMs to Drive Collaboration
I believe that the most effective enterprises will use these tools to elevate their IT leaders and architects to focus on collaboration and reuse. They must actually direct them to produce less bespoke automation; instead, they can use expertise in these tools to find and maintain existing systems. LLM presents options, but collaboration requires new thinking and strong leadership.