Terraform is widely adopted among both enterprises and service providers who embrace infrastructure-as-code (IaC) as their model for infrastructure provisioning automation in DevOps environments. And while the technology has limitations, innovations seeking to unseat Terraform are highly fragmented and have seen tepid adoption. This points to a market that’s settling on a de-facto standard, so rather than trying to replace Terraform with the next great innovation, infrastructure operators should instead find ways to extend and augment Terraform to suit their unique needs. Let’s explore several of the better-known shortcomings of Terraform, some drivers for the rise in new alternatives to the technology, and what users need to consider when contemplating a different path.
Without a doubt, Terraform is big. Across the big three public clouds, the number of Terraform provider installs tells the story: At the end of August, there were 457 million provider installs for AWS, and around 72 million each for Azure and GCP. Yes, compared to longer-in-the-tooth infrastructure configuration management tools like Ansible, Chef and Puppet, IaC as a category is still early. But, the trend is clear: IaC is becoming widely used at scale, and Terraform is the IaC platform most enterprises go with.
It’s also no secret that there are shortcomings to Terraform. A quick search on Reddit and Hackernews unearths the litany of complaints, some more valid than others:
- It’s tricky creating a Kubernetes cluster and then adding a resource to it
- Challenges passing values through a config or dumping a pure representation of an elevated config
- Cluster deletion is a pain
- An overall lack of visibility into what’s happening during the provisioning process
- The need for glue code to drive everything is a drain on developer productivity
- Lingering challenges with backward compatibility
- Continuous update challenges and the need to refactor
- Scaling across multiple workspaces and multiple cloud accounts
- Rollbacks can be tricky, especially when dealing with immutable infrastructure
Terraform’s parent company, HashiCorp, of course, steadily works to address these and other pain points. But the whole thing highlights an interesting question: Why is Terraform so popular?
Terraform is the only IaC solution that meets the needs of developers who want an IaC tool that preserves state, fits well into their CI/CD pipelines, is intent-based and gives them the right level of granularity. Terraform, even with its challenges, is better than the alternatives in these regards.
Here’s why:
Terraform was, from the start, designed for and targeted to developers. It maintains state and supports iteration, making IaC a powerful model for automating infrastructure. PaaS solutions, by contrast, are not robust enough for most developers who do not need or want all of the infrastructure decisions abstracted away. Most cloud management platforms suffer from the least common denominator problem, and even though the technology has improved in this regard, it’s a solution that developers generally abhor for its granularity; it’s geared more to the needs of IT management. Configuration management tools, as an alternative, are not intent-based, so flow becomes too complicated.
It’s interesting to note that state (maintaining the mapping between the resources that you’ve represented in your definition files and the actual physical resources that are in the cloud provider) was added into Terraform, helping resolve the issue of infrastructure drift. Terraform also realized that developers don’t necessarily need a UI. These tangible examples of the developer-centric design ethos at HashiCorp point to why it is so widely adopted.
So, what’s driving the interest in new approaches to infrastructure automation? The proliferation of domain-specific use cases and an ever-present need to reduce time-to-value are the leading reasons. Domain-specific automation options are becoming more dominant over OOB solutions because they take too long to get to automation; one tool does not fit all. Domain-specific automation addresses issues such as security automation, AIOps, DevSecOps, security compliance, integration with container scanners—not just generic orchestration. People are looking for value very fast, which can be found only with domain-specific automation—and therefore, fragmentation happens as different solutions emerge to address those cases.
There are several interesting alternatives to Terraform. Terragrant is used along with Terraform to expand its capabilities to multiple workspaces. It allows you to use the same template across many environments. Pulumi is another alternative that has achieved a measure of adoption and visibility. It provides native code support which addresses the problem with YAML file development complexity. Another more ambitious undertaking is the Crossplane open source project. It takes a Kubernetes-centric approach to creating a universal, cross-cloud control plane. It looks good conceptually, but it’s a fundamentally different approach, and adoption is nascent.
Turning to one of the Terraform alternatives isn’t necessarily a bad idea, but is fraught with landmines. In the end, this choice is less about technology and more about adoption and ecosystem momentum. Vendor and professional investments in tools, training, and skill sets surrounding Terraform is a formidable moat, and it cannot be easily replicated. Tools that would win over Terraform would have to win adoption, and achieving adoption means delivering all the enterprise-grade features and support that Terraform users currently enjoy. Chef and Puppet are examples of how this could happen. Both were successful technologies, but they were slow to react to the challenge presented by Terraform. Ultimately, these projects suffered from a legacy mindset and were overtaken by Terraform’s more focused, simple solution.
Most of the issues with Terraform lie in managing it at scale. When you have many workspaces and templates, it becomes hard to manage; it’s also a challenge that templates cannot be generalized among environments, resulting in a lot of template redundancy. Also, Terraform still needs help handling Day 2 operations and visibility into drift detection.
When considering ways to extend Terraform to address a broader set of use cases, you should first consider how well things will interoperate across your environment. If you have multiple clouds, multiple service layers and multiple cloud vendors, this is probably the most important factor. Also, investigate any new technology’s ability to support the things about Terraform that are critical to your Day 2 operations. Things like secrets and state come to mind. You’ll want enough management and monitoring in the new solution without it being a new burden to carry. You should be able to use your existing Terraform files, and you should be able to support all the versions of Terraform that you run. Compliance with governance and audit are key, and the new solution should support Terraform for both your on-premises/air-gapped environments and your cloud environments. Ideally, anything you add to Terraform would be open source, eliminating new complications from cross-vendor lock-in.
Architecture for how 8451 evolved its Terraform deployment
For all its shortcomings, Terraform remains the best option on the market for DevOps organizations, and there are ways to manage it with fewer headaches and without developers wasting so much time writing glue code or reinventing the wheel over and over again. One way this happens is by integrating Terraform with meta-orchestration tools that allow Terraform to become more flexible in managing environments and workspaces at scale. The open source Cloudify project is one example.
Yevgeniy Brikman gave a talk two years ago at HashiConf that has aged well. At the time, DevOps was still in the Stone Age. Today, it’s probably in the Bronze Age. Those of us building IaC today are being asked to deploy infrastructure that looks like modern cities, but the tools we have to work with remain relatively primitive. What we’re building takes too long to provision and it looks bespoke and temperamental at best or kludgy and fragile at worst.
It’s a story repeated over and over: Teams try to build their own, homegrown solution first as a wrapper on top of Terraform. Ultimately, they realize that it’s unsustainable as a lack of engineering resources becomes a bottleneck. We need solutions that augment—not replace—what they’ve already built and what works. That’s almost always Terraform. With Terraform, DevOps organizations get a proven IaC option that’s targeted to the needs of developers, with a substantial ecosystem and market uptake. When considering its shortcomings, it’s best to embrace an evolution strategy, and not start from ground zero.