Without question, golden templates (or golden images) simplify the process of deploying resources in on-premises and cloud infrastructures. These templates can save time. They ensure consistency by setting up a computing environment that matches the specifications. The templates also encourage component reuse, rather than requiring individual users to manually compose infrastructure as code (IaC) from scratch for each application. For common types of projects, the infrastructure is frequently the same, so why force busy developers to rewrite it for everything they want to deploy?
Golden templates were a great first step in making IaC more accessible for end users, including people who might otherwise find it difficult to compose IaC. Preset standards can plug-and-play with the users’ applications and projects.
That’s far better than the alternatives. Nobody has to ask those users to learn new tools, such as Terraform, AWS CloudFormation, or Ansible, each of which has its own syntax and nuances. Nor are those end users expected to understand the infrastructure, such as servers, storage, networks and operating systems. It’s akin to a PowerPoint template that is pre-equipped with a brand’s colors, fonts, icons, and predefined layouts.
Golden templates make it easier to get started with IaC and deliver something that looks right, with all the right pieces.
Or, at least, that was the goal. Despite the strengths of golden templates, there are still issues to resolve.
The Problems Golden Templates Address Well
Making IaC repeatable makes sense. It builds in infrastructure security, consistency, and performance by removing the likelihood of manual errors, ensuring best practices are followed, and enforcing the use of approved modules, providers, and resources.
Golden templates work fairly well for simple things you need to do all the time, such as deployments where the infrastructure is the same every time and the underlying codebase rarely changes. There’s no need to reinvent the wheel for every application deployment, so it makes sense to use a template that describes which modules to use and in which order. As with that PowerPoint template, the approach is relatively simple and it doesn’t require a platform engineer’s time and attention.
Saving time and improving efficiency are clear wins in themselves, and golden templates theoretically offer both of those, alongside security. For common use cases, templates can make it easy for end users to deploy their work without sending tons of requests to the platform team or exposing their organization to unnecessary risk.
Ultimately, though, like any template, golden templates are fragile.
Where Golden Templates Fail, and Why
Golden templates work great on Day One, but two things happen fairly quickly that throw a wrench into the works: updates to components and updates to the templates themselves.
Templates often include versioned components, such as modules and dependencies on different (often open source) projects.
What happens when you want to use the latest version of a module in your template? You have two options: update the whole template; or try to design your templates to always use the latest version, which has its own risks and pitfalls. Most companies want to test the latest versions before rolling them to production. And you don’t want to update your template one week only to do it again the next week when another component gets a new version.
That brings us to the process of rolling out new versions of the template, whether that happens frequently or once a year. Updating golden templates means they need to be in a version control system or some other distribution channel that your development teams actively use. If you’re lucky, you’ve built those templates into your build/deployment pipelines.
But what happens when not every team is cleared for updates? You want to make sure that your developers use the version that captures all the latest rules, best practices, and institutional knowledge from your platform engineering team. You also need to ensure that you’re not disrupting their existing infrastructure. And, if you’re not building templates into pipelines, you had better keep your fingers crossed that your end users are choosing the updated version of the template appropriately (rather than grabbing an outdated template).
That makes the process more complicated, when the entire point was to simplify. You now need to keep track of who uses which version of the template. You also need to track versions of the various components you need to support, maintain, or plan to manually upgrade eventually. And, as with any other template, if you mess up the golden template everyone uses, the problems trickle down. Everything using that template includes those same problems baked in, creating a sprawling remediation nightmare.
The Service Catalog Alternative… Maybe
Of course, a single template can’t fit all deployment needs, so many companies create service catalogs of templates. These catalogs allow developers to make template selections through multiple-choice questions or by filling in the blanks. This solves the challenges of relying on a single template for many use cases.
However, the service catalog model often requires end users to select infrastructure correctly for their use case, which in turn requires that you wrote the questions or directions expertly to guide them without requiring them to have additional infrastructure knowledge. As anyone who has tried to write instructions or guides for anyone quickly learns: That’s hard to do well! Everyone thinks about things differently, and a straightforward direction for me might be really confusing for someone else. Templates and service catalogs don’t escape this issue.
All of this assumes that every template is equal and that every developer who uses a template has the same baseline knowledge. When you share a template with developers who know about infrastructure, they can fill out their part quite easily. For developers who don’t understand infrastructure, however, it’s much harder to fill in the blanks. They may need to turn to platform engineering for help (or make inaccurate guesses, which is arguably worse), and that undercuts the benefits of the template.
Golden templates never solved the problem of changing application needs, either. Templates are, by design, one size fits all. What happens when an application developer uses a template that just barely fits the needs of their application to get it into production, and then updates that application to have even higher infrastructure requirements? Is that developer going to assess whether the updated application has new requirements? Or are they just going to keep using the same template? The developer may not pull down a new golden template version, consider whether they need an extra database, how expanded capacity may change requirements, or how the scale the application is running at has changed – and the developer is less likely to do so when they are not connected to the day-to-day maintenance of the application’s infrastructure. The result is an application that isn’t set up to handle the load and therefore fails.
The reverse is just as likely. It’s too easy to deploy applications that are over-provisioned, and therefore taking up valuable space and resources. That also creates a wider surface for potential attacks.
Often, developers need to revisit an application’s infrastructure requirements, and templates don’t require or encourage that thought process.
How to Ensure Secure, Efficient IaC
We do not want to give up the positives of golden templates: automation of repetitive tasks, creating knowledgeable guidelines for users with less infrastructure knowledge, and more ability to build in known best practices and secure patterns.
But if templates are too fragile to use to reliably deliver secure and efficient IaC, how can organizations solve these problems while still gaining the benefits? Shift your thinking from a template where developers fill in the blanks, to asking how you and your organization can generate IaC from standards.
Standards are guidelines your organization defines for how to build and structure infrastructure. Standards establish the requirements to ensure your applications and projects are deployed such that they meet the needs of your business, the industry, and your customers.
Standards change significantly less often than templates, which implicitly combine standards with module and other component versions. Instead of placing the burden on developers to remember and align to these standards, apply your policies and standards automatically when IaC is generated, eliminating the need to scan and fix it after creation. This approach enables developers to deploy applications faster and platform engineers to provision infrastructure more easily because there’s no need to worry about whether the IaC is secure and efficient.
Photo credit: Neeqolah Creative Works on Unsplash