At RackN, we’ve been working for years to make DevOps portable across platforms. For us, that means the ability to move between clouds and from cloud to metal. Of the several challenges to portability—composability, API heterogeneity, sequence, cross-system dependencies—making configuration management tools work with services has been particularly important, as significant parts of building systems rely on services that are not directly configurable.
Let’s look at DNS as an example. Whenever we configure services on a node, there are DNS implications. If we bring up a web server, we want to use a logical name for the node. If it’s a secure web server, that name should match our certificate. If we want to make that service discoverable, we likely want to add a logical name for that node, such as “clustermanager6.example.com” or similar. If you include names for BMC and rack location—not to mention IPv6 naming—it’s easy to image nodes with 10 names. Even cloud nodes have similar issues. That’s a lot of entries to manage!
The problem with all those DNS names, and similar services, is that they are NOT configurable from the node. They are services with APIs that require restricted authorization.
Services and Automation
DNS is just one simple example. System-level DevOps has many places where the configuration workflow needs to include external service actions. In cloud environments, generally that means either writing pre- or post-configuration steps that talk to the specific cloud API or using one of the cloud-specific template files. In physical environments, generally that means a lot of added manual steps or tightly constraining automation to inventory map files. No matter what, nodes don’t own DNS configuration—it’s a node external service with the API elements that are site specific.
All of these approaches make DevOps brittle and site-locked. Five years ago that was okay, when choices were more limited and scripts were hand-crafted by DevOps artisans. Today, we need our automation to be robust, portable and transparent.
We realized system automation needed to treat services interactions as equal to configuration. That means when we’re working on provisioning, we make sure service steps can be woven into the sequence of configuration steps. They must not be assumed correct or delegated to pre- or post-conditions because that makes it impossible to test and re-create working environments.
What can you do? We’ve found service registration tools (we like Consul.io) are essential and relatively easy to implement to help track services that we manage or proxy. More broadly, we’ve been working in the open on Digital Rebar to blend configuration and services into a single workflow.
The first step is to re-examine your configuration management scripts (and their prerequisites) for hidden service API interactions. Exposing those external dependencies will help you improve your automation repeatability and portability.