Does your cloud work for you, or do you work for your cloud? Given the enormous dependency and costs associated with cloud computing, that is a question more and more DevOps teams are asking these days. After more than a decade of pressure from higher-ups to move workloads to the public cloud, DevOps teams are now facing the reality that there is such a thing as too much cloud.
Don’t take that the wrong way. I’m not here to tell you the cloud is inherently bad or we should return to the days when all our critical workloads ran on bare-metal servers housed in a closet in the back of the office. Public cloud services offer a number of critical benefits and DevOps teams should take advantage of them.
However, to use the cloud responsibly, it’s crucial to control for two risks. The first is cloud computing costs, which easily can get out of hand if not managed properly. The second is a rigid dependency on public cloud services that undercuts agility and makes it more difficult to deploy applications at scale—which is exactly the opposite of what the cloud is supposed to do.
In this article, we’ll take a look at what causes each of these risks and how DevOps teams can respond to ensure their cloud strategy actually delivers value.
Risk 1: Cloud Computing Cost
You’ve probably heard a lot of talk about how the cloud saves money. If you use the cloud effectively, that is true. However, because of the complexity of cloud computing cost structures and the difficulty of comparing cloud computing costs to on-premises costs, it’s easy to end up with a cloud hangover after paying a much higher cloud computing bill than you bargained for.
There are several distinct challenges at play in predicting and understanding the true costs of your cloud. Let’s take a look at each one.
Never-Ending Cloud Computing Costs
When you move workloads to the cloud, your costs vary month by month and are spread out over time. At first, moving to the cloud feels like a great money-saver because it frees you from having to make large, upfront purchases of on-premises hardware. However, just as leasing a car ultimately costs more in the long run for most people than owning, paying monthly for cloud services hosted on someone else’s infrastructure adds up. If you ignore this fact, you can end up failing to appreciate the true cost of your cloud computing bill over the long term.
Unpredictable Cloud Costs
Another fundamental challenge of cloud computing cost compared to on-premises is that, in the cloud, costs structures are opaque and unpredictable. Public cloud vendors change the prices of services without notice. The same service might cost more in one cloud region than another, while some services have multiple fees attached. For example, if you store data in the public cloud, you pay not just a per-gigabyte cost to store it but also fees whenever you move the data. This adds up to tremendous difficulty in predicting your true cloud computing bill.
On-premises costs are not entirely straightforward to predict either, of course. There are some fluctuating expenses, such as the cost of electricity. However, the lion’s share of on-premises computing costs is the hardware itself, which is a very simple and tangible expense. It’s also a one-time expenditure that you can amortize over the life of your servers. Once you have purchased a server, you don’t have to worry that its effective cost will change month to month, as you would with a virtual server in the public cloud.
Overlooked Workloads
In the cloud, it’s trivially easy to spin up a new workload. It’s also easy to leave a workload running after you no longer need it—and public cloud vendors don’t go out of their way to notify you when you have resources that are activated but sitting idle. You can’t blame them; that would be like a restaurant telling you you’re ordering too much food or a builder telling you your house is bigger than you need.
If you don’t carefully manage your cloud workloads, they add up to extraneous and unnecessary costs.
To be sure, you can have unnecessary workloads running in an on-premises environment, too. But the difference is there is no substantial cost increase on-premises that accompanies each extraneous workload. You’ve already paid for the servers and you don’t get a bill for every service you have running on your own infrastructure.
Risk 2: Lack of Cloud Agility
Beyond unnecessary costs or inefficient cost structures, relying too heavily on the public cloud also undercuts the agility of DevOps teams.
When you run a service or workload in the public cloud, moving it to different infrastructure can be challenging. Not only is it usually difficult to migrate from one cloud to another, but also doing so entails adapting to a new cost structure and possibly paying more for the same set of services.
There is an irony here: Conventional wisdom suggests that one of the main reasons to migrate to the cloud is that the cloud is inherently more scalable and agile than on-premises infrastructure. That may be true in some cases, but when you factor in the cost and the difficulty of migration from one cloud to another, the cloud quickly starts to look less agile and scalable.
On-premises infrastructure doesn’t always scale painlessly either, of course. I don’t dispute that. But the cloud doesn’t always live up to the hype when it comes to scalability and agility. On-premises and cloud-based infrastructures both have limitations in this regard.
Cutting the Cord: Making the Cloud Work For You
Now that we’ve identified the inherent (and underappreciated) cost and agility challenges of cloud computing, let’s consider solutions.
Optimize Cloud Computing Costs
Cost optimization is something you should always do on any infrastructure, and the cloud is no exception.
There are many guides out there on cloud cost optimization. Some vendors also offer tools designed to monitor your cloud workloads and identify underused resources or help you right-size cloud service instances. These are all good ways to help reduce your cloud-computing bill.
Be Cloud-Agnostic
Another useful strategy is to architect your cloud workloads in a way that makes them as cloud-agnostic as possible. Instead of deeply integrating with a particular cloud vendor’s services, rely on third-party tooling whenever you can.
For example, if you run containerized apps in the cloud, run your containers in standard virtual server instances and use a third-party registry instead of using the container registry and managed container service of one cloud vendor. This will require a bit more work on your end, but it will give you the freedom to move your workload to another cloud easily. It also enables cost-efficient scalability, because you are not beholden to the cost structures tied to the vendor’s managed container service. Rather, you are using only virtual server instances (and perhaps some storage to go along with them), which have much more flexible pricing on public clouds.
Don’t Be Afraid of On-Prem
My final recommendation, and my most important one, is to embrace on-premises infrastructure.
Again, I say this not because I want to move backwards and return to a world where everything runs on-premises. I’m not anti-cloud.
However, from a cost and agility perspective, the public cloud is not the best fit for every type of workload. Workloads that have very convoluted cost structures in the public cloud—or that would be difficult to run in a public cloud without sacrificing a great deal of mobility and scalability—are often a better fit for on-premises environments.
You may be thinking that embracing on-premises means sacrificing all of the advantages of the cloud or having to forgo cloud-native workloads. Fortunately, that’s not the case. There are software vendors that offer the services of a public cloud, but on your own enterprise infrastructure using the public cloud APIs.
Thus, you can obtain the cost and agility benefits of on-premises infrastructure (not to mention the security advantages) while still running cloud-native applications. In fact, you don’t even need to overhaul your CI/CD pipeline; you can simply deliver your cloud-native apps onto on-premises infrastructure that runs services designed to be compatible with those of the public cloud, even though they are hosted on your own servers.
Conclusion
Don’t move workloads to the cloud just because it seems like the thing to do or because someone told you it’s necessarily better. Some workloads work better in the cloud while others are best hosted on-premises. An all-in cloud strategy is not typically the most cost-efficient or agile architecture you can adopt.
— Ariel Maislos