The Difference Between Capacity and Scalability Planning

Not only does the tech industry love terms, we love to make them super nuanced. For example SecOps versus DevSecOps, continuous delivery versus continuous deployment, hybrid cloud versus multi-cloud, etc. These terms may seem similar on the surface, but when you dig in they are quite different.

Here are two more: capacity planning versus scalability planning. Both are critical aspects of keeping applications up and running, but two different activities.

The reason we invest time in capacity and scalability planning is simple: We want to make sure that system resources such as compute, storage and memory are not the cause of an application outage. It is a response to expected and unexpected increases in application usage, as well as the steady growth of application adoption.

Here are a few differences between capacity planning and scalability planning.

Long Term Versus Short Term

Capacity planning is more focused on the base resources, as it relates to the long-term growth in application traffic. The goal is to have enough resources to handle the steady state of the application without breaking the bank by over anticipating usage. With capacity planning there is a big assumption that the application requires to always have a pool of resources available.

Automated Versus Static

Scalability could be a manual task, you see utilization creeping up and you respond with more instances. This is probably more common than not. But I don’t believe it’s the overall objective of most teams. Auto-scaling is the goal, where the system knows on its own when it’s time to spin up new resources. The public cloud providers, instrumentation of kubernetes, etc., help support this. Whereas capacity planning is a more deliberate identification of needs, and then statically setting those resource needs within the cloud.

Strategic Versus Responsive

Similar to static versus automated, capacity planning is a more strategic activity whereas scalability planning is a more responsive activity. Usually, scalability comes up during peak usage activities such as black Friday sales or school registrations.

Capacity planning generally calls for more research and leveraging tools such as load balancers and historic data to understand the best path forward. It also should involve the input from the broader team, especially development to discuss the resource usage of new functionality and application architectures.

There is a good possibility that scalability planning could be all that is left with some modern applications. The reason this might be true is resource allocation theoretically can be 100% application based, where the resources are always one step ahead of current utilization. In order to embrace this model resources need to be provisioned 100% automatic, which means that all infrastructure needs to be scripted. As more companies adopt serverless components of their application, this is already inherently true.

Capacity planning is where most operations teams spend their time, but resource utilization is shifting left and scalability planning is becoming more common. Scalability planning has become the responsibility of DevOps engineers, SREs and even developers. The main reason capacity planning will survive the long haul relates to budgets. While the application itself can fully control the infrastructure laid out before it, the IT organization and the organization itself will want to impose an upper limit so there is an indicator of when the cost of running the application on the resources may or may not be an issue. Capacity planning in this case will be purely a cost control mechanism, and negotiated just as any budgets are.

— Chris Riley