DevOps Practice

How to Calculate ROI on Infrastructure Automation

Infrastructure and cloud automation. You know you want it. Infrastructure as code and programmable infrastructure are hot topics these days. The possibilities opened by applying software development practices to infrastructure make the vision of transparent infrastructure closer than ever.

But before you start automating, you may want to look at the value you get from your investment and make sure your plan makes sense. After working in automation for over a decade, I’ve seen the importance of having a clear understanding of the value of automation time and again. It’s about time I write about calculating the return on investment (ROI) on infrastructure automation. But before getting to the specifics of infrastructure and cloud automation, it may be wise to start with the value of automation in general.

The Value of Automation

The equation is simple. You have a long, manual process. You figure out a way to automate it. Ta-da! What once took two hours now takes two minutes. And you save sweet 118 minutes. If you run this lovely piece of automation very frequently, the value is multiplied. Saving 118 minutes 10 times a day is very significant. Like magic.

The Value Formula

The formula for the value from automation can be very simple:

Time or Money Saved by Automating a Manual Task

Time or money saved by automating a manual task should include any benefit you can attribute to the automation. Here are some examples:

  1. The time saved by automation to accomplish a complex task (this process took eight hours. Automation does it in eight minutes. Now we can do it more often).
  2. The time saved for the person who used to do it manually (the person who used to do it manually now has eight free hours, while automation does the work).
  3. The time saved for a person who was waiting for the task to be completed (the person who used to wait eight hours for the task to be done, now waits eight minutes; they can do more).
  4. The time saved by not making human errors (the person who used to spend 10 hours troubleshooting after the eight-hour person finished saves 10 hours).
  5. The time saved by not having to document a manual process (the two people who spend five hours a month teaching other people how to do the eight-hour thing …).

Investment in Automation

Investment in automation is often naively calculated as the investment in building automation, but the real investment in automation also includes maintaining and supporting what you have built. So, if someone tells you it took five minutes to whip up this little Bash script, ask yourself what it would look like to update it and support it forever (and when there are 300 more of these scripts). Your calculation of the investment in automation must include building, maintaining and supporting your automation.

What Should I Automate?

The formula makes it easy to decide what your automation targets should be—or, in other words, where you would find the most value in automation.

A good start would be to look for tasks that take a long time to do manually or tasks that are done very often, as automating them will save more time and money. But you need to make sure the investment in automating them would justify the gain.

I wish I had a dollar for every time I talked to someone who wanted to automate the “most difficult task” to prove the value of automation. This follows the logic, “If you can do the hardest thing, you can also do the simplest thing.”—a bit like making it in New York City. But this is not how automation works. Why? Because automation is very much like software development.

In the software development world, one of the most common answers product teams gets from developers when they ask, “Is it possible to implement feature X?” is: “Everything is possible. It’s just a matter of effort.”

Well, automation works the same (and that’s not the only similarity to software product development).

Pretty much everything can be automated. The question is whether it’s worth it.

If you automate something that takes a whole day to do manually, but you only need to do it once a year ,  it may not be worth automating.

You could be automating the stupidest thing, and only save 20 minutes, but this thing happens 2,000 times every day, so that may very well be worth it. And the more stupid it is to automate, the better.

Real Life Infrastructure Automation

Back to the value formula. In real life, there are more facets to this formula. One of the factors that affect the value you get from automation is how many people have access to it.

You can automate something that can potentially run 2,000 times a day, every day; this could be a game-changer in terms of value. But if this is something that 2,000 different people need to do, there is also the question of how accessible your automation is.

Getting your automation to run smoothly by other people is not always a piece of cake (“What’s your problem?! It’s in git! Yes, you just get it from there. I’ll send you the link. You don’t have a user? Get a user! You can’t run it? Of course, you can’t, you need a runtime. Just get the runtime. It’s all in the readme! Oh, wait, the version is not in the readme. Get 3.0, it only works with 3.0. Oh, and you edited the config file, right?”).

Your calculation of the time or money saved by automating a manual task needs to consider the value derived from automation accessibility:

The more people involved, and the more distributed they are, the more difficult it is to achieve accessibility. Your calculation of the investment in automation should therefore also include the investment in self-service:

Adding Self-Service

The self-service part of the equation is often neglected when looking at the value of automation, especially when the focus is very technical. It’s very easy to show a promise of value if you can automate a task, and by the time the question of self-service surfaces, it may be too late.

This is very common with grassroots automation. In my field, I see this time after time, with teams that adopt free infrastructure automation tools (Ansible, Terraform, Cloudformation, ARM, you name it). In 2020, most people understand that there are no free lunches and that it takes precious time to develop and maintain this automation. Most already get that supporting it over time results in an automation team that is often too swamped to do much else.

But I dare say that few properly estimate how difficult it is to provide this automation as a service to other teams. I don’t blame them. For an engineer, the thought “This is too slow! I will automate it” comes very naturally. It’s much more difficult to imagine as a developer a user who doesn’t understand the beautiful capabilities of Terraform; someone who freezes when they hear, “It’s in GitHub”; someone who isn’t as fascinated by the elegance of the code and has no #$$%$@ idea what to do with code.

In a company with multiple teams, the gap between the people who can automate to the people who need the automation can be huge. And if you ignore it, your automation value formula can be very skewed.

Can Everyone Run This Automation?

When you consider automation, the technical question, “Can it be automated?” is not enough. How efficient the code will be, or how much time it would take to build the automation, is also not enough.

The question, “Can all relevant users run it as a seamless part of their workflow?” has to be asked and weighed in. You may be surprised, but highly accessible yet technically mediocre automation can actually win in some cases.

How the Value Equation Works in RPA

Robotic process automation (RPA) is a great example. RPA systems typically provide users with lists of automated actions by recording tasks that the user performs on a graphical user interface.

For many coders and automation professionals, this approach is almost sacrilegious. You can automate fast, but scaling the technology is extremely limited. It gives a wide audience access to automating tasks without enforcing coding best practices (we already know the masses can’t be automation experts). But RPA can offer high ROI in some cases. If we return to the value formula, this quick-and-dirty type of automation is highly accessible, and therefore the investment in self-service is low. If using RPA means x100 or x1000 people can benefit from automation, compared to high-code tools that only a few can use, we may be able to get impressive ROI.

Does this mean we should use RPA for DevOps automation? Not really. RPA is not always the right tool for the job, and it won’t always provide the best value in automation. RPA works well when you want to eliminate toil in small tasks (e.g. it is very popular for accounting tasks that a person triggers manually), but the value is lower when you have strict scale, efficiency and reliability requirements, which is typical in DevOps.

In the case of DevOps, with machine-to-machine heavy interfaces, the investment you will have to make in scaling and maintaining RPA can be very high and the ROI would eventually be very low. RPA may therefore not provide the greatest value in this field, and it makes sense to opt for more scalable options. But this doesn’t mean you should focus solely on good engineering and the question of accessibility and self-service must be addressed in your ROI calculation.

Infrastructure Automation in DevOps

In the past years, there has been a growing awareness of self-service in traditional infrastructure automation. Often led by IT and Ops, some of it started with simple means, such as connecting ticketing systems (JIRA, Service Now) to automation tools.

However, in DevOps infrastructure automation, accessibility of automation is often an afterthought. DevOps automation initiatives usually start with developers. The natural tendency is to invest in high-code automation and to focus on value derived by the time saved on the execution of a task rather than the multiplication of the value through the enablement of a broader user base. Existing self-service mechanisms are thrown aside,  and for good reasons: they are too slow and they don’t seamlessly fit into the DevOps process.

Ignoring self-service is one of the reasons why DevOps initiatives show great value-promise in small-scale and fail in larger scales.

Moreover, since DevOps spans so many areas of subject matter expertise, the chances of leaving critical expertise outside the coders’ silo is very high—especially expertise in operations, security and finance. Ignoring these experts’ accessibility to automation leaves them outside the DevOps loop.

I was very happy to see Gartner’s “Market Guide for Infrastructure Automation Tools,” published in April. If you don’t have access to the gated content, I recommend taking a look at Manju Bhat’s post on the topic. Gartner finally highlights self-service as a critical capability for infrastructure automation:

“Infrastructure automation tools provide DevOps teams with on-demand, self-service access to standardized environments. Adopting these tools enhances their ability to deliver customer-focused agility and velocity improvements while consuming new technology platforms with limited disruption.”

I see this acknowledgment in the importance of customer-facing accessibility as a crucial milestone on the way to realizing value in larger-scale DevOps initiatives.

Additional Value Factors in Infrastructure Automation

Unlike general automation, infrastructure automation is a field in which there are additional factors in the value formula, which fall under the Infrastructure Governance category:

  1. Cloud security.
  2. Cloud cost.
  3. Compliance with industry standards.

These elements need to be taken into consideration when addressing the value derived from automation. Infrastructure automation, especially in public cloud, involves sensitive assets: compute, network, storage, data and security elements. These elements can be expensive, and they could be very vulnerable to attacks.

If ignored,  these three factors can diminish value. But handling them properly in automation can also generate value.

The “time/money saved by automating a manual task” could, therefore, include an additional value segment:

And the investment calculation should also include embedded governance:

Build Your Value Formula

If you are working on an automation initiative, don’t forget your value formula. It will help you build a business case, get management buy-in and achieve a better understanding of value factors you should focus on.

The TLDR version:

Maya Ber Lerner

Maya is vice-president of product management at Quali. Originally a biomedical engineer, she somehow got into software and automation, and immediately felt at home. When not traveling she is the IT person at home, and the only one allowed to change router configurations. Outside of work, she likes drawing, carving and reading.

Recent Posts

Building an Open Source Observability Platform

By investing in open source frameworks and LGTM tools, SRE teams can effectively monitor their apps and gain insights into…

18 hours ago

To Devin or Not to Devin?

Cognition Labs' Devin is creating a lot of buzz in the industry, but John Willis urges organizations to proceed with…

19 hours ago

Survey Surfaces Substantial Platform Engineering Gains

While most app developers work for organizations that have platform teams, there isn't much consistency regarding where that team reports.

1 day ago

EP 43: DevOps Building Blocks Part 6 – Day 2 DevOps, Operations and SRE

Day Two DevOps is a phase in the SDLC that focuses on enhancing, optimizing and continuously improving the software development…

2 days ago

Survey Surfaces Lack of Significant Observability Progress

A global survey of 500 IT professionals suggests organizations are not making a lot of progress in their ability to…

2 days ago

EP 42: DevOps Building Blocks Part 5: Flow, Bottlenecks and Continuous Improvement

In part five of this series, hosts Alan Shimel and Mitch Ashley are joined by Bryan Cole (Tricentis), Ixchel Ruiz…

2 days ago