I was doing some research over the weekend, and it became abundantly clear that you can find an Ansible script to do pretty much any repetitive task in your infrastructure. This is good news for Ansible users, and the pool of available scripts will, no doubt, continue to grow as critical mass effect kicks in.
My concern is for those that find a script that does what they want/need, and drop it into their infrastructure. This approach may be a good short-term fix, but is full of potential peril. Even a cursory read of a complex script is insufficient.
We all know the watchwords of the day are “speed” and “agility,” but take some time to test and get to know those scripts. When (not if) they fail, you’re going to want a clue of what is going on in there to track it down, and you certainly want to understand what is being pulled into your infrastructure. While the community is likely to catch malicious scripts relatively quickly, the inclusion of questionable sources and cases where your infrastructure causes a script to have issues are more likely to make it to your desk.
The easiest validation cycle is to:
- Read it for understanding. Know what it is doing, why it is doing it and what it is including.
- Run it in an isolated environment. A pod, a VPN, whatever, just somewhere that you can control ingress and egress.
- Scan the results as you would anything in production. Make certain sources are valid for your organization, make certain inclusions are safe and validate that any vulnerabilities in the resulting systems are known and acceptable to your organization.
- If you can, do runtime testing upon it to catch any real-time changes. Most “changes after install” type of software is either well known or not included, so this step is more peace of mind than mandatory.
If your thought was “takes too long, not necessary,” the risk is on you. But on the list of weaknesses in DevOps, that mentality is definitely in the top three. “Takes too long” is an organizational concern that is different everywhere. “Not necessary” is only true until it isn’t. As an IT professional, having at least a passing familiarity with the environments you are creating is part of the job. Fail to do it at your own risk.
Much like there are trusted sources for the various types of repo we access, trusted playbook developers are a list you can curate. I know of a few that, in a pinch, I’d be willing to drop a script from into my architecture with only a cursory review, then give it a more thorough going over later when there is more time. But I know that because I’ve read their scripts, and I get that they are also careful.
While I have Ansible scripts on my mind because that’s what I was looking into, this advice applies to any and all automation scripts that pull from outside sources. Python and Node do a ton of this too, but only highly complex projects pull from a lot of disparate sources. Ansible and other infrastructure automation tools pull from a variety of sources almost by definition (because infrastructure is complex), so they deserve more attention.
In the end, automating all the things changes nothing about responsibility. So, take the time to make certain you know what is going on in your environment, and why. If you are really motivated, take the time to sweep through and remove any references you are certain you don’t need — but nested dependencies make the ROI on this type of work minimal, in my experience.
Meanwhile, you are keeping a ton of servers running and users happy. Keep rocking it, and keep looking for solutions like automation scripts to make you more responsive — just don’t relinquish your authority for the datacenter to random submitter on the internet — because responsibility will still be yours.