I’m a bad programmer.
There, I got that out of the way.
The truth is, I’m a sysadmin that writes code infrequently, on an island, and out of disciplined requirement, release, or QA practices. I need to run an ETL that kicks a file transfer. I’ll write some code. I’ve got a workflow that needs a crappy but functional web UI. Now I’m a developer. But in between these kinds of projects are days, and sometimes weeks of meetings and planning and dozens of things that are not code, which means by the time I come back to actual programming, I’m starting fresh every time.
I don’t think I’m alone. I think there are a lot of sysadmins in the world with functional perl chops, but because they don’t code daily, they don’t approach problems with an inherently code-focused approach. Systems are still running, the internet is still on, so this way of solving problems must be OK. It is, but there are drawbacks.
Rolling changes out to 4 dozen of your 8 dozen systems and verifying each received a 4 line change to that logship script with scp? Good luck with that. The old SA just left, and now you need to find all his stuff? GLHF. Some report hasn’t been sending for 4 days, what changed?! Yeah… yeah.
Where software development got it right
Great systems need three things: consistent configuration, maintainability, auditability. Good dev & deploy practices get you all three. Version control systems like git provide a great mechanism to manage code, version, and collaborate in reliable and predictable ways. Issue trackers, like Jira or Rally that integrate with version control let you easily tie specific changes back to the original reason for the change. Deployment systems like Jenkins and Capistrano give you one stop interfaces to track what got deployed where and when and by whom.
Take it as given that few SA teams are using all of these approaches for their work today. Probably you’ve got a ticket system, and you definitely get yelled at if you don’t comment your scripts. Maybe those comments link back to the work order number? Migrating what you do today to a more DevOps focused approach isn’t going to happen overnight. So how do you start?
Version control or GTFO – Yeah. I know. It’s on the list! No. Stop what you are doing, and gather all of your nginx.conf’s, your limits.conf, pg_hba.conf, and commit them to a git repo. I can’t emphasize this enough – if you are not enforcing version control in your systems administration team you’ll reap the whirlwind.
Change, Commit, Test. Welcome to the new DevOps discipline. You’re going to hate it at first, but 7 months from now when the webservers are flopping because somebody changed something, you’ll be grateful for a quick and reliable rollback.
Puppet or Chef or Ansible – get a basic setup. One master, three nodes. My starting point here was user management, I setup a basic user-add class, and let it manage ssh keys. Let that soak for a couple weeks and slowly roll it out to the rest of your infrastructure. Ten bucks says you find a dozen hosts with inconsistent UID’s that you’ve been meaning to fix for years.
Once that feels comfortable, move some basic scripts into your setup, roll those out slowly. Take the db configurations, commit them to the repo, and move ’em over to configuration mangaement too. The point is, moving to configuration management systems is not an atomic event across your entire infrastructure, iterate against it. Make small weekly changes that roll up to quarterly progress.
Here’s where the rubber meets the road:
“Git is going to suck because deploying is so much easier when you just vim a file on a server” – Every Sysadmin Ever
If you’ve got a repo, and you’ve got say, puppet… it’s a pretty easy step to have commits to that repo pulled and auto-deployed to your puppet managed nodes. Which means rollbacks can be too.
Pay off your tech debt – I don’t have to tell you about the 4 dozen scripts you’ve got out there, unmanaged, brittle, and needing constant tweaks to keep them lined up with system changes. These are the bread and butter of system administrators, and can provide a great opportunity to implement change. Next time you have to touch that script for a new SCP server, stop. Stop! Do this instead:
- check it in to your git repo, with some meaningful comments about what it does
- add it to your configuration management system (puppet/chef/ansible)
- if it’s bash or perl, rewrite it as python. It’s going to take a bit more time to rewrite all your scripts, but spending this time gets you more comfortable writing code regularly, and gets you more familiar with a modern scripting language.
Walk the path
Whatever your system or your goals, the hardest part of adopting DevOps approaches is going to come down to time. Reading through this myself, I’m struck by what an insensitive jerk I am. “Hey guys I know you totally have tons of free time, just starting doing everything you regularly do but add a ton of extra work too. Kthxbai”. The reality is, I probably am a jerk, but you don’t have to think about moving to DevOps as a shining city on a hill that you can reach with enough late nights. It’s a journey, a path, a way, not a place to arrive.
Find a way to start making small changes to your existing processes that line up with the best of the development world. Decide to add 10% more time to a proposed project to allow for it to be handled with these methods. If you begin to turn your steps onto the DevOps path, you’ll quickly realize you’ve already arrived at your destination.