One of the great things about our industry is the ability for anyone to get started doing development with a minimal amount of tools. Anyone can learn to do development with just a computer of some sort, a text editor, a tutorial and their brain.
In fact, the initial basics of how to think through a development problem don’t require a computer at all, reducing the cost of getting started even further. Having such an easy way to get started that doesn’t require a lot of tools lowers the bar for entry, enabling more people to join in on development teams and contribute new ideas and perspectives, which enriches the entire industry.
Getting into containerization and building modern microarchitectures is a slightly higher bar, but only slightly higher. There are a number of wonderful tutorials out there for getting started with Kubernetes, Docker, OpenShift Origin and other free and open-source tools.
The only differences in getting started between general software development and microarchitectures specifically are: needing a computer that has the processing power and memory to run containers, and enough of a grasp of a development language to have a basic application to play with. Even that last difference isn’t as crucial because there are so many sample applications people have built for the express purpose of teaching new folks to deploy microservices onto various systems.
However, when it comes to learning from the DevOps, SRE or ops side of managing Kubernetes deployments or other modern architectures, there’s no solid free tutorial I could find that dumped a bunch of broken sandboxes in front of the student with the express purpose of building their troubleshooting knowledge.
To gain insight into understanding how to bring microarchitectures back from failures or fall-overs requires access to larger applications and clusters and more senior, or more experienced, mentors–unless you are lucky enough to be able to build large interconnected applications and send traffic to them regularly enough to cause failures.
This lack of strong tutorials and a requirement for more resources makes breaking into DevOps or SRE work on microarchitectures particularly difficult if no one is willing to take a chance on a less experienced candidate. I want to see that change. We are missing out on hearing from new voices, people who may look at a problem from a different perspective or who may have unique solutions to problems we all face.
I decided to create a repository full of deliberately broken Kubernetes builds for new practitioners to build progressively more advanced troubleshooting skills. The basic idea is to build a pyramid of environments. The base level should be easy to understand and figure out. These environments will be bare-bones simple, with only one cause for failure.
We need to build the baseline recognition of common causes in isolation. Many of us that have debugged production systems have an almost unconscious grasp of when a common cause has appeared simply because we recognize that faint signature in the pattern of the data we’re analyzing. If you’ve never seen that signature, though, and you don’t have a mentor with you on these debugging challenges, learning to find that signal in the haystack of your data is much harder. These initial environments should make it easier to learn what those signals look like.
The next level of environments would have some additional factors unrelated to the actual problem causing additional logs, additional spikes in traffic or additional activity. In short, these environments would have noise. We can add another layer of complexity for some of these environments by adding another service or application that runs just fine on the same cluster. This level would train the ability to rule out sections of a system that aren’t broken, to identify systems that aren’t interconnected and to work on one system without disturbing the rest of the environment.
Further environments up the pyramid would have multiple contributing factors, red herrings or complex structures. At this point, some of the environments will need to be hosted outside of the local system. I’m still debating how to do these environments so the tutorial could still be free and easy for new folks to work with–I’m open to suggestions.
Ideally, there would also be a written guide with answers and tips a student wouldn’t look at until they’ve solved the problem on their own, but they certainly can look up and learn regardless of whether they solved the problem. After all, troubleshooting isn’t a closed-book exam by any stretch of the imagination.
So how do you start? Find the repo at GitHub. It’s free–open source with an MIT license–and open to contributions. There’s a contribution section of the README on the repo that explains how to get started if you want to send commits. If you want to try using these environments yourself and practice or learn troubleshooting skills, the initial environments are built on Minikube, a tool to run small Kubernetes clusters on your local machine. Follow one of the many excellent tutorials for getting Mikube up and running on your local machine, then follow GitHub’s tutorial on cloning down a repo.
Alternatively, you can download the repo directly. Once you’ve gotten Minikube up and running and you have a copy of the repo on your local machine, follow the instructions in each environment’s directory to deploy the environment to your local cluster, and start trying to figure out exactly what went wrong.
Want to help make this project a reality? I would love to have more contributors! In addition to being available online for help, I’m going to be at KubeCon North America 2019 in San Diego. At the conference, I’ll have a Hackers’ Corner set up in LogDNA’s booth for most of the conference, and I’ll be sitting there working on this project, among others, waiting for you. Bring your ideas, your stories, your broken environments, your laptops, your willingness to write docs, your Kubernetes knowledge, yourself—anything that can help enrich this project is welcome.
To learn more about containerized infrastructure and cloud native technologies, consider coming to KubeCon + CloudNativeCon NA, November 18-21 in San Diego.