Back in 2008 I was part of a startup that had success and sold to large company three years later. During the startup years, there was three of us in IT plus an offshore team that did our web development. The three core resources owned everything from the API down while the offshore team owned the front end. Since we were small, we each owned our own part of the architecture and managed the deployments and operations of the system with few issues. Once we were purchased, our world changed drastically.
Our team quickly ramped up to over 20 people while my original two architects soon left for greener pastures. Now we had a lot more people involved in the process and none were domain experts. Things quickly started spiraling out of control as server sprawl started creeping in and deployments were becoming increasing challenging and error prone. We hired our first “DevOps” person to help us reign all of this back in. Then things got really ugly. For those who have followed my DevOps writings and my rant called “No you are not a DevOps Engineer“, what I am about to describe is the basis for my position on DevOps.
DevOps as a Role: Fail Pattern
We never intended to create a DevOps silo or position. We just wanted someone with expertise with cloud and automation since everyone else on the team was a developer and many of them were new to both the cloud and the platform. Unintentionally, this person became labelled as DevOps and he was soon joined with three other people which became an official DevOps team. We now were the proud owner of a new silo which became the new bottleneck.
The DevOps team had their hands full to regain control over the environment. Permissions had to be pulled back to prevent server sprawl from continuing. Development teams now had to request environments from the DevOps team. When we were a startup, this was easy and manageable. As the team got larger and more concurrent projects were being run, this became challenging. The DevOps guys were so busy keeping up with the demand that they were not always in tune with the goals and the deliverables of the different projects. The app dev teams were so focused on “agile” that they were not including the DevOps guys early enough in the projects.
The DevOps team became drastically overworked. They were monitoring and operating the systems, they were being pulled from their deliverables to provision environments, and they were trying to put controls around a platform that experienced sudden growth from a 3 person team of founders to a 20+ person team of new people. To make matters worse, we knew what we wanted to do to fix these problems but there were not enough hours in the day to get there. We had mapped out an entire continuous integration and delivery strategy, but we just could not get enough time to focus on it.
DevOps started to get a bad rap from the app dev team because there was often delays in getting environments provisioned in time. We experienced environmental issues between dev, QA, and stage because of the lack of timely communication of the environment requirements from the app dev team and because the different app dev teams were not being consistent (e.g. different version of Python and different libraries).
All of this led to frustration from the business, especially from the business people who came from the startup and was used to frequent deployments with high quality.
Lesson Learned
Eventually the team turned this around. Although they still have an official DevOps team, the team works much closer with the app dev teams. A high level of standardization is now in place for the environments. The teams share operational requirements early in the project life-cycle and developers assist in the environment creation. The lesson learned here is that creating a DevOps role or organization is not the correct approach. Creating a new silo just creates new bottlenecks. DevOps is not a role, it is a way of getting quality, reliable software to market faster. The sooner an organization realizes this the sooner they can get to a successful state.