A very well-written paragraph about DevOps appears in Mike Loukides’ book, “What is DevOps”:
… Modern applications, running in the cloud, still need to be resilient and fault tolerant, still need monitoring, still need to adapt to huge swings in load, etc. But those features, formerly provided by the IT/operations infrastructures, now need to be part of the application, particularly in “platform as a service” environments. Operations doesn’t go away, a good part of it becomes part of the development. And rather than envision some sort of Uber developer, who understands big data, web performance optimization, application middleware, and fault tolerance in a massively distributed environment, we need operations specialists on the development teams. The infrastructure doesn’t go away – it moves into the code; and the people responsible for the infrastructure, the system administrators and corporate IT groups, evolve so that they can write the code that maintains the infrastructure. Rather than being isolated, they need to cooperate and collaborate with the developers who create the applications. This is the movement informally known as DevOps.
These days, recruiters to managers, engineers to analysts and, of course, CXOs all are interested in DevOps. They have questions such as, “What is DevOps?” “What are the DevOps requirements?” “What are DevOps implementations challenges?” But the key question is, “Why does DevOps remain difficult to implement?”
Adopting DevOps does not simply mean adopting new processes or toolsets. “Why DevOps” can be understood better by focusing on the benefits it will give you. So let’s dive deeper into one of the key best practices we discovered in our journey into DevOps.
Agile to DevOps
Business challenges, complexities and competition pushed companies to innovate and solve their existing business problems in a better way, adding more features and integrated solutions. Ways of implementing solutions vary from organization to organization, adoption of tools and technologies also differ by small and large organizations based on complexity of problems, but it’s clear that many have adopted agile practices in the past two decades. Agile was made clear by a few well-defined processes such as iteration planning, daily scrum meetings, scrum of scrums (in larger enterprises), iteration demos, retrospective meetings and small, empowered teams.
One of the key mindset changes most of the IT world went through while transforming themselves into agile execution was to have more verbal communications and daily scrum meetings. Daily scrum meetings required the scrum master to ask three questions:
- What was completed yesterday?
- What will be completed by end of today?
- Any blockers?
These three simple questions kept things simple and quick.
As agility is extended to Ops now, there is a need to revisit whether these three questions are enough. There is a need to revisit the attendees in scrum meetings too. There is a need to add some more responsibility to the role of the scrum masters’ communication radius.
Before we look at the fourth question, lets let look at the key reasons to adopt DevOps implementations in organizations:
- Achieve higher speed of time to market without compromising Ops stability
- Better resource utilization across Dev, Test and Ops
- Be more responsive to customers
Why would speed of delivery hamper stability in Ops? In our firsthand experience, the communication between Dev and Ops is not that rich in most of the organizations today. If one studies the root cause analysis of problems faced by Ops in trying to keep the environment stable and healthy, one will find some if not all the following issues:
- Developers are not deployment-aware
- Developers do not think of exposing or exporting the right controls for allowing Ops to configure app dynamically (Unintentionally. They simply don’t know.)
- Many configurations are embedded deep inside the app
- Adding specific third parties, changing their versions is sometimes missed in Ops communications
- Ops is unable to clearly articulate what they require from Dev, even when everyone is co-located
- Continuous firefighting in Ops results in fast fixes, better debugging, but not proactive processes
- Even if weekly meetings between Dev and Ops happen, they do not have a clear agenda or metrics
While there are many more symptoms and their causes, and while there are multiple solutions to fix processes and problems, we have found particularly one solution that seems effective.
We added a fourth question to daily scrum meeting. That fourth question is: “Are you blocking Ops?”
This question is a direct question to each developer in daily scrum meetings. This question essentially means to ask:
- Have you written in any code that requires a new configuration?
- Have you written any code that requires change in existing configuration?
- Have you written any code that is creating a new binary or file that needs to be installed in production?
- Have you recompiled existing code with new third party libraries because older ones are unsupported, even though there may be no other change in functionality?
- Have you hard coded any IP address or other configurations into the code which cannot be changed without rebuilding?
- And so on…
Ask the Fourth Question
The above list of questions is developed based on the root cause analysis of app stability in Ops. This list will be different for different developers/teams. The scrum master communicates the appropriate actions to developers based on the answers to these questions. The scrum master also alerts the Ops team and gives them a heads up on what’s coming their way, and provides timely feedback to the developers.
Is this question important enough to be the fourth question in scrum meetings?
Or is this question covered as part of the “Any blockers?” already being asked? One can debate either way. The blocker question in scrum meetings is thought of in a limited Dev/QA scope, and that scope has not included the Ops team until now. Given that Ops team faces customers, given that the DevOps movement is about improving synchronization between teams, this Ops question needs to be a first-class citizen in daily scrum meetings.
Lastly, it’s critical to to look at root cause analysis-related Ops problem as early as possible. The scrum master becomes more mature as he asks this question to his team while interacting with Ops on hearing them out. If teams are funded enough to have someone from Ops into the scrum meetings, then he/she can be the primary representative for Ops and can steer the delivery to be Ops-friendly.
Irrespective of how one achieves it, taking care of configuration issues proactively in daily meetings, and thinking about Ops on a daily basis in Dev meetings is key to successful DevOps adoption. Configuration issues are not the only focus, but they do form a large chunk of Ops issues.
From our metrics, customers have seen following benefits after adding the fourth question their scrum meetings:
- Shipping code 7x more frequently
- 45 percent fewer failures
- 12x faster ability to recover earlier in the tool chain
Please feel free to share your findings with us @ [email protected].
Happy DevOps!
About the Author/Sanju Burkule
Sanju Burkule, a pioneer, visionary and creator of Products in the DevOps Space @OpexSoftware. He writes about automation and strong, connected, integrated IT and tweets often about these topics.