Digital Transformation and the DevOps Chaos Theory

With the threat from agile startups disrupting markets across the globe, large enterprises are increasingly focused on moving to a more agile approach to business. Most organizations already have begun the digital transformation (DX) journey, but the fact that this transformation is still in its infancy and not completely finalized for many demonstrates the fluid and evolving nature of DX. Multiple IT technologies, processes, applications, systems and protocols need to be adopted and updated on a regular basis for businesses to keep abreast of changes. This does, of course, result in significant disruption for all involved.

With this in mind, DevOps principles are beginning to have much greater impact, with the true value of this methodology being identified through a new approach. As development velocity increases, and the scale of the enterprise also increases, businesses will develop a much greater reliance on DevOps principles to rein in the chaos associated with continuous development that’s being spurred on by the pace of digital service development and increased automation.

The key to managing and minimizing the resulting chaos is close collaboration and communication between the IT team members responsible for service development and delivery. Without this, the need for a continuous delivery pipeline operating at speed and scale, will quickly get out of control.

The DevOps Chaos Theory

To understand the challenges in this new business environment, it is useful to look at this at a more granular level, through the framework of the DevOps Chaos Theory.

The pace of innovation is measured as the Velocity (V), or the number of new software releases deployed in a production environment in a defined time period. The Scale (S) factor is measured as the overall number of IT staff involved in service delivery and management in production environments, such as DevOps, SecOps, QA, system architects, DBAs, NetOps and help desk. Interaction between these team members brings the potential for miscommunication, which will increase the overall chaos. The maximum number of interactions between these IT members is S * (S – 1)/ 2, and for high-scale organizations it approaches S²/2. Based on these considerations, a logical hypothesis would identify the system-level Chaos (C) in production environments as C = K * V * S². K is the normalization factor that may change based on the overall adoption of DX in a specific industry and the effectiveness of collaboration and communication between the IT team members.

Minimizing Disruption

Collaboration between departments within the business is important to ensure the chaos is controlled. There is no doubt that automation tools play an important role in continuous delivery and allowing organizations to operate at the speed and scale they need to. However, enterprises must identify the level of constraint placed upon the IT operations team. This is crucial to establish what changes need to be made and what service performance management technology must be introduced, to prevent operations from becoming a bottleneck to the continuous service delivery cycle inherent to DX.

Key performance indicators (KPIs) are critical when it comes to service delivery, particularly when you consider that this combines the entire stack into a single system. This stack is comprised of physical, virtual, wired and wireless layers of compute, storage and networking infrastructure and application that run at the top of the stack to deliver services consumed by end users. Using a variety of network, infrastructure and application performance management tools will obscure the system-level view across all the layers of the service stack and their interdependencies. Monitoring system-level KPIs requires access to reliable data sources, such as network traffic. An effective instrumentation of these data sources will play a key role in proactively identifying the root cause of service issues and thus reining in chaos.

Enterprises must be able to effectively analyze the monitored data to gain insight into all the infrastructure subsystems and applications interdependencies to establish a comprehensive view of their services, accessing both real-time and historic information. In addition, effective management at a human level also should form an important part of a company’s DX strategy, if chaos is to be mitigated and crisis averted.

More and more companies are realizing that DX offers a wealth of opportunity, but it’s crucial to understand that harnessing these opportunities requires strategy and forethought. Monitoring and managing systems will be vital, as will DevOps principles, as more processes are automated and data plays an increasingly central role to any company.

About the Author / Michael Segal

Michael Segal is vice president of strategy at NetScout. His product management experience spans across 10 years at Cisco Systems, where he managed all aspects of product line life cycles for several successful product lines. Michael’s technical areas of expertise include SaaS/cloud, virtualization, mobile IP, security, IP networking, Wi-Fi/wireless, VoIP and remote access. Michael holds patents in areas of networking and wireless mobility. Connect with him on LinkedIn and Twitter.