This new question in the DevOps.com Enterprise DevOps Q&A series comes from a large business, though not what many would recognize as a traditional or legacy enterprise. This organization, like many high-growth businesses, is in a relatively new market segment that only exists due to the rise of cloud computing and digital media. In their business, the network (especially access to and from their public cloud service) is critical to the value they provide their customers.
So it is no surprise then that, as they contemplate their transformation to an agile environment built on DevOps principles, they are asking me about the role of the Network Operations Center (NOC).
- What Role Does a NOC Play in the DevOps World?
Despite coming from a ‘new tech’ business, this is another of the questions that is typical in a large enterprise. While smaller startups and even many web-scale businesses may have an Application Support team, not many run their own NOC, and even fewer separate out these roles from the rest of the operations team.
First, some things do not change in a DevOps world. Fundamentally, the NOC is still tasked with maintaining high quality production networks and systems, detecting potential problems as soon as possible, triaging those problems to find root cause, and remediating problems as fast as they can.
However, the NOC team will gain additional activities and support a broader interlock beyond just the production environment, especially as networking becomes more embedded in application delivery through capabilities like software-defined networks and cloud computing. They will work more closely with teams like app dev, SQA, pre-prod and release; they will provide input to design and development; and they will contribute earlier in the software cycle. This is the logical outcome of a DevOps approach that emphasizes the need to work more collaboratively across the entire end-to-end SDLC.
So you should consider bringing these network specialists into:
- development planning, to help architects and engineers to build applications with operable networking by design, including software-defined networks that help make a truly idempotent application
- testing and QA, so they can provide engineers and SQA teams with tests and analysis that accurately reflect the reality of the prod network, rather than leaving network testing to after production deployment
- dev/test support (where they may already be acting), to ensure that the environments that dev/test rely on to create new capabilities are just as stable and reliable as the systems they will eventually be deployed to
- release and deployment, to ensure they have input into determining and defining ‘known good values’ for production monitoring and for problem diagnosis, triage, and escalation – before release, not just after
- expanding interoperability and self-service, by exposing ‘known-good’ deployment, triage, and (re-)configuration capabilities as automated software-defined processes (or APIs) that dev/test can use without opening a new service ticket
- a broader role in ensuring the end-to-end quality of the network, alongside traditional operations/infrastructure support as part of a new role commonly referred to as a ‘Site Reliability Engineer’ (SRE)
Because DevOps is about improving collaboration, remember too that you should also bring dev, test, QA, etc. into the NOC, and even rotate them onto support, so they have an understanding of the core operations of a production network, how their code acts in a real environment, and how their work affects other teams. This will help to build the empathy between teams that is a core tenet of DevOps.
You may be concerned that asking your NOC to work more upstream is adding a load of work to those teams that they didn’t have before. That may be true, but a significant outcome of Enterprise DevOps in practice is to reduce the ‘busy work’ of such teams, free them up to do more ‘smart work’, and to do it earlier in the software lifecycle.
For example, recent research has shown that DevOps is correlated with improving the quality of applications (especially before they hit production), reducing the amount of time and effort the NOC team spends on problem detection, triage, and remediation. In fact, in this research, ‘reduction in time spent fixing and maintaining applications’ is one of the strongest outcomes, with an average 21% reduction in this metric.
Similarly, standardizing on known-good processes, APIs, configurations etc., and enabling other teams to employ self-service and software-defined networking capabilities will allow the NOC team to spend less time on routine responses to things like servicing provisioning and configuration tickets.
I cannot guarantee they will not be a zero-sum game, especially in the short-term, but over time a DevOps approach should actually spare workload from the NOC team, freeing them up to ‘shift left,’ moving their talents earlier in the SDLC, as they communicate, collaborate, and integrate more closely with upstream teams like dev, test and QA.