Four key vectors for closing the DevOps feedback loop

A question I am asked frequently about DevOps – and a perennial challenge to all DevOps slide artists – is what exactly we mean by the idea of ‘closing the feedback loop’

Q. What do you mean when you mention ‘closing the feedback loop’? Ops monitor our systems today, and alert devs when they are needed (next step, devs carrying pagers!). Is this the same Ac?

Yes, this is definitely part of the equation. Feedback from operations (and more) to development (and more) – through alerts, monitoring, log entries, service tickets, or verbally – is essential. Good fast feedback helps, for example, to:

connect dev and ops more closely to user experiences
establish shared insight to replace fragile tribal knowledge
foster ownership from both teams in the entire service
provide rapid response to problems or new requirements
encourage faster experimentation and a tolerance for failure

Feedback loops in smaller organizations tend to be relatively simple – just lean over the cubicle and tell your counterpart what is going on, or program the monitoring system to send you an alert. However for larger organizations the multiplicity of teams, volume of events, separation of duties, and variety of feedback can make this harder to handle.

In larger organizations then, I see four key vectors used in closing the DevOps feedback loop:

Machine to Person (M2P)
Person to Person (P2P)
Machine to Machine (M2M)
Person to Machine (P2M)

I have tried to illustrate these 4 vectors below (and not being any kind of graphic artist, I would love any feedback on this image):

Machine to Person (M2P)

This is more or less the alert cycle the original question posed – a machine process (performance monitoring, for example) picks up an error situation that it cannot identify it as a ‘known resolvable’ issue, so forwards it to a person to take charge. Machines can resolve many problems automatically with intelligent orchestration – rollback a release, increase capacity, modify a configuration, migrate a workload, restart a service – but sometimes a person needs to fix hard and/or infrequent problems.

Person to Person (P2P)

This is just the classic ‘hot line’ feedback, where application users (internal or external) talk directly to someone in IT (e.g. help desk staff). This can be the most powerful feedback of all. Not only can it help identify defects, but it can also identify opportunities for improvement – from new features to entirely new services. However, to be useful this personal feedback must be transferred into development or operations process – e.g. to update configuration scripts, or as input to future release planning.

Machine to Machine (M2M)

This includes automated problem resolution (e.g. monitoring detects a performance bottleneck, triggers a provisioning engine to add additional capacity), but also includes data mining that feeds back ‘real-world’ production data to build more realistic test and QA environments, or to inform more accurate capacity plans. This feedback makes prod more stable, makes QA and test more realistic, reduces human and process errors, and helps to build scalability into applications from the start of the SDLC.

Person to Machine (P2M)

This can take many forms, such as end-users logging problems into a self-service portal or Service Desk, or rating a mobile app, or complaining online. Automation can trigger resolution for known problems, so end users can address problems automatically; or feedback can reach product owners to resolve common user complaints. However, P2M feedback may waste resources and increase technical debt, as solvable problems are fixed repeatedly rather than permanently; or user complaints are ignored.

Final Tips

These four vectors of active feedback are just a starting point, but describe a framework that enterprises can work to when looking to close and accelerate your feedback loops. Remember too that in a DevOps mode, feedback needs to be faster, it needs to be more accurate, and especially in a large organization it needs to connect not just with dev and ops, but with other stakeholders too – business owners, test and QA teams, security teams, and of course end users.

What are your feedback loops? Do you cover all four vectors? Do you have more that I am missing? Please let me know in the comments below – as always, I would love to hear from you.