Last week Puppet Labs released the analysis behind its annual study of DevOps and continuous delivery practices in the 2015 State of DevOps Report. Now in its fourth year, the report offers an in-depth look at the findings of the team on what it means to be a high-performance IT organization—the cultural characteristics these organizations share, the lean engineering practices they engage in and the technical architectures they depend upon. Like last year, however, the report tends to be a bit of a black box when it comes to actual statistical findings. While some figures do make it through, for example the oft-flogged statistic of high-performance organizations deploying 30 times as fast with 200 times the lead time of low performers stands pat, the report itself is somewhat oblique about the statistical basis upon which the designators of low-, medium-, and high-performance were determined.
DevOps.com spoke to Nigel Kersten, CIO for Puppet, and Alanna Brown, senior product marketing manager for Puppet, about the report to get a little more detail, to have them explain the process behind the analysis, and to offer up some of the most important facets of their findings over the last few years.
DevOps.com: Can you tell our readers a little bit about the high points from the report?
Kersten: Big finding from 2013 was the orgs that used Devops’ practices deployed code 30 times more often and had 50 percent fewer failures. In 2014 what was quite fascinating was that we actually found a link between these practices and overall organizational performance, not just IT performance. Higher-performing IT organizations are twice as likely to exceed their own profitability, market share and productivity goals.
If we fast-forward to today, the high-performing organizations using Devops practices are more stable and reliable than they’ve ever been. They’re recovering 168 times faster from their failures and have 60 times fewer failures due to their changes. So we’re not actually seeing throughput increase from the year before, but what we are seeing is the quality and speed has definitely increased. So people are producing changes that are of higher quality and that require fewer rollbacks.
And so what we’re concluding from the data that we’ve seen this year in the report is that there’s been a really huge focus to the left of the assembly line from production. So if we envisage the developer work flow application delivery, essentially we start with the developer laptop like on the far left, and on the far right is the application actually existing in production. So from the point between the developer’s laptop and the actual release, we’re seeing organizations focus much more on that stage, putting more testing in the hands of the developers, and all of these practices not just being focused around the release but the pre-production stage of the application life cycle having a really huge impact upon quality.
DevOps.com: Why doesn’t Puppet publish the actual data from the survey? You have great analysis here, but there’s very little statistical proof behind it—why don’t you provide that?
Brown: Well so we have a ton of data, and part of the reason we don’t issue the data is because you wouldn’t be able to actually get to the same results without doing a ton of data cleansing and statistical analysis. So our analysis is actually fairly rigorous, and some of that analysis doesn’t really lend itself very well to graphs and charts and visuals and things like that because it’s all based on correlations and predictive regressions. So that’s part of the reason.
Kersten: I think in many ways it’s analogous to releasing software to the world. We have all of these tools inhouse. We have all of these Excel spreadsheets we’ve put together. We have all of these you know pivot tables and different bits of cluster analysis, but they only – they’re not particularly usable for anyone else right now.
As much as we would love to work out how to polish all of that up and share it so it’s one big data set along with all of the tools, our biggest worry is honestly being that people would actually draw sort of misleading and incorrect conclusions from the data because of the level of analysis that was required.
DevOps.com: Can you tell us, then how you came to the determination of what’s a high-, medium- or low-performance organization? Because if you look at some of these numbers, you say they’ve got thirty times more deployment frequency. But of course they do because they’re high performers. So how does that work out statistically?
Brown: No, no, you’re absolutely right, and I will agree that IT performance metric is a bit circular. However, those are the actual metrics that are important to an IT organization, so they care about deploy frequency, deployment lead time, their mean time to recover and change failure rate. So we feel like that’s a pretty good approximation or a proxy for IT performance, and it’s better you add at doing those things that is correlated to higher performance overall.
Kersten: One thing, too, is that even though there’s a whole spectrum in terms of how fast people are able to actually deliver and ship code, what actually came out really strongly out of the survey this year and in previous years is that the high performance (orgs) definitely cluster in terms of data points. It’s not just a smooth spectrum, but there’s this really obvious clusters of high performance that leap out that all have a very high correlation of the kinds of practices they use. So it is a little bit circular but it’s also it’s not that we’re seeing this smooth spectrum from zero to 100. We’re seeing the masses and then the high performers, and they divide into quite clear clusters.
Brown: Yeah, and they’re not arbitrary clusters; they’re actually statistical. So the low IT performers share all the same characteristics and the high IT performers share all the same characteristics, so they’re very clear.
DevOps.com: OK, so let’s talk about some of those common characteristics between these groups if you’ve got statistical bases for these. Tell me a little bit about the low performers and we’ll kind of go on up the spectrum and we can talk about the characteristics they share.
Brown: So for IT performance, deployment pain was actually the number one most highest-correlated construct. And deployment pain consists of a couple of questions.
Like how painful are your deployments? Are they very disruptive events, etc? So there’s three questions that we ask around deployment pain.
The next thing that’s highly correlated with IT performance is version control. We use the version control for all production artifacts. Following that is culture, and the model we use for culture comes from a guy named Ron Westrom who is a sociologist. And he created a typology of cultures, and you can see the table in the report itself. So he has a pathological culture, bureaucratic, and then generative. So the more generative your culture is, the more likely you are to have higher IT performance.
Kersten: And I think this is what was so fascinating for us when we first started adopting Westrum last year We actually adopted Westrum as part of the model in that this pathological versus generative cultures, this is the language our practitioners in this space use to talk about you know what’s so great about working there: “My ideas are valued. I get to create ideas. I can create impact.” Similarly around what’s horrible about working at this other place. It maps to the pathological and generative descriptions of organizations perfectly really.
DevOps.com: So what would you say is the key characteristic in that first leap from low performance to medium performance, based on your data?
Brown: So the lower performers in terms of IT performance will be doing less automation, less testing. They haven’t automated their deployments, for example.
Kersten: I think one way I at least think about is that there are these characteristics that are necessary but none of them are sufficient on their own, and so you see people who say just adopted version control. They’re definitely heading ahead of the pack of the people who haven’t adopted any of these tools.
But there’s this transformative leap once you actually adopt version control and continuous delivery and the highly-automated conflict management system, that those things together are greater than the sum of their parts. That is at least how the data seems to us, ’cause we don’t actually have this smooth little spectrum between low performers and high performers who just adopted one of those technologies.
We definitely see improvement from those people, but it’s when they adopt all of them in total that they suddenly take this sort of order-of-magnitude leap.