ITIL is well-established as the process, certification and information resource library for service management across many IT organizations globally. With the recent updates to ITIL 4, influences from Agile and DevOps are helping ITIL to expand and adapt to changes in ways to create software and build operations into the delivery process.
In this DevOps Chat, Akshay Anand, product ambassador with AXELOS Global Best Practice, and Jon Stevens-Hall, principal product manager at BMC Software, join Accelerated Strategies Group CEO Mitch Ashley to discuss the new innovations in ITIL 4. As major contributors to ITIL 4, Akshay and Jon share the approach they took with ITIL 4, incorporating ideas from the world of Agile and DevOps.
As usual, the streaming audio is immediately below, followed by the transcript of our conversation.
Mitch Ashley: Hi, everyone, this is Mitch Ashley with DevOps.com and you’re listening to another DevOps Chat podcast. I have the great pleasure of being joined by a couple of really cool gentlemen here with a great topic I’m excited to talk with you about.
The first is Akshay Anand, who is Product Ambassador to AXELOS. Also, we have with us Jon Stevens-Hall, a Principal Product Manager with BMC Software. Welcome, gentlemen.
Akshay Anand: Thanks for having us.
Ashley: Yeah, I’m excited to talk with you about this. We’re gonna jump into ITIL, ITIL 4, and some of the work that you’ve been contributing to that, but first, why don’t we start out—Jon, would you introduce yourself, tell us a little bit about what you do and a little bit about BMC Software? I’m sure many people know that, but just for the folks who don’t?
Jon Stevens-Hall: Sure. So, I’m the Principal Product Manager for BMC’s enterprise ITSM suite, which, you know, it comes from the Remedy heritage family of products, now titled BMC Helix ITSM, and I’ve been around service management for over 20 years now. I sort of walked out of university into my first job in the late ‘90s, implementing systems for ITIL 2 over at what became Yell Group.
And so, I’ve been around service management and ITSM for all that time, and obviously, my primary role at BMC is producing and creating software and tools, but also, we always try to be active members of the ITSM community in general, which is what led me on a personal basis to being one of the contributors or ITIL 4.
Ashley: Mm-hmm. Awesome, excellent. And Akshay, how about you?
Anand: Oh, so, I joined AXELOS about four years ago. AXELOS, for those who don’t know, is a joint venture between Her Majesty’s Government and a private sector company called Capita. We like to see ourselves as the custodians of the IP, of not only ITIL, but other best practice arrangements like PRINCE2, which is about project management or MSP which is about program management. I’m Product Ambassador, so like the product evangelist for ITSM and ITIL in general. My background is mostly around consulting in the IT Service Management, consulting and advisory. I’ve also been Head of Service Management for Macmillan Publishing for a couple of years. And yeah, like I said, I joined AXELOS four years ago and it’s been a blast.
Ashley: Excellent, excellent. Well, I’m sure that consulting, and also Jon, your experience with BMC gives you some real-world perspective on ITIL and ITIL 4.
Well, talk about the publications that you’ve contributed to the IP. I think the Create, Deliver, and Support around ITIL 4 is one of the areas that you both contributed. Do you wanna jump in first, Akshay?
Anand: Sure. So, Create, Deliver, and Support is one of four books in a learning stream that we call Managing Professional. All the books in that sort of learning stream are oriented toward managers or professionals who are either managing or designing systems, service management systems. Create, Deliver, and Support in particular talks about things like employee engagement and professionalism, a lot of the commonly used tools needed to create, deliver, and support services, things like value streams, value stream mapping, how to apply value streams within the context of ITIL 4, how to manage suppliers.
So, it’s a very wide-ranging book, but it goes into a lot of detail about very, very specific techniques. Two of those techniques were actually authored by Jon, and one was around tickets and the use of tickets or the misuse of tickets and what to avoid, and the other was actually around swarming, and how swarming can be a good model to use within service management organizations, a technique that can be used to foster collaboration between development and engineering teams and frontline support teams.
Ashley: Interesting. Jon, love to hear about that. I know, just to share you a bias about tickets, I think everybody’s impression of IT is, you can’t talk to IT unless you complete a ticket. Well, maybe that’s taking it a little bit far, right? [Laughter]
Stevens-Hall: No, that’s definitely a true impression. I mean, one of the things I’ve spent much of the last six years doing is really, I’m a bit of a method actor, I’ve been very immersed in the DevOps community. And almost, I joke that I went to my first DevOps conference by accident because I signed up for something called Configuration Management Camp thinking, you know, ITSM, CMDB, configuration management. I soon learned very quickly we were talking more Puppet and Chef and Ansible.
Anyway, and it was fantastic, because it’s opened up a door into a world that has evolved somewhat differently. And there were many aspects of culture that I think IT Service Management was not understanding very well, which was one of the reasons I was very keen to bring my kind of cross-domain perspective to ITIL 4, because I think a lot of us felt ITIL—bear in mind that version 3 originated before DevOps, it needed a big refresh. And that example of tickets absolutely is true. I mean, people sort of see this negative perception that, you know, to get infrastructure, we used to have to raise a ticket and wait three weeks—now, we can log onto a software-defined infrastructure tool and we can create it right away and it’s all delivered on cloud.
And absolutely, I mean, one of the—you know, we sometimes, I think the heritage of IT has created these impressions not in a vacuum. You know, they’re not false impressions sometimes. But I mean, ultimately, what is it? You know, a ticket is just a data record that represents a piece of work, and a lot of the work we’ve done around developing systems at BMC has been to try and sort of take away this old way of very sort of skeuomorphically presenting a form to somebody who did all the fields on that piece of work, you know, all the things about that.
And I wrote a blog on this a year or two ago where I likened it to an old taxi booking service I used to use in California if I flew to work, where you had to kind of fill in about three pages—box, box, box, box, box. All those things are important pieces, probably, for the taxi operator, but you know, this is the world where I can open up my smartphone and press one button and get a taxi, you know? I don’t have to care about all that stuff.
Likewise, I think where we’ve had to move on is to sort of—you know, what we’ve had to learn as a community on the service management side and maybe, you know, have to convey more to the DevOps world is that, you know, we all record units of work, we just don’t have to necessarily present it in such a complicated way. And we can, you know, we should focus on the actual flows and the people and not this sort of big-ticket data problem. You know, we probably still want to know that you’re deploying infrastructure, but we don’t have to make you fill out a great big form to do it.
Ashley: Well, and software engineers kinda think in terms of Kanban cards and stories and epics, you know, applying agile kind of things. Have some of those concepts, then, creeped into the idea around tickets within ITIL 4? Is there some intersection of those ideas at all?
Stevens-Hall: Yeah, I mean, [Cross talk]—go ahead, Akshay.
Anand: Oh, I was gonna say, in general, I think we definitely had a lot of inspiration across the entire ITIL 4 contributing team, whether architects, authors, reviewers, et cetera. There were definitely a lot of people who pulled on a lot of lessons learned from the world of agile software development, of DevOps, and of lean, and we tried to incorporate a lot of that thinking into ITIL 4.
It may have been there in ITIL version 3, but it possibly wasn’t at the forefront, it wasn’t embedded at the heart of the framework. But what we did with ITIL 4 was to say, “Look, these are important, you know? We need to be able to reinforce continual improvement, we need to be able to reinforce elimination of wasteful work. We need to reinforce, you know, that we’re working in complex environments.” Especially as we’re working in environments which are, at this level of service management, is a mix of human and non-human interactions, and that is inherently complex and we need to reinforce things like iterative progression and the role of feedback in such a system and so on.
So, we brought a lot of those lessons in. Whether that was sort of very foundational models that we used in subsequent publications to present and frame certain arguments or whether that was highlighting specific techniques like Kanban or—I’m drawing a black, sorry, I haven’t had enough coffee. But highlighting specific techniques like Kanban and others in other publications.
So, there’s another publication called High Velocity IT, and coincidentally, Jon was a contributor to that as well. And in that, we talk about how certain techniques can be used to meet certain objectives in high-velocity service organizations like fast development or creating faster feedback loops at the service level. Sure, we have it at a technical level, but we also need to be able to mirror that or replicate that in some shape or fashion at the service level. So, we definitely try to bring in as much of that foundational thinking from lean and agile and other best practices or emerging practices and bake that into ITIL 4.
But for the specifics on how we did that is, with queueing and swarming and so on, that’s a lot of the stuff that Jon helped author.
Stevens-Hall: Yeah, and I think one of the big—I mean, taking a step back from just the concept, you know, the rethinking about things like tickets, one of the things I’m very happy, one of the reasons I was happy to get involved with ITIL 4 was that it’s shifted a lot of thinking, and again, learning from DevOps and lean in particular and making us think much more about value than process and control. And, of course, that’s been a challenge for us to position that we’re thinking that way, but ultimately, I want to sort of talk about the positive things.
You know, I talked to developers in the DevOps world who see huge value in people taking some of the leg work off their workload, you know? So, we all know that those developers are delivering the most value if they’re innovating the developing. If we can bring service management techniques, better align it with DevOps so that any of the support tickets that are coming the developer’s way, you know, the issues that inevitably come up when the—and especially as DevOps in enterprises goes from sort of small little startup exercises to big and normal and widespread, suddenly, those developers often find themselves doing half of their work resolving issues that have come in, you know, come in from the monitoring or from people. And one thing that service management has done extremely well for a long time is help to kind of industrialize and produce high quality support channels.
So, what we are in a great position to do is actually, you know, with that focus on value, you know, in a lean culture, if it’s not valuable, why the hell are we doing it, by kind of moving on from some of the old concepts, but taking all the things we’ve done very well and working with the DevOps community with a much better understanding of the way they’ve learned, you know, drawing up their lessons, but at the same time, pushing the fact that there is huge value that we can deliver to help them do what they do, then, you know, suddenly, this IT service management and the ITIL framework, I’m in a much better position to help those things. And I’m seeing people who are very skeptical before now saying, “Actually, this is how we unlocked the organization’s confidence and give them that ability to go faster with DevOps.”
Ashley: You know, it’s really heartening to hear that’s the approach that you both have taken with this, because it seems, you know, there’s sort of the impression of process for process’ sake where, no matter what we’re talking about, and of course, you know, many folks that are in this highly iterative—right, there’s process in the way we create software that way.
But if the service management and the software creation processes can better mesh together, right, so that it’s not viewed as an impedance when someone gets a ticket, right, and has to conform to an old process—but it flows with how we develop software and release the change that might go through that. I think maybe that’s that continuity that we start to see between ITIL and DevOps and lean and agile to help everybody see how it can benefit what we’re doing.
Anand: I remember a conversation with a gentleman called Greg Sanka, who is the CIO of Oregon State Administrative Services. This is a conversation maybe last summer, and we were talking about the guiding principles that we introduced in ITIL 4. In particular, we talked about focus on value. As a principle, we need to be able to focus on value—Jon has just referred to that, as well.
But what we are talking about is, nobody—very few teams are consistently sure on what value means and how value changes from person to person. So, of course, there’s value for customers who are using the software and the applications that we’re deploying. But there’s also value for your risk officer, there’s value for your supplier, there’s value for your CFO.
Now, a lot of these people aren’t involved in the actual use of the software day to day, they’re not involved in the development of the software, et cetera, but we’ve got to be able to architect the solutions, the service management wraparound, et cetera, et cetera to be able to create that value.
Now, part of the reason why sometimes IT service management and ticketing that you were mentioning has a bad reputation is, that’s how that assurance was created at a certain point in time.
Anand: But we now have to go back and say—look, because technology has changed, we’ve removed certain risks that we were trying to mitigate 10 years ago, but we’ve introduced new risks that we didn’t have 10 years ago.
Anand: So, what does that assurance now need to look like? How do we re-architect our systems to accomplish that same assurance? How do we introduce more automation, remove the toil, et cetera, while providing—in order to provide that assurance?
Stevens-Hall: Yeah, it’s almost like the ticket was almost the kind of, the manifestation of the internal memo in a digital system, you know, the old paper memo. I mean, I’m old enough to remember those envelopes where you crossed out the old address and put the next one on and it went around the internal mail system.
But, you know, it is important to remember that these processes actually helped us to move that kind of customer support in the technical world from adhocracy and Post-it notes to a proper, robust, scalable solution. That moved us on to self-service, to knowledge management, to properly established professional ways of handling and resolving issues that are now extremely valuable when DevOps goes big in organizations and it hits the mainstream and it hits a large volume of users and starts having to deal with support issues.
The more staff we can use these established service management practices—albeit modernized, heavily rethought, you know, focused much more on value than process for process’ sake? The more we can do that, the better—and this is proving itself over and over again with our customers—the better it actually helps us to enable DevOps to go, you know, to accelerate, to deliver that huge value that we know it can create.
Ashley: Well, it seems like the, some of the process and the auditability, if you want to think about the benefits of why you have processes, right, so consistency and reaction to issues and resolving them—those are some things that can actually benefit people that are working in a philosophy environment, because sometimes that can be a bit chaotic, or it’s just extremely fast. Obviously, things are happening very quickly.
And then when you need to start interfacing with other parts of the organization, whether it’s, you know, do we do vulnerability scanning for the security group, you know, how are we handling resolution time for production issues? That’s where that intersection can really be helpful, and some of those well-defined processes and ways of measuring value from an operational standpoint, development teams don’t have to do that, because you’ve already done that through, or are in the process, right?
Stevens-Hall: Yeah, and also one of the good things in ITIL 4 that we’ve been able to do is actually learn very much hands-on from DevOps. So, one of the big subject areas that’s really growing in interest on both sides, as it were—and I kept on using the word “sides,” but you know where I’m coming from—is this idea of using a more swarming technique than kind of the traditional structures.
And so, this is a really good example because, you know, very traditionally, service management has kind of operated a three-tier support structure where you’ll kind of have a front line at tier one, which we’ll try and probably mostly successfully fix most minor things, you know, then if they can’t, they’ll escalate to a more expert level, tier two. You know, back in the day, that would’ve been the service desk agency who then did their Microsoft certificates, for example, and became a slightly more specialist team.
And then things that can’t be solved there, you know, 5, 10 percent or whatever would end up with tier three teams who are more likely to be developers, specialists. There’s nothing wrong with having those teams, but what is wrong is a couple of things. You know, the first is that issues can often take a long time to get to that team, because what we’re effectively doing is putting queues in the way, you know, so it goes into the tier two queue. Someone then has X hours or days to pick it up, so it sits in that queue. It then goes to tier three and sits in a queue, and by the time it’s got there, we’ve already got an angry customer, even though everyone might have met their operational agreements, the other thing that happens is the sort of ping pong effect where, you know, if it goes to a tier—these things that get, especially, to the tier-three level, they might need input from several development teams. They might need a database team to have input to it, they might need a network team or whatever.
And with these kind of traditional silos, although they’re, if the ticket’s not gonna leave that silo, it’s absolutely the right place for it to be if the issue, I should talk issues, but things ping around and you end up with problems like champions emerging who’s the one person everybody knows can fix it every time, and so they pay their mortgage off in overtime, but they always look a bit unhealthy, and then they leave the company and everybody’s in trouble.
So, swarming has helped us, actually, it really is taking much leaner techniques. It’s about sort of being more dynamic in the way things are brought together as a team, you know, tearing down some of these silo structures, and in particular, removing queues of unfulfilled work in progress.
Ashley: Describe for us—
Stevens-Hall: And that way, we’re really learning directly from lean, we’re really learning directly from the DevOps focus on having much better flow and reducing handoffs and, in particular, not letting things build up in queues.
Ashley: For someone who’s not familiar with swarming, describe how that would work in this context.
Stevens-Hall: Okay, well, I mean, at BMC, we’ve done this in our own customer service team for a while, and what it effectively means is a couple of things. You know, you’ll always have kinda, the most severe issues have always pretty much been swarmed everywhere. And that’s, you know, everything’s on fire, so we’ll get everybody we need into a room and we’ll sort it out. And so, that’s always kind of been the most basic form of swarming, even though we haven’t really called it that. And that’s not really where the novelty is.
For us, the novelty is, we’ve collapsed that tier one and tier two, so instead of having that kinda chain where somebody fairly qualified takes a look and then if they can’t fix it, somebody more qualified takes a look, we kind of put those two people together. And instead of waiting for things to trickle through queues, we look at the things as they’re coming in, but the support people will, maybe every hour, just look at what’s coming in that hour and be very reactive—okay, we can fix those three things, we’ll just get it done—will qualify the rest very well. And you get this lovely sort of information sharing between the two levels as well. So, they learn from each other. I think pair programming is an interesting analogy from the modern software development world.
So, that actually increases the flow very significantly through those first—you know, it gets things fixed quicker and it increases the throughput. Then when things do go to sort of the tier three product specialist groups, they’re not allowed to start assigning it around other tier-three groups. So, if they’ve got one of these more complicated issues, instead of doing that, they’ll pull together a group of people into a swarm and work it that way.
And it’s all—they’ve got a lot of leeway to be very self-organizing which, again, probably doesn’t sound like the impression that service management have always given. So, it’s not that rigid, you know? They will figure out what works best for their team. So, we have some teams hold three sessions a day at fixed times and people can come in with issues. We have others that spin things up on a more ad hoc basis, but the interesting point is there, again, things are not going from bucket to bucket, you know, queue to queue, they’re being owned by the person who takes command of them and we’ve found there’s really great engagement, people coming together in a Microsoft Team session or on a call or face to face way back when. And again, it looks a lot more like the kind of practices that have evolved in the DevOps community. So, again, we’re learning to sort of focus on delivering the value rather than focus on hearing sort of a pre-defined, rigid process.
Ashley: Mm-hmm. I’ve been there, actually. I know you were trying to—I think I may have stepped on what you were gonna say, so jump in.
Anand: Oh, no. I think it works. As Jon was saying, you know, back in the day or probably even still to this day, you know, every time there’s a priority one incident or a severity one incident or one of those really big things, you know, everything’s on fire type of thing, you know, we used to call it a war room back in the day, we used to have conference breaches back in the day.
You know, so that technique was perhaps limited in its application, but we’re now starting to see how that same technique is being, if you will, scaled-down or right scaled-down to the appropriate size and being used across the support organization and not just limited for use to, you know, the building’s falling down type of issues.
Stevens-Hall: And one of the great things that’s happened as we’ve sort of distributed this messaging and talked more about it is, actually, it’s got huge engagement in DevOps community. I’ve presented on this at a number of DevOps events, you know, right up to DevOps Enterprise Summit.
Because actually, what I believe it’s doing is, it’s sort of showing the DevOps community that there is, not only is the support, the world of kind of technical support and service management, you know, that sort of structured technical customer support and service management, we’re looking to try and offer value to the DevOps community, but we’re also doing it in a way that is much better aligned with the way the DevOps community has learned to work. And, you know, so rather than those developers ending up just being treated as a third line support team and having things thrown over the fence to them, it enables us to kind of bring a much more collaborative system together and reduce those kind of pockets of those silos where knowledge doesn’t escape.
So, the value—again, thinking about the value—the value, again, to those teams working in the DevOps world is that we can actually be embedding support people much closer to what they do and those support people learn how to fix the issues. And then when there’s more of those issues, those support are there as professional support, you know, as support professionals who are good at this kind of thing can be figuring out ways to proactively resolve those things, automate those processes, move towards customer self-service as an ultimate goal, perhaps, or self-healing and get those developers developing.
Ashley: Yeah, it also gets them more—I’d say more closely aligned to knowledge about what the application is, how it’s structured, who to contact, you know, help them maybe even diagnose more of the issues themselves.
Stevens-Hall: Yeah, so again, it perhaps creates some parallels with what, again, companies like Google have innovated with site reliability engineering. It’s not quite the same, you know, these are not necessarily developers and I often caution people from kind of just saying you can just do SRE, you can get service management kind of spaces like the service desk and do SRE. You’re not really doing SRE in the strict sense the way it’s known in the DevOps community where it is kind of developers doing proactive things with infrastructure.
You know, if you look at an SRE job spec, it’s gonna be a bunch of programming languages and related experience, but we can still take, again, those learning principles, you know, the idea, for example, of thinking in terms of ERA budgets, you know, so rather than every failure being a major—you know, every time we lose a bit of our remaining availability budget against that target, you know, that could be, that’s always been seen as negative, I think.
But what SRE has introduced is the concept of actually using some of that spare availability capacity to be proactive and make things work better. I like—an analogy I do is, “Why don’t we take somebody off the service desk calls for an hour?” It means our call pickup percentage is 98 percent instead of 99 percent, but they could spend that time doing something that really, really drives improvement.
So, again, these techniques that have evolved in DevOps really sort of cross-pollinate well in the service management space and enable us to better connect to the way DevOps works. And, again, help them develop.
Ashley: Mm-hmm. Well, excellent. This has been fascinating. I’ve really enjoyed talking, and I’m sure the audience does, too, that’s listening in. It’s great to talk to the contributors, to the authors of this, because you get the benefit of both, you’re writing a knowledge that’s shared in that medium, but also—what was the thinking behind it? What were the influences and how did you get to those conclusions and sorta that context is also super helpful. So, this has been really great. I’ve enjoyed talking with both of you.
Anand: Same here.
Ashley: Great. Well, my thanks to Akshay Anand who is Product Ambassador with AXELOS and Jon Stevens-Hall, Principal Product Manager at BMC Software. Thank you, gentlemen.
Stevens-Hall: Thank you.
Anand: Thanks, Mitch, and thank you for listening to us today.