Security in a DevOps world: Q&A with Gartner's Ben Tomhave

As the infosec community catches up with the spirit of the DevOps movement, many CISOs and IT security leaders still need guidance on how to make the paradigm shift. Not only are these security folk scratching their heads to figure out the additional risks posed by continuous delivery, but they’re also learning how DevOps can actually help them improve their effectiveness and within their IT ecosystem.

Gartner recently took a deep dive into this topic with a report entitled “Security In a DevOps World”. DevOps.com caught up with one of the lead authors of the report, Ben Tomhave, research director for Gartner, to discuss some of his findings and to learn more about how he thinks security staff need to adjust philosophies and practices to peacefully co-exist with DevOps.

DevOps.com: How do you think security people need to change their philosophy in order to get the most value out of DevOps practices?

Tomhave: Getting away from that adversarial ‘party-of-no’ attitude is very key. We have to get away from “Come to us, we’ll give you things” and start thinking of partnering, collaborating and mentoring. On the development side, security needs to be pulled in earlier in the process, providing faster feedback after code commits. And then it needs to be working with operations to provide secure foundations, really inherently secure environments that are secure by default so that nothing is changing after the code is written.

Because that’s been a huge problem, too, right? Where developers go write code in who knows what environment and then it propagates to QA, which looks a little closer and sends it to production. Then it launches and suddenly there are all these surprises, things are breaking left and right and there was no communication along the way. Everybody was working on different platforms, and ops maybe wasn’t even managing or maintaining the development platforms. We’ve got to get away from that and security is no different—it should be a part of the process. All the monitoring tools, all the hardening tools, all that stuff needs to be applied from dev through QA to production.

DevOps.com: What would you say are the biggest security concerns or risks that crop up in DevOps environments? What kinds of things could go wrong if security teams aren’t embedded in the DevOps patterns?

Tomhave: I think that it ‘s the same old stuff, pushing out code that may be functionally correct but woefully insecure and all of the other gotchas, whether it be bad crypto, bad handling of identity management and insecure management of data. DevOps shops are not immune to these problems, either. One need only look at the Spotify disclosure that they had password resets because of a breach. That’s a big DevOps shop and clearly some error was made.

The good news is that a DevOps shop that’s doing lots of code pushes should be able to resolve issues relatively quickly. But that means it is incredibly imperative that they work closely together so that when there’s a break it is quickly triaged, remediated rapidly and there’s good clear communication between the incident responders–whether they be security incident responders or operations incident responders—and developers, getting the feedback back to developers in a timely fashion.

DevOps.com: It’s clear from your report that you think monitoring is more important than ever to support the rapidity of the DevOps cycle. Why is that?

Tomhave: You know, we expanded that DevOps mantra of fail fast, learn fast to add recover fast in there, too. If you want to fail fast, that’s cool. And if you want to learn fast, that’s also cool. But you also need to have rapid recovery as well, because the sooner you can detect a breach or an incident of some sort or a security event, the sooner you can start interdicting and remediating and taking positive actions to clean it up.

If you do DevOps with lots of operational transparency into process, but you don’t have adequate transparency into what the project is doing with the data involved and overall security monitoring capabilities, you could have a lot of problems. Suddenly now we’re going to be rapidly iterating through features and code pushes that are expanding capabilities and then we never know whether or not that environment is being attacked or whether or not it is being adequately secured or if it is breaking down. So that transparency meme needs to expand out to security monitoring as well so that you have security operational transparency as well as process transparency.

DevOps.com: I was really interested in your advocacy for a security architect to help facilitate security meshing into DevOps. What does an organization need from this role and what are the challenges of finding the right person to do the job?

Tomhave: The challenge here is what you really need is someone who is really competent with development and preferably has some actual experience that allows them to establish credibility with the development team.

But they also must be really strong security people as well. So they understand static analysis, dynamic analysis, they understand what he tools are doing, what things are looking for. They must also understand design principles from a security perspective, things like here’s why you don’t roll your own crypto, why you integrate with existing federated identity systems, why you do transport encryption and what an appropriate level of protection is for a given system.

Finding those two sets of qualities in a single person can be really challenging. It is really difficult to find them and, more importantly, once you do find that person, they tend to be very expensive, especially for smaller shops. You think of all these mobile start-ups, for example. Their feet are being held to the fire; they’re being subjected to regulator actions. We’ve seen the FTC go after some of these mobile app companies for violating privacy policies and things like that. So how do you get those security bodies in there when you may not be able to afford to hire someone?

Alternative staffing models really speak to that. They may need to find consultants through specialty shops or something along those lines to get someone to come in and be engaged either through a staff augmentation type of role or just on a per-project basis.

I think the big key is being able to do that around the design review up front and catch some of these problems early. Because no matter how much you invest in tools and how many times you scan the code base–either statically, dynamically or whatever–that’s only going to check maybe 40 percent of your code. And a lot of these functional weaknesses, you aren’t going to be able to detect them with tools. You have to have somebody look at the design and say, ‘Oh, you’ve forgotten completely to do encryption,’ or, ‘Oh, I see you’re hashing, but you haven’t salted your hash, so how are your passwords secure?’ or ‘Why are you dealing with credit card data here, this shouldn’t be done this way, you should just be interconnecting into our payment system over there. Quit reinventing the wheel.’

Those types of things need to be caught by architects early on, rather than trying to catch it using tools in the code.

DevOps.com: Speaking of tools, you dedicated time in your report discussing the importance of tool chains and automation. Automation has always been important to security, but what’s changed now with DevOps?

Tomhave: Well, I think that the big thing is almost all of the testing has to be automated up front. Static code analysis should just be automatically triggered on a nightly repository of check-ins, as much as possible. Same with dynamic analysis. Dynamic app sec testing should be done automated and feedback should just be fed back directly into the issue management systems. There shouldn’t be much human touch there at all, aside from monitoring for, addressing and resolving false positives in those tools. And possibly adding some customized checks looking for specific libraries or frameworks that should be in place or should be used within the tools.

The only time where I think we want to have any sort of manual interruptions are the first up-front quick risk analysis. Deciding is this a high risk project that’s dealing with sensitive data or has uniquely high uptime requirements, or is this a lower risk project where we’re not really concerned about losses here? I think that, as well as some of the design review considerations would need some manual hands and eyes involved.

And then everything should automate until you get to the end where you say, ‘OK, this is a high risk project, we need to put extra testing into it, so now we’re going to kick off pen testing, where we need to have some manual validation in place.’ That validation not only deals with false positives but really does a deeper dive into the application. So you fuzz the interface, it looks like that broke some stuff and you say ‘Yeah, we validated and proved it and here’s the exploit. This is how it broke.’ Then feed that back in a credible fashion to developers to show there’s the spot in the code with the errors or this is the logic error that led to whatever the condition may be.