Few topics within DevOps discussions elicit more controversy than the relationship between DevOps and information security. Do DevOps practices help to improve security, or do most organizations simply automate and bring their existing bad internal processes along into DevOps? With this topic in mind, I recently had an engaging conversion with Adam Muntner on this and more, and thought it would also be of interest to DevOps.com readers.
Adam Muntner is a Security engineer at Mozilla and the creator of FuzzDB, an open-content licensed attack and discovery pattern database of test cases for software security testing and research. He has 20 years of experience at security governance and management, technical assessment, research, and consulting with a focus on secure software development. The opinions expressed are his, not his employer’s.
Here is our recent conversation on DevOps and security, edited for brevity and clarity.
DevOps.com You’ve taken issue with some of the claims that DevOps can help to improve security. For example, my writing, “several years into enterprise DevOps deployments, what we are actually learning is that unifying development and operations teams through DevOps efforts does not create the wild west environments feared,” in my post DevOps Security Comes of Age. Another was from this Q&A with Gene Kim where he said that high performers are able to conduct more frequent code deployments complete their deployments from code committed to running in production more quickly, that they enjoyed twice the change success rate, and that outages were fixed 12 times more quickly.
I had used these findings to support DevOps as helping to improve security outcomes. You took issue with these statements. Why?
Muntner: I’m not opposed to DevOps or packaged deployment, rather I’m for taking a holistic view of benefit and risk from an informed perspective. The thing to keep in mind is that the security of an application environment is inherited – it’s the aggregate result of all its component pieces.
When system architectures are so complex that using opaque, componentized, pre-built pieces in an application environment becomes the norm, how do you know what’s in them? How much attackable surface area is being inherited?
Gene Kim is a smart guy and chose his words carefully with regard to security. Gene said,
“The first one is the high performers were massively outperforming their non-high performing peers. Thirty times more frequent code deployments. They were able to complete the deployments from code committed to running a production 8,000 times faster. In other words they could do a deployment measured in minutes or hours. Whereas low performers it took them weeks, months, or quarters. Furthermore, they actually had better outcomes. Twice the change success rate and when they caused outages they could fix it 12 times faster. There were 4,200 that completely responded and it just showed that you could be more agile and reliable at the same time. I think if we just connect some dots, they’re also going to be more secure, they’re probably going to have a faster find fix cycle time. We didn’t actually test that but, if that were true it would mirror what we found in the previous benchmarking work I had done.”
The old engineering maxim of “good, cheap, fast” has not been obsoleted by DevOps. Deploying code 8,000 times more quickly is not a measure of risk reduction. It might help get fixes out faster, but that doesn’t tell the whole story.
Gene said, when they caused outages they could fix it 12 times faster,” but how many of the outages are the result of faster deployment cycles? What does it mean to fix an outage 12 times faster than if the outage had never occurred due to a different development practice? Fixing outages quickly is good, that’s what is seen. It doesn’t take into account the unseen – newly introduced weaknesses that aren’t being tested or measured for.
How much of what was measured resulted from the ex-Devops security practices of organizations measured and resultant outcomes? What does “better outcome” mean, and is that even objectively quantifiable across different organizations or even situations in the same organization? Are we confusing correlation and causation? Every organization is different. The metrics are being held up as quantitative proof but it’s not certain that they are measuring what they purport to.
Consider this scenario: Operations staff are incentivized through the salary review/bonus process to maintain uptime. Despite working in the test environment, the latest code push to production broke your application. DevOps reinstalls the Docker and the problem is still there. What now? Go back to the old version that ran without crashing but (possibly unknown to your operations team) contains a known vulnerability in some library that is static-compiled into a binary?
DevOps.com: What about the argument that DevOps helps to reduce code bloat from legacy environments goes a long way to reducing both technical and security debt?
Muntner: Reducing bloat might be one of a number of goal of a devops team, but devops practices are just as likely to increase code bloat, opacity, and attackable surface. Security is not an inherent property of DevOps practices. For example, the perception that software which can only be installed through prebuilt packages increases simplicity only creates the dangerous illusion of simplicity.
Relying on these opaque components increases code bloat in unpredictable and unknown ways, potentially decreasing security posture. If there even is a threat model (usually not, they’re hard and time-consuming to produce, thus expensive) then a popular software package like Hadoop probably won’t more than a box in a threat model, when in reality it’s an aggregation of many components and n-th order subcomponents, each with their own attackable surface area, depending on configuration. If it’s being installed as part of a prebuilt environment, there are many possible security consequences and none of these are addressed or measured in any of the types of tests or metrics that have been described.
Performing thousands of tests sounds good, but what if tens of thousands of tests are necessary? Or a completely different testing methodology and toolset? Where are will the false negatives be based on the limitations of the existing set of tests? And most important, do the metrics support the claim?
There are many categories of attacks that these tests aren’t able to identify: Logic flaws are one. Different components of a system understanding the same piece of (malicious or spurious) data to mean different things, another. Sometimes the problem isn’t a bug, it’s architectural deficiency. If it’s deeply layered inside a component of your system that uses its own non standard build system, you probably won’t find it. Software tests are for known, expected code execution paths and interactions, it’s much harder to identify all possible execution paths and orders of operations.
DevOps: But all of those flaws can be found later in production, when they exist. It is not as if those manually tests are being done away with.
Muntner: Typically, security audits and assessments are time-boxed. It’s often hard to assess an entire large application inside the time box, let alone all the many opaque components. And then there is the question of pen testers relying on commercial automated tools where even the tests the tools are performing and the included test cases are opaque. Ultimately, they are a snapshot of the security posture at a point in time, given the limitations of the tester(s), tools, methodology, time, and ability of the organization to efficiently consume and triage testing results and lessons learned. Given the rapid release cycles implied by Devops and Agile development, the results can go stale fast.
Adding unknown layers of dependencies and libraries stretches those already-thin review hours even thinner. Security resources are scarce and expensive. Organizations with mature application security assurance programs only apply those resources to the most critical applications. Outside consultants reviewing one application could cost $20,000 or more.
Automation is very critical but automated tests have big limitations and often miss configuration management details. Most important, not all vulnerabilities get found. If the automated tools don’t test for it and the pentester didn’t find it within their testing window, it’s still there. Good penetration testers are always finding bugs that weren’t found during previously time-boxed assessments. And so are malicious actors.
DevOps.com: All of these possible conditions exist in enterprises now. And certainly, automating bad practices could more rapidly propagate a mess, or thinking it through and automating the right things could improve real outcomes. So why can’t this prove to produce more resilient code and architecture than current methods?
Muntner: Thinking security testing through and automating as much as possible will yield results, but that can happen with or without devops. I’m not saying devops is invalid, rather that it alone is not responsible for good outcomes. Thinking that an approach delivers more than it really does is only a false sense of security, arguably worse than awareness of insufficient security.
Secure systems and software development practices like command-safe APIs, network-layer features in TLS, HTTP layer features like CSP, improvements in application and protocol layer firewalls, developers learning to do proper encoding for the appropriate output context, automated testing with tools like OWASP ZAP or commercial equivalents as appropriate for the type of application are all high-impact but have nothing to do with devops.
DevOps.com: Security should be part of the flow, an integral part of QA, security and functional testing.
Muntner: Security isn’t a state, it’s a process. It’s a verb, not a noun. Security ‘what’ should be part of the workflow? Security activities and tests, personnel, all of the above? Should a security organization report to the business management and governance side of management, or the technical side? And why is DevOps better for security maturity than separation of duties?
DevOps is the approach we take at Mozilla, but the IT and product organizations at Mozilla are full of exceptionally talented people who would do a good job with security using any development process.
DevOps.com: Depends on the organization, I’m sure, but if an organization is going to move to DevOps than DevOps needs to be part of the processes.
Muntner: DevOps can be good or bad but DevOps doesn’t have much to do with security in and of itself, except for some of its popularly misunderstood practices, and then maybe not always in the ways people expect. Standardizing is good unless you standardize on terrible things.
DevOps.com: That’s certainly a risk. For many companies they are implementing DevOps regardless, so it’s a matter for security working toward the best way to integrate themselves and their processes.
Muntner: IT is a business support function. Security is a business risk analysis function. If you standardize and integrate things without understanding threat, risk and security posture, what have you done? Ultimately the decisions are business decisions. Unfortunately, they are frequently made from the perspective of insufficient knowledge.
There is nothing inherent to DevOps that solves these problems and DevOps comes with its own potential pitfalls. The problem is external to what DevOps can deliver. For organizations that want to better understand the maturity of security in their Software Development Lifecycle, OWASP OpenSAMM is a good place to start. It helps organizations understand security maturity means for them specifically and incrementally advance towards their desired posture.