Moving to the cloud creates a tremendous opportunity to get security right and reduce the risk of a data breach. Most organizations start their cloud security efforts by ‘shifting security left’ thereby addressing issues in the CI/CD pipeline. Fixing known vulnerabilities and risky configurations before pushing images to production makes sense. But there is an opportunity to shift security even further left—as you set up your cloud infrastructure—to reduce risk. Infrastructure-as-code (IaC) security allows you to define security controls as you configure your cloud infrastructure.
Before we dive into IaC security, let’s talk about why platform teams want to deploy infrastructure-as-code as they move through their cloud journey. Most organizations have not yet tackled IaC. They start their journey by experimenting with cloud configurations until they have defined a stack that works for their teams. In this early stage, application teams may be defining their own cloud stacks. They are manually configuring infrastructure or using scripts. Configurations may not be standardized across teams.
Although experimentation is reasonable early on in the journey, standardization of infrastructure saves time. Cloud services are no different than applications that run on premises. When your teams are using different versions of tools in their stack, getting an application up and running can be a nightmare. Interoperability between services is challenging, and it can feel like you are fixing a different flavor of the same problem over and over.
Mature cloud platform teams typically prefer to create a ‘golden image’ with a baseline set of services for development teams. They may create a menu of a few standard options to meet different teams’ needs. Because they are standardized, the platform team can take time to validate that they interoperate and are secure. For example, these golden images will be regularly scanned for vulnerabilities and updated to ensure they meet internal security standards and any relevant regulatory requirements.
Manually configuring the environment in the cloud for each team is tedious. Once these configurations have been standardized, the cloud infrastructure can be provisioned as code using templates. A declarative approach defines the desired end state, whereas an imperative approach defines the detailed steps to achieve the end state. The templates are normally stored in a Git repository and version controlled. Automating the configuration of infrastructure not only saves time but also reduces errors that can disrupt operations.
Once you have templates that define your infrastructure, standardizing permissions and access rights for services-as-code is a logical next step. The security team typically defines the guardrails or policies for accessing cloud services. These policies can then be defined as code along with the infrastructure configurations. This IaC security is a further ‘shift left’, allowing security to be embedded into infrastructure, even before an application can be deployed. Implementing policy-as-code minimizes the opportunity for human error in configuring security services. By defining these policies and enforcement guidelines together, the DevOps and security teams can establish a trusted partnership.
Although some organizations may start their journey by implementing IaC security, for most organizations it happens later in the cloud journey. The order is less important than ensuring you take a DevSecOps approach and embed security at each stage to manage risk. Be sure to address the following areas as part of your cloud and container security program:
Even if you have IaC templates, you need a program to regularly take inventory of all cloud assets and confirm permissions and configurations. The Center for Internet Security (CIS) benchmarks are commonly used to check passwords and access rights. Beyond enforcing these best practices for security on cloud services, continuously checking for changes to permissions and suspicious activity is critical. The cloud vendors capture logs across all of their services, and your security tool should continually monitor and alert using those logs.
Scanning for vulnerabilities and ensuring configurations use best practices allows you to shift security left into the development pipeline. Scan everywhere you have images—including within registries, pipelines and also check for new vulnerabilities identified for images running in production. Be sure to scan hosts as well as containers. Defining scanning policies and enforcing through the development process creates guardrails for the development team and clarifies expectations. Most regulatory compliance standards have image scanning requirements as well.
Prevention is fundamental but must be supplemented with threat detection and response. Alerting on unexpected behavior can identify the presence of a threat. Initially, teams may use out-of-the-box detection rules as a way to get started and then customize rules over time. Checking configs against dockerfile best practices is a good way to get started. Overly permissive access rights for containers is common in Kubernetes environments and should be allowed only when required.
As your team develops more expertise with containers and Kubernetes, you can also implement zero-trust network security policies to block unexpected communication between pods, services and applications.
Some DevOps teams handle triage of alerts, as they can quickly determine if the issue is a configuration issue or an indicator of compromise. Teams often map detection rules to the MITRE ATT&CK framework, allowing their security teams to manage alerts consistently across their environment. For any team to effectively respond to a security alert, it is important to capture a detailed record of exactly what happened before, during and after that event. This visibility can be particularly challenging with containers but it is critical because containers have short lives. The 2021 Sysdig Container Security and Usage Report revealed that about half of containers live less than five minutes.
Complying with regulatory standards can be painful. Some standards were written with cloud computing in mind, like NIST 800-53 and SOC2, and others were not. With those broader standards, it is important to understand the controls that are relevant to containers and the cloud.
As you consider your compliance program, be proactive—don’t wait until you are ready to roll into production. Start early by engaging your compliance team and helping them get up to speed. Your role is to provide evidence of compliance to pass an audit at any time using the documentation from your security platform.
The move to the cloud creates a tremendous opportunity to get security right and reduce risk. The fact that physical security and a chunk of operations are delegated to cloud vendors frees up security and development teams to focus on their more narrow scope within the shared security model. By taking a secure DevOps approach and inserting security into every stage of the application life cycle, your team can confidently and securely run cloud and containers.
The data used to train AI models needs to reflect the production environments where applications are deployed.
Looking for a DevOps job? Look at these openings at NBC Universal, BAE, UBS, and other companies with three-letter abbreviations.
Tricentis is adding AI assistants to make it simpler for DevOps teams to create tests.
Redis is taking it in the chops, as both maintainers and customers move to the Valkey Redis fork.
GitLab Duo Chat is a natural language interface which helps generate code, create tests and access code summarizations.