Blogs

Best Practices for Cloud Incident Response

Cloud computing is now mainstream, with almost all organizations running at least some resources in the public cloud—whether software-as-a-service (SaaS), platform-as-a-service (PaaS) or infrastructure-as-a-service (IaaS). Security teams have been scrambling to adapt to cloud environments, and with the growing adoption of DevSecOps, they are working together with DevOps teams to secure cloud systems from the ground up.

As your organization discovers the best way to secure its cloud investments, you also need to develop an incident response strategy for the cloud. Even if your cloud security controls are perfect (and they aren’t), attacks will happen. Knowing what to do during an attack and preparing teams for incident response can be the difference between an incident quickly contained and resolved, and a multimillion dollar disaster.

What Is Incident Response?

Incident response enables organizations to make sure they are aware of security incidents and can respond in time to limit the damage to their systems. The objective is to block attacks and prevent similar attacks in the future.

The SANS Institute’s six-step incident response process provides a structured framework for security incidents. These steps are:

Prepare—establish security policies, carry out risk assessments, determine which assets are sensitive and establish an incident response team.
Identify—monitor your systems to detect anomalous activity, identify real security incidents and investigate the severity and type of threats.
Contain—conduct short-term containment procedures to stop the spread of the threat, followed by long-term containment, such as applying temporary fixes and rerunning a clean system.
Eradicate—identify the root cause of the incident, remove malware and implement measures to prevent future attacks.
Recover—restore your production systems and apply measures for preventing further attacks. Test and monitor recovered systems.
Learn—perform retrospective analysis within two weeks of the incident with complete documentation evaluating containment efforts and determine how you can improve the incident response process.

How to Prepare Your Cloud for Incident Response

There are several ways you can prepare your incident response team and your cloud environment for more effective incident response:

Establish response goals—determine incident response objectives in consultation with stakeholders, legal advisors and organizational leaders. Common goals include problem control and mitigation, restoration of affected resources, and storage of data for evidence and attribution.
Use the cloud to respond—ensure your cloud resources include the tools and resources you need to respond to an incident. For example, ensure you have robust cloud-based logging and monitoring systems, and set up cloud-based backup and disaster recovery so you can rapidly restore affected systems.
Determine your requirements—keep copies of logs, snapshots and other evidence in a centralized cloud account. Apply mechanisms to enforce retention policies. Use tags and metadata to maintain visibility and connect logs and cloud resources to organizational units, projects or corporate systems.
Use a redeployment mechanism—if a security anomaly is caused by a misconfiguration, you should be able to remediate it easily by redeploying the resource with the appropriate configuration. Ensure the response mechanisms can be executed multiple times if necessary.
Leverage automation—after identifying recurring problems and incidents, automate as much as possible and break them down programmatically to build response mechanisms for common situations. For example, use the mature auto scaling service on AWS or Microsoft Azure’s infrastructure-as-code (IaC) capabilities. This is much easier on the cloud than it was in an on-premises data center. Ensure you use human response only for unique, new or critical incidents.

Effective Incident Response in the Cloud

Use the following tips to improve your ability to respond to security incidents in a public cloud environment.

Shift Your Focus

Cloud environments require you to monitor different elements than you would in traditional on-premises environments. In the cloud, you should focus on applications, APIs and user roles. Consider how incident responders can operate successfully in a cloud environment, and what tasks they may need to perform. The incident response team must have proper access and visibility into your systems so they can detect, remediate and prevent attacks.

Integrate Alerting and Incident Management Tools

The security team must have direct access to supporting data to triage alerts and classify incidents. To this end, security alerting tools should be integrated with any incident management tools you use, like PagerDuty and Slack, for example. This enables security alerts to directly reach the existing tools and workflows used by your teams. Responders won’t have to alternate between tools to see what is happening.

Build an audit trail to capture the response to each alert, which will provide visibility and accountability and help you refine your response processes. All actions taken in security tools must be visible in the relevant collaboration tool, so you can see who dismissed a specific alert and when, and what annotations they made.

Work with Your Cloud Provider

Cloud providers usually have an incident response team, but you cannot assume the vendor will handle everything during an event. Be aware of the shared responsibility model, where cloud providers are responsible for securing the infrastructure, while the customer is responsible for data and workloads.

Make sure you understand the service agreement for your cloud provider and who is responsible for exactly which element of the response. Find out exactly what alerts you can expect from the vendor’s team and how it can support your own team. Having a clear relationship and establishing points of contact can save critical time during an incident.

Protect Your Logs

The major cloud vendors provide logging capabilities for their environments, including log files or operational metrics to provide insight into service operations. Logging services may be free or paid, ranging from basic access logs to complete audit and configuration logs. Most cloud logging service will allow you to store logs outside the cloud or on-premises—and it is critical to do so.

Logs are a useful resource for incident response investigation, and you must ensure they remain inaccessible to attackers. An attacker might be able to compromise your cloud system or services, but they won’t be able to modify or delete your logs. Logs are a protected source of information that can help you identify the attack timeline, targeted systems and the attacker’s IP address. This provides a reliable starting point for investigations.

Conduct Cyber Range Training

Organizations often rely on exercises to train or test their security and incident response capabilities. Today, cloud environments provide an opportunity to simulate your production network in a protected environment, allowing your security team to practice their response to real attacks on your network within a safe setting.

Tools such as AWS CloudFormation allow you to quickly design and deploy training networks that are identical to your actual network. You can keep costs down by limiting the duration of exercises. These exercises may be the best way to prepare your team to respond to attacks in the real world.

These are the basics of security incident response, preparing a cloud environment for incident response and how teams can effectively react when the inevitable attack occurs. In short, it’s important to:

Focus on what matters – In the cloud, this is APIs, applications, and identity and access management (IAM) systems.
Integrate alerting and incident management tools – The cloud provides ample automation capabilities. Use them to respond automatically to common anomalies.
Work with your cloud provider – You are not alone in the cloud, and teams need to understand exactly which part cloud providers will take in responding to an incident.
Protect your logs – If logs are exposed to tampering, you will have no way to detect, investigate and respond to attacks. Protect them at all costs.
Conduct cyber range training – You’ll never really know what it’s like to respond to an incident until one actually happens. Instead of waiting for a real attack, conduct a “cyber range training” or security drill and see how everyone works together in an attack scenario.

Now, you’ll be better prepared as you move towards a cohesive cloud security strategy for developers, operations, and security teams.

Gilad David Maayan

Gilad David Maayan is a technology writer who has worked with over 150 technology companies including SAP, Samsung NEXT, NetApp and Imperva, producing technical and thought leadership content that elucidates technical solutions for developers and IT leadership.

Recent Posts

A Matter of Measurement

We're all asked to assess our skills, sometimes. Surely this answer is as good as any?

9 hours ago

The Commonhaus Way to Manage Open Source Projects

Commonhaus is taking a laissez-faire approach to open source group management.

9 hours ago

Five Great DevOps Job Opportunities

Looking for a great new DevOps job? Check out these available opportunities at Northrup Grumman, GovCIO, Northwestern Mutual, and more.

20 hours ago

Tools for Sustainability in Cloud Computing

You’re probably sold on the environmental benefits of moving to the cloud. These tools can help you get there faster…

4 days ago

OpenTofu Denies Hashicorp’s Code-Stealing Accusations

The legal battle between the faux-open-source HashiCorp and the open source OpenTofu heats up.

5 days ago

DevOps Unbound Special Edition from KubeCon Paris 2024 – DevOps Unbound EP 44

During this special KubeCon + CloudNativeCon Europe 2023 edition of DevOps Unbound , Alan Shimel and Mitch Ashley are joined…

5 days ago