Top 5 AWS Security Mistakes: Leaky S3 Buckets

As we get ready to discuss our list of the Top 5 AWS Security Mistakes in the upcoming DevOps.com webinar, we wanted to provide a preview of the type and depth of information we’ll be discussing. Since the most-talked about, and likely the most vulnerable, aspect of AWS security is inevitably those leaky S3 buckets, it’s a given we’ll tackle S3 first.

In this article, we’ll dig into S3 buckets, explain how they become vulnerable, cover eight different ways they can inherently become open to the public and how to find and protect any public facing S3 data.

The S3 Bucket

S3 is one of the oldest services in AWS—so old that parts of it still support XML-based policies instead of the JSON you see everywhere else. S3 also has a lot of features that aren’t as commonly known or used anymore, such as allowing someone else to host put objects in your bucket yet still maintain ownership of them.

Think of an S3 bucket as a server that can have subdirectories and files. Because S3 buckets can have a different root url (e.g. “dops.s3.amazonaws.com”) and have their own life cycle and settings, we tend to think of buckets as servers, not directories. Also, directories in S3 allow you to organize objects, but don’t have any distinct settings of their own, other than a name.

AWS has a wide range of use cases it needs to support, which is why we have all these mechanisms. Sharing publicly, within an account, between accounts, hosting websites and so on, creates complexity. The good news is S3 always defaults to secure and private. The bad news is AWS allows human beings to use it (and therefore weaken security) and can be confusing to manage. For example, S3 supports both read and write permissions (and list, for buckets). It’s possible to have no public read but public write permissions, which could lead to people placing bad files in your directories.

8 Ways AWS Data Becomes Public

There are at least eight different ways an AWS S3 bucket can inadvertently become open to the public and be exposed to a data breach.

Bucket ACLs (Access Control Lists): This is an XML document that defines the first layer of access. By default, only the account owner has access, but this can be opened up to other AWS accounts or the public at large. Amazon recommends against using these at all, but it’s also the easiest way to make something public quickly, so we see it all the time.
Bucket Policies: These are super-flexible JSON policies that allow you to set things such as IP-based and other conditional permissions on a bucket. While this should be the primary way for managing public access, ACLs are the first tab and about three clicks to make something public. It’s this flexibility that leads to mistakes such as opening up to a wider range of IP addresses than intended or using a negative to block access to some IPs but inadvertently granting access to everyone else.
IAM Policies: These are the normal IAM permissions you use to control access throughout your AWS account. You can’t make a bucket public with them, since they only govern AWS users, but you can open up access to the bucket by authorizing another AWS service access, and then that service exposes the content.
Object ACLs: These are the primary object-level controls. It’s just like a bucket ACL and uses XML to allow access from other accounts or the public. Objects do not necessarily inherit bucket ACLs. You can totally make an object public even if the bucket is private.
Explicit IAM or Bucket Policy Statements: IAM policies and bucket policies can have explicit statements referencing the object, which will override the object ACL (if you own the object), since they are evaluated first.
Pre-Signed URLs: These object-level policies must be created using code (not the console) and provide temporary access for anyone with the URL. It’s used, for example, to share files for a few minutes or an hour for someone using your app to download a media file.
CloudFront Origin Access Identity: CloudFront is the AWS content delivery network and can serve as the front end to S3. You can create something called a CloudFront origin access identity to write an IAM policy that allows CloudFront to access the S3 content. If the CloudFront allows public access, it can access the S3 content and that won’t show in a bucket policy or ACL.
Cross Origin Resource Sharing (CORS) Policies: CORS is required if you use S3 in a website and don’t want the browser to break due to same site security settings. CORS in S3 won’t override an ACL or bucket policy but could mask public access in limited situations where the data is exposed in the web code through the authorized site.

All put together, AWS evaluates IAM permissions first. Within those, the only one to make a bucket public over the web is the CloudFront Origin Access Identity. AWS then turns to the bucket ACLs and policies for any explicit deny statements. Then it looks at the object ACLs for public access. If you make an object public and the bucket policy doesn’t have an explicit deny, it is still public, but otherwise a good bucket policy will block everything. Besides, a lot of the problems are open bucket ACLs, not object ACLs.

4 Ways to Find Your Public S3 Buckets and Protect Your Data

What’s the best way to find public buckets? It’s relatively easy.

Log into the console, click on S3 and look for the Public tag. AWS uses some advanced back-end math to evaluate all the bucket policies to figure out if something is public, which catches most of the fringe cases, but it does not show if the bucket is private and objects in it are public.
Check CloudFront to see if you use any origin access identities. It’s easier to look here than in all your bucket policies (usually). This saves you from looking through object ACLs as well.
Check any CORS policies, if you use them. This is the least common situation and I even hesitated to complicate this post by mentioning them.
The bad news … there is no easy way to find public objects. Even finding them programmatically can be difficult if you have a large number. Also, changing a bucket ACL doesn’t cascade to the object ACLs so you need to run through and fix them one by one. However, if the bucket is locked down the attacker would need to know the full URL/name of the object to find it. It’s not perfect, but it’s better than nothing.

That’s it. Although, if you know of any other fringe cases, please don’t hesitate to let us know. While conducting assessments, we find well over 90% of the problems are just basic bucket ACL and policy mistakes. Object ACLs are also an issue, but slightly lower risk since you need the exact URL to find them if your bucket is otherwise secured properly.

If you are curious about the other 4 Top AWS Security Mistakes, sign up for the Aug 14 webinar.

— Mike Rothman