Amazon S3, one of the leading cloud storage solutions, is used by companies all over the world to power their IT operations. Over four years, UpGuard has detected thousands of S3-related data breaches caused by the incorrect configuration of S3 security settings. Jeff Barr, Chief Evangelist for Amazon Web Services recently announced public access settings for S3 buckets, a new feature designed to help AWS customers stop the epidemic of data breaches caused by incorrect S3 security settings.
AWS account owners can now select between four new options to set a default access setting for their account's S3 buckets. The settings are global, meaning they override any new or existing bucket-level ACLs (access control lists) and policies. The new settings can be applied retrospectively to secure existing S3 buckets
The ongoing S3 security problem has resulted in tens of millions of breached records. So this is welcome news and a step in the right direction for AWS, but we don't think it's enough.
The S3 Security Problem
Security researchers, including UpGuard, are constantly discovering open, unprotected S3 buckets containing sensitive data. For perspective, UpGuard's researchers have disclosed the following data breaches that were directly attributed to leaky S3 buckets:
- A data exposure containing 540m records from multiple third-party developed Facebook apps.
- Leak of GoDaddy's trade secrets and detailed infrastructure information.
- Pocket Inet, who exposed over 73Gb of data, including plain text passwords and AWS secret keys for Pocket iNet employees.
- A breach by Localblox, a private intelligence platform that exposed 48 million records of detailed personal information on tens of millions of individuals, gathered and scraped from multiple sources.
- The exposure of 14 million customer records by telecommunications carrier Verizon, an example of how third-party risk contributes to this problem.
- Viacom, who left a trove of system credentials and critical application data exposed in an open S3 bucket.
- The breach of a Chicago voters database, which leaked 1.8m personal records around the time of the 2016 general election.
- The Tea Party Patriots Citizens Fund, who leaked the personal details of 527,000 individuals in a misconfigured S3 bucket.
We've been uncovering S3 breaches for over four years, and the problem doesn't seem to be going away.
Who is responsible for the S3 security problem?
It's tempting to blame you, the users, for being too lazy or stupid to use S3 properly. We've all read about "solutions" to the S3 security problem, including (but not limited to):
- Monitoring your S3 buckets using products like AWS Config or UpGuard Core
- Building your own S3 monitoring solution using AWS Cloudtrail and Lambda
- Command-line testing with tools like S3 Inspector
These solutions do work, and we recommend using them to monitor your S3 security posture. To tell you the truth though, it feels a bit unfair. Why should S3 users be forced to spend more money on alternative solutions to resolve a fundamental issue?
Our opinion is that the security problem with S3 is one of product design.
Yes, AWS ensures that S3 buckets are private by default. Yet we continue to see thousands of open buckets, and regular breaches.
Our view is that AWS has made it far too easy for S3 users to misconfigure buckets to make them totally publicly accessible over the Internet. There are two key product features we've highlighted below that can easily trip you up if you're not careful.
#1: Any Authenticated Users
The concept of "any authenticated AWS users" is a poorly understood feature of S3. This level of security allows anybody with an AWS account to see inside your buckets.
Not just anyone at your company. Anyone in the world with an AWS account, which takes 5 minutes to set up.
It’s like if your internet banking credentials worked to log into someone else’s bank account. This unusual security model continues to cause a significant number of breaches and in our view is a crucial problem with the S3 security model.
#2: Inconsistent ACLs and Bucket Policies
Another easily misconfigured feature of S3’s security model is the interplay between ACLs and policies governing buckets and the objects inside them.
Some of the most catastrophic breaches we've found caused by people misunderstanding how these settings work together. You can lock down ACLs to an S3 bucket, but if the bucket policy is misconfigured, then you can still leave your data wide open to the Internet. Unhelpfully, bucket policies are relatively hidden away, and written using fairly obscure JSON syntax.
But understanding them is super important.
Otherwise you might look at your ACL that says “this bucket is not readable”, but the objects inside could still be accessible and readable by virtue of different bucket policies.
What has AWS done to secure S3?
So if you agree that features of the S3 security model are at least partially responsible for leaky buckets, what has AWS been doing to resolve the problem?
Through 2017, AWS announced multiple changes that promised to help:
- Providing a "public" flag for open buckets, and an email outreach campaign to owners of those buckets.
- The launch of Amazon Macie, which "is a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS".
After the launch of these features, we saw many exposed buckets disappear. But we also saw many more buckets with sensitive information persist, and new ones created since then with sensitive, publicly accessible data.
Why aren't we seeing more decisive changes?
S3 has been around since 2006. It is one of the first three AWS products. A victim of its own success, Amazon can only gradually make changes to S3 without breaking existing applications for tens of thousands of customers. Moving to a "private by default" security model for S3 too quickly would hurt AWS and its existing customers.
Nonetheless, over time, we believe AWS should split S3 into two, distinct products:
- Amazon Web Hosting - designed to host public websites, this storage solution would always be public.
- Amazon Private Storage - designed to hold any data you wouldn't want posted on the Internet, this storage is always private and cannot be accessed directly over the Internet.
Separating the products would clearly highlight the differences between public and private storage, and help you prevent the easy mistake of exposing your data through S3.
If you really had to expose data from private storage, you'd do it through an API wrapped with sensible security controls.
And what of the new S3 security features?
Amazon’s new S3 security features will likely have the same effect as their previous efforts: they will secure more buckets, but not all. For example, after the launch of the "public" flag for open buckets and the email campaign to owners of those buckets in November 2017, we saw many buckets disappear. But we also saw many more buckets with sensitive information persist, and new ones created since then with sensitive, publicly accessible data.
Why? Because as long as it is possible to misconfigure a system, people will do so. Adding new capabilities that make it easy to configure S3 storage to be private is not the same as removing the possibility of configuring it to be public.
As long as S3 buckets can be configured for public access, there will data exposures through S3 buckets. Addressing requires fundamental changes that we are yet to see.
Stay up to date with the latest security research and data breach disclosures.