Identity and Access Misstep: How an Amazon Engineer Exposed Credentials and More

UpGuard can now disclose that a repository hosted on GitHub with data from an Amazon Web Services engineer containing personal identity documents and system credentials including passwords, AWS key pairs, and private keys has been secured from public access. The data was committed to a public repository on the morning of 13 January, 2020. It was detected within half an hour by UpGuard analysts, reported to AWS Security, and secured that same day.

Discovery

On 13 January at approximately 11am, the UpGuard Data Leaks detection engine identified a GitHub repository with potentially sensitive data that had been uploaded half an hour earlier. Shortly after noon an analyst began reviewing the contents of the repository. After assessing the contents to establish the scope of the data, its degree of sensitivity, and the identity of the owner, the analyst notified AWS Security at 1:18pm. By 4pm, the repository was no longer publicly accessible, and at 4:45pm AWS Security replied to the initial notification email saying that they had taken action.

System Data and Credentials

When downloaded from GitHub as a compressed .zip file, the storage size of the repository totaled 954 MB. The repository was structured as general storage rather than application code, with many files in the top level directory and no clear convention for the subdirectories. Consistent with the engineer’s role, there were many AWS resource templates and log files, some of which included enough mentions of hostnames to identify likely AWS customers being assisted by the engineer. Timestamps in the logs indicate they were generated throughout the second half of 2019.

Of greater concern, however, were the many credentials found in the repository. Several documents contained access keys for various cloud services. There were multiple AWS key pairs including one named “rootkey.csv,” suggesting it provided root access to the user’s AWS account. Other files contained collections of auth tokens and API keys for third party providers. One such file for an insurance company included keys for messaging and email providers. The risk for committing these credentials would be mitigated over time due to GitHub’s token scanning feature, which identifies tokens matching certain patterns, but how quickly they are revoked is unknown. What we do know is that third parties can detect such credentials on GitHub within minutes.

Other credential types that would not be revoked by token scanning included private keys and passwords. Unlike AWS key pairs or other credentials subject to GitHub token scanning, these cannot be deterministically mapped to an issuer for automatic revocation. While some of the private keys were clearly labeled as “mock” or “test,” others were not, and included words like “kube,” “admin,” and “cloud” that could indicate association with more privileged systems. The passwords were associated with databases hosted in AWS and mail servers. UpGuard never attempts to use credentials, even when stored on the public internet, and cannot determine what data they may have been able to access.

Attribution

In addition to data related to computer systems like credentials, logs, and code, the repo also contained assorted documents that established the identity of the owner and their relationship to AWS. These documents included bank statements, correspondence with AWS customers, and identity documents including a drivers license. Multiple documents included the owner’s full name. A LinkedIn profile matching the exact full name identified one person who listed AWS as their employer in a role that matched the kinds of data found in the repository. Other documents in the repository included training for AWS personnel and documents marked as “Amazon Confidential.” Based on this evidence, UpGuard is confident the data originated from an AWS engineer.

Conclusion

Amazon Web Services is the largest provider of public cloud services, laying claim to about half the market share. In 2019, a former Amazon employee allegedly stole over a hundred million credit applications from Capital One, illustrating the scale of potential data loss associated with insider threats at such a large and central data processor. In this case, there is no evidence that the user acted maliciously or that any personal data for end users was affected, in part because it was detected by UpGuard and remediated by AWS so quickly. Rather, this case illustrates the value of rapid data leaks detection to prevent small accidents from becoming larger incidents.

Protect your organization

Get in touch or book a free demo.

Contact sales

Free demo

Related breaches

Learn more about the latest issues in cybersecurity.

Own Goal: Inside the Cyber Risks of the 2026 World Cup

Free World Cup streams and black-market betting sites are leaking fan data. UpGuard research reveals the hidden cyber risks of the 2026 tournament.

Greg Pollock

June 30, 2026

Social Insecurity: Billions of Social Security Number and Passwords

UpGuard research found a trove of sensitive information in an exposed Elastic database. Getting to the bottom of what it meant led us down an interesting path.

Greg Pollock

February 18, 2026

Sixth Sense: GPS and AI Data Exposed for Assistive Devices

UpGuard can now report that it has secured an Elasticsearch database for AngelSense, a GPS tracker for children and adults with special needs.

UpGuard Team

January 30, 2025

Stolen Data: National PTA Database Available on Dark Web

On May 13th, UpGuard discovered a new set of data recently posted on a prominent dark web forum, this time allegedly belonging to the National Parent Teacher Association.

UpGuard Team

May 14, 2024

Student Applications: How an Education Software Company Exposed Millions of Files

UpGuard can now report that a public Google Cloud Storage bucket containing approximately 1.5 terabytes of data used to administer funding programs for college students has been secured. The bucket belonged to SmarterSelect, a company that provides software for managing the application process for scholarships, grants, and awards. The more than 2.8 million files included documents like transcripts, resumes, personal essays, tax returns, and invoices for approximately 1.2 million applications to funding programs.

UpGuard Team

November 22, 2021

By Design: How Default Permissions on Microsoft Power Apps Exposed Millions

38 million records were exposed in multiple data leaks resulting from misconfigured Microsoft Power Apps portals. Data included sensitive information such as COVID-19 contact tracing data, COVID-19 vaccination appointments, social security numbers for job applicants, employee IDs, and millions of names and email addresses.

UpGuard Team

August 23, 2021

View all breaches

Sign up for our newsletter

UpGuard's monthly newsletter cuts through the noise and brings you what matters most: our breaking research, in-depth analysis of emerging threats, and actionable strategic insights.

Free instant security score

How secure is your organization?

Request a free cybersecurity report to discover key risks on your website, email, network, and brand.

Instant insights you can act on immediately
Hundreds of risk factors including email security, SSL, DNS health, open ports and common vulnerabilities

Free score

Join 27,000+ cybersecurity newsletter subscribers

Discovery

System Data and Credentials

Attribution

Conclusion

Protect your organization

Related breaches

Own Goal: Inside the Cyber Risks of the 2026 World Cup

Social Insecurity: Billions of Social Security Number and Passwords

Sixth Sense: GPS and AI Data Exposed for Assistive Devices

Stolen Data: National PTA Database Available on Dark Web

Student Applications: How an Education Software Company Exposed Millions of Files

By Design: How Default Permissions on Microsoft Power Apps Exposed Millions

Sign up for our newsletter

Free instant security score

How secure is your organization?