Data Warehouse: How a Vendor for Half the Fortune 100 Exposed a Terabyte of Backups

The UpGuard Data Breach Research team can now disclose that a set of cloud storage buckets utilized by data management company Attunity have been secured from any future malicious action. Attunity, recently acquired by business intelligence platform Qlik, provides solutions for data integration. An UpGuard researcher discovered three publicly accessible Amazon S3 buckets related to Attunity. Of those, one contained a large collection of internal business documents. The total size is uncertain, but the researcher downloaded a sample of about a terabyte in size, including 750 gigabytes of compressed email backups. Backups of employees’ OneDrive accounts were also present and spanned the wide range of information that employees need to perform their jobs: email correspondence, system passwords, sales and marketing contact information, project specifications, and more.

The Discovery

On May 13th, 2019 an UpGuard researcher discovered publicly accessible Amazon S3 buckets named “attunity-it,” “attunity-patch” and “attunity-support.” The oldest files in “attunity-it,” where the bulk of the sensitive data was stored, were uploaded in September of 2014, though that does not necessarily mean they were publicly available at that time. The most recent files had been uploaded just days prior to the researcher discovering the bucket. The researcher notified Attunity on May 16, 2019. As a result of time zone complications, and due to Attunity having been recently acquired by Qlik, the researcher ultimately wound up speaking on the phone with Qlik support. By the next day, public access to the buckets had been removed.

The Significance

In previous cases, we have discussed the complexity of fully analyzing and describing data stores that contain copies of users’ workspaces and mailboxes. Work lives are spread across countless files, each one with some importance, but difficult to reduce to any single metric of significance. While various tools can help search across large data stores, gaining those quantitative measures comes at the expense of analyzing the qualitative nature of the data. There may be a large number of email addresses, for example, but whether they are enterprise customers or merely marketing targets changes the significance of their inclusion. A few key examples from the Attunity data set can illustrate the kinds of data that users have access to and which can be exposed by misconfigured storage of those users’ file collections, as in this exposure of Attunity data.

Customer Data

According to Attunity’s website, more than two thousand enterprises and half the Fortune 100 use Attunity. A file with a client list found in the repository included a client list with a number of companies commensurate to that description. Every business must have customers, and having commercials relations entails some exchange of information, leading many data exposures to involve third party data for customers or other involved entities. In this case, Attunity’s business in cloud migration and data integration also involves supplying and managing the software that processes customer data. Exhaustively documenting the files associated with each of thousands of companies is not feasible or necessary for the research team’s purpose of raising awareness of the risk of data leaks. As in past reports, a few examples can highlight the kinds of information that can be exposed through misconfigured storage, and were exposed in the case of Attunity.

netflix-attunity — *Netflix database authentication strings.*

td-bank-attunity — *TD Bank software upgrade invoice.*

ford-attunity — *Ford project preparation slide.*

As with documents pertaining to the responsible entity, third parties are at risk of data leaks for business documents, credentials, communications, and architecture descriptions.

System Credentials

One class of data, among the most obviously significant for an information security program, are credentials for systems that would feasibly allow for the further compromise of the integrity, confidentiality, or availability of data. UpGuard researchers do not attempt to use credentials, and so cannot report on what access these could have provided, but the exposure of credentials certainly removes one layer of protection for accessing those systems. If they are administrative credentials then the exposure level would be high.

System credentials can be found in a number of places in the Attunity data set and serve as a useful reminder of how that information might be stored in many places across an organization’s digital assets. Credentials such as private keys were stored, and exposed, in directories for configuring those types of systems.

But system configuration directories are not the only place credentials exist in an organization. Users may save and transmit their own passwords across communication channels like email that leave a digital papertrail and can be exposed when backups of those systems are stored insecurely.

In this example, a data breach prompted the user to reset the password for the Attunity corporate Twitter account– which is a good thing to do– but by sharing that in plaintext via an email, a copy of that message was ultimately also stored in the public S3 bucket.

Likewise, IT support often responds to questions from personnel who cannot access their accounts. Having multiple layers of security is good, but makes it more difficult for personnel who must authenticate in multiple ways. Supporting those personnel requires some communication about credentials, and exposing those communications in turn creates a way through that layer of security.

Even code and system credentials can be leaked when developer computers are backed up and stored insecurely. This example shows information from a local git history that may be protected at other points in the software development lifecycle, but exposed due to being saved in an open S3 bucket.

System Information

Credentials are like a key, and using them requires knowing what lock they fit. For that, information about systems and their architecture provides the context. Such documents are necessary for planning and implementing technology of any complexity, and particularly so when communicating about requirements with customers for large scale data processing. As a data integrator, Attunity naturally appears to produce documents describing how they will process customer data.

Documentation also covers Attunity’s own systems, like this spreadsheet named “Production VLAN.”

Personal Information

Even businesses that do not have a consumer user base must store personal information about their own employees to perform core human resources tasks like verifying identities and payroll. Because data shared through digital communication– for example, emailing your employer your home address– is replicated rather than actually transferred like a physical object, there are many points at which it persists and can be stored insecurely. Furthermore, data sent to the department that uses it (like human resources) is on systems that are maintained by an information technology group, and may be replicated again for disaster recovery or other long term storage.

In the case of Attunity, the exposure included spreadsheets with employee data. The example below had 354 rows and included columns for ID, Employee, Actual / Forecast/Commit, Benefit Code, G/l account, Entity, Department, Location, Operation, Role, Active, Full Name, First name, Last name, Employee ID, Payroll ID, Date of hire, Job title, Direct manager, %, Local Currency, Salary 2015, Salary 2016, Company car value /Allowance, On target commission, Pro rated commission 2016, On target bonus, vacation days, Options Grant, RSUs Grant, Prior Notice, Recruitment fee, License Quota 2016, Key employee, Date of birth, Senior management, Zviran Code, OB VAC 1#1#15, Salary 2014, Date of termination, Travel budget 2016, updated salary 2016, Recruitment booked, and Attachments.

An additional risk is that the employee ID numbers tied to US Attunity employees follow the same numbering scheme as social security numbers, which leads us to believe they may be one in the same. The Attunity Employee IDs in this spreadsheet for US employees are nine digits, the same length as SSNs. Using a US government SSN validation site we could confirm that these numbers are valid SSNs issued to someone, and that they were issued at approximately the same time as these individuals’ dates of birth (which we could cross reference with the data in the Attunity spreadsheet). The US government site does not return the name of the person with the SSN for obvious security reasons, and so we cannot absolutely verify that these ID numbers are also the employee’s social security number.

Conclusion

Attunity’s business is to replicate and migrate data into data lakes for centralized analytics. The risks to Attunity posed by exposed credentials, information, and communications, then are risks to the security of the data they process. While many of the files are years old, the bucket was still in use at the time detected and reported by UpGuard, with the most recent files having been modified within days of discovery.

The chain of events leading to the exposure of that data provides a useful lesson in the ecology of a data leak scenario. Users’ workstations may be secured against attackers breaking in, but other IT processes can copy and expose the same data valued by attackers. When such backups are exposed, they can contain a variety of data from system credentials to personally identifiable information (PII). Data is not safe if misconfigurations and process errors expose that data to the public internet.

How UpGuard can help detect and prevent data breaches and data leaks

UpGuard helps security teams proactively detect and shut down data breach risks that impact their internal security posture and the security postures of all third-party relationships.

UpGuard can also continuously monitor the open, deep, and dark web, discovering stolen credentials and leaked data before they're weaponized. Its AI Threat Analyst acts as a virtual Tier 1 analyst, filtering out noise and elevating only high-confidence threats from sources like malware logs, ransomware leak sites, and encrypted messaging platforms.

The resulting significant reduction in false positives equips security teams to execute fast and targeted responses on risks that actually matter.

Protect your organization

Get in touch or book a free demo.

Contact sales

Free demo

Related breaches

Learn more about the latest issues in cybersecurity.

Sixth Sense: GPS and AI Data Exposed for Assistive Devices

UpGuard can now report that it has secured an Elasticsearch database for AngelSense, a GPS tracker for children and adults with special needs.

UpGuard Team

January 30, 2025

Stolen Data: National PTA Database Available on Dark Web

On May 13th, UpGuard discovered a new set of data recently posted on a prominent dark web forum, this time allegedly belonging to the National Parent Teacher Association.

UpGuard Team

May 14, 2024

Student Applications: How an Education Software Company Exposed Millions of Files

UpGuard can now report that a public Google Cloud Storage bucket containing approximately 1.5 terabytes of data used to administer funding programs for college students has been secured. The bucket belonged to SmarterSelect, a company that provides software for managing the application process for scholarships, grants, and awards. The more than 2.8 million files included documents like transcripts, resumes, personal essays, tax returns, and invoices for approximately 1.2 million applications to funding programs.

UpGuard Team

November 22, 2021

By Design: How Default Permissions on Microsoft Power Apps Exposed Millions

38 million records were exposed in multiple data leaks resulting from misconfigured Microsoft Power Apps portals. Data included sensitive information such as COVID-19 contact tracing data, COVID-19 vaccination appointments, social security numbers for job applicants, employee IDs, and millions of names and email addresses.

UpGuard Team

August 23, 2021

Florida County Database Mistake: Election Officials’ Logins Among Exposed Data

UpGuard can now disclose that an Amazon S3 storage bucket containing publicly exposed backups of systems representing the intranet and web presence for Martin County, Florida has been secured.

UpGuard Team

October 30, 2020

Streamlit: The Tip of The Shadow AI Iceberg

Tens of thousands of AI-enabled web applications using the Streamlit framework are publicly available, exposing PII and other confidential data.

Greg Pollock

December 9, 2025

View all breaches

Sign up for our newsletter

UpGuard's monthly newsletter cuts through the noise and brings you what matters most: our breaking research, in-depth analysis of emerging threats, and actionable strategic insights.

Free instant security score

How secure is your organization?

Request a free cybersecurity report to discover key risks on your website, email, network, and brand.

Instant insights you can act on immediately
Hundreds of risk factors including email security, SSL, DNS health, open ports and common vulnerabilities

Free score

Seeing is believing

Our most monumental Summit yet just ended

Data Warehouse: How a Vendor for Half the Fortune 100 Exposed a Terabyte of Backups

UpGuard Team

Table of contents

The Discovery

The Significance

Customer Data

System Credentials

System Information

Personal Information

Conclusion

How UpGuard can help detect and prevent data breaches and data leaks

Protect your organization

Related breaches

Sixth Sense: GPS and AI Data Exposed for Assistive Devices

Stolen Data: National PTA Database Available on Dark Web

Student Applications: How an Education Software Company Exposed Millions of Files

By Design: How Default Permissions on Microsoft Power Apps Exposed Millions

Florida County Database Mistake: Election Officials’ Logins Among Exposed Data

Streamlit: The Tip of The Shadow AI Iceberg

Sign up for our newsletter

Free instant security score

How secure is your organization?

Seeing is believing

Our most monumental Summit yet just ended

Table of contents

Join 27,000+ cybersecurity newsletter subscribers

The Discovery

The Significance

Customer Data

System Credentials

System Information

Personal Information

Conclusion

How UpGuard can help detect and prevent data breaches and data leaks

Protect your organization

Related breaches

Sixth Sense: GPS and AI Data Exposed for Assistive Devices

Stolen Data: National PTA Database Available on Dark Web

Student Applications: How an Education Software Company Exposed Millions of Files

By Design: How Default Permissions on Microsoft Power Apps Exposed Millions

Florida County Database Mistake: Election Officials’ Logins Among Exposed Data

Streamlit: The Tip of The Shadow AI Iceberg

Sign up for our newsletter

Free instant security score

How secure is your organization?