Data Warehouse: How a Vendor for Half the Fortune 100 Exposed a Terabyte of Backups

Last updated by UpGuard on August 29, 2019

scroll down

The UpGuard Data Breach Research team can now disclose that a set of cloud storage buckets utilized by data management company Attunity have been secured from any future malicious action. Attunity, recently acquired by business intelligence platform Qlik, provides solutions for data integration. An UpGuard researcher discovered three publicly accessible Amazon S3 buckets related to Attunity. Of those, one contained a large collection of internal business documents. The total size is uncertain, but the researcher downloaded a sample of about a terabyte in size, including 750 gigabytes of compressed email backups. Backups of employees’ OneDrive accounts were also present and spanned the wide range of information that employees need to perform their jobs: email correspondence, system passwords, sales and marketing contact information, project specifications, and more.

The Discovery

On May 13th, 2019 an UpGuard researcher discovered publicly accessible Amazon S3 buckets named “attunity-it,” “attunity-patch” and “attunity-support.” The oldest files in “attunity-it,” where the bulk of the sensitive data was stored, were uploaded in September of 2014, though that does not necessarily mean they were publicly available at that time. The most recent files had been uploaded just days prior to the researcher discovering the bucket. The researcher notified Attunity on May 16, 2019. As a result of time zone complications, and due to Attunity having been recently acquired by Qlik, the researcher ultimately wound up speaking on the phone with Qlik support. By the next day, public access to the buckets had been removed. 

The Significance

In previous cases, we have discussed the complexity of fully analyzing and describing data stores that contain copies of users’ workspaces and mailboxes. Work lives are spread across countless files, each one with some importance, but difficult to reduce to any single metric of significance. While various tools can help search across large data stores, gaining those quantitative measures comes at the expense of analyzing the qualitative nature of the data. There may be a large number of email addresses, for example, but whether they are enterprise customers or merely marketing targets changes the significance of their inclusion. A few key examples from the Attunity data set can illustrate the kinds of data that users have access to and which can be exposed by misconfigured storage of those users’ file collections, as in this exposure of Attunity data.

Subscribe to data breach notifications ›

Customer Data

According to Attunity’s website, more than two thousand enterprises and half the Fortune 100 use Attunity. A file with a client list found in the repository included a client list with a number of companies commensurate to that description. Every business must have customers, and having commercials relations entails some exchange of information, leading many data exposures to involve third party data for customers or other involved entities. In this case, Attunity’s business in cloud migration and data integration also involves supplying and managing the software that processes customer data. Exhaustively documenting the files associated with each of thousands of companies is not feasible or necessary for the research team’s purpose of raising awareness of the risk of data leaks. As in past reports, a few examples can highlight the kinds of information that can be exposed through misconfigured storage, and were exposed in the case of Attunity.

netflix-attunity

Netflix database authentication strings.

td-bank-attunity

TD Bank software upgrade invoice.

ford-attunity

Ford project preparation slide.

 

As with documents pertaining to the responsible entity, third parties are at risk of data leaks for business documents, credentials, communications, and architecture descriptions.  

System Credentials

One class of data, among the most obviously significant for an information security program, are credentials for systems that would feasibly allow for the further compromise of the integrity, confidentiality, or availability of data. UpGuard researchers do not attempt to use credentials, and so cannot report on what access these could have provided, but the exposure of credentials certainly removes one layer of protection for accessing those systems. If they are administrative credentials then the exposure level would be high.

System credentials can be found in a number of places in the Attunity data set and serve as a useful reminder of how that information might be stored in many places across an organization’s digital assets. Credentials such as private keys were stored, and exposed, in directories for configuring those types of systems. 

private_key

6-1

7-1

But system configuration directories are not the only place credentials exist in an organization. Users may save and transmit their own passwords across communication channels like email that leave a digital papertrail and can be exposed when backups of those systems are stored insecurely. 

In this example, a data breach prompted the user to reset the password for the Attunity corporate Twitter account– which is a good thing to do– but by sharing that in plaintext via an email, a copy of that message was ultimately also stored in the public S3 bucket.

8

 

Likewise, IT support often responds to questions from personnel who cannot access their accounts. Having multiple layers of security is good, but makes it more difficult for personnel who must authenticate in multiple ways. Supporting those personnel requires some communication about credentials, and exposing those communications in turn creates a way through that layer of security.

3-6

 

Even code and system credentials can be leaked when developer computers are backed up and stored insecurely. This example shows information from a local git history that may be protected at other points in the software development lifecycle, but exposed due to being saved in an open S3 bucket. 

11

 

 

System Information

Credentials are like a key, and using them requires knowing what lock they fit. For that, information about systems and their architecture provides the context. Such documents are necessary for planning and implementing technology of any complexity, and particularly so when communicating about requirements with customers for large scale data processing. As a data integrator, Attunity naturally appears to produce documents describing how they will process customer data. 

10

 

9

 

Documentation also covers Attunity’s own systems, like this spreadsheet named “Production VLAN.”

12

Personal Information

Even businesses that do not have a consumer user base must store personal information about their own employees to perform core human resources tasks like verifying identities and payroll. Because data shared through digital communication– for example, emailing your employer your home address– is replicated rather than actually transferred like a physical object, there are many points at which it persists and can be stored insecurely. Furthermore, data sent to the department that uses it (like human resources) is on systems that are maintained by an information technology group, and may be replicated again for disaster recovery or other long term storage. 

In the case of Attunity, the exposure included spreadsheets with employee data. The example below had 354 rows and included columns for ID, Employee, Actual / Forecast/Commit, Benefit Code, G/l account, Entity, Department, Location, Operation, Role, Active, Full Name, First name, Last name, Employee ID, Payroll ID, Date of hire, Job title, Direct manager, %, Local Currency, Salary 2015, Salary 2016, Company car value /Allowance, On target commission, Pro rated commission 2016, On target bonus, vacation days, Options Grant, RSUs Grant, Prior Notice, Recruitment fee, License Quota 2016, Key employee, Date of birth, Senior management, Zviran Code, OB  VAC 1#1#15, Salary 2014, Date of termination, Travel budget 2016, updated salary 2016, Recruitment booked, and Attachments. 

attunity-pii

An additional risk is that the employee ID numbers tied to US Attunity employees follow the same numbering scheme as social security numbers, which leads us to believe they may be one in the same. The Attunity Employee IDs in this spreadsheet for US employees are nine digits, the same length as SSNs. Using a US government SSN validation site we could confirm that these numbers are valid SSNs issued to someone, and that they were issued at approximately the same time as these individuals’ dates of birth (which we could cross reference with the data in the Attunity spreadsheet). The US government site does not return the name of the person with the SSN for obvious security reasons, and so we cannot absolutely verify that these ID numbers are also the employee’s social security number. 

Conclusion

Attunity’s business is to replicate and migrate data into data lakes for centralized analytics. The risks to Attunity posed by exposed credentials, information, and communications, then are risks to the security of the data they process. While many of the files are years old, the bucket was still in use at the time detected and reported by UpGuard, with the most recent files having been modified within days of discovery. 

The chain of events leading to the exposure of that data provides a useful lesson in the ecology of a data leak scenario. Users’ workstations may be secured against attackers breaking in, but other IT processes can copy and expose the same data valued by attackers. When such backups are exposed, they can contain a variety of data from system credentials to personally identifiable information (PII). Data is not safe if misconfigurations and process errors expose that data to the public internet. 

Subscribe to data breach notifications ›


Related posts

Learn more about the latest issues in cybersecurity