Updated on April 19, 2018 by UpGuard
When we examined the differences between breaches, attacks, hacks, and leaks, it wasn’t just an academic exercise. The way we think about this phenomenon affects the way we react to it. Put plainly: cloud leaks are an operational problem, not a security problem. Cloud leaks are not caused by external actors, but by operational gaps in the day-to-day work of the data handler. The processes by which companies create and maintain cloud storage must account for the risk of public exposure.
Cloud storage provides speed, scalability, and automation for IT operations. Companies move production datasets in and out of cloud storage as needed, often reusing the same bucket for multiple tasks. Without proper care, it’s easy for a sensitive dataset to be moved into an unsecured bucket. This is why cloud storage configurations should be validated at deployment and throughout their time hosting production data. Continuous validation keeps the risk visible and can even proactively notify administrators if public access becomes allowed.
An example of why process validation is the key to preventing cloud leaks is the fact that Amazon’s S3 storage is private by default. This means that a change to the permission set must occur at some point for the bucket to be exposed to the world. That change– adding access to the All Users or Authenticated Users groups– could only happen inadvertently if there is no control in place to validate that the permissions are accurate. Likewise, if the sensitive data is moved into a bucket that’s already public, no process control around data handling exists to check the permissions as part of the move.
It is not human error that leads to cloud leaks. It’s process error. People make mistakes in everything. Enterprise IT is extremely complicated, exacerbating our natural tendency to mess up occasionally. Which is exactly why controls at the process level– structural, automated validation– must be in place to check the work being done. Operations that must be done repeatedly, and that when done incorrectly risk jeopardizing the company, must be controlled to limit that risk as much as possible.
At the enterprise scale, validation can only be achieved if it can fit inside a high speed workflow. When configuration validation becomes a bottleneck to a process, it’s far less likely to be dutifully enforced. If it relies on someone manually checking each configuration, not only can it not be accomplished quickly enough, but it suffers from the same capacity for human error as the original set up. Computers are far better at maintaining uniformity among a series than we are. Automated process controls should act as executable documentation, where important standards such as ensuring all cloud storage is private can be detailed and then checked against the actual state of any cloud storage instances to be sure that they comply.
For example, if we are provisioning S3 buckets in the enterprise, rather than manually creating a bucket in the AWS console and walking through a checklist, we should automate the programmatic creation of buckets using Amazon’s API, and roll a validation step into the process after the bucket is created to check for critical settings like internet exposure. This way, when a cloud storage instance needs to be created, an admin can just kick off a script and be sure that the newly created bucket is up to snuff.
Automation also allows configuration validation to be performed continuously, throughout the asset’s lifetime. This ensures visibility into assets at all times. Change within an enterprise data center is constant; a good process validates that changes do not violate basic standards, and alerts people immediately when they do.
Further obscuring the problem is the distance at which cloud leaks often occur: a third-party vendor doing information processing accidentally exposes the information in the cloud. As the dataset is associated with the primary company in the minds of their customers, they will be held just as accountable for the leak as if it had been their own servers. This makes assessing and optimizing third-party cyber risk just as important as in-house resilience.
Partnering with another company to handle sensitive information should always entail an assessment of that company’s practices, so the risk they pose by handling that data can be understood. Spending millions on internal cybersecurity only to outsource the same data to someone who leaves it exposed in the cloud doesn’t make sense. Vendors should be selected and appraised with the same care a company takes in protecting their in-house assets and information.
UpGuard tackles cloud leaks by automating cloud storage validation. Public access is the most dangerous, but like any digital surface, the total configuration state of cloud storage determines its resilience. UpGuard not only scans storage instances for public exposure, but also checks cloud platforms and servers themselves for misconfigurations that can lead to data exposure. UpGuard validates other internet-facing sources of leaks as well, like misconfigured GitHub repositories, and vulnerable rsync servers.
With UpGuard’s visual policies, admins can know in a glance which of their cloud storage instances are public and which are private. New buckets can be validated during the provisioning process automatically with UpGuard’s API and integration with tools like Puppet, Chef, and Ansible.
With UpGuard Procedures, cloud provisioning and maintenance processes can be automated and validated from end to end, reducing operational risk. Executable documentation works best when arranged by process, so that procedural steps can be chained together logically and validated in turn. This produces trustworthy assets and drastically reduces the risk of misconfigurations, such as accidental public exposure.
For example, a procedure automating the creation of a new Linux web server on AWS could:
Cloud servers and storage deployed in this manner has a significantly lower risk of data exposure than those lacking these controls.
UpGuard also provides external vendor assessment, analyzing and visualizing the relative risk posed by third parties charged with handling your data. Compare vendors and partners to similar companies to see how they measure up within their field. Our external assessment aggregates every relevant security practice visible from the internet into a single risk score.
This includes website details for all of a vendor’s URLs; email and domain safety, such as protocols against phishing; open ports, like Microsoft’s SMB which has been exploited by ransomware attacks like WannaCry and Petya, and business details including employee satisfaction and CEO approval ratings.
Cloud leaks are the result of operational error– not human error. A process is missing the necessary controls to reliably produce good results over time. The way to prevent cloud leaks is to shore up those operational gaps by instituting automated validation across all critical assets. The reason cloud leaks happen is because nobody knows that sensitive data is exposed to the internet. Process controls, like those outlined above, guarantee that such knowledge is surfaced immediately, so that it can be fixed before it becomes a bigger problem. UpGuard helps prevent data breaches in the cloud and on-premises by automating and visualizing these process controls.
Misconfigurations are an internal problem that emanate from within the IT infrastructure of any enterprise; no hacker is necessary for massive damage to occur to digital systems and stored data. And the problem is pervasive, with Gartner estimating anywhere from 70% to 99% of data breaches result not from external, concerted attacks, but from internal misconfiguration of the affected IT systems.