Updated on May 1, 2018 by UpGuard
Modern enterprise data centers are a complex mix of different technologies geared towards accomplishing business goals. Some of these technologies are pricy, big-name business solutions, but some are simple tools and utilities, facilitating processes. Linux sysadmins have been using rsync (remote synchronization) to move and mirror files for two decades, though versions of it now run on nearly every platform. Its lightweight build, small footprint, and usability make it a good choice for simple file copy operations. But this same asset is also a liability for many utilities: designed purely for functionality, they may not automatically account for potential risks to enterprise data. To successfully use rsync in the enterprise means protecting the data being transferred through it from accidental exposure.
One of the great advantages of rsync over other similar utilities is that it is able to easily transfer only the delta between systems. For example, if you set up rsync on a file server and connect a backup server as the mirror, the initial sync will move every file in the specified path. After that first sync, rsync will only move the changes, keeping the mirror identical to the primary server and minimizing network traffic. This type of file copy procedure is extremely common for most organizations, and without process guidelines, techniques and utilities vary widely among individual admins.
Despite its compact build, rsync does have security options that can protect the data it transfers. But like many pared-down tools, it does not invoke them by default, and the burden therefore rests on the person setting it up to configure it securely.
Utilities are data agnostic. They don’t know sensitive from not sensitive, and they don’t separate out dangerous from harmless. They only do what you tell them, and in the case of most Linux-based utilities, only exactly what you tell them. When rsync is used by businesses, the government, and other large organizations, the files being transferred may contain extremely sensitive information. Although rsync can move these files the same as it could if they contained gibberish, the risk to the business can be severe if that information leaks.
Data exposure has become a prominent business risk, and organizations that have experienced such a leak have also had to endure the associated financial and reputational damage. Rsync can be a powerful utility for simple file mirroring or transfer, but the level of care taken to configure it should be commensurate to the sensitivity of the data being transmitted.
Before we dive into the configurations themselves, it’s important to note that there are two different ways to use rsync. One is a command line utility where all of the details are passed as argument variables, this is rsync. The daemonized version of rsync is known as rsyncd, and listens on a designated port as a service. Rsyncd relies on rsyncd.conf for its configurations, where each sync path has its own block of options. Rsyncd is the vector for data exposure involving rsync, as it can be opened by an anonymous third party without the proper protection. For our purposes, we will focus on rsyncd, which is the most common way rsync is utilized at scale.
The rsync daemon depends on rsyncd.conf for its authentication, access, logging, and available modules. In service mode, rsync can provide details for many different synchronization paths. Organizations that rely on rsync may find these paths accumulating over time. Because each path requires separate configuration, it’s easy for one to fall through the cracks and omit an important directive.
pid file = /var/run/rsyncd.pid lock file = /var/run/rsync.lock log file = /var/log/rsync.log port = 12000 [files] path = /var/storage/ comment = Primary file server timeout = 300
In its simplest form, the rsync.conf file declares the following global parameters:
Other global options exist, such as specifying the IP address to listen on, advanced socket options, and the ability to send a message of the day (MOTD), essentially a service banner, to users of the rsync service.
Additionally, it can include any number of “modules,” or file paths to synchronize. In the example above, our [files] block denotes the sole module for this system. Under that module, many directives can be set. The most important among these are:
However, the directives that set security are even more important. Because rsync is a lean utility, none of them are engaged by default. This requires administrators to understand and validate their rsync module configurations in order to properly limit access to the information they handle. Let’s look at each of them in detail.
The list option allows rsync to “hide” a module from anyone who doesn’t know what they’re looking for. When the rsync daemon is queried for available modules, those set to list = false will be omitted from the results. While this kind of security through obscurity is not enough on it’s own, it is one additional layer you can add to protect particularly sensitive file paths. By default, modules are listable, so this parameter must be explicitly set to false for hidden modules.
List = true
The module is visible when the rsync daemon is queried for available paths.
List = false
The module is not visible in the daemon’s list and must be accessed directly.
Default: True. The module will be visible.
The most basic way to protect rsync modules from accidental exposure is to restrict which external machines can talk to it. By using the hosts allow and hosts deny directives, rsync can build a policy of least privilege by permitting only those clients necessary for business goals. With hosts allow, all unspecified source IPs will be disallowed automatically. This drastically narrows the attack surface of the rsync server and should always be established for even somewhat sensitive information. Hosts deny can block specific IP addresses, offering further access granularity to an allowed IP range.
Hosts allow [IP address, IP range, hostname]
Specified clients will be allowed, unless they are also in the hosts deny list. All others will be blocked.
Hosts deny [IP address, IP range, hostname]
The specified clients will be blocked. All others will be allowed, unless the hosts allow directive is in use, in which case they must also be specified there.
Default: All hosts are allowed.
When used in conjunction, the hosts allow directive is read first. If the client is allowed there, the hosts deny directive is then read. If a client matches there, they are denied access-- even if specified in the allow list.
IP and hostname restrictions narrow the attack surface by device, but any user on those allowed devices will be able to access the rsync module. The auth users directive narrows the attack surface by user, limiting access to only specified accounts, regardless of device. When auth users is enabled and given a list of usernames, only those users can connect to the rsync daemon.
The auth users directive relies on a “secrets” file, for example, /etc/rsyncd/rsyncd.secrets. This file contains the username and password combinations for rsync accounts. It’s critical to note that the secrets file is stored in plain text, including passwords. This means the file should be heavily restricted.
If the auth users directive is absent, the default is to allow all users. And just like that, if your rsync server is available from the internet, you have a data leak. The most important takeaway to remember when building a secure rsync setup is that by default, anyone can access the path. Failure to correctly configure the auth users and hosts allow/deny settings turns whatever data is being synchronized into a public facing webpage. Anybody who finds the rsync server can pull the contents anonymously, without needing a password. Incidentally, finding internet exposed rsync hosts is trivial when the default port is being used. It is always recommended to limit access to rsync by user and device. Every layer reduces the risk of data exposure.
Auth users admin1,support,serviceadmin
The specified users will be allowed to authenticate to rsync. Default: All users are allowed.
Secrets file /etc/rsyncd/rsyncd.secrets
This specifies the location of the username and password combinations used by the auth users directive.
Default: None. Must be used in conjunction with auth users.
Having a plain text file with usernames and passwords, like that of the rsync “secrets” file, is not a great idea. This illustrates the risks of using rsync in the enterprise, one which companies must be willing to take in order to employ its functionality. However, there is another directive, called strict modes, that can offset the risk of the secrets file being compromised to some degree. Strict modes checks that the secrets file can only be accessed by the account under which the rsync daemon is running. For instance, if rsyncd is running under our dedicated rsync user (as it should, with minimal privileges) then only the rsync user should have access to read the secrets file. The daemon checks the file permissions and will not run unless they are correct. This is some nice additional validation that the plain text passwords in the secrets file won’t be accessed by unauthorized users.
That said, most enterprise class technology would never store passwords unencrypted in a text file. This is a qualitative difference between tools geared towards maximum functionality and platforms designed with business risks in mind. However, with the proper care, even rsync can be fairly well protected against accidental and malicious access.
Strict modes = true
The secrets file will be checked for proper access and the daemon will not start without it.
Strict modes = false (default)
The secrets file will not be checked for the proper permissions.
Default: False. The secrets file will not be checked unless auth users, secrets file, and strict modes are all enabled.
Encryption is one area where rsync and rsyncd differ greatly. When rsync is used on the command line, a separate protocol, usually SSH, must be specified for the transfer. However, the rsync daemon does not encrypt traffic. This means that an rsync process can potentially be sniffed in transit by a third party, granting them access to whatever information is being transferred. Therefore, rsync operations happening openly across the internet are extremely vulnerable to data exposure.
All rsyncd traffic should occur within a protected intranet or inside of an encrypted tunnel or VPN. At the enterprise level, there is no excuse for passing unencrypted data across the net. Alternative simple file copy solutions such as SCP and SFTP also support built-in encryption.
Default: Unencrypted on rsyncd.
If rsync is open to the net, anyone who scans the server will find an open port. Changing the port from 873 in the rsyncd.conf file can help obfuscate this, but ultimately if the rsync port is exposed, someone will eventually find it and see what they can do. Like any enterprise service, access to the rsync port should be limited in scope. Firewall ACLs can block unauthorized source IPs, much like the hosts allow and hosts deny directives in rsync itself.
Consider the operations being carried out by rsync. Is the data being copied important? If so, internet facing rsync is a massive vector of risk, and even with careful configuration can prove dangerous over time.
Default: Port 837.
Building a secure rsync setup for enterprise operations requires applying multiple layers of protection, each helping to minimize the surface area of the daemon and limit the remote connections that will be allowed access.
By following these three rules on every rsync module, you can reduce the chances of rsync-based data exposure significantly, allowing you to take advantage of the functionality of rsync without succumbing to its risks. But whether it’s an enterprise platform or a simple utility, misconfigurations will be the number one risk. People make mistakes all the time, and without the right process controls, those mistakes can come back around as a data breach or major outage. It’s fun to talk about 0-day exploits and fancy hacking methods, but an unprotected rsync server is far more likely and every bit as dangerous.
Misconfigurations are an internal problem that emanate from within the IT infrastructure of any enterprise; no hacker is necessary for massive damage to occur to digital systems and stored data. And the problem is pervasive, with Gartner estimating anywhere from 70% to 99% of data breaches result not from external, concerted attacks, but from internal misconfiguration of the affected IT systems.