Tonight I gave a talk on comparing containers and generating Dockerfiles. Instead of providing the slides, which are pretty lame by themselves, I thought I'd write up the talk in a proper context. UpGuard has a number of use cases, one of which highlighted for the talk was migrating the configuration of environments from one location to another. Traditionally we have helped some of our customers scan their configuration state and generate executable tests based on those configuration items as well as allow scanned configuration from multiple machines to be compared.
Lately, we have been investigating migrations to Linux contains, both for customers and for our own internal infrastructure. One of the common (and possibly a little inconvenient) situations faced is that the base ubuntu images are a very stripped down version of the OSs they represent. We knew we could use UpGuard to discover and list configuration items, but we had a look around online to see if anyone else was doing similar analyses.
The "docker diff" command showed up in many results, but it turns out this command highlights the filesystem differences relative to the base image. The "docker history" command also showed a little more than just the filesystem changes, but only went back to the base image. On the search we found this quote from Solomon Hykes in a recent interview:
"The beauty of docker images being "just files" means that the difference between two docker images is just a diff of the files they contain."
Everything we came across seemed to centre around files. Sometimes it's not so clear what's going on, just from looking at file differences. As an example, this is the output of "docker diff" on a container:
$ docker diff 10eeeb8151d0 C /etc C /etc/group C /etc/passwd … A /home/UpGuard A /home/UpGuard/.bash_history … A /run/ssdh.pid …
Sometimes looking at the differences on a filesystem doesn't show the whole picture.
We set up a docker container based on the official "ubuntu" precise (Ubuntu 12.04) image in the registry. The only additional changes we made were to install the SSH daemon (apt-get install openssh-server) and added a non-root user. The Dockerfile used is given below.
FROM ubuntu MAINTAINER UpGuard RUN apt-get update RUN apt-get install openssh-server -y RUN mkdir -p /var/run/sshd EXPOSE 22
Since we were running the docker daemon on a remote server we SSH'd into we ran the docker container (after building) with the following command:
$ sudo docker run -p 2201:22 -i -t sshd-ubuntu /bin/bash
Here we mapped the container's exposed SSH port to an alternate port (2201) so that UpGuard could SSH into the container to perform its scan. We then run a few commands in the running container just to finish the setup here as it was easier to perform these actions in a live terminal rather than a Dockerfile:
$ adduser UpGuard … $ /usr/sbin/sshd
A node scan of the ubuntu Docker container from the registry.
More than just displaying filesystem differences, the scan picks up some rudimentary configuration items of the system, like APT packages, users, groups and environment variables. While it is difficult to insert an interactive demonstration above, you can interact with a similar public page we set up comparing operating systems between hosting providers. We then set up a fresh Ubuntu Precise (12.04) Server VM. We then only installed "openssh-server" and added the "UpGuard" user, identical to the "ubuntu" container setup. After a scan these were the differences we detected in just the APT packages installed. They are shown below:
Here red shows packages that are installed on a fresh Ubuntu 12.04 server, but are missing from the Ubuntu 12.04 container. Orange shows packages identified as installed on both nodes, with a small difference found between each such as the version number. Of particular annoyance to me when setting up containers based on the "ubuntu" image, is that "vim" isn't even installed.
One of the other commonly used registry images is the one called "base", which also claims to be an Ubuntu 12.04 container. We decided to set up this container using the same process used for the "ubuntu" docker container above. We then compared the "ubuntu" image to the "base" image (both claiming to be Ubuntu 12.04 images). Below is a comparison of just the APT packages between these containers:
Again, here red and green represent packages that are detected in "ubuntu" and "base" respectively, but not the other. Orange represents packages found in both, but with differing version numbers. Even for two docker images that claim to both be based on Ubuntu 12.04, there are many differences in installed packages alone.
Being able to scan and compare container configuration setups can be used for basic forensic activities, but you also want to be able to take action on this information. Configuration information scanned with UpGuard can also be filtered into packages which contain executable tests for these configuration items. A package can also be exported to common configuration automation tools such as Puppet, Chef and Ansible. Of particular interest is our experimental export to Dockerfile.
We created a package for a rails app that set up the basic configuration for nginx. Below is a list of the tests contained in a package designed to validate that nginx is installed on a system.
By exporting the package to Dockerfile, the following file was downloaded:
# # Dockerfile template generated by UpGuard # To build: # 1) Place this file in a folder (e.g. ~/docker/railsapp/) and rename to just 'Dockerfile' # 2) Run: # $ docker build -t="railsapp" . (<--- don't forget the dot) # For details on running this container, see: # http:<wbr />/<wbr />/docs.docker.io<wbr />/en<wbr />/latest<wbr />/commandline<wbr />/cli # --------------------------------- # Account: UpGuard # Package: RailsApp # Downloaded: 2013-11-21 01:47:53 +0000 # FROM ubuntu MAINTAINER Generated by UpGuard for Steve Cossell "steve@UpGuard.com" # Just once before doing one or more apt-get install commands... # (If your package is not in the main Ubuntu repo, uncomment the following line) # RUN echo 'deb http://archive.ubuntu.com/ubuntu precise main restricted universe' >> /etc/apt/sources.list RUN apt-get update # The package nginx should be installed. Version 1.1.19-1ubuntu0.4 in particular, should be installed. # RUN apt-get install nginx=1.1.19-1ubuntu0.4 -y RUN apt-get install nginx -y # The package nginx-full should be installed. Version 1.1.19-1ubuntu0.4 in particular, should be installed. # RUN apt-get install nginx-full=1.1.19-1ubuntu0.4 -y RUN apt-get install nginx-full -y # The package nginx-common should be installed. Version 1.1.19-1ubuntu0.4 in particular, should be installed. # RUN apt-get install nginx-common=1.1.19-1ubuntu0.4 -y RUN apt-get install nginx-common -y # The file /etc/nginx/nginx.conf should have the defined properties # -- Please symlink the following file to the location of your Dockerfile # -- $ cd /dir/your/Dockerfile/lives/in/ # -- $ ln -s /etc/nginx/nginx.conf # -- Or, change the file's source to a URL # -- (For more information, see: http://docs.docker.io/en/latest/use/builder/#add) ADD nginx.conf /etc/nginx/nginx.conf RUN chown 0 /etc/nginx/nginx.conf RUN chgrp 0 /etc/nginx/nginx.conf RUN chmod 644 /etc/nginx/nginx.conf # The local TCP port 80 should be awaiting connections or able to receive packets. EXPOSE 80
One of the easiest ways to build applications programmatically into containers through Docker is to use a Dockerfile.
This Dockerfile is a good starting point and can be modified to meet custom requirements of the real deployment. Some immediate points of improvement highlighted in the above Dockerfile are looking up the proper dependency tree of APT packages and only installing the package that really needs to be installed. For example, installing the "nginx" package alone probably also installs the "nginx-full" and "nginx-common" packages as dependencies. In addition, the "chown" and "chgrp" commands could be ignored in Dockerfiles if the user and group being set for it root (uid=0).
After the talk I came across a tweet from the Docker account:
Imagine the situation where you have run a container and then in bash run a few more commands. You now want to reproduce those commands again, but can't remember what they were. It was brought up a few times during the meet up that all you will see in "docker history" is the bash command showing up, not the actual command(s) run. If you have run a number of commands you can technically scan that container using UpGuard and compare it against the base "ubuntu" container scan (or a scan of the image you based your container on) and see how your container has changed. If you select out the changes you can export them to a Dockerfile. How do you currently solve this problem?
Our interface with Docker and Dockerfiles is evolving. We are planning on scanning and comparing a range of public docker images from the registry to discover the additions people have made. If you are curious, please sign up for a free account and contact us for more information.