07/15/2014: Sharing Files Between Docker Images Using Volumes
$ cat bridge-env.sh
export BRIDGENAME=brbob
export IMAGENAME=bob
export IPADDR=10.0.10.1/24
Before any explanations, let's look at the files we'll be using:
./configuration/build_image.sh - wrapper for _docker build_.
./configuration/run_image.sh - wrapper for _docker run_.
./configuration/Dockerfile - control file for Docker image.
./configuration/files/bridge-env.sh - environment setting script.
All of the files are fairly small. Since our main topic today is Docker, let's look at the Docker configuration file first.
$ cat Dockerfile
FROM stackbrew/busybox:latest
MAINTAINER David Medinets <david.medinets@gmail.com>
RUN mkdir /configuration
VOLUME /configuration
ADD files /configuration
And you can build this image.
$ cat build_image.sh
sudo DOCKER_HOST=$DOCKER_HOST docker build --rm=true -t medined/shared-configuration .
I setup my docker to use a port instead of a UNIX socket. Therefore my DOCKERHOST is "tcp://0.0.0.0:4243". Since _sudo is being used, the environment variable needs to be set inside the sudo enviroment. If you want to use the default UNIX socker, leave DOCKER_HOST empty. The command will still work.
Then run it.
$ cat run_image.sh
sudo DOCKER_HOST=$DOCKER_HOST docker run --name shared-configuration -t medined/shared-configuration true
This command runs a docker container called sharedconfiguration. You'll notice that the _true command is run which exits immediately. Since this container will only hold files, it's ok there are no processes running in it. However, be very careful not to delete this container. Here is the output from docker ps showing the container.
$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d4a2aa46b5d9 medined/shared-configuration:latest true 7 seconds ago Exited (0) 7 seconds ago -shared-configuration
Now it's time to spin up two plain Ubuntu containers that can access the shared file.
$ sudo DOCKER_HOST=$DOCKER_HOST docker run --name A --volumes-from=shared-configuration -d -t ubuntu /bin/bash
94638de8b615f356f1240bbe602c0b7862e0589f1711fbff242b6d6f74c7de7d
$ sudo DOCKER_HOST=$DOCKER_HOST docker run --name B --volumes-from=shared-configuration -d -t ubuntu /bin/bash
sudo DOCKER_HOST=$DOCKER_HOST docker run --name B --volumes-from=shared-configuration -d -t ubuntu /bin/bash
How can we see the shared file? Let's turn to a very useful tool called nsenter (or namespace enter). The following command installs nsenter if isn't already installed.
hash nsenter 2>/dev/null \
|| { echo >&2 "Installing nsenter"; \
sudo DOCKER_HOST=$DOCKER_HOST \
docker run -v /usr/local/bin:/target jpetazzo/nsenter; }
I use a little script file to make nsenter easier to use:
$ cat enter_image.sh
#!/bin/bash
IMAGENAME=$1
usage() {
echo "Usage: $0 [image name]"
exit 1
}
if [ -z $IMAGENAME ]
then
echo "Error: missing image name parameter."
usage
fi
PID=$(sudo DOCKER_HOST=$DOCKER_HOST docker inspect --format {{.State.Pid}} $IMAGENAME)
sudo nsenter --target $PID --mount --uts --ipc --net --pid
This script is used by specifying the image name to use. For example,
$ ./enter_image.sh A
root@94638de8b615:/# cat /configuration/bridge-env.sh
export BRIDGENAME=brbob
export IMAGENAME=bob
export IPADDR=10.0.10.1/24
root@94638de8b615:/# exit
logout
$ ./enter_image.sh B
root@925365faded2:/# cat /configuration/bridge-env.sh
export BRIDGENAME=brbob
export IMAGENAME=bob
export IPADDR=10.0.10.1/24
root@925365faded2:/# exit
logout
We see the same information in both containers. Let's prove that the bridge-env.sh file is shared instead of being two copies.
$ ./enter_image.sh A
root@94638de8b615:/# echo "export NEW_VARIABLE=VALUE" >> /configuration/bridge-env.sh
root@94638de8b615:/# exit
logout
$ ./enter_image.sh B
root@925365faded2:/# cat /configuration/bridge-env.sh
export BRIDGENAME=brbob
export IMAGENAME=bob
export IPADDR=10.0.10.1/24
export NEW_VARIABLE=VALUE
We changed the file in the first container and saw the changes in the second container. As an alternative to using nsenter, you can simply run a container to list the files.
$ docker run --volumes-from shared-configuration busybox ls -al /configuration
07/12/2014: Running a Single-Node Accumulo Docker container
Based on the work by sroegner, I have a github project at https://github.com/medined/docker-accumulo which lets you run multiple single-node Accumulo instances using Docker.
First, create the image.
git clone https://github.com/medined/docker-accumulo.git cd docker-accumulo/single_node ./make_image.sh
Now start your first container.
export HOSTNAME=bellatrix export IMAGENAME=bellatrix export BRIDGENAME=brbellatrix export SUBNET=10.0.10 export NODEID=1 export HADOOPHOST=10.0.10.1 ./make_container.sh $HOSTNAME $IMAGENAME $BRIDGENAME $SUBNET $NODEID $HADOOPHOST yes
And then you can start a second one:
export HOSTNAME=rigel export IMAGENAME=rigel export BRIDGENAME=brrigel export SUBNET=10.0.11 export NODEID=1 export HADOOPHOST=10.0.11.1 ./make_container.sh $HOSTNAME $IMAGENAME $BRIDGENAME $SUBNET $NODEID $HADOOPHOST no
And a third!
export HOSTNAME=saiph export IMAGENAME=saiph export BRIDGENAME=brbellatrix export SUBNET=10.0.12 export NODEID=1 export HADOOPHOST=10.0.12.1 ./make_container.sh $HOSTNAME $IMAGENAME $BRIDGENAME $SUBNET $NODEID $HADOOPHOST no
The SUBNET is different for all containers. This isolates the Accumulo containers from each other.
Look at the running containers
$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 41da6f17261f medined/accumulo:latest /docker/run.sh saiph 4 seconds ago Up 2 seconds 0.0.0.0:49179->19888/tcp, 0.0.0.0:49180->2181/tcp, 0.0.0.0:49181->50070/tcp, 0.0.0.0:49182->50090/tcp, 0.0.0.0:49183->8141/tcp, 0.0.0.0:49184->10020/tcp, 0.0.0.0:49185->22/tcp, 0.0.0.0:49186->50095/tcp, 0.0.0.0:49187->8020/tcp, 0.0.0.0:49188->8025/tcp, 0.0.0.0:49189->8030/tcp, 0.0.0.0:49190->8050/tcp, 0.0.0.0:49191->8088/tcp saiph 23692dfe3f1e medined/accumulo:latest /docker/run.sh rigel 10 seconds ago Up 9 seconds 0.0.0.0:49166->19888/tcp, 0.0.0.0:49167->2181/tcp, 0.0.0.0:49168->50070/tcp, 0.0.0.0:49169->8025/tcp, 0.0.0.0:49170->8088/tcp, 0.0.0.0:49171->10020/tcp, 0.0.0.0:49172->22/tcp, 0.0.0.0:49173->50090/tcp, 0.0.0.0:49174->50095/tcp, 0.0.0.0:49175->8020/tcp, 0.0.0.0:49176->8030/tcp, 0.0.0.0:49177->8050/tcp, 0.0.0.0:49178->8141/tcp rigel 63f8f1a7141f medined/accumulo:latest /docker/run.sh bella 21 seconds ago Up 20 seconds 0.0.0.0:49153->19888/tcp, 0.0.0.0:49154->50070/tcp, 0.0.0.0:49155->8020/tcp, 0.0.0.0:49156->8025/tcp, 0.0.0.0:49157->8030/tcp, 0.0.0.0:49158->8050/tcp, 0.0.0.0:49159->8088/tcp, 0.0.0.0:49160->8141/tcp, 0.0.0.0:49161->10020/tcp, 0.0.0.0:49162->2181/tcp, 0.0.0.0:49163->22/tcp, 0.0.0.0:49164->50090/tcp, 0.0.0.0:49165->50095/tcp bellatrix
You can connect to running instances using the public ports. Especially useful is the public zookeeper port. Rather than searching through the ports listed above, here is an easier way.
$ docker port saiph 2181 0.0.0.0:49180 $ docker port rigel 2181 0.0.0.0:49167 $ docker port bellatrix 2181 0.0.0.0:49162
Having '0.0.0.0' in the response means that any IP address can connect.
You can enter the namespace of a container (i.e., access a bash shell) this way.
$ ./enter_image.sh rigel -bash-4.1# hdfs dfs -ls / Found 2 items drwxr-xr-x - accumulo accumulo 0 2014-07-12 09:13 /accumulo drwxr-xr-x - hdfs supergroup 0 2014-07-11 21:06 /user -bash-4.1# accumulo shell -u root -p secret Shell - Apache Accumulo Interactive Shell - - version: 1.5.1 - instance name: accumulo - instance id: bb713243-3546-487f-b6d6-cfaa272efb30 - - type 'help' for a list of available commands - root@accumulo> tables !METADATA
Now let's start an edge node. For my purposes, an edge node can connect to Hadoop, Zookeeper and Accumulo without running any of those processes. All of the edge node's resources are dedicated to client work.
export HOSTNAME=rigeledge export IMAGENAME=rigeledge export BRIDGENAME=brrigel export SUBNET=10.0.11 export NODEID=2 export HADOOPHOST=10.0.11.1 ./make_container.sh $HOSTNAME $IMAGENAME $BRIDGENAME $SUBNET $NODEID $HADOOPHOST no
As this container is started, the 'no' means that the supervisor configuration files will be deleted. So while supervisor will be running, it won't be managing any processes. This is not a best practice. It's just the way I chose for this prototype.