This post is a continuation of my previous post on accessing virtual machines on RedCloud. In this post, you will learn how to run a Docker container on a VM Ubuntu image, but you can also do this tutorial with Ubuntu on a local machine.
Why Containerize Code?
Let’s use an example: Say that you have three Python applications that you want to run on your computer, but they all use different version of Python or its packages. We cannot host these applications at the same time using Python on our computer…so what do we do? Think of another case- you want to share neural network code with a collaborator, but you don’t want to have to deal with the fact that it’s incredibly difficult for them to get TensorFlow and all its dependencies downloaded on their machine. Containerization allows us to isolate the installation and running of our application without having to worry about the setup of the host machine. In other words, everything someone would need to run your code will be provided in the container. Shown in Figure 1, each container shares the host OS’s kernel but has a virtual copy of some of the components of the OS. This allows the containers to be isolated from each other while running on the same host. Therefore, containers are exceptionally light and take only seconds to start .
Docker started as a project to build single-application LXC (linux) containers that were more portable and flexible, but is now its own container runtime environment. It is written in GO and developed by DotCloud (a PaaS company). A Docker engine is used to build and manage a Docker image, which is just a template that contains the application and all of its dependencies that are required for it to run. A Docker container is a running instance of a Docker image. A Docker engine is composed of three main parts: a server known as the Docker daemon, a rest API, and a client. The docker daemon creates and manages images and containers, the rest API helps to link the server and applications, and the client (user) interacts with the docker daemon through the command line (Figure 2).
Running a Docker Container on Ubuntu
- We will be setting up and running a Docker container that contains code to train a rainfall-runoff LSTM model built in Python using Tensorflow. The Github repo with the LSTM code and data can be downloaded here.
- Spin up a VM instance (Ubuntu 18.04) or use your own machine if you have a partition.
- I use MobaXterm to SSH into the VM instance and drag the HEC_HMS_LSTM folder into my home directory.
- Within the directory, you will see 2 additional files that are not part of the main neural network code: requirements.txt and jupyter.dockerfile. A requirements.txt file contains a list of only the packages that are necessary to run your application. You can make this file with just two lines of code: pip install pipreqs and then pipreqs /path/to/project. The created file will look like this:
The jupyter.dockerfile is our Dockerfile. It is a text file that contains all the commands a user could call on the command line to assemble an image.
Here in our Dockerfile, we are using the latest Ubuntu image and setting our working directory to the HEC_HMS_LSTM folder that has the neural network code, Hourly_LSTM.py, that we would like to execute. We start by looking for updates and then install Python, pip, and jupyter. Then we need to take our working directory contents and copy them into the container. We copy the requirements.txt file into the container and then add the whole HEC_HMS_LSTM folder into the container. We then use pip to install all of the packages listed in requirements.txt. Finally, we instruct the docker daemon to run the python script, Hourly_LSTM.py.
5. Next we have to download docker onto Ubuntu using the command line. This is the only thing that anyone will have to download to run your container. The steps to download Docker for Ubuntu through the command line are pretty easy using this link. You may be asked to allow automatic restarts during the installation, and you should choose this option. When Docker is done downloading, you can check to see that the installation was successful by seeing if the service is active.
6. Now that Docker is downloaded, we can build our Docker image and then run the container. We build the image using:
sudo docker build -f jupyter.dockerfile -t jupyter:latest .
In this command, the -f flag denotes the name of the dockerflile and -t is the name that we would like to tag our image with. If you’re running multiple containers, tagging is helpful to distinguish between the containers. The build will go through each step of the dockerfile. Be cognizant of how much space you have on your disk to store the downloaded Docker, the python libraries and the data you will generate; you may have to go back and resize your VM instance. We can then run our image using:
sudo docker run jupyter:latest
You’ll see the neural network training and the results of the prediction.