Containerizing Rhodium with Docker

Ever installed a new library only for it to throw version depreciation errors up on your terminal? Or have warnings print in your output line instead of the figure you so painstakingly coded? Fear not – containerization is here to save the day! But before we get too excited, there are a few things (and terms) to learn about containerization using Docker.

In this post, we will be walking through a brief description of containerization and explain a few of its key terms. At the end, we will perform an exercise by containerizing the Rhodium robust decision-making library. More information about this library and how it can be used for exploratory modeling can be found in this post by Andrew Dircks. For a specific application of this library to the Lake Problem, please refer to this post by Antonia Hadjimichael.

Explaining containerization (and what is a base image?)

Visualizing containerization as a portable, app-specific virtual environment (Source: https://www.armor.com/resources/blog/containerization-the-need-to-know/)

Using the image above, picture your hardware (laptop, desktop, supercomputer) as a large cargo ship, with its engines being its operating system. In the absence of containerization, an application (app) is developed in a specific computing environment, akin to placing cargo in a permanent storage hold under the deck of a ship. Methods for cargo loading and removal are strongly dictated by the shape and size of the ship. Similarly, a non-containerized app can only be reliably executed given that it is installed in a computing environment that is nearly or almost completely identical to that in which is was developed in.

On the contrary, containerization bundles everything an app might need to run in a ‘container’ – the code, its required libraries, and their associated dependencies – therefore enabling an app to be run consistently on any infrastructure. By extension, this renders a containerized application version- and operating system (OS)-independent. These ‘containers’ are thus easily loaded and installed onto any ‘cargo ship’. The piece of software that enables the efficient execution of containerized apps is the container engine. This nifty tool is responsible for handling app user input and ensuring the correct installation, startup and running of the containerized app. The engine also pulls, loads, and builds the container image, which is a (misleadingly-named) file, or repository of files, that contains all the information that the engine will need to build the app on a new machine.

In this post, we will be walking through the containerization of the Rhodium library using Docker, which is a container hub that let’s you develop, store and build your container images. It is the first and most commonly-used container hub (at the moment).

Let’s containerize!

Setup

If you use either a Windows or Mac machine, please install Docker Desktop from this site. Linux machines should first install Docker and then Docker Compose. Make sure to create an account and login.

Next, clone the PRIM, Platypus, and Rhodium repositories onto your local machine. You can directly download a .zip file of the repository here or you can clone the repository via your command line/terminal into a folder within a directory of your choice:

C:\Users\username> cd directory-of-choice
C:\Users\username\directory-of-choice>git clone https://github.com/lbl59/Rhodium.git
C:\Users\username\directory-of-choice>git clone https://github.com/Project-Platypus/Platypus.git
C:\Users\username\directory-of-choice>git clone https://github.com/Project-Platypus/PRIM.git

Great, your repositories are set and ready to go! These should result in three new folders: Rhodium, Platypus, and PRIM. Now, in the same terminal window, navigate to the PRIM folder and run the following:

C:\Users\username\directory-of-choice\PRIM>python setup.py develop

Repeat for the Platypus folder. This is to make sure that you have both PRIM and Project Platypus installed and setup on your local machine.

Updating the requirements.txt file

Now, navigate back to the original directory-of-choice. Open the Rhodium folder, locate and open the requirements.txt file. Modify it so it looks like this:

matplotlib==3.5.1
numpy==1.22.1
pandas==1.4.0
mpldatacursor==0.7.1
six==1.16.0
scipy==1.7.3
prim
platypus-opt
sklearn==1.0.2

This file tells Docker that these are the required versions of libraries to install when building and installing your app.

Creating a Dockerfile

To begin building the container image for Docker to pack and build as Rhodium’s container, first create a new text file and name is Dockerfile within the Rhodium folder. Make sure to remove the .txt extension and save it as “All types” to avoid appending an extension. Open it using whichever text file you are comfortable with. The contents of this file should look like as follows. Note that the comments are for explanatory purposes only.

# state the base version of Python you are working with
# for my machine, it is Python 3.9.1
FROM python:3.9.1

# set the Rhodium repository as the container
WORKDIR /rhodium_app

# copy the requirements file into the new working directory
COPY requirements.txt .

# install all libraries and dependencies declared in the requirements file
RUN pip install -r requirements.txt

# copy the rhodium subfolder into the new working directory
# find this subfolder within the main Rhodium folder
COPY rhodium/ .

# this is the command the run when the container starts
CMD ["python", "./setup.py"]

The “.” indicates that you will be copying the file from your present directory into your working directory.

Build the Docker image

Once again in your terminal, check that you are in the same directory as before. Then, type in the following:

$ docker build -t rhodium_image .

Hit enter. If the containerization succeeded, you should see the following in your terminal (or something similar to it):

Containerization successful!

Congratulations, you have successfully containerized Rhodium! You are now ready for world domination!

References

2022. [online] Available at: <https://www.redhat.com/en/topics/cloud-native-apps/what-is-containerization&gt; [Accessed 1 February 2022].

Education, I., 2022. containerization. [online] Ibm.com. Available at: <https://www.ibm.com/cloud/learn/containerization&gt; [Accessed 1 February 2022].

Iordache, A., Scott, D., Turner, S. and Dalleau, F., 2022. Containerized Python Development – Part 1 – Docker Blog. [online] Docker Blog. Available at: <https://www.docker.com/blog/containerized-python-development-part-1/&gt; [Accessed 1 February 2022].

Runnable Docker Guides. 2022. Dockerize your Python Application. [online] Available at: <https://runnable.com/docker/python/dockerize-your-python-application&gt; [Accessed 1 February 2022].