Simple profiling checks for running jobs on clusters

The goal of this short blog post is to share some simple tips on profiling your (to be) submitted jobs on high performance computing resources. Profiling your jobs can give you information about how efficiently you are using your computational resources, i.e., your CPUs and your allocated memory. Typically you would perform these checks on your experiment at a smaller scale, ensuring that everything is working as it should, before expanding to more tasks and CPUs.

Your first check is squeue typically paired with your user ID on a cluster. Here’s an example:

(base) [ah986@login02 project_dir]$ squeue -u ah986
             JOBID PARTITION     NAME      USER  ST       TIME  NODES NODELIST(REASON) 
           5688212    shared <job_name>    ah986  R       0:05      1 exp-4-55 

This tells me that my submitted job is utilizing 1 node in the shared partition of this cluster. If your cluster is using the SLURM scheduler, you can also use sacct which can display accounting data for all jobs you are currently running or have run in the past. There’s many pieces of information available with sacct, that you can specify using the --format flag. Here’s an example for the same job:

(base) [ah986@login02 project_dir]$ sacct --format=JobID,partition,state,time,start,end,elapsed,nnodes,ncpus,nodelist,AllocTRES%32 -j 5688212
       JobID  Partition      State  Timelimit               Start                 End    Elapsed   NNodes      NCPUS        NodeList                        AllocTRES 
------------ ---------- ---------- ---------- ------------------- ------------------- ---------- -------- ---------- --------------- -------------------------------- 
5688212          shared    RUNNING   20:00:00 2021-09-08T10:55:40             Unknown   00:19:47        1        100        exp-4-55 billing=360000,cpu=100,mem=200G+ 
5688212.bat+               RUNNING            2021-09-08T10:55:40             Unknown   00:19:47        1        100        exp-4-55          cpu=100,mem=200G,node=1 
5688212.0                  RUNNING            2021-09-08T10:55:40             Unknown   00:19:47        1        100        exp-4-55          cpu=100,mem=200G,node=1 

In this case I can see the number of nodes (1) and the number of cores (100) utilized by my job as well as the resources allocated to it (100 CPUs and 200G of memory on 1 node). This information is useful in cases where a task launches other tasks and you’d like to diagnose whether the correct number of cores is being used.

Another useful tool is seff, which is actually a wrapper around sacct and summarizes your job’s overall performance. It is a little unreliable while the job is still running, but after the job is finished you can run:

(base) [ah986@login02 project_dir]$ seff 5688212
Job ID: 5688212
Cluster: expanse
User/Group: ah986/pen110
State: COMPLETED (exit code 0)
Nodes: 1
Cores per node: 100
CPU Utilized: 1-01:59:46
CPU Efficiency: 68.16% of 1-14:08:20 core-walltime
Job Wall-clock time: 00:22:53
Memory Utilized: 38.25 GB
Memory Efficiency: 19.13% of 200.00 GB

The information here is very useful if you want to find out about how efficiently you’re using your resources. For this example I had 100 separate tasks I needed to perform and I requested 100 cores on 1 node and 200 GB of memory. These results tell me that my job completed in 23mins or so, the total time using the CPUs (CPU Utilized) was 01:59:46, and most importantly, the efficiency of my CPU use. CPU Efficiency is calculated “as the ratio of the actual core time from all cores divided by the number of cores requested divided by the run time”, in this case 68.16%. What this means it that I could be utilizing my cores more efficiently by allocating fewer cores to the same number of tasks, especially if scaling up to a larger number of nodes/cores. Additionally, my allocated memory is underutilized and I could be requesting a smaller memory allocation without inhibiting my runs.

Finally, while your job is still running you can log in the node(s) executing the job to look at live data. To do so, you simply ssh to one of the nodes listed under NODELIST (not all clusters allow this). From there, you can run the top command like below (with your own username), which will start the live task manager:

(base) [ah986@r143 ~]$ top -u ah986

top - 15:17:34 up 25 days, 19:55,  1 user,  load average: 0.09, 12.62, 40.64
Tasks: 1727 total,   2 running, 1725 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.1 sy,  0.0 ni, 99.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 257662.9 total, 249783.4 free,   5561.6 used,   2317.9 buff/cache
MiB Swap: 716287.0 total, 716005.8 free,    281.2 used. 250321.1 avail Mem 

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                                              
 78985 ah986     20   0  276212   7068   4320 R   0.3   0.0   0:00.62 top                                                                                                                                  
 78229 ah986     20   0  222624   3352   2936 S   0.0   0.0   0:00.00 slurm_script                                                                                                                         
 78467 ah986     20   0  259464   8128   4712 S   0.0   0.0   0:00.00 srun                                                                                                                                 
 78468 ah986     20   0   54520    836      0 S   0.0   0.0   0:00.00 srun                                                                                                                                 
 78481 ah986     20   0  266404  19112   4704 S   0.0   0.0   0:00.24 parallel                                                                                                                             
 78592 ah986     20   0  217052    792    720 S   0.0   0.0   0:00.00 sleep                                                                                                                                
 78593 ah986     20   0  217052    732    660 S   0.0   0.0   0:00.00 sleep                                                                                                                                
 78594 ah986     20   0  217052    764    692 S   0.0   0.0   0:00.00 sleep                                                                                                                                
 78595 ah986     20   0  217052    708    636 S   0.0   0.0   0:00.00 sleep                                                                                                                                
 78596 ah986     20   0  217052    708    636 S   0.0   0.0   0:00.00 sleep                                                                                                                                
 78597 ah986     20   0  217052    796    728 S   0.0   0.0   0:00.00 sleep                                                                                                                                
 78598 ah986     20   0  217052    732    660 S   0.0   0.0   0:00.00 sleep       

Memory and CPU usage can be tracked from RES and %CPU columns respectively. In this case, for the sake of an example, I just assigned all my cores to sleep a certain number of minutes each (using no CPU or memory). Similar information can also be obtained using the ps command, with memory being tracked under the RSS column.

 (base) [ah986@r143 ~]$ ps -u$USER -o %cpu,rss,args
%CPU   RSS COMMAND
 0.0  3352 /bin/bash /var/spool/slurm/d/job3509431/slurm_script
 0.0  8128 srun --export=all --exclusive -N1 -n1 parallel -j 100 sleep {}m ::: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
 0.0   836 srun --export=all --exclusive -N1 -n1 parallel -j 100 sleep {}m ::: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
 0.1 19112 /usr/bin/perl /usr/bin/parallel -j 100 sleep {}m ::: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
 0.0   792 sleep 3m
 0.0   732 sleep 4m
 0.0   764 sleep 5m
 0.0   708 sleep 6m
 0.0   708 sleep 7m
 0.0   796 sleep 8m
 0.0   732 sleep 9m
 0.0   712 sleep 10m

Performing Experiments on HPC Systems

A lot of the work we do in the Reed lab involves running computational experiments on High Performance Computing (HPC) systems. These experiments often consist of performing multi-objective optimization to aid decision making in complex systems, a task that requires hundreds of thousands of simulation runs and may not possible without the use of thousands of computing core. This post will outline some best practices for performing experiments on HPC systems.

1. Have a Plan

By nature, experiments run on HPC systems will consume a large amount of computational resources and generate large amounts of data. In order to stay organized, its important to have a plan for both how the computational resources will be used and how data will be managed.

Estimating your computational resources

Estimating the scale of your experiment is the first step to running on an HPC system. To make a reasonable estimate, you’ll need to gather the following pieces of information:

  • How long (in wall clock time) does a single model run take on your local machine?
  • How many function evaluations (for an MOEA run) or ensemble model runs will you need to perform?
  • How long do you have in wall clock time to run the experiment?

Using this information you can estimate the number of parallel processes that you will need to successfully run the experiment. Applications such as running the Borg MOEA are known as, “embarrassingly parallel” and scale quite well with an increase in processors, especially for problems with long function evaluation times (see Hadka and Reed, 2013 for more info). However, many other applications scale poorly, so it’s important to be aware of the parallel potential of your code. A helpful tip is to identify any inherently serial sections of the code which create bottlenecks to parallelization. Parallelizing tasks such as Monte Carlo runs and MOEA function evaluations will often result in higher efficiency than paralellizing the simulation model itself. For more resources on how to parallelize your code, see Bernardo’s post from last year.

Once you have an idea of the scale of your experiment, you’ll need to estimate the experiment’s computational expense. Each HPC resource has its own charging policy to track resource consumption. For example, XSEDE tracks charges in “service units” which are defined differently for each cluster. On the Stampede2 Cluster, a service unit is defined as one node-hour of computing time, so if you run on 100 nodes for 10 hours, you spend 1,000 service units regardless of how many core per node you utilize. On the Comet Cluster, a service unit is charged by the core-hour, so if you run 100 nodes for 10 hours and each utilizes 24 core, you’ll be charged 24,000 service units. Usually, the allocations you receive to each resource will be scaled accordingly, so even though Comet looks more expensive, you likely have a much large allocation to work with. I usually make an estimate of service units I need for an experiment and add another 20% as a factor of safety.

Data management:

Large experiments often create proportionately large amounts of data. Before you start, its important to think about where this data will be stored and how it will be transferred to and from the remote system. Many clusters have limits to how much you can store on different drives, and breaking these limits can cause performance issues for the system. System administrators often don’t take kindly to these performance issues and in extreme cases, breaking the rules may result in suspension or removal from a cluster. It helps to create an informal data management plan for yourself that specifies:

  1. How will you transfer large amounts of data to and from the cluster (tools such as Globus are helpful here).
  2. Where will you upload your code and how your files will be structured
  3. Where will you store data during your experimental runs. Often clusters have “scratch drives” with large or unlimited storage allocations. These drives may be cleaned periodically so they are not suitable for long term storage.
  4. Where will you store data during post processing. This may still be on the cluster if your post processing is computationally intensive or your local machine can’t handle the data size.
  5. Where will you store your experimental results and model data for publication and replication.

2. Test on your local machine

To make the most of your time on a cluster, its essential that you do everything you can to ensure your code is properly functioning and efficient before you launch your experiment. The biggest favor you can do for yourself is to properly test your code on a local machine before porting to the HPC cluster. Before porting to a cluster, I always run the following 4 checks:

  1. Unit testing: the worst case scenario after a HPC run is to find out there was a major error in your code that invalidates your results. To mitigate this risk as much as possible, it’s important to have careful quality control. One helpful tool for quality control is unit testing to examine every function or module in your code and ensure it is working as expected. For an introduction to unit testing, see Bernardo’s post on Python and C++.
  2. Memory checking: in low level code (think C, C++) memory leaks can be silent problem that throws off your results or crash your model runs. Sometimes, memory leaks can go undetected during small runs but add up and crash your system when run in large scale experiments. To test for memory leaks, make sure to use tools such as Valgrind before uploading your code to any cluster. This post features a nice introduction to Valgrind with C++.
  3. Timing and profiling: Profile your code to see which parts take longest and eliminate unnecessary sections. If you can, optimize your code to avoid computationally intensive pieces of code such as nested loops. There are numerous tools for profiling low level codes such as Callgrind and gprof. See this post for some tips on making your C/C++ code faster.
  4. Small MOEA tests: Running small test runs (>1000 NFE) will give you an opportunity to ensure that the model is properly interacting with the MOEA (i.e. objectives are connected properly, decision variables are being changed etc.). Make sure you are collecting runtime information so you can evaluate algorithmic performance

3. Stay organized on the cluster

After you’ve fully tested your code, it’s time to upload and port to the cluster. After transferring your code and data, you’ll likely need to use the command line to navigate files and run your code. A familiarity with Linux command line tools, bash scripting and command line editors such as vim can make your life much easier at this point. I’ve found some basic Linux training modules online that are very useful, in particular “Learning Linux Command Line” from Linked-in learning (formerly Lynda.com) was very useful.

Once you’ve got your code properly organized and compiled, validate your timing estimate by running a small ensemble of simulation runs. Compare the timing on the cluster to your estimates from local tests and reconfigure your model runs if needed. If performing an experiment with an MOEA, run a small number of NFE on a development or debug node confirm that the algorithm is running and properly parallelizing. Then run a single seed of the MOEA and perform runtime diagnostics to ensure things are working more or less as you expect. Finally, you’re ready to run the full experiment.

I’d love to hear your thoughts and suggestions

These tips have been derived from my experiences running on HPC systems, if anyone else has tips or best practices that you find useful, I’d love to hear about them in the comments.

Running jobs on the supercomputer: JANUS

The power of supercomputing is undeniable. However, there is often a hurdle in syntax to get jobs to run on them. What I’m including below are ways to submit jobs to run on the CU-Boulder supercomputer, JANUS, which I hope will be helpful.

To log on, open up a terminal window (e.g. Terminal on a Mac or CygWin on a PC): ssh <username>@login.rc.colorado.edu

To copy items to JANUS from a shell, simply use the following:

scp <path and filename on local machine>   <username>@login.rc.colorado.edu:<destination path on JANUS>/

The purpose of the job script is to tell JANUS where to run the job. I will cover two types of job scripts, (1) to submit a job to an entire node, and (2) to submit to a single processor. Note, nodes on JANUS contain multiple processors, usually more than 12, so that if you have a memory intensive job you may wish to submit the former. Also, the jobs that occupy entire nodes offer the user a larger number of total processors to work with (several thousand cores versus several hundred). Nevertheless, here are the examples:

1. Example script to submit to a node is below: The body of text should be saved to a text file with a “.sh” suffix (i.e. shell script). Also notice that lines that begin with “#” are not read by the program, but rather are for comments/documentation. To submit the script, first be sure you’ve loaded the slurm module:

module load slurm

sbatch <path and filename of script>

#!/bin/bash
# Lines starting with #SBATCH are interpreted by slurm as arguments.
#

# Set the name of the job, e.g. MyJob
#SBATCH -J MyJob

#
# Set a walltime for the job. The time format is HH:MM:SS - In this case we run for 12 hours. **Important, this length should be commensurate with the type of node
# you're submitting to, debug is less than 1 hour, but others can be much longer, check the online documentation for assistance

#SBATCH --time=12:00:00
#
# Select one node
#
#SBATCH -N 1

# Select one task per node (similar to one processor per node)
#SBATCH --ntasks-per-node 12
# Set output file name with job number

#SBATCH -o MyJob-%j.out

# Use the standard 'janus' queue. This is confusing as the online documentation is incorrect, use the below to get a simple 12 core node

#SBATCH --qos janus

# The following commands will be executed when this script is run.

# **Important, in order to get 12 commands to run at the same time on your node, enclose them in parentheses "()" and follow them with an ampersand "&"

# to get all jobs to run in the background. The last thing is be sure to include a "wait" command at the end, so that the job script waits to terminate until these

# jobs complete. Theoretically you could have more than 12 command below.

# ** Note replace the XCMDX commands below with the full path to your executable as well as any command line options exactly how you'd run them from the

# command line.

echo The job has begun

(XCMD1X) &

(XCMD2X) &

(XCMD3X) &

(XCMD4X) &

(XCMD5X) &

(XCMD6X) &

(XCMD7X) &

(XCMD8X) &

(XCMD9X) &

(XCMD10X) &

(XCMD11X) &

(XCMD12X) &

# wait ensures that job doesn't exit until all background jobs have completed

wait

EOF

2. Example script to submit to a single processor is below. The process is almost identical to above, except for 4 things: (i) the queue that we’ll submit to is called ‘serial’, (ii) number of tasks per node is 1, (iii) the number of executable lines is 1, and (iv) we do not need the “wait” command.

#!/bin/bash

# Lines starting with #SBATCH are interpreted by slurm as arguments.

#

# Set the name of the job, e.g. MyJob

#SBATCH -J MyJob

#

# Set a walltime for the job. The time format is HH:MM:SS - In this case we run for 6 hours. **Important, this length should be commensurate with the type of node

# you're submitting to, debug is less than 1 hour, but others can be much longer, check the online documentation for assistance

#SBATCH --time=6:00:00

#

# Select one node

#

#SBATCH -N 1

# Select one task per node (similar to one processor per node)

#SBATCH --ntasks-per-node 1

# Set output file name with job number

#SBATCH -o MyJob-%j.out

# Use the standard 'serial' queue. This is confusing as the online documentation is incorrect, use the below to get a single processor

#SBATCH --qos serial

# The following commands will be executed when this script is run.

# ** Note replace the XCMDX commands below with the full path to your executable as well as any command line options exactly how you'd run them from the

# command line.

echo The job has begun

XCMDX

EOF

Globus Connect for Transferring Files Between Clusters and Your Computer

I recently learned about a service called Globus Online that allows you to easily transfer files to the cluster.  It’s similar to WinSCP or SSH, but the transfers can happen in the background and get resumed if they are interrupted.  It is supported by the University of Colorado: https://www.rc.colorado.edu/filetransfer and the NSF XSEDE machines: https://www.xsede.org/globus-online. Also, courtesy of Jon Herman, a note about Blue Waters: There is an endpoint called ncsa#NearLine where you can push your data for long-term storage (to avoid scratch purges). However on NearLine there is a per-user quota of 5TB. So if you find yourself mysteriously unable to transfer any more files, you’ll know why.

To get started, first create a Globus Online account.  Then, you’ll need to create “endpoints” on your account.  The obvious endpoint is, say, the cluster.  The University of Colorado for example has instructions on how to add their cluster to your account.  Then, you need to make your own computers an endpoint!  To do this, click Manage Endpoints then click “Add Globus Connect.”  Give your computer a name, and then it will generate a unique key that you can then use on the desktop application for the service.  Download the program for Mac, Unix, or Windows.  The cool thing is you can do this on all your computers.  For example I have a computer called MacBookAir, using OSX, and another one called MyWindows8 or something like that, that uses Windows.

File transfers are then initiated as usual, only you’re using a web interface instead of a standalone program.

As usual feel free to comment in the comments below.

Running tmux on the cluster

tmux is a terminal multiplexer — a program that lets you do more than one thing at a time in a terminal window. For example, tmux lets you switch between running an editor to modify a script and running the script itself without exiting the editor. As an added bonus, should you lose your ssh connection, your programs are still running inside tmux and you can bring them up again when you re-connect. If you’ve ever used GNU Screen, it’s the same idea.

Here’s what it looks like in the MinTTY terminal that comes with Cygwin.   In this example I’ve split the window across the middle, with an editor open in the bottom and a command prompt at the top.  (You can do vertical splits too.)

tmux terminal multiplexer showing a split window

tmux terminal multiplexer showing a split window

Building tmux

I recently built tmux again. It’s pretty easy to do, and it takes up less than 7 megabytes of disk space. Here’s how:

Make a directory for your own installs. Mine is called ~/local

mkdir ~/local

You’ll probably have to build libevent 2.0 first if, like me, it’s not already on your system.

Make a directory for building things, if you haven’t already got one.

mkdir build
cd build

Download libevent. Be sure to replace 2.0.21 with the latest stable version.
wget https://github.com/downloads/libevent/libevent/libevent-2.0.21-stable.tar.gz

Extract libevent

tar xzvf libevent-2.0.21-stable.tar.gz
cd libevent-2.0.21-stable

Build libevent. I used an absolute path with the prefix to be on the safe side.

./configure --prefix=/path/to/my/home/directory/local
make
make install

Download and extract tmux. Get the latest version, which is 1.8 at the time of this writing.

wget http://downloads.sourceforge.net/tmux/tmux-1.8.tar.gz
tar xzvf tmux-1.8.tar.gz
cd tmux-1.8

If you rolled your own libevent, you’ll need to set appropriate CFLAGS and LDFLAGS when you run configure. Otherwise you can skip export CFLAGS and export LDFLAGS.

export CFLAGS=-I/path/to/my/home/directory/local/include
export LDFLAGS="-Wl,-rpath=/path/to/my/home/directory/local/lib -L/path/to/my/home/directory/local/lib"
./configure --prefix=/path/to/my/home/directory/local
make
make install

Then add tmux to your path, and you’re done. Do this by putting the following lines in your .bashrc:

export MYPACKAGES=/path/to/my/home/directory/local/
export PATH=${MYPACKAGES}/bin:${PATH}

Python for automating cluster tasks: Part 2, More advanced commands

This is part 2 in a series about using Python for automating cluster tasks.  Part 1 is here. (For more on Python, check out: another tutorial part one and two, tips on setting up Python and Eclipse, and some specific examples including a cluster submission guide and a script that re-evaluates solutions using a different simulation model)

Edit: Added another example in the “copy” section below!

Welcome back!  Let’s continue our discussion of basic Python commands.  Let’s start by modifying our last code sample to facilitate random seed analysis.  Now, instead of writing one file we will write 50 new files.  This isn’t exactly how we’ll do the final product, but it will be helpful to introduce loops and some other string processing.

Loops and String Processing

import re
import os
import sys
import time
from subprocess import Popen
from subprocess import PIPE

def main():
    #the input filename and filestream are handled outside of the loop.
    #but the output filename and filestream have to occur inside the loop now.
    inFilename = "borg.c"
    inStream = open(inFilename, 'rb')

    for mySeed in range(1,51):
        outFilename = "borgNew.seed" + str(mySeed) + ".c"
        outStream = open(outFilename, 'w')

        print "Working on seed %s" % str(mySeed)

        for line in inStream:
            if "int mySeed" in line:
                newString = " int mySeed = " + str(mySeed) + ";\n"
                outStream.write(newString)
            else:
                outStream.write(line)
        outStream.close()
        inStream.seek(0) #reset the input file so you can read it again

if __name__ == "__main__":
    main()

Above, the range function allows us to iterate through a range of numbers. Note that the last member of the range is never included, so range(1,51) goes from 1 to 50. Also, now we have to be concerned with making sure our files are closed properly, and making sure that the input stream gets ‘reset’ every time. There may be a more efficient way to do this code, but sometimes it’s better to be more explicit to be sure that the code is doing exactly what you want it to. Also, if you had to rewrite multiple lines, it would be helpful to structure your loops the way I have them here.

By the way, after you run the sample program, you may want to do something like “rm BorgNew*” to remove all the files you just created.

Calling System Commands

Ok great, so now you can use Python to modify text files. What if you have to do something else in your workflow, such as copy files? Move them? Rename them? Call programs? Basically, you want your script to be able to do anything that you would do on the command line, or call system commands. For some background, check out this post on Stack Overflow, talking about the four or five different ways to call external commands in Python.

The code sample is below. Note that there’s two different ways to use the call command. Using “shell=True” allows you to have access to certain features of the shell such as the wildcard operator. But be careful with this! Accessing the shell directly can lead to problems as discussed here.

import re
import os
import sys
import time
from subprocess import Popen
from subprocess import PIPE
from subprocess import call

def main():

    print "Listing files..."
    call(["ls", "-l"])

    print "Showing the current working directory..."
    call(["pwd"])

    print "Now making ten copies of borg.c"
    for i in range(1,11):
        print "Working on file %s" % str(i)
        newFilename = "borgCopy." + str(i) + ".c"
        call(["cp", "borg.c", newFilename])
    print "All done copying!"

    print "Here's proof we did it.  Listing the directory..."
    call(["ls", "-l"])

    print "What a mess.  Let's clean up:"
    call("rm borgCopy*", shell=True)
    #the above is needed if you want to use a wildcard, see:
    #http://stackoverflow.com/questions/11025784/calling-rm-from-subprocess-using-wildcards-does-not-remove-the-files

    print "All done removing!"

if __name__ == "__main__":
    main()

You may also remember that there are multiple ways to call the system. You can use subprocess to, in a sense, open a shell and call your favorite Linux commands… or you can use Python’s os library to do some of the tasks directly. Here’s an example of how to create some directories and then copy the files into the directory. Thanks to Amy for helping to write some of this code:

import os
import shutil
import subprocess

print os.getcwd()
global src
src = "myFile.txt" #or whatever your file is called

for i in range(51, 53); #remember this will only do it for 51 and 52
   newFoldername = 'seed'+str(i)
   if not os.path.exists(newFoldername):
      os.mkdir(newFoldername)
   print "Listing files..."
   subprocess.call(["ls", "-l"])
   shutil.copy(src, newFoldername)
   #now, we should change to the new directory to see if the
   #copy worked correctly
   os.chdir(newFoldername)
   subprocess.call(["ls", "-l"])
   #make sure to change back
   os.chdir("..")

Conclusion

These two pieces of the puzzle should open up a lot of possibilities to you, as you’re setting up your jobs. Let us know if you want more by posting in the comments below!

Python for automating cluster tasks: Part 1, Getting started

Yet another post in our discussion of Python.  (To see more, check out: a tutorial part one and two, tips on setting up Python and Eclipse, and some specific examples including a cluster submission guide and a script that re-evaluates solutions using a different simulation model)

If you’re just getting into using MOEAs and simulation models together, you may have spent some time getting familiar with how to get the two technologies to “talk” to one another, and you may have solved a problem or two.  But now you may realize, well there’s more to the process than just running a MOEA once.  The following are some reasons why you may need to set up a “batch” submission of multiple MOEA runs:

1. Random Seed Analysis A MOEA is affected by the random number generator used to generate initial random solutions, generate random or synthetic input data, and perform variation operations to develop new solutions. Typically, a random seed analysis is used to test how robust the algorithm is to different sets of random numbers.

2. Diagnostic Analysis of Algorithm Performance MOEAs have parameters (population size, run duration, crossover probability, etc.).  We’ve discussed methods to evaluate how well the algorithm does across a wide array of different parameter values (see, for example, this post).

3. Running Multiple Problem Formulations Perhaps you want to compare a 2 objective problem with a 10 objective problem.  Or, you use different simulation models (a screening level model, a metamodel, all the way to a physics-based process model).  All of these changes would require running the MOEA more than once.

4. Learning More about your Problem Even if you’re not trying 1-3, you may just need to run the MOEA for a problem that you’re still developing.  You make an educated guess about the best set of objectives, decisions, and constraints, but by analyzing the results you could see that this needs to change.  We use the term a posteriori to describe the process, because you don’t specify preferences between objectives, etc until after you generate alternatives.  So it’s an interactive process after you start running the algorithm.

The manual approach

For this illustration, here are some assumptions on what you’re up to.  First, you are using the Borg MOEA or MOEAframework, with a simulation model (either the source code, or you’ve written a wrapper to call it).  You’ve set up one trial of the simulation-optimization process, maybe using some of the resources on this blog!  You are using cluster computing and you have a terminal connection to the cluster all set up.  And, python is installed on the cluster.

A typical workflow might look something like this.  (Thanks to my student Amy Piscopo for being the “guinea pig” here).  Some of these steps are kind of complicated, you may or may not have them.

1. Change the simulation or optimization source code.  Typically we set up programming so you don’t need to actually change the code and re-compile if you’re making a simple change.  For example, if there’s a really important parameter in your simulation code, you can pass it in as a command line argument.  But, sometimes you can’t avoid this, so step one is that you might have to go into a text editor and modify code.  (Example: “On line 2268, change “int seed = 1” to “int seed = 2”.

2. Compile the simulation or optimization source code. Any changes in source code must be compiled before you run the program.   (Example: Write a command that compiles the files directly, such as in the Borg README, or use a makefile and the command “make”)

3. Modify a submission script. You thought your experiment was going to take 8 hours, and it really takes 24.  Oops.  Typically, when you create a “submission script” for the cluster, you need to tell it a few things about your job: what queue you want, how long to run, how many processors to request, and then the specific commands you need to run.  Another thing to consider with multiple runs is that the command that you are specifying may actually change. (Example: Changing the submission script to say “borg.exe -s 2” instead of “borg.exe -s 1”)

4. Make multiple run folders, and copy all files into each folder. Ok, so you’ve made the necessary modifications for, say, “Seed 1” and “Seed 2”.  Or, “Problem 1” and “Problem 2”.  Now, you need to gather up all the files for your particular run and put them into their own folder.  I usually recommend that you use a different folder for different runs, even if there are only a few files in the folder.  It just helps keep things organized. (Example: Each seed has its own folder)

5. Submit the jobs! Whew.  Too much work.  After clicking and typing and hitting enter and clicking and drag and dropping, you’re finally ready to hit go.

If you’re exhausted by that list, I don’t blame you — so am I!  This is a lot of manual steps, and the worst part is that the process won’t change hardly at all at seed 1, vs seed 50.  Plus, if you do all this by hand, you are more likely to make a mistake (“Did I actually change seed 2 to seed 3?  Oh gosh I don’t know”).  The other thing that’s annoying about it is that you may need to make yet another change in the process later on.  Instead of changing the seed, now you have to change some parameter!

Well, there’s another way…

Instead, use python!

That’s the point of this whole post.  Let’s get started.

Learning Python syntax and starting your first script

How do you quickly get started learning this new language?  First, log into the cluster and type “python”.  You may need to load a module, consult the documentation for your own cluster.  You’ll see a prompt that looks like this: “>>>”  This is the python interpreter.  You can type any command you want, and it will get executed right there, kind of like Matlab.  So now you’re running Python!  Saying it’s too hard to get started is no excuse. 🙂

Then, open the official Python tutorial, or a third party tutorial on the internet.  I noticed the third party one has a Python interpreter right in the browser.  Anyway, any time you see a new  command, look it up in the tutorial!  After a while you’ll be a Python expert.

One more comment in the “getting started” section.  In a previous post the beginning of the script looked like this.  The import commands load packages that have commands that you need.  Any time you learn a new command, make sure you don’t need to also include a new import command at the beginning of the script.

import re
import os
import sys
import time
from subprocess import Popen
from subprocess import PIPE

And then, the main function is defined a little differently than you may be used to in Fortran or C.  Something like this:

def main():

    #a bunch of code goes here

if __name__ == "__main__":
    main()

The default function, (“if __name__”) is a convention in Python.  So just set up your script in a similar way and you can use it like you were programming in C or another language.

Make sure to pay attention to what the tutorial says about code indenting.  Spaces/tabs/indents are very important in Python and they have to be done correctly in order for the code to work.

Ok now that you’re a verified Python expert let’s talk about how to do some of the basic functions that you need to know:

Modifying text files (or, changing one little number in 3000 lines of code)

Here’s your first Python program. It takes a file called borg.c, which, somewhere inside of which, contains the line “int mySeed = 1”. It then changes it to a number you specify in your Python program (in this case, 2). It does this by reading in every line of borg.c and spitting it into a new file, but when it gets to the mySeed line, it rewrites it. Note! Sometimes the spacing comes out weird on the blog. Be sure to follow correct Python spacing convention when writing your own code.

import re
import os
import sys
import time
from subprocess import Popen
from subprocess import PIPE
def main():
    inFilename = "borg.c"
    outFilename = "borgNew.c"
    seed = 2
    inStream = open(inFilename, 'rb')
    outStream = open(outFilename, 'w')
    for line in inStream:
        if "int mySeed" in line:
            newString = " int mySeed = " + str(seed) + ";\n"
            outStream.write(newString)
    else:
        outStream.write(line)
if __name__ == "__main__":
    main()

We’ve hit the ground running!  The above code sample shows how to write a file, how to write a loop, and then how to actually modify lines of text.  It also shows that ‘#’ indicates whether the line is a comment.  Remember that the indentation will tell Python whether you’re inside one loop, or an if statement, or what have you.  The indenting scheme makes the code easy to read!

If you save this code in a file (say, myPython.py), all you have to do to run the program is type “python myPython.py” at the command line.  No news is good news, if the program runs, it worked!

Hopefully this has given you a taste for what you can do easily, with scripting.  Yes, you could’ve opened a text editor and changed one line pretty easily.  But could you do it easily 1000 times?  Not without some effort.  One easy modification here is that you could assign the ‘seed’ variable in a loop, and change the file multiple times, or create multiple files.

Next time… We’ll talk about how to call system commands within Python.  So, in addition to changing text, we’ll be able to copy files, delete files, even submit jobs to the cluster.  All within our script! Feel free to adapt any code samples here for your own purposes, and provide comments below!

Re-evaluating solutions using Python subprocesses

Our series of Python examples continues!  If you’ve missed our previous posts, we have some tutorials in parts one and two, tips on setting up Python and Eclipse, cluster submission guides, and so forth.

Matt has been helping me get up to speed using Python.  He always told me that some processes are a lot easier in Python than in C++, and I suppose I didn’t believe him til recently.  Here is my first shot at a Python program that’s all my own, and I wanted to share it with you here.  The code is below, then some comments follow.  The group has recently begun several GitHub code repositories!  You can link to this code sample at my repository.

import re
import os
import sys
import time
from subprocess import Popen
from subprocess import PIPE

LINE_BUFFERED = 1
        
def main():

    #Define the command that you would like to run through the pipe.  This will typically be your
    #executable set up to work with MOEAframework, and associated arguments.  Specifically here
    #we are working with the LRGV problem.
    cmd = ['./lrgvForMOEAFramework',  '-m', 'std-io', '-c', 'combined', '-b', 'AllDecAll']

    #Verify the command
    print "The command to run is: %s" % cmd

    #Use popen to open a child process.  The standard in, out, and error are sent through a pipe.
    child = Popen(cmd, cwd='//home//joka0958//re-evaluator_2013-04-05//', bufsize=LINE_BUFFERED, stdin=PIPE, stdout=PIPE, stderr=PIPE)

    #The current version of the model spits out some lines to the console when it initializes them.
    #When using this python child process, we need to intentionally send and receive all output (i.e. it doesn't
    #automatically do it for us.  Here there are 3 initialization lines to catch:
    print "Reading initializer lines."
    for i in range(0,3):
       line = child.stdout.readline()
       if line:
         print line
       else:
         raise Exception("Evaluator died!")

    #Now we want to step through an existing Borg output file, which already contains decision variables and objectives.
    #We are going to step through each line, read in the decision variables from the file, and then evaluate those decision
    #variables in our external program.
    myFilename = "AllDecAllExperimentData.txt"
    fp = open(myFilename, 'rb')
    for line in fp:
        if "#" in line:
            #This "if" statement is helpful if you want to preserve the generation separators and header lines that appeared
            #in the original file.
            print line
        else:
            #Read in all the variables on the line (this is both decisions and objectives)
            allVariables = [float(xx) for xx in re.split("[ ,\t]", line.strip())]

            #Only keep what you want
            variables = allVariables[0:8]

            #We want to send the decision variables, separated by a space, and terminated by a newline character
            decvarsAsString = '%f %f %f %f %f %f %f %f\n' % (variables[0], variables[1], variables[2], variables[3], variables[4], variables[5], variables[6], variables[7])

            #We send that string to the child process and catch the result.
            print "Sending to process"
            child.stdin.write(decvarsAsString)
            child.stdin.flush() #you flush, so that the program knows the line was sent

            #Now obtain the program's result
            print "Result:"
            outputLine = child.stdout.readline()
            print outputLine

            #Since this is in a loop, it will operate for every line in the input file.

if __name__ == "__main__":
    main()

Basically, your child process gets created in the beginning of this script. It is hanging out, waiting for input. The cool thing about the way input and output is handled here, is that you have complete control over what gets sent to the child program and what gets read in from it. Also, you can send multiple solutions to the program, and read their output sequentially.

Python also gives you lots of great control over input and output formatting in a relatively simple fashion. Notice how we are moving back and forth from strings and floats effortlessly. I also like how the control sequences (for loops, if else statements) are very straightforward and easy to read.

Feel free to adapt this code for your own purposes, and provide comments below!

Some ideas for your Bash submission scripts

I’ve been playing around with some design options for PBS submission scripts that may help people doing cluster work.  Some things to look for in the source code:

  • You can use a list in bash that contains multiple text entries, and then access those text entries to create strings for your submissions.  Note that you can actually display the text first (see the ‘echo ${PBS}’) before you do anything; that way you aren’t requesting thousands of jobs that have a typo in them!
  • Using “read” allows the bash programmer to interact with the user.  Well, in reality you are usually both the programmer and the user.  But lots of times, I want to write a script and try it out first, before I submit hundreds of hours of time on the cluster.  The flags below can help with that process.
  • I added commands to compile the source code before actually submitting the jobs.  Plus, by using flags and pauses intelligently, you can bail out of the script if there’s a problem with compilation.
#!/bin/bash
NODES=32
WALLHOURS=5

PROBLEMS=("ProblemA" "ProblemB")
NSEEDS=10
SEEDS=$(seq 1 ${NSEEDS}) #note there are multiple ways to declare lists and sequences in bash

NFES=1000000
echo "NFEs is ${NFES}" #echo statements can improve usability of the script, especially if you're modifying it a lot for various trials

ASSUMEPERMISSIONFLAG=No #This is for pausing the submission script later

echo "Compile? Y or N."

read COMPILEFLAG
if [ "$COMPILEFLAG" = "Y" ]; then
    echo "Cleaning.."
    make clean -f MakefileParallel
    echo "Compiling.."
    make -f MakefileParallel
else
        echo "Not compiling."
fi

for PROBINDEX in ${!PROBLEMS[*]}
do
    PROBLEM=${PROBLEMS[$PROBINDEX]} #note the syntax to pull a list member out here
    echo "Problem is ${PROBLEM}"

    for SEED in ${SEEDS}
    do
        NAME=${PROBLEM}_${SEED} #Bash is really nice for manipulating strings like this
        echo "Submitting: ${NAME}"

        #Here is the actual PBS command, with bash variables used in place of different experimental parameters.  Note the use of getopt-style command line parsing to pass different arguments into the myProgram executable.  This implementation is also designed for parallel processing, but it can also be used for serial jobs too.

        PBS="#PBS -l nodes=32\n\
        #PBS -N ${NAME}\n\
        #PBS -l walltime=05:00:00\n\
        #PBS -j oe\n\
        #PBS -o ${NAME}.out\n\
        cd \$PBS_O_WORKDIR\n\
        module load openmpi/intel\n\
        mpirun ./myProgram -b ${PROBLEM} -c combined -f ${NFES} -s ${SEED}"

        #The first echo shows the user what is about to be passed to PBS.  The second echo then pipes it to the command qsub, and actually submits the job.

        echo ${PBS}

        if [ "$ASSUMEPERMISSIONFLAG" = "No" ]; then

            echo "Continue submitting? Y or N."

            read SUBMITFLAG

            #Here, the code is designed to just keep going after the user says Y once.  You can redesign this for your own purposes.  Also note that this code is fairly brittle in that the user MUST say Y, not y or yes.  You can build that functionality into the if statements if you'd like it.

            if [ "$SUBMITFLAG" = "Y" ]; then
                 ASSUMEPERMISSIONFLAG=Yes #this way, the user won't be asked again
                 echo -e ${PBS} | qsub
                 sleep 0.5
                 echo "done."
            fi
        else
            echo -e ${PBS} | qsub
            sleep 0.5
            echo "done."
         fi
    done
done

Using gdb, and notes from the book “Beginning Linux Programming”

I just started thumbing through Beginning Linux Programming by Matthew and Stones.  It covers in great detail a lot of the issues we talk about on this blog often — how to debug code, how to program using the BASH shell, file input/output, different development environments, and making makefiles.

Using gdb

One tool I haven’t talked about much is the debugger, gdb.  It is a text-based debugging tool, but it gives you some powerful ways to step through a program.  You can set a breakpoint, and then make rules for what variables are being displayed in each iteration.

Let’s say you have a file, myCode.c that you compile into an executable, myCode.  Compile using the -g flag, and then start gdb on your code by entering “gdb ./myCode”.  If your code has command line arguments, you need to specify an argument to gdb like this:

gdb –args ./myCode -a myArgument1 -b myArgument2

The important phrase here is “–args”, two dashes and the word args, that appears after gdb.  That lets gdb know that your ./myCode program itself has arguments.

You can also set a breakpoint inside gdb (you’d need to do that before you actually run the code).  To do this, say at line 10, simply type “break 10”.  This will be breakpoint 1.  To create rules to display data at each breakpoint type “display”.  It will ask what commands you’d like… for example, to display 5 values of an array, the command is “display array[0]@5”, then “cont” to continue, and “end” to end.

After setting up your breakpoints, simply type “run” to run the code.

If your program has a segmentation fault, it will let you know what line the segmentation fault occurred at, and using “backtrace” you can see what functions called that line.

If you have a segfault and the program is halted, the nice thing is that all the memory is still valid and you can see the value of certain variables.  To see the value of variables say “print myVariableName”.  It is quite informative.  For example, if a variable has a “NAN” next to it, you know there may be something wrong with that variable, that could cause an error somewhere else.

Here’s one example of a possible problem in pseudocode:

levelA = 0;

levelB = 0;

myLevel = 0.5;

myFrac = myLevel / (levelA + levelB);

The fourth line there looks innocuous enough, but this will cause a “divide by zero” error given the levelA and levelB value.  In gdb, you may get a segfault on the fourth line, but a simple “print levelA” and “print levelB” will help you solve the problem.

Here’s a short link that explains the basics of gdb with more detail.

Other notes

Also interesting are several C preprocessor macros that can tell you what line, file, date, and time the code was compiled at.   Predictably, these are __LINE__ __FILE__ __DATE__ and __TIME__ (that’s two underscores for each).

I also like the bash scripting examples that are contained in the book.  They taught me about some Linux utilities like “cut” that are very helpful, and covered elsewhere on this blog.

Any additional tips and tricks are welcome in the comments!