Running jobs on the supercomputer: JANUS

The power of supercomputing is undeniable. However, there is often a hurdle in syntax to get jobs to run on them. What I’m including below are ways to submit jobs to run on the CU-Boulder supercomputer, JANUS, which I hope will be helpful.

To log on, open up a terminal window (e.g. Terminal on a Mac or CygWin on a PC): ssh <username>@login.rc.colorado.edu

To copy items to JANUS from a shell, simply use the following:

scp <path and filename on local machine>   <username>@login.rc.colorado.edu:<destination path on JANUS>/

The purpose of the job script is to tell JANUS where to run the job. I will cover two types of job scripts, (1) to submit a job to an entire node, and (2) to submit to a single processor. Note, nodes on JANUS contain multiple processors, usually more than 12, so that if you have a memory intensive job you may wish to submit the former. Also, the jobs that occupy entire nodes offer the user a larger number of total processors to work with (several thousand cores versus several hundred). Nevertheless, here are the examples:

1. Example script to submit to a node is below: The body of text should be saved to a text file with a “.sh” suffix (i.e. shell script). Also notice that lines that begin with “#” are not read by the program, but rather are for comments/documentation. To submit the script, first be sure you’ve loaded the slurm module:

module load slurm

sbatch <path and filename of script>

#!/bin/bash
# Lines starting with #SBATCH are interpreted by slurm as arguments.
#

# Set the name of the job, e.g. MyJob
#SBATCH -J MyJob

#
# Set a walltime for the job. The time format is HH:MM:SS - In this case we run for 12 hours. **Important, this length should be commensurate with the type of node
# you're submitting to, debug is less than 1 hour, but others can be much longer, check the online documentation for assistance

#SBATCH --time=12:00:00
#
# Select one node
#
#SBATCH -N 1

# Select one task per node (similar to one processor per node)
#SBATCH --ntasks-per-node 12
# Set output file name with job number

#SBATCH -o MyJob-%j.out

# Use the standard 'janus' queue. This is confusing as the online documentation is incorrect, use the below to get a simple 12 core node

#SBATCH --qos janus

# The following commands will be executed when this script is run.

# **Important, in order to get 12 commands to run at the same time on your node, enclose them in parentheses "()" and follow them with an ampersand "&"

# to get all jobs to run in the background. The last thing is be sure to include a "wait" command at the end, so that the job script waits to terminate until these

# jobs complete. Theoretically you could have more than 12 command below.

# ** Note replace the XCMDX commands below with the full path to your executable as well as any command line options exactly how you'd run them from the

# command line.

echo The job has begun

(XCMD1X) &

(XCMD2X) &

(XCMD3X) &

(XCMD4X) &

(XCMD5X) &

(XCMD6X) &

(XCMD7X) &

(XCMD8X) &

(XCMD9X) &

(XCMD10X) &

(XCMD11X) &

(XCMD12X) &

# wait ensures that job doesn't exit until all background jobs have completed

wait

EOF

2. Example script to submit to a single processor is below. The process is almost identical to above, except for 4 things: (i) the queue that we’ll submit to is called ‘serial’, (ii) number of tasks per node is 1, (iii) the number of executable lines is 1, and (iv) we do not need the “wait” command.

#!/bin/bash

# Lines starting with #SBATCH are interpreted by slurm as arguments.

#

# Set the name of the job, e.g. MyJob

#SBATCH -J MyJob

#

# Set a walltime for the job. The time format is HH:MM:SS - In this case we run for 6 hours. **Important, this length should be commensurate with the type of node

# you're submitting to, debug is less than 1 hour, but others can be much longer, check the online documentation for assistance

#SBATCH --time=6:00:00

#

# Select one node

#

#SBATCH -N 1

# Select one task per node (similar to one processor per node)

#SBATCH --ntasks-per-node 1

# Set output file name with job number

#SBATCH -o MyJob-%j.out

# Use the standard 'serial' queue. This is confusing as the online documentation is incorrect, use the below to get a single processor

#SBATCH --qos serial

# The following commands will be executed when this script is run.

# ** Note replace the XCMDX commands below with the full path to your executable as well as any command line options exactly how you'd run them from the

# command line.

echo The job has begun

XCMDX

EOF

Advertisements

3 thoughts on “Running jobs on the supercomputer: JANUS

  1. Pingback: Compiling, running, and linking a simulation model to Borg: LRGV Example | Water Programming: A Collaborative Research Blog

  2. Pingback: How to Use Opensees on the JANUS Supercomuter | Travis Marcilla

  3. Pingback: How to Use Opensees on the JANUS Supercomputer | Travis Marcilla

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s