The situation: You have an “embarrassingly parallel” set of tasks that do not need to communicate with each other. They are currently configured to run in serial, but their combined runtime would exceed the allowed queue time (or your patience).
Goal: Submit a batch of serial jobs to a PBS queue in a sensible way.
Your old job script, my_job.sh
:
#!/bin/bash #PBS -l walltime=1:00:00 #PBS -l nodes=1:ppn=1 cd $PBS_O_WORKDIR for number in ${1..1000} do ./my_program -n ${number} done
As usual, you would submit this with qsub my_job.sh
.
New idea:
Most PBS systems support job arrays with the -t
flag:
qsub my_job.sh -t 0-1000
The index of each job is passed into my_job.sh
via the $PBS_ARRAYID
environment variable, so that the job script can be revised:
#!/bin/bash #PBS -l walltime=1:00:00 #PBS -l nodes=1:ppn=1 cd $PBS_O_WORKDIR ./my_program -n $PBS_ARRAYID
Now you should have 1000 jobs running. They are “parallel” in the sense that they will run simultaneously, but are not using MPI. The jobs will appear as an array to save you the hassle of scrolling through all of them:
jdh366@thecube ~/demos-meeting/serial-batch $ qstat Job id Name User Time Use S Queue ------------------------- ---------------- --------------- -------- - ----- 117046[].thecube my-job-script.sh jdh366 0 R default
The stdout
from each job will be saved in files named my-job-script.sh.o117046-<i>
where <i>
is the index of the job.
What if I have more jobs than processors?
You have two choices. First, you could submit all of them at the same time. PBS will run as many jobs as it can until they are finished. However, this is likely to occupy the entire system with many short serial jobs, which people won’t like.
The second option is to specify a “slot limit” using the %
character in the qsub
command like so:
qsub my_job.sh -t 0-1000%50
This command will run a maximum of 50 serial jobs at a time. When these finish, it will run another 50 and so on until the 1000 are complete. This avoids using the entire system at once.
For more HPC submission script examples, see here:
https://github.com/jdherman/hpc-submission-scripts