Debug in Real-time on SLURM

Debugging a code by submitting jobs to a supercomputer is an inefficient process. It goes something like this:

  1. Submit job and wait in queue
  2. Check for errors/change code
  3. (repeat endlessly until your code works)

Debugging in Real-Time:

There’s a better way to debug that doesn’t require waiting for the queue every time you want to check your code. On SLURM, you can debug in real-time like so:
  1. Request a debugging or interactive node and wait in queue
  2. Check for errors/change code continuously until code is fixed or node has timed out

Example (using Summit supercomputer at University of Colorado Boulder):

  1. Log into terminal (PuTTY, Cygwin, etc.)
  2. Navigate to directory where the file to be debugged is located using ‘cd’ command
  3. Load SLURM
    • $module load slurm
  4. Enter the ‘sinteractive’ command
    • $sinteractive
  5. Wait in line for permission to use the node (you will have a high priority with a debugging QOS so it shouldn’t take long)
  6. Once you are granted permission, the node is yours! Now you can debug to your hearts content (or until you run out of time).
I’m usually debugging shell scripts on Unix. If you want advice on that topic check out this link. I prefer the ‘-x’ command (shown below) but there are many options available.
Debugging shell scripts in Unix using ‘-x’ command: 
 $bash -x mybashscript.bash
Hopefully this was helpful! Please feel free to edit/comment/improve as you see fit.
Advertisements

One thought on “Debug in Real-time on SLURM

  1. Pingback: Water Programming Blog Guide (Part I) – Water Programming: A Collaborative Research Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s