Debugging MPI By Dave Hadka

Dave wrote the following instructions on how to debug MPI in an email recently, and I thought I’d post it here as a private post on the blog.

In case this isn’t already known, here’s instructions I came up with for running gdb and valgrind on MPI programs:

Debugging MPI with GDB
———————-

1) Run an interactive PBS job:

qsub -I -l walltime=16:00:00 -l nodes=1:ppn=4

The interactive job will start you in your home folder. CD to your working directory.

2) Load the OpenMPI module with GNU GCC support:

module load openmpi/gnu

3) Compile your code with the -ggdb flag to include GDB debugging info in the executable.

4) Create the GDB script, gdbscript.txt, to run when GDB is launched.
This is needed since the program will not start running until the
GDB ‘run’ command is called, and we need to automatically run all
jobs on remote nodes. This will also enable logging to gdb.txt.

set logging on
run

5) Run the MPI program with GDB:

mpirun gdb -x gdbscript.txt ./mpiprog.exe

6) When the program exits or an error is detected, you will be left in
GDB. You can now use any GDB commands, or quit by typing ‘quit’.

Memory Checking MPI Programs
—————————-

First, follow steps 1-3 above.

4) When the interactive PBS job starts, run the MPI program with Valgrind:

mpirun valgrind –tool=memcheck –log-file=valgrind_%p.txt ./mpiprog.exe

5) Look at the valgrind_NNNN.txt files that were created, one for each process,
to determine if any memory leaks occurred. Valgrind often detects
uninitialized values in the Open MPI code, which should be ignored.

Advertisements

One thought on “Debugging MPI By Dave Hadka

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s