Making Valgrind Easy

Some of this blog’s readers and authors (most notably, Joe Kasprzik) read the title of this post and though “wait, there already is a post about Valgrind in this blog.” And you are right, so in this blog post I will build on the legacy Joe has left us on his post about Valgrind and get into the details of how to use its basic functionalities to get your code right.

Common mistakes when coding in C++

Suppose we have the following code:

#include <stdio.h>

int main() {
    int *var = new int[5]; // you wouldn't do this if the size was always 5, but this makes the argument clear.
    int n = 5;
    int m;

    if (m > n) {
        printf("Got into if statement.\n");
        for (int i = 0; i < 6; ++i) {
            var[i] = i;
        }
    }

    printf("var[5] equals %d\n", var[n]);
}

Saving the code above in a file called test.cpp, compiling it with g++ to create an executable called "test," and running it with "./test" will return the following output:

bernardoct@DESKTOP-J6145HK ~
$ g++ test.cpp -o test

bernardoct@DESKTOP-J6145HK ~
$ ./test
Got into if statement.
var[5] equals 5

Great, it ran and did not crash (in such a simple code gcc's flag -Wall would have issued a warning saying m was not initialized, but in more complex code such warning may not be issued). However, it would be great if this code had crashed because this would make us look into it and figure out it actually has 3 problems:

  1. We did not assign a value to variable m (it was created but not initialized), so how did the code determine that m was greater than n to get into the code inside the if statement?
  2. The pointer array var was created as having length 5, meaning its elements are numbered 0 to 4. If the for-loop runs from 0 to 5 but element 5 does not exist, how did the code fill it in with the value of variable i when i was 5 in the loop? From the printf statement that returned 5 we know vars[5] equals 5.
  3. The pointer array var was not destroyed after the code did not need it any longer. This is not necessarily a problem in this case, but if this was a function that is supposed to be called over and over within a model there is a change the RAM would be filled with these seemingly inoffensive pointer arrays and the computer would freeze (or the node, if running on a cluster, would possibly crash and have to be rebooted).

Given C++ will not crash even in the presence of such errors, one way of making sure your code is clean is by running it through Valgrind. However, most people who has used Valgrind on a model that has a few hundreds or thousands of lines of code has gotten discouraged by its possibly long and cryptic-looking output. However, do not let this intimidate you because the output is actually fairly easy to read once you either learn what to look for or use Valkyrie, a graphical user interface for Valgrind.

Generating and interpreting Valgrind’s output

The first think that needs to be done for Valgrind to give you a meaningful output is to re-compile your code with the -O0 and -g flags, the former to prevent the compiler from modifying your code to make it more efficient but unintelligible to Valgrind (or to debuggers), and the latter for Valgrind (and debuggers) to be able to pinpoint the line of code where issues happen and are originated. Therefore, the code should be compiled as shown below:

bernardoct@DESKTOP-J6145HK ~
$ g++ -O0 -g test.cpp -o test

Now it is time to run your code with Valgrind to perform some memory checking. Valgrind itself will take flags that will dictate the type of analysis to be performed. Here we are interested in checking memory misuse (instead profiling, checking for thread safety, etc.), so the first flag (not required, but good to keep things for yourself) should be --tool=memcheck. Now that we specified that we want Valgrind to run a memory check, we should specify that we want it to look in detail for memory leaks and tell us where the erros are happening and originating, which can done by passing flags --leak-check=full and --track-origins-yes. This way, the complete function call to run Valgrind on our test program is:

bernardoct@DESKTOP-J6145HK ~
$ valgrind --tool=memcheck --leak-check=full --track-origins=yes ./test

Important: Beware that your code will take orders of magnitude longer to run with Valgrind than it would otherwise. This means that you should run something as small as possible but still representative — e.g. instead of running your stochastic model with 1,000 realizations and a simulation time of 50 years, consider running 2 realizations simulating 2 years, so that Valgrind analyzes the year-long simulation and the transition between realizations and years. Also, if running your code on a cluster, load the valgrind module with module load valgrind-xyz on your submission script and replace the call to your model on the submission script by the valgrind call above you can find the exact name of the Valgrind module by running module avail on the terminal. If running valgrind with a code that used MPI, use mpirun valgrind ./mycode -flags.

When called, valgrind will instrument our test.cpp and based on the collected information will print the following on the screen:

==385== Memcheck, a memory error detector
==385== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==385== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==385== Command: ./test
==385==
==385== Conditional jump or move depends on uninitialised value(s)
==385==    at 0x4006A9: main (test.cpp:9)
==385==  Uninitialised value was created by a stack allocation
==385==    at 0x400686: main (test.cpp:3)
==385==
Got into if statement.
==385== Invalid write of size 4
==385==    at 0x4006D9: main (test.cpp:12)
==385==  Address 0x5ab4c94 is 0 bytes after a block of size 20 alloc'd
==385==    at 0x4C2E8BB: operator new[](unsigned long) (vg_replace_malloc.c:423)
==385==    by 0x400697: main (test.cpp:5)
==385==
==385== Invalid read of size 4
==385==    at 0x4006F5: main (test.cpp:16)
==385==  Address 0x5ab4c94 is 0 bytes after a block of size 20 alloc'd
==385==    at 0x4C2E8BB: operator new[](unsigned long) (vg_replace_malloc.c:423)
==385==    by 0x400697: main (test.cpp:5)
==385==
var[5] equals 5
==385==
==385== HEAP SUMMARY:
==385==     in use at exit: 20 bytes in 1 blocks
==385==   total heap usage: 3 allocs, 2 frees, 73,236 bytes allocated
==385==
==385== 20 bytes in 1 blocks are definitely lost in loss record 1 of 1
==385==    at 0x4C2E8BB: operator new[](unsigned long) (vg_replace_malloc.c:423)
==385==    by 0x400697: main (test.cpp:5)
==385==
==385== LEAK SUMMARY:
==385==    definitely lost: 20 bytes in 1 blocks
==385==    indirectly lost: 0 bytes in 0 blocks
==385==      possibly lost: 0 bytes in 0 blocks
==385==    still reachable: 0 bytes in 0 blocks
==385==         suppressed: 0 bytes in 0 blocks
==385==
==385== For counts of detected and suppressed errors, rerun with: -v
==385== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)

Seeing Valgrind’s output being 5 times as long as the test code itself can be somewhat disheartening, but the information contained in the output is really useful. The first block of the output is the header it will always be printed so that you know the version of Valgrind you have been using, the call for your own code it used, and so on. In our example, the header is:

==385== Memcheck, a memory error detector
==385== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==385== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==385== Command: ./test

After that, Valgrind report the errors it found during the execution of your code. Errors are always reported as a description of the error in good old English, followed by where it happens in your code. Let’s look at the first error found by Valgrind:

==385== Conditional jump or move depends on uninitialised value(s)
==385==    at 0x4006A9: main (test.cpp:9)
==385==  Uninitialised value was created by a stack allocation
==385==    at 0x400686: main (test.cpp:3)

This tells us that there is an if statement (conditional statement) on line 9 of test.cpp in which at least one of the sides of the logical test has at least one uninitialized variable. As pointed out by Valgrind, line 9 of test.cpp has our problematic if statement which compares initialized variable n to uninitialized variable m, which will have whatever was put last in that memory address by the computer.

The second error block is the following:

==385== Invalid write of size 4
==385==    at 0x4006D9: main (test.cpp:12)
==385==  Address 0x5ab4c94 is 0 bytes after a block of size 20 alloc'd
==385==    at 0x4C2E8BB: operator new[](unsigned long) (vg_replace_malloc.c:423)
==385==    by 0x400697: main (test.cpp:5)

This means that your code is writing something in a location of memory that it did not allocated for its use. This block says that the illegal write, so to speak, happened in line 12 of test.cpp through a variable created in line 5 of test.cpp using the new[] operator. These lines correspond to var[i] = i; and to int *var = new int[5];. With this, we learned that either var was created too short on line 5 of test.cpp or that the for loop that assigns values to var goes one or more steps too far.

Similarly, the next block tells us that our printf statement used to print the value of var[5] on the screen has read past the amount of memory that was allocated to var in its declaration on line 5 of test.cpp, as shown below:

==385== Invalid read of size 4
==385==    at 0x4006F5: main (test.cpp:16)
==385==  Address 0x5ab4c94 is 0 bytes after a block of size 20 alloc'd
==385==    at 0x4C2E8BB: operator new[](unsigned long) (vg_replace_malloc.c:423)
==385==    by 0x400697: main (test.cpp:5)

The last thing Valgrind will report is the information about memory leaks, which are accounted for when the program is done running. The output about memory leaks for our example is:

==409== HEAP SUMMARY:
==409==     in use at exit: 20 bytes in 1 blocks
==409==   total heap usage: 3 allocs, 2 frees, 73,236 bytes allocated
==409==
==409== 20 bytes in 1 blocks are definitely lost in loss record 1 of 1
==409==    at 0x4C2E8BB: operator new[](unsigned long) (vg_replace_malloc.c:423)
==409==    by 0x400697: main (test.cpp:5)
==409==
==409== LEAK SUMMARY:
==409==    definitely lost: 20 bytes in 1 blocks
==409==    indirectly lost: 0 bytes in 0 blocks
==409==      possibly lost: 0 bytes in 0 blocks
==409==    still reachable: 0 bytes in 0 blocks
==409==         suppressed: 0 bytes in 0 blocks

The important points to take away from this last block are that:

  1. there were 20 bytes of memory leaks, meaning that if this were a function in your code every time it was run it would leave 20 bytes of garbage sitting in the RAM. This may not sound like a big deal but imagine if your code leaves 1 MB of garbage in the RAM for each of the 100,000 times a function is called. With this, there went 100 GB of RAM and everything else you were doing in your computer at that time because the computer will likely freeze and have to go through a hard-reset.
  2. the memory you allocated and did not free was allocated in line line 5 of test.cpp when you used the operator new[] to allocate the integer pointer array.

It is important to notice here that if we increase the amount of allocated memory by the new[] operator on line 5 to that corresponding to 6 instead of 5 integers, the last two errors (invalid read and invalid write) would disappear. This means that if you run your code with Valgrind and see hundreds of errors, chances are that it will take modifying a few lines of code to get rid of most of these errors.

Valkyrie — a graphical user interface for Valgrind

Another way of going through Valgrind’s output is by using Valkyrie (now installed in the login node of Reed’s cluster, The Cube). If you are analyzing your code from your own computer with a Linux terminal (does not work with Cygwin, but you can install a native Ubuntu terminal on Windows 10 by following instructions posted here) and do not have Valkyrie installed yet, you can install it by running the following on your terminal:

bernardoct@DESKTOP-J6145HK ~
$ sudo apt-get install valkyrie

Valkyrie works by reading an xml file exported by Valgrind containing the information about the errors it found. To export this file, you need to pass the flags --xml=yes and --xml-file=valgring_output.xml (or whatever name you want to give the file) to Valgrind, which would make the call to Valgrind become:

bernardoct@DESKTOP-J6145HK ~
$ valgrind --tool=memcheck --leak-check=full --track-origins=yes --xml=yes --xml-file=valgring_output.xml ./test

Now, you should have a file called “valgrind_output.xml” in the directory you are calling Valgrind from. To open it with Valkyrie, first open Valkyrie by typing valkyrie on your terminal if on Windows 10 you need to have Xming installed and running, which can be done by following the instructions in the end of this post. If on a cluster, besides having Xming open you also have to have ssh’ed into the cluster with the -X flag (e.g. by running ssh -X username@my.cluster.here) with either Cygwin or from a native Linux terminal. After opening Valkyrie, click on the green folder button and select the xml file, as in the screenshot below.

valkyrie_screenshot.png

After opening the xml file generated by Valgrind, Valkyrie should look like in the screenshot below:valkyrie_screenshot2

Now you can begin from a collapsed list of errors and unfold each error to see its details. Keep in mind that Valkyrie is not your only option of GUI for Valgrind, as IDEs like JetBrains’ CLion and QTCreator come integrated with Valgrind. Now go check your code!

PS: Thanks to folks on Redit for the comments which helped improve this post.

Dynamic memory allocation in C++

To have success creating C++ programs, it’s essential to have a good understanding of the workings and implementation of dynamic memory,  which is the “art” of performing manual memory management.  When developing code in C++, you may not always know how much memory you will need before your program runs.  For instance, the size of an array may be unknown until you execute the program.

Introduction to pointers

In order to understand dynamic memory allocation, we must first talk about pointers. A pointer is essentially a variable that stores the address in memory of another variable. The pointer data type is a long hexadecimal number representing the memory address.   This address can be accessed using an ampersand (&) exemplified in the following script:

//Accessing a pointer:
int age=43;
cout << &age << endl;
// The output would be something like: 0x7fff567ecb68

Since a pointer is also a variable, it requires declaration before being used .   The declaration consists on giving it a name, hopefully following the good code style convention and declaring the data type that it “points to” using an asterisk (*) like so:

//Declaring pointers of different data types:
int *integerPointer;
double *doublePointer;
float *floatPointer;
char *charPointer;

The pointer’s operators

In summary, the two pointer operators are: address-of operator(&) which returns the memory address and contents-of operator (*) which returns the value of the variable located at that address.

// Example of pointer operators:
float variable=25.6;
float *pointer;
pointer= &variable;
cout << variable << endl; //outputs 25.6, the variable’s value
cout << pointer << endl; //outputs 0x7fff5a774b68, the variable’s location in memory
cout << *pointer << endl; // outputs 25.6, value of the variable stored in that location

This last operator is also called deference operator which enables you to access directly the variable the pointer points to, which you can then use for regular operations:

float width = 5.0;
float length = 10.0;
float area;
float *pWidth = &width;
float *pLength = &length;

//Both of the following operations are equivalent
area = *pWidth * *pLength;
area = width * length;
//the output for both would be 50.

Deferencing the pointers *pWidth and *pLength represents exactly the same as the variables width and length, respectively.

Memory allocation in C++

Now that you have some basic understanding of pointers, we can talk about memory allocation in C++.  Memory in C++ is divided in tow parts: the stack and the heap.  All variables declared inside the function use memory from the stack whereas unused memory that can be used to allocate memory dynamically is stored in the heap.

You may not know in advance how much memory you need to store information in a variable. You can allocate memory within the heap of a variable of a determined data type using the new operator like so:

new float;

This operator allocates the memory necessary for storing a float on the heap and returns that address. This address can also be stored in a pointer, which can then be deferenced to access the variable:

float *pointer = new float; //requesting memory
*pointer = 12.0; //store value
cout << *pointer << endl; //use value
delete pointer;// free up memory
// this is now a dangling pointer
pointer= new float // reuse for new address

Here, the pointer is stored in the stack as a local variable, and holds the allocated address in the heap as its value. The value of 12.0 is then stored at the address in the heap. You can then use this value for other operations. Finally, the delete statement releases the memory when it’s no longer needed; however, it does not delete the pointer since it was stored in the stack. These type of pointers that point to non-existent memory are called dangling pointers and can be reused.

Dynamic memory and arrays

Perhaps the most common use of dynamic memory allocation are arrays. Here’s a brief example of the syntax:

int *pointer= NULL; // initialized pointer
pointer= new int[10] // request memory
delete[]pointer; //delete array pointed to by pointer

The NULL pointer has a value of zero, you can declare a null pointer when you do not have the address to be assigned.

Finally, to allocate and release memory for multi-dimensional arrays, you basically use an array of pointers to arrays, it sounds confusing but you can do this using the following  sample method:

int row = 3;
int col = 4;
double **p  = new double* [row]; // Allocate memory for rows

// Then allocate memory for columns
for(int i = 0; i < col; i++) {
    p[i] = new double[col];
}

//Release memory
for(int i = 0; i < row; i++) {
   delete[] p[i];
}
delete [] p;

I hope this quick overview provides a starting point on tackling your  C++ memory allocation challenges.

Debugging MPI By Dave Hadka

Dave wrote the following instructions on how to debug MPI in an email recently, and I thought I’d post it here as a private post on the blog.

In case this isn’t already known, here’s instructions I came up with for running gdb and valgrind on MPI programs:

Debugging MPI with GDB
———————-

1) Run an interactive PBS job:

qsub -I -l walltime=16:00:00 -l nodes=1:ppn=4

The interactive job will start you in your home folder. CD to your working directory.

2) Load the OpenMPI module with GNU GCC support:

module load openmpi/gnu

3) Compile your code with the -ggdb flag to include GDB debugging info in the executable.

4) Create the GDB script, gdbscript.txt, to run when GDB is launched.
This is needed since the program will not start running until the
GDB ‘run’ command is called, and we need to automatically run all
jobs on remote nodes. This will also enable logging to gdb.txt.

set logging on
run

5) Run the MPI program with GDB:

mpirun gdb -x gdbscript.txt ./mpiprog.exe

6) When the program exits or an error is detected, you will be left in
GDB. You can now use any GDB commands, or quit by typing ‘quit’.

Memory Checking MPI Programs
—————————-

First, follow steps 1-3 above.

4) When the interactive PBS job starts, run the MPI program with Valgrind:

mpirun valgrind –tool=memcheck –log-file=valgrind_%p.txt ./mpiprog.exe

5) Look at the valgrind_NNNN.txt files that were created, one for each process,
to determine if any memory leaks occurred. Valgrind often detects
uninitialized values in the Open MPI code, which should be ignored.

C++ Training: Valgrind

Valgrind is a tool for “memory debugging” of programs. It allows you to find places where memory is not properly allocated, which are difficult to find using traditional debuggers. Valgrind should be one of the first steps that you take when testing a program — even a simple one! It is available on Penn State’s clusters, or available on Linux.

Here are some “fun” programming errors you should avoid that Valgrind will help catch:

Trying to write outside the bound

double *a;
length = 5;
a = new double[length]; //a is now an array of size 5
for (int i = 0; i < 6; i++)
{
   a[i] = i; //error at i=5
}

The program will let you write a value at a[5], even though the only legal places to write are a[0] through a[4]. This is a corruption, since a[5] refers to a place where memory could be used by another variable. Valgrind will yell at you for this, and rightly so. Also be careful when allocating 2d and 3d arrays, since it is easy to confuse rows and columns, which will cause you major headaches.

Memory Leaks Inside Functions

void fun()
{
   double *a;
   a = new double[5];
   return;
}

Although the variable a gets destroyed as you leave the function, the memory that you allocated here doesn’t get destroyed at all! This is dangerous, especially if you call fun() many times.

Conditional Jump Depends on Uninitialized Variables

Even if you allocate memory properly, you may unfortunately perform this horrible error:

int y; //uninitialized at first
int x = 6;
int *a;
a = new int[5];

if (x < 5)
{
   y = 2;
}

x[y] = 7; //error, since y is not initialized yet
delete a; //a is deallocated properly

Here we managed memory correctly, but we accessed y before it had a value assigned to it. The danger here is that y really has some garbage value (i.e., -58342671), so the program may “sometimes” call the line correctly, and sometimes it won’t. Valgrind to the rescue!

This tutorial explains how to start using it.