Dynamic memory allocation in C++

To have success creating C++ programs, it’s essential to have a good understanding of the workings and implementation of dynamic memory,  which is the “art” of performing manual memory management.  When developing code in C++, you may not always know how much memory you will need before your program runs.  For instance, the size of an array may be unknown until you execute the program.

Introduction to pointers

In order to understand dynamic memory allocation, we must first talk about pointers. A pointer is essentially a variable that stores the address in memory of another variable. The pointer data type is a long hexadecimal number representing the memory address.   This address can be accessed using an ampersand (&) exemplified in the following script:

//Accessing a pointer:
int age=43;
cout << &age << endl;
// The output would be something like: 0x7fff567ecb68

Since a pointer is also a variable, it requires declaration before being used .   The declaration consists on giving it a name, hopefully following the good code style convention and declaring the data type that it “points to” using an asterisk (*) like so:

//Declaring pointers of different data types:
int *integerPointer;
double *doublePointer;
float *floatPointer;
char *charPointer;

The pointer’s operators

In summary, the two pointer operators are: address-of operator(&) which returns the memory address and contents-of operator (*) which returns the value of the variable located at that address.

// Example of pointer operators:
float variable=25.6;
float *pointer;
pointer= &variable;
cout << variable << endl; //outputs 25.6, the variable’s value
cout << pointer << endl; //outputs 0x7fff5a774b68, the variable’s location in memory
cout << *pointer << endl; // outputs 25.6, value of the variable stored in that location

This last operator is also called deference operator which enables you to access directly the variable the pointer points to, which you can then use for regular operations:

float width = 5.0;
float length = 10.0;
float area;
float *pWidth = &width;
float *pLength = &length;

//Both of the following operations are equivalent
area = *pWidth * *pLength;
area = width * length;
//the output for both would be 50.

Deferencing the pointers *pWidth and *pLength represents exactly the same as the variables width and length, respectively.

Memory allocation in C++

Now that you have some basic understanding of pointers, we can talk about memory allocation in C++.  Memory in C++ is divided in tow parts: the stack and the heap.  All variables declared inside the function use memory from the stack whereas unused memory that can be used to allocate memory dynamically is stored in the heap.

You may not know in advance how much memory you need to store information in a variable. You can allocate memory within the heap of a variable of a determined data type using the new operator like so:

new float;

This operator allocates the memory necessary for storing a float on the heap and returns that address. This address can also be stored in a pointer, which can then be deferenced to access the variable:

float *pointer = new float; //requesting memory
*pointer = 12.0; //store value
cout << *pointer << endl; //use value
delete pointer;// free up memory
// this is now a dangling pointer
pointer= new float // reuse for new address

Here, the pointer is stored in the stack as a local variable, and holds the allocated address in the heap as its value. The value of 12.0 is then stored at the address in the heap. You can then use this value for other operations. Finally, the delete statement releases the memory when it’s no longer needed; however, it does not delete the pointer since it was stored in the stack. These type of pointers that point to non-existent memory are called dangling pointers and can be reused.

Dynamic memory and arrays

Perhaps the most common use of dynamic memory allocation are arrays. Here’s a brief example of the syntax:

int *pointer= NULL; // initialized pointer
pointer= new int[10] // request memory
delete[]pointer; //delete array pointed to by pointer

The NULL pointer has a value of zero, you can declare a null pointer when you do not have the address to be assigned.

Finally, to allocate and release memory for multi-dimensional arrays, you basically use an array of pointers to arrays, it sounds confusing but you can do this using the following  sample method:

int row = 3;
int col = 4;
double **p  = new double* [row]; // Allocate memory for rows

// Then allocate memory for columns
for(int i = 0; i < col; i++) {
    p[i] = new double[col];
}

//Release memory
for(int i = 0; i < row; i++) {
   delete[] p[i];
}
delete [] p;

I hope this quick overview provides a starting point on tackling your  C++ memory allocation challenges.

Advertisements

Debugging MPI By Dave Hadka

Dave wrote the following instructions on how to debug MPI in an email recently, and I thought I’d post it here as a private post on the blog.

In case this isn’t already known, here’s instructions I came up with for running gdb and valgrind on MPI programs:

Debugging MPI with GDB
———————-

1) Run an interactive PBS job:

qsub -I -l walltime=16:00:00 -l nodes=1:ppn=4

The interactive job will start you in your home folder. CD to your working directory.

2) Load the OpenMPI module with GNU GCC support:

module load openmpi/gnu

3) Compile your code with the -ggdb flag to include GDB debugging info in the executable.

4) Create the GDB script, gdbscript.txt, to run when GDB is launched.
This is needed since the program will not start running until the
GDB ‘run’ command is called, and we need to automatically run all
jobs on remote nodes. This will also enable logging to gdb.txt.

set logging on
run

5) Run the MPI program with GDB:

mpirun gdb -x gdbscript.txt ./mpiprog.exe

6) When the program exits or an error is detected, you will be left in
GDB. You can now use any GDB commands, or quit by typing ‘quit’.

Memory Checking MPI Programs
—————————-

First, follow steps 1-3 above.

4) When the interactive PBS job starts, run the MPI program with Valgrind:

mpirun valgrind –tool=memcheck –log-file=valgrind_%p.txt ./mpiprog.exe

5) Look at the valgrind_NNNN.txt files that were created, one for each process,
to determine if any memory leaks occurred. Valgrind often detects
uninitialized values in the Open MPI code, which should be ignored.

C++ Training: Valgrind

Valgrind is a tool for “memory debugging” of programs. It allows you to find places where memory is not properly allocated, which are difficult to find using traditional debuggers. Valgrind should be one of the first steps that you take when testing a program — even a simple one! It is available on Penn State’s clusters, or available on Linux.

Here are some “fun” programming errors you should avoid that Valgrind will help catch:

Trying to write outside the bound

double *a;
length = 5;
a = new double[length]; //a is now an array of size 5
for (int i = 0; i < 6; i++)
{
   a[i] = i; //error at i=5
}

The program will let you write a value at a[5], even though the only legal places to write are a[0] through a[4]. This is a corruption, since a[5] refers to a place where memory could be used by another variable. Valgrind will yell at you for this, and rightly so. Also be careful when allocating 2d and 3d arrays, since it is easy to confuse rows and columns, which will cause you major headaches.

Memory Leaks Inside Functions

void fun()
{
   double *a;
   a = new double[5];
   return;
}

Although the variable a gets destroyed as you leave the function, the memory that you allocated here doesn’t get destroyed at all! This is dangerous, especially if you call fun() many times.

Conditional Jump Depends on Uninitialized Variables

Even if you allocate memory properly, you may unfortunately perform this horrible error:

int y; //uninitialized at first
int x = 6;
int *a;
a = new int[5];

if (x < 5)
{
   y = 2;
}

x[y] = 7; //error, since y is not initialized yet
delete a; //a is deallocated properly

Here we managed memory correctly, but we accessed y before it had a value assigned to it. The danger here is that y really has some garbage value (i.e., -58342671), so the program may “sometimes” call the line correctly, and sometimes it won’t. Valgrind to the rescue!

This tutorial explains how to start using it.