Simple tricks to make your C/C++ code run faster

Us, engineers with no formal computer science training, for a myriad of good reasons tend to favor friendly programming languages such as Python over evil C/C++. However, when our application is performance sensitive and we cannot wait for results sitting in our chairs for as long as a Python code would require us to, we are sometimes forced to us C/C++. Why then not making the most of it when it this is the case?

Here are some simple tips to make your C/C++ code run even faster, and how to get some advice about further performance improvements. The last idea (data locality) is transferable to Python and other languages.

Most important trick

Improve your algorithm. Thinking if there is a simpler way of doing what you coded may reduce your algorithm’s complexity (say, from say n3 to n*log(n)), which would:

  • yield great speed-up when you have lots of data or needs to run a computation several times in a row, and
  • make you look smarter.

Compiler flags

First and foremost, the those who know what they are doing — compiler developers — do the work for you by calling the compiler with the appropriate flags. There are an incredible amount of flags you can call for that, but I would say that the ones you should have on whenever possible are -O3 and –march=native.

The optimization flags (-O1 to -O3, the latter more aggressive than the former) will perform a series of modification on your code behind the scenes to speed it up, some times by more than an order of magnitude. The issue is that this modifications may eventually make your code behave differently than you expected, so it’s always good to do a few smaller runs with -O0 and -O3 and compare their results before getting into production mode.

The –march=native flag will make the compiler fine tune your code to the processor it is being compiled on (conversely, –march=haswell would fine tune it to haswell architectures, for example). This is great if you are only going to run your code on your own machine or in another machine known to have a similar architecture, but if you try to run the binary on a machine with a different architecture, specially if it is an older one, you may end up with illegal instruction errors.

Restricted pointer array

When declaring a pointer array which you are sure will not be subject to pointer aliasing — namely there will be no other pointer pointing to the same memory address –, you can declare that pointer as a restrict pointer, as below:

  • GCC: double* __restrict__ my_pointer_array
  • Intel compiler: double* restrict my_pointer_array

This will let the compiler know that it can change order of certain operations involving my_pointer_array to make your code faster without changing some read/write order that may change your results. If you want to use the restricted qualifier with the intel compiler the flag -restrict must be passed as an argument to the compiler.

Aligned pointer array

By aligning an array, you are making sure the data is going to lie in the ideal location in memory for the processor to fetch it and perform the calculations it needs based on that array. To help your compiler optimize your data alignment, you need to (1) align your array when it is declared by a specific number of bytes and (2) tell your the compiler the array is aligned when using it to perform calculations — the compiler has no idea whether or not arrays received as arguments in function are aligned. Below are examples in C, although the same ideas apply to C++ as well.


#include <stdio.h>
#include <omp.h>

#define SIZE_ARRAY 100000
#define ALIGN 64

void sum(double *__restrict__ a, double *__restrict__ b, double *__restrict__ c, int n) {
    a = (double*) __builtin_assume_aligned(a, ALIGN);
    b = (double*) __builtin_assume_aligned(b, ALIGN);
    c = (double*) __builtin_assume_aligned(c, ALIGN);
    for (int i = 0; i < n; ++i)
        c[i] = a[i] + b[i];

int main(void) {

    double a[SIZE_ARRAY] __attribute__((aligned(ALIGN )));
    double b[SIZE_ARRAY] __attribute__((aligned(ALIGN )));
    double c[SIZE_ARRAY] __attribute__((aligned(ALIGN )));

    for (int i = 0; i < SIZE_ARRAY; ++i) {
        a[i] = 5.;
        b[i] = 2.;
    double start_time = omp_get_wtime();
    sum(a, b, c, SIZE_ARRAY);
    double time = omp_get_wtime() - start_time;
    printf("%0.6fs", time);

Intel compiler

#include <stdio.h>
#include <omp.h>

#define SIZE_ARRAY 100000
#define ALIGN 64

void sum(double* restrict a, double* restrict b, double* restrict c, int n) {
    __assume_aligned(a, ALIGN);
    __assume_aligned(b, ALIGN);
    __assume_aligned(c, ALIGN);
    for (int i = 0; i < n; ++i)
        c[i] = a[i] + b[i];

int main(void) {

    __declspec(align(ALIGN )) float a[SIZE_ARRAY];
    __declspec(align(ALIGN )) float b[SIZE_ARRAY];
    __declspec(align(ALIGN )) float c[SIZE_ARRAY];

    for (int i = 0; i < SIZE_ARRAY; ++i) {
        a[i] = 5.;
        b[i] = 2.;
    double start_time = omp_get_wtime();
    sum(a, b, c, SIZE_ARRAY);
    double time = omp_get_wtime() - start_time;

    printf("%0.6fs", time);

Edit: In a comment to this post, Krister Walfridsson not only caught an issue with my GCC code, for which mention I thank him, but he also shows the differences in machine code generated with and without alignment.

Data Locality

Computers are physical things, which means that data is physically stored and needs to be physically moved around in memory and between cache and processor in order to be used in calculations. This means that, if your data is stored all over the place in memory — e.g. in multiple pointer arrays in different parts of memory –, the processor will have to reach out to several parts of memory to fetch all your data before performing any computations. By having the data intelligently laid out in memory you ensure all data required for each computation is stored close to each other and in cache at the same time, which becomes even more important if your code uses too much data to fit in the cache at once.

In order to making your processor’s life easier, it is a good idea to ensure that all data required for a calculation step is close together. For example, if a given computation required three arrays of fixed sizes, it is always a good idea to merge them into one long array, as in the example below for the Intel compiler.

#include <stdio.h>
#include <omp.h>

#define SIZE_ARRAY 100000
#define ALIGN 64

void sum(double* restrict a, double* restrict b, double* restrict c, int n) {
    __assume_aligned(a, ALIGN);
    __assume_aligned(b, ALIGN);
    __assume_aligned(c, ALIGN);
    for (int i = 0; i < n; ++i)
        c[i] = a[i] + b[i];

int main(void) {

    __declspec(align(ALIGN )) float abc[3 * SIZE_ARRAY];

    for (int i = 0; i < 2 * SIZE_ARRAY; ++i) {
        a[i] = 5.;
        b[i] = 2.;
    double start_time = omp_get_wtime();
    sum(abc, abc + SIZE_ARRAY, abc + 2 * ARRAY, SIZE_ARRAY);
    double time = omp_get_wtime() - start_time;

    printf("%0.6fs", time);

or even, since c[i] depends only on b[i] and a[i], we can have the values of a, b and c intercalated to assure that all computations will be performed on data that is right next to each other in memory:

#include <stdio.h>
#include <omp.h>

#define SIZE_ARRAY 100000
#define ALIGN 64
#define STRIDE 3

void sum(double* restrict abc, int n, int stride) {
    __assume_aligned(abc, ALIGN);

    for (int i = 0; i < n; i += stride)
        abc[i+2] = abc[i] + abc[i+1];

int main(void) {
    __declspec(align(ALIGN )) double abc[3 * SIZE_ARRAY];

    for (int i = 0; i < 3 * SIZE_ARRAY; i += STRIDE) {
        abc[i] = 5.;
        abc[i+1] = 2.;
    double start_time = omp_get_wtime();
    sum(abc, SIZE_ARRAY, STRIDE );
    double time = omp_get_wtime() - start_time;

    printf("%0.6fs", time);


According a class project in which we had to write C code to perform matrix multiplication, the improvements suggested should may improve the performance of your code by 5 or 10 times. Also, the idea of data locality can be transferred to other languages, such as Python.


Reading CSV files in C++

If you are an engineer used to coding in Python or Matlab who is transitioning to C++, you will soon find out that even the most innocent task will now require several lines of code. A previous post has already shown how to export data to a CSV file. In order to facilitate your transition to C++, see below for an example of how to read your new CSV file.


#include <string>
#include <vector>
#include <sstream> //istringstream
#include <iostream> // cout
#include <fstream> // ifstream

using namespace std;

 * Reads csv file into table, exported as a vector of vector of doubles.
 * @param inputFileName input file name (full path).
 * @return data as vector of vector of doubles.
vector<vector<double>> parse2DCsvFile(string inputFileName) {

    vector<vector<double> > data;
    ifstream inputFile(inputFileName);
    int l = 0;

    while (inputFile) {
        string s;
        if (!getline(inputFile, s)) break;
        if (s[0] != '#') {
            istringstream ss(s);
            vector<double> record;

            while (ss) {
                string line;
                if (!getline(ss, line, ','))
                try {
                catch (const std::invalid_argument e) {
                    cout << "NaN found in file " << inputFileName << " line " << l
                         << endl;


    if (!inputFile.eof()) {
        cerr << "Could not read file " << inputFileName << "\n";
        __throw_invalid_argument("File not found.");

    return data;

int main()
    vector<vector<double>> data = parse2DCsvFile("test.csv");

    for (auto l : data) {
    	for (auto x : l)
    		cout << x << " ";
    	cout << endl;

    return 0;

Dynamic memory allocation in C++

To have success creating C++ programs, it’s essential to have a good understanding of the workings and implementation of dynamic memory,  which is the “art” of performing manual memory management.  When developing code in C++, you may not always know how much memory you will need before your program runs.  For instance, the size of an array may be unknown until you execute the program.

Introduction to pointers

In order to understand dynamic memory allocation, we must first talk about pointers. A pointer is essentially a variable that stores the address in memory of another variable. The pointer data type is a long hexadecimal number representing the memory address.   This address can be accessed using an ampersand (&) exemplified in the following script:

//Accessing a pointer:
int age=43;
cout << &age << endl;
// The output would be something like: 0x7fff567ecb68

Since a pointer is also a variable, it requires declaration before being used .   The declaration consists on giving it a name, hopefully following the good code style convention and declaring the data type that it “points to” using an asterisk (*) like so:

//Declaring pointers of different data types:
int *integerPointer;
double *doublePointer;
float *floatPointer;
char *charPointer;

The pointer’s operators

In summary, the two pointer operators are: address-of operator(&) which returns the memory address and contents-of operator (*) which returns the value of the variable located at that address.

// Example of pointer operators:
float variable=25.6;
float *pointer;
pointer= &variable;
cout << variable << endl; //outputs 25.6, the variable’s value
cout << pointer << endl; //outputs 0x7fff5a774b68, the variable’s location in memory
cout << *pointer << endl; // outputs 25.6, value of the variable stored in that location

This last operator is also called deference operator which enables you to access directly the variable the pointer points to, which you can then use for regular operations:

float width = 5.0;
float length = 10.0;
float area;
float *pWidth = &width;
float *pLength = &length;

//Both of the following operations are equivalent
area = *pWidth * *pLength;
area = width * length;
//the output for both would be 50.

Deferencing the pointers *pWidth and *pLength represents exactly the same as the variables width and length, respectively.

Memory allocation in C++

Now that you have some basic understanding of pointers, we can talk about memory allocation in C++.  Memory in C++ is divided in tow parts: the stack and the heap.  All variables declared inside the function use memory from the stack whereas unused memory that can be used to allocate memory dynamically is stored in the heap.

You may not know in advance how much memory you need to store information in a variable. You can allocate memory within the heap of a variable of a determined data type using the new operator like so:

new float;

This operator allocates the memory necessary for storing a float on the heap and returns that address. This address can also be stored in a pointer, which can then be deferenced to access the variable:

float *pointer = new float; //requesting memory
*pointer = 12.0; //store value
cout << *pointer << endl; //use value
delete pointer;// free up memory
// this is now a dangling pointer
pointer= new float // reuse for new address

Here, the pointer is stored in the stack as a local variable, and holds the allocated address in the heap as its value. The value of 12.0 is then stored at the address in the heap. You can then use this value for other operations. Finally, the delete statement releases the memory when it’s no longer needed; however, it does not delete the pointer since it was stored in the stack. These type of pointers that point to non-existent memory are called dangling pointers and can be reused.

Dynamic memory and arrays

Perhaps the most common use of dynamic memory allocation are arrays. Here’s a brief example of the syntax:

int *pointer= NULL; // initialized pointer
pointer= new int[10] // request memory
delete[]pointer; //delete array pointed to by pointer

The NULL pointer has a value of zero, you can declare a null pointer when you do not have the address to be assigned.

Finally, to allocate and release memory for multi-dimensional arrays, you basically use an array of pointers to arrays, it sounds confusing but you can do this using the following  sample method:

int row = 3;
int col = 4;
double **p  = new double* [row]; // Allocate memory for rows

// Then allocate memory for columns
for(int i = 0; i < col; i++) {
    p[i] = new double[col];

//Release memory
for(int i = 0; i < row; i++) {
   delete[] p[i];
delete [] p;

I hope this quick overview provides a starting point on tackling your  C++ memory allocation challenges.

Reed Group’s basic C++ code style conventions

It is always good practice for programmers to adopt some sort of style convention when developing new code. This helps keeping the code readable for both authors and collaborators, as well as for people that read your code on online repositories. Here I will set a precedent for a minimal C++ code style for Reed’s group encompassing C++ features we normally use based on the most common practices out there, so that we can more easily help each other with out codes and keep consistency when publishing them. This post may be updated if somebody sets precedents for C++ features didn’t account for (e.g. namespaces).

Naming conventions

  • Classes: Uppercase first letter. If the class name is comprised of more than one word, all words should be written together (no dashes, underscores, etc.) and the first letter of each word should be capitalized. E.g.: MyAwesomeClass.
  • Functions: Lowercase first letter. If the function name is comprised of more than one word, all words should be written together (no dashes, underscores, etc.) and the first letter of each word except the first should be capitalized. E.g.: myFantasticFunction. If you are creating a getter or a setter, be sure to follow the this standard. E.g. the getter for variable “thisVariable” would be “getThisVariable.”
  • Variables: Same as Functions. Acronyms should also follow this rule — e.g. a variable containing a short-term ROF (for risk-of-failure) value for a utility should be called something like “shortTermRof.”
  • Constants: All letters capitalized and words separated by underscores. E.g. MY_GREAT_CONSTANT.

Other naming rules

Besides naming conventions, there are other good practices when it comes to coming up with names in your code:

  • Do not assign one letter names, unless it is a temporary variable such as i, j, k used as indexes.
  • Assign informative names to your classes, functions, variables and constants. If you have a variable called “length,” another called “thisLength” and a third one called “realLength” your code will be really hard to follow.
  • Being concise is great (nobody reads code for its poetic variable names) but avoid shortening your names too much. Calling a variable “catchmentFallCreekIthaca” makes it much easier for someone else to know the information contained in that variable than calling it “catfacreith.”
  • We all get really frustrated with our codes at times, and want to curse it really bad. It’s fine to do it in your office when nobody is hearing, but be sure to not let that leak into your code and to keep some decency: e.g. avoid having in your code “this&%$*%DoesNot&$%*#@Work = true” or anything of the sort.

Other rules

  • Avoid magic numbers (hard coded numbers). Codes like the one below not only are hard to understand but also make the reader question if the results of the code are actually right:
    if (312 * evaporation + inflow / 52 - 7 * demand) {
        // Do something here

    Now imagine if the value 212 is the value of an area and is used in 83 different parts of your code: that’s a problem. Instead, declaring those numbers as constants would be preferred:

    const double DRYVILLE_RESERVOIR_AREA = 312.0;
    const double NUMBER_OF_WEEKS_IN_YEAR = 52.14;
    const double NUMBER_OF_DAYS_IN_WEEK = 7.0;
    // Lots of code here, since constants are normally declared at the top of the code.
    if (DRYVILLE_RESERVOIR_AREA * evaporation + inflow / NUMBER_OF_WEEKS_IN_YEAR - NUMBER_OF_DAYS_IN_WEEK * demand) {
        // Do something here
  • Keep your cpp files shorter than 500 lines. If you start approaching 500 lines, it may be the case that your class can be broken into parent and multiple children classes, or into two completely different classes.
  • Have only the main.cpp file in the root directory. All other files, if any, should be in directories so that the code is easy to navigate through.
  • If there is an issue or simplification to be fixed at some point in the future, use the “//FIXME:” comment to indicate it, as in the code below:
    //FIXME: replace constant area below by storage vs. area curve.
    if (DRYVILLE_RESERVOIR_AREA * evaporation + inflow / NUMBER_OF_WEEKS_IN_YEAR - NUMBER_OF_DAYS_IN_WEEK * demand) {
    // Do something here

Note that different languages have different standards. If coding in Python or Matlab, for example, be sure to follow the best practices for these languages. Also, if developing code in collaboration with another research group, be sure to negotiate a convention.

Profiling C++ code with Callgrind

Often times, we have to write code to perform tasks whose complexity vary from mundane, such as retrieving and organizing data, to highly complex, such as simulations CFD simulations comprising the spine of a project. In either case, depending on the complexity of the task and amount of data to be processed, it may happen for the newborn code to leave us staring at an underscore marker blinking gracefully for hours on a command prompt during its execution until the results are ready, leading to project schedule delays and shortages of patience. Two standard and preferred approaches to the problem of time intensive codes are to simplify the algorithm and to make the code more efficient. In order to better select the parts of the code to work on, it is often useful to first find the parts of the code in which more time by profiling the code.

In this post, I will show how to use Callgrind, part of Valgrind, and KCachegrind to profile C/C++ codes on Linux — unfortunately, Valgrind is not available for Windows or Mac, although it can be ran on cluster from which results can be downloaded and visualized on Windows with QCachegrind. The first step is to install Valgrind and KCachegrind by typing the following commands in the terminal of a Debian based distribution, such as Ubuntu (equivalent yum commands area available for Red Hat based distributions):

$ sudo apt-get install valgrind
$ sudo apt-get install kcachegrind

Now that the required tools are installed, the next step is to compile your code with GCC/G++ (with a make file, cmake, IDE or by running the compiler directly from the terminal) and then run the following command in a terminal (type ctrl+shift+T to open the terminal):

$ valgrind --tool=callgrind path/to/your/compiled/program program_arguments

Callgrind will then run your program with some instrumentation added to its execution to measure time expenditures and cache use by each function in your code. Because of the instrumentation, Your code will take considerably longer to run under Callgrind than it typically would, so be sure to run a representative task that is as small as possible when profiling your code. During its execution, Callgrind will output a report similar to the one below on terminal itself:

==12345== Callgrind, a call-graph generating cache profiler
==12345== Copyright (C) 2002-2015, and GNU GPL'd, by Josef Weidendorfer et al.
==12345== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==12345== Command: path/to/your/compiled/program program_arguments
==12345== For interactive control, run 'callgrind_control -h'.
==12345== Events    : Ir
==12345== Collected : 4171789731
==12345== I   refs:      4,171,789,731

The report above shows that it collected 4 billion events in order to generate the comprehensive report saved in the file callgrind.out.12345 — 12345 is here your process id, shown in the report above. Instead of submerging your soul into a sea of despair by trying to read the output file in a text editor, you should load the file into KCachegrind by typing:

$ kcachegrind calgrind.out.12345

You should now see a screen like the one below:


The screenshot above shows the profiling results for my code. The left panel shows the functions called by my code sorted by total time spent inside each function. Because functions call each other, callgrind shows two cost metrics as proxies for time spent in each function: Incl., showing the total cost of a function, and self, showing the time spent in each function itself discounting the callees. By clicking on “Self” to order to functions by the cost of the function itself, we sort the functions by the costs of their own codes, as shown below:


Callgrind includes functions that are native to C/C++ in its analysis. If one of them appears in the highest positions of the left panel, it may be the case to try to use a different function or data structure that performs a similar task in a more efficient way. Most of the time, however, our functions are the ones in most of the top positions in the list. In the example above, we can see that a possible first step I can take to improve the time performance of my code is to make function “ContinuityModelROF::shiftStorage” more efficient. A few weeks ago, however, the function “ContinuityModel::continuityStep” was ranked first with over 30% of the cost, followed by a C++ map related function. I then replaced a map inside that function by a pointer vector, resulting in the drop of my function’s cost to less than 5% of the total cost of the code.

In case KCachegrind shows that a given function that is called from multiple places in the code is costly, you may want to know which function is the main culprit behind the costly calls. To do this, click on the function of interest (in this case, “_memcpy_sse2_unalight”) in the left panel, and then click on “Callers” in the right upper panel and on “Call Graph” in the lower right panel. This will show in list and graph forms the calls made to the function by other functions, and the asociated percent costs. Unfortunately, I have only the function “ContinuityModelROF::calculateROF” calling “_memcpy_sse2_unalight,” hence the simple graph, but the graph would be more complex if multiple functions made calls to “_memcpy_sse2_unalight.”

I hope this saves you at least the time spend reading this post!

Root finding in MATLAB, R, Python and C++

In dynamical systems, we are often interested in finding stable points, or equilibria. Some systems have multiple equilibria. As an example, take the lake problem, which is modeled by the equation below where Xt is the lake P concentration, at are the anthropogenic P inputs, Yt~LN(μ,σ2)  are random natural P inputs, b is the P loss rate, and q is a shape parameter controlling the rate of P recycling from the sediment. The first three terms on the right hand side make up the “Inputs” in the figure, while the last term represents the “Outputs.” A lake is in equilibrium when the inputs are equal to the outputs and the lake P concentration therefore is not changing over time.


For irreversible lakes this occurs at three locations, even in the absence of anthropogenic and natural inputs: an oligotrophic equilibrium, an unstable equilibrium (called the critical P threshold) and a eutrophic equilibrium (see figure below).


The unstable equilibrium in this case is called the critical P threshold because once it is crossed, it is impossible to return to an oligotrophic equilibrium by reducing anthropogenic and natural P inputs alone. In irreversible lakes like this, we would therefore like to keep the lake P concentration below the critical P threshold. How do we find the critical P threshold? With a root finding algorithm!

As stated earlier, the system above will be in equilibrium when the inputs are equal to the outputs and the P concentration is not changing over time, i.e. when

X_{t+1} - X_t = \frac{X^q_t}{1+X^q_t} - bX_t = 0

Therefore we simply need to find the zero, or “root” of the above equation.  Most of the methods for this require either an initial estimate or upper and lower bounds on the location of the root. These are important, since an irreversible lake will have three roots. If we are only interested in the critical P threshold, we have to make sure that we provide an estimate which leads to the unstable equilibrium, not either of the stable equilibria. If possible, you should plot the function whose root you are finding to make sure you are giving a good initial estimate or bounds, and check afterward to ensure the root that was found is the one you want! Here are several examples of root-finding methods in different programming languages.

In MATLAB, roots can be found with the function fzero(fun,x0) where ‘fun’ is the function whose root you want to find, and x0 is an initial estimate. This function uses Brent’s method, which combines several root-finding methods: bisection, secant, and inverse quadratic interpolation. Below is an example using the lake problem.

myfun = @(x,b,q) x^q/(1+x^q)-b*x;
b = 0.42;
q = 2.0;
fun = @(x) myfun(x,b,q);
pcrit = fzero(fun,0.75);

This returns pcrit = 0.5445, which is correct. If we had provided an initial estimate of 0.25 instead of 0.75, we would get pcrit = 2.6617E-19, basically 0, which is the oligotrophic equilibrium in the absence of anthropogenic and natural P inputs. If we had used 1.5 as an initial estimate, we would get pcrit = 1.8364, the eutrophic equilibrium.


In R, roots can be found with the function uniroot, which also uses Brent’s method. Dave uses this on line 10 of the function lake.eval in his OpenMORDM example. Instead of taking in an initial estimate of the root, this function takes in a lower and upper bound. This is safer, as you at least know that the root estimate will lie within these bounds. Providing an initial estimate that is close to the true value should do well, but is less predictable; the root finding algorithm may head in the opposite direction from what is desired.

b <- 0.42
q <- 2.0
pcrit <- uniroot(function(x) x^q/(1+x^q) - b*x, c(0.01, 1.5))$root

This returns pcrit = 0.5445145. Good, we got the same answer as we did with MATLAB! If we had used bounds of c(0.75, 2.0) we would have gotten 1.836426, the eutrophic equilibrium.

What if we had given bounds that included both of these equilibria, say c(0.5, 2.0)? In that case, R returns an error: ‘f() values at end points not of opposite sign’. That is, if the value returned by f(x) is greater than 0 for the lower bound, it must be less than 0 for the upper bound and vice versa. In this case both f(0.5) and f(2.0) are greater than 0, so the algorithm fails. What if we gave bounds for which one is greater than 0 and another less, but within which there are multiple roots, say c(-0.5,2.0)? Then R just reports the first one it finds, in this case pcrit = 0.836437, the eutrophic equilibrium. So it’s important to make sure you pick narrow enough bounds that include the root you want, but not roots you don’t!


In Python, you can use either scipy.optimize.root or scipy.optimize.brentq, which is what Jon uses on line 14 here. scipy.optimize.root can be used with several different algorithms, but the default is Powell’s hybrid method, also called Powell’s dogleg method. This function only requires an initial estimate of the root.

from scipy.optimize import root
b = 0.42
q = 2.0
pcrit = root(lambda x: x**(1+x**q) - b*x, 0.75)

scipy.optimize.root returns an object with several attributes. The attribute of interest to us is the root, represented by x, so we want pcrit.x. In this case, we get the correct value of 0.54454. You can play around with initial estimates to see how pcrit.x changes.


Not surprisingly, scipy.optimize.brentq uses Brent’s method and requires bounds as an input.

from scipy.optimize import brentq as root
b = 0.42
q = 2.0
pcrit = root(lambda x: x**(1+x**q) - b*x, 0.01, 1.5)

This just returns the root itself, pcrit = 0.5445. Again, you can play around with the bounds to see how this estimate changes.


In C++, Dave again shows how this can be done in the function ‘main-lake.cpp’ provided in the Supplementary Material to OpenMORDM linked from this page under the “Publications” section. On lines 165-168 he uses the bisect tool to find the root of the function given on lines 112-114. I’ve copied the relevant sections of his code into the function ‘find_Pcrit.cpp’ below.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <iostream>
#include <boost/math/tools/roots.hpp>

namespace tools = boost::math::tools;
using namespace std;

double b, q, pcrit;

double root_function(double x) {
  return pow(x,q)/(1+pow(x,q)) - b*x;

bool root_termination(double min, double max) {
  return abs(max - min) <= 0.000001;

int main(int argc, char* argv[])
  b = 0.42;
  q = 2.0;

  std::pair<double, double> result = tools::bisect(root_function, 0.01, 1.0, root_termination);
  pcrit = (result.first + result.second)/2;
  cout << pcrit << endl;

This yields the desired root of pcrit = 0.54454, but of course, changing the bounds may result in different estimates. In case you missed it, the take home message is to be careful about your initial estimate and bounds ;).



Getting started with C and C++

I’ve been learning C and C++ recently and I thought I’d share my experience learning these languages through a post. Prior to learning C and C++, I had experience in Python and Matlab, but this was my first foray into lower level languages. In my attempts to learn each language I made my way through several courses and books available online; some I found very helpful, others not so much. In this post I’ll detail my experiences with each resource and provide some tips that may help those planning on learning these languages.

Main takeaways

To learn the languages, I used four main resources. Online courses from, a book titled Learn C the Hard Way, a book commonly known as K&R 2 and tutorials from For those who do not have the time or desire to read this whole post, I found the following resources to be the most useful:

For C:

Learning C the Hard Way

K&R 2

For C++:

Foundations of Programming: Object – Oriented Design (you need a login             to access)

Up and Running with C++ (again, you need a login to access)

Everyone’s learning style is different, but I found that I learned the languages much faster by coding examples myself, rather than watching someone walk through a script. I also found that courses that taught straight from the command line were more effective than courses that taught through an IDE. When using an IDE, I often found myself spending more time working through glitches or nuances within the IDE than learning the languages themselves.

I’ll detail my experiences with each resource below. I’ll start with C resources, then discuss C++.

Resources for Learning C

C Essential Training – From

I started my training by taking course on titled “C Essential Training”. is an online educational website with thousands of videos, many of which focus on programming. The service is free to Cornell students and graduate students (though I checked and unfortunately neither PSU, UC Davis nor CU Boulder have agreements with the site). I found the course to be well structured and I felt that the instructor presented the material clearly and understandably. Despite this, I do not feel that the course did an effective job teaching me C. The main problem I had with the course was its reliance on the Eclipse IDE.  Eclipse seems like a fine IDE, but I don’t plan to use an IDE when writing code and I didn’t want to spend the time taking a separate course to learn its intricacies (though does have a full course devoted to Eclipse). Throughout the course, I kept finding myself having small Eclipse problems (e.g. not being able to change the project I was working on or having compiler errors) that were not hard to solve, but were never discussed in the lectures. I was able to solve each problem by doing some research online, but each little problem took me time to resolve and was mentally taxing. After spending 30 minutes looking up an Eclipse fix, I was not in the mood to go troubleshooting interesting C questions . Another problem with using Eclipse is that the user is never forced to write their own makefiles, an omission that seems like it could really hurt someone who plans to run C programs through the command line. In summary, I would not recommend taking this course unless you are either thoroughly versed in Eclipse or plan to write all of your code through Eclipse.

Learning C the Hard Way

The next resource I used to learn C was a book that Jazmin pointed me to called Learning C the Hard Way by Zed A. Shaw (after some poking around I found this had been mentioned previously on this blog). The book is laid out as a tutorial, where each chapter teaches a new C concept (it’s really a C course in book form).The author takes a slightly nontraditional teaching approach in that he makes you write the code first, then explains in detail what you just wrote. I found this very hands on teaching method extremely helpful. When I wrote the code myself, I was forced to focus on every detail of the code (something that is very important in a language like C). I also was able to learn which concepts were genuinely challenging for me and which concepts I needed more work on. When I watched the lectures, I’d often believe I understood a concept, only to find out later that I had misinterpreted the instructors lesson.

The book does not use an IDE, but rather writes code in a text editor (I used Sublime Text) and runs them on the Unix command line.  The author provides a succinct introduction to makefiles and how to use them, which was refreshing after the Eclipse based course that never mention makefiles or compilers.

Overall I found the teaching method employed by the book to be very effective, and I would highly recommend it to new C users. I should note however, that there seems to be some controversy surrounding the book. If you google “Learning C the hard way” you’ll find some very heated exchanges between the author and a blogger who criticized the book’s teaching methodology. The blogger had two main criticisms of the book; first that it over simplified and inaccurately presented key C concepts, and second, that the author failed to teach accepted programming standards for the C language. Mr. Shaw’s rebuttal was that the book’s purpose was to teach people get people comfortable with C and begin actually coding with it, then  once they are literate, have them go back and learn more about the language’s nuances. I personally agree with Mr. Shaw on this point, though I don’t have a background in computer science so my opinion is only that of an beginner. Many of the criticisms of the book seemed to come from the perspective of an advanced coder who is unable to see the language through the eyes of a beginner. Mr. Shaw’s explanations might be over simplified, but they do a good job demystifying many of the most foreign  aspects of C. I think that use of this book should be supplemented with other sources, especially documents on accepted C coding standards, but if you’re looking for a quick way to get on your feet with C and gain some confidence, then the book is a great resource.

I used a free beta version of the book which can be found here: but you can also purchase the book from the author here:

I found the beta version to be just fine, but there were some minor errors and some sections were clearly under construction.

The blog post criticizing the book can be found here:

K&R 2

A resource that I discovered through reading the exchanges between the Shaw and his critics was “The C Programming Language” by Brian W. Kernighan and Dennis M. Ritchie (commonly referred to as K&R 2 which is what I’ll call it for the rest of the post). One of the Authors of this book, Dennis Ritchie, actually coauthored the C language and this book is talked of as the go to authority of all matters C. Mr. Shaw devoted a whole chapter of “Learning C the Hard way” to bashing this book, but I found its layout and explanations quite accessible and useful. I did not find the tutorials as direct as “Learning C the Hard Way”, but I found it to be a helpful supplement.


Resources for Learning C++

Foundations of Programming: Object-Oriented Design – From

A main difference between C and C++ is that C++ is an object oriented language. I had some very basic experience in using object oriented programming, but was looking for a refresher before learning C++. “Foundations of Programming: Object-Oriented Design” was an excellent course that taught me all I wanted to know about object-oriented programming and more. The course is purely conceptual and does not teach any actual code or contain any practice problems. It presents the concepts in a simple yet comprehensive manner that I found very helpful. I would highly recommend this course to anyone hoping to learn or brush up their knowledge of how object-oriented programming works.

Up and Running with C++ – From

This course was very similar in layout to the C course from, and I have the same criticisms. The entire course used Eclipse, and I kept having minor problems that were never addressed by the lectures but prevented me from executing my code. I did feel like I was able to learn the basic tools I needed from the lectures, but I would have gotten much more out of the class if it had been taught through the command line. I also felt that the course was sparse on exercises and heavy on lectures. I found that I got much less out of watching the instructor write code than being forced to write out the code myself (as Learning C the Hard Way forces you to do).

This resource is linked often in older posts on this blog, and I found it helpful in answering C++ questions I had after finishing the courses. I did not find that tutorial had the most helpful narration of how one may learn C++ from scratch, but it has very succinct definitions of many C++ components and was helpful as a reference. I think this site is the one I will look to most when troubleshooting future C++ code.

Final thoughts

I’d like to note that I found WRASEMAN’s post on makefiles a few weeks back  to be quite helpful. From my limited experience, ensuring that your code compiles correctly can be one of the most challenging parts of using a lower level language and the post has some excellent resources that explain makefiles are and how they can be used.

I know there are a lot of contributors and readers of this blog who are much more versed in C and C++ than I am, so if you’d like to chime in on useful resources, please do so in the comments.