A non-intimidating introduction to parallel computing with Numba

This blog post is adapted from material I learned during the 2021 San Diego Supercomputer Center (SDSC) Summer Institute. This was an introductory boot camp to high-performance computing (HPC), and one of the modules taught the application of Numba for in-line parallelization and speeding up of Python code.

What is Numba?

According to its official web page, Numba is a just-in-time (JIT) compiler that translates subsets of Python and NumPy code into fast machine code, enabling it to run at speeds approaching that of C or Fortran. This is becuase JIT compilation enables specific lines of code to be compiled or activated only when necessary. Numba also makes use of cache memory to generate and store the compiled version of all data types entered to a specific function, which eliminates the need for recompilation every time the same data type is called when a function is run.

This blog post will demonstrate a simple examples of using Numba and its most commonly-used decorator, @jit, via Jupyter Notebook. The Binder file containing all the executable code can be found here.

Note: The ‘@‘ flag is used to indicate the use of a decorator

Installing Numba and Setting up the Jupyter Notebook

First, in your command prompt, enter:

pip install numba

Alternatively, you can also use:

conda install numba

Next, import Numba:

import numpy as np
import numba
from numba import jit
from numba import vectorize

Great! Now let’s move onto using the @jit decorator.

Using @jit for executing functions on the CPU

The @jit decorator works best on numerical functions that use NumPy. It has two modes: nopython mode and object mode. Setting nopython=True tell the compiler to overlook the involvement of the Python interpreter when running the entire decorated function. This setting leads to the best performance. However, in the case when:

  1. nopython=True fails
  2. nopython=False, or
  3. nopython is not set at all

the compiler defaults to object mode. Then, Numba will manually identify loops that it can compile into functions to be run in machine code, and will run the remaining code in the interpreter.

Here, @jit is demonstrated on a simple matrix multiplication function:

# a function that does multiple matrix multiplication
def matrix_multiplication(A, x):
    b = np.empty(shape=(x.shape[0],1), dtype=np.float64)
    for i in range(x.shape[0]):
        b[i] = np.dot(A[i,:], x)
    return b

Remember – the use of @jit means that this function has not been compiled yet! Compilation only happens when you call the function:

A = np.random.rand(10, 10)
x = np.random.rand(10, 1)

But how much faster is Numba really? To find out, some benchmarking is in order. Jupyter Notebook has a handy function called %timeit that runs simple functions many times in a loop to get their average execution time, that can be used as follows:

%timeit matrix_multiplication(A,x)

# 11.4 µs ± 7.34 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

Numba has a special .py_func attribute that effectively allows the decorated function to run as the original uncompiled Python function. Using this to compare its runtime to that of the decorated version,

%timeit matrix_multiplication.py_func(A,x)

# 35.5 µs ± 3.5 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

From here, you can see that the Numba version runs about 3 times faster than using only NumPy arrays. In addition to this, Numba also supports tuples, integers, floats, and Python lists. All other Python features supported by Numba can be found here.

Besides explicitly declaring @jit at the start of a function, Numba makes it simple to turn a NumPy function into a Numba function by attaching jit(nopython=True) to the original function. This essentially uses the @jit decorator as a function. The function to calculate absolute percentage relative error demonstrates how this is done:

# Calculate percentage relative error
def numpy_re(x, true):
    return np.abs(((x - true)/true))*100

numba_re = jit(nopython=True)(numpy_re)

And we can see how the Number version is faster:

%timeit numpy_re(x, 0.66)
%timeit numba_re(x, 0.66)

where the NumPy version takes approximately 2.61 microseconds to run, while the Numba version takes 687 nanoseconds.

Inline parallelization with Numba

The @jit decorator can also be used to enable inline parallelization by setting its parallelization pass parallel=True. Parallelization in Numba is done via multi-threading, which essentially creates threads of code that are distributed over all the available CPU cores. An example of this can be seen in the code snippet below, describing a function that calculates the normal distribution of a set of data with a given mean and standard deviation:

SQRT_2PI = np.sqrt(2 * np.pi)

@jit(nopython=True, parallel=True)
def normals(x, means, sds):
    n = means.shape[0]
    result = np.exp(-0.5*((x - means)/sds)**2)
    return (1 / (sds * np.sqrt(2*np.pi))) * result

As usual, the function must be compiled:

means = np.random.uniform(-1,1, size=10**8)
sds = np.random.uniform(0.1, 0.2, size=10**8)

normals(0.6, means, sds)

To appreciate the speed-up that Numba’s multi-threading provides, compare the runtime for this with:

  1. A decorated version of the function with a disabled parallel pass
  2. The uncompiled, original NumPy function

The first example can be timed by:

normals_deco_nothread = jit(nopython=True)(normals.py_func)
%timeit normals_deco_nothread(0.6, means, sds)

# 3.24 s ± 757 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The first line of the code snippet first makes an uncompiled copy of the normals function, and then applies the @jit decorator to it. This effectively creates a version of normals that uses @jit, but is not multi-threaded. This run of the function took approximately 3.3 seconds.

For the second example, simply:

%timeit normals.py_func(0.6, means, sds)

# 7.38 s ± 759 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Now, compare both these examples to the runtime of the decorated and multi-threaded normals function:

%timeit normals(0.6, means, sds)

# 933 ms ± 155 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

The decorated, multi-threaded function is significantly faster (933 ms) than the decorated function without multi-threading (3.24 s), which in turn is faster than the uncompiled original NumPy function (7.38 s). However, the degree of speed-up may vary depending on the number of CPUs that the machine has available.


In general, the improvements achieved by using Numba on top of NumPy functions are marginal for simple, few-loop functions. Nevertheless, Numba is particularly useful for large datasets or high-dimensional arrays that require a large number of loops, and would benefit from the one-and-done compilation that it enables. For more information on using Numba, please refer to its official web page.

Introduction to Unit Testing


Here is something that has happened to most or all of us engineers who code. We write a function or a class and make sure it works by running it and printing some internal state output on the console. The following week, after adding or making improvements on seemingly unrelated parts of the code, the results start to look weird. After a few hours of investigation (a.k.a. wasted time) we find out the reason for the weird results is that that first function or class is either not giving us the right numbers anymore or not behaving as expected with different inputs. I wonder how many papers would have been written, how many deadlines would have been met, and how many “this $@#%& does not @#$% work!!!” thoughts and comments to friends and on our codes themselves would have been avoided if we managed to get rid of such frustrating and lengthy situations.

In this post I will introduce a solution to this rather pervasive issue: unit testing, which amounts to ways of creating standard tests throughout the development of a code to avoid the mentioned setbacks and simplify debugging. I will also get into some of the specifics of unit testing for Python and demonstrate it with a reservoir routing example. Lastly, I will briefly mention a unit testing framework for C++.

The General Idea

Unit test frameworks are libraries with functions (or macros) that allow for the creation of standard tests for individual parts of code, such as function and classes, that can be run separately from the rest of code. The advantages of using Unit Testing are mainly that (1) it eliminates the need of running the entire code with a full set of inputs simply to see if a specific function or class is working, which in some cases can get convoluted take a really long time, (2) it allows for easy and fast debugging, given that if a certain test fails you know where the problem is, (3) it prevents you or someone else from breaking the code down the road by running the tests frequently as the code is developed, and (4) it prevents an error in a function or class to be made noticible in another, which complicates the debugging process. Of course, the more modular a code is the easier it is to separately test its parts — a code comprised of just one 1,000-lines long function cannot be properly unit-tested, or rather can only have one unit test for the entire function.

Example Code to be Tested

The first thing we need to perform unit testing is a code to be tested. Let’s then consider the reservoir routing code below, comprised of a reservoir class, a synthetic stream flow generation function, a function to run the simulation over time, and the main.:

import numpy as np
import matplotlib.pyplot as plt

class Reservoir:
    __capacity = -1
    __storage_area_curve = np.array([[], []])
    __evaporation_series = np.array([])
    __inflow_series = np.array([])
    __demand_series = np.array([])
    __stored_volume = np.array([])

    def __init__(self, storage_area_curve, evaporations, inflows, demands):
        """ Constructor for reservoir class

            storage_area_curve {2D numpy matrix} -- Matrix with a
            row of storages and a row of areas
            evaporations {array/list} -- array of evaporations
            inflows {array/list} -- array of inflows
            demands {array/list} -- array of demands

        self.__storage_area_curve = storage_area_curve
        assert(storage_area_curve.shape[0] == 2)
        assert(len(storage_area_curve[0]) == len(storage_area_curve[1]))

        self.__capacity = storage_area_curve[0, -1]

        n_weeks = len(demands)
        self.__stored_volume = np.ones(n_weeks, dtype=float) * self.__capacity
        self.__evaporation_series = evaporations
        self.__inflow_series = inflows
        self.__demand_series = demands

    def calculate_area(self, stored_volume):
        """ Calculates reservoir area based on its storave vs. area curve

            stored_volume {float} -- current stored volume

            float -- reservoir area

        storage_area_curve_T = self.__storage_area_curve.T .astype(float)

        if stored_volume  self.__capacity:
            print "Storage volume {} graeter than capacity {}.".format(
                stored_volume, self.__capacity)
            raise ValueError

        for i in range(1, len(storage_area_curve_T)):
            s, a = storage_area_curve_T[i]
            # the &st; below needs to be replace with "smaller than" symbol. WordPress code highlighter has a bug that was distorting the code because of this "smaller than."
            if stored_volume &st; s:
                sm, am = storage_area_curve_T[i - 1]
                return am + (stored_volume - sm)/ (s - sm) * (a - am)

        return a

    def mass_balance(self, upstream_flow, week):
        """ Performs mass balance on reservoir

            upstream_flow {float} -- release from upstream reservoir
            week {int} -- week

            double, double -- reservoir release and unfulfilled
            demand (in case reservoir gets empty)

        evaporation = self.__evaporation_series[week] *\
            self.calculate_area(self.__stored_volume[week - 1])
        new_stored_volume = self.__stored_volume[week - 1] + upstream_flow +\
            self.__inflow_series[week] - evaporation - self.__demand_series[week]

        release = 0
        unfulfiled_demand = 0

        if (new_stored_volume > self.__capacity):
            release = new_stored_volume - self.__capacity
            new_stored_volume = self.__capacity
        elif (new_stored_volume < 0.):
            unfulfiled_demand = -new_stored_volume
            new_stored_volume = 0.

        self.__stored_volume[week] = new_stored_volume

        return release, unfulfiled_demand

    def get_stored_volume_series(self):
        return self.__stored_volume

def run_mass_balance(reservoirs, n_weeks):
    upstream_flow = 0
    release = 0
    total_unfulfilled_demand = 0
    for week in range(1, n_weeks):
        for reservoir in reservoirs:
            release, unfulfilled_demand = reservoir.mass_balance(release, week)

def generate_streamflow(n_weeks, sin_amplitude, log_mu, log_sigma):
    """Log-normally distributed stream flow generator.

        n_weeks {int} -- number of weeks of stream flows
        sin_amplitude {double} -- amplitude of log-mean sinusoid fluctuation
        log_mu {double} -- mean log-mean
        log_sigma {double} -- log-sigma

        {Numpy array} -- stream flow series

    streamflows = np.zeros(n_weeks)

    for i in range(n_weeks):
        streamflow = np.random.randn() * log_sigma + log_mu *\
            (1. + sin_amplitude * np.sin(2. * np.pi / 52 * (i % 52)))
        streamflows[i] = np.exp(streamflow)

    return streamflows

if __name__ == "__main__":
    n_weeks = 522
    sin_amplitude, log_mean, log_std = 1., 2.0, 1.8
    streamflows = generate_streamflow(n_weeks, sin_amplitude, log_mean, log_std)
    storage_area_curve = np.array([[0, 1000, 3000, 4000], [0, 400, 600, 900]])

    reseroir1 = Reservoir(storage_area_curve, np.random.rand(n_weeks) / 10,\
        streamflows, np.random.rand(n_weeks) * 60)

    run_mass_balance([reseroir1], n_weeks)

    plt.xlabel("Weeks [-]")
    plt.ylabel("Stored Volume [MG]")
    plt.title("Storage Over Time")
    plt.ylim(0, 4100)

We have quite a few functions to be tested in the code above, but lets concentrate on Reservoir.calculate_area and generate_streamflow. The first is a function within a class (Reservoir) while the second is a standalone function. Another difference between both functions is that we know exactly what output to expect from Reservoir.calculate_area (area given storage), while given generate_stream flow is setup to return random numbers we can only test their estimated statistical properties, which will change slightly every time we run the function. We will use the PyTest framework to perform unit testing in the reservoir routing code above.


PyTest is one among many available unit test frameworks for Python. The reason for its choice is that PyTest is commonly used by Python developers. You can find the API (list of commands) here and examples more here.

Basic Work Flow

Using PyTest is pretty simple. All you have to do it:

  • Install PyTest with pip or conda (pip install pytest or conda install pytest)
  • Create a file (preferably in the same directory as the file with the code you want to test) whose name either begins or end with “test_” or “_test.py,” respectively — this is a “must.” In this example, the main code is called “reservoir_routing.py,” so I created the PyTest file in the same directory and named it “test_reservoir_routing.py.”
  • In the unit test file (“test_reservoir_routing.py”), import from the main code file the functions and classes you want to test, the PyTest module pytest and any other packages you may use to write the tests, such as Numpy.
  • Write the tests. Tests in PyTest are simply regular Python functions whose names begin with “test_” and that have assertions (“assert,” will get to that below) in them.
  • On the Command Prompt, PowerShell, or Linux terminal, run the command “pytest.” PyTest will then look for all your files that begin with “test_” or end in “_test.py,” scan them for functions beginning with “test_,” which indicate a test function, run them and give you the results.


Assertions are the actual checks, the cores of the tests. Assertions are performed with the assert command and will return “pass” if the condition being asserting is true and “fail” otherwise. Below is an example of an assertion, which will return “pass” if the function being tested returns the value on the right-hand-side for the given arguments or “false” otherwise:

assert function_calculate_something(1, 'a', 5) == 100

Sometimes we are not sure of the value a function will return but have an approximate idea. This is the case of functions that perform stochastic calculations or calculate numerically approximate solutions — e.g. sampling from distributions, Newton-Rhapson minimum finder, E-M algorithm, finite-differences PDE solvers, etc. In this case, we should perform our assertion against an approximate value with:

# assert if function returns 100 +- 2
assert function_calculate_something(1, 'a', 5) == pytest.approx(100., abs=2.)

or with:

# assert if function returns 100 +- 0.1 (or 100 +- 100 * 1e-3)
assert function_calculate_something(1, 'a', 5) == pytest.approx(100., rel=1e-3)

Sometimes (although, unfortunately, rarely) we add code to a function to check the validity of the arguments, intermediate calculations, or to handle exceptions during the function’s execution. We can also test all that with:

# Test if an exception (such as Exception) are properly raised
with pytest.raises(Exception):
    function_calculate_something(1345336, 'agfhdhdh', 54784574)

Test File Example

Putting all the above together, below is a complete PyTest file with two tests (two test functions, each with whatever number of assertions):

from reservoir_routing import Reservoir, generate_streamflow
import numpy as np
import pytest

def test_calculate_area():
    n_weeks = 522
    storage_area_curve = np.array([[0, 500, 800, 1000], [0, 400, 600, 900]])

    reseroir1 = Reservoir(storage_area_curve, np.random.rand(n_weeks),
                            np.zeros(n_weeks), np.random.rand(n_weeks) * 20)

    # Test specific values of storages and areas
    assert reseroir1.calculate_area(500) == 400
    assert reseroir1.calculate_area(650) == 500
    assert reseroir1.calculate_area(1000) == 900

    # Test if exceptions are properly raised
    with pytest.raises(ValueError):

def test_generate_streamflow():
    n_weeks = 100000
    log_mu = 7.8
    log_sigma = 0.5
    sin_amplitude = 1.
    streamflows = generate_streamflow(n_weeks, sin_amplitude, log_mu, log_sigma)

    whitened_log_mu = (np.log(streamflows[range(0, n_weeks, 52)]) - log_mu) / log_sigma

    # Test if whitened mean in log space is 0 +- 0.5
    assert np.mean(whitened_log_mu) == pytest.approx(0., abs=0.05)

If I open my Command Prompt, navigate to the folder where both the main code and the test Python files are located and run the command pytest I will get the output below:

F:\Dropbox\Bernardo\Blog posts>pytest
============================= test session starts =============================
platform win32 -- Python 2.7.13, pytest-4.2.0, py-1.7.0, pluggy-0.8.1
rootdir: F:\Dropbox\Bernardo\Blog posts, inifile:
collected 2 items

test_reservoir_routing.py F.                                             [100%]

================================== FAILURES ===================================
_____________________________ test_calculate_area _____________________________

    def test_calculate_area():
        n_weeks = 522
        storage_area_curve = np.array([[0, 500, 800, 1000], [0, 400, 600, 900]])

        reseroir1 = Reservoir(storage_area_curve, np.random.rand(n_weeks),
                                np.zeros(n_weeks), np.random.rand(n_weeks) * 20)

        # Test specific values of storages and areas
        assert reseroir1.calculate_area(500) == 400
>       assert reseroir1.calculate_area(650) == 500
E       assert 400 == 500
E        +  where 400 = (650)
E        +    where  = .calculate_area

test_reservoir_routing.py:14: AssertionError
========================== deprecated python version ==========================
You are using Python 2.7.13, which will no longer be supported in pytest 5.0
For more information, please read:
===================== 1 failed, 1 passed in 1.07 seconds ======================

F:\Dropbox\Bernardo\Blog posts>

This seems like a lot just to say that one test passed and the other failed, but there is actually some quite useful information in this output. The lines below (20 and 21 of the output above) indicate which assertion failed, so you can go ahead and debug your the corresponding specific part of code with a debugger:

>       assert reseroir1.calculate_area(650) == 500
E       assert 400 == 500

I actually found this error with PyTest as I was writing the reservoir routing code for this post and used VS Code, which has a debugger integrated with PyTest, to debug and fix the error. The error is happening because of an int to float conversion that is not being done correctly in line 45 of the original code, which can be replaced by the version below to fix the problem:

storage_area_curve_T = self.__storage_area_curve.T<strong>.astype(float)</strong>

The last piece of important information contained in the last few lines of the output is that support for Python 2.7 will be discontinued, as it is happening with other Python packages as well. It may be time to finally switch to Python 3.

Practice Exercise

As a practice exercise, try writing a test function for the Reservoir.mass_balance function including cases in which the reservoir spills, gets empty, and is partly full.

Unit Testing in C++

A framework for unit testing in C++ that is quite popular is the Catch2 framework. The idea is exactly the same: you have a code you want to test and use the framework to write test functions. An example of Catch2 being used to check a factorial function is shown below, with both the main and the test codes in the same file:

#define CATCH_CONFIG_MAIN  // This tells Catch to provide a main() - only do this in one cpp file
#include "catch.hpp"

unsigned int Factorial( unsigned int number ) {
    return number <= 1 ? number : Factorial(number-1)*number;

TEST_CASE( "Factorials are computed", "[factorial]" ) {
    REQUIRE( Factorial(1) == 1 );
    REQUIRE( Factorial(2) == 2 );
    REQUIRE( Factorial(3) == 6 );
    REQUIRE( Factorial(10) == 3628800 );


Setting up and running unit testing is not rocket science. Also, with my test functions I do not have to use my actual reservoir system test case in order to check if the surface area calculations are correct. If I find they are not, I can just re-run the test function with a debugger (VS Code is great for that) and fix my function without having to deal with the rest of the code. You would be surprised by how many hours of debugging can be saved by taking the unit test approach. Lastly, be sure to run your tests after any change to your code you consider minimally significant (I know, this sounds vague). You never know when you mess something up by trying to fix or improve something else.

Time series forecasting in Python for beginners

Time series forecasting in Python for beginners

This semester I am teaching Engineering Management Methods here at Cornell University. The course is aimed at introducing engineering students to systems thinking and a variety of tools and analyses they can use to analyze data. The first chapter has been on time series forecasting, where we discussed some of the simpler models one can use and apply for forecasting purposes, including Simple and Weighted Moving Average, Single and Double Exponential Smoothing, Additive and Multiplicative Seasonal Models, and Holt Winter’s Method.

The class applications as well as the homework are primarily performed in Excel, but I have been trying, with limited success, to encourage the use of programming languages for the assignments. One comment I’ve received by a student has been that it takes significantly more time to perform the calculations by coding; they feel that it’s a waste of time. I initially attributed the comment to the fact that the student was new to coding and it takes time in the beginning, but on later reflection I realized that, in fact, the student was probably simply manually repeating the same Excel operations by using code: take a set of 30 observations, create an array to store forecasts, loop through every value and calculate forecast using model formula, calculate error metrics, print results, repeat steps for next set of data. It occurred to me that of course they think it’s a waste of time, because doing it that way completely negates what programming is all about: designing and building an executable program or function to accomplish a specific computing task. In this instance, the task is to forecast using each of the models we learn in class and the advantage of coding comes with the development of some sort of program or function that performs these operations for us, given a set of data as input. Simply going through the steps of performing a set of calculations for a problem using code is not much different than doing so manually or in Excel. What is different (and beneficial) is designing a code so that it can then be effortlessly applied to all similar problems without having to re-perform all calculations. I realize this is obvious to the coding virtuosos frequenting this blog, but it’s not immediately obvious to the uninitiated who are rather confused on why Dr. Hadjimichael is asking them to waste so much time for a meager bonus on the homework.

So this blog post, is aimed at demonstrating to coding beginners how one can transition from one way of thinking to the other, and providing a small time-series-forecasting toolkit for users that simply want to apply the models to their data.

The code and data for this example can be found on my GitHub page and I will discuss it below. I will be using a wine sales dataset that lists Australian wine sales (in kiloliters) from January 1980 to October 1991. The data looks like this:

Date Sales
1/1/80 464
2/1/80 675
3/1/80 703
4/1/80 887
5/1/80 1139
6/1/80 1077
7/1/80 1318
8/1/80 1260
9/1/80 1120
10/1/80 963
11/1/80 996
12/1/80 960
view raw wine_sales.csv hosted with ❤ by GitHub

And this is what the time series looks like:


We first need to import the packages we’ll be using and load the data. I will be using Pandas in this example (but there’s other ways). I’m also defining the number of seasonal periods in a cycle, in this case 12.

import numpy as np #Package we'll use for numerical calculations
import matplotlib.pyplot as plt #From matplotlib package we import pyplot for plots
import pandas #Package to data manipulation
import scipy.optimize #Package we'll use to optimize
plt.style.use('seaborn-colorblind') #This is a pyplot style (optional)

'''Load the data into a pandas series with the name wine_sales'''
time_series = pandas.Series.from_csv("wine_sales.csv", header=0)

P=12 #number of seasonal periods in a cycle
In class, I’ve always mentioned that one should use a training and a validation set for model development, primarily to avoid overfitting our model to the specific training set. In this example, the functions are written as they apply to the training set. Should you choose to apply the functions listed here, you should apply the functions for the training set, extract forecasts and then use those to initialize your validation period. To divide the observations, you would do something like this:
training = time_series[0:108] # Up to December '88
validation = time_series[108:] # From January '89 until end

Now, if say, we wanted to apply the Naive model of the next steps forecast being equal to the current observation, i.e., \hat{y}_{t+1}=y_t, we’d do something like:

y_hat=pandas.Series().reindex_like(time_series) # Create an array to store forecasts
y_hat[0]= time_series[0] # Initialize forecasting array with first observation
''' Loop through every month using the model to forecast y_hat'''
for t in range(len(y_hat)-1): # Set a range for the index to loop through
    y_hat[t+1]= time_series[t] # Apply model to forecast time i+1

Now if we’d like to use this for any time series, so we don’t have to perform our calculations every time, we need to reformat this a bit so it’s a function:

def naive(time_series):
    y_hat[0]= time_series[0] # Initialize forecasting array with first observation
    ''' Loop through every month using the model to forecast y'''
    #This sets a range for the index to loop through
    for t in range(len(y_hat)-1): 
        y_hat[t+1]= time_series[t] # Apply model to forecast time i+1
    return y_hat

Now we can just call define this function at the top of our code and just call it with any time series as an input. The function as I’ve defined it returns a pandas.Series with all our forecasts. We can then do the same for all the other modeling methods (below). Some things to note:

  • The data we read in the top, outside the functions, as well as any parameters defined (P in this case) are global variables and do not need to be defined as an input to the function. The functions below only need a list of parameter values as inputs.
  • For the models with seasonality and/or trend we need to create separate series to store those estimates for E, S, and T.
  • Each model has its own initialization formulas and if we wanted to apply them to the validation set that follows our training set, we’d need to initialize with the last values of our training.
Using this model, y_hat(t+1)=(y(t)+y(t-1)...+y(t-k+1))/k (i.e., the predicted 
next value is equal to the average of the last k observed values).'''
def SMA(params):
    ''' Loop through every month using the model to forecast y.
    Be careful with Python indexing!'''
    for t in range(k-1,len(y_hat)-1): #This sets a range for the index to loop through
        y_hat[t+1]= np.sum(time_series[t-k+1:t+1])/k # Apply model to forecast time i+1
    return y_hat

Using this model, y_hat(t+1)=w(1)*y(t)+w(2)*y(t-1)...+w(k)*y(t-k+1) (i.e., the 
predicted next value is equal to the weighted average of the last k observed 
def WMA(params):
    weights = np.array(params)
    y_hat[0:k]=time_series[0:k] # Initialize values
    ''' Loop through every month using the model to forecast y.
    Be careful with Python indexing!'''
    for t in range(k-1,len(y_hat)-1): #This sets a range for the index to loop through
        y_hat[t+1]= np.sum(time_series[t-k+1:t+1].multiply(weights)) # Apply model to forecast time i+1
    return y_hat
'''This model includes the constraint that all our weights should sum to one. 
To include this in our optimization later, we need to define it as a function of our 
def WMAcon(params):
    weights = np.array(params)
    return np.sum(weights)-1

Using this model, y_hat(t+1)=y_hat(t)+a*(y(t)-y_hat(t))(i.e., the 
predicted next value is equal to the weighted average of the last forecasted value and its
difference from the observed).'''
def SES(params):
    a = np.array(params)
    y_hat[0]=time_series[0] # Initialize values
    ''' Loop through every month using the model to forecast y.
    Be careful with Python indexing!'''
    for t in range(len(y_hat)-1): #This sets a range for the index to loop through
        y_hat[t+1]=  y_hat[t]+a*(time_series[t]-y_hat[t])# Apply model to forecast time i+1
    return y_hat

Using this model, y_hat(t+1)=E(t)+T(t) (i.e., the 
predicted next value is equal to the expected level of the time series plus the 
def DES(params):
    a,b = np.array(params)
    '''We need to create series to store our E and T values.'''
    E = pandas.Series().reindex_like(time_series)
    T = pandas.Series().reindex_like(time_series)
    y_hat[0]=E[0]=time_series[0] # Initialize values
    ''' Loop through every month using the model to forecast y.
    Be careful with Python indexing!'''
    for t in range(len(y_hat)-1): #This sets a range for the index to loop through
        E[t+1] = a*time_series[t]+(1-a)*(E[t]+T[t])
        T[t+1] = b*(E[t+1]-E[t])+(1-b)*T[t]
        y_hat[t+1] = E[t] + T[t] # Apply model to forecast time i+1
    return y_hat

Using this model, y_hat(t+1)=E(t)+S(t-p) (i.e., the 
predicted next value is equal to the expected level of the time series plus the 
appropriate seasonal factor). We first need to create an array to store our 
forecast values.'''
def ASM(params):
    a,b = np.array(params)
    p = P
    '''We need to create series to store our E and S values.'''
    E = pandas.Series().reindex_like(time_series)
    S = pandas.Series().reindex_like(time_series)
    y_hat[:p]=time_series[0] # Initialize values
    '''We need to initialize the first p number of E and S values'''
    E[:p] = np.sum(time_series[:p])/p
    S[:p] = time_series[:p]-E[:p]
    ''' Loop through every month using the model to forecast y.
    Be careful with Python indexing!'''
    for t in range(p-1, len(y_hat)-1): #This sets a range for the index to loop through
        E[t+1] = a*(time_series[t]-S[t+1-p])+(1-a)*E[t]
        S[t+1] = b*(time_series[t]-E[t])+(1-b)*S[t+1-p]
        y_hat[t+1] = E[t] + S[t+1-p] # Apply model to forecast time i+1
    return y_hat

Using this model, y_hat(t+1)=E(t)*S(t-p) (i.e., the 
predicted next value is equal to the expected level of the time series times 
the appropriate seasonal factor). We first need to create an array to store our 
forecast values.'''
def MSM(params):
    a,b = np.array(params)
    p = P
    '''We need to create series to store our E and S values.'''
    E = pandas.Series().reindex_like(time_series)
    S = pandas.Series().reindex_like(time_series)
    y_hat[:p]=time_series[0] # Initialize values
    '''We need to initialize the first p number of E and S values'''
    E[:p] = np.sum(time_series[:p])/p
    S[:p] = time_series[:p]/E[:p]
    ''' Loop through every month using the model to forecast y.
    Be careful with Python indexing!'''
    for t in range(p-1, len(y_hat)-1): #This sets a range for the index to loop through
        E[t+1] = a*(time_series[t]/S[t+1-p])+(1-a)*E[t]
        S[t+1] = b*(time_series[t]/E[t])+(1-b)*S[t+1-p]
        y_hat[t+1] = E[t]*S[t+1-p] # Apply model to forecast time i+1
    return y_hat

Using this model, y_hat(t+1)=(E(t)+T(t))*S(t-p) (i.e., the 
predicted next value is equal to the expected level of the time series plus the 
trend, times the appropriate seasonal factor). We first need to create an array 
to store our forecast values.'''
def AHW(params):
    a, b, g = np.array(params)
    p = P
    '''We need to create series to store our E and S values.'''
    E = pandas.Series().reindex_like(time_series)
    S = pandas.Series().reindex_like(time_series)
    T = pandas.Series().reindex_like(time_series)
    y_hat[:p]=time_series[0] # Initialize values
    '''We need to initialize the first p number of E and S values'''
    E[:p] = np.sum(time_series[:p])/p
    S[:p] = time_series[:p]-E[:p]
    T[:p] = 0
    ''' Loop through every month using the model to forecast y.
    Be careful with Python indexing!'''
    for t in range(p-1, len(y_hat)-1): #This sets a range for the index to loop through
        E[t+1] = a*(time_series[t]-S[t+1-p])+(1-a)*(E[t]+T[t])
        T[t+1] = b*(E[t+1]-E[t])+(1-b)*T[t]
        S[t+1] = g*(time_series[t]-E[t])+(1-g)*S[t+1-p]
        y_hat[t+1] = E[t]+T[t]+S[t+1-p] # Apply model to forecast time i+1
    return y_hat

Using this model, y_hat(t+1)=(E(t)+T(t))*S(t-p) (i.e., the 
predicted next value is equal to the expected level of the time series plus the 
trend, times the appropriate seasonal factor). We first need to create an array 
to store our forecast values.'''
def MHW(params):
    a, b, g = np.array(params)
    p = P
    '''We need to create series to store our E and S values.'''
    E = pandas.Series().reindex_like(time_series)
    S = pandas.Series().reindex_like(time_series)
    T = pandas.Series().reindex_like(time_series)
    y_hat[:p]=time_series[0] # Initialize values
    '''We need to initialize the first p number of E and S values'''
    S[:p] = time_series[:p]/(np.sum(time_series[:p])/p)
    E[:p] = time_series[:p]/S[:p]
    T[:p] = 0
    ''' Loop through every month using the model to forecast y.
    Be careful with Python indexing!'''
    for t in range(p-1, len(y_hat)-1): #This sets a range for the index to loop through
        E[t+1] = a*(time_series[t]/S[t+1-p])+(1-a)*(E[t]+T[t])
        T[t+1] = b*(E[t+1]-E[t])+(1-b)*T[t]
        S[t+1] = g*(time_series[t]/E[t])+(1-g)*S[t+1-p]
        y_hat[t+1] = (E[t]+T[t])*S[t+1-p] # Apply model to forecast time i+1
    return y_hat

Having defined this, I can then, for example, call the Multiplicative Holt Winters method by simply typing:


This will produce a forecast using the Multiplicative Holt Winters method with those default parameters, but we would like to calibrate them to get the “best” forecasts from our model. To do so, we need to define what we mean by “best”, and in this example I’m choosing to use Mean Square Error as my performance metric. I define it below as a function that receives the parameters and some additional arguments as inputs. I only need to set it up this way because my optimization function is trying to minimize the MSE function by use of those parameters. I’m using the “args” array to simply tell the function which model it’s using to forecast.

def MSE(params, args):
    model, = args
    t_error = np.zeros(len(time_series))
    forecast = model(params)
    for t in range(len(time_series)):
        t_error[t] = time_series[t]-forecast[t]
    MSE = np.mean(np.square(t_error))
    return MSE

To perform the optimization in Excel, we’d use Solver, but in Python we have other options. SciPy is a Python package that allows us, among many other things, to optimize such single-objective problems. What I’m doing here is that I define a list of all the models I want to optimize, their default parameters, and the parameters’ bounds. I then use a loop to go through my list of models and run the optimization. To store the minimized MSE values as well as the parameter values that produce them, we can create an array to store the MSEs and a list to store the parameter values for each model. The optimization function produces a “dictionary” item that contains the minimized MSE value (under ‘fun’), the parameters that produce it (under ‘x’) and other information.

''' List of all the models we will be optimizing'''
models = [SES, DES, ASM, MSM, AHW, MHW]
''' This is a list of all the default parameters for the models we will be 
optimizing. '''
                      #SES,  DES,     ASM
default_parameters = [[0.5],[0.5,0.5],[0.5,0.5],
                      #MSM,        AHW,            MHW
''' This is a list of all the bounds for the default parameters we will be 
optimizing. All the a,b,g's are weights between 0 and 1. '''
bounds = [[(0,1)],[(0,1)]*2, [(0,1)]*2,
min_MSEs = np.zeros(len(models)) # Array to store minimized MSEs
opt_params = [None]*len(models) # Empty list to store optim. parameters
for i in range(len(models)):
    res = scipy.optimize.minimize(MSE, # Function we're minimizing (MSE in this case)
                                  default_parameters[i], # Default parameters to use
                                  # Additional arguments that the optimizer 
                                  # won't be changing (model in this case)
                                  method='L-BFGS-B', # Optimization method to use
                                  bounds=bounds[i]) # Parameter bounds
    min_MSEs[i] = res['fun'] #Store minimized MSE value
    opt_params[i] = res['x'] #Store parameter values identified by optimizer

Note: For the WMA model, the weights should sum to 1 and this should be input to our optimization as a constraint. To do so, we need to define the constraint function as a dictionary and include the following in our minimization call: constraints=[{‘type’:’eq’,’fun’: WMAcon}]. The number of periods to consider cannot be optimized by this type of optimizer.

Finally, we’d like to present our results. I’ll do so by plotting the observations and all my models as well as their minimized MSE values:

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1) # Create figure
ax.set_title("Australian wine sales (kilolitres)") # Set figure title
l1 = ax.plot(time_series, color='black', linewidth=3.0, label='Observations') # Plot observations
for i in range(len(models)):
    ax.plot(time_series.index,models[i](opt_params[i]), label = models[i].__name__)
ax.legend() # Activate figure legend
print('The estimated MSEs for all the models are:')
for i in range(len(models)):
    print(models[i].__name__ +': '+str(min_MSEs[i]))

This snippet of code should produce this figure of all our forecasts, as well as a report of all MSEs:Figure_1

The estimated MSEs for all the models are:
SES: 133348.78
DES: 245436.67
ASM: 80684.00
MSM: 64084.48
AHW: 72422.34
MHW: 64031.19

The Multiplicative Holt Winters method appears to give the smallest MSE when applied to these data.

A completely non-exhaustive list of tutorial resources for scientific computing

This is a short blog post to put together a list of resources with tutorials (similar to what’s usually found on this blog) for various programming languages. It is by no means exhaustive, so please comment if you feel there’s an important one I left out.



Kinda new blog with Matlab tutorials on numerical methods


One of the MathWorks blogs, great tutorials.


Matlab tutorials from MathWorks


More Matlab tutorials, a lot of material on many topics



Not updated very frequently, but good data analysis and visualization tutorials are posted


Updated regularly, some great data visualization and analysis tutorials. The tutorials come with videos. I really like this site.


Updated regularly (about once a month) with python tutorials and general python coding insights. I really like the writing style of this author. 


Tutorials on using SciPy

C and C++:


C and C++ programming tutorials, tips and tricks


Not really updated anymore but some good basic tutorials are listed


Hadn’t been updated in a while, but it looks like it’s been picked up again. Good for general C++ programming and good practice.


C++ tutorials



This is a great general resource not devoted to a particular language. They cover topics in data science, machine learning and programming in general. Some great tutorials for Python and R. 


Mathematical programming problems to help with learning any language


Free programming books repository

Reddits (some of the bigger ones):

/r/matlab (General on matlab, community provides help with coding)

/r/programming (General on programming)

/r/learnprogramming (Community provides help with debugging questions)

/r/python (General on python, community provides help with coding)

/r/learnpython (Community provides help with python questions, smaller than /r/python)

/r/cpp (General on C++)

/r/cpp_questions (Community provides help with C++ questions)

I’ve also recently made /r/sci_comp which has very little activity for the moment, but the aim is to create a community with general resources on coding for scientific applications.


Types of Errors in Numerical Methods

What are numerical methods?

Many engineering problems are too time consuming to solve or may not be able to be solved analytically. In these situations, numerical methods are usually employed. Numerical methods are techniques designed to solve a problem using numerical approximations. An example of an application of numerical methods is trying to determine the velocity of a falling object. If you know the exact function that determines the position of your object, then you could potentially differentiate the function to obtain an expression for the velocity. More often, you will use a machine to record readings of times and positions that you can then use to numerically solve for velocity:


where f is your function, t is the time of the reading, and h is the distance to the next time step.

Because your answer is an approximation of the analytical solution, there is an inherent error between the approximated answer and the exact solution. Errors can result prior to computation in the form of measurement errors or assumptions in modeling. The focus of this blog post will be on understanding two types of errors that can occur during computation: roundoff errors and truncation errors.

Roundoff Error

Roundoff errors occur because computers have a limited ability to represent numbers. For example, π has infinite digits, but due to precision limitations, only 16 digits may be stored in MATLAB. While this roundoff error may seem insignificant, if your process involves multiple iterations that are dependent on one another, these small errors may accumulate over time and result in a significant deviation from the expected value. Furthermore, if a manipulation involves adding a large and small number, the effect of the smaller number may be lost if rounding is utilized. Thus, it is advised to sum numbers of similar magnitudes first so that smaller numbers are not “lost” in the calculation.

One interesting example that we covered in my Engineering Computation class, that can be used to illustrate this point, involves the quadratic formula. The quadratic formula is represented as follows:


Using a = 0.2, b = – 47.91, c = 6 and if we carry out rounding to two decimal places at every intermediate step:


The error between our approximations and true values can be found as follows:


As can be seen, the smaller root has a larger error associated with it because deviations will be more apparent with smaller numbers than larger numbers.

If you have the insight to see that your computation will involve operations with numbers of differing magnitudes, the equations can sometimes be cleverly manipulated to reduce roundoff error. In our example, if the quadratic formula equation is rationalized, the resulting absolute error is much smaller because fewer operations are required and numbers of similar magnitudes are being multiplied and added together:


Truncation Error

Truncation errors are introduced when exact mathematical formulas are represented by approximations. An effective way to understand truncation error is through a Taylor Series approximation. Let’s say that we want to approximate some function, f(x) at the point xi+1, which is some distance, h, away from the basepoint xi, whose true value is shown in black in Figure 1. The Taylor series approximation starts with a single zero order term and as additional terms are added to the series, the approximation begins to approach the true value. However, an infinite number of terms would be needed to reach this true value.

Figure 1: Graphical representation of a Taylor Series approximation (Chapra, 2017)

The Taylor Series can be written as follows:


where Rn is a remainder term used to account for all of the terms that were not included in the series and is therefore a representation of the truncation error. The remainder term is generally expressed as Rn=O(hn+1) which shows that truncation error is proportional to the step size, h, raised to the n+1 where n is the number of terms included in the expansion. It is clear that as the step size decreases, so does the truncation error.

The Tradeoff in Errors

The total error of an approximation is the summation of roundoff error and truncation error. As seen from the previous sections, truncation error decreases as step size decreases. However, when step size decreases, this usually results in the necessity for more precise computations which consequently results in an increase in roundoff error. Therefore, the errors are in direct conflict with one another: as we decrease one, the other increases.

However, the optimal step size to minimize error can be determined. Using an iterative method of trying different step sizes and recording the error between the approximation and the true value, the following graph shown in Figure 2 will result. The minimum of the curve corresponds to the minimum error achievable and corresponds to the optimal step size. Any error to the right of this point (larger step sizes) is primarily due to truncation error and the increase in error to the left of this point corresponds to where roundoff error begins to dominate. While this graph is specific to a certain function and type of approximation, the general rule and shape will still hold for other cases.

Figure 2: Plot of Error vs. Step Size (Chapra, 2017)

Hopefully this blog post was helpful to increase awareness of the types of errors that you may come across when using numerical methods! Internalize these golden rules to help avoid loss of significance:

  • Avoid subtracting two nearly equal numbers
  • If your equation has large and small numbers, work with smaller numbers first
  • Consider rearranging your equation so that numbers of a similar magnitude are being used in an operation


Chapra, Steven C. Applied Numerical Methods with MATLAB for Engineers and Scientists. McGraw-Hill, 2017.

Class Notes from ENGRD 3200: Engineering Computation taught by Professor Peter Diamessis at Cornell University

Python example: Hardy Cross method for pipe networks

I teach a class called Water Resources Engineering at the University of Colorado Boulder, and I have often added in examples where students use Python to perform the calculations.  For those of you who are new to programming, you might get a kick out of this example since it goes through an entire procedure of how to solve an engineering problem using Python, and the numpy library.  The example is two YouTube videos embedded below! The best way to watch them, though, is in full screen, which you can get to by clicking the name of the video, opening up a new YouTube tab in your browser.

Let us know if you have any questions in the comments section below. For more on Python from this blog, search “Python” in the search box.

Power Spectral Density Estimation in Matlab

Power Spectral Density Estimation in Matlab

Hi all,

This post is about power spectral density estimation (PSDE) using Matlab.  PSDE is often used in signal processing and fluid mechanics, so you may or may not be familiar with it.  The idea is to determine the frequency content of sampled time series data.  This means that we can determine the amount of the variance in the time series that is contained at different frequencies (or periodicities).  Let’s take a look at a simple example.

Suppose we are sampling a signal (in this case a sign wave) for about 3 minutes with a frequency of 100 hertz, meaning we take 100 measurements every second.  Let’s generate our signal and plot our observations using Matlab.

x = (0:0.01:60*pi);
y = sin(x);

title('Signal with no Noise')
ylabel('Observed Signal')


Sine wave, with no noise, sampled at 100 Hertz

From observing the data directly it’s pretty clear what the frequency and period of the data are here (1/(2*pi) and pi respectively): no need for PSDE.  However, real data is rarely so well behaved.  To simulate this, let’s add random noise to our original signal.  To make it interesting, let’s add a lot of noise: normal distributed noise with standard deviation 10 times larger than the magnitude of the signal!

for i = 1:max(size(x))
 y(i) = y(i)+10*norminv(rand);

title('Signal with Noise')
ylabel('Observed Signal')

Sine wave, with random normal noise, sampled at 100 Hertz

Sine wave, with random normal noise, sampled at 100 Hertz

Now the noise completely washes out the signal, and it is not at all clear what the period or frequency of the underlying data are.  So let’s compute and plot the power spectrum of the data.  I leave it to the reader to review all of the math underlying computing the spectrum, but I’ll provide a very brief description of what we are doing here.  PSDE involves taking the discrete Fourier transform of the data.  This transforms our data from the temporal or spatial domain to the frequency domain.  The power spectral density (PSD) function is simply the magnitude of the squared Fourier transformed data (often scaled). It tells us how much of the variance in our signal is explained in each discrete frequency band.  A spike in the PSD function tells us that there’s a lot of variance explained in that frequency band.  The frequency resolution of our analysis (width of the frequency bands) is dictated by the length of our recorded observations.  Interestingly, if we integrate the PSD function over the frequency domain, we obtain the sample variance of our data.  Now, let’s look at the code.

fs = (0.01)^-1;
N =size(y,2);

Y = fft(y);
G = abs(Y.^2)*(1/(fs*N));

title('Amplitude of Spectrum of y(t)')
xlabel('Frequency (Hz)')

The first two lines are defining the sampling frequency (fs), the number of observations (N).  Next we use Matlab’s fft command to take the discrete Fourier transformation of the data.  We then compute the frequency bins we are considering, then we compute and plot the PSD function.  Let’s take a look at the resultamplitude_of_spectrum


There is a lot of noise, but there is a distinct peak at 0.1592, which is 1/(2*pi), indicates the frequency of the underlying signal.  This is neat, even though the signal contains a huge amount of noise, we’ve distinguished the correct frequency (and inversely the period) of the signal!

So how might we as systems analysts use this?

In my PhD thesis we used PSDE to determine the critical time-dynamics of a hydropower system’s operation.  This turned out to be a critical diagnostic tool.  In the hydropower system I considered, both inflows and energy prices are stochastic, so it can be difficult to determine if  the derived ‘optimal’ policy makes sense by only observing simulation results: the release in any one time step (here 6-hrs) can appear noisy and erratic in response to price or hydrologic shocks.  The power of PSDE is the ability to cut through the noise and reveal the underlying signal.

As an example, I’ve derived an implicit ‘optimal’ policy using sampling stochastic dynamic programming.  To evaluate the quality of the derived policy, I’ve simulated the system operation with this policy over the range of historical inflow series.  For simplicity, we’ll assume that the energy price profile (ranging from high on-peak to low off-peak prices) is identical on every day.  Now, let’s look at the PSD function for the reservoir release.


We can immediately pick out elements of the PSD function that make sense.  First, there is a lot of variability at very low frequencies, corresponding to periodicity on the order of months or seasons.  This is likely due to seasonal hydrologic variability between summer, spring, and winter inflows: releases are higher in the spring and early summer because more water is available, and generally lower in the winter and late summer when inflows to the reservoir are lower.  Similarly the peak at frequency 1/day makes sense, because we still have a daily price cycle that the system is responding to, put we are harder pressed to explain the other peaks between frequencies 0.1 and 1 (roughly corresponding to weekly and within week periodicities).  We have fixed the price profile, but hydrology shouldn’t be cycling during the week, so what is going on?

As it turns out, I had a bug in my algorithm (bad ending conditions for the SSDP) which was causing the ‘optimal’ policy to behave a bit myopically on a weekly basis.  This error was propagating through the model, but was not easy to see by simply observing the simulated results (see chapter 6 of the thesis).  Correcting the problem, I got the following spectrum.  The unexplainable spikes have gone away, and we now see that daily price cycling and seasonal hydrology explain the majority of the systems variability (as we expect).


For our Cornell-based readers our new CEE faculty member Gregory McLaskey is offering a new class on time series analysis (including PSDE) which is highly recommended.  PSDE is also used extensively (with real data) in Todd Cowen‘s experimental fluid mechanics class which also recommend because you learn a lot and get to play with lasers which, let’s be honest, is why we all became engineers.

Algorithm Diagnostics Walkthrough using the Lake Problem as an example (Part 1 of 3: Generate Pareto approximate fronts)

This three part series is an overview of the algorithm diagnostics I performed in my Lake Problem study with the hope that readers may apply the steps to any problem of interest. All of the source code for my study, including the scripts used for the diagnostics can be found at https://github.com/VictoriaLynn/Lake-Problem-Diagnostics.

The first step to using the MOEAFramework for comparative algorithm diagnostics was to create the simulation model on which I would be assessing algorithm performance. The Lake Problem was written in C++. The executable alone could be used for optimization with Borg and I created a java stub to connect the problem to the MOEAFramework. (https://github.com/VictoriaLynn/Lake-Problem-Diagnostics/blob/master/Diagnostic-Source/myLake4ObjStoch.java).  Additional information on this aspect of a comparative study can be found in examples 4 and 5 for the MOEAFramework (http://moeaframework.org/examples.html) and in Chapter 5 of the manual. I completed the study using version 2.1, which was the newest at the time. I used the all in one executable instead of the source code although I compiled my simulation code within the examples subfolder of the source code.

Once I had developed an appropriate simulation model to represent my problem, I could begin the diagnostic component of my study. I first chose algorithms of interest and determined the range of parameters from which I would like to sample. To determine parameter ranges, I consulted Table 1 of the 2013 AWR article by Reed et al.

Reed, P., et al. Evolutionary Multiobjective Optimization in Water Resources: The Past, Present, and Future. (Editor Invited Submission to the 35th Anniversary Special Issue), Advances in Water Resources, 51:438-456, 2013.

Example parameter files and the ones I used for my study can be found at https://github.com/VictoriaLynn/Lake-Problem-Diagnostics/tree/master/Diagnostic-Source/params. Once I had established parameter files for sampling, I found chapter 8 of the MOEAFramework manual to be incredibly useful.  Below I walk through the steps I took in generating approximations of the Pareto optimal front for my problem across multiple seeds, algorithms, and parameterizations.   All of the commands have been consolidated into the file Lake_Problem_Comparative_Study.sh on Github, but I had many separate files during my study, which will be separated into steps here. It may have been possible to automate the whole process, but I liked breaking it up into separate scripts to make sure I checked that the output made sense after each step.

Step 1: Generate Parameter Samples To generate parameter samples for each algorithm, I used the following code, which I kept in a file called sample_parameters.sh. I ran all .sh scripts using the general command sh script_name.sh.

JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"

# Generate the parameter samples
echo -n "Generating parameter samples..."
java ${JAVA_ARGS} \
org.moeaframework.analysis.sensitivity.SampleGenerator \
--method ${METHOD} --n ${NSAMPLES} --p ${ALGORITHM}_params.txt \

Step 2: Optimize the problem using algorithms of interest This step had two parts: optimization with Borg and optimization with the MOEAFramework algorithms. To optimize using Borg, one needs to request Borg at http://borgmoea.org/. This is the only step that needs to be completed outside of the MOEAFramework. I then used the following script to generate approximations to the Pareto front for all 500 samples and 50 random seeds. The –l and –u flags indicate upper and lower bounds for decision variable values. Fortunately, it should soon be possible to type one value and specify the number of variables with that bound instead of typing all 100 values as shown here.

#50 random seeds


SEEDS=$(seq 1 ${NSEEDS})

for SEED in ${SEEDS}
#PBS -N ${NAME}\n\
#PBS -l nodes=1\n\
#PBS -l walltime=96:00:00\n\
#PBS -o output/${NAME}\n\
#PBS -e error/${NAME}\n\
./BorgExec -v 100 -o 4 -c 1 \
-l 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 \
-u 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 \
-e 0.01,0.01,0.0001,0.0001 -p Borg_params.txt -i Borg_Latin -s ${SEED} -f ./sets/${ALGORITHM}_${PROBLEM}_${SEED}.set -- ./LakeProblem4obj_control "
echo -e $PBS | qsub

Optimization with the MOEAFramework allowed me to submit jobs for all remaining algorithms and seeds with one script as shown below. In my study, I actually submitted epsilon dominance algorithms (included –e flag) and point dominance algorithms (did not include –e flag) separately; however, it is my understanding that it would have been fine to submit jobs for all algorithms with the epsilon flag, especially since I converted all point dominance approximations to the Pareto front to epsilon dominance when generating reference sets.



SEEDS=$(seq 1 ${NSEEDS})
JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

for SEED in ${SEEDS}
#PBS -N ${NAME}\n\
#PBS -l nodes=1\n\
#PBS -l walltime=96:00:00\n\
#PBS -o output/${NAME}\n\
#PBS -e error/${NAME}\n\
java ${JAVA_ARGS}
org.moeaframework.analysis.sensitivity.Evaluator -p
${ALGORITHM}_params.txt -i ${ALGORITHM}_Latin -b ${PROBLEM}
-a ${ALGORITHM} -e 0.01,0.01,0.0001,0.0001 -s ${SEED} -o ./sets/${NAME}.set"
echo -e $PBS | qsub


Step 3: Generate combined approximation set for each algorithm and Global reference set Next, I generated a reference set for each algorithm’s performance. This was useful as it made it easier to generate the global reference set for all algorithms across all seeds and parameterizations and it allowed me to calculate a percent contribution for each algorithm to the global reference set. Below is the script for the algorithm reference sets:



JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

# Generate the combined approximation sets for each algorithm
echo -n "Generating combined approximation set for
java ${JAVA_ARGS} \
org.moeaframework.analysis.sensitivity.ResultFileMerger \
-b ${PROBLEM} -e 0.01,0.01,0.0001,0.0001 -o ./SOW4/reference/${ALGORITHM}_${PROBLEM}.combined \
echo "done."

In the same file, I added the following lines to generate the global reference set while running the same script.
# Generate the reference set from all combined approximation sets
echo -n "Generating reference set..."
java ${JAVA_ARGS} org.moeaframework.util.ReferenceSetMerger \
-e 0.01,0.01,0.0001,0.0001 -o ./SOW4/reference/${PROBLEM}.reference ./SOW4/reference/*_${PROBLEM}.combined > /dev/null
echo "done."

If one wants to keep the decision variables associated with the reference set solutions, it is possible to use org.moeaframework.analysis.sensitivity.ResultFileMerger on all of the pertinent .set files.

A final option for reference sets is to generate local reference sets for each parameterization of each algorithm. This was done with the following script:


SEEDS=$(seq 1 ${NSEEDS})

# Evaluate all algorithms for all seeds
java -cp MOEAFramework-2.1-Demo.jar org.moeaframework.analysis.sensitivity.ResultFileSeedMerger -d 4 -e 0.01,0.01,0.0001,0.0001 \
--output ./SOW4/${ALGORITHM}_${PROBLEM}.reference ./SOW4/objs/${ALGORITHM}_${PROBLEM}*.obj

Part 2 of this post walks through my calculation of metrics.

Determining Whether Differences in Diagnostic Results are Statistically Significant (Part 1)

While working on my diagnostic assessment of the difficulty of the “Lake Problem”, I decided that I would like to check that the differences in performance I observed were statistically significant. This part of the blog post is a broad overview of the steps I took before the statistical analysis and that analysis. Part 2 includes a tutorial. Looking through prior group papers, I drew initial inspiration from the following articles by Dave although I think my final method differs slightly.

Hadka, D. and P. Reed. Diagnostic Assessment of Search Controls and Failure Modes in Many-Objective Evolutionary Optimization. Evolutionary Computation, 20(3):423-452, 2012.

Hadka, D. and P. Reed. Borg: An Auto-Adaptive Many-Objective Evolutionary Computing Framework. Evolutionary Computation, 21(2):231-259, 2013.

Below is a short summary of the steps preceding my statistical analysis with links to group resources for those who may need more background.  For everyone else, feel free to skip ahead.

Before I could determine the statistical significance of my results, I needed to perform a comparative study of algorithm performance. I got help from sections 8.1 to 8.8 and section 8.10 of the MOEAFramework manual, which can be found at the MOEAFramework website to generate approximate fronts for the problem for multiple random seeds (I used 50) using algorithms found within the MOEAFramework, calculate metrics for each seed across all parameterizations, and determine the best value for each seed across all parameterizations and the probability of attaining a specified threshold (I used 75%) for each seed.  I skipped section 8.9 and performed the analysis described in 8.10 on metrics for each seed’s metrics as many samples are necessary to perform statistics on algorithm performance.  Please note, I also included the Borg evolutionary algorithm in this study, which requires optimization outside of the framework. All other steps for Borg can be performed the same as for the MOEAFramework algorithms.

In this case, I used Latin Hypercube Sampling to generate 500 samples of the feasible parameter space for each algorithm. I chose parameter ranges from Table 1 of the following paper.
Reed, P., et al. Evolutionary Multiobjective Optimization in Water Resources: The Past, Present, and Future. (Editor Invited Submission to the 35th Anniversary Special Issue), Advances in Water Resources, 51:438-456, 2013.

Here is the shell script I used for calculating the best attained values and probability of attaining the 75% threshold for all algorithms and all seeds.


SEEDS=$(seq 1 ${NSEEDS})
JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

# Perform the analysis
echo ""
echo "Hypervolume Analysis:"
	for SEED in ${SEEDS}

    #PBS -N ${NAME}\n\
    #PBS -l nodes=1\n\
    #PBS -l walltime=96:00:00\n\
    #PBS -o output/${NAME}\n\
    #PBS -e error/${NAME}\n\
    cd \$PBS_O_WORKDIR\n\
        java ${JAVA_ARGS} org.moeaframework.analysis.sensitivity.Analysis \
        -p ${ALGORITHM}_params.txt -i ${ALGORITHM}_${METHOD} -t 0.75 -o ./SOW6/Hypervolume/${ALGORITHM}_${PROBLEM}_${SEED}_Hypervolume.analysis \
        -c -e -m 0 --hypervolume 0.7986 ./SOW6/metrics/${ALGORITHM}_${PROBLEM}_${SEED}.metrics"
    echo -e $PBS | qsub


The above shell script is written for Hypervolume, but other methods can be specified easily by indicating a different number after the -m flag and removing the hypervolume flag.  If one wants to analyze Generational Distance, type 1 after the -m flag.  For the Additive Epsilon Indicator, type 4.  Typically, these are the three metrics of interest as explained in this blog post https://waterprogramming.wordpress.com/2013/06/25/moea-performance-metrics/.   The –hypervolume flag allows you to specify the hypervolume of your reference set. That way the individual reference set hypervolume values are normalized to that of the best known approximation to the Pareto front.

Hypervolume can be calculated for a given approximation set using the following command line argument.

java -cp MOEAFramework-2.1-Demo.jar org.moeaframework.analysis.sensitivity.SetHypervolume name_of_reference_set_file 

Finally, I calculated statistics on these values.  I used the Kruskal Wallis and Mann Whitney U (Also called the Wilcoxon Rank Sum) tests that Dave used.  One advantage of these tests is that they are non-parameteric, meaning that assumptions are not made about the distributions from which the data come.  Another advantage I found is that they are built into Matlab as the kruskalwallis and ranksum functions.  I began with the Kruskal Wallis test as it tests the null hypothesis that independent samples from two or more groups come from distributions with equal medians and returns the p-value that this is true given the data.  This test is a useful screen, as there would not have been much point in continuing if this test indicated that the values for all algorithms had been statistically similar.

I received very low p-values using Kruskal Wallis so I continued to the Mann Whitney U test in every case.  The Mann Whitney U test provides a pair-wise comparison.   Matlab has options to test one of three alternative hypotheses

  1. medians are not equal (default, two-tailed test)
  2. median of X is greater than median of Y (right-tailed test)
  3. median of X is less than median of Y (left-tailed test)

I used one tailed tests in order to determine if one algorithm’s performance was superior to another algorithm’s performance.  I then was able to rank algorithm performance on any given metric by determining the number of algorithms each algorithm outperformed according to this test. A detailed walk through is provided in Part 2.

Runtime dynamics with serial Borg

written by Riddhi Singh


The purpose of this write up is to provide step-by-step instructions for generating a runtime dynamics file using the serial version of Borg.


Familiarity with running the serial version of Borg is NOT needed. We will give a brief overview of setting up a Borg problem. The user should preferably have a pre-defined problem to test this method. Otherwise, the dtlz2 test function that is provided with the serial release of Borg (dtlz2.py) can be used.   Serial Borg can be requested here: http://borgmoea.org/

Step-by-step instructions for obtaining a runtime dynamics file when calling Borg from the command line

1.    Create a Borg executable – use the make file that comes with the Borg-1.6 version.

Use the following commands in a makefile and then run the ‘make’ command- 

CC  = gcc


            ${CC}  -o borgExec frontend.c borg.c mt19937ar.c -lm

Alternatively run the following on command line:

gcc -o borgExec frontend.c borg.c mt19937ar.c -lm

2.    Create a problem executable – This step can be skipped while using the dtlz2.py executable that comes with the Borg package. If compiling your own problem, you need essentially create a bridge between Borg and your problem. For each function evaluation, Borg needs a way to pass variables to your problem, and after evaluation your problem needs to return the associated objective function values back to Borg. You can use the template in dtlz2.py to parse Borg’s input stream. Sample code in C++ is provided below. Note that you cannot use this code directly, this is just the interface with your test function that should be defined above.

void test_function(double * vars, double *objs)


/*your test function here

test function takes inputs in vars and returns objective function values in objs*/

/*Once the objectives are evaluated, pass them to Borg using output stream functions*/

      for (int cols=0;cols<nobjs;cols++)


              printf(“%lg “, objs[cols]);





void main( int argc, char * argv[])


//Initialize number of variables and objective functions

  int nvars = 10;

  int nobjs = 1;

/*Declare the pointers to the arrays that will store the incoming Borg variables and outgoing objective function values*/

  double * vars = new double [nvars];

  double * objs = new double [nobjs];

//Read the input stream and output the objective function values

  int maxSize = 10000;

  char buffer [maxSize];

/*Read the input stream from Borg until the end of stream is reached and

assign the numbers to a testbuffer*/

Borg terminates each stream with a endline character*/

while ( *fgets(buffer, maxSize, stdin)!=’\n’)


      int UseSize = strlen(buffer);

      char *pEnd;

      char *testbuffer = new char [UseSize];

      for (int i=0; i <UseSize; i++)


              testbuffer[i] = buffer[i];


/*Now, parse through the testbuffer and assign variables.

Borg only passes variables followed by objective functions but since we are reading these as strings, they need to be converted to double before sending out for evaluation*/

      for (int cols =0;cols<nvars;cols++)


              vars[cols] = strtod(buffer, &pEnd);

              testbuffer  = pEnd;


      //Evaluate the test problem with this vector

     test_problem(vars, objs);

//This test problem is declared as:

void test_problem(double vars, double* objs)

//The function returns the objective function values in objs


Now compile this executable. Use the following commands in a makefile and then run the ‘make’ command-

CC2 = g++


            ${CC2} -o TestExec TestFunction.cpp

Alternatively run the following on command line:

g++ -o TestExec TestFunction.cpp

3.    Get the runtime dynamics file – Once both Borg and problem executables are ready, obtaining runtime dynamics from the command line is straightforward. Use the following command to generate a runtime dynamics file –

./BorgExec –v 10 –o 1 –n 1000 –e 0.01 –R RuntimeDynamicsFile.txt –F 10 ./TestExec > BorgOutput.txt

The options are elaborated in the README file that is provided with the Borg-1.6 package. The ‘RuntimeDynamicsFile.txt’ contains the runtime dynamics and the ‘BorgOutput.txt’ file contains the final result.

Alternatively, if using the dtlz2.py function, execute the following command –

./BorgExec –v 11 –o 2 –n 1000 –e 0.01,0.01 –R RuntimeDynamicsFile.txt –F 10 python dtlz2.py > BorgOutput.txt