Compiling the Borg Matlab Wrapper (OSX/Linux)

The Borg MOEA is written in C, but thanks to Dave, a Matlab wrapper is now available. This post will describe how to compile and use it on OSX (and, to a lesser extent, Linux); thanks to Joe, who figured all of this out. Windows users should refer to this post instead.

Step 0: Get Files

You will need the full source code (borg.c, borg.h, mt19937.c, mt19937.h) for the serial version of Borg, as well as three additional files: borg.m, DTLZ2.m,and nativeborg.cpp. Students in CEE 6200 will have received these files in a zip folder (be sure to grab the OSX folder rather than the Windows one).

Step 1: Compile shared library

The Windows version comes with pre-compiled shared libraries, but the OSX version does not. This is ok, because we need a compiler for the rest of the steps anyway. If you don’t already have it, go install XCode, which is Apple’s software development kit. We really only need the gcc compiler, but it won’t hurt to install the whole package.

After installing XCode, we need to compile Borg as a shared library. Open a terminal and navigate to the directory where you put your Borg files. Then, type the following commands:

gcc -c borg.c
gcc -c mt19937ar.c
gcc -shared -o libborg.so borg.o mt19937ar.o

The file libborg.so is the shared library that we’re looking for. Take this file and copy it to where your Matlab files are located (CEE 6200 students, it should already be in the right directory). Note if you’ve never used a terminal before, it’s just the application called “Terminal” on your Mac. Use the command cd /to/some/directory to change directory, and pwd to print working directory (that is, to see where you are).

Step 2: Check Compiler in Matlab

Matlab needs to compile Borg into a mex file to call its functions directly. The default compiler that it looks for on OSX is XCode—which, fortunately, we installed in the previous step. You can double-check that Matlab recognizes your compiler by running mex -setup at the Matlab command line. (Note: NOT in your system terminal). The mex -setup command should locate your XCode installation, with the gcc compiler.

Step 3: Compile Borg

In the Matlab command window, navigate to the directory where you stored all of the Borg files from Step 0. Run the command mex nativeborg.cpp libborg.so. If it works, you’ll see a new file called nativeborg.mex* in your directory. If it doesn’t work, double-check that you have all of the files copied into the same directory, and that Matlab’s working directory is set properly.

A few caveats. If you’re running OSX 10.9 (Mavericks), this will most likely need a small tweak to work—read troubleshooting step 3b below. Linux users will need to add a library path to the compilation like so:

mex LDFLAGS="\$LDFLAGS -Wl,-rpath,\." nativeborg.cpp libborg.so

Again: these commands are being run in Matlab’s command windownot the system terminal.

Step 3b: Compilation Troubleshooting

If you’re running Mavericks (OSX 10.9) and thus have XCode 5 or higher, you will most likely get this error:

xcodebuild: error: SDK macosx10.7 cannot be located

The SDK for OSX 10.7 is no longer included in XCode, but Matlab still searches for it by default. Try following these instructions from Mathworks to change the SDK that Matlab searches for from 10.7 to 10.8. As per the instructions, even if you’re running Mavericks you should still try 10.8, because it is likely better-supported by Matlab. Both the 10.8 and 10.9 SDKs should be included with the latest version of XCode.

Step 3c: Troubleshooting Continued

The compilers included in the latest version of XCode may have further compatibility issues with mex. Specifically, when you try to run mex, you may get a cryptic error message about undefined type char16_t. Following these instructions, you will need to add -std=C++11 to the CXXFLAGS variable in your options file (the same mexopts.sh file that you edited in Step 3b above).

This forces the compiler to abide by a certain standard, in which the undefined type is now (apparently) defined. I have confirmed that this fix works on OSX 10.9.2 with Matlab 2013a.

Step 4: Run a test problem

The file DTLZ2.m shows an example of how an objective function should be formatted for use with the Borg Matlab wrapper. It accepts a vector of decision variables as input, and outputs a vector of objectives (and optionally, constraints). From the command line, you can optimize this function as follows:

[vars, objs] = borg(11, 2, 0, @DTLZ2, 100000, zeros(1,11), ones(1,11), 0.01*ones(1,2));

The function returns the decision variables and objectives associated with non-dominated points (the Pareto-approximate set). The first three inputs to the function are: the number of decision variables, number of objectives, and number of constraints. After that comes the handle for your objective function, the number of function evaluations to run in the optimization, the lower and upper bounds of the decision variables, and finally, the epsilon precision values for the objectives.

To see the set of nondominated solutions, try scatter plotting the columns of the objs matrix:

scatter(objs(:,1), objs(:,2));

And you should see a semicircular-looking Pareto front. Now all you need to do is swap out DTLZ2 with an objective function or model of your choice, and you’ll be ready to go!

Compile and run Borg MATLAB on JANUS

Some of the steps/information below are very similar to the information above; however, this is to clarify the exact steps necessary to compile and run Borg MATLAB on the CU Boulder supercomputer (JANUS).

Step 1: Transfer Borg source code to JANUS

Open your terminal and copy Borg source code from you computer to the appropriate folder on JANUS. Below is an example of a command you can use to copy these files. The *.c copies over all files ending in .c. If you only have one file of a specific type, you can also just use the filename.

scp /Users/elizabethhoule/Desktop/*.c elho9743@login.rc.colorado.edu:/lustre/janus_scratch/elho9743/borg

Step 2: Compile shared library

Log onto JANUS, and navigate to the JANUS directory that contains the Borg source code. For example:

ssh elho9743@login.rc.colorado.edu
cd /lustre/janus_scratch/elho9743/borg

Use the following code to compile Borg as a shared library. This should produce a file called libborg.so.

gcc -c -fPIC borg.c
gcc -c -fPIC mt19937ar.c
gcc -shared -o libborg.so borg.o mt19937ar.o

Check your folder via the ls command to ensure libborg.so now exists. Then copy libborg.so to your JANUS directory that contains your MATLAB code. For example:

cp /lustre/janus_scratch/elho9743/borg/libborg.so /lustre/janus_scratch/elho9743/matlabfiles/

Also ensure that your nativeborg.cpp and borg.m files are in this same directory. Navigate to the directory that now contains your MATLAB code and libborg.so.

cd /lustre/janus_scratch/elho9743/matlabfiles/

Step 3: Compile Borg

Use the following commands to compile Borg on JANUS.

LD_LIBRARY_PATH=$LD_LIBRARY_PATH:.
module load matlab/matlab-2013a
matlab
mex nativeborg.cpp libborg.so

The first command, LD_LIBRARY_PATH=$LD_LIBRARY_PATH:., tells MATLAB where the .so is located. Even after you compile Borg, if you close out of your terminal, you will need to type the first command again in your new terminal before Borg will run. When you type this command ensure you are in your JANUS directory that contains the MATLAB files.

The second command, module load matlab/matlab-2013a, loads MATLAB on JANUS. The third command, matlab, opens MATLAB on JANUS in your terminal. The fourth command, mex nativeborg.cpp libborg.so, compiles Borg in MATLAB on JANUS.

Lastly, you can run a test problem to make sure Borg is properly compiled. Make sure all of the necessary files are in the directory in which you are running the test problem. Use the following command to run the test problem:

[vars, objs] = borg(11, 2, 0, @DTLZ2, 10000, [0.01,0.01], zeros(1,11), ones(1,11));

If this works, you are good to go. Navigate out of MATLAB on JANUS and load slurm to start running your Borg jobs.

Advertisements

Extensions of SALib for more complex sensitivity analyses

Over the past few weeks, I’ve had some helpful discussions with users of SALib that I thought would be worth sharing. These questions mostly deal with using the existing library in clever ways for more complicated modeling scenarios, but there is some extra information about library updates at the end.

1. How to sample parameters in log space

All three methods in the library (Sobol, Morris, and extended FAST) currently assume an independent uniform sampling of the parameters to be analyzed. (This is described in the documentation). However, lots of models have parameters that should be sampled in log space. This is especially true of environmental parameters, like hydraulic conductivity for groundwater models. In this case, uniform sampling over several orders of magnitude will introduce bias away from the smaller values.

One approach is instead to uniformly sample the exponent of the parameter. For example, if your parameter value ranges from [0.001, 1000], sample from [-3, 3]. Then transform the value back into real space after you read it into your model (and of course, before you do any calculations!) This way you can still use uniform sampling while ensuring fair representation in your parameter space.

2. How to sample discrete scenarios

In some sensitivity analysis applications, the uncertain factor you’re sampling isn’t a single value, but an entire scenario! This could be, for example, a realization of future streamflow or climate conditions—we would like to compare the sensitivity of some model output to streamflow and climate scenarios, without reducing the latter to a single value.

This can be done in SALib as follows. Say that you have an ensemble of 1,000 possible streamflow scenarios. Sample a uniform parameter on the range [0, 999]. Then, in your model, round it down to the nearest integer, and use it as an array index to access a particular scenario. This is the approach used in the “General Probabilistic Framework” described by Baroni and Tarantola (2014). Discretizing the input factor should not affect the Sobol and FAST methods. It will affect the Morris method, which uses the differences between input factors to determine elementary effects, so use with caution.

This approach was recently used by Matt Perry to analyze the impact of climate change scenarios on forest growth.

3. Dealing with model-specific configuration files

In Matt’s blog post (linked above), he mentioned an important issue: the space-separated columns of parameter samples generated by SALib may not be directly compatible with model input files. Many models, particularly those written in compiled languages, will have external configuration files (plaintext/XML/etc.) to specify their parameters. Currently SALib doesn’t have a solution for this—you’ll have to roll your own script to convert the parameter samples to the format of your model’s configuration file. (Update 11/16/14: here is an example of using Python to replace template variables with parameter samples from SALib).

One idea for how to do this in the future would be to have the user specify a “template file”, which is a configuration file where the parameter values are replaced with tags (for example, “{my_parameter}”. The location of this file could be specified as a command line parameter. Then, while generating parameter samples, SALib could make a copy of the template for each model run, overwriting the tags with parameter values along the way. The downside of this approach is that you would have thousands of input files instead of one. I’m going to hold off on this for now, but feel free to submit a pull request.

4. Confidence intervals for Morris and FAST

Previously, only the Sobol method printed confidence intervals for the sensitivity indices. These are generated by bootstrapping with subsets of the sample matrix. I updated the Morris method with a similar technique, where confidence intervals are bootstrapped by sampling subsets of the distribution of elementary effects.

For FAST (and extended FAST), there does not appear to be a clear way to get confidence intervals by bootstrapping. The original extended FAST paper by Saltelli et al. displayed confidence intervals on sensitivity indices, but these were developed by replicating the experiment, adding a random phase shift to generate alternate sequences of points as given in Section 2.2 of the linked paper. I added this random phase shift to SALib such that a different random seed will produce a different sampling sequence for FAST (previously this was not the case).

However, my attempts to bootstrap the FAST results were unsuccessful. The sequence of model outputs are FFT‘d to develop the sensitivity indices, which means that they cannot be sub-sampled or taken out of order. So for now, FAST does not provide confidence intervals. You can generate your own confidence intervals by replicating the full sensitivity analysis with different random seeds. This is usually very difficult for environmental models, given the computational expense, but not for test functions.

Thanks for reading. Email me at jdh366-at-cornell-dot-edu if you have any questions, or want to share a successful (or unsuccessful) application of SALib!