Running Sobol Sensitivity Analysis using MOEAFramework

Note: this post describes a version of Sobol analysis included in MOEAFramework (http://moeaframework.org), written in Java. If you prefer, there is a new Python library available called SALib (http://jdherman.github.io/SALib/), which contains open-source implementations of Sobol, Morris, and FAST sensitivity methods.

This post was originally written by Jon Herman.  A small number of edits were later added by Joseph Kasprzyk.

Sobol sensitivity analysis is used to attribute the variance in a model’s output to its parameters and their interactions. In a previous post, I explained how to run Sobol sensitivity analysis using our C/C++ code. We have decided as a group to move away from this implementation due to maintainability concerns. Instead, we will use the Sobol implementation included in the MOEAFramework. MOEAFramework is a multi-objective optimization library written in Java, but it also contains Sobol sensitivity analysis as an added feature. All you need is a single *.jar file, which can be downloaded here (we want the “all-in-one executable” download, since we will mostly be treating the program as a black box).  Because MOEAframework uses Java, it is platform independent, so the software will work on Mac, Linux, and PC!

Once you have the *.jar file, complete the following steps to calculate your sensitivity indices.

Step 1: Choose sampling bounds for your parameters

First, you will need to create a simple text file with the form [parameter] [lower bound] [upper bound]. For example, such a file for the Hymod model might look like this:

Cmax 0.0 1000.0
B 0.0 3.0
Alpha 0.0 1.0
Kq 0.15 1.0
Ks 0.0 0.15

The bounds are used to sample parameter values. The names of the parameters themselves are only used to print the final sensitivity indices (i.e., you can name them whatever you want). For this example, I will refer to this file as params.txt.  Use a plain-text editor for this task, such as Notepad (on Windows), TextMate (on Mac), or emacs (on Linux).  If you use a “rich text” editor like Word, the file will not be in the correct format!

Step 2: Generate parameter sets using the Sobol Sequence

Put your params.txt file in the same directory as your MOEAFramework *.jar file.

In your operating system, find the Terminal program and open it.  In Windows you can do this by typing ‘cmd’ in the Search menu of Windows 7/8.  In Linux and Mac the program is called Terminal.  Note that the commands to navigate around the Terminal are going to be basically the same in Mac and Linux, and we cover them here.  In Windows the commands are different, and here’s a good external summary.

Move to this directory (e.g. use the ‘cd’ command) and type the following command:

java -classpath MOEAFramework-2.0-Executable.jar org.moeaframework.analysis.sensitivity.SampleGenerator
-m saltelli -n 1000 -p params.txt > sobolParameterSets.txt

(This is only one command. It’s being wrapped to multiple lines.)

The Java class being called is SampleGenerator. The -m flag specifies the mode, and the -n flag specifies the number of initial samples to generate from the pseudo-random Sobol sequence. The -p flag specifies the parameter bounds file that you created in the first step. Be sure to replace “2.0” with the version of MOEAFramework that you downloaded.

In this example, 1000 parameter sets are generated from the Sobol sequence. After that, the Saltelli method of cross-sampling is applied (for more information, see: Saltelli 2008, “Global Sensitivity Analysis: The Primer“). The cross-sampling scheme creates a total of 2N(p+1) total parameter sets to be run in your model; for the Hymod example, we would have 1000*(5+1) = 6000 total model runs. You probably want to pipe the output from this command into a text file using the “>” operator as shown above.

If you just copied the *.jar file to your computer, you may get a warning that you do not have Java installed!  Follow the prompts to make sure that Java works, and then try again.  On Windows, you need the Java Development Kit (JDK), if your system doesn’t prompt you you can download it for free at the Java website.

Note that the Sobol method can be computationally intensive depending on the model being analyzed. Even for a simple model like Hymod, from personal experience I would recommend a sample size of at least N = 10,000 (which translates to 60,000 model runs). More complex models will be slower to run and will also require more samples to calculate accurate estimates of Sobol indices. Once you complete this process, pay attention to the confidence bounds on your sensitivity indices to see whether you need to run more samples.

Step 3: Run the parameter sets through your model

The parameter sets are now saved in sobolParams.txt—the analysis so far has no idea what model you’re running. You need to set up your model to read in parameter sets from the file sobolParams.txt, run them through the model, and save your desired outputs in a file called objectiveValues.txt. You can use multiple objective values here—each row in the objectives file will correspond to a model run. Here is an example of reading in parameter sets for the Hymod model in C/C++ in the main() function (if you already know how to do this, you can skip this part):

//begin main()
string sampleFile = "sobolParams.txt"  //Input Sobol samples file to read

int numSamples = 60000; //make sure your program knows how many samples it has to read
int nParams = 5; //how many parameters do you have?
int nObj = 4; //how many objectives do you have?

ifstream in; //Input file stream
ofstream of; //output file stream for objective values

//obj is an array holding the objective values, and xreal holds the parameter values for each run
double *obj = new double [nObj];
double *xreal = new double [nParams];

//Intialize the model only once
init_HyMod(argc, argv);
//Open the output file and set some properties for it
of.open("objectiveValues.txt", ios::app);
of.setf(ios_base::fixed);
of.precision(8);
of.width(20);
in.open(sampleFile.c_str(), ios_base::in);
if (!in)
{
     cout << "Error: sampleFile could not be opened!" << endl;
     exit(1);
}
//Loop through parameter samples in sobolParams file
for (int s = 0; s < numSamples; s++)
{
    //Read in the parameter values from Sobol
     for (int i=0; i < nParams; i++)
     {
          in >> xreal[i]; //read in each parameter individually for this set
     }
     in.ignore(1000,'\n'); //ignore the rest of the line
     //Run this sample once all parameters have been read in
     calc_HyMod(indPtr);
    //Print the objective value outputs
    for(int i = 0; i < nObj; i++)
    {
         of << setw(15) << obj[i];
    }
    of << endl;
} //close the samples loop

//Close the output file
 of.close();
//Remember to clean up any memory you allocated!
deleteAllStuff(&hymodPtr->data, &hymodPtr->model, &hymodPtr->snowModel);
//end main()

As you can see, even this fairly simple task becomes complicated in C/C++. It might only take you a handful of lines to do the same thing in Matlab. But if you want speed, C/C++ is the way to go. Be careful to read in your parameters in the same order in which you sampled them!

Step 4: Calculate Sobol Indices

You now have the objective values for all of your model runs, arranged in columns in the file objectiveValues.txt. We need to send this information back to MOEAFramework to compute the sensitivity indices, using following command:

java -Xmx256m -classpath MOEAFramework-2.0-Executable.jar org.moeaframework.analysis.sensitivity.SobolAnalysis -m 0 -i objectiveValues.txt -p params.txt -o sobolIndices.txt -r 1000

As always, make sure your version of MOEAFramework is properly listed after the classpath tag! The class being called is SobolAnalysis. The flag -Xmx256m tells Java how much memory to reserve. (If you ever get an error from Java saying something like “cannot allocate heap: not enough memory”, you may need to increase the amount of memory reserved … 128m, 256m, 512m, etc.).

As shown here, the SobolAnalysis class requires several flags to be passed in: the parameter file (-p), the file containing calculated objective values (-i), the output file for the sensitivity indices (-o), and the column of the objective values file to read (-m). The columns are assumed to be zero-indexed; if you have calculated multiple objective values, you would continue on to -m 1, etc., repeating the same command as above.

Optional parameters for the sobol program include the (-r) flag, which specifies the number of boostrap resamples, and the (-s) flag, which specifies the random seed for bootstrapping. These parameters are used to calculate the confidence intervals of the sensitivity indices. If these flags are excluded, the program defaults to use the system time as the random seed, with 1000 bootstrap resamples.

Step 5: Interpret your results

Open the file sobolIndices.txt in a regular text editor. You will see something like this:

Parameter	Sensitivity [Confidence]
First-Order Effects
Ks 0.0017259632775069277 [0.009004221878377168]
Kq 0.006978812896485629 [0.007912025856580397]

...(etc.)
Total-Order Effects
Ks 0.00851970745914532 [0.010850538623282687]
Kq 6.892431800058496E-4 [0.010346174868875227]

...(etc.)
Second-Order Effects
Ks * Kq -8.715871377997618E-4 [0.01149935524352781]
Ks * DDF -0.00178083062205274 [0.01210766544038049]

...(etc.)

The parameter names will match the ones you specified in params.txt. The first order, total order, and second order sensitivities are specified as indicated. Confidence intervals for each indice are shown in brackets. Most of the indices are omitted here for the sake of brevity. Typically we use the total order indices to get a broad picture of model behavior, since they estimate all of the interactions between parameters. If the confidence intervals of your dominant indices are larger than ~10% of the value itself, you may want to consider increasing your sample size as computation permits. The parameters shown here have negligible sensitivity indices—the precision included in the output is not necessarily meaningful. For total-order indices to be important, they will usually need to be above 0.05 at the very least (the most dominant parameters will have values upward of 0.8).

Those are the basics for running Sobol using MOEAFramework. In general, the logic behind the process is exactly the same as in our old C/C++ code, and in my opinion it’s even simpler to run the Java version. Feel free to email jdh366 (at) cornell (dot) edu with questions. Thanks for reading.

Advertisements

How to specify constraints in MOEAframework (External Problem)

Let’s say you have an external problem (in C/C++) that plugs in to the MOEAframework.  This post covers how to calculate constraints and pass them back and forth from the C/C++ program to the Java code.  First we’ll show the two code examples and then we’ll explain some of the details of the calculations.

Class definition in the java script:


public static class myProblem extends ExternalProblem {
  public myProblem() throws IOException{
      super("./myPath/myCPPExecutable");
  }

  @Override
  public String getName() {
      return "myProblem";
  }

  @Override
  public int getNumberOfVariables(){
      return 5;
  }

  @Override
  public int getNumberOfObjectives(){
      return 6;
  }

  @Override
  public int getNumberOfConstraints(){
      return 3;
  }

  @Override
  public Solution newSolution(){
      Solution solution = new Solution( getNumberOfVariables(),
          getNumberOfObjectives(), getNumberOfConstraints() );

      for (int i = 0; i < getNumberOfVariables(); i++) {
          solution.setVariable(i, new RealVariable(0.0, 1.0));
      }

      return solution;
  }

}

As you can see, the Java excerpt is not really any different than the traditional example, other than that the number of constraints is nonzero.  In the C++ code, one must actually give some values to the MOEA, which represents the constraint violations of each solution.  The constraints are such that any nonzero value is considered a constraint violation, whereas feasible solutions have constraint violation = 0. The below example shows how to calculate constraints for three objectives, named objectiveA, objectiveB, and objectiveC.  The constraints are of type “greater than”, “less than”, and “equal”, respectively.  The example treats all constraint violations as negative, but it is also OK to have the constraint violations positive, as long as they are nonzero.

Note: The above paragraph was edited 14 August 2012, with a small fix made. The way it was written originally was not wrong, we just clarified it a bit.


#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include "moeaframework.h"

int nvars = 5;
int nobjs = 6;
int nconstrs = 3; //define the number of constraints here

void evaluate (double *vars, double *objs, double *constrs)
{
 //a generic function to calculate objective functions
 objs = myObjFun(vars);

 //let's say you want to constrain the first three objectives to be above, below
 //or equal to a given level.  Pull out the values of the objectives:
 double objA = objs[0]; double objB = objs[1]; double objC = objs[2];

 //Set the desired level of the objectives:
 double levelA = 0.98; double levelB = 1.1; double levelC = 0.0;

 //Now calculate the constraint violations.
 //The 0th constraint is a greater than or equal to constraint on objA
 constrs[0] = objA / levelA - 1.0;

 //The 1st constraint is less than or equal to on objB
 constrs[1] = 1.0 - objB / levelB;

 //The 2nd constraint is an equality constraint on level C
 if (objC == levelC) constrs[2] = 0.0;

 else constrs[2] = -1.0*abs( (levelC - objC) / levelC );

 for (int i = 0; i < nconstrs; i++)
 {
    if (constrs[i] > 0.0) constrs[i] = 0.0;
 }

 return;

}

int main (int argc, char* argv[])
{
 double vars[nvars];
 double objs[nobjs];
 double constrs[nconstrs];

 MOEA_Init(nobjs, nconstrs);
 //the second argument here is the number of constraints

 while (MOEA_Next_solution() == MOEA_SUCCESS) {
    MOEA_Read_doubles(nvars, vars);
    evaluate(vars, objs, constrs);
   MOEA_Write(objs, constrs);
 }

 MOEA_Terminate();

 return EXIT_SUCCESS;

}

So the constraint array is very similar to the objectives, but it has negative values when solutions are infeasible, and zero otherwise.  Let us know if you have any questions!

Training video: Java file for external problems

In the next few weeks, we’ll be adding blog posts relating to our MOEAframework training.  They are a little bit out of order now, but we may rearrange them as things move forward.

Here we continue our discussion of the “external problem” example in MOEAframework.  A nice feature of the software is the ability to link a problem in a different programming language to the Java MOEA software.  In this video I walk through the different parts of the java file and run a quick example.

Training video: MOEAFramework GUI

In the next few weeks, we’ll be adding blog posts relating to our MOEAframework training.  They are a little bit out of order now, but we may rearrange them as things move forward.

The following video deals with using the GUI for MOEAFramework.  The GUI is best suited for those beginning their MOEA training.  The following video shows how to download and operate the GUI as well as two simple problems:

  1. Running two different problems with the same algorithm (DTLZ1-2D and DTLZ1-2D-Rotated with eNSGAII)
  2. Running two different algorithms on the same problem. (DTLZ1-2D-Rotated with eNSGAII and eMOEA)

WordPress: How to post a Screenr/YouTube video

Update: As of October 2015, Screenr has been discontinued.

As a guide for other folks trying to post videos on this blog (and, I suppose, on WordPress in general), here is the workflow:

  1. Create an account on screenr. In order to make a video, simply press Record and follow the instructions.
  2. After your video is done, it will give you a link to a screenr video. You can simply post a link to this video and have the users navigate to it on their site. The cooler thing to do, though, is to embed the video on YouTube. So…
  3. Create an acccount on YouTube. Nowadays you can link your Google account so it’s very seemless.
  4. Back in Screenr, click Publish to YouTube. It will ask for your YouTube name and password. The video is automatically sent to YouTube, and you have to navigate back to the YouTube page to manage it. As it explains on the screen, it may take a few minutes for the transfer to complete.
  5. In YouTube, navigate to My Videos to manage your new video. One suggestion is to make the video Unlisted, which means that you need a direct link in order to watch it. In the Advanced tools, make sure Enable Embedding is clicked.
  6. Click Watch on Video Page to see what the video will look like inside YouTube. Then, click Share, and then Embed to get the Embed code. Copy it to the clipboard.
  7. Now we’re ready to make our WordPress post. Log in to WordPress. Navigate to the WordPress “dashboard” to create a new post (you want to be in the Dashboard to get all the advanced settings for a full post, not just the quick post editor). Type a description of your video. When you’re ready to put the YouTube embed code, open the “Text” tab in the editor and paste the embed code.
  8. Publish the WordPress post and you’re done!

Training Video: External problems in MOEAFramework

In the next few weeks, we’ll be adding blog posts relating to our MOEAframework training.  They are a little bit out of order now, but we may rearrange them as things move forward.

Today’s post is a video overview of a simple “external” problem for MOEAframework.  External means that the objective function is coded in another language, compiled in an executable, and it communicates with the MOEAframework algorithms via the command line.  I walk through different components of a simple code file, rosenbrock.c, compile it, and show how to interact with the program via the command line.

Using Linux input redirection in a C++ code test

I’m testing a function that does a numerical calculation, namely the Cholesky matrix factorization.  I found a reference to algorithm that does this, but how do I test my code quickly, to input some data, perform the calculation, and then output some results?

It turns out Linux gives us a simple way to do this!  Using “input redirection”, we can quickly input some data from a text file, and process it as if we were inputting it directly through the keyboard.  First let’s look at some pseudocode.


#include <stdio.h>

#include <math.h>

#include <iostream>

#include <fstream>

#include <string>

using namespace std;

ofstream out; //for debugging

double myFun (double input1, double input2) //insert the function you want to test here

int main()

{

//Let's say you need to enter an n by n matrix with a text file, and you know n in advance. Then:

double input[52][52]; //you can use a vector of vectors, or a 2d C-style array if needed

int i = 0; int j = 0;

for (int k = 0; k < 52*52; k++)

{

cin >> input[i][j];

j++;

if (j == n)

{

j = 0; i++;

}

}

//call myFun to test.  Then, use cout or the output stream to output your results.

//to open the output stream, simply write out.open("myOutput.txt") and then use out << to output whatever you need.

return 0;

}

Now, if you have data in space or tab delimited format in a textfile, all you have to do is compile your program:

g++ myFile.cpp -o myExecutable

And then run it with input redirection:

./myExecutable < myData.txt

where the < operator feeds in the contents of myData.txt into the standard input of your program.  For more on input redirection read here, and feel free to comment!