Running Sobol Sensitivity Analysis using MOEAFramework

Note: this post describes a version of Sobol analysis included in MOEAFramework (http://moeaframework.org), written in Java. If you prefer, there is a new Python library available called SALib (http://jdherman.github.io/SALib/), which contains open-source implementations of Sobol, Morris, and FAST sensitivity methods.

This post was originally written by Jon Herman.  A small number of edits were later added by Joseph Kasprzyk.

Sobol sensitivity analysis is used to attribute the variance in a model’s output to its parameters and their interactions. In a previous post, I explained how to run Sobol sensitivity analysis using our C/C++ code. We have decided as a group to move away from this implementation due to maintainability concerns. Instead, we will use the Sobol implementation included in the MOEAFramework. MOEAFramework is a multi-objective optimization library written in Java, but it also contains Sobol sensitivity analysis as an added feature. All you need is a single *.jar file, which can be downloaded here (we want the “all-in-one executable” download, since we will mostly be treating the program as a black box).  Because MOEAframework uses Java, it is platform independent, so the software will work on Mac, Linux, and PC!

Once you have the *.jar file, complete the following steps to calculate your sensitivity indices.

Step 1: Choose sampling bounds for your parameters

First, you will need to create a simple text file with the form [parameter] [lower bound] [upper bound]. For example, such a file for the Hymod model might look like this:

Cmax 0.0 1000.0
B 0.0 3.0
Alpha 0.0 1.0
Kq 0.15 1.0
Ks 0.0 0.15

The bounds are used to sample parameter values. The names of the parameters themselves are only used to print the final sensitivity indices (i.e., you can name them whatever you want). For this example, I will refer to this file as params.txt.  Use a plain-text editor for this task, such as Notepad (on Windows), TextMate (on Mac), or emacs (on Linux).  If you use a “rich text” editor like Word, the file will not be in the correct format!

Step 2: Generate parameter sets using the Sobol Sequence

Put your params.txt file in the same directory as your MOEAFramework *.jar file.

In your operating system, find the Terminal program and open it.  In Windows you can do this by typing ‘cmd’ in the Search menu of Windows 7/8.  In Linux and Mac the program is called Terminal.  Note that the commands to navigate around the Terminal are going to be basically the same in Mac and Linux, and we cover them here.  In Windows the commands are different, and here’s a good external summary.

Move to this directory (e.g. use the ‘cd’ command) and type the following command:

java -classpath MOEAFramework-2.0-Executable.jar org.moeaframework.analysis.sensitivity.SampleGenerator
-m saltelli -n 1000 -p params.txt > sobolParameterSets.txt

(This is only one command. It’s being wrapped to multiple lines.)

The Java class being called is SampleGenerator. The -m flag specifies the mode, and the -n flag specifies the number of initial samples to generate from the pseudo-random Sobol sequence. The -p flag specifies the parameter bounds file that you created in the first step. Be sure to replace “2.0” with the version of MOEAFramework that you downloaded.

In this example, 1000 parameter sets are generated from the Sobol sequence. After that, the Saltelli method of cross-sampling is applied (for more information, see: Saltelli 2008, “Global Sensitivity Analysis: The Primer“). The cross-sampling scheme creates a total of 2N(p+1) total parameter sets to be run in your model; for the Hymod example, we would have 1000*(5+1) = 6000 total model runs. You probably want to pipe the output from this command into a text file using the “>” operator as shown above.

If you just copied the *.jar file to your computer, you may get a warning that you do not have Java installed!  Follow the prompts to make sure that Java works, and then try again.  On Windows, you need the Java Development Kit (JDK), if your system doesn’t prompt you you can download it for free at the Java website.

Note that the Sobol method can be computationally intensive depending on the model being analyzed. Even for a simple model like Hymod, from personal experience I would recommend a sample size of at least N = 10,000 (which translates to 60,000 model runs). More complex models will be slower to run and will also require more samples to calculate accurate estimates of Sobol indices. Once you complete this process, pay attention to the confidence bounds on your sensitivity indices to see whether you need to run more samples.

Step 3: Run the parameter sets through your model

The parameter sets are now saved in sobolParams.txt—the analysis so far has no idea what model you’re running. You need to set up your model to read in parameter sets from the file sobolParams.txt, run them through the model, and save your desired outputs in a file called objectiveValues.txt. You can use multiple objective values here—each row in the objectives file will correspond to a model run. Here is an example of reading in parameter sets for the Hymod model in C/C++ in the main() function (if you already know how to do this, you can skip this part):

//begin main()
string sampleFile = "sobolParams.txt"  //Input Sobol samples file to read

int numSamples = 60000; //make sure your program knows how many samples it has to read
int nParams = 5; //how many parameters do you have?
int nObj = 4; //how many objectives do you have?

ifstream in; //Input file stream
ofstream of; //output file stream for objective values

//obj is an array holding the objective values, and xreal holds the parameter values for each run
double *obj = new double [nObj];
double *xreal = new double [nParams];

//Intialize the model only once
init_HyMod(argc, argv);
//Open the output file and set some properties for it
of.open("objectiveValues.txt", ios::app);
of.setf(ios_base::fixed);
of.precision(8);
of.width(20);
in.open(sampleFile.c_str(), ios_base::in);
if (!in)
{
     cout << "Error: sampleFile could not be opened!" << endl;
     exit(1);
}
//Loop through parameter samples in sobolParams file
for (int s = 0; s < numSamples; s++)
{
    //Read in the parameter values from Sobol
     for (int i=0; i < nParams; i++)
     {
          in >> xreal[i]; //read in each parameter individually for this set
     }
     in.ignore(1000,'\n'); //ignore the rest of the line
     //Run this sample once all parameters have been read in
     calc_HyMod(indPtr);
    //Print the objective value outputs
    for(int i = 0; i < nObj; i++)
    {
         of << setw(15) << obj[i];
    }
    of << endl;
} //close the samples loop

//Close the output file
 of.close();
//Remember to clean up any memory you allocated!
deleteAllStuff(&hymodPtr->data, &hymodPtr->model, &hymodPtr->snowModel);
//end main()

As you can see, even this fairly simple task becomes complicated in C/C++. It might only take you a handful of lines to do the same thing in Matlab. But if you want speed, C/C++ is the way to go. Be careful to read in your parameters in the same order in which you sampled them!

Step 4: Calculate Sobol Indices

You now have the objective values for all of your model runs, arranged in columns in the file objectiveValues.txt. We need to send this information back to MOEAFramework to compute the sensitivity indices, using following command:

java -Xmx256m -classpath MOEAFramework-2.0-Executable.jar org.moeaframework.analysis.sensitivity.SobolAnalysis -m 0 -i objectiveValues.txt -p params.txt -o sobolIndices.txt -r 1000

As always, make sure your version of MOEAFramework is properly listed after the classpath tag! The class being called is SobolAnalysis. The flag -Xmx256m tells Java how much memory to reserve. (If you ever get an error from Java saying something like “cannot allocate heap: not enough memory”, you may need to increase the amount of memory reserved … 128m, 256m, 512m, etc.).

As shown here, the SobolAnalysis class requires several flags to be passed in: the parameter file (-p), the file containing calculated objective values (-i), the output file for the sensitivity indices (-o), and the column of the objective values file to read (-m). The columns are assumed to be zero-indexed; if you have calculated multiple objective values, you would continue on to -m 1, etc., repeating the same command as above.

Optional parameters for the sobol program include the (-r) flag, which specifies the number of boostrap resamples, and the (-s) flag, which specifies the random seed for bootstrapping. These parameters are used to calculate the confidence intervals of the sensitivity indices. If these flags are excluded, the program defaults to use the system time as the random seed, with 1000 bootstrap resamples.

Step 5: Interpret your results

Open the file sobolIndices.txt in a regular text editor. You will see something like this:

Parameter	Sensitivity [Confidence]
First-Order Effects
Ks 0.0017259632775069277 [0.009004221878377168]
Kq 0.006978812896485629 [0.007912025856580397]

...(etc.)
Total-Order Effects
Ks 0.00851970745914532 [0.010850538623282687]
Kq 6.892431800058496E-4 [0.010346174868875227]

...(etc.)
Second-Order Effects
Ks * Kq -8.715871377997618E-4 [0.01149935524352781]
Ks * DDF -0.00178083062205274 [0.01210766544038049]

...(etc.)

The parameter names will match the ones you specified in params.txt. The first order, total order, and second order sensitivities are specified as indicated. Confidence intervals for each indice are shown in brackets. Most of the indices are omitted here for the sake of brevity. Typically we use the total order indices to get a broad picture of model behavior, since they estimate all of the interactions between parameters. If the confidence intervals of your dominant indices are larger than ~10% of the value itself, you may want to consider increasing your sample size as computation permits. The parameters shown here have negligible sensitivity indices—the precision included in the output is not necessarily meaningful. For total-order indices to be important, they will usually need to be above 0.05 at the very least (the most dominant parameters will have values upward of 0.8).

Those are the basics for running Sobol using MOEAFramework. In general, the logic behind the process is exactly the same as in our old C/C++ code, and in my opinion it’s even simpler to run the Java version. Feel free to email jdh366 (at) cornell (dot) edu with questions. Thanks for reading.

6 thoughts on “Running Sobol Sensitivity Analysis using MOEAFramework

  1. This is a bash script to perform all the operations described above:

    #!/bin/bash

    PARAMS=”params.txt” # parameters sampling bounds
    SOBOLPAR=”sobolParams.txt” # parameter sets generated from Sobol Sequence
    MODEL_EXE=”test_Sobol” # name of the executable
    OUTPUT_FILE=”sobolIndices” # output file containing Sobol indices

    INIT_SAMPLE=1000 # number of initial samples to generate from the pseudo-random Sobol Sequence
    NUMOB=2 # number of objectives
    OBJS=$(seq 1 ${NUMOB})

    # arguments for java input
    JAVA_ARGS=”-Xmx256m -classpath MOEAFramework-1.13-Executable.jar:.”
    SAMPLEGEN_ARGS=”org.moeaframework.analysis.sensitivity.SampleGenerator”
    SOBOL_ARGS=”org.moeaframework.analysis.sensitivity.SobolAnalysis”

    # GENERATION OF PARAMETER SETS USING SOBOL SEQUENCE
    echo “…generating parameter sets using Sobol Sequence”
    java ${JAVA_ARGS} ${SAMPLEGEN_ARGS} -m saltelli -n ${INIT_SAMPLE} -p ${PARAMS} > ${SOBOLPAR}

    # RUN THE MODEL WITH THE PARAMETER SETS
    echo “…running the model”
    ./${MODEL_EXE} ${INIT_SAMPLE}

    # COMPUTATION OF SOBOL INDICES FOR EACH OBJECTIVE
    echo “…computation of Sobol Indices”
    for N in ${OBJS}
    do
    NAME=${OUTPUT_FILE}_${N}
    let “OB=${N}-1”
    java ${JAVA_ARGS} ${SOBOL_ARGS} -m ${OB} -r 1000 -i objectiveValues.txt -p ${PARAMS} -o ${NAME}.txt
    done
    echo “finished”

  2. Pingback: Running Sobol Sensitivity Analysis « Pat Reed Group Research Tips Blog

  3. Pingback: Water Programming Blog Guide (Part I) – Water Programming: A Collaborative Research Blog

Leave a comment