MOEAFramework Training Part 4: Processing Metrics and Creating Visualizations

Part 4 wraps up the MOEAFramework training by taking the metrics generated in Part 3 and visualizing them to gain general insight about algorithm behavior and assess strengths and weaknesses of the algorithms.

The .metrics files stored in the data_metrics folder look like the following:

Picture1

Metrics are reported every 1000 NFE and a .metrics file will be created for each seed of each parameterization of each algorithm. There are different ways to proceed with merging/processing metrics depending on choice of visualization. Relevant scripts that aren’t in the repo can be found in this zipped folder along with example data.

Creating Control Maps

When creating control maps, one can average metrics across seeds for each parameterization or use best/worst metrics to try to understand the best/worse performance of the algorithm. If averaging metrics, it isn’t unusual to find that all metrics files may not have the same number of rows, therefore rendering it impossible to average across them. Sometimes the output is not reported as specified. This is not unusual and rather just requires you to cut down all of your metric files to the greatest number of common rows. These scripts are found in ./MOEA_Framework_Group/metrics

1.Drag your metrics files from data_metrics into ./MOEA_Framework_Group/metrics and change all extensions to .txt (use ren .metrics .txt in command prompt to do so)

2.Use Cutting_Script.R to find the maximum number of rows that are common among all seeds. This will create new metric files in the folder, Cut_Files

Now these files can be averaged and grouped with their corresponding parameter values.

1.Seed_Merge.R: Creates a text file with the average hypervolume for each parameterization for each algorithm (i.e. hypervolume_Borg.txt)

2.Add Borg_Samples.txt and NSGAII_Samples.txt to the folder

3.Make_Final_Table.R: takes the population values from the sample file and the hypervolume values for each parameterization and puts them into a final matrix in a form to be accepted by the control map code.

In order to create control maps, you need to first find the reference set hypervolume, because all metrics are normalized to this overall hypervolume. This can be done using the following command and the HypervolumeEval java class written by Dave Hadka.

$ java -cp MOEAFramework-2.12-Demo.jar HypervolumeEval ./data_ref/lake.ref >> lake_ref.hypervolume

Finally, use Control_Map_Borg.py and Control_Map_NSGAII.py make your control maps.

Picture1

Control Maps for Borg and NSGAII

Initial population size on the x-axis can be regarded as a proxy for different parameterizations and the y-axis shows the number of NFE. The color represents the percentage of the overall reference hypervolume that is achieved. Control maps highlight a variety of information about an algorithm: Controllability (sensitivity to parameterization), Effectiveness (quality of approximation sets), and Efficiency (how many NFE it takes to achieve high quality solutions)

Ideally, we would want to see a completely dark blue map that would indicate that the algorithm is able to find high quality solutions very quickly for any parameterization. We can see that this is not the case for either of the algorithms above. Any light streaks indicate that for that particular parameterization, the algorithm had a harder time achieving high quality solutions. Borg is generally robust to parameterization and as seen, if allowed more NFE, it will likely result in a more even blue plot.

Creating Attainment Plots

To create attainment plots:

1.Drag metrics back from Cut_Files back into the data_metrics_new directory on the Cube.

2.Use the average_metrics.sh script to average out the metrics to obtain a set of average metrics across seeds for each algorithm.

3.Concatenate parameter files using: cat NSGAII_lake_*.average >> NSGAII_Concatenate_Average_Metrics.txt

4.Use Example_Attain.m to find the best metric values and to calculate the probability of attainment of the best metrics.

5.Create attainment vectors with build_attainment_matrix.py

6.Plot with color_mesh.py

Picture1

Attainment plots for each metric and algorithm

Attainment plots highlight the following:

Reliability of the algorithm: In general, we would like to see minimal attainment variability which would suggest that our algorithm reliably produces solutions of high quality across seeds. The white circles show the algorithm’s single best run for each metric. The gradient shows the probability of obtaining the best metric value. You can see here that the algorithms are able to reliably obtain high generational distance metrics. However, remember that generational distance is an easy metric to meet. For the other two metrics, one can see that while NSGAII obtains the best metric values, Borg has a slightly higher reliability of obtaining high hypervolume values which arguably is a more important quality that demonstrates robustness in the algorithm.

There are some extra visualizations that can be made to demonstrate algorithmic performance.

Reference Set Contribution

How much of the reference set is contributed by each algorithm?

1.Copy MOEAFramework jar file into data_ref folder

2.Add a # at the end of the individual algorithm sets

3.java -cp MOEAFramework-2.12-Demo.jar org.moeaframework.analysis.sensitivity.SetContribution -e 0.01,0.01,0.0001,0.001 -r lake.ref Borg_lake.set NSGAII_lake.set > lake_set_contribution.txt

ReferenceSet.PNG

lake_set_contribution.txt, as seen above, reports the percentage (as a decimal) of the reference set that is contributed by each algorithm and includes unique and non-unique solutions that could have been found by both algorithms. Typically, these percentages are shown in a bar chart and would effectively display the stark difference between the contribution made by Borg and NSGAII in this case.

Random Seed Analysis

A random seed analysis is a bit of a different experiment and requires just one parameterization, the default parameterization for each algorithm. Usually around 50 seeds of the defaults are run and the corresponding average hypervolume is shown as solid line while the 5th and 95th percentile confidence interval across seeds is shown as shading. Below is an example of such a plot for a different test case:

Picture3.jpg

Random seed analysis plot of hypervolume as a function of NFE

This style of plot is particularly effective at showcasing default behavior, as most users are likely to use the algorithms “straight out of the box”. Ideally, the algorithms have monotonically increasing hypervolume that reaches close to 1 in a small number of functional evaluations and also thin shading to indicate low variability among seeds. Any fluctuations in hypervolume indicates deterioration in the algorithm, which is a result of losing non-dominated solutions and is an undesirable quality of an algorithm.

MOEAFramework Training Part 3: Calculating Metrics

Part 3 of the MOEAFramework Training covers calculation of metrics that can be used to assess the performance of your algorithms. The relevant files and folders for this part of the training have been added to the Github repo. These are:

  • generate_refset.sh
  • evaluate.sh 

Generation of a Reference Set

In order to assess the performance of an algorithm, you have to have something to compare its approximation sets to. If your optimization problem has a theoretical solution, then the approximation sets produced by the run.sh script in Part 2 can be compared against a theoretical reference set. However, most real world optimization problems do not have a solution. Therefore, a best known reference set must be used for comparison. The best known reference set is usually generated by combining the best solutions from all algorithms that are tested.

generate_refset.sh uses MOEAFramework’s ResultFileMerger class to first merge all seeds of one parameterization into a merged set (i.e. NSGAII_lake_P2.set). Then, the script implements the ReferenceSetMerger class to combine all parameterizations into one overall algorithm reference set (i.e. Borg_lake.set) based on the epsilons provided in the settings.sh script. Finally, the ReferenceSetMerger is once again used to determine an overall best reference set across all algorithms (i.e. lake.ref). After running ./generate_refset.sh, the corresponding files will be stored in the folder data_ref. At this point, one can then make a 3D plot of the overall reference set and the algorithm-specific reference sets to start visualizing differences in algorithms. The figure below shows the best-known reference set for the DPS version of the lake problem found by plotting the solutions in lake.ref as well as the best reference sets for Borg and NSGA-II found after running 10 seeds and 10 parameterizations of the algorithms for 25,000 NFE. For a more thorough diagnostic experiment, usually 20-50 seeds of 100 parameterizations of each algorithm are run for around 100,000 NFE.

Reference_Set

Calculation of Performance Metrics

From the figures, differences in each algorithm’s reference set can be seen. Notably, Borg is able to produce solutions with a much higher reliability than NSGA-II and ultimately it seems as though mostly Borg’s solutions contribute to the overall reference set. Depending on the complexity of the problem, these differences will become more apparent. To more formally quantify algorithm performance, we use the ResultFileEvaluator in the evaluate.sh script to calculate performance metrics including hypervolume, generational distance, and epsilon indicator using the raw optimization results and the overall reference set. More information on these metrics and how to calculate them can be found in Evolutionary Algorithms for Solving Multi-Objective Problems by Coello Coello (2007). Note that the metrics are reported every 100 NFE, as defined in the run.sh file. After running ./evaluate.sh, metric files will be stored in the data_metrics folder. A metric file will be created for each seed of each parameterization. Ultimately, as will be shown in the next post, these files can be averaged across seeds and/or parameterizations to determine average performance of an algorithm.

In the next post, we will show how to merge these metrics and effectively visualize them to assess the performance of the algorithm over time and over different parameterizations.

MOEAFramework Training Part 1: Connecting an External Problem

The goal of this training is to step a user through becoming familiar with the capabilities of MOEAFramework, a free and open source Java library created by Dave Hadka, that allows the user to design, execute, and test out a variety of popular multi-objective evolutionary algorithms (MOEAs). In this series, we will demonstrate the capabilities of MOEAFramework in 4 parts. Part 1 demonstrates how to hook up an external optimization problem to MOEAFramework. Part 2 will cover the optimization of the problem using a variety of algorithms. Part 3 will illustrate how to calculate metrics to assess the performance of the algorithms. Finally, Part 4 will step through generation of relevant figures from the metrics that convey the effectiveness, efficiency, reliability, and controllability of the algorithms.

The example test case that will be used throughout this tutorial is the DPS version of the Lake Problem. The code is written and adapted by Dr. Julianne Quinn and can be found here. However, the tutorial will be set up so that the user can easily swap in their own problem formulation.

Finally, this guide is specifically built to connect an external problem that is written in C++ but Dave Hadka’s Beginner Guide to the MOEA Framework, has examples of how to create the Java executable for problems written in a language other than C++.

Connecting Your Problem

Before starting this training, it is recommended to set up a directory in a Linux environment with Java installed. If you are a student in Reed Research Group, create a folder in your Cube directory called “MOEA_Diagnostics_Tutorial”. Throughout the training, we will be adding various subdirectories and files to this folder. For this part of the tutorial, you will need the following in your directory:

  1. MOEAFramework Demo .jar file which can be found by clicking on “Demo Application” on the far right.
  2. The C++ version of the Lake Problem
  3. SOWs_Type6.txt (natural inflows read in by the .cpp file)
  4. moeaframework.c and moeaframework.h found here

Once you have the .cpp file, the next step is to get it set up to be recognized by MOEAFramework. I have made these changes in this file in my GitHub repo. If you compare with the file in 2, you will see some changes.

First, I have commented out all parts of the code related to Borg, which the problem was originally set up to be optimized with. For this tutorial, we will be optimizing with multiple algorithms and comparing their performance. Secondly, in lines 55-57, I have use the #define directive to declare the number of variables, objectives, and constraints to be constants. Lines 300-309 is where the direct connection to MOEAFramework comes in. MOEA_Init initializes the communication between C++ and MOEAFramework and takes the number of objectives and constraints as arguments. Then we can start reading and evaluating solutions. MOEA_Read_doubles extracts the decision variables and stores them in an array, vars. Line 304 calls the lake_problem function that we would like to evaluate, along with the variables, and constraints. This results in the objs array being filled. Finally, MOEA_Write writes the objectives back into the framework. When all solutions are done being read and written, MOEA_Terminate closes the connection. The important thing here is just to make sure that you are passing the names of the arguments of your lake_problem function correctly.

Next, we must make an executable of the C++ file that Java can read. The makefile is located here and takes lake.cpp, moeaframework.c, moeaframework.h, and some relevant libraries and compiles them into an executable called “lake”. In order to run this file, simply type “make”.

Finally, we must turn our executable into a java class. The relevant java file can be found here. This file can be found in the MOEAFramework documentation, but must be tailored to an external problem. Lines 21-25 import in the relevant tools from MOEAFramework that help to configure and solve the problem. Lines 40, 42, 43, and 48 show the first change from the original file. Here we insert the name of our executable that was generated in the last step, “lake”. In lines 53, 58, and 63, we state the number of decision variables, objectives, and constraints. Finally, in lines 67-75, we create a newSolution ( ) method to specify that our solution should have 6 real-valued decision variables between 0 and 1.

In order to create a class file, simply type:

 

javac -classpath MOEAFramework-2.12-Demo.jar lake.java 

 

A file called lake.class will be created.

The first part of training is done! Next time, we will set up some scripts, call these executables, and optimize the lake problem with a variety of different algorithms.

 

 

 

Algorithm Diagnostics Walkthrough using the Lake Problem as an example (Part 3 of 3: Metrics-based analysis of algorithm performance)

Now that you have your desired metrics based on part 2 of this series, it is possible to gain more insight into your algorithm performance. When I performed this analysis for the actual study, I used the AWRAnalysis.java, Analysis_Attainment_LakeProblem.sh and HypervolumeEval.java files found in the Github repository as explained in the README. I later discovered it was possible to do this within the framework, so that option will be discussed here.

It is possible to calculate the hypervolume of a Pareto Approximate Front within the framework using the SetHypervolume class. For more information on the MOEAFramework classes, please see the following link (http://moeaframework.org/javadoc/index.html).

I used the following command: (Note the change to version 2.3 because I reran this command today to check I remembered it correctly although it seems there is now a version 2.4. It is always best to use the newest version.)


java –cp MOEAFramework-2.3-Demo.jar org.moeaframework.analysis.sensitivity.SetHypervolume myLake4ObjStoch.reference –e 0.01,0.01,0.0001,0.0001 myLake4ObjStoch.reference

This returns a hypervolume value between 0 and 1 that is useful for threshold calculations as shown below.

To calculate threshold attainments, use the Analysis class. Below is an example of performing attainment analysis within the framework instead of using AWRAnalysis.java.  This approach generates a huge number of files, which are best understood when plotted, a subject for a future post.


#!/bin/bash
#source setup_LTM.sh

dim=4
problem=myLake4ObjStoch
epsilon="0.01,0.01,0.0001,0.0001"

algorithms="Borg eMOEA eNSGAII GDE3 MOEAD NSGAII"
seeds="1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50"
percentiles="`seq 1 1 100`"
thresholds=(`seq 0.01 0.01 1.0`)

#compute averages across metrics
#echo "Computing averages across metrics..."
#for algorithm in ${algorithms}
#do
# echo "Working on: " ${algorithm}
# java -classpath `cygpath -wp $CLASSPATH` org.moeaframework.analysis.sensitivity.MetricFileStatistics --mode average --output $WORK/metrics/${algorithm}_${problem}.average $WORK/metrics/${algorithm}_${problem}_*.metrics
#done
#echo "Done!"

#compute search control metrics (for best and attainment)
echo "Computing hypervolume search control metrics..."
for algorithm in ${algorithms}
do
 echo "Working on: " ${algorithm}
 counter=$1
 for percentile in ${percentiles}
 do
 java -classpath MOEAFramework-2.3-Demo.jar org.moeaframework.analysis.sensitivity.Analysis --parameterFile ./${algorithm}_params.txt --parameters ./${algorithm}_Latin --metric 0 --threshold ${thresholds[$counter]} --hypervolume 0.7986 ./SOW6/metrics/average_replace_NaNs/${algorithm}_${problem}.average > ./test/Hypervolume_${percentile}_${algorithm}.txt
 counter=$((counter+1))
 done
 done
echo "Done!"

echo "Computing generational distance search control metrics..."
for algorithm in ${algorithms}
do
 echo "Working on: " ${algorithm}
 counter=$1
 for percentile in ${percentiles}
 do
 java -classpath MOEAFramework-2.3-Demo.jar org.moeaframework.analysis.sensitivity.Analysis --parameterFile ./${algorithm}_params.txt --parameters ./${algorithm}_Latin --metric 1 --threshold ${thresholds[$counter]} ./SOW6/metrics/average_replace_NaNs/${algorithm}_${problem}.average > ./test/GenDist_${percentile}_${algorithm}.txt
 counter=$((counter+1))
 done
done
echo "Done!"

echo "Computing additive epsilon indicator search control metrics..."
for algorithm in ${algorithms}
do
 echo "Working on: " ${algorithm}
 counter=$1
 for percentile in ${percentiles}
 do
 java -classpath MOEAFramework-2.3-Demo.jar org.moeaframework.analysis.sensitivity.Analysis --parameterFile ./${algorithm}_params.txt --parameters ./${algorithm}_Latin --metric 4 --threshold ${thresholds[$counter]} ./SOW6/metrics/average_replace_NaNs/${algorithm}_${problem}.average > ./test/EpsInd_${percentile}_${algorithm}.txt
 counter=$((counter+1))
 done
done
echo "Done!"

I did encounter some caveats while working through this process. Scripts for handling them and instructions are provided in the Diagnostic-Source README on Github. One caveat that is not covered there is increasing the speed of the hypervolume calculation. Please see Dave Hadka’s Hypervolume repository for more information (https://github.com/dhadka/Hypervolume).

Algorithm Diagnostics Walkthrough using the Lake Problem as an example (Part 2 of 3: Calculate metrics for Analysis) Tori Ward

This post continues from Part 1, which provided examples of using the MOEAFramework to generate Pareto approximate fronts for a comparative diagnostic study.

Once one has finished generating all of the approximate fronts and respective reference sets one hopes to analyze, metrics may be calculated within the MOEAFramework. I calculated metrics for both my local reference sets and all of my individual approximations of the Pareto front. The metrics for the individual approximations were averaged for each parameterization across all seeds to determine the expected performance for a single seed.

Calculate Metrics

Local Reference Set Metrics

#!/bin/bash

NSAMPLES=50
NSEEDS=50
METHOD=Latin
PROBLEM=myLake4ObjStoch
ALGORITHMS=( NSGAII GDE3 eNSGAII MOEAD eMOEA Borg)

SEEDS=$(seq 1 ${NSEEDS})
JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

for ALGORITHM in ${ALGORITHMS[@]}
do
NAME=${ALGORITHM}_${PROBLEM}
PBS="\
#PBS -N ${NAME}\n\
#PBS -l nodes=1\n\
#PBS -l walltime=96:00:00\n\
#PBS -o output/${NAME}\n\
#PBS -e error/${NAME}\n\
cd \$PBS_O_WORKDIR\n\
java ${JAVA_ARGS} \
org.moeaframework.analysis.sensitivity.ResultFileEvaluator \
--b ${PROBLEM} --i ./SOW4/${ALGORITHM}_${PROBLEM}.reference \
--r ./SOW4/reference/${PROBLEM}.reference --o ./SOW4/${ALGORITHM}_${PROBLEM}.localref.metrics"
echo -e $PBS | qsub
done

Individual Set Metrics

#!/bin/bash

NSAMPLES=50
NSEEDS=50
METHOD=Latin
PROBLEM=myLake4ObjStoch
ALGORITHMS=( NSGAII GDE3 eNSGAII MOEAD eMOEA Borg)

SEEDS=$(seq 1 ${NSEEDS})
JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

for ALGORITHM in ${ALGORITHMS[@]}
do
for SEED in ${SEEDS}
do
NAME=${ALGORITHM}_${PROBLEM}_${SEED}
PBS="\
#PBS -N ${NAME}\n\
#PBS -l nodes=1\n\
#PBS -l walltime=96:00:00\n\
#PBS -o output/${NAME}\n\
#PBS -e error/${NAME}\n\
cd \$PBS_O_WORKDIR\n\
java ${JAVA_ARGS} \
org.moeaframework.analysis.sensitivity.ResultFileEvaluator \
--b ${PROBLEM} --i ./SOW4/sets/${ALGORITHM}_${PROBLEM}_${SEED}.set \
--r ./SOW4/reference/${PROBLEM}.reference --o ./SOW4/metrics/${ALGORITHM}_${PROBLEM}_${SEED}.metrics"
echo -e $PBS | qsub
done
done

Average Individual Set Metrics across seeds for each parameterization

#!/bin/bash
#PBS -l nodes=1:ppn=1
#PBS -N moeaevaluations
#PBS -j oe
#PBS -l walltime=96:00:00

cd "$PBS_O_WORKDIR"

NSAMPLES=50
NSEEDS=50
METHOD=Latin
PROBLEM=myLake4ObjStoch
ALGORITHMS=( NSGAII GDE3 eNSGAII MOEAD eMOEA Borg)

SEEDS=$(seq 1 ${NSEEDS})
JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

# Average the performance metrics across all seeds
for ALGORITHM in ${ALGORITHMS[@]}
do
echo -n "Averaging performance metrics for ${ALGORITHM}..."
java ${JAVA_ARGS} \
org.moeaframework.analysis.sensitivity.SimpleStatistics \
-m average --ignore -o ./metrics/${ALGORITHM}_${PROBLEM}.average ./metrics/${ALGORITHM}_${PROBLEM}_*.metrics
echo "done."
done

At the end of this script, I also calculated the set contribution I mentioned earlier by including the following lines.

# Calculate set contribution
echo ""
echo "Set contribution:"
java ${JAVA_ARGS} org.moeaframework.analysis.sensitivity.SetContribution \
-e 0.01,0.01,0.001,0.01 -r ./reference/${PROBLEM}.reference ./reference/*_${PROBLEM}.combined

Part 3 covers using the MOEAFramework for further analysis of these metrics.

Algorithm Diagnostics Walkthrough using the Lake Problem as an example (Part 1 of 3: Generate Pareto approximate fronts)

This three part series is an overview of the algorithm diagnostics I performed in my Lake Problem study with the hope that readers may apply the steps to any problem of interest. All of the source code for my study, including the scripts used for the diagnostics can be found at https://github.com/VictoriaLynn/Lake-Problem-Diagnostics.

The first step to using the MOEAFramework for comparative algorithm diagnostics was to create the simulation model on which I would be assessing algorithm performance. The Lake Problem was written in C++. The executable alone could be used for optimization with Borg and I created a java stub to connect the problem to the MOEAFramework. (https://github.com/VictoriaLynn/Lake-Problem-Diagnostics/blob/master/Diagnostic-Source/myLake4ObjStoch.java).  Additional information on this aspect of a comparative study can be found in examples 4 and 5 for the MOEAFramework (http://moeaframework.org/examples.html) and in Chapter 5 of the manual. I completed the study using version 2.1, which was the newest at the time. I used the all in one executable instead of the source code although I compiled my simulation code within the examples subfolder of the source code.

Once I had developed an appropriate simulation model to represent my problem, I could begin the diagnostic component of my study. I first chose algorithms of interest and determined the range of parameters from which I would like to sample. To determine parameter ranges, I consulted Table 1 of the 2013 AWR article by Reed et al.

Reed, P., et al. Evolutionary Multiobjective Optimization in Water Resources: The Past, Present, and Future. (Editor Invited Submission to the 35th Anniversary Special Issue), Advances in Water Resources, 51:438-456, 2013.

Example parameter files and the ones I used for my study can be found at https://github.com/VictoriaLynn/Lake-Problem-Diagnostics/tree/master/Diagnostic-Source/params. Once I had established parameter files for sampling, I found chapter 8 of the MOEAFramework manual to be incredibly useful.  Below I walk through the steps I took in generating approximations of the Pareto optimal front for my problem across multiple seeds, algorithms, and parameterizations.   All of the commands have been consolidated into the file Lake_Problem_Comparative_Study.sh on Github, but I had many separate files during my study, which will be separated into steps here. It may have been possible to automate the whole process, but I liked breaking it up into separate scripts to make sure I checked that the output made sense after each step.

Step 1: Generate Parameter Samples To generate parameter samples for each algorithm, I used the following code, which I kept in a file called sample_parameters.sh. I ran all .sh scripts using the general command sh script_name.sh.

NSAMPLES=500
METHOD=Latin
PROBLEM=myLake4ObjStoch
ALGORITHMS=(Borg MOEAD eMOEA NSGAII eNSGAII GDE3)
JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"

# Generate the parameter samples
echo -n "Generating parameter samples..."
for ALGORITHM in ${ALGORITHMS[@]}
do
java ${JAVA_ARGS} \
org.moeaframework.analysis.sensitivity.SampleGenerator \
--method ${METHOD} --n ${NSAMPLES} --p ${ALGORITHM}_params.txt \
--o ${ALGORITHM}_${METHOD}
done

Step 2: Optimize the problem using algorithms of interest This step had two parts: optimization with Borg and optimization with the MOEAFramework algorithms. To optimize using Borg, one needs to request Borg at http://borgmoea.org/. This is the only step that needs to be completed outside of the MOEAFramework. I then used the following script to generate approximations to the Pareto front for all 500 samples and 50 random seeds. The –l and –u flags indicate upper and lower bounds for decision variable values. Fortunately, it should soon be possible to type one value and specify the number of variables with that bound instead of typing all 100 values as shown here.

#!/bin/bash
#50 random seeds

NSEEDS=50
PROBLEM=myLake4ObjStoch
ALGORITHM=Borg

SEEDS=$(seq 1 ${NSEEDS})

for SEED in ${SEEDS}
do
NAME=${ALGORITHM}_${PROBLEM}_${SEED}
PBS="\
#PBS -N ${NAME}\n\
#PBS -l nodes=1\n\
#PBS -l walltime=96:00:00\n\
#PBS -o output/${NAME}\n\
#PBS -e error/${NAME}\n\
cd \$PBS_O_WORKDIR\n\
./BorgExec -v 100 -o 4 -c 1 \
-l 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 \
-u 0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1 \
-e 0.01,0.01,0.0001,0.0001 -p Borg_params.txt -i Borg_Latin -s ${SEED} -f ./sets/${ALGORITHM}_${PROBLEM}_${SEED}.set -- ./LakeProblem4obj_control "
echo -e $PBS | qsub
done

Optimization with the MOEAFramework allowed me to submit jobs for all remaining algorithms and seeds with one script as shown below. In my study, I actually submitted epsilon dominance algorithms (included –e flag) and point dominance algorithms (did not include –e flag) separately; however, it is my understanding that it would have been fine to submit jobs for all algorithms with the epsilon flag, especially since I converted all point dominance approximations to the Pareto front to epsilon dominance when generating reference sets.


#!/bin/bash

NSEEDS=50
PROBLEM=myLake4ObjStoch
ALGORITHMS=(MOEAD GDE3 NSGAII eNSGAII eMOEA)

SEEDS=$(seq 1 ${NSEEDS})
JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

for ALGORITHM in ${ALGORITHMS[@]}
do
for SEED in ${SEEDS}
do
NAME=${ALGORITHM}_${PROBLEM}_${SEED}
PBS="\
#PBS -N ${NAME}\n\
#PBS -l nodes=1\n\
#PBS -l walltime=96:00:00\n\
#PBS -o output/${NAME}\n\
#PBS -e error/${NAME}\n\
cd \$PBS_O_WORKDIR\n\
java ${JAVA_ARGS}
org.moeaframework.analysis.sensitivity.Evaluator -p
${ALGORITHM}_params.txt -i ${ALGORITHM}_Latin -b ${PROBLEM}
-a ${ALGORITHM} -e 0.01,0.01,0.0001,0.0001 -s ${SEED} -o ./sets/${NAME}.set"
echo -e $PBS | qsub
done

done

Step 3: Generate combined approximation set for each algorithm and Global reference set Next, I generated a reference set for each algorithm’s performance. This was useful as it made it easier to generate the global reference set for all algorithms across all seeds and parameterizations and it allowed me to calculate a percent contribution for each algorithm to the global reference set. Below is the script for the algorithm reference sets:

#!/bin/bash

NSAMPLES=50
NSEEDS=50
METHOD=Latin
PROBLEM=myLake4ObjStoch
ALGORITHMS=( NSGAII GDE3 eNSGAII MOEAD eMOEA Borg)

JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

# Generate the combined approximation sets for each algorithm
for ALGORITHM in ${ALGORITHMS[@]}
do
echo -n "Generating combined approximation set for
${ALGORITHM}..."
java ${JAVA_ARGS} \
org.moeaframework.analysis.sensitivity.ResultFileMerger \
-b ${PROBLEM} -e 0.01,0.01,0.0001,0.0001 -o ./SOW4/reference/${ALGORITHM}_${PROBLEM}.combined \
./SOW4/sets/${ALGORITHM}_${PROBLEM}_*.set
echo "done."
done

In the same file, I added the following lines to generate the global reference set while running the same script.
# Generate the reference set from all combined approximation sets
echo -n "Generating reference set..."
java ${JAVA_ARGS} org.moeaframework.util.ReferenceSetMerger \
-e 0.01,0.01,0.0001,0.0001 -o ./SOW4/reference/${PROBLEM}.reference ./SOW4/reference/*_${PROBLEM}.combined > /dev/null
echo "done."

If one wants to keep the decision variables associated with the reference set solutions, it is possible to use org.moeaframework.analysis.sensitivity.ResultFileMerger on all of the pertinent .set files.

A final option for reference sets is to generate local reference sets for each parameterization of each algorithm. This was done with the following script:

#!/bin/bash
NSEEDS=50
ALGORITHMS=( GDE3 eMOEA Borg NSGAII eNSGAII MOEAD)
PROBLEM=myLake4ObjStoch

SEEDS=$(seq 1 ${NSEEDS})

# Evaluate all algorithms for all seeds
for ALGORITHM in ${ALGORITHMS[@]}
do
java -cp MOEAFramework-2.1-Demo.jar org.moeaframework.analysis.sensitivity.ResultFileSeedMerger -d 4 -e 0.01,0.01,0.0001,0.0001 \
--output ./SOW4/${ALGORITHM}_${PROBLEM}.reference ./SOW4/objs/${ALGORITHM}_${PROBLEM}*.obj
done

Part 2 of this post walks through my calculation of metrics.

Determining Whether Differences in Diagnostic Results are Statistically Significant (Part 1)

While working on my diagnostic assessment of the difficulty of the “Lake Problem”, I decided that I would like to check that the differences in performance I observed were statistically significant. This part of the blog post is a broad overview of the steps I took before the statistical analysis and that analysis. Part 2 includes a tutorial. Looking through prior group papers, I drew initial inspiration from the following articles by Dave although I think my final method differs slightly.

Hadka, D. and P. Reed. Diagnostic Assessment of Search Controls and Failure Modes in Many-Objective Evolutionary Optimization. Evolutionary Computation, 20(3):423-452, 2012.

Hadka, D. and P. Reed. Borg: An Auto-Adaptive Many-Objective Evolutionary Computing Framework. Evolutionary Computation, 21(2):231-259, 2013.

Below is a short summary of the steps preceding my statistical analysis with links to group resources for those who may need more background.  For everyone else, feel free to skip ahead.

Before I could determine the statistical significance of my results, I needed to perform a comparative study of algorithm performance. I got help from sections 8.1 to 8.8 and section 8.10 of the MOEAFramework manual, which can be found at the MOEAFramework website to generate approximate fronts for the problem for multiple random seeds (I used 50) using algorithms found within the MOEAFramework, calculate metrics for each seed across all parameterizations, and determine the best value for each seed across all parameterizations and the probability of attaining a specified threshold (I used 75%) for each seed.  I skipped section 8.9 and performed the analysis described in 8.10 on metrics for each seed’s metrics as many samples are necessary to perform statistics on algorithm performance.  Please note, I also included the Borg evolutionary algorithm in this study, which requires optimization outside of the framework. All other steps for Borg can be performed the same as for the MOEAFramework algorithms.

In this case, I used Latin Hypercube Sampling to generate 500 samples of the feasible parameter space for each algorithm. I chose parameter ranges from Table 1 of the following paper.
Reed, P., et al. Evolutionary Multiobjective Optimization in Water Resources: The Past, Present, and Future. (Editor Invited Submission to the 35th Anniversary Special Issue), Advances in Water Resources, 51:438-456, 2013.

Here is the shell script I used for calculating the best attained values and probability of attaining the 75% threshold for all algorithms and all seeds.

NSEEDS=50
METHOD=Latin
PROBLEM=myLake4ObjStoch
ALGORITHMS=( NSGAII GDE3 eNSGAII MOEAD eMOEA Borg)

SEEDS=$(seq 1 ${NSEEDS})
JAVA_ARGS="-cp MOEAFramework-2.1-Demo.jar"
set -e

# Perform the analysis
echo ""
echo "Hypervolume Analysis:"
for ALGORITHM in ${ALGORITHMS[@]}
do
	for SEED in ${SEEDS}
	do
    NAME=${ALGORITHM}_${PROBLEM}_${SEED}_Hypervolume

    PBS="\
    #PBS -N ${NAME}\n\
    #PBS -l nodes=1\n\
    #PBS -l walltime=96:00:00\n\
    #PBS -o output/${NAME}\n\
    #PBS -e error/${NAME}\n\
    cd \$PBS_O_WORKDIR\n\
        java ${JAVA_ARGS} org.moeaframework.analysis.sensitivity.Analysis \
        -p ${ALGORITHM}_params.txt -i ${ALGORITHM}_${METHOD} -t 0.75 -o ./SOW6/Hypervolume/${ALGORITHM}_${PROBLEM}_${SEED}_Hypervolume.analysis \
        -c -e -m 0 --hypervolume 0.7986 ./SOW6/metrics/${ALGORITHM}_${PROBLEM}_${SEED}.metrics"
    echo -e $PBS | qsub
    done

done

The above shell script is written for Hypervolume, but other methods can be specified easily by indicating a different number after the -m flag and removing the hypervolume flag.  If one wants to analyze Generational Distance, type 1 after the -m flag.  For the Additive Epsilon Indicator, type 4.  Typically, these are the three metrics of interest as explained in this blog post https://waterprogramming.wordpress.com/2013/06/25/moea-performance-metrics/.   The –hypervolume flag allows you to specify the hypervolume of your reference set. That way the individual reference set hypervolume values are normalized to that of the best known approximation to the Pareto front.

Hypervolume can be calculated for a given approximation set using the following command line argument.

java -cp MOEAFramework-2.1-Demo.jar org.moeaframework.analysis.sensitivity.SetHypervolume name_of_reference_set_file 

Finally, I calculated statistics on these values.  I used the Kruskal Wallis and Mann Whitney U (Also called the Wilcoxon Rank Sum) tests that Dave used.  One advantage of these tests is that they are non-parameteric, meaning that assumptions are not made about the distributions from which the data come.  Another advantage I found is that they are built into Matlab as the kruskalwallis and ranksum functions.  I began with the Kruskal Wallis test as it tests the null hypothesis that independent samples from two or more groups come from distributions with equal medians and returns the p-value that this is true given the data.  This test is a useful screen, as there would not have been much point in continuing if this test indicated that the values for all algorithms had been statistically similar.

I received very low p-values using Kruskal Wallis so I continued to the Mann Whitney U test in every case.  The Mann Whitney U test provides a pair-wise comparison.   Matlab has options to test one of three alternative hypotheses

  1. medians are not equal (default, two-tailed test)
  2. median of X is greater than median of Y (right-tailed test)
  3. median of X is less than median of Y (left-tailed test)

I used one tailed tests in order to determine if one algorithm’s performance was superior to another algorithm’s performance.  I then was able to rank algorithm performance on any given metric by determining the number of algorithms each algorithm outperformed according to this test. A detailed walk through is provided in Part 2.

Default parameters for MOEAFramework algorithms and Borg

I recently ran into an interesting case when performing runtime dynamics and control map diagnostics on my 2 objective formulation of the lake problem with 2 constraints.  Although Borg yielded a much better reference set for the control map with a small LHS sample for each algorithm, the other algorithms did better on runtime when limited to 25,000 NFE.  In part this resulted from sbx being a dominant operator, allowing those algorithms with it to charge forward while Borg’s adaptive operators took longer to choose it, but it still made me wonder what the default parameters were.

Borg’s were easily found in Table 3 of the paper
Hadka, David, Patrick M. Reed, and Timothy W. Simpson. “Diagnostic assessment of the borg MOEA for many-objective product family design problems.” Evolutionary Computation (CEC), 2012 IEEE Congress on. IEEE, 2012.
found here http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6256466&tag=1.

Ranges of Borg parameters and default

Ranges of Borg parameters and default parameters. L is the number of decision variables

You can find the default parameters for the MOEAFramework algorithms by inspecting the source code.
Below is a table showing the default parameters that one would specify when control mapping for MOEAFramework Native algorithms followed by where to look in the source code for those, Jmetal, and PISA algorithms and operator default parameters should you want more information. Please note, this table does not provide the actual value of the parameter in all cases as the framework seems to adjust some of the values by certain properties of the problem to attain a final value. These are merely user input values.

Default parameters for native MOEAFramework algorithms.  L is the number of decision variables.

Default parameters for native MOEAFramework algorithms. L is the number of decision variables.

To find the default parameters for the algorithms native to the framework (eNSGAII, NSGAII, eMOEA, MOEA/D, and GDE3) you can check the following java class in the source code:

org.moeaframework.algorithm.StandardAlgorithm

For JMetal Algorithms see:

org.moeaframework.algorithm.jmetal.JMetalAlgorithms

and for Pisa Algorithms see:

org.moeaframework.algorithm.pisa.PISAAlgorithms

Finally, if you want the default parameters for the operators, you can look in this class:

org.moeaframework.core.spi.OperatorFactory

Thanks to Dave for pointing out where to look in the source code!

Runtime metrics for MOEAFramework algorithms, extracting metadata from Borg runtime, and handling infinities

I have been working with runtime metrics for a variety of algorithms on the “Lake Problem” for max NFE of only 25,000 at an output frequency of 1,000. Here are a few things I learned along the way for which the group does not seem to have consolidated resources.

First, most of our recent runtime resources have been focused on Borg, so although it is extremely easy to get runtime metrics for MOEAFramework algorithms (in fact it’s one of the site’s examples), I was initially at a loss for where to look. What I ended up doing was copying the code from the 3rd example found here: http://moeaframework.org/Example3.java and making the following changes.

At the top of the code, I added the following line

import java.io.File;

just below the line that read:

import java.io.IOException;

First, clearly change the name in .withProblem to the name of the problem you defined (See examples 4 and 5 and the manual for more info). You can specify your desired output frequency in .withFrequency and include runtime info beyond that shown in the example. It only collects Elapsed Time and Generational Distance.  Because my problem was not built into the framework, I had to specify a reference set I had already made.  That is why I had to include the import java.io.File line.  Until I added it, I received compilation errors.

My section for setting up the instrumenter looks like this:

// setup the instrumenter to record metrics
Instrumenter instrumenter = new Instrumenter()
.withReferenceSet(new File("./5obj_2const_stoch/myLake5Obj2ConstStoch.reference"))
.withFrequency(1000)
.attachElapsedTimeCollector()
.attachGenerationalDistanceCollector()
.attachHypervolumeCollector()
.attachAdditiveEpsilonIndicatorCollector();

You will also need to specify the desired problem, algorithm and max NFE for the Executor. Further if you are using an epsilon dominance algorithm, you can specify the epsilons with the following line:

.withEpsilon(0.01,0.01)

I had a two objective problem with epsilons of 0.01 for both objectives.

Finally, I became frustrated with the runtime printing format included in the example as it didn’t have a consistent field separator when I ran it. This may not have actually been important, but below is what I used:

System.out.println("NFE"+"\t"+"Elapsed Time"+"\t"+
"Generational Distance"+"\t"+"Hypervolume"+"\t"+
"Additive Epsilon Indicator");

for(int ii=0; ii<accumulator.size("NFE"); ii++){
System.out.println(accumulator.get("NFE",ii)+"\t"+
accumulator.get("Elapsed Time",ii)+"\t"+
accumulator.get("GenerationalDistance",ii)+"\t"+
accumulator.get("Hypervolume",ii)+"\t"+
accumulator.get("AdditiveEpsilonIndicator",ii));
}

You could also use the code included in Matt’s blog post here: https://waterprogramming.wordpress.com/2013/02/05/matplotlib-part-i-borg-runtime-metrics-plots/ although I haven’t actually tried it yet.

Once this was done, I was excited to plot the results, but that is where I ran into another glitch.  At the extremely small intervals, the values for Additive Epsilon Indicator and Generational Distance were Infinity as the algorithms had yet to find any feasible solutions.  This was easily solved by the following command.  Thank you Jon for recommending it!

sed 's/Infinity/-9999.0/g' all_of_my_files.txt >> write_to_file.txt

It goes through the first file specified and replaces all of the “Infinity” values with -9999.0 before writing the output to the file after the sideways carrots.  This made it much easier to import the data to Matlab where I set the -9999.0 values to NaNs.  Since I had 300 files, I put this in a bash script that looped through my runtime files for all algorithms and seeds.

Finally, I wanted to plot Borg operator probabilities, but now that Borg is not in the framework, its initial output files are not very pretty.  There is a way to extract the data if one consults section 10.2.2 of the MOEAFramework manual.  Then, you get nice files that look kind of like the one’s in the blog post I referenced by Matt earlier.  This made it much easier to plot the data.  It also let me repeat the Python exercise in that post with my own files, which was fun.

March 27, 2015 edit: I just found this video, which provides an overview of calculating runtime metrics with the MOEAFramework.  It also includes an example of reading a seed from the command line at the beginning of the java code.

Send all optimization algorithms the same model when comparing performance!

While performing diagnostics on Riddhi’s formulations for the environmental economics “Lake Problem”, we had had slightly different models for communication with Borg and the MOEAFramework algorithms. The only difference in the code was the part where the Borg/MOEA connection was made, so we did not think it would impact performance in any meaningful way, especially since command line calls outside of optimization with identical decision variables showed essentially identical results. The only difference was the truncation of a few digits way past the specified epsilon for the Borg model.  In spite of these minute differences, our preliminary results were incredibly disconcerting. Borg had a reference set contribution of zero, and when the combined set found by Borg was compared to that found by a Random search, the front found by the Random search dominated that found by Borg, which is not possible especially on such a simple problem as the 2 objective deterministic formulation with which we were working!  After many attempts to determine the reason for our bizarre results, Jon pointed out that we could probably call the model we had written for the MOEAFramework in the command line when we optimized with Borg. When I tried this, the results made sense!  Borg had a reference set contribution of 1.0 for the 2 objective deterministic formulation and its Pareto approximate front dominated that of the Random search.  Further investigation with decision variables returned by optimization is leading us to believe the format in which the two models read variables may differ slightly, and we are looking into that hypothesis.

For anyone else comparing performance of multi-objective evolutionary algorithms, make sure you send the exact same model to all algorithms!  Otherwise, the comparison will not be fair.