Part 2 of our MOEAFramework Training covers the optimization of the lake problem using your choice of algorithms. The relevant files and folders for this part of the training have been added to the Github repo. These are:
- lib folder
- global.properties
- generate_samples.sh
- settings.sh
- run.sh
- algorithm parameter files
In general, the goal of diagnostics is to conduct the optimization of a test problem (the lake problem in our case) using different evolutionary algorithms and to evaluate the performance of these algorithms.
Steps for Optimization
- New files and folders:
- lib folder: The lib folder contains the Java libraries with source code for the evolutionary algorithms, libraries that may be called from within the algorithms, and a Java version of Borg. Unzip this folder before starting the training.
- global.properties: This file is necessary for MOEAFramework to recognize the external problem. Lines 1 and 2 indicate the name of the problem and the class that had been indicated in the lake.java file and the resulting lake.class file.
- settings.sh: This file defines the relevant parameters for the optimization.
- Line 1 contains the names of the algorithms to be tested.
- Line 2 indicates the number of samples of the algorithms that the user wants to test. Each algorithm has a set of parameters that characterize it. These parameters can be anything from population size to crossover rate. Each parameter has an acceptable range of values. In a diagnostics study, it is typical to take multiple Latin Hypercube samples within these ranges to obtain different instances of the algorithm to test.
- Line 3 indicates the number of seeds (number of replicate trials)
- Line 4 indicates the name of the problem (name of the .class file)
- Line 9 shows that the relevant Java files are all in the lib folder
- Line 45 is where the user states the epsilons for each objective
- Line 50 is where the user specifies the number of functional evaluations
Run this script by typing ./settings.sh
- generate_samples.sh: The next step is to generate NSAMPLES of the MOEAs specified in the settings.sh file. In order to do this, you must provide a text file with the relevant parameter ranges for each algorithm. I have added these parameter files for the 5 MOEAs that I have chosen to use, which represent a wide range of different styles and generations of MOEAs, from the older, but most commonly downloaded algorithm, NSGA-II, to some of the newer reference point and reference vector algorithms, NSGA-III and RVEA. These files contain the names of the parameters and the relevant range of values. It is important to note that these parameter ranges might not always be relevant to every problem. Some parameter values depend on the number of objectives and/or decision variables. General rules and defaults for algorithm and operator parameters can be found here and here.
Run this script by typing ./generate_samples.sh
This script utilizes the SampleGenerator from MOEAFramework to produce a corresponding sample file, with each row corresponding to a different parameterization of that algorithm.
- run.sh: This bash script is where the meat of the optimization takes place. The script reads in the parameter/sample files and information from the settings.sh file to set up the problem. Then, through a for-loop, the script uses the DetailedEvaluator from MOEAFramework to perform the optimization for all algorithms, seeds, and samples. The arguments for this java command are intuitive- the only unspecified one being the f flag, which states, in this case, that the output be reported every 100 functional evaluations.
Run this script by typing ./run.sh
This script will submit a job for each seed of each algorithm and create a directory called data_raw to store the optimization results. Each job will take up 1 processor. All parameterizations of the algorithm will be running on the same processor. Depending on the complexity of the problem, the number of functional evaluations, and the number of parameterizations, the optimization of the problem could take a very long time. It is important to start off with small trials to understand how a problem scales with increased NFE, parameterizations, and Monte Carlo sampling if that is relevant to your problem. Below is a table outlining some examples of timing trials that one can do to determine problem complexity as well as to determine approximate wall clock time for trials.
No. of Seeds | No. of Parameterizations | No. of MC Samples | NFE | Time |
1 | 1 | 1000 | 25,000 | 16 minutes |
1 | 1 | 5000 | 25,000 | 1 hr, 23 minutes |
10 | 1 | 1000 | 25,000 | 1 hr, 24 minutes |
1 | 2 | 1000 | 25,000 | 32 minutes |
10 | 1 | 10000 | 25,000 | 3 hours |
1 | 100 | 1000 | 25,000 | 25 hours |
1 | 2 | 1000 | 200,000 | 4 hours |
This finishes up Part 2 of the MOEAFramework training. In Part 3, we will go over how to evaluate the performance of our algorithms by generating metrics.
Credits: All of the bash scripts in the training repo are written by Dave Hadka, the creator of MOEAFramework.