# MORDM VII: Optimality, robustness, and reevaluation under deep uncertainty

In the previous MORDM post, we visualized the reference set of performance objectives for the North Carolina Research Triangle and conducted a preliminary multi-criterion robustness analysis using two criteria: (1) regional reliability should be at least 98%, and (2) regional restriction frequency should be not more than 20%. Using these metrics, we found that Pareto-optimality does not guarantee satisfactory robustness, a statement that is justified by showing that not all portfolios within the reference set satisfied the robustness criteria.

In this post, we will explain the differences between optimality and robustness, and justify the importance of robust optimization instead of sole reliance on a set of optimal solutions (aka an optimal portfolio). To demonstrate the differences better, we will also reevaluate the Pareto optimal set of solutions under a more challenging set of states of the world (SOWs), a method first used in Herman et al (2013, 2014) and Zeff et al (2014). The formal term for this method, Deeply Uncertain (DU) Re-evaluation was coined in a 2019 paper by Trindade et al.

### Optimality vs robustness

The descriptions of optimality are many. From a purely technical perspective, a Pareto optimal set is a set of decision variables or solutions that maps to a Pareto front, or a set of performance objectives where improving one objective cannot be improved without causing performance degradation in another. For the purposes of this blog post, we shall use the definition of optimality as laid out by Beyer and Sendoff in their 2007 paper:

The global optimal design…depends on the…(objective) functions and constraints…however, these functions always represent models and/or approximations of the real world.

Beyer and Sendhoff (2007)

In other words, a Pareto reference set is only optimal within the bounds of the model it was generated from. This makes sense; models are only approximations of the real world. It is difficult and computationally expensive to have bounds on the degree of certainty to which the model optimum maps to the true optimum due to uncertainties driven by human action, natural variability, and incomplete knowledge. Optimization is static in relation to reality – the set of solutions found do not change with time, and only account for the conditions within the model itself. Any deviation from this set of solutions or unaccounted differences between the actual system and model may result in failure (Herman et al, 2015; Read et al 2014).

This is why searching the set of optimal solutions for robust solutions is important. Herman et al (2015) quotes an earlier works by Matalas and Fiering (1977) that defines robustness as the insensitivity a system’s portfolio to uncertainty. Within the MORDM context, robustness was found to be best defined using the multi-criterion satisficing robustness measure (Herman et al, 2015), which refers to the ability of a solution to meet one or more requirements (or criteria) set by the decision-makers when evaluated under a set of challenging scenarios. More information on alternative robustness measures can be found here.

In this blog post, we will begin to explore this concept of robustness by conducting DU Re-evaluation, where we will perform the following steps:

### Generate a set of ROF tables from a more challenging set of SOWs

Recall that we previously stored our Pareto-optimal solution set in a .csv file names ‘NC_refset.csv’ (find the original Git Repository here). Now, we will write a quick Python script (called `rof_tables_reeval.py` in the Git Repository) using MPI4PY that will parallelize and speed up the ROF table generation and the bash script to submit the job. More information on parallelization using MPI4PY can be found in this handy blog post by Dave Gold.

First, create a Python virtual environment within the folder where all your sourcecode is kept and activate the virtual environment. I called mine `python3_venv`:

``````python3 -m venv python_venv
source python_venv/bin/activate``````

Next, install the `numpy `and `mpi4py` libraries:

``pip install numpy mpi4py``

Then write the Python script as follows:

```# -*- coding: utf-8 -*-
"""
Created on Tues March 1 2022 16:16
@author: Lillian Bei Jia Lau
"""

from mpi4py import MPI
import numpy as np
import subprocess, sys, time
import os

# 5 nodes, 50 RDMs per node
comm = MPI.COMM_WORLD
rank = comm.Get_rank() # up to 20 processes
print('rank = ', rank)

N_RDMs_needed = 100
N_REALIZATIONS = 100
N_RDM_PER_NODE = 20
N_TASKS = 50 # rank ranges from 0 to 50

DATA_DIR = "/scratch/lbl59/blog/WaterPaths/"
N_SOLS = 1

current_RDM = rank + (N_TASKS * i)

command_gen_tables = "./waterpaths -T {} -t 2344 -r {} -d {} -C 1 -O rof_tables_reeval/rdm_{} -e 0 \
-U TestFiles/rdm_utilities_test_problem_reeval.csv \
-W TestFiles/rdm_water_sources_test_problem_reeval.csv \
-P TestFiles/rdm_dmp_test_problem_reeval.csv \
-s {} -f 0 -l {} -R {}\
-p false".format(OMP_NUM_THREADS, N_REALIZATIONS, DATA_DIR, current_RDM, SOLS_FILE_NAME, N_SOLS, current_RDM)

print(command_gen_tables)
os.system(command_gen_tables)

comm.Barrier()
```

Before proceeding, a quick explanation on what all this means:

• Line 12: We are parallelizing this job across 5 nodes on Cornell’s THECUBE (The Cube) computing cluster.
• Lines 19-28: On each node, we are submitting 10 tasks to each of the 5 nodes requested. Each task, in turn, is handling 2 robust decision-making (RDM) multiplier files that scale up or scale down a hydroclimatic realization make a state of the world more (or less) challenging. In this submission, we are creating 400 different variations of each hydroclimatic scenario using the 400 RDM files, and running it across only one solution
• Line 16 and 31: The ‘rank’ is the order of the tasks in which there are submitted. Since there are 10 tasks over 5 nodes, there will be a total of 50 tasks being submitted. Note and understand how the `current_RDM` is calculated.
• Lines 32 to 37: This is the command that you are going to submit to The Cube. Note the `-C` and `-O` flags; a value of 1 for the `-C` flag tells WaterPaths to generate ROF tables, and the `-O` tells WaterPaths to output each ROF table file into a folder within `rof_tables_reeval/rdm_{}`for each RDM. Feel free to change the filenames as you see fit.

To accompany this script, first create the following folders: `output`, `out_reeval`, and `rof_tables_reeval`. The `output` folder will contain the simulation results from running the 1 solution across the 100 hydroclimatic realizations. The `out_reeval` folder will store any output or error messages such as script runtime.

Then, write the following bash submission script:

```#!/bin/bash
#SBATCH -n 50 -N 5 -p normal
#SBATCH --job-name=rof_tables_reeval
#SBATCH --output=out_reeval/rof_tables_reeval.out
#SBATCH --error=out_reeval/rof_tables_reeval.err
#SBATCH --time=200:00:00
#SBATCH --mail-user=lbl59@cornell.edu
#SBATCH --mail-type=all

module spider py3-mpi4py
module spider py3-numpy/1.15.3

START="\$(date +%s)"

mpirun python3 rof_tables_reeval.py

DURATION=\$[ \$(date +%s) - \${START} ]

echo \${DURATION}
```

You can find the bash script under the filename `rof_table_gen_reeval.sh`. Finally, submit the script using the following line:

```sbatch ./rof_table_gen_reeval.sh
```

The run should take roughly 5 hours. We’re good for some time!

### Re-evaluate your solutions (and possibly your life choices, while you’re at it)

Once the ROF tables are generated, it’s time to get a little hands-on with the underlying WaterPaths code. Navigate to the following `PaperTestProblem.cpp` file using:
`cd /yourfilepath/WaterPaths/src/Problem/`

1. Delete `PaperTestProblem.cpp` and replace it with the file `PaperTestProblem-reeval.cpp`. It can be found in the the main Git Repository.
2. Rename the latter file `PaperTestProblem.cpp` – it will be the new PaperTestProblem file that will be able to individually read each RDM scenario’s ROF tables.
3. Re-make WaterPaths by calling `make clean` and then `make gcc` in the command line. This will ensure that WaterPaths has no problems running the new `PaperTestProblem.cpp` file.

Next, write the following Python script (called `run_du_reeval.py` in the Git repository):

```# -*- coding: utf-8 -*-
"""
Created on Tues March 1 2022 16:16

@author: Lillian Bei Jia Lau
"""

from mpi4py import MPI
import numpy as np
import subprocess, sys, time
import os

N_REALIZATIONS = 100
N_RDM_PER_NODE = 20
N_TASKS_PER_NODE = 10 # rank ranges from 0 to 15

comm = MPI.COMM_WORLD
rank = comm.Get_rank() # up to 20 processes
print('rank = ', rank)

DATA_DIR = "/scratch/lbl59/blog/WaterPaths/"
N_SOLS = 69

current_RDM = rank + (N_TASKS * i)

command_run_rdm = "./waterpaths -T {} -t 2344 -r {} -d {} -C -1 -O rof_tables_reeval/rdm_{} -e 0 \
-U TestFiles/rdm_utilities_test_problem_reeval.csv \
-W TestFiles/rdm_water_sources_test_problem_reeval.csv \
-P TestFiles/rdm_dmp_test_problem_reeval.csv \
-s {} -R {} -f 0 -l 69\
current_RDM , SOLS_FILE_NAME, current_RDM)

print(command_run_rdm)
os.system(command_run_rdm)

comm.Barrier()
```

Note the change in the -C flag; its value is now -1, telling WaterPaths that it should import the ROF table values from the folder indicated by the `-O` flag. The resulting objective values for each RDM will be saved in the `output `folder we previously made.

The accompanying bash script, named `du_reeval.sh` is as follows:

```#!/bin/bash
#SBATCH -n 50 -N 5 -p normal
#SBATCH --job-name=mordm_training_du_reeval
#SBATCH --output=out_reeval/mordm_training_du_reeval.out
#SBATCH --error=out_reeval/mordm_training_du_reeval.err
#SBATCH --time=200:00:00
#SBATCH --mail-user=lbl59@cornell.edu
#SBATCH --mail-type=all

module spider py3-numpy/1.15.3

START="\$(date +%s)"

mpirun python3 run_du_reeval.py

DURATION=\$[ \$(date +%s) - \${START} ]

echo \${DURATION}
```

This run should take approximately three to four days. After that, you will have 1000 files containing 69 objective value sets resulting from running the 69 solutions across 1000 deeply-uncertain states of the world.

### Summary

In this post, we defined optimality and robustness. We demonstrated how to run a DU re-evaluation across 100 challenging SOWs to observe how these ‘optimal’ solutions perform in more extreme scenarios. This is done to show that optimality is bound by current model states, and any deviation from the expected circumstances as defined by the model may lead to degradations in performance.

In the next blog post, we will be visualizing these changes in performance using a combination of sensitivity analysis, scenario discovery, and tradeoff analysis.

## References

Beyer, H. and Sendhoff, B., 2007. Robust optimization – A comprehensive survey. Computer Methods in Applied Mechanics and Engineering, 196(33-34), pp.3190-3218.

Herman, J., Reed, P., Zeff, H. and Characklis, G., 2015. How Should Robustness Be Defined for Water Systems Planning under Change?. Journal of Water Resources Planning and Management, 141(10), p.04015012.

Herman, J., Zeff, H., Reed, P. and Characklis, G., 2014. Beyond optimality: Multistakeholder robustness tradeoffs for regional water portfolio planning under deep uncertainty. Water Resources Research, 50(10), pp.7692-7713.

Matalas, N. C., and Fiering, M. B. (1977). “Water-resource systems planning.” Climate, climatic change, and water supply, studies in geophysics, National Academy of Sciences, Washington, DC, 99–110.

Read, L., Madani, K. and Inanloo, B., 2014. Optimality versus stability in water resource allocation. Journal of Environmental Management, 133, pp.343-354.

Zeff, H., Kasprzyk, J., Herman, J., Reed, P. and Characklis, G., 2014. Navigating financial and supply reliability tradeoffs in regional drought management portfolios. Water Resources Research, 50(6), pp.4906-4923.

# Debug in Real-time on SLURM

Debugging a code by submitting jobs to a supercomputer is an inefficient process. It goes something like this:

1. Submit job and wait in queue
2. Check for errors/change code
3. (repeat endlessly until your code works)

Debugging in Real-Time:

There’s a better way to debug that doesn’t require waiting for the queue every time you want to check your code. On SLURM, you can debug in real-time like so:
1. Request a debugging or interactive node and wait in queue
2. Check for errors/change code continuously until code is fixed or node has timed out

Example (using Summit supercomputer at University of Colorado Boulder):

2. Navigate to directory where the file to be debugged is located using ‘cd’ command
4. Enter the ‘sinteractive’ command
• \$sinteractive
5. Wait in line for permission to use the node (you will have a high priority with a debugging QOS so it shouldn’t take long)
6. Once you are granted permission, the node is yours! Now you can debug to your hearts content (or until you run out of time).
I’m usually debugging shell scripts on Unix. If you want advice on that topic check out this link. I prefer the ‘-x’ command (shown below) but there are many options available.
Debugging shell scripts in Unix using ‘-x’ command:
\$bash -x mybashscript.bash
Hopefully this was helpful! Please feel free to edit/comment/improve as you see fit.

# Some ideas for your Bash submission scripts

I’ve been playing around with some design options for PBS submission scripts that may help people doing cluster work.  Some things to look for in the source code:

• You can use a list in bash that contains multiple text entries, and then access those text entries to create strings for your submissions.  Note that you can actually display the text first (see the ‘echo \${PBS}’) before you do anything; that way you aren’t requesting thousands of jobs that have a typo in them!
• Using “read” allows the bash programmer to interact with the user.  Well, in reality you are usually both the programmer and the user.  But lots of times, I want to write a script and try it out first, before I submit hundreds of hours of time on the cluster.  The flags below can help with that process.
• I added commands to compile the source code before actually submitting the jobs.  Plus, by using flags and pauses intelligently, you can bail out of the script if there’s a problem with compilation.
```#!/bin/bash
NODES=32
WALLHOURS=5

PROBLEMS=("ProblemA" "ProblemB")
NSEEDS=10
SEEDS=\$(seq 1 \${NSEEDS}) #note there are multiple ways to declare lists and sequences in bash

NFES=1000000
echo "NFEs is \${NFES}" #echo statements can improve usability of the script, especially if you're modifying it a lot for various trials

ASSUMEPERMISSIONFLAG=No #This is for pausing the submission script later

echo "Compile? Y or N."

if [ "\$COMPILEFLAG" = "Y" ]; then
echo "Cleaning.."
make clean -f MakefileParallel
echo "Compiling.."
make -f MakefileParallel
else
echo "Not compiling."
fi

for PROBINDEX in \${!PROBLEMS[*]}
do
PROBLEM=\${PROBLEMS[\$PROBINDEX]} #note the syntax to pull a list member out here
echo "Problem is \${PROBLEM}"

for SEED in \${SEEDS}
do
NAME=\${PROBLEM}_\${SEED} #Bash is really nice for manipulating strings like this
echo "Submitting: \${NAME}"

#Here is the actual PBS command, with bash variables used in place of different experimental parameters.  Note the use of getopt-style command line parsing to pass different arguments into the myProgram executable.  This implementation is also designed for parallel processing, but it can also be used for serial jobs too.

PBS="#PBS -l nodes=32\n\
#PBS -N \${NAME}\n\
#PBS -l walltime=05:00:00\n\
#PBS -j oe\n\
#PBS -o \${NAME}.out\n\
cd \\$PBS_O_WORKDIR\n\
mpirun ./myProgram -b \${PROBLEM} -c combined -f \${NFES} -s \${SEED}"

#The first echo shows the user what is about to be passed to PBS.  The second echo then pipes it to the command qsub, and actually submits the job.

echo \${PBS}

if [ "\$ASSUMEPERMISSIONFLAG" = "No" ]; then

echo "Continue submitting? Y or N."

#Here, the code is designed to just keep going after the user says Y once.  You can redesign this for your own purposes.  Also note that this code is fairly brittle in that the user MUST say Y, not y or yes.  You can build that functionality into the if statements if you'd like it.

if [ "\$SUBMITFLAG" = "Y" ]; then
ASSUMEPERMISSIONFLAG=Yes #this way, the user won't be asked again
echo -e \${PBS} | qsub
sleep 0.5
echo "done."
fi
else
echo -e \${PBS} | qsub
sleep 0.5
echo "done."
fi
done
done

```

# Using linux “cut”

The following code takes a file that has 16 columns and outputs a file with 5 of those columns.  Some notes:

• Don’t use PATH as a variable.  The program won’t work, because PATH is a system variable!
• Note the C++ style syntax of the loop.  Versions of bash greater than 3.0 allow you to use curly brackets, like: for i in {1..50}.  But when you want to use variables inside the range, you have to do something else, such as my example.  Others are discussed here.
• The star of this script is the ‘cut’ command.  d tells what delimiter you’d like.  f tells what fields you want to cut.
• Then there are some simple commands around cut.  ‘cat’ displays the contents of the file.  Then, the | operator pipes the output of cat into the next command, which is cut.  Finally, you then use the > operator to direct the output of this command into a new file.
• Save this file on the cluster or a Linux system as myFileNameHere.sh.  Then, to run the code, simply type “sh myFileNameHere.sh”
```
#!/bin/bash

# Cut out only the objective function values from the CBorg output files.

MYPATH=./output/
INPUT_NAME_BASE=CBorg_LRGV_
EXTENSION=.out
START_COLUMN=9
FINISH_COLUMN=13
NUM_SEEDS=50

echo "Beginning..."
for ((I=1; I<=\$NUM_SEEDS; I++));
do
echo "Processing \$I"
cat \${MYPATH}\${INPUT_NAME_BASE}\${I}\${EXTENSION} | cut -d ' ' -f \${START_COLUMN}-\${FINISH_COLUMN} > \${MYPATH}\${INPUT_NAME_BASE}\${I}\${OUTPUT_NAME_ADDENDUM}\${EXTENSION}
done
echo "Totally done."

```