Introduction to PyBorg – basic setup and running

PyBorg is a new secondary implementation of Borg, written entirely in Python using the Platypus optimization library. PyBorg was developed by Andrew Dircks based on the original implementation in C and it is intended primarily as a learning tool as it is less efficient than the original C version (which you can still use with Python but through the use of the plugin “wrapper” also found in the package). PyBorg can be found in the same repository where the original Borg can be downloaded, for which you can request access here: http://borgmoea.org/#contact

This blogpost is intended to demonstrate this new implementation. To follow along, first you need to either clone or download the BitBucket repository after you gain access.

Setting up the required packages is easy. In your terminal, navigate to the Python directory in the repository and install all prerequisites using python setup.py install. This will install all requirements (i.e. the Platypus library, numpy, scipy and six) for you in your current environment.

You can test that everything works fine by running the optimization on the DTLZ2 test function, found in dtlz2.py. The script creates an instance of the problem (as it is already defined in the Platypus library), sets it up as a ploblem for Borg to optimize and runs the algorithm for 10,000 function evaluations:

    # define a DTLZ2 problem instance from the Platypus library
    nobjs = 3
    problem = DTLZ2(nobjs)

    # define and run the Borg algorithm for 10000 evaluations
    algorithm = BorgMOEA(problem, epsilons=0.1)
    algorithm.run(10000)

A handy 3D scatter plot is also generated to show the optimization results.

The repository also comes with two other scripts dtlz2_runtime.py and dtlz2_advanced.py.
The first demonstrates how to use the Platypus hypervolume indicator at a specified runtime frequency to get learn about its progress as the algorithm goes through function evaluations:

The latter provides more advanced functionality that allows you define custom parameters for Borg. It also includes a function to generate runtime data from the run. Both scripts are useful to diagnose how your algorithm is performing on any given problem.

The rest of this post is a demo of how you can use PyBorg with your own Python model and all of the above. I’ll be using a model I’ve used before, which can be found here, and I’ll formulate it so it only uses the first three objectives for the purposes of demonstration.

The first thing you need to do to optimize your problem is to define it. This is done very simply in the exact same way you’d do it on Project Platypus, using the Problem class:

from fishery import fish_game
from platypus import Problem, Real
from pyborg import BorgMOEA

# define a problem
nVars = 6
nObjs = 3 

problem = Problem(nVars, nObjs) # first input is no of decision variables, second input is no of objectives
problem.types[:] = Real(0, 1) #defines the type and bounds of each decision variable
problem.function = fish_game #defines the model function

This assumes that all decision variables are of the same type and range, but you can also define them individually using, e.g., problem.types[0].

Then you define the problem for the algorithm and set the number of function evaluations:

algorithm = BorgMOEA(problem, epsilons=0.001) #epsilons for each objective
algorithm.run(10000) # number of function evaluations

If you’d like to also produce a runtime file you can use the detailed_run function included in the demo (in the files referenced above), which wraps the algorithm and runs it in intervals so the progress can be monitored. You can combine it with runtime_hypervolume to also track your hypervolume indicator. To use it you need to define the total number of function evaluations, the frequency with which you’d like the progress to be monitored and the name of the output file. If you’d like to calculate the Hypervolume (you first need to import it from platypus) you also need to either provide a known reference set or define maximum and minimum values for your solutions.

maxevals = 10000
frequency = 100
output = "fishery.data"
hv = Hypervolume(minimum=[-6000, 0, 0], maximum=[0, 1, 100])

nfe, hyp = detailed_run(algorithm, maxevals, frequency, output, hv)

My full script can be found below. The detailed_run function is an edited version of the default that comes in the demo to also include the hypervolume calculation.

from fishery import fish_game
from platypus import Problem, Real, Hypervolume
from pyborg import BorgMOEA
from runtime_diagnostics import detailed_run

# define a problem
nVars = 6 # no. of decision variables to be optimized
nObjs = 3

problem = Problem(nVars, nObjs) # first input is no of decision variables, second input is no of objectives
problem.types[:] = Real(0, 1)
problem.function = fish_game

# define and run the Borg algorithm for 10000 evaluations
algorithm = BorgMOEA(problem, epsilons=0.001)
#algorithm.run(10000)

# define detailed_run parameters
maxevals = 10000
frequency = 100
output = "fishery.data"
hv = Hypervolume(minimum=[-6000, 0, 0], maximum=[0, 1, 100])

nfe, hyp = detailed_run(algorithm, maxevals, frequency, output, hv)

# plot the results using matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.scatter([s.objectives[0] for s in algorithm.result],
           [s.objectives[1] for s in algorithm.result],
           [s.objectives[2] for s in algorithm.result])
ax.set_xlabel('Objective 1')
ax.set_ylabel('Objective 2')
ax.set_zlabel('Objective 3')
ax.scatter(-6000, 0, 0, marker="*", c='orange', s=50)
plt.show()

plt.plot(nfe, hyp)
plt.title('PyBorg Runtime Hypervolume Fish game')
plt.xlabel('Number of Function Evaluations')
plt.ylabel('Hypervolume')
plt.show()

It produces the following two figures:

A video training on Rhodium

A few weeks ago I filmed a video training guide to the Rhodium framework for the annual meeting of the society for Decision Making Under Deep Uncertainty. Rhodium is a Python library that facilitates Many Objective Robust Decision making. The training walks through a demonstration of Rhodium using the Lake Problem. The training introduces a live Jupyter notebook Antonia and I created using Binder.

To follow the training:

  1. Watch the demo video below
  2. Access the Binder Hub this link: https://mybinder.org/v2/gh/dgoldri25/Rhodium/7982d8fcb1de9a84f074cc
  3. Click on the file called “DMDU_Rhodium_Demo.ipynb” to open the live demo
  4. Begin using Rhodium!

Helpful Links

Using Rhodium for exploratory modeling

Rhodium is a powerful, simple, open source Python library for multiobjective robust decision making. As part of Project Platypus, Rhodium is compatible with Platypus (a MOEA optimization library) and PRIM (the Patent Rule Induction Method for Python), making it a valuable tool for bridging optimization and analysis. 

In the Rhodium documentation, a simple example of optimization and analysis uses the Lake Problem (DPS formulation). The actual optimization is performed in the line:

optimize(model, "NSGAII", 10000)

This optimize function uses the Platypus library directly for optimization; here the NSGAII algorithm is used for 10,000 function evaluations on the defined Lake Problem (model). This optimization call is concise and simple, but there are a few reasons why it may not be ideal.

  1. Speed. Python, an interpreted language, is inherently slower than compiled languages (Java, C/C++, etc.) The Platypus library is built entirely in Python, making optimization slow.
  2. Scalability. Platypus has support for parallelizing optimization, but this method is not ideal for large-scale computational experiments on computing clusters. 
  3. MOEA Suite. State of the art MOEAs such as the Borg MOEA are not implemented in Platypus for licensing reasons, so it is not usable directly by Rhodium.

Thus, external optimization is necessary for computationally demanding Borg runs. Luckily, Rhodium is easily compatible with external data files, so analysis with Rhodium of independent optimizations is simple. In this post, I’ll use a sample dataset obtained from a parallel Borg run of the Lake Problem, using the Borg wrapper.

The code and data used in this post can be found in this GitHub repository. lakeset.csv contains a Pareto approximate Lake Problem set. Each line is a solution, where the first six values are the decision variables and the last four are the corresponding objectives values. 

We’ll use Pandas for data manipulation. The script below reads the sample .csv file with Pandas, converts it to a list of Python dictionaries, and creates a Rhodium DataSet. There are a few important elements to note. First, the Pandas to_dict function takes in an optional argument ‘records’ to specify the format of the output. This specific format creates a list of Python dictionaries, where each element of the list is an individual solution (i.e. a line from the .csv file) with dictionary keys corresponding to the decision / objective value names and dictionary values as each line’s data. This is the format necessary for making a Rhodium DataSetwhich we create by calling the constructor with the dictionary as input.

import pandas as pd
from rhodium import *

# use pandas to read the csv file
frame = pd.read_csv("lakeset.csv")

# convert the pandas data frame to a Python dict in record format
dictionary = frame.to_dict('records')

# create a Rhodium DataSet instance from the Python dictionary
dataset = DataSet(dictionary)

Printing the Rhodium DataSet with print(dataset) yields:

...
...
Index 204:
   c1: 0.286373779
   r1: 0.126801547
   w1: 0.6265428129999999
   c2: -0.133307575
   r2: 1.3584425430000002
   w2: 0.10987546599999999
   benefit: -0.412053431
   concentration: 0.359441661
   inertia: -0.98979798
   reliability: -0.9563

Once we have a Rhodium DataSet instantiated, we access many of the library’s functionalities, without performing direct optimization with Platypus. For example, if we want the policy with the lowest Phosphorus concentration (denoted by the ‘concentration’ field), the following code outputs:

policy = dataset.find_min('concentration')
print(policy)
{'c1': 0.44744488600000004, 'r1': 0.9600368159999999, 'w1': 0.260339899, 'c2': 0.283860122, 'r2': 1.246763577, 'w2': 0.5300663529999999, 'benefit': -0.213267399, 'concentration': 0.149320863, 'inertia': -1.0, 'reliability': -1.0}

Rhodium also offers powerful plotting functionalities. For example, we can easily create a Parallel Axis plot of our data to visualize the trade-offs between objectives. The following script uses the parallel_coordinates function in Rhodium on our external dataset. Here, since parallel_coordinates takes a Rhodium model as input, we can: 1) define the external optimization problem as a Rhodium model, or 2) define a ‘dummy’ model that gives us just enough information to create plots. For the sake of simplicity, we will use the latter, but the first option is simple to set up if there exists a Python translation of your problem/model. Note, to access the scenario discovery and sensitivity analysis functionalities of Rhodium, it is necessary to create a real Rhodium Model.

# define a trivial "dummy" model in Rhodium with an arbitrary function
model = Model(lambda x: x)

# set up the model's objective responses to match the keys in your dataset
# here, all objectives are minimized
# this is the only information needed to create a parallel coordinate plot
model.responses = [Response("benefit", Response.MINIMIZE),
                   Response("concentration", Response.MINIMIZE),
                   Response("inertia", Response.MINIMIZE),
                   Response("reliability", Response.MINIMIZE)]

# create the parallel coordinate plot from the results of our external optimization
fig = parallel_coordinates(model, dataset, target="bottom",
                           brush=[Brush("reliability < -0.95"), Brush("reliability >= -0.95")])

Make your Git repository interactive with Binder

Have you ever tried to demo a piece of software you wrote only to have the majority of participants get stuck when trying to configure their computational environment? Difficulty replicating computational environments can prevent effective demonstration or distribution of even simple codes. Luckily, new tools are emerging that automate this process for us. This post will focus on Binder, a tool for creating custom computing environments that can be distributed and used by many remote users simultaneously. Binder is language agnostic tool, and can be used to create custom environments for R, Python and Julia. Binder is powered by BinderHub, an open source service in the cloud. At the bottom of this post, I’ll provide an example of an interactive Python Jupyter Notebook that I created using BinderHub.

BinderHub

BinderHub combines two useful libraries: repo2docker and JupyterHub. repo2docker is a tool to build, run and push Docker images from source code repositories. This allows you to create copies of custom environments that users can replicate on any machine. These copies are can be stored and distributed along with the remote repository. JuptyerHub is a scalable system that can be used to spawn multiple Jupyter Notebook servers. JuptyerHub takes the Docker image created by repo2docker and uses it to spawn a Jupyter Notebook server on the cloud. This server can be accessed and run by multiple users at once. By combining repo2docker and JupyterHub, BinderHub allows users to both replicate complex environments and easily distribute code to large numbers of users.

Creating your own BinderHub deployment

Creating your own BinderHub deployment is incredibly easy. To start, you need a remote repository containing two things: (1) a Jupyter notebook with supporting code and (2) configuration files for your environment. Configuration files can either be an environment.yml file (a standard configuration file that can be generated with conda, see example here) or a requirements.txt file (a simple text file that lists dependencies, see example here).

To create an interactive BinderHub deployment:

  1. Push your code to a remote repository (for example Github)
  2. Go to mybinder.org and paste the repository’s URL into the dialoge box (make sure to select the proper hosting service)
  3. Specify the branch if you are not on the Master
  4. Click “Launch”

The website will generate a URL that you can copy and share with users. I’ve created an example for our Rhodium tutorial, which you can find here:

https://mybinder.org/v2/gh/dgoldri25/Rhodium/master?filepath=DMDU_Rhodium_Demo.ipynb

To run the interactive Jupyter Notebook, click on the file titled “Rhodium_Demo.ipynb”. Happy sharing!

A Python Implementation of grouped Radial Convergence Plots to visualize Sobol Sensitivity Analysis results

TDLR; A Python implementation of grouped radial convergence plots based on code from the Rhodium library. This script is will be added to Antonia’s repository for Radial Convergence Plots.

Radial convergence plots are a useful tool for visualizing results of Sobol Sensitivities analyses. These plots array the model parameters in a circle and plot the first order, total order and second order Sobol sensitivity indices for each parameter. The first order sensitivity is shown as the size of a closed circle, the total order as the size of a larger open circle and the second order as the thickness of a line connecting two parameters.

In May, Antonia created a new Python library to generate Radial Convergence plots in Python, her post can be found here and the Github repository here. I’ve been working with the Rhodium Library a lot recently and found that it contained a Radial Convergence Plotting function with the ability to plot grouped output, a functionality that is not present in Antonia’s repository. This function produces the same plots as Calvin’s R package. Adding a grouping functionality allows the user to color code the visualization to improve the interpretability of the results. In the code below I’ve adapted the Rhodium function to be a standalone Python code that can create visualizations from raw output of the SALib library. When used on a policy for the Lake Problem, the code generates the following plot shown in Figure 1.

Figure 1: Example Radial Convergence Plot for the Lake Problem reliability objective. Each of the points on the plot represents a sampled uncertain parameter in the model. The size of the filled circle represents the first order Sobol Sensitivity Index, the size of the open circle represents the total order Sobol Sensitivty Index and the thickness of lines between points represents the second order Sobol Sensitivity Index.

import numpy as np
import itertools
import matplotlib.pyplot as plt
import seaborn as sns
import math
sns.set_style('whitegrid', {'axes_linewidth': 0, 'axes.edgecolor': 'white'})

def is_significant(value, confidence_interval, threshold="conf"):
    if threshold == "conf":
        return value - abs(confidence_interval) > 0
    else:
        return value - abs(float(threshold)) > 0

def grouped_radial(SAresults, parameters, radSc=2.0, scaling=1, widthSc=0.5, STthick=1, varNameMult=1.3, colors=None, groups=None, gpNameMult=1.5, threshold="conf"):
    # Derived from https://github.com/calvinwhealton/SensitivityAnalysisPlots
    fig, ax = plt.subplots(1, 1)
    color_map = {}
    
    # initialize parameters and colors
    if groups is None:
        
        if colors is None:
            colors = ["k"]
        
        for i, parameter in enumerate(parameters):
            color_map[parameter] = colors[i % len(colors)]
    else:        
        if colors is None:
            colors = sns.color_palette("deep", max(3, len(groups)))
        
        for i, key in enumerate(groups.keys()):
            #parameters.extend(groups[key])
            
            for parameter in groups[key]:
                color_map[parameter] = colors[i % len(colors)]
    
    n = len(parameters)
    angles = radSc*math.pi*np.arange(0, n)/n
    x = radSc*np.cos(angles)
    y = radSc*np.sin(angles)
    
    # plot second-order indices
    for i, j in itertools.combinations(range(n), 2):
        #key1 = parameters[i]
        #key2 = parameters[j]
        
        if is_significant(SAresults["S2"][i][j], SAresults["S2_conf"][i][j], threshold):
            angle = math.atan((y[j]-y[i])/(x[j]-x[i]))
                
            if y[j]-y[i] < 0:
                angle += math.pi
                
            line_hw = scaling*(max(0, SAresults["S2"][i][j])**widthSc)/2
                
            coords = np.empty((4, 2))
            coords[0, 0] = x[i] - line_hw*math.sin(angle)
            coords[1, 0] = x[i] + line_hw*math.sin(angle)
            coords[2, 0] = x[j] + line_hw*math.sin(angle)
            coords[3, 0] = x[j] - line_hw*math.sin(angle)
            coords[0, 1] = y[i] + line_hw*math.cos(angle)
            coords[1, 1] = y[i] - line_hw*math.cos(angle)
            coords[2, 1] = y[j] - line_hw*math.cos(angle)
            coords[3, 1] = y[j] + line_hw*math.cos(angle)

            ax.add_artist(plt.Polygon(coords, color="0.75"))
        
    # plot total order indices
    for i, key in enumerate(parameters):
        if is_significant(SAresults["ST"][i], SAresults["ST_conf"][i], threshold):
            ax.add_artist(plt.Circle((x[i], y[i]), scaling*(SAresults["ST"][i]**widthSc)/2, color='w'))
            ax.add_artist(plt.Circle((x[i], y[i]), scaling*(SAresults["ST"][i]**widthSc)/2, lw=STthick, color='0.4', fill=False))
    
    # plot first-order indices
    for i, key in enumerate(parameters):
        if is_significant(SAresults["S1"][i], SAresults["S1_conf"][i], threshold):
            ax.add_artist(plt.Circle((x[i], y[i]), scaling*(SAresults["S1"][i]**widthSc)/2, color='0.4'))
           
    # add labels
    for i, key in enumerate(parameters):                
        ax.text(varNameMult*x[i], varNameMult*y[i], key, ha='center', va='center',
                rotation=angles[i]*360/(2*math.pi) - 90,
                color=color_map[key])
        
    if groups is not None:
        for i, group in enumerate(groups.keys()):
            print(group)
            group_angle = np.mean([angles[j] for j in range(n) if parameters[j] in groups[group]])
            
            ax.text(gpNameMult*radSc*math.cos(group_angle), gpNameMult*radSc*math.sin(group_angle), group, ha='center', va='center',
                rotation=group_angle*360/(2*math.pi) - 90,
                color=colors[i % len(colors)])
            
    ax.set_facecolor('white')
    ax.set_xticks([])
    ax.set_yticks([])
    plt.axis('equal')
    plt.axis([-2*radSc, 2*radSc, -2*radSc, 2*radSc])
    #plt.show()

    
    return fig

The code below implements this function using the SALib to conduct a Sobol Sensitivity Analysis on the Lake Problem to produce Figure 1.

import numpy as np
import itertools
import matplotlib.pyplot as plt
import math
from SALib.sample import saltelli
from SALib.analyze import sobol
from lake_problem import lake_problem
from grouped_radial import grouped_radial

# Define the problem for SALib
problem = {
	'num_vars': 5,
	'names': ['b', 'q', 'mean', 'stdev', 'delta'],
	'bounds': [[0.1, 0.45],
			   [2.0, 4.5],
			   [0.01, 0.05],
			   [0.001, 0.005],
			   [0.93, 0.99]]
}

# generate Sobol samples
param_samples = saltelli.sample(problem, 1000)

# extract each parameter for input into the lake problem
b_samples = param_samples[:,0]
q_samples = param_samples[:,1]
mean_samples = param_samples[:,2]
stdev_samples = param_samples[:,3]
delta_samples = param_samples[:,4]


# run samples through the lake problem using a constant policy of .02 emissions
pollution_limit = np.ones(100)*0.02

# initialize arrays to store responses
max_P = np.zeros(len(param_samples))
utility = np.zeros(len(param_samples))
inertia = np.zeros(len(param_samples))
reliability = np.zeros(len(param_samples))

# run model across Sobol samples
for i in range(0, len(param_samples)):
	print("Running sample " + str(i) + ' of ' + str(len(param_samples)))
	max_P[i], utility[i], inertia[i], reliability[i] = lake_problem(pollution_limit,
																	b=b_samples[i],
																	q=q_samples[i],
																	mean=mean_samples[i],
																	stdev=stdev_samples[i],
																	delta=delta_samples[i])

#Get sobol indicies for each response
SA_max_P = sobol.analyze(problem, max_P, print_to_console=False)
SA_reliability = sobol.analyze(problem, reliability, print_to_console=True)
SA_inertia = sobol.analyze(problem, inertia, print_to_console=False)
SA_utility = sobol.analyze(problem, utility, print_to_console=False)

# define groups for parameter uncertainties
groups={"Lake Parameters" : ["b", "q"],
        "Natural Pollution" : ["mean", "stdev"],
        "Discounting" : ["delta"]}


fig = grouped_radial(SA_reliability, ['b', 'q', 'mean', 'stdev', 'delta'], groups=groups, threshold=0.025)
plt.show()

Examining Robustness Metrics using Rhodium

Introduction

Rhodium is a python library built to facilitate the Many Objective Robust Decision Making (MORM) framework. The MORDM framework couples many objective search with robust decision making (RDM) to facilitate decision support for complex, many-objective problems under deep uncertainty (Kasprzyk et al., 2013). A core component of MORDM is the quantification of robustness, which can be defined as “the insensitivity of system design to errors, random or otherwise, in the estimates of those parameters effecting design choice” (Matalas and Fiering, 1977; quote via Herman et al., 2015). While robustness as a concept may sound straightforward, quantifying robustness in mathematical terms is more challenging. As we’ll see later in this post, our choice on how to quantify robustness may have large implications on the decision making process. In this post I’ll demonstrate how to use Rhodium to examine the implications of the choice of robustness measure on the shallow lake problem from environmental systems literature. I’ll first walk through problem formulation and multi-objective optimization steps of MORDM, then implement four robustness metrics presented in Herman et al., (2015) on a set of candidate solutions.

Problem Formulation

The classical shallow lake problem (Carptenter et al,. 1999) presents a hypothetical town that seeks to balance economic benefits of phosphorus (P) emissions with the ecological benefits of a healthy lake. The lake naturally receives inflows of phosphorus from the surrounding area and recycles phosphorus in its system. Under natural conditions, the lake’s phosphorus levels will always return to a healthy (oligotrophic) equilibrium. However, if emissions from the town are too high, the lake will cross an irreversible tipping point into an unhealthy (eutrophic) state. The town seeks to find a policy to regulate phosphorus that will keep its economy prosperous while preserving a healthy ecosystem.

System Model

To facilitate the decision making process, decision makers employ a dimensionless model to abstract phosphorus dynamics in the lake:

X_{t+1} = X_{t}+a_t+Y_t+\frac{X_t^q}{1+X^q_t}-bX_t

Where X is the normalized concentration of P in the lake, a represents the anthropogenic phosphorus inputs from the town, Y ~ LN(mu, sigma) are natural P inputs to the lake, q is a parameter controlling the recycling rate of P from sediment, b is a parameter controlling the rate at which P is removed from the system and t is the time index in years. For details on the lake problem model, see Quinn et al., (2017).

Uncertainties

Decision makers have estimates of model parameters that govern the flux of phosphorus, however, these parameters are uncertain and their probability distributions are unknown, therefore they are treated as deep uncertainties. The assumed values and plausible ranges of these uncertainties are shown in the table below.

UncertaintyBase ValueLower boundUpper Bound
b0.420.10.45
q2.02.04.5
delta0.980.930.99
mu0.030.010.05
sigma(10^-5)^.50.0010.005

Objectives

Local stakeholders have identified four objectives they would like to optimize:

  1. Maximize the average economic benefit of phosphorus emissions
  2. Minimize the worst case average phosphorus concentration in the lake
  3. Maximize the inertia of the phosphorus control policy (ie. do not create a policy with large year to year swings in phosphorus emissions)
  4. Maximize the reliability of a policy staying below the lake’s critical phosphorus threshold

For details on the objective formulations, refer to Quinn et al., (2017).

Decisions

Following Quinn et al., (2017), phosphorus emissions are controlled by a state dependent rule system which uses cubic radial basis functions to determine annual phosphorus emissions based on the phosphorus in the lake.

-2 \leq c_j \leq 2

a_{t,i} = min \bigg(max \bigg(\sum_{j=1}^n w_j {\mid\frac{X_{t,i}-c{j}}{r_j}\mid}^3, 0.01 \bigg),0.1\bigg)

0 \leq r_j \leq 2

0 \leq w_j \leq 1

\sum_{j=1}^n w_j =1

Where c_j, r_j and w_j are the centers, radii and wights of n cubic radial basis functions. These parameters will be the decision variables for our multi-objective optimization (searching for parameters to a rule system like this is known as direct policy search (DPS)). For this example we’ll use n=2 cubic radial basis functions and have 6 decision variables.

Building the model in Rhodium

Conducting MORDM on your own can be an onerous process, often your model is written in a different language than you post-processing software and each individual script requires data in a different format. Rhodium can make the MORDM process much easier. It has custom built data structures which are tailored for MORDM analysis and a declarative structure which makes plugging in an external model simple. For this demonstration I’ll use a python implementation of the Lake problem which I’ve copied to this Git repository (to save space I’m omitting it from the text of this post). The first step in conducting MORDM with Rhodium is to declare a Rhodium model. This file must be in the same repository as the python file running rhodium, I’ll also need the file RBFpolicy.py, found here.

model = Model(LakeProblemDPS)

To properly declare the model I need to specify model parameters, levers, uncertainties and model responses. Model parameters include any input that will change across model evaluations. I also need to let Rhodium know which parameters are decision variables or “levers” and which are uncertainties. I can do this by explicitly declaring levers, uncertainties and their ranges as shown below.

model.parameters = [Parameter("vars"),
                    Parameter("b"),
                    Parameter("q"),
                    Parameter("mu"),
                    Parameter("sigma"),
                    Parameter("delta")]
model.levers = [RealLever("vars", 0.0, 1, length=6)]

model.uncertainties = [UniformUncertainty("b", 0.1, 0.45),
                       UniformUncertainty("q", 2.0, 4.5),
                       UniformUncertainty("mu", 0.01, 0.05),
                       UniformUncertainty("sigma", 0.001, 0.005),
                       UniformUncertainty("delta", 0.93, 0.99)]

Next, I need to specify model responses. Model responses include any optimization objective and any information needed to assess constraints or other information. When specifying output I also need to tell Rhodium what kind of response each variable represents (ie minimize or maximize etc.).

model.responses = [Response("net_benefits", Response.MAXIMIZE),
                   Response("max_P", Response.MINIMIZE),
                   Response("intertia", Response.MAXIMIZE),
                   Response("reliability", Response.MAXIMIZE)]

Many Objective Search

Now that I’ve defined our model, I’m ready to perform search with an MOEA. In this example I’ll use NSGAII and search over 10,000 function evaluations. This optimization takes a long time when running in serial, so I’ll exploit Rhodium’s parallelization capabilities to utilize 4 core on my desktop computer.

# Use a Process Pool evaluator, which will work on Python 3+
if __name__ == "__main__":   
    with ProcessPoolEvaluator(4) as evaluator:
        RhodiumConfig.default_evaluator = evaluator
        paretoSet = optimize(model, "NSGAII", 10000)

The variable paretoSet, is an instantiation of Rhodium’s DataSet class which now contains the Pareto approximate set found by NSGAII. We can visualize the approximate Pareto set using Rhodium’s built in visualization tools. Here I’ll use a Parallel Coordinate plot. Following Quinn et al., 2017, I’ll assume the stakeholders have set performance criteria of Reliability > 0.95 and Net Benefits > 0.2. Using Rhodium’s brushing functionality we can examine which solutions in our Pareto approximate set meet these criteria.

parallel_coordinates(model, paretoSet, target="top",
                           brush=[Brush("reliability > 0.95 and net_benefits > 0.2"), Brush("reliability <= 0.95 or net_benefits <= 0.2")])

Below, I’ve defined a function to evaluate a solution’s satisficing criteria. We can extract solutions that meet the criteria using the DataSet class’s find function.

criteria = lambda x : x["reliability"] > 0.95 and x["net_benefits"] > 0.2
acceptableSols = paretoSet.find(criteria)

DU Reevaluation

The solutions defined above meet stakeholder criteria if our assumptions about the state of the world (SOW) are correct, but what if the future deviates from our assumptions? To answer this question, I’ll create 1,000 new SOWs using latin hypercube sampling of our model uncertainties. I’ll then use Rhodium to reevaluate the set of acceptable solutions under each SOW (from here out I will refer to this as DU reevaluation). Rhodium has built in tools for both sampling uncertainties and reevaluating a set of solutions.

SOWs = sample_lhs(model, 1000)

reevaluation = [evaluate(model, update(SOWs, policy)) for policy in acceptableSols]

Measuring Robustness

For this exercise, I’ll examine four robustness measures reviewed in Herman et al., (2015). The four metrics include two variants of regret and two variants of satisficing, categories of robustness measures originally defined by Lempert and Collins, (2007). I’ll first show how each measure is calculated and how it can be coded in Rhodium, then I’ll illustrate how and why the choice of robustness metric changes the decision making process. All python implementations of robustness metrics in this post are adaptations of original functions written by Dave Hadka in the Rhodium source code.

Regret Type 1

The first metric, regret type 1 (R1), measures a solution’s deviation from its performance in the baseline SOW. Examples of this metric’s use can be found in Kasprzyk et al., (2013) and Lempert and Collins, (2007). As defined by Herman et al., (2015), R1 measures robustness as the 90th percentile deviation in objective performance across SOWs, maximized across all objectives. The 90th percentile is used to capture tail performance while reducing susceptibility to outliers.

R1 = \max_i[D_{i,90}:P(D_i\leq D_{i,90}=0.90]

Where:

D_{i,j} = \frac{|F(x)_{i,j}-F(x)^*_i|}{F(x)^*_i}

Where F(x)^*_i represents the value of objective i in the baseline SOW and F(x)_{i,j} represents the value of objective i calculated in SOW j.

Below I’ve coded a function which utilizes Rhodium’s built in data structures to calculate R1 for a given solution. The variable results is a Rhodium DataSet object containing the full set of DU reevaluations for one member of the Pareto approximate set. baseline is a DataSet containing the solution’s performance in the baseline state of the world.

def regret_type1(model, results, baseline, percentile=90):
    quantiles = []
    for response in model.responses:
        if response.dir == Response.MINIMIZE or response.dir == Response.MAXIMIZE:
            values = [abs((result[response.name] - baseline[response.name]) / baseline[response.name]) for result in results]
            quantiles.append(np.percentile(values, percentile))
            
    return max(quantiles)

Regret Type 2

Regret Type 2 (R2), is a variant of Savage regret introduced by Savage, 1951. R2 measures a decision maker’s regret for choosing a given solution over other available solutions by comparing the chosen solution’s performance in each SOW against the best performing option for that SOW. Like R1, R2 utilizes the 90th percentile deviation to capture tail behavior while reducing susceptibility to outliers:

R2 = \max_i[D_{i,90}:P(D_i\leq D_{i,90}=0.90]

Where:

D_{i,j} = \frac{|F(x)_{i,j}-min_s F(x_s)_{i,j}|}{F(x)_{i,j}}

if objective i is to be minimized and

D_{i,j} = \frac{|F(x){i,j}-max_s F(x_s){i,j}|}{F(x)_{i,j}}

if objective i is to be maximized.

The best value of each objective i is taken across the set of all solutions s. The deviation in R2 is normalized by the objective value itself rather than the best value, since this objective often approaches zero for minimization problems.

The function below again utilizes Rhodium’s data structures to efficiently implement a function for R2 in python. all_results is a list of DataSet objects, each containing the full set of DU reevaluations for one member of the Pareto approximate set. results is a DataSet containing the full set of DU reevaluations for the member of the Pareto approximate set being evaluated by this function.

def regret_type2(model, all_results, results, percentile=90):
    # for each uncertainty sampling, find the best value
    best = []
    
    for i in range(len(all_results[0])):
        entry = {}
        
        for response in model.responses:
            if response.dir == Response.MINIMIZE:
                entry[response.name] = min([result[response.name] for result in all_results])
            elif response.dir == Response.MAXIMIZE:
                entry[response.name] = max([result[response.name] for result in all_results])
                
        best.append(entry)
    
    # then compute the regret from the best value
    quantiles = []
    
    for response in model.responses:
        if response.dir == Response.MINIMIZE or response.dir == Response.MAXIMIZE:
            values = []
            
            for i in range(len(all_results[0])):
                values.append(abs((results[i][response.name] - best[i][response.name]) / results[i][response.name]))
            
            quantiles.append(np.percentile(values, percentile))
        
    return max(quantiles)

Satisficing Type 1

Satisficing metrics quantify a solution’s ability to meet prespecified performance criteria across deeply uncertain SOWs. Satisficing type 1, an approximation of Starr’s domain criterion (Starr 1962; Schneller and Sphicas, 1983) , represents the fraction of states of the world that a solution meets a set of performance criteria.

S1 = \frac{1}{N} \sum^{N}_{j=1} \Lambda_{s,j}

Where N is the number of sampled SOWs and \Lambda_{s,j} = 1 if solution s meets the performance criteria in SOW j and \Lambda_{s,j} = 0 otherwise. For this metric, we’ll utilize the performance criteria stated above: Relibility > 95% and Net Benefit > 0.2.

In python, we can express the performance criteria in a lambda function, as shown in the criteria function defined above. The function below calculates S1 for a solution whose DU reevaluation results have been stored in a DataSet objected called results.

def satisficing_type1(model, results, expr=None):
    # if no criteria are defined, check the feasibility across SOWs
    if expr is None:
        return mean(check_feasibility(model, results))
    # otherwise, return the number of SOWs that meet the specified criteria
    else:
        satisfactory = [expr(result) for result in results]
        return sum(satisfactory)/len(results)       

Satisficing Type 2

Satisficing type 2 (S2) is a measure of robustness derived from Info-Gap literature (Ben-Haim, 2004). S2 represents the uncertainty horizon that can be withstood before a solution fails to meet a set of performance criteria.

S2 = \hat{\alpha} = max \big[\alpha:min_{j \in U(\alpha)} F(x)_j \geq r^* \big]

Where \hat{\alpha} = maximum level of uncertainty, measured outward from the base SOW that can be tolerated before performance drops below threshold r*. In this example, I’ll normalize all uncertainties to their sampling range. In the function below, results is once again a Rhodium DataSet object containing the full set of DU reevaluations for one member of the Pareto approximate set. baseline is a DataSet containing the solution’s performance in the baseline state of the world. uncertainties_min is a list containing the lower bound for each uncertainty and uncertainties_max is a list containing upper bounds.

def satisficing_type2(model, results, baseline, uncertainties_max, uncertainties_min, expr=None):
    distances = []
    
    # ensure all default parameters are defined in baseline
    baseline = baseline.copy()
    populate_defaults(model, baseline)
    normalized_baseline = [baseline[u.name] for u in model.uncertainties]
    normalized_baseline = (np.array(uncertainties_max)-np.array(normalized_baseline))/(np.array(uncertainties_max)-np.array(uncertainties_min))
    
    for i, result in enumerate(results):
        if (expr is None and _is_feasible(model, result)) or (expr is not None and expr(result)==False):
            normalized_point = (np.array(uncertainties_max)- np.array([result[u.name] for u in model.uncertainties]))/(np.array(uncertainties_max)-np.array(uncertainties_min))
            distances.append(sp.distance.euclidean(
                    normalized_point,
                    normalized_baseline))
                      
    return 0.0 if len(distances) == 0 else min(distances)

Examining the results

Figure 2: The top ranked solution according to each robustness measure along with each solution’s corresponding ranking across all other measures. Crossing lines indicate contrasting ordinal rankings of solution robustness between two measures. The preferablility of a solution is highly dependent on the choice of robustness metric.

Figure 2 shows the top ranked solution according to each robustness measure along with its corresponding rank across the other three measures. The four measures each provide different rankings of solution robustness. The top ranked solutions for the two regret based measures both preform poorly when evaluated under any of the other three measures, while the best solution according to S1 is tied for the best solution in S2 and ranks in the middle according to the two regret based solution. The second solution selected by S2 ranks at the bottom according to the other three metrics.

The disparity in ranking underscores the differences between the robustness measures. The wide difference in ranking between S2’s best solution and it’s ranking by the other three metrics indicates that while the uncertainties can deviate “far” from the base SOW before this solution fails, its performance is likely to be quite poor once this horizon is crossed. It’s poor ranking in S1 indicates that it will likely fail in many SOWs and it’s R1 ranking suggests that failures in the tail its performance are likely to be quite severe. Finally it’s low ranking in R2 means that there are other solutions that perform better across SOWs.

A solution that performs well by measure S2 may be preferable if decision makers have confidence that their assumptions about the base SOW are correct, however, this is rarely the case in conditions of deep uncertainty, so the solution recommended by S2 alone is likely undesirable. Furthermore, while S2 provides an aggregated measure of “distance to failure” this measure does not indicate which uncertainties drive failure or how far each individual uncertainty must deviate from the base SOW before the solution fails. A better way to understand a solution’s sensitivity to deeply uncertainties is through scenario discovery, which seeks to define vulnerable regions of the uncertainty space for a given solution. Rhodium has a set of built in scenario discovery tools, for details see this post by Julie.

The difference in robustness ranking between measures R1, R2 and S1 leave the decision makers with a choice regarding how they would like a solution to perform. Solutions that perform well in S1 maintain performance across the widest range of potential SOWs, but the metric provides no information on the severity of failure when the criteria is not met. Solutions that perform well in R1 are likely to have the least severe failures in the tails of their performance, but the metric does not measure the fraction of states of the world that result in poor performance. Measure R2 yields information about a solution’s performance relative to other options, but does not provide information about performance criteria or failure severity. As is often the case in decision making problems, the choice of measure should reflect a decision maker’s particular risk tolerance and preferences and must be tailored to each problem individually.

Final thoughts

This example has demonstrated how to use Rhodium to perform the first two steps of MORDM on a four objective formulation of the shallow lake problem. The results indicate that the choice of robustness metric changes which solutions are favored by decision makers. While I’ll wrap up this post here, this should not be the end of an MORDM analysis. After using the robustness metrics to select candidate policies, decision makers should perform scenario discovery to examine which uncertainties control solution performance and how these vulnerabilities differ between candidate solutions. Next, they should visualize each candidate policy to understand how it responds to various system states. Finally, they should think about whether these results necessitate any changes to the original problem formulation. If they choose to reformulate the problem, then the whole process starts back at the beginning. Luckily, Rhodium makes this process easy, allowing decision makers to examine problem formulations quickly and easily.

References

Ben-Haim, Y. (2004). “Uncertainty, probability and information-gaps.” Reliab. Eng. Syst. Saf., 85(1), 249–266.

Carpenter, S.R., Ludwig, D., Brock, W.A., 1999. Management of eutrophication for lakes subject to potentially irreversible change. Ecol. Appl. 9, 751e771.

Groves, D. G., and Lempert, R. J. (2007). “A new analytic method for finding policy-relevant scenarios.” Global Environ. Change, 17(1), 73–85.

Hall, J. W., Lempert, R. J., Keller, K., Hackbarth, A., Mijere, C., and
McInerney, D. J. (2012). “Robust climate policies under uncertainty:
A comparison of robust decision making and info-gap methods.”
Risk Anal., 32(10), 1657–1672

Herman, J.D., Reed, P.M., Zeff, H.B., Characklis, G.W., 2015. How should robustness be defined for water systems planning under change? J. Water Resour. Plan. Manag. 141, 04015012.

Kasprzyk, J. R., Nataraj, S., Reed, P. M., and Lempert, R. J. (2013). “Many objective robust decision making for complex environmental systems
undergoing change.” Environ. Modell. Softw., 42, 55–71.

Lempert, R. J., and Collins, M. (2007). “Managing the risk of an uncertain
threshold response: Comparison of robust, optimimum, and precaution-
ary approaches.” Risk Anal., 27(4), 1009–1026.

Matalas, N. C., and Fiering, M. B. (1977). “Water-resource systems
planning.” Climate, climatic change, and water supply, studies in geo-
physics, National Academy of Sciences, Washington, DC, 99–110.

Quinn, J. D., Reed, P. M., & Keller, K. (2017). Direct policy search for robust multi-objective management of deeply uncertain socio-ecological tipping points. Environmental Modelling & Software, 92, 125-141.

Savage, L. J. (1951). “The theory of statistical decision.” J. Am. Stat. Assoc., 46(253), 55–67.

Schneller, G., and Sphicas, G. (1983). “Decision making under uncertainty: Starr’s domain criterion.” Theory Decis., 15(4), 321–336.

Starr, M. K. (1962). Product design and decision theory, Prentice-Hall, Englewood Cliffs, NJ.

Radial convergence diagram (aka chord diagram) for sensitivity analysis results and other inter-relationships between data

TLDR; Python script for radial convergence plots that can be found here.

You might have encountered this type of graph before, they’re usually used to present relationships between different entities/parameters/factors and they typically look like this:

From https://datavizcatalogue.com/methods/non_ribbon_chord_diagram.html

In the context of our work, I have seen them used to present sensitivity analysis results, where we are interested in both the individual significance of a model parameter, but also the extent of its interaction with others. For example, in Butler et al. (2014) they were used to present First, Second, and Total order parameter sensitivities as produced by a Sobol’ Sensitivity Analysis.

From Butler et al. (2014)

I set out to write a Python script to replicate them. Calvin Whealton has written a similar script in R, and the same functionality also exists within Rhodium. I just wanted something with a bit more flexibility, so I wrote this script that produces two types of these graphs, one with straight lines and one with curved (which are prettier IMO). The script takes dictionary items as inputs, either directly from SALib and Rhodium (if you are using it to display Sobol results), or by importing them (to display anything else). You’ll need one package to get this to run: NetworkX. It facilitates the organization of the nodes in a circle and it’s generally a very stable and useful package to have.

Graph with straight lines
Graph with curved lines

I made these graphs to display results the results of a Sobol analysis I performed on the model parameters of a system I am studying (a, b, c, d, h, K, m, sigmax, and sigmay). The node size indicates the first order index (S1) per parameter, the node border thickness indicates the total order index (ST) per parameter, and the thickness of the line between two nodes indicates the secord order index (S2). The colors, thicknesses, and sizes can be easily changed to fit your needs. The script for these can be found here, and I will briefly discuss what it does below.

After loading the necessary packages (networkx, numpy, itertools, and matplotlib) and data, there is some setting parameters that can be adapted for the figure generation. First, we can define a significance value for the indices (here set to 0.01). To keep all values just set it to 0. Then we have some stylistic variables that basically define the thicknesses and sizes for the lines and nodes. They can be changed to get the look of the graph to your liking.

# Set min index value, for the effects to be considered significant
index_significance_value = 0.01
node_size_min = 15 # Max and min node size
node_size_max = 30
border_size_min = 1 # Max and min node border thickness
border_size_max = 8
edge_width_min = 1 # Max and min edge thickness
edge_width_max = 10
edge_distance_min = 0.1 # Max and min distance of the edge from the center of the circle
edge_distance_max = 0.6 # Only applicable to the curved edges

The rest of the code should just do the work for you. It basically does the following:

  • Define basic variables and functions that help draw circles and curves, get angles and distances between points
  • Set up graph with all parameters as nodes and draw all second order (S2) indices as lines (edges in the network) connecting the nodes. For every S2 index, we need a Source parameter, a Target parameter, and the Weight of the line, given by the S2 index itself. If you’re using this script for other data, different information can fit into the line thickness, or they could all be the same.
  • Draw nodes and lines in a circular shape and adjust node sizes, borders, and line thicknesses to show the relative importance/weight. Also, annotate text labels on each node and adjust their location accordingly. This produces the graph with the straight lines.
  • For the graph with the curved lines, define function that will generate the x and y coordinates for them, and then plot using matplotlib.

I would like to mention this script by Enrico Ubaldi, based on which I developed mine.

Interactive visualizations of high-dimensional data using J3

Project Platypus is a repository that supports multiple Python libraries for multi-objective optimization, scenario discovery, and data analysis. Past blogposts have already demonstrated the Rhodium [1, 2] and Platypus [3] libraries. The aim of this post is to demonstrate the capabilities of J3 and its implementation within Project Platypus, through the Python module J3Py. J3 is an open source, cross-platform Java application for producing and sharing high-dimensional, interactive scientific visualizations. It can be used within Project Platypus, through J3Py, which is a Python module that allows us to call J3 within Python scripts. This blogpost will look into a simple system I’ve been working on and use the Rhodium library to generate management alternatives. I’ll then show how J3 can be used to explore the tradeoffs in the alternatives generated and aid in the negotiated selection of alternatives.

First thing to do is load the necessary libraries:

import numpy as np # This is a library required by the model
import itertools # This is a library required by the model
from rhodium import * # This is the library needed to use Rhodium
from j3 import J3 # This is the library we'll be using to visualize solutions

We then need to define the model function, it’s a bit long and not immediately pertinent to the blogpost so I’ll put it at the bottom so readers don’t have to scroll through it. The optimization will be performed using Rhodium and it’s set up like so:

model = Model(fish_game)

model.parameters = [Parameter("vars"),
                    Parameter("a"),
                    Parameter("b"),
                    Parameter("c"),
                    Parameter("d"),
                    Parameter("h"),
                    Parameter("K"),
                    Parameter("m"),
                    Parameter("sigmaX"),
                    Parameter("sigmaY")]

model.responses = [Response("NPV", Response.MAXIMIZE),
                   Response("PreyDeficit", Response.MINIMIZE),
                   Response("ConsLowHarvest", Response.MINIMIZE),
                   Response("WorstHarvest", Response.MAXIMIZE),
                   Response("PredatorExtinction", Response.INFO)]

model.constraints = [Constraint("PredatorExtinction < 1")]

model.levers = [RealLever("vars", 0.0, 1.0, length = 6)]

output = optimize(model, "NSGAII", 1000)

As Julie has covered Rhodium already, I won’t go into the details here, but it’s pretty intuitive that we first declare what the model is, input parameters, responses (i.e. objectives and constraints), and decision variables. Instead, I’ll focus on analyzing the output (the candidate solutions found) using J3. “output” here is a “DataSet” object of the Rhodium module and contains the decision variables, and objective performance of the solutions identified. There is also a constraint (“PredatorExtinction”) which is always zero in all the solutions, and I will not be visualizing here. I will not edit or change anything on my screen before screen-grabbing, to demonstrate how truly simple and easy it is to use. To call the J3 environment run:

J3(output.as_dataframe(['NPV', 'PreyDeficit', 'ConsLowHarvest', 'WorstHarvest']))

This produces a window with a 3D scatterplot of three of our objectives in the x, y, and z axes and the fourth used as the color.

Full resolution here

I’d like to start examining what my results look like so I’ll make this larger and rotate it a bit.

Full resolution here

I’d like to also change how my objectives are displayed. So I’ll change the orientation of the axes and the objective used for the color.

Full resolution here

The rainbow color-scheme is not really my aesthetic, so let’s change that also.

Full resolution here

A couple things we can see from this plot: there is a strong tradeoff between the NPV objective and the prey deficit, as well as between the prey deficit and the Worst Harvest. We can examine these pairs of tradeoffs more explicitly, by pulling the axes’ planes out and projecting the values on the 2D surfaces:

Full resolution here

We can also examine the tradeoffs using a parallel axis plot:

Full resolution here

We can also move the axes in the parallel axis plot:

Full resolution here

Having these multiple views, we can highlight and examine particular solutions and see how they compare with others, as well as get more detailed information:

Full resolution here and here

The final feature I’d like to showcase is solution brushing, which can facilitate in the negotiation of solutions process. Brushing allows decision makers to set limits on what the believe is acceptable or unacceptable performance (e.g. “I cannot accept costs above X amount”). It also allows decision makers to more closely examine where potential tensions might arise. If, for example, one negotiating party sets their bar too high, all remaining solutions might be unacceptable by the other decision making parties. Tools like brushing make this process more transparent and straightforward.

Full resolution here

The model function used in the example is posted below. I would also like to mention ScreenToGif, which is the tool I used to produce these GIFs and it’s been super easy to download and start using. Great product.

nRBF = 2 # no. of RBFs to use
nIn = 1 # no. of inputs (depending on selected strategy)
nOut = 1 # no. of outputs (depending on selected strategy)

N = 100 # Number of realizations of environmental stochasticity

tSteps = 100 # no. of timesteps to run the fish game on

# Define problem to be solved
def fish_game(vars, # contains all C, R, W for RBF policy
              a = 0.005, # rate at which the prey is available to the predator
              b = 0.5, # prey growth rate
              c = 0.5, # rate with which consumed prey is converted to predator abundance
              d = 0.1, # predator death rate
              h = 0.1, # handling time (time each predator needs to consume the caught prey)
              K = 2000, # prey carrying capacity given its environmental conditions
              m = 0.7, # predator interference parameter
              sigmaX = 0.004, # variance of stochastic noise in prey population
              sigmaY = 0.004): # variance of stochastic noise of predator population

    x = np.zeros(tSteps+1) # Create prey population array
    y = np.zeros(tSteps+1) # Create predator population array
    z = np.zeros(tSteps+1) # Create harvest array

    # Create array to store harvest for all realizations
    harvest = np.zeros([N,tSteps+1])
    # Create array to store effort for all realizations
    effort = np.zeros([N,tSteps+1])
    # Create array to store prey for all realizations
    prey = np.zeros([N,tSteps+1])
    # Create array to store predator for all realizations
    predator = np.zeros([N,tSteps+1])
    
    # Create array to store metrics per realization
    NPV = np.zeros(N)
    cons_low_harv = np.zeros(N)
    harv_1st_pc = np.zeros(N)    
    
    # Create array with environmental stochasticity for prey
    epsilon_prey = np.random.normal(0.0, sigmaX, N)
    
    # Create array with environmental stochasticity for predator
    epsilon_predator = np.random.normal(0.0, sigmaY, N)
    
    #Set policy input and output ranges
    input_ranges = [[0, K]] # Prey pop. range to use for normalization
    output_ranges = [[0, 1]] # Range to de-normalize harvest to

    # Go through N possible realizations
    for i in range(N):
        # Initialize populations and values
        x[0] = prey[i,0] = K
        y[0] = predator[i,0] = 250
        z[0] = effort[i,0] = hrvSTR([x[0]], vars, input_ranges, output_ranges)
        NPVharvest = harvest[i,0] = effort[i,0]*x[0]        
        # Go through all timesteps for prey, predator, and harvest
        for t in range(tSteps):
            if x[t] > 0 and y[t] > 0:
                x[t+1] = (x[t] + b*x[t]*(1-x[t]/K) - (a*x[t]*y[t])/(np.power(y[t],m)+a*h*x[t]) - z[t]*x[t])* np.exp(epsilon_prey[i]) # Prey growth equation
                y[t+1] = (y[t] + c*a*x[t]*y[t]/(np.power(y[t],m)+a*h*x[t]) - d*y[t]) *np.exp(epsilon_predator[i]) # Predator growth equation
                if t <= tSteps-1:
                    z[t+1] = hrvSTR([x[t]], vars, input_ranges, output_ranges)
            prey[i,t+1] = x[t+1]
            predator[i,t+1] = y[t+1]
            effort[i,t+1] = z[t+1]
            harvest[i,t+1] = z[t+1]*x[t+1]
            NPVharvest = NPVharvest + harvest[i,t+1]*(1+0.05)**(-(t+1))
        NPV[i] = NPVharvest
        low_hrv = [harvest[i,j]<prey[i,j]/20 for j in range(len(harvest[i,:]))] # Returns a list of True values when there's harvest below 5%
        count = [ sum( 1 for _ in group ) for key, group in itertools.groupby( low_hrv ) if key ] # Counts groups of True values in a row
        if count: # Checks if theres at least one count (if not, np.max won't work on empty list)
            cons_low_harv[i] = np.max(count)  # Finds the largest number of consecutive low harvests
        else:
            cons_low_harv[i] = 0
        harv_1st_pc[i] = np.percentile(harvest[i,:],1)
    
    return (np.mean(NPV), # Mean NPV for all realizations
            np.mean((K-prey)/K), # Mean prey deficit
            np.mean(cons_low_harv), # Mean worst case of consecutive low harvest across realizations
            np.mean(harv_1st_pc), # 5th percentile of all harvests
            np.mean((predator < 1).sum(axis=1))) # Mean number of predator extinction days per realization

# Calculate outputs (u) corresponding to each sample of inputs
# u is a 2-D matrix with nOut columns (1 for each output)
# and as many rows as there are samples of inputs
def hrvSTR(Inputs, vars, input_ranges, output_ranges):
    # Rearrange decision variables into C, R, and W arrays
    # C and R are nIn x nRBF and W is nOut x nRBF
    # Decision variables are arranged in 'vars' as nRBF consecutive
    # sets of {nIn pairs of {C, R} followed by nOut Ws}
    # E.g. for nRBF = 2, nIn = 3 and nOut = 4:
    # C, R, C, R, C, R, W, W, W, W, C, R, C, R, C, R, W, W, W, W
    C = np.zeros([nIn,nRBF])
    R = np.zeros([nIn,nRBF])
    W = np.zeros([nOut,nRBF])
    for n in range(nRBF):
        for m in range(nIn):
            C[m,n] = vars[(2*nIn+nOut)*n + 2*m]
            R[m,n] = vars[(2*nIn+nOut)*n + 2*m + 1]
        for k in range(nOut):
            W[k,n] = vars[(2*nIn+nOut)*n + 2*nIn + k]

    # Normalize weights to sum to 1 across the RBFs (each row of W should sum to 1)
    totals = np.sum(W,1)
    for k in range(nOut):
        if totals[k] > 0:
            W[k,:] = W[k,:]/totals[k]
    # Normalize inputs
    norm_in = np.zeros(nIn)
    for m in range (nIn):
        norm_in[m] = (Inputs[m]-input_ranges[m][0])/(input_ranges[m][1]-input_ranges[m][0])
    # Create array to store outputs
    u = np.zeros(nOut)
    # Calculate RBFs
    for k in range(nOut):
        for n in range(nRBF):
            BF = 0
            for m in range(nIn):
                if R[m,n] > 10**-6: # set so as to avoid division by 0
                    BF = BF + ((norm_in[m]-C[m,n])/R[m,n])**2
                else:
                    BF = BF + ((norm_in[m]-C[m,n])/(10**-6))**2
            u[k] = u[k] + W[k,n]*np.exp(-BF)
    # De-normalize outputs
    norm_u = np.zeros(nOut)
    for k in range(nOut):
        norm_u[k] = output_ranges[k][0] + u[k]*(output_ranges[k][1]-output_ranges[k][0])
    return norm_u