# Introduction to PyBorg – basic setup and running

PyBorg is a new secondary implementation of Borg, written entirely in Python using the Platypus optimization library. PyBorg was developed by Andrew Dircks based on the original implementation in C and it is intended primarily as a learning tool as it is less efficient than the original C version (which you can still use with Python but through the use of the plugin “wrapper” also found in the package). PyBorg can be found in the same repository where the original Borg can be downloaded, for which you can request access here: http://borgmoea.org/#contact

This blogpost is intended to demonstrate this new implementation. To follow along, first you need to either clone or download the BitBucket repository after you gain access.

Setting up the required packages is easy. In your terminal, navigate to the Python directory in the repository and install all prerequisites using python setup.py install. This will install all requirements (i.e. the Platypus library, numpy, scipy and six) for you in your current environment.

You can test that everything works fine by running the optimization on the DTLZ2 test function, found in dtlz2.py. The script creates an instance of the problem (as it is already defined in the Platypus library), sets it up as a ploblem for Borg to optimize and runs the algorithm for 10,000 function evaluations:

    # define a DTLZ2 problem instance from the Platypus library
nobjs = 3
problem = DTLZ2(nobjs)

# define and run the Borg algorithm for 10000 evaluations
algorithm = BorgMOEA(problem, epsilons=0.1)
algorithm.run(10000)


A handy 3D scatter plot is also generated to show the optimization results.

The repository also comes with two other scripts dtlz2_runtime.py and dtlz2_advanced.py.
The first demonstrates how to use the Platypus hypervolume indicator at a specified runtime frequency to get learn about its progress as the algorithm goes through function evaluations:

The latter provides more advanced functionality that allows you define custom parameters for Borg. It also includes a function to generate runtime data from the run. Both scripts are useful to diagnose how your algorithm is performing on any given problem.

The rest of this post is a demo of how you can use PyBorg with your own Python model and all of the above. I’ll be using a model I’ve used before, which can be found here, and I’ll formulate it so it only uses the first three objectives for the purposes of demonstration.

The first thing you need to do to optimize your problem is to define it. This is done very simply in the exact same way you’d do it on Project Platypus, using the Problem class:

from fishery import fish_game
from platypus import Problem, Real
from pyborg import BorgMOEA

# define a problem
nVars = 6
nObjs = 3

problem = Problem(nVars, nObjs) # first input is no of decision variables, second input is no of objectives
problem.types[:] = Real(0, 1) #defines the type and bounds of each decision variable
problem.function = fish_game #defines the model function


This assumes that all decision variables are of the same type and range, but you can also define them individually using, e.g., problem.types[0].

Then you define the problem for the algorithm and set the number of function evaluations:

algorithm = BorgMOEA(problem, epsilons=0.001) #epsilons for each objective
algorithm.run(10000) # number of function evaluations


If you’d like to also produce a runtime file you can use the detailed_run function included in the demo (in the files referenced above), which wraps the algorithm and runs it in intervals so the progress can be monitored. You can combine it with runtime_hypervolume to also track your hypervolume indicator. To use it you need to define the total number of function evaluations, the frequency with which you’d like the progress to be monitored and the name of the output file. If you’d like to calculate the Hypervolume (you first need to import it from platypus) you also need to either provide a known reference set or define maximum and minimum values for your solutions.

maxevals = 10000
frequency = 100
output = "fishery.data"
hv = Hypervolume(minimum=[-6000, 0, 0], maximum=[0, 1, 100])

nfe, hyp = detailed_run(algorithm, maxevals, frequency, output, hv)


My full script can be found below. The detailed_run function is an edited version of the default that comes in the demo to also include the hypervolume calculation.

from fishery import fish_game
from platypus import Problem, Real, Hypervolume
from pyborg import BorgMOEA
from runtime_diagnostics import detailed_run

# define a problem
nVars = 6 # no. of decision variables to be optimized
nObjs = 3

problem = Problem(nVars, nObjs) # first input is no of decision variables, second input is no of objectives
problem.types[:] = Real(0, 1)
problem.function = fish_game

# define and run the Borg algorithm for 10000 evaluations
algorithm = BorgMOEA(problem, epsilons=0.001)
#algorithm.run(10000)

# define detailed_run parameters
maxevals = 10000
frequency = 100
output = "fishery.data"
hv = Hypervolume(minimum=[-6000, 0, 0], maximum=[0, 1, 100])

nfe, hyp = detailed_run(algorithm, maxevals, frequency, output, hv)

# plot the results using matplotlib
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

fig = plt.figure()
ax.scatter([s.objectives[0] for s in algorithm.result],
[s.objectives[1] for s in algorithm.result],
[s.objectives[2] for s in algorithm.result])
ax.set_xlabel('Objective 1')
ax.set_ylabel('Objective 2')
ax.set_zlabel('Objective 3')
ax.scatter(-6000, 0, 0, marker="*", c='orange', s=50)
plt.show()

plt.plot(nfe, hyp)
plt.title('PyBorg Runtime Hypervolume Fish game')
plt.xlabel('Number of Function Evaluations')
plt.ylabel('Hypervolume')
plt.show()


It produces the following two figures:

# Measuring the parallel performance of the Borg MOEA

In most applications, parallel computing is used to improve code efficiency and expand the scope of problems that can be addressed computationally (for background on parallelization, see references listed at the bottom of this post). For the Borg Many Objective Evolutionary Algorithm (MOEA) however, parallelization can also improve the quality and reliability of many objective search by enabling a multi-population search. The multi-population implementation of Borg is known as Multi-Master Borg, details can be found here. To measure the performance of Multi-Master Borg, we need to go beyond basic parallel efficiency (discussed in my last post, here), which measures the efficiency of computation but not the quality of the many objective search. In this post, I’ll discuss how we measure the performance of Multi-Master Borg using two metrics: hypervolume speedup and reliability.

## Hypervolume speedup

In my last post, I discussed traditional parallel efficiency, which measures the improvement in speed and efficiency that can be achieved through parallelization. For many objective search, speed and efficiency of computation are important, but we are more interested in the speed and efficiency with which the algorithm produces high quality solutions. We often use the hypervolume metric to measure the quality of an approximation set as it captures both convergence and diversity (for a thorough explanation of hypervolume, see this post). Using hypervolume as a measure of search quality, we can then evaluate hypervolume speedup, defined as:

Hypervolume speedup = $\frac{T_S^H}{T_P^H}$

where $T_S^H$ is the time it takes the serial version of the MOEA to achieve a given hypervolume threshold, and $T_P^H$ is the time it takes the parallel implementation to reach the same threshold. Figure 1 below, adapted from Hadka and Reed, (2014), shows the hypervolume speedup across different parallel implementations of the Borg MOEA for the five objective NSGA II test problem run on 16,384 processors (in this work the parallel epsilon-NSGA II algorithm is used as a baseline rather than a serial implementation). Results from Figure 1 reveal that the Multi-Master implementations of Borg are able to reach each hypervolume threshold faster than the baseline algorithm and the master-worker implementation. For high hypervolume thresholds, the 16 and 32 Master implementations achieve the hypervolume thresholds 10 times faster than the baseline.

## Reliability

MOEAs are inherently stochastic algorithms, they are be initialized with random seeds which may speedup or slow down the efficiency of the search process. To ensure high quality Pareto approximate sets, it’s standard practice to run an MOEA across many random seeds and extract the best solutions across all seeds as the final solution set. Reliability is a measure of the probability that each seed will achieve a high quality set of solutions. Algorithms that have higher reliability allow users to run fewer random seeds which saves computational resources and speeds up the search process. Salazar et al., (2017) examined the performance of 17 configurations of Borg on the Lower Susquehanna River Basin (LSRB) for a fixed 10 hour runtime. Figure 2 shows the performance of each configuration across 50 random seeds. A configuration that is able to achieve the best hypervolume across all seeds would be represented as a blue bar that extends to the top of the plot. The algorithmic configurations are shown in the plot to the right. These results show that though configuration D, which has a high core count and low master count, achieves the best overall hypervolume, it does not do so reliably. Configuration H, which has two masters, is able to achieve nearly the same hypervolume, but has a much higher reliability. Configuration L, which has four masters, achieves a lower maximum hypervolume, but has vary little variance across random seeds.

These results can be further examined by looking at the quality of search across its runtime. In Figure 3, Salazar et al. (2017) compare the performance of the three algorithmic configurations highlighted above (D, H and L). The hypervolume achieved by the maximum and minimum seeds are shown in the shaded areas, and the median hypervolume is shown with each line. Figure 3 clearly demonstrates how improved reliability can enhance search. Though the Multi-Master implementation is able to perform fewer function evaluations in the 10 hour runtime, it has very low variance across seeds. The Master-worker implementation on the other hand achieves better performance with it’s best seed (it gets lucky), but its median performance never achieves the hypervolume of the two or four master configurations.

## Concluding thoughts

The two measures introduced above allow us to quantify the benefits of parallelizing the Multi-Master Borg MOEA. The improvements to search quality not only allow us to reduce the time and resources that we need to expend on many objective search, but may also allow us to discover solutions that would be missed by the serial or Master-Worker implementations of the algorithm. In many objective optimization contexts, this improvement may fundamentally alter our understanding of what is possible in a challenging environmental optimization problems.

## References

Hadka, D., & Reed, P. (2015). Large-scale parallelization of the Borg multiobjective evolutionary algorithm to enhance the management of complex environmental systems. Environmental Modelling & Software, 69, 353-369.

Salazar, J. Z., Reed, P. M., Quinn, J. D., Giuliani, M., & Castelletti, A. (2017). Balancing exploration, uncertainty and computational demands in many objective reservoir optimization. Advances in water resources, 109, 196-210.

# The ABCs of MOEAs

We have recently begun introducing multi-objective evolutionary algorithms (MOEAs) to a few new additions to our research group, and it reminded me of when I began learning the relevant terminology myself barely a year ago. I recalled using Antonia’s glossary of commonly-used terms as I was getting acquainted with the group’s research in general, and figured that it might be helpful to do something similar for MOEAs in particular.

This glossary provides definitions and examples in plain language for terms commonly used to explain and describe MOEAs, and is intended for those who are just being introduced to this optimization tool. It also has a specific focus on the Borg MOEA, which is a class of algorithms used in our group. It is by no means exhaustive, and since the definitions are in plain language, please do leave a comment if I have left anything out or if the definitions and examples could be better written.

## Greek symbols

ε-box

Divides up the objective space into n-dimensional boxes with side length ε. Used as a “filter” to prevent too many solutions from being “kept” by the archive. The smaller the size of the ε-box, the finer the “mesh” of the filter, and the more solutions are kept by the archive. Manipulating the value of ε affects convergence and diversity.

Each ε-box can only hold one solution at a time. If two solutions are found that reside in the same ε-box, the solution closest to the lower left corner of the box will be kept, while the other will be eliminated.

ε-dominance

A derivative of Pareto dominance. A solution x is said to ε-dominate solution y if it lies in the lower left corner of an ε-box for at least one objective, and is not ε-dominated by solution y for all other objectives.

ε-progress

ε-progress occurs when the current solution x lies in an different ε-box that dominates the previous solution. Enforces a minimum threshold ( ε ) over which an MOEA’s solution must exceed to avoid search stagnation.

ε-value

The “resolution” of the problem. Can also be interpreted a measure of the degree of error tolerance of the decision-maker. The ε-values can be set according to the discretion of the decision-maker.

## A

A posteriori

Derived from Latin for “from the latter”. Typically used in multi-objective optimization to indicate that the search for solutions precedes the decision-making process. Exploration of the trade-offs resulting from different potential solutions generated by the MOEA is used to identify preferred solutions. Used when uncertainties and preferences are not known beforehand.

A priori

Derived from Latin for “from the former”. Typically used in multi-objective optimization to indicate that a set of potential solutions have already been decided beforehand, and that the function of a search is to identify the best solution(s). Used when uncertainties and preferences are known (well-characterized).

The distance that the known Pareto front must “move” to dominate the true Pareto front. In other words, the gap between the current set of solutions and the true (optimal) solutions. A performance measure of MOEAs that captures convergence. Further explanation can be found here.

Archive

A “secondary population” that stores the non-dominated solutions. Borg utilizes ε-values to bound the size of the archive (an ε-box dominance archive) . That is, solutions that are ε-dominated are eliminated. This helps to avoid deterioration.

## C

Conditional dependencies

Decision variables are conditionally dependent on each other if the value of one decision variable affects one or more if its counterparts.

Control maps

Figures that show the hypervolume achieved in relation to the number of objective function evaluations (NFEs) against the population size for a particular problem. Demonstrates the ability of an MOEA to achieve convergence and maintain diversity for a given NFE and population size. An ideal MOEA will be able to achieve a high hypervolume for any given NFE and population size.

Controllability

An MOEA with a high degree of controllability is one that results in fast convergence rates, high levels of diversity, and a large hypervolume regardless of the parameterization of the MOEA itself. That is, a controllable MOEA is insensitive to its parameters.

Convergence

Two definitions:

1. An MOEA is said to have “converged” at a solution when the subsequent solutions are no better than the previous set of solutions. That is, you have arrived at the best set of solutions that can possibly be attained.
2. The known Pareto front of the MOEA is approximately the same as the true Pareto front. This definition requires that the true Pareto front be known.

## D

Decision variables

Variables that can be adjusted and set by the decision-maker.

Deterioration

Occurs when elements of the current solution are dominated by a previous set of solutions within the archive. This indicates that the MOEA’s ability to find new solutions is diminishing.

Diversity

The “spread” of solutions throughout the objective space. An MOEA is said to be able to maintain diversity if it is able to find many solutions that are evenly spread throughout the objective space.

Dominance resistance

A phenomenon in which an MOEA struggles to produce offspring that dominate non-dominated members of the population. That is, the current set of solutions are no better than the worst-performing solutions of the previous set. An indicator of stagnation.

## E

Elitist selection

Refers to the retention of a select number of ‘best’ solutions in the previous population, and filling in the slots of the current generation with a random selection of solutions from the archive. For example, the Borg MOEA utilizes elitist selection during the randomized restarts when the best k-solutions from the previous generation are maintained in the population.

Epistasis

Describes the interactions between the different operators used in Borg MOEA. Refers to the phenomenon in which the heavier applications of one operator suppresses the use of other operators, but does not entirely eliminate the use of the lesser-used operators. Helps with finding new solutions. Encourages diversity and prevents pre-convergence.

## G

Generation

A set of solutions generated from one iteration of the MOEA. Consists of both dominated and non-dominated solutions.

Generational

Generational MOEAs apply the selection, crossover and mutation operators all at once to an entire generation of the population. The result is a complete replacement of the entire generation at the next time-step.

Generational distance

The average distance between the known Pareto front and the true Pareto front. The easiest performance metric to meet, and captures convergence of the solutions. Further explanation can be found here.

Genetic algorithm

An algorithm that uses the principles of evolution – selection, mutation and crossover – to search for solutions to a problem given a starting point, or “seed”.

## H

Hypervolume

The n-dimensional “volume” covered by the known Pareto front with respect to the total n-dimensional volume of all the objectives of a problem, bounded by a reference point. Captures both convergence and diversity. One of the performance measures of an MOEA. Further explanation can be found here.

## I

Injection

The act of “refilling” the population with members of the archive after a restart. Injection can also include filling the remaining slots in the current population with new, randomly-generated solutions or mutated solutions. This helps to maintain diversity and prevent pre-convergence.

## L

Latin hypercube sampling (LHS)

A statistical method of sampling random numbers in a way that reflects the true underlying probability distribution of the data. Useful for high-dimensional problems such as those faced in many-objective optimization. More information on this topic can be found here.

## M

Many-objective problem

An optimization problem that involves more than three objectives.

Mutation

One of the three operators used in MOEAs. Mutation occurs when a solution from the previous population is slightly perturbed before being injected into the next generation’s population. Helps with maintaining diversity of solutions.

Multi-objective

An optimization problem that traditionally involves two to three objectives.

## N

NFE

Number of function evaluations. The maximum number of times an MOEA is applied to and used to update a multi (or many)-objective problem.

## O

Objective space

The n-dimensional space defined by the number, n, of objectives as stated by the decision-maker. Can be thought of as the number of axes on an n-dimensional graph.

Offspring

The result of selection, mutation, or crossover in the current generation. The new solutions that, if non-dominated, will be used to replace existing members in the current generation’s population.

Operator

Genetic algorithms typically use the following operators – selection, crossover, and mutation operators. These operators introduce variation in the current generation to produce new, evolved offspring. These operators are what enable MOEAs to search for solutions using random initial solutions with little to no information.

## P

Parameters

Initial conditions for a given MOEA. Examples of parameters include population-to-archive ratio, initial population size, and selection ratio.

Parameterization

An MOEA with a high degree of parameterization implies that it requires highly-specific parameter values to generate highly-diverse solutions at a fast convergence rate.

Parents

Members of the current generation’s population that will undergo selection, mutation, and/or crossover to generate offspring.

Pareto-dominance

A solution x is said to Pareto-dominate another solution y if x performs better than y in at least one objective, and performs at least as well as y in all other objectives.

Pareto-nondominance

Both solutions x and y are said to be non-dominating if neither Pareto-dominates the other. That is, there is at least one objective in which solution x that is dominated by solution y and vice versa.

Pareto front

A set of solutions (the Pareto-optimal set) that are non-dominant to each other, but dominate other solutions in the objective space. Also known as the tradeoff surface.

Pareto-optimality

A set of solutions is said to have achieved Pareto-optimality when all the solutions within the same set non-dominate each other, but are dominant to other solutions within the same objective space. Not to be confused with the final, “optimal” set of solutions.

Population

A current set of solutions generated by one evaluation of the problem by an MOEA. Populated by both inferior and Pareto-optimal solutions; can be static or adaptive. The Borg MOEA utilizes adaptive population sizing, of which the size of the population is adjusted to remain proportional to the size of the archive. This prevents search stagnation and the potential elimination of useful solutions.

Pre-convergence

The phenomenon in which an MOEA mistakenly converges to a local optima and stagnates. This may lead the decision-maker to falsely conclude that the “best” solution set has been found.

## R

Recombination

One of the ways that a mutation operator acts upon a given solution. Can be thought of as ‘shuffling’ the current solution to produce a new solution.

Rotation

Applying a transformation to change the orientation of the matrix (or vector) of decision variables. Taking the transpose of a vector can be thought of as a form of rotation.

Rotationally invariant

MOEAs that utilize rotationally invariant operators are able to generate solutions for problems and do not require that the problem’s decision variables be independent.

## S

Search stagnation

Search stagnation is said to have occurred if the set of current solutions do not ε-dominate the previous set of solutions. Detected by the ε-progress indicator (ref).

Selection

One of the three operators used in MOEAs. The selection operator chooses the ‘best’ solutions from the current generation of the population to be maintained and used in the next generation. Helps with convergence to a set of optimal solutions.

Selection pressure

A measure of how ‘competitive’ the current population is. The larger the population and the larger the tournament size, the higher the selection pressure.

A steady-state MOEA applies its operators to single members of its population at a time. That is, at each step, a single individual (solution) is selected as a parent to be mutated/crossed-over to generate an offspring. Each generation is changed one solution at each time-step.

## T

Time continuation

A method in which the population is periodically ’emptied’ and repopulated with the best solutions retained in the archive. For example, Borg employs time continuation during its randomized restarts when it generates a new population with the best solutions stored in the archive and fills the remaining slots with randomly-generated or mutated solutions.

Tournament size

The number of solutions to be ‘pitted against each other’ for crossover or mutation. The higher the tournament size, the more solutions are forced to compete to be selected as parents to generate new offspring for the next generation.

## References

Coello, C. C. A., Lamont, G. B., & Van, V. D. A. (2007). Evolutionary Algorithms for Solving Multi-Objective Problems Second Edition. Springer.

Hadjimichael, A. (2017, August 18). Glossary of commonly used terms. Water Programming: A Collaborative Research Blog. https://waterprogramming.wordpress.com/2017/08/11/glossary-of-commonly-used-terms/.

Hadka, D., & Reed, P. (2013). Borg: An Auto-Adaptive Many-Objective Evolutionary Computing Framework. Evolutionary Computation, 21(2), 231–259. https://doi.org/10.1162/evco_a_00075

Kasprzyk, J. R. (2013, June 25). MOEA Performance Metrics. Water Programming: A Collaborative Research Blog. https://waterprogramming.wordpress.com/2013/06/25/moea-performance-metrics/.

Li, M. (n.d.). Many-Objective Optimisation. https://www.cs.bham.ac.uk/~limx/MaOP.html.

What is Latin Hypercube Sampling? Statology. (2021, May 10). https://www.statology.org/latin-hypercube-sampling/.

# Using the Python Borg Wrapper – Lake Problem Example

Python compatibility with the Borg MOEA is highly useful for practical development and optimization of Python models. The Borg algorithm is implemented in C, a strongly-typed, compiled language that offers high efficiency and scalability for complex optimization problems. Often, however, our models are not written in C/C++, rather simpler scripting languages like Python, MATLAB, and R. The Borg Python wrapper allows for problems in Python to be optimized by the underlying C algorithm, maintaining efficiency and ease of use. Use of the Python wrapper varies slightly across operating systems and computing architectures. This post will focus on Linux systems, like “The Cube”, our computing cluster here at Cornell. This post will work with the most current implementation of the Borg MOEA, which can be accessed with permission from this site.

The underlying communications between a Python problem and the Borg C source code are handled by the wrapper with ctypes, a library that provides Python compatibility with C types and functions in shared libraries. Shared libraries (conventionally .so files in Linux/Unix) provide dynamic linking, a systems tool that allows for compiled code to be linked and reused with different objects. For our purposes, we can think of the Borg shared library as a way to compile the C algorithm and reuse it with different Python optimization runs, without having to re-compile any code. The shared library gives the wrapper access to the underlying Borg functions needed for optimization, so we need to create this file first. In the directory with the Borg source code, use the following command to create the (serial) Borg shared library.

gcc -shared -fPIC -O3 -o libborg.so borg.c mt19937ar.c –lm 

Next, we need to move our shared library into the directory containing the Python wrapper (borg.py) and whatever problem we are optimizing.

In this post, we’ll be using the Lake Problem DPS formulation to demonstrate the wrapper. Here’s the source code for the problem:

@author: Rohini

#DPS Formulation

#Objectives:
#1) Maximize expected economic benefit
#2) Minimize worst case average P concentration
#3) Maximize average inertia of P control policy
#4) Maximize average reliability

#Constraints:
#Reliability has to be <85%

#Decision Variables
#vars: vector of size 3n
#n is the number of radial basis functions needed to define the policy
#Each has a weight, center, and radius (w1, c1, r1..wm,cn,rn)

#Time Horizon for Planning, T: 100 Years
#Simulations, N: 100

"""

import numpy as np
from sys import *
from math import *
from scipy.optimize import root
import scipy.stats as ss

# Lake Parameters
b = 0.42
q = 2.0

# Natural Inflow Parameters
mu = 0.03
sigma = np.sqrt(10**-5)

# Economic Benefit Parameters
alpha = 0.4
delta = 0.98

# Set the number of RBFs (n), decision variables, objectives and constraints
n = 2
nvars = 3 * n
nobjs = 4
nYears = 100
nSamples = 100
nSeeds = 2
nconstrs = 1

# Set Thresholds
reliability_threshold = 0.85
inertia_threshold = -0.02

###### RBF Policy ######

#Define the RBF Policy
def RBFpolicy(lake_state, C, R, W):
# Determine pollution emission decision, Y
Y = 0
for i in range(len(C)):
if R[i] != 0:
Y = Y + W[i] * ((np.absolute(lake_state - C[i]) / R[i])**3)

Y = min(0.1, max(Y, 0.01))

return Y

###### Main Lake Problem Model ######

def LakeProblemDPS(*vars):

seed = 1234

#Solve for the critical phosphorus level
def pCrit(x):
return [(x[0]**q) / (1 + x[0]**q) - b * x[0]]

sol = root(pCrit, 0.5)
critical_threshold = sol.x

#Initialize arrays
average_annual_P = np.zeros([nYears])
discounted_benefit = np.zeros([nSamples])
yrs_inertia_met = np.zeros([nSamples])
yrs_Pcrit_met = np.zeros([nSamples])
lake_state = np.zeros([nYears + 1])
objs = [0.0] * nobjs
constrs = [0.0] * nconstrs

#Generate nSamples of nYears of natural phosphorus inflows
natFlow = np.zeros([nSamples, nYears])
for i in range(nSamples):
np.random.seed(seed + i)
natFlow[i, :] = np.random.lognormal(
mean=log(mu**2 / np.sqrt(mu**2 + sigma**2)),
sigma=np.sqrt(log((sigma**2 + mu**2) / mu**2)),
size=nYears)

# Determine centers, radii and weights of RBFs
C = vars[0::3]
R = vars[1::3]
W = vars[2::3]
newW = np.zeros(len(W))

#Normalize weights to sum to 1
total = sum(W)
if total != 0.0:
for i in range(len(W)):
newW[i] = W[i] / total
else:
for i in range(len(W)):
newW[i] = 1 / n

#Run model simulation
for s in range(nSamples):
lake_state[0] = 0
Y = np.zeros([nYears])

#find policy-derived emission

Y[0] = RBFpolicy(lake_state[0], C, R, newW)

for i in range(nYears):
lake_state[i + 1] = lake_state[i] * (1 - b) + (
lake_state[i]**q) / (1 +
(lake_state[i]**q)) + Y[i] + natFlow[s, i]
average_annual_P[
i] = average_annual_P[i] + lake_state[i + 1] / nSamples
discounted_benefit[
s] = discounted_benefit[s] + alpha * Y[i] * delta**i

if i >= 1 and ((Y[i] - Y[i - 1]) > inertia_threshold):
yrs_inertia_met[s] = yrs_inertia_met[s] + 1

if lake_state[i + 1] < critical_threshold:
yrs_Pcrit_met[s] = yrs_Pcrit_met[s] + 1

if i < (nYears - 1):
#find policy-derived emission
Y[i + 1] = RBFpolicy(lake_state[i + 1], C, R, newW)

# Calculate minimization objectives (defined in comments at beginning of file)
objs[0] = -1 * np.mean(discounted_benefit)  #average economic benefit
objs[1] = np.max(
average_annual_P)  #minimize the max average annual P concentration
objs[2] = -1 * np.sum(yrs_inertia_met) / (
(nYears - 1) * nSamples
)  #average percent of transitions meeting inertia thershold
objs[3] = -1 * np.sum(yrs_Pcrit_met) / (nYears * nSamples
)  #average reliability

constrs[0] = max(0.0, reliability_threshold - (-1 * objs[3]))

return (objs, constrs)


The important function for this blog post is LakeProblemDPS, which demonstrates how to configure your own problem with the wrapper. Your function must take in *vars, the decision variables, and return objs, a list of objective values (or a tuple of objective values and constraints). Within the problem, refer to vars[i] as the i-th decision variable, for i in [0,nvars-1]. Set the list of objective values in the same manner.

Once our problem is defined and compatible with the wrapper, we can optimize with Borg. The following code runs the Lake problem optimization for 10,000 function evaluations.

# Serial Borg run with Python wrapper
# ensure libborg.so is compiled and in this directory
from borg import *
from lake import *

maxevals = 10000

# create an instance of the serial Borg MOEA
borg = Borg(nvars, nobjs, nconstrs, LakeProblemDPS)

# set the decision variable bounds and objective epsilons
borg.setBounds(*[[-2, 2], [0, 2], [0, 1]] * int((nvars / 3)))
borg.setEpsilons(0.01, 0.01, 0.0001, 0.0001)

# perform the optimization
# pass in a dictionary of arguments, as defined in borg.py
result = borg.solve({"maxEvaluations": maxevals})

# print the resulting objectives
for sol in result:
print(sol.getObjectives())

Note the constructor Borg() creates an instance of the Borg algorithm with a specified number of variables, objectives, and constraints. The LakeProblemDPS argument is the objective function to be optimized by this instance of Borg. The setBounds and setEpsilons methods are required. solve() performs the optimization and takes in a dictionary of Borg parameters. See borg.py for a comprehensive list.

### Using the Python wrapper to run the Parallel Borg MOEA

The previous example uses the serial Borg MOEA, but the wrapper also supports the master-worker and multi-master parallelizations. Configuring a parallel version of the algorithm requires a few additional steps. First, you must compile a shared library of the parallel implementation and move it to the wrapper directory.

For the master-worker version, use:

mpicc -shared -fPIC -O3 -o libborgms.so borgms.c mt19937ar.c -lm

For the multi-master version, use:

mpicc -shared -fPIC -O3 -o libborgmm.so borgmm.c mt19937ar.c -lm

To call the master-worker version, you must explicitly start up and shut down MPI using the Configuration class provided in borg.py. The following code performs a parallel master-worker optimization of the Lake problem:

# Master-worker Borg run with Python wrapper
# ensure libborgms.so or libborgms.so is compiled and in this directory
from borg import *
from lake import *

# set max time in hours
maxtime = 0.1

# need to start up MPI first
Configuration.startMPI()

# create an instance of Borg with the Lake problem
borg = Borg(nvars, nobjs, nconstrs, LakeProblemDPS)

# set bounds and epsilons for the Lake problem
borg.setBounds(*[[-2, 2], [0, 2], [0, 1]] * int((nvars / 3)))
borg.setEpsilons(0.01, 0.01, 0.0001, 0.0001)

# perform the optimization
result = borg.solveMPI(maxTime=maxtime)

# shut down MPI
Configuration.stopMPI()

# only the master node returns a result
# print the objectives to output
if result:
for solution in result:
print(solution.getObjectives())

This script must be called as a parallel process – here’s a SLURM submission script that can be used to run the optimization on 16 processors (compatible for The Cube):

#!/bin/bash
#SBATCH -J py-wrapper
#SBATCH -o normal.out
#SBATCH -e normal.err
#SBATCH --nodes 1

mpirun python3 mslake.py

sbatch submission.sbatch will allocate one node with 16 processors for the optimization run.

## Troubleshooting

Depending on your machine’s MPI version and your shell’s LD_LIBRARY_PATH environment variable, the Borg wrapper may try to access an unavailable mpi shared library. This issue happens on our cluster, the Cube, and causes the following error:

OSError: libmpi.so.0: cannot open shared object file: No such file or directory

In borg.py, the startMPI method attempts to access the nonexistent libmpi.so.0 shared library. To fix this, find the location of your mpi files with:

do
java ${JAVA_ARGS} org.moeaframework.analysis.sensitivity.ResultFileEvaluator -d 3 -i ./Runtime/runtime_S${SEED}.runtime -r Borg_DTLZ2_3.reference -o ./metrics/Borg_DTLZ2_3_S${SEED}.metrics done  To execute the script in your terminal enter: ./calc_runtime_metrics.sh  The above command will generate 10 .metrics files inside the metrics folder, each .metric file contains MOEA performance metrics for one randome seed, hypervolume is in the first column, each row represents a different runtime snapshot. It’s important to note that the hypervolume calculated by the MOEAFramework here is the absolute hypervolume, but what we really want in this situation is the relative hypervolume to the reference set (ie the hypervolume achieved at each runtime snapshot divided by the hypervolume of the reference set). To calculate the hypervolume of the reference set, follow the steps presented in step 2 of Jazmin’s blog post (linked here), and divide all runtime hypervolumes by this value. To examine runtime peformance across random seeds, we can load each .metric file into Matlab and plot hypervolume against NFE. The runtime hypervolume for the DTLZ2 3 objective test case I ran is shown in Figure 3 below. Figure 3: Runtime results for the DTLZ2 3 objective test case Figure 3 shows that while there is some variance across the seeds, they all approach the hypervolume of the reference set after about 10,000 NFE. This leveling off of our search across many initial parameterizations indicates that our algorithm has likely converged to a final approximation of our Pareto set. If this plot had yielded hypervolumes that were still increasing after the 30,000 NFE, it would indicate that we need to extend our search to a higher number of NFE. ## References Deb, K., Thiele, L., Laumanns, M. Zitzler, E., 2002. Scalable multi-objective optimization test problems, Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02, (1), 825-830 Hadka, D., Reed, P., 2013. Borg: an auto-adaptive many-objective evolutionary computing framework. Evol. Comput. 21 (2), 231–259. Knowles, J., Corne, D., 2002. On metrics for comparing nondominated sets. Evolutionary Computation, 2002. CEC’02. Proceedings of the 2002 Congress on. 1. IEEE, pp. 711–716. Zatarain Salazar, J., Reed, P.M., Herman, J.D., Giuliani, M., Castelletti, A., 2016. A diagnostic assessment of evolutionary algorithms for multi-objective surface water reservoir control. Adv. Water Resour. 92, 172–185. Zatarain Salazar, J. J., Reed, P.M., Quinn, J.D., Giuliani, M., Castelletti, A., 2017. Balancing exploration, uncertainty and computational demands in many objective reservoir optimization. Adv. Water Resour. 109, 196-210 Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C.M., Da Fonseca, V.G., 2003. Performance assessment of multiobjective optimizers: an analysis and review. IEEE Trans. Evol. Comput. 7 (2), 117–132. # Introduction to Borg Operators Part 1: Simplex Crossover (SPX) I have always found the connection between Genetic Algorithms and Biological Evolution very interesting. In many ways, Genetic Algorithms like Borg, emulate Biological Evolution. They start with a random population of potential solutions, the best fit of these solutions are selected to produce the future generation. During the production of offspring, crossover and mutation occur, which leads to more variation in the future generation. Strong (Non-dominated) offspring join the larger population and replace some of the previous weaker solutions (which are now dominated), and the process is repeated until some final condition is met. In this post, I will briefly discuss Borg’s operators, and then dive deeper into Simplex Crossover (SPX) by Tsutsui et Al. (1999). This is the first of a series of posts in which I will discuss the Borg operators. ## General Overview of Borg Operators Borg uses the six operators shown in the figure 1 below to achieve variation in its solutions. In the figure, the larger dots represent the parent solutions, while the smaller dots represent the offspring solutions (Hadka & Reed, 2013). The question is how does Borg know which operators are best suited for a new problem? Borg auto-adaptively selects operators to based on their performance to aid the search. Based on section 3.4 in Hadka and Reed, 2013: In the first iteration in Borg assigns an equal probability for using each operator. For K operators, this probability is 1/K. With each iteration, these probabilities are updated, so that the operators that produced more solutions in the epsilon box archive, are assigned higher probabilities, while those with less solutions in the epsilon box dominance archive are assigned a lower probability. Hadka and Reed(2013) use the following weighted average equation to obtain this behavior: Where: • K is the number of operators used • Qi ∈ {Q1,…, Qk} is the probability that an operator is used to produce offspring in the next generation. and is initialized at 1/K • Ci ∈ {C1,…, Ck} is the number of solutions in the epsilon box dominance archive produced by each operator i. • Sigma is a positive number that prevents any of the operators from reaching a probability of zero. Figure 1 (Hadka & Reed, 2013) Notice in the bottom three operators in Figure 1 show some similarity: the parents form a simplex (which is a triangle in 2 dimensions). However, the way the offspring are distributed around the parents is different. In Parent Centric Crossover, the offspring form around the parent solutions, in Unimodal Normal Distribution Crossover, the offspring are normally distributed around the center of mass of the three parents, and in Simplex Crossover, the solutions are uniformly distributed inside a simplex that is larger than the simplex created by the three parents (Deb et Al., 2002). ## Simplex Crossover (SPX) Discussion based on Tsutsui et Al., 1999 Simplex Crossover (SPX) is a recombination operator that can use more than two parent solutions to generate offspring. The method involves expanding the simplex formed by the parent solutions by a factor of epsilon, and then sampling offspring from the resulting expanded simplex. The expansion of the simplex is centered around the center of mass of the parent solution vectors, and therefore independent of the coordinate system used. The number of parents must be at least 2 and less than or equal to the number of parameters plus one. That is: Where: • m: is the number of parents • n: is the number of parameters The SPX is denoted as SPX-n-m-ε. Tsutsui et Al., 1999 states that for low dimensional functions, SPX works best with 3 parent solution vectors, and for high dimensional functions SPX works best with 4 parent solution vectors. ### The 3 Parent, 2 Parameter Case It is easiest to see this in the 2-dimensional three parent, 2 parameter case, i.e. SPX-2-3-ε. This can be done in four steps, referring to figure 2: Figure 2 (Tsutsui et Al., 1999) 1. Let X1, X2, and X3 be the three parent solution vectors. Each of these vectors has two parameters (it’s x and y coordinates). Calculate the center of mass, O, of the parents by computing the average of their two parameters: 2. Expand the simplex by epsilon at every vertex: 3. Produce offspring by using a uniform distribution to sample inside new expanded simplex defined in step 3. You can see this in Python using the code below:  import numpy as np import random def SPX_2D(X1, X2, X3, epsilon, n_offspring=1, m=2): # m is the number of parameters in each parent vector # X1, X2, X3 are the parent vectors as np.arrays, each with m parameters (elements) i.e. len(Xi)=m # epsilon is a value greated that zero by which the simplex is expanded # n_offspring is the number of offspring you would like to produce # Step 0: apply checks and inform the user if the input is wrong if m<2: print("error: at least two parameters needed for 2D SPX") elif len(X1)!= len(X2) | len(X1)!=len(X3) | len(X1)!=len(X2): print("error: the number of parameters in the parent vectors don't match") elif len(X1)!=m: print("error: the number of parameters in the parent vectors is not equal to m") # if the checks pass, the function can proceed else: # Step 1: find the center of mass (CoM) of the three parents CoM = 1/3 * (X1 + X2 + X3) # Step 2: Define the vertices (Vi) of the simplex by expanding the simplex in the direction of Xi-CoM by (1+epsilon) # note that the equation here is slightly different from the one in the paper, since the one in the paper produces the vector # that translates the center of mass to the new vertices, and so the vectors need to be added to the center of mass coordinates # to produce the new vertices. V1=CoM+(X1-CoM)*(1+epsilon) V2=CoM+(X2-CoM)*(1+epsilon) V3=CoM+(X3-CoM)*(1+epsilon) # Step 3: Produce offspring by sampling n_offspring points from the new simplex defined in step 3 using a uniform distribution offspring = [point_on_triangle(V1, V2, V3) for _ in range(n_offspring)] return offspring, V1, V2, V3, CoM ######################################################################################################################### # Point_on_triangle function # source: https://stackoverflow.com/questions/47410054/generate-random-locations-within-a-triangular-domain # Solution by Mark Dickinson (Nov 21, 2017) # Comments added by Sara def point_on_triangle(pt1, pt2, pt3): # pt1, pt2, pt3 are the vertices of a triangle # find two random numbers s and t, note that random produces a float between 0 and 1 using a uniform distribution s, t = sorted([random.random(), random.random()]) # use s & t to calculate a weighted average of each coordinate. This will produce the offspring vector return (s * pt1[0] + (t-s)*pt2[0] + (1-t)*pt3[0], s * pt1[1] + (t-s)*pt2[1] + (1-t)*pt3[1]) ######################################################################################################################### # Let's try an example X1=np.array([-2,2]) X2=np.array([4,2]) X3=np.array([1,6]) epsilon=0.3 m=2 n_offspring=1000 offspring, V1, V2, V3, CoM = SPX_2D(X1, X2, X3, epsilon, n_offspring, m) # Finally, you can plot the parents and offspring to produce Figure 3 import matplotlib.pyplot as plt # Plot the parents x1, y1 = zip(X1, X2, X3) plt.scatter(x1, y1, s=50, label='Parents') # Plot the center of mass x2, y2 = zip(CoM) plt.scatter(x2, y2, s=150, label='Center of Mass') # Plot the expanded simplex x3, y3 = zip(V1, V2, V3) plt.scatter(x3, y3, s=100, label='Simplex Vertices') # Plot the offspring x4, y4 = zip(*offspring) plt.scatter(x4, y4, s=0.05, label='Offspring') plt.legend() plt.show()  Figure 3 ### General Case The general case is denoted as SPX-n-m-ε, with n parameters and n parents. Each parent solution vector is: These vectors are in R^n. Thinking in terms of Biology, this parent vector mimics a chromosome, with m parameters as its different traits. The general case can be outlined in four steps: 1) Parent vectors are selected from the population pool 2) R^n is space is then divided as follows:Divide R^n into h sets of R^m-1 spaces and one R^q space. I found this easier when I thought of it in the context of a chromosome, i.e. large parent vector of length n, being divided into smaller sub-vectors of length m-1, and a remainder vector or length q. See figure 4 below: Figure 4 3) In each R^m-1 space, the m parent vectors are used to generate offspring using the expanded simplex as described in the 2-dimensional, 3 parent, 2 parameters case. Again, using the chromosome analogy, figure 5 shows how this can be depicted for m=3 parents and for example, n=9 parameters. R^9 is divided into four (h=integer(n/(m-1)=4) R^2 (m-1=2) spaces and one R^1 space (q=remainder(n/(m-1)=1). Figure 5 4) The sets of offspring in R^m-1 produced in step 3 together with the vector in R^q (which remains unchanged), are then combined in their correct positions to form an offspring vector in R^n. The code for the general SPX case can be found in David Hadka’s github. ### References # Using Borg in Parallel and Serial with a Python Wrapper – Part 2 This blog post is Part 2 of a two-part series that will demonstrate how I have coupled a pure Python simulation model with the Borg multi-objective evolutionary algorithm (MOEA). I recommend reading Part 1 of this series before you read Part 2. In Part 1, I explain how to get Borg and provide sample code showing how you can access Borg’s serial and/or parallelized (master-slave) implementations through a Python wrapper (borg.py). In Part 2, I provide details for more advanced simulation-optimization setups that require you pass additional information from the borg wrapper into the simulation model (the “function evaluation”) other than just decision variable values. In Part 1, the example simulation model I use (PySedSim) is called through a function handle “Simulation_Caller” in the example_sim_opt.py file. Borg needs only this function handle to properly call the simulation model in each “function evaluation”. Borg’s only interaction with the simulation model is to pass the simulation model’s function handle (e.g., “Simulation_Caller”) the decision variables, and nothing else. In many circumstances, this is all you need. However, as your simulation-optimization setup becomes more complex, in order for your simulation model (i.e., the function evaluation) to execute properly, you may need to pass additional arguments to the simulation model from Borg other than just the decision variables. For example, in my own use of Borg in a simulation-optimization setting, in order to do optimization I first import a variety of assumptions and preferences to set up a Borg-PySedSim run. Some of those assumptions and preferences are helpful to the simulation model (PySedSim) in determining how to make use of the decision variable values Borg feeds it. So, I would like to pass those relevant assumptions and preferences directly into the Borg wrapper (borg.py), so the wrapper can in turn pass them directly into the simulation model along with the decision variable values. Before I show how to do this, let me provide a more concrete example of how/why I am doing this in my own research. In my current work, decision variable values represent parameters for a reservoir operating policy that is being optimized. The simulation model needs to know how to take the decision variable values and turn them into a useful operating policy that can be simulated. Some of this information gets imported in order to run Borg, so I might as well pass that information directly into the simulation model while I have it on hand, rather than importing it once again in the simulation model. To do what I describe above, we just need to modify the two functions in the example_sim_opt.py module so that a new argument “additional_inputs” is passed from borg to the simulation handle. Using my python code from blog post 1, I provide code below that is modified in the Simulation_Caller() function on lines 5, 21, 22 and 27; and in the Optimization() function on lines 55, 56 and 70. After that code, I then indicate how I modify the borg.py wrapper so it can accept this information. import numpy as np import pysedsim # This is your simulation model import platform # helps identify directory locations on different types of OS def Simulation_Caller(vars, additional_inputs): ''' Purpose: Borg calls this function to run the simulation model and return multi-objective performance. Note: You could also just put your simulation/function evaluation code here. Args: vars: A list of decision variable values from Borg additional_inputs: A list of python data structures you want to pass from Borg into the simulation model. Returns: performance: policy's simulated objective values. A list of objective values, one value each of the objectives. ''' borg_vars = vars # Decision variable values from Borg # Unpack lists of additional inputs from Borg (example assumes additional inputs is a python list with two items) borg_dict_1 = additional_inputs[0] borg_dict_2 = additional_inputs[1] # Reformat decision variable values as necessary (.e.g., cast borg output parameters as array for use in simulation) op_policy_params = np.asarray(borg_vars) # Call/run simulation model with decision vars and additional relevant inputs, return multi-objective performance: performance = pysedsim.PySedSim(decision_vars = op_policy_params, sim_addl_input_1 = borg_dict_1, sim_addl_input_2 = borg_dict_2) return performance def Optimization(): ''' Purpose: Call this method from command line to initiate simulation-optimization experiment Returns: --pareto approximate set file (.set) for each random seed --Borg runtime file (.runtime) for each random seed ''' import borg as bg # Import borg wrapper parallel = 1 # 1= master-slave (parallel), 0=serial # The following are just examples of relevant MOEA specifications. Select your own values. nSeeds = 25 # Number of random seeds (Borg MOEA) num_dec_vars = 10 # Number of decision variables n_objs = 6 # Number of objectives n_constrs = 0 # Number of constraints num_func_evals = 30000 # Number of total simulations to run per random seed. Each simulation may be a monte carlo. runtime_freq = 1000 # Interval at which to print runtime details for each random seed decision_var_range = [[0, 1], [4, 6], [-1,4], [1,2], [0,1], [0,1], [0,1], [0,1], [0,1], [0,1]] epsilon_list = [50000, 1000, 0.025, 10, 13, 4] # Borg epsilon values for each objective borg_dict_1 = {'simulation_preferences_1': [1,2]} # reflects data you want Borg to pass to simulation model borg_dict_2 = {'simulation_preferences_2': [3,4]} # reflects data you want Borg to pass to simulation model # Where to save seed and runtime files main_output_file_dir = 'E:\output_directory' # Specify location of output files for different seeds os_fold = Op_Sys_Folder_Operator() # Folder operator for operating system output_location = main_output_file_dir + os_fold + 'sets' # If using master-slave, start MPI. Only do once. if parallel == 1: bg.Configuration.startMPI() # start parallelization with MPI # Loop through seeds, calling borg.solve (serial) or borg.solveMPI (parallel) each time for j in range(nSeeds): # Instantiate borg class, then set bounds, epsilon values, and file output locations borg = bg.Borg(num_dec_vars, n_objs, n_constrs, Simulation_Caller, add_sim_inputs = [borg_dict_1, borg_dict_2]) borg.setBounds(*decision_var_range) # Set decision variable bounds borg.setEpsilons(*epsilon_list) # Set epsilon values # Runtime file path for each seed: runtime_filename = main_output_file_dir + os_fold + 'runtime_file_seed_' + str(j+1) + '.runtime' if parallel == 1: # Run parallel Borg result = borg.solveMPI(maxEvaluations='num_func_evals', runtime=runtime_filename, frequency=runtime_freq) if parallel == 0: # Run serial Borg result = borg.solve({"maxEvaluations": num_func_evals, "runtimeformat": 'borg', "frequency": runtime_freq, "runtimefile": runtime_filename}) if result: # This particular seed is now finished being run in parallel. The result will only be returned from # one node in case running Master-Slave Borg. result.display() # Create/write objective values and decision variable values to files in folder "sets", 1 file per seed. f = open(output_location + os_fold + 'Borg_DPS_PySedSim' + str(j+1) + '.set', 'w') f.write('#Borg Optimization Results\n') f.write('#First ' + str(num_dec_vars) + ' are the decision variables, ' + 'last ' + str(n_objs) + ' are the ' + 'objective values\n') for solution in result: line = '' for i in range(len(solution.getVariables())): line = line + (str(solution.getVariables()[i])) + ' ' for i in range(len(solution.getObjectives())): line = line + (str(solution.getObjectives()[i])) + ' ' f.write(line[0:-1]+'\n') f.write("#") f.close() # Create/write only objective values to files in folder "sets", 1 file per seed. Purpose is so that # the file can be processed in MOEAFramework, where performance metrics may be evaluated across seeds. f2 = open(output_location + os_fold + 'Borg_DPS_PySedSim_no_vars' + str(j+1) + '.set', 'w') for solution in result: line = '' for i in range(len(solution.getObjectives())): line = line + (str(solution.getObjectives()[i])) + ' ' f2.write(line[0:-1]+'\n') f2.write("#") f2.close() print("Seed %s complete") %j if parallel == 1: bg.Configuration.stopMPI() # stop parallel function evaluation process def Op_Sys_Folder_Operator(): ''' Function to determine whether operating system is (1) Windows, or (2) Linux Returns folder operator for use in specifying directories (file locations) for reading/writing data pre- and post-simulation. ''' if platform.system() == 'Windows': os_fold_op = '\\' elif platform.system() == 'Linux': os_fold_op = '/' else: os_fold_op = '/' # Assume unix OS if it can't be identified return os_fold_op  Next, you will need to acquire the Borg wrapper using the instructions I specified in my previous blog post. You will need to make only two modifications: (1) modify the Borg class in borg.py so it accepts the inputs you want to pass to the simulation; and (2) some additional internal accounting in borg.py to ensure those inputs are passed to the borg.py methods that deal with your function handle. I will address these two in order. First, modify the Borg class in borg.py so it now accepts an additional input (I only show some of the borg.py code here, just to indicate where changes are being made):  class Borg: def __init__(self, numberOfVariables, numberOfObjectives, numberOfConstraints, function, epsilons = None, bounds = None, directions = None, add_sim_inputs=None): # add_sim_inputs is the new input you will pass to borg  Then, modify the portion of the borg.py wrapper where self.function is called, so it can accommodate any simulation inputs you have specified.  if add_pysedsim_inputs is None: self.function = _functionWrapper(function, numberOfVariables, numberOfObjectives, numberOfConstraints, directions) else: # More simulation inputs are specified and can be passed to the simulation handle self.function = _functionWrapper(function, numberOfVariables, numberOfObjectives, numberOfConstraints, directions, addl_inputs=add_sim_inputs)  After the above, the last step is to modify the _functionWrapper method in borg.py:  def _functionWrapper(function, numberOfVariables, numberOfObjectives, numberOfConstraints, directions=None, addl_inputs=None): # addl_inputs will be passed into the simulation model def innerFunction(v,o,c): global terminate try: if addl_inputs is None: result = function(*[v[i] for i in range(numberOfVariables)]) else: result = function([v[i] for i in range(numberOfVariables)], addl_inputs)  # Using Borg in Parallel and Serial with a Python Wrapper – Part 1 Simulation and optimization are frequently used to solve complex water resources and environmental systems problems. By itself, a simulation model begs the question “what to simulate?” Similarly, by itself, an optimization model begs the question “is the solution really best?” For this reason, simulation and optimization models are frequently coupled. This blog post is part 1 of a multi-part series that will demonstrate how I have coupled a pure Python simulation model with the multi-objective evolutionary optimization algorithm Borg. In this post, I will show how you can access Borg’s serial and/or parallelized (master-slave) implementations through a Python wrapper (borg.py). Please see this previous blog post for some background about Borg, and how to obtain it. My instructions below assume you have access to the Borg files. In the setup I will describe below, Borg parameterizes and iteratively refines solutions (e.g., reservoir operating policies) to a problem, optimizing them in response to their simulated performance with respect to multiple objectives. You will need the following Borg files (see link above for how to download these): • Serial (i.e., borg.c, libborg.so, etc.) and/or master-slave (i.e., borgms.c, libborgms.so, etc.) implementations of Borg, depending upon your ability to parallelize. • Python wrapper for Borg (borg.py), which will allow you to to access Borg easily in Python. You will need to create the following files yourself (I provide sample code below for these files): • example_sim_opt.py—A python module that should contain two main functions: 1. A simulation caller, which takes decision variables and returns multi-objective performance. This function is called “Simulation_Caller” in the example_sim_opt.py code below. 2. An optimization function, which calls the Borg MOEA through its python wrapper borg.py. This borg.py wrapper provides extensive docstring documentation regarding required arguments, returned values, etc., so I do suggest reading through the wrapper if you have questions (e.g., about the python data types of arguments and returns). Note that the file and function names above are just example names. You can name the above files whatever you want. Just be sure to modify the code I provide below to reflect the new names. A sample of code for example_sim_opt.py is as follows: import numpy as np import pysedsim # This is your simulation model import platform # helps identify directory locations on different types of OS def Simulation_Caller(vars): ''' Purpose: Borg calls this function to run the simulation model and return multi-objective performance. Note: You could also just put your simulation/function evaluation code here. Args: vars: A list of decision variable values from Borg Returns: performance: policy's simulated objective values. A list of objective values, one value each of the objectives. ''' borg_vars = vars # Decision variable values from Borg # Reformat decision variable values as necessary (.e.g., cast borg output parameters as array for use in simulation) op_policy_params = np.asarray(borg_vars) # Call/run simulation model, return multi-objective performance: performance = pysedsim.PySedSim(decision_vars = op_policy_params) return performance def Optimization(): ''' Purpose: Call this method from command line to initiate simulation-optimization experiment Returns: --pareto approximate set file (.set) for each random seed --Borg runtime file (.runtime) for each random seed ''' import borg as bg # Import borg wrapper parallel = 1 # 1= master-slave (parallel), 0=serial # The following are just examples of relevant MOEA specifications. Select your own values. nSeeds = 25 # Number of random seeds (Borg MOEA) num_dec_vars = 10 # Number of decision variables n_objs = 6 # Number of objectives n_constrs = 0 # Number of constraints num_func_evals = 30000 # Number of total simulations to run per random seed. Each simulation may be a monte carlo. runtime_freq = 1000 # Interval at which to print runtime details for each random seed decision_var_range = [[0, 1], [4, 6], [-1,4], [1,2], [0,1], [0,1], [0,1], [0,1], [0,1], [0,1]] epsilon_list = [50000, 1000, 0.025, 10, 13, 4] # Borg epsilon values for each objective # Where to save seed and runtime files main_output_file_dir = 'E:\output_directory' # Specify location of output files for different seeds os_fold = Op_Sys_Folder_Operator() # Folder operator for operating system output_location = main_output_file_dir + os_fold + 'sets' # If using master-slave, start MPI. Only do once. if parallel == 1: bg.Configuration.startMPI() # start parallelization with MPI # Loop through seeds, calling borg.solve (serial) or borg.solveMPI (parallel) each time for j in range(nSeeds): # Instantiate borg class, then set bounds, epsilon values, and file output locations borg = bg.Borg(num_dec_vars, n_objs, n_constrs, Simulation_Caller) borg.setBounds(*decision_var_range) # Set decision variable bounds borg.setEpsilons(*epsilon_list) # Set epsilon values # Runtime file path for each seed: runtime_filename = main_output_file_dir + os_fold + 'runtime_file_seed_' + str(j+1) + '.runtime' if parallel == 1: # Run parallel Borg result = borg.solveMPI(maxEvaluations='num_func_evals', runtime=runtime_filename, frequency=runtime_freq) if parallel == 0: # Run serial Borg result = borg.solve({"maxEvaluations": num_func_evals, "runtimeformat": 'borg', "frequency": runtime_freq, "runtimefile": runtime_filename}) if result: # This particular seed is now finished being run in parallel. The result will only be returned from # one node in case running Master-Slave Borg. result.display() # Create/write objective values and decision variable values to files in folder "sets", 1 file per seed. f = open(output_location + os_fold + 'Borg_DPS_PySedSim' + str(j+1) + '.set', 'w') f.write('#Borg Optimization Results\n') f.write('#First ' + str(num_dec_vars) + ' are the decision variables, ' + 'last ' + str(n_objs) + ' are the ' + 'objective values\n') for solution in result: line = '' for i in range(len(solution.getVariables())): line = line + (str(solution.getVariables()[i])) + ' ' for i in range(len(solution.getObjectives())): line = line + (str(solution.getObjectives()[i])) + ' ' f.write(line[0:-1]+'\n') f.write("#") f.close() # Create/write only objective values to files in folder "sets", 1 file per seed. Purpose is so that # the file can be processed in MOEAFramework, where performance metrics may be evaluated across seeds. f2 = open(output_location + os_fold + 'Borg_DPS_PySedSim_no_vars' + str(j+1) + '.set', 'w') for solution in result: line = '' for i in range(len(solution.getObjectives())): line = line + (str(solution.getObjectives()[i])) + ' ' f2.write(line[0:-1]+'\n') f2.write("#") f2.close() print("Seed %s complete") %j if parallel == 1: bg.Configuration.stopMPI() # stop parallel function evaluation process def Op_Sys_Folder_Operator(): ''' Function to determine whether operating system is (1) Windows, or (2) Linux Returns folder operator for use in specifying directories (file locations) for reading/writing data pre- and post-simulation. ''' if platform.system() == 'Windows': os_fold_op = '\\' elif platform.system() == 'Linux': os_fold_op = '/' else: os_fold_op = '/' # Assume unix OS if it can't be identified return os_fold_op  The following is an example of how you would submit a batch script on a Linux cluster to run a parallelized simulation-optimization experiment using the example_sim_opt.py and borg.py files. Note that in the parallelized version, simulations (i.e., “function evaluations”) are being run in parallel by separate processors. You would need the following two files: 1. example_sim_opt_caller.py. This is a python file that is used to call example_sim_opt.py 2. example_sim_opt_batch_script.pbs. This is batch script that runs example_sim_opt_caller.py in parallel on a cluster using open MPI. Example code for example_sim_opt_caller.py:  ''' Purpose: To initiate the optimization process, which will iteratively call the simulation model. ''' import example_sim_opt # Import main optimization module that uses borg python wrapper # Module within example example_sim_opt.Optimization()  Example code for example_sim_opt_batch_script.pbs:  #PBS -l nodes=8:ppn=16 #PBS -l walltime=24:00:00 #PBS -j oe #PBS -o pysedsim_output.out cd$PBS_O_WORKDIR
source /etc/profile.d/modules.sh