Runtime Visualization of MOEA with Platypus in Python

The Platypus framework provides us with a Python library to solve and analyze multi-objective problems conveniently. In this notebook, we use Platypus to solve a multi-objective optimization problem – the 3 objective version of DTLZ2 (Deb et al., 2002a) – using the evolutionary algorithm NSGA-II (Deb et al., 2002b). We generate runtime visualizations for snapshots of the algorithm to help build an understanding of how the evolutionary algorithm works. Specifically, we are going to look at the population at each generation, their ranks (a key parameter NSGA-II uses to select offspring), the parallel coordinates plot of the current population, and runtime indicators such as hypervolume and generational distance.

Setting up the problem

I created a Google Colab for this post. Since this post is equivalent to the Google Colab file, you may choose to look at either one. This project is intended to be used in a Jupyter Notebook environment with Python.

First, we import the Python libraries we need for the visualization.

import numpy as np
import math
import pandas as pd
from pandas.plotting import parallel_coordinates
import random
from tqdm.notebook import tqdm

import matplotlib.pyplot as plt
from matplotlib import animation, rc, rcParams
rc('animation', html='jshtml')

from mpl_toolkits.mplot3d import Axes3D

!pip install platypus-opt
from platypus import (NSGAII, NSGAIII, DTLZ2, Hypervolume, EpsilonBoxArchive,
                      Solution, GenerationalDistance, InvertedGenerationalDistance,
                      Hypervolume, EpsilonIndicator, Spacing)

Our goal is to visualize the population at each generation of the evolutionary algorithm. To do so, we utilize an interface provided by the Platypus library: the callback function.

At each iteration of the algorithm, the callback function (if initialized) is called. We define our callback function to store the current number of function evaluations (algorithm.nfe) and all the data points in the current population (algorithm.result). Each population has the type Solution defined in Platypus, which not only contains the values of the variables but also attributes specific to the solver, such as ranks and crowding distances. We will access some of these attributes during visualization.

We also define the frequency of NFEs that we want to store the values. Saving a snapshot of the algorithm at every single iteration may be too expensive or unnecessary. Hence, we set a lapse for some number of iterations, in which the callback function does nothing unless the last NFE is more than the frequency amount apart from this NFE.

In the following example, we set up the problem DTLZ2 and use the callback function to solve it using NSGAII.

#define the frequency
frequency = 200

# define the problem definition
problem = DTLZ2(3)

# instantiate the optimization algorithm
algorithm = NSGAII(problem, divisions_outer=12)

# define callback function
solutions_list = []
hyp = []
nfe = []
last_calc = 0

def DTLZ2_callback(algorithm):
  global last_calc
  if algorithm.nfe == last_calc + frequency:
    last_calc = algorithm.nfe
    nfe.append(algorithm.nfe)
    solutions_list.append(algorithm.result);

# optimize the problem using 10,000 function evaluations
algorithm.run(10000,DTLZ2_callback)

In order to calculate the metrics, we first need to define our reference set. For DTLZ2(3), our points lie on the unit sphere in the first octant. Let’s randomly select 1000 such points and plot them.

# generate the reference set for 3D DTLZ2
reference_set = EpsilonBoxArchive([0.02, 0.02, 0.02])

for _ in range(1000):
    solution = Solution(problem)
    solution.variables = [random.uniform(0,1) if i < problem.nobjs-1 else 0.5 for i in range(problem.nvars)]
    solution.evaluate()
    reference_set.add(solution)

fig_ref = plt.figure()
ax_ref = fig_ref.add_subplot(projection="3d")
ax_ref.scatter(
              [s.objectives[0] for s in reference_set],
              [s.objectives[1] for s in reference_set],
              [s.objectives[2] for s in reference_set],
              )

Given the reference set, we can now calculate the indicators for each population across iterations. The Platypus library provides us with all the functions we need to calculate the following indicators: generational distance, hypervolume, epsilon indicator, and spacing.

We initialize them in a dictionary and iterate over all saved populations to calculate the indicators.

# calculate the indicators
indicators = {"gd" : GenerationalDistance(reference_set),
              "hyp" : Hypervolume(reference_set),
              "ei" : EpsilonIndicator(reference_set),
              "sp" : Spacing()}

indicator_results = {index : [] for index in indicators}

for indicator in tqdm(indicator_results):
  for solution in tqdm(solutions_list):
    indicator_results[indicator] += [indicators[indicator].calculate(solution)]

Setting up the visualization

At this point, we have the data we need to perform runtime visualizations. We will utilize the animation.FuncAnimation function in matplotlib to create interactive animations in Jupyter Notebook (or Google Colab). The idea behind creating such animations is to first initialize a static figure, and then define an update function to let the FuncAnimation know how to visualize new data for each iteration.

We define drawframe() that does the following: (1) clear the axis, so that previous data points are wiped out; (2) draw the new data points; (3) reset the limits of data axes so that the axes are consistent across frames; (4) update new data for indicator axes.

def drawframe(n):

    # clear axes
    ax.cla()
    ax_parallel.cla()

    # save results
    result = solutions_list[n]
    crowding_distances = [s.crowding_distance if s.crowding_distance != math.inf else 0 for s in result]
    ranks = [s.rank for s in result]

    result = solutions_list[n]
    points = {
              'X': [s.objectives[0] for s in result],
              'Y': [s.objectives[1] for s in result],
              'Z': [s.objectives[2] for s in result],
              'rank': [s.rank for s in result],
              'tag' : ['tag' for s in result]
    }
    df = pd.DataFrame(points)

    # update new data points
    ax.scatter(points['X'], points['Y'], points['Z'],
              c = ranks,
              alpha = 0.5,
              linestyle="", marker="o",
              cmap=cmap,
              vmax=max_rank,
              vmin=0)
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
    ax.set_zlim(zlim)
    ax.set_title('Solve DTLZ2, NFE = ' + str(nfe[n]))

    # update the parallel coordinates plot
    parallel_coordinates(df[['X','Y','Z','tag']], 'tag', ax=ax_parallel)

    # update indicator plots
    for indicator in indicator_axes:
      indicator_ax = indicator_axes[indicator]
      indicator_ax.plot(nfe[:n],indicator_results[indicator][:n], c = 'r')
      indicator_ax.set_xlim(left = min(nfe), right=max(nfe))
      indicator_ax.set_ylim(bottom = min(indicator_results[indicator]), top = max(indicator_results[indicator]))

With the drawframe() function, we create a static figure that initializes the axes we will feed into the drawframe() function. The initialization does the following: (1) set up the subplots of the figure; (2) calculate the maximum ranks of all points in all populations to determine the color mapping; (3) load the points from the first iteration; (4) initialize the scatter plots, parallel coordinates plot, and indicator plots.

fig = plt.figure(figsize=(20,10), dpi = 70)
fig.suptitle("Runtime Visualization", fontsize=20)
fig.subplots_adjust(wspace=0.3, hspace=0.3)
ax = fig.add_subplot(2, 3, 1, projection="3d")
ax_parallel = fig.add_subplot(2,3,4)

indicator_axes = {"gd" : fig.add_subplot(2, 3, 2),
                  "hyp" : fig.add_subplot(2, 3, 3),
                  "ei" : fig.add_subplot(2, 3, 5),
                  "sp" : fig.add_subplot(2, 3, 6)}

indicator_names = {"gd" : "Generational Distance",
                  "hyp" : "Hypervolume",
                  "ei" : "Epsilon Indicator",
                  "sp" : "Spacing"}

# load the ranks of all points
all_rank = [s.rank for result in solutions_list for s in result]
max_rank = max(all_rank)

# define the colormap
cmap = plt.get_cmap('Accent', max_rank)

# load the points from the first iteration
result = solutions_list[0]
points = {
    'X': [s.objectives[0] for s in result],
    'Y': [s.objectives[1] for s in result],
    'Z': [s.objectives[2] for s in result],
    'rank': [s.rank for s in result],
    'tag' : ['tag' for s in result]
}

df = pd.DataFrame(points)

# create the scatter plot
graph = ax.scatter(
              points['X'], points['Y'], points['Z'],
              c = points['rank'],
              alpha = 0.5,
              cmap=cmap,
              linestyle="", marker="o",
              vmax=max_rank,
              vmin=0
          )

# create the parallel coordinates plot
parallel_coordinates(df[['X','Y','Z','tag']], 'tag', ax=ax_parallel)

plt.colorbar(graph, label='Rank', pad = 0.2)

# save the dimensions for later use
xlim = ax.get_xlim()
ylim = ax.get_ylim()
zlim = ax.get_zlim()
title = ax.set_title("DTLZ2 Static Figure")
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_zlabel("z")

# initialize subplots for each indicator
for indicator in indicator_axes:
  indicator_axes[indicator].plot(nfe[:0],indicator_results[indicator][:0])
  indicator_axes[indicator].set_title(indicator_names[indicator] + " vs NFE")
  indicator_axes[indicator].set_xlabel("NFE")
  indicator_axes[indicator].set_ylabel(indicator_names[indicator])

Now we are ready to create an animation. We use the FuncAnimation() function, the initialized figure, and the drawframe() function to create the animation.

In the Jupyter Notebook or Google Colab, you will be able to play the animation using the play button at the bottom of the image. By default, the animation will play in a loop. You may select “once” so that the animation freezes in the last frame. Important: Before re-generating the animation, be sure to re-run the previous initialization so that the figure is reset. Otherwise, you may see overlapping points/lines.

ani = animation.FuncAnimation(fig, drawframe, frames=len(nfe), interval=20, blit=False)
ani

This visualization shows how the algorithm progresses as NFE grows. For the set of solutions, we clearly see how they converge to the reference set. Moreover, more and more points have lower ranks, indicating they are getting closer to the Pareto Front (points on the Pareto Front have rank = 0). The parallel coordinates plot shows how our solutions get narrowed down and the tradeoffs we could make. Finally, the four indicator plots track the performance of our algorithm as NFE increases. The trajectory of generational distance, hypervolume, and epsilon indicator suggests convergence.

In conclusion, the project highlights the potential of the Platypus library in Python in providing valuable insights into the progress of evolutionary algorithms, not just their final outcomes. Through the use of NSGA-II as an illustrative example, we have demonstrated the ability to monitor the ranks of points across generations. In Dave’s post, the runtime visualizations revealed the changing probabilities of variators across iterations. These findings emphasize the power of incorporating dynamic techniques to gain a comprehensive understanding of the runtime behavior of MOEA algorithms. I hope this project opens doors to further explore, visualize, and analyze the dynamics of evolutionary algorithms.

References

[1] Deb, K., Thiele, L., Laumanns, M., & Zitzler, E. (2002a). Scalable multi-objective optimization test problems. In Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. No. 02TH8600) (Vol. 1, pp. 825-830). IEEE.

[2] Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. A. M. T. (2002b). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation, 6(2), 182-197.

[3] Gold, D. (2020, May 6). Beyond Hypervolume: Dynamic visualization of MOEA runtime. Water Programming: A Collaborative Research Blog. https://waterprogramming.wordpress.com/2020/05/06/beyond-hypervolume-dynamic-visualization-of-moea-runtime/

[4] Python – How to create 3D scatter animations – Stack Overflow https://stackoverflow.com/questions/41602588/how-to-create-3d-scatter-animations

[5] GitHub – Project-Platypus/Platypus: A Free and Open Source Python Library for Multiobjective Optimization https://github.com/Project-Platypus/Platypus

Infrastructure Investment Selection as a Two-Stage Stochastic Programming Problem (Part 1)

About Stochastic Programming

Stochastic programming (SP) is an approach to decision-making under uncertainty (alongside robust optimization and chance constrained optimization). It is an extension of deterministic optimization techniques such as linear (or non-linear) programming that incorporates stochastic elements (hence the name) in the form of multiple potential future scenarios. Overall, the aim of stochastic programming is to identify optimal solutions to a given problem that perform well under a finite range of possible scenarios. This is done by optimizing the expected value of the objective function over all scenarios. For clarification, a “scenario” in the context of SP is defined as the combined realization (expected outcome) of all future uncertainties.

There are two versions of an SP problem: the two-stage SP problem consists of two decision-making stages, where the first stage decisions represent the decisions taken before the uncertainty (the future scenario, in this case) unfolds and becomes known. The second stage decisions are taken once the future has unfolded and corrective action needs to be taken. Two-stage SP problems are typically used for scenario planning, where a decision-maker’s goal is to identify a solution that will result in good performance across all potential scenarios. Next, the multi-stage SP problem with s-stages is an extension of the two-stage SP problem, where the decision at stage s is dependent on all decisions made up until s. Due to the problem’s high dimensionality, the multi-stage SP problem is typically represented using a scenario tree with each branch associated with a probable outcome and its probability and each level associated with a stage s.

In this post, I will provide a two-stage stochastic programming example discuss the pros and cons of its use. In the following post, I will and demonstrate how to approach and solve such a problem using the Python cvxpy library.

The infrastructure selection example

Disclaimer: All values are arbitrary .

A water utility serving the residents of Helm’s Keep (57,000 households) are deciding on a water supply storage expansion project to meet projected increase in demand and wants to identify a suitable yet affordable project to invest in:

  • Option 1: Water treatment capacity expansion on the distant (but yet untapped) Lake Heleborn
  • Option 2: The construction of small local reservoir on the outskirts of Helm’s Keep
  • Option 3: Construction of a new groundwater pumping station

Each of these infrastructure options are associated with the following capital investment costs. The variable capital cost and variable operating cost are functions of the infrastructure options and expected demand (both in MGD).

Fixed capital cost ($mil)Variable capital cost ($mil/MGD capacity)Variable operating cost ($mil/MGD demand)
Option 115.67.50.2
Option 211.94.70.5
Option 313.95.10.9

Helm’s Keep’s priority in the selection of its new water supply infrastructure is to minimize its total infrastructure net present cost:

f_{INPC}=\sum_{y=1}^{T}\frac{PMT}{(1+d)^y}

where INPC is the infrastructure net present cost, T=30 is bond term for an infrastructure option, and  is the discount rate for a given year y of debt service payment d for an infrastructure option since its bond was issued. The value of PMT is calculated as follows:

\frac{P[BR(1+BR)^T]}{(1+BR)^{T}-1}

where P is the bond principal (the total value of the bond), BR is the bond interest rate to be paid to the creditor throughout the bond term T. All payments are discounted to their present value. The value of the bond principal is dependent on the capacity of the infrastructure option and is calculated as a function of the fixed and variable capital costs:

P_i = C_iy_i + VCC_ix_i

such that C_i is the capital cost of infrastructure option i and y_i is a binary {0,1} variable indicating if option i is chosen. VCC_i is the variable capital cost of infrastructure option i, and x_i is the final built capacity of i.

In addition, there is some water loss in transporting water from either location due to aging infrastructure, which will result in losses caused by non-revenue water (NRW). The losses associated with option are listed below:

  • Option 1: 20% water loss
  • Option 2: 10% water loss
  • Option 3: 12% water loss

Helm’s Keep must also consider the debt is undertaking relative to its potential annual net revenue R_{s,i}, calculated as follows:

R_{s,i}=[(D_{s,i}\times UR \times \text{households})(1-\rho_{i})] - OC_{s,i}

where D_{s,i} is per-capita daily water delivered, UR=\$0.06/MGD is the per-household uniform water rate, and \rho_{i} is the percentage of water lost during transmission given that infrastructure i was chosen. OC_{s,i} is Helm’s Keep’s daily operating costs per capita where

OC_{s,i}=VOC_i \frac{D_{s,i}}{\rho_i}

such that D_{s,i} is the demand in scenario s for infrastructure option i, VOC_{s,i} is the variable operating cost of i, and \rho_i is the percentage of NRW.

However, it must also meet the following constraints:

  • The infrastructure option decision is exclusive (only one option can be selected).
  • The total capacity of the chosen infrastructure option cannot be less than 1.2 times that of daily demand increase (after accounting for NRW).
  • Daily deliveries be at least equal to demand (after accounting for NRW).
  • All water produced (after accounting for NRW) are considered for-revenue water and result in gross revenue.
  • Annual payments PMT_{s,i} should not exceed 1.25 times that of Helm’s Keep’s annual net revenue R_{s,i}.

There are three possible scenarios s \in S = {1,2,3} of water demand increase \delta_s, Federal Reserve bond interest rate, and discount rate. The probability of each scenario is listed in the table below:

ScenariosDemand increase (MGD)Bond rate (%)Discount rate (%)Probability (%)
Scenario 14.44.33.10.35
Scenario 25.22.62.60.41
Scenario 33.95.22.50.24

Helm’s Keep faces the following decisions:

  • Stage 1: Which infrastructure option i to build and to what capacity x_i for i \forall {1,2,3}?
  • Stage 2: How much will Helm’s Keep’s corresponding bond principal be how much water is delivered in each scenario?

Formulating the problem

Let the first-stage decision variables be:

  • Infrastructure option selection y_i where y_1 + y_2 + y_3 = 1 and y_i \in {0,1} \forall i=[1,2,3]
  • Capacity of each infrastructure option x_i \forall i = {1,2,3}

Next, we set up the second stage decision variables. Let D_s, BR_s and d_s be the amount of water delivered, bond rate, and discount rate, in scenario s \in S = {1,2,3}. This will result in the following problem setup:

min  f_{INPC} = \sum_{i=1}^{N}\sum_{s=1}^{S}p_s\sum_{y=1}^{30}\frac{PMT_{s,i}}{(1+d_{s,i})^y}

where

PMT_{s,i}=(C_i y_i + VCC_i x_i) \frac{[BR_{s,i}(1+BR_{s,i})^{30}]}{(1+BR_{s,i})^{30} -1}

with constraints

y_1 + y_2 + y_3 = 1

D_{s,i} \geq \frac{\delta_{s}}{1-MAX(\rho)}

D_{s,i} \geq \sum_{i=1}^{3}\frac{x_i}{1-\rho_i}

D_{s,i} \leq 8

PMT_{s,i} \leq 1.25R_{s,1}

x_i \leq My_i

such that

R_{s,i} = [(3420D_{s,i})(1-\rho_i)] - (VOC_i\frac{D_{s,i}}{\rho_s})

and p_s is the probability of scenario s and UR \times \text{households}=3420.

Some caveats

As we walked through the formulation of this two-stage stochastic programming problem, it quickly becomes apparent that the computational complexity of such an approach can quickly become intractable when the number of uncertain scenarios and the complexity of the model (large number of uncertain parameters) is high. In addition, this approach assumes a priori knowledge of future uncertainty and sufficient underlying expert knowledge to generate 6uncertainty estimates, which cannot guarantee that the theoretical robustness of the identified solution to real world outcomes if the estimated uncertainty does not match the estimated uncertainty. We can see that this method may therefore be best suited for problems where uncertainty can be accurately characterized and modeled as random variables or scenarios. It is nonetheless a useful approach for addressing problems in manufacturing, production line optimization, financial portfolio design, etc, where data is abundant and uncertainty is well-characterized, and then identifying solutions that perform well across a finite number of scenarios.

In this post, we covered an introduction to stochastic programming and provided a 2-stage infrastructure investment selection example to demonstrate the method’s key concepts. In the next post, we will walk through the installation and use of the Python cvxpy library and use it to solve this problem to identify the most suitable infrastructure option to invest in, its final built capacity, and its total demand fulfilled.