A quick and straightforward introduction to LIME

In this blog post, we will be discussing the many household uses of citrus aurantiifolio, or the common lime.

Just kidding, we’ll be talking about a completely different type of LIME, namely Local Interpretable Model-Agnostic Explanations (at this point you may be thinking that the former’s scientific name becomes the easier of the two to say). After all, this is the WaterProgramming blog.

On a more serious note though, LIME is one of the two widely-known model agnostic explainable AI (xAI) methods, alongside Shapley Additive Explanations (SHAP). This post is intended to be an introduction to LIME in particular, and we will be setting up the motivation for using xAI methods as well as a brief example application using the North Carolina Research Triangle system.

Before we proceed, here’s a quick graphic to get oriented:

The figure above mainly demonstrates three main concepts: Artificial Intelligence (AI), the methods used to achieve AI (one of which includes Machine Learning, or ML), and the methods to explain how such methods achieved their version of AI (explainable AI, or more catchily known as xAI). For more explanation on the different categories of AI, and their methods, please refer to these posts by IBM’s Data Science and AI team and this SpringerLink book by Sarker (2022) respectively.

Model-agnostic vs model-specific

Model-specific methods

As shown in the figure, model-specific xAI methods are techniques that can only be used on the specific model that it was designed for. Here’s a quick rundown of the type of model and their available selection of xAI methods:

  • Decision trees
    • Decision tree visualization (e.g. Classification and Regression Tree (CART) diagrams)
    • Feature importance rankings
  • Neural networks (NN), including deep neural networks
    • Coefficient (feature weight) analysis
    • Neural network neuron/layer activation visualization
    • Attention (input sequence) mechanisms
    • Integrated gradients
    • DeepLIFT
    • GradCAM
  • Reinforcement learning
    • Causal models

Further information on the mathematical formulation for these methods can be found in Holzinger et al (2022). Such methods account for the underlying structure of the model that they are used for, and therefore require some understanding of the model formulation to fully comprehend (e.g. interpreting NN activation visualizations). While less flexible than their agnostic counterparts, model-specific xAI methods provide more granular information on how specific types of information is processed by the model, and therefore how changing input sequences, or model structure, can affect model predictions.

Model-agnostic methods

Model-agnostic xAI (as it’s name suggests) relies solely on analyzing the input-output sets of a model, and therefore can be applied to a wide range of machine learning models regardless of model structure or type. It can be thought of as (very loosely) as sensitivity analysis, but applied to AI methods (for more information on this discussion, please refer to Scholbeck et al. (2023) and Razavi et al. (2021)). SHAP and LIME both fall under this set of methods, and approximately abide by the following process: perturb the input then identify how the output is changed. Note that this set of methods provide little insight as to the specifics of model formulation and how it affects model predictions. Nonetheless, it affords a higher degree of flexibility, and does not bind you to one specific model.

Why does this matter?

Let’s think about this in the context of water resources systems management and planning. Assume you are a water utility responsible for ensuring that you reliably deliver water to 280,000 residents on a daily basis. In addition, you are responsible for planning the next major water supply infrastructure project. Using a machine learning model to inform your short- and long-term management and planning decisions without interrogating how it arrived at its recommendations implicitly assumes that the model will make sensible decisions that balance all stakeholders’ needs while remaining equitable. More often than not, this assumption can be incorrect and may lead to (sometimes funny but mostly) unintentional detrimental cascading implications on the most vulnerable (for some well-narrated examples of how ML went awry, please refer to Brian Christian’s “The Alignment Problem”).

Having said that, using xAI as a next step in the general framework of adopting AI into our decision-making processes can help us better understand why a model makes its predictions, how those predictions came to be, and their degree of usefulness to a decision maker. In this post, I will be demonstrating the use of LIME to answer the following questions.

The next section will establish the three components we will need to apply LIME to our problem:

  1. An input (feature) set and the training dataset
  2. The model predictions
  3. The LIME explainer

A quick example using the North Carolina Research Triangle

The input (feature) set and training dataset

The Research Triangle region in North Carolina consists of six main water utilities that deliver water to their namesake cities: OWASA (Orange County), Durham, Cary, Raleigh, Chatham, and Pittsboro (Gold et al., 2023). All utilities have collaboratively decided that they will each measure their system robustness using a satisficing metric (Starr 1962; Herman et al. 2015), where they satisfy their criteria for robustness if the following criteria are met:

  1. Their reliability meets or exceeds 98%
  2. Their worst-case cost of drought mitigation actions amount to no more than 10% of their annual volumetric revenue
  3. They do not restrict demand more than 20% of the time

If all three criteria are met, then they are considered “successful” (represented as a 1). Otherwise, they have “failed” (represented as a 0). We have 1,000 training data points of success or failure, each representing a state of the world (SOW) in which a utility fails or succeeds to meet their satisficing critieria. This is our training dataset. Each SOW consists of a unique combination of features that include inflow, evaporation and water demand Fourier series coefficients, as well as socioeconomic, policy and infrastructure construction factors. This is our feature set.

Feel free to follow along this portion of the post using the code available in this Google Colab Notebook.

The model prediction

To generate our model prediction, let’s first code up our model. In this example, we will be using Extreme Gradient Boosting, otherwise known as XGBoost (xgboost) as our prediction model, and the LIME package. Let’s first install the both of them:

pip install lime
pip install xgboost

We will also need to import all the needed libraries:

import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import lime
import lime.lime_tabular
import xgboost as xgb
from copy import deepcopy

Now let’s set up perform XGBoost! We will first need to upload our needed feature and training datasets:

satisficing = pd.read_csv('satisficing_all_utilites.csv')
du_factors = pd.read_csv('RDM_inputs_final_combined_headers.csv')  

# get all utility and feature names
utility_names = satisficing.columns[1:]
du_factor_names = du_factors.columns

There should be seven utility names (six, plus one that represents the entire region) and thirteen DU factors (or feature names). In this example, we will be focusing only on Pittsboro.

# select the utility 
utility = 'Pittsboro'

# convert to numpy arrays to be passed into xgboost function
du_factors_arr = du_factors.to_numpy()
satisficing_utility_arr = satisficing[utility].values

# initialize the figure object
fig, ax = plt.subplots(1, 1, figsize=(5, 5))  
perform_and_plot_xbg(utility, ax, du_factors_arr, satisficing_utility_arr, du_factor_names)

Note the perform_and_plot_xgb function being used – this function is not shown here for brevity, but you can view the full version of this function in this Google Colab Notebook.

The figure above is called a factor map that shows mid-term (Demand2) and long-term (Demand3) demand growth rates on its x- and y-axes respectively. The green denotes the range of demand growth rates where XGBoost has predicted that Pittsboro will successfully meet its satisficing criteria, and brown is otherwise. Each point is a sample from the original training dataset, where the color (white is 1 – success, and red is 0 – failure) denotes whether Pittsboro actually meets its satisficing criteria. In this case we can see that Pittsboro quickly transitions in failure when its mid- and long-term demand are both higher than expected (indicated by the 1.0-value on both axes).

The LIME explainer

Before we perform LIME, let’s first select an interesting point using the figure above.

In the previous section, we can see that the XGBoost algorithm predicts that Pittsboro’s robustness is affected most by mid-term (Demand2) and long-term (Demand3) demand growth. However, there is a point (indicated using the arrow and the brown circle below) where this prediction did not match the actual data point.

To better understand why then, this specific point was predicted to be a “successful” SOW where the true datapoint had it labeled as a “failure” SOW, let’s take a look at how the XGBoost algorithm made its decision.

First, let’s identify the index of this point:

# select an interesting point 
interesting_point_range_du = du_factors[(du_factors['Demand2'] < 0) & (du_factors['Demand3'] < 0)].index
interesting_point_satisficing = satisficing_utility_arr[interesting_point_range_du]
interesting_point_range_sat = np.where(satisficing_utility_arr == 0)
interesting_point = np.intersect1d(interesting_point_range_du, interesting_point_range_sat)[0]

This will return an index of 704. Next, we’ll run LIME to break down the how this (mis)classification was made:

# instantiate the lime explainer
explainer = lime_tabular.LimeTabularExplainer(du_factors_arr, 
                                              mode='classification', 
                                              feature_names=du_factor_names)

# explain the interesting instance
exp = explainer.explain_instance(du_factors_arr[interesting_point_selected], 
                                 xgb_model.predict_proba, 
                                 num_features=len(du_factor_names))

exp.show_in_notebook(show_table=True)

The following code, if run successfully, should result in the following figure.

Here’s how to interpret it:

  • The prediction probability bars on the furthest left show the model’s prediction. In this case, the XGBoost model classifies this point as a “failure” SOW with 94% confidence.
  • The tornado plot in the middle show the feature contributions. In this case, it shows the degree to which each SOW feature influenced the decision. In this case, the model misclassified the data point as a “success” although it was a failure as our trained model only accounts for the top two overall features that influence the entire dataset to plot the factor map, and did not account for short-term demand growth rates (Demand1) and the permitting time required for constructing the water treatment plant (JLWTP permit).
  • The table on the furthest right is the table of the values of all the features of this specific SOW.

Using LIME has therefore enabled us to identify the cause of XGBoost’s misclassification, allowing us to understand that the model needed information on short-term demand and permitting time to make the correct prediction. From here, it is possible to further dive into the types of SOWs and their specific characteristics that would cause them to be more vulnerable to short-term demand growth and infrastructure permitting time as opposed to mid- and long-term demand growth.

Summary

Okay, so I lied a bit – it wasn’t quite so “brief” after all. Nonetheless, I hope you learned a little about explainable AI, how to use LIME, and how to interpret its outcomes. We also walked through a quick example using the good ol’ Research Triangle case study. Do check out the Google Colab Notebook if you’re interested in how this problem was coded.

With that, thank you for sticking with me – happy learning!

References

Amazon Web Services. (1981). What’s the Difference Between AI and Machine Learning? Machine Learning & AI. https://aws.amazon.com/compare/the-difference-between-artificial-intelligence-and-machine-learning/

Btd. (2024, January 7). Explainable AI (XAI): Model-specific interpretability methods. Medium. https://baotramduong.medium.com/explainable-ai-model-specific-interpretability-methods-02e23ebceac1

Christian, B. (2020). The alignment problem: Machine Learning and human values. Norton & Company.

Gold, D. F., Reed, P. M., Gorelick, D. E., & Characklis, G. W. (2023). Advancing Regional Water Supply Management and infrastructure investment pathways that are equitable, robust, adaptive, and cooperatively stable. Water Resources Research, 59(9). https://doi.org/10.1029/2022wr033671

Herman, J. D., Reed, P. M., Zeff, H. B., & Characklis, G. W. (2015). How should robustness be defined for water systems planning under change? Journal of Water Resources Planning and Management, 141(10). https://doi.org/10.1061/(asce)wr.1943-5452.0000509

Holzinger, A., Saranti, A., Molnar, C., Biecek, P., & Samek, W. (2022). Explainable AI methods – A brief overview. xxAI – Beyond Explainable AI, 13–38. https://doi.org/10.1007/978-3-031-04083-2_2

IBM Data and AI Team. (2023, October 16). Understanding the different types of artificial intelligence. IBM Blog. https://www.ibm.com/blog/understanding-the-different-types-of-artificial-intelligence/

Sarker, I. H. (2022, February 10). AI-based modeling: Techniques, applications and research issues towards automation, intelligent and Smart Systems – SN Computer Science. SpringerLink. https://link.springer.com/article/10.1007/s42979-022-01043-x#Sec6

Scholbeck, C. A., Moosbauer, J., Casalicchio, G., Gupta, H., Bischl, B., & Heumann, C. (2023, December 20). Position paper: Bridging the gap between machine learning and sensitivity analysis. arXiv.org. http://arxiv.org/abs/2312.13234

Starr, M. K. (1963). Product design and decision theory. Prentice-Hall, Inc.

Variance-based local time varying sensitivity analysis: A tutorial

In this post, we will be performing time-varying sensitivity analysis (TVSA) to assess how the use of information from reservoir storage and downstream city levels change across time when attempting to meet competing objectives in the Red River basin. This exercise is derived from the Quinn et al (2019) paper listed in the references below. You will be able to find all the files needed for this training in this GitHub repository. You can also follow along this tutorial at this Binder here.

Before beginning, here are some preliminary suggested readings to help you better understand TVSA and Radial Basis Functions (RBFs), which will be used in this training:

  • Herman, J. D., Reed, P. M., & Wagener, T. (2013). Time-varying sensitivity analysis clarifies the effects of watershed model formulation on model behavior. Water Resources Research, 49(3), 1400–1414. https://doi.org/10.1002/wrcr.20124
  • Giuliani, M., Zaniolo, M., Castelletti, A., Davoli, G., &amp; Block, P. (2019). Detecting the state of the climate system via artificial intelligence to improve seasonal forecasts and inform reservoir operations. Water Resources Research, 55(11), 9133–9147. https://doi.org/10.1029/2019wr025035
  • Quinn, J. D., Reed, P. M., Giuliani, M., &amp; Castelletti, A. (2019). What is controlling our control rules? opening the black box of multireservoir operating policies using time‐varying sensitivity analysis. Water Resources Research, 55(7), 5962–5984. https://doi.org/10.1029/2018wr024177
  • Reed, P.M., Hadjimichael, A., Malek, K., Karimi, T., Vernon, C.R., Srikrishnan, V., Gupta, R.S., Gold, D.F., Lee, B., Keller, K., Thurber, T.B, & Rice, J.S. (2022). Addressing Uncertainty in Multisector Dynamics Research [Book]. Zenodo. https://doi.org/10.5281/zenodo.6110623

The Red River System

The Red River basin consists of four reservoirs: Son La (SL), Thac Ba (TB), Tuyen Quang (TQ), and Hoa Binh (HB), shown in the figure below. The reservoirs’ operators must make daily decisions on the volume of water to release from each reservoir to achieve the following competing objectives:

  1. Protect the city of Hanoi (which lies downstream of the four reservoirs) from flooding by managing city water levels.
  2. Achieve maximum possible hydropower production from each reservoir to serve the energy demands of Hanoi.
  3. Minimize agricultural water supply deficit to meet irrigation needs for the region.
Figure 1: (a) The Red River Basin relative to Vietnam, Laos, and China, and (b) the model abstracting key components of the reservoir system that manages it.

The reservoir operators use information from the projected storage levels at each reservoir and water level forecasts to make their daily release decisions. These decisions in turn determine if they are able to meet their three objectives. Therefore, it is important to understand how different types of information is used and coordinated throughout the year to make these daily decisions.

This exercise

To complete this exercise, you should have the following subfolders and files within your Binder.

  1. HydroInfo_100_thinned.resultfile that contains the decision variables and objective values for all sets of optimal decisions. If this file is missing from the main folder structure, you download the original version here.
  2. daily/SolnXX/ the subfolder that contains text files of each reservoir’s daily decision variables (the columns) across 1,000 years (the rows)  for Solution XX.

Most of the daily data folders are large and cannot be stored on the linked GitHub repository. Note that only Soln33 can be found in the daily\ folder. If you would like to perform TVSA on other solutions, please download them separately here.

All in all, there are four selected solutions, named according to the objectives that they prioritize and their tradeoff characteristics:

  1. The “Best Flood” policy: Solution 33
  2. The “Best Hydropower” policy: Solution 37
  3. The “Best Deficit” policy: Solution 52
  4. The “Compromise” policy: Solution 35

Learning goals

In this exercise, we will:

  1. Briefly explain RBFs (Giuliani et al., 2016; Quinn et al., 2017; Giuliani et al., 2019).
  2. Unpack and understand the underlying mechanisms for calculating analytical variance-based first- and second-order sensitivity indices.
  3. Perform time-varying sensitivity analysis for each of the four listed solutions.
  4. Analyze the resulting figures to understand how information use changes throughout the year.

Understanding TVSA

Let’s walk through some key equations before we proceed. In this section, you will also find the Equation numbers that link these equations to their origins in the Quinn et al. (2019) paper listed above.

First, let’s look at some of the notation we will be using:

  • M is the number of policy inputs
  • K is the number of policy outputs
  • N is the number of radial basis functions (RBFs) used
  • t \in T=13\times365 is a day within the entire record, while $t\%365$ is a calendar year (day of year, DOY)
  • k \in K=1 is the number of reservoirs
  • u_{t}^{k} is the policy-prescribed release at reservoir $k$ at time $t$
  • a,b \in A,B=1 is the system state, aka the storage at the reservoir
  • (x_t)_a is a vector of system variables for state $a$
  • c_{n,m} and $b_{n,m}$ are the centers and radii of policy input $m \in M$ for RBF function $n \in N$
  • w_n is the weight of RBF function $n \in N$

Calculating the first-order sensitivity analysis

The following equation is drawn from Eqn.10 of Quinn et al. (2019):

\Bigl(\frac{\delta u_{t}^{k}}{\delta(x_t)_a}\Bigr)^{2}\text{Var }((x_{t\%365})_a)

This equation describes the annual first-order contributions of only variable $(x_t)_a$ to the system state at a.

Next, from Eqn.11 of Quinn et al. (2019):

\frac{\delta u_{t}^{k}}{\delta(x_t)_a}\frac{\delta u_{t}^{k}}{\delta(x_t)_b}\text{Cov }((x_{t\%365})_a, (x_{t\%365})_b)

This equation describes the annual second-order contributions for variable (x_t) to the system states at a and b. But how do we calculate the derivatives of the policy u_{t}^{k} with respect to its inputs?

Calculating the derivatives

We can calculate the derivative by taking the first partial differential of the policy with respect to each of its inputs. This approach is generalizable to all forms of policy formulation, but in this case, we will take the derivative with respect to the RBF function shown in Eqn 8 of Quinn et al. (2019).

\frac{\delta u_{t}^{k}}{\delta(x_t)_a} = \sum_{n=1}^{N}\Biggl\{\frac{-2w_{n}[(x_{t})_{a}-c_{n,a}]}{b_{n,a}^2}\text{ exp}\Biggl[-\sum_{m=1}^{M}\Bigl(\frac{(x_{t})_{j}-c_{n,m}}{b_{n,m}}\Bigr)^2\Biggr]\Biggr\}

For reference, here is the original RBF function (Eqn. 8 of Quinn et al. (2019)):

u_{t}^{k} = \sum_{n=1}^{N}w_{n}^{k}\text{ exp}\Biggr[-\sum_{m=1}^{M}\Biggl(\frac{(x_n)_m - c_{n,m}}{b_{n,m}}\Biggr)^2\Biggr]

A quick aside on RBFs

As shown in the equation above, an RBF is characterized by three parameters: centers c \in C and radii b \in B for each input variable, and a weight w. It is a type of Gaussian process kernel that informs the model (in our case, the Red River model) of the degree of similarity between two input points (Natsume, 2022). Each input is shifted by its distance from its respective “optimal” center point, and scaled by its “optimal” radius. The overall influence of an RBF function on the final release decision is then weighted by an “optimal” constant, where the sum of all w should be equal to one.

Herein lies the key advantage of using RBFs: instead of identifying the optimal output (the releases) at each timestep, optimizing RBFs finds optimal values of c, b, and w associated with each input. This limits the number of computations and storage space required to estimate the best-release policy. This is done by eliminating the need to compute future outcomes u_{t+1}, u_{t+2},...,u_{T} across all following timesteps, and avoiding the storage of information on prior releases u_{t-1}.

General steps

Time-varying sensitivity analysis in general is conducted as follows:

  1. Calculate the variance and covariance of each input variable for each timestep t over all years. In this case, the timestep is in daily increments.
  2. Calculate the partial first and second derivatives of each release policy for each day with respect to the input variables at each location.
  3. Compute the first- and second-order contributions of each input variable across all timesteps by:
    • Multiplying the variances with the first-order derivatives
    • Multiplying the covariances with their respective second-order derivatives
  4. Plot the contribution of each input variable to daily release policies (i.e. what kind of information is being used to make a decision on releases from reservoir k).  

Setting up the problem

To follow along with this tutorial, ensure that all files are located in their appropriate directories, open up the Jupyter notebook named tvsa_exercise.ipynb. Now, let’s import all the required libraries and set some constant parameters. We will be setting the values for the number of input parameters, M, the number of RBF functions N, and the number of outputs, K.

# import the libraries
import numpy as np
import math 
import matplotlib.pyplot as plt
import seaborn as sns
from calc_SI import *
from plot_SI import *

sns.set_style("dark")
# set the parameters 
M = 5 # Storage at the four reservoirs and the forecasted water levels in Hanoi 
N = 12 # Number of RBFs to parameterize. Set a minimum of M+K
K = 4 # Number of outputs (release decisions at the four reservoirs)
num_years = 1000 # Number of years to simulate

We will also need to define our input and output names. These are the column names that will appear in the output .txt files needed to plot our figures later in this exercise. Let’s also set the solution that we would like to analyze.

inputNames = ['sSL', 'sHB', 'sTQ', 'sTB', 'HNwater']  # the storage at Son La, Hoa Bing, Tuyen Quang, Thac Ba, and the water level at Hanoi
outputNames = ['rSL', 'rHB', 'rTQ', 'rTB']  # the release at Son La, Hoa Bing, Tuyen Quang, Thac Ba
policy_file = 'HydroInfo_100_thinned.resultfile'
soln_num = 33

Performing TVSA

The main calculations that output the sensitivity index for each day of the year across 1,000 years is performed by the function called calc_analytical_SI. This function can be found in the calc_SI.py file within this folder. It it highly recommended that you take the time to go through and understand the mechanisms of time-varying sensitivity analysis after completing this exercise. This step can take anywhere from 30-50 minutes, depending on your computing hardware. Feel free to comment out the line of code below if you would only like to plot the TVSA figures.

calc_analytical_SI(M, N, K, policy_file, soln_num, num_years, inputNames, outputNames)

If you are successful in running this function, you should see two new folders in daily/Soln_XX/ called release_decisions and Figures. The latter will become important in a while, when we begin to generate our TVSA figures. For now, open release_decisions to check if the files containing the sensitivity of each reservoir’s release decisions (e.g. rHB.txt) have populated the folder.

Are they there? Great! Let’s move on to generating the figures.

Plot the results

Now that we have the sensitivity indices of each reservoir’s release decision over time, let’s visualize how they use storage and water level forecast information. You can do this by using the plot_tvsa_policy function found in the plot_SI.py file. As before, you’re advised to attempt understand the script once you have completed this exercise.

To begin, let’s specify the name of the inputs that you would like to see in the legends of the figure. For this exercise, we will re-use the inputNames list defined prior to this. We will also need to add one more to this list, called interactions, which quantify the influence of input variable interactions on release decisions. In general, the list of input names and their associated colors should have length $M+1$.

Feel free to replace this with a list of other names that is more intuitive for you. We will also specify the colors they will be shown in. Since there are six inputs, we will have to specify six colors.

# specify the legend labels for each input and their respective colors
input_list = inputNames + ['interactions']

# for Son La, Hoa Binh, Tuyen Quang, Thac Ba, Hanoi, and their interactions respectively
input_colors = ['#FFBF00', '#E83F6F', '#2274A5', '#E3D7FF', '#92BFB1', '#F7A072']

For now, let’s see how this information affect’s Son La’s release decisions over time for the “Best Flood” solution. This is the solution that minimizes the “worst first-percentile of the amount by which the maximum water level in Hanoi exceed 11.25m” (Quinn et al., 2019).

plot_tvsa_policy(num_years, 33, input_list, input_colors, 'rSL', 'Sensitivity of Son La releases')

If all the steps have been completed, you should be able to view the following image:

Figure 2: Average contributions of each input across all simulation years and their interactions to Son La’s release decision variance. The vertical axis represents the relative contributions of each input variable to variance. The horizontal axis shows the months in a year.

If you’d like to see and automatically save the plots for all four reservoirs, you can opt for a for loop:

res_names = ['Son La', 'Hoa Binh', 'Tuyen Quang', 'Thac Ba']

for r in range(len(outputNames)):
    plot_name = res_names[r] + ' releases'
    plot_tvsa_policy(num_years, 33, input_list, input_colors, 
                     outputNames[r], plot_name + ' releases')

At the end of this, you should be able to locate all the figure in the daily/Soln_XX/Figures folder, where XX is the solution number of your choice.

Analyze the results

The figure below shows the time-varying sensitivity of the release decisions at all four reservoirs to the input variables. As a quick reminder, we are looking at Solution 33, which is the “Best Flood” solution.

Figure 3: Average contributions of each input across all simulation years and their interactions to (clockwise) Son La, Hoa Binh, Tuyen Quang, and Thac Ba. The vertical axis represents the relative contributions of each input variable to variance. The horizontal axis shows the months in a year.

In Figure 3 above, note the visual similarity between both Hoa Binh and Son La’s components’ contributions to their respective variance. The similarities between the pair of reservoirs could potentially be attributed to the geographic characteristics of the Red River basin (although further analysis is required here). In Figure 1, both Son La and Hoa Binh belong to the same Red River tributary, with the former preceding the latter. Their release decisions dominantly use information from Hanoi’s water level forecasts, particularly during the five months that correspond to Vietnam’s May-September monsoon season.

During the same period, the variance in Thac Ba and Tuyen Quang’s release decision have relatively-even contributions from Son La’s storage levels, Hanoi’s water level forecast, and interactions between different types of information. This indicates that it is crucial for the reservoir managers at Thac Ba and Tuyen Quang to carefully monitor these three main information sources during the monsoon period. Higher resolution data, and more frequent monitoring may be valuable during this time.

Interestingly, storage information from Hoa Binh appears to contribute most to the release decision variances of each reservoir before at and at the beginning of the monsoon. Hoa Binh’s storage information is particularly important during the first half of the year at all four reservoirs, which happens to be immediately before and during the monsoon season. During the same period of time, interactions between information from all other reservoirs also plays an important role in regulating release decisions.

Post-monsoon, the implications of interactions on release decisions becomes negligible, and each reservoir’s releases become dominantly driven by storage levels at Hoa Binh and Hanoi’s water levels once more, approximately mirroring the distribution of information contributions during the pre-monsoon period. However, storage levels at Thac Ba and Tuyen Quang play a more important role at determining release decisions during this time, and their contributions to release decisions is greater for their own reservoirs. Notably, information on storage at Thac Ba has the least influence for all reservoirs’ release decisions (including its own) throughout the entire year. This is likely due to it being the smallest reservoir in the system (Table 1; Quinn et al., 2019).

These observations can lead us to hypothesize that:

  1. For a policy that prioritizes minimizing flooding in Hanoi, information on storage levels at the Hoa Binh reservoir is vital.
  2. Storage levels at Hoa Binh should therefore be closely monitored and recorded at high resolution.
  3. During the monsoon, water levels in Hanoi are also key to release decisions and should be recorded at regular intervals.
  4. Interactions between difference sources of information is also a significant contributor to the variance across all reservoirs’ release decisions; treating decisions and conditions at each reservoir as independent of others can result in under-estimating the risk of flooding.

Summary

Overall, TVSA is a useful method to help decision-makers identify what information is useful, and when it is best used. This information is valuable when determining the resolution at which observations should be taken, or if downscaling is appropriate. Note that this method of calculating sensitivity relies on variance-based global sensitivity analysis method. This is where the sensitivity of an output to a provided input (or a combination thereof) is quantified as the contribution of total variance in the output by that input (or a combination of inputs). However, such methods may be biased by heavy-tailed distributions or outliers and therefore should be used only where appropriate (e.g. when the output distributions have analyzed, or when outputs are not multimodal; Reed et al., 2022). As you may have noticed, TVSA is also a computationally expensive process that requires a significant amount of time. This requirement grows exponentially with the number of simulations/years that you are calculating variance across, or as the number of inputs and outputs increase.

In this exercise, we performed time varying sensitivity analysis to understand how each reservoir within the Red River system in Vietnam uses storage and water level information on a day-to-day basis, and how this use of information changes over time. Similar methods have also been used to better understand how the formulation of hydrological processes affect watershed model behavior (Herman et al., 2013; Medina and Muñoz, 2020; Basijokaite and Kelleher, 2021), and asses information use in microgrid energy management (Liu et al., 2023). Post-exercise, do take some time to peruse these papers (linked before in the References section).

That said, thank you for sticking around and hope you learned a little something here!

References

  1. Basijokaite, R., & Kelleher, C. (2021). Time‐varying sensitivity analysis reveals relationships between watershed climate and variations in annual parameter importance in regions with strong interannual variability. Water Resources Research, 57(1). https://doi.org/10.1029/2020wr028544
  2. Giuliani, M., Castelletti, A., Pianosi, F., Mason, E., & Reed, P. M. (2016). Curses, tradeoffs, and scalable management: Advancing Evolutionary Multiobjective Direct Policy search to improve water reservoir operations. Journal of Water Resources Planning and Management, 142(2). https://doi.org/10.1061/(asce)wr.1943-5452.0000570
  3. Herman, J. D., Reed, P. M., & Wagener, T. (2013a). Time-varying sensitivity analysis clarifies the effects of watershed model formulation on model behavior. Water Resources Research, 49(3), 1400–1414. https://doi.org/10.1002/wrcr.20124
  4. Liu, M.V., Reed, P.M., Gold, D.F., Quist, G., Anderson, C.L. (2023). A Multiobjective Reinforcement Learning Framework for Microgrid Energy Management. arXiv.
    https://doi.org/10.48550/arXiv.2307.08692
  5. Medina, Y., & Muñoz, E. (2020). A simple time-varying sensitivity analysis (TVSA) for assessment of temporal variability of hydrological processes. Water, 12(9), 2463. https://doi.org/10.3390/w12092463
  6. Natsume, Y. (2022, August 23). Gaussian process kernels. Medium. https://towardsdatascience.com/gaussian-process-kernels-96bafb4dd63e.
  7. Quinn, J. D., Reed, P. M., Giuliani, M., & Castelletti, A. (2019). What is controlling our control rules? opening the black box of multireservoir operating policies using time‐varying sensitivity analysis. Water Resources Research, 55(7), 5962–5984. https://doi.org/10.1029/2018wr024177
  8. Reed, P.M., Hadjimichael, A., Malek, K., Karimi, T., Vernon, C.R., Srikrishnan, V., Gupta, R.S., Gold, D.F., Lee, B., Keller, K., Thurber, T.B, & Rice, J.S. (2022). Addressing Uncertainty in Multisector Dynamics Research [Book]. Zenodo. https://doi.org/10.5281/zenodo.6110623

MORDM VIII: Characterizing the effects of deep uncertainty

In the previous post, we defined robustness using the satisficing metric where (1) reliability should be at least 98%, (2) restriction frequency should be not more than 10% and (3) worst-case cost of drought mitigation action should not be more than 10% of annual net volumetric income. To calculate the robustness of these set of actions (portfolios) against the effects of challenging states of the world (SOWs) on the initial set of actions, we once again re-simulated them to discover how they fail.

In this penultimate post, we will perform simple sensitivity analysis across the average performance of all sixty-nine portfolios of actions to understand which uncertainties control the performance of each utility (Raleigh, Durham and Cary) and the regions across all uncertain SOWs.

Calculating average performance across 100 DU SOWs

First, we create a new folder to hold the output of the next few post-processing steps. Navigate to the WaterPaths/ folder and create a folder called post_processing. Now, let’s calculate the average performance of each of the sixty-nine initial portfolios across the 100 DU SOWs that we previously simulated them over. The code for this can be found in the post_processing_code folder under gather_objs.py file and should look like this:

# -*- coding: utf-8 -*-
"""
Created on Mon April 26 2022 11:12

@author: Lillian Bei Jia Lau
Organizes output objectives by mean across RDMs and across solutions

"""
import numpy as np

obj_names = ['REL_C', 'RF_C', 'INF_NPC_C', 'PFC_C', 'WCC_C', \
        'REL_D', 'RF_D', 'INF_NPC_D', 'PFC_D', 'WCC_D', \
        'REL_R', 'RF_R', 'INF_NPC_R', 'PFC_R', 'WCC_R', \
        'REL_reg', 'RF_reg', 'INF_NPC_reg', 'PFC_reg', 'WCC_reg']

'''
Performs regional minimax
'''
def minimax(N_SOLNS, objs):
    for i in range(N_SOLNS):
        for j in range(5):
            if j == 0:
                objs[i,15] = np.min([objs[i,0],objs[i,5], objs[i,10]])
            else:
                objs[i, (j+15)] = np.max([objs[i,j],objs[i,j+5], objs[i,j+10]])
    return objs

'''
Calculates the mean performance acorss all SOWs
'''
def mean_performance_across_rdms(objs_by_rdm_dir, N_RDMS, N_SOLNS):
    objs_matrix = np.zeros((N_SOLNS,20,N_RDMS), dtype='float')
    objs_means = np.zeros((N_SOLNS,20), dtype='float')

    for i in range(N_RDMS):
        filepathname = objs_by_rdm_dir + str(i) + '_sols0_to_' + str(N_SOLNS) + '.csv'
        objs_file = np.loadtxt(filepathname, delimiter=",")
        objs_matrix[:,:15,i] = objs_file

        objs_file_wRegional = minimax(N_SOLNS, objs_matrix[:,:,i])

        objs_matrix[:,:,i] = objs_file_wRegional

        array_has_nan = np.isnan(np.mean(objs_matrix[:,3,i]))
        if(array_has_nan == True):
            print('NaN found at RDM ', str(i))

    for n in range(N_SOLNS):
        for n_objs in range(20):
            objs_means[n,n_objs] = np.mean(objs_matrix[n,n_objs,:])

    return objs_means

'''
Calculates the mean performance acorss all SOWs
'''
def mean_performance_across_solns(objs_by_rdm_dir, N_RDMS, N_SOLNS):
    objs_matrix = np.zeros((N_SOLNS,20,N_RDMS), dtype='float')
    objs_means = np.zeros((N_RDMS,20), dtype='float')

    for i in range(N_RDMS):
        filepathname = objs_by_rdm_dir + str(i) + '_sols0_to_' + str(N_SOLNS) + '.csv'
        objs_file = np.loadtxt(filepathname, delimiter=",")
        objs_matrix[:,:15,i] = objs_file
        objs_file_wRegional = minimax(N_SOLNS, objs_matrix[:,:,i])

        objs_matrix[:,:,i] = objs_file_wRegional

        array_has_nan = np.isnan(np.mean(objs_matrix[:,3,i]))
        if(array_has_nan == True):
            print('NaN found at RDM ', str(i))

    for n in range(N_RDMS):
        for n_objs in range(20):
            objs_means[n,n_objs] = np.mean(objs_matrix[:,n_objs,n])

    return objs_means

# change number of solutions available depending on the number of solutions
# that you identified
N_SOLNS = 69
N_RDMS = 100

# change the filepaths
objs_by_rdm_dir = '/yourFilePath/WaterPaths/output/Objectives_RDM'
objs_og_dir = '/yourFilePath/WaterPaths/'

fileoutpath = '/yourFilePath/WaterPaths/post_processing/'

fileoutpath_byRDMs = fileoutpath + 'meanObjs_acrossRDM.csv'
fileoutpath_bySoln = fileoutpath + 'meanObjs_acrossSoln.csv'

# should have shape (N_SOLNS, 20)
objs_byRDM = mean_performance_across_rdms(objs_by_rdm_dir, N_RDMS, N_SOLNS)
# should have shape (N_RDMS, 20)
objs_bySoln = mean_performance_across_solns(objs_by_rdm_dir, N_RDMS, N_SOLNS)

np.savetxt(fileoutpath_byRDMs, objs_byRDM, delimiter=",")
np.savetxt(fileoutpath_bySoln, objs_bySoln, delimiter=",")

This will output two .csv files: meanObjs_acrossRDM.csv contains the average performance of each of the sixty-nine objectives evaluated over 100 DU SOWs, while meanObjs_acrossSoln.csv contains the average performance of all solutions within one SOW. Take some time to understand this difference, as this will be important when performing sensitivity analysis and scenario discovery.

Calculate the robustness of each portfolio to deep uncertainty

Now, let’s identify how each of these solutions perform under a set of more challenging SOWs. Within post_processing_code/, identify the file called calc_robustness_across_rdms.py. It should look like this:

# -*- coding: utf-8 -*-
"""
Created on Mon April 26 2022 11:12

@author: Lillian Bei Jia Lau

Calculates the fraction of RDMs over which each perturbed version of the solution meets all four satisficing criteria
"""
import numpy as np
import pandas as pd

obj_names = ['REL_C', 'RF_C', 'INF_NPC_C', 'PFC_C', 'WCC_C', \
        'REL_D', 'RF_D', 'INF_NPC_D', 'PFC_D', 'WCC_D', \
        'REL_R', 'RF_R', 'INF_NPC_R', 'PFC_R', 'WCC_R', \
        'REL_reg', 'RF_reg', 'INF_NPC_reg', 'PFC_reg', 'WCC_reg']

utilities = ['Cary', 'Durham', 'Raleigh', 'Regional']

'''
Performs regional minimax
'''
def minimax(N_SOLNS, objs):
    for i in range(N_SOLNS):
        for j in range(5):
            if j == 0:
                objs[i,15] = np.min([objs[i,0],objs[i,5], objs[i,10]])
            else:
                objs[i, (j+15)] = np.max([objs[i,j],objs[i,j+5], objs[i,j+10]])
    return objs

'''
For each rdm, identify if the perturbed solution version x satisfies the satisficing criteria
'''
def satisficing(df_objs):
    for i in range(4):
        df_objs['satisficing_C'] = (df_objs['REL_C'] >= 0.98).astype(int) *\
                                    (df_objs['WCC_C'] <= 0.10).astype(int) *\
                                    (df_objs['RF_C'] <= 0.10).astype(int)

        df_objs['satisficing_D'] = (df_objs['REL_D'] >= 0.98).astype(int) *\
                                    (df_objs['WCC_D'] <= 0.10).astype(int) *\
                                    (df_objs['RF_D'] <= 0.10).astype(int)

        df_objs['satisficing_R'] = (df_objs['REL_R'] >= 0.98).astype(int) *\
                                    (df_objs['WCC_R'] <= 0.10).astype(int) *\
                                    (df_objs['RF_R'] <= 0.10).astype(int)

    df_objs['satisficing_reg'] = np.max(df_objs.iloc[:, 20:23])
    return df_objs

def calc_robustness(objs_by_rdm_dir, N_RDMS, N_SOLNS):

    # matrix structure: (N_SOLNS, N_OBJS, N_RDMS)
    objs_matrix = np.zeros((N_SOLNS,20,N_RDMS), dtype='float')

    satisficing_matrix = np.zeros((N_SOLNS,4,N_RDMS), dtype='float')
    solution_robustness = np.zeros((N_SOLNS,4), dtype='float')

    for i in range(N_RDMS):
        # get one perturbed instance's behavior over all RDMs
        filepathname = objs_by_rdm_dir + str(i) + '_sols0_to_' + str(N_SOLNS) + '.csv'

        objs_file = np.loadtxt(filepathname, delimiter=",")

        objs_matrix[:,:15,i] = objs_file

        objs_file_wRegional = minimax(N_SOLNS, objs_matrix[:,:,i])

        objs_matrix[:,:,i] = objs_file_wRegional

        # NaN check
        array_has_nan = np.isnan(np.mean(objs_matrix[:,3,i]))
        if(array_has_nan == True):
            print('NaN found at RDM ', str(i))

    # for the perturbed instances
    for r in range(N_RDMS):

        df_curr_rdm = pd.DataFrame(objs_matrix[:, :, r], columns = obj_names)

        df_curr_rdm = satisficing(df_curr_rdm)
        satisficing_matrix[:N_SOLNS,:,r] = df_curr_rdm.iloc[:,20:24]

    for n in range(N_SOLNS):
        solution_robustness[n,0] = np.sum(satisficing_matrix[n,0,:])/N_RDMS
        solution_robustness[n,1] = np.sum(satisficing_matrix[n,1,:])/N_RDMS
        solution_robustness[n,2] = np.sum(satisficing_matrix[n,2,:])/N_RDMS

    solution_robustness[:,3] = np.min(solution_robustness[:,:3], axis=1)

    return solution_robustness

'''
Change number of solutions available depending on the number of solutions
that you identified and the number of SOWs that you are evaluating them over.
'''
N_RDMS = 100
N_SOLNS = 69

objs_by_rdm_dir = '/scratch/lbl59/blog/WaterPaths/output/Objectives_RDM'

fileoutpath_robustness = '/scratch/lbl59/blog/WaterPaths/post_processing/' + \
    'robustness_' + str(N_RDMS) + '_SOWs.csv'

robustness = calc_robustness(objs_by_rdm_dir, N_RDMS, N_SOLNS)

np.savetxt(fileoutpath_robustness, robustness, delimiter=",")

When you run this script from the terminal, you should have a .csv file called ‘robustness_100_SOWs.csv‘ appear in your post_processing/ folder. Now, let’s get onto some sensitivity analysis!

Delta moment-independent sensitivity analysis

The Delta moment-independent (DMI) method (Borgonovo, 2007) is sensitivity analysis method that compares the entire probability distribution of both input and output parameters to estimate the sensitivity of the output to a specific input parameter. It is one of many global sensitivity analysis methods, which in itself is one of two main categories of sensitivity analysis that enables the assessment of the degree to which uncertainty in model inputs map to the degree of uncertainty in model output. For purposes of this test case, the DMI is preferable as it does not rely on any one statistical moment (variance, mean and skew) to describe the dependence of model output to its input parameters. It is also time-sensitive, reflecting the current state of knowledge within the system, which philosophically pairs well with our use of the ROF triggers. More information on alternative global sensitivity methods can be found here.

Within the context of our test case, we will be using the DMI method to identify uncertainties in our decision variables that most strongly influence our performance over the 100 DU SOWs. To perform DMI sensitivity analysis, first navigate to the post_processing/ folder. Within the folder, create two sub-folders called delta_output_DV/ and delta_output_DUF/. This is where all your DMI output will be stored. Next, locate the delta_sensitivity.py file within the post_processing_code/ folder. The code should look similar to the script provided below:

import sys
from SALib.analyze import delta
from SALib.util import read_param_file
import numpy as np
import pandas as pd

'''
Finds the upper and lower bounds of input parameters
'''
def find_bounds(input_file):
    bounds = np.zeros((input_file.shape[1],2), dtype=float)
    for i in range(input_file.shape[1]):
        bounds[i,0] = min(input_file[:,i])
        bounds[i,1] = max(input_file[:,i])

    return bounds
'''
Performs delta moment-independent sensitivity analysis
Source: https://github.com/SALib/SALib/tree/main/examples/delta
'''
def delta_sensitivity(dec_vars, measured_outcomes, names, mo_names, bounds, rdm, mode):
    X = dec_vars
    Y = measured_outcomes

    problem = {
        'num_vars': int(dec_vars.shape[1]),
        'names': names,
        'bounds': bounds
    }

    for i in range(measured_outcomes.shape[1]):
        mo_label = mo_names[i]
        if i == 2 and mode == 'objs':
          break
        else:
          filename = '../post_processing/delta_output_' + rdm + '/S1_' + mo_label + '.csv'
          S1 = delta.analyze(problem, X, Y[mo_label].values, num_resamples=10, conf_level=0.95, print_to_console=False)
        numpy_S1 = np.array(S1["S1"])
        fileout = pd.DataFrame([names, numpy_S1], index = None, columns = None)
        fileout.to_csv(filename, sep=",")

'''
0 - Name all file headers and compSol to be analyzed
'''
obj_names = ['REL_C', 'RF_C', 'INF_NPC_C', 'PFC_C', 'WCC_C', \
        'REL_D', 'RF_D', 'INF_NPC_D', 'PFC_D', 'WCC_D', \
        'REL_R', 'RF_R', 'INF_NPC_R', 'PFC_R', 'WCC_R', \
        'REL_reg', 'RF_reg', 'INF_NPC_reg', 'PFC_reg', 'WCC_reg']

dv_names = ['RT_C', 'RT_D', 'RT_R', 'TT_D', 'TT_R', 'LMA_C', 'LMA_D', 'LMA_R',\
            'RC_C', 'RC_D', 'RC_R', 'IT_C', 'IT_D', 'IT_R', 'IP_C', 'IP_D', \
            'IP_R', 'INF_C', 'INF_D', 'INF_R']

rdm_headers_dmp = ['Cary restr. eff', 'Durham restr. eff', 'Raleigh restr. eff']
rdm_headers_utilities = ['Demand growth\nmultiplier', 'Bond term\nmultiplier', \
                        'Bond interest\nrate multiplier', 'Infrastructure interest\nrate multiplier']
rdm_headers_ws = ['Streamflow amp', 'SCR PT', 'SCR CT', 'NRR PT', 'NRR CT', 'CR Low PT', 'CR Low CT',\
                  'CR High PT', 'CR High CT', 'WR1 PT', 'WR1 CT', 'WR2 PT', 'WR2 CT',\
                  'DR PT', 'DR CT', 'FR PT', 'FR CT']

duf_names = ['Cary restr. eff', 'Durham restr. eff', 'Raleigh restr. eff', 'Demand growth\nmultiplier',\
             'Bond term\nmultiplier', 'Bond interest\nrate multiplier', 'Infrastructure interest\nrate multiplier',\
             'Streamflow amp\nmultiplier', 'SCR PT\nmultiplier', 'SCR CT\nmultiplier', 'NRR PT\nmultiplier',\
             'NRR CT\nmultiplier', 'CR Low PT\nmultiplier', 'CR Low CT\nmultiplier', 'CR High PT\nmultiplier',\
             'CR High CT\nmultiplier', 'WR1 PT\nmultiplier', 'WR1 CT\nmultiplier', 'WR2 PT\nmultiplier',\
             'WR2 CT\nmultiplier', 'DR PT\nmultiplier', 'DR CT\nmultiplier', 'FR PT\nmultiplier', 'FR CT\nmultiplier',\
             'DR PT\nmultiplier', 'DR CT\nmultiplier', 'FR PT\nmultiplier', 'FR CT\nmultiplier']

utilities = ['Cary', 'Durham', 'Raleigh', 'Regional']

N_RDMS = 100
N_SOLNS = 69

'''
1 - Load DU factor files and DV files
'''
# change to your own filepath
rdm_factors_directory = '/yourFilePath/WaterPaths/TestFiles/'
rdm_dmp_filename = rdm_factors_directory + 'rdm_dmp_test_problem_reeval.csv'
rdm_utilities_filename = rdm_factors_directory + 'rdm_utilities_test_problem_reeval.csv'
rdm_watersources_filename = rdm_factors_directory + 'rdm_water_sources_test_problem_reeval.csv'

rdm_dmp = pd.read_csv(rdm_dmp_filename, sep=",", names=rdm_headers_dmp)
rdm_utilities = pd.read_csv(rdm_utilities_filename, sep=",", names=rdm_headers_utilities)
rdm_ws_all = np.loadtxt(rdm_watersources_filename, delimiter=",")
rdm_ws = pd.DataFrame(rdm_ws_all[:,:17], columns=rdm_headers_ws)

dufs = pd.concat([rdm_dmp, rdm_utilities, rdm_ws], axis=1, ignore_index=True)
duf_numpy = dufs.to_numpy()

# change to your own filepath
dv_directory = '/yourFilePath/WaterPaths/'
dv_filename = dv_directory + 'NC_dvs_all_noheader.csv'
dvs = np.loadtxt(dv_filename, delimiter=",")

'''
2 - Get bounds for DU factors and DVs
'''
duf_bounds = find_bounds(duf_numpy)
dv_bounds = find_bounds(dvs)

'''
3 - Load robustness file and objectives file
'''
# change to your own filepath
main_dir = '/yourFilePath/WaterPaths/post_processing/'

robustness_filename = main_dir + 'robustness_100_SOWs.csv'
robustness_arr = np.loadtxt(robustness_filename, delimiter=",")
robustness_df = pd.DataFrame(robustness_arr, columns=utilities)

objs_mean_rdm_filename = main_dir + 'meanObjs_acrossRDM.csv'
objs_mean_rdm_arr = np.loadtxt(objs_mean_rdm_filename, delimiter=",")
objs_mean_rdm_df = pd.DataFrame(objs_mean_rdm_arr, columns=obj_names)

objs_mean_soln_filename = main_dir + 'meanObjs_acrossSoln.csv'
objs_mean_soln_arr = np.loadtxt(objs_mean_soln_filename, delimiter=",")
objs_mean_soln_df = pd.DataFrame(objs_mean_soln_arr, columns=obj_names)

# to change  depending on whether DV or DUF is being analyzed
dec_vars = dvs
measured_outcomes = objs_mean_rdm_df
names = dv_names
mo_names = obj_names
bounds = dv_bounds
rdm = 'DV'
mode = 'objs'
###

delta_sensitivity(dec_vars, measured_outcomes, names, mo_names, bounds, rdm, mode)

The code above identifies the sensitivity of the average values of all sixty-nine performance objective sets over all 100 deeply-uncertain SOWs to the decision variables. This is why we use the meanObjs_acrossRDM.csv file – this file contains sixty-nine mean values of the performance objectives, where each set of performance objectives inversely maps to their corresponding portfolio of actions.

To identify the sensitivity of performance objectives to the DU factors, change lines 121 to 127 to the following:

# to change  depending on whether DV or DUF is being analyzed
dec_vars = duf_numpy[:100,:]
measured_outcomes = objs_mean_soln_df
names = duf_names
mo_names = obj_names
bounds = duf_bounds[:100,:]
rdm = 'DUF'
mode = 'objs'
###

The code above identifies the sensitivity of the average values of all twenty performance objectives over each of the sixty-nine different portfolios to the set of deeply uncertain hydroclimatic and demand scenarios. This is why we use the meanObjs_acrossSoln.csv file – this file contains one hundred mean values of the performance objectives, where each set of performance objectives inversely maps to their corresponding DU SOW.

Great job so far! Now let’s visualize the sensitivity of our output to our input parameters using heatmaps. Before being able to visualize each utility’s performance sensitivity, we must first organize the sensitivity indices of the decision variables into a file containing the indices over all objectives for each utility. The gather_delta.py script does this. Simply change the value of mode on line 11 to ‘DUF‘ to gather the indices for the DU factors.

"""
Created on Tue April 26 2022 16:12

@author: Lillian Bei Jia Lau

Gathers the delta sensitivity indices into files per utility
"""
import numpy as np
import pandas as pd

mode = 'DV'
main_dir = '/yourFilePath/WaterPaths/post_processing/delta_output_' + mode + '/'
utilities = ['_C', '_D', '_R', '_reg']
objs = ['REL', 'RF', 'INF_NPC', 'PFC', 'WCC']
utilities_full = ['Cary', 'Durham', 'Raleigh', 'Regional']

dv_names = ['RT_C', 'RT_D', 'RT_R', 'TT_D', 'TT_R', 'LMA_C', 'LMA_D', 'LMA_R',\
            'RC_C', 'RC_D', 'RC_R', 'IT_C', 'IT_D', 'IT_R', 'IP_C', 'IP_D', \
            'IP_R', 'INF_C', 'INF_D', 'INF_R']

duf_names = ['Cary restr. eff', 'Durham restr. eff', 'Raleigh restr. eff', 'Demand growth\nmultiplier',\
             'Bond term\nmultiplier', 'Bond interest\nrate multiplier', 'Infrastructure interest\nrate multiplier',\
             'Streamflow amp\nmultiplier', 'SCR PT\nmultiplier', 'SCR CT\nmultiplier', 'NRR PT\nmultiplier',\
             'NRR CT\nmultiplier', 'CR Low PT\nmultiplier', 'CR Low CT\nmultiplier', 'CR High PT\nmultiplier',\
             'CR High CT\nmultiplier', 'WR1 PT\nmultiplier', 'WR1 CT\nmultiplier', 'WR2 PT\nmultiplier',\
             'WR2 CT\nmultiplier', 'DR PT\nmultiplier', 'DR CT\nmultiplier', 'FR PT\nmultiplier', 'FR CT\nmultiplier',\
             'DR PT\nmultiplier', 'DR CT\nmultiplier', 'FR PT\nmultiplier', 'FR CT\nmultiplier']

s1_dv_cary = np.zeros((len(objs), len(dv_names)), dtype=float)
s1_dv_durham = np.zeros((len(objs), len(dv_names)), dtype=float)
s1_dv_raleigh = np.zeros((len(objs), len(dv_names)), dtype=float)
s1_dv_regional = np.zeros((len(objs), len(dv_names)), dtype=float)

s1_dv_dict = {
    '_C': s1_dv_cary,
    '_D': s1_dv_durham,
    '_R': s1_dv_raleigh,
    '_reg': s1_dv_regional
}

s1_duf_cary = np.zeros((len(objs), len(duf_names)), dtype=float)
s1_duf_durham = np.zeros((len(objs), len(duf_names)), dtype=float)
s1_duf_raleigh = np.zeros((len(objs), len(duf_names)), dtype=float)
s1_duf_regional = np.zeros((len(objs), len(duf_names)), dtype=float)

s1_duf_dict = {
    '_C': s1_duf_cary,
    '_D': s1_duf_durham,
    '_R': s1_duf_raleigh,
    '_reg': s1_duf_regional
}

for i in range(len(utilities)):
    s1_util = []
    hdrs = []
    if mode == 'DV':
        s1_util = s1_dv_dict[utilities[i]]
        hdrs = dv_names
    elif mode == 'DUF':
        s1_util = s1_duf_dict[utilities[i]]
        hdrs = duf_names

    for j in range(len(objs)):
        curr_file = main_dir + 'S1_' + objs[j] + utilities[i] + '.csv'
        s1_util[j, :] = pd.read_csv(curr_file, sep=',', skiprows=2, header=None).iloc[0,1:]

    s1_util_df = pd.DataFrame(s1_util, columns=hdrs)
    out_filepath = main_dir + utilities_full[i] + '.csv'

    s1_util_df.to_csv(out_filepath, sep=',', index=False)

Now, let’s plot our heatmaps! The code to do so can be found in sensitivity_heatmap.py, and should look similar to the code provided below:

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import AxesGrid

sns.set_theme()

# change depending on compromise solution and whether its sensitivity to DUF or DVs
mode = 'DUF'
rot = 90    # if DV use 0; if DUF use 45
main_dir = '/YourFilePath/WaterPaths/post_processing/delta_output_' + mode + '/'
c_filepath = main_dir + 'Cary.csv'
d_filepath = main_dir + 'Durham.csv'
r_filepath = main_dir + 'Raleigh.csv'
reg_filepath = main_dir + 'Regional.csv'

cary = pd.read_csv(c_filepath, index_col=False, header=0)
durham = pd.read_csv(d_filepath, index_col=False, header=0)
raleigh = pd.read_csv(r_filepath, index_col=False, header=0)
regional = pd.read_csv(reg_filepath, index_col=False, header=0)

savefig_dir = '/YourFilePath/WaterPaths/post_processing/'
savefig_name = savefig_dir + 'dmi_heatmap_' + mode + '.svg'

grid_kws = {"height_ratios": (0.20, 0.20, 0.20, 0.20, .02), "hspace": 0.5}
f, (ax1, ax2, ax3, ax4, cbar_ax) = plt.subplots(5, figsize=(15, 20), gridspec_kw=grid_kws)
plt.subplots_adjust(top = 0.95, bottom = 0.05,
            hspace = 0, wspace = 0.05)

y_objs=['REL', 'RF', 'INPC', 'PFC', 'WCC']

x_dvs=['$RT_{C}$', '$RT_{D}$', '$RT_{R}$', '$TT_{D}$', '$TT_{R}$', '$LM_{C}$', '$LM_{D}$', '$LM_{R}$',\
                '$RC_{C}$', '$RC_{D}$', '$RC_{R}$', '$IT_{C}$', '$IT_{D}$', '$IT_{R}$', '$IP_{C}$', \
                '$IP_{D}$', '$IP_{R}$','$INF_{C}$', '$INF_{D}$', '$INF_{R}$']
x_dufs = ['Cary\nrestr. eff', 'Durham\nrestr. eff', 'Raleigh\nrestr. eff', 'Dem. growth\nmultiplier',\
             'Bond term\nmultiplier', 'Bond interest\nrate multiplier', 'Infra. interest\nrate multiplier',\
             'Streamflow amp\nmultiplier', 'SCR PT\nmultiplier', 'SCR CT\nmultiplier', 'NRR PT\nmultiplier',\
             'NRR CT\nmultiplier', 'CR Low PT\nmultiplier', 'CR Low CT\nmultiplier', 'CR High PT\nmultiplier',\
             'CR High CT\nmultiplier', 'WR1 PT\nmultiplier', 'WR1 CT\nmultiplier', 'WR2 PT\nmultiplier',\
             'WR2 CT\nmultiplier', 'DR PT\nmultiplier', 'DR CT\nmultiplier', 'FR PT\nmultiplier', 'FR CT\nmultiplier',\
             'DR PT\nmultiplier', 'DR CT\nmultiplier', 'FR PT\nmultiplier', 'FR CT\nmultiplier']

x_labs = []
if mode == 'DV':
    x_labs = x_dvs
elif mode == 'DUF':
    x_labs = x_dufs

plt.rc('xtick', labelsize=1)
plt.rc('ytick', labelsize=3)
plt.rc('axes', labelsize=5)
plt.rc('axes', titlesize=14)

#ax1 = fig.add_subplot(411)
ax1.set_title("Cary")
sns.heatmap(cary, linewidths=.05, cmap="YlOrBr", xticklabels=[],
    yticklabels=y_objs, ax=ax1, cbar=False)
ax1.set_yticklabels(y_objs, rotation=0)

#ax2 = fig.add_subplot(412)
ax2.set_title("Durham")
sns.heatmap(durham, linewidths=.05, cmap="YlOrBr", xticklabels=[],
    yticklabels=y_objs, ax=ax2, cbar=False)
ax2.set_yticklabels(y_objs, rotation=0)

#ax3 = fig.add_subplot(413)
ax3.set_title("Raleigh")
sns.heatmap(raleigh, linewidths=.05, cmap="YlOrBr", xticklabels=[],
    yticklabels=y_objs, ax=ax3, cbar=False)
ax3.set_yticklabels(y_objs, rotation=0)

#ax4 = fig.add_subplot(414, fontsize=10)
ax4.set_title("Regional")
ax4 = sns.heatmap(regional, linewidths=.05, cmap="YlOrBr", xticklabels=x_labs,
    yticklabels=y_objs, ax=ax4, cbar=True, cbar_ax=cbar_ax,
    cbar_kws={'orientation': 'horizontal'})     # change depending on whether analyzing DUF or DV
ax4.set_xticklabels(x_labs, rotation=rot, fontsize=10)
ax4.set_yticklabels(y_objs, rotation=0)

plt.savefig(savefig_name)

Running this for the sensitivity to decision variables and DU factors will generate the following images:

Sensitivity of performance objectives to decision variables.

In the figure above, the color of each box represents the sensitivity of a performance objective (y-axis) to a specific decision variable (x-axis). It is interesting to note that the restriction triggers (RT) of all utilities strongly influence each of their individual and regional reliability and restriction frequency. This indicates the potential for regional conflict, as possible errors is operating one utility’s restriction trigger may adversely affect other utilities’ reliabilities and ability to maintain full control over their own use of water-use restrictions. Furthermore, Raleigh’s performance is sensitive to more decision variables than its remaining two counterparts, with it’s worst-case cost (WCC) being affected most by Cary’s infrastructure investments. This observation highlights the importance of careful cooperation between a region’s member utilities to ensure that all partners play their part in maintaining both their own and their counterparts’ performance.

Sensitivity of performance objectives to DU factors.

In this next figure, we observe that uncertainty in demand growth is the only DU factor that significantly drives changes in individual and regional performance. This finding can thus help utilities to focus on demand management programs, or formulate operations and management policies that enable them to more quickly adapt to changes in consumer and industrial demand growth.

Overall, in this post, we have performed a simple sensitivity analysis to identify uncertainties in the decision variables and DU factors that control regional and individual performance. All the code for processing the output data can be found in this GitHub repository here. In the next post, we will end the MORDM blogpost series by performing scenario discovery to map regions of success and failure as defined by our robustness metrics.

References

Borgonovo, E. (2007). A new uncertainty importance measure. Reliability Engineering &Amp; System Safety, 92(6), 771-784. doi: 10.1016/j.ress.2006.04.015

Herman, J. D., Reed, P. M., Zeff, H. B., & Characklis, G. W. (2015). How should robustness be defined for water systems planning under change? Journal of Water Resources Planning and Management, 141(10), 04015012. doi:10.1061/(asce)wr.1943-5452.0000509

Reed, P.M., Hadjimichael, A., Malek, K., Karimi, T., Vernon, C.R., Srikrishnan, V., Gupta, R.S., Gold, D.F., Lee, B., Keller, K., Thurber, T.B, & Rice, J.S. (2022). Addressing Uncertainty in Multisector Dynamics Research [Book]. Zenodo. https://doi.org/10.5281/zenodo.6110623

A step-by-step tutorial for scenario discovery with gradient boosted trees

Our recently published eBook, Addressing Uncertainty in Multisector Dynamics Research, provides several interactive tutorials for hands on training in model diagnostics and uncertainty characterization. The purpose of this post is to expand upon these trainings by providing a tutorial demonstrating gradient boosted trees for scenario discovery. I’ll first provide some brief background on scenario discovery and gradient boosted trees, then demonstrate a Python implementation on a water supply planning problem. All code here is written in Python, but the workflow is model agnostic, and can be paired with simulation models in any language. I’ve included my code within the text below, but all code and data for this post can also be found in this git repository.

Scenario discovery gradient boosted trees

In water resources planning and management, decision makers are often faced with uncertainty about how their system will change in the future. Traditionally, planners have confronted this uncertainty by developing prespecified narrative scenarios, which reduce the multitude of possible future conditions into a small subset of important future states of the world (a prominent example is the ‘scenario matrix framework’ used to evaluate climate change (O’Neill et al., 2014)). While this approach provides intuitive appeal, it may increase system vulnerability if future conditions do not evolve as decision makers expect (for a detailed critique of scenario based planning see Reed et al., 2022). This vulnerability is especially apparent for systems facing deep uncertainty, where decision makers do not know or cannot agree upon the probability density functions of key system inputs (Kwakkel et al., 2016).

Scenario discovery (Groves and Lempert, 2007) is an exploratory modeling centered approach that seeks to discover consequential future scenarios using computational experiments rather than relying on prespecified information. To perform scenario discovery, decision makers first identify a set of relevant uncertainties and their plausible ranges. Next, an ensemble of these uncertainties is developed by sampling across parameter ranges. Candidate policies are then evaluated across this ensemble and machine learning or data mining algorithms are used to examine which combinations of uncertainties generate vulnerability in the system. These combinations can then be used to develop narrative scenarios to inform implementation and monitoring efforts or new policy development.

A core element of the scenario discovery process is the algorithm used to classify future states of the world. Popular algorithms include the PRIM, CART and logistic regression. Recently, gradient boosted trees have been applied as an alternative classificiation algorithm. Gradient boosted trees have advantages over other scenario discovery algorithms because they can easily capture nonlinear and non-differentiable boundaries in the uncertainty space (which are particularly prevalent in water supply planning problems that have discrete capacity expansion options), are highly resistant to overfitting and provide a clear means of ranking the importance of uncertain factors (Trindade et al., 2020). For a comprehensive overview of gradient boosted trees, see Bernardo’s post here.

Test case: the Sedento Valley

To demonstrate gradient boosted trees for scenario discovery we’ll use the Sedento Valley water supply planning test case (Trindade et al., 2020). In the Sedento Valley, three water utilities seek to discover cooperative water supply managment and infrastructure investment portfolios to meet several conflicting objectives in a system facing deep uncertainty. In this post, we’ll investigate how these deep uncertainties (which include demand growth, the efficacy of water use restrictions, financial variables and parameters governing infrastructure permitting and construction time) impact a utility’s ability to maintain three performance criteria: keeping reliability > 98%, restriction frequency < 20% and worst case cost less than 10% of annual revenue. For simplicity, we’ll focus on one regional water utility named Watertown.

Step 1: create a sample of deeply uncertain states of the world

To start the scenario discovery process, we generate an ensemble of deep uncertainties that represent future states of the world (SOWs). Here, we’ll use Latin Hypercube Sampling with an implementation I found in the Surrogate Modeling Toolbox.

import numpy as np
from smt.sampling_methods import LHS

'''
This script will generate 1000 Latin Hypercube Samples (LHS)
of deeply uncertain system parameters for the Sedento Valley
'''


# create an array storing the ranges of deeply uncertain parameters
DU_factor_limits = np.array([
    [0.9, 1.1], # Watertown restriction efficacy 
    [0.9, 1.1], # Dryville restriction efficacy
    [0.9, 1.1], # Fallsland restriction efficacy
    [0.5, 2.0], # Demand growth rate multiplier
    [1.0, 1.2], # Bond term
    [0.6, 1.0], # Bond interest rate
    [0.6, 1.4], # Discount rate
    [0.75, 1.5], # New River Reservoir permitting time
    [1.0, 1.2], # New River Reservoir construction time
    [0.75, 1.5], # College Rock Reservoir (low) permitting time
    [1.0, 1.2], # College Rock Reservoir (low) construction time
    [0.75, 1.5], # College Rock Reservoir (high) permitting time
    [1.0, 1.2], # College Rock Reseroir (high) construction time
    [0.75, 1.5], # Water Reuse permitting time
    [1.0, 1.2], # Water Reuse construction time
    [0.8, 1.2], # Inflow amplitude
    [0.2, 0.5], # Inflow frequency
    [-1.57, 1.57]]) # Inflow phase

# Use the smt package to set up the LHS sampling
sampling = LHS(xlimits=DU_factor_limits)

# We will create 1000 samples
num = 1000

# Create the actual sample
x = sampling(num)

# save to a csv file
np.savetxt('DU_factors.csv', x, delimiter=',')

Step 2: Evaluate performance across SOWs

Next, we’ll evaluate the performance of our policy across the LHS sample of DU factors. For the Sedento Valley test case, we use WaterPaths, an open-source simulation system for integrated water supply portfolio management and infrastructure investment planning (for more see Trindade et al., 2020). This step is not included in the git repository as it requires high-performance computing for this system, but results can be found in the “Model_output.csv” file. For simulation details, see Gold et al., 2022.

Step 3: Convert model output into a boolean array for classification

To perform classification, we need to convert the results of our simulations to a binary array classifying each SOW as a “success” or “failure” based on whether the policy met the performance criteria under the SOW. First, we define a small function to determine if an SOW meets a set of criteria, then we apply this function to our results. We also load the DU factor LHS sample.

# First, define a function to check whether performance criteria are met
def check_criteria(objectives, crit_objs, crit_vals):
    """
    Determines if an objective meets a given set of criteria for a set of SOWs

    inputs:
        objectives: np array of all objectives across a set of SOWs
        crit_objs: the column index of the objective in question
        crit_vals: an array containing [min, max] of the values 
    
    returns:
        meets_criteria: an numpy array containing the SOWs that meet both min and max criteria

    """
    
    # check max and min criteria for each objective
    meet_low = objectives[:, crit_objs] >= crit_vals[0]
    meet_high = objectives[:, crit_objs] <= crit_vals[1]

    # check if max and min criteria are met at the same time
    meets_criteria = np.hstack((meet_low, meet_high)).all(axis=1)

    return meets_criteria


##### Load data and pre-process #####

# load objectives and create input array of boolean values for SD input
Reeval_objectives = np.loadtxt('Model_output.csv', skiprows=1, delimiter=',')
REL = check_criteria(Reeval_objectives, [0], [.979, 1])
RF = check_criteria(Reeval_objectives, [1], [0, 0.10])
WCC = check_criteria(Reeval_objectives, [2], [0, 0.10])
SD_input = np.vstack((REL, RF, WCC)).SD_input(axis=0)


# load DU factors
DU_factors = np.loadtxt('DU_factors.csv', skiprows=1, delimiter=',')
DU_names = ['Watertown Rest. Eff.', 'Dryville Rest. Eff.', 'FSD_inputsland Rest. Eff.',
            'Demand Growth Rate', 'Bond Term', 'Bond Interest',
            'Discount Rate', 'NRR Perm', 'NRR Const', 'CRR L Perm',
            'CRR L Const.',	'CRR H Perm.', 'CRR H Const.', 'WR1 Perm.',
             'WR1 Const.', 'Inflows A', 'Inflows m','Inflows p']

Step 4: Fit a boosted trees classifier

After we’ve formatted the data, we’re ready to perform boosted trees classification. There are several packages for boosted trees in Python, here we’ll use the implementation from scikit-learn. We’ll use an ensemble of 200 trees with depth 3 and a learning rate of 0.1. These parameters need to be tuned for the individual problem, I found this nice post that goes into detail on parameter tuning.

##### Boosted Tree Classification #####

from sklearn.ensemble import GradientBoostingClassifier

# create a gradient boosted classifier object
gbc = GradientBoostingClassifier(n_estimators=200,
                                 learning_rate=0.1,
                                 max_depth=3)

# fit the classifier
gbc.fit(DU_factors, SD_input)

Step 5: Examine which DU factors have the most impact on performance criteria

Now we’re ready to examine the results of our classification. First, we’ll examine how important each DU factor is to the classification results generated by boosted trees. To rank the imporance of each DU factor, we examine the percentage decrease in impurity of the ensemble of trees that is associated with each factor. In plain english, this is a measure of how helpful each DU factor is to correctly classifying SOWs. This infromation is generated during the fit of the classifier above and is easily accessible as an attribute of our scikit-learn classifier.

For our example, one deep uncertainty, demand growth rate, clearly stands out as the most influential, as shown in the figure below. A second, the restriction efficacy for Watertown (the utility we’re focusing on), also stands out as a higher level of importance. All other DU factors have little impact on the classification in this case.

##### Factor Ranking #####

# Extract the feature importances
feature_importances = deepcopy(gbc.feature_importances_)

# rank the feature importances and plot
importances_sorted_idx = np.argsort(feature_importances)
sorted_names = [DU_names[i] for i in importances_sorted_idx]

fig = plt.figure(figsize=(8,8))
ax = fig.gca()
ax.barh(np.arange(len(feature_importances)), feature_importances[importances_sorted_idx])
ax.set_yticks(np.arange(len(feature_importances)))
ax.set_yticklabels(sorted_names)
ax.set_xlim([0,1])
ax.set_xlabel('Feature Importance')
plt.tight_layout()

Step 6: Create factor maps

Finally, we visualize the results of our classification through factor mapping. In the plot below, we show the uncertainty space projected onto the two most influential factors, demand growth rate and restriciton efficacy. Each point represents a sampled SOW, red points represent SOWs that resulted in failure, while white points represent SOWs that resulted in success. The color in the background shows the predicted regions of success and failure from the boosted trees classification.

Here we observe that high levels of demand growth are the primary source of vulnerability for the water utility. When restriction efficacy is lower than our estimate (multiplier < 1), the utility faces failures at demand growth levels of about 1.7 times the estimated values. When restriction effectiveness is at or above estimates, the acceptable scaling of demand growth raises to about 1.8.

Taken as a whole, these results provide valueable insights for decision makers. From our original 18 deep uncertainties, we have discovered that two are critical for the success of our water supply management policy. Further, we have defined thresholds within the uncertainty space that define scenarios that lead to failure. We can use this information to inform monitoring efforts for the water supply policy, or to inform a new problem formulation that tailors actions to mitigate these vulnerabilities.

##### Factor Mapping #####

# Select the top two factors discovered above
selected_factors = DU_factors[:, [3,0]]

# Fit a classifier using only these two factors
gbc_2_factors = GradientBoostingClassifier(n_estimators=200,
                                 learning_rate=0.1,
                                 max_depth=3)
gbc_2_factors.fit(selected_factors, SD_input)

# plot prediction contours
x_data = selected_factors[:,0]
y_data = selected_factors[:,1]

x_min, x_max = (x_data.min(), x_data.max())
y_min, y_max = (y_data.min(), y_data.max())

# create a grid to makes predictions on
xx, yy = np.meshgrid(np.arange(x_min, x_max * 1.001, (x_max - x_min) / 100),
                        np.arange(y_min, y_max * 1.001, (y_max - y_min) / 100))
                        
dummy_points = list(zip(xx.ravel(), yy.ravel()))

z = gbc_2_factors.predict_proba(dummy_points)[:, 1]
z[z < 0] = 0.
z = z.reshape(xx.shape)

# plot the factor map        
fig = plt.figure(figsize=(10,8))
ax = fig.gca()
ax.contourf(xx, yy, z, [0, 0.5, 1.], cmap='RdBu',
                alpha=.6, vmin=0.0, vmax=1)
ax.scatter(selected_factors[:,0], selected_factors[:,1],\
            c=SD_input, cmap='Reds_r', edgecolor='grey', 
            alpha=.6, s= 100, linewidth=.5)
ax.set_xlim([.5, 2])
ax.set_ylim([.9,1.1])
ax.set_xlabel('Demand Growth Multiplier')
ax.set_ylabel('Restriction Eff. Multiplier')

References

Gold, D. F., Reed, P. M., Gorelick, D. E., & Characklis, G. W. (2022). Power and Pathways: Exploring Robustness, Cooperative Stability, and Power Relationships in Regional Infrastructure Investment and Water Supply Management Portfolio Pathways. Earth’s Future, 10(2), e2021EF002472.

Groves, D. G., & Lempert, R. J. (2007). A new analytic method for finding policy-relevant scenarios. Global Environmental Change, 17(1), 73-85.

Kwakkel, J. H., Walker, W. E., & Haasnoot, M. (2016). Coping with the wickedness of public policy problems: approaches for decision making under deep uncertainty. Journal of Water Resources Planning and Management, 142(3), 01816001.

O’Neill, B. C., Kriegler, E., Riahi, K., Ebi, K. L., Hallegatte, S., Carter, T. R., … & van Vuuren, D. P. (2014). A new scenario framework for climate change research: the concept of shared socioeconomic pathways. Climatic change, 122(3), 387-400.

Reed, P.M., Hadjimichael, A., Malek, K., Karimi, T., Vernon, C.R., Srikrishnan, V., Gupta, R.S., Gold, D.F., Lee, B., Keller, K., Thurber, T.B., & Rice, J.S. (2022). Addressing Uncertainty in Multisector Dynamics Research [Book]. Zenodo. https://doi.org/10.5281/zenodo.6110623

Trindade, B. C., Gold, D. F., Reed, P. M., Zeff, H. B., & Characklis, G. W. (2020). Water pathways: An open source stochastic simulation system for integrated water supply portfolio management and infrastructure investment planning. Environmental Modelling & Software, 132, 104772.

Milton Friedman’s thermostat and sensitivity analysis of control policies

If you’re reading this blog, you probably already know that correlation does not necessarily imply causation. However, does a lack of correlation necessarily imply a lack of causation? Despite widespread misconception, even by Nobel Laureates, it does not. This is especially true for controlled systems, as explained in Milton Friedman’s thermostat example. The act of controlling a system for stability (e.g., a thermostat that expends energy to stabilize indoor air temperature) will tend to remove correlations between variables that are, in a certain sense, causally related (e.g., between energy expenditure and indoor temperature). More generally, the control of a dynamical system can introduce complex, path-dependent behavior that complicates the analysis of observed data using standard statistical techniques. This blog post will introduce these issues in the context of reservoir operations. I will then connect this to recent developments in using sensitivity analysis to improve understanding of how complex control policies respond to changing information over time, and demonstrate the benefits of using simulated state trajectories rather than random sampling approaches when analyzing controlled systems.

A Jupyter Notebook containing the Python code used for this analysis can be found in this GitHub repository.

Reservoir storage stabilization policy

First, consider a reservoir with a capacity of 20 MG, and daily inflows that average 10 MGD, with moderate persistence. Let’s assume the reservoir operator follows a “storage stabilization” policy that attempts to keep the storage constant at 10 MG. This simply requires the operator to release (S – S’ + I*) each day, where S is the current storage, S’ is the storage target, and I* is the projected inflow over the course of the next time step. How will this system evolve over time? With a perfect forecast, we have a situation analogous to Milton Friedman’s thermostat: the control policy keeps storage constant over time, leading it to become uncorrelated from both inflows and releases. Meanwhile, because storage is unvarying, the release depends only on inflows in a perfect linear relationship.

Figure 1: Time series of state variables for storage stabilization policy with perfect forecast.
Figure 2: Pairwise relationships between state variables for storage stabilization policy with perfect forecast.

The fact that inflow is uncorrelated with storage may appear paradoxical at first, but it makes sense once we understand the storage variable contributes no useful information to the system, since it is unchanging over time.

With imperfect forecasts, the operator sometimes needs to increase the release to avoid overtopping the reservoir, or decrease the release to avoid negative storage. This adds noise to the relationships above, but the reservoir releases are still minimally correlated to storages.

Figure 3: Pairwise relationships between state variables for storage stabilization policy with imperfect forecasts.

Power production policy

The reservoir storage stabilization policy and Milton Friedman’s thermostat example demonstrate just one particular aspect of a much broader point: the act of controlling a dynamical system will fundamentally alter the trajectories of the system through state-space. This has important implications for how observed data from these systems should be understood.

But first, consider an alternative reservoir control policy that attempts to take advantage of power price information when making releases for the purpose of hydropower production. The power price has a mean of 10 and lower persistance than the inflow distribution. We assume the following simple set of rules: (1) when price is below 8, release S/4 MG; (2) when price is between 8 and 12, release S/2; (3) when price is above 12, release (S + I*/2). Again, we also adjust the release to avoid overtopping the reservoir or incurring negative storages. Despite the simplicity of this policy, it produces surprisingly complex dynamics.

Figure 4: Time series of state variables for power production policy with perfect forecast.
Figure 5: Pairwise relationships between state variables for power production policy with imperfect forecasts.

In the bottom row of Figure 5, we see interesting and highly non-linear relationships driving the release policy as a function of the projected inflow, storage, and power price. Additionally, the relationships between the state variables themselves (i.e., power price and storage) can display non-linear and thresholding-type behavior. Comparing the storage stabilization policy and the power production policy underscores the importance of the control process in dictating the dynamics of the system and the regions of state-space that are explored.

Figure 6: Comparison of state-space inhabited by storage stabilization policy and power production policy.

Implications for sensitivity analysis of control policies

So what does any of this have to do with sensitivity analysis? Recent advances in adaptive control (e.g. Direct Policy Search, Reinforcement Learning) have allowed for the development of complex operating policies for managing water resource systems under uncertainty by adapting to evolving conditions over time (see recent reviews by Giuliani et al., 2021, and Herman et al., 2020). These operating policies can be built from Artificial Neural Networks, Radial Basis Functions, Decision Trees, or other functional forms, and can exhibit highly non-linear behavior. This complicates efforts to understand how they work “under the hood,” leading to skepticism from some researchers and decision-makers who prefer simpler and more transparent operating policies. Recent work has attempted to “open the black box” by combining sensitivity analysis and visualization in order to illuminate how different operating policies monitor and respond to different sources of information (e.g., Quinn et al., 2019; Hamilton et al., 2022).

In most applications of global sensitivity analysis, researchers want to understand the impacts of uncertain model parameters (e.g., soil permeability) or inputs (e.g., precipitation) on model outputs (e.g., runoff) (see Pianosi et al., 2016, for a review of sensitivity analysis in environmental models). Importantly, in general the inputs to the sensitivity analysis are considered to be exogenous factors and forcings that do not respond to the dynamics of the rest of the system model. For this reason, (quasi)random sampling approaches are often used to generate samples from across the feasible range of each parameter.

However, when sensitivity analysis is applied to the problem of better understanding adaptive control policies (e.g., reservoir releases), then the policy inputs may be either exogenous (e.g., inflows, power prices) or endogenous (e.g., storage). The latter, as we have seen, can cause highly non-linear and path-dependent dynamics within the system, resulting in simulated trajectories that are not uniformly distributed across state space. Any non-uniformity in the distribution of states is potentially valuable “information” about the dynamics of the control policy and the broader system, which we would like to capture with our sensitivity analysis. For this reason, it is preferable to use simulated system trajectories as inputs to the sensitivity analysis, rather than randomly generated inputs, in order to ensure that our sensitivity analysis reflects the actual data generating process of the system.

Figure 7: Comparison of state-space inhabited by storage stabilization policy, using randomly generated input data vs. simulated state trajectories.
Figure 8: Comparison of state-space inhabited by power production policy, using randomly generated input data vs. simulated state trajectories.

Due to the non-linear, non-independent, non-normally distributed nature of these simulated data, many common variance-based global sensitivity analysis methods may not be appropriate. However, moment-independent methods such as information theoretic sensitivity analysis and the delta-moment independent method can help overcome some of these challenges. See Hamilton et al., 2022, Hadjimichael et al., 2020, and references therein, as well as this blog post by Keyvan Malek, for more discussion of these issues.

For the sake of simplicity, I use R-squared as a simple indicator of variance-based sensitivity rather than more sophisticated measures such as Sobol sensitivity. I also apply a discretized version of information theoretic sensitivity analysis (ITSA). Both indices range between 0 and 1, but they cannot be directly compared to each other (e.g., R2 is often higher than ITSA given the same data) . Tables 1 and 2 show the sensitivity indices for the two operating policies.

ITSA, simR2, simITSI, unifR2, unif
Projected inflow0.520.950.260.69
Storage0.060.040.140.29
Power price0.050.020.070.00
Table 1: Comparison of sensitivity indicies for policy inputs governing reservoir releases, under storage stabilization policy
ITSA_simR2_simITSA_unifR2_unif
Projected inflow0.290.480.140.30
Storage0.410.450.170.31
Power price0.090.010.120.21
Table 2: Comparison of sensitivity indicies for policy inputs governing reservoir releases, under power production policy

The most obvious contrast is between the uniformly sampled data and the simulated data; the former tends to estimate significantly lower sensitivity. This is due to the missing information on the path-dependent relationships between states within the system, as a result of using randomly sampled input data rather than actual simulated trajectories. Another interesting comparison is ITSA_sim vs R2_sim for the power production policy. While R2 classifies projected inflow and storage as having roughly equivalent influence on release decisions, ITSA finds substantially more influence from storage. This finding makes sense when comparing the two relationships in Figure 5; both have similar shapes and variances if you squint, but the storage-release relationship has more well-defined fine structure that cannot be discerned by variance-based approaches.

References

Giuliani, M., Lamontagne, J. R., Reed, P. M., & A. Castelletti. (2021). A State-of-the-Art Review of Optimal Reservoir Control for Managing Conflicting Demands in a Changing World. Water Resources Research, 57, e2021WR029927. https://doi.org/10.1029/2021WR029927

Hadjimichael, A., Quinn, J., & P. Reed. (2020). Advancing diagnostic model evaluation to better understand water shortage mechanisms in institutionally complex river basins. Water Resources Research, 56, e2020WR028079. https://doi.org/10.1029/2020WR028079

Hamilton, A. L., Characklis, G. W., & P. M. Reed. (2022). From stream flows to cash flows: Leveraging Evolutionary Multi-Objective Direct Policy Search to manage hydrologic financial risks. Water Resources Research, 58, e2021WR029747. https://doi.org/10.1029/2021WR029747

Herman, J. D., Quinn, J. D., Steinschneider, S., Giuliani, M., & S. Fletcher (2020). Climate adaptation as a control problem: Review and perspectives on dynamic water resources planning under uncertainty. Water Resources Research, 56, e24389. https://doi.org/10.1029/2019WR025502

Pianosi, F., Beven, K., Freer, J., Hall, J. W., Rougier, J., Stephenson, D. B., & T. Wagener. (2016). Sensitivity analysis of environmental models: A systematic review with practical workflow. Environmental Modelling & Software, 79, 214-232. http://dx.doi.org/10.1016/j.envsoft.2016.02.008

Quinn, J. D., Reed, P. M., Giuliani, M., & A. Castelletti (2019). What is controlling our control rules? Opening the black box of multireservoir operating policies using time-varying sensitivity analysis. Water Resources Research, 55, 5962–5984. https://doi.org/10.1029/2018WR024177

ggplot (Part 3) – Animating Sensitivity Analysis Indices

For part three of this introduction to ggplot, I will go over an example of a user-friendly library that can easily animate your plot: “gganimate”. It works with different types of graphs; I will apply it to the bar plot in order to visualize the variations in sensitivity indices. This would be helpful for time-varying sensitivity analysis, which is an option to prevent information loss and gain understanding of system behavior, compared to analyzing just the aggregated sensitivity indices. You can download the test-case data from here. These are first and total order sensitivity indices corresponded to annual crop yield for several parameters, categorized by different groups (such as parameters that are related to estimate crop phenology and biomass, or parameters that are related to capturing the effect of temperature or soil hydrological variables on the yield). We can create a subset of just one year of data to explore, using the general command line to make a bar plot based on different groups:

library(ggplot2)
library(gganimate)

annual_S1T<- read.csv("(your directory)/testCase_ST_S1.csv",sep="\t")
head(annual_S1T)
#   S1_sobol   ST_sobol Year Parameters Group_orig Group
#1 0.007892755 0.01102304 1990         a1  Hydrology     A
#2 0.019468069 0.02543356 1998         a1  Hydrology     A
#3 0.004058817 0.00962275 1999         a1  Hydrology     A

one_year<- subset(annual_S1T,annual_S1T$Year==2000)
g1<- ggplot(one_year, aes(x=Parameters, y=ST_sobol,fill=Group))+
  geom_bar(stat = "identity", position=position_dodge()) +
  theme(panel.background = element_blank(), axis.line = element_line(size = 0.5, linetype = "solid",colour = "black"),
        panel.grid.major.y = element_blank(),
        axis.title=element_text(size=12,face="bold"),
        plot.title = element_text(size=16),plot.subtitle=element_text(size=18, hjust=0.5,  color="black"),
        axis.text.y=element_text(size = 12,colour = "black"),
        axis.text.x=element_text(size=12,colour = "black"),
    legend.text=element_text(size=18),legend.title=element_text(size=20),legend.key.height=unit(3,"line"))+
  labs(x ="Parameters",y="Total Order Indices",fill="Group")

We can flip Cartesian coordinates so that the horizontal becomes vertical and the vertical becomes horizontal, by adding coord_flip(). Next, we will use the whole dataset to visualize the annual variations in total order indices. By adding transition_time() and specifying the column corresponding to the variations (in our case, “Year”), we can animate our graph. We can also add a label for each time frame in the “labs/title” section.

g2<- ggplot(annual_S1T, aes(x=Parameters, y=ST_sobol,fill=Group)) + 
  geom_bar(stat = "identity", position=position_dodge())  + 
  theme(panel.background = element_blank(), axis.line = element_line(size = 0.5, linetype = "solid",colour = "black"),
        panel.grid.major.y = element_blank(),
        axis.title=element_text(size=12,face="bold"),
        plot.title = element_text(size=16),plot.subtitle=element_text(size=18, hjust=0.5,  color="black"),
        axis.text.y=element_text(size = 12,colour = "black"),
        axis.text.x=element_text(size=12,colour = "black",angle = 90),
        legend.text=element_text(size=18),legend.title=element_text(size=20),legend.key.height=unit(2,"line"))+
  labs(x ="Parameters",y="Total Order Indices",fill="Group",title = 'Year: {frame_time}')+
  transition_time(Year) 

With the animate() function, we can specify how long we want to wait to change the frame in the animated graph, and how long we want to pause between repetitions of the time series. The result is a gif file that can be saved using the animate() function.

It is also possible to compare the three different sensitivity indices. Total order and first order indices are provided in the dataset; by subtracting the first order indices from the total order, we can calculate the total interactions each parameter has with the rest.

annual_S1T$Total_Interaction<- annual_S1T$ST_sobol-annual_S1T$S1_sobol

One of the best ways to provide data to ggplot functions is the format that the melt function provides. This format is in fact the format that ggplot prefers. In these cases, we can reshape the data frame (using the reshape2 library) to be able to use a single ggplot command line with the few different layout panels, or different types of graph. We can us melt() function to perform this task, and reshape the data frame to have all the indices’ variables in a row instead of various columns, with the correct corresponding information about which year, group, and parameter label the data belong to. To use the melt function, we need to specify columns that identify data values and we use the id.vars argument to do that. More information about the reshape2 package can be found here.

library(reshape2)
annual_S1T_melted<- melt(annual_S1T, id.vars = c("Parameters","Group","Year"), measure.vars = c("Total_Interaction", "S1_sobol","ST_sobol"))
head(annual_S1T_melted)
#  Parameters Group Year          variable       value
#1         a1     A 1990 Total_Interaction 0.003130285
#2         a1     A 1998 Total_Interaction 0.005965492
#3         a1     A 1999 Total_Interaction 0.005563933
#4         a1     A 1991 Total_Interaction 0.007473186
#5         a1     A 1995 Total_Interaction 0.006950200
#6         a1     A 1992 Total_Interaction 0.001914845

Now, we can create the same graph that we saw above for three variables. In this case, we can use facet_grid().

ggplot(annual_S1T_melted, aes(x=Parameters, y=value,fill=Group)) + facet_grid(~variable)+
geom_bar(stat = "identity", position=position_dodge())  +
  coord_flip() +
  theme(panel.background = element_blank(), axis.line = element_line(size = 0.5, linetype = "solid",colour = "black"),
        panel.grid.major.y = element_blank(),
        axis.title=element_text(size=12,face="bold"),
        plot.title = element_text(size=16),plot.subtitle=element_text(size=18, hjust=0.5,  color="black"),
        axis.text.y=element_text(size = 12,colour = "black"),
        axis.text.x=element_text(size=12,colour = "black",angle = 90),
        legend.text=element_text(size=18),legend.title=element_text(size=20),legend.key.height=unit(2,"line"))+
  labs(x ="Parameters",y="Total Order Indices",fill="Group",title = 'Year: {frame_time}')+
  transition_time(Year) 

How to make horizon plots in Python

Horizon plots were invented about a decade ago to facilitate visual comparison between two time series. They are not intuitive to read right away, but they are great for comparing and presenting many sets of timeseries together. They can take advantage of a minimal design by avoiding titles and ticks on every axis and packing them close together to convey a bigger picture. The example below shows percent changes in the price of various food items in 25 years.

The way they are produced and read is by dividing the values along the y axis in bands based on ranges. The color of each band is given by a divergent color map. By collapsing the bands to the zero axis and layering the higher bands on top, one can create a time-varying heatmap of sorts.

Source: http://idl.cs.washington.edu/papers/horizon

I wasn’t able to find a script that could produce this in Python, besides some code in this github repository, that is about a decade old and cannot really run in Python 3. I cleaned it up and updated the scripts with some additional features. I also added example data comparing USGS streamflow data with model simulation data for the same locations for 38 years. The code can be found here and can be used with any two datasets that one would like to compare with as many points of comparison as needed (I used eight below, but the script can accept larger csv files with more or less comparison points, which will be detected automatically). The script handles the transformation of the data to uniform bands and produces the following figure, with every subplot comparing model output with observations at eight gauges, i.e. model prediction error. When the model is over predicting the area is colored blue, when the area is underpredicting, the area is colored red. Darker shades indicate further divergence from the zero axis. The script automatically uses three bands for both positive or negative divergence, but more can be added, as long as the user defines additional colors to be used.

Using this type of visualization for these data allows for time-varying comparisons of multiple locations in the same basin. The benefit of it is most exploited with many subplots that make up a bigger picture.

Future extensions in this repository will include code to accept more file types than csv, more flexibility in how the data is presented and options to select different colormaps when executing.

Displaying Interactions with chordDiagram in R

Sensitivity analysis is a powerful tool to find out which parameters in a model have the largest effect on the results. Besides the impact of the main parameters, exploring the interactions between parameters is also important because complex systems usually have many active interactions. Learning from them will improve our understanding of how a model works at different underlying conditions. This could potentially lead to improving our inferences of the model’s results and model evaluation. Previously, Antonio showed how to create a Radial convergence plot for visualizing Sobol indices in Python. In this blog post, I am going to use a library in R to create a Chord diagram for Sobol interaction indices for annual baseflow. Then, I will create an animated GIF file that shows the interactions for a few years. The dataset (download from here) that I am using came from the simulation of an agro-biophysical model that has 33 parameters.

install.packages("circlize")
library(circlize)
baseflow_S2<- read.csv(paste("...(your path).../sobol_ Baseflow_S2_1989.csv",sep=""))
baseflow_S2$X<- NULL

To work with this library, the format of the data should be a matrix or list.

baseflow_S2<- as.matrix(baseflow_S2)

As I mentioned above, the interactions are between 33 parameters, so we have 33 rows and columns in this dataset. I am going to assign a name to each row and column based on the actual order in the dataset:

rownames(baseflow_S2) = c("H_1","H_2","H_3","H_4","B_1","B_2","B_3","B_4","B_5","W_4","W_5","T_4","W_1","W_2","W_6","W_7","Phy_1","Phy_2","T_1","T_2","W_3","Phy_3","Phy_4","T_3","Phy_5","Phy_6","H_5","Ph_1","Ph_2","Ph_3","Ph_4","Ph_5","Ph_6")                   
colnames(baseflow_S2)= c("H_1","H_2","H_3","H_4","B_1","B_2","B_3","B_4","B_5","W_4","W_5","T_4","W_1","W_2","W_6","W_7","Phy_1","Phy_2","T_1","T_2","W_3","Phy_3","Phy_4","T_3","Phy_5","Phy_6","H_5","Ph_1","Ph_2","Ph_3","Ph_4","Ph_5","Ph_6") 	       

Now, we are going to assign a color for each parameter. For better visualization and interpretability of the results, these parameters are classified into six main groups (H, Ph, B, T, W, and Phy) based on how they are related to the model. I am going to assign the same color to all of the parameters within each group.

grid.col=c(H_1="deepskyblue4",H_2="deepskyblue4",H_3="deepskyblue4",H_4="deepskyblue4",H_5="deepskyblue4",Ph_1="darkorange3",Ph_2="darkorange3",Ph_3="darkorange3",Ph_4="darkorange3",Ph_5="darkorange3",Ph_6="darkorange3",B_1="brown4",B_2="brown4",B_3="brown4",B_4="brown4",B_5="brown4",T_1="coral2",T_2="coral2",T_3="coral2",T_4="coral2",W_1="cyan3",W_2="cyan3",W_3="cyan3",W_4="cyan3",W_5="cyan3",W_6="cyan3",W_7="cyan3",Phy_1="darkgreen",Phy_2="darkgreen",Phy_3="darkgreen",Phy_4="darkgreen",Phy_5="darkgreen",Phy_6="darkgreen")

Now we can look at the plot:

interactions_plot<- chordDiagram(baseflow_S2,grid.col = grid.col )

As we see, there are many small interactions between parameters. Therefore, you may want to set a limit to show some specific interactions that are large, or you may just want to clean up some of the interactions that are less than specific values. To do this task, we can use the result of the above function without any more adjustments in order to get the final format of a dataset that is created within the function. Then, we can modify it based on what we want to show. The “interactions_plot” is actually a plot, but you can also look at its data frame. The “col” column has the color code that is assigned to each interaction. We can extract that column.

 col2<- interactions_plot$col

Then get the index of values that are less than for example 0.005:

idx <- which(interactions_plot$value2 < 0.005, arr.ind = TRUE)

And we can replace the actual color in “col2” with “transparent” for those rows for which their index was selected above.

col2[idx] <- 'transparent'

Now, we can create a final graph with some more adjustments: with “order,” we can manually sort a grid (sectors). “link.sort” and “link.decreasing” are used to sort the links based on the value of the interaction (which is shown by the width of the sector). This might help in detecting interesting interactions: “annotationTrack” set to “grid”, print the sectors, “preAllocateTracks”, specified the track, “circos.track” creates plotting regions for a track: track.index is the index for the track, which is going to be updated. “panel.fun” is used to add graphics and customize sector labels (x and y correspond to the data points in each cell).

chordDiagram(baseflow_S2,order = c("H_1","H_2","H_3","H_4","H_5","Ph_1","Ph_2","Ph_3","Ph_4","Ph_5","Ph_6","B_1","B_2","B_3","B_4","B_5","T_1","T_2","T_3","T_4","W_1","W_2","W_3","W_4","W_5","W_6","W_7","Phy_1","Phy_2","Phy_3","Phy_4","Phy_5","Phy_6"),grid.col = grid.col ,col=col2, link.sort = TRUE, link.decreasing = TRUE,annotationTrack = "grid", preAllocateTracks = list(1))
title(main=paste(1989),adj=0)
    circos.track (track.index = 1, panel.fun = function(x, y) {
      xlim = get.cell.meta.data("xlim")
      ylim = get.cell.meta.data("ylim")
      sector.name = get.cell.meta.data("sector.index")
      circos.text(mean(xlim), ylim[1]+.1, sector.name, facing = "clockwise", niceFacing = TRUE, adj = c(-0.5,0.5),cex = 0.6)
      circos.axis(h = "top", labels.cex = 0.6, major.tick.percentage = 0.2, track.index = 2)}, bg.border = NA) 

We may want to add another layer to highlight a specific sector and add a group name. In this case, we can use “highlight.sector” to specify the color, label, and components of this new layer. With “padding,” we can also change the width of this layer.

highlight.sector(rownames(baseflow_S2[c("H_1","H_2","H_3","H_4","H_5"),]), track.index = 1, col = "deepskyblue4",text = "H", cex = 0.8, text.col = "black", niceFacing = TRUE,padding = c(-0.9, 0, 0.27, 0),font=2) 

highlight.sector(rownames(baseflow_S2[c("Ph_1","Ph_2","Ph_3","Ph_4","Ph_5","Ph_6"),]), track.index = 1, col = "darkorange3", text = "PH", cex = 0.8, text.col = "black", niceFacing = TRUE,padding = c(-0.9, 0, 0.27, 0),font=2)

highlight.sector(rownames(baseflow_S2[c("T_1","T_2","T_3","T_4"),]), track.index = 1, col = "coral2", text = "T", cex = 0.8, text.col = "black", niceFacing = TRUE,padding = c(-0.9, 0, 0.27, 0),font=2)   

highlight.sector(rownames(baseflow_S2[c("Phy_1","Phy_2","Phy_3","Phy_4","Phy_5","Phy_6"),]), track.index = 1, col = "darkgreen",text = "PHY", cex = 0.8, text.col = "black", niceFacing = TRUE,padding = c(-0.9, 0, 0.27, 0),font=2)

highlight.sector(rownames(baseflow_S2[c("B_1","B_2","B_3","B_4","B_5"),]), track.index = 1, col = "brown4", text = "B", cex = 0.8, text.col = "black", niceFacing = TRUE,padding =c(-0.9, 0, 0.27, 0),font=2) 

highlight.sector(rownames(baseflow_S2[c("W_1","W_2","W_3",	"W_4","W_5","W_6","W_7"),]), track.index = 1, col = "cyan3",text = "W", cex = 0.8, text.col = "black", niceFacing = TRUE,padding = c(-0.9, 0, 0.27, 0),font=2)

And here is the final plot:

To save the graph use this line of code before calling the chordDiagram():

png(file = paste("…(your path)…./sobol_Baseflow_S2_1989.png",sep="") ,height = 7, width = 7, units = "in", res = 500) 

And at the end run dev.off(). Now, I am going to run this code for all of the other years and save them to make a GIF file that shows the interactions for all years. To make an animation from these plots, first we need to install “magick package.”

This simple code reads the first image and then adds the rest of the images to that in a loop. You can also add a message for each image. (Here, I added the year.) The “delay” option is used to adjust the delay after each frame.   

install.packages("magick")
library(magick)
img <- image_read(path = paste0("...(your path).../sobol_Baseflow_S2_1989.png"))
years<- as.data.frame(1989:1997)

for(Y in 1:nrow(years)) {
img0 <- image_read(path = paste0("...(your path).../sobol_Baseflow_S2_",years[Y,],".png"))
img <- c(img, img0) 
}  
img1 <- image_scale(image = img, geometry = "720x720")
ani0 <- image_animate(image = img1, delay = 200)
image_write(image = ani0, path = paste0("...(your path).../sobol_Baseflow_S2.gif"))

Information Theory and Moment-Independent Sensitivity Indices

In this post, I will go over the concept of information theory and its relevance to sensitivity analysis. Then, I will talk about two other methods that are closely related to the concept of information theory. These methods are also often categorized as moment-independent sensitivity analysis methods.

What are the advantages of moment-independent sensitivity analysis indices?

Many methods exist for sensitivity analysis: for example, variance-based methods, such as the one proposed by Sobol (1990), that clarify how much of the variance of outputs can be explained by each input variable (Saltelli et al., 2008). They have been broadly implemented, and numerous studies show their capabilities. However, as Auder and Iooss (2008) and Zadeh et al., (2017) explain, as these approaches are not moment indifferent, variance-based methods might be inefficient in heavy-tailed distributions or when the outputs tend to have outliers. They also become less reliable when they encounter multi-modal distributions. To respond to these issues, moment-independent methods have been developed that are based on PDFs and CDFs. In this blog post, I go over some of them.

But before I introduce these methods, I would like to mention that there are studies that show that these moment-independent methods can be outperformed by other sensitivity analysis methods (e.g., variance-based Sobol). For example, Puy et al., (2020) shows that a moment-independent method (i.e., PAWN, which is explained below) does not necessarily produce better answers. Also, Auder and Iooss (2008) discuss how, at least in some cases, moment-independent methods can significantly increase computational costs.

Entropy-Based Methods

Shannon, the founder of information theory, first introduced the concept of entropy in his famous 1948 paper, A Mathematical Theory of Communication. To put it simply, entropy is the opposite of information. Higher entropy indicates higher uncertainty. In the last several decades, the concept of entropy has found its way into many scientific areas, including economics, statistics, engineering, environmental science, and machine learning.

A famous example that can help to explain the concept of entropy is coin tossing. Let’s say that you have two coins. One of them is a fair coin, meaning that the probabilities of getting heads or tails are equal (P=0.5). The other coin is not fair, as the probability of getting heads (P=0.8) is higher than that of getting tails (P=0.2). Now, you are going to flip one of these coins; which one has a higher uncertainty? The answer is that the fair coin has a higher entropy. Entropy has also been described as a measure of surprise. In our coin example, a coin with higher entropy has a higher chance to surprise you. The following explains some of the basic concepts related to information theory.

Entropy

The following formula can be used to calculate entropy:

Conditional Entropy

Another important concept in information theory is conditional entropy. Conditional entropy is the amount of information needed to estimate a random variable Y when a random variable X is known. Intuitively, low-conditional entropy implies higher dependence of random variable Y on random variable X. The following formula can be used to calculate conditional entropy:

Mutual Information

Mutual information is another concept that is closely related to entropy and explains how much of the information in Y can be estimated when X is known. The following shows how mutual information is related to entropy and conditional entropy:`

Entropy-Based Sensitivity Analysis

The concept of entropy has been used in sensitivity analysis. There are two popular entropy-based indices that have been used for sensitivity analysis: i) Krzykacz-Hausmann (2001) and ii) Liu et al., (2006). Both of them are based on analysis of PDFs of inputs and outputs. In this blog post, I will only explain the first one, Liu et al., adapted Kullback-Leibler (K-L) relative entropy concept which is conceptually similar and has a more complex mathematical formulation that can be solved using Monte Carlo sampling. More information on K-L-sensitivity methods can be found in Liu et al., (2006) and Auder and Iooss (2008).

Krzykacz-Hausmann Sensitivity Index

These authors use the direct definitions of Shannon’s information theory and use the following formula to calculate it:

 Higher values of η indicate higher sensitivity of RV Y to RV X.

Moment-Independent Methods

There are some other moment-independent methods that have been widely used in recent years, and I am including two of them here: i) PAWN and ii) δ-moment independent.

PAWN

PAWN is another moment-independent sensitivity analysis metric, developed by Pianosi and Wagener (2015). The main difference between PAWN and other moment-independent approaches is that PAWN calculates the difference between Cumulative Density Function (CDF) instead of PDF. The main advantage is that CDF is generally easier and faster to calculate. PAWN can be also used to focus on specific parts of the distribution.

To calculate the  index, they used Kolmogorov–Smirnov statistics, which calculate the distance between the conditional CDF and unconditional CDF.

where KS (xi) is the Kolmogorov–Smirnov index for factor xi, Fy is the unconditional CDF, Fy|x is the conditional CDF, and stat is either the maximum or the median of values calculated for different values that xi was conditioned on. The PAWN index varies between 0 and 1, while higher values of Ti indicate a higher importance for a factor. Pawn can also be used to zoom into a specific range of the output surface, as has been explained in Pianosi and Wagener (2015).

δ-Moment Independent Method

The delta (δ) moment-independent method is conceptually similar to the PAWN method. The main difference is that, instead of the CDF curve, it calculates the difference between unconditional and conditional PDFs. The method was first introduced by Borgonovo (2006) and has become very popular ever since. The following equation is used to calculate the δ index.

The δ index has all of the advantages of the PAWN index, but in many cases, it can be more computationally expensive.

In my next two blog posts, I will introduce some open-source moment-independent and entropy-based software packages and will give you some practical examples. I will also go over the application of information theory in causal analysis and inference.

Determining the appropriate number of samples for a sensitivity analysis

Sensitivity analysis aims at assessing the relative contributions of the different sources of uncertainty to the variability in the output of a model. There are several mathematical techniques available in the literature, with variance-based approaches being the most popular, and variance-based indices the most widely used, especially “total effects” indices. Literature also provides several estimators/approximators for these indices (reviews can be found here [1]), which typically need N = n × (M + 2) model evaluations (and resulting outputs), where M is the number of uncertain inputs and n is some factor of proportionality that is selected ad hoc, depending on the complexity of the model (e.g. linear or not, monotonic or not, continuous or not). [Note: Literature typically refers to n as the number of samples and to N as the number of model evaluations, and this is how I’ll be using them also.]

The generation of this set of model runs of size N is by far the most computationally demanding step in the calculation of variance-based indices, and a major limitation to their applicability, as it’s typically in the order of magnitude of thousands [2] (n typically >1000). For example, consider a model with 5 uncertain inputs that takes 1 minute to run. To appropriately estimate sensitivity indices for such a model, we would potentially need about N=7000 model evaluations, which would take almost 5 days on a personal computer, excluding the time for the estimator calculation itself (which is typically very short).

The aim is therefore to pick the minimum n needed to ensure our index calculation is reliable. Unfortunately, there is no hard and fast rule on how to do that, but the aim of this post is to illuminate that question a little bit and provide some guidance. I am going off the work presented here [3] and elsewhere, and the aim is to perform the estimation of sensitivity indices repeatedly, using an increasing number of n until the index values converge.

I will be doing this using a fishery model, which is a nonlinear and nonmonotonic model with 9 parameters. Based on previous results suggesting that 3 of these parameters are largely inconsequential to the variability in the output, I’ll be fixing them to their nominal values. I’ll be using SALib to perform the analysis. My full script can be found here, but I’ll only be posting the most important snippets of code in this post.

First, I set up my SALib ‘problem’ and create arrays to store my results:

# Set up dictionary with system parameters
problem = {
  'num_vars': 6,
  'names': ['a', 'b', 'c','h',
            'K','m'],
  'bounds': [[ 0.002, 2],
             [0.005, 1],
             [0.2, 1],
             [0.001, 1],
             [100, 5000],
             [0.1, 1.5]]
}

# Array with n's to use
nsamples = np.arange(50, 4050, 50)

# Arrays to store the index estimates
S1_estimates = np.zeros([problem['num_vars'],len(nsamples)])
ST_estimates = np.zeros([problem['num_vars'],len(nsamples)])

I then loop through all possible n values and perform the sensitivity analysis:

# Loop through all n values, create sample, evaluate model and estimate S1 & ST
for i in range(len(nsamples)):
    print('n= '+ str(nsamples[i]))
    # Generate samples
    sampleset = saltelli.sample(problem, nsamples[i],calc_second_order=False)
    # Run model for all samples
    output = [fish_game(*sampleset[j,:]) for j in range(len(sampleset))]
    # Perform analysis
    results = sobol.analyze(problem, np.asarray(output), calc_second_order=False,print_to_console=False)
    # Store estimates
    ST_estimates[:,i]=results['ST']
    S1_estimates[:,i]=results['S1']

I can then plot the evolution of these estimates as n increases:

Evolution of first order and total order indices with increasing number of samples (n)

What these figures tell us is that choosing an n below 1000 for this model would potentially misestimate our indices, especially the first order ones (S1). As n increases, we see the estimates becoming less noisy and converging to their values. For more complex models, say, with more interactive effects, the minimum n before convergence could actually be a lot higher. A similar experiment by Nossent et al. [3] found that convergence was reached only after n=12,000.

An observation here is that the values of the total indices (ST) are higher than those of the the first order indices (S1), which makes sense, as ST includes both first order effects (captured by S1) and second order effects (i.e. interactions between the factors). Another observation here is that the parameters with the most significant effects (m and K) converge much faster than parameters with less impact on the output (a and b). This was also observed by Nossent et al. [3].

Finally, sensitivity analysis is often performed for the purposes of factor prioritization, i.e. determining (often rank-ordering) the most important parameters for the purposes of, for example, deciding where to devote most calibration efforts in the model or most further analysis to reduce the uncertainty in a parameter. The figures below show the evolution of that rank-ordering as we increase n.

Evolution of parameter ranking based on first order and total order indices with increasing number of samples (n)

These figures show that with a number of samples that is too small, we could potentially misclassify a factor as important or unimportant when it actually is not.

Now, one might ask: how is this useful? I’m trying to minimize my n, but you’ve just made me run way too many model evaluations, multiple times, just to determine how I could have done it faster? Isn’t that backwards?

Well, yes and no. I’ve devoted this time to run this bigger experiment to get insight on the behavior of my model. I have established confidence in my index values and factor prioritization. Further, I now know that n>1500 would probably be unnecessary for this system and even if the model itself or my parameter ranges change. As long as the parameter interactions, and model complexity remain relatively the same, I can leverage this information to perform future sensitivity analyses, with a known minimum n needed.

References:
[1] A. Saltelli, P. Annoni, I. Azzini, F. Campolongo, M. Ratto, and S. Tarantola, “Variance based sensitivity analysis of model output. Design and estimator for the total sensitivity index,” Computer Physics Communications, vol. 181, no. 2, pp. 259–270, Feb. 2010, doi: 10.1016/j.cpc.2009.09.018.
[2] F. Pianosi and T. Wagener, “A simple and efficient method for global sensitivity analysis based on cumulative distribution functions,” Environmental Modelling & Software, vol. 67, pp. 1–11, May 2015, doi: 10.1016/j.envsoft.2015.01.004.
[3] J. Nossent, P. Elsen, and W. Bauwens, “Sobol’ sensitivity analysis of a complex environmental model,” Environmental Modelling & Software, vol. 26, no. 12, pp. 1515–1525, Dec. 2011, doi: 10.1016/j.envsoft.2011.08.010.