Various posts have discussed sensitivity analysis and techniques in this blog before. The purpose of this post is to show an application of the methods and demonstrate how they can be used in an exploratory manner, for the purposes of *robust decision making* (RDM). RDM aims to evaluate the performance of a policy/strategy/management plan over an ensemble of deeply uncertain parameter combinations – commonly referred to as *“states of the world”* (SOWs) – and then identify the policies that are most robust to those uncertainties. Most importantly, this process allows the decision maker to examine the implications of their assumptions about the world (or how it will unfold) on their candidate strategies [1].

This is Part 1 of a two part post. In this first post, I’ll introduce the types of figures I’ll be talking about, and some visualization code. In the second post (up in a couple days), I’ll discuss sensitivity analysis for the system as well as some visuals. All the code and data to produce the figures below can be found in this repository.

Now assume the performance of a system is described by a time-series, produced by our model as an output. This might be a streamflow we care about, reservoir releases, nutrient loading, or any type of time-series produced by a run of our model. For the purposes of this example, I’ll use a time-series from the system I’ve been working on, which represents historical shortages for an agricultural user.

We can sort and rank these data, in the style of a flow duration curve, which would allow us to easily see, levels for median shortage (50th percentile), worst (99th), etc. The reasons one might care about these things (instead of, say, just looking at the mean, or at the time series as presented in Fig. 1) are multiple : we are often concerned with the levels and frequencies of our extremes when making decisions about systems (e.g. *“how bad is the worst case?”, “how rare is the worst case?”*), we might like to know how often we exceed a certain threshold (e.g. “*how many years exceed an annual shortage of 1000 af?*“), or, simply, maintain the distributional information of the series we care about in an easily interpretable format.

For the purposes of an exploratory experiment, we would like to see how this time-series of model output might change under different conditions (or SOWs). There are multiple ways one might go about this [2], and in this study we sampled a broad range of parameters that we thought would potentially affect the system using Latin Hypercube Sampling [3], producing 1000 parameter combinations. We then re-simulated the system and saved all equivalent outputs for this time-series. We would like to see how this output changes under all the sampled runs.

Another way of visualizing this information, if we’re not interested in seeing all the individual lines, is to look at the range of outputs. To produce Fig. 4, I used the *fill_between *function in matplotlib, filling between the max and min values at each percentile level.

By looking at the individual lines or the range, there’s one piece of potentially valuable information we just missed. We have little to no idea of what the density of outputs is within our experiment. We can see the max and min range, the lines thinning out at the edges, but it’s very difficult to infer any density of output within our samples. To address this, I’ve written a little function that loops through 10 frequency levels (you can also think of them as percentiles) and uses the *fill_between *function again. The only tricky thing to figure out was how to appropriately represent each layer of increasing opacity in the legend – they are all the same color and transparency, but become darker as they’re overlaid. I pulled two tricks for this. First, I needed a function that calculates the custom *alpha*, or the transparency, as it is not cumulative in matplotlib (e.g., two objects with transparency 0.2 together will appear as a single object with transparency 0.36).

def alpha(i, base=0.2): | |

l = lambda x: x+base-x*base | |

ar = [l(0)] | |

for j in range(i): | |

ar.append(l(ar[-1])) | |

return ar[-1] |

Second, I needed proxy artists representing the color at each layer. These are the *handles *in the code below, produced with every loop iteration.

handles = [] | |

labels=[] | |

fig = plt.figure() | |

ax=fig.add_subplot(1,1,1) | |

for i in range(len(p)): | |

ax.fill_between(P, np.min(expData_sort[:,:],1), np.percentile(expData_sort[:,:], p[i], axis=1), color='#4286f4', alpha = 0.1) | |

ax.plot(P, np.percentile(expData_sort[:,:], p[i], axis=1), linewidth=0.5, color='#4286f4', alpha = 0.3) | |

handle = matplotlib.patches.Rectangle((0,0),1,1, color='#4286f4', alpha=alpha(i, base=0.1)) | |

handles.append(handle) | |

label = "{:.0f} %".format(100-p[i]) | |

labels.append(label) | |

ax.plot(P,hist_sort, c='black', linewidth=2, label='Historical record') | |

ax.set_xlim(0,100) | |

ax.legend(handles=handles, labels=labels, framealpha=1, fontsize=8, loc='upper left', title='Frequency in experiment',ncol=2) | |

ax.set_xlabel('Shortage magnitude percentile', fontsize=12) | |

plt.savefig('experiment_data_density.png') |

This allows us to draw some conclusions about how events of different magnitudes/frequencies shift under the SOWs we evaluated. For this particular case, it seems that high frequency, small shortages (left hand side) are becoming smaller and/or less frequent, whereas low frequency, large shortages (right hand side) are becoming larger and/or more frequent. Of course, the probabilistic inference here depends on the samples we chose, but it serves the exploratory purposes of this analysis.

References:

[1]: Bryant, Benjamin P., and Robert J. Lempert. “Thinking inside the Box: A Participatory, Computer-Assisted Approach to Scenario Discovery.” *Technological Forecasting and Social Change* 77, no. 1 (January 1, 2010): 34–49. https://doi.org/10.1016/j.techfore.2009.08.002.

[2]: Herman, Jonathan D., Patrick M. Reed, Harrison B. Zeff, and Gregory W. Characklis. “How Should Robustness Be Defined for Water Systems Planning under Change?” *Journal of Water Resources Planning and Management* 141, no. 10 (2015): 4015012. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000509.

[3]: McKay, M. D., R. J. Beckman, and W. J. Conover. “A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code.” *Technometrics* 21, no. 2 (1979): 239–45. https://doi.org/10.2307/1268522.

Pingback: Open Source Sensitivity Analysis Tools – Water Programming: A Collaborative Research Blog