Weather Regime-Based Stochastic Weather Generation (Part 2/2)

In this post on the Water Programming Blog, we continue to explore the application of the stochastic weather generator (available on GitHub) for climate-change scenario developments. This is the second installment of a two-part series of blog posts, and readers are strongly encouraged to familiarize themselves with different components of the stochastic weather generator, as explained in Part 1 by Rohini Gupta (The Reed Research Group).

Here, we will begin by offering a concise overview of developing climate change scenarios and how these scenarios are integrated into the weather generation model. Following this, we will proceed to interpret the impact of these climatic change conditions on key statistical insights concerning occurrences of floods and droughts. Through these examples, the implications of these climatic shifts on water resources management and flood risk analysis will become evident.

Climate Change Perturbations

In this stochastic weather generator, we specifically focus on two aspects of climate change scenario developments, including 1) thermodynamic and 2) dynamic perturbations.

1) Thermodynamic Change

Thermodynamic climate change, often referred to as temperature-driven change, is primarily driven by changes in the Earth’s energy balance and temperature distribution. This warming affects various aspects of the climate system, such as intensifying precipitation extremes, melting snowpacks and ice sheets, rising sea levels, altered weather patterns, and shifts in ecosystems. The primary driver of temperature-driven climate change is the increase in regional-to-global average temperatures due to the enhanced greenhouse effect. As temperatures rise due to natural and anthropogenic forcings, they trigger a cascade of interconnected impacts throughout the climate system.

In the stochastic weather generator, scenarios of temperature change are treated simply by adding trends to simulated temperature data for each location across the spatial domain. However, scenarios of thermodynamic precipitation intensification are modeled using a quantile mapping technique through scaling the distribution of daily precipitation in a way that replicates the effects of warming temperatures on precipitation as the moisture holding capacity of the atmosphere increases. In the context of California, previous studies have demonstrated that as temperatures rise, the most severe precipitation events (often associated with Atmospheric Rivers landfalling) are projected to increase in frequency, while the intensity of smaller precipitation events is expected to decrease (Gershunov et al., 2019). This alteration effectively stretches the distribution of daily precipitation, causing extreme events to become more pronounced while reducing the occurrence and strength of lighter precipitation events. We replicate this phenomenon by making adjustments to the statistical characteristics and distribution percentiles of precipitation (e.g., Pendergrass and Hartmann, 2014). To further elaborate on this, we select a scaling factor for the 99th percentile of nonzero precipitation and then modify the gamma-GPD mixed distribution to enforce this chosen scaling factor. For instance, in a scenario with a 3°C temperature warming and a 7% increase in extreme precipitation per °C (matching the theoretical Clausius-Clapeyron rate of increase in atmospheric water holding capacity due to warming; Najibi and Steinschneider (2023)), the most extreme precipitation events are projected to increase by approximately 22.5% (1.073). We adjust the gamma-GPD models fitted to all locations to ensure this percentage increase in the upper tail of the distribution. Assuming that mean precipitation remains constant at baseline levels, this adjustment will cause smaller precipitation values in the gamma-GPD model to decrease, compensating for the increase in extreme events through stretching the distribution of nonzero precipitation.

Lines 33-40 from ‘config.simulations.R’ show the user-defined changes to implement the thermodynamic scenarios based on temperature in Celsius (tc: e.g. 1 °C), percent change in extreme precipitation quantile (pccc: e.g. 7% per °C), and percent change in average precipitation (pmuc: e.g. 12.5% decrease) inputs. Needless to say that the stochastic weather generator runs at baseline mode if tc=0 and pmuc=0.

    ##-------------Define perturbations-------------##
    ##climate changes and jitter to apply:
    change.list <- data.frame("tc"=  c(0), # {e.g., 0, 1, 2, ...} (changes in temperature)
                              "jitter"= c(TRUE),
                              "pccc"= c( 0.00), # {e.g., 0, 0.07, 0.14, ...} (changes for precipitation extreme quantile -- CC)
                              "pmuc"= c( 0.00)# {e.g., 0, -.125, .125, ...} (changes in precipitation mean)
    )
    ##----------------------------------------------##

2) Dynamic Change

Dynamic climate change, also known as circulation-driven change, is driven by shifts in atmospheric and oceanic circulation patterns. These circulation patterns are influenced by a variety of factors, including temperature gradients, differences in air pressure, and Earth’s rotation. Changes in these patterns can lead to alterations in weather patterns, precipitation distribution, and regional climate characteristics. One well-known example of dynamic climate change is the phenomenon of El Niño and La Niña, which involve changes in ocean temperatures and atmospheric pressure in the Pacific Ocean. These events can significantly impact local-to-global weather patterns, causing droughts, heavy rainfall, and other extreme weather events (Pfahl et al., 2017).

Dynamic changes impact the evolution of weather patterns and can modify the occurrence frequency of these patterns. This influence can occur through direct adjustments to the transition probabilities between different weather regimes, or indirectly by modifying the covariates that govern the progression of these weather regimes. In Steinschneider et al. (2019), a Niño 3.4 index is used to force weather regime evolution and is systematically adjusted to create more frequent El Niño and La Niña events. In Gupta et al. (in review), a 600-year long sequence of tree-ring reconstructed principal components of weather regime occurrence are used as an alternative covariate to better capture natural variability inherent in the weather regimes.

In the most recent version of the stochastic weather generator, we developed a novel non-parametric approach to simulation of weather regimes, allowing for future dynamic change scenarios with altered (customized) weather regime probabilities. Assuming that the historical time series of water regimes is divided into distinct, consecutive segments without overlaps, each segment has a duration of D years, and there is a total of ND segments considered there (here, D=4 and ND=18). In the non-parametric method, each segment (indexed as n=1 to ND) is assigned a sampling probability denoted as pn. To generate a new sequence of daily weather regimes spanning any desired number of years, the procedure involves resampling (with replacement) the nth D-year segment of daily weather regimes using the corresponding probability pn. This process is repeated until the required number of years of simulated weather regimes has been attained. If needed, the last segment can be trimmed to ensure the precise desired duration of simulated weather regimes.

In the baseline scenario for the weather generator with no dynamic climate change (only thermodynamic change), each segment is considered equally likely (i.e., no changes to large-scale circulation patterns).

However, the probabilities pn can be adjusted to alter the frequencies of each of the identified weather regimes in the final simulation, enabling the generation of dynamic climate change scenarios (i.e., scenarios in which the frequencies of different atmospheric flow patterns change compared to their historical frequencies). This is achieved using a linear program. The goal of the linear model (not shown) is to identify new sampling probabilities pn that, when used in the non-parametric simulation approach above, create a sequence of weather regimes with long-term average frequencies that approach some vector of target probabilities for those identified weather regimes.

Lines 91-126 from ‘config.simulations.R’ show the user-defined changes to implement a non-parametric scenario with equal probabilities (0: no change to the historical sequence of weather regimes) to ten weather regimes, i.e., dynamic scenario #0; and a 30% increase in weather regime number three (a dry weather condition) in California, i.e., dynamic scenario #1.

##Choose below whether through parametric or non-parametric way to create the simulated WRs ##
    use.non_param.WRs <- TRUE #{TRUE, FALSE}: TRUE for non-parametric, FALSE for parametric simulated WRs

    dynamic.scenario  <- 0 # {0, 1, 2}: 0: no dynamic change; 1: dynamic scenario #1 (30% increase in WR3); or 2: dynamic scenario #2 (linear trend)

    if (use.non_param.WRs){      #----------- 1+2 dynamic scenarios ----------#
      if (dynamic.scenario==0){
        ##===> Attempt #0 (thermodynamic only; no change to freq of WRs) ===##
        # #specify target change (as a percent) for WR probabilities
        WR_prob_change <- c(0,0,0,0,0,0,0,0,0,0) # between 0 and 1
        # #how close (in % points) do the WR frequencies (probabilities) need to be to the target
        lp.threshold <- 0.00001
        # #how much change do we allow in a sub-period sampling probability before incurring a larger penalty in the optimization
        piecewise_limit <- .02
        
      }else if(dynamic.scenario==1){
        ##===> Attempt #1 (dynamic scenario #1) ===##
        # #specify target change (as a percent) for WR probabilities (if, increasing WR3 in future)
        WR_prob_change <- c(0,0,.3,0,0,0,0,0,0,0) # between 0 and 1
        # #how close (in % points) do the WR frequencies (probabilities) need to be to the target
        lp.threshold <- 0.007
        # #how much change do we allow in a sub-period sampling probability before incurring a larger penalty in the optimization
        piecewise_limit <- .02
        
      }else if(dynamic.scenario==2){
        ##===> Attempt #2 (dynamic scenario #2) ===##
        # specify target change (as a percent) for WR probabilities (if, continuing their current trends in future)
        WR_prob_change <- c(-0.09969436,  0.27467048,  0.33848792,
                            -0.28431861, -0.23549986,  0.03889970,
                            -0.05628958, 0.38059153, -0.16636739, -0.17995965) # between 0 and 1
        # how close (in % points) do the WR frequencies (probabilities) need to be to the target
        lp.threshold <- 0.008
        # how much change do we allow in a sub-period sampling probability before incurring a larger penalty in the optimization
        piecewise_limit <- .02
      }
    }

Stochastic Weather Generation for Climate Change Scenarios

The stochastic weather generator is utilized to generate two climate change scenarios as defined above. The description of specific configurations for each scenario is listed as follows:

  • Thermodynamic Scenario: 3°C increase in temperature, 7% per °C increase in precipitation extremes, no change in average precipitation.
  • Dynamic Scenario: 30% increase in occurrence frequency of one weather regime only, labeled as ‘WR3’, which exhibits a ridge directly over the northwest US, i.e., blocking moisture flow over California, and resulting in dry conditions during the cold season there.

Thus, we generated 1008 years of simulated precipitation and temperature for each 12 sites in the Tuolumne River Basin in California (similar to Part 1) following these two scenarios. Below is a list of figures to understand better the impact of each scenario on precipitation and temperature statistical distributions and important climate extremes at basin scale.

The two Figures below present the cumulative distribution function (CDF) of the generated scenario for precipitation (left) and temperature (right) based on the thermodynamic and dynamic change, respectively. The observed time-series of precipitation and temperature across these 12 sites is also illustrated.

As seen above, although the 3°C warming is clearly manifested in the alteration of simulated temperature’s CDF, it is hard to notice any drastic shifts in the overall CDF of precipitation time series, as the bulk of distribution has not been altered (remember the average precipitation remained constant although its extreme quantile scaled up by ~ 22.5%).

Similarly, while the CDF of precipitation demonstrates a slight shift towards a drier condition, we notice a large shift in tail of temperature distribution.

Now, we go ahead and examine a set of important indexes for climate risk assessment of water systems. The Figure below presents the 1-day precipitation maxima derived from the generated scenario for precipitation based on the thermodynamic (left) and dynamic (right) change.

In the plot above depicted for thermodynamic climate change, the median 1-day precipitation extremes at the basin scale throughout the entire synthetic weather generation vs. historical period demonstrates a 25.5% increase in its magnitude, which is consistent with the event-based precipitation scaling by 22.5% at each site. However, such metric has almost remained unchanged in the dynamic climate change scenario.

Finally, the Figure below shows the 5-year drought derived from the generated scenario for water-year precipitation total, based on the thermodynamic (left) and dynamic (right) change.

The boxplots presented above related to the thermodynamic scenario, revealing a consistent median 5-year drought magnitude, as anticipated (no shift in the distribution of average precipitation bulk). In contrast, the dynamic climate change scenario exhibits a substantial exacerbation, with the 5-year drought magnitude worsening by around 9% compared to the historical records.

There is plenty more to explore! The stochastic weather generator is suitable to quickly simulate a long sequence of weather variables that reflect any climate change of interest. Keep an eye out for upcoming additions to the repository in the coming months, and do not hesitate to contact us or create a GitHub issue if you need clarification.

References

Gershunov, A., Shulgina, T., Clemesha, R.E.S. et al. (2019). Precipitation regime change in Western North America: The role of Atmospheric Rivers. Scientific Reports, 9, 9944. https://doi.org/10.1038/s41598-019-46169-w.

Gupta, R.S., Steinschneider S., Reed, P.M. Understanding Contributions of Paleo-Informed Natural Variability and Climate Changes on Hydroclimate Extremes in the Central Valley Region of California. Authorea. March 13, 2023. https://doi.org/10.22541/essoar.167870424.46495295/v1

Najibi, N., and Steinschneider, S. (2023). Extreme precipitation-temperature scaling in California: The role of Atmospheric Rivers, Geophysical Research Letters, 50(14), 1–11, e2023GL104606. https://doi.org/10.1029/2023GL104606.

Pendergrass, A.G., and Hartmann, D.L. (2014). Two modes of change of the distribution of rain, Journal of Climate, 27(22), 8357-8371. https://doi.org/10.1175/JCLI-D-14-00182.1

Pfahl, S., O’Gorman, P.A., Fischer, E.M. (2017). Understanding the regional pattern of projected future changes in extreme precipitation, Nature Climate Change, 7 (6), 423-427. http://dx.doi.org/10.1038/nclimate3287

Steinschneider, S., Ray, P., Rahat, S. H., & Kucharski, J. (2019). A weather‐regime‐based stochastic weather generator for climate vulnerability assessments of water systems in the western United States. Water Resources Research, 55(8), 6923-6945. https://doi.org/10.1029/2018WR024446

Introducing the GRRIEn Analysis Framework: Defining standards for reproducible and robust supervised learning of earth surface processes at large spatial scales

I’m very excited to write this blogpost to highlight the GRRIEn framework that was developed in part by Dr. Elizabeth Carter at Syracuse University, along with collaborators Carolynne Hultquist and Tao Wen. I first saw Liz present the GRRIEn framework at the Frontiers in Hydrology conference last year and I knew it was a framework that I wanted to eventually demo on the blog. The associated manuscript (now published here) that introduces the GRRIEn (Generalizable, Reproducible, Robust, and Interpreted Environmental) framework is a cheat sheet of best management practices to extract generalizable insight on environmental systems using supervised learning of global environmental observations (EOs; satellites and coupled earth systems models), including standards for reproducible data engineering, robust model training, and domain-specialist interpreted model diagnostics (Carter et al., 2023).

In short, this is a paper that I wish existed when I first started diving into data science and machine learning in graduate school. If you are a student/have a student who is new to statistics, data science, and machine learning for environmental applications, I highly recommend this paper as the single most useful one-stop shop read. Below, I will introduce a bit about Liz and her group, the steps of the framework, and include some of the awesome ways that Liz has incorporated GRRIEn into her classes.

The CHaRTS Lab

Dr. Carter is based at Syracuse University and runs the Climate Hazards Research Team Syracuse (CHaRTS). CHaRTS uses proxy observations of the hydrologic cycle from satellites, photographs, and earth systems models to support fair and stable management of water resources and hydroclimatic risk in the twenty-first century. I talked to Liz about creating the GRRIEn framework and the results that she has seen in the classroom.

Why did you decide to create GRRIen?

“GRRIEn was created as a teaching tool for my graduate class, Environmental Data Science (EDS). EDS was intended to equip students with the computation tools they would need to use global earth observations for thesis/dissertation research. Some of my students have an interest in spatial statistics and machine learning, but most of them are domain specialists who want to use global earth observations to learn more about very specific environmental processes. The GRRIEn method was created for these students, the domain specialists. The goal is to reduce barriers to accessing insight from global earth observations associated with computational methods and advanced statistical techniques. For example, many students struggle to find and process spatial data. In EDS, they adopt a standard reproducible computational pipeline in the form of a GitHub repository for accessing raw data and creation of an analysis-ready dataset. For students who have limited statistical skills, I teach a few simple model-agnostic tools to diagnose and control for feature and observation dependence in multivariate observational spatiotemporal datasets. Most importantly, I teach methods in model interpretation. We need domain specialists to evaluate whether data-driven models are a physically plausible representation of environmental processes if we want to use global earth observations for knowledge discovery.”

What kind of results have you seen as you incorporate GRRIen in the classroom?

“My favorite part of EDS is seeing students going from never having written a line of code to processing massive volumes of data. Because so much of the process has been constrained by the GRRIEn method, students can really spread their wings in terms of application. For the past three years, I’ve had several students take their EDS project to conferences like AGU, or submit it for publication.”

Overview of the GRRIEn Framework

As the volume of earth observations from satellites, global models, and the environmental IoT continues to grow, so does the potential of these observations to help scientists discover trends and patterns in environmental systems at large spatial scales. Because many emerging global datasets are too large to store and process on personal computers, and because of scale-dependent qualities of spatial processes that can reduce the robustness of traditional statistical methods, domain specialists need targeted and accessible exposure to skills in reproducible scientific computing and spatial statistics to allow the use of global observations from satellites, earth systems models, and the environmental IoT to generalize insights from in-situ field observations across unsampled times and locations. The GRRIEn framework (which stands for generalizable, robust, reproducible, and interpretable environmental analytic framework) was developed for this purpose (Carter al., 2023). Below are the four key components of the framework, but please refer to the manuscript for full descriptions and examples. 

Generalizable: How well do your experimental results from a sample extend to the population as a whole?

Three common sources of global gridded earth observations (EOs) include active satellite remote sensing data, such as synthetic aperture radar imagery (left), passive satellite remote sensing data, including optical and passive microwave imagery (center), and gridded outputs from global coupled earth system models (right).

A primary motivation to generate global EOs is to allow scientists to translate insight from our limited field measurements to unsampled times and locations to improve the generalizability of our earth systems theories. As global EOs are (approximately) spatiotemporally continuous observations, the overall objective of GRRIEn analysis is to train a supervised learning algorithm that can predict an environmental process using globally available EOs as input data. This algorithm can be used to generalize insights from direct observations of an environmental process sampled experimentally or in-situ across unlabeled times or locations where global EOs are available (i.e. through interpolation, extrapolation, and diagnostic modeling)

Robust: Do your statistics show good performance on data drawn from a wide range of probability and joint probability distributions?

In order for your model to generalize well, you must quantify and account for scale-dependent multicollinearity and spatial/temporal dependence in your data. For example, multicollinearity occurs when one or more predictor variables are linearly related to each other. It is a problem for prediction and inference because the accuracy of predictions depends on the exact structure of multicollinearity that was present in the training dataset being present in the prediction dataset. Since it is associated with model instability in most machine learning and some deep learning applications, it also impacts diagnostic modeling; the model is less likely to interpret the system in a physically plausible way, and small changes in training data can lead to big changes in model weights and parameters.

Furthermore, correlation structure tends to be dynamic in space and time. The figure below shows an example of correlation structure between summertime air temperature and precipitation from 1950-2000 (left) and projected from 2050-2100 (right). One can see that the correlation between air temperature and precipitation will change for many locations in the US under climate change. This suggests that statistical models trained using temperature and precipitation under 20th century conditions will not make robust estimates in the 21st century.

Pearson’s correlation coefficient [scaled between -1 and 1, color bar (Benesty et al.
2009)] between bias-corrected statistically downscaled Climate Model Intercomparison
Project 5 ensemble mean monthly precipitation and daily max temperature. Historical
observations 1950-1999 (left) and moderate emissions forecast (RCP 4.5) for 2050-2099 (right) both indicate spatiotemporal variability collinearity between summertime maximum temperature and precipitation. Covariance of meteorological variables is a signature of local climate. As local climates shift due to global warming, so will the local covariability of meteorological variables (right). This generates complexity for predicting environmental process response to meteorological variables under climate change (Taylor et al. 2012)

We can’t expect to collect a sample of future covariability structure because these conditions haven’t happened yet. So how do we address these issues? The manuscript summarizes a great checklist of steps to make sure your model development is robust to both dependent features and observations.  

Checklist for robust data engineering and model development with dependent
features (left) and observations (right) in spatiotemporal systems.

Reproducible: Can other scientists understand and replicate your analysis and yield the same results?

The best way to facilitate this is to create a clear repository structure which contains a README, a container image that has specified versions of all packages that are used, and code that facilitates both the download of larger geospatial datasets that can’t be stored in a repository and code to reproduce analyses and figures.

In Liz’s Environmental Data Science class at Syracuse (CEE 609), she has her students complete a semester long project where every piece of the project is documented in a GRRIEn repository structure (see here for a great example). By the end of the semester, she noted that her students have a written paper, fully documented workflow, and a populated repository that often has served as a basis for a journal publication for students in her research group. 

Interpretable: Do your model parameters reflect a physically plausible diagnosis of the system?

The most important step to ensuring that your model will make the right predictions in different contexts is to make sure those predictions are happening for the right reason: have your model weights and parameters diagnosed the importance of your predictors, and their relationship to the environmental process that serves as your predictand, in a physically plausible way? The authors suggest forming a set of interpretable hypotheses before modeling that denotes the data that you are using and the relationships you expect to find and then utilizing local and global interpretation methods to confirm or reject the original hypotheses.   

Conclusion/Resources

I think the GRRIEn framework is a really great tool for young researchers embarking on data science and machine learning projects. If you are a newer faculty that is interested in incorporating some of these concepts into your class, Liz has made “Code Sprints” from her class available here. There are Jupyter Notebooks on working with Python, raster data, vector data, regressions, autocorrelation, and multicollinearity. Be sure to keep up with the work coming out of the CHaRTS lab on Twitter here!

Find the paper here: https://doi.org/10.1175/AIES-D-22-0065.1

A time-saving tip for simulating multiple scenarios

A time-saving tip for simulating multiple scenarios

Last week I was reading a paper by Thomlinson, Arnott, and Harou (2020), A water resource simulator in Python. In this, they describe a method of evaluating many simulation scenarios (or hydrologic realizations) simultaneously, and claim that this approach results in significant improvements in computational efficiency.

I was struck because I have been constructing similar multi-scenario simulations using a different approach, by evaluating one scenario at a time.

After having thought I perfected my code, was it possible that a slight modification to the implementation could significantly improve the efficiency? (Spoiler: Yes! By many hours, in fact.)

I decided to post this here (A) as a simple demonstration in comparing runtime metrics of two different algorithm approaches, and (B) as a reminder to remain open to the possibility of simple yet significant improvements in code performance.

A GitHub repository containing all of the relevant Python scripts used below can be accessed here.

Introduction

There are many reasons why an analyst might want to perform simulation of many different scenarios. For example, when simulating a water resource policy which is affected by stochastic exogenous forces, it is preferable to evaluate the policy under many different realizations of the stochastic variable. The performance of that policy can then be measured as an aggregation of the performance under the different realizations.

How one decides to simulate the different realizations can have significant impacts on the efficiency of the program.

Here, I compare two different implementations for many-scenario evaluation, and demonstrate a comparison of the methods using the shallow lake problem (Carpenter et al., 1999).

The two implementation techniques are shown graphically in the figure below.

Figure: Visual representation of the two different multi-scenario simulation implementations discusses in this post. (This figure was inspired by a similar figure contained in a Pywr presentation made available by James Thomlinson)

In the past, I (and others who came before me) have used what I refer to as a serial implementation, where the realizations are looped-through one-by-one, evaluating the model over the entire simulation time period before moving on to the next realization. Now, I am contrasting that with what I am calling the block implementation. For the block implementation, the simulation is evaluated for all realizations before progressing to the subsequent time step.

When viewed this way, it may seem obvious that the block simulation will perform more efficiently. The model is applied to the columns rather than the individual elements of a matrix of realizations or scenarios. However, hindsight is 20-20, and if you’re a water resource modeler with an engineering background (such as myself) this may not be apparent to you when constructing the model.

Now, onto the demonstration.

The Lake Problem

The shallow lake problem, as introduced by Carpenter et al. (1999), was used as a case study for design of multi-objective pollution policies by Professor Julie Quinn and has been explored in depth on this blog. While a comprehensive understanding of the lake problem, and the corresponding model, is not necessary for understanding this post, I will provide a brief summary here.

The problem centers around a hypothetical town located near a shallow lake. The town would like to discharge nutrient pollution (e.g., waste water treatment plant effluent) into the lake, while balancing the economic benefits (i.e., cost savings associated with the discharge) and environmental objectives (i.e., preserving the ecological health of the lake).

The pollution dynamics of the lake are described by a non-linear model representing pollution fluxes into and out of the lake:

X_{t+1} = X_t + a_t + Y_t + \frac{X_t^q}{1+X_t^q} - bX_t

Where X_t is the pollution concentration in the lake, a_t is the anthropogenically pollution discharged from the town, Y_t \approx Log-Normal(\mu, \sigma) is the natural pollution inflow into the lake and follows a log-normal distribution, q is the rate at which pollution is recycled from the sediment into the water column, and b is the rate at which the nutrient pollution is removed from the lake due to natural processes.

The non-linearity of the nutrient dynamics shown above result in the presence of an unstable equilibrium or ecological tipping point beyond which the lake becomes permanently eutrophic (ecologically unhealthy).

The problem then becomes identifying pollution policies that prescribe a preferable amount of pollution, a_t during each year while not crossing the ecological tipping point.

The lake model

A model can be constructed which simulates a pollution policy and measures corresponding environmental and economic performance objectives over a 100-year period. In her 2017 publication, Prof. Quinn shows how an adaptive pollution policy, which determines pollution discharges rates conditional upon the current lake pollution level, can result in greater system performance and robustness. However, for the sake of simplicity in this demonstration I am modeling an intertemporal pollution policy which consists of a set of annual pollution discharge quantities (release_decision in the following code). I selected one pollution policy from a set of pareto-approximate policies generated previously. To learn more about how optimal pollution discharge policies can be generated for this model, see the blog post here.

Since natural nutrient pollution inflow into the lake is stochastic, it is best to evaluate a single policy for many different realizations of natural inflow. The objective performance can then be measured as the summary (e.g., mean, sum, or max) of the performance across the different realizations.

Below, I am going to show two different ways of implementing multi-realization simulations corresponding to the two approaches described in the Introduction.

I’ve separated the common and and unique features of the two different implementations, to highlight the differences. Both models begin by defining a set of parametric constants, initializing variables for later use, and by calculating the ecological tipping point (critical pollution threshold):

def lake_model(release_decision):

    # Choose the number of stochastic realizations
    n_realizations = 100

    #Initialize conditions
    n_years = 100
    n_objs = 4
    n_time = 100

    # Constants
    reliab_threshold = 0.85
    inertia_threshold = 0.02
    b = 0.42
    q = 2.0
    delta = 0.98
    X0 = 0
    alpha = 0.4
    mu = 0.5
    sigma = np.sqrt(10**-0.5)

    # Initialize variables
    years_inertia_met = np.zeros(n_realizations)
    years_reliability_met = np.zeros(n_realizations)
    expected_benefits = np.zeros(n_realizations)
    average_annual_concentration = np.zeros(n_realizations)
    objs = [0.0]*n_objs

    # Identify the critical pollution threshold
    #Function defining x-critical. X-critical exists for flux = 0.
    def flux(x):
            f = pow(x,q)/(1+pow(x,q)) - b*x
            return f

    # solve for the critical threshold with unique q and b
    Xcrit = sp.bisect(flux, 0.01, 1.0)
    
    ...

Serial simulation

For the serial model evaluation, lake quality over a 100-year period is evaluated for one realization at a time. A single realization of stochastic nutrient pollution inflow, Y_t is generated and the policy is evaluated under these conditions. This is repeated 100 times for 100 unique realizations of natural inflow.


def serial_lake_model(release_decision):

    ...

    # Begin evaluation; evaluate one realization at a time
    for s in range(n_realizations):

        ### Generate a single inflow_realization
        inflow_realizations = np.random.lognormal(mean = mu, sigma = sigma, size = n_time)

        # Initialize a new matrix of lake-state variables
        lake_state = np.zeros(n_years)

        # Set the initial lake pollution level
        lake_state[0] = X0

        for t in range(n_years-1):

            lake_state[t+1] = lake_state[t] + release_decision[t] + inflow_realizations[t] + (lake_state[t]**q)/(1 + lake_state[t]**q)

            expected_benefits[s] += alpha * release_decision[t] * delta**(t+1)

            # Check if policy meets inertia threshold
            if t>= 1:
                years_inertia_met[s] += (abs(release_decision[t-1] - release_decision[t]) < inertia_threshold) * 1

            # Check if policy meets reliability threshold
            years_reliability_met[s] += (lake_state[t] < Xcrit) * 1

        average_annual_concentration[s] = np.mean(lake_state)

    # Calculate objective scores
    objs[0] = -np.mean(expected_benefits)
    objs[1] = max(average_annual_concentration)
    objs[2] = -np.sum(years_inertia_met / (n_years - 1))/n_realizations
    objs[3] = -np.sum(years_reliability_met)/(n_realizations * n_years)

    return objs

This is the way I have written this program in the past, and the way it has been presented previously in this blog.

Block Simulation

I’ve now re-written the model using the block simulation method, as shown in Figure 1. Rather than taking a single timeseries of natural inflow at a time, I produce a matrix of 100-realizations and perform calculations one-timestep (or column) at a time.

I recommend you pause to compare the above and below code segments, to make sure that you have a solid understanding of the differences in implementation before looking at the results.

def matrix_lake_model(release_decision):

    ...
    
    ### Create a matrix contaiining all inflow realizations
    inflow_realizations = np.zeros((n_realizations, n_time))
    for s in range(n_realizations):
        inflow_realizations[s, :] = np.random.lognormal(mean = mu, sigma = sigma, size = n_time)

    # Begin evaluation; evaluate all realizations one time step at a time
    for t in range(n_years-1):

        lake_state[:,t+1] = lake_state[:,t] + release_decision[t] + inflow_realizations[:,t] + (lake_state[:,t]**q)/(1 + lake_state[:,t]**q)

        expected_benefits[:] += alpha * release_decision[t] * delta**(t+1)

        # Check if policy meets inertia threshold
        if t>= 1:
            years_inertia_met += (abs(release_decision[t-1] - release_decision[t]) < inertia_threshold) * 1

        # Check if policy meets reliability threshold
        years_reliability_met += (lake_state[:,t] < Xcrit) * 1

    # Calculate objective scores
    objs[0] = -np.mean(expected_benefits)
    objs[1] = max(np.mean(lake_state, axis = 1))
    objs[2] = -np.sum(years_inertia_met / (n_years - 1))/n_realizations
    objs[3] = -np.sum(years_reliability_met)/(n_realizations * n_years)

    return objs

Running the same pollution policy in both of the above models, I first made sure that the output objective scores were identical. Having verified that, I set up some simple timing tests.

Comparison of evaluation time efficiency

I used the built-in time module in Python to perform runtime tests to compare the two implementation methods. The time module contains the function perf_counter() which makes a precise recording of the time at which it is called.

Calling perf_counter() before and after model evaluation provides a simple way of measuring the evaluation time. An example of this time measurement is shown below:

import time
import lake_model

start = time.perf_counter()
model_output = lake_model(release_decision, n_realizations)
end = time.perf_counter()
    
model_runtime = end - start

When performing an optimization of the pollution release policy using the Borg multi-objective evolutionary algorithm (MOEA), it is standard for our group to simulate each policy for 100-realizations of hydrologic stochasticity. The objective scores are calculated as some aggregation of the performance across those realizations.

With this in mind, I first measured the difference in runtime between the two simulation methods for 100-realizations of stochasticity.

I found the model using the block implementation, evaluating all realizations one time-step at a time, was 0.048 seconds faster then evaluating the 100-realizations in series. That may seem marginal at first, but how does this difference in speed scale when performing many thousands of model evaluations throughout an optimization?

Previously, when optimizing the release decisions for the lake problem, I allowed the Borg MOEA to run for a maximum of 50,000 model evaluations per seed, and repeated this process for 20 random seeds… at a savings of 0.048 seconds per evaluation, I would have saved 13.3 hours of computation time if I had used the block implementation technique!

This difference grows exponentially with the number of realizations performed in each evaluation. In the figure below, I am showing runtime of the block implementation relative to the serial implementation (I.e., \frac{T_{simul}}{T_{serial}}) for an increasingly large number of realizations:

Figure: Runtime of the matrix implementation technique relative to the serial technique, for a range of realization numbers.

Conclusions

In a stochastic world, evaluation of water resource policies under a broad ensemble of scenarios allows for the selection of more policies that are more robust under uncertain factors. Writing code that is computationally burdensome can limit the size of the ensemble, and ultimately limit the analysis.

When working with high performance computing resources, researchers are often limited in allotted compute-hours, or the time spent processing a program on the machine. Given this context, it is highly valuable to identify time-saving measures such as the block implementation presented here.

Ultimately, this experimentation was pleasantly humbling. I hope that this serves as motivation to approach one of your existing models with a fresh perspective. At the least, I hope to keep you from making the same costly mistake as me.

I want to shout out Thomlinson, Arnott, and Harou again for writing the paper that sparked this re-assessment, check out their water resource simulator for solving resource allocation problems, Pywr, here.

Taylor Diagram

The evaluation of model performance is a widely discussed and yet extremely controversial topic in hydrology, atmospheric science, and environmental studies. Generally, it is not super straightforward to quantify the accuracy of tools that simulate complex systems. One of the reasons is that these systems usually have various sub-systems. For example, a complex river system will simulate regulated streamflow, water allocation, dam operations, etc. This can imply that there are multiple goodness-of-fit objectives that cannot be fully optimized simultaneously. In this situation, there will always be tradeoffs in the accuracy of the system representation.

The other reason is that these complex systems are often state-dependent, and their behavior non-linearly changes as the system evolves. Finally, there are variables in the system that have several properties, and it is usually impossible to improve all of their properties at the same time (Gupta et al., 1998). Take streamflow as an example. In calculating model accuracy, we can focus on the daily fluctuation of streamflow or look into the changes weekly, monthly, and annually. We can also concentrate on the seasonality of streamflow, extreme low and high cases, and the persistence of these extreme events. Therefore, our results are not fully defensible if we only focus on one performance metric and make inferences that ignore many other components of the system. In this blog post, I am going to explain a visualization technique that allows us to draw three or more metrics of the system on the same plot. The plot is called a Taylor Diagram. The Taylor Diagram is not really a solution to the problem that I explained, but it does provide a systematic and mathematically sound way of demonstrating a few popular goodness-of-fit measures. The original version of the Taylor Diagram includes three performance metrics: i) standard deviation of simulated vs. observed; ii) correlation coefficient between observed and simulated; and iii) centered root mean square error (CRMSE). However, there are other versions of the Taylor Diagram that include more than these three metrics (e.g., here). The Taylor Diagram was developed by Karl E. Taylor (original paper) and has been frequently used by meteorologists and atmospheric scientists. In this blog post, I focus on streamflow.

Underlying Mathematical Relationships

As mentioned above, the Taylor Diagram draws the following three statistical metrics on a single plot: standard deviation, CRMSE, and correlation. The equation below relates these three:

Equation – 1

In this equation:

This relationship can be easily derived using the definition of CRMSE, which is:

Equation – 2

In this equation:

We can expand this equation to the following:

Equation – 3

The first two components of the equation are standard deviations of observed and simulated time series. To understand the third component of the equation, we need to recall the mathematical definition of correlation.

Equation – 4

If we multiply both sides of the above equation by standard deviation of observed and simulated, we see that the third components of the equations 3 and equation 4 are actually the same. The Taylor Diagram uses polar coordinates to visualize all of these components. Readers can refer to the original paper to find more discussion and the trigonometric proofs behind this form of plot.

Code Example

In this blog post, I am presenting a simple R package that can be used to create Taylor Diagrams. In this example, I use the same streamflow datasets that I explained in the previous blog post. First, you need to download the time series from GitHub and use the following commands to read these two time series into R.

Observed_Arrow<-read.table("~/Taylor Diagram/Arrow_observed.txt", header = T)
Simulated_Arrow<-read.table("~/Taylor Diagram/Arrow_simulated.txt", header = T)

In this post, I use the “openair” R library, which was originally developed for atmospheric chemistry data analysis. In addition to the Taylor Diagram, the package provides many helpful visualization options that can be accessed from here. Note that I was able to find at least one more R package that can be used to create Taylor Diagrams (i.e., plotrix ). There are also Python and MATLAB libraries that can be used to create Taylor Diagrams.

You can use the following to install the “openair” package and activate the library.

install.packages("openair")
library(openair)

The following command can be used to create a Taylor Diagram.

combined_dataset<-cbind(Observed_Arrow[ 18263:length(Observed_Arrow[,1]),], Arrow_sim=Simulated_Arrow[1:10592,4])

TaylorDiagram(combined_dataset, obs = "ARROW_obs", mod = "Arrow_sim")

Interpretation of the Taylor Diagram

The Taylor Diagram indicates the baseline observed point where correlation is 1 and CRMSE is 0 (purple color). If the simulation point is close to the observed point, it means that they are similar in terms of standard deviation, their correlation is high, and their CRMSE is close to zero. There is also a black dashed line that represents the standard deviation of the observed time series. If the red dot is above the line, it means that the simulated data set has a higher variation. The other helpful information that we gain from the Taylor Diagram is the correlation between observed and simulated values. Higher correlation shows a higher level of agreement between observed and simulated data. The correlation goes down as a point moves toward higher sectors in the graph. Centered root mean square error also shows the quality of the simulation process; however, it puts more weight on outliers. The golden contour lines on the polar plot show the value of CRMSE. Basically, on this figure, the red point has a higher standard deviation, the correlation is around 90%, and the CRMSE is close to 20,000.

The package also allows us to look at the performance of the model during different months or years, and we can compare different simulation scenarios with the observed data. Here, I use the function to draw a point for each month. The “group” argument can be used to do that.

TaylorDiagram(combined_dataset, obs = "ARROW_obs", mod = "Arrow_sim", group = "Month")

The function also provides a simple way to break down the data into different plots for other attributes. For example, I created four panels for four different annual periods:

TaylorDiagram(combined_dataset, obs = "ARROW_obs", mod = "Arrow_sim", group = "Month", type = "Year")