Ensemble forecasting – application

In contrast to my theoretically oriented previous post on ensemble forecasting, I will attempt, in this post, to provide some practically oriented operational context for ensemble forecasting from the perspective of water management. This and the previous post will serve as a useful background for the planned third post in this series that will focus on the details of ensemble forecast verification. This post is largely a distillation of my experience in working with the formal Forecast Informed Reservoir Operations (FIRO) efforts that are ongoing, primarily in the western US. I use the term ‘formal’ here deliberately to differentiate from the more general notion of FIRO, which has been in the research literature since at least the 90’s. Importantly, ‘formal’ FIRO, as I choose to label it, is an effort to formalize the use of forecast information into the operations of our nation’s portfolio of federally owned or controlled dams, which contain the largest and most important reservoirs in CONUS. This is in stark contrast to the largely informal use of FIRO that has been applied somewhat ‘under the radar’ and scattershot in various ways and for various purposes by dam operators up to this point. While there is a very interesting discussion to be had about the institutional and operational complexities of this effort (perhaps a future post), I’ll leave this brief once-over-the-FIRO-world contextualization at that for now.

For the purposes of this post, I mention the operational context to narrow the window of forecast applications to a tractable space that is a) pertinent to water systems operations and b) is a space that I am intimately familiar with. The world of hydrometeorological forecasting is large and growing, containing both advances in legacy dynamical forecasting techniques as well as emergent machine learning approaches. When it comes to federal agencies, tried/tested/verified is incredibly important, so I anticipate the uptake of the latter ML approaches to be some ways off. Baby steps…the US Army Corp of Engineers (USACE) is still trying to move past the ‘water on the ground’ legacy of water management.

Meteorological forecasting

In discussing the current state of ensemble forecasting, I will progress hierarchically from the meteorological to the hydrologic forecasting space. I will first discuss some aspects of dynamical meteorological forecasting that form the backbone of any hydrological forecasting effort. It is, after all, the weather gods casting their hydrometeors (rain/snow/ice) upon the earth that cause the majority of the hydrologic responses of interest to practical FIRO applications. This is an important point that I’ll note here and come back to later. Meteorological forcings dominate the hydrologic response and by consequence, dominate the hydrologic uncertainties when we start thinking in terms of future predictions (forecasts) where meteorologic uncertainties become large. This is wholly in contrast to efforts to understand hydrologic uncertainty with observational data as forcings (for instance, stochastic watershed modeling, see my post here), where the hydrologic model’s structural inadequacy is the primary driver of hydrologic predictive uncertainty. The observational (input) uncertainties are not inconsequential in these sorts of applications, but they form a less substantial portion of the overall uncertainty and can largely be considered aleatory.

Advances in meteorological forecasting skill

With some sense of the importance of meteorological uncertainty to the hydrologic forecasting problem, I want to turn now to a brief discussion on the advances in dynamical meteorological forecasting and areas for future research and improvement. Most folks have some practical sense of forecast skill (or lack thereof) from their daily lives; you check the smartphone weather before you head out for the day, put on shorts and a t-shirt, then get soaked walking to work due to some unforecast rain event. Despite events like this and the general challenge of dynamical uncertainty in the atmospheric system (see previous post), the reality is that forecast skill has steadily improved over time, to the tune of 1-day of ‘useful’ skill improvement per decade (see Figure 1). Although dependent on the particular skill metric being used, it is common to assess ‘useful’ skill above 60% skill and a ‘high degree of accuracy’ above 80% skill. By some visual extrapolation, one might surmise that we should be approaching ‘useful’ skill out to 10-days by now in the northern hemisphere (NH) for the atmospheric variable in question.

Figure 1. Improvements in forecast skill of 500 hPa geopotential height anomaly over time (Bauer et al., 2015)

Notably, Figure 1 shows the skill improvements for a large, synoptic scale variable at a moderate altitude in the atmosphere (500 hPa ~ 15,000’ above sea level). Skill varies greatly across atmospheric variables and is particularly challenging for variables that are highly dependent on sub-grid scale processes (i.e. processes that occur below the native resolution of the forecast model); one of these, as luck would have it, is the aforementioned hydrometeor (precipitation) generating process of such interest to hydrologic applications. As I will loosely paraphrase, something I heard from F. Martin Ralph, the current director of the Center for Western Weather and Water extremes (CW3E) at Scripps, is that ‘our skill in predicting upper atmospheric variables at large scales has increased enormously, whereas our skill in predicting local scale precipitation has barely budged’. This basic notion is a Pandora’s box of sorts in terms of thinking how we might use the information that is available in forecast models (i.e. the highly accurate synoptic scale variables) differently to more directly forecast key variables of interests at the local scale; many of these emerging techniques rely on ML techniques applied across wide spatial scales that can more precisely target key hydrometeorological attributes of interest to water management like the probability of exceeding some rainfall threshold (Zhang et al., 2022) or mapping directly to streamflow itself and skipping the whole precipitation part (Nearing et al., 2021). These are certainly highly promising avenues for research, but we’ll return to some practical advancements in dynamical forecasting for now.

Areas for improving meteorological forecasting

One could spend an entire blog post (or many posts really) on this subject. I want to touch on just a few key areas where much research effort is currently dedicated to advancing meteorological forecasting. The first of these is improving the spatiotemporal resolution of the forecast models. To take a quick step back, it is useful to compare a ‘forecast’ model to a Global Circulation/Climate Model (GCM) or Earth System Model (ESM). In a broad sense, these models are all constructed in the same basic way. They start with a dynamical representation of the atmosphere based on the Navier-Stokes fluid dynamics equations that are discretized and solved across ‘grids’ of some horizontal and vertical dimensions across the globe. In climate science speak, this part of the model is often referred to as the ‘dynamical core’ (or ‘dy-core’ if you want to sound cool). For the most part, things that happen below this native grid resolution are parameterized (e.g. cloud formation), which means that they are modeled based on fitted statistical relationships between grid scale atmospheric states and sub-grid scale process outcomes. These parameterizations are often referred to as the model’s ‘physics’. Intuitively, the smaller the scale we can represent the atmospheric processes, then the better we might be able to capture these local behaviors. Advances in computational power have enable much higher resolutions for forecast models/GCM/ESM over time and will likely continue to do so.

Where these models primarily differ is in all the other stuff needed to run a simulation of earth’s atmosphere, namely the land/ocean/ice surfaces and their couplings with the atmosphere. Forecast models must be computationally tractable at operational relevant timescales to be useful. In the space we are discussing, this means that the model must be able to reinitialize and produce a forecast at least daily. Moreover, the forecast temporal resolution must be much higher to capture evolution of the climate system at operationally relevant scales, typically 6-hourly with current technology. To overcome this complexity tradeoff (see Figure 2), forecast models have traditionally relied on extremely simplified representations of the land/ocean/ice surface boundary conditions that are assumed to be constant across the forecast period. Conversely, the development of GCM/ESMs has continually increased the complexity of the land/ocean/ice modeling and their atmospheric coupling while retaining sufficient atmospheric spatiotemporal resolution for long term simulations. GCM/ESM must also provide provable closure of things like energy and carbon fluxes across long time scales; this is not really a primary concern for forecast models, they can be ‘leaky’. Increasingly, however, computational advances have enabled forecast models to look more like GCM/ESM, with state-of-the-science forecast models now modeling and coupling land/ocean interactions along with the atmospheric evolution.

Figure 2. Tradeoff between model complexity and resolution in modeling the earth system (Bauer et al., 2015)

The second key area that I’ll highlight is data assimilation. Data assimilation is the process of constraining forecast trajectories based on some intermediate assessment of the actual evolution of the climate system (i.e. an observation). In meteorological forecasting, this is typically done in a relatively short window after the forecast has been generated, for instance the 12-hour assimilation window in Figure 3 below. Importantly, for the issuance of forecasts, this assimilation period is happening ‘under the hood’, so that the user of forecasts would only see the forecast as being issued at 21:00, not the 09:00 time when the actual forecast model was initialized. While the details of data assimilation are complex and well outside the scope of this post, it is an incredibly important area of continued research for dynamical forecasting. To put this into context, another loose paraphrase I heard from Vijay Tallapragada, senior scientist for NOAA’s Environmental Modeling Center, is ‘if you want long-term job security in the weather forecasting, get into data assimilation’.

Figure 3. Data assimilation example in ensemble forecasting (Bauer et al., 2015)

Lastly, as alluded to in my previous post, the methods to sample initial condition uncertainty and propagate it through the dynamical forecast model are also a very important area of advancement in meteorological ensemble forecasting. The mathematical challenge of adequately sampling initial condition uncertainty at global scales across highly correlated initialization variables is huge, as is the computational burden of running multiple perturbed members of the forecasts to produce an ensemble. The latter challenge has particular significance to the hydrological forecasting enterprise and may, at least partially, explain certain design choices in the hydrologic forecast systems discussed in the following section.

Hydrologic ensemble forecasting

To turn the meteorologic outputs of precipitation and temperature to streamflow, we need a streamflow generating process model. Moreover, if we want an ensemble of streamflow predictions, we need a way to generate some uncertainty in the streamflows to reflect the dominant meteorological uncertainties in the forecasts. With this modeling chain of meteorologic forecast model à streamflow process model, one could envision many possible ways to attempt to capture this uncertainty. Perhaps the most intuitive would be to take each ensemble output from the meteorologic forecast model and run it through the streamflow process model; voila! ensemble streamflow forecasts!

The Hydrologic Ensemble Forecast Service (HEFS)

This is, however, not how it’s done in the Hydrologic Ensemble Forecast Service (HEFS) that is the current operational model used by NOAA/NWS river forecast centers (RFC) and the model being primarily used for formal FIRO efforts (see Figure 4 below). Instead, HEFS has its own internal Meteorological Ensemble Forecast Processor (MEFP) that ingests an ensemble mean of temperature and precipitation inputs from the meteorological ensemble forecast (global ensemble forecast system, GEFS) and then creates its own internal ensemble of meteorologic forecasts. These internally generated traces of forecasted temperature/precipitation are each used to force a single, watershed calibrated version of the SAC-SMA hydrologic model to produce an ensemble of streamflow forecasts (Demargne et al., 2014). Why use this implementation instead of the seemingly more straightforward approach with the meteorological ensemble? I don’t have a definitive answer for that, but my understanding is that, from a statistical verification perspective, the current HEFS approach is more reliable.


Figure 4. Conceptual diagram of HEFS model (Demargne et al., 2014)

Another possible reason stems from the computational complexity challenge of meteorological ensemble forecasting mentioned in the previous section, particularly as it relates to hindcasting. Hindcasts are forecasts that are generated with a modern day forecast model (and their modern day skill) against historical observations of the appropriate quality (post satellite era generally speaking, ~1979 – present). In meteorological spheres, hindcasts are important in documenting the performance of forecast model across a much larger dataset. For the purposes of advancing operational water management with forecasts, hindcasts are more than important…they are indispensable. Hindcasts form the basis for the design and testing of forecast informed water management procedures; there is no workaround for this. To generate a meteorological hindcast, the NWS has to carve out time to dedicate its warehouse sized forecast model computational platform to generating 20 -30 years of forecasts and archiving the data. This time must be balanced with the operational necessity of producing real time forecasts for things like: not getting wet going to work or maybe, I don’t know, aviation? This challenge means that hindcasts must typically be produced at reduced resolution compared to the operational forecasts. For example, the current NOAA Global Ensemble Forecast System (GEFSv12) produces operational forecasts with 31 ensemble members, but only 5 ensemble members in the hindcasts (Guan et al., 2019).

This limitation is significant. 5 ensemble members is sufficient to produce a relatively good approximation of the ensemble-mean, but grossly undersized for producing an adequate probabilistic streamflow ensemble. In this hindcast context, then, the construction of HEFS makes sense and enables more flexibility for generating historical forecast ensembles for development of forecast informed water management policies.

Practical HEFS limitations

The current implementations of HEFS typically generate ~40 ensemble members. The MEFP is conditioned on a historical period of reference using a Schaake shuffle type procedure, so this ensemble size is actually the length (in years) of that historical conditioning period. As with meteorological forecasting, the production of HEFS hindcasts is also challenging, in that it requires the same carving out of time on an operational computational architecture to produce the hindcasts. This limitation substantiates one of the core interests from folks within the formal FIRO implementation efforts into synthetic forecasting methods to expand the length and availability of simulated hindcasts for operational policy development. In general, HEFS is reliable, but it suffers from systemic ‘Type-II conditional bias’, which in layman’s terms is the tendency to underpredict extreme events. I will dig into some of these attributes of HEFS forecasts in a future post on ensemble verification.

Outside of questions of statistical reliability, there are other more challenging to quantify concerns about the implementation of HEFS. The ensemble-mean forecasts from the NOAA GEFS model that force HEFS are bound to be more ‘skillful’ than, for instance, the ensemble control forecast. (Note: the control forecast is the forecast trajectory initialized from the ‘best guess’ at the initial condition state of the atmosphere). This is a well-documented property of ensemble-mean predictions. However, ensemble-mean forecasts are not dynamically consistent representations of the atmosphere (Wilks, 2019). As I noted in my previous post, an ideal forecast ensemble would include all possible trajectories of the forecasts, which might diverge substantially at some future time. Where forecast trajectories have diverged, an average of those trajectories will produce a value that is not in the space of plausible dynamic trajectories. In practical terms, this leads to ensemble-mean forecasts collapsing to climatology at longer leads and systematic reductions in the range of magnitudes of key variables as lead time increases. In short, this reliance on the meteorological ensemble-mean forecast does incur some risk (in my opinion) of missing high impact extremes due to, for example, even relatively small spatial errors in the landfall locations of atmospheric river storms.

Final thoughts

The intent of this post was to provide some current operational context for the types of forecasts being considered for formal implementation of forecast informed water management operations. The ensemble forecasting space is large and exciting from both research and operational perspectives, but it’s challenging to navigate and understand what is actually being considered for use, and what is simply novel and interesting. In my next post, I’ll discuss some of the primary verification techniques used for ensemble forecasts with a focus on the performance of HEFS in an example watershed and some associated practical implementation.

Reference

Bauer, P., Thorpe, A., & Brunet, G. (2015). The quiet revolution of numerical weather prediction. Nature, 525(7567), 47–55. https://doi.org/10.1038/nature14956

Demargne, J., Wu, L., Regonda, S. K., Brown, J. D., Lee, H., He, M., … Zhu, Y. (2014). The science of NOAA’s operational hydrologic ensemble forecast service. Bulletin of the American Meteorological Society, 95(1), 79–98. https://doi.org/10.1175/BAMS-D-12-00081.1

Guan, H., Zhu, Y., Sinsky, E., Fu, B., Zhou, X., Li, W., et al. (2019). The NCEP GEFS-v12 Reforecasts to Support Subseasonal and Hydrometeorological Applications. 44th NOAA Annual Climate Diagnostics and Prediction Workshop, (October), 78–81.

Nearing, G. S., Kratzert, F., Sampson, A. K., Pelissier, C. S., Klotz, D., Frame, J. M., … Gupta, H. V. (2021). What Role Does Hydrological Science Play in the Age of Machine Learning? Water Resources Research, 57(3). https://doi.org/10.1029/2020WR028091

Wilks, D. S., (2019). Statistical Methods in the Atmospheric Sciences, 4th ed. Cambridge, MA: Elsevier.

Zhang, C., Brodeur, Z. P., Steinschneider, S., & Herman, J. D. (2022). Leveraging Spatial Patterns in Precipitation Forecasts Using Deep Learning to Support Regional Water Management. Water Resources Research, 58(9), 1–18. https://doi.org/10.1029/2021WR031910

Sankey diagrams for USGS gauge data in python(?)

This post was inspired by the Sankey diagram in Figure 1 of this pre-print led by Dave Gold:Exploring the Spatially Compounding Multi-sectoral Drought Vulnerabilities in Colorado’s West Slope River Basins” (Gold, Reed & Gupta, In Review) which features a Sankey diagram of flow contributions to Lake Powell. I like the figure, and thought I’d make an effort to produce similar diagrams using USGS gauge data.

Sankey diagrams show data flows between different source and target destinations. Lot’s of people use them to visualize their personal/business cashflows. It’s an obvious visualization option for streamflows.

To explain the “(?)” in my title: When I started this, I’d realized quickly that I to choose one of two popular plotting packages: matplotlib or plotly.

I am a frequent matplotlib user and definitely appreciate the level of control in the figure generation process. However, it can sometimes be more time- and line-intensive designing highly customized figures using matplotlib. On the other hand, in my experience, plotly tools can often produce appealing graphics with less code. I am also drawn to the fact that the plotly graphics are interactive objects rather than static figures.

I decided to go with plotly to try something new. If you want to hear my complaints and thoughts on use context, you can skip to the conclusions below.

In the sections below, I provide some code which will:

  • Define a network of USGS gauge stations to include in the plot
  • Retrieve data from USGS gauge stations
  • Create a Sankey diagram using plotly showing streamflows across the network

Here, I focus on the Rio Grande river upstream of Albuquerque, NM. However you can plot a different streamflow network by modifying the dictionary of upstream nodes defining the network.


Plotting a Sankey streamflow network with plotly

The code used here requires both plotly and the pygeohydro package (for USGS data retrieval).

from pygeohydro import NWIS
import plotly.graph_objects as go

With that out of the way, we can get started.

Defining the flow network & data retrieval

I start by defining a dictionary called upstream_stations which defines the relationships between different gauges of interest.

This dictionary contains pairs of the form: {"GAUGE_ID" : ["LIST_OF", "UPSTREAM", "GAUGE_IDs"]}

If there is no upstream site, then include an empty list. For the Rio Grande network, this looks like:

# define relationships between each gauge and upstream sites
upstream_stations = {
    '08329918' : ['08319000', '08328950'], 
    '08319000' : ['08317400', '08317200'],
    '08328950' : [],
    '08317200' : [],
    '08317400' : ['08313000'],
    '08313000' : ['08290000', '08279500'],
    '08287000' : [],
    '08279500' : [],
    '08290000' : ['08287000', '08289000'],
    '08289000' : [],
}

# Get list of all stations from upstream_stations
all_stations = list(upstream_stations.keys())
for station, upstream in upstream_stations.items():
    all_stations += upstream
all_stations = list(set(all_stations))

Notice that I also made a list containing all the stations IDs. I use the pygeohydro package from the HyRiver suite of tools to get retrieve the gauge station data (Chegini, Li, & Leung, 2021). I often cite this package, and have written about it in a past post (“Efficient hydroclimatic data accessing with HyRiver for Python”).

Using the list of all_stations, I use the following code to pull daily streamflow data for each site from 2015-2020 (or some other specified dates):

def get_usgs_gauge_data(stations, dates):
    """
    Get streamflow data from USGS gauge stations using NWIS.
    """
    nwis = NWIS()
    df = nwis.get_streamflow(stations, dates, mmd=False)
    
    # get rid of USGS- in columns
    df.columns = df.columns.str.replace('USGS-', '')
    return df

# Get USGS flows
flows = get_usgs_gauge_data(all_stations, ('2015-01-01', '2020-12-31'))

For the Sankey diagram, we need a single flow value for each station. In this case I calculate an average of the annual total flows at each station:

# Get annual mean flows
agg_flows = flows.resample('Y').sum().agg('mean')

Creating the Sankey figure

At it’s core, a Sankey diagram is a visualization of a weighted network (also referred to as a graph) defined by:

  • Nodes
  • Links (aka Edges)
  • Weights

In our case, the nodes are the USGS gauge stations, the links are the connections between upstream and downstream gauges, and the weights are the average volumes of water flowing from one gauge to the next.

Each link is defined by a source and target node and a value. This is where the upstream_stations dictionary comes in. In the code block below, I set up the nodes and links, looping through upstream_stations to define all of the source-target relationships:

## Deinfe nodes and links
# Nodes are station IDs
nodes = all_stations
node_indices = {node: i for i, node in enumerate(nodes)}

# make links based on upstream-downstream relationships
links = {
    'source': [],
    'target': [],
    'value': [],
}

# loop through upstream_stations dict
for station, upstream_list in upstream_stations.items():
    for stn in upstream_list:
        if stn in agg_flows and station in agg_flows:
            links['source'].append(node_indices[stn])
            links['target'].append(node_indices[station])
            links['value'].append(agg_flows[stn])

Lastly, I define some node labels and assign colors to each node. In this case, I want to make the nodes black if they represent reservoir releases (gauges at reservoir outlets) or blue if they are simple gauge stations.

labels = {
    '08329918' : 'Rio Grande at Alameda', 
    '08319000' : 'San Felipe Gauge',
    '08328950' : 'Jemez Canyon Reservoir',
    '08317200' : 'Santa Fe River',
    '08317400' : 'Cochiti Reservoir',
    '08313000' : 'Rio Grande at Otowi Bridge',
    '08287000' : 'Abiquiu Reservoir',
    '08279500' : 'Rio Grande',
    '08290000' : 'Rio Chama',
    '08289000' : 'Rio Ojo Caliente',
}

# Create nodes labels and colors lists
node_labels = [labels[node] for node in nodes]
node_colors = ['black' if 'Reservoir' in label else 'dodgerblue' for label in node_labels]


Finally, the function to generate the figure:

def create_sankey_diagram(node_labels, links, node_colors, 
						  orientation='h',
                          size=(2000, 700)):
    """
    Create a Sankey diagram of using Plotly.
    
    Parameters
    ----------
    node_labels : list
        List of node labels.
    links : dict
        Dictionary with keys 'source', 'target', and 'value'.
    node_colors : list
        List of node colors.
    orientation : str
        Orientation of the diagram, 'h' for horizontal and 'v' for vertical.
        
    Returns
    -------
    sankey_fig : plotly.graph_objects.Figure
        Plotly figure object.
    """
    sankey_fig = go.Figure(go.Sankey(
        orientation=orientation,
        node=dict(
            pad=70,
            thickness=45,
            line=dict(color='dodgerblue', width=0.5),
            label=node_labels,
            color=node_colors
        ),
        link=dict(
            source=links['source'],
            target=links['target'],
            value=links['value'],
            color='cornflowerblue'
        )
    ))
    
    sankey_fig.update_layout(
        title_text="Rio Grande Streamflow ",
        font=dict(size=23),
        width=size[0],
        height=size[1]
    )
    return sankey_fig

There are some options for manipulating this figure script to better suit your needs. Specifically you may want to modify:

  • pad=70 : this is the horizontal spacing between nodes
  • thickness=45 : this is the thickness of the node elements

With our pre-prepped data from above, we can use the function like so:

sankey_fig = create_sankey_diagram(node_labels, 
								   links, 
								   node_colors, 
								   orientation='v', size=(1000, 1200))
sankey_fig

And here we have it:

I’d say it looks… okay. And admittedly this is the appearance after manipulating the node placement using the interactive interface.

It’s a squished vertically (which can be improved by making it a much taller figure). However my biggest issue is with the text being difficult to read.

Changing the orientation to horizontal (orientation='h') results in a slightly better looking figure. Which makes sense, since the Sankey diagram is often shown horizontal. However, this does not preserve the relationship to the actual North-South flow direction in the Rio Grande, so I don’t like it as much.

Conclusions

To answer the question posed by the title, “Sankey diagrams for USGS gauge data in python(?)”: Yes, sometimes. And sometimes something else.

Plotly complaints: While working on this post, I developed a few complaints with the plotly Sankey tools. Specifically:

  • It appears that the label text coloring cannot be modified. I don’t like the white edgecolor/blur effect, but could not get rid of this.
  • The font is very difficult to read… I had to make the text size very large for it to be reasonably legible.
  • You can only assign a single node thickness. I had wanted to make the reservoirs thick, and shrink the size of the gauge station nodes. However, it seems this cannot be done.
  • The diagrams appear low-resolution and I don’t see a way to save a high res version.

Ultimately, the plotly tools are very restrictive in the design of the graphic. However, this is a tradeoff in order to get the benefit of interactive graphics.

Plotly praise: The plotly Sankey tools have some advantages, specifically:

  • The plots are interactive
  • Plots can be saved as HTML and embedded in websites

These advantages make the plotly tools good for anyone who might want to have a dynamic and maybe frequently updated dashboard on a site.

On the other hand, if I wanted to prepare a publication-quality figure, where I had absolute control of the design elements, I’d likely turn to matplotlib. That way it could be saved as an SVG and further manipulated in a vector art program link Inkscape or Illustrator.

Thanks for reading!

References

Chegini, T., Li, H. Y., & Leung, L. R. (2021). HyRiver: Hydroclimate data retriever. Journal of Open Source Software, 6(66), 3175.

Links

Colorado State Decision Tools

This blog post details the available information from Colorado state’s decision tools that can be used in different research areas. These tools have extensive information on water administration, climate data, reservoir details, and groundwater data. Snapshots of each tool are provided to highlight the features.

Water administration

Prior Appropriation Doctrine is the priority system followed. Water source, location structure, call priority, and priority date details are accessible. Administrative call data is also available (active and historical). The following is the snapshot of viewing active call data. The additional resources tab in the lower bottom has more information linking to call analysis for each structure and water source. Acquisition and new appropriation data for lakes and streams is also available. Decree details, court docs and net amounts for different structure types (acquifer reservation, augmentation plan, basin determination, ditch system, entity, erosion control dam, exchange plan, livestock water tank, measuring point, Mine, minimum flow, pipeline, power plant, pump, reach and recharge area) can be accessed in https://dwr.state.co.us/Tools/WaterRights/NetAmounts

Climate Stations

Climate data is available for 12157 stations from different sources such as NRCS, NOAA, CoCoRaHS, and CoAgMet. The data can be viewed as a table, or graph and downloaded as a CSV file for each site. Climate parameters include evaporation, temperature, precipitation, snow, vapor pressure and wind speed.

Dams

Information on the structural details of the dam is provided under the dam safety tool. The data can be downloaded from this link. Data includes 53 columns for each dam with details on dam type, length, height, crest elevation, storage, federal regulations, surface area, drainage area, spillway capacity, outlet description, last inspection date, safe storage and others.

Groundwater

This page has well-documented groundwater level reports for the years 1989-2023 for aquifers in different basins of the Colorado state along with rules for water well construction and inspection details. The groundwater – water levels tool has data for around 10000 wells with details on well name, water level depth, water level elevation, period of record count, measured agency, aquifer name, county, basin and others. Well permit data is available in this page.

Surface water data

Current and historical gage data is available for 2554 stations with tabled data on station name, station type, data source, river source, record start date and end date. This page has the details on stream gage station, diversion structures and storage structures. Along with flow data, different environmental variables data is available as water temperature, air temperature, conductivity, pH.

GIS and Mapping

GIS data is available for download for all river basins that includes different layers as diversion, wells, streams/rivers, canals, irrigated acreage and climate stations. Other tools include location converter and distance calculator (https://dwr.state.co.us/Tools/Home). They also provide an extensive tool called Map Viewer that can be used to view GIS data online. Different map viewers are as follows

Map DescriptionLink
DWR Map Viewer: All inclusive map viewer with over 170 map layers of DWR and CWCB data. (Use the Layer Catalog to add layers that are not pre-populated.)Launch DWR Map Viewer
DWR Well Permit Viewer: A map viewer with data used to research well permits. (Use the Layer Catalog to add layers that are not pre-populated.)Launch DWR Well Permit Map Viewer
CWCB Data Viewer: View data specific to CWCB instream flows, Construction Fund loans, recreational in-channel diversions (RICDs), WSRF grants and more.Launch CWCB Data Viewer
CDSS SNODAS Tools: The SNODAS tools page provides interactive analysis tools to explore historic snowpack from 2003 forward.Launch SNODAS Tools
Table taken from https://cdss.colorado.gov/map-viewers

Environmental Flow Tool

This flow tool is built for ecological risk assessment in the basin. This tool is developed in excel and has embedded historical and future datasets of colorado river basin. Future data is available for five planning scenarios defined as business as casual, weak economy, cooperative growth, adaptive innovation and hot growth. Environmental flow risk categories are low, moderate, less moderate, high and very high ecological risk. Ecological risk maps are generated for each basin.

Cost estimating tool

The structure of cost estimating tool is shown in the following figure

Technical Update to the Water Plan

CWCB released update to water plan for use by researchers, decision makers, water users and stakeholders with data on historical and future availability covering different sectors utilizing the above defined tools. In my opinion, this report is worth reading to understand the Colorado’s water assessment.

All the heading are linked to web pages where the tool data is available which are accessible as of May 6, 2024.

Ensemble forecasting – theory

‘Does the flap of a butterfly’s wing in Brazil stir up a tornado in Texas?’ These words reflect the so-called ‘butterfly effect’ typically ascribed to Edward Lorenz (MIT), who was a pioneer in the field of numerical weather prediction (NWP). (Interestingly, Lorenz himself first referred to the ‘flap of a seagulls’ wing’; the butterfly swap out was the work of a creative conference session convener. Somewhat coincidentally, the shape of the ‘Lorenz attractor’ is also reminiscent of a butterfly, see Figure 1). Motivated by the very salient challenge in the post-WWII era to improve weather prediction capabilities for strategic purposes, he developed a theoretical framework for the predictability of non-linear dynamical systems in a seminal paper, ‘Deterministic nonperiodic flow’ (Lorenz, 1963), that would come to be known as ‘chaos theory’ or ‘chaotic dynamics’. This theory was foundational to the development of ensemble forecasting (and a whole host of other things), and still underpins our current theoretical understanding of the inherent predictability of complex dynamical systems like the atmosphere (Lorenz, 2006).


Figure 1. Book cover (left panel) and the characteristic ‘butterfly wing’ shape of the Lorenz attractor (right panel) from his seminal 1963 work. (Palmer, 1993)

In actuality, the whole ‘flap of a butterfly’s wing creating a tornado in Texas’ relates specifically to only one aspect of the theoretical framework developed by Lorenz. In this post, I will attempt to give a slightly fuller treatment of this theoretical framework and its development. My hope is that this post will be a theoretical primer for two follow-on practically oriented posts: 1) a discussion of the state-of-the-science of hydrometeorological ensemble forecasting and its application in emerging water resources management practices like forecast informed reservoir operations, and 2) a deeper dive into ensemble verification techniques. While the techniques for (2) are grounded in the theoretical statistical properties of stochastic-dynamic predictive ensembles, they have broad applications to any scenario where one needs to diagnose the performance of an ensemble.

The Lorenz model and ‘chaos’ theory

When most people hear the word ‘chaos’, they tend to think of the dictionary definition: ‘complete disorder and confusion’ (Oxford), which in a scientific context might aptly describe some sort of white noise process. As you will see, this is somewhat of a far cry from the ‘chaos’ described in Lorenz’s theory. The ‘chaos’ terminology was actually coined by a later paper building on Lorenz’s work (Li & Yorke, 1975) and as noted by Wilks (2019): ‘…this label is somewhat unfortunate in that it is not really descriptive of the sensitive-dependence phenomenon’. The sensitive-dependence phenomenon is one of a set of 4 key properties (to be discussed later) that characterize the behaviors of the sort of non-linear, non-periodic deterministic systems that Lorenz argued were most representative of the atmosphere. In contrast to ‘disorder’ and ‘confusion’, these properties, in fact, lend some sense of order to the evolution of these dynamical systems, although the nature of that order is highly dependent on the system state.

A deterministic, non-periodic systems model

First, let’s dive into a few details of Lorenz’s 1963 work, using some figures and descriptions from a later paper (Palmer, 1993) that are easier to follow. While the 1963 paper is quite dense, much of the mathematical logic is dedicated to justifying the use of a particular system of equations that forms the basis of the study. This system of 3 equations and 3 variables (X, Y, Z) describes a non-linear, dissipative model of convection in a fluid of uniform depth, where there is a temperature difference between the upper and lower surfaces. Lorenz derived the set of 3 equations shown in the upper left panel of Figure 2 from earlier work by Rayleigh (1916) on this particular problem. In short, X relates to the intensity of convective motion, Y relates to the temperature difference between ascending and descending currents, and Z relates to the distortion of the vertical temperature profile from linearity; the details of these variables are not actually all that important to the study. What is important is that this system of equations has no general solutions (aside from the steady state solution) and must be numerically integrated in the time dimension to determine the convective flow dynamics. The ‘trajectories’ of these integrations shown in the phase space diagrams in Figure 2 exhibit the sorts of unstable, non-periodic behaviors that Lorenz thought were the most useful analog to atmospheric dynamics, albeit in a much simpler system. (‘Much’ is an understatement here; modern NWP models have a phase space on the order of 109 in contrast to the 3-dimensional phase space of this problem, Wilks, 2019. Of note, the use of phase-space diagrams (i.e. plots where each axis corresponds to a dependent variable in the system of equations) preceded Lorenz, but his use of them is perhaps one of the best-known early instances of this kind of visualization. Other uses of phase space relationships can be found in Rohini’s post or Dave’s post)

Figure 2. a) Lorenz equations and the XY phase space diagram of their integrations. b-c) X variable timeseries of two separate integrations of the Lorenz equations from very similar initial conditions. (Palmer, 1993)

Regime structure and multiple distinct timescales

What ‘behaviors’ then, are we talking about? Referring again to Figure 2a, we see the projection of a long series of model integration trajectories onto the XY-plane of the phase space diagram, where each dot is a timestep in the integration. What is immediately apparent in the form of these integrations is the two lobed ‘butterfly’ shape of the diagram. Each lobe has a central attractor where the trajectories often reside for multiple revolutions, but can then transition to the other attractor when passing near the center of the diagram. These so-called ‘Lorenz attractors’ comprise one of the key properties of chaotic dynamics, namely regime structure, which is tendency of the trajectories to reside around phase space attractors for some period of time. This residence time in a regime is generally quite a bit longer than the timescales of the trajectories’ individual revolutions around an attractor. This attribute is referred to as multiple distinct timescales and is evidenced in Figure 2b-c, where smaller amplitude sections of the timeseries show residence in one or the other regime and large amplitude sections describe transitions between regimes. Often, but not always, the residence in these regimes occurs for multiple revolutions, suggesting that there are shorter timescale evolutions of the system that take place within these regimes, while infrequent, random shifts to the other regimes occur at longer timescales.

Sensitive-dependence and state-dependent variation in predictability

Figure 3. a-c) Different trajectories through the XY phase space dependent on the initial condition state-space (black circle). (Palmer, 1993)

Returning now to the ‘butterfly effect’; what, then, is this whole sensitive-dependence thing mentioned earlier? Figure 3a-c provide a nice graphical representation of this phenomenon. In each panel, a set of nearby initial states are chosen at different location in the phase space diagram and then are followed through their set of trajectories. In 3a, these trajectories neatly transition from one regime to the other, remaining relatively close to each other at the trajectories’ end. In 3b, a set of initial states not so far from 3a are chosen and instead of neatly transitioning to the other regime, they diverge strongly near the center of the diagram, with some trajectories remaining in the original regime, and some transitioning. However, for about half of the timespan, the trajectories remain very close to one another. Finally, an initial state chosen near the center of the diagram (3c) diverges quickly into both regimes, with trajectories ending up at nearly opposite ends of the phase space (black tails at upper right and left of diagram). Figures b-c, in particular, showcase the sensitive-dependence on initial conditions attributes of the system. In other words, from a set of very close-by initial states, the final trajectories in phase space can yield strongly divergent results. Importantly, this divergence in trajectories over some period of time can occur right away (3c), at some intermediate time (3b), or not at all (3a).

This is the basic idea behind the last core property of these systems, state-dependent variation in predictability. Clearly, a forecast initialized from the starting point in 3a could be a little bit uncertain about the exact starting state and still end up in about the right spot for an accurate future prediction at the end of the forecast horizon. At medium ranges, this is also the case for 3b, but the longer range forecast is highly uncertain. For 3c, all forecast ranges are highly uncertain; in other words, the flap of a butterfly’s wing can mean the difference between one or the other trajectory! Importantly, one can imagine in this representation that an average value of 3c’s trajectories (i.e. the ensemble mean) would fall somewhere in the middle of the phase space and be representative of none of the two physically plausible trajectories into the right or left regimes. This is an important point that we’ll return to at the end of this post.

From the Lorenz model to ensemble forecasting

The previous sections have highlighted this idealized dynamical system (Lorenz model) that theoretically has properties of the actual atmosphere. Those 4 properties (the big 4!) were: 1) sensitive-dependence on initial conditions, 2) regime structure, 3) multiple distinct timescales, and 4) state-dependent variation in predictability. In this last section, I will tie these concepts into a theoretical discussion of ensemble forecasting. Notably, in the previous sections, I discussed ‘timescales’ without any reference to the real atmosphere. The dynamics of the Lorenz system prove to be quite illuminating about the actual atmosphere when timescales are attached to the system that are roughly on the order of the evolution of synoptic scale weather systems. If one assumes that a lap around the Lorenz attractor equates to a synoptic timescale of ~5 days, then the Lorenz model can be conceptualized in terms of relatively homogenous synoptic weather progressions occurring within two distinct weather regimes that typically persist on the order of weeks to months (see Figure 3b-c). This theoretical foundation jives nicely with the empirically derived weather-regime based approach (see Rohini’s post or Nasser’s post) where incidentally, 4 or more weather regimes are commonly found (The real atmosphere is quite a bit more complex than the Lorenz model after all). This discussion of the Lorenz model has hopefully given you some intuition that conceptualizing real world weather progressions outside the context of an appropriate regime structure could lead to some very unphysical representations of the climate system.

Ensemble forecasting as a discrete approximation of forecast uncertainty

In terms of weather prediction, though, the big 4 make things really tough. While there are certainly conditions in the atmosphere that can lead to excellent long range predictability (i.e. ‘forecasts of opportunity’, see Figure 3a), the ‘typical’ dynamics of the atmosphere yield the potential for multiple regimes and associated transitions within the short-to-medium range timeframe (1-15 days) where synoptic-scale hydro-meteorological forecasting is theoretically possible. (Note, by ‘synoptic-scale’ here, I am referring to the capability to predict actual weather events, not prevailing conditions. Current science puts the theoretical dynamical limit to predictability at ~2 weeks with current NWP technology achieving ‘usable’ skill out to ~7 days, give or take a few dependent on the region)

Early efforts to bring the Lorenz model into weather prediction sought to develop analytical methods to propagate initial condition uncertainty into a useful probabilistic approximation of the forecast uncertainty at various lead times. This can work for simplified representation of the atmosphere like the Lorenz model, but quickly becomes intractable as the complexity and scale of the governing equations and boundary conditions increases.

Figure 4. Conceptual depiction of ensemble forecasting including sampling of initial condition uncertainty, forecasts at different lead times, regime structure, and ensemble mean comparison to individual ensemble members. Solid arrow represents a ‘control’ member of the ensemble. Histograms represent an approximate empirical distribution of the ensemble forecast. (Wilks, 2019)

Thankfully, rapid advances in computing power led to an alternate approach, ensemble forecasting! Ensemble forecasting is a stochastic-dynamic approach that couples a discrete, probabilistic sampling of the initial condition uncertainty (Figure 4 ‘Initial time’) and propagates each of those initial condition state vectors through a dynamical NWP model. Each of these NWP integrations is a unique trajectory through state space of the dynamical model that yields a discrete approximation of the forecast uncertainty (Figure 4 ‘Intermediate/Final forecast lead time’). This discrete forecast uncertainty distribution theoretically encompasses the full space of potential hydro-meteorologic trajectories and allows a probabilistic representation of forecast outcomes through analysis of the empirical forecast ensemble distribution. These forecast trajectories highlight many of the big 4 properties discussed in previous sections, including regime structure and state-dependent predictability (Figure 4 trajectories are analogous to the Figure 3b trajectories for the Lorenz model). The ensemble mean prediction is an accurate and dynamically consistent prediction at the intermediate lead time, but at the final lead time where distinct regime partitioning has occurred, it is no longer dynamically consistent and delineates a region of low probability in the full ensemble distribution. I will explore properties of ensemble averaging, both good and bad, in a future post.

Lastly, I will note that the ensemble forecasting approach is a type of Monte Carlo procedure. Like other Monte Carlo approaches with complex systems, the methodology for sampling the initial condition uncertainty has a profound effect on the uncertainty quantification contained within the final NWP ensemble output, especially when considering the high degree of spatiotemporal relationships within the observed climatic variables that form the initial state vector. This is a key area of continued research and improvement in ensemble forecasting models.

Final thoughts

I hope that you find this discussion of the theoretical underpinnings of chaotic dynamics and ensemble forecasting to be useful. I have always found these foundational concepts to be particularly fascinating. Moreover, the basic theory has strong connections outside of ensemble forecasting, including the ties to weather regime theory mentioned in this post. I also think these foundational concepts are necessary to understand how much actual ensemble forecasting techniques can diverge from the theoretical stochastic-dynamic framework. This will be the subject of some future posts that will delve into more practical and applied aspects of ensemble forecasting with a focus on water resources management.

Reference

Lorenz, E. N. (1963). Deterministic nonperiodic flow. Journal of the Atmospheric Sciences, 20, 130–141.

Lorenz, E. N. (2006). Reflections on the Conception , Birth , and Childhood of Numerical Weather Prediction. Annual Review of Earth and Planetary Science, 34, 37–45. https://doi.org/10.1146/annurev.earth.34.083105.102317

Palmer, T. N. (1993). Extended-range atmospheric prediction and the Lorenz model. Bulletin – American Meteorological Society, 74(1), 49–65. https://doi.org/10.1175/1520-0477(1993)074<0049:ERAPAT>2.0.CO;2

Rayleigh, L. (1916). On convective currents in a horizontal layer when the higher temperature is on the underside. Phil. Mag., 32, 529-546.

Wilks, D. S., (2019). Statistical Methods in the Atmospheric Sciences, 4th ed. Cambridge, MA: Elsevier.

Python Profiling with line_profiler

The line_profiler can be used to see the amount of time taken to execute each line in a function of your code. I think this is an important tool that can be used to reduce the runtime of a code. Simple command “pip install line_profiler” shall install the package or “conda install line_profiler” to install into an existing conda environment.

I shall present the usage of this line_profiler tool for a randomly generated data to calculate supply to demand ratio for releases from a reservoir. Demand or target supply is defined for day of the water year. The following code first defines the calculation for day of water year, generates random data for demand and supply, and two functions are defined for different methods of calculation of ratio of supply to demand. Include the line @profile before the function definition line to get the profile of execution of each line in the function.

import pandas as pd
import numpy as np
from line_profiler import profile

#function to caluclate day of water year
def get_dowy(date):
water_year_start = pd.Timestamp(year=date.year, month=10, day=1)
if date < water_year_start:
water_year_start = pd.Timestamp(year=date.year - 1, month=10, day=1)
return (date - water_year_start).days + 1

# Generate random data for demand for each day of water year
np.random.seed(0)
data = {
'Median_Demand': np.random.randint(0, 1000, 367),
}

# Create dataframe
df_demand = pd.DataFrame(data)

## Generate random data for supply from years 2001 to 2010 and also define corresponding day of water year
date_range = pd.date_range(start='2001-10-01', end='2091-09-30', freq='D')
data = {
'dowy': [get_dowy(date) for date in date_range],
'Supply': np.random.uniform(0, 2500, len(date_range))
}
# Create dataframe
df_supply = pd.DataFrame(data, index=date_range)

@profile #define before the function for profiling
def calc_supply_demand_1(df,df_median):
ratio = pd.DataFrame()
medians_dict = df_demand['Median_Demand'].to_dict()
demand = df_supply['dowy'].map(medians_dict)
supply = df_supply['Supply']
ratio = supply.resample('AS-OCT').sum() / demand.resample('AS-OCT').sum()
return ratio

@profile
def calc_supply_demand_2(df,df_median):
ratio = pd.DataFrame()
medians_dict = df_demand['Median_Demand'].to_dict()
demand = pd.Series([df_demand['Median_Demand'][i] for i in df.dowy], index=df.index)
supply = df_supply['Supply']
ratio = supply.resample('AS-OCT').sum() / demand.resample('AS-OCT').sum()
return ratio

ratio1 = calc_supply_demand_1(df_supply, df_demand)
ratio2 = calc_supply_demand_2(df_supply,df_demand)

Running just the code wouldn’t output anything related to line_profiler. To enable profiling, run the script as follows (this sets the environment variable LINE_PROFILE=1)

LINE_PROFILE=1 python Blog_Post.py

The above line generates three output files as profile_output.txt, profile_output_.txt, and profile_output.lprof and stdout is as follows:

Timer unit: 1e-09 s

0.04 seconds - /directory/Blog_Post.py:30 - calc_supply_demand_1
2.43 seconds - /directory/Blog_Post.py:39 - calc_supply_demand_2
Wrote profile results to profile_output.txt
Wrote profile results to profile_output_2024-03-29T192919.txt
Wrote profile results to profile_output.lprof
To view details run:
python -m line_profiler -rtmz profile_output.lprof

On executing the line “python -m line_profiler -rtmz profile_output.lprof”, the following is printed.

Timer unit: 1e-06 s

Total time: 0.0393394 s
File: /directory/Blog_Post.py
Function: calc_supply_demand_1 at line 30

Line # Hits Time Per Hit % Time Line Contents
==============================================================
30 @profile
31 def calc_supply_demand_1(df,df_median):
32 1 2716.4 2716.4 6.9 ratio = pd.DataFrame()
33 1 1365.2 1365.2 3.5 medians_dict = df_demand['Median_Demand'].to_dict()
34 1 3795.6 3795.6 9.6 demand = df_supply['dowy'].map(medians_dict)
35 1 209.7 209.7 0.5 supply = df_supply['Supply']
36 1 31252.0 31252.0 79.4 ratio = supply.resample('AS-OCT').sum() / demand.resample('AS-OCT').sum()
37 1 0.5 0.5 0.0 return ratio

Total time: 2.43446 s
File: /directory/Blog_Post.py
Function: calc_supply_demand_2 at line 39

Line # Hits Time Per Hit % Time Line Contents
==============================================================
39 @profile
40 def calc_supply_demand_2(df,df_median):
41 1 1365.1 1365.1 0.1 ratio = pd.DataFrame()
42 1 697.5 697.5 0.0 medians_dict = df_demand['Median_Demand'].to_dict()
43 1 2411800.5 2e+06 99.1 demand = pd.Series([df_demand['Median_Demand'][i] for i in df.dowy], index=df.index)
44 1 53.9 53.9 0.0 supply = df_supply['Supply']
45 1 20547.0 20547.0 0.8 ratio = supply.resample('AS-OCT').sum() / demand.resample('AS-OCT').sum()
46 1 0.6 0.6 0.0 return ratio

0.04 seconds - /directory/Blog_Post.py:30 - calc_supply_demand_1
2.43 seconds - /directory/Blog_Post.py:39 - calc_supply_demand_2

The result shows line number, number of hits (number of times the line is executed; hits increase when executed in a for loop), total time, time per hit, percentage of time and the line contents. The above result implies that for the first function, 79.4% of time was used to execute ratio, whereas for the second function 99.1% is used in execution of demand. ratio1 and ratio2 are the two exact same outputs where demand is defined in different ways in both the functions. We also see that time taken to execute calc_supply_demand_1 function is 0.04 seconds and calc_supply_demand_2 is 2.43 seconds. Using this I could reduce the runtime by 61 times (2.43/0.04) identifying that demand calculation takes 99.1% of time in calc_supply_demand_2 function using line_profiler. Another method is using cprofile (details are in this blog post). cprofile gives more detailed information.

References:

https://kernprof.readthedocs.io/en/latest

https://researchcomputing.princeton.edu/python-profiling

Nonstationary stochastic watershed modeling

In this post, I will describe the motivation for and implementation of a nonstationary stochastic watershed modeling (SWM) approach that we developed in the Steinschneider group during the course of my PhD. This work is in final revision and should be published in the next month or so. This post will attempt to distill key components of the model and their motivation, saving the full methodological development for those who’d like to read the forthcoming paper.

SWMs vs SSM/SSG

Before diving into the construction of the model, some preliminaries are necessary. First, what are SWMs, what do they do, and why use them? SWMs are a framework that combine deterministic, process-based watershed models (think HYMOD, SAC-SMA, etc.; we’ll refer to these as DWMs from here forward) with a stochastic model that capture their uncertainty. The stochastic part of this framework can be used to generate ensembles of SWM simulations that both represent the hydrologic uncertainty and are less biased estimators of the streamflow observations (Vogel, 2017).

Figure 1: SWM conceptual diagram

SWMs were developed to address challenges to earlier stochastic streamflow modeling/generation techniques (SSM/SSG; see for instance Trevor’s post on the Thomas-Fiering SSG; Julie’s post and Lillian’s post on other SSG techniques), the most important of which (arguably) being the question of how to formulate them under non-stationarity. Since SSMs are statistical models fitted directly to historical data, any attempt to implement them in a non-stationary setting requires strong assumptions about what the streamflow response might look like under an alternate forcing scenario. This is not to say that such an approach is not useful or valid for exploratory analyses (for instance Rohini’s post on synthetic streamflow generation to explore extreme droughts). SWMs attempt to address this issue of non-stationarity by using DWMs in their stochastic formulation, which lend some ‘physics-based’ cred to their response under alternate meteorological forcings.

Construction of an SWM

Over the years, there have been many SWM or SWM-esque approaches devised, ranging from simple autoregressive models to complex Bayesian approaches. In this work, we focus on a relatively straightforward SWM approach that models the hydrologic predictive uncertainty directly and simply adds random samples of it to the DWM simulations. The assumption here being that predictive uncertainty is an integrator of all traditional component modeling uncertainties (input, parameter, model/structural), so adding it back in can inject all these uncertainties into the SWM simulations at once (Shabestanipour et al., 2023).

Figure 2: Uncertainty components

By this straightforward approach, the fitting and parameter estimation of the DWM is accomplished first (and separately) via ‘standard’ fitting procedures; for instance, parameter optimization to minimize Nash-Sutcliffe Efficiency (NSE). Subsequently, we develop our stochastic part of the model on the predictive uncertainty that remains, which in this case, is defined simply by subtracting the target observations from the DWM predictions. This distribution of differenced errors is the ‘predictive uncertainty distribution’ or ‘predictive errors’ that form the target of our stochastic model.

Challenges in modeling predictive uncertainty

Easy, right? Not so fast. There is a rather dense and somewhat unpalatable literature (except for the masochists out there) on the subject of hydrologic uncertainty that details the challenges in modeling these sorts of errors. Suffice it to say that they aren’t well behaved. Any model we devise for these errors must be able to manage these bad behaviors.

So, what if we decide that we want to try to use this SWM thing for planning under future climates? Certainly the DWM part can hack it. We all know that lumped, conceptual DWMs are top-notch predictors of natural streamflow… At the least, they can produce physically plausible simulations under alternate forcings (we think). What of the hydrologic predictive uncertainty then? Is it fair or sensible to presume that some model we constructed to emulate historical uncertainty is appropriate for future hydrologic scenarios with drastically different forcings? My line of rhetorical questioning should clue you in on my feelings on the subject. YES!, of course. ‘Stationarity is immortal!’ (Montanari & Koutsoyiannis, 2014).

Towards a hybrid, state-variable dependent SWM

No, actually, there are a number of good reasons why this statement might not hold for hydrologic predictive uncertainty under non-stationarity. You can read the paper for the laundry list. In short, hydrologic predictive uncertainty of a DWM is largely a reflection of its structural misrepresentation of the true process. Thus, the historical predictive uncertainty that we fit our model to is a reflection of that structural uncertainty propagated through historical model states under historical, ‘stationary’ forcings. If we fundamentally alter those forcings, we should expect to see model states that do not exist under historical conditions. The predictive errors that result from these fundamentally new model states are thus likely to not fall neatly into the box carved out by the historical scenarios.

Figure 3: Structural uncertainty

To bring this back to the proposition for a nonstationary SWM approach. The intrinsic link between model structure and its predictive uncertainty raises an interesting prospect. Could there be a way to leverage a DWM’s structure to understand its predictive uncertainty? Well, I hope so, because that’s the premise of this work! What I’ll describe in the ensuing sections is the implementation of a hybrid, state-variable dependent SWM approach. ‘Hybrid’ because it couples both machine learning (ML) and traditional statistical techniques. ‘State-variable dependent’ because it uses the timeseries of model states (described later) as the means to infer the hydrologic predictive uncertainty. I’ll refer to this as the ‘hybrid SWM’ for brevity.

Implementation of the hybrid SWM

So, with backstory in hand, let’s talk details. The remainder of this post will describe the implementation of this hybrid SWM. This high-level discussion of the approach supports a practical training exercise I put together for the Steinschneider group at the following public GitHub repo: https://github.com/zpb4/hybrid-SWM_training. This training also introduces a standard implementation of a GRRIEN repository (see Rohini’s post). Details of implementing the code are contained in the ‘README.md’ and ‘training_exercise.md’ files in the repository. My intent in this post is to describe the model implementation at a conceptual level.

Model-as-truth experimental design

First, in order to address the problem of non-stationary hydrologic predictive uncertainty, we need an experimental design that can produce it. There is a very real challenge here of not having observational data from significantly altered climates to compare our hydrologic model against. We address this problem by using a ‘model-as-truth’ experimental design, where we fit one hydrologic model (‘truth’ model) to observations, and a second hydrologic model (‘process’ model) to the first truth model. The truth model becomes a proxy for the true, target flow of the SWM modeling procedure, while the process model serves as our proposed model, or hypothesis, about that true process. Under this design, we can force both models with any plausible forcing scenario to try to understand how the predictive uncertainty between ‘truth’ and ‘process’ models might change.

Figure 4: Conceptual diagram of ‘model-as-truth’ experimental design

For the actual work, we consider a very simple non-stationary scenario where we implement a 4oC temperature shift to the temperature forcing data, which we refer to as the ‘Test+4C’ scenario. We choose this simple approach to confine non-stationarity to a high-confidence result of anthropogenic climate change, namely, thermodynamic warming. We compare this Test+4C scenario to a ‘Test’ scenario, which is the same out-of-sample temporal period (WY2005-2018) of meteorological inputs under historical values. SAC-SMA and HYMOD are the truth model and process model for this experiment, respectively. Other models could have been chosen. We chose these because they are conceptually similar and commonly used.

Figure 5: Errors between truth and process models in 5 wettest years of Test/Test+4C scenarios.

Hybrid SWM construction

The core feature of the hybrid SWM is a model for the predictive errors (truth model – process model) that uses the hydrologic model state-variables as predictors. We implement this model in two steps that have differing objectives, but use the same state-variable predictor information. An implicit assumption in using state-variable dependencies in both steps is that these dependencies can exist in both stages. In other words, we do not expect the error-correction step to produce independent and identically distributed residuals. We call the first step an ‘error-correction model’ and the second step a ‘dynamic residual model’. Since we use HYMOD as our process model, we use its state-variables (Table 1) as the predictors for these two steps.

Table 1: HYMOD state variables

Short NameLong NameDescription
simSimulationHYMOD predicted streamflow in mm
runoffRunoffUpper reservoir flow of HYMOD in mm
baseflowBaseflowLower reservoir flow of HYMOD in mm
precipPrecipitationBasin averaged precipitation in mm
tavgAverage temperatureBasin averaged temperature in oC
etEvapotranspirationModeled evapotranspiration (Hamon approach) in mm
upr_smUpper soil moistureBasin averaged soil moisture content (mm) in upper reservoir
lwr_smLower soil moistureBasin averaged soil moisture (mm) in lower reservoir
sweSnow water equivalentBasin averaged snow water equivalent simulated by degree day snow module (mm)

Hybrid SWM: Error correction

The error-correction model is simply a predictive model between the hydrologic model (HYMOD) state-variables and the raw predictive errors. The error-correction model also uses lag-1 to 3 errors as covariates to account for autocorrelation. The objective of this step is to infer state-dependent biases in the errors, which are the result of the predictive errors subsuming the structural deficiencies of the hydrologic model. This ‘deterministic’ behavior in the predictive errors can also be conceived as the ‘predictive errors doing what the model should be doing’ (Vogel, 2017). Once this error-correction model is fit to its training data, it can be implemented against any new timeseries of state-variables to predict and debias the errors. We use a Random Forest (RF) algorithm for this step because they are robust to overfitting, even with limited training data. This is certainly the case here, as we consider only individual basins and a ~15 year training period (WY1989-2004). Moreover, we partition the training period into a calibration and validation subset and fit the RF error-correction model only to the calibration data (WY1989-1998), reducing available RF algorithm training data to 9 years.

Hybrid SWM: Dynamic residual model

The dynamic residual model (DRM) is fit to the residuals of the error correction result in the validation subset. We predict the hydrologic model errors for the validation subset from the fitted RF model and subtract them from the empirical errors to yield the residual timeseries. By fitting the DRM to this separate validation subset (which the RF error-correction model has not seen), we ensure that the residuals adequately represent the out-of-sample uncertainty of the error-correction model.

A full mathematical treatment of the DRM is outside the scope of this post. In high-level terms, the DRM is built around a flexible distributional form particularly suited to hydrologic errors, called the skew exponential power (SEP) distribution. This distribution has 4 parameters (mean: mu, stdev: sigma, kurtosis: beta, skew: xi) and we assume a mean of zero (due to error-correction debiasing), while setting the other 3 parameters as time-varying predictands of the DRM model (i.e. sigmat, betat, xit). We also include a lag-1 autocorrelation term (phit) to account for any leftover autocorrelation from the error-correction procedure. We formulate a linear model for each of these parameters with the state-variables as predictors. These linear models are embedded in a log-likelihood function that is maximized (i.e. MLE) against the residuals to yield the optimal set of coefficients for each of the linear models.

With a fitted model, the generation of a new residual at each timestep t is therefore a random draw from the SEP with parameters (mu=0, sigmat, betat, xit) modified by the residual at t-1 (epsilont-1) via the lag-1 coefficient (phit).

Figure 6: Conceptual diagram of hybrid SWM construction.

Hybrid SWM: Simulation

The DRM is the core uncertainty modeling component of the hybrid SWM.  Given a timeseries of state-variables from the hydrologic model for any scenario, the DRM simulation is implemented first, as described in the previous section. Subsequently, the error-correction model is implemented in ‘predict’ mode with the timeseries of random residuals from the DRM step. Because the error-correction model includes lag-1:3 terms, it must be implemented sequentially using errors generated at the previous 3 timesteps. The conclusion of these two simulation steps yields a timeseries of randomly generated, state-variable dependent errors that can be added to the hydrologic model simulation to produce a single SWM simulations. Repeating this procedure many times will produce an ensemble of SWM simulations.

Final thoughts

Hopefully this discussion of the hybrid SWM approach has given you some appreciation for the nuanced differences between SWMs and SSM/SSGs, the challenges in constructing an adequate uncertainty model for an SWM, and the novel approach developed here in utilizing state-variable information to infer properties of the predictive uncertainty. The hybrid SWM approach shows a lot of potential for extracting key attributes of the predictive errors, even under unprecedented forcing scenarios. It decouples the task of inferring predictive uncertainty from features of the data like temporal seasonality (e.g. day of year) that may be poor predictors under climate change. When linked with stochastic weather generation (see Rohini’s post and Nasser’s post), SWMs can be part of a powerful bottom-up framework to understand the implications of climate change on water resources systems. Keep an eye out for the forthcoming paper and check out the training noted above on implementation of the model.

References:

Brodeur, Z., Wi, S., Shabestanipour, G., Lamontagne, J., & Steinschneider, S. (2024). A Hybrid, Non‐Stationary Stochastic Watershed Model (SWM) for Uncertain Hydrologic Simulations Under Climate Change. Water Resources Research, 60(5), e2023WR035042. https://doi.org/10.1029/2023WR035042

Montanari, A., & Koutsoyiannis, D. (2014). Modeling and mitigating natural hazards: Stationarity is immortal! Water Resources Research, 50, 9748–9756. https://doi.org/10.1002/ 2014WR016092

Shabestanipour, G., Brodeur, Z., Farmer, W. H., Steinschneider, S., Vogel, R. M., & Lamontagne, J. R. (2023). Stochastic Watershed Model Ensembles for Long-Range Planning : Verification and Validation. Water Resources Research, 59. https://doi.org/10.1029/2022WR032201

Vogel, R. M. (2017). Stochastic watershed models for hydrologic risk management. Water Security, 1, 28–35. https://doi.org/10.1016/j.wasec.2017.06.001

Tipping Points: Harnessing Insights for Resilient Futures

Introduction

Through various posts on this blog, we’ve delved into the intricacies of the lake problem, unraveling its multifaceted nature and shedding light on the complexities of socio-ecological systems. Amidst our analyses, one concept stands out: tipping points. These pivotal moments, where seemingly minor changes can have significant impacts, go beyond theoretical concepts. They embody critical thresholds within socio-ecological systems, capable of triggering disproportionate and often irreversible shifts. As we embark on this exploration, drawing from a recent deep dive into tipping points I conducted last semester, my hope is that this post enriches your understanding of this essential topic, which frequently emerges in our research discussions.

Socio-Ecological Systems: Understanding the Dynamics

In the intricate interplay between human societies and the natural environment lies the phenomenon of tipping points—critical thresholds within complex socio-ecological systems where incremental changes can trigger disproportionate and often irreversible shifts. Understanding these pivotal moments transcends mere academic intrigue; it holds profound implications for global sustainability, governance strategies, and our collective ability to navigate an era rife with uncertainties and challenges (Lauerburg et al. 2020). At their core, socio-ecological systems encapsulate the intricate interdependencies between human societies and the surrounding environment. These systems are characterized by dynamic interactions between social, economic, and ecological components, fostering a complex web of relationships that shape the resilience and vulnerability of the system as a whole. Within these systems exists the complex phenomenon of tipping points. To comprehend these tipping points, resilience theory offers invaluable insights. Resilience theory serves as a cornerstone in understanding the stability, adaptability, and transformative potential of socio-ecological systems. Central to this theory is the notion of resilience as the system’s capacity to absorb disturbances, reorganize, and persist in the face of change. Tipping points, within the resilience framework, mark critical thresholds where the system’s resilience is severely tested, and small disturbances may provoke abrupt and disproportionate shifts, potentially leading to regime changes or alternate system states (Sterk, van de Leemput, and Peeters 2017).

(Lauerburg et al. 2020)

Real-world examples of socio-ecological tipping points abound across various ecosystems, showcasing the profound implications of critical thresholds in shaping ecological trajectories.

Coral Reefs

One prominent instance is the coral reef ecosystems, where rising sea temperatures and ocean acidification can trigger sudden and extensive coral bleaching events (Ravindran 2016; Moore 2018). Beyond a certain threshold, these events can lead to mass mortality of corals, fundamentally altering the reef’s structure and compromising its ability to support diverse marine life.

Amazon Rainforest

Another compelling example lies in the Amazon rainforest. Deforestation, exacerbated by human activities such as logging and agriculture, can push the rainforest past a tipping point where it transforms from a lush, biodiverse ecosystem into a drier savanna-like landscape (Nobre and Borma 2009; Amigo 2020). This transition can be irreversible, leading to a loss of biodiversity, disruption of regional climates, and further exacerbation of climate change.

Eutrophication of Lakes

Similarly, in freshwater ecosystems like shallow lakes and ponds, excessive nutrient input from agricultural runoff or urban sewage can drive eutrophication. This increased nutrient loading can push these ecosystems towards a tipping point where algal blooms become persistent, leading to oxygen depletion, fish kills, and ultimately a shift from a clear-water state to a turbid, algae-dominated state (Quinn, Reed, and Keller 2017).

These real-world examples underscore the vulnerability of various ecosystems to tipping points and emphasize the need for proactive and adaptive management strategies to prevent or mitigate such shifts, while also considering ethical considerations and equity concerns that play a pivotal role in addressing tipping points within socio-ecological systems. Viewing tipping points through the lens of social justice highlights the disproportionate impact exerted on marginalized and vulnerable groups, exacerbating existing social inequities and emphasizing the imperative to preserve the resilience and functionality of ecosystems vital for supporting biodiversity, regulating climate, and sustaining human livelihoods.

Understanding Tipping Points in Water Resource Allocation Systems

In water resource allocation systems, tipping points are triggered by various factors, from human-induced stresses to natural and climate-related dynamics. Crucial in identifying precursors and patterns preceding these tipping points are longitudinal studies and historical analyses. These analyses offer insights into the temporal evolution of system dynamics, enabling the identification of early warning signals and indicators heralding tipping events (Grimm and Schneider 2011; Kradin 2012). However, grappling with challenges and uncertainties inherent in detecting and predicting tipping points within water resource management is complex. Data constraints and methodological challenges significantly impede the identification and prediction of tipping points. Strategies aimed at bolstering resilience and adapting to the complexities of water resource management are needed. Various modeling paradigms and approaches offer lenses through which tipping points within socio-ecological systems can be analyzed and understood. For instance, methodologies for detecting tipping points, such as Network Analysis and Complex System Approaches, are often incorporated into modeling frameworks like agent-based models (ABMs) and simulation techniques (Peng and Lu 2012; Moore 2018). This integration allows the identification of key nodes or connections within the system that are particularly sensitive to changes, potentially indicating locations of tipping points. Similarly, complex system approaches inform the structure and dynamics of the model, aiding in capturing emergent behaviors and potential tipping phenomena.

Moving forward, there are several instances where tipping points have been observed in water resource allocation, shedding light on critical junctures that significantly impact water availability and ecosystem stability. For example, one study delves into aquifer depletion and groundwater tipping points, highlighting how unsustainable groundwater extraction practices can lead to aquifer depletion, affecting water availability for agricultural and domestic purposes (Castilla-Rho et al. 2017). This research emphasizes the importance of understanding social norms and compliance with conservation policies to mitigate unsustainable groundwater development. Another investigation explores tipping points in river basin management, where water allocation practices exceed ecological capacities, resulting in altered flow regimes and ecosystem collapse (Yletyinen et al. 2019). This study underscores the interconnectedness between human activities and ecological resilience in river basins, emphasizing the need for adaptive management strategies to address these complex challenges.

Furthermore, recent studies highlight the significance of tipping points in the broader context of climate change and ecological research. Dietz et al. (2021) emphasize the importance of climate tipping points in economic models, demonstrating their potential to significantly impact the social cost of carbon and increase global economic risk. Similarly, Goodenough and Webb (2022) discuss the opportunities for integrating paleoecology into contemporary ecological research and policy, emphasizing the potential for paleoecological evidence to inform our understanding of ecological tipping points and natural processes. These insights underscore the interconnectedness of research domains and the importance of interdisciplinary collaboration in addressing complex environmental challenges.

Future Research Directions and Needs

Future research in the realm of tipping points within socio-ecological systems demands a strategic focus on specific areas to advance understanding and ensure policy relevance. Adaptive management emerges as a foundational strategy, emphasizing the implementation of flexible approaches capable of accommodating shifting conditions and uncertainties. Delving into emerging areas of study and innovation stands as a cornerstone, exploring novel research frontiers and innovative methodologies pivotal to advancing our comprehension of tipping points. Moreover, forecasting future tipping points encounters significant hurdles due to uncertainties in modeling future scenarios. Model limitations coupled with the unpredictable nature of complex systems amplify these uncertainties, making it challenging to project the timing, magnitude, and precise locations of tipping events accurately. Addressing these challenges necessitates innovative strategies that surmount data constraints, enhance modeling capabilities, and navigate uncertainties to fortify resilience against potential tipping points. Additionally, bridging knowledge gaps for policy relevance emerges as a crucial necessity. While scientific knowledge continues to evolve, translating these findings into actionable policies remains a persistent hurdle. To bridge this gap effectively, robust science-policy interfaces are imperative, enhancing communication channels and knowledge transfer mechanisms between researchers, policymakers, and practitioners (Goodenough and Webb 2022). By prioritizing interdisciplinary innovation and strengthening the connections between science and policy, future research endeavors can make substantial strides in addressing tipping points and bolstering the resilience of socio-ecological systems.

Conclusion

As we conclude our exploration of tipping points within socio-ecological systems, the profound implications of these critical thresholds become increasingly apparent. The intricate interplay of human activities and environmental changes underscores the complexity of these tipping phenomena. Embracing emerging areas of study and innovation is essential as we navigate uncertainties in modeling future scenarios. Leveraging the insights gained, we can enhance our ability to anticipate and respond to tipping events effectively.

By harnessing the insights gleaned from these endeavors, we can better equip ourselves to navigate the uncertainties ahead. From adaptive governance to community engagement, the tools at our disposal offer avenues for addressing the challenges posed by tipping points. I hope this exploration has provided valuable insights into socio-ecological tipping points and has expanded your understanding of these critical phenomena!

Lastly, HAPPY VALENTINE’S DAY!

References 

Amigo, Ignacio. 2020. “When Will the Amazon Hit a Tipping Point?” Nature 578 (7796): 505–8. https://doi.org/10.1038/d41586-020-00508-4.

Castilla-Rho, Juan Carlos, Rodrigo Rojas, Martin S. Andersen, Cameron Holley, and Gregoire Mariethoz. 2017. “Social Tipping Points in Global Groundwater Management.” Nature Human Behaviour 1 (9): 640–49. https://doi.org/10.1038/s41562-017-0181-7.

Dietz, Simon, James Rising, Thomas Stoerk, and Gernot Wagner. 2021. “Economic Impacts of Tipping Points in the Climate System.” Proceedings of the National Academy of Sciences 118 (34): e2103081118. https://doi.org/10.1073/pnas.2103081118.

Goodenough, Anne E., and Julia C. Webb. 2022. “Learning from the Past: Opportunities for Advancing Ecological Research and Practice Using Palaeoecological Data.” Oecologia 199 (2): 275–87. https://doi.org/10.1007/s00442-022-05190-z.

Grimm, Sonja, and Gerald Schneider. 2011. Predicting Social Tipping Points: Current Research and the Way Forward. Discussion Paper / Deutsches Institut Für Entwicklungspolitik 8/2011. Bonn: Dt. Inst. für Entwicklungspolitik.

Kradin, N. N., ed. 2012. Politicheskai︠a︡ antropologii︠a︡ tradit︠s︡ionnykh i sovremennykh obshchestv: Materialy mezhdunarodnoĭ konferent︠s︡ii. Vladivostok: Izdatelʹskiĭ dom Dalʹnevostochnogo federalʹnogo universiteta.

Lauerburg, R. A. M., R. Diekmann, B. Blanz, K. Gee, H. Held, A. Kannen, C. Möllmann, et al. 2020. “Socio-Ecological Vulnerability to Tipping Points: A Review of Empirical Approaches and Their Use for Marine Management.” Science of The Total Environment 705 (February): 135838. https://doi.org/10.1016/j.scitotenv.2019.135838.

Moore, John C. 2018. “Predicting Tipping Points in Complex Environmental Systems.” Proceedings of the National Academy of Sciences 115 (4): 635–36. https://doi.org/10.1073/pnas.1721206115.

Nobre, Carlos Afonso, and Laura De Simone Borma. 2009. “‘Tipping Points’ for the Amazon Forest.” Current Opinion in Environmental Sustainability 1 (1): 28–36. https://doi.org/10.1016/j.cosust.2009.07.003.

Peng, Heng, and Ying Lu. 2012. “Model Selection in Linear Mixed Effect Models.” J. Multivar. Anal. 109 (August): 109–29.

Quinn, Julianne D, Patrick M Reed, and Klaus Keller. 2017. “Direct Policy Search for Robust Multi-Objective Management of Deeply Uncertain Socio-Ecological Tipping Points.” Environmental Modelling & Software 92 (June): 125–41.

Ravindran, Sandeep. 2016. “Coral Reefs at a Tipping Point.” Proceedings of the National Academy of Sciences 113 (19): 5140–41. https://doi.org/10.1073/pnas.1605690113.

Sterk, Marjolein, Ingrid A van de Leemput, and Edwin THM Peeters. 2017. “How to Conceptualize and Operationalize Resilience in Socio-Ecological Systems?” Current Opinion in Environmental Sustainability, Sustainability governance, 28 (October): 108–13. https://doi.org/10.1016/j.cosust.2017.09.003.

Yletyinen, Johanna, Philip Brown, Roger Pech, Dave Hodges, Philip E Hulme, Thomas F Malcolm, Fleur J F Maseyk, et al. 2019. “Understanding and Managing Social–Ecological Tipping Points in Primary Industries.” BioScience 69 (5): 335–47. https://doi.org/10.1093/biosci/biz031.

The Thomas-Fiering Model for Synthetic Streamflow Generation with a Python Implementation

In 1962 a group of economists, engineers and political scientists who were involved in the Harvard Water Program published “Design of Water Resource Systems“. In chapter 12 of the book, Thomas and Fiering present the following statistical model which was one of the first, if not the first, formal application of stochastic modelling for synthetic streamflow generation and water resource systems evaluation.

It is an autoregressive model which can simulate monthly streamflow values based on the mean, variance, and correlation of historic observations.

In this blog post, I present the model in it’s original form along with a modified form presented by Stedinger and Taylor (1982). Then, I share a Python implementation of the model which is used to generate an ensemble of synthetic flows. I use plotting tools from the Synthetic Generation Figure Library to plot the results.

All of the code used for this post is available here: ThomasFieringModelDemo

Let’s get into it!

The Thomas-Fiering Model

The model that Thomas and Fiering proposed took the form:

Where, for each month m, Q_m is the generated flow, \bar{Q}_m is the mean historic flow, b_m is an autoregression coefficient for predicting that months flow from the prior months flow, \sigma is the standard deviation, r is the correlation coefficient and \epsilon is a random standard normal variable.

A modification to this model was proposed by Stedinger and Taylor (1982), which transforms transforms the streamflow values before fitting the model. I refer to this as the “Stedinger transformation” below and in the code.

Given Q_{m} as the observed flows in month m, the Stedinger transformation of the observed flows is then:

where \hat{\tau}_m is the estimated “lower bound” for each month, calculated as:

The modeled flows are generated from the recursive relationship:

Where:

  • \mu_{m} is the observed average historic monthly x series
  • \sigma_{m}^2 is the observed variance of the historic monthly x series
  • \epsilon_{m} independent standard-normal random variables
  • \rho_m observed between-month correlations of the historic x series

The above steps are performed for each month, and the synthetic streamflow sequence is generated by iteratively applying the stochastic process for the desired duration.

Python Implementation

I built this version of the Thomas Fiering model as a Python class with the following structure:

class ThomasFieringGenerator():
    def __init__(self, Q, **kwargs):
        
    def preprocessing(self, **kwargs):
	    # Stedinger normalization
	    
    def fit(self, **kwargs):
	    # Calculate mu, sigma, and rho for each month
	    
    def generate(self, n_years, **kwargs):
	    # Iteratively generate a single timeseries
	    # Inverse stedinger normalization
        return Q_synthetic
    
    def generate_ensemble(self, n_years, 
                          n_realizations = 1, 
                          **kwargs):
        # Loop and generate multiple timeseries
        return 

Rather than posting the entire code here, which would clutter the page, I will refer you to and encourage you to check out the full implementation which is in the linked repository here: ThomasFieringModelDemo/model.py

To see how this is used and replicate the results below using some example data, see the Jupyter Notebook: ThomasFieringModelDemo/ThomasFiering_demo.ipynb

Synthetic ensemble results

I used the ThomasFieringGenerator to produce 100 samples of 50-year monthly streamflows at USGS gauge site 01434000 on the Delaware River which has data going back to 1904.

This data is available in the repo and is stored in the file usgs_monthly_streamflow_cms.csv

The plotting functions are taken from the Synthetic Generation Figure Library which was shared previously on the blog.

First we consider the range of historic and synthetic streamflow timeseries:

Generally when working with synthetic ensembles it is good for the distribution of synthetic ensembles “envelope” the historic range while maintaining a similar median. The Thomas Fiering model does a good job at this!

The next figure shows the range of flow-quantile values for both historic and synthetic flows. Again, we see a nice overlapping of the synthetic ensemble:

Conclusions

I personally think it is fun and helpful to look back at the foundational work in a field. Since Thomas and Fiering’s work in the early 1960s, there has been a significant amount of work focused on synthetic hydrology.

The Thomas Fiering model has a nice simplicity while still performing very nicely (with the help of the Stedinger normalization). Sure there are some limitations to the model (e.g., the estimation of distribution and correlation parameters may be inaccurate for short records, and the method does not explicitly prevent the generation of negative streamflows), but the model, and the Harvard Water Program more broadly, was successful in ushering in new approaches for water resource systems analysis.

References

Maass, A., Hufschmidt, M. M., Dorfman, R., Thomas, Jr, H. A., Marglin, S. A., & Fair, G. M. (1962). Design of water-resource systems: New techniques for relating economic objectives, engineering analysis, and governmental planning. Harvard University Press.

Stedinger, J. R., & Taylor, M. R. (1982). Synthetic streamflow generation: 1. Model verification and validation. Water resources research, 18(4), 909-918.

A quick and straightforward introduction to LIME

In this blog post, we will be discussing the many household uses of citrus aurantiifolio, or the common lime.

Just kidding, we’ll be talking about a completely different type of LIME, namely Local Interpretable Model-Agnostic Explanations (at this point you may be thinking that the former’s scientific name becomes the easier of the two to say). After all, this is the WaterProgramming blog.

On a more serious note though, LIME is one of the two widely-known model agnostic explainable AI (xAI) methods, alongside Shapley Additive Explanations (SHAP). This post is intended to be an introduction to LIME in particular, and we will be setting up the motivation for using xAI methods as well as a brief example application using the North Carolina Research Triangle system.

Before we proceed, here’s a quick graphic to get oriented:

The figure above mainly demonstrates three main concepts: Artificial Intelligence (AI), the methods used to achieve AI (one of which includes Machine Learning, or ML), and the methods to explain how such methods achieved their version of AI (explainable AI, or more catchily known as xAI). For more explanation on the different categories of AI, and their methods, please refer to these posts by IBM’s Data Science and AI team and this SpringerLink book by Sarker (2022) respectively.

Model-agnostic vs model-specific

Model-specific methods

As shown in the figure, model-specific xAI methods are techniques that can only be used on the specific model that it was designed for. Here’s a quick rundown of the type of model and their available selection of xAI methods:

  • Decision trees
    • Decision tree visualization (e.g. Classification and Regression Tree (CART) diagrams)
    • Feature importance rankings
  • Neural networks (NN), including deep neural networks
    • Coefficient (feature weight) analysis
    • Neural network neuron/layer activation visualization
    • Attention (input sequence) mechanisms
    • Integrated gradients
    • DeepLIFT
    • GradCAM
  • Reinforcement learning
    • Causal models

Further information on the mathematical formulation for these methods can be found in Holzinger et al (2022). Such methods account for the underlying structure of the model that they are used for, and therefore require some understanding of the model formulation to fully comprehend (e.g. interpreting NN activation visualizations). While less flexible than their agnostic counterparts, model-specific xAI methods provide more granular information on how specific types of information is processed by the model, and therefore how changing input sequences, or model structure, can affect model predictions.

Model-agnostic methods

Model-agnostic xAI (as it’s name suggests) relies solely on analyzing the input-output sets of a model, and therefore can be applied to a wide range of machine learning models regardless of model structure or type. It can be thought of as (very loosely) as sensitivity analysis, but applied to AI methods (for more information on this discussion, please refer to Scholbeck et al. (2023) and Razavi et al. (2021)). SHAP and LIME both fall under this set of methods, and approximately abide by the following process: perturb the input then identify how the output is changed. Note that this set of methods provide little insight as to the specifics of model formulation and how it affects model predictions. Nonetheless, it affords a higher degree of flexibility, and does not bind you to one specific model.

Why does this matter?

Let’s think about this in the context of water resources systems management and planning. Assume you are a water utility responsible for ensuring that you reliably deliver water to 280,000 residents on a daily basis. In addition, you are responsible for planning the next major water supply infrastructure project. Using a machine learning model to inform your short- and long-term management and planning decisions without interrogating how it arrived at its recommendations implicitly assumes that the model will make sensible decisions that balance all stakeholders’ needs while remaining equitable. More often than not, this assumption can be incorrect and may lead to (sometimes funny but mostly) unintentional detrimental cascading implications on the most vulnerable (for some well-narrated examples of how ML went awry, please refer to Brian Christian’s “The Alignment Problem”).

Having said that, using xAI as a next step in the general framework of adopting AI into our decision-making processes can help us better understand why a model makes its predictions, how those predictions came to be, and their degree of usefulness to a decision maker. In this post, I will be demonstrating the use of LIME to answer the following questions.

The next section will establish the three components we will need to apply LIME to our problem:

  1. An input (feature) set and the training dataset
  2. The model predictions
  3. The LIME explainer

A quick example using the North Carolina Research Triangle

The input (feature) set and training dataset

The Research Triangle region in North Carolina consists of six main water utilities that deliver water to their namesake cities: OWASA (Orange County), Durham, Cary, Raleigh, Chatham, and Pittsboro (Gold et al., 2023). All utilities have collaboratively decided that they will each measure their system robustness using a satisficing metric (Starr 1962; Herman et al. 2015), where they satisfy their criteria for robustness if the following criteria are met:

  1. Their reliability meets or exceeds 98%
  2. Their worst-case cost of drought mitigation actions amount to no more than 10% of their annual volumetric revenue
  3. They do not restrict demand more than 20% of the time

If all three criteria are met, then they are considered “successful” (represented as a 1). Otherwise, they have “failed” (represented as a 0). We have 1,000 training data points of success or failure, each representing a state of the world (SOW) in which a utility fails or succeeds to meet their satisficing critieria. This is our training dataset. Each SOW consists of a unique combination of features that include inflow, evaporation and water demand Fourier series coefficients, as well as socioeconomic, policy and infrastructure construction factors. This is our feature set.

Feel free to follow along this portion of the post using the code available in this Google Colab Notebook.

The model prediction

To generate our model prediction, let’s first code up our model. In this example, we will be using Extreme Gradient Boosting, otherwise known as XGBoost (xgboost) as our prediction model, and the LIME package. Let’s first install the both of them:

pip install lime
pip install xgboost

We will also need to import all the needed libraries:

import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt
import seaborn as sns
import lime
import lime.lime_tabular
import xgboost as xgb
from copy import deepcopy

Now let’s set up perform XGBoost! We will first need to upload our needed feature and training datasets:

satisficing = pd.read_csv('satisficing_all_utilites.csv')
du_factors = pd.read_csv('RDM_inputs_final_combined_headers.csv')  

# get all utility and feature names
utility_names = satisficing.columns[1:]
du_factor_names = du_factors.columns

There should be seven utility names (six, plus one that represents the entire region) and thirteen DU factors (or feature names). In this example, we will be focusing only on Pittsboro.

# select the utility 
utility = 'Pittsboro'

# convert to numpy arrays to be passed into xgboost function
du_factors_arr = du_factors.to_numpy()
satisficing_utility_arr = satisficing[utility].values

# initialize the figure object
fig, ax = plt.subplots(1, 1, figsize=(5, 5))  
perform_and_plot_xbg(utility, ax, du_factors_arr, satisficing_utility_arr, du_factor_names)

Note the perform_and_plot_xgb function being used – this function is not shown here for brevity, but you can view the full version of this function in this Google Colab Notebook.

The figure above is called a factor map that shows mid-term (Demand2) and long-term (Demand3) demand growth rates on its x- and y-axes respectively. The green denotes the range of demand growth rates where XGBoost has predicted that Pittsboro will successfully meet its satisficing criteria, and brown is otherwise. Each point is a sample from the original training dataset, where the color (white is 1 – success, and red is 0 – failure) denotes whether Pittsboro actually meets its satisficing criteria. In this case we can see that Pittsboro quickly transitions in failure when its mid- and long-term demand are both higher than expected (indicated by the 1.0-value on both axes).

The LIME explainer

Before we perform LIME, let’s first select an interesting point using the figure above.

In the previous section, we can see that the XGBoost algorithm predicts that Pittsboro’s robustness is affected most by mid-term (Demand2) and long-term (Demand3) demand growth. However, there is a point (indicated using the arrow and the brown circle below) where this prediction did not match the actual data point.

To better understand why then, this specific point was predicted to be a “successful” SOW where the true datapoint had it labeled as a “failure” SOW, let’s take a look at how the XGBoost algorithm made its decision.

First, let’s identify the index of this point:

# select an interesting point 
interesting_point_range_du = du_factors[(du_factors['Demand2'] < 0) & (du_factors['Demand3'] < 0)].index
interesting_point_satisficing = satisficing_utility_arr[interesting_point_range_du]
interesting_point_range_sat = np.where(satisficing_utility_arr == 0)
interesting_point = np.intersect1d(interesting_point_range_du, interesting_point_range_sat)[0]

This will return an index of 704. Next, we’ll run LIME to break down the how this (mis)classification was made:

# instantiate the lime explainer
explainer = lime_tabular.LimeTabularExplainer(du_factors_arr, 
                                              mode='classification', 
                                              feature_names=du_factor_names)

# explain the interesting instance
exp = explainer.explain_instance(du_factors_arr[interesting_point_selected], 
                                 xgb_model.predict_proba, 
                                 num_features=len(du_factor_names))

exp.show_in_notebook(show_table=True)

The following code, if run successfully, should result in the following figure.

Here’s how to interpret it:

  • The prediction probability bars on the furthest left show the model’s prediction. In this case, the XGBoost model classifies this point as a “failure” SOW with 94% confidence.
  • The tornado plot in the middle show the feature contributions. In this case, it shows the degree to which each SOW feature influenced the decision. In this case, the model misclassified the data point as a “success” although it was a failure as our trained model only accounts for the top two overall features that influence the entire dataset to plot the factor map, and did not account for short-term demand growth rates (Demand1) and the permitting time required for constructing the water treatment plant (JLWTP permit).
  • The table on the furthest right is the table of the values of all the features of this specific SOW.

Using LIME has therefore enabled us to identify the cause of XGBoost’s misclassification, allowing us to understand that the model needed information on short-term demand and permitting time to make the correct prediction. From here, it is possible to further dive into the types of SOWs and their specific characteristics that would cause them to be more vulnerable to short-term demand growth and infrastructure permitting time as opposed to mid- and long-term demand growth.

Summary

Okay, so I lied a bit – it wasn’t quite so “brief” after all. Nonetheless, I hope you learned a little about explainable AI, how to use LIME, and how to interpret its outcomes. We also walked through a quick example using the good ol’ Research Triangle case study. Do check out the Google Colab Notebook if you’re interested in how this problem was coded.

With that, thank you for sticking with me – happy learning!

References

Amazon Web Services. (1981). What’s the Difference Between AI and Machine Learning? Machine Learning & AI. https://aws.amazon.com/compare/the-difference-between-artificial-intelligence-and-machine-learning/

Btd. (2024, January 7). Explainable AI (XAI): Model-specific interpretability methods. Medium. https://baotramduong.medium.com/explainable-ai-model-specific-interpretability-methods-02e23ebceac1

Christian, B. (2020). The alignment problem: Machine Learning and human values. Norton & Company.

Gold, D. F., Reed, P. M., Gorelick, D. E., & Characklis, G. W. (2023). Advancing Regional Water Supply Management and infrastructure investment pathways that are equitable, robust, adaptive, and cooperatively stable. Water Resources Research, 59(9). https://doi.org/10.1029/2022wr033671

Herman, J. D., Reed, P. M., Zeff, H. B., & Characklis, G. W. (2015). How should robustness be defined for water systems planning under change? Journal of Water Resources Planning and Management, 141(10). https://doi.org/10.1061/(asce)wr.1943-5452.0000509

Holzinger, A., Saranti, A., Molnar, C., Biecek, P., & Samek, W. (2022). Explainable AI methods – A brief overview. xxAI – Beyond Explainable AI, 13–38. https://doi.org/10.1007/978-3-031-04083-2_2

IBM Data and AI Team. (2023, October 16). Understanding the different types of artificial intelligence. IBM Blog. https://www.ibm.com/blog/understanding-the-different-types-of-artificial-intelligence/

Sarker, I. H. (2022, February 10). AI-based modeling: Techniques, applications and research issues towards automation, intelligent and Smart Systems – SN Computer Science. SpringerLink. https://link.springer.com/article/10.1007/s42979-022-01043-x#Sec6

Scholbeck, C. A., Moosbauer, J., Casalicchio, G., Gupta, H., Bischl, B., & Heumann, C. (2023, December 20). Position paper: Bridging the gap between machine learning and sensitivity analysis. arXiv.org. http://arxiv.org/abs/2312.13234

Starr, M. K. (1963). Product design and decision theory. Prentice-Hall, Inc.

Geocoding Using Google API Key in Python

Introduction

In this post, I will delve into the transformative realm of geocoding, addressing the challenges posed by historical data laden with addresses. Throughout the discussion, I’ll guide you through the intricacies of leveraging the Google API key to seamlessly integrate location information.

Key points include:

  - Preparing your digital workspace for geocoding with essential tools.

  - Obtaining a Google API key to unlock the power of precise coordinates.

  - Applying practical steps to effortlessly transform addresses into valuable spatial insights using Pandas DataFrame.

ArcGIS Developers. (n.d.). ArcGIS API for Python. Retrieved from here.

As we study the realm of data-driven exploration, a significant challenge surfaces—one that personally resonated with me during my recent project. Historical data, often a treasure trove of insights, can be a double-edged sword. The challenge arises when this wealth of information arrives with addresses instead of coordinates, adding a layer of complexity to the analysis. In my own experience, this meant that a substantial portion of the received data was, unfortunately, unusable for my intended analysis without a transformative solution.

This is where geocoding steps in as a crucial ally, reshaping the landscape of historical datasets and making them analysis-ready. The inconsistency in reporting formats and the absence of standardized coordinates pose hurdles that can impede meaningful analysis. Geocoding becomes an indispensable tool, allowing me to bridge this gap, harmonize the data, and unlock its full potential for analysis.

This post focuses on the intricacies of geocoding—a transformative process that transcends mere addresses, providing the geographic insights necessary for a comprehensive understanding of your dataset. Equipped with the Google API key, we’ll delve into the practical steps needed to seamlessly integrate location information. The following sections outline the key stages of this geocoding journey, ensuring your dataset is prepared for advanced analysis.

Preparing your environment

Prepare your digital workspace for this impactful data transformation by ensuring you have all the necessary tools. Seamlessly install GoogleMaps and Pandas using the commands provided below in your terminal.

#Install necessary packages
pip install GoogleMaps
pip install pandas

Formatting your data

In the realm of geocoding, the importance of formatting data before initiating the geocoding process cannot be overstated, especially when leveraging the capabilities of the Google API key. The formatting step is a vital prerequisite for several reasons. Firstly, it ensures consistency and standardization in the structure of addresses, aligning them with the expectations of geocoding services. This consistency allows for more accurate interpretation and processing of addresses. Additionally, formatted addresses provide a clear separation of components such as city, state, and ZIP code, facilitating efficient matching during the geocoding algorithm’s execution. The optimization of API usage is another benefit, as well-formatted addresses contribute to accurate results, minimizing unnecessary costs associated with incorrect or ambiguous requests. By addressing these considerations in the formatting stage, one sets the stage for a more reliable and cost-effective geocoding journey, unlocking the full potential of spatial insights. Now, let’s delve into the practical implementation of this formatting process with the following code, which creates a new column containing the consolidated and formatted addresses within your dataset.

# Create new dataframe just with the address
df['ADDRESS'] = df['CITY'] + ',' + \
                df['STATE'] + ' ' + \
                df['ZIP'].astype(str)

Obtain a Google API Key

  1. Navigate to the Google Cloud console
  2. Under the Project drop-down menu, select the project that you want to work on. If one does not already exist, create a new project 
  3. Under the Library tab on the side menu, search “geocoding api” and ensure that it is enabled
  4. Under the Credentials tab on the side menu, select “Create Credentials” and create a new API key to generate a new key
  5. Your Google API key will then appear on the screen, save this key for future use

Geocoding with Pandas DataFrame

With your environment set up, your data formatted, and your Google API key in hand, it’s time to wield this powerful tool. Dive into the practical realm of geocoding by seamlessly applying the sample code below to your dataset, effortlessly transforming addresses into precise coordinates. Follow the code through each practical step, and witness your location data evolve into a valuable asset, ready for advanced analysis. 

# Import necessary libraries
import googlemaps
import pandas as pd

# Define your API key
# Replace "YOUR_GOOGLE_API_KEY" with your actual Google Maps API key that you saved previously.
gmaps_key = googlemaps.Client(key="YOUR_GOOGLE_API_KEY")

# Extract the last column (ADDRESS) to create a new dataframe (df_address)
df_address = df.iloc[:, -1:]

# Geocoding with Pandas DataFrame
# Define a function to geocode addresses and extract latitude and longitude
def geocode_and_extract(df_address):
    # Add empty columns for latitude and longitude
    df_address['LATITUDE'] = ""
    df_address['LONGITUDE'] = ""
    
    # Loop through each row in the df_address dataframe
    for i in range(len(df_address)):
        address = df_address['ADDRESS'][i]
        # Geocode the address using the Google Maps API
        geocode_result = gmaps_key.geocode(address)
        
        # Check if geocoding was successful
        if geocode_result:
            # Extract latitude and longitude from the geocoding result
            df_address['LATITUDE'][i] = geocode_result[0]['geometry']['location']['lat']
            df_address['LONGITUDE'][i] = geocode_result[0]['geometry']['location']['lng']

# Apply the geocoding function to your dataframe
geocode_and_extract(df_address)

# Print an example to verify correctness
example_address = df_address['ADDRESS'][0]
example_result = gmaps_key.geocode(example_address)
example_lat = example_result[0]["geometry"]["location"]["lat"]
example_long = example_result[0]["geometry"]["location"]["lng"]
print(f'Example Geocoded Address: {example_address}, Latitude: {example_lat}, Longitude: {example_long}')

# Let's join the coordinates with the original dataset
df['LATITUDE'] = df_address['LATITUDE']
df['LONGITUDE'] = df_address['LONGITUDE']

Replace "YOUR_GOOGLE_API_KEY" with your actual Google Maps API key that you saved previously.

Dealing with incomplete data

In the midst of my geocoding journey, I encountered a common challenge: incomplete data. Some addresses in my dataset were detailed with full street information, while others provided only city and state. The beauty of the Google API key revealed itself as it effortlessly transformed both types of data, allowing for a seamless integration of geographic insights.

This flexibility became especially crucial as I navigated through points where only city and state information existed. While it may seem like a potential hurdle, the Google API key’s ability to interpret varying address formats ensured that no data was left behind. For my specific analysis, the key only needed to fall within a designated radius of interest, and the results proved to be accurate enough for my purposes.

In handling diverse data formats, the Google API key stands as your ally, harmonizing and geocoding all pieces of the puzzle, ensuring that each contributes to the rich narrative your dataset is meant to tell.

Conclusion

Your geocoding journey has now equipped you to handle diverse datasets, turning a potential challenge into a transformative opportunity. As you venture forth, may your analyses unveil fresh perspectives and illuminate the hidden stories within your spatial data. The Google API key has been the linchpin throughout this exploration, ensuring that every piece of information contributes to the bigger picture. From handling varying addresses to pinpointing detailed locations, the key’s flexibility has proven invaluable, providing accurate insights.

Congratulations on mastering the art of geocoding—may your data-driven adventures persist in revealing the intricate layers of time, geography, and the untold stories that lie within.