Interactive Visualization of Scientific Data and Results with Panel

This is a guest post by David Lafferty from the University of Illinois Urbana-Champaign. David is a fifth-year PhD student in the Department of Climate, Meteorology & Atmospheric Sciences working with Dr. Ryan Sriver. David’s research contributes to advances in climate change risk assessment. Dr. Casey Burleyson, a research scientist at the Pacific Northwest National Laboratory, also contributed text about the MSD-LIVE platform.

Visualization plays a crucial role in the scientific process. From an initial exploratory data analysis to the careful presentation of hard-earned results, data visualization is a powerful tool to generate hypotheses and insights, communicate findings to both scientists and stakeholders, and aid decision-making. Interactive data visualization can enhance each of these use cases and is particularly useful for facilitating collaborations and deepening user engagement. Effective data visualization has been a recurring theme of the Water Programming blog, with several past posts directly addressing the topic, including some that highlight various tools for interactive visualization [1, 2].

In this post, I’d like to highlight Panel, a data exploration and web app framework in Python.  I’ll also showcase how we used Panel to construct an interactive dashboard to accompany a recent paper, allowing readers to explore our results in more detail.

An overview of the HoloViz ecosystem, taken from the hvPlot website. Panel provides a high-level interface that allows users to build linked apps across all of these visualization libraries.

Panel is an open-source Python library that allows developers to create interactive web apps and dashboards in Python. It is part of the HoloViz ecosystem, which includes other visualization packages like HoloViews and GeoViews. Panel provides a high-level API for creating rich, interactive visualizations and user interfaces directly from Python code. The library includes an impressive number of features, such as different types of panes for displaying and arranging plots, media, text, or other external objects; a wide range of widgets; and flexible, customizable dashboard layouts. Panel itself does not provide visualization tools but instead relies on (and is designed to work well with) other popular plotting libraries such as Bokeh, Datashader, Folium, Plotly, and Matplotlib. Panel provides several deployment options for your polished apps and dashboards, but also works well within development environments such as Jupyter Notebooks, VS Code, PyCharm, or Spyder.

A simple Panel dashboard

To get a sense of Panel’s capabilities, let’s build a simple dashboard that analyzes global temperature anomalies from different observational datasets. Global mean temperature rise above pre-industrial levels is one of most widely-used indicators of climate change, and 2023 was a record year in this respect – all major monitoring agencies recorded 2023 as the warmest year on record [3] and the exceptional warmth of 2023 fell outside the prediction intervals of many forecasting groups [4].

To visualize this temperature change, we’re going to look at four observational datasets: Berkeley Earth, HadCRUT5, NOAA, and GISTEMPv4. First, we need to import the required Python libraries:

!pip install hvplot

from functools import reduce
import pandas as pd
import hvplot.pandas

import panel as pn
pn.extension(comms='colab')

This notebook is available on Google Colab, which is why I’ve set comms=‘colab' in the Panel configuration settings. Next, we download and merge all observational datasets:

# HadCRUT5 anomaly relative to 1961-1990
# Link: https://www.metoffice.gov.uk/hadobs/hadcrut5/
hadcrut = pd.read_csv('https://www.metoffice.gov.uk/hadobs/hadcrut5/data/HadCRUT.5.0.2.0/analysis/diagnostics/HadCRUT.5.0.2.0.analysis.summary_series.global.annual.csv',
                      usecols=[0,1], skiprows=1, names=['Year','HadCRUT5'], index_col=0)

# GISTEMP v4 anomaly relative to 1951-1980
# Link: https://data.giss.nasa.gov/gistemp/
gistemp = pd.read_csv('https://data.giss.nasa.gov/gistemp/tabledata_v4/GLB.Ts+dSST.csv',
                      skiprows=2, usecols=[0,13], names=['Year', 'GISTEMPv4'], index_col=0, skipfooter=1, engine='python')

# NOAA anomaly relative to 1901-2000
# Link: https://www.ncei.noaa.gov/access/monitoring/climate-at-a-glance/global/time-series/globe/land_ocean/ytd/12
noaa = pd.read_csv('https://www.ncdc.noaa.gov/cag/global/time-series/globe/land_ocean/ytd/12/1880-2023.csv',
                   skiprows=5, names=['Year', 'NOAA'], index_col=0)

# Berkeley Earth anomaly relative to 1951-1980
# Link: https://berkeleyearth.org/data/
berkearth = pd.read_csv('https://berkeley-earth-temperature.s3.us-west-1.amazonaws.com/Global/Land_and_Ocean_summary.txt',
                        skiprows=58, usecols=[0,1], sep='\s+', names=['Year', 'BerkeleyEarth'], index_col=0)

# Merge
df = reduce(lambda x, y: pd.merge(x, y, left_index=True, right_index=True, how='outer'), [hadcrut, gistemp, noaa, berkearth])

As you can see, each of these datasets measures the global mean temperature anomaly relative to a fixed baseline, and the baselines are not constant across datasets. Climate scientists often define 1850-1900 as the canonical pre-industrial baseline, but choosing this period is a non-trivial exercise [5] and can impact the uncertainty around our estimates of climate change. Let’s explore this phenomenon interactively using Panel and the datasets downloaded above:

def adjust_baseline(obs_name, new_baseline):
  # Adjusts given observational dataset to selected new baseline

def plot_timeseries(obs_to_include, new_baseline_start, new_baseline_end):
  # Plots timeseries of anomalies for selected datasets and new baseline

def get_2023_anomaly(obs_to_include, new_baseline_start, new_baseline_end):
  # Calculates and shows 2023 anomaly for selected datasets and new baseline

The full code for each of these functions is available in the notebook. Briefly, adjust_baseline calculates a new timeseries of global temperature anomalies for a given observational dataset, now relative to a new user-defined baseline. plot_timeseries then plots these new timeseries, and the user can choose which datasets are included via the obs_to_include parameter. As I mentioned above, Panel interfaces nicely with several other plotting libraries – here I elected to use hvPlot with the Bokeh backend to add zooming and panning capabilities to the timeseries plot. Finally, get_2023_anomaly reports the 2023 temperature anomalies relative to the new baseline (where I’ve used Panel’s Number indicator to color-code the anomaly report for some added glitz).

Now, let’s add interactivity to these functions via Panel. First, we define a set of ‘widgets’ that allow the user to enter their chosen baseline, and which observational datasets to include in the plots:

# Widgets
new_baseline_start = pn.widgets.IntInput(name='Baseline start year', value=1981, step=5, start=1850, end=2022, width=150)
new_baseline_end = pn.widgets.IntInput(name='Baseline end year', value=2010, step=5, start=1851, end=2023, width=150)

obs_to_include = pn.widgets.CheckButtonGroup(value=obs_names, options=obs_names, orientation='vertical', width=150)

Panel has a large number of widgets available for different use cases, so I encourage you to check those out. Then, we use Panel’s bind functionality to define reactive or bound versions of the functions above:

# Interactivity is generated by pn.bind
interactive_anomaly = pn.bind(get_2023_anomaly, obs_to_include, new_baseline_start, new_baseline_end)
interactive_plot = pn.bind(plot_timeseries, obs_to_include, new_baseline_start, new_baseline_end)

These functions will now be automatically re-run upon a user interaction. Finally, we can construct the dashboard by using a combination of Panel’s row and column layouts:

# Construct the dashboard
obs_to_include_heading = pn.pane.HTML('<label>Observational datasets to show:</label>')
anomaly_heading = pn.pane.Markdown('## 2023 anomaly:')

pn.Row(
    pn.Column(pn.WidgetBox(new_baseline_start, new_baseline_end, width=175, margin=10),
              pn.WidgetBox(pn.Column(obs_to_include_heading, obs_to_include), width=175, margin=10)),
    interactive_plot,
    pn.Column(anomaly_heading, interactive_anomaly),
    width=1000,
)

Et voila! You’ve just built your first Panel dashboard. If you run the code in the Google Colab notebook, it should look something like this:

You can try changing the baseline, zooming and panning around the plot, and toggling the datasets on/off. Notice that using an earlier baseline period typically results in a larger disagreement among the different products, reflecting the large uncertainties in early temperature measurements.  

Facilitating a deeper exploration of published results

We relied on Panel to produce a more elaborate dashboard to accompany a recent paper, Downscaling and bias-correction contribute considerable uncertainty to local climate projections in CMIP6, published in npj Climate & Atmospheric Science last year [6]. In brief, the goal of this paper was to understand the relative importance of various sources of uncertainty in local climate projections. In addition to the traditional sources of climate uncertainty, such as model uncertainty (different climate models can simulate different future climates) and internal variability (related to the chaotic nature of the Earth system), we also looked at downscaling and bias-correction uncertainty, which arises from the variety of available post-processing methods that aim to make climate model outputs more actionable (i.e., higher spatial resolution and reduced biases relative to observations). Our results were highly heterogeneous across space, time, and indicators of climate change, so it was challenging to condense everything into a few figures for the paper. As such, we developed a dashboard to allow users to explore our results in more detail by focusing in on specific locations or time periods that might be of more interest to them:

We relied on Panel for this dashboard mainly for its flexibility. Panel allowed us to include several different plot types and user interaction features, along with text boxes and figures to provide additional information. For example, users can explore the uncertainty breakdown at different locations by zooming and panning around a map plot, entering specific latitude and longitude coordinates, or selecting from a drop-down list of almost 900 world cities. As someone not well-versed in HTML or JavaScript, a major advantage provided by Panel was the ability to construct everything in Python.

Our dashboard is publicly available (https://lafferty-sriver-2023-downscaling-uncertainty.msdlive.org/) and we’d love for you to check it out! For deploying the dashboard, we worked with Carina Lansing and Casey Burleyson from PNNL to make it available on MSD-LIVE. MSD-LIVE (the MultiSector Dynamics Living, Intuitive, Value-adding, Environment) is a cloud-based flexible and scalable data and code management system combined with an advanced computing platform that will enable MSD researchers to document and archive their data, run their models and analysis tools, and share their data, software, and multi-model workflows within a robust Community of Practice. For the dashboard, we leveraged MSD-LIVE’s cloud computing capabilities to support the on-demand computing required to generate the plots. When a user clicks on the dashboard link, MSD-LIVE spins up a dedicated computing instance on the Amazon Web Services (AWS) cloud. That computing instance has the required Python packages pre-installed and is directly connected to the dataset that underpins the dashboard. Once the dashboard is closed or after a period of inactivity the dedicated computing instance is spun down. The underlying cloud computing capabilities in MSD-LIVE have also been used to support Jupyter-based interactive tutorials used to teach new people to use MSD models.

Conclusion

I hope this blog post has served as a useful introduction to Panel’s capabilities, and perhaps sparked some new ideas about how to visualize and communicate your scientific data and results. Although the temperature anomaly dashboard we created earlier was very simple, I think it demonstrates the key steps towards constructing more complicated apps: write functions to produce your desired visualizations, add interactivity via pn.bind, then arrange your dashboard using Panel’s plethora of panes, widgets, and layout features. Like any coding project, you will invariably run into unforeseen difficulties – there was definitely some trial-and-error involved in building the more complicated dashboard associated with our paper – but following this ordered approach should allow you to start creating interactive (and hopefully more effective) scientific visualizations. I would also encourage you to check out the Panel component and app galleries for yet more inspiration and demonstrations of what Panel can do.

References

  1. Trevor Amestoy, Creating Interactive Geospatial Maps in Python with Folium, Water Programing blog (2023). https://waterprogramming.wordpress.com/2023/04/05/creating-interactive-geospatial-maps-in-python-with-folium/
  2. Trevor Amestoy, Constructing interactive Ipywidgets: demonstration using the HYMOD model, Water Programing blog (2022). https://waterprogramming.wordpress.com/2022/07/20/constructing-interactive-ipywidgets-demonstration-using-the-hymod-model/
  3. Robert Rhode, Global Temperature Report for 2023, Berkeley Earth (2024). https://berkeleyearth.org/global-temperature-report-for-2023/
  4. Zeke Hausfather, 2023’s unexpected and unexplained warming, The Climate Brink (2024). https://www.theclimatebrink.com/p/2023s-unexpected-and-unexplained
  5. Hawkins, E., et al., Estimating Changes in Global Temperature since the Preindustrial Period, Bull. Amer. Meteor. Soc., 98, 1841–1856, https://doi.org/10.1175/BAMS-D-16-0007.1
  6. Lafferty, D.C., Sriver, R.L. Downscaling and bias-correction contribute considerable uncertainty to local climate projections in CMIP6. npj Clim Atmos Sci 6, 158 (2023). https://doi.org/10.1038/s41612-023-00486-0

Tipping Points: Harnessing Insights for Resilient Futures

Introduction

Through various posts on this blog, we’ve delved into the intricacies of the lake problem, unraveling its multifaceted nature and shedding light on the complexities of socio-ecological systems. Amidst our analyses, one concept stands out: tipping points. These pivotal moments, where seemingly minor changes can have significant impacts, go beyond theoretical concepts. They embody critical thresholds within socio-ecological systems, capable of triggering disproportionate and often irreversible shifts. As we embark on this exploration, drawing from a recent deep dive into tipping points I conducted last semester, my hope is that this post enriches your understanding of this essential topic, which frequently emerges in our research discussions.

Socio-Ecological Systems: Understanding the Dynamics

In the intricate interplay between human societies and the natural environment lies the phenomenon of tipping points—critical thresholds within complex socio-ecological systems where incremental changes can trigger disproportionate and often irreversible shifts. Understanding these pivotal moments transcends mere academic intrigue; it holds profound implications for global sustainability, governance strategies, and our collective ability to navigate an era rife with uncertainties and challenges (Lauerburg et al. 2020). At their core, socio-ecological systems encapsulate the intricate interdependencies between human societies and the surrounding environment. These systems are characterized by dynamic interactions between social, economic, and ecological components, fostering a complex web of relationships that shape the resilience and vulnerability of the system as a whole. Within these systems exists the complex phenomenon of tipping points. To comprehend these tipping points, resilience theory offers invaluable insights. Resilience theory serves as a cornerstone in understanding the stability, adaptability, and transformative potential of socio-ecological systems. Central to this theory is the notion of resilience as the system’s capacity to absorb disturbances, reorganize, and persist in the face of change. Tipping points, within the resilience framework, mark critical thresholds where the system’s resilience is severely tested, and small disturbances may provoke abrupt and disproportionate shifts, potentially leading to regime changes or alternate system states (Sterk, van de Leemput, and Peeters 2017).

(Lauerburg et al. 2020)

Real-world examples of socio-ecological tipping points abound across various ecosystems, showcasing the profound implications of critical thresholds in shaping ecological trajectories.

Coral Reefs

One prominent instance is the coral reef ecosystems, where rising sea temperatures and ocean acidification can trigger sudden and extensive coral bleaching events (Ravindran 2016; Moore 2018). Beyond a certain threshold, these events can lead to mass mortality of corals, fundamentally altering the reef’s structure and compromising its ability to support diverse marine life.

Amazon Rainforest

Another compelling example lies in the Amazon rainforest. Deforestation, exacerbated by human activities such as logging and agriculture, can push the rainforest past a tipping point where it transforms from a lush, biodiverse ecosystem into a drier savanna-like landscape (Nobre and Borma 2009; Amigo 2020). This transition can be irreversible, leading to a loss of biodiversity, disruption of regional climates, and further exacerbation of climate change.

Eutrophication of Lakes

Similarly, in freshwater ecosystems like shallow lakes and ponds, excessive nutrient input from agricultural runoff or urban sewage can drive eutrophication. This increased nutrient loading can push these ecosystems towards a tipping point where algal blooms become persistent, leading to oxygen depletion, fish kills, and ultimately a shift from a clear-water state to a turbid, algae-dominated state (Quinn, Reed, and Keller 2017).

These real-world examples underscore the vulnerability of various ecosystems to tipping points and emphasize the need for proactive and adaptive management strategies to prevent or mitigate such shifts, while also considering ethical considerations and equity concerns that play a pivotal role in addressing tipping points within socio-ecological systems. Viewing tipping points through the lens of social justice highlights the disproportionate impact exerted on marginalized and vulnerable groups, exacerbating existing social inequities and emphasizing the imperative to preserve the resilience and functionality of ecosystems vital for supporting biodiversity, regulating climate, and sustaining human livelihoods.

Understanding Tipping Points in Water Resource Allocation Systems

In water resource allocation systems, tipping points are triggered by various factors, from human-induced stresses to natural and climate-related dynamics. Crucial in identifying precursors and patterns preceding these tipping points are longitudinal studies and historical analyses. These analyses offer insights into the temporal evolution of system dynamics, enabling the identification of early warning signals and indicators heralding tipping events (Grimm and Schneider 2011; Kradin 2012). However, grappling with challenges and uncertainties inherent in detecting and predicting tipping points within water resource management is complex. Data constraints and methodological challenges significantly impede the identification and prediction of tipping points. Strategies aimed at bolstering resilience and adapting to the complexities of water resource management are needed. Various modeling paradigms and approaches offer lenses through which tipping points within socio-ecological systems can be analyzed and understood. For instance, methodologies for detecting tipping points, such as Network Analysis and Complex System Approaches, are often incorporated into modeling frameworks like agent-based models (ABMs) and simulation techniques (Peng and Lu 2012; Moore 2018). This integration allows the identification of key nodes or connections within the system that are particularly sensitive to changes, potentially indicating locations of tipping points. Similarly, complex system approaches inform the structure and dynamics of the model, aiding in capturing emergent behaviors and potential tipping phenomena.

Moving forward, there are several instances where tipping points have been observed in water resource allocation, shedding light on critical junctures that significantly impact water availability and ecosystem stability. For example, one study delves into aquifer depletion and groundwater tipping points, highlighting how unsustainable groundwater extraction practices can lead to aquifer depletion, affecting water availability for agricultural and domestic purposes (Castilla-Rho et al. 2017). This research emphasizes the importance of understanding social norms and compliance with conservation policies to mitigate unsustainable groundwater development. Another investigation explores tipping points in river basin management, where water allocation practices exceed ecological capacities, resulting in altered flow regimes and ecosystem collapse (Yletyinen et al. 2019). This study underscores the interconnectedness between human activities and ecological resilience in river basins, emphasizing the need for adaptive management strategies to address these complex challenges.

Furthermore, recent studies highlight the significance of tipping points in the broader context of climate change and ecological research. Dietz et al. (2021) emphasize the importance of climate tipping points in economic models, demonstrating their potential to significantly impact the social cost of carbon and increase global economic risk. Similarly, Goodenough and Webb (2022) discuss the opportunities for integrating paleoecology into contemporary ecological research and policy, emphasizing the potential for paleoecological evidence to inform our understanding of ecological tipping points and natural processes. These insights underscore the interconnectedness of research domains and the importance of interdisciplinary collaboration in addressing complex environmental challenges.

Future Research Directions and Needs

Future research in the realm of tipping points within socio-ecological systems demands a strategic focus on specific areas to advance understanding and ensure policy relevance. Adaptive management emerges as a foundational strategy, emphasizing the implementation of flexible approaches capable of accommodating shifting conditions and uncertainties. Delving into emerging areas of study and innovation stands as a cornerstone, exploring novel research frontiers and innovative methodologies pivotal to advancing our comprehension of tipping points. Moreover, forecasting future tipping points encounters significant hurdles due to uncertainties in modeling future scenarios. Model limitations coupled with the unpredictable nature of complex systems amplify these uncertainties, making it challenging to project the timing, magnitude, and precise locations of tipping events accurately. Addressing these challenges necessitates innovative strategies that surmount data constraints, enhance modeling capabilities, and navigate uncertainties to fortify resilience against potential tipping points. Additionally, bridging knowledge gaps for policy relevance emerges as a crucial necessity. While scientific knowledge continues to evolve, translating these findings into actionable policies remains a persistent hurdle. To bridge this gap effectively, robust science-policy interfaces are imperative, enhancing communication channels and knowledge transfer mechanisms between researchers, policymakers, and practitioners (Goodenough and Webb 2022). By prioritizing interdisciplinary innovation and strengthening the connections between science and policy, future research endeavors can make substantial strides in addressing tipping points and bolstering the resilience of socio-ecological systems.

Conclusion

As we conclude our exploration of tipping points within socio-ecological systems, the profound implications of these critical thresholds become increasingly apparent. The intricate interplay of human activities and environmental changes underscores the complexity of these tipping phenomena. Embracing emerging areas of study and innovation is essential as we navigate uncertainties in modeling future scenarios. Leveraging the insights gained, we can enhance our ability to anticipate and respond to tipping events effectively.

By harnessing the insights gleaned from these endeavors, we can better equip ourselves to navigate the uncertainties ahead. From adaptive governance to community engagement, the tools at our disposal offer avenues for addressing the challenges posed by tipping points. I hope this exploration has provided valuable insights into socio-ecological tipping points and has expanded your understanding of these critical phenomena!

Lastly, HAPPY VALENTINE’S DAY!

References 

Amigo, Ignacio. 2020. “When Will the Amazon Hit a Tipping Point?” Nature 578 (7796): 505–8. https://doi.org/10.1038/d41586-020-00508-4.

Castilla-Rho, Juan Carlos, Rodrigo Rojas, Martin S. Andersen, Cameron Holley, and Gregoire Mariethoz. 2017. “Social Tipping Points in Global Groundwater Management.” Nature Human Behaviour 1 (9): 640–49. https://doi.org/10.1038/s41562-017-0181-7.

Dietz, Simon, James Rising, Thomas Stoerk, and Gernot Wagner. 2021. “Economic Impacts of Tipping Points in the Climate System.” Proceedings of the National Academy of Sciences 118 (34): e2103081118. https://doi.org/10.1073/pnas.2103081118.

Goodenough, Anne E., and Julia C. Webb. 2022. “Learning from the Past: Opportunities for Advancing Ecological Research and Practice Using Palaeoecological Data.” Oecologia 199 (2): 275–87. https://doi.org/10.1007/s00442-022-05190-z.

Grimm, Sonja, and Gerald Schneider. 2011. Predicting Social Tipping Points: Current Research and the Way Forward. Discussion Paper / Deutsches Institut Für Entwicklungspolitik 8/2011. Bonn: Dt. Inst. für Entwicklungspolitik.

Kradin, N. N., ed. 2012. Politicheskai︠a︡ antropologii︠a︡ tradit︠s︡ionnykh i sovremennykh obshchestv: Materialy mezhdunarodnoĭ konferent︠s︡ii. Vladivostok: Izdatelʹskiĭ dom Dalʹnevostochnogo federalʹnogo universiteta.

Lauerburg, R. A. M., R. Diekmann, B. Blanz, K. Gee, H. Held, A. Kannen, C. Möllmann, et al. 2020. “Socio-Ecological Vulnerability to Tipping Points: A Review of Empirical Approaches and Their Use for Marine Management.” Science of The Total Environment 705 (February): 135838. https://doi.org/10.1016/j.scitotenv.2019.135838.

Moore, John C. 2018. “Predicting Tipping Points in Complex Environmental Systems.” Proceedings of the National Academy of Sciences 115 (4): 635–36. https://doi.org/10.1073/pnas.1721206115.

Nobre, Carlos Afonso, and Laura De Simone Borma. 2009. “‘Tipping Points’ for the Amazon Forest.” Current Opinion in Environmental Sustainability 1 (1): 28–36. https://doi.org/10.1016/j.cosust.2009.07.003.

Peng, Heng, and Ying Lu. 2012. “Model Selection in Linear Mixed Effect Models.” J. Multivar. Anal. 109 (August): 109–29.

Quinn, Julianne D, Patrick M Reed, and Klaus Keller. 2017. “Direct Policy Search for Robust Multi-Objective Management of Deeply Uncertain Socio-Ecological Tipping Points.” Environmental Modelling & Software 92 (June): 125–41.

Ravindran, Sandeep. 2016. “Coral Reefs at a Tipping Point.” Proceedings of the National Academy of Sciences 113 (19): 5140–41. https://doi.org/10.1073/pnas.1605690113.

Sterk, Marjolein, Ingrid A van de Leemput, and Edwin THM Peeters. 2017. “How to Conceptualize and Operationalize Resilience in Socio-Ecological Systems?” Current Opinion in Environmental Sustainability, Sustainability governance, 28 (October): 108–13. https://doi.org/10.1016/j.cosust.2017.09.003.

Yletyinen, Johanna, Philip Brown, Roger Pech, Dave Hodges, Philip E Hulme, Thomas F Malcolm, Fleur J F Maseyk, et al. 2019. “Understanding and Managing Social–Ecological Tipping Points in Primary Industries.” BioScience 69 (5): 335–47. https://doi.org/10.1093/biosci/biz031.

The Thomas-Fiering Model for Synthetic Streamflow Generation with a Python Implementation

In 1962 a group of economists, engineers and political scientists who were involved in the Harvard Water Program published “Design of Water Resource Systems“. In chapter 12 of the book, Thomas and Fiering present the following statistical model which was one of the first, if not the first, formal application of stochastic modelling for synthetic streamflow generation and water resource systems evaluation.

It is an autoregressive model which can simulate monthly streamflow values based on the mean, variance, and correlation of historic observations.

In this blog post, I present the model in it’s original form along with a modified form presented by Stedinger and Taylor (1982). Then, I share a Python implementation of the model which is used to generate an ensemble of synthetic flows. I use plotting tools from the Synthetic Generation Figure Library to plot the results.

All of the code used for this post is available here: ThomasFieringModelDemo

Let’s get into it!

The Thomas-Fiering Model

The model that Thomas and Fiering proposed took the form:

Where, for each month m, Q_m is the generated flow, \bar{Q}_m is the mean historic flow, b_m is an autoregression coefficient for predicting that months flow from the prior months flow, \sigma is the standard deviation, r is the correlation coefficient and \epsilon is a random standard normal variable.

A modification to this model was proposed by Stedinger and Taylor (1982), which transforms transforms the streamflow values before fitting the model. I refer to this as the “Stedinger transformation” below and in the code.

Given Q_{m} as the observed flows in month m, the Stedinger transformation of the observed flows is then:

where \hat{\tau}_m is the estimated “lower bound” for each month, calculated as:

The modeled flows are generated from the recursive relationship:

Where:

  • \mu_{m} is the observed average historic monthly x series
  • \sigma_{m}^2 is the observed variance of the historic monthly x series
  • \epsilon_{m} independent standard-normal random variables
  • \rho_m observed between-month correlations of the historic x series

The above steps are performed for each month, and the synthetic streamflow sequence is generated by iteratively applying the stochastic process for the desired duration.

Python Implementation

I built this version of the Thomas Fiering model as a Python class with the following structure:

class ThomasFieringGenerator():
    def __init__(self, Q, **kwargs):
        
    def preprocessing(self, **kwargs):
	    # Stedinger normalization
	    
    def fit(self, **kwargs):
	    # Calculate mu, sigma, and rho for each month
	    
    def generate(self, n_years, **kwargs):
	    # Iteratively generate a single timeseries
	    # Inverse stedinger normalization
        return Q_synthetic
    
    def generate_ensemble(self, n_years, 
                          n_realizations = 1, 
                          **kwargs):
        # Loop and generate multiple timeseries
        return 

Rather than posting the entire code here, which would clutter the page, I will refer you to and encourage you to check out the full implementation which is in the linked repository here: ThomasFieringModelDemo/model.py

To see how this is used and replicate the results below using some example data, see the Jupyter Notebook: ThomasFieringModelDemo/ThomasFiering_demo.ipynb

Synthetic ensemble results

I used the ThomasFieringGenerator to produce 100 samples of 50-year monthly streamflows at USGS gauge site 01434000 on the Delaware River which has data going back to 1904.

This data is available in the repo and is stored in the file usgs_monthly_streamflow_cms.csv

The plotting functions are taken from the Synthetic Generation Figure Library which was shared previously on the blog.

First we consider the range of historic and synthetic streamflow timeseries:

Generally when working with synthetic ensembles it is good for the distribution of synthetic ensembles “envelope” the historic range while maintaining a similar median. The Thomas Fiering model does a good job at this!

The next figure shows the range of flow-quantile values for both historic and synthetic flows. Again, we see a nice overlapping of the synthetic ensemble:

Conclusions

I personally think it is fun and helpful to look back at the foundational work in a field. Since Thomas and Fiering’s work in the early 1960s, there has been a significant amount of work focused on synthetic hydrology.

The Thomas Fiering model has a nice simplicity while still performing very nicely (with the help of the Stedinger normalization). Sure there are some limitations to the model (e.g., the estimation of distribution and correlation parameters may be inaccurate for short records, and the method does not explicitly prevent the generation of negative streamflows), but the model, and the Harvard Water Program more broadly, was successful in ushering in new approaches for water resource systems analysis.

References

Maass, A., Hufschmidt, M. M., Dorfman, R., Thomas, Jr, H. A., Marglin, S. A., & Fair, G. M. (1962). Design of water-resource systems: New techniques for relating economic objectives, engineering analysis, and governmental planning. Harvard University Press.

Stedinger, J. R., & Taylor, M. R. (1982). Synthetic streamflow generation: 1. Model verification and validation. Water resources research, 18(4), 909-918.