Ensemble forecasting – application

In contrast to my theoretically oriented previous post on ensemble forecasting, I will attempt, in this post, to provide some practically oriented operational context for ensemble forecasting from the perspective of water management. This and the previous post will serve as a useful background for the planned third post in this series that will focus on the details of ensemble forecast verification. This post is largely a distillation of my experience in working with the formal Forecast Informed Reservoir Operations (FIRO) efforts that are ongoing, primarily in the western US. I use the term ‘formal’ here deliberately to differentiate from the more general notion of FIRO, which has been in the research literature since at least the 90’s. Importantly, ‘formal’ FIRO, as I choose to label it, is an effort to formalize the use of forecast information into the operations of our nation’s portfolio of federally owned or controlled dams, which contain the largest and most important reservoirs in CONUS. This is in stark contrast to the largely informal use of FIRO that has been applied somewhat ‘under the radar’ and scattershot in various ways and for various purposes by dam operators up to this point. While there is a very interesting discussion to be had about the institutional and operational complexities of this effort (perhaps a future post), I’ll leave this brief once-over-the-FIRO-world contextualization at that for now.

For the purposes of this post, I mention the operational context to narrow the window of forecast applications to a tractable space that is a) pertinent to water systems operations and b) is a space that I am intimately familiar with. The world of hydrometeorological forecasting is large and growing, containing both advances in legacy dynamical forecasting techniques as well as emergent machine learning approaches. When it comes to federal agencies, tried/tested/verified is incredibly important, so I anticipate the uptake of the latter ML approaches to be some ways off. Baby steps…the US Army Corp of Engineers (USACE) is still trying to move past the ‘water on the ground’ legacy of water management.

Meteorological forecasting

In discussing the current state of ensemble forecasting, I will progress hierarchically from the meteorological to the hydrologic forecasting space. I will first discuss some aspects of dynamical meteorological forecasting that form the backbone of any hydrological forecasting effort. It is, after all, the weather gods casting their hydrometeors (rain/snow/ice) upon the earth that cause the majority of the hydrologic responses of interest to practical FIRO applications. This is an important point that I’ll note here and come back to later. Meteorological forcings dominate the hydrologic response and by consequence, dominate the hydrologic uncertainties when we start thinking in terms of future predictions (forecasts) where meteorologic uncertainties become large. This is wholly in contrast to efforts to understand hydrologic uncertainty with observational data as forcings (for instance, stochastic watershed modeling, see my post here), where the hydrologic model’s structural inadequacy is the primary driver of hydrologic predictive uncertainty. The observational (input) uncertainties are not inconsequential in these sorts of applications, but they form a less substantial portion of the overall uncertainty and can largely be considered aleatory.

Advances in meteorological forecasting skill

With some sense of the importance of meteorological uncertainty to the hydrologic forecasting problem, I want to turn now to a brief discussion on the advances in dynamical meteorological forecasting and areas for future research and improvement. Most folks have some practical sense of forecast skill (or lack thereof) from their daily lives; you check the smartphone weather before you head out for the day, put on shorts and a t-shirt, then get soaked walking to work due to some unforecast rain event. Despite events like this and the general challenge of dynamical uncertainty in the atmospheric system (see previous post), the reality is that forecast skill has steadily improved over time, to the tune of 1-day of ‘useful’ skill improvement per decade (see Figure 1). Although dependent on the particular skill metric being used, it is common to assess ‘useful’ skill above 60% skill and a ‘high degree of accuracy’ above 80% skill. By some visual extrapolation, one might surmise that we should be approaching ‘useful’ skill out to 10-days by now in the northern hemisphere (NH) for the atmospheric variable in question.

Figure 1. Improvements in forecast skill of 500 hPa geopotential height anomaly over time (Bauer et al., 2015)

Notably, Figure 1 shows the skill improvements for a large, synoptic scale variable at a moderate altitude in the atmosphere (500 hPa ~ 15,000’ above sea level). Skill varies greatly across atmospheric variables and is particularly challenging for variables that are highly dependent on sub-grid scale processes (i.e. processes that occur below the native resolution of the forecast model); one of these, as luck would have it, is the aforementioned hydrometeor (precipitation) generating process of such interest to hydrologic applications. As I will loosely paraphrase, something I heard from F. Martin Ralph, the current director of the Center for Western Weather and Water extremes (CW3E) at Scripps, is that ‘our skill in predicting upper atmospheric variables at large scales has increased enormously, whereas our skill in predicting local scale precipitation has barely budged’. This basic notion is a Pandora’s box of sorts in terms of thinking how we might use the information that is available in forecast models (i.e. the highly accurate synoptic scale variables) differently to more directly forecast key variables of interests at the local scale; many of these emerging techniques rely on ML techniques applied across wide spatial scales that can more precisely target key hydrometeorological attributes of interest to water management like the probability of exceeding some rainfall threshold (Zhang et al., 2022) or mapping directly to streamflow itself and skipping the whole precipitation part (Nearing et al., 2021). These are certainly highly promising avenues for research, but we’ll return to some practical advancements in dynamical forecasting for now.

Areas for improving meteorological forecasting

One could spend an entire blog post (or many posts really) on this subject. I want to touch on just a few key areas where much research effort is currently dedicated to advancing meteorological forecasting. The first of these is improving the spatiotemporal resolution of the forecast models. To take a quick step back, it is useful to compare a ‘forecast’ model to a Global Circulation/Climate Model (GCM) or Earth System Model (ESM). In a broad sense, these models are all constructed in the same basic way. They start with a dynamical representation of the atmosphere based on the Navier-Stokes fluid dynamics equations that are discretized and solved across ‘grids’ of some horizontal and vertical dimensions across the globe. In climate science speak, this part of the model is often referred to as the ‘dynamical core’ (or ‘dy-core’ if you want to sound cool). For the most part, things that happen below this native grid resolution are parameterized (e.g. cloud formation), which means that they are modeled based on fitted statistical relationships between grid scale atmospheric states and sub-grid scale process outcomes. These parameterizations are often referred to as the model’s ‘physics’. Intuitively, the smaller the scale we can represent the atmospheric processes, then the better we might be able to capture these local behaviors. Advances in computational power have enable much higher resolutions for forecast models/GCM/ESM over time and will likely continue to do so.

Where these models primarily differ is in all the other stuff needed to run a simulation of earth’s atmosphere, namely the land/ocean/ice surfaces and their couplings with the atmosphere. Forecast models must be computationally tractable at operational relevant timescales to be useful. In the space we are discussing, this means that the model must be able to reinitialize and produce a forecast at least daily. Moreover, the forecast temporal resolution must be much higher to capture evolution of the climate system at operationally relevant scales, typically 6-hourly with current technology. To overcome this complexity tradeoff (see Figure 2), forecast models have traditionally relied on extremely simplified representations of the land/ocean/ice surface boundary conditions that are assumed to be constant across the forecast period. Conversely, the development of GCM/ESMs has continually increased the complexity of the land/ocean/ice modeling and their atmospheric coupling while retaining sufficient atmospheric spatiotemporal resolution for long term simulations. GCM/ESM must also provide provable closure of things like energy and carbon fluxes across long time scales; this is not really a primary concern for forecast models, they can be ‘leaky’. Increasingly, however, computational advances have enabled forecast models to look more like GCM/ESM, with state-of-the-science forecast models now modeling and coupling land/ocean interactions along with the atmospheric evolution.

Figure 2. Tradeoff between model complexity and resolution in modeling the earth system (Bauer et al., 2015)

The second key area that I’ll highlight is data assimilation. Data assimilation is the process of constraining forecast trajectories based on some intermediate assessment of the actual evolution of the climate system (i.e. an observation). In meteorological forecasting, this is typically done in a relatively short window after the forecast has been generated, for instance the 12-hour assimilation window in Figure 3 below. Importantly, for the issuance of forecasts, this assimilation period is happening ‘under the hood’, so that the user of forecasts would only see the forecast as being issued at 21:00, not the 09:00 time when the actual forecast model was initialized. While the details of data assimilation are complex and well outside the scope of this post, it is an incredibly important area of continued research for dynamical forecasting. To put this into context, another loose paraphrase I heard from Vijay Tallapragada, senior scientist for NOAA’s Environmental Modeling Center, is ‘if you want long-term job security in the weather forecasting, get into data assimilation’.

Figure 3. Data assimilation example in ensemble forecasting (Bauer et al., 2015)

Lastly, as alluded to in my previous post, the methods to sample initial condition uncertainty and propagate it through the dynamical forecast model are also a very important area of advancement in meteorological ensemble forecasting. The mathematical challenge of adequately sampling initial condition uncertainty at global scales across highly correlated initialization variables is huge, as is the computational burden of running multiple perturbed members of the forecasts to produce an ensemble. The latter challenge has particular significance to the hydrological forecasting enterprise and may, at least partially, explain certain design choices in the hydrologic forecast systems discussed in the following section.

Hydrologic ensemble forecasting

To turn the meteorologic outputs of precipitation and temperature to streamflow, we need a streamflow generating process model. Moreover, if we want an ensemble of streamflow predictions, we need a way to generate some uncertainty in the streamflows to reflect the dominant meteorological uncertainties in the forecasts. With this modeling chain of meteorologic forecast model à streamflow process model, one could envision many possible ways to attempt to capture this uncertainty. Perhaps the most intuitive would be to take each ensemble output from the meteorologic forecast model and run it through the streamflow process model; voila! ensemble streamflow forecasts!

The Hydrologic Ensemble Forecast Service (HEFS)

This is, however, not how it’s done in the Hydrologic Ensemble Forecast Service (HEFS) that is the current operational model used by NOAA/NWS river forecast centers (RFC) and the model being primarily used for formal FIRO efforts (see Figure 4 below). Instead, HEFS has its own internal Meteorological Ensemble Forecast Processor (MEFP) that ingests an ensemble mean of temperature and precipitation inputs from the meteorological ensemble forecast (global ensemble forecast system, GEFS) and then creates its own internal ensemble of meteorologic forecasts. These internally generated traces of forecasted temperature/precipitation are each used to force a single, watershed calibrated version of the SAC-SMA hydrologic model to produce an ensemble of streamflow forecasts (Demargne et al., 2014). Why use this implementation instead of the seemingly more straightforward approach with the meteorological ensemble? I don’t have a definitive answer for that, but my understanding is that, from a statistical verification perspective, the current HEFS approach is more reliable.


Figure 4. Conceptual diagram of HEFS model (Demargne et al., 2014)

Another possible reason stems from the computational complexity challenge of meteorological ensemble forecasting mentioned in the previous section, particularly as it relates to hindcasting. Hindcasts are forecasts that are generated with a modern day forecast model (and their modern day skill) against historical observations of the appropriate quality (post satellite era generally speaking, ~1979 – present). In meteorological spheres, hindcasts are important in documenting the performance of forecast model across a much larger dataset. For the purposes of advancing operational water management with forecasts, hindcasts are more than important…they are indispensable. Hindcasts form the basis for the design and testing of forecast informed water management procedures; there is no workaround for this. To generate a meteorological hindcast, the NWS has to carve out time to dedicate its warehouse sized forecast model computational platform to generating 20 -30 years of forecasts and archiving the data. This time must be balanced with the operational necessity of producing real time forecasts for things like: not getting wet going to work or maybe, I don’t know, aviation? This challenge means that hindcasts must typically be produced at reduced resolution compared to the operational forecasts. For example, the current NOAA Global Ensemble Forecast System (GEFSv12) produces operational forecasts with 31 ensemble members, but only 5 ensemble members in the hindcasts (Guan et al., 2019).

This limitation is significant. 5 ensemble members is sufficient to produce a relatively good approximation of the ensemble-mean, but grossly undersized for producing an adequate probabilistic streamflow ensemble. In this hindcast context, then, the construction of HEFS makes sense and enables more flexibility for generating historical forecast ensembles for development of forecast informed water management policies.

Practical HEFS limitations

The current implementations of HEFS typically generate ~40 ensemble members. The MEFP is conditioned on a historical period of reference using a Schaake shuffle type procedure, so this ensemble size is actually the length (in years) of that historical conditioning period. As with meteorological forecasting, the production of HEFS hindcasts is also challenging, in that it requires the same carving out of time on an operational computational architecture to produce the hindcasts. This limitation substantiates one of the core interests from folks within the formal FIRO implementation efforts into synthetic forecasting methods to expand the length and availability of simulated hindcasts for operational policy development. In general, HEFS is reliable, but it suffers from systemic ‘Type-II conditional bias’, which in layman’s terms is the tendency to underpredict extreme events. I will dig into some of these attributes of HEFS forecasts in a future post on ensemble verification.

Outside of questions of statistical reliability, there are other more challenging to quantify concerns about the implementation of HEFS. The ensemble-mean forecasts from the NOAA GEFS model that force HEFS are bound to be more ‘skillful’ than, for instance, the ensemble control forecast. (Note: the control forecast is the forecast trajectory initialized from the ‘best guess’ at the initial condition state of the atmosphere). This is a well-documented property of ensemble-mean predictions. However, ensemble-mean forecasts are not dynamically consistent representations of the atmosphere (Wilks, 2019). As I noted in my previous post, an ideal forecast ensemble would include all possible trajectories of the forecasts, which might diverge substantially at some future time. Where forecast trajectories have diverged, an average of those trajectories will produce a value that is not in the space of plausible dynamic trajectories. In practical terms, this leads to ensemble-mean forecasts collapsing to climatology at longer leads and systematic reductions in the range of magnitudes of key variables as lead time increases. In short, this reliance on the meteorological ensemble-mean forecast does incur some risk (in my opinion) of missing high impact extremes due to, for example, even relatively small spatial errors in the landfall locations of atmospheric river storms.

Final thoughts

The intent of this post was to provide some current operational context for the types of forecasts being considered for formal implementation of forecast informed water management operations. The ensemble forecasting space is large and exciting from both research and operational perspectives, but it’s challenging to navigate and understand what is actually being considered for use, and what is simply novel and interesting. In my next post, I’ll discuss some of the primary verification techniques used for ensemble forecasts with a focus on the performance of HEFS in an example watershed and some associated practical implementation.

Reference

Bauer, P., Thorpe, A., & Brunet, G. (2015). The quiet revolution of numerical weather prediction. Nature, 525(7567), 47–55. https://doi.org/10.1038/nature14956

Demargne, J., Wu, L., Regonda, S. K., Brown, J. D., Lee, H., He, M., … Zhu, Y. (2014). The science of NOAA’s operational hydrologic ensemble forecast service. Bulletin of the American Meteorological Society, 95(1), 79–98. https://doi.org/10.1175/BAMS-D-12-00081.1

Guan, H., Zhu, Y., Sinsky, E., Fu, B., Zhou, X., Li, W., et al. (2019). The NCEP GEFS-v12 Reforecasts to Support Subseasonal and Hydrometeorological Applications. 44th NOAA Annual Climate Diagnostics and Prediction Workshop, (October), 78–81.

Nearing, G. S., Kratzert, F., Sampson, A. K., Pelissier, C. S., Klotz, D., Frame, J. M., … Gupta, H. V. (2021). What Role Does Hydrological Science Play in the Age of Machine Learning? Water Resources Research, 57(3). https://doi.org/10.1029/2020WR028091

Wilks, D. S., (2019). Statistical Methods in the Atmospheric Sciences, 4th ed. Cambridge, MA: Elsevier.

Zhang, C., Brodeur, Z. P., Steinschneider, S., & Herman, J. D. (2022). Leveraging Spatial Patterns in Precipitation Forecasts Using Deep Learning to Support Regional Water Management. Water Resources Research, 58(9), 1–18. https://doi.org/10.1029/2021WR031910

Ensemble forecasting – theory

‘Does the flap of a butterfly’s wing in Brazil stir up a tornado in Texas?’ These words reflect the so-called ‘butterfly effect’ typically ascribed to Edward Lorenz (MIT), who was a pioneer in the field of numerical weather prediction (NWP). (Interestingly, Lorenz himself first referred to the ‘flap of a seagulls’ wing’; the butterfly swap out was the work of a creative conference session convener. Somewhat coincidentally, the shape of the ‘Lorenz attractor’ is also reminiscent of a butterfly, see Figure 1). Motivated by the very salient challenge in the post-WWII era to improve weather prediction capabilities for strategic purposes, he developed a theoretical framework for the predictability of non-linear dynamical systems in a seminal paper, ‘Deterministic nonperiodic flow’ (Lorenz, 1963), that would come to be known as ‘chaos theory’ or ‘chaotic dynamics’. This theory was foundational to the development of ensemble forecasting (and a whole host of other things), and still underpins our current theoretical understanding of the inherent predictability of complex dynamical systems like the atmosphere (Lorenz, 2006).


Figure 1. Book cover (left panel) and the characteristic ‘butterfly wing’ shape of the Lorenz attractor (right panel) from his seminal 1963 work. (Palmer, 1993)

In actuality, the whole ‘flap of a butterfly’s wing creating a tornado in Texas’ relates specifically to only one aspect of the theoretical framework developed by Lorenz. In this post, I will attempt to give a slightly fuller treatment of this theoretical framework and its development. My hope is that this post will be a theoretical primer for two follow-on practically oriented posts: 1) a discussion of the state-of-the-science of hydrometeorological ensemble forecasting and its application in emerging water resources management practices like forecast informed reservoir operations, and 2) a deeper dive into ensemble verification techniques. While the techniques for (2) are grounded in the theoretical statistical properties of stochastic-dynamic predictive ensembles, they have broad applications to any scenario where one needs to diagnose the performance of an ensemble.

The Lorenz model and ‘chaos’ theory

When most people hear the word ‘chaos’, they tend to think of the dictionary definition: ‘complete disorder and confusion’ (Oxford), which in a scientific context might aptly describe some sort of white noise process. As you will see, this is somewhat of a far cry from the ‘chaos’ described in Lorenz’s theory. The ‘chaos’ terminology was actually coined by a later paper building on Lorenz’s work (Li & Yorke, 1975) and as noted by Wilks (2019): ‘…this label is somewhat unfortunate in that it is not really descriptive of the sensitive-dependence phenomenon’. The sensitive-dependence phenomenon is one of a set of 4 key properties (to be discussed later) that characterize the behaviors of the sort of non-linear, non-periodic deterministic systems that Lorenz argued were most representative of the atmosphere. In contrast to ‘disorder’ and ‘confusion’, these properties, in fact, lend some sense of order to the evolution of these dynamical systems, although the nature of that order is highly dependent on the system state.

A deterministic, non-periodic systems model

First, let’s dive into a few details of Lorenz’s 1963 work, using some figures and descriptions from a later paper (Palmer, 1993) that are easier to follow. While the 1963 paper is quite dense, much of the mathematical logic is dedicated to justifying the use of a particular system of equations that forms the basis of the study. This system of 3 equations and 3 variables (X, Y, Z) describes a non-linear, dissipative model of convection in a fluid of uniform depth, where there is a temperature difference between the upper and lower surfaces. Lorenz derived the set of 3 equations shown in the upper left panel of Figure 2 from earlier work by Rayleigh (1916) on this particular problem. In short, X relates to the intensity of convective motion, Y relates to the temperature difference between ascending and descending currents, and Z relates to the distortion of the vertical temperature profile from linearity; the details of these variables are not actually all that important to the study. What is important is that this system of equations has no general solutions (aside from the steady state solution) and must be numerically integrated in the time dimension to determine the convective flow dynamics. The ‘trajectories’ of these integrations shown in the phase space diagrams in Figure 2 exhibit the sorts of unstable, non-periodic behaviors that Lorenz thought were the most useful analog to atmospheric dynamics, albeit in a much simpler system. (‘Much’ is an understatement here; modern NWP models have a phase space on the order of 109 in contrast to the 3-dimensional phase space of this problem, Wilks, 2019. Of note, the use of phase-space diagrams (i.e. plots where each axis corresponds to a dependent variable in the system of equations) preceded Lorenz, but his use of them is perhaps one of the best-known early instances of this kind of visualization. Other uses of phase space relationships can be found in Rohini’s post or Dave’s post)

Figure 2. a) Lorenz equations and the XY phase space diagram of their integrations. b-c) X variable timeseries of two separate integrations of the Lorenz equations from very similar initial conditions. (Palmer, 1993)

Regime structure and multiple distinct timescales

What ‘behaviors’ then, are we talking about? Referring again to Figure 2a, we see the projection of a long series of model integration trajectories onto the XY-plane of the phase space diagram, where each dot is a timestep in the integration. What is immediately apparent in the form of these integrations is the two lobed ‘butterfly’ shape of the diagram. Each lobe has a central attractor where the trajectories often reside for multiple revolutions, but can then transition to the other attractor when passing near the center of the diagram. These so-called ‘Lorenz attractors’ comprise one of the key properties of chaotic dynamics, namely regime structure, which is tendency of the trajectories to reside around phase space attractors for some period of time. This residence time in a regime is generally quite a bit longer than the timescales of the trajectories’ individual revolutions around an attractor. This attribute is referred to as multiple distinct timescales and is evidenced in Figure 2b-c, where smaller amplitude sections of the timeseries show residence in one or the other regime and large amplitude sections describe transitions between regimes. Often, but not always, the residence in these regimes occurs for multiple revolutions, suggesting that there are shorter timescale evolutions of the system that take place within these regimes, while infrequent, random shifts to the other regimes occur at longer timescales.

Sensitive-dependence and state-dependent variation in predictability

Figure 3. a-c) Different trajectories through the XY phase space dependent on the initial condition state-space (black circle). (Palmer, 1993)

Returning now to the ‘butterfly effect’; what, then, is this whole sensitive-dependence thing mentioned earlier? Figure 3a-c provide a nice graphical representation of this phenomenon. In each panel, a set of nearby initial states are chosen at different location in the phase space diagram and then are followed through their set of trajectories. In 3a, these trajectories neatly transition from one regime to the other, remaining relatively close to each other at the trajectories’ end. In 3b, a set of initial states not so far from 3a are chosen and instead of neatly transitioning to the other regime, they diverge strongly near the center of the diagram, with some trajectories remaining in the original regime, and some transitioning. However, for about half of the timespan, the trajectories remain very close to one another. Finally, an initial state chosen near the center of the diagram (3c) diverges quickly into both regimes, with trajectories ending up at nearly opposite ends of the phase space (black tails at upper right and left of diagram). Figures b-c, in particular, showcase the sensitive-dependence on initial conditions attributes of the system. In other words, from a set of very close-by initial states, the final trajectories in phase space can yield strongly divergent results. Importantly, this divergence in trajectories over some period of time can occur right away (3c), at some intermediate time (3b), or not at all (3a).

This is the basic idea behind the last core property of these systems, state-dependent variation in predictability. Clearly, a forecast initialized from the starting point in 3a could be a little bit uncertain about the exact starting state and still end up in about the right spot for an accurate future prediction at the end of the forecast horizon. At medium ranges, this is also the case for 3b, but the longer range forecast is highly uncertain. For 3c, all forecast ranges are highly uncertain; in other words, the flap of a butterfly’s wing can mean the difference between one or the other trajectory! Importantly, one can imagine in this representation that an average value of 3c’s trajectories (i.e. the ensemble mean) would fall somewhere in the middle of the phase space and be representative of none of the two physically plausible trajectories into the right or left regimes. This is an important point that we’ll return to at the end of this post.

From the Lorenz model to ensemble forecasting

The previous sections have highlighted this idealized dynamical system (Lorenz model) that theoretically has properties of the actual atmosphere. Those 4 properties (the big 4!) were: 1) sensitive-dependence on initial conditions, 2) regime structure, 3) multiple distinct timescales, and 4) state-dependent variation in predictability. In this last section, I will tie these concepts into a theoretical discussion of ensemble forecasting. Notably, in the previous sections, I discussed ‘timescales’ without any reference to the real atmosphere. The dynamics of the Lorenz system prove to be quite illuminating about the actual atmosphere when timescales are attached to the system that are roughly on the order of the evolution of synoptic scale weather systems. If one assumes that a lap around the Lorenz attractor equates to a synoptic timescale of ~5 days, then the Lorenz model can be conceptualized in terms of relatively homogenous synoptic weather progressions occurring within two distinct weather regimes that typically persist on the order of weeks to months (see Figure 3b-c). This theoretical foundation jives nicely with the empirically derived weather-regime based approach (see Rohini’s post or Nasser’s post) where incidentally, 4 or more weather regimes are commonly found (The real atmosphere is quite a bit more complex than the Lorenz model after all). This discussion of the Lorenz model has hopefully given you some intuition that conceptualizing real world weather progressions outside the context of an appropriate regime structure could lead to some very unphysical representations of the climate system.

Ensemble forecasting as a discrete approximation of forecast uncertainty

In terms of weather prediction, though, the big 4 make things really tough. While there are certainly conditions in the atmosphere that can lead to excellent long range predictability (i.e. ‘forecasts of opportunity’, see Figure 3a), the ‘typical’ dynamics of the atmosphere yield the potential for multiple regimes and associated transitions within the short-to-medium range timeframe (1-15 days) where synoptic-scale hydro-meteorological forecasting is theoretically possible. (Note, by ‘synoptic-scale’ here, I am referring to the capability to predict actual weather events, not prevailing conditions. Current science puts the theoretical dynamical limit to predictability at ~2 weeks with current NWP technology achieving ‘usable’ skill out to ~7 days, give or take a few dependent on the region)

Early efforts to bring the Lorenz model into weather prediction sought to develop analytical methods to propagate initial condition uncertainty into a useful probabilistic approximation of the forecast uncertainty at various lead times. This can work for simplified representation of the atmosphere like the Lorenz model, but quickly becomes intractable as the complexity and scale of the governing equations and boundary conditions increases.

Figure 4. Conceptual depiction of ensemble forecasting including sampling of initial condition uncertainty, forecasts at different lead times, regime structure, and ensemble mean comparison to individual ensemble members. Solid arrow represents a ‘control’ member of the ensemble. Histograms represent an approximate empirical distribution of the ensemble forecast. (Wilks, 2019)

Thankfully, rapid advances in computing power led to an alternate approach, ensemble forecasting! Ensemble forecasting is a stochastic-dynamic approach that couples a discrete, probabilistic sampling of the initial condition uncertainty (Figure 4 ‘Initial time’) and propagates each of those initial condition state vectors through a dynamical NWP model. Each of these NWP integrations is a unique trajectory through state space of the dynamical model that yields a discrete approximation of the forecast uncertainty (Figure 4 ‘Intermediate/Final forecast lead time’). This discrete forecast uncertainty distribution theoretically encompasses the full space of potential hydro-meteorologic trajectories and allows a probabilistic representation of forecast outcomes through analysis of the empirical forecast ensemble distribution. These forecast trajectories highlight many of the big 4 properties discussed in previous sections, including regime structure and state-dependent predictability (Figure 4 trajectories are analogous to the Figure 3b trajectories for the Lorenz model). The ensemble mean prediction is an accurate and dynamically consistent prediction at the intermediate lead time, but at the final lead time where distinct regime partitioning has occurred, it is no longer dynamically consistent and delineates a region of low probability in the full ensemble distribution. I will explore properties of ensemble averaging, both good and bad, in a future post.

Lastly, I will note that the ensemble forecasting approach is a type of Monte Carlo procedure. Like other Monte Carlo approaches with complex systems, the methodology for sampling the initial condition uncertainty has a profound effect on the uncertainty quantification contained within the final NWP ensemble output, especially when considering the high degree of spatiotemporal relationships within the observed climatic variables that form the initial state vector. This is a key area of continued research and improvement in ensemble forecasting models.

Final thoughts

I hope that you find this discussion of the theoretical underpinnings of chaotic dynamics and ensemble forecasting to be useful. I have always found these foundational concepts to be particularly fascinating. Moreover, the basic theory has strong connections outside of ensemble forecasting, including the ties to weather regime theory mentioned in this post. I also think these foundational concepts are necessary to understand how much actual ensemble forecasting techniques can diverge from the theoretical stochastic-dynamic framework. This will be the subject of some future posts that will delve into more practical and applied aspects of ensemble forecasting with a focus on water resources management.

Reference

Lorenz, E. N. (1963). Deterministic nonperiodic flow. Journal of the Atmospheric Sciences, 20, 130–141.

Lorenz, E. N. (2006). Reflections on the Conception , Birth , and Childhood of Numerical Weather Prediction. Annual Review of Earth and Planetary Science, 34, 37–45. https://doi.org/10.1146/annurev.earth.34.083105.102317

Palmer, T. N. (1993). Extended-range atmospheric prediction and the Lorenz model. Bulletin – American Meteorological Society, 74(1), 49–65. https://doi.org/10.1175/1520-0477(1993)074<0049:ERAPAT>2.0.CO;2

Rayleigh, L. (1916). On convective currents in a horizontal layer when the higher temperature is on the underside. Phil. Mag., 32, 529-546.

Wilks, D. S., (2019). Statistical Methods in the Atmospheric Sciences, 4th ed. Cambridge, MA: Elsevier.

Nonstationary stochastic watershed modeling

In this post, I will describe the motivation for and implementation of a nonstationary stochastic watershed modeling (SWM) approach that we developed in the Steinschneider group during the course of my PhD. This work is in final revision and should be published in the next month or so. This post will attempt to distill key components of the model and their motivation, saving the full methodological development for those who’d like to read the forthcoming paper.

SWMs vs SSM/SSG

Before diving into the construction of the model, some preliminaries are necessary. First, what are SWMs, what do they do, and why use them? SWMs are a framework that combine deterministic, process-based watershed models (think HYMOD, SAC-SMA, etc.; we’ll refer to these as DWMs from here forward) with a stochastic model that capture their uncertainty. The stochastic part of this framework can be used to generate ensembles of SWM simulations that both represent the hydrologic uncertainty and are less biased estimators of the streamflow observations (Vogel, 2017).

Figure 1: SWM conceptual diagram

SWMs were developed to address challenges to earlier stochastic streamflow modeling/generation techniques (SSM/SSG; see for instance Trevor’s post on the Thomas-Fiering SSG; Julie’s post and Lillian’s post on other SSG techniques), the most important of which (arguably) being the question of how to formulate them under non-stationarity. Since SSMs are statistical models fitted directly to historical data, any attempt to implement them in a non-stationary setting requires strong assumptions about what the streamflow response might look like under an alternate forcing scenario. This is not to say that such an approach is not useful or valid for exploratory analyses (for instance Rohini’s post on synthetic streamflow generation to explore extreme droughts). SWMs attempt to address this issue of non-stationarity by using DWMs in their stochastic formulation, which lend some ‘physics-based’ cred to their response under alternate meteorological forcings.

Construction of an SWM

Over the years, there have been many SWM or SWM-esque approaches devised, ranging from simple autoregressive models to complex Bayesian approaches. In this work, we focus on a relatively straightforward SWM approach that models the hydrologic predictive uncertainty directly and simply adds random samples of it to the DWM simulations. The assumption here being that predictive uncertainty is an integrator of all traditional component modeling uncertainties (input, parameter, model/structural), so adding it back in can inject all these uncertainties into the SWM simulations at once (Shabestanipour et al., 2023).

Figure 2: Uncertainty components

By this straightforward approach, the fitting and parameter estimation of the DWM is accomplished first (and separately) via ‘standard’ fitting procedures; for instance, parameter optimization to minimize Nash-Sutcliffe Efficiency (NSE). Subsequently, we develop our stochastic part of the model on the predictive uncertainty that remains, which in this case, is defined simply by subtracting the target observations from the DWM predictions. This distribution of differenced errors is the ‘predictive uncertainty distribution’ or ‘predictive errors’ that form the target of our stochastic model.

Challenges in modeling predictive uncertainty

Easy, right? Not so fast. There is a rather dense and somewhat unpalatable literature (except for the masochists out there) on the subject of hydrologic uncertainty that details the challenges in modeling these sorts of errors. Suffice it to say that they aren’t well behaved. Any model we devise for these errors must be able to manage these bad behaviors.

So, what if we decide that we want to try to use this SWM thing for planning under future climates? Certainly the DWM part can hack it. We all know that lumped, conceptual DWMs are top-notch predictors of natural streamflow… At the least, they can produce physically plausible simulations under alternate forcings (we think). What of the hydrologic predictive uncertainty then? Is it fair or sensible to presume that some model we constructed to emulate historical uncertainty is appropriate for future hydrologic scenarios with drastically different forcings? My line of rhetorical questioning should clue you in on my feelings on the subject. YES!, of course. ‘Stationarity is immortal!’ (Montanari & Koutsoyiannis, 2014).

Towards a hybrid, state-variable dependent SWM

No, actually, there are a number of good reasons why this statement might not hold for hydrologic predictive uncertainty under non-stationarity. You can read the paper for the laundry list. In short, hydrologic predictive uncertainty of a DWM is largely a reflection of its structural misrepresentation of the true process. Thus, the historical predictive uncertainty that we fit our model to is a reflection of that structural uncertainty propagated through historical model states under historical, ‘stationary’ forcings. If we fundamentally alter those forcings, we should expect to see model states that do not exist under historical conditions. The predictive errors that result from these fundamentally new model states are thus likely to not fall neatly into the box carved out by the historical scenarios.

Figure 3: Structural uncertainty

To bring this back to the proposition for a nonstationary SWM approach. The intrinsic link between model structure and its predictive uncertainty raises an interesting prospect. Could there be a way to leverage a DWM’s structure to understand its predictive uncertainty? Well, I hope so, because that’s the premise of this work! What I’ll describe in the ensuing sections is the implementation of a hybrid, state-variable dependent SWM approach. ‘Hybrid’ because it couples both machine learning (ML) and traditional statistical techniques. ‘State-variable dependent’ because it uses the timeseries of model states (described later) as the means to infer the hydrologic predictive uncertainty. I’ll refer to this as the ‘hybrid SWM’ for brevity.

Implementation of the hybrid SWM

So, with backstory in hand, let’s talk details. The remainder of this post will describe the implementation of this hybrid SWM. This high-level discussion of the approach supports a practical training exercise I put together for the Steinschneider group at the following public GitHub repo: https://github.com/zpb4/hybrid-SWM_training. This training also introduces a standard implementation of a GRRIEN repository (see Rohini’s post). Details of implementing the code are contained in the ‘README.md’ and ‘training_exercise.md’ files in the repository. My intent in this post is to describe the model implementation at a conceptual level.

Model-as-truth experimental design

First, in order to address the problem of non-stationary hydrologic predictive uncertainty, we need an experimental design that can produce it. There is a very real challenge here of not having observational data from significantly altered climates to compare our hydrologic model against. We address this problem by using a ‘model-as-truth’ experimental design, where we fit one hydrologic model (‘truth’ model) to observations, and a second hydrologic model (‘process’ model) to the first truth model. The truth model becomes a proxy for the true, target flow of the SWM modeling procedure, while the process model serves as our proposed model, or hypothesis, about that true process. Under this design, we can force both models with any plausible forcing scenario to try to understand how the predictive uncertainty between ‘truth’ and ‘process’ models might change.

Figure 4: Conceptual diagram of ‘model-as-truth’ experimental design

For the actual work, we consider a very simple non-stationary scenario where we implement a 4oC temperature shift to the temperature forcing data, which we refer to as the ‘Test+4C’ scenario. We choose this simple approach to confine non-stationarity to a high-confidence result of anthropogenic climate change, namely, thermodynamic warming. We compare this Test+4C scenario to a ‘Test’ scenario, which is the same out-of-sample temporal period (WY2005-2018) of meteorological inputs under historical values. SAC-SMA and HYMOD are the truth model and process model for this experiment, respectively. Other models could have been chosen. We chose these because they are conceptually similar and commonly used.

Figure 5: Errors between truth and process models in 5 wettest years of Test/Test+4C scenarios.

Hybrid SWM construction

The core feature of the hybrid SWM is a model for the predictive errors (truth model – process model) that uses the hydrologic model state-variables as predictors. We implement this model in two steps that have differing objectives, but use the same state-variable predictor information. An implicit assumption in using state-variable dependencies in both steps is that these dependencies can exist in both stages. In other words, we do not expect the error-correction step to produce independent and identically distributed residuals. We call the first step an ‘error-correction model’ and the second step a ‘dynamic residual model’. Since we use HYMOD as our process model, we use its state-variables (Table 1) as the predictors for these two steps.

Table 1: HYMOD state variables

Short NameLong NameDescription
simSimulationHYMOD predicted streamflow in mm
runoffRunoffUpper reservoir flow of HYMOD in mm
baseflowBaseflowLower reservoir flow of HYMOD in mm
precipPrecipitationBasin averaged precipitation in mm
tavgAverage temperatureBasin averaged temperature in oC
etEvapotranspirationModeled evapotranspiration (Hamon approach) in mm
upr_smUpper soil moistureBasin averaged soil moisture content (mm) in upper reservoir
lwr_smLower soil moistureBasin averaged soil moisture (mm) in lower reservoir
sweSnow water equivalentBasin averaged snow water equivalent simulated by degree day snow module (mm)

Hybrid SWM: Error correction

The error-correction model is simply a predictive model between the hydrologic model (HYMOD) state-variables and the raw predictive errors. The error-correction model also uses lag-1 to 3 errors as covariates to account for autocorrelation. The objective of this step is to infer state-dependent biases in the errors, which are the result of the predictive errors subsuming the structural deficiencies of the hydrologic model. This ‘deterministic’ behavior in the predictive errors can also be conceived as the ‘predictive errors doing what the model should be doing’ (Vogel, 2017). Once this error-correction model is fit to its training data, it can be implemented against any new timeseries of state-variables to predict and debias the errors. We use a Random Forest (RF) algorithm for this step because they are robust to overfitting, even with limited training data. This is certainly the case here, as we consider only individual basins and a ~15 year training period (WY1989-2004). Moreover, we partition the training period into a calibration and validation subset and fit the RF error-correction model only to the calibration data (WY1989-1998), reducing available RF algorithm training data to 9 years.

Hybrid SWM: Dynamic residual model

The dynamic residual model (DRM) is fit to the residuals of the error correction result in the validation subset. We predict the hydrologic model errors for the validation subset from the fitted RF model and subtract them from the empirical errors to yield the residual timeseries. By fitting the DRM to this separate validation subset (which the RF error-correction model has not seen), we ensure that the residuals adequately represent the out-of-sample uncertainty of the error-correction model.

A full mathematical treatment of the DRM is outside the scope of this post. In high-level terms, the DRM is built around a flexible distributional form particularly suited to hydrologic errors, called the skew exponential power (SEP) distribution. This distribution has 4 parameters (mean: mu, stdev: sigma, kurtosis: beta, skew: xi) and we assume a mean of zero (due to error-correction debiasing), while setting the other 3 parameters as time-varying predictands of the DRM model (i.e. sigmat, betat, xit). We also include a lag-1 autocorrelation term (phit) to account for any leftover autocorrelation from the error-correction procedure. We formulate a linear model for each of these parameters with the state-variables as predictors. These linear models are embedded in a log-likelihood function that is maximized (i.e. MLE) against the residuals to yield the optimal set of coefficients for each of the linear models.

With a fitted model, the generation of a new residual at each timestep t is therefore a random draw from the SEP with parameters (mu=0, sigmat, betat, xit) modified by the residual at t-1 (epsilont-1) via the lag-1 coefficient (phit).

Figure 6: Conceptual diagram of hybrid SWM construction.

Hybrid SWM: Simulation

The DRM is the core uncertainty modeling component of the hybrid SWM.  Given a timeseries of state-variables from the hydrologic model for any scenario, the DRM simulation is implemented first, as described in the previous section. Subsequently, the error-correction model is implemented in ‘predict’ mode with the timeseries of random residuals from the DRM step. Because the error-correction model includes lag-1:3 terms, it must be implemented sequentially using errors generated at the previous 3 timesteps. The conclusion of these two simulation steps yields a timeseries of randomly generated, state-variable dependent errors that can be added to the hydrologic model simulation to produce a single SWM simulations. Repeating this procedure many times will produce an ensemble of SWM simulations.

Final thoughts

Hopefully this discussion of the hybrid SWM approach has given you some appreciation for the nuanced differences between SWMs and SSM/SSGs, the challenges in constructing an adequate uncertainty model for an SWM, and the novel approach developed here in utilizing state-variable information to infer properties of the predictive uncertainty. The hybrid SWM approach shows a lot of potential for extracting key attributes of the predictive errors, even under unprecedented forcing scenarios. It decouples the task of inferring predictive uncertainty from features of the data like temporal seasonality (e.g. day of year) that may be poor predictors under climate change. When linked with stochastic weather generation (see Rohini’s post and Nasser’s post), SWMs can be part of a powerful bottom-up framework to understand the implications of climate change on water resources systems. Keep an eye out for the forthcoming paper and check out the training noted above on implementation of the model.

References:

Brodeur, Z., Wi, S., Shabestanipour, G., Lamontagne, J., & Steinschneider, S. (2024). A Hybrid, Non‐Stationary Stochastic Watershed Model (SWM) for Uncertain Hydrologic Simulations Under Climate Change. Water Resources Research, 60(5), e2023WR035042. https://doi.org/10.1029/2023WR035042

Montanari, A., & Koutsoyiannis, D. (2014). Modeling and mitigating natural hazards: Stationarity is immortal! Water Resources Research, 50, 9748–9756. https://doi.org/10.1002/ 2014WR016092

Shabestanipour, G., Brodeur, Z., Farmer, W. H., Steinschneider, S., Vogel, R. M., & Lamontagne, J. R. (2023). Stochastic Watershed Model Ensembles for Long-Range Planning : Verification and Validation. Water Resources Research, 59. https://doi.org/10.1029/2022WR032201

Vogel, R. M. (2017). Stochastic watershed models for hydrologic risk management. Water Security, 1, 28–35. https://doi.org/10.1016/j.wasec.2017.06.001

The Thomas-Fiering Model for Synthetic Streamflow Generation with a Python Implementation

In 1962 a group of economists, engineers and political scientists who were involved in the Harvard Water Program published “Design of Water Resource Systems“. In chapter 12 of the book, Thomas and Fiering present the following statistical model which was one of the first, if not the first, formal application of stochastic modelling for synthetic streamflow generation and water resource systems evaluation.

It is an autoregressive model which can simulate monthly streamflow values based on the mean, variance, and correlation of historic observations.

In this blog post, I present the model in it’s original form along with a modified form presented by Stedinger and Taylor (1982). Then, I share a Python implementation of the model which is used to generate an ensemble of synthetic flows. I use plotting tools from the Synthetic Generation Figure Library to plot the results.

All of the code used for this post is available here: ThomasFieringModelDemo

Let’s get into it!

The Thomas-Fiering Model

The model that Thomas and Fiering proposed took the form:

Where, for each month m, Q_m is the generated flow, \bar{Q}_m is the mean historic flow, b_m is an autoregression coefficient for predicting that months flow from the prior months flow, \sigma is the standard deviation, r is the correlation coefficient and \epsilon is a random standard normal variable.

A modification to this model was proposed by Stedinger and Taylor (1982), which transforms transforms the streamflow values before fitting the model. I refer to this as the “Stedinger transformation” below and in the code.

Given Q_{m} as the observed flows in month m, the Stedinger transformation of the observed flows is then:

where \hat{\tau}_m is the estimated “lower bound” for each month, calculated as:

The modeled flows are generated from the recursive relationship:

Where:

  • \mu_{m} is the observed average historic monthly x series
  • \sigma_{m}^2 is the observed variance of the historic monthly x series
  • \epsilon_{m} independent standard-normal random variables
  • \rho_m observed between-month correlations of the historic x series

The above steps are performed for each month, and the synthetic streamflow sequence is generated by iteratively applying the stochastic process for the desired duration.

Python Implementation

I built this version of the Thomas Fiering model as a Python class with the following structure:

class ThomasFieringGenerator():
    def __init__(self, Q, **kwargs):
        
    def preprocessing(self, **kwargs):
	    # Stedinger normalization
	    
    def fit(self, **kwargs):
	    # Calculate mu, sigma, and rho for each month
	    
    def generate(self, n_years, **kwargs):
	    # Iteratively generate a single timeseries
	    # Inverse stedinger normalization
        return Q_synthetic
    
    def generate_ensemble(self, n_years, 
                          n_realizations = 1, 
                          **kwargs):
        # Loop and generate multiple timeseries
        return 

Rather than posting the entire code here, which would clutter the page, I will refer you to and encourage you to check out the full implementation which is in the linked repository here: ThomasFieringModelDemo/model.py

To see how this is used and replicate the results below using some example data, see the Jupyter Notebook: ThomasFieringModelDemo/ThomasFiering_demo.ipynb

Synthetic ensemble results

I used the ThomasFieringGenerator to produce 100 samples of 50-year monthly streamflows at USGS gauge site 01434000 on the Delaware River which has data going back to 1904.

This data is available in the repo and is stored in the file usgs_monthly_streamflow_cms.csv

The plotting functions are taken from the Synthetic Generation Figure Library which was shared previously on the blog.

First we consider the range of historic and synthetic streamflow timeseries:

Generally when working with synthetic ensembles it is good for the distribution of synthetic ensembles “envelope” the historic range while maintaining a similar median. The Thomas Fiering model does a good job at this!

The next figure shows the range of flow-quantile values for both historic and synthetic flows. Again, we see a nice overlapping of the synthetic ensemble:

Conclusions

I personally think it is fun and helpful to look back at the foundational work in a field. Since Thomas and Fiering’s work in the early 1960s, there has been a significant amount of work focused on synthetic hydrology.

The Thomas Fiering model has a nice simplicity while still performing very nicely (with the help of the Stedinger normalization). Sure there are some limitations to the model (e.g., the estimation of distribution and correlation parameters may be inaccurate for short records, and the method does not explicitly prevent the generation of negative streamflows), but the model, and the Harvard Water Program more broadly, was successful in ushering in new approaches for water resource systems analysis.

References

Maass, A., Hufschmidt, M. M., Dorfman, R., Thomas, Jr, H. A., Marglin, S. A., & Fair, G. M. (1962). Design of water-resource systems: New techniques for relating economic objectives, engineering analysis, and governmental planning. Harvard University Press.

Stedinger, J. R., & Taylor, M. R. (1982). Synthetic streamflow generation: 1. Model verification and validation. Water resources research, 18(4), 909-918.

Weather Regime-Based Stochastic Weather Generation (Part 2/2)

In this post on the Water Programming Blog, we continue to explore the application of the stochastic weather generator (available on GitHub) for climate-change scenario developments. This is the second installment of a two-part series of blog posts, and readers are strongly encouraged to familiarize themselves with different components of the stochastic weather generator, as explained in Part 1 by Rohini Gupta (The Reed Research Group).

Here, we will begin by offering a concise overview of developing climate change scenarios and how these scenarios are integrated into the weather generation model. Following this, we will proceed to interpret the impact of these climatic change conditions on key statistical insights concerning occurrences of floods and droughts. Through these examples, the implications of these climatic shifts on water resources management and flood risk analysis will become evident.

Climate Change Perturbations

In this stochastic weather generator, we specifically focus on two aspects of climate change scenario developments, including 1) thermodynamic and 2) dynamic perturbations.

1) Thermodynamic Change

Thermodynamic climate change, often referred to as temperature-driven change, is primarily driven by changes in the Earth’s energy balance and temperature distribution. This warming affects various aspects of the climate system, such as intensifying precipitation extremes, melting snowpacks and ice sheets, rising sea levels, altered weather patterns, and shifts in ecosystems. The primary driver of temperature-driven climate change is the increase in regional-to-global average temperatures due to the enhanced greenhouse effect. As temperatures rise due to natural and anthropogenic forcings, they trigger a cascade of interconnected impacts throughout the climate system.

In the stochastic weather generator, scenarios of temperature change are treated simply by adding trends to simulated temperature data for each location across the spatial domain. However, scenarios of thermodynamic precipitation intensification are modeled using a quantile mapping technique through scaling the distribution of daily precipitation in a way that replicates the effects of warming temperatures on precipitation as the moisture holding capacity of the atmosphere increases. In the context of California, previous studies have demonstrated that as temperatures rise, the most severe precipitation events (often associated with Atmospheric Rivers landfalling) are projected to increase in frequency, while the intensity of smaller precipitation events is expected to decrease (Gershunov et al., 2019). This alteration effectively stretches the distribution of daily precipitation, causing extreme events to become more pronounced while reducing the occurrence and strength of lighter precipitation events. We replicate this phenomenon by making adjustments to the statistical characteristics and distribution percentiles of precipitation (e.g., Pendergrass and Hartmann, 2014). To further elaborate on this, we select a scaling factor for the 99th percentile of nonzero precipitation and then modify the gamma-GPD mixed distribution to enforce this chosen scaling factor. For instance, in a scenario with a 3°C temperature warming and a 7% increase in extreme precipitation per °C (matching the theoretical Clausius-Clapeyron rate of increase in atmospheric water holding capacity due to warming; Najibi and Steinschneider (2023)), the most extreme precipitation events are projected to increase by approximately 22.5% (1.073). We adjust the gamma-GPD models fitted to all locations to ensure this percentage increase in the upper tail of the distribution. Assuming that mean precipitation remains constant at baseline levels, this adjustment will cause smaller precipitation values in the gamma-GPD model to decrease, compensating for the increase in extreme events through stretching the distribution of nonzero precipitation.

Lines 33-40 from ‘config.simulations.R’ show the user-defined changes to implement the thermodynamic scenarios based on temperature in Celsius (tc: e.g. 1 °C), percent change in extreme precipitation quantile (pccc: e.g. 7% per °C), and percent change in average precipitation (pmuc: e.g. 12.5% decrease) inputs. Needless to say that the stochastic weather generator runs at baseline mode if tc=0 and pmuc=0.

    ##-------------Define perturbations-------------##
    ##climate changes and jitter to apply:
    change.list <- data.frame("tc"=  c(0), # {e.g., 0, 1, 2, ...} (changes in temperature)
                              "jitter"= c(TRUE),
                              "pccc"= c( 0.00), # {e.g., 0, 0.07, 0.14, ...} (changes for precipitation extreme quantile -- CC)
                              "pmuc"= c( 0.00)# {e.g., 0, -.125, .125, ...} (changes in precipitation mean)
    )
    ##----------------------------------------------##

2) Dynamic Change

Dynamic climate change, also known as circulation-driven change, is driven by shifts in atmospheric and oceanic circulation patterns. These circulation patterns are influenced by a variety of factors, including temperature gradients, differences in air pressure, and Earth’s rotation. Changes in these patterns can lead to alterations in weather patterns, precipitation distribution, and regional climate characteristics. One well-known example of dynamic climate change is the phenomenon of El Niño and La Niña, which involve changes in ocean temperatures and atmospheric pressure in the Pacific Ocean. These events can significantly impact local-to-global weather patterns, causing droughts, heavy rainfall, and other extreme weather events (Pfahl et al., 2017).

Dynamic changes impact the evolution of weather patterns and can modify the occurrence frequency of these patterns. This influence can occur through direct adjustments to the transition probabilities between different weather regimes, or indirectly by modifying the covariates that govern the progression of these weather regimes. In Steinschneider et al. (2019), a Niño 3.4 index is used to force weather regime evolution and is systematically adjusted to create more frequent El Niño and La Niña events. In Gupta et al. (in review), a 600-year long sequence of tree-ring reconstructed principal components of weather regime occurrence are used as an alternative covariate to better capture natural variability inherent in the weather regimes.

In the most recent version of the stochastic weather generator, we developed a novel non-parametric approach to simulation of weather regimes, allowing for future dynamic change scenarios with altered (customized) weather regime probabilities. Assuming that the historical time series of water regimes is divided into distinct, consecutive segments without overlaps, each segment has a duration of D years, and there is a total of ND segments considered there (here, D=4 and ND=18). In the non-parametric method, each segment (indexed as n=1 to ND) is assigned a sampling probability denoted as pn. To generate a new sequence of daily weather regimes spanning any desired number of years, the procedure involves resampling (with replacement) the nth D-year segment of daily weather regimes using the corresponding probability pn. This process is repeated until the required number of years of simulated weather regimes has been attained. If needed, the last segment can be trimmed to ensure the precise desired duration of simulated weather regimes.

In the baseline scenario for the weather generator with no dynamic climate change (only thermodynamic change), each segment is considered equally likely (i.e., no changes to large-scale circulation patterns).

However, the probabilities pn can be adjusted to alter the frequencies of each of the identified weather regimes in the final simulation, enabling the generation of dynamic climate change scenarios (i.e., scenarios in which the frequencies of different atmospheric flow patterns change compared to their historical frequencies). This is achieved using a linear program. The goal of the linear model (not shown) is to identify new sampling probabilities pn that, when used in the non-parametric simulation approach above, create a sequence of weather regimes with long-term average frequencies that approach some vector of target probabilities for those identified weather regimes.

Lines 91-126 from ‘config.simulations.R’ show the user-defined changes to implement a non-parametric scenario with equal probabilities (0: no change to the historical sequence of weather regimes) to ten weather regimes, i.e., dynamic scenario #0; and a 30% increase in weather regime number three (a dry weather condition) in California, i.e., dynamic scenario #1.

##Choose below whether through parametric or non-parametric way to create the simulated WRs ##
    use.non_param.WRs <- TRUE #{TRUE, FALSE}: TRUE for non-parametric, FALSE for parametric simulated WRs

    dynamic.scenario  <- 0 # {0, 1, 2}: 0: no dynamic change; 1: dynamic scenario #1 (30% increase in WR3); or 2: dynamic scenario #2 (linear trend)

    if (use.non_param.WRs){      #----------- 1+2 dynamic scenarios ----------#
      if (dynamic.scenario==0){
        ##===> Attempt #0 (thermodynamic only; no change to freq of WRs) ===##
        # #specify target change (as a percent) for WR probabilities
        WR_prob_change <- c(0,0,0,0,0,0,0,0,0,0) # between 0 and 1
        # #how close (in % points) do the WR frequencies (probabilities) need to be to the target
        lp.threshold <- 0.00001
        # #how much change do we allow in a sub-period sampling probability before incurring a larger penalty in the optimization
        piecewise_limit <- .02
        
      }else if(dynamic.scenario==1){
        ##===> Attempt #1 (dynamic scenario #1) ===##
        # #specify target change (as a percent) for WR probabilities (if, increasing WR3 in future)
        WR_prob_change <- c(0,0,.3,0,0,0,0,0,0,0) # between 0 and 1
        # #how close (in % points) do the WR frequencies (probabilities) need to be to the target
        lp.threshold <- 0.007
        # #how much change do we allow in a sub-period sampling probability before incurring a larger penalty in the optimization
        piecewise_limit <- .02
        
      }else if(dynamic.scenario==2){
        ##===> Attempt #2 (dynamic scenario #2) ===##
        # specify target change (as a percent) for WR probabilities (if, continuing their current trends in future)
        WR_prob_change <- c(-0.09969436,  0.27467048,  0.33848792,
                            -0.28431861, -0.23549986,  0.03889970,
                            -0.05628958, 0.38059153, -0.16636739, -0.17995965) # between 0 and 1
        # how close (in % points) do the WR frequencies (probabilities) need to be to the target
        lp.threshold <- 0.008
        # how much change do we allow in a sub-period sampling probability before incurring a larger penalty in the optimization
        piecewise_limit <- .02
      }
    }

Stochastic Weather Generation for Climate Change Scenarios

The stochastic weather generator is utilized to generate two climate change scenarios as defined above. The description of specific configurations for each scenario is listed as follows:

  • Thermodynamic Scenario: 3°C increase in temperature, 7% per °C increase in precipitation extremes, no change in average precipitation.
  • Dynamic Scenario: 30% increase in occurrence frequency of one weather regime only, labeled as ‘WR3’, which exhibits a ridge directly over the northwest US, i.e., blocking moisture flow over California, and resulting in dry conditions during the cold season there.

Thus, we generated 1008 years of simulated precipitation and temperature for each 12 sites in the Tuolumne River Basin in California (similar to Part 1) following these two scenarios. Below is a list of figures to understand better the impact of each scenario on precipitation and temperature statistical distributions and important climate extremes at basin scale.

The two Figures below present the cumulative distribution function (CDF) of the generated scenario for precipitation (left) and temperature (right) based on the thermodynamic and dynamic change, respectively. The observed time-series of precipitation and temperature across these 12 sites is also illustrated.

As seen above, although the 3°C warming is clearly manifested in the alteration of simulated temperature’s CDF, it is hard to notice any drastic shifts in the overall CDF of precipitation time series, as the bulk of distribution has not been altered (remember the average precipitation remained constant although its extreme quantile scaled up by ~ 22.5%).

Similarly, while the CDF of precipitation demonstrates a slight shift towards a drier condition, we notice a large shift in tail of temperature distribution.

Now, we go ahead and examine a set of important indexes for climate risk assessment of water systems. The Figure below presents the 1-day precipitation maxima derived from the generated scenario for precipitation based on the thermodynamic (left) and dynamic (right) change.

In the plot above depicted for thermodynamic climate change, the median 1-day precipitation extremes at the basin scale throughout the entire synthetic weather generation vs. historical period demonstrates a 25.5% increase in its magnitude, which is consistent with the event-based precipitation scaling by 22.5% at each site. However, such metric has almost remained unchanged in the dynamic climate change scenario.

Finally, the Figure below shows the 5-year drought derived from the generated scenario for water-year precipitation total, based on the thermodynamic (left) and dynamic (right) change.

The boxplots presented above related to the thermodynamic scenario, revealing a consistent median 5-year drought magnitude, as anticipated (no shift in the distribution of average precipitation bulk). In contrast, the dynamic climate change scenario exhibits a substantial exacerbation, with the 5-year drought magnitude worsening by around 9% compared to the historical records.

There is plenty more to explore! The stochastic weather generator is suitable to quickly simulate a long sequence of weather variables that reflect any climate change of interest. Keep an eye out for upcoming additions to the repository in the coming months, and do not hesitate to contact us or create a GitHub issue if you need clarification.

References

Gershunov, A., Shulgina, T., Clemesha, R.E.S. et al. (2019). Precipitation regime change in Western North America: The role of Atmospheric Rivers. Scientific Reports, 9, 9944. https://doi.org/10.1038/s41598-019-46169-w.

Gupta, R.S., Steinschneider S., Reed, P.M. Understanding Contributions of Paleo-Informed Natural Variability and Climate Changes on Hydroclimate Extremes in the Central Valley Region of California. Authorea. March 13, 2023. https://doi.org/10.22541/essoar.167870424.46495295/v1

Najibi, N., and Steinschneider, S. (2023). Extreme precipitation-temperature scaling in California: The role of Atmospheric Rivers, Geophysical Research Letters, 50(14), 1–11, e2023GL104606. https://doi.org/10.1029/2023GL104606.

Pendergrass, A.G., and Hartmann, D.L. (2014). Two modes of change of the distribution of rain, Journal of Climate, 27(22), 8357-8371. https://doi.org/10.1175/JCLI-D-14-00182.1

Pfahl, S., O’Gorman, P.A., Fischer, E.M. (2017). Understanding the regional pattern of projected future changes in extreme precipitation, Nature Climate Change, 7 (6), 423-427. http://dx.doi.org/10.1038/nclimate3287

Steinschneider, S., Ray, P., Rahat, S. H., & Kucharski, J. (2019). A weather‐regime‐based stochastic weather generator for climate vulnerability assessments of water systems in the western United States. Water Resources Research, 55(8), 6923-6945. https://doi.org/10.1029/2018WR024446

Weather Regime-Based Stochastic Weather Generation (Part 1/2)

*This post is written in collaboration with Nasser Najibi from Steinschneider Research Group*

In this blog post, we debut the weather-regime-based stochastic weather generator on the Water Programming Blog! This weather generator was first published in the following article:

Steinschneider, S., Ray, P., Rahat, S. H., & Kucharski, J. (2019). A weather‐regime‐based stochastic weather generator for climate vulnerability assessments of water systems in the western United States. Water Resources Research, 55(8), 6923-6945.

Since its inception, the generator has been applied in studies led by researchers primarily affiliated with Cornell or the Army Corp of Engineers. In this blog post, we’ll step through the key theory and components of this weather generator and show how to run a simple simulation and diagnostics with code that is now provided in an official GitHub repo! Future blog posts in this series will step through more sophisticated examples and additions to the generator. Readers can refer to the paper above for more details on the specific components or methods used in this generator.   

Why use this generator?

If you are looking to develop alternative local weather scenarios informed by climate changes, you can go down two avenues: (1) using projections from General Circulation Models (GCMs) or (2) using statistical weather generation approaches. GCMs have a rich physical representation of ocean and atmospheric processes and are useful to generate internally consistent scenarios that can ultimately be downscaled to the region of interest. However, due to coarse model resolution, GCMs exhibit significant variability and biases in processes that are important for regional planning and vulnerability assessments like precipitation, atmospheric river flux and droughts. These models are also computationally expensive to run and so you are limited to a few storylines of warming scenarios to explore. 

Stochastic weather generators, however, are a complementary way to develop local weather scenarios. They are directly conditioned on the weather of the region of interest and therefore can better capture regional processes and be more applicable to regional water systems planning. This generator can be used to create many scenarios that preserve regional and temporal statistics but are not bounded by the historical record; notably these scenarios can also be generated in a computationally efficient manner. The output from these generators can also be systematically adjusted to create basic climate change scenarios. What is often missing from these statistical models, however, is process information, and this is where the weather-regime-based stochastic generator can be useful.

What are the key components of this generator?

Figure 1: Generator flow chart (Steinschneider et al., 2019)

Figure 1 shows the main components of the first version of the generator, some of which have been extended or altered in newer studies. There are three primary levels to this generator listed below.

Simulation of Weather Regimes: The first step of using this generator is identifying weather regimes that bring regional weather to the application area. Weather regimes are large scale atmospheric flow patterns that tend to persist and organize local weather systems and can effectively be modeled by a Markovian process. Weather regimes are often identified by either clustering or fitting a Hidden Markov Model (HMM) to a selected number of principal components of geopotential height (GPH) anomalies. In many of the studies that use this generator in the Western U.S. (Steinschneider et al., 2019, Najibi et al., 2021, Rahat et al., 2022, Gupta et al., (in review), 500 hPa GPH anomalies are used to identify cool season weather regimes because this GPH effectively capture the weather systems organized by the jet stream. The same GPH anomalies also have been used to identify North Atlantic weather regimes that bring weather systems to Europe (Charlton-Perez et al., 2018). A combination of 500 and 250 hPa GPH anomalies have been used to identify weather regimes in Chile (Meseguer-Ruiz et al., 2020), for example. The choice of number of weather regimes is also region specific. In the case of Western US weather regimes, prior work has developed a framework to identify the optimal number of weather regimes that produce the best results in reproducing regional weather statistics over a large geographic area in California (Najibi et al., 2021). In the case of the Western U.S., 4-5 metastable regimes are defined (both in Guirguis et al., 2020 and Najibi et al., 2021), but other regions might be defined by more. Once weather regimes are identified, then they can be simulated as Markov chains, effectively dictating the large-scale atmospheric flow processes that are taking place on a given day.

Simulation of Local Weather: The next step is a bootstrapping procedure to create local weather. Each day of the historical record is classified to be in a specific weather regime and has a certain value of precipitation and temperature associated with each grid cell in the basin of interest. Thus, when chains of weather regimes are simulated, precipitation and temperature can be bootstrapped from the historical record from time periods where simulated blocks of weather regimes match with historical blocks. The block bootstrapping procedure helps maintain the joint distribution of the weather variables, but an additional copula-based jittering technique helps to broaden the range of simulated precipitation outside of the historical distribution while still maintaining the shape of the underlying distribution.

Dynamic and Thermodynamic Perturbations: Dynamic and thermodynamic climate changes are imposed separately in this generator. This allows a way to systematically understand how specific climate changes lead to changes to local climate. Dynamical changes influence how weather regimes evolve and can change the frequency of weather regimes. This can be done by directly changing the transition probabilities of the weather regimes or indirectly by imposing covariates that dictate the evolution of the weather regimes. In Steinschneider et al. (2019), a Niño 3.4 index is used to force weather regime evolution and is systematically adjusted to create more frequent El Niño and La Niña events. In Gupta et al. (in review), a 600-year long sequence of tree-ring reconstructed principal components of weather regime occurrence are used as an alternative covariate to better capture natural variability inherent in the weather regimes. Thermodynamic trends are applied post generation of local weather and consist of manipulations to temperature or precipitation. Stepwise temperature trends can be added to each grid cell, or elevation-dependent warming can be imposed. Based on an imposed temperature, the mean or an upper quantile of the precipitation distribution can be scaled to reflect an increased capacity for the atmosphere to hold moisture due to increased temperature.

Where can I use this generator?

A weather-regime based stochastic weather generator is not applicable everywhere. It can be utilized for any region in which weather regimes dictate regional weather, which are the midlatitude regions of the Earth. Examples of locations where weather regimes are active and have been identified in studies include locations like the Western U.S, the Great lakes region of the U.S., parts of Western Europe, Morocco, Chile, and New Zealand, among others. If a midlatitude location is not applicable to your case study, consider using other synthetic weather generation techniques extensively covered in this series of posts by Julie Quinn. It is worth noting that the quality of the generator is also going to be dependent on having a sufficient historical precipitation and temperature record to bootstrap off of.

Where can I find the code?

Now, for the first time ever, demo code is publicly available in an official GitHub repo. In this demo, created by Nasser, the user can simulate precipitation and temperature for 12 randomly selected grid cells in the Tuolumne River Basin in California. In its most complex form, the generator can be used to generate weather across 12 basins simultaneously in the Central Valley region of California. Download or clone the GitHub repository and then download the necessary data from the Zenodo archive here. Then, drag these data to the Data folder in the repo. Then, the user can proceed with stepping through the rest of the demo.

  1. You will first simulate weather regimes for a certain number of years using either a parametric (config.WRs.param.NHMM.R) or non-parametric method (config.WRs.non_param.R).
    • Parametric Method: In this method, NHMMs are fitted using the following covariates: the first four PCs of the SPI index over California and two harmonics for the cold season. The two harmonics are the only covariates used during the warm season. From this fitted NHMM, one can simulate an arbitrary sequence of WRs. The parametric method is particularly useful if the user is interested in using specific covariates to force the evolution of the weather regime sequence.
    • Non-parametric Method: Randomly resample a block of m-years (here, m=4 years) of observed WRs to create a sequence of 1000-yr simulated WRs. Because the non-parametric method doesn’t requires covariates, it is particularly effective for flexibly generating very long sequences of weather regimes that can more appropriate capture natural variability for example.
  2. Once the WRs are simulated, you can simulate local precipitation and temperature. This is done by running the config.simulations.R script.
#Demo code- simulates precipitation and temperature for 12 grid cells in the Tuolumne Basin
rm(list=ls())

library(MASS)
library(fExtremes)
library(evmix)
library(zoo)
library(abind)
library(parallel)
library(mvtnorm)
library(tictoc)
library(lubridate)

#adjust main directory and directory for simulation files
mainDir <- "D:/Projects/Tuolumne_River_Basin/GitHub_WGENv2.0"
setwd(mainDir)

dir.to.sim.files <- "./Data/simulated.data.files/WGEN.out"
dir.create(file.path(mainDir, dir.to.sim.files), showWarnings = FALSE)

num.iter <- 1 # A single long trace (e.g., thousand years) is sufficient although we create like 5 ensembles in the simulated WRs
basin.cnt <- 'myPilot' # for a set of 12 randomly selected Livneh grids in the Tuolumne River Basin 
number.years.long <- 3050 # e.g., 500, 1000, 2000, 3000, 5000 years, etc [note: current NHMM output (parametric) is for 1036 years; current non-parametric is for 3050 years]

##############################Define perturbations#######################################
## Climate changes and jitter to apply:
change.list <- data.frame("tc"=  c(0), # e.g., 1, 2, ...
                          "jitter"=  c(TRUE),
                          "pccc"=c( 0), # e.g., 0.07, 0.14, ...
                          "pmuc"=c( 0)# e.g., -.125
)

# For simulating the WRs (i.e., 'config.WRs.non_param.R'; 'config.WRs.param.NHMM.R'), do you use non-parametric or parametric method
use.non_param.WRs <- TRUE #TRUE for non-parametric, FALSE for parametric simulated WRs
####### Choose A (FALSE) or B (TRUE) below for the simulated WRs #################
#-- A) load in NHMM with WRs (parametric)
tmp.list <- readRDS(paste0("./Data/simulated.data.files/WRs.out/final.NHMM.output.rds"))
weather.state.assignments <- tmp.list$WR.historical # this is for BOTH the parametric and non-parametric approach 
sim.weather.state.assignments <- tmp.list$WR.simulation[,1:num.iter] # this is only for the parametric approach of simulating WRs
num.states <- length(unique(as.vector(weather.state.assignments)))    #number of WRs in the model
rm(tmp.list) # for memory

#-- B) load in non-parametric simulation with WRs
np.list.sim.weather.state.assignments <- readRDS(paste0("./Data/simulated.data.files/WRs.out/final.non_param.WRs.output.rds")) #this is only for the non-parametric approach of simulating WRs
num.states <- length(unique(np.list.sim.weather.state.assignments$markov.chain.sim[[num.iter]]))    #number of WRs in the model

# load in supporting functions
files.sources = list.files("./Programs/functions",full.names = TRUE)
my.functions <- sapply(files.sources, source)
##################################

#dates for the historical WRs
start_date_synoptic="1948-01-01"; end_date_synoptic="2021-12-31"
dates.synoptic <- seq(as.Date(start_date_synoptic),as.Date(end_date_synoptic), by="days")
#create dates for the simulated WRs
my.num.sim = ceiling(number.years.long/length(unique(format(dates.synoptic,'%Y')))) # number of chunks of historical periods; e.g., 1 is one set of simulation equal to the historical
long.dates.sim <- rep(dates.synoptic,times=my.num.sim)


  • The overarching parameters of the simulation are listed in Lines 14-23 of the code. Here you can adjust paths to directories, how many iterations you want to run (set to “1” currently) and the length of the simulation.
  • Lines 26-31: These are the thermodynamic climate change parameters. Here you can impose a temperature increase (tc), a precip scaling factor for the upper quantile of the distribution (pccc), and a mean shift to the simulated precipitation distribution (pmuc). Currently, the simulation is set to no imposed climate changes but copula-based jittering is imposed and so is kept as TRUE.  
  • Lines 33-45: Depending on which method was used to simulate the WRs, load in the WRs accordingly, using the blocks of code under either A or B. Be sure to run Line 38 no matter which method was used.  
  1. Lines 62-98 contain more details on precipitation characteristics.
#########Precipitation characteristics#########

#location of obs weather data (RData format): weather data (e.g., precip and temp) as matrices (time x lat|lon: t-by-number of grids); dates vector for time; basin average precip (see the example meteohydro file)
processed.data.meteohydro <- paste0("./Data/processed.data.files/processed.meteohydro/processed.meteohydro.myPilot.RData")
load(processed.data.meteohydro) #load in weather data


qq <- .99              # percentile threshold to separate Gamma and GPD distributions
thshd.prcp <- apply(prcp.site,2,function(x) {quantile(x[x!=0],qq,na.rm=T)})


#Bootstrapping choices###
window.size <- rep(3,length(months))   #the size of the window (in days) from which runs can be bootstrapped around the current day of simulation, by month: Jan -- Dec

trace <- 0.25     #trace prcp threshold. 0.25 mm (for Livneh dataset); or 0.01 inches


#The spearman correlation between basin and site precipitation, used in the copula-based jitters
prcp.basin.site <- cbind(prcp.basin,prcp.site)
S <- cor(prcp.basin.site,method="spearman")
n.sites <- dim(prcp.site)[2] # Number of gridded points for precipitation


#fit emission distributions to each site by month. sites along the columns, parameters for month down the rows
emission.fits.site <- fit.emission(prcp.site=prcp.site,
                                   months=months,
                                   months.weather=months.weather,
                                   n.sites=n.sites,
                                   thshd.prcp=thshd.prcp)


#how often is prcp under threshold by month and site
qq.month <- sapply(1:n.sites,function(i,x=prcp.site,m,mm) {
  sapply(m,function(m) {
    length(which(as.numeric(x[x[,i]!=0 & mm==m & !is.na(x[,i]),i])<=thshd.prcp[i]))/length(as.numeric(x[x[,i]!=0 & mm==m & !is.na(x[,i]),i]))
  })},
  m=months, mm=months.weather)

  • thshd.prcp: the threshold precipitation quantile that dictates the separation between extreme precipitation and the rest of the distribution (set at .99). Precipitation scaling is applied to this upper quantile.
  • trace: the minimum precipitation for which a day is classified as “dry”
  • window.size: The window around which data is bootstrapped from the historical datasetS: The Spearman rank correlation between individual basin sites and the basin average (for the copula-based jittering)
  • The fit.emission.R function which fits gamma and GPD distributions to the prcp.site data. A GPD is fit to the precipitation that lies above the thshd.prcp at each site, while the gamma distribution is fit to the rest of the precipitation data by month. 
  1. Line 122 is where the bootstrapping process occurs (wgen.simulator.R)
##########################Simulate model with perturbations#######################################

#decide whether or not to use historic or simulated WRs
if(use.non_param.WRs) {
  
  dates.sim <- np.list.sim.weather.state.assignments$dates.sim
  markov.chain.sim <- np.list.sim.weather.state.assignments$markov.chain.sim
  identical.dates.idx <- dates.synoptic%in%dates.weather
  weather.state.assignments <- weather.state.assignments[identical.dates.idx]
  
} else {
  dates.sim <- long.dates.sim
  markov.chain.sim <- as.list(data.frame(sim.weather.state.assignments))
}

#run the daily weather generate num.iter times using the num.iter Markov chains
mc.sim <- resampled.date.sim <- resampled.date.loc.sim <- prcp.site.sim <- tmin.site.sim <- tmax.site.sim <-  list()
start_time <- Sys.time()
for (k in 1:num.iter) {
  my.itertime <- Sys.time()
  my.sim <- wgen.simulator(weather.state.assignments=weather.state.assignments,mc.sim=markov.chain.sim[[k]],
                           prcp.basin=prcp.basin,dates.weather=dates.weather,
                           first.month=first.month,last.month=last.month,dates.sim=dates.sim,
                           months=months,window.size=window.size,min.thresh=trace)
  print(paste(k,":", Sys.time()-my.itertime))
  
  #each of these is a list of length iter
  mc.sim[[k]] <- my.sim[[1]]
  resampled.date.sim[[k]] <- my.sim[[2]]
  resampled.date.loc.sim[[k]] <- my.sim[[3]]
  prcp.site.sim[[k]] <- prcp.site[resampled.date.loc.sim[[k]],]
  tmin.site.sim[[k]] <- tmin.site[resampled.date.loc.sim[[k]],]
  tmax.site.sim[[k]] <- tmax.site[resampled.date.loc.sim[[k]],]
}
end_time <- Sys.time(); run.time.sim <- end_time - start_time
print(paste("SIMULATION started at:",start_time,", ended at:",end_time)); print(round(run.time.sim,2))
#remove for memory
rm(my.sim)
  1. Once baseline weather is generated, then perturbations can be applied in Line 156 using the perturb.climate.R function. In this function, temperature trends and precipitation scaling is applied if applicable. If not, just the copula-based jittering is applied. This output is then saved for analysis.
#once the simulations are created, we now apply post-process climate changes (and jitters)
for (change in 1:nrow(change.list)) {
  start_time <- Sys.time()
  cur.tc <- change.list$tc[change]
  cur.jitter <- change.list$jitter[change]
  cur.pccc <- change.list$pccc[change]
  cur.pmuc <- change.list$pmuc[change]
  
  #precipitation scaling (temperature change dependent)
  perc.q <- (1 + cur.pccc)^cur.tc    #scaling in the upper tail for each month of non-zero prcp
  perc.mu <- (1 + cur.pmuc)          #scaling in the mean for each month of non-zero prcp
  
  #perturb the climate from the simulations above (the longest procedure in this function is saving the output files)
  set.seed(1)   #this ensures the copula-based jitterrs are always performed in the same way for each climate change
  perturbed.sim <-  perturb.climate(prcp.site.sim=prcp.site.sim,
                                    tmin.site.sim=tmin.site.sim,
                                    tmax.site.sim=tmax.site.sim,
                                    emission.fits.site=emission.fits.site,
                                    months=months,dates.sim=dates.sim,n.sites=n.sites,
                                    qq=qq,perc.mu=perc.mu,perc.q=perc.q,S=S,cur.jitter=cur.jitter,cur.tc=cur.tc,
                                    num.iter=num.iter,thshd.prcp=thshd.prcp,qq.month=qq.month)
  set.seed(NULL)
  prcp.site.sim.perturbed <- perturbed.sim[[1]]
  tmin.site.sim.perturbed <- perturbed.sim[[2]]
  tmax.site.sim.perturbed <- perturbed.sim[[3]]
  
  #remove for memory
  rm(perturbed.sim)
  end_time <- Sys.time(); run.time.qmap <- end_time - start_time
  print(paste("QMAPPING started at:",start_time,", ended at:",end_time)); print(round(run.time.qmap,2))
  
  #how to name each file name to track perturbations in each set of simulations
  file.suffix <- paste0(".temp.",cur.tc,"_p.CC.scale.",cur.pccc,"_p.mu.scale.",cur.pmuc,"_hist.state.",use.non_param.WRs,"_jitter.",cur.jitter,"_s",num.states,"_with_",num.iter,".",basin.cnt)
  
  print(paste("|---start saving---|"))
  write.output.large(dir.to.sim.files,
                     prcp.site.sim=prcp.site.sim.perturbed,
                     tmin.site.sim=tmin.site.sim.perturbed,
                     tmax.site.sim=tmax.site.sim.perturbed,
                     mc.sim=mc.sim,resampled.date.sim=resampled.date.sim,
                     dates.sim=dates.sim,file.suffix=file.suffix)
  print(paste("|---finished saving---|"))
  
}

#remove for memory
rm(resampled.date.sim,resampled.date.loc.sim,markov.chain.sim,mc.sim)
##################################################################################################
print(paste0("--- done.  state= ", num.states," --- ensemble member:",num.iter))
gc()

# The End
#####################################################################################
  1. The R script plot.statistics.full.diagnostics.long.R can then be used to generate a variety of figures that demonstrate the quality of the generator in preserving and/or expanding around the observed statistics of each site. Options include:
Exceedance probability curves for all 12 sites
Simulated vs. observed maximum precipitation for all 12 sites
Simulated vs. observed 1-20 year droughts consolidated across all locations

And much more! Code is provided for 11 different diagnostic assessments of the quality of the generator. Look out for additions to the repository over the next months and feel free to open a GitHub issue if you have questions!

References

Charlton‐Perez, A. J., Ferranti, L., & Lee, R. W. (2018). The influence of the stratospheric state on North Atlantic weather regimes. Quarterly Journal of the Royal Meteorological Society, 144(713), 1140-1151.

Guirguis, K., Gershunov, A., DeFlorio, M.J., Shulgina, T., Delle Monache, L., Subramanian, A.C., Corringham, T.W. and Ralph, F.M., 2020. Four atmospheric circulation regimes over the North Pacific and their relationship to California precipitation on daily to seasonal timescales. Geophysical Research Letters, 47(16), p.e2020GL087609.

Gupta, R.S., Steinschneider S., Reed, P.M. Understanding Contributions of Paleo-Informed Natural Variability and Climate Changes on Hydroclimate Extremes in the Central Valley Region of California. Authorea. March 13, 2023. DOI: 10.22541/essoar.167870424.46495295/v1

Meseguer-Ruiz, O., Cortesi, N., Guijarro, J. A., & Sarricolea, P. (2020). Weather regimes linked to daily precipitation anomalies in Northern Chile. Atmospheric Research, 236, 104802.

Najibi, N., Mukhopadhyay, S., & Steinschneider, S. (2021). Identifying weather regimes for regional‐scale stochastic weather generators. International Journal of Climatology41(4), 2456-2479.

Rahat, S. H., Steinschneider, S., Kucharski, J., Arnold, W., Olzewski, J., Walker, W., … & Ray, P. (2022). Characterizing hydrologic vulnerability under nonstationary climate and antecedent conditions using a process-informed stochastic weather generator. Journal of Water Resources Planning and Management148(6), 04022028.

Steinschneider, S., Ray, P., Rahat, S. H., & Kucharski, J. (2019). A weather‐regime‐based stochastic weather generator for climate vulnerability assessments of water systems in the western United States. Water Resources Research, 55(8), 6923-6945.

Figure Library Part 2 – Visualizations Supporting Synthetic Streamflow Diagnostics

Motivation

This is the second post in a series which was started by Dave last week and seeks to make a library of plotting functions for common visualization tasks. In this post, I share some plotting functions I have developed for synthetic streamflow diagnostic visualizations.

A thorough review of synthetic streamflow verification is outside the scope of this post, but in short we want to verify that the synthetic timeseries have statistical properties similar to the historic streamflow. Additionally, we want to see that the synthetic timeseries reflect the range of internal variability present in the basin by showing that a large synthetic ensemble envelopes the limited historic observations.

The following functions can be used to compare the following properties of historic and synthetic streamflow:

  • Flow magnitude
  • Non-exceedance probabilities (flow duration curves)
  • Autocorrelation
  • Spatial correlation across locations

Diagnostic plotting functions

  • plot_flow_ranges()
    • Range and median of flow values in historic and synthetic flow timeseries.
  • plot_fdc_ranges()
    • The range of annual flow duration curves (FDCs) for historic and synthetic streamflow timeseries.
  • plot_autocorrelation()
    • Comparison of historic and synthetic autocorrelation across a range of lag values.
  • plot_spatial_correlation()
    • Heatmap visualizations of Pearson correlation between different flow locations in the historic data and a single synthetic realization.

These functions are designed to accept historic (Qh) and synthetic (Qs) data with the following properties:

  • Qh : A pd.Series containing a single historic streamflow timeseries. The index of this data must be type pd.DatetimeIndex.
  • Qs : A pd.DataFrame containing many synthetic streamflow timeseries. The columns should each contain a single synthetic realization and the index must be type pd.DatetimeIndex. The column names do not matter.

Additionally, each function (excluding plot_fdc_ranges()) has a timestep keyword argument which changes the temporal scale of the flow being considered. Options are ‘daily’, ‘weekly’, or ‘monthly’, and the pandas.groupby() function is used to aggregate data accordingly to allow for comparison of statistics at different temporal scales. The default is ‘daily’.

Streamflow data

The historic data used in this example is taken from the Delaware River Basin. Specifically, streamflow data from the USGS gauge at Lordville, NY, on the upper portion of the Delaware River is used.

The synthetic streamflow was generated using the Kirsch-Nowak [1, 2] synthetic streamflow generator. For more detail on this generator, see Julie’s post here: Open Source Streamflow Generator Part 1.

I created 100 synthetic realizations of 30-year, daily streamflow timeseries at the Lordville gauge.

The historic and synthetic streamflow timeseries are assumed to start at the same date. Most of the plotting functions below do not explicitly consider the date associated with the timeseries, however comparisons between flow statistics across some range of time assume that the timeseries are aligned for the same dates.


Flow magnitude range

This first plot simply shows the range (max, min) and median of flows for both historic and synthetic datasets.

def plot_flow_ranges(Qh, Qs, 
                     timestep = 'daily',
                     units = 'cms', y_scale = 'log',
                     savefig = False, fig_dir = '.',
                     figsize = (7,5), colors = ['black', 'orange'],
                     title_addon = ""):
    """Plots the range of flow for historic and syntehtic streamflows for a specific timestep scale.
    
    Args:
        Qh (pd.Series): Historic daily streamflow timeseries. Index must be pd.DatetimeIndex. 
        Qs (pd.DataFrame): Synthetic daily streamflow timeseries realizations. Each column is a unique realization. Index must be pd.DatetimeIndex.
        timestep (str, optional): The timestep which data should be aggregated over. Defaults to 'daily'. Options are 'daily', 'weekly', or 'monthly'.
        units (str, optional): Streamflow units, for axis label. Defaults to 'cms'.
        y_scale (str, optional): Scale of the y-axis. Defaults to 'log'.
        savefig (bool, optional): Allows for png to be saved to fig_dir. Defaults to False.
        fig_dir (str, optional): Location of saved figure output. Defaults to '.' (working directory).
        figsize (tuple, optional): The figure size. Defaults to (4,4).
        colors (list, optional): List of two colors for historic and synthetic data respectively. Defaults to ['black', 'orange'].
        title_addon (str, optional): Text to be added to the end of the title. Defaults to "".
    """

    # Assert formatting matches expected
    assert(type(Qh.index) == pd.DatetimeIndex), 'Historic streamflow (Qh) should have pd.DatatimeIndex.'
    assert(type(Qs.index) == pd.DatetimeIndex), 'Synthetic streamflow (Qh) should have pd.DatatimeIndex.'

    # Handle conditional datetime formatting
    if timestep == 'daily':
        h_grouper = Qh.index.dayofyear
        s_grouper = Qs.index.dayofyear
        x_lab = 'Day of the Year (Jan-Dec)'
    elif timestep == 'monthly':
        h_grouper = Qh.index.month
        s_grouper = Qs.index.month
        x_lab = 'Month of the Year (Jan-Dec)'
    elif timestep == 'weekly':
        h_grouper = pd.Index(Qh.index.isocalendar().week, dtype = int)
        s_grouper = pd.Index(Qs.index.isocalendar().week, dtype = int)
        x_lab = 'Week of the Year (Jan-Dec)'
    else:
        print('Invalid timestep input. Options: "daily", "monthly", "weekly".')
        return

    # Find flow ranges
    s_max = Qs.groupby(s_grouper).max().max(axis=1)
    s_min = Qs.groupby(s_grouper).min().min(axis=1)
    s_median = Qs.groupby(s_grouper).median().median(axis=1)
    h_max = Qh.groupby(h_grouper).max()
    h_min = Qh.groupby(h_grouper).min()
    h_median = Qh.groupby(h_grouper).median()
  
    ## Plotting  
    fig, ax = plt.subplots(figsize = figsize, dpi=150)
    xs = h_max.index
    ax.fill_between(xs, s_min, s_max, color = colors[1], label = 'Synthetic Range', alpha = 0.5)
    ax.plot(xs, s_median, color = colors[1], label = 'Synthetic Median')
    ax.fill_between(xs, h_min, h_max, color = colors[0], label = 'Historic Range', alpha = 0.3)
    ax.plot(xs, h_median, color = colors[0], label = 'Historic Median')
    
    ax.set_yscale(y_scale)
    ax.set_ylabel(f'{timestep.capitalize()} Flow ({units})', fontsize=12)
    ax.set_xlabel(x_lab, fontsize=12)
    ax.legend(ncols = 2, fontsize = 10, bbox_to_anchor = (0, -.5, 1.0, 0.2), loc = 'upper center')    
    ax.grid(True, linestyle='--', linewidth=0.5)
    ax.set_title(f'{timestep.capitalize()} Streamflow Ranges\nHistoric & Synthetic Timeseries at One Location\n{title_addon}')
    plt.tight_layout()
    
    if savefig:
        plt.savefig(f'{fig_dir}/flow_ranges_{timestep}.png', dpi = 150)
    return

Assuming I have loaded a pd.DataFrame named Qs which contains synthetic timeseries in each column, and a single historic timeseries Qh, I can quickly plot daily (or weekly or monthly) flow ranges using:

# Useage
plot_flow_ranges(Qh, Qs, timestep='daily')

This figure provides a lot of useful information about our synthetic ensemble:

  • The synthetic ensembles envelope the limited historic values, and include more extreme high and low flows
  • The seasonal trends in the synthetics align with the historic
  • The median of the synthetic ensemble aligns very closely with the median of the historic data.

Flow duration curve ranges

The plot_fdc_ranges() function will calculate an FDC for each year of the historic and synthetic data, and plot the ranges that these FDCs take. It will also "total" FDCs calculated for the entirety of the historic or synthetic records.

def plot_fdc_ranges(Qh, Qs, 
                    units = 'cms', y_scale = 'log',
                    savefig = False, fig_dir = '.',
                    figsize = (5,5), colors = ['black', 'orange'],                   
                    title_addon = ""):
    """Plots the range and aggregate flow duration curves for historic and synthetic streamflows.
    
    Args:
        Qh (pd.Series): Historic daily streamflow timeseries. Index must be pd.DatetimeIndex. 
        Qs (pd.DataFrame): Synthetic daily streamflow timeseries realizations. Each column is a unique realization. Index must be pd.DatetimeIndex.
        units (str, optional): Streamflow units, for axis label. Defaults to 'cms'.
        y_scale (str, optional): Scale of the y-axis. Defaults to 'log'.
        savefig (bool, optional): Allows for png to be saved to fig_dir. Defaults to False.
        fig_dir (str, optional): Location of saved figure output. Defaults to '.' (working directory).
        figsize (tuple, optional): The figure size. Defaults to (4,4).
        colors (list, optional): List of two colors for historic and synthetic data respectively. Defaults to ['black', 'orange'].
        title_addon (str, optional): Text to be added to the end of the title. Defaults to "".
    """
    
    ## Assertions
    assert(type(Qs) == pd.DataFrame), 'Synthetic streamflow should be type pd.DataFrame.'
    assert(type(Qh.index) == pd.DatetimeIndex), 'Historic streamflow (Qh) should have pd.DatatimeIndex.'
    assert(type(Qs.index) == pd.DatetimeIndex), 'Synthetic streamflow (Qh) should have pd.DatatimeIndex.'

    
    # Calculate FDCs for total period and each realization
    nonexceedance = np.linspace(0.0001, 0.9999, 50)
    s_total_fdc = np.quantile(Qs.values.flatten(), nonexceedance)
    h_total_fdc = np.quantile(Qh.values.flatten(), nonexceedance) 
    
    s_fdc_max = np.zeros_like(nonexceedance)
    s_fdc_min = np.zeros_like(nonexceedance)
    h_fdc_max = np.zeros_like(nonexceedance)
    h_fdc_min = np.zeros_like(nonexceedance)

    annual_synthetics = Qs.groupby(Qs.index.year)
    annual_historic = Qh.groupby(Qh.index.year)

    for i, quant in enumerate(nonexceedance):
            s_fdc_max[i] = annual_synthetics.quantile(quant).max().max()
            s_fdc_min[i] = annual_synthetics.quantile(quant).min().min()
            h_fdc_max[i] = annual_historic.quantile(quant).max()
            h_fdc_min[i] = annual_historic.quantile(quant).min()
    
    ## Plotting
    fig, ax = plt.subplots(figsize=figsize, dpi=200)
    
    #for quant in syn_fdc_quants:
    ax.fill_between(nonexceedance, s_fdc_min, s_fdc_max, color = colors[1], label = 'Synthetic Annual FDC Range', alpha = 0.5)
    ax.fill_between(nonexceedance, h_fdc_min, h_fdc_max, color = colors[0], label = 'Historic Annual FDC Range', alpha = 0.3)

    ax.plot(nonexceedance, s_total_fdc, color = colors[1], label = 'Synthetic Total FDC', alpha = 1)
    ax.plot(nonexceedance, h_total_fdc, color = colors[0], label = 'Historic Total FDC', alpha = 1, linewidth = 2)

    ax.grid(True, linestyle='--', linewidth=0.5)
    ax.set_yscale(y_scale)
    ax.set_ylabel(f'Flow ({units})')
    ax.set_xlabel('Non-Exceedance Probability')
    ax.legend(fontsize= 10)
    ax.grid(True, linestyle='--', linewidth=0.5)
    
    plt.title(f'Flow Duration Curve Ranges\nHistoric & Synthetic Streamflow\n{title_addon}')
    if savefig:
        plt.savefig(f'{fig_dir}/flow_duration_curves_{title_addon}.png', dpi=200)
    plt.show()
    return

## Usage
plot_fdc_ranges(Qh, Qs)

Similar to the flow range plot above, we see that the synthetic ensemble is nicely enveloping the historic data; there are synthetic instances where the extreme flows in a particular year are more extreme than historic observations. However, the FDC of the total synthetic ensemble (if all synthetics were combined) is almost exactly equal to the total historic FDC.

Autocorrelation

Hydrologic systems are known to have memory, or correlation between streamflow at different points in time. We want to make sure that our synthetic streamflow timeseries have similar memory or autocorrelation.

The plot_autocorrelation() function will calculate autocorrelation for the historic record, and each synthetic realization for a range of different lag values. In addition to the required Qh and Qs arguments this function requires lag_range which should be a list or array of lag values to be considered.

def plot_autocorrelation(Qh, Qs, lag_range, 
                         timestep = 'daily',
                         savefig = False, fig_dir = '.',
                         figsize=(7, 5), colors = ['black', 'orange'],
                         alpha = 0.3):
    """Plot autocorrelation of historic and synthetic flow over some range of lags.

    Args:
        Qh (pd.Series): Historic daily streamflow timeseries. Index must be pd.DatetimeIndex. 
        Qs (pd.DataFrame): Synthetic daily streamflow timeseries realizations. Each column is a unique realization. Index must be pd.DatetimeIndex.
        lag_range (iterable): A list or range of lag values to be used for autocorrelation calculation.
        
        timestep (str, optional): The timestep which data should be aggregated over. Defaults to 'daily'. Options are 'daily', 'weekly', or 'monthly'.
        savefig (bool, optional): Allows for png to be saved to fig_dir. Defaults to False.
        fig_dir (str, optional): Location of saved figure output. Defaults to '.' (working directory).
        figsize (tuple, optional): The figure size. Defaults to (4,4).
        colors (list, optional): List of two colors for historic and synthetic data respectively. Defaults to ['black', 'orange'].
        alpha (float, optional): The opacity of synthetic data. Defaults to 0.3.
    """
    
    ## Assertions
    assert(type(Qs) == pd.DataFrame), 'Synthetic streamflow should be type pd.DataFrame.'
    assert(type(Qh.index) == pd.DatetimeIndex), 'Historic streamflow (Qh) should have pd.DatatimeIndex.'
    assert(type(Qs.index) == pd.DatetimeIndex), 'Synthetic streamflow (Qh) should have pd.DatatimeIndex.'

    if timestep == 'monthly':
        Qh = Qh.resample('MS').sum()
        Qs = Qs.resample('MS').sum()
        time_label = 'months'
    elif timestep == 'weekly':
        Qh = Qh.resample('W-SUN').sum()
        Qs = Qs.resample('W-SUN').sum()
        time_label = f'weeks'
    elif timestep == 'daily':
        time_label = f'days'
        
    # Calculate autocorrelations
    autocorr_h = np.zeros(len(lag_range))
    confidence_autocorr_h = np.zeros((2, len(lag_range)))
    for i, lag in enumerate(lag_range):
        h_corr = pearsonr(Qh.values[:-lag], Qh.values[lag:])
        autocorr_h[i] = h_corr[0]
        confidence_autocorr_h[0,i] = h_corr[0] - h_corr.confidence_interval().low
        confidence_autocorr_h[1,i] = h_corr.confidence_interval().high - h_corr[0]
    
    autocorr_s = np.zeros((Qs.shape[1], len(lag_range)))
    for i, realization in enumerate(Qs.columns):
        autocorr_s[i] = [pearsonr(Qs[realization].values[:-lag], Qs[realization].values[lag:])[0] for lag in lag_range]


    ## Plotting
    fig, ax = plt.subplots(figsize=figsize, dpi=200)

    # Plot autocorrelation for each synthetic timeseries
    for i, realization in enumerate(Qs.columns):
        if i == 0:
            ax.plot(lag_range, autocorr_s[i], alpha=1, color = colors[1], label=f'Synthetic Realization')
        else:
            ax.plot(lag_range, autocorr_s[i], color = colors[1], alpha=alpha, zorder = 1)
        ax.scatter(lag_range, autocorr_s[i], alpha=alpha, color = 'orange', zorder = 2)

    # Plot autocorrelation for the historic timeseries
    ax.plot(lag_range, autocorr_h, color=colors[0], linewidth=2, label='Historic', zorder = 3)
    ax.scatter(lag_range, autocorr_h, color=colors[0], zorder = 4)
    ax.errorbar(lag_range, autocorr_h, yerr=confidence_autocorr_h, 
                color='black', capsize=4, alpha=0.75, label ='Historic 95% CI')

    # Set labels and title
    ax.set_xlabel(f'Lag ({time_label})', fontsize=12)
    ax.set_ylabel('Autocorrelation (Pearson)', fontsize=12)
    ax.set_title('Autocorrelation Comparison\nHistoric Timeseries vs. Synthetic Ensemble', fontsize=14)

    ax.legend(fontsize=10)
    ax.grid(True, linestyle='--', linewidth=0.5)
    plt.subplots_adjust(left=0.12, right=0.95, bottom=0.12, top=0.92)

    if savefig:
        plt.savefig(f'{fig_dir}/autocorrelation_plot_{timestep}.png', dpi=200, bbox_inches='tight')
    plt.show()
    return

Let’s visualize the autocorrelation of daily streamflow for a maximum of 100-day lag.

## Usage
plot_autocorrelation(Qh, Qs, lag_range=range(1,100, 5), timestep='daily') 

Spatial Correlation

The plot_spatial_correlation() function generates a heatmap visualization of the Pearson correlation matrix for streamflow at different locations in both the historic and synthetic datasets.

The plot_spatial_correlation() function is different from the other three functions shown above, since it requires a set of timeseries data for multiple different locations.

Rather than providing a set of many different realizations for a single site we need to provide a single realization of synthetics across all sites of interest, with a timeseries for each site in each column. This new input is Qs_i, and must the same columns as Qh (they both must contain a timeseries for each location.

def plot_correlation(Qh, Qs_i,
                        timestep = 'daily',
                        savefig = False, fig_dir = '.',
                        figsize = (8,4), color_map = 'BuPu',
                        cbar_on = True):
    """Plots the side-by-side heatmaps of flow correlation between sites for historic and synthetic multi-site data.
    
    Args:
        Qh (pd.DataFrame): Historic daily streamflow timeseries at many different locations. The columns of Qh and Qs_i must match, and index must be pd.DatetimeIndex. 
        Qs (pd.DataFrame): Synthetic daily streamflow timeseries realizations. Each column is a unique realization. The columns of Qh and Qs_i must match, and index must be pd.DatetimeIndex.
        timestep (str, optional): The timestep which data should be aggregated over. Defaults to 'daily'. Options are 'daily', 'weekly', or 'monthly'.
        savefig (bool, optional): Allows for png to be saved to fig_dir. Defaults to False.
        fig_dir (str, optional): Location of saved figure output. Defaults to '.' (working directory).
        figsize (tuple, optional): The figure size. Defaults to (4,4).
        color_map (str, optional): The colormap used for the heatmaps. Defaults to 'BuPu.
        cbar_on (bool, optional): Indictor if the colorbar should be shown or not. Defaults to True.
    """
    
    ## Assertions
    assert(type(Qh) == type(Qs_i) and (type(Qh) == pd.DataFrame)), 'Both inputs Qh and Qs_i should be pd.DataFrames.'
    assert(np.mean(Qh.columns == Qs_i.columns) == 1.0), 'Historic and synthetic data should have same columns.'
    
    if timestep == 'monthly':
        Qh = Qh.resample('MS').sum()
        Qs_i = Qs_i.resample('MS').sum()
    elif timestep == 'weekly':
        Qh = Qh.resample('W-SUN').sum()
        Qs_i = Qs_i.resample('W-SUN').sum()
    
    ## Plotting
    fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=figsize, dpi = 250)
    
    # Heatmap of historic correlation
    h_correlation = np.corrcoef(Qh.values.transpose())
    sns.heatmap(h_correlation, ax = ax1, 
                square = True, annot = False,
                xticklabels = 5, yticklabels = 5, 
                cmap = color_map, cbar = False, vmin = 0, vmax = 1)

    # Synthetic correlation
    s_correlation = np.corrcoef(Qs_i.values.transpose())
    sns.heatmap(s_correlation, ax = ax2,
                square = True, annot = False,
                xticklabels = 5, yticklabels = 5, 
                cmap = color_map, cbar = False, vmin = 0, vmax = 1)
    
    for axi in (ax1, ax2):
        axi.set_xticklabels([])
        axi.set_yticklabels([])
    ax1.set(xlabel = 'Site i', ylabel = 'Site j', title = 'Historic Streamflow')
    ax2.set(xlabel = 'Site i', ylabel = 'Site j', title = 'Synthetic Realization')
    title_string = f'Pearson correlation across different streamflow locations\n{timestep.capitalize()} timescale\n'
        
    # Adjust the layout to make room for the colorbar
    if cbar_on:
        plt.tight_layout(rect=[0, 0, 0.9, 0.85])
        cbar_ax = fig.add_axes([0.92, 0.1, 0.02, 0.6])  # [left, bottom, width, height]
        cbar = fig.colorbar(ax1.collections[0], cax=cbar_ax)
        cbar.set_label("Pearson Correlation")
    plt.suptitle(title_string, fontsize = 12)
    
    if savefig:
        plt.savefig(f'{fig_dir}/spatial_correlation_comparison_{timestep}.png', dpi = 250)
    plt.show()
    return

To use this function, I need to load a different pd.DataFrame containing synthetic timeseries at all of my study locations.

## Usage
single_synthetic_all_sites = pd.read_csv('synthetic_streamflow_single_realization_all_sites.csv', sep=',', index_col=0, parse_dates=True)

plot_correlation(Qh_all_sites, single_synthetic_all_sites, timestep='weekly')

Across the different streamflow sites, we see that the synthetic timeseries do a decent job at preserving spatial correlation, although correlation is decreased for sites which are not strongly correlated in the historic record.

Conclusions

I hope these functions save you some time in your own work, and allow you expand your synthetic diagnostic considerations.

References

[1] Kirsch, B. R., Characklis, G. W., & Zeff, H. B. (2013). Evaluating the impact of alternative hydro-climate scenarios on transfer agreements: Practical improvement for generating synthetic streamflows. Journal of Water Resources Planning and Management139(4), 396-406.

[2] Nowak, K., Prairie, J., Rajagopalan, B., & Lall, U. (2010). A nonparametric stochastic approach for multisite disaggregation of annual to daily streamflow. Water Resources Research46(8).