Plotting trajectories and direction fields for a system of ODEs in Python

The aim of this post is to guide the reader through plotting trajectories and direction fields for a system of equations in Python. This is useful when investigating the equilibria and stability of the system, and to facilitate in understanding the general behavior of a system under study. I will use a system of predator-prey equations, that my most devoted online readers are already familiar with from my previous posts on identifying equilibria and stability, and on nondimensionalization. Specifically, I’ll be using the Lotka-Volterra set of equations with Holling’s Type II functional response:

\frac{\mathrm{d} x}{\mathrm{d} t}=bx\left ( 1-\frac{x}{K} \right )-\frac{axy}{1+ahx}

\frac{\mathrm{d} y}{\mathrm{d} t}=\frac{caxy}{1+ahx}-dy

where:

x: prey abundance

y: predator abundance

b: prey growth rate

d: predator death rate

c: rate with which consumed prey is converted to predator

a: rate with which prey is killed by a predator per unit of time

K: prey carrying capacity given the prey’s environmental conditions

h: handling time

This system has 3 equilibria: when both species are dead (0,0), when predators are dead and the prey grows to its carrying capacity (K,0) and a non-trivial equilibrium where both species coexist and is generally more interesting, given by:

y^*=\frac{b}{a}(1+ahx^*)\left(1-\frac{x^*}{K} \right)

x^*=\frac{d}{a(c-dh)}

The following code should produce both trajectories and direction fields for this system of ODEs (python virtuosos please excuse the extensive commenting, I try to comment as much as possible for people new to python):

import numpy as np
from matplotlib import pyplot as plt
from scipy import integrate

# I'm using this style for a pretier plot, but it's not actually necessary
plt.style.use('ggplot')

"""
This is to ignore RuntimeWarning: invalid value encountered in true_divide
I know that when my populations are zero there's some division by zero and
the resulting error terminates my function, which I want to avoid in this case.
"""
np.seterr(divide='ignore', invalid='ignore')

# These are the parameter values we'll be using
a = 0.005
b = 0.5
c = 0.5
d = 0.1
h = 0.1
K = 2000

# Define the system of ODEs
# P[0] is prey, P[1] is predator
def fish(P, t=0):
    return ([b*P[0]*(1-P[0]/K) - (a*P[0]*P[1])/(1+a*h*P[0]),
            c*(a*P[0]*P[1])/(1+a*h*P[0]) - d*P[1] ])

# Define equilibrium point
EQ = ([d/(a*(c-d*h)),b*(1+a*h*(d/(a*(c-d*h))))*(1-(d/(a*(c-d*h)))/K)/a])

"""
I need to define the possible values my initial points will take as they
relate to the equilibrium point. In this case I chose to plot 10 trajectories
ranging from 0.1 to 5
"""
values = np.linspace(0.1, 5, 10)
# I want each trajectory to have a different color
vcolors = plt.cm.autumn_r(np.linspace(0.1, 1, len(values)))

# Open figure
f = plt.figure()
"""
I need to define a range of time over which to integrate the system of ODEs
The values don't really matter in this case because our system doesn't have t
on the right hand side of dx/dt and dy/dt, but it is a necessary input for
integrate.odeint.
"""
t = np.linspace(0, 150, 1000)

# Plot trajectories by looping through the possible values
for v, col in zip(values, vcolors):
    # Starting point of each trajectory
    P0 = [E*v for E in EQ]
    # Integrate system of ODEs to get x and y values
    P = integrate.odeint(fish, P0, t)
    # Plot each trajectory
    plt.plot( P[:,0], P[:,1],
            # Different line width for different trajectories (optional)
            lw=0.5*v,
            # Different color for each trajectory
            color=col,
            # Assign starting point to trajectory label
            label='P0=(%.f, %.f)' % ( P0[0], P0[1]) )
"""
To plot the direction fields we first need to define a grid in order to
compute the direction at each point
"""
# Get limits of trajectory plot
ymax = plt.ylim(ymin=0)[1]
xmax = plt.xlim(xmin=0)[1]
# Define number of points
nb_points = 20
# Define x and y ranges
x = np.linspace(0, xmax, nb_points)
y = np.linspace(0, ymax, nb_points)
# Create meshgrid
X1 , Y1 = np.meshgrid(x,y)
# Calculate growth rate at each grid point
DX1, DY1 = fish([X1, Y1])
# Direction at each grid point is the hypotenuse of the prey direction and the
# predator direction.
M = (np.hypot(DX1, DY1))
# This is to avoid any divisions when normalizing
M[ M == 0] = 1.
# Normalize the length of each arrow (optional)
DX1 /= M
DY1 /= M

plt.title('Trajectories and direction fields')
"""
This is using the quiver function to plot the field of arrows using DX1 and
DY1 for direction and M for speed
"""
Q = plt.quiver(X1, Y1, DX1, DY1, M, pivot='mid', cmap=plt.cm.plasma)
plt.xlabel('Prey abundance')
plt.ylabel('Predator abundance')
plt.legend(bbox_to_anchor=(1.05, 1.0))
plt.grid()
plt.xlim(0, xmax)
plt.ylim(0, ymax)
plt.show()

This should produce the following plot. All P0s are the initial conditions we defined.
trajectories

We can also see that this parameter combination produces limit cycles in our system. If we change the parameter values to:

a = 0.005
b = 0.5
c = 0.5
d = 0.1
h = 0.1
K = 200

i.e. reduce the available resources to the prey, our trajectories look like this:

trajectories1

The equilibrium becomes stable, attracting the trajectories to it.

The same can be seen if we increase the predator death rate:

a = 0.005
b = 0.5
c = 0.5
d = 1.5
h = 0.1
K = 2000

trajectories2

The implication of this observation is that an initially stable system, can become unstable given more resources for the prey or less efficient predators. This has been referred to as the Paradox of Enrichment and other predator-prey models have tried to address it (more on this in future posts).

P.S: I would also like to link to this scipy tutorial, that I found very helpful and that contains more plotting tips.

Advertisements

Transitioning from R to Python with Spyder

I am currently in the midst of transitioning from R, my preferred programming language, over to Python. While the syntax of the languages is similar, I was quickly overwhelmed with the choices of Integrated Development Environments (IDEs) and text editors for Python. Choosing an IDE is a deeply personal choice and the one you choose depends on your skill level and programming needs. I have tried out many different IDEs, and, for the purpose of making a smooth, intuitive, transition from R to Python, the best one for me was Spyder (Scientific Python Development Environment). In this blog post, I will give an overview of the Spyder environment and step through some of its functionalities.

Installation

The easiest way to install Spyder is through a Python Scientific Distribution found here. There are three options, but I chose to install Anaconda which gives you the core Python language, over 100 main Python libraries, and Spyder. It is an incredibly efficient way to get everything you need in just one download and works for both Windows and Mac. Once this is installed, you can open Spyder immediately.

 Environment

Figure 1: Spyder Environment

The first aspect that I like about Spyder is how similar it looks to RStudio and Matlab, as shown in Figure 1. This made the transition very easy for me. As shown in Figure 2, the Spyder environment is comprised of a collection of panes which can be repositioned by dragging if a different format is more intuitive to the user. To see which panes are open, click View->Panes. The most useful panes will already be open by default. You can choose to keep either the console or the IPython console. This is a matter of preference and I chose to use the regular console.

Figure 2: Choosing Panes

At the top of the screen is your directory, which, by default, is set to the folder which contains Anaconda. You can change it to your preferred location on your computer by clicking the folder icon next to the drop down arrow.

Editor

The leftmost pane is the editor which is where code can be written. The Spyder editor has features such as syntax coloring and real-time code analysis. By default, a temporary script, temp.py, will be open. Go ahead and save this in your current directory. Make sure that the file shown in the gray bar matches your directory (shown in Figure 3).

Figure 3: Setting a Directory

Let’s write a simple script to test out the environment (shown in Figure 4).

Figure 4: Sample Script and Run Settings

Click the green arrow at the top of the screen to run the script. A box will pop up with Run Settings. Make sure the working directory is correct and click “Run.” If you just want to run a certain section of the script, you can highlight that section and click the second green arrow with the blue and orange box.

Console

The results from the script will appear in the console, which is my bottom right pane. The user can also execute a command directly in this console.

Figure 5: Console

Object Inspector/Variable Explorer/File Explorer

The last major aspect of the environment is the top right pane, which is a comprised of three tabs. The first tab is the object inspector, which is analogous to RStudio’s “help” tab. You can search for information on libraries, functions, modules, and classes.

Figure 6: Object Inspector

The second tab is the variable explorer, which is the same as RStudio’s “Environment” tab. This tab conveniently shows the type, size, and value of your variables. The results from our test script are shown in Figure 7.

Figure 7: Variable Explorer

Finally, the last tab is a file explorer which lists out all of the files and folder in your the current directory.

Debugging with Spyder

The Python debugger, pdb, is partly integrated into Spyder. The debugging tools are located in blue, adjacent to the green “run” buttons. By double-clicking specific lines in the code, the user can set breakpoints where the debugger will stop and results from the debugger are displayed in the console.

Figure 8: Debugging Tools

 

Those are the main components of Spyder! As you can see, it is a fairly uncomplicated and intuitive IDE. Hopefully this overview will make the transition from R or Matlab to Python much easier. Go forth and conquer!

 

Jupyter Notebook: A “Hello World” Overview

Jupyter Notebook: A “Hello World” Overview

Jupyter Notebook: Overview

When first learning Python, I was introduced to Jupyter Notebook as an extremely effective IDE for group-learning situations. I’ve since used this browser-based interactive shell for homework assignments, data exploration and visualization, and data processing. The functionality of Jupyter Notebook extends well past simple development and showcasing of code as it can be used with almost any Python library (except for animated figures right before a deadline). Jupyter Notebook is my go-to tool when I am writing code on the go.

As a Jupyter Notebook martyr, I must point out that Jupyter Notebooks can be used for almost anything imaginable. It is great for code-oriented presentations that allow for running live code, timing of lines of code and other magic functions, or even just sifting through data for processing and visualization. Furthermore, if documented properly, Jupyter Notebook can be used as an easy guide for stepping people through lessons. For example, check out the structure of this standalone tutorial for NumPy—download and open it in Jupyter Notebook for the full experience. In a classroom setting, Jupyter Notebook can utilize nbgrader to create quizzes and assignments that can be automatically graded. Alas, I am still trying to figure out how to make it iron my shirt.

1_XMB2FXE3sN4FTwaO8dMMeA

A Sample Jupyter Notebook Presentation (credit: Matthew Speck)

One feature of Jupyter Notebook is that it can be used for a web application on a server-client structure to allow for users to interact remotely via ssh or http. In an example is shown here, you can run Julia on this website even if it is not installed locally. Furthermore, you can use the Jupyter Notebook Viewer to share notebooks online.  However, I have not yet delved into these areas as of yet.

For folks familiar with Python libraries through the years, Jupyter Notebook evolved from IPython and has overtaken its niche. Notably, it can be used for over 40 languages—the original intent was to create an interface for Julia, Python and R, hence Ju-Pyt-R— including Python, R, C++, and more. However, I have only used it for Python and each notebook kernel will run in a single native language (although untested workaround exist).

Installing and Opening the Jupyter Notebook Dashboard

While Jupyter Notebook comes standard with Anaconda, you can easily install it via pip or by checking out this link.

As for opening and running Jupyter Notebook, navigate to the directory (in this case, I created a directory in my username folder titled ‘Example’) you want to work out of in your terminal (e.g. Command Prompt in Windows, Terminal in MacOS) and run the command ‘jupyter notebook’.

command_prompt_start

Opening the Command Prompt

Once run, the following lines appear in your terminal but are relatively unimportant. The most important part is being patient and waiting for it to open in your default web browser—all mainstream web browsers are supported, but I personally use Chrome.

 

 

If at any time you want to exit Jupyter Notebook, press Ctrl + C twice in your terminal to immediately shut down all running kernels (Windows and MacOS). Note that more than one instance of Jupyter Notebook can be running by utilizing multiple terminals.

Creating a Notebook

Once Jupyter Notebook opens in your browser, you will encounter the dashboard. All files and subdirectories will be visible on this page and can generally be opened or examined.

jupyter_start

Initial Notebook Dashboard Without Any Files

If you want to create a shiny new Notebook to work in, click on ‘New’ and select a new Notebook in the language of your choice (shown below). In this case, only Python 3 has been installed and is the only option available. Find other language kernels here.

jupyter_open

Opening a New Notebook

Basic Operations in Jupyter Notebook

Once opened, you will find an untitled workbook without a title or text. To edit the title, simply left-click on ‘Untitled’ and enter your name of choice.

jupyter_new

Blank Jupyter Notebook

To write code, it is the same as writing a regular Python script in any given text editor. You can divide your code into separate sections that are run independently instead of running the entire script again. However, when importing libraries and later using them, you must run the corresponding lines to import them prior to using the aforementioned libraries.

To run code, simply press Shift + Enter while the carat—the blinking text cursor—is in the cell.

jupyter_hello_world

Jupyter Notebook with Basic Operations

After running any code through a notebook, the file is automatically backed up in a hidden folder in your working directory. Note that you cannot directly open the notebook (IPYNB File) by double-clicking on the file. Rather, you must reopen Jupyter Notebook and access it through the dashboard.

jupyter_folder_windows

Directory where Sample Jupyter Notebook Has Been Running

As shown below, you can easily generate and graph data in line. This is very useful when wanting to visualize data in addition to modifying a graphic (e.g. changing labels or colors). These graphics are not rendered at the same DPI as a saved image or GUI window by default but can be changed by modifying matplotlib’s rcParams.

jupyter_graphic

Example Histogram in Jupyter Notebook

Conclusion

At this point, there are plenty of directions you can proceed. I would highly suggest exploring some of the widgets available which include interesting interactive visualizations. I plan to explore further applications in future posts, so please feel free to give me a yell if you have any ideas.

Preparing Data for a Time Series Analysis

This semester, I had the opportunity to take Dr. Scott Steinschneider’s new class, Hydrologic Engineering in a Changing Climate, that is offered here at Cornell. In this class, we covered time series analysis, extreme value modeling, and trend tests. I chose to do a final project which focused on using a time series approach to forecast electricity demand in California. As I worked through my project, it became apparent to me that data are rarely in the form where a time series model can be applied directly. Consequently, multiple transformations are usually necessary before a model can be fit to the data set. In this post, I outline some of the inherent characteristics of data that might warrant a transformation as well as the steps that can be taken to address these problems.

Given a data set that you would like to fit any Box-Jenkins model to, you should ask yourself the following two questions?

  1. Is normality a reasonable assumption for the residuals?
  2. Are the data stationary?

Normality

Normality can be checked before fitting the model because if the original data are not normal, then there is a good chance that the residuals won’t be as well. If you fit a histogram to the data and it looks like Figure 1, you probably need to apply some form of normalizing transformation.

Rplot03

Figure 1: Histogram of Monthly Flow

Two traditional transformations that you can try are a log transform or a Box-Cox transform, shown in the following two equations, where xt is the original data point.

Log Transform:

Picture1

Box-Cox Transform:

Picture3

Sometimes a log transformation can be too drastic and skew the data the opposite way. The Box-Cox transform is effectively a less intense transformation that one can try if the log transform is not suitable. Note that when λ=0, the Box-Cox transform reduces to a simple log transform.

The powerTransform function in the R package, car, can be used to find a lambda that will maximize normality.

Stationarity

For a data set to exhibit stationarity, the following three principles must be true for us to be confident that our model will represent our data well:

For some lag term, s,

  1. E[xt]=E[xt+s] (The mean of the data set does not change with time)
  2. Var[xt]=Var[xt+s] (The variance of the data set does not change with time)
  3. Cov[xt,xt+s]= γ (The covariance between data points is some constant value,γ)

Outlined below are some of the characteristics of a data set that can cause a violation of one or more of these principles.

Seasonality

Seasonality in data can exist if a time series pattern repeats over a fixed and known period. Figure 2 shows monthly inflow into the Schoharie Creek Reservoir. Periodicity is apparent, but it isn’t until we look at the autocorrelation function (ACF) of the data, shown in Figure 3, that we see that there is a clear repetition occurring every 12 months.

Inflow

Figure 2: Monthly Inflow for the Schoharie Creek

ACF

Figure 3: ACF of Monthly Inflow

One effective way to get rid of this monthly seasonality is to use the following de-seasonalizing equation:

Picture4

The seasonality is removed from each data point by subtracting the corresponding monthly mean (xmt) and dividing by the month’s standard deviation ( smt). This equation can also be used to account for daily or yearly seasonality as well.

Differencing is another way to address seasonality in data. A seasonal difference is the difference between an observation and the corresponding observation from the previous year.

Picture7

Where m=12 for monthly data, m=4 for quarterly data, and so on 1.

Trend

A trend, shown in the first panel of Figure 4, is a clear violation of the first requirement for stationarity. There are a couple options that one can implement to deal with trends: differencing and model fitting.

Pic8

Figure 4: De-trending process1

From the above figures, it is clear that differencing can be used to account for seasonality but can also be used to dampen a trend. A first difference is performed by subtracting the value of the current observation from the one in the time step before. It can be applied as follows:

Picture8

 If the transformed data is plotted and still has a trend, a second difference can further be applied.

Picture9

It is important to note the distinction between seasonal and first differences. Seasonal differencing is the difference from one year to the next, while first differencing is the difference between one observation and the next. Seasonal and trend differencing can both be applied, but sometimes, if seasonal differencing is performed first, it will remove the need for further differencing1.

In Figure 4, note how a log transform, seasonal differencing, and second differencing is necessary to ultimately remove the trend.

 

Pic10

Figure 5: Modeling Fitting with Ordinary Least Squares2

If a monotonic trend is observed, such as the one in Figure 5, a model fitting can be performed. In this example, a linear model is fit to the trend by choosing coefficients that minimize the sum of squares. This model is then subtracted from the original data to give residuals. The goal is for the resulting residuals to be stationary. Note that a polynomial model can also be fit to the trend if appropriate2 .

Heteroscedasticity

Heteroscedasticity describes the phenomenon when the data do not exhibit a constant variance. This is a violation of the second principle. Heteroscedasticity tends to appear in financial time series (i.e. prices of stocks and bonds) which can be very volatile, but it appears less so in hydrological data3. I did not have to address heteroscedasticity in the electricity load data for my project, and some statisticians suggest that one doesn’t have to deal with it unless it is very severe as weak heteroscedasticity tends be taken care of with normalization and de-seasonalization.

One way to check for heteroscedasticity in a time series is with the McLeod-Li test for conditional heteroscedasticity. If heteroscedasticity is present, consider using an ARCH/GARCH model, if an AR or ARMA model can be fit to the data, respectively, or a hybrid ARCH-ARIMA model if the latter models are not appropriate.

Choosing a Time Series Model

Once the necessary transformations have been performed, you are ready to fit a time series model to your data. R has a some useful packages for this: forecast and stats. Some helpful functions in these packages include:

auto.arima (forecast) – This function tells you what model is the best fit for your data, the coefficients for the lag terms, and variance of errors (along with other useful information).

arima.sim (stats) – This function allows you to simulate a set of data from your time series model.

predict (stats) – This function will provide a prediction for n time steps into the future based on the chosen time series model. Keep in mind it is best when used to predict just the next few time step.

Finally, remember that back-transformations must be performed on all simulations or predictions to get them into back into the original space.

*For a really helpful explanation of different time series notation, check this previous post.

References

*All information or figures not specifically cited came from class notes and homework from Dr. Scott Steinschneider’s class

(1) Stationarity and Differencing: https://www.otexts.org/fpp/8/1

(2) Removal of Trend and Seasonality, UC Berkeley: https://www.stat.berkeley.edu/~gido/Removal%20of%20Trend%20and%20Seasonality.pdf

(3) Heteroscedasticity: http://www.math.canterbury.ac.nz/~m.reale/econ324/Topic2.pdf

 

Directed search with the Exploratory Modeling workbench

This is the third blog in a series showcasing the functionality of the Exploratory Modeling workbench. Exploratory modeling entails investigating the way in which uncertainty and/or policy levers map to outcomes. To investigate these mappings, we can either use sampling based strategies (open exploration) or optimization based strategies (directed search) In the first blog, I gave a general overview of the workbench and showed briefly how both investigation strategies can be done. In the second blog, I demonstrated the use of the workbench for open exploration in substantial more detail. In this third blog, I will demonstrate in more detail how to use the workbench for directed search. Like in the previous two blog post, I will use the DPS version of the lake problem.

For optimization, the workbench relies on platypus. You can easily install the latest version of platypus from github using pip

pip install git+https://github.com/Project-Platypus/Platypus.git

By default, the workbench will use epsilon NSGA2, but all the other algorithms available within platypus can be used as well.

Within the workbench, optimization can be used in three ways:
* Search over decision levers for a reference scenario
* Robust search: search over decision levers for a set of scenarios
* worst case discovery: search over uncertainties for a reference policy

The search over decision levers or over uncertainties relies on the specification of the direction for each outcome of interest defined on the model. It is only possible to use ScalarOutcome objects for optimization.

Search over levers

Directed search is most often used to search over the decision levers in order to find good candidate strategies. This is for example the first step in the Many Objective Robust Decision Making process. This is straightforward to do with the workbench using the optimize method.

from ema_workbench import MultiprocessingEvaluator, ema_logging

ema_logging.log_to_stderr(ema_logging.INFO)

with MultiprocessingEvaluator(model) as evaluator:
    results = evaluator.optimize(nfe=10000, searchover='levers', 
                                 epsilons=[0.1,]*len(model.outcomes),
                                 population_size=50)

the result from optimize is a DataFrame with the decision variables and outcomes of interest. The latest version of the workbench comes with a pure python implementation of parallel coordinates plot built on top of matplotlib. It has been designed with the matplotlib and seaborn api in mind. We can use this to quickly visualize the optimization results.

from ema_workbench.analysis import parcoords

paraxes = parcoords.ParallelAxes(parcoords.get_limits(results), rot=0)
paraxes.plot(results, color=sns.color_palette()[0])
paraxes.invert_axis('max_P')
plt.show()

Note how we can flip an axis using the invert_axis method. This eases interpretation of the figure because the ideal solution in this case would be a straight line for the four outcomes of interest at the top of the figure.

output_8_1

Specifying constraints

In the previous example, we showed the most basic way for using the workbench to perform many-objective optimization. However, the workbench also offers support for constraints and tracking convergence. Constrains are an attribute of the optimization problem, rather than an attribute of the model as in Rhodium. Thus, we can pass a list of constraints to the optimize method. A constraint can be applied to the model input parameters (both uncertainties and levers), and/or outcomes. A constraint is essentially a function that should return the distance from the feasibility threshold. The distance should be 0 if the constraint is met.

As a quick demonstration, let’s add a constraint on the maximum pollution. This constraint applies to the max_P outcome. The example below specifies that the maximum pollution should be below 1.

from ema_workbench import MultiprocessingEvaluator, ema_logging, Constraint

ema_logging.log_to_stderr(ema_logging.INFO)

constraints = [Constraint("max pollution", outcome_names="max_P",
                          function=lambda x:max(0, x-1))]

with MultiprocessingEvaluator(model) as evaluator:
    results = evaluator.optimize(nfe=1000, searchover='levers', 
                                 epsilons=[0.1,]*len(model.outcomes),
                                 population_size=25, constraints=constraints)

tracking convergence

To track convergence, we need to specify which metric(s) we want to use and pass these to the optimize method. At present the workbench comes with 3 options: Hyper volume, Epsilon progress, and a class that will write the archive at each iteration to a separate text file enabling later processing. If convergence metrics are specified, optimize will return both the results as well as the convergence information.

from ema_workbench import MultiprocessingEvaluator, ema_logging
from ema_workbench.em_framework.optimization import (HyperVolume,
                                                     EpsilonProgress, )
from ema_workbench.em_framework.outcomes import Constraint

ema_logging.log_to_stderr(ema_logging.INFO)

# because of the constraint on pollution, we can specify the 
# maximum easily
convergence_metrics = [HyperVolume(minimum=[0,0,0,0], maximum=[1,1,1,1]),
                       EpsilonProgress()]
constraints = [Constraint("max pollution", outcome_names="max_P",
                          function=lambda x:max(0, x-1))]

with MultiprocessingEvaluator(model) as evaluator:
    results_ref1, convergence1 = evaluator.optimize(nfe=25000, searchover='levers', 
                                    epsilons=[0.05,]*len(model.outcomes),
                                    convergence=convergence_metrics,
                                    constraints=constraints,
                                    population_size=100)

We can visualize the results using parcoords as before, while the convergence information is in a DataFrame making it also easy to plot.

fig, (ax1, ax2) = plt.subplots(ncols=2, sharex=True)
ax1.plot(convergence1.epsilon_progress)
ax1.set_xlabel('nr. of generations')
ax1.set_ylabel('$\epsilon$ progress')
ax2.plot(convergence1.hypervolume)
ax2.set_ylabel('hypervolume')
sns.despine()
plt.show()

output_16_0

Changing the reference scenario

Up till now, we have performed the optimization for an unspecified reference scenario. Since the lake model function takes default values for each of the deeply uncertain factors, these values have been implicitly assumed. It is however possible to explicitly pass a reference scenario that should be used instead. In this way, it is easy to apply the extended MORDM approach suggested by Watson and Kasprzyk (2017).

To see the effects of changing the reference scenario on the values for the decision levers found through the optimization, as well as ensuring a fair comparison with the previous results, we use the same convergence metrics and constraints from the previous optimization. Note that the constraints are in essence only a function, and don’t retain optimization specific state, we can simply reuse them. The convergence metrics, in contrast retain state and we thus need to re-instantiate them.

from ema_workbench import Scenario

reference = Scenario('reference', **dict(b=.43, q=3,mean=0.02, 
                                         stdev=0.004, delta=.94))
convergence_metrics = [HyperVolume(minimum=[0,0,0,0], maximum=[1,1,1,1]),
                       EpsilonProgress()]

with MultiprocessingEvaluator(model) as evaluator:
    results_ref2, convergence2 = evaluator.optimize(nfe=25000, searchover='levers', 
                                  epsilons=[0.05,]*len(model.outcomes),
                                  convergence=convergence_metrics,
                                  constraints=constraints,
                                  population_size=100, reference=reference)

comparing results for different reference scenarios

To demonstrate the parcoords plotting functionality in some more detail, let’s combine the results from the optimizations for the two different reference scenarios and visualize them in the same plot. To do this, we need to first figure out the limits across both optimizations. Moreover, to get a better sense of which part of the decision space is being used, let’s set the limits for the decision levers on the basis of their specified ranges instead of inferring the limits from the optimization results.

columns = [lever.name for lever in model.levers]
columns += [outcome.name for outcome in model.outcomes]
limits = {lever.name: (lever.lower_bound, lever.upper_bound) for lever in 
           model.levers}
limits = dict(**limits, **{outcome.name:(0,1) for outcome in model.outcomes})
limits = pd.DataFrame.from_dict(limits)
# we resort the limits in the order produced by the optimization
limits = limits[columns] 

paraxes = parcoords.ParallelAxes(limits, rot=0)
paraxes.plot(results_ref1, color=sns.color_palette()[0], label='ref1')
paraxes.plot(results_ref2, color=sns.color_palette()[1], label='ref2')
paraxes.legend()
paraxes.invert_axis('max_P')
plt.show()

output_22_0.png

Robust Search

The workbench also comes with support for many objective robust optimization. In this case, each candidate solution is evaluated over a set of scenarios, and the robustness of the performance over this set is calculated. This requires specifying 2 new pieces of information:
* the robustness metrics
* the scenarios over which to evaluate the candidate solutions

The robustness metrics are simply a collection of ScalarOutcome objects. For each one, we have to specify which model outcome(s) it uses, as well as the actual robustness function. For demonstrative purposes, let’s assume we are use a robustness function using descriptive statistics: we want to maximize the 10th percentile performance for reliability, inertia, and utility, while minimizing the 90th percentile performance for max_P.

We can specify our scenarios in various ways. The simplest would be to pass the number of scenarios to the robust_optimize method. In this case for each generation a new set of scenarios is used. This can create noise and instability in the optimization. A better option is to explicitly generate the scenarios first, and pass these to the method. In this way, the same set of scenarios is used for each generation.

If we want to specify a constraint, this can easily be done. Note however, that in case of robust optimization, the constrains will apply to the robustness metrics instead of the model outcomes. They can of course still apply to the decision variables as well.

import functools
from ema_workbench import Constraint, MultiprocessingEvaluator
from ema_workbench import Constraint, ema_logging
from ema_workbench.em_framework.optimization import (HyperVolume,
                                                     EpsilonProgress)
from ema_workbench.em_framework.samplers import sample_uncertainties

ema_logging.log_to_stderr(ema_logging.INFO)

percentile10 = functools.partial(np.percentile, q=10)
percentile90 = functools.partial(np.percentile, q=90)

MAXIMIZE = ScalarOutcome.MAXIMIZE
MINIMIZE = ScalarOutcome.MINIMIZE
robustnes_functions = [ScalarOutcome('90th percentile max_p', kind=MINIMIZE, 
                             variable_name='max_P', function=percentile90),
                       ScalarOutcome('10th percentile reliability', kind=MAXIMIZE, 
                             variable_name='reliability', function=percentile10),
                       ScalarOutcome('10th percentile inertia', kind=MAXIMIZE, 
                             variable_name='inertia', function=percentile10),
                       ScalarOutcome('10th percentile utility', kind=MAXIMIZE, 
                             variable_name='utility', function=percentile10)]

def constraint(x):
    return max(0, percentile90(x)-10)

constraints = [Constraint("max pollution", 
                          outcome_names=['90th percentile max_p'],
                          function=constraint)]
convergence_metrics = [HyperVolume(minimum=[0,0,0,0], maximum=[10,1,1,1]),
                       EpsilonProgress()]
n_scenarios = 10
scenarios = sample_uncertainties(model, n_scenarios)

nfe = 10000

with MultiprocessingEvaluator(model) as evaluator:
    robust_results, convergence = evaluator.robust_optimize(robustnes_functions, 
                            scenarios, nfe=nfe, constraints=constraints,
                            epsilons=[0.05,]*len(robustnes_functions),
                            convergence=convergence_metrics,)
fig, (ax1, ax2) = plt.subplots(ncols=2)
ax1.plot(convergence.epsilon_progress.values)
ax1.set_xlabel('nr. of generations')
ax1.set_ylabel('$\epsilon$ progress')
ax2.plot(convergence.hypervolume)
ax2.set_ylabel('hypervolume')
sns.despine()
plt.show()

output_25_0.png

paraxes = parcoords.ParallelAxes(parcoords.get_limits(robust_results), rot=45)
paraxes.plot(robust_results)
paraxes.invert_axis('90th percentile max_p')
plt.show()

output_26_0.png

Search over uncertainties: worst case discovery

Up till now, we have focused on optimizing over the decision levers. The workbench however can also be used for worst case discovery (Halim et al, 2016). In essence, the only change is to specify that we want to search over uncertainties instead of over levers. Constraints and convergence works just as in the previous examples.

Reusing the foregoing, however, we should change the direction of optimization of the outcomes. We are no longer interested in finding the best possible outcomes, but instead we want to find the worst possible outcomes.

# change outcomes so direction is undesirable
minimize = ScalarOutcome.MINIMIZE
maximize = ScalarOutcome.MAXIMIZE

for outcome in model.outcomes:
    if outcome.kind == minimize:
        outcome.kind = maximize
    else:
        outcome.kind = minimize

We can reuse the reference keyword argument to perform worst case discovery for one of the policies found before. So, below we select solution number 9 from the pareto approximate set. We can turn this into a dict and instantiate a Policy objecti.

from ema_workbench import Policy

policy = Policy('9', **{k:v for k, v in results_ref1.loc[9].items()
                        if k in model.levers})

with MultiprocessingEvaluator(model) as evaluator:
    results = evaluator.optimize(nfe=1000, searchover='uncertainties', 
                                 epsilons=[0.1,]*len(model.outcomes),
                                 reference=policy)

Visualizing the results is straightforward using parcoords.

paraxes = parcoords.ParallelAxes(parcoords.get_limits(results), rot=0)
paraxes.plot(results)
paraxes.invert_axis('max_P')
plt.show()

output_30_0.png

Closing remarks

This blog showcased the functionality of the workbench for applying search based approaches to exploratory modelling. We specifically looked at the use of many-objective optimization for searching over the levers or uncertainties, as well as the use of many-objective robust optimization. This completes the overview of the functionality available in the workbench. In the next blog, I will put it all together to show how the workbench can be used to perform Many Objective Robust Decision Making.

Open exploration with the Exploratory Modelling Workbench

In this blog, I will continue to showcase the functionality of the exploratory modelling workbench. In the previous blog, I have given a general introduction to the workbench, and showed how the Direct Policy Search example that comes with Rhodium can be adapted for use with the workbench. In this blog post, I will showcase how the workbench can be used for open exploration.

first a short background

In exploratory modeling, we are interested in understanding how regions in the uncertainty space and/or the decision space map to the whole outcome space, or partitions thereof. There are two general approaches for investigating this mapping. The first one is through systematic sampling of the uncertainty or decision space. This is sometimes also known as open exploration. The second one is to search through the space in a directed manner using some type of optimization approach. This is sometimes also known as directed search.

The workbench support both open exploration and directed search. Both can be applied to investigate the mapping of the uncertainty space and/or the decision space to the outcome space. In most applications, search is used for finding promising mappings from the decision space to the outcome space, while exploration is used to stress test these mappings under a whole range of possible resolutions to the various uncertainties. This need not be the case however. Optimization can be used to discover the worst possible scenario, while sampling can be used to get insight into the sensitivity of outcomes to the various decision levers.

open exploration

To showcase the open exploration functionality, let’s start with a basic example using the DPS lake problem also used in the previous blog post. We are going to simultaneously sample over uncertainties and decision levers. We are going to generate 1000 scenarios and 5 policies, and see how they jointly affect the outcomes. A scenario is understood as a point in the uncertainty space, while a policy is a point in the decision space. The combination of a scenario and a policy is called experiment. The uncertainty space is spanned by uncertainties, while the decision space is spanned by levers. Both uncertainties and levers are instances of RealParameter (a continuous range), IntegerParameter (a range of integers), or CategoricalParameter (an unorder set of things). By default, the workbench will use Latin Hypercube sampling for generating both the scenarios and the policies. Each policy will be always evaluated over all scenarios (i.e. a full factorial over scenarios and policies).

from ema_workbench import (RealParameter, ScalarOutcome, Constant,
                           ReplicatorModel)
model = ReplicatorModel('lakeproblem', function=lake_model)
model.replications = 150

#specify uncertainties
model.uncertainties = [RealParameter('b', 0.1, 0.45),
                       RealParameter('q', 2.0, 4.5),
                       RealParameter('mean', 0.01, 0.05),
                       RealParameter('stdev', 0.001, 0.005),
                       RealParameter('delta', 0.93, 0.99)]

# set levers
model.levers = [RealParameter("c1", -2, 2),
                RealParameter("c2", -2, 2),
                RealParameter("r1", 0, 2),
                RealParameter("r2", 0, 2),
                RealParameter("w1", 0, 1)]

def process_p(values):
    values = np.asarray(values)
    values = np.mean(values, axis=0)
    return np.max(values)

#specify outcomes
model.outcomes = [ScalarOutcome('max_P', kind=ScalarOutcome.MINIMIZE,
                                function=process_p),
                  ScalarOutcome('utility', kind=ScalarOutcome.MAXIMIZE,
                                function=np.mean),
                  ScalarOutcome('inertia', kind=ScalarOutcome.MINIMIZE,
                                function=np.mean),
                  ScalarOutcome('reliability', kind=ScalarOutcome.MAXIMIZE,
                                function=np.mean)]

# override some of the defaults of the model
model.constants = [Constant('alpha', 0.41),
                   Constant('steps', 100)]

Next, we can perform experiments with this model.

from ema_workbench import (MultiprocessingEvaluator, ema_logging,
                           perform_experiments)
ema_logging.log_to_stderr(ema_logging.INFO)

with MultiprocessingEvaluator(model) as evaluator:
    results = evaluator.perform_experiments(scenarios=1000, policies=5)

Visual analysis

Having generated these results, the next step is to analyze them and see what we can learn from the results. The workbench comes with a variety of techniques for this analysis. A simple first step is to make a few quick visualizations of the results. The workbench has convenience functions for this, but it also possible to create your own visualizations using the scientific Python stack.

from ema_workbench.analysis import pairs_plotting
fig, axes = pairs_plotting.pairs_scatter(results, group_by='policy',
                                         legend=False)
plt.show()

output_6_0

Writing your own visualizations requires a more in-depth understanding of how the results from the workbench are structured. perform_experiments returns a tuple. The first item is a numpy structured array where each row is a single experiment. The second item contains the outcomes, structured in a dict with the name of the outcome as key and a numpy array as value. Experiments and outcomes are aligned based on index.

import seaborn as sns

experiments, outcomes = results

df = pd.DataFrame.from_dict(outcomes)
df = df.assign(policy=experiments['policy'])

# rename the policies using numbers
df['policy'] = df['policy'].map({p:i for i, p in
                                enumerate(set(experiments['policy']))})

# use seaborn to plot the dataframe
grid = sns.pairplot(df, hue='policy', vars=outcomes.keys())
ax = plt.gca()
plt.show()

output_8_0

Often, it is convenient to separate the process of performing the experiments from the analysis. To make this possible, the workbench offers convenience functions for storing results to disc and loading them from disc. The workbench will store the results in a tarbal with .csv files and separate metadata files. This is a convenient format that has proven sufficient over the years.

from ema_workbench import save_results

save_results(results, '1000 scenarios 5 policies.tar.gz')

from ema_workbench import load_results

results = load_results('1000 scenarios 5 policies.tar.gz')

advanced analysis

In addition to visual analysis, the workbench comes with a variety of techniques to perform a more in-depth analysis of the results. In addition, other analyses can simply be performed by utilizing the scientific python stack. The workbench comes with

  • Scenario Discovery, a model driven approach to scenario development.
  • Dimensional stacking, a quick visual approach drawing on feature scoring to enable scenario discovery. This approach has received limited attention in the literature (Suzuki et al., 2015). The implementation in the workbench replaces the rule mining approach with a feature scoring approach.
  • Feature Scoring, a poor man’s alternative to global sensitivity analysis
  • Regional sensitivity analysis

Scenario Discovery

A detailed discussion on scenario discovery can be found in an earlier blogpost. For completeness, I provide a code snippet here. Compared to the previous blog post, there is one small change. The library mpld3 is currently not being maintained and broken on Python 3.5 and higher. To still utilize the interactive exploration of the trade offs within the notebook, use the interactive back-end as shown below.

from ema_workbench.analysis import prim

experiments, outcomes = results

x = experiments
y = outcomes['max_P'] <0.8

prim_alg = prim.Prim(x, y, threshold=0.8)
box1 = prim_alg.find_box()

%matplotlib notebook

box1.show_tradeoff()
plt.show()

tradeoff

%matplotlib inline
# we go back to default not interactive

box1.inspect(43)
box1.inspect(43, style='graph')
plt.show()

output_13_1

dimensional stacking

Dimensional stacking was suggested as a more visual approach to scenario discovery. It involves two steps: identifying the most important uncertainties that affect system behavior, and creating a pivot table using the most influential uncertainties. Creating the pivot table involves binning the uncertainties. More details can be found in Suzuki et al. (2015) or by looking through the code in the workbench. Compared to the original paper, I use feature scoring for determining the most influential uncertainties. The code is set up in a modular way so other approaches to global sensitivity analysis can easily be used as well if so desired.

from ema_workbench.analysis import dimensional_stacking

x = experiments
y = outcomes['max_P'] <0.8

dimensional_stacking.create_pivot_plot(x,y, 2, nbins=3)
plt.show()

output_15_1

We can see from this visual that if B is low, while Q is high, we have a high concentration of cases where pollution stays below 0.8. The mean and delta have some limited additional influence. By playing around with an alternative number of bins, or different number of layers, patterns can be coarsened or refined.

regional sensitivity analysis

A third approach for supporting scenario discovery is to perform a regional sensitivity analysis. The workbench implements a visual approach based on plotting the empirical CDF given a classification vector. Please look at section 3.4 in Pianosi et al (2016) for more details.

from ema_workbench.analysis import regional_sa
from numpy.lib import recfunctions as rf

x = rf.drop_fields(experiments, 'model', asrecarray=True)
y = outcomes['max_P'] < 0.8

regional_sa.plot_cdfs(x,y)
plt.show()

output_17_0

feature scoring

Feature scoring is a family of techniques often used in machine learning to identify the most relevant features to include in a model. This is similar to one of the use cases for global sensitivity analysis, namely factor prioritisation. In some of the work ongoing in Delft, we are comparing feature scoring with Sobol and Morris and the results are quite positive. The main advantage of feature scoring techniques is that they impose virtually no constraints on the experimental design, while they can handle real valued, integer valued, and categorical valued parameters. The workbench supports multiple techniques, the most useful of which generally is extra trees (Geurts et al. 2006).

For this example, we run feature scoring for each outcome of interest. We can also run it for a specific outcome if desired. Similarly, we can choose if we want to run in regression mode or classification mode. The later is applicable if the outcome is a categorical variable and the results should be interpreted similar to regional sensitivity analysis results. For more details, see the documentation.

from ema_workbench.analysis import feature_scoring

x = experiments
y = outcomes

fs = feature_scoring.get_feature_scores_all(x, y)
sns.heatmap(fs, cmap='viridis', annot=True)
plt.show()

output_19_0

From the results, we see that max_P is primarily influenced by b, while utility is driven by delta, for inertia and reliability the situation is a little bit less clear cut.

linear regression

In addition to the prepackaged analyses that come with the workbench, it is also easy to rig up something quickly using the ever expanding scientific Python stack. Below is a quick example of performing a basic regression analysis on the results.

experiments, outcomes = results

for key, value in outcomes.items():
    params = model.uncertainties #+ model.levers[:]

    fig, axes = plt.subplots(ncols=len(params), sharey=True)

    y = value

    for i, param in enumerate(params):
        ax = axes[i]
        ax.set_xlabel(param.name)

        pearson = sp.stats.pearsonr(experiments[param.name], y)

        ax.annotate("r: {:6.3f}".format(pearson[0]), xy=(0.15, 0.85),
                    xycoords='axes fraction',fontsize=13)

        x = experiments[param.name]
        sns.regplot(x, y, ax=ax, ci=None, color='k',
        scatter_kws={'alpha':0.2, 's':8, 'color':'gray'})

        ax.set_xlim(param.lower_bound, param.upper_bound)

    axes[0].set_ylabel(key)

plt.show()

output_22_0

More advanced sampling techniques

The workbench can also be used for more advanced sampling techniques. To achieve this, it relies on SALib. On the workbench side, the only change is to specify the sampler we want to use. Next, we can use SALib directly to perform the analysis. To help with this, the workbench provides a convenience function for generating the problem dict which SALib provides. The example below focusses on performing SOBOL on the uncertainties, but we could do the exact same thing with the levers instead. The only changes required would be to set lever_sampling instead of uncertainty_sampling, and get the SALib problem dict based on the levers.

from SALib.analyze import sobol
from ema_workbench.em_framework.salib_samplers import get_SALib_problem

with MultiprocessingEvaluator(model) as evaluator:
    sa_results = evaluator.perform_experiments(scenarios=1000,
                                               uncertainty_sampling='sobol')

experiments, outcomes = sa_results
problem = get_SALib_problem(model.uncertainties)

Si = sobol.analyze(problem, outcomes['max_P'],
                   calc_second_order=True, print_to_console=False)

Si_filter = {k:Si[k] for k in ['ST','ST_conf','S1','S1_conf']}
Si_df = pd.DataFrame(Si_filter, index=problem['names'])
Animations (2/2)

Animations (2/2)

In the second part of this two-part post we’ll learn to use different tools and techniques to visualize and save animations (first part here). All the code discussed here is available on a GitHub repository, here.

This part focuses on the moviepy Python library, and all the neat things one can do with it. There actually are some nice tutorials for when we have a continuous function t -> f(t) to work with (see here). Instead, we are often working with data structures that are indexed on time in a discrete way.

Moviepy could be used from any data source dependent on time, including netCDF data such as the one manipulated by VisIt in the first part of this post. But in this second part, we are instead going to focus on how to draw time -dependent trajectories to make sense of nonlinear dynamical systems, then animate them in GIF. I will use the well-known shallow lake problem, and go through a first example with detailed explanation of the code. Then I’ll finish with a second example showing trajectories.

Part I: using state trajectories to understand the concept of stable equilibria

The shallow lake problem is a classic problem in the management of coupled human and natural system. Some human (e.g. agriculture) produce phosphorus that eventually end up in water bodies such as lakes. Too much phosphorus in lake causes a processus called eutrophication which usually destroys lakes’ diverse ecosystems (no more fish) and lower water quality. A major problem with that is that eutrophication is difficult or even sometimes impossible to reverse: lowering phosphorus inputs to what they were pre-eutrophication simply won’t work. Simple nonlinear dynamics, first proposed by Carpenter et al. in 1999 (see here) describe the relationship between phosphorus inputs (L) and concentration (P). The first part of the code (uploaded to GitHub as movie1.py) reads:

 import attractors
import numpy as np
from moviepy.video.io.bindings import mplfig_to_npimage
from moviepy.video.VideoClip import DataVideoClip
import matplotlib.pyplot as plt
import matplotlib.lines as mlines

# Lake parameters
b = 0.65
q = 4

# One step dynamic (P increment rate)
# arguments are current state x and lake parameters b,q and input l
def Dynamics(x, b, q, l):
    dp = (x ** q) / (1 + x ** q) - b * x + l
    return dp 

Where the first 6 lines contain the usual library imports. Note that I am importing an auxiliary Python function “attractors” to enable me to plot the attractors (see attractors.pyon the GitHub repository). The function “Dynamics” correspond to the evolution of P given L and lake parameters b and q, also given in this bit of code. Then we introduce the time parameters:

# Time parameters
dt = 0.01 # time step
T = 40 # final horizon
nt = int(T/dt+1E-6) # number of time steps

To illustrate that lake phosphorus dynamics depend not only on the phosphorus inputs L but also on initial phosphorus levels, we are going to plot P trajectories for different constant values of L, and three cases regarding the initial P. We first introduce these initial phosphorus levels, and the different input levels, then declare the arrays in which we’ll store the different trajectories

# Initial phosphorus levels
pmin = 0
pmed = 1
pmax = 2.5

# Inputs levels
l = np.arange(0.001,0.401,0.005)

# Store trajectories
low_p = np.zeros([len(l),nt+1]) # Correspond to pmin
med_p = np.zeros([len(l),nt+1]) # Correspond to pmed
high_p = np.zeros([len(l),nt+1]) # Correspond to pmax 

Once that is done, we can use the attractor import to plot the equilibria of the lake problem. This is a bit of code that is the GitHub repository associated to this post, but that I am not going to comment on further here.

After that we can generate the trajectories for P with constant L, and store them to the appropriate arrays:

# Generating the data: trajectories
def trajectory(b,q,p0,l,dt,T):
# Declare outputs
time = np.arange(0,T+dt,dt)
traj = np.zeros(len(time))
# Initialize traj
traj[0] = p0
# Fill traj with values
for i in range(1,len(traj)):
traj[i] = traj[i-1] + dt * Dynamics(traj[i-1],b,q,l)
    return traj
# Get them!
for i in range(len(l)):
    low_p[i,:] = trajectory(b,q,pmin,l[i],dt,T)
    med_p[i, :] = trajectory(b, q, pmed, l[i], dt, T)
    high_p[i,:] = trajectory(b,q,pmax,l[i],dt,T)

Now we are getting to the interesting part of making the plots for the animation. We need to declare a figure that all the frames in our animation will use (we don’t want the axes to wobble around). For that we use matplotlib / pyplot libraries:

# Draw animated figure
fig, ax = plt.subplots(1)
ax.set_xlabel('Phosphorus inputs L')
ax.set_ylabel('Phosphorus concentration P')
ax.set_xlim(0,l[-1])
ax.set_ylim(0,pmax)
line_low, = ax.plot(l,low_p[:,0],'.', label='State, P(0)=0')
line_med, = ax.plot(l,med_p[:,0],'.', label='State, P(0)=1')
line_high, = ax.plot(l,high_p[:, 0], '.', label='State, P(0)=2.5')

Once that is done, the last things we need to do before calling the core moviepy functions are to 1) define the parameters that manage time, and 2) have a function that makes frames for the instant that is being called.

For 1), we need to be careful because we are juggling with different notions of time, a) time in the dynamics, b) the index of each instant in the dynamics (i.e., in the data, the arrays where we stored the trajectories), and c) time in the animation. We may also want to have a pause at the beginning or at the end of the GIF, rather than watch with tired eyes as the animation is ruthlessly starting again before we realized what the hell happened. So here is how I declared all of this:

# Parameters of the animation
initial_delay = 0.5 # in seconds, delay where image is fixed before the animation
final_delay = 0.5 # in seconds, time interval where image is fixed at end of animation
time_interval = 0.25 # interval of time between two snapshots in the dynamics (time unit or non-dimensional)
fps = 20 # number of frames per second on the GIF
# Translation in the data structure
data_interval = int(time_interval/dt) # interval between two snapshots in the data structure
t_initial = -initial_delay*fps*data_interval
t_final = final_delay*fps*data_interval
time = np.arange(t_initial,low_p.shape[1]+t_final,data_interval) # time in the data structure

Now for 2), the function to make the frames resets the parts of the plot that change for different time indexes(“t” below is the index in the data). If we don’t do that, the plot will keep the previous plotted elements, and will grow messier at the animation goes on.

# Making frames
def make_frame(t):
    t = int(t)
    if t<0:
        return make_frame(0)
    elif t>nt:
        return make_frame(nt)
    else:
        line_low.set_ydata(low_p[:,t])
        line_med.set_ydata(med_p[:,t])
        line_high.set_ydata(high_p[:, t])
        ax.set_title(' Lake attractors, and dynamics at t=' + str(int(t*dt)), loc='left', x=0.2)
       if t > 0.25*nt:
            alpha = (t-0.25*nt) / (1.5*nt)
            lakeAttBase(eqList, 0.001, alpha=alpha)
            plt.legend(handles=[stable, unstable], loc=2)
        return mplfig_to_npimage(fig) 

In the above mplfig_to_npimage(fig) is a moviepy function that turns a figure into a frame of our GIF. Now we just have to call the function to do frames using the data, and to turn it into a GIF:

# Animating
animation = DataVideoClip(time,make_frame,fps=fps)
animation.write_gif("lake_attractors.gif",fps=fps)

Where the moviepy function DataVideoClip takes as arguments the sequences of indexes defined by the vector “time” defined in the parameters of the animation, the “make_frame” routine we defined, and the number of frame per second we want to output. The last lines integrates each frame to the GIF that is plotted below:

lake_attractors

Each point on the plot represent a different world (different constant input level, different initial phosphorus concentration), and the animation shows how these states converge towards an stable equilibriun point. The nonlinear lake dynamics make the initial concentration important towards knowing if the final concentration is low (lower set of stable equilibria), or if the lake is in a eutrophic state (upper set of stable equilibria).

Part II: plotting trajectories in the 2-D plane

Many trajectories can be plotted at the same time to understand the behavior of attractors, and visualize system dynamics for fixed human-controlled parameters (here, the phosphorus inputs L). Alternatively, if one changes the policy, trajectories evolve depending on both L and P. This redefines how trajectories are defined.

I did a similar bit of code to show how one could plot trajectories in the 2D plane. It is also uploaded on the GitHub repository (under movie2.py), and is similar in its structure to the code above. The definition of the trajectories and where to store them change. We define trajectories where inputs are lowered at a constant rate, with a minimum input of 0.08. For three different initial states, that gives us the following animation that illustrates how the system’s nonlinearity leads to very different trajectories even though the starting positions are close and the management policy, identical:

lake_trajectories

 

This could easily be extended to trajectories in higher dimensional planes, with and without sets of equilibria to guide our eyes.