Runtime Visualization of MOEA with Platypus in Python

The Platypus framework provides us with a Python library to solve and analyze multi-objective problems conveniently. In this notebook, we use Platypus to solve a multi-objective optimization problem – the 3 objective version of DTLZ2 (Deb et al., 2002a) – using the evolutionary algorithm NSGA-II (Deb et al., 2002b). We generate runtime visualizations for snapshots of the algorithm to help build an understanding of how the evolutionary algorithm works. Specifically, we are going to look at the population at each generation, their ranks (a key parameter NSGA-II uses to select offspring), the parallel coordinates plot of the current population, and runtime indicators such as hypervolume and generational distance.

Setting up the problem

I created a Google Colab for this post. Since this post is equivalent to the Google Colab file, you may choose to look at either one. This project is intended to be used in a Jupyter Notebook environment with Python.

First, we import the Python libraries we need for the visualization.

import numpy as np
import math
import pandas as pd
from pandas.plotting import parallel_coordinates
import random
from tqdm.notebook import tqdm

import matplotlib.pyplot as plt
from matplotlib import animation, rc, rcParams
rc('animation', html='jshtml')

from mpl_toolkits.mplot3d import Axes3D

!pip install platypus-opt
from platypus import (NSGAII, NSGAIII, DTLZ2, Hypervolume, EpsilonBoxArchive,
                      Solution, GenerationalDistance, InvertedGenerationalDistance,
                      Hypervolume, EpsilonIndicator, Spacing)

Our goal is to visualize the population at each generation of the evolutionary algorithm. To do so, we utilize an interface provided by the Platypus library: the callback function.

At each iteration of the algorithm, the callback function (if initialized) is called. We define our callback function to store the current number of function evaluations (algorithm.nfe) and all the data points in the current population (algorithm.result). Each population has the type Solution defined in Platypus, which not only contains the values of the variables but also attributes specific to the solver, such as ranks and crowding distances. We will access some of these attributes during visualization.

We also define the frequency of NFEs that we want to store the values. Saving a snapshot of the algorithm at every single iteration may be too expensive or unnecessary. Hence, we set a lapse for some number of iterations, in which the callback function does nothing unless the last NFE is more than the frequency amount apart from this NFE.

In the following example, we set up the problem DTLZ2 and use the callback function to solve it using NSGAII.

#define the frequency
frequency = 200

# define the problem definition
problem = DTLZ2(3)

# instantiate the optimization algorithm
algorithm = NSGAII(problem, divisions_outer=12)

# define callback function
solutions_list = []
hyp = []
nfe = []
last_calc = 0

def DTLZ2_callback(algorithm):
  global last_calc
  if algorithm.nfe == last_calc + frequency:
    last_calc = algorithm.nfe
    nfe.append(algorithm.nfe)
    solutions_list.append(algorithm.result);

# optimize the problem using 10,000 function evaluations
algorithm.run(10000,DTLZ2_callback)

In order to calculate the metrics, we first need to define our reference set. For DTLZ2(3), our points lie on the unit sphere in the first octant. Let’s randomly select 1000 such points and plot them.

# generate the reference set for 3D DTLZ2
reference_set = EpsilonBoxArchive([0.02, 0.02, 0.02])

for _ in range(1000):
    solution = Solution(problem)
    solution.variables = [random.uniform(0,1) if i < problem.nobjs-1 else 0.5 for i in range(problem.nvars)]
    solution.evaluate()
    reference_set.add(solution)

fig_ref = plt.figure()
ax_ref = fig_ref.add_subplot(projection="3d")
ax_ref.scatter(
              [s.objectives[0] for s in reference_set],
              [s.objectives[1] for s in reference_set],
              [s.objectives[2] for s in reference_set],
              )

Given the reference set, we can now calculate the indicators for each population across iterations. The Platypus library provides us with all the functions we need to calculate the following indicators: generational distance, hypervolume, epsilon indicator, and spacing.

We initialize them in a dictionary and iterate over all saved populations to calculate the indicators.

# calculate the indicators
indicators = {"gd" : GenerationalDistance(reference_set),
              "hyp" : Hypervolume(reference_set),
              "ei" : EpsilonIndicator(reference_set),
              "sp" : Spacing()}

indicator_results = {index : [] for index in indicators}

for indicator in tqdm(indicator_results):
  for solution in tqdm(solutions_list):
    indicator_results[indicator] += [indicators[indicator].calculate(solution)]

Setting up the visualization

At this point, we have the data we need to perform runtime visualizations. We will utilize the animation.FuncAnimation function in matplotlib to create interactive animations in Jupyter Notebook (or Google Colab). The idea behind creating such animations is to first initialize a static figure, and then define an update function to let the FuncAnimation know how to visualize new data for each iteration.

We define drawframe() that does the following: (1) clear the axis, so that previous data points are wiped out; (2) draw the new data points; (3) reset the limits of data axes so that the axes are consistent across frames; (4) update new data for indicator axes.

def drawframe(n):

    # clear axes
    ax.cla()
    ax_parallel.cla()

    # save results
    result = solutions_list[n]
    crowding_distances = [s.crowding_distance if s.crowding_distance != math.inf else 0 for s in result]
    ranks = [s.rank for s in result]

    result = solutions_list[n]
    points = {
              'X': [s.objectives[0] for s in result],
              'Y': [s.objectives[1] for s in result],
              'Z': [s.objectives[2] for s in result],
              'rank': [s.rank for s in result],
              'tag' : ['tag' for s in result]
    }
    df = pd.DataFrame(points)

    # update new data points
    ax.scatter(points['X'], points['Y'], points['Z'],
              c = ranks,
              alpha = 0.5,
              linestyle="", marker="o",
              cmap=cmap,
              vmax=max_rank,
              vmin=0)
    ax.set_xlim(xlim)
    ax.set_ylim(ylim)
    ax.set_zlim(zlim)
    ax.set_title('Solve DTLZ2, NFE = ' + str(nfe[n]))

    # update the parallel coordinates plot
    parallel_coordinates(df[['X','Y','Z','tag']], 'tag', ax=ax_parallel)

    # update indicator plots
    for indicator in indicator_axes:
      indicator_ax = indicator_axes[indicator]
      indicator_ax.plot(nfe[:n],indicator_results[indicator][:n], c = 'r')
      indicator_ax.set_xlim(left = min(nfe), right=max(nfe))
      indicator_ax.set_ylim(bottom = min(indicator_results[indicator]), top = max(indicator_results[indicator]))

With the drawframe() function, we create a static figure that initializes the axes we will feed into the drawframe() function. The initialization does the following: (1) set up the subplots of the figure; (2) calculate the maximum ranks of all points in all populations to determine the color mapping; (3) load the points from the first iteration; (4) initialize the scatter plots, parallel coordinates plot, and indicator plots.

fig = plt.figure(figsize=(20,10), dpi = 70)
fig.suptitle("Runtime Visualization", fontsize=20)
fig.subplots_adjust(wspace=0.3, hspace=0.3)
ax = fig.add_subplot(2, 3, 1, projection="3d")
ax_parallel = fig.add_subplot(2,3,4)

indicator_axes = {"gd" : fig.add_subplot(2, 3, 2),
                  "hyp" : fig.add_subplot(2, 3, 3),
                  "ei" : fig.add_subplot(2, 3, 5),
                  "sp" : fig.add_subplot(2, 3, 6)}

indicator_names = {"gd" : "Generational Distance",
                  "hyp" : "Hypervolume",
                  "ei" : "Epsilon Indicator",
                  "sp" : "Spacing"}

# load the ranks of all points
all_rank = [s.rank for result in solutions_list for s in result]
max_rank = max(all_rank)

# define the colormap
cmap = plt.get_cmap('Accent', max_rank)

# load the points from the first iteration
result = solutions_list[0]
points = {
    'X': [s.objectives[0] for s in result],
    'Y': [s.objectives[1] for s in result],
    'Z': [s.objectives[2] for s in result],
    'rank': [s.rank for s in result],
    'tag' : ['tag' for s in result]
}

df = pd.DataFrame(points)

# create the scatter plot
graph = ax.scatter(
              points['X'], points['Y'], points['Z'],
              c = points['rank'],
              alpha = 0.5,
              cmap=cmap,
              linestyle="", marker="o",
              vmax=max_rank,
              vmin=0
          )

# create the parallel coordinates plot
parallel_coordinates(df[['X','Y','Z','tag']], 'tag', ax=ax_parallel)

plt.colorbar(graph, label='Rank', pad = 0.2)

# save the dimensions for later use
xlim = ax.get_xlim()
ylim = ax.get_ylim()
zlim = ax.get_zlim()
title = ax.set_title("DTLZ2 Static Figure")
ax.set_xlabel("x")
ax.set_ylabel("y")
ax.set_zlabel("z")

# initialize subplots for each indicator
for indicator in indicator_axes:
  indicator_axes[indicator].plot(nfe[:0],indicator_results[indicator][:0])
  indicator_axes[indicator].set_title(indicator_names[indicator] + " vs NFE")
  indicator_axes[indicator].set_xlabel("NFE")
  indicator_axes[indicator].set_ylabel(indicator_names[indicator])

Now we are ready to create an animation. We use the FuncAnimation() function, the initialized figure, and the drawframe() function to create the animation.

In the Jupyter Notebook or Google Colab, you will be able to play the animation using the play button at the bottom of the image. By default, the animation will play in a loop. You may select “once” so that the animation freezes in the last frame. Important: Before re-generating the animation, be sure to re-run the previous initialization so that the figure is reset. Otherwise, you may see overlapping points/lines.

ani = animation.FuncAnimation(fig, drawframe, frames=len(nfe), interval=20, blit=False)
ani

This visualization shows how the algorithm progresses as NFE grows. For the set of solutions, we clearly see how they converge to the reference set. Moreover, more and more points have lower ranks, indicating they are getting closer to the Pareto Front (points on the Pareto Front have rank = 0). The parallel coordinates plot shows how our solutions get narrowed down and the tradeoffs we could make. Finally, the four indicator plots track the performance of our algorithm as NFE increases. The trajectory of generational distance, hypervolume, and epsilon indicator suggests convergence.

In conclusion, the project highlights the potential of the Platypus library in Python in providing valuable insights into the progress of evolutionary algorithms, not just their final outcomes. Through the use of NSGA-II as an illustrative example, we have demonstrated the ability to monitor the ranks of points across generations. In Dave’s post, the runtime visualizations revealed the changing probabilities of variators across iterations. These findings emphasize the power of incorporating dynamic techniques to gain a comprehensive understanding of the runtime behavior of MOEA algorithms. I hope this project opens doors to further explore, visualize, and analyze the dynamics of evolutionary algorithms.

References

[1] Deb, K., Thiele, L., Laumanns, M., & Zitzler, E. (2002a). Scalable multi-objective optimization test problems. In Proceedings of the 2002 Congress on Evolutionary Computation. CEC’02 (Cat. No. 02TH8600) (Vol. 1, pp. 825-830). IEEE.

[2] Deb, K., Pratap, A., Agarwal, S., & Meyarivan, T. A. M. T. (2002b). A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation, 6(2), 182-197.

[3] Gold, D. (2020, May 6). Beyond Hypervolume: Dynamic visualization of MOEA runtime. Water Programming: A Collaborative Research Blog. https://waterprogramming.wordpress.com/2020/05/06/beyond-hypervolume-dynamic-visualization-of-moea-runtime/

[4] Python – How to create 3D scatter animations – Stack Overflow https://stackoverflow.com/questions/41602588/how-to-create-3d-scatter-animations

[5] GitHub – Project-Platypus/Platypus: A Free and Open Source Python Library for Multiobjective Optimization https://github.com/Project-Platypus/Platypus

Parameter estimation of Hidden Markov Models using a Markov Chain Monte Carlo method in Julia

I recently completed a course Final Project in which I attempted to implement a Hidden Markov Model (HMM) parameterized using the Markov Chain Monte Carlo method. In this post, I will briefly introduce HMM models as well as the MCMC method. I will then walk you through an exercise implementing a HMM in the Julia programming language and share some preliminary results from my exploration in combining both these methods. All the code found in this exercise can be located in this GitHub repository.

Note: The exercise in this post will require some familiarity with Julia. For a quick introduction to this relatively recent newcomer to the programming languages scene, please refer to Peter Storm’s blog post here.

Hidden Markov Models

A Hidden Markov Model (HMM) is a statistical tool used to model different types of time series data ranging from signal processing, hydrology, financial statistics, speech recognition, and econometrics (Rabiner, 1990; Rydén, 2008), where the underlying process is assumed to be Markovian. Specifically, it has been found to be effectively at modeling systems with long-term persistence, where past events have effects on current and future events at a multi-year scale (Akintug and Rasmussen, 2005; Bracken et al., 2014). Recent studies have also found that it is possible to use HMMs to model the dynamics of streamflow, where the HMM state space represents the different hydrological conditions (e.g. dry vs. wet) and the observation space represents the streamflow data (Hadjimichael et al., 2020; 2020a). These synthetic streamflows are computationally generated with the aid of packages such as hmmlearn in Python and HMMBase in Julia. An example implementation of hmmlearn can be found in Julie Quinn’s blog post here. She also wrote the prequel to the implementation post, where she dives into the background of HMMs and applying Expectation-Maximization (EM) to parameterize the model.

Markov Chain Monte Carlo

The EM approach of parameterizing HMMs reflects a frequentist approach that assumes the existence of one “true” value for each HMM parameter. In this exercise, I will instead use a Bayesian approach via the Markov Chain Monte Carlo method to generate and randomly sample potential parameter (model prior) values from a Markov chain, which are statistical models that approximate Markovian time series. In these time series, the probability of being in any given state st at current time t is dependent only on states prior to st. Using the MCMC approach, a Markov Chain is constructed for each HMM parameter such that its stationary distribution (distribution of the parameters at equilibrium) is the distribution of its associated model posterior (the likelihood of a model structure after, or a posteriori, obtaining system observed data). The model priors (assumed model structure prior to having any knowledge about the system) are randomly sampled from their respective chains — hence the term “Monte Carlo”. These samples are then used to estimate the statistical properties of the distribution of each parameter.

Overall, MCMC is a powerful method to estimate the statistical properties of high-dimensional problems with many parameters. Nonetheless, it is beholden to its own set of limitations that include high computational expense and the inability to guarantee convergence. Figure 1 below summarize some of the benefits and drawbacks of using MCMC parameterization:

Figure 1: The pros and cons of using MCMC parameterization.

Setting up the parameters

In this exercise, I will be using the HMM to approximate the statistical properties of streamflow data found in the annualQ.csv file previously used also by Rohini in her post on creating HMMs to generate synthetic streamflow series and by the MSD UC Ebook HMM training Jupyter Notebook. This streamflow time series originated from a USGS outlet gauge in the Upper Colorado River Basin. It records 105 years’ worth of streamflow from 1909 and 2013 (inclusive). It was previously found that it was best described using two states (wet and dry; Bracken et al, 2014). In this exercise, I also assume that this time series is best approximated by a first-order HMM where the state at current time t (st) is only affected by the state at the timestep immediately prior to it (st-1).

In this exercise, I will be using MCMC to parameterize the following HMM parameters:

ParameterDescription
The initial state distribution a for wet (1)
and dry (2) states
The transition matrix A from a wet (1) to
a dry (2) state
The probability distributions for the vector
of observations x for a given state where
Wet: 1 and Dry: 2
Table 1: Summary of all HMM parameters to be adjusted using the MCMC approach.

From Table 1 above:

  • P(ai) is the initial state probability for state i
  • P(Aij) is the transition probability from state i to state j
  • μi and σi are the mean and standard deviation for the vector of streamflow observations x for a given state i

Now that the dataset and model parameters have been introduced, we will proceed with demonstrating the code required to implement the HMM model with MCMC parameterization.

Implementing the streamflow HMM with using MCMC parameterization

Before begin, I recommend downloading and installing Julia 1.8.5. Please find the Windows and Mac 64-bit installer here. Once Julia is set up, we can download and install all required packages. In Julia, this can be done as follows:

# make sure all required packages are installed
import Pkg
Pkg.activate(@__DIR__);
Pkg.instantiate();

Once this is done, we can load all the required packages. In Julia, this is done using the using key word, which can be thought of as the Python import parallel. Note that I placed a semicolon after the using CSV line. This is to prevent the automatic printing of output messages that Julia automatically displays when loading packages.

# load all packages
using Random
using CSVFiles # load CSV data
using DataFrames # data storage and presentation
using Plots # plotting library
using StatsPlots # statistical plotting
using Distributions # statistical distribution interface
using Turing # probabilistic programming and MCMC
using Optim # optimization library
using StatsBase
using LaTeXStrings
using HMMBase
using DelimitedFiles
using CSV;

Now, let’s load the data and identify the number of years and number of states inherent to time streamflow time series.

# load data
# source: https://github.com/IMMM-SFA/msd_uncertainty_ebook/tree/main/notebooks/data
Q = CSV.File("annualQ.csv", header=[:Inflow]) |> DataFrame
Q = Q[2:end,:]
Q[!,:Inflow] = parse.(Float64, Q[!, :Inflow])
Q[!, :Year] = 1909:2013;
Q[!, :logQ] = log.(Q[!, :Inflow]);

# Assume first-order HMM where only the immediate prior step influences the next step
n_states = 2
logQ = Q[!,:logQ];
n_years = size(logQ)[1];

Next, we should set a random seed to ensure that our following results are consistent. This is because the package that we are using in this exercise called “Distributions” performs random sampling. Without setting the seed (shown below), the outcomes of the exercise may be wildly different each time the MCMC parameterization algorithm is run.

# Set the random seed
Random.seed!(1)

Next, we define the HMM model with MCMC sampling. The sampling is performed using methods and syntax built into the Turing.jl package. In the code below, also note lines 32 to 38. Within those lines, the observations’ states are enforced where observations with log-streamflows below the input mean are assigned to a dry state, while the opposite is implemented for log-streamflows above the wet state. This step is important to prevent mode collapse, where the HMM model cannot distinguish between the wet and dry state distributions due to the random sampling of the MCMC parameterization.

@model function HMM_mcmc(logQ, n_states, n_years)
    # Source: https://chat.openai.com/c/d81ef577-9a2f-40b2-be9d-c19bf89f9980
    # Prompt example: Please suggest how you would parameterize Hidden Markov Model with the Markov Chain Monte Carlo method 
    # Uses the Turing.jl package in Julia. The input to this model will be an inflow timeseries with two hidden states.
    mean_logQ = mean(logQ)

    # Define the prior distributions for the parameters
    μ₁ ~ Normal(15.7, 0.25)   # wet state   
    μ₂ ~ Normal(15.2, 0.25)   # dry state
    σ₁ ~ truncated(Normal(0.24, 0.1), 0, Inf)
    σ₂ ~ truncated(Normal(0.24, 0.1), 0, Inf)
    a11 ~ Dirichlet(ones(n_states))
    A ~ filldist(Dirichlet(ones(n_states)), n_states)
    s = tzeros(Int, n_years)   # Define the vector of hidden states

    # Define intial state probability distribution
    a = [a11[1], 1-a11[1]]   # [wet state, dry state]

    # Define the observation distributions 
    B = [Normal(μ₁, σ₁), Normal(μ₂, σ₂)]

    # Sample the initial hidden state and observation variables
    s[1] ~ Categorical(a)

    # Loop over time steps and sample the hidden+observed variables 
    for t in 2:n_years
        s[t] ~ Categorical(A[:,s[t-1]])
    end
    for t in 1:n_years
        logQ[t] ~ B[s[t]]

        # if the logQ is greater than the input mean, force the state to be a wet state
        if logQ[t] > mean_logQ
            s[t] ~ Categorical(A[:,1])
        # if the logQ is smaller than the input mean, force the state to be a dry state
        elseif logQ[t] < mean_logQ
            s[t] ~ Categorical(A[:,2])
        end

    end
end

We then have to run the model for at least 3,000 iterations before the sample values for each parameter converge to stable values. The code to perform this is as follows:

# run for 3000 iterations; usually enough
# run using >3 chains to double check convergence if desired
hmm_model = HMM_mcmc(logQ, n_states, n_years)
g = Gibbs(HMC(0.05, 10, :μ₁, :μ₂, :σ₁, :σ₂, :a11, :A), PG(10, :s))
chains = sample(hmm_model, g, MCMCThreads(), 3000, 1, drop_warmup=true)

To check for convergence, you can attempt to run plot(chains) to check for convergence, and corner(chains) to examine the parameters for dependencies. In this post, we will not discuss MCMC convergence in detail, but some things to look out for would be r_hat values in the output of the code shown above that are close to 1, as well as chains that are similar to each other. Refer to Table 2 and Figure 2 for some quick statistics that can be used as a baseline to determine if your model has converged.

Table 2: The mean converged values for some of the HMM parameters.
Figure 2: The mean converged values of the transition matrix components.

Now, let’s use the model to generate 1,000 samples of streamflow. First, we define the streamflow prediction function:

function predict_inflow(chain, logQ)
    μ₁ = Array(group(chain, :μ₁))
    μ₂ = Array(group(chain, :μ₂))
    σ₁ = Array(group(chain, :σ₁))
    σ₂ = Array(group(chain, :σ₂))
    a11 = Array(group(chain, "a11[1]"))
    a22 = Array(group(chain, "a11[2]"))
    A11 = Array(group(chain, "A[1,1]"))
    A12 = Array(group(chain, "A[1,2]"))
    A21 = Array(group(chain, "A[2,1]"))
    A22 = Array(group(chain, "A[2,2]"))

    n_samples = 1000
    
    μ1_sample = sample(μ₁, n_samples, replace=true);
    μ2_sample = sample(μ₂, n_samples, replace=true);
    σ1_sample = sample(σ₁, n_samples, replace=true);
    σ2_sample = sample(σ₂, n_samples, replace=true);
    a11_sample = sample(a11, n_samples, replace=true);
    a22_sample = sample(a22, n_samples, replace=true);
    A11_sample = sample(A11, n_samples, replace=true);
    A12_sample = sample(A12, n_samples, replace=true);
    A21_sample = sample(A21, n_samples, replace=true);
    A22_sample = sample(A22, n_samples, replace=true);
    
    Q_predict = zeros(length(logQ), n_samples)    
    #residuals_ar = zeros(length(temp_data), n_samples)
    
    s_matrix = zeros(length(logQ), n_samples)
    logQ_matrix = zeros(length(logQ), n_samples)
    meanQ = mean(logQ)
    
    for n = 1:n_samples 
        s_sample = tzeros(Int, n_years)
        Q_sample = zeros(n_years)

        μ1 = μ1_sample[n]
        μ2 = μ2_sample[n]
        σ1 = σ1_sample[n]
        σ2 = σ2_sample[n]
        a11 = a11_sample[n]
        a22 = a22_sample[n]
        A11 = A11_sample[n]
        A21 = 1 - A11_sample[n]
        A12 = 1 - A22_sample[n]
        A22 = A22_sample[n]
        #println(A)
        A = [A11 A12; A21 A22]
        A = transpose(A)
        #println(A)
        #a =  [1-a22, a22]
        a = [a11, 1-a11]
        print(a)
        B = [Normal(μ1, σ1), Normal(μ2, σ2)]

        # Sample the initial hidden state and observation variables
        s_sample[1] = rand(Categorical(a))

        # Loop over time steps and sample the hidden+observed variables 
        for t in 2:n_years
            s_sample[t] = rand(Categorical(A[s_sample[t-1], :]))
        end
        for t in 1:n_years
            Q_sample[t] = rand(B[s_sample[t]])
            if Q_sample[t] < meanQ
                s_sample[t] = 1
            elseif Q_sample[t] > meanQ
                s_sample[t] = 2
            end
        end

        s_matrix[:, n] = s_sample
        Q_predict[:, n] = Q_sample
        
    end

    return s_matrix, Q_predict

end

From this function, we obtain the state and log-inflow matrix with dimensions 105 × 1000 each.

s_matrix, logQ_predict = predict_inflow(chains, logQ) 

We randomly select a timeseries out of the 1,000 timeseries output by the prediction function and plot the wet and dry states identified:

# create the dry and wet masks
q = Q[!,:Inflow]
y = Q[!,:Year]

# select a random sample timeseries to plot
idx = 7
Q_wet_mcmc = q[s_matrix[:,idx] .== 1]
Q_dry_mcmc = q[s_matrix[:,idx] .== 2]

y_wet_mcmc = y[s_matrix[:,idx] .== 1]
y_dry_mcmc = y[s_matrix[:,idx] .== 2];

# plot the figure
plot(y, Q.Inflow, c=:grey, label="Inflow", xlabel="Year", ylabel="Annual inflow (cubic ft/yr)")
scatter!(y_dry_mcmc, Q_dry_mcmc, c=:salmon, label="Dry state (MCMC)", xlabel="Year", ylabel="Annual inflow (cubic ft/yr)")
scatter!(y_wet_mcmc, Q_wet_mcmc, c=:lightblue, label="Wet state (MCMC)", xlabel="Year", ylabel="Annual inflow (cubic ft/yr)")

The output figure should look similar to this:

Figure 2: Wet (pink) and dry (blue) states identified within the streamflow timeseries.

From this figure, we can see that the HMM can identify streamflow extreme as wet and dry states (albeit with some inconsistencies) where historically high streamflows are identified as a “wet state” and low streamflows are correspondingly identified as “dry”.

To make sure that the model can distinguish the observation distributions for wet and dry states, we plot their probability distribution functions. Before we plot anything, though, let’s define a function that will help us plot the distributions themselves. For this function, we first need to load the LinearAlgebra package.

# load the LinearAlgebra package 
using LinearAlgebra

# function to plot the distribution of the wet and dry states 
function plot_dist(Q, μ, σ, A)
    evals, evecs = eigen(A).values, eigen(A).vectors
    one_eval = argmin(abs.(evals.-1))
    π = evecs[:, one_eval] / sum(evecs[:, one_eval])

    x_0 = LinRange(μ[1] - 4*σ[1], μ[1] + 4*σ[1], 10000)
    norm0 = Normal(μ[1], σ[1])
    fx_0 = π[1].*pdf.(norm0, x_0)
 
    x_1 = LinRange(μ[2] - 4*σ[2], μ[2] + 4*σ[2], 10000)
    norm1 = Normal(μ[2], σ[2])
    fx_1 = π[2].*pdf.(norm1, x_1)
 
    x = LinRange(μ[1] - 4*σ[1], μ[2] + 4*σ[2], 10000)
    fx = π[1].*pdf.(norm0, x) .+ π[2].*pdf.(norm1, x)
    
    #fig, ax = plt.subplots(1, 1, figsize=(12, 8))

    b = range(μ[1] - 4*σ[1], μ[2] + 4*σ[2], length=100)
    
    histogram(log.(Q), c=:lightgrey, normalize=:pdf, label="Log-Inflow")
    plot!(x_0, fx_0, c=:red, linewidth=3, label="Dry State Distr", xlabel="x", ylabel="P(x)")
    plot!(x_1, fx_1, c=:blue, linewidth=2, label="Wet State Distr")
    plot!(x, fx, c=:black, linewidth=2, label="Combined State Distr")
end

Great! Now let’s use this function to plot the distributions themselves.

# get the model parameter value means
μ₁_mean = mean(chains, :μ₁)
μ₂_mean = mean(chains, :μ₂)
σ₁_mean = mean(chains, :σ₁)
σ₂_mean = mean(chains, :σ₂)
a11_mean = mean(chains, "a11[2]")
A11_mean = mean(chains, "A[1,1]")
A12_mean = mean(chains, "A[1,2]")
A21_mean = mean(chains, "A[2,1]")
A22_mean = mean(chains, "A[2,2]")

# obtained from model above
μ = [μ₂_mean, μ₁_mean]
σ = [σ₂_mean, σ₁_mean]
A_mcmc = [A11_mean 1-A11_mean; 1-A22_mean A22_mean]

# plot the distributions
plot_dist(Q[!,:Inflow], μ, σ, A_mcmc)

This code block will output the following figure:

Figure 3: The overall (black), wet (blue) and dry (red) streamflow probability distributions.

From Figure 3, we can observe that the dry state streamflow distribution is different from that of the wet state. The HMM model correctly estimates a lower mean for the dry state, and a higher one for the dry state. Based on visual analysis, the states have similar standard deviations, but this is likely an artifact of the prior parameterization of the initial HMM model that we set up earlier. The step we took earlier to force the wet/dry state distributions given a sampled log-streamflow value also likely prevented mode collapse and enabled the identification of two distinct distributions.

Now, let’s see if our HMM can be used to generate synthetic streamflow. We do this using the following lines of code:

# plot the FDC of the sampled inflow around the historical 
sampled_obs_mcmc = zeros(1000, n_years)
Q_predict_mcmc = transpose(logQ_predict)
for i in 1:1000
    sampled_obs_mcmc[i,:] = exp.(Q_predict_mcmc[i, :])
    sampled_obs_mcmc[i,:] = sort!(sampled_obs_mcmc[i, :], rev=true)
end

hist_obs = copy(Q[!,:Inflow])
hist_obs = sort!(hist_obs, rev=true)
probs = (1:n_years)/n_years

plot(probs, hist_obs, c=:grey, linewidth=2, label="Historical inflow",
    xlabel="Exceedance prob.", ylabel="Annual inflow (cubic ft/yr)")
for i in 1:1000
    if i == 1
        plot!(probs, sampled_obs_mcmc[i,:], c=:lightgreen, linewidth=2,
            label="Sampled inflow (MCMC)", xlabel="Exceedance prob.", ylabel="Annual inflow (cubic ft/yr)")
    end

    plot!(probs, sampled_obs_mcmc[i,:], c=:lightgreen, linewidth=2, legend=false,
        xlabel="Exceedance prob.", ylabel="Annual inflow (cubic ft/yr)")
end

plot!(probs, hist_obs, c=:grey, linewidth=2, legend=false,
    xlabel="Exceedance prob.", ylabel="Annual inflow (cubic ft/yr)")

Once implemented, the following figure should be output:

Figure 4: The original historical streamflow timeseries (grey) and the synthetic streamflow generated by the HMM (green).

This demonstrates that the HMM model is capable of generating synthetic streamflow that approximates the statistical properties of the historical timeseries, while expanding upon the range of uncertainty in streamflow scenarios. This has implications for applications such as exploratory modeling of different streamflow scenarios, where such approaches can be used to generate a wetter- or dryer-than-historical streamflows.

Summary

While using MCMC to parameterize a HMM is a useful method to generate synthetic streamflow, one significant drawback is the long runtime and the large number of iterations that the MCMC algorithm takes to converge (Table 3).

Table 3: MCMC parameterization and runtime.

This is likely due to the HMM sampling setup. Notice that in lines 27 and 33-27 in the HMM_mcmc function: we are essentially fitting an additional probability distribution at each st essentially turning them into tunable parameters themselves. Since there are 105 elements in the streamflow timeseries, this introduces an additional 105 parameters for the MCMC to fit distributions to. This is the likely cause for the lengthy runtime. While we will not explore alternative HMM-MCMC parameterization approaches, more efficient alternatives definitely exist, and you are highly encouraged to check it out!

Overall, we began this post with a quick introduction to HMM and MCMC sampling. We then compared the pros and cons of using the MCMC approach to parameterize a HMM. Once we understood its benefits and drawbacks, we formulated a HMM for a 105-year long timeseries of annual streamflow and examined the performance of the model by analyzing its converged parameter values, observation probability distributions, as well as its ability to detect wet/dry states in the historical time series. We also used the HMM to generate 1,000 synthetic streamflow scenarios. Finally, we diagnosed the cause for the long time to convergence of the MCMC sampling approach.

This brings us to the end of the post! I hope you now have a slightly better understanding of HMMs, how they can be parameterized using the MCMC approach, as well as issues you might run into when you implement the latter.

Happy coding!

References

Akintug, B., & Rasmussen, P. F. (2005). A Markov switching model for annual hydrologic time series. Water Resources Research, 41(9). https://doi.org/10.1029/2004wr003605

Bracken, C., Rajagopalan, B., & Zagona, E. (2014). A hidden markov model combined with climate indices for Multidecadal Streamflow Simulation. Water Resources Research, 50(10), 7836–7846. https://doi.org/10.1002/2014wr015567

Hadjimichael, A., Quinn, J., & Reed, P. (2020). Advancing diagnostic model evaluation to better understand water shortage mechanisms in institutionally complex river basins. Water Resources Research, 56(10), e2020WR028079.

Hadjimichael, A., Quinn, J., Wilson, E., Reed, P., Basdekas, L., Yates, D., & Garrison, M. (2020a). Defining robustness, vulnerabilities, and consequential scenarios for diverse stakeholder interests in institutionally complex river basins. Earth’s Future, 8(7), e2020EF001503.

Reed, P.M., Hadjimichael, A., Malek, K., Karimi, T., Vernon, C.R., Srikrishnan, V., Gupta, R.S., Gold, D.F., Lee, B., Keller, K., Thurber, T.B, & Rice, J.S. (2022). Addressing Uncertainty in Multisector Dynamics Research [Book]. Zenodo. https://doi.org/10.5281/zenodo.6110623

Rabiner, L. R. (1990). A tutorial on Hidden Markov models and selected applications in speech recognition. Readings in Speech Recognition, 267–296. https://doi.org/10.1016/b978-0-08-051584-7.50027-9

Rydén, T. (2008). Em versus markov chain Monte Carlo for estimation of Hidden Markov models: A computational perspective. Bayesian Analysis, 3(4). https://doi.org/10.1214/08-ba326

Turek, D., de Valpine, P. and Paciorek, C.J. (2016) Efficient Markov Chain Monte Carlo Sampling for Hierarchical Hidden Markov Models, arXiv.org. Available at: https://arxiv.org/abs/1601.02698.

Deep Reinforcement Learning with PPO (the ML kind, not the health insurance kind)

As the name *may* have implied, today’s blog post will be about proximal policy optimization (PPO), which is a deep reinforcement learning (DRL) algorithm introduced by OpenAI in 2017. Before we proceed, though, let’s set a few terms straight:

  • State: An abstraction of the current environment that the agent inhabits. An agent observes the current state of the environment, and makes a decision on the next action to take.
  • Action: The mechanism that the agent uses to transition between one state to another.
  • Agent: The decision-maker and learner that takes actions based on its consequent rewards or penalties.
  • Environment: The dynamic system defined by a set of rules that the agent interacts with. The environment changes based on the decisions made and actions taken by the agent.
  • Reward/Penalty: The feedback received by the agent. Often in reinforcement learning, the agent’s objective is to maximize its reward or minimize its penalty. In this post, we will proceed under the assumption of a reward-maximization objective.
  • Policy: A model that maps the states to the probability distribution of actions. The policy can then be used to tell the agent to select actions that are most likely to result in the lowest penalty/highest reward for a given state.
  • Continuous control: The states, actions, and rewards take on analog continuous values (e.g. move the cart forward by 1.774 inches).
  • Discrete control: The states, actions, and rewards take on binary values (e.g. true/false, 1/0, left/right).

As per the last definition, the PPO is policy-based DRL algorithm that consists of two main steps:

  1. The agent interacts with the environment by taking a limited number of actions, and samples data from the reward it receives.
  2. The agent then makes multiple optimizations (policy updates) for an estimate (or “surrogate” as Schulman et al. 2017 calls it) of the reward-maximizing objective function using stochastic gradient ascent (SGA). This is where the weights of the loss function (the difference between actual and observed reward) are incrementally tuned as more data is obtained to result in the highest possible reward.

Note: If you are wondering what SGA is, look up Stochastic Gradient Descent — it’s the same thing, but reversed.

These steps address a couple of issues that other policy-based methods such as policy gradient optimization (PGO) and trust region policy optimization (TRPO) face. Standard PGO requires that the objective function be updated only once per data sample, which is computationally expensive given the number of updates that are typically required of such problems. PGO is also susceptible to potentially destructive policy updates where one round of optimization could result in the policy’s premature convergence, or failure to converge to the true maximum reward. On the other hand, the TRPO is complicated to implement and requires prior computations to optimize (and re-optimize) a secondary constraint function that defines the policy’s trust region (and hence the name). It is therefore difficult to implement, and lacks explainability.

PPO Plug

Unlike both standard PGO and TRPO, PPO serves to carefully balance the tradeoffs between ease of implementation, stable and reliable reward trajectories, and speed. It is particularly useful for training agents in continuous control environments, and achieves this in one of two ways:

  1. PPO with adaptive penalty: The penalty coefficient used to optimize the function defining the trust region is updated every time the policy changes to better to adapt the penalty coefficient so that we achieve an update that is both significant but does not overshoot from the true maximum reward.
  2. PPO with a clipped surrogate objective: This method is currently the more widely used version of PPO as it has been found to perform better than the former (Schulman et al, 2017; van Heeswijk, 2022). This PPO variant restricts – clips – the possible range within which the policy can be updated by penalizing any update where the ratio of the probability of a new action being taken given a state to that of the prior action exceeds a threshold.

The latest version of PPO, called PPO2 (screams creativity, no?) is GPU- and multiprocessor-enabled. In other words, its a more efficiently-parallelized version of PPO that enables the training over multiple environments at the same time. This is the algorithm we will be demonstrating next.

PPO Demo

As always, we first need to load some libraries. Assuming that you already have the usual suspects (NumPy, MatPlotlib, Pandas, etc), you will require the Gym and Stable-Baselines3 libraries. The former is a collection of environments that reference general RL problems, while the latter contains stable implementations of most RL algorithms written in PyTorch. To install both Gym and Stable-Baselines3 on your local machine, enter the following command into your command line:

pip install gym stable-baselines3

Once that it completed, create a Python file and follow along the code. This was adapted from a similar PPO demo that can be found here.

In the Python file, we first import the necessary libraries to run and record the training process. We will directly import the PPO algorithm from Stable-Baselines3. Note that this version of the PPO algorithm is in fact the more recent PPO2 algorithm.

# import libraries to train the mountain car
import gym
from stable_baselines3 import PPO

# import libraries to record the training
import numpy as np
import imageio

Next, let’s create the gym environment. For the purpose of this post, we will use the Mountain Car environment from the Gym library. The Mountain Car problem describes a deterministic Markov Decision Process (MDP) with a known reward function (and hence the name). In this problem, a car is placed at the bottom of a sinusoidal valley (Figure 1) and can take three discrete deterministic actions – accelerate to the right, accelerate to the left, or don’t accelerate – to gain enough momentum to push the car up to the flag on top of the mountain. In this environment, the goal of the agent is to learn a policy that will consistently accelerate the mountain car to reach the flag at the top of the hill in less than 200 episodes. This is where the agent completes a full training sequence within the pre-allocated number of time steps, and is otherwise known as an epoch in general machine learning.

Figure 1: Episode 1 (untrained) of the Mountain Cart.

Aite, so let’s create this environment!

# make the gym environment
env_mtcar = gym.make('MountainCar-v0')

Next, we setup an actor-critic multi-layer perceptron model and apply the PPO2 algorithm to train the mountain cart. Here, we want to view the information output by the model, we we will set verbose = 1. We will then allow the model learn for over 100,000 timesteps per episode.

# setup the model 
# Implement actor critic, using a multi-layer perceptron (2 layers of 64) in the pre-specified environment
model = PPO("MlpPolicy", env_cartpole, verbose=1)

# return a trained model that is trained over 10,000 timesteps
model.learn(total_timesteps=10_000)

Let’s take a look at the starting point of the environment. In general, it’s good practice to use the .reset() function to return the environment to it’s starting state after every episode. Here, we also initialize an array of images that we will later combine using the imageIO library to form a GIF.

# get the environment
vec_env = model.get_env()

# reset the environment and return an array of observations whenever a new episode is started
obs = vec_env.reset()

# initialize the GIF to record
images = []
img = model.env.render(mode='rgb_array')

For the Mountain Car environment, the obs variable is a 2-element array where the first element describes the position of the car along the x-axis, and the second element describes the velocity of the car. After a reset, the obs variable should print to look something like [[-0.52558255 0. ]] where the velocity is zero (stationary).

Next, we take 1,000 random actions just to see how things look like. Before each action is taken, we take a snapshot of the prior action and append it to the list of images we initialized earlier. Next, we predict what the next action and hidden state (denoted by the “_” at the beginning of the variable name) is given the current state, provided that its actions are deterministic. We then use this new action to take a step to return the resulting observation and reward. The reward variable will return a value of -1 if an additional action was taken which did not result in reaching the flag. The done variable is a boolean that indicates if the car successfully reached the flag. If done=TRUE, reset the environment to its starting state. Otherwise, continue learning from the environment. We also create a new snapshot of this current action and render it as an image to be later added to the image list.

# train the car
for i in range(1000):
    # append a snapshot of the current episode to the array
    images.append(img)
    
    # get the policy action from an observation and the optional hidden state
    # return the deterministic actions 
    action, _states = model.predict(obs, deterministic=True)
    
   # step the environment with the given action
    obs, reward, done, info = vec_env.step(action)
    
   
   if (i%500) == 0:
        print("i=", i)
        print("Observation=", obs)
        print("Reward=", reward)   
        print("Done?", done)
        print("Info=", info)
  
    if done:
       obs = env.reset()
    
    img = model.env.render(mode='rgb_array')
    vec_env.render()

Finally, we convert the list of images to a GIF and close the environment:

imageio.mimsave('mtcart.gif', [np.array(img) for i, img in enumerate(images) if i%2 == 0], fps=29)

You should see the mtcart.gif file saved in the same directory that you have your code file in. This GIF should look similar to Figure 2:

Figure 2: The mountain car environment progress for a 10,000 timestep training environment.

Conclusion

Overall, PPO is relatively simple but powerful reinforcement learning algorithm to implement, with recent applications in video games, autonomous driving, and continuous control problems in general. This post provided you with a brief but thorough overview of the algorithm and a simple example application to the Mountain Cars environment, and I hope that it motivates you to further check it out and explore other environments as well!

References

King, J. (2020) Driving up a mountain, A Random Walk. Available at: https://jfking50.github.io/mountaincar/ (Accessed: April 12, 2023).

Proximal policy optimization (no date) Proximal Policy Optimization. Available at: https://openai.com/research/openai-baselines-ppo (Accessed: April 12, 2023).

Schulman, J. et al. (2017) Proximal policy optimization algorithms, arXiv.org. Available at: https://arxiv.org/abs/1707.06347 (Accessed: April 11, 2023).

Srinivasan, A.V. (2019) Stochastic gradient descent - clearly explained !!, Medium. Towards Data Science. Available at: https://towardsdatascience.com/stochastic-gradient-descent-clearly-explained-53d239905d31 (Accessed: April 11, 2023).

Wouter van Heeswijk, P.D. (2023) Proximal Policy Optimization (PPO) explained, Medium. Towards Data Science. Available at: https://towardsdatascience.com/proximal-policy-optimization-ppo-explained-abed1952457b (Accessed: April 11, 2023).

Creating Interactive Geospatial Maps in Python with Folium

Interactive mapping and data visualization provide data scientists and researchers with a unique opportunity to explore and analyze geospatial data, and to share their work with stakeholders in a more engaging and accessible way.

Personally, I’ve found that constructing an interactive map for my own research, and iteratively updating it as the work progresses, has been great for improving my own understanding of my project. My project map (not the one in this demo) has been a helpful narrative aid when introducing someone to the project.

I’ve had a lot of success using the folium Python package to create interactive maps, and want to share the tool with you all here.

In this post, I provide a demonstration on how to use the folium package to create an HTML based interactive map of a hydrologic watershed.

Folium: a quick introduction

Folium is a Python package that is used to create interactive maps.

It is built on top of the Leaflet JavaScript library and can be used to visualize geospatial data in unique ways. Using folium, users construct maps by adding various features including lines, markers, pop-up text boxes, and different base-maps (aka, tiles), and can export maps as interactive HTML documents.

The full Folium documentation is here, however I think it is best to learn through an example.

Demo: Mapping a hydrologic basin data with Folium

All of the code and data used in this post is located in a GitHub repository: trevorja/Folium_Interactive_Map_Demo.

To interact with the demo map directly, download the file here: basin_map.html

The folium_map_demo.ipynb Jupyter Notebook steps through the process shown below and will re-create the map shown above. You can create a similar plot for a different basin by changing the station_id number inside ‘retrieve_basin_data.ipynb’ to a specific USGS gauge of interest.

Here is a brief video showcasing the interaction with the final map:

Map Data

For this demo, I am plotting various features within a hydrologic basin in northern New Mexico.

All of the data in the map is retrieved using the pynhd package from the HyRiver suite for python. For more information about these packages, see my previous post, Efficient hydroclimatic data accessing with HyRiver for Python.

The script ‘retrieve_basin_data.ipynb’ is set up to retrieve several features from the basin, including:

  • Basin boundary
  • Mainstem river
  • Tributary rivers
  • USGS gauge locations
  • New Mexico Water Data Initiative (NMWDI) Sites
  • HUC12 pour points

The geospatial data (longitude and latitudes) for each of these features are exported to data/basin_data.csv and used later to generate the folium map.

Constructing a folium hydrologic map

Like many other data visualization programs, folium maps are constructed by iteratively adding features or map elements to the main map object. It is easy to add a map feature, however it takes more care to ensure that the features are well-organized and respond to user interaction the way you want them to.

In this demo, I deconstruct the process into five parts:

  1. Initializing the map
  2. Initializing feature groups (layers)
  3. Adding points to feature layers
  4. Adding polygons & lines to feature layers
  5. Adding layers onto the map

Before going any further, we need to import the necessary packages, load our basin data, and convert the geospatial data to geopandas.GeoDataFrame() geometry objects:

# Import packages
import pandas as pd
import folium
from folium.plugins import MarkerCluster
import geopandas as gpd
from shapely.geometry import Point, LineString
from shapely import wkt

# Specify the coordinate reference system (CRS)
crs = 4386

# Load the basin data
basin_data = pd.read_csv('./data/basin_data.csv', sep = ',', index_col=0)

# Convert to a geoDataFrame and geometry objects
basin_data['geometry'] = basin_data['geometry'].apply(wkt.loads)
geodata = gpd.GeoDataFrame(basin_data, crs = crs)

Additionally, I start by specifying a couple design options before getting started (colors, line widths, etc.). I do this at the start so that I can easily test different colors, and I store all of these specifications in a dict. This step is not necessary, but just helps to stay organized. The following code block shows some of these specifications, but some are left out just to make this post shorter:

## Plot options
# Line widths
basin_linewidth = 2
mainstem_linewidth = 3
tributary_linewidth = 1

# ...
# More color and design specs here...
# ...

# Combine options into a dictionary
plot_options = {
    'station':{'color': usgs_color,
               'size': usgs_size},
    'pourpoint': {'color': pourpoint_color,
                  'size': pourpoint_size},
    'nmwdi': {'color': nmwdi_color,
              'size': nmwdi_size}
              }

With that out of the way, we will start mapping.

Initializing the map

We start to construct our map using the folium.Map() function:

# Initialize the map
geomap = folium.Map(location = [36.5, -106.5],
                    zoom_start = 9.2,
                    tiles = 'cartodbpositron',
                    control_scale = True)

The location and zoom_start arguments set the default view; the user will be able to pan and zoom around the map, but this will be the starting location.

The tile argument in the initial folium.Map() calls sets the default base-map style, but different styles can be added using the folium.TileLayer() function. Each TileLayer() call adds a different base-map style that is then available from the drop-down menu in the final figure.

## Add different tiles (base maps)
folium.TileLayer('openstreetmap').add_to(geomap)
folium.TileLayer('stamenwatercolor').add_to(geomap)
folium.TileLayer('stamenterrain').add_to(geomap)

Here is a side-by-side comparison of the four different tiles used in this demo:

Initializing feature groups (layers)

Before we add any lines or makers to the map, it is best to set up different layers using folium.FeatureGroup(). When the map is complete, the different feature groups can be toggled on-and-off using the drop-down menu.

Below, I initialize a folium.FeatureGroup for each of the six feature types in the map.

# Start feature groups for toggle functionality
basin_layer = folium.FeatureGroup(name = 'Basin Boundary', show=True)
usgs_layer = folium.FeatureGroup(name = 'USGS Gauges', show=True)
mainstem_layer = folium.FeatureGroup(name = 'Mainstem River', show=True)
tributary_layer = folium.FeatureGroup(name='Tributary Rivers', show=False)
pourpoint_layer = folium.FeatureGroup(name= 'HUC12 Pour Points', show=False)
nmwdi_layer = folium.FeatureGroup(name='NM Water Data Initiative Gauge', show=True)

The name argument is what will be displayed on the drop-down menu. The show argument indicates whether that layer is visible by default (when the map is first opened), or whether it first needs to be selected using the drop-down menu.

Now that the layers are initialized, we can begin adding the features (polygons, lines, and point markers) to each layer.

Adding points to feature layers

The folium.CircleMarker() is used to add circle points using a specific coordinate location.

The following code shows how I am iteratively adding different points to the different layers of the map based upon the feature type.

For each point, I extract the latitude and longitude (coords = [point.geometry.y, point.geometry.x]) and pass it to the folium.CircleMarker() function with colors and sizes specific to each of the three different point features (USGS gauge stations (station), NMWDI (nmwdi), and HUC12 pourpoints (pourpoint)):

plot_types = ['station', 'nmwdi', 'pourpoint']
plot_layers = [usgs_layer, nmwdi_layer, pourpoint_layer]

# Loop through the different feature point types
for i, feature_type in enumerate(plot_types):

	# Specify the correct feature layer to add the point too
	map_layer = plot_layers[i]

	# Add each point
    for _, point in geodata.loc[geodata['type'] == feature_type].iterrows():    
        coords = [point.geometry.y, point.geometry.x]
        
        # Add the popup box with description
        textbox = folium.Popup(point.description,
                               min_width= popup_width,
                               max_width= popup_width)

		# Add the marker at the coordinates with color-coordination
        folium.CircleMarker(coords,
                            popup= textbox,
                            fill_color = plot_options[feature_type]['color'],
                            fill = True,
                            fill_opacity = fill_opacity,
                            radius= plot_options[feature_type]['size'],
                            color = plot_options[feature_type]['color']).add_to(map_layer)

Adding polygons & lines to feature layers

I’ve found that it can be difficult to add Polygons or Lines to a folium map if the coordinate geometry are not formatted correctly. For me, the best method has been to convert the polygon or line data to a geopandas.GeoSeries and then converting this to JSON format data using the .to_json() method.

Once in JSON format, the data can be added to the map using the folium.GeoJson() method similar to other features. Although, rather than adding it to the map directly, we add it to the specific feature layer so that we can toggle things later.

Below, I show how I add the basin boundary polygon to the map. This needs to be repeated for the mainstem and tributary river lines, and the full code is included in folium_map_demo.ipynb.

## Plot basin border
for i,r in geodata.loc[geodata['type'] == ].iterrows():
	# Convert the Polygon or LineString to geoJSON format
    geo_json = gpd.GeoSeries(r['geometry']).simplify(tolerance = 0.000001).to_json()
    geo_json = folium.GeoJson(data= geo_json,
                              style_function=lambda x: {'fillcolor': 'none',
                                                        'weight': basin_linewidth,
                                                        'color': basin_color,
                                                        'opacity': 1,
                                                        'fill_opacity': 1,
                                                        'fill': False})
	# Add popup with line description
    folium.Popup(r.description,
                 min_width = popup_width,
                 max_width= popup_width).add_to(geo_json)
    
    # Add the feature to the appropriate layer
    geo_json.add_to(basin_layer)

And with that, the hard part is done.

Last step: adding layers onto map

Now, if you try to visualize the map it will be empty! This is because we have not added the feature layers to the map. In this last step, we add each of the six feature layers to the map and also add the folium.LayerControl() which will allow for us to toggle the different layers on-and-off:

# Add all feature layers to the map
basin_layer.add_to(geomap)
usgs_layer.add_to(geomap)
mainstem_layer.add_to(geomap)
tributary_layer.add_to(geomap)
pourpoint_layer.add_to(geomap)
nmwdi_layer.add_to(geomap)

# Add the toggle option for layers
folium.LayerControl().add_to(geomap)

Ready for the grand reveal?

Viewing, saving, and sharing the map

Viewing your map is as easy as calling the map name at any point in the script (i.e., geomap), and folium makes it easy to save the map as an HTML using the map.save() function as shown here:

# Save and view the map
geomap.save("basin_map.html")
geomap

Once you have your HTML saved, and you’ve taken a moment to open it on your computer and made sure that the features are situated nicely, then it comes time to share. Other users can view the maps simply by opening the HTML file on their local machine, or you can add the HTML to a website.

Concluding thoughts

I hope you’ve found some inspiration here, and find a way to incorporate interactive geospatial mapping in your project. I don’t think it can be overstated how much an interactive visual such as a folium map can serve to broaden the access to your dataset or model.

Thanks for reading!

Detecting Causality using Convergent Cross Mapping: A Python Demo using the Fisheries Game

This post demonstrates the use of Convergent Cross Mapping (CCM) on the Fisheries Game, a dynamic predator-prey system. To conduct CCM, we’ll use the causal_ccm Python package by Prince Joseph Erneszer Javier, which you can find here. This demo follows the basic procedure used in the tutorial for the causal_ccm package, which you can find here.

All code in this blog post can be found in a Jupyter notebook in this Github repo.

Convergent Cross Mapping

CCM is a technique for understanding causality in dynamical systems where the effects of causal variables cannot be separated or uncoupled from the variables they influence (Sugihara et al., 2012). Rohini wrote a great blog post about CCM in 2021, so I’ll just provide a quick summary here and refer readers to Rohini’s post for more details.

CM harnesses the idea that the dynamics of a system can be represented by an underlying manifold, which can be approximated using lagged information from the time series of each variable of interest. If variable X has a causal effect on variable Y, then information about X should be encoded in variable Y, and we can “recover” historical states of X from the historical time series of Y. Using this concept, CCM develops “shadow manifolds” for system variables, and examines the relationships between shadow manifolds using cross mapping, which involves sampling nearest-neighbor points in one manifold, and determining if they correspond to neighboring points in the other. For a more detailed explanation of CCM, I highly recommend reading Sugihara et al., 2012. I also found the series of videos created by the authors to be extremely helpful. You can find them here.

Simulating the fisheries predator-prey system

We’ll start by simulating the predator-prey system used in Hadjimichael et al., (2020). This system models the population dynamics of two species, a predator and a prey. The simulation uses two differential equations to model the dynamics of the predator-prey system:

\frac{dx}{dt} = bx(1-\frac{x}{K}) - \frac{\alpha x y}{y^{m }+ \alpha h x} - zx


\frac{dy}{dt} = \frac{c \alpha x y}{\alpha h x} -dy

Where x and y represent the prey and predator population densities, t represents the time in years, a is the prey availability, b is the prey growth rate, c is the rate at which prey is converted to new predators, d is the predator death rate, h is the time needed to consume prey (called the handling time), K is the carrying capacity for prey, m is the level of predator interaction, and z is the harvesting rate. In the cell below, we’ll simulate this system for a given set of environmental parameters (a, b etc.), and plot the dynamics over 120 time periods. For more details on the Fisheries system, see the training blog posts by Lillian and Trevor (part 0, part 1, part 2). There is also an interactive Jupyter notebook in our recently published eBook on Uncertainty Characterization for Multisector Dynamics Research.

import numpy as np
from matplotlib import pyplot as plt

# assign default parameters
tsteps = 120 # number of timesteps to simulate
a = 0.01 # prey availability
b = 0.25 # prey growth rate
c = 0.30 # rate that prey is converted to predator
d = 0.1 # predator death rate
h = 0.61 # handling time
K = 1900 # prey carrying capacity
m = .20 # predator interference rate

# create arrays to store predator and prey populations
prey = np.zeros(tsteps+1)
pred = np.zeros(tsteps+1)

# initial population levels
prey[0] = 28
pred[0] = 28

# harvesting, which we will keep at 0 for this example
z = np.zeros(len(prey))

# simulate the system
for t in range(tsteps):
    if prey[t] > 0 and pred[t] > 0:
        prey[t + 1] = (prey[t] + b * prey[t] * (1 - prey[t] / K) - (a * prey[t] * pred[t]) / (np.power(pred[t], m) +
                                                            a * h * prey[t]) - z[t] * prey[t])   # Prey growth equation
        pred[t + 1] = (pred[t] + c * a * prey[t] * pred[t] / (np.power(pred[t], m) + a * h *
                                                            prey[t]) - d * pred[t]) # Predator growth equation

# plot the polulation dynamics
fig = plt.figure(figsize=(6,6))
plt.plot(np.arange(tsteps+1), prey, label = 'Prey')
plt.plot(np.arange(tsteps+1), pred, label = 'Predator')
plt.legend(prop={'size': 12})
plt.xlabel('Timestep', size=12)
plt.ylabel('Population Size', size=12)
plt.title('Population Dynamics', size=15)
Figure 1: predator and prey population dynamics

With these parameters, we can visualize the trajectory and direction field of the system of equations, which is shown below (I’m not including ploting code here for brevity, but see this tutorial if you’re interested in making these plots). In CCM terms, this is a visualization of the underlying manifold of the dynamical system.

Figure 2: Trajectories and direction fields of the predator-prey system

Causal detection with CCM

From the system of equations above, we know there is a causal relationship between the predator and prey, and we have visualized the common manifold. Below, we’ll test whether CCM can detect this relationship. If the algorithm works as it should, we will see a clear indication that the two populations are causally linked.

To conduct CCM, we need to specify the number of dimensions to use for shadow manifolds, E, the lag time, tau, and the library size, L.

The CCM algorithm should converge to a stable approximation of causality as the library size increases. Below we’ll test library sizes from 10 to 100 to see if we achieve convergence for the Fisheries system. We’ll start by assuming shadow manifolds have two dimensions (E=2), and a lag time of one time step (tau=1). To test convergence, we’ll plot the correlation (rho) between the shadow manifold predictions and the historical states of the two variables.

from causal_ccm.causal_ccm import ccm
E = 2 # dimensions of the shadow manifold
tau = 1 # lag time
L_range = range(10, 100, 5) # test a range of library sizes
preyhat_Mpred, predhat_Mprey = [], [] # correlation list
for L in L_range:
    ccm_prey_pred = ccm(prey, pred, tau, E, L) # define new ccm object # Testing for X -> Y
    ccm_pred_prey = ccm(pred, prey, tau, E, L) # define new ccm object # Testing for Y -> X
    preyhat_Mpred.append(ccm_prey_pred.causality()[0])
    predhat_Mprey.append(ccm_pred_prey.causality()[0])

# Plot Cross Mapping Convergence
plt.figure(figsize=(6,6))
plt.plot(L_range, preyhat_Mpred, label='$\hat{Prey}(t)|M_{Prey}$')
plt.plot(L_range, predhat_Mprey, label='$\hat{Pred}(t)|M_{Pred}$')
plt.ylim([0,1])
plt.xlabel('Library Size', size=12)
plt.ylabel(r'$\rho$', size=12)
plt.legend(prop={'size': 12})
Figure 3: Convergence in cross mapping estimates

Figure 3 shows that CCM does seem to detect a causal relationship between the predator and prey (rho is far above 0), and the estimated strength of this relationship starts to stabilize (converge) with a library size of around 60.

Next, we’ll examine how our lag time (tau) used to construct the shadow manifolds impacts our findings.

from causal_ccm.causal_ccm import ccm
E = 2 # dimensions of the shadow manifold
L = 60 # length of library (we'll return to this later)

# first, test different lags for construction of the shadow manifolds
# we'll test lags from 0 to 30 time steps
preyhat_Mpred, predhat_Mprey = [], []  # lists to store correlation
for tau in range(1, 20):
    ccm_prey_pred = ccm(prey, pred, tau, E, L)  # define new ccm object # Testing for prey -> pred
    ccm_pred_prey = ccm(pred, prey, tau, E, L)  # define new ccm object # Testing for pred -> prey
    preyhat_Mpred.append(ccm_prey_pred.causality()[0]) # stores prey -> pred
    predhat_Mprey.append(ccm_pred_prey.causality()[0]) # stores pred -> prey

# plot the correlation for different lag times
plt.figure(figsize=(6,6))
plt.plot(np.arange(1,20), preyhat_Mpred, label='$\hat{Prey}(t)|M_{Prey}$')
plt.plot(np.arange(1,20), predhat_Mprey, label='$\hat{Pred}(t)|M_{Pred}$')
plt.ylim([0,1.01])
plt.xlim([0,20])
plt.xticks(np.arange(20))
plt.xlabel('Lag', size=12)
plt.ylabel(r'$\rho$', size=12)
plt.legend(prop={'size': 12})
Figure 4: The impact of lag time, tau, on CCM estimates

The results in Figure 4 indicate that a lag time of 1 (which we initially assumed) does not adequately capture the causal relationship between the two variables. When lag times are set between 5 and 10, CMM shows a much stronger relationship between the two variables. Using this information, we can again test the convergence across different library sizes.

E = 2 # dimensions of the shadow manifold
tau = 5
L_range = range(10, 100, 5)
preyhat_Mpred, predhat_Mprey = [], [] # correlation list
for L in L_range:
    ccm_prey_pred = ccm(prey, pred, tau, E, L) # define new ccm object # Testing for X -> Y
    ccm_pred_prey = ccm(pred, prey, tau, E, L) # define new ccm object # Testing for Y -> X
    preyhat_Mpred.append(ccm_prey_pred.causality()[0])
    predhat_Mprey.append(ccm_pred_prey.causality()[0])

# Plot Cross Mapping Convergence
plt.figure(figsize=(6,6))
plt.plot(L_range, preyhat_Mpred, label='$\hat{Prey}(t)|M_{Prey}$')
plt.plot(L_range, predhat_Mprey, label='$\hat{Pred}(t)|M_{Pred}$')
plt.ylim([0,1])
plt.xlabel('Library Size', size=12)
plt.ylabel(r'$\rho$', size=12)
plt.legend(prop={'size': 12})
Figure 5: With a lag of 5, CCM converges much faster, and detects a much stronger relationship between predator and prey

In the Figure 5, we observe that CCM with a lag size of 5 converges much faster and generates a stronger correlation than our original estimate using tau = 1. In fact, CCM can reconstruct the historical system states almost perfectly. To see why we can visualize the underlying shadow manifolds and cross mapping conducted for this analysis (this is conveniently available in the causal_ccm package with the visualize_cross_mapping function).

# set lag (tau) to 7 and examine results
tau = 5

# prey -> predator
ccm1 = ccm(prey,pred, tau, E, L) # prey -> predator
print("prey -> predator: " + str(ccm1.causality()[0]))

# visualize the cross mapping from the two shadow manifolds
ccm1.visualize_cross_mapping()
Figure 6: Shadow manifolds and cross mapping between prey (shown as X) and predator (shown as Y).

Figure 6 shows the two shadow manifolds for prey (X in these plots) and predator (Y in these plots). We observe that the shapes of the shadow manifolds preserve the general characteristics of the original manifold, as shown by the trajectories plotted in Figure 2. Figure 6 also shows the the nearest neighbor points sampled in each manifold (blue boxes) and their mapping to the other variable’s shadow manifold (red stars). We can see how similar points on one manifold correspond to similar points on the other in both directions.

We can also use the causal_ccm package to visualize the correlation between the prey and the CCM prey estimates, with very impressive results (Figure 7).

Figure 7: The relationship between observed populations and CCM estimates.

Concluding thoughts

This example demonstrates that CCM can indeed detect the causal relationship between predator and prey in this system and in fact, provides extremely accurate reconstructions of both population sets. This shouldn’t come as a surprise since we knew from the start that a strong causal relationship does exist within this system. Still, I find it almost unnerving how well a job CCM manages to do here. For more info on CCM and coding CCM, see the links below:

A great tutorial of CCM by the author of the causal_ccm Python package

Rohini’s blog post (with a demo of CCM in R)

Video links teaching CCM core concepts

References:

Hadjimichael, A., Reed, P., & Quinn, J. (2020). Navigating Deeply Uncertain Tradeoffs in Harvested Predator-Prey Systems. Complexity, 2020, 1-18. https://doi.org/10.1155/2020/4170453

Sugihara, G., May, R., Ye, H., Hsieh, C. H., Deyle, E., Fogarty, M., & Munch, S. (2012). Detecting causality in complex ecosystems. science338(6106), 496-500.

Using drawdata and Mercury in Jupyter Notebooks

I wrote a post last year on Enhancing Jupyter Notebooks for Teaching. Through my time at Cornell as a TA or mentoring our undergrad researchers, I’ve learned how important it is to find fun and interesting ways for students to play with data and to feel more comfortable diving into traditionally difficult topics like machine learning. This post features two cool new libraries that were brought to my attention recently that can help jazz up your tutorials:

(1) drawdata: allows a student to interactively draw a dataset (lines, histograms, scatterplots) with up to four different labels. The dataset and labels can be saved as JSONs and CSVs and also directly copied into Pandas dataframes. This can be useful to facilitate interactive machine learning tutorials.

(2) Mercury: converts a Jupyter notebook to web app. This is especially useful for classroom tutorials because it doesn’t require that students have Jupyter installed or even Python.

I’ve combined both of these functionalities into a notebook that is focused on classification of a dataset with 2 labels using support vector machines (SVMs) that can be found here.

drawdata

First let’s install drawdata by typing pip install drawdata into our command line. Next, let’s follow through the steps of the Jupyter notebook. We’ll import the rest of our libraries and draw out a simple linearly-separable dataset.

By clicking “copy csv” we can copy this exact dataset to a Pandas dataframe. Upon inspection, we see that each datapoint has 2 features (an x and y coordinate) and a label that is either “a” or “b”. Let’s also adjust the labels to be [-1,1] for the purposes of using some classifiers from scikit- learn. We then fit a very basic support vector classifier and plot the decision boundary.

data=pd.read_clipboard(sep=",")

#Rename the labels to integers

for i in range(0, len(data)):
    # checking if the character at char index is equivalent to 'a'
    if(data.iloc[i,2] == 'a'):
        # append $ to modified string
        data.iloc[i,2] = -1
    else:
        # append original string character
        data.iloc[i,2] = 1
data.iloc[:,2]=data.iloc[:,2].astype('int')


#Create our datasets

X=np.array(data.iloc[:,0:2])
y=np.array(data.iloc[:,2])


#Create a 60/40 training and test split 

X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.4, random_state=42)
    
#Fit classifier 

clf=svm.SVC(kernel='linear',C=0.025)
clf.fit(X_train, y_train)
score = clf.score(X_test, y_test)
print("The classification accuracy is", score)
#The classification accuracy is 1.0


#Plot original dataset that you drew 

cm = plt.cm.get_cmap('cool_r')
cm_bright = ListedColormap(["purple", "cyan"])
figure = plt.figure(figsize=(8, 6))
ax = plt.subplot(1,1,1)
#Plot decision boundary
ax.set_title("Classification Boundary")
DecisionBoundaryDisplay.from_estimator(clf, X, cmap=cm, alpha=0.8, ax=ax, eps=0.5)
# Plot the training points
ax.scatter(X_train[:, 0], X_train[:, 1], c=y_train,cmap=cm_bright , edgecolors="k")
# Plot the testing points
ax.scatter(X_test[:, 0], X_test[:, 1], c=y_test, cmap=cm_bright, alpha=0.6, edgecolors="k")

We can also plot the margin and support vectors.

This is a straightforward classification example, but we can also look at non-linearly separable examples.

Ruh roh! Our linear classifier is never going to work for this dataset! We’ll have to use the kernel trick- that is, we need to map our data to a higher dimensional space with an additional feature that may be able to better help us separate out the two classes. We can create an additional feature that captures the distance from a central point (300, 300). This allows us to find a hyperplane that can separate the points in a higher dimensional space.

#Create an additional dimension
r = np.sqrt((X[:, 0]-300)**2+(X[:, 1]-300)**2)

def plot_3D(elev=30, azim=30, X=X, y=y):
    ax = plt.subplot(projection='3d')
    ax.scatter3D(X[:, 0], X[:, 1], r, c=y, s=50, cmap=cm_bright,edgecolors="k")
    ax.view_init(elev=elev, azim=azim)
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_zlabel('r')
    

interact(plot_3D, elev=(-90, 90), azip=(-180, 180),
         X=fixed(X), y=fixed(y));

Thus, we can implement a similar radial basis function kernel in our SVC rather than using a linear kernel to define our non-linear classification boundary.

Mercury

Once you have a notebook that you are satisfied with, you can make it into an interactive web app by adding a YAML header. The YAML header also facilitates the ability interact with certain variables in the code and then run it. For example, users can select a classifier from a drop down menu or slide through different values of C. Finally they click the run button to execute the notebook.

A YAML header is added to the first RAW cell in the notebook. Here I want the user to be able to slide through the hardness of the margin (between 0 and 100).

---
title: SVM Classifier
description: Implement linear and non-linear classifiers
show-code: True
params:
    C_margin:
        input: slider 
        label: Value of Margin 
        value: 0.025
        min: 0
        max: 100
---

Then we install mercury (pip install mljar-mercury) and go to the command line and type mercury watch Classification_Tutorial.ipynb. The prompt will create a local address that you can put in your browser. The users can then draw their own dataset, adjust the hardness of the margin, and run the notebook.

Resources

Some of the SVM plotting code was borrowed from: https://jakevdp.github.io/PythonDataScienceHandbook/05.07-support-vector-machines.html

More info on Mercury can be found in this post: https://medium.com/@MLJARofficial/mercury-convert-jupyter-notebook-to-a-web-app-by-adding-yaml-header-872ce4e53676

Markdown-Based Scientific and Computational Note Taking with Obsidian

Motivation

Over the last year, being in the Reed Research Group, I have been exposed to new ideas more rapidly than I can manage. During this period, I was filling my laptop storage with countless .docx, .pdf, .txt, .html files semi-sporadically stored across different cloud and local storages.

Finding a good method for organizing my notes and documentation has been very beneficial for my productivity, processing of new ideas, and ultimately staying sane. For me, this has been done through Obsidian.

Obsidian is an open-source markdown (.md) based note taking and organization tool, with strengths being:

  • Connected and relational note organization
  • Rendering code and mathematical (LaTeX) syntax
  • Community-developed plugins
  • Light-weight
  • A nice aesthetic

The fact that all notes are simply .md files is very appealing to me. I don’t need to waste time searching through different local and cloud directories for my notes, and can avoid having OneNote, Notepad, and 6 Word windows open at the same time.

Also, I like having localized storage of the notes and documentation that I am accumulating. I store my Obsidian directory on GitHub, and push-pull copies between my laptop and desktop, giving me the security of having three copies of my precious notes at all times.

I’ve been using Obsidian as my primary notetaking app for a few months now, and it become central to my workflow. In this post, I show how Obsidian extends Markdown feature utility, helps to organize topical .md libraries, and generally makes the documentation process more enjoyable. I also try to explain why I believe Obsidian is so effective for researchers or code developers.

Note Taking in Markdown

Markdown (.md) text is an efficient and effective way of not just documenting code (e.g., README.md files), but is great for writing tutorials (e.g., in JupyterNotebook), taking notes, and even designing a Website (as demonstrated in Andrew’s Blog Post last week).

In my opinion, the beauty is that only a few syntax tricks are necessary for producing a very clean .md document. This is particularly relevant as a programmer, when notes often require mathematical and code syntax.

I am not going to delve into Markdown syntax. For those who have not written in Markdown, I suggest you see the Markdown Documentation here. Instead, I focus on how Obsidian enables a pleasant environment in which to write .md documentation.

Obsidian Key Features

Below, I hit on what I view as the key strengths of Obsidian:

  • Connected and relational note organization
  • Rendering code and mathematical (LaTeX) syntax
  • Community-developed plugins

Relational Note Organization

If you take a quick look at any of the Obsidian forums, you will come across this word: Zettelkasten.

Zettelkasten, meaning "slip-box" in German and also referred to as card file, is a note taking strategy which encourages the use of many highly-specific notes which are connected to one another via references.

Obsidian is designed to help facilitate this style of note taking, with the goal of facilitating what they refer to as a "second brain". The goal of this relational note taking is to store not just information about a single topic but rather a network of connected information.

To this end, Obsidian helps to easily navigate these connections and visualize relationships stored within a Vault (which is just a fancy word for one large folder directory). Below is a screen cap of my note network developed over the last ~4 months. On the left is a visualization of my entire note directory, with a zoomed-in view on the right.

Notice the scenario discovery node, which has direct connections to methodological notes on Logistic Regression, Boosted Trees, PRIM, and literature review notes on Bryant & Lempert 2010, a paper which was influential in advocating for participatory, computer-assisted scenario discovery.

Each of these nodes is then connected to the broader network of notes and documentation.

These relations are easily constructed by linking other pages within the directory, or subheadings within those pages. When writing the notes, you can then click through the links to flip between files in the directory.

Links to Internal Pages and Subheadings

Referencing other pages (.md files) in your library is done with a double square bracket on either side: [[Markdown cheatsheet]]

You can link get down to finer resolution and specifically reference various levels of sub-headings within a page by adding a hashtag to the internal link, such as: [[Markdown cheatsheet#Basics#Bold]]

Tags and Tag Searches

Another tool that helps facilitate relational note taking is the use of #tags. Attach a # to any word within any of your notes, and that word becomes connected to other instances of the word throughout the directory.

Once tags have been created, they can be searched. The following Obsidian syntax generates a live list of occurrences of that tag across your entire vault:

```query
tag: #scenarios 

Which produces the window:

Rending code and math syntax

Language-Specific Code Snippets

Obsidian will beautifully stylized code snippets using language-specific formatting, and if you don’t agree then you change change your style settings.

A block of code is specified, for some specific language using the tripple-tic syntax as such:

```langage
<Enter code here>

The classic HelloWorld() function can be stylistically rendered in Python or C++:

LaTeX

As per usual, the $ characters are used to render LaTeX equations. Use of single-$ characters will results in in-line equations ($<Enter LaTeX>$) with double-$$ used for centered equations.

Obsidian is not limited to short LaTeX equations, and has plugins designed to allow for inclusion other LaTeX packages or HTML syntaxes.

latex $$ \phi_{t,i} = \left\{ \begin{array}\\ \phi_{t-1,i} + 1 & \text{if } \ z_t < \text{limit}\\ 0 & \text{otherwise.} \end{array} \right. $$

will produce:

Plugins

Obsidian boasts an impressive 667 community-developed plugins with a wide variety of functionality. A glimpse at the webpage shows plugins that give more control over the visual interface, allow for alternative LaTeX environments, or allow for pages to be exported to various file formats.

Realistically, I have not spent a lot of time working with the plugins. But, if you are someone who likes the idea of a continuously evolving and modifiable documentation environment then you may want to check them out in greater depth.

Conclusion: Why did I write this?

This is not a sponsered post in any way, I just like the app.

When diving into a new project or starting a research program, it is critical to find a way of taking notes, documenting resources, and storing ideas that works for you. Obsidian is one tool which has proven to be effective for me. It may help you.

Best of luck with your learning.

Enhancing Jupyter Notebooks for Teaching

In our research group, we often use Jupyter Notebooks as teaching tools in the classroom and to train new students. Jupyter Notebooks are useful because they integrate code and its output into a single document that also supports visualizations, narrative text, mathematical equations, and other forms of media that often allows students to better contextualize what they are learning.

There are some aspects of Jupyter Notebooks that students sometimes struggle with as they shift from IDEs to this new environment. Many have noted difficulty tracking the value of variables through the notebook and getting stuck in a mindset of just pressing “Shift+Enter”. Since the advent of Project Jupyter, there have been a variety of “unofficial” extensions and widgets that have been put out by the community that I have found can help enhance the learning experience and make Jupyter Notebooks less frustrating. It’s just not apparent that these extensions exist until you go searching for them. I will demonstrate some below that have been particularly helpful when working with our undergrads and hopefully will be helpful for you too!

Variable Inspector

One very easy way to remedy the situation of not having a variable display is to enable the Variable Inspector Extension which allows you to see the value of all of the variables in an external floating window. In your command prompt type:

#Installation
pip install jupyter_contrib_nbextensions
jupyter contrib nbextension install --user

#Enable extension
jupyter nbextension enable varInspector/main

Here we are installing a massive library of extensions called Nbextensions that we will enable in this post. When you open your notebooks, you will now see a small icon at the top that you can click to create a variable inspector window that will reside in the margins if the notebook.

Variable Inspector Icon

As you step through the notebook, your variable inspector will become populated.

Variable Inspector Floating Window

Snippets of Code

The Snippets menu is pretty fantastic for students who are new to coding or for those of us who go back and forth between coding languages and keep having to quickly search how to make arrays or need code for example plots.

We can just enable this extension as follows right in Jupyter Notebook under the Nbextensions tab:

Enabling the Snippets Menu

This gif provides an example of some of the options that are available for NumPy, Matplotlib, and Pandas.

Jupyter Snippets Menu

If for instance, you want to create a histogram but can’t remember the form of the code, a snippet for this exists under the Matplotlib submenu.

Making a histogram with Matplotlib Snippets

Rubberband (Highlight Multiple Cells)

If you enable the Rubberband extension in the Nbextensions tab, you can highlight and execute multiple cells, which just gives a bit more precision in selecting a subset of cells that you want to run at once. You hold down “Ctrl+Shift” while dragging the mouse. Then you can execute the highlighted cells using “Shift+Enter”.

Execution Time

You can wrap your code in a variety of timing functions or just enable the Execution Time extension in the Nbextensions tab. For every cell that you execute, the elapsed time will now be recorded under the cell. For me, this is particularly useful when you have some cells that take a longer time to execute. If you are trying to create training, it’s helpful for a user to get a breakdown of exactly how long the code should be taking to run (so that they know if the timing is normal or if there is an issue with the code).

Elapsed time is shown below each cell

Creating a Jupyter Notebook Slideshow with RISE (a reveal.js extension)

The last teaching tool that I want to demonstrate is how to turn your Jupyter Notebook into an interactive slideshow, which has been infinitely more useful to use as a teaching tool than just scrolling through the Jupyter Notebook at a meeting.

First, installation is easy:

pip install RISE

Now, we have to prepare our notebook for a slideshow. Go to View -> Cell Toolbar -> Slideshow. We need to select the Slide Type for each of our cells.

Selecting the slide type for the slideshow

At the top of the screen, we now have a new icon that allows you to enter into slideshow mode when you are ready. We can now step through the slideshow that we have made and interactively execute the coding boxes as well.

I hope these tips help you to have a more pleasant experience while using Jupyter Notebooks. If there are other extensions that you enjoy using regularly, please post below!