Constructing interactive Ipywidgets: demonstration using the HYMOD model

Constructing interactive Ipywidgets: demonstration using the HYMOD model

Last week, Lillian and I published the first post in a series of training post studying the “Fisheries Game, which is a decision making problem within a complex, non-linear, and uncertain ecological context.

In preparing for that post, I learned about the Ipywidgets python library for widget construction. It stood out to me as a tool for highlighting the influence of parametric uncertainty on model performance. More broadly, I think it has great as an educational or data-narrative device.

This blogpost is designed to highlight this potential, and provide a basic introduction to the library. A tutorial demonstration of how an interactive widget is constructed is provided, this time using the HYMOD rainfall-runoff model.

This post is intended to be viewed through a Jupyter Notebook for interaction, which can be accessed through a Binder at this link!

The Binder was built with an internal environment specification, so it should not be necessary to install any packages on your local machine! Because of this, it may take a minute to load the page.

Alternatively, you can pull the source code and run the Jupyter Notebook from your local machine. All of the source code is available in a GitHub repository: Ipywidget_Demo_Interactive_HYMOD.

If using your local machine, you will first need to install the Ipywidget library:

pip install ipywidgets

Let’s begin!

HYMOD Introduction

HYMOD is a conceptual rainfall-runoff model. Given some observed precipitation and evaporation, a parameterized HYMOD model simulates the resulting down-basin runoff.

This post does not focus on specific properties or performance of the HYMOD model, but rather uses the model as a demonstration of the utility of the Ipywidget library.

I chose to use the HYMOD model for this, because the HYMOD model is commonly taught in introductory hydrologic modeling courses. This demonstration shows how an Ipywidget can be used in an educational context. The resulting widget can allow students to interact in real-time with the model behavior, by adjusting parameter values and visualizing the changes in the resulted streamflow.

If you are interested in the technical details of implementing the HYMOD model, you can dig into the source code, available (and throughly commented/descriptive) in the repository for this post: Ipywidget_Demo_Interactive_HYMOD.

HYMOD represents surface flow as a series of several quick-flow reservoirs. Groundwater flow is represented as a single slow-flow reservoir. The reservoirs have constant flow rates, with the quick-flow reservoir rate, Kq, being greater than the slow-flow reservoir rate, Ks.

Image source: Sun, Wenchao & Ishidaira, Hiroshi & Bastola, Satish. (2010)

HYMOD Parameters:

Like any hydrologic model, the performance of HYMOD will be dependent upon the specified parameter values. There are several parameters that can be adjusted:

  • Cmax: Max soil moisture storage (mm) [10-2000]
  • B: Distribution of soil stores [0.0 – 7.0]
  • Alpha: Division between quick/slow routing [0.0 – 1.0]
  • Kq: Quick flow reservoir rate constant (day^-1) [0.15 – 1.0]
  • Ks: Slow flow reservoir rate constant. (day^-1) [0.0 – 0.15]
  • N: The number of quick-flow reservoirs.

Interactive widget demonstration

I’ve constructed an Ipywidets object which allows a user to visualize the impact of the HYMOD model parameters on the resulting simulation timeseries. The user also has the option to select from three different error metrics, which display in the plot, and toggle the observed timeseries plot on and off.

Later in this post, I will give detail on how the widget was created.

Before provided the detail, I want to show the widget in action so that you know the expectation for the final product.

The gif below shows the widget in-use:

Demonstration of the HYMOD widget.

Ipywidgets Introduction

The Ipywdiget library allows for highly customized widgets, like the one above. As with any new tool, I’d recommend you check out the documentation here.

Below, I walk through the process of generating the widget shown above.

Lets begin!

Import the library

# Import the library
import ipywidgets as widgets

Basic widget components

Consider an Ipywidget as being an arrangement of modular components.

The tutorial walks through the construction of five key widget components:

  1. Variable slider
  2. Drop-down selectors
  3. Toggle buttons
  4. Label objects
  5. Interactive outputs (used to connect the plot to the other three components)

In the last section, I show how all of these components can be arranged together to construct the unified widget.

Sliders

Sliders are one of the most common ipywidet tools. They allow for manual manipulation of a variable value. The slider is an object that can be passed to the interactive widget (more on this further down).

For my HYMOD widget, I would like to be able to manipulate each of the model parameters listed above. I begin by constructing a slider object for each of the variables.

Here is an example, for the C_max variable:

# Construct the slider
Cmax_slider = widgets.FloatSlider(value = 500, min = 10, max = 2000, step = 1.0, description = "C_max",
                                  disabled = False, continuous_update = False, readout = True, readout_format = '0.2f')


# Display the slider
display(Cmax_slider)

Notice that each slider recieves a specified minmax, and step corresponding to the possible values. For the HYMOD demo, I am using the parameter ranges specified in Herman, J.D., P.M. Reed, and T. Wagener (2013), Time-varying sensitivity analysis clarifies the effects of watershed model formulation on model behavior.

I will construct the sliders for the remaining parameters below. Notice that I don’t assign the description parameter in any of these sliders… this is intentional. Later in this tutorial I will show how to arrange the sliders with Label() objects for a cleaner widget design.

# Construct remaining sliders
Cmax_slider = widgets.FloatSlider(value = 100, min = 10, max = 2000, step = 1.0, disabled = False, continuous_update = False, readout = True, readout_format = '0.2f')
B_slider = widgets.FloatSlider(value = 2.0, min = 0.0, max = 7.0, step = 0.1, disabled = False, continuous_update = False, readout = True, readout_format = '0.2f')
Alpha_slider = widgets.FloatSlider(value = 0.30, min = 0.00, max = 1.00, step = 0.01, disabled = False, continuous_update = False, readout = True, readout_format = '0.2f')
Kq_slider = widgets.FloatSlider(value = 0.33, min = 0.15, max = 1.00, step = 0.01, disabled = False, continuous_update = False, readout = True, readout_format = '0.2f')
Ks_slider = widgets.FloatSlider(value = 0.07, min = 0.00, max = 0.15, step = 0.01, disabled = False, continuous_update = False, readout = True, readout_format = '0.2f')
N_slider = widgets.IntSlider(value = 3, min = 2, max = 7, disabled = False, continuous_update = False, readout = True)

# Place all sliders in a list
list_of_sliders = [Kq_slider, Ks_slider, Cmax_slider, B_slider, Alpha_slider, N_slider]

The Dropdown() allows the user to select from a set of discrete variable options. Here, I want to give the user options on which error metric to use when comparing simulated and observed timeseries.

I provide three options:

  1. RMSE: Root mean square error
  2. NSE: Nash Sutcliffe efficiency
  3. ROCE: Runoff coefficient error

See the calculate_error_by_type inside the HYMOD_components.py script to see how these are calculated.

To provide this functionality, I define the Dropdown() object, as below, with a list of options and the initial value:

# Construct the drop-down to select from different error metrics
drop_down = widgets.Dropdown(options=['RMSE','NSE','ROCE'], description='',
                                value = 'RMSE', disabled = False)

# Display the drop-down
display(drop_down)

ToggleButton

The ToggleButton() allows for a bool variable to be toggled between True and False. For my streamflow plot function, I have an option plot_observed = False which determines if the observed streamflow timeseries is shown in the figure.

# Construct the button to toggle observed data On/Off
plot_button = widgets.ToggleButton(value = False, description = 'Toggle', disabled=False, button_style='', tooltip='Description')

# Display the button
display(plot_button)

Labels

As mentioned above, I choose to not include the description argument within the slider, drop-down, or toggle objects. This is because it is common for these labels to get cut-off when displaying the widget object.

For example, take a look at this slider below, with a long description argument:

# Make a slider with a long label
long_title_slider = widgets.FloatSlider(value = 2.0, min = 0.0, max = 7.0, step = 0.1, description = 'This slider has a long label!', readout = True)

# Display: Notice how the label is cut-off!
display(long_title_slider)

The ipywidgets.Label() function provides a way of avoiding this while allowing for long descriptions. Using Label() will ultimately provide you with a lot more control over your widget layout (last section of the tutorial).

The Label() function generates a separate object. Below, I create a unique Label() object for each HYMOD parameter.

# Import the Label() function
from ipywidgets import Label

# Make a list of label strings
param_labs = ['Kq : Quick flow reservoir rate constant (1/day)',
            'Ks : Slow flow reservoir rate constant (1/day)',
            'C_max : Maximum soil moisture storage (mm)',
            'B : Distribution of soil stores',
            'Alpha : Division between quick/slow routing',
            'N : Number of quick-flow reservoirs']

# Make a list of Label() objects
list_of_labels = [Label(i) for i in param_labs]

# Display the first label, for example.
list_of_labels[0]

Interactive_output

Now that we have constructed interactive

The interactive_output function takes two inputs, the function to interact with, and a dictionary of variable assignments:

interactive_output( function, {‘variable_name’ : variable_widget, …} )

I have created a custome function plot_HYMOD_results which:

  1. Loads 1-year of precipitation and evaporation data for the Leaf River catchment.
  2. Runs the HYMOD simulation using the provided parameter values.
  3. Calculates the error of the simulated vs. observed data.
  4. Plots the timeseries of runoff.

The source code for this function can be found in the GitHub repository for this post, or specifically here.

The function receives parameter values for each of the HYMOD parameters discussed above, a bool indicator if observed data should be plotted, and a specified error metric.

plot_HYMOD_results(C_max, B, Alpha, Ks, Kq, N_reservoirs, plot_observed = False, error_type = ‘RMSE’):

I have already generated widget components corresponding to each of these variables! (If you are on the Jupyter Notebook version of this post, make sure to have Run every cell before this, or else the following code wont work.

I can now use the interactive_output function to link the widget components generated earlier with the function inputs:

# Import the interactive_output function
from ipywidgets import interactive_output

# Import my custom plotting function
from HYMOD_plots import plot_HYMOD_results

result_comparison_plot = interactive_output(plot_HYMOD_results, {'C_max' : Cmax_slider, 'B': B_slider, 'Alpha':Alpha_slider, 
                                                                 'Ks':Ks_slider, 'Kq':Kq_slider,'N_reservoirs':N_slider, 
                                                                 'plot_observed' : plot_button,'error_type': drop_down})

# Show the output
result_comparison_plot
Output generated by the interactive_output().

Displaying the interactive_output reveals only the plot, but does not include any of the widget functionality…

Despite this, the plot is still linked to the widget components generated earlier. If you don’t believe me (and are reading the Jupyter Notebook version of this post), scroll up and click the ToggleButton a few cells up, then come back and look at the plot again.

Using the interactive_output() function, rather than other variations of the interact() functions available, allows for cleaner widgets to be produced, because now the arrangment of the widget components can be entirely customizable.

Keep reading for more detail on this!

Arranging widget components

Rather than using widget features one-at-a-time, Ipywidgets allow for several widgets to be arranged in a unified layout. Think of everything that has been generated previously as being a cell within the a gridded widget; the best part is that each cell is linked with one another.

Once the individual widget features (e.g., sliders, buttons, drop-downs, and output plots) are defined, they can be grouped using the VBox() (vertical box) and HBox() (horizontal box) functions.

I’ve constructed a visual representation of my intended widget layout, shown below. The dashed orange boxes show those components grouped by the HBox() function, and the blue boxes show those grouped by the VBox() function.

Visual representation of the final widget layout.

Before getting started, import some of the basic layout functions:

# Import the various 
from ipywidgets import HBox, VBox, Layout

Before constructing the entire widget, it is good to get familiar with the basic HBox() and VBox() functionality.

Remember the list of sliders and list of labels that we created earlier?

# Stack the list of label objects vertically:
VBox(list_of_labels)

# Try the same thing with the sliders (remove comment #):
#VBox(list_of_sliders)

In the final widget, I want the column of labels to be located on the left of the column of sliders. HBox() allows for these two columns to be arrange next to one another:

# Putting the columns side-by-side
HBox([VBox(list_of_labels), VBox(list_of_sliders)])

Generating the final widget

Using the basic HBox() and VBox() functions shown above, I arrange all of the widget components I’ve defined previously. I first define each row of the widget using HBox(), and finally stack the rows using VBox().

The script below will complete the arrangement, and call the final widget!

# Define secifications for the widgets: center and wrap 
box_layout = widgets.Layout(display='flex', flex_flow = 'row', align_items ='center', justify_content = 'center')

# Create the rows of the widets
title_row = Label('Select parameter values for the HYMOD model:')
slider_row = HBox([VBox(list_of_labels), VBox(list_of_sliders)], layout = box_layout)
error_menu_row = HBox([Label('Choose error metric:'), drop_down], layout = box_layout)
observed_toggle_row = HBox([Label('Click to show observed flow'), plot_button], layout = box_layout)
plot_row = HBox([result_comparison_plot], layout = box_layout)


# Combine label and slider box (row_one) with plot for the final widget
HYMOD_widget = VBox([title_row, slider_row, plot_row, error_menu_row, observed_toggle_row])


# Call the widget and have fun!
HYMOD_widget

Concluding remarks

If you’ve made it this far, thank you for reading!

I hope that you are able to find some fun/interesting/educational use for the Ipywidget skills learned in this post.

Fisheries Training 0: Exploring Predator-Prey Dynamics

Fisheries Training 0: Exploring Predator-Prey Dynamics

Hello there, welcome to a new training series!

In a series of five posts, we will be analyzing and exploring the Fisheries Game, which is a rich, predator-prey system with complex and nonlinear dynamics arising from the interactions between two fish species. This game will be used to walk through the steps of performing a broad a posteriori decision making process, including the exploration and characterization of impacts of system uncertain on the performance outcomes. It also serves as a conceptual tool to demonstrate the importance of accounting for deep uncertainty, and the significance of its effects on system response to management actions.

A GitHub repository containing all of the necessary code for this series is available here, and will be updated as the series develops.

We will begin here (Post 0) with an introduction to the Fisheries Game, using a modified form of the Lotka-Volterra equations for predator-prey dynamics.

Toward the end of this post, we will provide an overview of where this series is going and the tools that will be used along the way.

A very quick introduction to the Lotka-Volterra system of equations

In this game, we are stakeholders for a managed fishery. Our goal is to determine a harvesting strategy which balances tradeoffs between ecological and economic objectives. Throughout this process we will consider ecological stability, harvest yield, harvest predictability, and profits while attempting to maintain the robustness of both the ecological system and economic performance under the presence of uncertainty.

The first step in our decision making process is to model population dynamics of the fish species.

Equation 1: The base Lotka-Volterra SOE.

Equation 1 is the original Lotka-Volterra system of equations (SOE) as developed independently by Alfred Lotka in 1910, and by Vito Volterra in 1928. In this equation, x is the population density of the prey, and y is the population density of the predator. This SOE characterizes a prey-dependent predator-prey functional response, which assumes a linear relationship between prey consumption and predator growth.

Arditi and Akçakaya (1990) constructed an alternative predator-prey SOE, which assume a non-linear functional response (i.e., predator population growth is not linear with consumption). It accounts for predation efficiency parameters such as interference between predators, time needed to consume prey after capture, and a measure of how long it takes to convert consumed prey into new predators. This more complex SOE takes the form:

Equation 2: The full predator-dependent predator-prey SOE, including predator interference (m) and harvesting (z).

The full description of the variables are as follows: x and y are the prey and predator population densities at time t respectively (where t is in years); α is the rate at which the predator encounters the prey; b is the prey growth rate; c is the rate at which the predator converts prey to new predators; d is predator death rate; h is the time it needs to consume the prey (handling time); K is the environmental carrying capacity; m is the level of predator interaction; and z is the fraction of the prey population that is harvested. In this post, we will spend some time exploring the potential behavior of these equations.

Before proceeding further, here are some key terms used throughout this post:

  • Zero isoclines: Lines on the plot indicating prey or predator population levels that result in constant population size over time (zero growth rate).
  • Global attractor: The specific value that the system tends to evolve toward, independent of initial conditions.
  • Equilibrium point: The intersection of two (or more) zero isoclines; a point or line of trajectory convergence.
    • Stable (nontrivial) equilibrium: Equilibrium at which both prey and predator exist
    • Unstable equilibrium: Global attractor is a stable limit cycle
  • Stable limit cycle: A closed (circular) trajectory
  • Trajectory: The path taken by the system given a specific set of initial conditions.
  • Functional response: The rate of prey consumption by your average predator.

Population equilibrium and stability

When beginning to consider a harvesting policy, it is important to first understand the natural stability of the system.

Population equilibrium is a necessary condition for stability. The predator and prey populations are in equilibrium if the rate of change of the populations is zero:

\frac{dx}{dt} = \frac{dy}{dt} = 0

Predator stability

Let’s first consider the equilibria of the predator population. In this case, we are only interested in the non-trivial (or co-existence) equilibria where the predator and prey populations are non-zero (y \neq 0, and x \neq 0).

\frac{dy}{dt} = \frac{c\alpha xy}{y^m + \alpha hx} -dy = 0

Here, we are interested in solving for the predator zero-isocline (x^*) which satisfies this equation. Re-arrangement of the above equation yields:

c\alpha xy = d(y^m + \alpha hx)

\alpha x(c-dh) - dy^m = 0

Solving for x yields the predator zero-isocline:

x^* = \frac{dy^m}{\alpha (c - dh)}

In the context of the fisheries, we are interested in the co-existence conditions in which x^* > 0. From the zero-isocline stability equation above, it is clear that this is only true if:

c > hd

Simply put, the rate at which predators convert prey into new predators (c) must be greater than the death rate (d) scaled by the time it needs to consume prey (h).

Prey stability

A similar process can be followed for the derivation of the zero-isocline stability condition for the prey population. The stability condition can be determined by solving for:

\frac{dx}{dt} = bx(1-\frac{x}{K}) - \frac{\alpha xy}{y^m + \alpha hx} = 0

As the derivation of the prey isocline is slightly more convoluted than the predator isocline, the details are not presented here, but are available in the Supplementary Materials of Hadjimichael et al. (2020), which is included in the GitHub repository for this post here.

Resolution of this stability condition yields the second necessary co-existence equilibrium condition:

\alpha (hK)^{1-m} < (b - z)^m

Uncertainty

As indicated by the two conditions above, the stability of the predator and prey populations depends upon ecosystem properties (e.g., the availability of prey, \alpha, predator interference, m, and the carrying capacity, K), unique species characteristics (e.g., the prey growth rate, b, and predation efficiency parameters c, h).

Specification of different values for these parameters results in wildly different system dynamics. Below are examples of three different population trajectories, each generated using the same predator-dependent system of equations, with different parameter values.

Figure 1: Three trajectory fields, resulting from predator-prey models with different parameter values. (a) A stable equilibrium with a single global attractor (values: a = 0.01, b = 0.35, c = 0.60, d = 0.20, h = 0.44, K = 1900, m = 0.30, z = 0.0); (b) an unstable system with limit cycles as global attractors (values: a = 0.32, b = 0.48, c = 0.40, d = 0.16, h = 0.23, K = 2400, m = 0.30, z = 0.0); (c) an unstable system with deterministic extinction of both species (values: a = 0.61, b = 0.11, c = 0.80, = 0.18, h = 0.18, K = 3100, m = 0.70, z = 0.0).

Exploring system dynamics (interactive!)

To emphasize the significance of parameter uncertainty on the behavior of predator-prey system behavior, we have prepared an interactive Jupyter Notebook. Here, you have ability to interactively change system parameter values and observe the subsequent change in system behavior.

The Binder link to the interactive Jupyter Notebook prepared by Trevor Amestoy can be found here. Before the code can be run, open the ‘Terminal’, as shown in Figure 2 below.

Figure 2: Starting page of the Jupyter Notebook binder.

Install the necessary libraries by entering the following line into the terminal:

pip install numpy scipy matplotlib pandas

You’re good to go once the libraries have been loaded. Open the Part_0_Interactive_fisheries_ODEs.ipynb file in the Part_0_ODE_Dynamics/ folder.

Play around with the sliders to see what system trajectories you end up with! Notice how abruptly the population dynamics change, even with minor changes in parameter values.

The following GIFs show some potential trajectories you might observe as you vary the ranges of the variables:

Starting at a stable equilibrium

In Figure 3 below, both prey and predator eventually coexist at constant population sizes with no human harvesting.

Figure 3: The Fisheries system at a stable equilibrium quickly transitioning to an unstable equilibrium, and back again to a stable equilibrium.

However, this is a fishery with very small initial initial prey population size. Here, note how quickly the system changes from one of stable equilibrium into that of unstable equilibrium with a small decrease in the value of α. Without this information, stakeholders might unintentionally over-harvest the system, causing the extinction of the prey population.

Starting at an unstable equilibrium

Next, Figure 4 below shows an unstable equilibrium with limit cycles and no human harvesting.

Figure 4: An unstable equilibrium where the both prey and predator populations oscillate in a stable limit cycle.

Figure 4 shows that a system will be in a state of unstable equilibrium when it takes very little time for the predator to consume prey, given moderately low prey population sizes and prey population growth rate. Although predators die at a relatively high rate, the prey population still experiences extinction as it is unable to replace population members lost through consumption by the predator species. This is a system in which stakeholders might instead choose to harvest the predator species.

Introducing human harvesting

Finally, a stable equilibrium that changes when human harvesting of the prey population is introduced in Figure 5 below.

Figure 5: An unstable equilibrium where the both prey and predator populations oscillate in a stable limit cycle.

This Figure demonstrates a combination of system parameters that might represent a system that can be harvested in moderation. It’s low prey availability is abated by a relatively high predator growth rate and relatively low conversion rates. Be that as it may, exceeding harvesting rates may suddenly cause the system to transition into an unstable equilibrium, or in the worst case lead to an unexpected collapse of the population toward extinction.

Given that the ecosystem parameters are uncertain, and that small changes in the assumed parameter values can result in wildly different population dynamics, it begs the question: how can fisheries managers decide how much to harvest from the system, while maintaining sustainable population levels and avoiding system collapse?

This is the motivating question for the upcoming series!

The MSD UC eBook

To discover economic and environmental tradeoffs within the system, reveal the variables shaping the trajectories shown in Figure 3 to 5, and map regions of system vulnerability, we will need suitable tools.

In this training series, we will be primarily be using the MSD UC eBook and its companion GitHub repository as the main resources for such tools. Titled ‘Addressing uncertainty in MultiSector Dynamics Research‘, the eBook is part of an effort towards providing an open, ‘living’ toolkit of uncertainty characterization methods developed by and for the MultiSector Dynamics (MSD) community. The eBook also provides a hands-on Jupyter Notebook-based tutorial for performing a preliminary exploration of the dynamics within the Fisheries Game, which we will be expanding on in this tutorial series. We will primarily be using the eBook to introduce you to sensitivity analysis (SA) methods such as Sobol SA, sampling techniques such as Latin Hypercube Sampling, scenario discovery approaches like Logistic Regression, and their applications to complex, nonlinear problems within unpredictable system trajectories. We will also be making frequent use of the functions and software packages available on the MSD GitHub repository.

To make the most out of this training series, we highly recommend familiarizing yourself with the eBook prior to proceeding. In each of the following posts, we will also be making frequent references to sections of the eBook that are relevant to the current post.

Training sequence

In this post, we have provided an overview of the Fisheries Game. First, we introduced the basic Lotka-Volterra equations and its predator-dependent predator-prey variation (Arditi and Akcakaya, 1990) used in Hadjimichael et al. (2020). We also visualized the system and explored how the system trajectories change with different parameter specifications. Next, we introduced the MSD UC eBook as a toolkit for exploring the Fisheries’ system dynamics.

Overall, this post is meant to be a gateway into a deep dive into the system dynamics of the Fisheries Game as formulated in Hadjimichael et al. (2020), using methods from the UC eBook as exploratory tools. The next few posts will set up the Fisheries Game and its objectives for optimization, explore the implications of varying significant decision variables, map parameter ranges that result in system vulnerability, and visualize these outcomes. The order for these posts are as follows:

Post 1 Problem formulation and optimization of the Fisheries Game in Rhodium

Post 2 Visualizing and exploring the Fisheries’ Pareto set and Pareto front using Rhodium and J3

Post 3 Identifying important system variables using different sensitivity analysis methods

Post 4 Mapping consequential scenarios that drive the Fisheries’ vulnerability to deep uncertainty

That’s all from us – see you in Post 1!

References

Abrams, P., & Ginzburg, L. (2000). The nature of predation: prey dependent, ratio dependent or neither?. Trends In Ecology &Amp; Evolution, 15(8), 337-341. https://doi.org/10.1016/s0169-5347(00)01908-x

Arditi, R., & Akçakaya, H. R. (1990). Underestimation of Mutual Interference of Predators. Oecologia, 83(3), 358–361. http://www.jstor.org/stable/4219345

Hadjimichael, A., Reed, P., & Quinn, J. (2020). Navigating Deeply Uncertain Tradeoffs in Harvested Predator-Prey Systems. Complexity, 2020, 1-18. https://doi.org/10.1155/2020/4170453

Lotka, A. (1910). Contribution to the Theory of Periodic Reactions. The Journal Of Physical Chemistry, 14(3), 271-274. https://doi.org/10.1021/j150111a004

Reed, P.M., Hadjimichael, A., Malek, K., Karimi, T., Vernon, C.R., Srikrishnan, V., Gupta, R.S., Gold, D.F., Lee, B., Keller, K., Thurber, T.B, & Rice, J.S. (2022). Addressing Uncertainty in Multisector Dynamics Research [Book]. Zenodo. https://doi.org/10.5281/zenodo.6110623

Volterra, V. (1928). Variations and Fluctuations of the Number of Individuals in Animal Species living together. ICES Journal Of Marine Science, 3(1), 3-51. https://doi.org/10.1093/icesjms/3.1.3

Viewing your Scientific Landscape with VOSviewer

In this post, we’re taking a look at a cool tool for visualizing bibliometric networks called VOSviewer. In our group, we have been thinking more about our scientific landscape (i.e. what fields are we reaching with our work and who are we collaborating with most). VOSviewer provides a great way to interactively engage with this network in maps like this:

https://app.vosviewer.com/?json=https://drive.google.com/uc?id=19qfFyRLmgG9Zh7S9oss2Xo8vcqj-GjG7

(The web app cannot always accommodate high traffic, so if the link appears broken, please check back in a couple hours!)

In this blog post, I’m going to highlight this tool by looking at the reach of the Borg Multi-Objective Evolutionary Algorithm (Borg MOEA) developed by Dave Hadka and Patrick Reed and first published in 2013. I also really like that these conceptual networks can be embedded into a website. Slight caveat: if you use WordPress, you must have a Business account with plugins enabled to embed these maps directly into your website. You’ll see that in order to view the interactive maps, I’ll have links to VosViewer’s online application, but unfortunately you will not be able to interact with the maps directly in this post.

Step 1: Download bibliographic data

The first step to creating these maps is to get our bibliographic network information. I’m using the Dimensions database which has a free web app here. It should be noted that Vosviewer also supports exports from Scopus, Web of Science, Lens, and PubMed. Once I get to the Dimensions web app, I type in “Borg MOEA” to find the original paper and download the citations associated with that paper.

You can choose the Borg paper as shown above and then scroll down and click on “show all” next to “Publication Citations”.

Then we export this by clicking the save/export button at the top of the page.

Make sure to choose “bibliographic mapping” so that the data are compatible with VosViewer. Dimensions will send an email with the link to download a zipped file which contains a .csv file with all the meta data needed to create the network.

Step 2: Create the map in VOSviewer

The user interface for the desktop version of VosViewer is fairly intuitive but also quite powerful. First we choose “Create” on the left  and then specify that we want to create a map based on bibliographic data.

Next, we specify that we want to read our data from a database file. Note that VosViewer also supports RIS files and has an API that you can directly query with. On the right, we choose the Dimensions tab and upload the metadata .csv file.

Now comes the fun part! We can create maps based on various components of the metadata that we downloaded. For example, let’s do the co-authorship analysis. This will show the strength of co-authorship among all the authors that have cited the Borg paper.

Click “Next>” and then fill out some narrowing thresholds. I decrease the number of documents that the author must have from 5 to 1 and then choose to show only connected authors to create a clean map. The result is this:

Here the colors represent the different clusters of authors and the size represents the number of documents that have been published by that author that have cited Borg. If we switch to the “Overlay Visualization” tab, we get more information on the years that the authors have published most together.

Step 3: Clean-Up and Customization

One thing you might have noticed is that there are duplicates of some names. For example, Patrick M. Reed and Patrick Reed. We want to merge these publications so that we only have one bubble to represent Pat. For this, we need to create thesaurus file. In the folder that contains the VOSviewer application, navigate to the “data” folder and look for “thesaurus_authors.txt”. Here we can specify which names to merge.

Now, I can recreate my map, but this time, I supply a thesaurus file when creating the map.

Below, we have our final merged map.

We can create another map that instead shows the journals that the citations come from. Now instead we create a map and choose “Citation” and “Sources”.

So we are seeing strong usage of Borg in journals pertaining to water resources, evolutionary computation, and some miscellaneous IEEE journals.

Note that you can also create maps associated with countries from where the authors’ institutions are located.

We see that the US is the strongest hub, followed by China, Italy, and the UK. We can finally show the institutions that the authors are affiliated with. Here we see lots of publications from Cornell, Penn State, and Politecnico di Milano.

Step 4: Creating a shared link or embedding in a website

Screenshots are not an ideal way to view these maps, so let’s see how we can go about sharing them as interactive applications that others can use. We can go to the right hand side of the application and choose “share” as a google drive link. After going through authentication, you now have a shareable link as follows:

Institution: https://app.vosviewer.com/?json=https://drive.google.com/uc?id=18q9f8LvLNbATl5rmP28te5MB0R8bn3xx

Co-Author: https://app.vosviewer.com/?json=https://drive.google.com/uc?id=19qfFyRLmgG9Zh7S9oss2Xo8vcqj-GjG7

Country: https://app.vosviewer.com/?json=https://drive.google.com/uc?id=1pCgv2JVpqXBxEjDIIckBni9HUihQrwWA

Journal: https://app.vosviewer.com/?json=https://drive.google.com/uc?id=174togoYCbxZMm7fcU459p8mYbG3sAxje

If your website supports it, you can also embed these applications directly into a webpage using the following html code. You would just need to swap out the source location, src, with your own links. When swapping out, be sure to include &simple_ui=true after your link as shown below.

<iframe
  allowfullscreen="true"
  src="https://app.vosviewer.com/?json=https://drive.google.com/uc?id=1mYU4RNuhUQP__sKZWRnHzn_IkzrfvJMV&simple_ui=true"
  width="100%"
  height="75%"
  style="border: 1px solid #ddd; max-width: 1200px; min-height: 500px"
>
</iframe>

This is just scraping the surface of what VOSviewer is capable of, so definitely take time to explore the website, the examples, and the manual. If you make any cool maps, please link them in the comments below!