Parallel axis plots for the absolute beginner

A parallel axis plot is a simple way to convey a lot of information in a meaningful and easy-to-understand way. Also known as parallel coordinate plots (PCP), it is a visualization technique used to analyze multivariate numerical data (Weitz, 2020), or in the case of multi-objective optimization, to analyze tradeoffs between multiple conflicting objectives. As someone new to the field of multi-objective optimization, I found them particularly helpful as I tried to wrap my head around the multi-dimensional aspects of this field.

There are multiple tools in Python that you can use to generate PCPs. There are several different posts by Bernardo and Jazmin that utilize the Pandas and Plotting libraries to do so. In this post, I would like to explain a little about how you can generate a decent PCP using only Numpy and Matplotlib.

For context, I used a PCP to contrast the non-dominated solutions from the entire reference set of the optimized GAA problem reference set.

For the beginner, the figure above demonstrates three important visualization techniques in generating PCPs: color, brushing, and axis ordering. Firstly, it is important to consider using colors that stand on opposite sides on the color wheel to contrast the different types of information you are presenting. Next, brushing should be used to divert the viewer’s attention away from any information deemed unnecessary, highlight vital information, or to prove a point using juxtaposition. Finally, the ordering of the axes is important, particularly when presenting conflicting objectives. It is best for all axes to be oriented in one “direction of preference”, so that the lines between each adjacent axis can represent the magnitude of the tradeoff between two objectives. Thus, the order in which these axes are placed will significantly affect the way the viewer perceives the tradeoffs, and should be well-considered.

To help with understanding the how to generate a PCP, here is a step-by-step walk-through of the process.

1. Import all necessary libraries, load data and initialize the Matplotlib figure

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import ticker

# load data
all_soln = np.loadtxt(open("GAA-reference-set.csv"), delimiter=",")
nd_indices = np.loadtxt(open("non-dominated-index.csv"), delimiter=",")

# identify and list the objectives of the reference set
objs = ['NOISE', 'WEMP', 'DOC', 'ROUGH', 'WFUEL', 'PURCH', 'RANGE', 'LDMAX', 'VCMAX', 'PFPF']

# create an array of integers ranging from 0 to the number of objectives                    
x = [i for i, _ in enumerate(objs)]

# sharey=False indicates that all the subplot y-axes will be set to different values
fig, ax  = plt.subplots(1,len(x)-1, sharey=False, figsize=(15,5))

Two sets of data are loaded:

  • all_soln: the entire reference set
  • nd_indices: an array of row indices of the non-dominated solutions from all_soln

In Line 16, we are initializing a figure fig and an array of axis objects ax. I find that having an array of axes helps me better control tick locations and labeling, since I can iterate over them in a loop.

Bear in mind that this is simply an example. It is also possible to obtain the non-dominated set directly from the the reference set by performing a Pareto sort.

2. Normalize the objective values in all_soln

min_max_range = {}

for i in range(len(objs)):
    all_soln[:,i] = np.true_divide(all_soln[:,i] - min(all_soln[:,i]), np.ptp(all_soln[:,i]))
    min_max_range[objs[i]] = [min(all_soln[:,i]), max(all_soln[:,i]), np.ptp(all_soln[:,i])]

All values in all_soln are normalized by subtracting the minimum value from each objective, then dividing it by the range of values for that objective. The min_max_range dictionary contains the minimum, maximum and range of values for each objective. This will come in handy later on.

3. Iterate through all the axes in the figure and plot each point

I used the enumerate function here. It may seem somewhat confusing at first, but it basically keeps count of your iterations as your are iterating through an object (ie: a list, an array). More information on how it works can be found here.

for i, ax_i in enumerate(ax):
    for d in range(len(all_soln)):
        if ((d in nd_indices)== False):
            if (d == 0):
                ax_i.plot(objs, all_soln[d, :], color='lightgrey', alpha=0.3, label='Dominated', linewidth=3)
            else:
                ax_i.plot(objs, all_soln[d, :], color='lightgrey', alpha=0.3, linewidth=3)
    ax_i.set_xlim([x[i], x[i+1]])

for i, ax_i in enumerate(ax):
    for d in range(len(all_soln)):
        if (d in nd_indices):
            if (d == nd_indices[0]):
                ax_i.plot(objs, all_soln[d, :], color='c', alpha=0.7, label='Nondominated', linewidth=3)
            else:
                ax_i.plot(objs, all_soln[d, :], color='c', alpha=0.7, linewidth=3)
    ax_i.set_xlim([x[i], x[i+1]])

All solutions from the non-dominated set are colored cyan, while the rest of the data is greyed-out. This is an example of brushing. Note that only the first line plotted for both sets are labeled, and that the grey-out data is plotted first. This is so the non-dominated lines are shown clearly over the brushed lines.

4. Write a function to position y-axis tick locations and labels

The set_ticks_for_axis() function is key to this process as it grants you full control over the labeling and tick positioning of your y-axes. It has three inputs:

  • dim: the index of a value from the objs array
  • ax_i: the current axis
  • ticks: the desired number of ticks
def set_ticks_for_axis(dim, ax_i, ticks):
    min_val, max_val, v_range = min_max_range[objs[dim]]
    step = v_range/float(ticks)
    tick_labels = [round(min_val + step*i, 2) for i in range(ticks)]
    norm_min = min(all_soln[:,dim])
    norm_range = np.ptp(all_soln[:,dim])
    norm_step =(norm_range/float(ticks-1))
    ticks = [round(norm_min + norm_step*i, 2) for i in range(ticks)]
    ax_i.yaxis.set_ticks(ticks)
    ax_i.set_yticklabels(tick_labels)

Hello min_max_range! This dictionary essentially makes accessing the extreme values and range of each objective easier and less mistake-prone. It is optional, but I do recommend it.

Overall, this function does two things:

  1. Creates ticks-evenly spaced tick-marks along ax_i.
  2. Labels ax_i with tick labels of size ticks. The tick labels are evenly-spaced values generated by adding step*i to min_val for each iteration i.

A nice thing about this function is that is also preserves the order that the objective values should be placed along the axis, which makes showing a direction of preference easier. It will be used to label each y-axis in our next step.

5. Iterate over and label axes

for dim, ax_i in enumerate(ax):
    ax_i.xaxis.set_major_locator(ticker.FixedLocator([dim]))
    set_ticks_for_axis(dim, ax_i, ticks=10)

FixedLocator() is a subclass of Matplotlib’s ticker class. As it’s name suggests, it fixes the tick locations and prevents changes to the tick label or location that may possibly occur during the iteration. More information about the subclass can be found here.

Here, you only need to label the x-axis with one label and one tick per iteration (hence Line 2). On the other hand, you are labeling the entire y-axis of ax_i, which is where you need to use set_ticks_for_axis().

6. Create a twin axis on the last axis in ax

ax2 = plt.twinx(ax[-1])
dim = len(ax)
ax2.xaxis.set_major_locator(ticker.FixedLocator([x[-2], x [-1]]))
set_ticks_for_axis(dim, ax2, ticks=10)
ax2.set_xticklabels([objs[-2], objs[-1]])

Creating a twin axis using plt.twinx() enables you to label the last axis with y-ticks. Line 3 and 5 ensure that the x-axis is correctly labeled with the last objective name.

7. Finally, plot the figure

plt.subplots_adjust(wspace=0, hspace=0.2, left=0.1, right=0.85, bottom=0.1, top=0.9)
ax[8].legend(bbox_to_anchor=(1.35, 1), loc='upper left', prop={'size': 14})
ax[0].set_ylabel("$\leftarrow$ Direction of preference", fontsize=12)
plt.title("PCP Example", fontsize=12)
plt.savefig("PCP_example.png")
plt.show()

Be sure to remember to label the direction of preference, and one you’ve saved your plot, you’re done!

The source code to generate the following plot can be found here. I hope this makes parallel axis plots a little more understandable and less intimidating.

References

Weitz, D. (2020, July 27). Parallel Coordinates Plots. Retrieved November 09, 2020, from https://towardsdatascience.com/parallel-coordinates-plots-6fcfa066dcb3

Keen, B.A., Parallel Coordinates in Matplotlib. (2017, May 17). Retrieved November 09, 2020, from https://benalexkeen.com/parallel-coordinates-in-matplotlib/

Parasol Resources

In my last post, I introduced Parasol: an open source, interactive parallel coordinates library for multi-objective decision making. For additional background on Parasol, check out the library’s webpage to view interactive example applications, the GitHub repo for documentation, and the paper [Raseman et al. (2019)] for a more rigorous overview of Parasol.

parasol-tutorial

A demonstration of Parasol features like linked plots and data tables, highlighting individual polylines, and brushing and marking data.

In this post, I will describe how to start making your own interactive visualizations with Parasol.

Parasol Wiki

The best place to look for Parasol resources is the wiki page on the GitHub repo. Here you will find the API documentation, an introduction to web development for Parasol, and step-by-step tutorials taking you through the app development process. Because the wiki is likely to grow over time, I won’t provide links to these resources individually. Instead, I’ll give an overview of these resources here but I encourage you to visit the wiki for the most up-to-date tutorials and documentation.

API

The Application Programming Interface (API) is a list of all the features of the Parasol library. If you want to know everything that Parasol can do, the API is the perfect reference. If you are just getting started, however, this can be overwhelming. If that is the case, go through the tutorials and come back to the API page when you want to take your apps to the next level.

Web Development Basics

Creating Parasol apps requires some level of web development knowledge, but to build simple apps, you only need to understand the basics of HTML, JavaScript, and CSS. Check out the web development tutorial on the wiki for an overview of those concepts.

Tutorials

To start developing right away, go through the tutorials listed on the wiki page. These posts take the user through how to create a minimal Parasol app, they touch on the primary features of the API, and more advanced topics like how to use HTML buttons and sliders to create more polished apps.

Creating shareable apps with JSFiddle

jsfiddle-tutorial-3-completed

Using JSFiddle to create shareable Parasol apps (without having to host them on a website)

I recently discovered JSFiddle–a sandbox tool for learning web development. Not only has JSFiddle made web development more accessible and easier for me to learn, but I realized its a great way to share Parasol apps in an efficient, informal way. Check out the tutorial on the wiki for more details and the following links to my own Parasol “fiddles”:

Give it a try!

As I’ve said before, check out Parasol and give us your feedback. If you think this library is valuable, submit an Issue or Star the GitHub repo, or write a comment below. We are open to new ideas and features about how Parasol can better suit the needs of developers.

References

Raseman, William J., Joshuah Jacobson, and Joseph R. Kasprzyk. “Parasol: An Open Source, Interactive Parallel Coordinates Library for Multi-Objective Decision Making.” Environmental Modelling & Software 116 (June 1, 2019): 153–63. https://doi.org/10.1016/j.envsoft.2019.03.005.