The Linked Parallel Coordinate Plot with Linked Heatmap: A tool for simultaneous visualization of the objective and decision spaces

Introduction

When making real world decisions, the individual decisions that comprise a solution are often equally important to decision makers as the objective values that are generated by the solution. When evaluating MOEA search results for such problems, analysts are in need of a tool that can allow them to simultaneously visualize solutions both in the objective and decision spaces. Parallel Coordinate Plot with Linked Heatmap tool was created to fill this need. The tool was built  off of the D3 Interactive Parallel Coordinate tool that Bernardo introduced in his post a few weeks ago. As you may be able to guess by its name, the Parallel Coordinate Plot with Linked Heatmap tool links a heatmap containing decision space values with an interactive Parallel Coordinate plot containing objective space data. The tool can be found here.

Pcoord_full

Heatmap_full

Figure 1: The Parallel Coordinate Plot with Linked Heatmap tool demonstrated with a six objective problem with 31 decision variables.

Each axis on the parallel coordinate plot represents an objective function, while each line across the axes represents an individual Pareto optimal solution. Each row of the heatmap also represents an individual Pareto optimal solution, while each column corresponds to a decision variable. The shading of the heatmap cells represent how much of a decision variable is being used for each solution.

The tool allows analysts to simultaneously view the decision space along with the objective space in two different ways. First, the grouping functionality, contained in the original parallel coordinate plot tool, allows the analyst to group her data based on decision space categories. These categories are then plotted as different colors on the parallel coordinate plot. The fraction of solutions contained within each group can be observed in the ring plot in the top left corner of the screen. This ring plot will update dynamically as data is brushed, reflecting the how much of the brushed solutions are from each group.

The second way the tool can be used to visualize the decision space along with the objective space is through the linked heatmap. The heatmap shows the relative usage of each decision variable for each pareto optimal solution. As the parallel coordinate plot is brushed, the heatmap will dynamically update, reflecting only solutions within the brushed dataset.

Uploading Data

To upload data, follow the procedure outlined in Bernardo’s post. For this tool, the input is a .csv file containing both decision variable and objective function values for each Pareto optimal solution. Like the input for Bernardo’s tool, this file should have “name” and “group” as the heading for the first two columns. Unlike the input for Bernardo’s tool, the next columns should hold the decision variables, with the name of each variable as the heading. The final columns should have the objective values. The tool will automatically recognize the last 6 columns as objective values, no matter how many decision variables are present (the tool is only built to handle 6 objective problem formulations at the moment, I’m working on making it flexible to any number of objectives, I’ll post an update once that has been completed). For best use of the tool, decision variable values should be normalized to the highest value for each decision variable. An example input can be seen in Figure 2.

example_data

Figure 2: Example input data format, note that the objective values are in the last six columns

 

Using the Tool

For purposes of illustration, I’ll demonstrate how the tool can be used on a six objective water supply problem that yeilded a Pareto optimal set of 2600 solutions. This demonstration will follow Daniel Keim’s procedure for analyzing complex data broadly outlined as: Analyze and show the important, brush/filter, analyze further, show the important and provide details on demand.

Step 1: Analyze and show the important

The full Pareto set can be analyzed for broad patterns within the decision and objective spaces. The distribution of colors (representing grouping) on the parallel coordinate plot can be used to determine any correlation between certain groups of solutions and resulting objective values. The ring plot in the upper right corner can also be used to determine which groups of solutions are most prevalent within the Pareto set.

Pcoord_full

Figure 3: The parallel coordinate plot of an unbrushed Pareto optimal solutions to the six objective problem. Note that groups four and five, colored green and purple, are most prevalent in the full Pareto set.

The heatmap can also provide some useful insight into the nature of the full Pareto set. Though the heatmap cannot provide helpful insight into single solutions when 2600 points are plotted simultaneously, it can provide insight into broad trends within the data. For example, when examining Figure 4 below, one can see that certain decision variables are seen as columns of nearly solid yellow or blue. Solid yellow columns indicate decision variables that are not frequently employed in Pareto optimal solutions, while blue bars indicate that a decision variable is very frequently employed over the entire set.  In the example data set, decision variables 11 and 23 are rarely used, while decision variable 1 and 4 are used in the vast majority of solutions.

Heatmap_full

Figure 4: Heatmap of unbrushed Pareto optimal solutions and their corresponding decision variables.

Step 2: Brush/Filter the data

Once broad trends from the Pareto optimal set have been analyzed, the analyst can brush the data to meet satisfying criterion. Objectives of the highest importance can be brushed first. For this example, I first brushed the data to .99 reliability and above, after brushing I hit the “keep” button in the upper left to rescale the axes to the brushed data.

 

brushed_no_keep

brushed_keep

Figure 5: Brushed Parallel Coordinate plot. The data is the same in both (a) and (b), but (b) shows the plot after the “keep” button, circled in red in (a), has been pressed, which rescales the axes.

The heatmap of decision variables is dynamically updated as the parallel coordinate plot is brushed. An examination of the new heatmap can yield insights into connections between highly reliable solutions and the decision variables that comprise them. The heatmap still contains too many rows to provide insight into decisions that compose individual solutions, but the color patterns can be compared to the original heatmap. It can be observed that the heatmap in Figure 6 has noticeably darker columns 21, 25 and 27 than Figure 4, meaning that those decision variables are employed more in the solutions with greater than .99 reliability than in the overall Pareto set. Conversly, Figure 6 has noticeably lighter columns 19, 29 and 30 than Figure 4, meaning that those decision variables are not employed as commonly in higher reliability solutions.

heatmap_brushed1

Figure 6: Heatmap of solutions brushed to 99% reliability criterion

Step 3: Analyze Further

I then brushed Pareto Set  again, yielding a more refined set of solutions. Figure 7 shows the Pareto set brushed to satisficing criteria reliability > 99% and storage < 30. The keep function was again used to rescale the axes.

brushed2_keep

heatmap_brushed2

Figure 7: The Pareto Set brushed to satificing criteria >99% reliability and <30 storage.

The heatmap can again be used to compare brushed solutions to the entire set. The heatmap in Figure 7 reveals that Decision Variables 2, 19 and 29 are only employed in three solutions while 21, 25 and 27 are still very dark.

Step 4: Show the important

I then brushed the Pareto set a final time, which yielded a much smaller set of acceptable solutions. Figure 8 shows the Pareto Set brushed to satisficing criteria of reliability > .99, storage < 30 and Supply < 5. The Parallel Coordinate plot illuminates tradeoffs between objective values of solutions while the heatmap can be used to illuminate tradeoffs within the decision space. It can be noted from the heatmap that many decision variables are unused in any solution that meet these satisficing criteria, this may help inform analysts in simplifying future problem formulations.

brushed3_keep

heatmap_brushed3

Figure 8: Parallel Coordinate Plot and Linked heatmap shown for brushed criteria Reliability > .99, Storage < 30 and Supply < 5

It should also be noted that the proportion of solutions from each group in the brushed solutions differs from the entire Pareto set. While the most common group of solutions in the entire set was Group 5, no solution from Group 5 remains in the brushed set. Group 2, which only represented a small fraction of the entire Pareto set, represents a significant portion of solutions that meet satisficing criteria.

Provide details on demand

Information on individual solutions can be found by either hovering over a cell of the heatmap, or hovering over the tabulated row of a given solution which will highlight it on the Parallel Coordinate Plot.

Conclusion

This tool can assist analysts in understanding tradeoffs within the decision space as well as the objective space. It can also help identify broad mappings between decision variables and their effects on objective values. It was designed to be a dynamic and easy to use tool to assist analysts during an iterative decision making process that employs a MOEA. The tool is still a work in progress, so stay tuned to the blog to hear of improvements to the interactive nature of the tool.

Advertisements

5 thoughts on “The Linked Parallel Coordinate Plot with Linked Heatmap: A tool for simultaneous visualization of the objective and decision spaces

  1. would it be possible to make the source code of this tool available? It is possible to use D3 based code within the Jupyter notebook. If you release the source code of the tool, I will make a blogpost showing how you can use the tool within the notebook to explore data stored in a pandas dataframe.

  2. Pingback: Water Programming Blog Guide (3) – Water Programming: A Collaborative Research Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s