Lower dimensional visualizations of many-objective Pareto Fronts

Understanding the geometry of a many-objective Pareto front can be challenging due to the high dimensionality of many-objective problems. Often, we use tools such as parallel coordinate plots, glyph plots and pairwise scatter plot matrices to identify trade-offs and select candidate alternatives. While these tools are useful to decision making, they don’t always capture patterns or geometric properties that may be embedded withing many-objective Pareto fronts.

Mathematical tools for visualizing high dimensional data on lower dimensional manifolds have existed for decades, and in recent years they have been applied in many-objective contexts (Filipac and Tusar, 2018). In this post I’ll overview four common methods for representing Pareto fronts on lower dimensional manifolds and demonstrate them on two many-objective test problems: the four objective DTLZ 2 and four objective DTLZ 7 (Deb et al., 2005).

Parallel coordinate plots of the two test problems can be found in Figure 1. DTLZ 2 has a continuous Pareto front, while DTLZ 7 has a highly disconnected Pareto front. Both Pareto fronts visualized here are the analytical true Pareto fronts from the MOEAFramework.

I’ve added the code to produce the plots in this post to my repository on many-objective visualization, which can be found here.

Figure 1: Parallel coordinate plots of the four objective DTLZ 2 (left) and DTLZ 7 (right)

1. Multidimensional Scaling

Multidimensional Scaling (MDS) refers to a family of techniques that seek low dimensional representations of high dimensional spaces by preserving pairwise distances between points (Kruskal, 1978). Classic MDS attempts to preserve the euclidean distance between each pair of points by minimizing a stress function defined as:

stress = (\frac{\sum_i \sum_j (f(x_{ij}) - d_{ij})^2}{\sum_i \sum_j d_{ij}^2})^{1/2}

Where:

f(x_{ij}) is the euclidean distance between points x_i and x_j in the full dimensional space. (Note: extensions of MDS have been developed that substitute this distance for a weakly monotonic transformation of the original points)

d_{ij} is the euclidean distance between points x_i and x_j in the lower dimensional representation

To perform MDS on the test problem Pareto Fronts, I used the Manifold tool from the Yellowbrick package, a machine learning visualization module associated with sklearn. MDS representations of four objective DTLZ 2 and DTLZ 7 and shown in Figure 2. For the highly disconnected DTLZ 7 problem, MDS clearly distinguishes the 8 clusters within the Pareto Front.

Figure 2: MDS representations of the four objective DTLZ 2 (left) and DTLZ 7 (right)

2. IsoMaps

IsoMaps are an extension of MDS that first clusters points using K-nearest neighbors, then maps the points to a lower dimensional space by minimizing the geodesic distances between clusters. To create IsoMap visualizations for the test problems, I again utilized the Yellowbrick manifold function. IsoMap projections for four objective DTLZ 2 and DTLZ 7 are shown in Figure 3. Like MDS, IsoMapping is able to clearly demonstrate the disconnected nature of the DTLZ 7 Pareto front. I should mention that I had to use 30 neighbors to achieve this, which is much higher than the 8-12 neighbors recommended as an unofficial rule of thumb. This could be a result of the highly disconnected nature of DTLZ 7, which may cause problems for IsoMap.

Figure 3: IsoMap representations of the four objective DTLZ 2 (left) and DTLZ 7 (right)

3. Sammon Mapping

Like MDS and IsoMapping, Sammon Mapping (Sammon, 1969) seeks to find a lower dimensional representation that preserves the the distances between each point in the Pareto front from the high dimensional space. Sammon Mapping uses a modified version of stress known as “Sammon Stress”, defined as:

S =\sum_{i} \sum_{j>i} \frac{(d^{*}_{ij} - d_{ij})^2}{d^{*}_{ij}}

Where:

d^{*}_{ij}: is the distance between points x_i and x_j in the full objective space

d_{ij}: is the distance between points x_i and x_j in the lower dimensional space

The key difference between Sammon Stress and the classic MDS stress is that Sammon Stress is normalized by the distance in the high dimensional space rather than the low dimensional representation. This function is usually minimized by gradient descent.

I coded the Sammon maps for the two test problems using an open source implementation from tompollard on Github. Like the other two methods, Sammon mapping highlights the disconnected nature of DTLZ 7 while showing a relatively continuous representation of DTLZ 2 that suggests its spherical shape.

Figure 4: Sammon mapping representations of the four objective DTLZ 2 (left) and DTLZ 7 (right)

4. Self Organizing Maps

Self Organizing Maps (SOM; Kohonen, 1982) use an artificial neural network to create a discrete, low dimensional representation of a high dimensional space. SOMs are a form of unsupervised machine learning that are used in both classification and dimensional reduction applications.

To map the high dimensional data, SOMs start with a uniformly spaced grid of neurons, then implement a competitive learning algorithm to associate each neuron with a set of Pareto front solutions. This video has the best explanation I’ve seen on how SOMs work (thanks to Rohini for sharing it with me). I also found this Gif from Wikicommons to be helpful in understanding SOMs.

Pareto front visualizations using SOMs plot the the original uniform grid of neurons on an x-y plane, and the distance between neurons of the final map as the color. Grid points with dark shading (long distance between final neurons) indicate boundaries between clusters in the high dimensional space. SOMs for the four objective DTLZ 2 and DTLZ 7 problems are shown in Figure 5. The disconnected clusters in DTLZ 7 are clearly apparent while no real structure is shown for DTLZ 2.

Figure 5: SOM representations of the four objective DTLZ 2 (left) and DTLZ 7 (right)

Concluding thoughts

To be perfectly honest, I don’t think that any of the methods described in this post are particularly useful for understanding many-objective optimization results if used on their own. However, they may be a useful complement when exploring solution sets and finding patterns that may not be apparent when visualizing the full dimensional space. Additionally, they are all fairly straight forward to code and can easily be included in an exploratory analysis.

References

Deb, K., Thiele, L., Laumanns, M., & Zitzler, E. (2005). Scalable test problems for evolutionary multiobjective optimization. In Evolutionary multiobjective optimization (pp. 105-145). Springer, London.

Filipič, B., & Tušar, T. (2018, July). A taxonomy of methods for visualizing pareto front approximations. In Proceedings of the Genetic and Evolutionary Computation Conference (pp. 649-656).

Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological cybernetics, 43(1), 59-69.

Kruskal, J. B. (1978). Multidimensional scaling (No. 11). Sage.

Sammon, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Transactions on computers, 100(5), 401-409.

Parallel plots in R

Today I’d like to share some simple code we worked up with the help of CU graduate student Rebecca Smith, for creating parallel coordinate plots in R.  The code plots a Pareto approximate set in light gray, and on top of it is placed several highlighted solutions.  Showing 3 highlighted solutions really gives a nice view of the tradeoffs in many dimensions simultaneously.  You can follow the colored lines and see how objectives and decisions interact, and having the colored solutions above the full tradeoff gives you a sense of how individual solutions compare to the full tradeoff set (i.e., is a solution the lowest in the set?  The highest?)  I would show you an example, but, well, that is left to the reader 🙂

I’ll post the code here and then explain below.

###
#R template to make Parallel Coordinate plots with the full set in gray and highlighted solutions in color
#
#by Rebecca Smith and Joe Kasprzyk
###

#
#User defined parameters
#

#the excel file containing the data, with headers and rows
myFilename="an-excel-file.xlsx"

#the file folder you're working in
myWorkingDirectory="D:/Data/Folder/"

#a vector of the indices you want to plot, in order
objIndices=c(16, 2, 11, 5, 9, 17)

#highlighted solutions.  The default is 3, more or less and you have to change the code below

sol1.row = 125
sol1.color = "blue"

sol2.row = 109   #the red one
sol2.color = "red"

sol3.row = 179   #the green one
sol3.color = "green"

#
#The code is below.  You should have to make minimal changes below this point!
#
setwd(myWorkingDirectory)

options(java.parameters = "-Xmx4g")
require(XLConnect)
library(MASS)

wb1 = loadWorkbook(myFilename)
archive=as.matrix(readWorksheet(wb1,sheet="Sheet1",header = TRUE))

N=length(archive[,1])

#save the specific data you want to plot
par.plot.arch=archive[,objIndices]

#Concatenate a matrix that starts with the rows of all the solutions,
#then appends the specific solutions you want.
par.plot.arch.sol.1.2.3=rbind(par.plot.arch,par.plot.arch[sol1.row,],par.plot.arch[sol2.row,],par.plot.arch[sol3.row,])

#parcoord takes the following arguments:
#the data
#col is a vector of the colors you want to plot
#lwd is a vector of values that will be assigned to line weight
#lty is a vector of values that will be assigned to line type

#note, below that rep.int repeats the integer the specified number of times.  Basically we are telling R to make the first N values
#gray, and then the rest are assigned a specific colour.

#including linetype...
#parcoord(par.plot.arch.sol.1.2.3, col=c(rep.int("grey",N),sol1.color,sol2.color,sol3.color), lwd=c(rep.int(1,N),6,6,6),lty=c(rep.int(1,N),2,2,2),var.label=TRUE)

#without linetype
parcoord(par.plot.arch.sol.1.2.3, col=c(rep.int("grey",N),sol1.color,sol2.color,sol3.color), lwd=c(rep.int(1,N),6,6,6),var.label=TRUE)

Most of my comments are in the comments above, but a few quick thoughts.

First, the code is separated into stuff the user can change, and stuff the user shouldn’t have to play around with too much. One big thing is that the number of ‘highlighted’ solutions is hardcoded to 3. It is pretty simple to add more, you just have to make sure that you add the new entities everywhere where they belong (i.e., in concatenating the data, and also in the plotting).

Second, play around with the ordering of the columns for the best effect. You’ll see that R uses a c construction for you to make a vector with columns indices in any order.

Finally, remember you can also use the col option in parcoord to plot a real-valued color spectrum. We have found, though, this is kind of difficult to do if there isn’t a clear monotonic relationship between the colored variable and the others.

That’s all for now but feel free to comment below!