(This post will be an introduction to the basic ideas behind MOEAs and will include references to relevant literature. Editors, please feel free to add citations as you see fit).
What is a MOEA?
Multi-objective evolutionary algorithms (MOEA) are used to optimize a set of decision variables with respect to a set of objectives. (The “multi-objective” distinction just means that more than one metric is being used to evaluate performance). In between the decision variables and objectives lies your model, which could be any application where performance needs to be optimized. For a detailed explanation of terminology related to decision variables and objectives, see Joe’s post. Evolutionary algorithms are considered heuristic methods because they do not solve problems deterministically, but rather explore the space of possible solutions and dynamically learn which decision variables lead to good performance. Perhaps the most comprehensive resource for learning about MOEAs is Evolutionary Algorithms for Solving Multi-Objective Problems by Coello Coello (ask around for hard or PDF copies). This can be a difficult read for those new to the subject, so you may want to read the rest of this post and peruse the suggested papers before diving in.
Why do we use them?
A number of different optimization strategies exist, including linear programming, response surface methods, and others. For some optimization problems, you can even find closed-form analytical solutions (think of finding function maxima in calculus). Compared to other strategies, evolutionary algorithms are often considered to be messy and inelegant; for example, in the Algorithm Design Manual, Steven Skeina refers to genetic algorithms as “voodoo”. EAs truly are the sledgehammer of optimization methods—but this is why we use them! In engineering, we frequently encounter all sorts of difficult problems, including: constrained non-convex (where you must travel through infeasible space to reach feasible space), multi-modal (many “false” local optima), and high-dimensional problems, in terms of both decision variables and output objectives. More elegant methods typically need to be reformulated on a per-problem basis, but modern MOEAs can handle these difficult problems without altering their implementation. We want a solver that is reliable and robust across problem types, which is why we turn to MOEAs. A good discussion of the motivation for multi-objective optimization from a management perspective is given in Brill et al. (1982) Modeling to Generate Alternatives (note that the specific algorithm they use is less important for our purposes).
How do they work? (Single-objective)
For simplicity, let’s start by looking at a basic single-objective EA. This description is mostly based on the early genetic algorithms, but please note that a wide variety of variations exist. An EA begins with a randomly selected population of solutions (i.e., it randomly selects parameter sets and evaluates them in your model). From here, the algorithm uses mathematical operators to guide its search. EAs generally employ three operators—crossover, mutation, and selection—although the specific form of these operators varies between algorithms. The crossover operator combines decision variables from two or more solutions to create new solutions. The mutation operator perturbs the decision variables of a single solution to look for improvement in its local vicinity. Finally, the selection operator determines which solutions are allowed to proceed to the next “generation” (the next cycle of running these operators) and which will be discarded. By performing this process over multiple generations, the algorithm is able to evolve the best-known solution (hence the analogies to biological evolution).
Some technical considerations
Every run of an EA is a battle between convergence toward the best solution (governed by the selection operator) and diversity of solutions (governed by the crossover and mutation operators). Ideally, we want fast convergence with maximum diversity, but a balance must be struck between the two—we can sacrifice some convergence speed to ensure that the algorithm does not converge prematurely to a local optimum.
We also must consider the mathematical properties of our problems and the difficulties these may pose for evolutionary search. The success of early EAs generally depends on whether the problem is decomposable (i.e., if the decision variables can be optimized individually). This is closely related to the concept of rotational invariance, which describes whether the decision variables interact with each other and thus need to be optimized together rather than individually. Finally, the success of EAs also depends on the causality of the problem: in a problem with strong causality, a small change in decision variables results in a small change in the objective function(s). We can expect EAs to perform better on problems that have both decomposability and strong causality. However, modern MOEAs utilize adaptive operator schemes to attempt to overcome these problems. A good discussion of testing the ability of EAs to overcome nasty problems (especially high dimensionality) is given by Deb et al. (2005) Scalable Test Problems.
Moving to multi-objective EAs
In a single-objective optimization, an EA will return only one point that is optimal with regard to the objective function. What if we want to incorporate multiple objectives? Our first thought might be to combine all of objectives together into one, using some kind of weighting scheme. However, this requires that we know ahead of time which objectives we care about most, and this is not always the case. Instead, we can incorporate all of our desired objective functions separately using the (crucial) principle of nondominance. In a set of solutions, a solution is said to be “dominated” if another solution exists which is better than it in all objectives. The solutions that are not dominated are said to be—you guessed it—”nondominated”. The set of all nondominated solutions in a given generation is referred to as the Pareto front or tradeoff surface (because objectives usually conflict with one another, resulting in multiple possible “best” solutions). So, rather than optimizing all of the objective functions themselves, modern MOEAs instead optimize the nondominance of solutions. The first algorithm to utilize this approach was Deb’s NSGA; Deb’s NSGAII (2002 paper) addressed many shortcomings of NSGA, and soon became extremely popular (to say the least). A thorough, broader discussion of multi-objective optimization is given by Fleming et al. 2005.
Where is the Pareto front?
A single run of a MOEA will provide a best-known approximation set of solutions, based on the principle of nondominance. But how do we know whether this is “right” or not? Many people have debated this question in more of a philosophical, mathematical sense. Practically speaking, we usually investigate this question using brute force. We don’t just run a MOEA one time—instead, we use many different random seeds, a high number of function evaluations, and sometimes even different algorithms. For example, our recent AWR study used 10 different algorithms with 50 random seeds each. Our approach to the question “is it right?” is just to take the results from all of these runs and find the final set of nondominated solutions, known as the reference set. This is the best-known set of solutions to the problem after performing as much evolution as computation allows. We can discuss whether a reference set is “right”, but from an engineering perspective, it is perhaps more important to discuss whether it provides a useful, diverse set of new solutions to the problem.
Improving efficiency: epsilon dominance
MOEAs rely on nondominated sorting to generate a set of optimal solutions. This is a computationally expensive operation, because an algorithm must compare every existing solution to every other solution. As a consequence, an algorithm may waste a lot of time sorting solutions that differ from each other by insignificant amounts. An important contribution to address this problem is the principle of epsilon dominance. Epsilon dominance asks the user to specify a desired precision for each objective value ahead of time in order to simplify the nondominated sorting operations. For example, you may only care about measuring cost to the nearest dollar, or for a big project, to the nearest thousand dollars. An algorithm will then only find the best solution within each “epsilon” of the objective space, rather than optimizing using full floating point precision. The theory behind epsilon dominance is discussed in Laumanns et al. 2002. Josh and Pat added epsilon dominance to NSGAII to create eNSGAII; a good description of the algorithm design and performance is given in their 2006 comparison paper. Finally, a clear example of the efficiency gains from epsilon dominance is given in Kollat and Reed (2007) Scaling Analysis.
Putting it all together: Visualizing alternatives
We perform all of this computation because, somewhere in the world, there are stakeholders who care about this problem. It would be somewhat unhelpful to approach these stakeholders with a 10-page ASCII printout of a reference set. If we want to increase the real-world impact of our heroic efforts, we need to provide stakeholders with visualizations of our best-known solutions. Fortunately for our group, Josh has created the Aerovis program to view solution sets in multiple dimensions. Aerovis allows us to explore our solution sets and identify regions of interest in the objective space, and allows stakeholders to easily see the full range of possible outcomes for their problem. Getting started with MOEAs involves a lot of in-depth technical effort, but it’s important not to lose sight of the big picture—ultimately, we want to present meaningful results to stakeholders in a visually appealing way. A company or government agency may not care what algorithm you used, or how many random seeds you ran, etc., but they certainly do care if you can provide them with a better solution than the status quo. Some great examples of visualizing practical results are given in the Kollat and Reed (2007) VIDEO paper and also Kasprzyk et al. 2011.
The reading listed here should get you started for now. I plan to write another post about more advanced topics, including MOEA diagnostics (why we need them, what they mean, etc.). If you want to get a head start on MOEA diagnostics, you can check out Dave and Pat’s recent paper in Evolutionary Computation. Thanks for reading and feel free to email jdh366 (at) cornell (dot) edu with questions.