Content uploaded by Todd Paciencia

Author content

All content in this area was uploaded by Todd Paciencia on Oct 10, 2015

Content may be subject to copyright.

Multi-Objective Optimization of Stochastic,

Black-Box Systems Using Direct Search and

Indierence Values

Todd J. Paciencia

1

and James W. Chrissis

2

Air Force Institute of Technology, WPAFB, OH, 45433

In this work, a general framework is developed to solve black-box, multi-objective

problems to a desired level of resolution or completeness of the Pareto front. This

framework can be used to solve problems with or without closed form representation

and can be expanded easily for stochastic responses. An indierence region-based

method is developed to help determine the completeness of a Pareto approximation

and to nd any possible missing portions of the optimal front. This method is used with

optimization of single-objective formulations via direct search methods to complete

the approximation. The resulting algorithm is evaluated on systems with up to eight

objectives and is shown to provide a reasonably complete approximation of the Pareto

set, and to do so eciently if performed smartly.

Nomenclature

fg=

Utopia point

fb=

Nadir point

ωi

= Indierence value in objective i

1

Student, Air Force Institute of Technology.

2

Associate Professor, Air Force Institute of Technology, AFIT/ENS, 2950 Hobson Way, WPAFB, OH 45433, USA.

1

I. Introduction

THERE are many existing methodologies for optimization over multiple objectives, but most

carry limitations. As discontinuity, non-convexity, a large number of objectives or decision variables,

black-box systems, and/or uncertainty are introduced, existing methodologies may fail to nd a

relatively complete Pareto approximation in an ecient manner. During black-box, or simulation-

based optimization this is very important, as each function evaluation may be expensive and decision-

makers and analysts may not have enough knowledge of the system

apriori

to x focus on only a

part of the Pareto set. Therefore, the priority becomes achieving a desired level of completeness,

while maintaining eciency.

Three methods using direct search were introduced that each sought to avoid the limitations of

other methods in order to solve a problem with no closed form representation relatively completely

and eciently. Walston introduced the rst method, Stochastic Multi-Objective Mesh Adaptive

Direct Search (SMOMADS).

1

SMOMADS uses aspiration and reservation levels within the context

of scalarization functions to solve single-objective formulations. Using design of experiments with

the aspiration and reservation level as factors, dierent regions of the Pareto front can be found.

SMOMADS uses Ranking and Selection (R&S), a method where sample means are used to select a

best candidate via probability of correct selection, to determine best points to account for variation.

2

The specic R&S procedure used by Walston was Sequential Selection with Memory (SSM) by

Pichitlamken and Nelson.

3

Audet, Savard, and Zghal introduced the second and third methods, Bi-Objective MADS (Bi-

MADS) and Multi-Objective MADS (MultiMADS).

4,5

BiMADS is only for two objectives, and uses

single-objective formulations and the ordering property in two objectives to complete the Pareto

approximation. MultiMADS also uses single-objective formulations, but generates reference points

for these formulations using an alternate simplex to the Convex Hull of Individual Minima (CHIM).

Both of these methods were developed for deterministic problems, but are easily extended to the

stochastic case by using R&S.

All of these methods use Mesh Adaptive Direct Search (MADS), or its mixed-variable form,

MV-MADS, to solve their sub-problems.

6,7

MADS allows for an innite set of directions to explore

2

the set of decision variables and uses the concept of a mesh to control the search. More recently,

OrthoMADS was introduced, where the polling directions are chosen deterministically such that

they are orthogonal to each other so that the convex cones of missed directions at each iteration

are minimal in size.

8

Further, the convergence results for OrthoMADS hold deterministically, rather

than with probability one, and allow for non-linear constraints. The true limitation of using MADS

is the growth of the poll set as the the number of decision variables increases.

This paper rst discusses a few other multi-objective and black-box approaches for the sake of

discussion. SMOMADS, BiMADS, and MultiMADS are then discussed in more detail. Next, an

indierence-region based method is introduced to nd potential missing parts of the Pareto front.

This method is used to modify the MADS-based algorithms such that their objective functions

can be used in a framework to create a n-dimensional algorithm, without the addition of further

constraints or the need to build an alternate simplex. Results and parameter settings are also

discussed.

II. Multi-Objective Optimization with a Stochastic Response

The multi-objective problem to be solved may include both continuous and discrete variables,

as well as some level of noise or uncertainty in each objective. The specic problem formulation is:

Minimize:

E[F(x)] = E[f(x) + w(x)]

(1)

subject to:

gi(x)≤0, i ∈ {1, . . . , N },

x∈nRnc×Zndo

where

F(x)∈nRnc×Zndo→RN

,

F= (F1, F2, . . . , FN)

is the set of objective functions,

εw(x)

is the random error or noise such that

E[εw(x)] = 0

, and

x

is the set of continuous and discrete

design variables. Here, the objectives need not be smooth.

This formulation can have many optimal solutions depending upon the importance of the objec-

tives to a decision-maker. Therefore, the Pareto set is the set of solutions such that no one solution

is better than another in all objectives. Otherwise that solution is

dominated

. The deterministic

3

form of this problem may be easier to solve, as points do not need to be sampled repeatedly to

nd a "best" response. Two points of signicance in the multi-objective context are the

utopia

and

nadir

points. The utopia point,

fg

, is the vector consisting of the minimum objective function value

of each objective over all feasible points. This can be "easily" found by minimizing each objective

independently. The nadir point,

fb

, is the vector consisting of the maximum objective function value

of each objective over all Pareto solutions. This is often approximated by the

pseudo −nadir

, found

using the maximum objective function value of each objective over those solutions corresponding to

fg

. This is an approximation, as there may be multiple optima for any objective.

A. Methods to Solve the General Problem

There are many multi-objective optimization methods, but those that can nd a complete front

for any system may still carry limitations once considering eciency if applied to a stochastic

response, black-box system. Messac and Mattson modied the Normal Constraint (NC) method

to allow for exploration of the entire feasible space while nding an even distribution of points.

9

This method uses additional constraints built from the utopia plane to reduce the design space.

For a stochastic, blackbox system, R&S may not be sucient to deal with the uncertainty both

in the objectives and these constraints. Shan and Wang

10

proposed a Pareto-Set-Pusuing (PSP)

method to progressively sample closer to the Pareto front. This method is designed to generate

even-distributed solutions, but cannot guarantee evenness or a complete front. Kim and de Weck

designed an adaptive weighted sum technique to overcome the inability of weighted sum methods to

nd nonconvex solutions or to easily nd evenly distributed solutions.

11

However, their technique

uses additional constraints, user-dened parameters in addition to any needed for the sub-problem

solver, and a complex construction of a Pareto front patching.

The use of surrogates is another tool often found in black-box optimization, and they can also

be used various ways within the framework presented here if means or a best candidate are used for

the responses. The use of surrogates is not a focus here, but is worthy of mention. It is important

to note that metamodels are entirely reliant on proper sampling and in some cases, picking the

correct parameters. Metamodels can have issues with highly non-linear or erratic systems, and dis-

4

continuous objectives. Mullur and Messac designed metamodels to nd Pareto solutions after using

NC to nd near-Pareto solutions, but creating a non-linear metamodel for any of the sub-problems

requires a certain number of near-Pareto solutions in an area.

12

Ryu, Kim, and Wan

13

used a meta-

modeling, trust-region, and weighted sum combination to solve quadratic sub-problems to generate

evenly distributed solutions in a bi-objective problem. However, they used a Central Composite

Design to build the metamodels, again highlighting the possible issue of larger design spaces for sur-

rogates. Jones, Schonlau, and Welch

14

designed an ecient, global optimization algorithm (EGO)

using sampling, meta-modeling, cross-validation, and expected improvement. Couckuyt, et. al

15

developed an evolutionary algorithm to pick a best surrogate using expected improvement.

Although the research highlighted thus far is only a small fraction of the work related to multi-

objective or simulation-based optimization, other non-heuristic methods may not generate a well-

distributed Pareto front eciently in the general case. It is important to note again, however, that

using direct search to solve sub-problems for any algorithm may become inecient as the number

of decision variables increases. This will be discussed in further detail in the results. Next we will

discuss the three MADS-based algorithms that we leverage in this work.

B. SMOMADS

SMOMADS solves Eq. (1) for Pareto optimal solutions by minimizing a single-objective formu-

lation with a given aspiration level

a

and reservation level

r

,

Sr

a=− min(u) + ε·

N

X

i=1

ui!

(2)

where

ui=

αi·wi·(ai−fi)+1, fi< ai,

wi·(ai−fi)+1, ai≤fi≤ri,

βi·wi·(ri−fi), ri< fi,

wi=1

ri−ai

,

αi=

(0.1) ri−ai

ai−fg

i, ai6=fg

i,

(0.1) ri−ai

10−7,

o.w.,

5

βi=

(−10) ri−ai

ai−fb

i, ai6=fb

i,

(−10) ri−ai

10−7,

o.w.,

Fig. 1 Intersection of Rays with

Pareto Front

and where

ε

was set to

5

in Walston's work. The func-

tion

ui

is of the type called component achievement

functions, i.e. strictly monotone functions of the ob-

jectives. The minimization of Eq. (2) provides proper

Pareto optimal solutions nearest the aspiration level.

Therefore, sampling over a variety of aspiration and

reservation levels can provide many solutions along the

Pareto front. This is depicted in Figure 1. A determin-

istic dominance check is performed for SMOMADS. As a tolerance is not always easily dened, and

the Pareto front may be unknown

apriori

, the deterministic check is also used for this research.

C. BiMADS

BiMADS approximates two-objective Pareto fronts by solving a series of single-objective

formulations.

4

Specically, BiMADS begins by nding the solutions that correspond to the utopia

point components. As evaluating these solutions for both objectives yields the pseudo-nadir, this

also bounds the Pareto approximation. The algorithm works toward the Pareto front, using a

weighting strategy such that each current nondominated point has a corresponding

δ

. This

δ

equals

the sum of the distances from that point to its predecessor and successor in objective space (utilizing

the ordering property), divided by the current weight. A point is selected using the maximum

δ

. A

new single-objective formulation is then solved using a reference point derived from the maximum

objective function values of those predecessor and successor points. The weights are adjusted so

that no point is selected too many times, in the case of discontinuity. This creates a

gap

-lling

strategy. Within the algorithm, every point evaluated is also considered for nondominance and the

algorithm terminates once the maximum

δ

is below some predetermined value.

There are two specic single-objective formulations that may be used with a reference point

r

.

6

The single-objective normalized formulation is:

ˆ

Rr: min

x∈Xˆ

ψr=ˆ

φr(f1(x), f2(x), . . . , fN(x)) = max

i∈{1,2,...,N}

fi(x)−ri

si

(3)

where

s∈Rn

. The single-objective product formulation is:

˜

Rr: min

x∈X˜

ψr=˜

φr(f1(x), f2(x), . . . , fN(x)) = −

N

Y

i=1 (ri−fi(x))+2

(4)

where

(ri−fi(x))+= max {ri−fi(x),0}

and

i= 1,2, . . . , N

. These formulations were shown to

have convergence to Pareto solutions for any number of objectives using Clarke calculus for non-

smooth functions.

4

The latter formulation restricts the choice of reference point whose dominance

zone should be non-empty, but preserves dierentiability of the original problem.

5

D. MultiMADS

MultiMADS was designed by Audet, Savard, and Zghal

5

to solve problems with more than two

objectives without restricting the choice of reference point. They proposed a new single-objective

formulation:

Rr: min

x∈Xψr=φr(f1(x), f2(x), . . . , fN(x)) =

−dist2(∂D, f(x)) ,

if

f(x)∈D,

dist2(∂D, f(x)) ,

o.w.,

(5)

where

dist (∂D, f(x))

is the distance in the objective space from

f(x)

to the boundary

∂D

of the

dominance zone relative to

r

in the objective space. Here, the

L2

-norm is used. The dominance

zone

D

is dened as

{x∈Rn:fi(x)≤ri

for

i= 1,2, . . . , p}

. They showed that this formulation

provides a more exible optimality condition than

˜

Rr

and that this formulation generalizes the

ˆ

Rr

formulation. To construct a

y∈∂D

relative to

r

,

yi=

ri,

if

i=ˆ

i,

fi(x),

o.w.

.

for

i∈ {1,2, . . . , p},

(6)

where

ˆ

i∈argmin {|fi(x)−ri|:i∈ {1,2, . . . , p}} .

7

To determine the reference points, rst

z∗=minx∈XPp

i=1 sifi(x)

is found where

si

is a positive

scaling factor. Then vectors are generated from the set

B={β∈Rp:Pp

i=1 βi= 1, βi≥0}

. A

reference point is dened as

r=fg+z∗βIp:β∈B

. The set of these reference points is referred

to as the Tangent Hull, and is the alternate simplex. To generate a nice distribution of vectors to

create the reference points, the strategy from Normal Boundary Intersection (NBI) is used.

16

In this

work, scaling factors of 1 are used in the formulation.

III. Determining the Completeness of a Pareto Approximation

It has been shown for SMOMADS

19

that sampling the aspiration and reservation levels using

space-lling designs such as Hammersley sequence sampling

17

and Near-Uniform Design (NUD)

18

can be a more reasonable approach to nding a complete front. Furthermore, MultiMADS has been

shown to have desirable results on three-objective problems

5

. However, in some cases being able to

apply more of a gap-lling approach such as that found in BiMADS may be a more ecient and/or

more straightforward way to guarantee the generation of well-distributed Pareto approximations.

As the single-objective formulations of all of these methods can still be used if we can determine

reference points and gaps in the front, the true issue for greater than two objectives is how to

determine if, and where, these portions of the Pareto front are missing from the approximation.

Wu and Azarm developed metrics to compare approximations

20

and Farhang-Mehr and Azarm

developed an entropy metric to determine the true quality of the approximation of an unknown

front.

21

A useful concept implemented by these metrics is an

indifference region

, or

indifference

values

. Using indierence values, a decision-maker can attempt to

apriori

decide the required

delity of the Pareto front appoximation in each objective, creating a hypercube or indierence

region around each point. We will show that this can be used to nd gaps with resepect to individual

objectives in n-dimensional space.

Figure 2 depicts the basis behind Algorithm 1. In accordance with satisfying the decision-

maker's preferences, each point on the Pareto approximation ideally has another point preceding

and succeeding it within the indierence value

ωi

(and region) for each objective

i

as appropriate.

Such an indierence region is shown in Figure 2(a). By sorting data one objective at a time,

8

(a) (b)

Fig. 2 Searching Around a Point.

and searching along that objective starting from each data point, gaps may be found using the

indierence values. However, as noted with BiMADS, sorting with more than two objectives no

longer puts points into order such that successive points are necessarily closest to or are near each

other in the space. Therefore, the distance between points relative to some fraction of a norm (here

L2

is used) of the indierence values can be used to ensure points under consideration are in the

correct portion of objective space. This method helps account for the fact that solutions lie on a

hypercurve, and are not necessarily "linearly" next to each other, and produces a gap consisting

of two endpoints. A reference point for

˜

Rr,ˆ

Rr, orRr

can be constructed from these gap endpoints

using their maximum values in the objective space. A reservation level for

Sr

a

can be constructed

in the same manner, with the aspiration level being formed by the minimum values. We will show

later that this very simple means of nding gaps can be used to form an eective multi-objective

framework.

Figure 2(b) demonstrates a simple example. Assume the curve is the Pareto front, the green

point is the current solution being searched around, and the square and circle denote the indierence

hypercube and indierence

L2

-norm respectively. When searching "above" in the second objective,

Point 1 is within the indierence value, but outside the indierence norm. Point 2 is within the

indierence value, but outside the indierence region. The algorithm would then evaluate Point 3,

but as it and any succeeding points are outside the indierence value, a gap would be indentied.

9

Going by objective, this is a very quick way to search for gaps among multi-objective solutions.

To rectify the issue of the

L2

-norm not truly representing the indierence hypercube, 0.5 of the

L2

-norm can be used, as shown in Figure 2(a).

Algorithm 1: Indierence Value-Based Gap Algorithm

Given

c > 0

, a vector of indierence values

~ω

, and

p

non-dominated points:

1: Set

dcrit =c· k ~ω k

.

2: Repeat for each objective

n

.

3: Sort the objective data in ascending order of function value. Set

j= 1

.

4: Repeat for each data point

j

, relative to the sorted data.

5: Set

i= 1

.

6: If

j= 1

or

j=p

, set

j=j+ 1

or stop, respectively (extreme points).

7: Else, if

|fn

j−fn

j−i| ≤ ωn

and

kfn

j−fn

j−ik≤ dcrit

, set

j=j+ 1

.

8: Else, if

|fn

j−fn

j−i|> ωn

, nd the closest point

k

to

j

(smallest

L2

-norm), from point

1

to

j−1

.

9: If

|fn

j−fn

k| ≤ ωn

, set

j=j+ 1

(will nd in another objective).

10: Else, add

(j, k)

as a gap. Set

j=j+ 1

.

11: End If.

12: Else,

i=i+ 1

.

13: End If.

14: Search above using same process 5-12, except using

j+i

instead of

j−i

in lines 7-8,

and points

j+ 1

to

p

in line 8.

15: Remove gaps with a distance between their centers less than

dcrit

, retaining one.

Algorithm 1 is the resulting algorithm. Keeping in mind eciency, Algorithm 1 removes similar

gaps in its nal step. Gaps may be found that are near each other in space when there are many

objectives, and a single sub-problem may ll gaps for multiple objectives. If it does not, the gaps are

re-identied in subsequent steps of the main optimization algorithm. Other steps in the algorithm

also serve to either reduce similar gaps found, or to increase computational eciency.

10

Fig. 3 Algorithm 1 Limitation.

Algorithm 1 does has one known limitation. A single

point can satisfy both the "above" and "below" search for

more than one objective. Imagine a circle of points on the

surface of a sphere, depicted two-dimensionally in Figure 3.

These points can satisfy indierence and distance criteria,

while leaving a gap in the center of the circle. Fortunately,

the iterative nature of the formulations and the mesh con-

struct of MADS produces a low probability of such successive

points re-occurring as the main algorithm progresses. Further,

dcrit

could be reduced, but this may

aect the eciency of the algorithm. In the event of such an occurrence, n-dimensional visualization

techniques provide a means to potentially identify such gaps, after which reference points can be

formulated from surrounding solutions to ll the gaps. Such techniques include Hyperspace Diag-

onal Counting (HSDC), where bins are created using a counting method away from the utopia to

the nadir in two groups of objectives.

22

Parallel coordinates is another technique where objectives

are shown on parallel y-axes, and the Hyper Radial Value (HRV) is a third technique that splits

objectives into two groups and plots the hyper radius of each group from the utopia.

23

IV. nMADS Algorithm

Adding Algorithm 1 and slightly changing the BiMADS strategy enables an n-objective algo-

rithm, nMADS, shown as Algorithm 2. To use Algorithm 1 in this framework, decision-makers do

not necessarily have to determine

apriori

their exact indierence values. Instead they may choose

a number of bins in each objective and use the utopia and pseudo-nadir to derive values. The only

caution in doing this is that the pseudo-nadir may vastly over-estimate the true nadir for certain

systems, such as those with many localized fronts and multiple optima for the utopia. In these cases,

adjusting the indierence values during the course of the algorithm is recommended, as otherwise

convergence to the true optimal front will be slowed. In this respect, some

apriori

knowledge as

to the scale of the objective space may be required. Indierence values set to approximately

fb

i−fg

i

10

seem to work very well in practice.

11

Algorithm 2: nMADS

INITIALIZATION:

Let

size(g)

denote the Euclidean distance between the two endpoints for a gap

g

.

1: Apply the OrthoMADS algorithm (with R&S if applicable) from initial iterate

x0

to solve

minx∈Xfi(x)

for each objective

i= 1, . . . , N

.

2: Remove dominated points and run Algorithm 1 to identify a set of gaps

G

, given some

c > 0

and indierence value vector

~ω

.

3: Initialize the weights

w(g)

to

size(g)

for all gaps

g∈G

. Initialize the weights

v(g)

to

1

∀g∈G

.

MAIN ITERATIONS: Repeat while

G6=∅

and

max {w(g)}> c· k ~ω k

4: For each

g∈G

:

5: If

w(g)< c· k ~ω k

, set

G=G\g

, go to next gap.

6: Else:

7: Build reference point

r

by using maximum objective values from the endpoints of

g

.

8: Solve a single-objective formulation using the OrthoMADS (-R&S) algorithm from the starting

iterate corresponding to one of the two endpoints of

g

.

9: End If.

10: End For.

11: Remove dominated points and run Algorithm 1 with resulting gaps

G0

.

12: If any center of

g0∈G0

is within

k~ω k

of any center of

g∈G

(according to Euclidean

distance), set

v(g0)=2v(g)

, and set

w(g0) = size(g0)/v(g0)

.

13: Else, set

w(g0) = size(g0)

and

v(g0)=1

.

14: End If.

15: Set

G=G0

, REPEAT.

Using the same single-objective formulations from the algorithms in Section II, the reference

points can still be built using the boundaries of an identied gap, as found by Algorithm 1. In the

case of

ˆ

Rr

,

Rr

, and

Sr

a

the optimal yields a solution in the dominance zone. Therefore, as Algorithm

12

2 progresses, gaps in the front are successively found and lled in to some acceptable resolution.

Starting iterates are chosen from the gap boundaries with the intent of aiding direct search eciency

and because gaps are no longer tied to a "center" solution as with BiMADS. Weighting on the gap

size is used to ensure that true discontinuities do not prevent termination of the algorithm.

Variations on the weighting scheme (such as an add-one to the denominator instead of doubling

in Line 12), the starting iterate(s) to use for a sub-problem, a norm to use, and the sub-problem

solver can also be applied to speed eciency for a given problem. If using MADS, it is important to

note that there is an optional search step where an experimental design can be used to sample on the

mesh to try and speed convergence. Due to this and the pattern-based search, a single sub-problem

may end up lling more than one gap or a gap with respect to mutliple objectives. Therefore, it

may be even more ecient to only solve the sub-problem for the largest gap during an iteration.

Additionally, other MADS parameters such as initial mesh size can aect the convergence rate. A

function evaluation (FEval) limit is typically imposed to prevent too many calls to an expensive

function while waiting for mesh convergence. Thus, it can be important to ensure that a high enough

limit is used to nd an accurate

fg

or to allow the search to nd improvement for a sub-problem. If

there are localized fronts in the problem, due to the FEval limit imposed, it may be most ecient

to rst focus on those gaps closest to the utopia in hopes of screening out local Pareto solutions

earlier. In the case when using R&S, the fact that solutions are being sampled repeatedly should

also be a consideration for choice of FEval limits.

To exemplify Algorithm 2, we will rst use the following three-objective problem, Viennet3:

Minimize:

F1(x, y) = 0.5(x2+y2) + sin(x2+y2)

(7)

F2(x, y) = (3x−2y+ 4)2

8+(x−y+ 1)2

27 + 15

F3(x, y) = 1

x2+y2+ 1 −1.1e−x2−y2

subject to

−3≤x, y ≤3

An initial approximation from searching for

fg

is as shown in Figure 4(a). Algorithm 1 identies

gaps from the current approximation using

c= 0.5

and indierence values chosen as

1/10

of the

13

(a) (b)

(c) (d)

Fig. 4 Three-Objective Example.

dierence between estimated utopia and pseudo-nadir components. In this case, one of the gaps

identied was as shown, corresponding to Objective 3. Figure 4(b) depicts those points found using

a sub-problem. Walston used SMOMADS to solve this problem, requiring 4,096 test points, or

approximately 2,048,000 function evaluations to nd a relatively complete front.

1

Instead, using

Algorithm 2 with

Sr

a

, a 10-sample Latin Hypercube in the MADS search step, and a 500 FEval limit

for all sub-problems, the front shown in Figure 4(c) was found. This only required 3,352 FEvals and

found 1,163 unique Pareto solutions. Adding one percent of the pseudo-nadir objective values to

each objective as noise, and using SSM as the R&S technique per Walston's work,

1

the front shown

in Figure 4(d) was found. Although the eect of the noise and having to sample points repeatedly is

evident, this is still relatively representative. This required 11,622 FEvals, and found 110 solutions.

V. Three to Eight Objective Problems using nMADS

In the remaining examples shown, default settings from Nomadm

7

were used, unless otherwise

noted. For Algorithm 2, when the responses are treated as stochastic, noise is set to less than or

14

equal to one percent of

fb

i

, and

fg

and

fb

are estimated using two replications of the sub-problems.

A 10-sample Latin Hypercube is used in the MADS search step, and the zero vector is used for

x0

. No cache of function evaluations was maintained between sub-problems. These settings are

meant to create a general aplication of the algorithm and to showcase its completeness, and are not

necessarily good or optimal for a problem. The number of function evaluations and unique solutions

are used as a comparison. Although the framework still works in only two objectives, it would be

very similar to BiMADS and so results are only given for more than two objectives.

A. 3 Objectives

Having seen Viennet3, we will next look at a problem with many local fronts, and then one with

disconnected Pareto regions. Consider the test problem DTLZ3:

24

Minimize:

F1(X) = (1 + g(X))cos(x1π/2)cos(x2π/2)

(8)

F2(X) = (1 + g(X))cos(x1π/2)sin(x2π/2)

F3(X) = (1 + g(X))sin(x1π/2)

subject to

0≤xi≤1

where

g(X) = 100

|X| − 2 +

|X|

X

i=3

(xi−0.5)2−cos (20π(xi−0.5))

.

This problem has

3|X|−2−1

local Pareto-optimal fronts, and a pseudo-nadir can highly over-

estimate the nadir. The global optimal front lies on the unit sphere. Figure 5(a) shows the front

found for the four-variable problem, using Algorithm 2 with

Rr

, an initial mesh size of 0.1, a FEval

limit of 500 for all sub-problems, and

ωi= 0.1

. Just over 1,890 solutions were found in 6,500

FEvals. When the initial mesh was changed to 1, just over 2,200 solutions were found in 13,200

FEvals. Figure 5(b) depicts the front found again using an initial mesh of 1, but with using the

solution found closest to the utopia as a starting iterate. This found almost 2,150 solutions in 12,900

FEvals. It is clear that various parameter settings may impact the eciency of the algorithm on

more dicult problems, but it is also evident that the algorithm consistently nds a relatively even

and complete front.

15

(a) (b)

(c) (d)

Fig. 5 DTLZ3 Deterministic.

Next, the number of variables was changed to 12 and the FEval limit to 1,000 for a sub-problem.

Figure 5(c) is included to further clarify the notion of the algorithm. After the rst iteration of

gap-lling based on the search for the utopia, the approximation was as shown. The magenta points

depict the means of the gap endpoints for each identied gap. Continuing the algorithm, the front in

Figure 5(d) was obtained, consisting of 1,759 solutions and requiring a total of 24,067 FEvals. For a

more complete front, with the trade-o of eciency, a lower weighting scheme could be used. Here,

there were

310 −1

local fronts. Performance on this problem can be variable in that if the FEval

limit is not chosen well, points found on these local fronts create gaps that may persist and thus

unnecessarily increase total FEvals used until a point is found that dominates them. In this sense,

an additional lter could be useful to screen for such points and either remove them or increase

their associated sub-problem's FEval limit.

Next, we consider the stochastic response case for the four-variable problem, again using 500

16

(a) (b)

Fig. 6 DTLZ3 Stochastic.

as a FEval limit. Figure 6(a) shows an approximation after 75,000 FEvals. We see that the noise

in the objectives has allowed bad solutions to escape the deterministic dominance check. However,

if we ignore these points and scale to the desired front (as seen in Figure 6(b)), we have in fact

obtained a good approximation. In fact, this front had nearly 1,350 solutions.

Now consider DTLZ7:

24

Minimize:

F1(X) = x1

(9)

F2(X) = x2

F3(X) = (1 + g(X))h(F, g)

subject to

0≤xi≤1

where

g(X) = 1 + 9

|X| − 2

|X|

X

i=3

xi

h(F, g)=3−

2

X

i=1 Fi

1 + g(1 + sin (3πFi)).

Again using

Rr

, FEval limits of 1500,

~ω = [0.1,0.1,0.2]

, and 22 variables, the front shown in

Figure 7(a) was found. This took 20,950 FEvals with 1,369 solutions. However, as Pareto-optimal

solutions have

x3,...,20 = 0

, a starting iterate with

X=~

0.5

is more dicult. Adjusting the initial

mesh size to be 0.1 so as to be smarter relative to the variable range, Figure 7(b) shows the front

found using this new starting iterate. This found 699 solutions in just over 67,000 FEvals. Now

using a stochastic response with the

X=~

0.5

starting iterate, the front shown in Figure 7(c) was

17

(a) (b) (c)

Fig. 7 DTLZ7.

found. This found 470 solutions but required 527,812 FEvals. Part of this was due to the trade-o

between R&S and being able to move towards a better solution during a sub-problem. Additionally,

noise was added to

F3

in addition to that already present in

F1

and

F2

, perhaps making the problem

even more dicult. It is important to note here that these results are still fairly ecient given the

number of decision variables and the nature of direct search.

B. More Than 3 Objectives

Using the HRV representation, that groups normalized objectives into two groups and plots

their hyper-radial values, we can now also showcase the benet of Algorithm 2 using problems in

more than three objectives. We use a few problems that have deterministic published solutions using

HRV. The coloring of solutions depicted are HRV schemes to represent if all objective function values

for that solution are within some value of the utopia,

23

and is not important here. We will use

ˆ

Rr

as

we have yet to show results using that formulation. All following results used stochastic responses,

default Nomadm parameters, both gap endpoints as starting iterates (replicated sub-problems), a

"plus-one" weighting strategy for the gap denominators instead of doubling, and indierence values

derived from the estimate of

fb

i−fg

i

10

.

18

(a) (b)

Fig. 8 4-Objective HRV Pareto Representation.

First, consider the following four-objective problem:

Minimize:

F1(X)=7.49 −0.44x1+ 1.16x2−0.61x3

(10)

F2(X)=4.13 −0.92x1+ 0.16x2−0.43x3

F3(X) = −21.9+1.94x1+ 0.3x2+ 1.04x3

F4(X) = 11.33 −x1−x2−x3

subject to

F1(X)−7.49 ≤ −3.1725

F2(X)−4.13 ≤ −8.042

1.94x1−0.3x2−1.04x3≤18.4988

6.3969 ≤x1≤7.0901

0.6931 ≤x2≤2.8904

3.912 ≤x3≤4.6052

Figure 8(a) shows a result, where

[7,2,4.5]

was used as the starting iterate. A FEval limit of 500

was used to nd the utopia, and 150 was used for the gap sub-problems. Here, 1,414 solutions were

found in 5,176 function evaluations. The published solution is shown in Figure 9(b).

25

Although

the proposed framework is somewhat robust to parameter settings in nding a complete front, as

has been shown, parameter settings can have impact on the eciency of the approximation. Here,

high FEval limits were not needed and so sub-problems could be replicated and gaps weighted less

19

(a) (b)

Fig. 9 6-Objective Solutions.

so as to help mitigate the eect of noise.

Consider the following six-objective problem:

Minimize:

F1(x1, x2) = x2

1+ (x2−1)2

(11)

F2(x1, x2) = x2

1+ (x2+ 1)2+ 1

F3(x1, x2)=(x1−1)2+x2

2+ 2

F4(x1, x2) = (x1−2)2

2+(x2+ 1)2

13 + 3

F5(x1, x2) = (x1+x2−3)2

36 +(−x1+x2+ 2)2

8−17

F6(x1, x2) = (x1+ 2x2−1)2

175 +(−x1+2x2)2

17 −13

subject to

−2≤x1, x2≤2

Figure 9(a) depicts an approximation using nMADS, a FEval limit of 500 for the utopia, and

a limit of 150 on the sub-problems. To investigate variability of eciency, 20 runs were conducted

of the double-weighting scheme against adding one to the weight denominator each iteration. An

average of 50 more solutions were found using the latter "plus-one" scheme. However, an average

of 400 more function evaluations were required. Table 1 shows metrics for the 20 double-weighted

scheme runs. Figure 9(b) depicts an approximation using the "plus-one" weighting scheme and the

same 150 function evaluation limit, with an additional iteration of gaps lled once the algorithm

had terminated. In total, over 17,000 function evaluations were used to nd 3,333 solutions. This

20

demonstrates that in latter stages of Algorithm 2, or with non-optimal points that have yet to be

dominated, failing to adjust parameter values based on observation may cause unnecessary expense.

Table 1: 6-Objective Problem Metrics

FEvals Solutions

Mean 8897 1798

St Dev 638 127

Max 9894 2013

Min 7819 1553

Now we consider the following eight-objective problem to show the ability to nd a relatively

complete and even front in a large number of objectives:

Minimize:

F1(x, y) = (x−2)2

2+(y+ 1)2

13 + 3

(12)

F2(x, y) = (x+y−3)2

175 +(2y−x)2

17 −13

F3(x, y) = (3x−2y+ 4)2

8+(x−y+ 1)2

27 + 15

F4(x, y) = (3x+y+ 9)2

34 +(x+ 1)2

15 + 29

F5(x, y) = (4x−y−4)2

22 −(y−1)2

5−17

F6(x, y) = (y+ 14)2

8+(x+y)2

10 + 64

F7(x, y) = (17 −x−y)3

995 +(8y−5x)

65

F8(x, y) = (7 + 2x+ 5y)

5+(y−3x)3

235

subject to

4x+y−4≤0

−1−x≤0

x−y−2≤0

−4≤x, y ≤4

Figure 10(a) shows the nMADS result in comparison to the published solution, Figure 10(b), again

using HRV to plot the data. The published solution was found via a genetic algorithm and had

625 solutions.

23

nMADS (Algorithm 2) used 6,992 function evaluations and found 2,350 solutions

21

(a) (b)

Fig. 10 8-Objective Solutions.

in this instance, using the "plus-one" weighting scheme and a function evaluation limit of 150 for

each sub-problem. Over 20 replications of nMADS without any additional iterations, an average

of 1,967 solutions were found using 5,773 function evaluations. The respective standard deviations

were 459 for function evaluations, and 154 for the number of solutions.

VI. Conclusion

In the stochastic case where R&S is required, the number of function evaluations may still be

too high for some real-world problems due to expense of a single function evaluation. Further,

as the number of decision variables or local Pareto fronts increases, the number of evaluations

needed for a sub-problem is likely to increase. There are several options to increase eciency that

warrant more rigorous investigation; these include: adaptive indierence regions, adaptive mesh

parameters, lowering the number of evaluations by R&S, choosing a best formulation for a problem,

using dierent norms in the algorithms, and the use of surrogates. The algorithm could also be

made more ecient by development of a lter for clearly non-optimal solutions that persist until

that correct portion of the global front is found. Otherwise such points that occur on more dicult

or noisy problems may cause the need for a larger FEval limit or repeated use of a sub-problem,

and may take several iterations to progressively work towards the global front.

Using the new iterate strategy and gap location algorithm enable the driving concepts behind

BiMADS to work in more than two objectives. Further, it seems that if used smartly, this framework

allows for a robust, yet ecient, approximation relative to any multi-objective problem. The results

22

on problems tested with up to eight objectives are promising, and have been similar on a variety

of other problems not shown here. nMADS' eciency relative to other algorithms wanes as the

number of decision variables increases due to direct search. However, nMADS avoids limitations

of many other multi-objective optimization algorithms and enables the solution of large numbers

of objectives. Unfortunately, parameter settings are not always trivial to create a most ecient

instance for a problem, and in the stochastic case, a large number of function evaluations may be

unavoidable. These aspects require further study.

VII. Acknowledgements

I must give a special thanks to Dr. James Chrissis for his continued support, assistance, and

patience.

VIII. References

[1] Walston, J., "Search Techniques for Multi-Objective Optimization of Mixed-Variable Systems Having

Stochastic Responses," Ph.D. Dissertation, Department of Engineering Sciences, Air Force Institute of

Technology, WPAFB, OH, 2007.

[2] Sriver, T., "Pattern Search Ranking and Selection Algorithms for Mixed-Variable Optimization of

Stochastic Systems," Ph.D. Dissertation, Department of Engineering Sciences, Air Force Institute of

Technology, WPAFB, OH, 2004.

[3] Pichitlamken, J., and Nelson, B., "Selection-of-the-best procedures for optimization via simulation",

Proceedings of the 2001 Winter Simulation Conference

, Arlington, 2001.

[4] Audet, C., Savard, G., and Zghal, W., "Multiobjective Optimization Through a Series of Single-

Objective Formulations,"

Les Cahiers du GERAD

, G-2007-05, Jan. 2007.

[5] Audet, C., Savard, G., and Zghal, W., "A mesh adaptive direct search algorithm for multiobjective

optimization",

European Journal of Operational Research, Vol.

204, pp545-556, 2010.

[6] Audet, C., and Dennis, J., "Mesh adaptive direct search algorithms for constrained optimization,"

SIAM Journal of Optimization, Vol.

17, No. 2, pp. 188-217, 2006.

[7] M.A. Abramson, "Nomad-m version 4.6." Free Software Foundation Inc., 59 Tempe Place, Suite 330,

Boston, MA 02111-1307 USA, 2009.

[8] Abramson, M., Audet, C., Dennis, J., and Le Digabel S., "OrthoMADS: a deterministic MADS instance

with orthogonal directions,"

SIAM Journal of Optimization, Vol.

20, No. 2, pp. 948-966, 2008.

23

[9] Messac, A., Mattson C., "Normal constraint method with guarantee of even representation of complete

Pareto frontier",

45-th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materi-

als Conference

, Palm Springs, 2004.

[10] Shan, S., and Wang, G., "An ecient Pareto set identication approach for multiobjective optimization

on black-box functions",

Journal of Mechanical Design, Vol.

5, No. 5, pp866-874, 2004.

[11] Kim, I., and de Weck, O., "Adaptive weight sum method for multiobjective optimization: a new method

for Pareto front generation",

Structural and Multidisciplinary Optimization, Vol.

31, pp105-116, 2006.

[12] Muller, A., and Messac, A., "Higher metamodel accuracy using computationally ecient pseudo re-

sponse surfaces",

1-st AIAA Multidisciplinary Design Optimization Specialist Conference

, Austin, 2005.

[13] Ryu, J., Kim, S., and Wan, H., "Pareto front approximation with adaptive weighted sum method

in multiobjective simulation optimization",

Proceedings of the 2009 Winter Simulation Conference

,

Austin,2009.

[14] Jones, D., Schonlau, M., and Welch, W., "Ecient global optimization of expensive black-box func-

tions",

Journal of Global Optimization, Vol.

13, pp455-492, 1998.

[15] Couckuyt, I., et. al, "Automatic surrogate model type selection during the optimization of expensive

black-box problems",

Proceedings of the 2011 Winter Simulation Conference

, Phoenix, 2011.

[16] Das, I., and Dennis, J., "Normal-boundary intersection: a new method for generating the Pareto

surface in nonlinear multicriteria optimization problems",

SIAM Journal on Optimization, Vol.

8, No.

3, pp631-657, 1998.

[17] Giunta, A., Wojtkiewicz, S., Jr., and Eldred, M., "Overview of modern design of experiments methods

for computational simulations," AIAA Paper 2003-0649, 2003.

[18] Ma, C., and Fang, K., "A new approach to construction of nearly uniform designs,"

International

Journal of Quality Technology

, Vol. 20, pp. 115-126, 2004.

[19] Paciencia, T., "Multi-Objective Optimization of Mixed Variable, Stochastic Systems Using Single-

Objective Formulations," Master's Thesis, Department of Engineering Sciences, Air Force Institute of

Technology, WPAFB, OH, 2008.

[20] Wu, J., and Azarm, S., "Metrics for quality assessment of a multiobjective design optimization solution

set,"

Transactions of the ASME

, Vol. 123, No. 1, pp. 18-25, 2001.

[21] Farhang-Mehr, A., and Azarm, S., "An information-theoretic entropy metric for assessing multi-

objective optimization solution set quality,"

Journal of Mechanical Design

, Vol. 125, No. 4, pp. 655-663,

2003.

[22] Agarawal, G., Parashar, S., and Bloebaum, C.L., "Intuitive Visualization of Hyperspace Pareto Frontier

24

for Robustness in Multi-Attribute Decision-Making,"

Proceeding of the 11-th AIAA/ISSMO Multidis-

ciplinary Analysis and Optimization Conference

, Portsmouth, 2006.

[23] Chiu, P., and Bloebaum, C.L., "Hyper-Radial Visualization for Decision-making in Multi-objective

Optimization,"

46-th AIAA Aerospace Sciences Meeting and Exhibit

, Reno, 2008.

[24] Deb, K., "Scalabale Test Problems for Evolutionary Multi-Objective Optimization", TIK-Technical

Report No. 112, ETH Zurich, July 17, 2001.

[25] Chiu, P., and Bloebaum, C.L., "Hyper-Radial Visualization with Weighted Preferences for Multi-

objective Decision Making,"

12-th AIAA/ISSMO Multidisciplinary Analysis and Optimization Con-

ference

, Victoria, 2008.

25