Conference PaperPDF Available

Surrogate-Assisted Partial Order-Based Evolutionary Optimisation


Abstract and Figures

In this paper, we propose a novel approach (SAPEO) to support the survival selection process in evolutionary multi-objective algorithms with surrogate models. The approach dynamically chooses individuals to evaluate exactly based on the model uncertainty and the distinctness of the population. We introduce multiple SAPEO variants that differ in terms of the uncertainty they allow for survival selection and evaluate their anytime performance on the BBOB bi-objective benchmark. In this paper, we use a Kriging model in conjunction with an SMS-EMOA for SAPEO. We compare the obtained results with the performance of the regular SMS-EMOA, as well as another surrogate-assisted approach. The results open up general questions about the applicability and required conditions for surrogate-assisted evolutionary multi-objective algorithms to be tackled in the future.
Content may be subject to copyright.
Surrogate-Assisted Partial Order-Based
Evolutionary Optimisation
Vanessa Volz1, G¨unter Rudolph1, and Boris Naujoks2
1TU Dortmund University
{vanessa.volz, guenter.rudolph}
2TH K¨oln - University of Applied Sciences
Abstract. In this paper, we propose a novel approach (SAPEO) to
support the survival selection process in evolutionary multi-objective al-
gorithms with surrogate models. The approach dynamically chooses in-
dividuals to evaluate exactly based on the model uncertainty and the dis-
tinctness of the population. We introduce multiple SAPEO variants that
differ in terms of the uncertainty they allow for survival selection and
evaluate their anytime performance on the BBOB bi-objective bench-
mark. In this paper, we use a Kriging model in conjunction with an
SMS-EMOA for SAPEO. We compare the obtained results with the
performance of the regular SMS-EMOA, as well as another surrogate-
assisted approach. The results open up general questions about the ap-
plicability and required conditions for surrogate-assisted evolutionary
multi-objective algorithms to be tackled in the future.
Keywords: partial order, multi-objective, surrogates, evolutionary al-
gorithms, bbob
1 Introduction
Surrogate model-assisted evolutionary multi-objective algorithms (SA-EMOAs)
are a group of fairly recent but popular approaches1to solve multi-objective
problems with expensive fitness functions. Using surrogate model predictions
of the function values instead of / to complement exact evaluations within an
evolutionary algorithm (EA) can save computational time and in some cases
make the problem tractable at all.
Many EMOAs only consider objective values for the purpose of sorting and
then selecting the best individuals in a population. Assuming that individuals
can confidently be distinguished based on surrogate model predictions, knowing
the individuals’ exact objective values is not necessary. Under this assumption,
the algorithm and its evolutionary path would not be affected at all by trusting
the predicted sorting, and the computational budget could be reduced.
In this paper, we present a novel approach to integrate surrogate models
1Workshop on the topic in 2016:
2 Volz, Rudolph, Naujoks
and evolutionary (multi-objective) algorithms (dubbed SAPEO for Surrogate-
Assisted Partial Order-Based Evolutionary Optimisation2) that seeks to reduce
the number of function evaluations while simultaneously controlling the prob-
ability of detrimental effects on the solution quality. The idea is to choose the
individuals for exact evaluation dynamically based on the model uncertainty and
the distinctness of the population. Preliminary experiments on single-objective
problems showed promising results, so in this paper, we investigate the approach
in the currently sought-after multi-objective context. We also present different
versions that allow differing levels and types of uncertainties for the survival
selection process, which, in turn, can potentially effect the solution quality.
In the following, we describe our extensive analysis of the anytime perfor-
mance of SAPEO using the BBOB-BIOBJ benchmark (refer to section 2.2),
focusing on use cases with low budgets to simulate applications with expensive
functions. Our SAPEO implementation3uses a Kriging surrogate model [14] in
conjunction with the SMS-EMOA [2]. We further compare the algorithm and
its variants to the underlying SMS-EMOA and an SA-EMOA approach called
pre-selection [4] (SA-SMS in the following).
We specifically investigate if and under which conditions SAPEO outper-
forms the SMS-EMOA and SA-SMS in terms of the hypervolume indicator that
all algorithms use to evaluate populations. Surprisingly, none of the surrogate-
assisted algorithms can convincingly beat out the baseline SMS-EMOA, even
for small function budgets. This result opens up questions about SA-EMOAs in
general and about the necessary quality of the integrated surrogate models.
A potential explanation for this performance is the increased uncertainty of
the surrogate model predictions when compared to the single-objective exper-
iments. Thus, we analyse the effects of prediction uncertainty on the overall
performance of the SA-EMOAs. In the future, the resulting insights could be-
come important when (1) deciding whether using a surrogate model is beneficial
on a given problem at all and (2) when choosing the sample size and further pa-
rameters for the model. This is especially crucial for multi- and many-objective
problems, where learning an accurate surrogate model becomes increasingly ex-
pensive and thus renders analysing the trade-off between surrogate model com-
putation and function evaluations critical. Additionally, the stated questions and
insights are also relevant for noisy optimisation problems where uncertainty can
be reduced by repeated evaluations (although not eliminated).
In the following, we present related work in section 2 and introduce the
proposed SAPEO algorithm in section 3. The description and visualisation of
the benchmarking results are found in section 4. Section 5 concludes the paper
with an analysis of the results and lists open research problems.
2Acknowledgements: The SAPEO concept was developed during the
SAMCO Workshop in March 2016 at the Lorentz Center (Leiden, NL). This work is part of a project that has received
funding from the European Unions Horizon 2020 research and innova- tion program
under grant agreement No 692286.
3Code and visualisations available at:
Surrogate-Assisted Partial Order-Based Evolutionary Optimisation 3
2 Background and Related work
2.1 Surrogate-Assisted Evolutionary Multi-Objective Optimisation
Let X1, . . . , XλRnbe a population and the corresponding fitness function
f:RnRd. General concepts of multi-objective optimisation will not be
discussed here (refer to e.g. [16]). We will be referring to Pareto-dominance as
, and to its weak and strong versions as -and , respectively. We use the
same notation to compare vectors in objective space Rd, i.e. let a, b Rd, then
ab⇒ ∀k∈ {1. . . d}:akbk∧ ∃k∈ {1. . . d}:ak< bk.
Surrogate-assisted evolutionary multi-objective optimisation is surveyed in
[7,8], where several approaches for the integration of surrogates and EMOAs are
described. According to the surveys, the selection approaches can generally be
divided into individual-based (e.g. SAPEO), generation-based, and population-
based strategies. Additionally, there is pre-selection (e.g. SA-SMS [4]), which is
similar to individual-based strategies but does not retain any individuals with
uncertain fitness values.
However, neither of the cited surveys, nor [9], features any algorithm that
chooses to propagate uncertainty instead of assuming a distribution and using
aggregated metrics such as the expected value. In [10], uncertainty propagation
is implemented for noisy optimisation problems using a partial order based on
confidence intervals (or hypercubes in higher dimensions) as examined in [13].
The only work transferring this approach to SA-EMOAs we are aware of is
GP-DEMO [11] and the authors’ previous publications who use a differential
evolution algorithm.
Apart from the differences owed to the underlying optimisation algorithms
(e.g. crowding distance vs. hypervolume), GP-DEMO is very similar to the
SAPEO variant that allows the least uncertainty (SAPEO-uf-ho, cf. 4.1). An
important difference, however, is SAPEOs dynamic adaptation of allowed un-
certainty throughout the runtime of the algorithm and executing different partial
orders in sequence. In addition to the partial orders inspired by [13] used in both
[11] and SAPEO, we propose another order that interprets the confidence inter-
val bounds as objectives and thus deals differently with overlapping intervals. A
further difference is the choice of samples for the surrogate models: GP-DEMO
uses the Pareto front, whereas SAPEO always uses a local model relative to the
solution in question.
2.2 Benchmarking with BBOB
BBOB-BIOBJ is a bi-objective Black-Box Optimisation Benchmarking test suite
[15]. It consists of 55 bi-objective functions that are a combination of 10 of
the 24 single-objective functions in the BBOB test suite established in 2009
[6]. In order to measure general algorithm performance across function types,
single-objective functions were chosen such that the resulting benchmark would
be diverse in terms of separability, conditioning, modality and global structure
[6]. Based on these properties, the single-objective functions are divided into 5
4 Volz, Rudolph, Naujoks
weakly structured
f01 f02 f06 f08 f13 f14 f15 f17 f20 f21
Gallagher 101
Schwefel x*sin(x)
Schaffer F7 c10
Sum diff. powers
Sharp ridge
Attractive sector
01 02 03 04 05 06 07 08 09 10
11 12 13 14 15 16 17 18 19
20 21 22 23 24 25 26 27
28 29 30 31 32 33 34
35 36 37 38 39 40
41 42 43 44 45
46 47 48 49
50 51 52
53 54
Fig. 1: The 55 BBOB-BIOBJ functions are combinations of 10 single-objective
functions (on the top and right). The groups the single-objective and the result-
ing bi-objective functions belong to are colour-coded according to the legend.
function groups, from which 2 functions are chosen each. The resulting problems
and corresponding properties are visualised in figure 1.
In an effort to measure performance on the different function types more
accurately, each of the functions in the bi-objective test suite has 10 instances
that differ in terms of some properties, e.g. the location of optima. As an effect,
the scale of the optima and achievable absolute improvement of the objective
values also vary significantly across instances (thus also between objectives). The
robustness of an algorithm’s performance on a function group can be evaluated
with a higher confidence by testing on multiple members of that group.
All of the functions in the test suites are defined for search spaces of multiple
dimensions, of which we will be considering dimensions 2,3,5,10 and 20 in order
to be able to evaluate a wide range of problem sizes. The search space of each
function is limited to [100,100] Rper dimension in BBOB-BIOBJ.
The performance of an algorithm on the benchmarking suite is measured
using a quality indicator expressing both the size of the obtained Pareto set
and the proximity to a reference front. Since the true Pareto front is not known
for the functions in the test suite, an approximation is obtained by combining
all known solutions from popular algorithms. The ideal and nadir points are
known, however, and used to normalise the quality indicator to enable compar-
isons across functions [3]. The metric reported as a performance measure for the
algorithm is called precision. It is the difference of the quality indicator of the
reference set Iref and the indicator value of the obtained set. 58 target preci-
sions are fixed and the number of function evaluations needed to achieve them
is reported during a benchmark run. This way, the COCO platform enables an
anytime comparison of algorithms, i.e. an evaluation of algorithm performance
for each target precision and number of function evaluations [3].
Surrogate-Assisted Partial Order-Based Evolutionary Optimisation 5
3 Surrogate-Assisted Partial Order-Based Evolutionary
3.1 Formal description
Let ˜
f(Xi)Rdbe the predicted fitness for individual Xias computed by a local
surrogate model with uncertainty ˜σimodelled by
fk(Xi) = fk(Xi) + ei, ei N (0,˜σi), k ∈ {1. . . d}.
Assuming (Assumption A1) that the assumptions made by Kriging models [14]
hold and ˜σiwas estimated correctly, it follows that
fk(xi) + ui]= 1 αwith
ui= ˜σiz1α
since P(|ei| ≤ ui)=1α. Here, zdenotes the quantile function of the standard
normal distribution. In case the objective values are stochastically independent,
which is true for the BBOB-BIOBJ benchmark, we can therefore conclude that
f(Xi) lies within the hypercube (or bounding box) bounded by the described
confidence interval in each dimension with probability (1 α)d.
Assuming the function values lie within the defined hypercubes (Assumption
A2), we can distinguish individuals confidently just based on the predicted hy-
percubes. Of course, in case of large uncertainties ˜σi, a distinction can rarely be
meaningful. To combat this problem, SAPEO introduces a threshold εgfor the
uncertainty that is adapted in each iteration gof the EMOA and decreased over
the runtime of the algorithm depending on the distinctness of the population
(more details in section 3.2). Individuals Xiwith an uncertainty higher than the
threshold (ui> εg) are evaluated exactly in generation g.
To distinguish between individuals, we propose binary relations incorporating
the information on confidence bounds to varying degrees. All of these relations
induce a strict partial order (irreflexivity, transitivity) on a population, includ-
ing and akin to the Pareto dominance commonly used in EMOAs for the first
part of the selection process. We analyse the proposed relations in terms of the
probability and magnitude of a sorting error eothat per our definition are the
pairwise differences between the different orders induced by Pareto dominance
and the proposed relations, respectively. We define the probability P(ei,j
o,r) of a
sorting error made by relation ron individuals Xi, Xjand the magnitude of
the error ei,j
o,r) = PXi6 Xj|XirXj
o,r =(|f(Xi)f(Xj)|if (XirXj)(Xi6 Xj)
0 else
A single or more sorting errors can but do not have to lead to selection errors
es, where the individuals selected differ from the baseline comparison. This type
of error will not be analysed in this paper, but it is bounded by eo.
6 Volz, Rudolph, Naujoks
We define:
f: Pareto dominance on function values This relation is the standard in EMOAs
and, since only fis considered, it is obvious that P(ei,j
o,f ) = 0.
XifXj:= f(Xi)f(Xj)
u: Confidence interval dominance (cf. [11,13]) Assuming A2, if
XiuXj:= ^
fk(Xi) + ui<˜
holds, it is guaranteed that XiXj. Assuming stochastic independence of
the errors on predicted uncertainty, we can compute an upper bound for the
probability of sorting errors per dimension:
o,u(k)) = Pfk(Xi)fk(Xj)|XiuXj
fk(Xi) + uifk(Xj)˜
Only if the confidence hypercubes of two individuals intersect, the probability
of them being incomparable is greater than 0. Because of the way XiuXj
is defined, this is only possible if a sorting error is made in every dimension. It
follows that P(ei,j
o,u)αdassuming A1, making sorting errors controllable.
c: Confidence interval bounds as objectives Another way of limiting the predic-
tion errors potentially perpetuated through the algorithm is to limit the magni-
tude of the sorting error. For this reason we define:
XicXj:= ^
∧ ∃k[1 . . . d] : ˜
Under assumption A2, the error per dimension is bounded by the length of
intersection of the confidence intervals, which is in turn bounded by the width
of the smaller interval. Therefore, it holds that ei,j
o,c 2dmin(ui, uj)n.
p: Pareto dominance on predicted values This relation is the most straightfor-
ward, but it does not take the uncertainties of the predictions into account.
XipXj:= ˜
Assuming A2 again, a sorting error can only be committed if the confidence
intervals intersect. Because of the symmetric nature of the interval, it holds that
o,p (ui+uj)d, as the magnitude of the sorting error is again bounded by the
confidence interval widths.
Surrogate-Assisted Partial Order-Based Evolutionary Optimisation 7
o: Pareto dominance on lower bounds This optimistic relation was motivated
by [4] where SA-EMOAs performed better using oinstead of p.
XioXj:= ^
∧ ∃k[1 . . . d] : ˜
The maximum error occurs when the lower confidence interval bounds are close
together, but in the wrong order, making ei,j
o,o 2nmax(ui, uj)d.
Now assume we have obtained a strict partial order based on any of the given
binary relations. Let rcbe the rank of the µ-th individual. Then, all individu-
als with rank less than rccan confidently (with maximum errors as described
above) be selected. In case a selection has to be made from the individuals with
the critical rank rc, one option is to apply another dominance relation to the
individuals in question in hopes that the required distinction can be made. In
case of f, this always means evaluating uncertain individuals until a confident
distinction can be made according to the previous relation.
Another option is to use a relation inducing a total preorder (transitivity,
totality) as a secondary selection criterion (and random choice in case of further
ties) as most EMOAs do. Again incorporating different information, we have
tested the following hypervolume-based (hv) relations for this purpose:
Hypervolume contribution of objective values:
Xiho Xj:= hv(fo(Xi)) hv(fo(Xj)), where fo(Xi) = (f(Xi)ui= 0
f(Xi) else
Hypervolume contribution of confidence interval bounds:
Xihc Xj:= Qk[1...d]hv˜
3.2 SAPEO algorithm
Algorithm 1 describes the basic SAPEO algorithm, which any EMOA and surro-
gate model with uncertainty estimates can be plugged into. As inputs, the algo-
rithm receives the fitness function fun, the number of points considered for the
surrogate model local size and the budget of function evaluations. The order
of dominance relations (strategies) and the secondary criterion (scnd crit)
are used for selection (cf. section 3.1). The output is the final population.
The algorithm starts with mandatory data structures; the population is ini-
tialised randomly (line 1) and evaluated using the considered fitness function
fun (line 2), the EMOA is set up (line 3) and the error tolerance εas well as the
generation counter are set to their initial values (line 4).
The core optimisation loop starts in line 5 and stops if either the considered
optimiser terminates (due to the allocated budget or convergence), but not while
both the error tolerance εis larger than 0 and there are function evaluations left
to avoid convergence on imprecise values. Within the loop, new candidate solu-
tions Xare first generated by the optimisation algorithm (line 6) and evaluated
8 Volz, Rudolph, Naujoks
Algorithm 1 SAPEO
Input: fun, local size, budget, strategies, scnd crit
Output: Xfinal final population
1: X0.genome [random(n) : i1. . . pop size]random initialisation
2: X0.f [fun(x) : xX0]; X0.e = 0 evaluate sampled individuals
3: Oinit(X0, budget)init optimiser with initial population
4: ε0⇐ ∞;g1init error tolerance, generation counter
5: while (¬O.stop()) (ε > 0budget > 0) do
6: Xg.phenome O.evolv e(Xg1)get new population
7: Xg.f, Xg.e [model(x, k nearest(x, X[X.e == 0], local size)) : xXg]
8: predict value, error with surrogate from evaluated neighbours
9: εgmin(eg1, α-percentiles(diff(Xg.f)) update error tolerance
10: for xXgdo
11: if x.e > ε, strateg y, scnd crit) == NULL then
12: x.f =fun(x)evaluate individual
13: bbob.recommend(X[X.e > 0,last]) recommend solution
14: end if
15: end for
16:, strateg y, scnd crit)SAPEO survival selection
17: g=g+ 1 increase generation counter
18: end while
19: Xfinal =Xg1;Xfinal.f = [f un(x) : xXfinal ]evaluate final population
based on a local surrogate model trained from the local size evaluated indi-
viduals closest in design space (line 8). The predicted function and the expected
model errors (cf. equation 1) are stored. The error tolerance threshold is then
adapted (line 9). We reduce the threshold during the course of the algorithm
in order to limit the probability of sorting errors with large effects on the final
population. Therefore, εgis the minimum of the previous threshold εg1and
the α-percentiles of the euclidian distances in objective space per dimension.
The distances are a way to measure the distinctness of a population and thus
the potential of overlapping confidence intervals. By adapting gaccordingly, we
reduce the number and magnitude of potential sorting errors.
If any of the individuals in the population need to be evaluated - either
because the predicted uncertainty is above the threshold or because the indi-
viduals cannot be distinguished (see line 11) - they are evaluated in line 12 and
updated accordingly. In order to simulate anytime behaviour of the algorithm,
each time a solution is evaluated, an individual is recommended to the BBOB
framework in line 13. This serves the purpose of measuring the solution quality
of the algorithm had it been stopped at the time more accurately.
The set of candidate solutions, along with the (predicted but reasonably
certain) function values and the expected prediction errors are then passed to
the optimiser in line 16. Depending on the selected strategy, the optimiser then
selects the succeeding population as described above with regard to the predicted
function values and uncertainties and resumes its regular process.
Finally, after the optimisation loop terminates, the function values of the
individuals in the final population are computed using the real fitness function
in line 19, in case there are any individuals left that have not been evaluated.
Surrogate-Assisted Partial Order-Based Evolutionary Optimisation 9
4 Evaluation
4.1 Experimental Setup
Each experiment was run with 550 parallel jobs that took less than 3 hours
each with specifications according to table 1. Since the performance is strictly
measured in terms of function evaluations (target precision reached per function
evaluation, cf. section 2.2), the runtime does not influence it.
Table 1: Experiment specifications and parameters
budget 1000 per dimension (usecase: expensive function)
variation operators standard for all algorithms (cf. [2])
populations size 100 (as suggested in [4])
sample size for surrogate 15 (due to computational concerns)4
number of candidate offspring 15 (for SA-SMS, same as sample size)
correlation assumption squared exponential
trend assumption constant
regression weights maximum likelihood using COBYLA
start: 102, bounds: [104,101]
We compare the performances with a standard [2] and surrogate-assisted
SMS-EMOA with pre-selection as proposed by [4], since we are not aware of
any other SA-EMOAs using the SMS-EMOA with individual-based surrogate
management strategies. Specifically, we look at the following algorithms:
SMS-EMOA Standard SMS-EMOA as baseline comparison.
SA-SMS-p Surrogate assisted SMS-EMOA using pfor pre-selection.
SA-SMS-o Surrogate assisted SMS-EMOA using oinstead (experimentally
shown to improve the performance of pre-selection for the NSGA-II [4]).
SAPEO-uf-ho SAPEO using uto rank the offspring, thus accepting a risk
of sorting errors of only α2(cf. 3.1). For as long as the population cannot
be distinguished by u, the invididuals are evaluated according to f, thus
avoiding making any further sorting errors. The hypervolume relation ho is
used as secondary criterion. This algorithm should therefore only take small
risks and behave like the SMS-EMOA while saving function evaluations.
SAPEO-ucp-ho SAPEO using increasingly risky relations u,c,pto avoid
evaluations completely if not forced by the uncertainty threshold ε, taking
the opposite approach as SAPEO-uf-ho. ho is used as secondary criterion.
SAPEO-uc-hc SAPEO using multi-objectification of the confidence interval
boundaries fully. It uses uas a first safer way of ranking, followed by c
on critical individuals. Secondary criterion is hc.
4In a real-world application, the sample size should be chosen considering the tradeoff
between computation times for the model and the fitness function.
10 Volz, Rudolph, Naujoks
4.2 Visualisation and Interpretation of Results
There are two main angles to evaluating the anytime performance of algorithms:
(1) measuring the performance indicator after a predefined number of function
evaluations (fixed budget ) and (2) recording the function evaluation when target
performances are reached (fixed target ) [6]. In the following, we use the latter.
For a detailed depiction of an algorithm’s performance for a fixed target, we
use heatmaps (cf. figure 2) that show the percentage of budget used per dimen-
sion until a target was reached according to the colour scale on the right. If the
target is not reached within the allocated budget, the corresponding square is
white. The dimensions and instances of each function are shown separately to
enable analysis of the generalisation of algorithm performance across function
instances and dimensions. This is very important to justify the aggregation of
performance measures across instances. The functions have colour codes accord-
ing to the legend in figure 1 that specify their function groups.
dimensions and instances
Fig. 2: SAPEO-uf-ho performance in terms of the percentage of the budget per
dimension used to reach target 100for all function instances and dimensions.
From the plot, it is apparent that for the selected target 100, the algorithm
SAPEO-uf-ho has trouble with a number of functions even in small dimensions.
Additionally, for those functions, the algorithm’s performance seems to drop with
increasing dimension of the search space. Especially the Rosenbrock function
seems to be problematic for the algorithm: SAPEO-uf-ho rarely reaches the
target for dimensions 10 or 20 when the Rosenbrock function is part of the
problem (f 04, 13, 21, 28-34). A potential cause is an inaccurate representation
Surrogate-Assisted Partial Order-Based Evolutionary Optimisation 11
of its narrow valley containing the optimum with the surrogate model. Another
explanation could be a mismatch of the Rosenbrock function and the variation
operators, causing difficulty for the underlying SMS-EMOA. As expected, some
of the weakly structured problems were difficult for SAPEO-uf-ho as well.
A discussion of the potential causes of the performances is only possible
with reference to other algorithms. In order to get a better overview of the
performances of all algorithms and to detect patterns, we have compiled figure
3, which is an assembly of 30 heatmaps like the one in figure 2 for all algorithms
and different targets. The same colour scale as in figure 2 is used. Recall that
white spaces signify targets that were not reached within the allocated budget.
In figure 3, the most obvious trend is the declining performance for each
target, which was of course expected. It is also apparent that the SMS-EMOA
performs better in general than all other algorithms for each target precision. We
Fig. 3: Heatmaps visualising target performances for all algorithms (rows) across
multiple targets (columns). Refer to figure 2 for a detailed explanation.
12 Volz, Rudolph, Naujoks
can also see that all SAPEO versions are an improvement when compared to the
SA-SMS algorithms. Interestingly, we also see similar patterns in terms of which
functions are more difficult for all algorithms, indicating that the added surrogate
models do not influence the underlying optimisation behaviour significantly.
Unfortunately, while providing a good overview, figure 3 is not well suited to
interpret the performance of each algorithm per function. While very detailed,
the plots are not easy to interpret due to the abundance of information displayed
at once. In order to analyse the circumstances of different performance patterns,
we compile a plot that aggregates the different instances of a function. This way,
the general performance of an algorithm per function can be expressed without
risk of overfitting, as intended by the COCO framework. To do that, we use the
expected runtime (expected number of function evaluations) to reach a target
[5] as a performance measure. The measure is estimated for a restart algorithm
with 1000 samples. The results are again displayed in a heatmap (figure 4). The
colour visualises the estimated expected runtime per dimension according to the
scale on the right in log10-scale. Higher values than the maximum budget (>3)
occur if a target is not reached in all instances. White spaces occur if the target
was never reached by the algorithm in all instances.
The plot displays expected runtime for all dimensions in different columns
according to the labels above. Each of these columns is again divided into 3,
displaying the results for different algorithms according to the labels on the
bottom. There are two algorithms per column, whose results are displayed on
top of each other for each row corresponding to a function. For each algorithm,
the expected runtimes for targets 101,100,101,102,103are depicted in that
order. In case a target was never reached for all algorithms in a column, it is
omitted. For example, the expected runtime for SAPEO-uf-ho on function 01,
target 101and dimension 2 is on the top left corner and encoded in a light
blue. Therefore, the expected runtime to reach target 101is around 1002 = 2.
The SMS-EMOA is directly below that and a shade lighter, so has a slightly
higher expected runtime. Like in figure 2, the groups each function belongs to
are encoded according to the colour scheme in the legend of figure 1.
The general trends as seen in figure 3 can be observed here as well. However,
we can also see that SAPEO-uf-ho beats the SMS-EMOA in terms of precision
reached on very rare occasions, for example on functions f03 and f41 in dimension
2 and function f20 in dimension 10. Still, the SMS-EMOA generally reaches the
same or more precision targets than the other algorithms. However, in most
cases where a higher precision target is reached by only a single algorithm, the
corresponding colour indicates a very high expected runtime. This means that
the algorithm did not reach the higher target for most instances, which speaks
against a robust performance of that algorithm. More importantly, the SA-SMS
variants often reach less targets than the other algorithms, especially in higher
dimensional problems, meaning they are clearly outperformed.
The colour gradients in most functions are remarkably alike, indicating simi-
lar behaviour and difficulties experienced with each problem. This is expected, as
the intention of SA-EMOAs is to avoid function evaluations with only controlled
Surrogate-Assisted Partial Order-Based Evolutionary Optimisation 13
dim 2 dim 3 dim 5 dim 10 dim 20
Fig. 4: BBOB-BIOBJ performance results for all algorithms regarding expected
runtime (colour coded in log-scale) for targets 101,100,101,102,103
effects on the evolutionary path. Possibly due to the aggregating nature of the
expected runtime measure, the performance contrast does not appear to be as
stark as in figure 3. The gradient and number of performance targets reached per
function is in fact relatively similar for all algorithms. In most cases, differences
occur towards the end of the gradient, indicating that the precision improvement
of the surrogate-assisted algorithms is less steep than for the SMS-EMOA. How-
ever, in order to analyse the algorithms’ behaviour appropriately in that regard,
a more thorough analysis of the separate selection steps is required.
Regarding the different functions, there seems to be no clear performance
pattern. The different SAPEO versions vary rarely. The performance of all al-
gorithms seems to be more closely tied to the single-objective functions, e.g.,
Rosenbrock seems to pose problems whereas Schwefel seems more manageable.
14 Volz, Rudolph, Naujoks
5 Conclusions and Future Work
In this paper, we have proposed a novel approach to surrogate-assisted multi-
objective evolutionary algorithms called SAPEO. An extensive analysis of its
anytime performance using the BBOB-BIOBJ benchmark showed that it was
outperformed by its underlying algorithm in this study, the SMS-EMOA. This
fact is quite surprising, since the SAPEO-uf-ho variant allows minimal uncer-
tainties and should rarely make different decisions. However, SAPEO still beats
another SA-EMOA based on the SMS-EMOA [4] on the benchmark.
One potential source of error is the surrogate model, e.g. assumptions A1,
A2 (section 3.1) could be wrong. A large error in the predicted uncertainty
could have a tremendous influence on the algorithm. However, this is control-
lable through the adaptation of the uncertainty threshold ε. Additionally, the
uncertainties during the start of the SAPEOs were relatively large, which could
also send the algorithm into a wrong direction. The uncertainties could be mit-
igated by using surrogate ensembles instead, distributing the samples better,
increasing the sample size or selecting a fitting kernel. Additionally, the per-
formances of local vs. global surrogates should be analysed more thoroughly.
It is apparent that the quality of the surrogate model is a major concern for
SA-EMOAs, which could be problematic for black-box optimisation in general.
Apart from the model, there are possible improvements regarding the bi-
nary relations used. For one, ucould be defined without forcing strict Pareto
dominance of the hypercubes. Furthermore, using hypercubes for the potential
location of the fitness values is a simplification. Perhaps a binary relation on
hyperellipsoids could provide better results.
Furthermore, while the SAPEO approach worked well for single-objective
problems, the corresponding multi-objective problems pose an incomparably
larger difficulty for a surrogate model. Additionally, even slightly overestimated
function values could lead to an incorrect identification of dominated individuals.
This is because a large number of critical values would need to be distinguished
with less certain relations or expensive function evaluations.
Notice that previous studies [12,1] have shown that EAs with a larger number
of offspring are less vulnerable to noisy fitness functions. Therefore, it may be
conjectured that the µ+ 1 selection scheme of the SMS-EMOA causes the poor
performance under noise induced by the surrogate. This hypothesis, however,
remains a question for future research. In general, the influence of the quality
of surrogate models on SA-EMOAs should be analysed more carefully. With the
proper noise-robust optimisation algorithm and parametrisation, SAPEO should
be able to beat its underlying algorithm as it does on single-objective problems.
1. Arnold, D., Beyer, H.-G.: On the Benefits of Populations for Noisy Optimization.
Evolutionary Computation 11(2), 111–127 (2003)
2. Beume, N., Naujoks, B., Emmerich, M.: SMS-EMOA: Multiobjective selection
Surrogate-Assisted Partial Order-Based Evolutionary Optimisation 15
based on dominated hypervolume. European Journal of Operations Research
181(3), 1653–1669 (2007)
3. Brockhoff, D., Tsar, T., Tuˇsar, D., Wagner, T., Hansen, N., Auger, A.: Biobjective
Performance Assessment with the COCO Platform. CoRR abs/1605.01746 (2016),
retrieved: 22/12/2016
4. Emmerich, M., Giannakoglou, K., Naujoks, B.: Single- and Multi-objective Evo-
lutionary Optimization Assisted by Gaussian Random Field Metamodels. IEEE
Transactions on Evolutionary Computation 10(4), 421–439 (2006)
5. Hansen, N., Auger, A., Ros, R., Finck, S., Poˇs´ık, P.: Comparing results of 31 algo-
rithms from the black-box optimization benchmarking bbob-2009. In: Companion
of Genetic and Evolutionary Computation Conference (GECCO 2010). pp. 1689–
1696. ACM Press, New York (2010)
6. Hansen, N., Finck, S., Ros, R., Auger, A.: Real-Parameter Black-Box Optimization
Benchmarking 2009: Noiseless Functions Definitions. Research Report RR-6829,
INRIA (2009), retrieved: 22/12/2016
7. Jin, Y.: A Comprehensive Survey of Fitness Approximation in Evolutionary Com-
putation. Soft Computing 9(1), 3–12 (2005)
8. Jin, Y.: Surrogate-assisted evolutionary computation: Recent advances and future
challenges. Swarm and Evolutionary Computation 1(2), 61–70 (2011)
9. Knowles, J., Nakayama, H.: Meta-Modeling in Multiobjective Optimization. In:
Branke, J., et al. (eds.) Multiobjective Optimization - Interactive and Evolutionary
Approaches, pp. 245–284. Springer, Berlin (2008)
10. Limbourg, P., Aponte, D.E.S.: An Optimization Algorithm for Imprecise Multi-
Objective Problem Functions. In: IEEE Congress on Evolutionary Computation
(CEC 2005). IEEE Press, Piscataway, NJ (2005)
11. Mlakar, M., Petelin, D., Tsar, T., Filipiˇc, B.: GP-DEMO: Differential Evolu-
tion for Multiobjective Optimization based on Gaussian Process models. European
Journal of Operational Research 243(2), 347–361 (2015)
12. Nissen, V., Propach, J.: Optimization with noisy function evaluations. In: Parallel
Problem Solving from Nature (PPSN V). pp. 159–168. Springer, Berlin (1998)
13. Rudolph, G.: A Partial Order Approach to Noisy Fitness Functions. In: IEEE
Congress on Evolutionary Computation (CEC 2001). pp. 318–325. IEEE Press,
Piscataway, NJ (2001)
14. Sacks, J., Welch, W.J., Mitchell, T.J., Wynn, H.P.: Design and analysis of computer
experiments. Statistical Science 4(4), 409–423 (1989)
15. Tsar, T., Brockhoff, D., Hansen, N., Auger, A.: COCO: The Bi-objective Black
Box Optimization Benchmarking (bbob-biobj) Test Suite. CoRR abs/1604.00359
(2016), retrieved: 22/12/2016
16. Zitzler, E., Knowles, J., Thiele, L.: Quality Assessment of Pareto Set Approxima-
tions. In: Branke, J., et al. (eds.) Multiobjective Optimization - Interactive and
Evolutionary Approaches, pp. 373–404. Springer, Berlin (2008)
... Surrogate-assisted evolutionary algorithms[7][8][9]aim to make expensive problems tractable by replacing function evaluations of strategically selected individuals with estimates from a machine learning model trained on previously evaluated points. The SAPEO algorithm 1 proposed recently for multi-objective optimisation problems[18]uses diierent dominance relations to rank individuals based on their estimated function value and applies the concept of uncertainty propagation (often found in algorithms designed for noisy problems[10]) to minimise the number of function evaluations needed. Under certain conditions, individuals are not evaluated and their tness is only estimated via a surrogate model, thus introducing an uncertainty that is propagated throughout the algorithm. ...
... We identify the possible and worst-case deviations and analyse the expected eeects empirically. We speciically analyse the error frequency of some of the partial orders proposed in[18]under diierent conditions, such as dimension and certainty of the surrogate model. We also discuss how the results of this paper pertain to the multi-objective context in previous work. ...
... The SAPEO (Surrogate-Assisted Partial Order-Based Evolutionary Optimisation) algorithm ((rst introduced in[18]) is aimed at nding an optimal balance between the number of function evaluations and the uncertainty propagated throughout the runtime of the algorithm introduced by tness estimates. In[18]it was applied to multi-objective optimisation problems only. ...
Conference Paper
Uncertainty propagation is a technique to incorporate individuals with uncertain fitness estimates in evolutionary algorithms. The Surrogate-Assisted Partial Order-Based Evolutionary Optimisation Algorithm (SAPEO) uses uncertainty propagation of fitness predictions from a Kriging model to reduce the number of function evaluations. The fitness predictions are ranked with partial orders and the corresponding individuals are only evaluated if they are indistinguishable otherwise or the risk of uncertainty propagation exceeds a steadily decreasing error tolerance threshold. In this paper, we investigate the effects of using uncertainty propagation according to SAPEO on single-objective problems. To this end, we present and apply different ways of measuring the deviations of SAPEO from the underlying CMA-ES. We benchmark the algorithms on the BBOB testbed to assess the effects of uncertainty propagation on their performance throughout the runtime of the algorithm on a variety of problems. Additionally, we examine thoroughly the differences per iteration between the evolution paths of SAPEO and CMA-ES based on a model for the rank-one update. The BBOB results suggest that the success of SAPEO generally improves the performance but depends heavily on function and dimension, which is supported by the analysis of the evolution paths.
... More recently, it has also been suggested to use uncertainty information on the predictions in order to introduce lazy evaluation into evolutionary algorithms (e.g. GP-DEMO [8] and SAPEO [9,11]). These algorithms will only evaluate individuals in a population if their comparison based solely on information obtained through the model is too risky according to a given statistical significance level. ...
Conference Paper
In this position paper, we discuss the need for systematic benchmarking of surrogate-assisted evolutionary algorithms and give an overview of existing suitable function suites. Based on the findings, we hope to encourage more comparative studies in this field supported by benchmarks and outline how a concerted effort of the community could create better insight into the various previously proposed algorithms and concepts.
... A ranking of candidate solutions would be sufficient (cf. [45]). Machine learning models rarely consider ranking errors. ...
Surrogate-Assisted Evolutionary Algorithms (SAEAs) have been proven to be very effective in solving (synthetic and real-world) computationally expensive optimization problems with a limited number of function evaluations. The two main components of SAEAs are: the surrogate model and the evolutionary optimizer, both of which use parameters to control their respective behavior. These parameters are likely to interact closely, and hence the exploitation of any such relationships may lead to the design of an enhanced SAEA. In this chapter, as a first step, we focus on Kriging and the Efficient Global Optimization (EGO) framework. We discuss potentially profitable ways of a better integration of model and optimizer. Furthermore, we investigate in depth how different parameters of the model and the optimizer impact optimization results. In particular, we determine whether there are any interactions between these parameters, and how the problem characteristics impact optimization results. In the experimental study, we use the popular Black-Box Optimization Benchmarking (BBOB) testbed. Interestingly, the analysis finds no evidence for significant interactions between model and optimizer parameters, but independently their performance has a significant interaction with the objective function. Based on our results, we make recommendations on how best to configure EGO.
Real-world problems are often affected by uncertainties of different types and from multiple sources. Algorithms created for expensive optimisation, such as model-based optimisers, introduce additional errors. We argue that these uncertainties should be accounted for during the optimisation process. We thus introduce a benchmark as well as a new surrogate-assisted evolutionary algorithm to investigate this hypothesis further. The benchmark includes two function suites based on procedural content generation for games, which is a common problem observed in games research and also mirrors several types of uncertainties in the real-world. We find that observing and handling the uncertainty present in the problem can improve the optimiser, and also provides valuable insight into the function characteristics.
Full-text available
This document details the rationales behind assessing the performance of numerical black-box optimizers on multi-objective problems within the COCO platform and in particular on the biobjective test suite bbob-biobj. The evaluation is based on a hypervolume of all non-dominated solutions in the archive of candidate solutions and measures the runtime until the hypervolume value succeeds prescribed target values.
Full-text available
The bbob-biobj test suite contains 55 bi-objective functions in continuous domain which are derived from combining functions of the well-known single-objective noiseless bbob test suite. Besides giving the actual function definitions and presenting their (known) properties, this documentation also aims at giving the rationale behind our approach in terms of function groups, instances, and potential objective space normalization.
Full-text available
This paper proposes a novel surrogate-model-based multiobjective evolutionary algorithm called Differential Evolution for Multiobjective Optimization based on Gaussian Process models (GP-DEMO). The algorithm is based on the newly defined relations for comparing solutions under uncertainty. These relations minimize the possibility of wrongly performed comparisons of solutions due to inaccurate surrogate model approximations. The GP-DEMO algorithm was tested on several benchmark problems and two computationally expensive real-world problems. To be able to assess the results we compared them with another surrogate-model-based algorithm called Generational Evolution Control (GEC) and with the Differential Evolution for Multiobjective Optimization (DEMO). The quality of the results obtained with GP-DEMO was similar to the results obtained with DEMO, but with significantly fewer exactly evaluated solutions during the optimization process. The quality of the results obtained with GEC was lower compared to the quality gained with GP-DEMO and DEMO, mainly due to wrongly performed comparisons of the inaccurately approximated solutions.
Full-text available
Many scientific phenomena are now investigated by complex computer models or codes. A computer experiment is a number of runs of the code with various inputs. A feature of many computer experiments is that the output is deterministic—rerunning the code with the same inputs gives identical observations. Often, the codes are computationally expensive to run, and a common objective of an experiment is to fit a cheaper predictor of the output to the data. Our approach is to model the deterministic output as the realization of a stochastic process, thereby providing a statistical basis for designing experiments (choosing the inputs) for efficient prediction. With this model, estimates of uncertainty of predictions are also available. Recent work in this area is reviewed, a number of applications are discussed, and we demonstrate our methodology with an example.
Full-text available
Evolutionary algorithms (EAs) have received increasing interests both in the academy and industry. One main difficulty in applying EAs to real-world applications is that EAs usually need a large number of fitness evaluations before a satisfying result can be obtained. However, fitness evaluations are not always straightforward in many real-world applications. Either an explicit fitness function does not exist, or the evaluation of the fitness is computationally very expensive. In both cases, it is necessary to estimate the fitness function by constructing an approximate model. In this paper, a comprehensive survey of the research on fitness approximation in evolutionary computation is presented. Main issues like approximation levels, approximate model management schemes, model construction techniques are reviewed. To conclude, open questions and interesting issues in the field are discussed.
Full-text available
The hypervolume measure (or S metric) is a frequently applied quality measure for comparing the results of evolutionary multiobjective optimisation algorithms (EMOA). The new idea is to aim explicitly for the maximisation of the dominated hypervolume within the optimisation process. A steady-state EMOA is proposed that features a selection operator based on the hypervolume measure combined with the concept of non-dominated sorting. The algorithm’s population evolves to a well-distributed set of solutions, thereby focussing on interesting regions of the Pareto front. The performance of the devised Smetric selection EMOA (SMS-EMOA) is compared to state-of-the-art methods on two- and three-objective benchmark suites as well as on aeronautical real-world applications.
Conference Paper
Full-text available
In many practical engineering design and other scientific optimization problems, the objective function is not given in closed form in terms of the design variables. Given the value of the design variables, the value of the objective function is obtained by some numerical analysis, such as structural analysis, fluidmechanic analysis, thermodynamic analysis, and so on. It may even be obtained by conducting a real (physical) experiment and taking direct measurements. Usually, these evaluations are considerably more time-consuming than evaluations of closed-form functions. In order to make the number of evaluations as few as possible, we may combine iterative search with meta-modeling. The objective function is modeled during optimization by fitting a function through the evaluated points. This model is then used to help predict the value of future search points, so that high performance regions of design space can be identified more rapidly. In this chapter, a survey of meta-modeling approaches and their suitability to specific problem contexts is given. The aspects of dimensionality, noise, expensiveness of evaluations and others, are related to choice of methods. For the multiobjective version of the meta-modeling problem, further aspects must be considered, such as how to define improvement in a Pareto approximation set, and how to model each objective function. The possibility of interactive methods combining meta-modeling with decision-making is also covered. Two example applications are included. One is a multiobjective biochemistry problem, involving instrument optimization; the other relates to seismic design in the reinforcement of cable-stayed bridges.
Conference Paper
Full-text available
This paper presents results of the BBOB-2009 benchmarking of 31 search algorithms on 24 noiseless functions in a black-box optimization scenario in continuous domain. The runtime of the algorithms, measured in number of function evaluations, is investigated and a connection between a single convergence graph and the runtime distribution is uncovered. Performance is investigated for different dimensions up to 40-D, for different target precision values, and in different subgroups of functions. Searching in larger dimension and multi-modal functions appears to be more difficult. The choice of the best algorithm also depends remarkably on the available budget of function evaluations.
Conference Paper
This chapter reviews methods for the assessment and comparison of Pareto set approximations. Existing set quality measures from the literature are critically evaluated based on a number of orthogonal criteria, including invariance to scaling, monotonicity and computational effort. Statistical aspects of quality assessment are also considered in the chapter. Three main methods for the statistical treatment of Pareto set approximations deriving from stochastic generating methods are reviewed. The dominance ranking method is a generalization to partially-ordered sets of a standard non-parametric statistical test, allowing collections of Pareto set approximations from two or more stochastic optimizers to be directly compared statistically. The quality indicator method — the dominant method in the literature — maps each Pareto set approximation to a number, and performs statistics on the resulting distribution(s) of numbers. The attainment function method estimates the probability of attaining each goal in the objective space, and looks for significant differences between these probability density functions for different optimizers. All three methods are valid approaches to quality assessment, but give different information. We explain the scope and drawbacks of each approach and also consider some more advanced topics, including multiple testing issues, and using combinations of indicators. The chapter should be of interest to anyone concerned with generating and analysing Pareto set approximations.
Conference Paper
In the optimization literature it is frequently assumed that the quality of solutions can be determined by calculating deterministic objective function values. Practical optimization problems, however, often require the evaluation of solutions through experimentation, stochastic simulation, sampling, or even interaction with the user. Thus, most practical problems involve noise. We empirically investigate the robustness of population-based versus point-based optimization methods on a range of parameter optimization problems when noise is added. Our results favor population-based optimization, and the evolution strategy in particular.