# An enhanced statistical approach for evolutionary algorithm comparison

**Abstract**

This paper presents an enhanced approach for comparing evolutionary algorithm. This approach is based on three statistical techniques: (a) Principal Component Analysis, which is used to make the data uncorrelated; (b) Bootstrapping, which is employed to build the probability distribution function of the merit functions; and (c) Stochastic Dominance Analysis, that is employed to make possible the comparison between two or more probability distribution functions. Since the approach proposed here is not based on parametric properties, it can be applied to compare any kind of quantity, regardless the probability distribution function. The results achieved by the proposed approach have provided more supported decisions than former approaches, when applied to the same problems.

An Enhanced Statistical Approach for Evolutionary

Algorithm Comparison

Eduardo G. Carrano

Research Group for Intelligent

Systems - GPSI

Centro Federal de Educação

Tecnológica de Minas Gerais

Av. Amazonas, 5253 - Nova

Suiça - 30480-000

Belo Horizonte - MG, Brazil

egcarrano@deii.cefetmg.br

Ricardo H. C. Takahashi

Dep. Mathematics

Universidade Federal de

Minas Gerais

Av. Antônio Carlos, 6627 -

Pampulha - 31270-010

Belo Horizonte - MG, Brazil

taka@mat.ufmg.br

Elizabeth F. Wanner

Dep. Mathematics

Universidade Federal de Ouro

Preto

Campus Universitário, Morro

do Cruzeiro - 35400-000

Ouro Preto - MG, Brazil

efwanner@iceb.ufop.br

ABSTRACT

This paper presents an enhanced approach for comparing evolu-

tionary algorithm. This approach is based on three statistical tech-

niques: (a) Principal Component Analysis, which is used to make

the data uncorrelated; (b) Bootstrapping, which is employed to

build the probability distribution function of the merit functions;

and (c) Stochastic Dominance Analysis, that is employed to make

possible the comparison between two or more probability distri-

bution functions. Since the approach proposed here is not based

on parametric properties, it can be applied to compare any kind of

quantity, regardless the probability distribution function. The re-

sults achieved by the proposed approach have provided more sup-

ported decisions than former approaches, when applied to the same

problems.

Categories and Subject Descriptors

G.1.6 [Mathematic of Computing]: Numerical Analysis—Opti-

mization

General Terms

Algorithms

Keywords

evolutionary algorithms, algorithm comparison, evolutionary en-

coding schemes, tree network design

1. INTRODUCTION

When deterministic optimization algorithms are compared, their

performances are characterized by their computational complexity

only. Since those algorithms always perform the same sequence

of deterministic steps, it is guaranteed under some assumptions

that, starting from a given initial point, an algorithm converges (i.e.,

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for proﬁt or commercial advantage and that copies

bear this notice and the full citation on the ﬁrst page. To copy otherwise, to

republish, to post on servers or to redistribute to lists, requires prior speciﬁc

permission and/or a fee.

GECCO’08, July 12–16, 2008, Atlanta, Georgia, USA.

Copyright2008ACM978-1-60558-130-9/08/07...$5.00.

reaches a stop criterion) in a ﬁxed number of algorithm iterations

(the solution that is reached is not necessarily the global optimum

of the problem) [5].

The performance analysis of evolutionary algorithms cannot fol-

low the same methodology. The stochastic nature of those algo-

rithms introduces another issue that must be considered: it is not

guaranteed, in any single run, the achievement of the same solu-

tion that has been found in another run and, even when the same

solution is reached, the computational effort spent to obtain this

solution varies in different runs of the same algorithm [5].

The ﬂexible structure of the evolutionary algorithms makes pos-

sible to build them in several different ways, what leads to different

algorithms which usually present different performance. This com-

binatorial scenario of possible algorithms motivates the efforts for

developing evaluation/comparison methods for evolutionary algo-

rithms, such as the ones presented in [5, 4, 18, 19, 1].

Some of the approaches proposed in literature state the assump-

tion that, in some cases, the convergence is almost ensured, and

therefore could be assumed as true. Therefore, it is possible to com-

pare the algorithms based on a single criterion: the computational

effort required to reach the optimum. In this approach, the com-

putational cost of the algorithm is modeled as a random variable,

and some statistical analysis are employed to infer on this variable

[5]. Although this approach can be useful in some particular situ-

ations, the assumption of convergence is too strong. Notice that in

large NP-hard problems, for instance, the evolutionary algorithms

are expected to ﬁnd “good” sub-optimal solutions only, and the as-

sumption of ﬁnding the exact optimum is infeasible, in practice.

There are similar approaches which ﬁx the computational cost, es-

tablishing a pre-determined number of algorithm generations. This

approach also can carry problems, since the computational cost be-

comes an arbitrary parameter, that can hardly affect the algorithm

analysis.

It is reasonable to consider the trade-off between faster algo-

rithms that lead to rough solutions, and slower algorithms that de-

liver more accurate solutions. A feasible way for comparing algo-

rithms in which the convergence is not ensured is the employment

of multicriteria comparison schemes, such as discussed in [19] and

presented in [18]. In this kind of approach, the convergence ability

and the computational cost of the algorithms are modeled as ran-

dom variables, and a statistical estimator, such as the mean, is em-

ployed to allow comparisons. The reference [18] for example, uses

the number of function evaluations to estimate the computational

cost of the algorithm and the convergence rate to estimate its con-

vergence capacity. These criteria are evaluated through their mean,

897

and a multiobjective analysis [9] is employed to ﬁnd which algo-

rithms can be considered efﬁcient. Since a multiobjective analysis

is adopted, the result of this approach is a set of efﬁcient algorithms,

instead of a single “best one”.

Although the approach described above is more well suited than

a single-criterion one, it still presents an weakness: the compar-

isons are made based only in mean values, what often implies the

loss of the information about how the points are spread around that

mean. This paper proposes a multiobjective comparison approach

of evolutionary algorithms that is based on the concept of stochastic

dominance [16], instead of using comparisons of the mean values

only. It makes possible to consider the deviation of the criteria

around mean, what consequently provides more supported deci-

sions. The approximated probability distribution functions (PDF)

are built using a Bootstrapping procedure [6]. Since this approach

is non-parametric, it can be employed for comparing any data set,

regardless the distribution. The results achieved have shown that

the approach has a high capacity of detecting signiﬁcant statistical

differences between algorithms.

The paper is structured as follows:

• Section 2 introduces the statistical concepts which are used

in the comparison approach;

• Section 3 presents the methodology of the comparison pro-

posed in this work;

• Section 4 presents the results achieved by the comparison

approach in three instances of a discrete problem. The results

obtained by the approach are compared with previous results

for the same problem.

2. CONCEPTUAL BACKGROUND

The concepts which are employed in the comparison approach

proposed in this paper are brieﬂy introduced in this section.

2.1 Principal Component Analysis

1

Principal Component Analysis (PCA) is a mathematical pro-

cedure that is employed to transform a set of correlated variables

into a new set (sometimes smaller) of uncorrelated variables. The

new variables are arranged in such a way that they are sorted by

variability, with the ﬁrst component accounting for as much of the

variability in the data as possible, and each succeeding component

accounts for as much of the remaining variability as possible. The

variables which represent a signiﬁcant part of variability are called

principal components. Sometimes, this procedure is called Proper

Orthogonal Decomposition (POD) or Hotelling Transform.

The PCA is usually employed for some speciﬁc tasks:

• Reduce the dimensionality of a data set;

• Identify new meaningful underlying variables;

• Find the uncorrelated data, to analyze the variables indepen-

dently.

Let X beasetofM variables with N observations each. The PCA

of X can be performed in 13 steps:

1. Arrange the data in N data vectors, X

1

,...,X

N

. Each vector X

i

is a column vector (M × 1) with a single observation of the M

variables.

2. Build a matrix X (M × N) with the column vectors.

1

The description of Principal Component Analysis presented in

this section has been adapted from the references [10, 11, 13].

3. Find the empirical mean of each variable m =1,...,M:

u[m, 1] =

1

N

N

n=1

X[m, n].

4. Subtract the empirical mean u from each column of matrix X:

B = X − u · h

where h[1,m]=1 ∀ m =[1,...,M].

5. Find the empirical covariance matrix C (M × M):

C =

1

N

B · B

∗

where B

∗

is the conjugate transpose of matrix B.

6. Find the eigenvalues of matrix C and build a column vector E

(M × 1).

7. Find the matrix of eigenvectors V in which:

V

−1

· C · V = D

where D is a diagonal matrix, such that:

D[m, m]=E(m, 1) ∀ m =[1,...,M].

8. Sort the columns of V and D in descendant order of eigenvalues.

9. Find the cumulative energy content for each eigenvector:

g[m]=

M

i=1

D[i, i] ∀ m =[1,...,M]

• Eigenvectors with higher cumulative energy represent most

part of variance than eigenvectors with lower cumulative en-

ergy.

10. Select a subset of the eigenvectors as basis vectors:

W [p, q]=V [p.q] ∀ p =[1,...,M] q =[1,...,L]

where 1 ≤ L ≤ M .

• The vector g can be used as a guide for choosing L.Forin-

stance, L can be chosen such that: g[m = L] ≥ 0.9.

11. Find the empirical standard deviation vector s (M × 1):

s[m, 1] =

C(m, m) ∀ m =[1,...,M].

12. Calculate the z-score matrix Z (M × N):

Z =

B

s · h

(divide element by element).

13. Project the z-scores of the data onto the new basis:

Y = W

∗

· Z.

The result of this procedure is a set of data Y , which is uncorre-

lated, and is ﬁtted to a lower dimension (L ≤ M ). This procedure

is useful in the analysis of high dimension data.

In the speciﬁc case of this work, the PCA is used only to make

the data uncorrelated, regardless the dimension reduction that could

be exploited. Therefore, only the steps 1 to 8 are required. The

uncorrelated data, is obtained through (1).

Y = V · X (1)

In evolutionary algorithm comparison, at least two merit criteria

should be considered: a criterion which estimates the convergence

capacity of the algorithm and a criterion which estimates its compu-

tational cost [18]. Therefore, each algorithm run results in a point

in a two-dimensional space (or in higher dimension, if more criteria

are considered), which is often correlated. The decoupling of data

can be useful to make meaningful the independent analysis of each

merit criterion considered.

2.2 Bootstrapping

2

Bootstrapping is a statistical procedure employed to infer the

properties of an estimator (such as mean or variance) by sampling

from an approximating distribution. The empirical distribution ob-

tained in some observations is generally used as approximating dis-

tribution.

2

The description of Bootstrapping presented in this section has

been adapted from the references [2, 6, 7, 8].

898

The bootstrapping procedure can be useful in the following situ-

ations:

• build hypothesis tests;

• make possible inferences based on parametric assumptions;

• make possible inferences in non-parametric data sets, us-

ing distribution comparison methods, such as non-parametric

tests or stochastic dominance.

Let P = {x

1

,x

2

,x

3

,...,x

N

} be a population and

S = {X

1

,X

2

,X

3

,...,x

n

} be an independent sample extracted

from P (where n is much smaller than N ). The properties of an

estimator T for the population, Θ=T (P ) can be inferred from

the sample, θ = T (S), by the following steps:

1. Extract one sample S

i

of size n from S with replacement.

2. Calculate the value of the estimator T for the sample S

i

.

3. Repeat the steps 1 and 2 for a pre-determined number of

times.

4. Build the probability density function of the estimator T for

P using the values obtained in step 2.

From the distribution probability function obtained in bootstrap-

ping, it is possible to ﬁnd conﬁdence intervals, quantiles distribu-

tion, etc, which can be useful for further analysis.

The main advantage of bootstrapping over analytical methods is

its simplicity – it is easy to employ the bootstrapping in order to

estimate standard errors and conﬁdence intervals for estimators of

complex parameters of the distribution, such as percentile points,

proportions and correlation coefﬁcients.

In this work, the bootstrapping is employed for estimating the

independent probability density function (PDF) of the mean of each

merit criterion. Since data is uncorrelated (which is guaranteed by

the employment of PCA) it is possible to perform analysis over the

independent PDFs instead of using a joint PDF. The comparison of

evolutionary algorithms using the PDF can provide more supported

decisions than approaches which are based only in mean values,

which is commonly adopted for this task.

2.3 First Order Stochastic Dominance

Stochastic dominance is a statistical concept employed to com-

pare two (or more) distribution functions. Since it is not dependent

on parametric distributions it can be employed for comparing any

entity, provided that the PDFs are known. The stochastic domi-

nance can be deﬁned as follows:

D

EFINITION 2.1. Stochastic Dominance: Consider a problem

of minimization. A random variable X

1

presents stochastic domi-

nance of ﬁrst order over another random variable X

2

if:

cdf(X

1

) ≥ cdf(X

2

) ∀ x

i

∈ X

1

∪ X

2

(2)

where cdf(X) is the cumulative distribution function of X.

The stochastic dominance is illustrated in Figure 1.

Once the stochastic dominance is going to be evaluated compu-

tationally, a discrete approximation of the concept becomes nec-

essary, since it is not possible to process an inﬁnite-dimensional

variable (represented by a continuous function over a non-compact

interval). This approximation can be calculated considering a ﬁ-

nite number of values of the cumulative distribution function of the

variables, what leads to an interval comparison, as follows.

x

cdf(X

1

)

cdf(X

2

)

Figure 1: 1

st

order stochastic dominance example: X

1

dom-

inates X

2

, since the quantiles of X

1

always occur before the

quantiles of X

2

(minimization problem).

Let X

1

be a random variable, and x

q1

1

,x

q2

1

,...,x

qi

1

,...,x

qn

1

be

the points of X

1

which deﬁne the quantiles,

q

0

,q

1

n−1

,...,q

i−1

n−1

,...,q

1

, as shown in Figure 2

3

.

X

1

x

q1

1

x

q2

1

x

q3

1

x

qi

1

x

qn−1

1

x

qn

1

q

0

q

1

n−1

q

2

n−1

q

i−1

n−1

q

n−2

n−1

q

1

P

X

1

<x

q1

1

=0

P

X

1

<x

q2

1

=

1

n−1

P

X

1

<x

q3

1

=

2

n−1

P

X

1

<x

qi

1

=

i−1

n−1

P

X

1

<x

qn−1

1

=

n−2

n−1

P

X

1

<x

qn

1

=1

Figure 2: Random variable X

1

Let X

2

be another random variable, and x

q1

2

,...,x

qi

2

,...,x

qn

2

be the points of X

2

which deﬁne the same quantiles,

q

0

,q

1

n−1

,...,q

i−1

n−1

,...,q

1

.

It is said that X

1

“is better than” X

2

if, and only if:

x

qi

1

≤ x

qi

2

∀ i ∈ [1,n]

x

qi

1

<x

qi

2

for some i ∈ [1,n]

(3)

One should note that, when n →∞, (3) is equivalent to First

Order Stochastic Dominance [16].

3. COMPARISON METHODOLOGY

In the proposed approach, two merit criteria have been estab-

lished for comparison:

• Number of function evaluations performed by the algorithm

(n

fe

), which is employed for estimating the computational

3

A point x

k

of X deﬁnes the quantile q

α

if P (X<x

k

)=α.

899

cost required by the algorithm. Optionally, the computation

time of the algorithm could be used as an indicator of the

algorithm cost [19].

• Objective function value of the best solution achieved by the

algorithm (f

bs

), which is employed for estimating the con-

vergence ability of the algorithm.

It is straightforward to note that each algorithm run results in a

point in the 2 dimension space [n

fe

,f

bs

]

T

.

The algorithm comparison methodology proposed in this paper

is performed through the following steps:

1. Each algorithm is executed n

r

times, in a given test problem;

2. In each run, the number of objective function evaluations up

to the stopping criterion being reached, n

fe

, and the best

reached value of objective function, f

bs

, are recorded;

3. The whole data achieved (all runs performed by all algo-

rithms) is then submitted to a PCA, in order to obtain un-

correlated data with regard to the merit functions;

4. For each merit criterion:

(a) A bootstrapping procedure is performed over the un-

correlated data:

• Bootstrapping is employed to estimate the proba-

bility distribution function of the estimator mean,

for all algorithms analyzed.

(b) The stochastic dominance analysis is used to rank the

algorithms, based on the PDFs found in bootstrapping.

The algorithms which are not dominated by any other

one receive rank 1, the algorithms which are dominated

only by the algorithms of rank 1 receive rank 2, and so

forth.

This procedure provides a rank of the algorithms in each criterion

under which they are being analyzed. A multiobjective analysis can

be easily derived from this rank:

• If an algorithm A is better than an algorithm B in one cri-

terion without being worse in the other, it is possible to say

that A dominates B, taking into account all the merit func-

tions considered.

It is possible to note that this approach is more powerful than for-

mer methods that are based on mean values, since it considers the

whole distribution instead of considering only the mean. It makes

possible to take into account the spread of the merit function values

for each one of the algorithms.

It is important to make clear that the accuracy of this approach is

hardly dependant on the quality of the estimated PDFs. When the

PDFs present high variability over different bootstrapping runs, it is

possible to take conclusions which does not reﬂect exactly what is

present in data. Higher samples tends to provide better PDF estima-

tions, and should be used when possible. However, as can be seen

in results section, the approach proposed in this paper has leaded to

reliable results, even for small samples (15 and 30 points), showing

that this approach can be useful in complex problems (where each

algorithm run requires high computation time). Finally, the use of

unidimensional PDFs instead of using a joint PDF (which consid-

ers all merit functions jointly) does not introduces analysis errors,

since the data is uncorrelated by PCA.

4. NUMERICAL RESULTS

The comparison methodology proposed has been tested in the

same instances proposed in [1], for the Optimal Communication

Spanning Tree problem (OCST). The comparison approach pro-

posed in that paper has been used as a benchmark for testing the

approach proposed in this work. The comparison method pro-

posed in [1] is based on Kruskal-Wallis Nonparametric Tests and

Multiple Comparisons [3]. Although this method is general, and

consequently can be applied to any kind of distribution, it is less

powerful than parametric approaches, due to the characteristics of

non-parametric tests: the lower power of those tests increases the

chance of do not ﬁnd signiﬁcant difference between algorithms that

could be considered different in practical situations. It is expected

that the approach proposed in this paper presents higher capacity of

discriminating between algorithms that could be signiﬁcantly dif-

ferent for practical purposes.

4.1 Optimal Communication Spanning Tree

In Optimal Communication Spanning Tree problem [12], the al-

gorithm looks for the spanning tree which presents minimum cost

and complies with the communication requirements between the

nodes [17]. This problem has been proved to be MAX SNP-hard

[14, 17].

The problem can be stated as follows:

min

i,j∈V

R

i,j

· C

X

i,j

(4)

where:

C

X

i,j

is the sum of weights of edges in path i − j.

R

i,j

is the communication requirement between i and j.

V is the set of vertices.

The only constraint considered here is that the network must be

a tree. Additionally, constraints which limit the maximum degree

of each node could be considered, for modeling equipments used

in real cases, such as hubs or switches (which present a limited

number of ports).

Such as in the reference [1], the comparison approach has been

employed to compare 12 genetic algorithms, when they are applied

to solve 10, 25 and 50 node instances of the OCST problem.

In order to make possible the comparison of the approaches, the

best function value and the computation time have been considered

here as merit criteria. The algorithms have been tested on a Pen-

tium IV (Prescott) at 3.2GHz with 1GB of RAM, using the Matlab

7 environment. Although the computation times are not compara-

ble with other approaches, since they are strictly dependent of the

hardware and software used, the time ratio between the methods

can provide useful information about the computational cost of the

methods. They have been labeled as follows:

Label Decoding Crossover Mutation

A Characteristic vector single point swap

B Prüfer numbers single point point

C Prüfer numbers single point swap

D Network random keys single point point

E Network random keys single point swap

F Edge sets single point point

G Edge sets single point swap

H Node biased single point point

I Node biased single point swap

J Link and node biased single point point

K Link and node biased single point swap

L - kruskal kruskal

Finally, the following parameters have been adopted in all simu-

lations:

900

• Number of runs: 30 runs per method;

• Population size: 50 individuals;

• Crossover probability: 0.80;

• Mutation probability: 0.45 (per individual);

• Linear ranking and roulette-wheel selection;

• Generational replacement with elitism;

• Stop criterion: 100 generations without improvement.

4.2 Comparison Results - Full Data

The results achieved by the former approach [1] (which will be

referred here as Kruskal+MC ) have been reproduced here to make

easier the evaluation of the comparison methods. The tables 1 and

2 show the results (convergence time and best function value) ob-

tained on three instances of the OCST problem. The ordering of the

algorithm performances provided by both comparison approaches

is presented together with the tables. The following notation has

been adopted for exposing the results: the methods are listed in de-

scending order of performance; in this list, underlining and overlin-

ing are used to indicate which methods did not exhibit statistically

signiﬁcant differences from one to the other one.

Table 1: OCST (30 runs) - Best function value

10 nodes 25 nodes 50 nodes

Lab. avg sd avg sd avg sd

A 3.65e5 7.25e3 2.58e6 1.09e4 1.19e7 3.52e5

B 3.67e5 1.24e4 2.90e6 2.01e5 1.32e7 7.53e5

C 3.74e5 1.62e4 2.85e6 2.00e5 1.30e7 8.96e5

D 3.75e5 1.59e4 2.93e6 2.81e5 1.42e7 1.36e6

E 3.74e5 1.57e4 2.80e6 1.59e5 1.42e7 1.64e6

F 3.64e5 - 2.57e6 9.89e3 1.16e7 2.28e5

G 3.64e5 - 2.60e6 2.13e4 1.25e7 7.30e5

H 3.64e5 - 2.56e6 6.24e3 1.11e7 2.53e4

I 3.64e5 1.65e3 2.58e6 1.02e4 1.12e7 3.97e4

J 3.64e5 1.44e3 2.64e6 3.37e4 1.17e7 2.98e5

K 3.64e5 9.32e2 2.62e6 3.16e4 1.17e7 3.07e5

L 3.64e5 - 2.57e6 5.60e3 1.20e7 4.99e5

Kruskal+MC:

10 nodes:

FGHLKIJABECD

25 nodes:

HLFIAGKJECBD

50 nodes:

HIF KJALGCBED

Proposed approach:

10 nodes:

LF GHKIJABECD

25 nodes:

HLAIFGKJECBD

50 nodes:

HIF KJALGCBED

The Tables 1 and 2 show that the comparison methods have pre-

sented coherent results: algorithms which have been classiﬁed as

efﬁcient by one method have also received similar classiﬁcation by

Table 2: OCST (30 runs) - Computation time (ms)

10 nodes 25 nodes 50 nodes

Lab. avg sd avg sd avg sd

A 1.956e4 2.628e3 1.325e5 2.235e4 7.317e5 2.262e5

B 8.984e3 7.426e2 3.009e4 7.329e3 1.917e5 4.363e4

C 8.578e3 1.655e3 2.246e4 4.737e3 1.272e5 4.821e4

D 1.492e4 2.255e3 7.437e4 1.828e4 3.376e5 1.046e5

E 1.648e4 3.692e3 7.306e4 2.400e4 3.194e5 1.132e5

F 1.516e4 1.225e3 6.471e4 9.189e3 4.280e5 9.074e4

G 1.638e4 2.266e3 7.306e4 2.147e4 3.412e5 1.241e5

H 1.648e4 8.892e2 7.474e4 1.331e4 5.401e5 1.531e5

I 1.682e4 4.310e2 5.902e4 1.546e4 3.409e5 9.116e4

J 1.773e4 1.085e3 1.103e5 2.824e4 7.909e5 2.169e5

K 1.913e4 1.785e3 1.321e5 4.066e4 8.369e5 2.968e5

L 2.001e4 1.745e3 9.654e4 1.227e4 6.972e5 1.792e5

Kruskal+MC:

10 nodes:

CBDFGEHIJKAL

25 nodes:

CBF EGDIHLJAK

50 nodes:

CBEDGF IHLAJK

Proposed approach:

10 nodes:

CBFDGHIEJKAL

25 nodes:

CBF GIEHDLAJK

50 nodes:

CBIGF HEDLAJK

the other one. However, it is important to see that the approach

proposed in this paper has shown the capacity of distinguishing

between some algorithms that have been considered equivalent by

the Kruskal+MC method. This was expected, and can be cred-

ited to the higher power of the statistical tests employed. The

Kruskal+MC method presents some intransitivity in classiﬁcation

that can make the decision process harder: for instance, in Table 1

(for 25 nodes), the Kruskal+MC determined that the algorithm H

is equivalent to L and L is equivalent to G,howeverH is consid-

ered statistically better than G. This phenomenon has not occurred

in the approach proposed in this work.

From Table 1 and the results obtained by the bootstrapping-based

approach, it is possible to conclude that the algorithm H has pre-

sented the best convergence performance. This conclusion could

not be safely taken with the Kruskal+MC approach, since it has not

detected any signiﬁcant difference between H and I in any instance

considered. With respect to the convergence time, the comparison

approach proposed in this work has stated that the algorithm C is

signiﬁcantly faster than other ones. Once more, this conclusion

could not be drawn by the Kruskal+MC approach, since it could

not distinguish between B and C.

As discussed in section 3 it is possible to compare the algorithms

considering both criteria jointly. It is assumed that one algorithm is

better than other if it is better in one criterion without being worse

in the other one. This methodology leads to the following classiﬁ-

cation:

901

10 nodes:

FBGHCLIJDKEA

25 nodes:

CFHIBGLAEDJK

50 nodes:

CHIBFAGJKDEL

This classiﬁcation suggests that the algorithms H and C are the

best ones with regard to convergence capacity and time required

respectively. The algorithms F and I also represent intermediate

choices, presenting good convergence (however lower than H)and

requiring a computation time smaller than H. The choice between

these algorithms should be taken considering the time available to

achieve a good solution. In the authors’ opinion the algorithms H

and I are the most suitable ones in this case. The algorithm C

should not be taken as a reasonable choice, since it presents very

poor convergence performance, and could lead to expensive solu-

tions.

4.3 Comparison Results - Reduced Data

A lower dimension set of 15 algorithm runs has been built by

sampling (without replacement) 15 runs of the 30 previously per-

formed. The intention of this test is to evaluate the effect of the

reduction of the number of algorithm runs in the results achieved

by the comparison approaches. If one comparison method requires

a smaller set of runs to efﬁciently compare two or more algorithms

it can be considered more suitable, since the acquisition of addi-

tional data to perform the comparison carries a high computational

cost (it is required a whole algorithm run to ﬁnd a single point for

comparison).

From Tables 3 and 4, it is noticeable that the smaller set of al-

gorithm runs has reduced the capacity of Kruskal+MC detects sig-

niﬁcant difference between the algorithms. When this comparison

method is applied over the reduced set, it provides a sorting simi-

lar to one achieved in the higher set (with 30 runs). However the

uncertainty about the extent to which each method is better than

other ones increases signiﬁcantly, what makes harder the decision

of which algorithm should be used.

On the other hand, the smaller number of algorithm runs does

not carry too many losses to the bootstrapping based approach. It

is obvious that some information is lost, since a small sample of-

ten implies in a less reliable approximating function. However, at

least in the particular case discussed in this paper, this loss has not

caused signiﬁcant changes in algorithm sorting. This is an inter-

esting property of the proposed method, since it seems to provide

efﬁcient comparisons from small sets of algorithm runs.

5. CONCLUSION AND FUTURE WORK

5.1 Concluding Remarks

This paper has proposed a new methodology for comparison of

evolutionary algorithms that is based on the techniques of (a) Prin-

cipal Component Analysis, which is used to make the data uncorre-

lated; (b) Bootstrapping, which is employed to build the probabil-

ity distribution function of the merit functions; and (c) Stochastic

Dominance Analysis, which performs a multicriteria ordering be-

tween algorithms, considering their ability to reach solution accu-

racy and the related computational effort.

The proposed technique has presented a high capability to dis-

criminate between different algorithms, allowing to detect statisti-

cally signiﬁcant differences between performances which were not

detected by former methodologies.

Table 3: OCST (15 runs) - Best function value

10 nodes 25 nodes 50 nodes

Lab. avg sd avg sd avg sd

A 3.64e5 - 2.57e6 8.76e3 1.19e7 3.03e5

B 3.67e5 1.16e4 2.93e6 2.29e5 1.32e7 9.19e5

C 3.73e5 1.58e4 2.84e6 1.63e5 1.28e7 6.72e5

D 3.71e5 1.27e4 2.92e6 2.19e5 1.42e7 1.43e6

E 3.69e5 9.57e4 2.85e6 1.91e5 1.42e7 1.59e6

F 3.64e5 - 2.57e6 9.26e3 1.16e7 1.98e5

G 3.64e5 - 2.60e6 2.33e4 1.25e7 7.64e5

H 3.64e5 - 2.57e6 8.40e3 1.11e7 8.60e3

I 3.64e5 2.13e3 2.58e6 1.22e4 1.12e7 3.51e4

J 3.64e5 1.33e3 2.65e6 3.92e4 1.17e7 3.35e5

K 3.64e5 9.76e2 2.63e6 2.70e4 1.17e7 3.78e5

L 3.64e5 - 2.57e6 5.10e3 1.19e7 3.78e5

Kruskal+MC:

10 nodes:

AF GHLKJIBEDC

25 nodes:

HLFAIGKJCEDB

50 nodes:

HIF JKALGCBED

Proposed approach:

10 nodes:

AF GHLKJIBEDC

25 nodes:

HLAF IGKJCEDB

50 nodes:

HIF JKALGCBED

5.2 Future Work

There are some points which are being considered as future ex-

tensions of this work:

• To exploit a parametric version of bootstrapping: when ap-

plied to estimate the characteristics of the operator mean the

bootstrapping often returns normal (or normal-like) distribu-

tions, what can be proved by the Central Limit Theorem [15].

This property can be exploited in order to try to differen-

tiate algorithms which have been considered similar in the

stochastic dominance analysis.

• To extend the proposed comparison approach for multiob-

jective problems: the approach proposed in this paper can be

extended to multiobjective problems by replacing the con-

vergence criteria by some Pareto quality indicator. Scalar

Pareto-set quality metrics, such as the S-Metric and the Inte-

grated Sphere Counting can be used for this task.

6. ACKNOWLEDGMENTS

The authors would like to thank Brazilian agencies Capes, CNPq

and Fapemig by the ﬁnancial support.

7. REFERENCES

[1] E. G. Carrano, C. M. Fonseca, R. H. C. Takahashi, L. C. A.

Pimenta, and O. M. Neto. A preliminary comparison of tree

902

Table 4: OCST (15 runs) - Computation time (ms)

10 nodes 25 nodes 50 nodes

Lab. avg sd avg sd avg sd

A 1.986e4 2.664e3 1.478e5 2.311e4 7.842e5 2.213e5

B 8.974e3 8.852e2 3.965e4 7.581e3 2.075e5 4.319e4

C 8.730e3 1.601e3 3.072e4 3.284e3 1.466e5 4.711e4

D 1.570e4 2.495e3 9.002e4 1.823e4 3.515e5 1.040e5

E 1.519e4 2.476e3 8.940e4 2.319e4 3.823e5 1.292e5

F 1.531e4 1.452e3 8.035e4 9.831e3 4.737e5 1.006e5

G 1.692e4 2.438e3 8.970e4 2.388e4 3.823e5 1.312e5

H 1.645e4 4.474e2 1.165e5 1.438e4 7.136e5 1.634e5

I 1.673e4 3.829e2 9.570e4 1.413e4 4.742e5 8.480e5

J 1.745e4 1.116e3 1.520e5 2.759e4 9.112e5 2.364e5

K 1.912e4 1.563e3 1.637e5 4.161e4 8.590e5 1.875e5

L 2.009e4 1.218e3 1.221e5 1.415e4 7.615e5 1.623e5

Kruskal+MC:

10 nodes:

CBEFDHIGJKAL

25 nodes:

CBFEGDIHLAJK

50 nodes:

CBDEGFIHLAKJ

Proposed approach:

10 nodes:

CBEFHIGDJKAL

25 nodes:

CBFGIEHDLAJK

50 nodes:

CBIGFDHELAKJ

encoding schemes for evolutionary algorithms. In Proc.

IEEE International Conference on Systems Man and

Cybernetics, Vancouver, Canada, 2007.

[2] M.R.Chernick.Bootstrap Methods, A practitioner’s guide,

volume Wiley Series in Probability and Statistics. John

Wiley and Sons, 1999.

[3] W. J. Connover. Practical Nonparametric Statistics.Wiley,

3rd edition, 1999.

[4] B. G. W. Craenen, A. E. Eiben, and J. I. van Hemert.

Comparing evolutionary algorithms on binary constraint

satisfaction problems. IEEE Transactions on Evolutionary

Computation, 7:424–444, 2003.

[5] P. Dutta and D. DuttalMajumder. Performance comparison of

two evolutionary schemes. In Proc. International Conference

on Pattern Recognition, pages 659–663, Viena, Austria,

1996.

[6] B. Efron. Bootstrap methods: Another look at the jackknife.

The Annals of Statistics, 7:1–26, 1979.

[7] B. Efron. Nonparametric estimates of standard error: The

jackknife, the bootstrap and other methods. Biometrika,

68:589–599, 1981.

[8] B. Efron. The jackknife, the bootstrap, and other resampling

plans. Technical report, Society of Industrial and Applied

Mathematics CBMS-NSF Monographs, 1982.

[9] M. Ehrgott. Multicriteria Optimization. Springer, 2000.

[10] K. Fukunaga. Introduction to Statistical Pattern Recognition.

Elsevier, 2 edition, 1990.

[11] R. C. Gonzalez and R. E. Woods. Digital image processing.

Addison Wesley, 2 edition, 1992.

[12] T. C. Hu. Optimum communication spanning trees. SIAM

Journal of Computing, 3:188–195, 1974.

[13] E. Oja. Neural networks, principal component, and

subspaces. International Journal of Neural Systems,

1:61–68, 1989.

[14] C. H. Papadimitriou and M. Yannakakis. Optimization,

approximation, and complexity classes. Journal of Computer

System Science, 43:425–440, 1991.

[15] A. Papoulis. Probability, Random Variables and Stochastic

Processes. McGraw Hill, 3rd edition, 1991.

[16] S. Pemmaraju and S. Skiena. Implementing Discrete

Mathematics: Combinatorics and Graph Theory with

Mathematica. Cambridge University Press, 2003.

[17] S. Soak, D. W. Corne, and B. Ahn. The

edge-window-decoder representation for tree-based

problems. IEEE Transactions on Evolutionary Computation,

10(124–144), 2006.

[18] R. H. C. Takahashi, J. A. Vasconcelos, J. A. Ramirez, and

L. Krahenbuhl. A multiobjective methodology for evaluating

genetic operators. IEEE Transactions on Magnetics,

39:1321–1324, 2003.

[19] E. Zitzler, L. Thiele, M. Laumanns, C. M. Fonseca, and V. G.

Fonseca. Performance assessment of multiobjective

optimizers: An analysis and review. IEEE Transactions on

Evolutionary Computation, 7:117–132, 2003.

903

- CitationsCitations4
- ReferencesReferences23

- "al [8] has extended with specific to parameter based performance analysis. Analysis method for EOAs have further grown in several directions such as Bootstrapping [9], [10], Exploratory Landscape Analysis(ELA) [11] and drift analysis [12]. Mathematical approaches [14]–[16] and visual analysis approach [13] have drawn attention for evaluating EOAs in recent years. "

[Show abstract] [Hide abstract]**ABSTRACT:**Growing application of evolutionary optimization algorithms in the problems of different domain have led to analyze their efficiency and effectiveness rigorously. Various approaches have been proposed to algorithms for performance evaluation such as parametric, non-parametric or mathematical which lack direct involvement of results obtained. A visual comparative performance evaluation method has been proposed in this paper incorporating more direct participation of results. Proposed method has been studied in perspective of three types of comparisons one-to-one, one-to-many and many-to-many. Necessary interpretations for the method have been illustrated and examined with solutions obtained on several benchmark functions through well known evolutionary optimization algorithms.- "The algorithm comparison methodology which is considered here is inspired on the evaluation schemes proposed in [Carrano et al. 2008, Carrano et al. 2011 . It can be summarized as follows: "

[Show abstract] [Hide abstract]**ABSTRACT:**The original formulation of the Generalized Assignment Problem (GAP) consists in, given a set of n different tasks and m different agents, as-signing each task to an agent in such a way that a cost function is minimized. A previous work introduced the Equilibrium Function as a new objective function in the problem formulation. The purpose of this second objective function is to minimize the maximum difference between the amount of work assigned to the agents. This allows better distributions of the tasks between the agents than the results found from the original problem, with a small increase in the cost. This paper proposes new crossover and mutation operators that produce improve-ments in the algorithm presented in [Subtil et al. 2010], leading to consider-ably better Pareto approximations than the ones obtained in the previous work, within the same number of function evaluations. The proposed operators exploit problem-specific information in a probabilistic way, performing operations that lead to objective function enhancement or feasibility enhancement with greater probability than operations that do not cause such enhancements. A statistical comparison procedure is employed for supporting such conclusions.- [Show abstract] [Hide abstract]
**ABSTRACT:**This paper presents a new memory-based variable-length encoding genetic algorithm for solving multiobjective optimization problems. The proposed method is a binary implementation of the NSGA2 and it uses a Hash Table for storing all the solutions visited during algorithm evolution. This data structure makes possible to avoid the re-visitation of solutions and it provides recovering and storage of data with low computational cost. The algorithm memory is used for building crossover, mutation and local search operators with a parameterless variable-length encoding. These operators control the neighborhood based on the density of points already visited on the region of the new solution to be evaluated. Two classical multiobjective problems are used to compare two variations of the proposed algorithm and two variations of the binary NSGA2. A statistical analysis of the results indicates that the memory-based adaptive neighborhood operators are able to provide significant improvement of the quality of the Pareto-set approximations.

## People who read this publication also read

Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.

This publication is from a journal that may support self archiving.

Learn more