Page 1

On Cross-Correlation Evaluation Model of

Internet Macroscopic Topology by Genetic

Algorithm

Ye XU

College of Information Science and Engineering, Shenyang Ligong University, Shenyang, China

Email: xuy.mail@gmail.com

Zhuo WANG

College of Information Science and Engineering, Shenyang Ligong University, Shenyang, China

Email: {zhuowang}@163.com

Abstract—Cross-correlation evaluation model, CCEM, was

mainly studied to evaluate how much two different

topologies are similar to each other in a quantitative way,

and further used in evaluating whether a topology by an

Internet topology model is close to real Internet or not. SLS

(Signless Laplacian Spectra), is used to quantitatively

identify the topology properties of the Internet generated

by the model and the Internet out of real measuring. SLS

eigenvectors could be gained out of this procedure, then a

cross-correlation calculation was performed on the

eigenvectors to give the difference identification in a

quantitative way. With this, a recommended way of using

the CCEM within a Genetic Algorithm was finally given.

Index Terms—Cross-correlation, Internet topology,

topology evaluation, SLS

I. INTRODUCTION

The research on the Internet topology modeling has

been growing into a hot topic in Internet-related research

fields recently [1]. There are basically three phases of

research in Internet topology modeling from 1980’s till

now [4]. The latest researches have been giving great

help in discovering characters of Internet topology after

Faloutsos found power-law distribution in Internet

topology structure in 1999 [6]. And after that, based on

the power-law related findings, many researchers had

developed many kinds of Internet topology models. All

these models could give a mathematical way to construct

an Internet, however, they could only be named as

qualitative or quasi-quantitative models because the way

to construct these models are not complete quantitative.

To construct a completely quantitative model,

quantitative evaluation

Cross-correlation Evaluation Model, CCEM, out of

composite methods of graph theory, spectral density [5]

algorithm is necessary.

and correlation algorithm [15] would be studied in this

paper.

A. Spectral density introduction

A non-directed graph G could be denoted by it

symmetrical adjacency matrix A. If there is a link

between node i and node j in G, then Aij=Aji=1, otherwise

Aij=Aji=0. Eigen values of G are the eigen values of its

matrix A, and they are denoted as λ1, λ2… λn.

Researches in Graph Theory show that eigen values of a

graph are closely related to the structural properties of

the graph topology. So studies on a graph’s eigen values

are useful in topology research.

Spectra of a graph G is denoted by a set of the eigen

values and their tuple of its adjacency matrix A [2], and

it’s denoted as Eq. (1).

⎛

=

m

where m is the tuple of the eigen value.

Spectral density

)(λρ

, is the eigen value density of

the adjacency matrix A, and it could be denoted as: [2] [3]

[4]

1

)(

λρ

⎟⎟

⎠

⎞

⎜⎜

⎝

n

n

m

G

(

Spec

...

...

)

1

1

λλ

. (1)

∑

=

i

−=

n

i

N

1

)(

λλδ

. (2)

where

iλ is the ith eigen value of the adjacency matrix

A, N is the number of the eigen values.

approaching to a continuous function when

)

→

(λ

N

ρ

will be

∞

.

B. Experiment samples

Experiment samples in this paper are the router-level

Internet samples measured from CAIDA1. The rough

measuring results in this paper are the router-level

1 CAIDA, the Cooperative Association for Internet Data Analysis, is a

worldwide research center on Internet-related research fields. CAIDA

has more than thirty monitor nodes which are distributed throughout

the whole world, measuring and monitoring the Internet.

This work is supported by the National Natural Science Foundation

of China (60802031)

230JOURNAL OF NETWORKS, VOL. 6, NO. 2, FEBRUARY 2011

© 2011 ACADEMY PUBLISHER

doi:10.4304/jnw.6.2.230-237

Page 2

Internet topology data measured at 30th, Jan. 20062 from

as many as twenty-one CAIDA monitors3. And after the

IP Alias resolution, we get twenty-one set of measuring

samples.

Then we move on sampling bias handling process.

Firstly, we gather them together (the twenty-one monitor

measuring results) to form a complete testing sample in

order to reduce the impact of sampling bias to an

extreme extent. And this best copy of sample is

undoubtedly regarded as our key sample in experiments

of the paper.

However, we still made several other inferior or

incomplete testing samples for comparison reasons, and

they are sample(1) comprising data from only one

monitors (arin monitor), and sample(2) from two

monitors (arin, b-root), till sample(20) from as many as

twenty monitors.

Finally, We get an experiment Internet samples with

1,145,841 routers (nodes) and 2,907,638 links from the

twenty-one monitors. After IP alias-solution [16] [17],

the size of the sample reduced to 29,367 routers and

190,280 links respectively [12], but still too large to be

easily handled by computer.

To simplify the computation, we performed a

second-order sampling (re-sampling) operations on the

experiment samples, and the re-sampling rules are:

1)Re-sampling operation is completely random, it could

start from any effective node in target graph;

2)Re-sampled results must be a connected graph;

3)Re-sampled results should cover as much nodes as

possible, i.e., node selection is preferential to link

selections.

At last, the re-sampled Internet topology graph was

converted into an adjacency matrix for further

calculation.

II. POSSIBILITY OF USING SPECTRAL DENSITY IN

DISTINGUISHING TOPOLOGY GRAPHS

Before we made use of spectral density to construct

CCEM, we would first testify whether it could be used to

distinguish topology graphs (including Internet topology)

or not.

Three representative graphs: ER random graph,

Scale-free graph and Internet topology graph were

selected for the test in this paper.

A. Distinguishing ER random graph

According to [3], the spectral density of an ER random

graph converges to a half-circle, and the low part of the

half-circle exhibits an exponential distribution, as is

shown in Fig. 1 from Ref. [3].

2 The reason why measuring topology data at 30th, Jan. 2006 is that there

are as many as twenty-one monitors providing effective measuring data that

day. For other days round that period of time, the fact is, there would be

fewer effective monitors.

3 The twenty-one monitors are arin, b-root, cam, cdg-rssac, champagne,

d-root, e-root, h-root, i-root, iad, ihug, k-root, lhr, m-root, mwest, neu1, nrt,

riesling, sjc, uoregon and yto. And all monitors are distributed into different

continents for better measuring Internet throughout the whole world.

Figure 1. Diagram of ER Random Graph’s spectral density. The axes

are calibrated by (Np(1-p))^0.5

B. Distinguishing scale-free graph

Spectrum density of a scale-free graph out of BA

model [3][8][9][10][11] exhibits a symmetrically

continuous curve with a triangular center together with

two power-law distribution sides, as is shown in Fig. 2

from Ref. [8].

Figure 2. Diagram of Scale-free Graph’s spectral density. The axes are

calibrated by (Np(1-p))^0.5

C. Distinguishing Internet topology graph

We can find from Fig. 1 and Fig. 2 that different graph

exhibits quite different spectra diagram. Thus the spectral

density could be utilized as a tool to distinguish graph.

Internet topology graph, as we know, is a type of

graph different from ER graph and Scale-free graph, but

is a little similar to the scale-free one [1][6][12]. We then

take a look at if it is possible to distinguish the Internet

topology graph from the scale-free one.

For simplicity and better comparison, we draw three

copy of Internet graph with the re-sampling tool

mentioned above and the size of the three samples after

re-sampling are 30 nodes and 29 links, 300 nodes and

536 links, as well as 500 nodes and 753 links

respectively. Their eigen values and spectral density are

listed in table I.

JOURNAL OF NETWORKS, VOL. 6, NO. 2, FEBRUARY 2011231

© 2011 ACADEMY PUBLISHER

Page 3

TABLE I

EIGEN VALUES AND SPECTRAL DENSITY OF THREE RE-SAMPLED

INTERNET GRAPHS

30 ips 300 ips

λ (13)1

)(λρ

λ (104)1

-3.2196 0.0333 -8.7818

-2.6318 0.0333 -8.0004

… … …

-0.5663 0.0333 -0.1767

-0.0000 0.5333 -0.0000

0.5663 0.0333 0.1479

… … …

2.6318 0.0333 8.8174

3.2196 0.0333 14.1650

Note: The value in the bracket is the total number of the eigen values.

500 ips

)(λ

0.0033

0.0033

…

0.0033

0.5567

0.0033

…

0.0033

0.0033

ρ

λ (112)1

-10.7058

-10.2681

…

-0.2635

-0.0000

0.1113

…

10.9470

12.3570

)(λ

0.0020

0.0020

…

0.0020

0.7320

0.0020

…

0.0020

0.0020

ρ

The symmetry of the spectral density could be found

from table I, and this is consistent to the spectra

symmetry on scale-free graphs found in [3], [8]. The

correspondence match proves in a coarse granularity that

there is a little similarity between the Internet graph and

the scale-free graph, as was mentioned previously.

However, there are differences between the graphs,

and we illustrated the Internet’s spectra diagram in Fig. 3

for better comparison.

Figure 3. Spectral density diagrams of three Internet graphs. The

sub-graph in the top-right is a plot zoomed in to [-5, 5] in axis x and [0,

0.2] in axis y for a better view

From Fig. 3, we first find that there are complete

conformities in all three re-sampled graphs (30 ips, 300

ips and 500 ips), such as two small peaks whenλ =±

1.0000, one distinct peak when λ =0, and all

)(λρ

<0.005 when λ <-1.0000 and λ >1.0000.

All three graphs comprise quite different size and

contents (specific routers and links) due to re-sampling

rules, and the conformity found in Fig. 3 shows that,

though performed on different part of Internet, the

spectral density still get similar results. So conclusions

could be made that, spectral density is OK in

representing real Internet graph characters.

Next, we compare Fig. 3 with the scale-free graph

(Fig. 2) and find that the center of three spectral density

curves in Fig. 3 is of triangular shape, which is similar to

the scale-free graph. For the two side parts, however,

they are different from scale-free graph since the side

parts are not complied with exponential distribution or

power-law distribution. So the spectral density is OK in

distinguishing Internet graph from the scale-free graph.

Again, we begin to distinguish the Internet graph from

the ER graph by comparing Fig. 3 with Fig. 1, and the

differences are easily found from the two Fig.s. So, we

make the conclusion that the spectral density is OK in

distinguishing Internet graph from the ER graph.

Together with the fact that spectral density gives a

quantitative description of Internet topology characters,

we would make use of it in CCEM for Internet topology

modeling.

III. INTERNET TOPOLOGY CHARACTERS DISCOVERED BY

SPECTRAL DENSITY

A. General spectral density

For a better view of spectra distribution, we calibrate

the coordinate system by a factor of

) 1 (

pNp

−

to

make a new one with axis X as

λρ

What’s more, we enlarge the size of the re-sampled

Internet topology graph from 30 ips, 300 ips and 500 ips

(Fig. 3) to 300, 800, 2000, 3000 and 4000 ips (Fig. 4) so

as to make a graph closer to the real Internet.

We know that the more nodes a graph has, the closer

to real Internet it is. However, a graph with 4000 ips is

the largest one in this paper, and the reasons are: 1)

Limitations of computing abilities, the calculating

efficiency of spectral density would decrease sharply if

the size of the graph increases over 4000; 2) Internet

characters could be well expressed through spectral

density no matter how many nodes an Internet graph has.

And this is a fact having been proved in Fig. 3

(different-sized-graph has conformities in spectral

density structure) and going to be proved again in Fig. 4.

)1 (/

pNp −λ

and

axis Y as

)1 ()(

pNp −

[1] [3].

Figure 4. Spectral density of five re-sampled graphs. The sub-graph in

the top-right is a plot zoomed in to [-3, 3] in axis x and [0, 0.15] in axis

y for a better view

From Fig. 4, we found that all five graphs’ spectral

density showed very good conformities despite of their

232JOURNAL OF NETWORKS, VOL. 6, NO. 2, FEBRUARY 2011

© 2011 ACADEMY PUBLISHER

Page 4

different size. All five plots have the maximum when

λ =0 and the second maximum when λ =0.5 around.

Similar to what was found in Fig. 3, the conformity

among five Internet graphs proved that, only a

small-sized Internet graph could be enough to represent

key properties of real Internet topology by spectral

density based on the re-sampling tool. Which means that,

performing experiments on the complete Internet

topology graph is not necessary any more for us to study

its properties, a rather smaller re-sampled graph with

appropriate algorithm could also be effective.

Back to the basic idea of this paper, to distinguish

topology graphs by comparing their spectral density.

However, the spectral density is somewhat in coarse

granularity, there is another especially valuable kind of

spectral density named Signless Laplacian Spectra (SLS)

which could give further and finer information on a

graph’s properties [14].

B. SLS

An SLS matrix |L| of a graph G is defined to |L|=D+A,

where matrix D is a diagonal matrix representing G’s

degree, and matrix A is G’s adjacency matrix [14]. SLS

is eigen values of |L|. Some researches in graph theory

indicate that SLS is the best spectra in distinguishing

different graphs [14]. In this paper, SLS is used on four

re-sampled Internet topology graphs (3000 ips). And the

result is illustrated in Fig. 5.

Figure 5. SLS analysis results on four 3000-ip graphs, where axis y is

in logarithm, and axis x is sorted by eigen values’ descending order

From Fig. 5, firstly, we could see that all four curves

show high similarities although the four samples are

completely random and different from each other. Again,

this should be regarded as another proof that the

re-sampled samples could

properties of the real Internet graph.

There are two evident horizontal lines when SLS

equals to 1(100) and 2, which means that there are the

most nodes in the Internet topology graph when SLS

equals to 1, and the second-most nodes at SLS=2. All

four samples exhibit same properties clearly in Fig. 5.

For the other part of the Fig. 5, i.e., the part when

SLS>2 and SLS<1, we’d make further studies by

effectively represent

performing power-law distribution fitting operations [1].

The fit result is illustrated in Fig. 6 and Fig. 7.

From Fig. 6, we could see that there is obvious

power-law relationship between

corresponding descending order, and the fitting result

ACC (absolute value of the correlation coefficient) is

greater than 0.9, meaning that the fitting operation is

highly acceptable. The power-law relationship found

here is quite consistent to what was found in the spectral

density research on China CERNET in [1].

However, there is not clear power-law relationship

since ACC is rather small in Fig. 7. And this could also

be regarded as a criterion identifying Internet graph.

SLS and its

Figure 6. Power law distribution fitting results with descending eigen

value when SLS>2 of four re-sampled graphs.

Figure 7. Power law distribution fitting results with descending eigen

value when SLS<1.

C. Selection for CCEM

Compared with the general spectral density, SLS is

better since 1) SLS is recommended to be the best

spectra in Ref. [14]; 2) SLS is as same as the general

spectral density in quantitatively identifying Internet

graph by its eigen value sequence, but is better in

discovering more characters of Internet such as two

horizontal phases at SLS=1 and SLS=2, one power-law

JOURNAL OF NETWORKS, VOL. 6, NO. 2, FEBRUARY 2011233

© 2011 ACADEMY PUBLISHER

Page 5

distribution part when SLS>2 and non-power-law

distribution at SLS<1.

So, SLS would be selected for studying CCEM.

IV. CROSS-CORRELATION EVALUATION MODEL

A. Transformation from SLS to data sequence

To evaluate an Internet model is to determine the

differences between the generated Internet topology and

the real Internet topology. SLS eigen values sequences

are introduced to determine the differences as a

quantitative evaluation way.

The SLS eigen values are a sequence of numerical

numbers representing the primary characters of the target

graph, i.e., the Internet topology graph. With the two

value sequences, the problem left for us is to find an

effective algorithm to get the evaluation result between

them.

CCEM, then is used to evaluate whether a given or a

generated topology is similar to or same as the real

Internet topology. And the first requirement of CCEM is

to transform SLS into data sequence.

After the sort of eigen values of SLS in descending

way, the data sequence is gained and ready for the next

step evaluation, as is shown in Eq. (3) and Eq. (4).

][

SLSmu

=

][

SLSnv

=

where u[m] is sequence of real Internet topology, v[n] is

sequence of a given topology, m and n denote the

descending order of SLS eigen value of the real Internet

topology and a given topology, respectively.

]

]

[

m

[

n

. (3)

. (4)

B. Cross-correlation algorithm

Cross-correlation

distinguishing and identifying the differences between

numerical number sequences

quantitative way [15], and it’s defined in Eq. (5).

algorithm is capable of

in an absolutely

) 0 (

vv

r

) 0 (

uu

r

)(

)(

uv

uv

nr

n =

ρ

. (5)

where n is the disalignment lag between u[m] and v[n],

)(nruv

is cross-variance,

autocorrelation of u[m] and v[n] with disalignment lag

set to be 0, respectively. And they are:

∑

=

0

k

N

∑

=

k

N

∑

=

k

N

where N is length of u[m] and v[n]. Let Nu=Length(u[m]),

Nv=Length(v[n]), then:

NN

−+=

1

) 0 (

uu

r

and

) 0 (

vv r

are

−−

+=

1

u

)()(

1

)(

nN

uv

knvknr

. (6)

−

=

1

0

−

2

)(

1

) 0 (

uu

r

N

ku

. (7)

=

1

0

2

)(

1

) 0 (

vv

r

N

kv

. (8)

uuvu

uu

=

u

NNifNNN

NN

!

if

==

. (9)

[Proof 1]: The cross-correlation maximum occurs if

and only if two given topology are completely identical

and the disalignment lag is 0.

Proof:

If two given topology are completely identical, then:

][

mu

=

And if the disalignment lag is 0, with Eq. (6), we get:

∑

) 0 (

k

N

According to Eq. (10), we get:

) 0 () 0 (

vv uuuv

rrr

==

∑

=

0

k

N

∑

=

0

k

N

)]([

uE

=

. (12)

First, we are going to prove:

([| | )(|) 0 (

≥≥

kuEjrr

uuuu

Consider a non-negative variable,

()( [(

±

kukuE

Extend this Eq. (14), we get:

([2)]([(

+±

kuE

With Eq. (13), we simplify Eq. (15) to:

) 0 (2

±

r

uu

Then:

) 0 (

uuuu

rr

≤−

And

|) 0 (

≥

rr

uuuu

Now, we’ve proved that cross-correlation value

reaches maximum when

disalignment lag set to be 0.

Next, we are going to prove when

) 0 (

uu

r

or

When the disalignment lag

][

nv

. (10)

−

=

=

1

0

)()(

1

N

uv

kvkur

. (11)

) 0 (

−

=

1

)()(

1

N

kuku

−

=

1

2

)(

1

N

ku

2k

0!| )]()

=+±

jjku

. (13)

0] ))

2≥+

j

. (14)

0))]([

)]()

2

2

≥

+±±

j

jkukuEkuE

. (15)

0)(2

≥

jr

uu

) 0 (

uu

r

)(

j

≤

0!| )

j

(

=

j

. (16)

][][

nvmu

=

and the

][]![

nvmu

=

, the

maximum is still

) 0 (

!=

j

ruu

0

. So, according to Eq.

vv r

.

0

( j

, to simplify the

)

to be

proof procedure, we can set

]!,[

=

jmu

(16), we get:

)( jruv

since

!],[

=

jjnv

0!| )

j

r

| )

j

(|) 0 (

uu

=≥

jrr

uv

.

And for

) 0 (

vv r

, similar to

) 0 (

≥

r

vv

) 0 (

uu

, we still get:

0

=

.

!(|

jr

uv

End proof.

We then use SLS eigen values from Fig. 5, i.e., the

four SLS sequences from four real Internet topology, to

testify whether Proof (1) is correct or not.

234JOURNAL OF NETWORKS, VOL. 6, NO. 2, FEBRUARY 2011

© 2011 ACADEMY PUBLISHER

Page 6

From Fig. 8, it’s clear that all four SLS sequences

reach their maximums when disalignment lag equals 0,

quite consistent with what we have proved in Proof (1).

Figure 8. Auto-correlation calculation of SLS eigen values with

disalignment lags, all four SLS sequences come from Fig. 5.

Figure 9. Cross-correlation calculation of SLS eigen values with

disalignment lags, all four SLS sequences come from Fig. 5. SLS(1)

and SLS(2) means cross-correlation calculation between SLS(1) and

SLS(2), and the others are the same.

And for Fig. 9, we can find that the cross-correlation

still reach the maximum when disalignment lag equals 0,

though all four SLS sequences, i.e., SLS(1)(2)(3)(4) are

different from each other.

The four SLS sequences, however, all come from real

Internet topology, are quite similar to each other. And we

can see that the maximum of three cross-correlation

nearly reach 1, quite close to the maximum value of

auto-correlation in Fig. 8. This is quite reasonable,

because the topology that SLS(1)(2)(3) and SLS (4)

represent are proved to be similar in section III, and it’s

again proved to so close in topology structure to each

other that the cross-correlation values are almost equal to

that of auto-correlation, i.e., the four topology are almost

same to each other. During the mean while, Proof (1) is

testified to be true.

By now it seems that the alike topology always

reaches a maximum close to 1 during cross-correlation

calculations, what about the disalike topologies. We

select SLS(1) and make cross-correlation calculation

with three random sequences and illustrated the results in

Fig. 10.

Figure 10. Cross-correlation calculation of SLS(1) eigen values and

three random seuquences with disalignment lags, SLS sequences (1)

come from Fig. 5. SLS(1) and random(1) means cross-correlation

calculation between SLS(1) and random sequence (1), and the others

are the same.

From Fig. 10, it’s clear that the plot is quite different

from that in Fig. 9. Firstly, the maximum of

cross-correlation is around 0.2, not 1 as in Fig. 8, and Fig.

9, meaning that the similarities between SLS (1) and

random sequences (1)(2) and (3) are not identical to each

other, i.e., the topologies represented by SLS(1) and the

other three random sequences are not alike to each other.

This is quite reasonable since the three random

sequences originate from random operations, it’s unlikely

to be identical to SLS (1), or the random generated

topology has very little possibility to be similar to the

real Internet topology.

Secondly, the growing curves are not close to zero any

more, but close to 0.1. The reason is that part of the

randomly generated sequences is “similar” in some way

to part of SLS sequence (1). The “similarity”, however,

is quite low since the cross-correlation values are near

0.1 and 0.2, quite far from 1, the value of the

cross-correlation calculation from completely identical

topologies.

With Proof (1) and illustrations from Fig. 8, 9 and 10,

CCEM can be used to evaluate the difference between

topologies, and more important, CCEM can function as a

measuring scale to evaluate how much a given topology

is close to the other one.

The gained result from CCEM would be a relative

large cross-correlation value if the two sequences or two

topologies are similar to each other, or a small value

otherwise. Then a threshold would usually be set for

making decisions when using CCEM in evaluating

Internet topology model.

B. CCEM algorithm

The CCEM algorithm for the Internet topology is

shown in table II.

JOURNAL OF NETWORKS, VOL. 6, NO. 2, FEBRUARY 2011235

© 2011 ACADEMY PUBLISHER

Page 7

TABLE II

THE CCEM ALGORITHM FOR THE INTERNET TOPOLOGY MODELING

Steps Operations

(1) Get an adjacency matrix of the target Internet topology

graph (as a template Internet graph) by the Internet

re-sampling tool;

/* the size of the matrix should be identical to that of the

matrix from Internet model; */

(2) Get the SLS eigen value sequence by SLS operations;

/* This sequence would be regarded as the base

comparison sample; */

(3) LOOP

(3.1) Construct a modeled Internet with An Internet Model

(with specific parameters), and get its adjacency matrix;

(3.2) Get its SLS eigen value sequence next;

(3.3) Perform cross-correlation algorithm on the two

sequences: the modeled sequence from (3.2) and the

template sequence from (2);

End LOOP till the cross-correlation result is greater than

the threshold, which implies that the modeled Internet is

similar enough to the real Internet;

/* The threshold is adjustable. */

Else adjust the parameter of the Internet model, and

continues the loop.

The size of the modeled Internet graph and that of real

Internet graph must be identical, and the user could

control how to set the value. We know that the real

Internet graphs with different size are quite different,

even the real Internet graph with the same size but

re-sampled at different time, are not identical to each

other. So the result gained out of the algorithm may

differ in some way each time.

But we still consider the CCEM algorithm to be

effective because (1) the properties of the real Internet by

re-sampling rules are quite similar (Fig. 4, 5 and 6), so

the different re-sampled Internet graph could not make

great changes for the algorithm results. (2) Internet is a

kind of dynamically growing networks, there is not a

static Internet graph to be used as a template in the

algorithm. So the re-sampled Internet is so far OK to be

used in the algorithm.

C. A recommended way to use CCEM

A way to use CCEM is recommended as: to use it

within a Genetic Algorithm (GA). Here are the reasons.

1) GA fit the CCEM studied in this paper quite well.

GA could give direct calculations and optimizations

when using CCEM to evaluate and optimize a given

topology to real Internet topology.

2) Most Internet modeling researches are out of

statistics at present because the Internet is too huge to be

handled by other approaches. And most statistical result

is a mathematical model with parameters that are not

quite certain, for example, some parameters are defined

as a numerical interval [5], but not a certain value. To

determine these parameters, or to optimize the numerical

intervals, is what the current researchers are required to

do, and technically speaking, it needs rounds of

repeatable calculations. Under such conditions, GA

would be the most appropriate tool because of it’s quite

good at repeatable

auto-decision-making. GA could automatically make

adjustments to the Internet model’s parameters till the

optimization is done.

So CCEM is recommended to be used in a GA in

Internet topology modeling researches.

A possible use of GA with CCEM is listed in Table III.

computation and

TABLE III

POSSIBLE USAGE OF CCEM IN GA

procedures information

(1) Definition of

genes in GA.

Randomly

comprising a bundle of genes.

initializing a gene group

(2) Definition of

evaluation

function

The choice of ε should minimize the difference

between the generated network and real Internet,

i.e., the cross-correlation outcome should be

maximized. So the evaluation function should be

as:

(3) Selection Genes were sorted by scores from high to low

in the gene group, and the first m*N genes, m is

a random number (0<m<1), were selected for the

next round calculation by GA. Then we duplicate

the selected genes, and deleted the last m genes

to keep the group size remaining the same.

(4) Crossover

Randomly select two genes, xi(vi…)、xj(vj…)

out of the group to perform the crossover

(5) Mutation

Randomly select two genes, xi(vi…)、xj(vj…)

out of the group to perform the crossover.

/*Unlike crossover operations, not all genes

have to be mutated. We set up a threshold of 0.3

in the algorithm, which means only 30% genes

would be performed by mutation. */

6)

conditions

Termination Basically there are two termination conditions

in GA.

Firstly, GA would be terminated right after the

best gene is found when evaluation function

result in the highest score or a maximum value.

As mentioned above, maximized outcome from

cross-correlation only occurs when the two SLS

eigenvectors are totally identical. And in this

paper, it’s quite obvious that

mentioned, y is the SLS eigenvector of real

Internet topology from Fig.5) is the maximum we

are looking for, which means the generated

network is completely equivalent in topology to

real Internet.

This maximum value, however, is hard to

achieve, since it’s hard to generate a network

exactly same as real Internet. We then set up a

threshold as

[95. 0

ruv

⋅

best optimized parameter ε is regarded to be

found and GA will stop running if the evaluation

result out of evaluation function is great than this

threshold.

The second termination condition is when GA

have repeated for more than 1000 times before

finding the best gene (parameter ε). If so,

terminate the algorithm. This is done to ensure

ending GA in an appropriated way, or else GA

might run a very long time.

),(

yyr

(as

]

n

to replace

][nruv

. A

236JOURNAL OF NETWORKS, VOL. 6, NO. 2, FEBRUARY 2011

© 2011 ACADEMY PUBLISHER

Page 8

V. CONCLUSION

CCEM and its algorithm were studied in this paper.

Firstly, we testified the ability of spectral density in

distinguishing different graphs by performing it among

three different but representative graphs, ER random

graph, BA scale-free graph and the Internet topology

graph. And we found that three yielded spectra showed

quite different properties, so that the spectral density

approach was confirmed to be capable of distinguishing

and identifying Internet graphs.

Secondly, for the sake of quantitative research, we put

the Internet graph into its adjacency matrix, and then get

its SLS eigen values (a kind of specific spectral density)

which made the quantitative research possible. Thirdly, a

cross-correlation algorithm

quantitatively evaluate the difference between graphs.

CCEM together with its algorithm was finally brought

forward for Internet topology modeling.

What’s more, CCEM was recommended to be used in

a Genetic Algorithm during the Internet topology

modeling researches. And our next work is to set up a

new Internet topology model with parameters optimized

by CCEM together with GA studied in this paper.

was introduced to

REFERENCES

[1] Jiang Y, Fang B.X., Hu M.Z. An Example of Analyzing

the Characteristics of a Large Scale ISP Topology

Measured from Multiple Vantage Points[J]. Journal of

Software, 2005,16(5):846-856

[2] Douglas B. West. An introduction to Graph Theory[M].

Chian Machine Press, 2006,1-47,339-348.

[3] Farkas IJ, Derényi I, Barabási A, Vicsek T. Spectra of

‘real-world’ graphs: Beyond the semicircle law[J].

Physical Review E, 2001,64(2):1-12.

[4] ZHANG Yu+, ZHANG Hong-Li, FANG Bin-Xing. A

Survey on Internet Topology Modeling. Journal of

Software, 2004,15(8):1221~1226

[5] Wang X.F., Li X., Chen G.R., Complex networks theory

and its application[M]. Beijing: QsingHua Press,

2006,49-70.

[6] Faloutsos M, Faloutsos P, Faloutsos C. On power-law

relationships of the Internet topology[J]. ACM

SIGCOMM ComputerCommunication

1999,29(4):251-262.

[7] Mehta ML. Random Matrices. 2nd ed[M]. New York:

Academic Press, 1991.

[8] Goh KI, Kahang B, Kim D. Spectra and eigenvectors of

scale-free networks[J].

2001,64(5):1-5.

[9] Barabasi A L, Eric Bonabeau. Scale free networks[J ].

Scientific American, 2003. 50- 59

[10] Barabási AL, Albert R. Emergence of scaling in random

networks[J]. Science, 1999,286(5439):509~512.

[11] Albert R, Barabási A. Statistical mechanics of complex

networks[J]. Reviews

2002,74(1):47-97.

[12] XU Y. A TL Model for Router-level Internet

Macroscopic Topology(D). Shenyang: Northeastern

University,2006,36-38.

[13] Gkantsidis C, Mihail M, Zegura E. Spectral analysis of

Internet topologies[C], In Proceedings of the 22nd

Annual Joint Conference of the IEEE Computer and

Review,

Physical Review E,

of Modern Physics,

Communications

Computer Society, New York, 2003, 364--374.

[14] Dam E, Haemers WH. Which graphs are determined by

their spectrum? [J]. Linear Algebra and its Applications,

2003,373:241-272.

[15] Rorabaugh. COMPLETE

PROCESSING[M]. US:McGraw-Hill, 2005.

[16] R. Teixeira, K. Marzullo, S. Savage, and G. Voelker, In

search of path diversity in ISP networks[C]. Proceedings

of the USENIX/ACM Internet Measurement Conference,

(Miami, FL, USA), October 2003.

[17] S. Bilir, K. Sarac, and T. Korkmaz, End to end

intersection characteristics of Internet paths and trees[C].

IEEE International Conference on Network Protocols

(ICNP), (Boston, MA, USA), November 2005.

Ye XU, born in Dec. 1976, and got his

ph.D degree in Computer application

technology in 2006 from Noreastern

University, China. And his research

interests now include complex network

modeling and information processing

technology.

He is now working in college of

information science and engineering,

Shenyang Ligong University, China, as an asscoicate professor

and Master supervisor.

Societies (INFOCOM'03), IEEE

DIGITAL SIGNAL

JOURNAL OF NETWORKS, VOL. 6, NO. 2, FEBRUARY 2011 237

© 2011 ACADEMY PUBLISHER