Page 1

Thomas Karagiannis,

Mart Molle,

and Michalis Faloutsos

University of California,Riverside

Long-Range Dependence

Ten Years of Internet Traffic Modeling

Self-similarity and scaling phenomena have dominated Internet traffic analysis for

the past decade.With the identification of long-range dependence (LRD) in

network traffic, the research community has undergone a mental shift from

Poisson and memory-less processes to LRD and bursty processes.Despite its

widespread use, though, LRD analysis is hindered by our difficulty in actually

identifying dependence and estimating its parameters unambiguously.The authors

outline LRD findings in network traffic and explore the current lack of accuracy

and robustness in LRD estimation. In addition, the authors present recent

evidence that packet arrivals appear to be in agreement with the Poisson

assumption in the Internet core.

T

replicate the Internet and study it as a

whole, so we rely on thorough analysis of

network measurements and their trans-

formation into models to help explain the

Internet’s functionality and improve its

performance.

About 10 years ago, the introduction

of long-range dependence (LRD) and

self-similarity revolutionized our under-

standing of network traffic. (LRD means

that the behavior of a time-dependent

process shows statistically significant

correlations across large time scales; self-

raffic modeling and analysis is a fun-

damental building block of Internet

engineering and design. We can’t

similarity describes the phenomenon in

which the behavior of a process is pre-

served irrespective of scaling in space or

time.) Prior to that, researchers consid-

ered Poisson processes (that is, the pack-

et arrival process is memory-less and

interarrival times follow the exponential

distribution) to be an adequate represen-

tation for network traffic in real sys-

tems.1LRD flew in the face of conven-

tional wisdom by stating that network

traffic exhibits long-term memory (its

behavior across widely separated times is

correlated). This assertion challenged the

validity of the Poisson assumption and

shifted the community’s focus from

2 SEPTEMBER • OCTOBER 2004 Published by the IEEE Computer Society1089-7801/04/$20.00 © 2004 IEEEIEEE INTERNET COMPUTING

Internet Measurement

Page 2

assuming memory-less and smooth behavior to

long memory and bursty behavior.

In this article, we provide an overview of what

the community has learned from 10 years of LRD

research; we also identify the caveats and limita-

tions of our ability to detect LRD. In particular, we

want to raise awareness on two issues: that identi-

fying and estimating LRD is far from straightfor-

ward, and that the large-scale aggregation of the

Internet’s core might have shifted packet-level

behavior toward being a Poisson process. Ultimate-

ly, measuring and modeling the Internet requires us

to constantly reinvent models and methods.

Self-Similarity

in Internet Traffic

Ample evidence collected over the past decade

suggests the existence of LRD, self-similarity and

heavy-tailed distributions (meaning large values

can exist with non-negligible probability) in vari-

ous aspects of network behavior.

Before we look at the major advances in LRD

research, we must first describe LRD and self-sim-

ilarity in the context of time-series analysis.

Stochastic Time Series

Let X(t) be a stochastic process. In some cases, X

can take the form of a discrete time series {Xt}, t

= 0, 1, ..., N, either through periodic sampling or

by averaging its value across a series of fixed-

length intervals. We say that X(t) is stationary if its

joint distribution across a collection of times t1, ...,

tNis invariant to time shifting. Thus, we can char-

acterize the dependence between the process’s val-

ues at different times by evaluating the process’s

autocorrelation function (ACF), which is ρ(k). The

ACF measures similarity between a series Xtand a

shifted version of itself Xt+k:

()

−

(

⎡⎣

+

2

, (1)

where µ and σ are the mean and standard devia-

tions, respectively, for X.

Also of interest is a time series’ aggregated

process Xk

(m):

, k = 0, 1, 2, …, — 1. (2)

Intuitively, {Xk(m)} describes the average value of

the time series across “windows” of m consecutive

values from the original time series. If {Xk

independent and identically distributed, then

Var(X(m)) = σ2/m. However, if the sequence exhibits

(m)} were

long memory, then the aggregated process’s vari-

ance converges to zero at a much slower rate than

1/m.2

Self-Similarity and LRD

A stationary process X is long-range dependent if

its autocorrelations decay to zero so slowly that

their sum doesn’t converge — that is, ∑k=1

= ∞. Intuitively, memory is built-in to the process

because the dependence among an LRD process’s

widely separated values is significant, even across

large time shifts.

A stochastic process X is self-similar if

∞|ρ(k)|

X(at) = aHX(t), a > 0,

where the equality refers to equality in distribu-

tions, a is a scaling factor, and the self-similarity

parameter H is called the Hurst exponent. Intu-

itively, self-similarity describes the phenomenon

in which certain process properties are preserved

irrespective of scaling in space or time.

Second-order self-similarity describes the prop-

erty that a time series’ correlation structure (ACF) is

preserved irrespective of time aggregation. Simply

put, a second-order self-similar time series’ ACF is

the same for either coarse or fine time scales. A sta-

tionary process Xtis second-order self-similar3if

ρ(k) =1/2 [(k + 1)2H– 2k2H+ (k – 1)2H],

0.5 < H < 1 (3)

and asymptotically exactly self-similar if

ρ(k) =1/2 [(k + 1)2H– 2k2H+ (k – 1)2H],

0.5 < H < 1.

Second-order self-similar processes are char-

acterized by a hyperbolically decaying ACF and

used extensively to model LRD processes. Con-

versely, quickly decaying correlations characterize

short-range dependence. From these definitions,

we can infer that LRD characterizes a time series if

0.5 < H < 1. As H → 1, the dependence is stronger.

For network-measurement processes, X refers

to the number of packets and bytes at consecutive

time intervals, meaning that X describes the vol-

ume of bytes/packets observed in a link every time

interval t.

Self-Similarity in Internet Traffic

Leland and colleagues’ pioneering work provided

the first empirical evidence of self-similar charac-

lim

k→∞

N

m

⎢

⎣⎢

⎥

⎦⎥

X

m

X

k

m

i

i km

=

∑

km

()

()

=

+−

1

11

ρ

µµ

σ

( ) k

EXX

t t k

=

−

)

⎤⎦

IEEE INTERNET COMPUTINGwww.computer.org/internet/ SEPTEMBER • OCTOBER 20043

Long-Range Dependence

Page 3

teristics in LAN traffic.4They performed a rigor-

ous statistical analysis of Ethernet traffic mea-

surements and established its self-similar nature.

Specifically, they observed that Internet traffic

variability was invariant to the observed time scale

— that is, traffic didn’t become smooth with aggre-

gation as fast as the Poisson traffic model indicat-

ed. Subsequently, Paxson and Floyd described the

failure of using Poisson modeling in wide-area

Internet traffic.5They demonstrated that packet

interarrival times for Telnet and FTP traffic were

described by heavy-tailed distributions and char-

acterized by burstiness, which indicated that the

Poisson process underestimated both burstiness

and variability. In addition, they proved that large-

scale correlations characterized wide-area traffic

traces, concluding, “We should abandon Poisson-

based modeling of wide-area traffic for all but user

session arrivals.”

These two landmark studies nudged researchers

away from traditional Poisson modeling and inde-

pendence assumptions, which were discarded as

unrealistic and overly simplistic. The nature of the

congestion produced from self-similar network traf-

fic models had a considerable impact on queuing

performance,6due in large part to variability across

various time scales. Further studies proved that

Poisson-based models significantly underestimated

performance measures, showing that self-similari-

ty resulted in performance degradation by drasti-

cally increasing queuing delay and packet loss.7

Self-similarity’s origins in Internet traffic are

mainly attributed to heavy-tailed distributions of

file sizes.8,9Several studies correlated the Hurst

exponent with heavy-tailed distributions, indicat-

ing that extremely large transfer requests could

occur with non-negligible probability.

Apart from LRD, Internet traffic presents com-

plex scaling and multifractal characteristics. Many

simulations and empirical studies illustrate how

scaling behavior and the intensity of the observed

dependence is related to the scale of observation.

Specifically, loose versus strong dependence exists

in smaller versus larger time scales, respectively.

The change point is usually associated with either

the round-trip time (RTT) or intrusive “fast” flows

with small interarrival times.10,11

Despite the overwhelming evidence of LRD’s

presence in Internet traffic, a few findings indi-

cate that Poisson models and independence could

still be applicable as the number of sources

increases in fast backbone links that carry vast

numbers of distinct flows, leading to large vol-

umes of traffic multiplexing.12In addition, other

studies13point out that several end-to-end net-

work properties seem to agree with the indepen-

dence assumptions in the presence of nonstation-

arity (that is, statistical properties vary with time).

LRD Estimation

and Its Limitations

The predominant way to quantify LRD is through

the Hurst exponent, which is a scalar, but calcu-

lating this exponent isn’t straightforward. First, it

can’t be calculated definitively, only estimated.

Second, although we can use several different

methods to estimate the Hurst exponent, they

often produce conflicting results, and it’s not clear

which provides the most accurate estimation.

We can classify Hurst exponent estimators into

two general categories: those operating in the time

domain and those operating in the frequency or

wavelet domain. Due to space constraints, we can’t

give a complete description of all available esti-

mators, but an overview appears elsewhere.14

Time-domain estimators investigate the

power-law relationship between a specific statis-

tical property in a time series and the time-aggre-

gation block size m: LRD exists if the specific

property versus m is a straight line when plotted

in log-log scale. This line’s slope is an estimate of

the Hurst exponent, so time-domain estimators

imply two presuppositions for LRD to exist: sta-

tistically significant evidence that the relevant

points do indeed represent a straight line, and the

line’s slope is such that 0.5 < H < 1 (the Hurst

exponent H depends on this slope). These estima-

tors use several methodologies: R/S (rescaled

range statistic), absolute value, variance, and vari-

ance of residuals.

Naturally, frequency- and wavelet-domain

estimators operate in the frequency or wavelet

domain. Similarly to the time-domain method-

ologies, they examine if a time series’ spectrum or

energy follows power-law behavior. These esti-

mators include the Periodogram, the Whittle, and

the wavelet Abry-Veitch (AV) estimators.15

We can test these estimation methodologies’

capabilities by first examining their accuracy on

synthesized LRD series and then testing their abil-

ity to discriminate LRD behavior when applied to

non-LRD data sets. In agreement with similar

findings in earlier studies,14,16our findings

demonstrate that no consistent estimator is robust

in every case: estimators can hide LRD or report

it erroneously. Furthermore, each estimator has

4 SEPTEMBER • OCTOBER 2004 www.computer.org/internet/IEEE INTERNET COMPUTING

Internet Measurement

Page 4

different strengths and limitations. We used the

software package SELFIS (publicly distributed at

our Web site, www.cs.ucr.edu/~tkarag) to perform

the experiments described next.

Estimator Accuracy

on Synthesized LRD Series

The most extensively used self-similar processes

for simulating LRD are fractional Gaussian noise

(fGn) and fractional Auto Regressive Integrated

Moving Average (ARIMA) processes. fGn is an

increment of fractional Brownian motion (fBm) (a

random walk process with dependent increments);

fGn is a Gaussian process and its ACF is given by

Equation 3. The fractional ARIMA(p,d,q) model is a

fractional integration of the autoregressive moving

average, or ARMA(p,q), model. Fractional ARIMA

processes describe LRD series when 0 < d < 0.5, in

which H = d + 0.5.

We tested each estimator against two different

types of synthesized long-memory series: frac-

tional ARIMA and fGn.17For each Hurst value

between 0.5 and 1 (using a step of 0.1), we gener-

ated 100 fGn and 100 fractional ARIMA synthe-

sized data sets of 64 Kbytes. Figure 1 reports the

average estimated Hurst value for these data sets

for each estimator as well as the 95 percent confi-

dence intervals of the mean (that is, the range of

values that has a high probability of containing

the mean). However, these intervals are so close to

the average that they’re barely discernible.

Although many estimators and generators exist,

we used and evaluated the most common and

widely used ones.

Figure 1 shows significant variation in the

estimated Hurst exponent value between the var-

ious methodologies, especially as the Hurst expo-

nent tends to 1, where the intensity of long-

range dependence is larger. Frequency-domain

estimators seem to be more accurate. In the case

of the fGn synthesized series, Whittle and Peri-

odogram estimators fall exactly on top of the

optimal estimation line. The Whittle estimator

has the a priori advantage of being applied to a

time series whose underlying structure matches

the assumptions under which the estimator was

derived. The wavelet AV estimator always over-

estimates the Hurst exponent’s value (usually by

0.05). Overall, time-domain estimators fail to

report the correct Hurst exponent value, under-

estimating it by more than 20 percent. (In Figure

1, lines clustered under the optimal estimation

line represent these estimators.) When we used

fractional ARIMA to synthesize the time series,

the estimations were generally closer to the opti-

mal estimation line. However, none of the esti-

mators consistently followed the optimal line

across all Hurst values.

Discrimination of LRD

Behavior in Deterministic Series

To study the estimations’ sensitivity, we examined

IEEE INTERNET COMPUTINGwww.computer.org/internet/SEPTEMBER • OCTOBER 20045

Long-Range Dependence

Figure 1.Estimating Hurst exponent values. We tested the

performance of estimators on (a) fractional Gaussian noise (fGn)

estimator and (b) fractional ARIMA (Auto Regressive Integrated

Moving Average) synthesized time series.The target line is the

optimal estimation.In both cases,time-domain estimators,

represented by the lines clustered below the target line,failed to

capture the synthesized Hurst exponent value,especially as H

tended to 1.Frequency-based estimators appear to be more

accurate,following the target line closer.

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0.500.55 0.600.650.70 0.750.80 0.850.900.95

Estimated Hurst exponent

Hurst exponents of fGn series

Target

RS

Variance

Abs

Residuals

Perioidograms

Whittle

AVs

0.90

0.95

0.50 0.550.600.65 0.70 0.750.800.85 0.90 0.95

Estimated Hurst exponent

Hurst exponents of fractional ARIMA series

0.40

0.50

0.60

0.70

0.80

0.45

0.55

0.65

0.75

0.85

Target

RS

Variance

Abs

Residuals

Perioidograms

Whittle

AVs

(a)

(b)

Page 5

6 SEPTEMBER • OCTOBER 2004 www.computer.org/internet/IEEE INTERNET COMPUTING

Internet Measurement

the effects of various phenomena common to

time-series analysis, such as periodicity, noise, and

trend (where the mean of the process is steadily

increasing or decreasing). Our analysis revealed

that the presence of such processes significantly

affects estimators. Furthermore, most methodolo-

gies fail to distinguish between LRD and such phe-

nomena, and falsely report LRD in deterministic

non-LRD time series. We examined four cases and

learned that, essentially, no estimator is consis-

tently robust in every case. Each one evaluates dif-

ferent statistics to estimate the Hurst exponent,

which requires the examination of many estima-

tors to get an overall picture of the time series’

properties. Applying signal-processing techniques

and methodologies could help us overcome some

of these limitations, but networking practitioners

aren’t necessarily familiar with such practices.

Cosine plus white Gaussian noise. In our first test,

we applied the estimators to periodic data sets and

then synthesized the series with white Gaussian

noise and a cosine function: Acos(αx). Periodicity

can mislead the Whittle, Periodogram, R/S, and AV

methods into falsely reporting LRD. The Hurst

exponent estimation depends mainly on A, so the

estimations approach 1 as A increases. Thus, as the

amplitude increases, estimations become less reli-

able. If the amplitude is large and the period is

small, Whittle always estimates the Hurst exponent

to be 0.99. (Whittle estimates of 0.99 represent the

failure of robust estimation.)

fGn series plus white Gaussian noise. We next

examined the effect of noise on LRD data. We

found that all estimators underestimate the Hurst

exponent in the presence of noise, but with the

exception of Whittle and the wavelet estimator, the

difference is negligible. Depending on the signal-

to-noise ratio and the fGn series’ Hurst exponent

value, however, these two estimators could signif-

icantly underestimate the Hurst exponent — by

more than 20 percent in some cases.

fGn series plus a cosine function. In studying the

effect of periodicity on LRD data, we found that all

estimations were affected by its presence. Depend-

ing on the cosine function’s amplitude, time-

domain estimators tend to underestimate the Hurst

exponent. On the other hand, frequency-based

methodologies overestimate the Hurst exponent.

As we increase the cosine function’s amplitude,

estimates tend toward 1.

Trend. The definition of LRD assumes stationary

time series. To study the impact of nonstationarity

on the estimators, we therefore synthesized vari-

ous series with different decaying or increasing

trends. We also examined combinations of previ-

ous categories (white Gaussian noise and cosine

functions) with trend. In every case, the Whittle

estimate was consistently 0.99; the Periodogram

method’s estimates for the Hurst exponent were

greater than 1, whereas self-similarity is only

defined for H < 1. No other methodology produced

statistically significant estimations.

Examining the Poisson

Assumption in the Backbone

We studied the Poisson assumption’s validity on

several OC48 (2.5 Gbps) backbone traces taken

from CAIDA (Cooperative Association for Internet

Data Analysis) monitors located at two different

SONET OC48 links belonging to two US tier-1

Internet service providers (ISPs).

To capture the traces, we used Linux-based

monitors with Dag 4.11 network cards and pack-

et-capture software originally developed at the

University of Waikato and currently produced by

Endace. We analyzed various backbone traces:

August 2002 (backbone 1, eight hours), January

2003 (backbone 1, one hour), April 2003 (back-

bone 1, eight hours), May 2003 (backbone 1, 48

hours; backbone 2, two hours), and January 2004

(backbone 2, one hour).

Our analysis demonstrates that backbone pack-

et arrivals appear to agree with the Poisson

assumption,12,18but our traces also appear to agree

with self-similarity and past LRD findings. A more

elaborate discussion of our findings as well as a

traffic characterization that reconciles these con-

tradictory results appears elsewhere;18there, we

argue how Internet traffic demonstrates a nonsta-

tionary, time-dependent Poisson process and,

when viewed across very long time scales, exhibits

the observed LRD.

To test the Poisson traffic model’s validity, we

must examine two key properties: whether packet

interarrival times follow the exponential distribu-

tion, and whether packet sizes and interarrival

times appear mutually independent. Congestion in

today’s Internet usually appears on access links

rather than in the backbone where ISPs overpro-

vision their networks: traffic characteristics can

vary in such links, which means our findings

might not apply.

Page 6

Distribution of Packet Interarrival Times

An interarrival-time distribution consists of two

portions, one that contains back-to-back packets

and another with packets guaranteed to be sepa-

rated by idle time. For heavily utilized links, inter-

arrival times are a function of packet sizes because

many packets are sent back to back. For overpro-

visioned links, the distribution tends to contain

most probability in the “idle” portion (where pack-

ets are separated by idle time).

We can closely approximate packet interarrival

times for our traces by using an exponential dis-

tribution. Figure 2 shows the packet interarrival

distributions for two of the backbone traces. The

Complementary Cumulative Distribution Function

(CCDF) of packet interarrival times is a straight line

when the y-axis is plotted in log scale, which cor-

responds to exponential distribution.

To highlight the differences between current

backbone traces and past Ethernet-link traces, Fig-

ure 2 also shows the CCDF of interarrival times for

the famous BC-pAug89 trace, which was first used

to prove LRD in network traffic in the pioneering

work of Leland and colleagues.4It was recorded at

11:25 (EDT) on 29 August 1989 from an Ethernet

at the Bellcore Morristown Research and Engi-

neering facility.

Figure 2 shows a minor discrepancy between

our traces and the exponential distribution for

small values (that is, less than 6µs;5 µs is the time

required for the transmission of 1,500-byte packets

in an OC48 link) of the interarrival times. This dis-

crepancy is caused by train-like interarrivals

(back-to-back packets not separated by any inter-

mediate idle time) during busy periods at the

upstream router. However, the Poisson traffic

model assumption does not require that interar-

rival times follow a perfect exponential distribu-

tion. In fact, these deviations and short-range arti-

facts can be incorporated into the Poisson model

as “packet trains.”19

Independence

We separately examined and showed that packet

sizes and interarrival times appear to be uncorre-

lated in our traces. We validated the independence

by using various tests, such as the ACF and cross-

correlation function (XCF), visual examination of

conditional probabilities and scatter plots, the

Box-Ljung statistic, and Pearson’s chi-square test

for independence. Other researchers have used

similar tools in the literature to test the indepen-

dence hypothesis.5,13

Using the ACF, we examined two different time

series. The sizes series consisted of the actual pack-

et sizes as individual packets arrive, and the inter-

IEEE INTERNET COMPUTINGwww.computer.org/internet/SEPTEMBER • OCTOBER 20047

Long-Range Dependence

Figure 2.Complementary Cumulative Distribution Function of

packet interarrival times over two backbone networks. For OC48

link traces on (a) January 2003,backbone 1,and (b) January 2004,

backbone 2,as well as for (c) the BC-pAug89 1989 Bellcore trace,

the y-axis is plotted in logarithmic scale.We can approximate the

distributions of OC48 traces with an exponential distribution

(straight line in log-linear scale),but the BC-pAug89 data set clearly

deviates from the exponential distribution.

5 20 35 50 65 80 95 110125140155170185200215230245260275 290

Interarrival time (microsec)

10–8

10–7

10–6

10–5

10–4

10–3

10–2

10 –1

100

P[X>x] (log scale)

Backbone1, OC-48, 2003-01-15, 10:00, dir. 1

020406080100 120140160180

10–8

10–7

10–6

10–5

10–4

10–3

10–2

10–1

Backbone2, OC-48, 2004-01-22, 14:00, dir. 1

P[X>x] (log scale)

Interarrival time (microsec)

(a)

(b)

10–6

10–5

10–4

10–3

10–2

10

10

–1

0

BC pAug89, 10 Mb/s 1989 08 29, 1989, 11:25 (EDT)

P[X>x] (log scale)

(c)

0 50 100 150200250

Interarrival time (msec)

Page 7

arrival series consisted of timestamp differences

between consecutive packets. Apart from limited

correlation at small time lags (less than 20), sizes

and interarrivals weren’t correlated. The trivial cor-

relation at small time lags close to zero was due to

back-to-back packets, as described earlier. The XCF

between sizes and interarrivals points to indepen-

dence beyond minimal correlation at small lags.

Independence was also suggested by the Box-

Ljung statistic Qk, defined as

,

where ρiis the autocorrelation coefficient for lags

1 ≤ i ≤ k and n is the series’ length. To test the null

hypothesis (that is, independence), we compared

the Qkstatistic with the chi-square distribution,

which had k degrees of freedom. We applied the

test for varying numbers of consecutive packet

arrivals for both the interarrival times and packet

sizes. The Box-Ljung statistic shows that we can

consider that both variables are not correlated with

95 percent confidence for up to a certain number

of consecutive packet arrivals. The point at which

dependence appears differs with the trace and time

within the trace — for example, independence

holds for 20,000 consecutive packet interarrivals

on average according to the test for the January

2003, backbone 1 trace. For the packet-sizes series,

the average is approximately 16,000 consecutive

packet arrivals.

We validated these findings by applying Pear-

son’s chi-square test for independence. In all cases,

we accepted the null hypothesis for similar num-

bers of consecutive interarrivals (as with the Box-

Ljung statistic), provided that we apply the test to

the “idle” portion of the distribution (that is, using

interarrival times larger than 6 µs to remove back-

to-back packet effects). Independence held for

hundreds of thousands of consecutive interarrivals

for the May 2003, backbone 2 trace.

LRD

Despite the Poisson characteristics of packet

arrivals, our traces and analysis agreed with pre-

vious findings, showing that LRD characterizes

backbone traffic. However, the intensity of corre-

lation depends on the scale of observation. Specif-

ically, in all traces analyzed, we saw a dichotomy

in the scaling in agreement with previous stud-

ies;10,11The intensity of LRD depends on the scale.

The change point is within the millisecond scale,

albeit slightly different for each case, but the pat-

tern is the same: at scales below the change point,

the Hurst exponent is just above 0.6. At larger

scales, it varies between 0.7 and 0.85 depending

on the trace and the estimator used. We studied the

series of byte and packet counts with smallest

aggregation level at 10 µs.

Conclusions

The findings we’ve presented here might further

challenge established beliefs. They reflect an

extremely dynamic and constantly evolving net-

work expanding in size and complexity. Further

analysis of other backbone links as well as links

near the network’s periphery seems compelling.

We could very well discover that individual links

exhibit varying behavior, especially at small time

scales. Why should traffic be an exception to the

Internet’s diversity?

The problem of characterizing Internet traffic

is not one that can be solved easily, once and for

all. As the Internet increases in size and the tech-

nologies connected to it change, we must con-

stantly monitor and reevaluate our assumptions to

ensure that our conceptual models correctly rep-

resent reality.

Acknowledgments

This work was supported by grants from the US National Sci-

ence Foundation from the Advanced Network Infrastructure

and Research (ANIR-9985195) and the Information and Data

Management (IDM-0208950) programs.

References

1. L. Kleinrock, Queueing Systems, Volume II: Computer

Applications, John Wiley & Sons, 1976.

2. J. Beran, Statistics for Long-Memory Processes, Chapman

and Hall, 1994.

3. K. Park and W. Willinger, “Self-Similar Network Traffic: An

Overview,” Self-Similar Network Traffic and Performance

Evaluation, Wiley-Interscience, 2000, 1–39.

4. W.E. Leland et al., “On the Self-Similar Nature of Ethernet

Traffic” IEEE/ACM Trans. Networking, vol. 2, no. 1, 1994,

pp. 1–15.

5. V. Paxson and S. Floyd, “Wide Area Traffic: The Failure of

Poisson Modeling,” IEEE/ACM Trans. Networking, vol. 3,

no. 3, 1995, pp. 226–244.

6. A. Erramilli, O. Narayan, and W. Willinger, “Experimental

Queuing Analysis with Long-Range Dependent Packet Traf-

fic,” IEEE/ACM Trans. Networking, vol. 4, no. 2, 1996, pp.

209–223.

7. K. Park, G. Kim, and M.E. Crovella, “On the Relationship

Between File Sizes Transport Protocols, and Self-Similar

Qn n(

ni

k

i

i

k

=+

−

=∑

) 2

2

1

ρ

8SEPTEMBER • OCTOBER 2004 www.computer.org/internet/IEEE INTERNET COMPUTING

Internet Measurement

Page 8

Network Traffic,” Int’l Conf. Network Protocols, IEEE CS

Press, 1996, pp. 171–180.

8. M.E. Crovella and A. Bestavros, “Self-Similarity in World

Wide Web Traffic: Evidence and Possible Causes,”

IEEE/ACM Trans. Networking, vol. 5, no. 6, 1997, pp.

835–846.

9. W. Willinger et al., “Self-Similarity through High-Variabil-

ity: Statistical Analysis of Ethernet LAN Traffic at the

Source Level,” IEEE/ACM Trans. Networking, vol. 5, no. 1,

1997, pp. 71–86.

10. A. Feldmann et al., “The Changing Nature of Network Traf-

fic: Scaling Phenomena,” ACM Computer Comm. Rev., vol.

28, Apr. 1998, pp. 5–29.

11. Z.L. Zhang et al., “Small-Time Scaling Behaviors of Inter-

net Backbone Traffic: An Empirical Study,” Proc. IEEE

Infocom, IEEE CS Press, 2003, pp. 1826–1836.

12. J. Cao et al., “On the Nonstationarity of Internet Traffic,”

Sigmetrics/Performance, ACM Press, 2001, pp. 102–112.

13. Y. Zhang et al., “On the Constancy of Internet Path Prop-

erties,” Proc. ACM Sigcomm Internet Measurement Work-

shop, ACM Press, 2001, pp. 197–212.

14. M.S. Taqqu and V. Teverovsky, “On Estimating the Intensi-

ty of Long-Range Dependence in Finite and Infinite Vari-

ance Time Series,” A Practical Guide to Heavy Tails: Sta-

tistical Techniques and Applications, Birkhauser, 1998, pp.

177–217.

15. P. Abry and D. Veitch, “Wavelet Analysis of Long-Range

Dependence Traffic,” IEEE Trans. Information Theory, vol.

44, no. 1, 1998, pp. 2–15.

16. T. Karagiannis, M. Faloutsos, and R.H. Riedi, “Long-

Range Dependence: Now You See It, Now You Don’t!”

Proc. IEEE Global Telecommunications Conf. Global

Internet Symp., 2002.

17. V. Paxson, “Fast Approximation of Self Similar Network Traf-

fic,” Computer Comm. Rev., vol. 27, no. 5, 1997, pp. 5–18.

18. T. Karagiannis, M. Molle, and M. Faloutsos., “A Nonsta-

tionary Poisson View of Internet Traffic,” Proc. IEEE Info-

com, IEEE CS Press, 2004.

19. R. Jain and S. Routhier, “Packet Trains: Measurements and

a New Model for Computer Network Traffic,” IEEE J.

Select. Areas Comm., vol. 4, no. 6, 1986, pp. 986–995.

Thomas Karagiannis is a PhD candidate in the Department of

Computer Science and Engineering at the University of Cal-

ifornia, Riverside. His technical interests include Internet

traffic measurements, analysis of Internet traffic dynam-

ics, Internet protocols, and peer-to-peer networks. Kara-

giannis received a BSc in the Department of Applied Infor-

matics at University of Macedonia, Thessaloniki, Greece.

He is a member of IEEE. Contact him at tkarag@cs.ucr.edu.

Mart L. Molle is a professor in the Department of Computer Sci-

ence and Engineering at the University of California, River-

side. His research interests include the performance evalu-

ation of protocols for computer networks and of distrib-

uted systems. Molle received a BSc (Hons.) in mathemat-

ics/computing science from Queen’s University at Kingston,

Canada, and an MS and PhD in computer science from the

University of California, Los Angeles. He is a member of

the IEEE. Contact him at mart@cs.ucr.edu.

Michalis Faloutsos is a faculty member in the Computer Sci-

ence Department at the University of California, Riverside.

His interests include Internet protocols and measurements,

multicasting, and ad hoc networks. Faloutsos received a BS

in electrical and computer engineering from the National

Technical University of Athens and an MSc and PhD in

computer science from the University of Toronto. Contact

him at michalis@cs.ucr.edu.

IEEE INTERNET COMPUTING www.computer.org/internet/SEPTEMBER • OCTOBER 20049

Long-Range Dependence