[Show abstract][Hide abstract] ABSTRACT: One of the distinguishing features of a backbone link is that it is designed to carry traffic from a large number of end users. This results in a Normal distribution for the number of bytes or packets that arrive in a fixed-length time interval. Based on this observation, which is substantiated by data analysis, we present a simple model for the steady-state loss probability that can be solved in closed form. This model assumes that there is no buffer, so that issues raised by the correlation of counts that is characteristic of packet traffic are bypassed. The longest interval that captures the relevant statistical fluctuations of backbone traffic is one second. Data collection on live commercial networks is costly, so byte and packet counts are usually collected over much longer time intervals; five minutes is a lower bound. This creates no problem in estimating the mean of the Normal distribution, but it makes direct estimation of the variance for one-second counts infeasible. Routers collect flow data; flows are analogous to "calls" in telephony. By modeling the number of active flows as an M/G/∞ queue and assuming that packets in a flow are spread uniformly in time, an equation for the variance of (say) one-second counts in terms of measured quantities is derived. The efficacy of this formula is demonstrated by applying it to data.
Operations Research 08/2005; 53(4):575-585. DOI:10.1287/opre.1040.0192 · 1.74 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We start with the premise, and provide evidence that it is valid, that a Markov-modulated Poisson process (MMPP) is a good model for Internet traffic at the packet/byte level. We present an algorithm to estimate the parameters and size of a discrete MMPP (D-MMPP) from a data trace. This algorithm requires only two passes through the data. In tandem-network queueing models, the input to a downstream queue is the output from an upstream queue, so the arrival rate is limited by the rate of the upstream queue. We show how to modify the MMPP describing the arrivals to the upstream queue to approximate this effect. To extend this idea to networks that are not tandem, we show how to approximate the superposition of MMPPs without encountering the state-space explosion that occurs in exact computations. Numerical examples that demonstrate the accuracy of these methods are given. We also present a method to convert our estimated D-MMPP to a continuous-time MMPP, which is used as the arrival process in a matrix-analytic queueing model.
[Show abstract][Hide abstract] ABSTRACT: With the rapid growth of Internet applications built on TCP/IP such as the World Wide Web and the standardization of traffic management schemes such as Available Bit Rate (ABR) in Asynchronous Transfer Mode (ATM) networks, it is essential to evaluate the performance of feedback-based protocols using traffic models which are specific to dominant applications. This paper presents a method for analyzing feedback-based protocols with a Web-user-like input traffic where the source alternates between ‘transfer’ periods followed by ‘think’ periods. Our key results, which are presented for the TCP protocol, are as follows: (1) When the round-trip time is the same for all users, the goodputs and the fraction of time that the system has some given number of transferring sources are insensitive to the distributions of transfer (file or page) sizes and think times except through the ratio of their means. Thus, apart from network round-trip times, only the ratio of average transfer sizes and think times of users need be known to size the network for achieving a specific quality of service. (2) The Engset model can be adapted to accurately compute goodputs for TCP and TCP over ATM, with different buffer management schemes. Though only these adaptations are given in the paper, the method based on the Engset model can be applied to analyze other feedback systems, such as ATM ABR, by finding a protocol specific adaptation. Hence, the method we develop is useful not only for analyzing TCP using a source model significantly different from the commonly used persistent sources, but also can be useful for analyzing other feedback schemes. (3) Comparisons of simulated TCP traffic to measured Ethernet traffic shows qualitatively similar second order autocorrelation when think times follow a Pareto distribution with infinite variance. Also, the simulated and measured traffic have long range dependence. In this sense our traffic model, which purports to be Web-user-like, also agrees with measured data traffic.
[Show abstract][Hide abstract] ABSTRACT: Measurements of file sizes transported on the World-Wide-Web have led some researchers to propose describing them by probability distributions with infinite variance. The M/G/1 queue often arises as a performance model for components of the WWW, and the service times correspond to file sizes; the infinite variance of the file sizes becomes the variance of the service times. In this paper the effects of very large service-time variances on some performance measures for the M/G/1 queue are explored via numerical examples and analytic arguments.The first main conclusion is that it is the form of the service-time distribution over a wide finite range that controls the steady-state queueing performance, so distributions with very large finite variances can yield the same behavior as distributions with infinite variances. The second main conclusion is that very large service-time variances cause the rate of approach to steady-state performance to be so slow that steady-state performance measures are not likely to be of engineering interest. A third conclusion is that a common device of using the probability that the work in an infinite queue exceeds the level b to approximate the probability that a finite buffer of size b overflows may be very inaccurate. The approximation works better for the fat-tailed distributions studied than for the others.The most important engineering implication of these results is that when service times have a very large variance (such as file transfers on the WWW), performance criteria other than steady-state measures have to be used.
Proceedings of SPIE - The International Society for Optical Engineering 03/2000; 40(1-3):47-70. DOI:10.1016/S0166-5316(99)00069-3 · 0.20 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Data teletraffic is characterized by bursty arrival processes. Performance models are characterized by a desire to know under what circumstances is the probability that an arrival finds a full input buffer very small. In this paper I examine how four models proposed in the literature perform on two data sets of local area network traffic. Among my conclusion are (1) the protocol governing the data transmission may have a substantial effect on the statistical propoerties on the packet stream, (2) approximating the probability that a finit buffer of size b overflows may not be adequately approximated by the probability that an infinite buffer has at least b packets in it, and (3) a data-based estimate of large-deviation rate-function does the best job of estimating packet loss on these data sets. This method may overestimate the loss rate by several orders of magnitude, so there is room for further refinements.
Proceedings of SPIE - The International Society for Optical Engineering 12/1998; 34(4-34):227-247. DOI:10.1016/S0166-5316(98)00039-X · 0.20 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: . We present an algorithm for solving linear systems involving the probability or rate matrix for a Markov chain. It is based on a UL factorization but works only with a submatrix of the factor U. We demonstrate its utility on Erlang-B models as well as more complicated models of a telephone multiplexing system. Key words. Markov chains, fundamental matrix, decision process. 1. Introduction. Markov chain models can lend insight into the behavior of many physical systems, such as telephone networks, highway systems, and ATM switching networks. These models are based on properties of a matrix P whose entries depend on the probabilities of transition from one state to another, or on the arrival and departure rates for customers. The matrix P is nonnegative. If we define D to be a diagonal matrix whose diagonal entries are the rowsums for P , then the matrix D \Gamma P has zero rowsums. In other words, (D \Gamma P )e = 0; where e is the column vector of all ones. Thus, D \Gamma P has a ...
SIAM Journal on Matrix Analysis and Applications 04/1998; 19(2). DOI:10.1137/S0895479896301753 · 1.59 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: When network demands are uncertain, a planner might design a network based on some nominal set of point‐to‐point demands, and later be faced with a different set of offered demands. To accommodate the offered demands, modification of the network may be required. Given this scenario, it seems natural to question how these modification costs might affect the overall cost. To address such questions, we study the effects of random demands on network costs. In this study, we design a network based on nominal demands, generate random demands based on the nominal demands, and then modify the designed network to carry the random demands. We generate the offered demands randomly from four different distributions. For each demand distribution we perform 300 simulations. This paper describes our observations.
Telecommunication Systems 01/1998; 10:409-421. DOI:10.1023/A:1019175202276 · 0.71 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Heyman (1992) examined three sequences giving the number of cells per frame of a VBR encoding of videoconferences (talking heads); these sequences were produced by hardware encoders using different coding algorithms. Each sequence had a gamma marginal distribution, and the autocorrelation function was geometric up to lags of at least 3 s, which includes all autocorrelation values larger than 0.1. We present an easy to simulate autoregressive process that has these properties. The model is tested by comparing the cell-loss rate produced when the data trace was used as the sole source in a simulation of an ATM switch to the cell-loss rates produced when traces generated by the model were used as the source
[Show abstract][Hide abstract] ABSTRACT: Most of the studies of feedback-based flow and congestion control consider only persistent sources which always have data to send. However, with the rapid growth of Internet applications built on TCP/IP such as the World Wide Web and the standardization of traffic management schemes such as Available Bit Rate (ABR) in Asynchronous Transfer Mode (ATM) networks, it is essential to evaluate the performance of feedback-based protocols using traffic models which are specific to dominant applications. This paper presents a method for analysing feedback-based protocols with a Web-user-like input traffic where the source alternates between "transfer" periods followed by "think" periods. Our key results, which are presented for the TCP protocol, are:(1) The goodputs and the fraction of time that the system has some given number of transferring sources are insensitive to the distributions of transfer (file or page) sizes and think times except through the ratio of their means. Thus, apart from network round-trip times, only the ratio of average transfer sizes and think times of users need be known to size the network for achieving a specific quality of service.(2) The Engset model can be adapted to accurately compute goodputs for TCP and TCP over ATM, with different buffer management schemes. Though only these adaptations are given in the paper, the method based on the Engset model can be applied to analyze other feedback systems, such as ATM ABR, by finding a protocol specific adaptation. Hence, the method we develop is useful not only for analysing TCP using a source model significantly different from the commonly used persistent sources, but also can be useful for analysing other feedback schemes.(3) Comparisons of simulated TCP traffic to measured Ethernet traffic shows qualitatively similar autocorrelation when think times follow a Pareto distribution with infinite variance. Also, the simulated and measured traffic have long range dependence. In this sense our traffic model, which purports to be Web-user-like, also agrees with measured traffic.
Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems; 06/1997
[Show abstract][Hide abstract] ABSTRACT: The Grassmann-Taksar-Heyman algorithm is a direct algorithm for computing the steady-state distribution of a finite irreducible Markov chain. We describe our experience in implementing this algorithm on a single-instruction multiple-data parallel processor computer. Our main conclusions are that a lower-level language has a performance advantage compared to Fortran, and that data storage is the limiting factor that determines the largest problem that can be solved. As a consequence, we devote considerable attention to storing a block tridiagonal transition matrix.
[Show abstract][Hide abstract] ABSTRACT: Intuition suggests that a finite buffer limits the effect of
traffic auto-correlations on the queue length. We investigate the extent
to which finite buffers can, therefore, be expected to mitigate the
effects of long-range dependence (LRD). With traffic sequences generated
by the fractional autoregressive integrated moving average (f-ARIMA)
models for LRD, and by AR models for short range dependence (SRD), we
investigate the traffic performance for a range of finite buffers, both
for single and multiplexed streams. For design, the aim is to `match' a
given LRD auto-correlation function with a suitable SRD function whose
performance provides, for a wide range of traffic intensities and buffer
sizes, a conservative bound on the performance associated with the LRD
function. The results suggest that by suitably `dominating' an LRD
auto-correlation function by an SRD function, one can obtain
conservative performance bounds for a realistic range of traffic
intensities and buffer sizes (delays). Also, in several cases, a
`cross-over' phenomenon is observed for cell-loss probabilities as the
buffer size increases, i.e., the loss probabilities are smaller for LRD
than for SRD for small buffers, with the converse true for large
buffers. This suggests that finite buffers, can in some cases,
counteract the effects of LRD in traffic arrivals, and thus enable
conservative designs to be based on Markovian traffic models
Global Telecommunications Conference, 1996. GLOBECOM '96. 'Communications: The Key to Global Prosperity; 12/1996
[Show abstract][Hide abstract] ABSTRACT: The authors explore the influence of long-range dependence in
broadband traffic engineering. The classification of stochastic
processes {X<sub>t</sub>} into those with short or long-range dependence
is based on the asymptotic properties of the variance of the sum S<sub>m
</sub>=X<sub>1</sub>+X<sub>2</sub>+···+X<sub>m
</sub>. Suppose this process describes the number of packets (or ATM
cells) that arrive at a buffer; X<sub>t</sub> is the number that arrive
in the tth time slice (e.g., 10 ms). We use a generic buffer model to
show how the distribution of S<sub>m</sub> (for all values of m)
determines the buffer occupancy. From this model we show that long-range
dependence does not affect the buffer occupancy when the busy periods
are not large. Numerical experiments show this property is present when
data from four video conferences and two entertainment video sequences
(which have long-range dependence) are used as the arrival process, even
when the transmitting times are long enough to make the probability of
buffer overflow 0.07. We generated sample paths from Markov chain models
of the video traffic (these have short-range dependence). Various
operating characteristics, computed over a wide range of loadings,
closely agree when the data trace and the Markov chain paths are used to
drive the model. From this, we conclude that long-range dependence is
not a crucial property in determining the buffer behavior of variable
bit rate (VBR)-video sources
[Show abstract][Hide abstract] ABSTRACT: Traffic from video services is expected to be a substantial
portion of the traffic carried by emerging broadband integrated
networks. For variable bit rate (VBR) coded video, statistical source
models are needed to design networks that achieve acceptable picture
quality at minimum cost and to design traffic shaping and control
mechanisms. For video teleconference traffic Heyman et al. (1992) showed
that traffic is sufficiently accurately characterized by a multistate
Markov chain model that can be derived from three traffic parameters
(mean, correlation, and variance). The present authors describe modeling
results for sequences with frequent scene changes (the previously
studied video teleconferences have very little scene variation) such as
entertainment television, news, and sports broadcasts. The authors
analyze 11 long sequences of broadcast video traffic data. Unlike video
teleconferences, the different sequences studied have different details
regarding distributions of cells per frame. The authors present source
models applicable to the different sequences and evaluate their accuracy
as predictors of cell losses in asynchronous transfer mode (ATM)
networks. The modeling approach is the same for all of the sequences but
use of a single model based on a few physically meaningful parameters
and applicable to all sequences does not seem to be possible
[Show abstract][Hide abstract] ABSTRACT: We consider the application of importance sampling in steady-state simulations of finite Markov chains. We show that, for a large class of performance measures, there is a choice of the alternative transition matrix for which the ratio of the variance of the importance sampling estimator to the variance of the naive simulation estimator converges to zero as the sample path length goes to infinity. Obtaining this ‘optimal’ transition matrix involves computing the performance measure of interest, so the optimal matrix cannot be computed in precisely those situations where simulation is required to estimate steady-state performance. However, our results show that alternative transition matrices of the form Q=P+E/T, where P is the original transition matrix and T is the sample path length, can be expected to provide good results. Moreover, we provide an iterative algorithm for obtaining alternative transition matrices of this form that converge to the optimal matrix as the number of iterations increases, and present an example that shows that spending some computer time iterating this algorithm and then conducting the simulation with the resulting alternative transition matrix may provide considerable variance reduction when compared to naive simulation.
Advances in Applied Probability 03/1996; 28(1). DOI:10.2307/1427916 · 0.71 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: We prove that every infinite-state stochastic matrix P say, that is irreducible and consists of positive-recurrrent states can be represented in the form I- P = (A - I)(B - S), where A is strictly upper-triangular, B is strictly lower-triangular, and S is diagonal. Moreover, the elements of A are expected values of random variables that we will specify, and the elements of B and S are probabilities of events that we will specify. The decomposition can be used to obtain steady-state probabilities, mean first-passage-times and the fundamental matrix.
Journal of Applied Probability 12/1995; 32(4). DOI:10.2307/3215202 · 0.59 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The main contributions of this paper are two-fold. First, we prove
fundamental, similarly behaving lower and upper bounds, and give an
approximation based on the bounds, which is effective for analyzing ATM
multiplexers, even when the traffic has many, possibly heterogeneous,
sources and their models are of high dimension. Second, we apply our
analytic approximation to statistical models of video teleconference
traffic, obtain the multiplexing system's capacity as determined by the
number of admissible sources for given cell-loss probability, buffer
size and trunk bandwidth, and, finally, compare with results from
simulations, which are driven by actual data from coders. The results
are surprisingly close. Our bounds are based on large deviations theory.
The main assumption is that the sources are Markovian and
time-reversible. Our approximation to the steady-state buffer
distribution is called Chenoff-dominant eigenvalue since one parameter
is obtained from Chernoffs theorem and the other is the system's
dominant eigenvalue. Fast, effective techniques are given for their
computation. In our application we process the output of variable bit
rate coders to obtain DAR(1) source models which, while of high
dimension, require only knowledge of the mean, variance, and
correlation. We require cell-loss probability not to exceed 10<sup>-6
</sup>, trunk bandwidth ranges from 45 to 150 Mb/s, buffer sizes are
such that maximum delays range from 1 to 60 ms, and the number of
coder-sources ranges from 15 to 150. Even for the largest systems, the
time for analysis is a fraction of a second, while each simulation takes
many hours. Thus, the real-time administration of admission control
based on our analytic techniques is feasible
IEEE Journal on Selected Areas in Communications 09/1995; 13(6-13):1004 - 1016. DOI:10.1109/49.400656 · 3.45 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Associated with every stochastic matrix is another matrix called the fundamental matrix. The fundamental matrix can be used to obtain mean first-passage-times and other interesting operating characteristics. The fundamental matrix is defined as a matrix inverse, and computing it from the definition can be fraught with numerical errors. We establish a new representation of the fundamental matrix where matrix inversion is replaced by multiplying and then adding a pair of matrices. The representation requires the solution of a system of linear equations, and we show that that can be done via back and forward substitution from numbers that have already been calculated when the GTH algorithm is used to compute the steady-state probabilities. An algorithm based on this representation is given. The time complexity of the faster implementation is 75% of the time complexity of using Gaussian elimination.
SIAM Journal on Matrix Analysis and Applications 07/1995; 16(3). DOI:10.1137/S0895479893258814 · 1.59 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: In the simulation of Markov chains, importance sampling involves replacing the original transition matrix, say P, with a suitably chosen transition matrix Q that tends to visit the states of interest more frequently. The likelihood ratio of P relative to Q is an important random variable in the importance sampling method. It always has expectation one, and for any interesting pair of transition matrices P and Q, there is a sample path length that causes the likelihood ratio to be close to zero with probability close to one. This may cause the variance of the importance sampling estimator to be larger than the variance of the traditional estimator. We develop sufficient conditions for ensuring the tightness of the distribution of the logarithm of the likelihood ratio for all sample path lengths, and we show that when these conditions are satisfied, the likelihood ratio is approximately lognormally distributed with expected value one. These conditions can be used to eliminate some choices of the alternative transition matrix Q that are likely to result in a variance increase. We also show that if the likelihood ratio is to remain well behaved for all sample path lengths, the alternative transition matrix Q has to approach the original transition matrix P as the sample path length increases. The practical significance of this result is that importance sampling can be difficult to apply successfully in simulations that involve long sample paths.
Operations Research 06/1995; 43(3):509-519. DOI:10.1287/opre.43.3.509 · 1.74 Impact Factor