Publications (36)31.36 Total impact

Article: Sizing Backbone Internet Links
[Show abstract] [Hide abstract]
ABSTRACT: One of the distinguishing features of a backbone link is that it is designed to carry traffic from a large number of end users. This results in a Normal distribution for the number of bytes or packets that arrive in a fixedlength time interval. Based on this observation, which is substantiated by data analysis, we present a simple model for the steadystate loss probability that can be solved in closed form. This model assumes that there is no buffer, so that issues raised by the correlation of counts that is characteristic of packet traffic are bypassed. The longest interval that captures the relevant statistical fluctuations of backbone traffic is one second. Data collection on live commercial networks is costly, so byte and packet counts are usually collected over much longer time intervals; five minutes is a lower bound. This creates no problem in estimating the mean of the Normal distribution, but it makes direct estimation of the variance for onesecond counts infeasible. Routers collect flow data; flows are analogous to "calls" in telephony. By modeling the number of active flows as an M/G/∞ queue and assuming that packets in a flow are spread uniformly in time, an equation for the variance of (say) onesecond counts in terms of measured quantities is derived. The efficacy of this formula is demonstrated by applying it to data.  [Show abstract] [Hide abstract]
ABSTRACT: We start with the premise, and provide evidence that it is valid, that a Markovmodulated Poisson process (MMPP) is a good model for Internet traffic at the packet/byte level. We present an algorithm to estimate the parameters and size of a discrete MMPP (DMMPP) from a data trace. This algorithm requires only two passes through the data. In tandemnetwork queueing models, the input to a downstream queue is the output from an upstream queue, so the arrival rate is limited by the rate of the upstream queue. We show how to modify the MMPP describing the arrivals to the upstream queue to approximate this effect. To extend this idea to networks that are not tandem, we show how to approximate the superposition of MMPPs without encountering the statespace explosion that occurs in exact computations. Numerical examples that demonstrate the accuracy of these methods are given. We also present a method to convert our estimated DMMPP to a continuoustime MMPP, which is used as the arrival process in a matrixanalytic queueing model.  [Show abstract] [Hide abstract]
ABSTRACT: With the rapid growth of Internet applications built on TCP/IP such as the World Wide Web and the standardization of traffic management schemes such as Available Bit Rate (ABR) in Asynchronous Transfer Mode (ATM) networks, it is essential to evaluate the performance of feedbackbased protocols using traffic models which are specific to dominant applications. This paper presents a method for analyzing feedbackbased protocols with a Webuserlike input traffic where the source alternates between ‘transfer’ periods followed by ‘think’ periods. Our key results, which are presented for the TCP protocol, are as follows: (1) When the roundtrip time is the same for all users, the goodputs and the fraction of time that the system has some given number of transferring sources are insensitive to the distributions of transfer (file or page) sizes and think times except through the ratio of their means. Thus, apart from network roundtrip times, only the ratio of average transfer sizes and think times of users need be known to size the network for achieving a specific quality of service. (2) The Engset model can be adapted to accurately compute goodputs for TCP and TCP over ATM, with different buffer management schemes. Though only these adaptations are given in the paper, the method based on the Engset model can be applied to analyze other feedback systems, such as ATM ABR, by finding a protocol specific adaptation. Hence, the method we develop is useful not only for analyzing TCP using a source model significantly different from the commonly used persistent sources, but also can be useful for analyzing other feedback schemes. (3) Comparisons of simulated TCP traffic to measured Ethernet traffic shows qualitatively similar second order autocorrelation when think times follow a Pareto distribution with infinite variance. Also, the simulated and measured traffic have long range dependence. In this sense our traffic model, which purports to be Webuserlike, also agrees with measured data traffic.  [Show abstract] [Hide abstract]
ABSTRACT: Despite the fact that most of todays' Internet traffic is transmitted via the TCP protocol, the performance behavior of networks with TCP traffic is still not well understood. Recent research activities have lead to a number of performance models for TCP traffic, but the degree of accuracy of these models in realistic scenarios is still questionable. This paper provides a comparison of the results (in terms of average throughput per connection) of three different `analytic' TCP models: I. the throughput formula in [Padhye et al. 98], II. the modified Engset model of [Heyman et al. 97], and III. the analytic TCP queueing model of [Schwefel 01] that is a packet based extension of (II). Results for all three models are computed for a scenario of $N$ identical TCP sources that transmit data in individual TCP connections of stochastically varying size. The results for the average throughput per connection in the analytic models are compared with simulations of detailed TCP behavior. All of the analytic models are expected to show deficiencies in certain scenarios, since they neglect highly influential parameters of the actual real simulation model: The approach of Model (I) and (II) only indirectly considers queueing in bottleneck routers, and in certain scenarios those models are not able to adequately describe the impact of bufferspace, neither qualitatively nor quantitatively. Furthermore, (II) is insensitive to the actual distribution of the connection sizes. As a consequence, their prediction would also be insensitive of socalled longrange dependent properties in the traffic that are caused by heavytailed connection size distributions. The simulation results show that such properties cannot be neglected for certain network topologies: LRD properties can even have counterintuitive impact on the average goodput, namely the goodput can be higher for small buffersizes.© (2001) COPYRIGHT SPIEThe International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only. 
 [Show abstract] [Hide abstract]
ABSTRACT: Measurements of file sizes transported on the WorldWideWeb have led some researchers to propose describing them by probability distributions with infinite variance. The M/G/1 queue often arises as a performance model for components of the WWW, and the service times correspond to file sizes; the infinite variance of the file sizes becomes the variance of the service times. In this paper the effects of very large servicetime variances on some performance measures for the M/G/1 queue are explored via numerical examples and analytic arguments.The first main conclusion is that it is the form of the servicetime distribution over a wide finite range that controls the steadystate queueing performance, so distributions with very large finite variances can yield the same behavior as distributions with infinite variances. The second main conclusion is that very large servicetime variances cause the rate of approach to steadystate performance to be so slow that steadystate performance measures are not likely to be of engineering interest. A third conclusion is that a common device of using the probability that the work in an infinite queue exceeds the level b to approximate the probability that a finite buffer of size b overflows may be very inaccurate. The approximation works better for the fattailed distributions studied than for the others.The most important engineering implication of these results is that when service times have a very large variance (such as file transfers on the WWW), performance criteria other than steadystate measures have to be used.  [Show abstract] [Hide abstract]
ABSTRACT: Data teletraffic is characterized by bursty arrival processes. Performance models are characterized by a desire to know under what circumstances is the probability that an arrival finds a full input buffer very small. In this paper I examine how four models proposed in the literature perform on two data sets of local area network traffic. Among my conclusion are (1) the protocol governing the data transmission may have a substantial effect on the statistical propoerties on the packet stream, (2) approximating the probability that a finit buffer of size b overflows may not be adequately approximated by the probability that an infinite buffer has at least b packets in it, and (3) a databased estimate of largedeviation ratefunction does the best job of estimating packet loss on these data sets. This method may overestimate the loss rate by several orders of magnitude, so there is room for further refinements.  [Show abstract] [Hide abstract]
ABSTRACT: When network demands are uncertain, a planner might design a network based on some nominal set of point‐to‐point demands, and later be faced with a different set of offered demands. To accommodate the offered demands, modification of the network may be required. Given this scenario, it seems natural to question how these modification costs might affect the overall cost. To address such questions, we study the effects of random demands on network costs. In this study, we design a network based on nominal demands, generate random demands based on the nominal demands, and then modify the designed network to carry the random demands. We generate the offered demands randomly from four different distributions. For each demand distribution we perform 300 simulations. This paper describes our observations.  [Show abstract] [Hide abstract]
ABSTRACT: . We present an algorithm for solving linear systems involving the probability or rate matrix for a Markov chain. It is based on a UL factorization but works only with a submatrix of the factor U. We demonstrate its utility on ErlangB models as well as more complicated models of a telephone multiplexing system. Key words. Markov chains, fundamental matrix, decision process. 1. Introduction. Markov chain models can lend insight into the behavior of many physical systems, such as telephone networks, highway systems, and ATM switching networks. These models are based on properties of a matrix P whose entries depend on the probabilities of transition from one state to another, or on the arrival and departure rates for customers. The matrix P is nonnegative. If we define D to be a diagonal matrix whose diagonal entries are the rowsums for P , then the matrix D \Gamma P has zero rowsums. In other words, (D \Gamma P )e = 0; where e is the column vector of all ones. Thus, D \Gamma P has a ...  [Show abstract] [Hide abstract]
ABSTRACT: Heyman (1992) examined three sequences giving the number of cells per frame of a VBR encoding of videoconferences (talking heads); these sequences were produced by hardware encoders using different coding algorithms. Each sequence had a gamma marginal distribution, and the autocorrelation function was geometric up to lags of at least 3 s, which includes all autocorrelation values larger than 0.1. We present an easy to simulate autoregressive process that has these properties. The model is tested by comparing the cellloss rate produced when the data trace was used as the sole source in a simulation of an ATM switch to the cellloss rates produced when traces generated by the model were used as the source 
Conference Paper: A New Method for Analysing FeedbackBased Protocols with Applications to Engineering Web Traffic over the Internet.
[Show abstract] [Hide abstract]
ABSTRACT: Most of the studies of feedbackbased flow and congestion control consider only persistent sources which always have data to send. However, with the rapid growth of Internet applications built on TCP/IP such as the World Wide Web and the standardization of traffic management schemes such as Available Bit Rate (ABR) in Asynchronous Transfer Mode (ATM) networks, it is essential to evaluate the performance of feedbackbased protocols using traffic models which are specific to dominant applications. This paper presents a method for analysing feedbackbased protocols with a Webuserlike input traffic where the source alternates between "transfer" periods followed by "think" periods. Our key results, which are presented for the TCP protocol, are:(1) The goodputs and the fraction of time that the system has some given number of transferring sources are insensitive to the distributions of transfer (file or page) sizes and think times except through the ratio of their means. Thus, apart from network roundtrip times, only the ratio of average transfer sizes and think times of users need be known to size the network for achieving a specific quality of service.(2) The Engset model can be adapted to accurately compute goodputs for TCP and TCP over ATM, with different buffer management schemes. Though only these adaptations are given in the paper, the method based on the Engset model can be applied to analyze other feedback systems, such as ATM ABR, by finding a protocol specific adaptation. Hence, the method we develop is useful not only for analysing TCP using a source model significantly different from the commonly used persistent sources, but also can be useful for analysing other feedback schemes.(3) Comparisons of simulated TCP traffic to measured Ethernet traffic shows qualitatively similar autocorrelation when think times follow a Pareto distribution with infinite variance. Also, the simulated and measured traffic have long range dependence. In this sense our traffic model, which purports to be Webuserlike, also agrees with measured traffic.  [Show abstract] [Hide abstract]
ABSTRACT: The GrassmannTaksarHeyman algorithm is a direct algorithm for computing the steadystate distribution of a finite irreducible Markov chain. We describe our experience in implementing this algorithm on a singleinstruction multipledata parallel processor computer. Our main conclusions are that a lowerlevel language has a performance advantage compared to Fortran, and that data storage is the limiting factor that determines the largest problem that can be solved. As a consequence, we devote considerable attention to storing a block tridiagonal transition matrix.  [Show abstract] [Hide abstract]
ABSTRACT: Intuition suggests that a finite buffer limits the effect of traffic autocorrelations on the queue length. We investigate the extent to which finite buffers can, therefore, be expected to mitigate the effects of longrange dependence (LRD). With traffic sequences generated by the fractional autoregressive integrated moving average (fARIMA) models for LRD, and by AR models for short range dependence (SRD), we investigate the traffic performance for a range of finite buffers, both for single and multiplexed streams. For design, the aim is to `match' a given LRD autocorrelation function with a suitable SRD function whose performance provides, for a wide range of traffic intensities and buffer sizes, a conservative bound on the performance associated with the LRD function. The results suggest that by suitably `dominating' an LRD autocorrelation function by an SRD function, one can obtain conservative performance bounds for a realistic range of traffic intensities and buffer sizes (delays). Also, in several cases, a `crossover' phenomenon is observed for cellloss probabilities as the buffer size increases, i.e., the loss probabilities are smaller for LRD than for SRD for small buffers, with the converse true for large buffers. This suggests that finite buffers, can in some cases, counteract the effects of LRD in traffic arrivals, and thus enable conservative designs to be based on Markovian traffic models  [Show abstract] [Hide abstract]
ABSTRACT: The authors explore the influence of longrange dependence in broadband traffic engineering. The classification of stochastic processes {X<sub>t</sub>} into those with short or longrange dependence is based on the asymptotic properties of the variance of the sum S<sub>m </sub>=X<sub>1</sub>+X<sub>2</sub>+···+X<sub>m </sub>. Suppose this process describes the number of packets (or ATM cells) that arrive at a buffer; X<sub>t</sub> is the number that arrive in the tth time slice (e.g., 10 ms). We use a generic buffer model to show how the distribution of S<sub>m</sub> (for all values of m) determines the buffer occupancy. From this model we show that longrange dependence does not affect the buffer occupancy when the busy periods are not large. Numerical experiments show this property is present when data from four video conferences and two entertainment video sequences (which have longrange dependence) are used as the arrival process, even when the transmitting times are long enough to make the probability of buffer overflow 0.07. We generated sample paths from Markov chain models of the video traffic (these have shortrange dependence). Various operating characteristics, computed over a wide range of loadings, closely agree when the data trace and the Markov chain paths are used to drive the model. From this, we conclude that longrange dependence is not a crucial property in determining the buffer behavior of variable bit rate (VBR)video sources  [Show abstract] [Hide abstract]
ABSTRACT: Traffic from video services is expected to be a substantial portion of the traffic carried by emerging broadband integrated networks. For variable bit rate (VBR) coded video, statistical source models are needed to design networks that achieve acceptable picture quality at minimum cost and to design traffic shaping and control mechanisms. For video teleconference traffic Heyman et al. (1992) showed that traffic is sufficiently accurately characterized by a multistate Markov chain model that can be derived from three traffic parameters (mean, correlation, and variance). The present authors describe modeling results for sequences with frequent scene changes (the previously studied video teleconferences have very little scene variation) such as entertainment television, news, and sports broadcasts. The authors analyze 11 long sequences of broadcast video traffic data. Unlike video teleconferences, the different sequences studied have different details regarding distributions of cells per frame. The authors present source models applicable to the different sequences and evaluate their accuracy as predictors of cell losses in asynchronous transfer mode (ATM) networks. The modeling approach is the same for all of the sequences but use of a single model based on a few physically meaningful parameters and applicable to all sequences does not seem to be possible  [Show abstract] [Hide abstract]
ABSTRACT: We consider the application of importance sampling in steadystate simulations of finite Markov chains. We show that, for a large class of performance measures, there is a choice of the alternative transition matrix for which the ratio of the variance of the importance sampling estimator to the variance of the naive simulation estimator converges to zero as the sample path length goes to infinity. Obtaining this ‘optimal’ transition matrix involves computing the performance measure of interest, so the optimal matrix cannot be computed in precisely those situations where simulation is required to estimate steadystate performance. However, our results show that alternative transition matrices of the form Q=P+E/T, where P is the original transition matrix and T is the sample path length, can be expected to provide good results. Moreover, we provide an iterative algorithm for obtaining alternative transition matrices of this form that converge to the optimal matrix as the number of iterations increases, and present an example that shows that spending some computer time iterating this algorithm and then conducting the simulation with the resulting alternative transition matrix may provide considerable variance reduction when compared to naive simulation.  [Show abstract] [Hide abstract]
ABSTRACT: We prove that every infinitestate stochastic matrix P say, that is irreducible and consists of positiverecurrrent states can be represented in the form I P = (A  I)(B  S), where A is strictly uppertriangular, B is strictly lowertriangular, and S is diagonal. Moreover, the elements of A are expected values of random variables that we will specify, and the elements of B and S are probabilities of events that we will specify. The decomposition can be used to obtain steadystate probabilities, mean firstpassagetimes and the fundamental matrix.  [Show abstract] [Hide abstract]
ABSTRACT: The main contributions of this paper are twofold. First, we prove fundamental, similarly behaving lower and upper bounds, and give an approximation based on the bounds, which is effective for analyzing ATM multiplexers, even when the traffic has many, possibly heterogeneous, sources and their models are of high dimension. Second, we apply our analytic approximation to statistical models of video teleconference traffic, obtain the multiplexing system's capacity as determined by the number of admissible sources for given cellloss probability, buffer size and trunk bandwidth, and, finally, compare with results from simulations, which are driven by actual data from coders. The results are surprisingly close. Our bounds are based on large deviations theory. The main assumption is that the sources are Markovian and timereversible. Our approximation to the steadystate buffer distribution is called Chenoffdominant eigenvalue since one parameter is obtained from Chernoffs theorem and the other is the system's dominant eigenvalue. Fast, effective techniques are given for their computation. In our application we process the output of variable bit rate coders to obtain DAR(1) source models which, while of high dimension, require only knowledge of the mean, variance, and correlation. We require cellloss probability not to exceed 10<sup>6 </sup>, trunk bandwidth ranges from 45 to 150 Mb/s, buffer sizes are such that maximum delays range from 1 to 60 ms, and the number of codersources ranges from 15 to 150. Even for the largest systems, the time for analysis is a fraction of a second, while each simulation takes many hours. Thus, the realtime administration of admission control based on our analytic techniques is feasible  [Show abstract] [Hide abstract]
ABSTRACT: Associated with every stochastic matrix is another matrix called the fundamental matrix. The fundamental matrix can be used to obtain mean firstpassagetimes and other interesting operating characteristics. The fundamental matrix is defined as a matrix inverse, and computing it from the definition can be fraught with numerical errors. We establish a new representation of the fundamental matrix where matrix inversion is replaced by multiplying and then adding a pair of matrices. The representation requires the solution of a system of linear equations, and we show that that can be done via back and forward substitution from numbers that have already been calculated when the GTH algorithm is used to compute the steadystate probabilities. An algorithm based on this representation is given. The time complexity of the faster implementation is 75% of the time complexity of using Gaussian elimination.  [Show abstract] [Hide abstract]
ABSTRACT: In the simulation of Markov chains, importance sampling involves replacing the original transition matrix, say P, with a suitably chosen transition matrix Q that tends to visit the states of interest more frequently. The likelihood ratio of P relative to Q is an important random variable in the importance sampling method. It always has expectation one, and for any interesting pair of transition matrices P and Q, there is a sample path length that causes the likelihood ratio to be close to zero with probability close to one. This may cause the variance of the importance sampling estimator to be larger than the variance of the traditional estimator. We develop sufficient conditions for ensuring the tightness of the distribution of the logarithm of the likelihood ratio for all sample path lengths, and we show that when these conditions are satisfied, the likelihood ratio is approximately lognormally distributed with expected value one. These conditions can be used to eliminate some choices of the alternative transition matrix Q that are likely to result in a variance increase. We also show that if the likelihood ratio is to remain well behaved for all sample path lengths, the alternative transition matrix Q has to approach the original transition matrix P as the sample path length increases. The practical significance of this result is that importance sampling can be difficult to apply successfully in simulations that involve long sample paths.
Publication Stats
2k  Citations  
31.36  Total Impact Points  
Top Journals
Institutions

19972003

AT&T Labs
Austin, Texas, United States


1996

University of Wisconsin–Madison
Madison, Wisconsin, United States 
Texas A&M University
College Station, Texas, United States
