Daniel P. Heyman

AT&T Labs, Austin, Texas, United States

Are you Daniel P. Heyman?

Claim your profile

Publications (37)32.05 Total impact

  • Daniel P. Heyman
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the distinguishing features of a backbone link is that it is designed to carry traffic from a large number of end users. This results in a Normal distribution for the number of bytes or packets that arrive in a fixed-length time interval. Based on this observation, which is substantiated by data analysis, we present a simple model for the steady-state loss probability that can be solved in closed form. This model assumes that there is no buffer, so that issues raised by the correlation of counts that is characteristic of packet traffic are bypassed. The longest interval that captures the relevant statistical fluctuations of backbone traffic is one second. Data collection on live commercial networks is costly, so byte and packet counts are usually collected over much longer time intervals; five minutes is a lower bound. This creates no problem in estimating the mean of the Normal distribution, but it makes direct estimation of the variance for one-second counts infeasible. Routers collect flow data; flows are analogous to "calls" in telephony. By modeling the number of active flows as an M/G/∞ queue and assuming that packets in a flow are spread uniformly in time, an equation for the variance of (say) one-second counts in terms of measured quantities is derived. The efficacy of this formula is demonstrated by applying it to data.
    Operations Research 08/2005; 53(4):575-585. DOI:10.1287/opre.1040.0192 · 1.50 Impact Factor
  • Source
    D.P. Heyman, D. Lucantoni
    [Show abstract] [Hide abstract]
    ABSTRACT: We start with the premise, and provide evidence that it is valid, that a Markov-modulated Poisson process (MMPP) is a good model for Internet traffic at the packet/byte level. We present an algorithm to estimate the parameters and size of a discrete MMPP (D-MMPP) from a data trace. This algorithm requires only two passes through the data. In tandem-network queueing models, the input to a downstream queue is the output from an upstream queue, so the arrival rate is limited by the rate of the upstream queue. We show how to modify the MMPP describing the arrivals to the upstream queue to approximate this effect. To extend this idea to networks that are not tandem, we show how to approximate the superposition of MMPPs without encountering the state-space explosion that occurs in exact computations. Numerical examples that demonstrate the accuracy of these methods are given. We also present a method to convert our estimated D-MMPP to a continuous-time MMPP, which is used as the arrival process in a matrix-analytic queueing model.
    IEEE/ACM Transactions on Networking 01/2004; DOI:10.1109/TNET.2003.820252 · 1.99 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: With the rapid growth of Internet applications built on TCP/IP such as the World Wide Web and the standardization of traffic management schemes such as Available Bit Rate (ABR) in Asynchronous Transfer Mode (ATM) networks, it is essential to evaluate the performance of feedback-based protocols using traffic models which are specific to dominant applications. This paper presents a method for analyzing feedback-based protocols with a Web-user-like input traffic where the source alternates between ‘transfer’ periods followed by ‘think’ periods. Our key results, which are presented for the TCP protocol, are as follows: (1) When the round-trip time is the same for all users, the goodputs and the fraction of time that the system has some given number of transferring sources are insensitive to the distributions of transfer (file or page) sizes and think times except through the ratio of their means. Thus, apart from network round-trip times, only the ratio of average transfer sizes and think times of users need be known to size the network for achieving a specific quality of service. (2) The Engset model can be adapted to accurately compute goodputs for TCP and TCP over ATM, with different buffer management schemes. Though only these adaptations are given in the paper, the method based on the Engset model can be applied to analyze other feedback systems, such as ATM ABR, by finding a protocol specific adaptation. Hence, the method we develop is useful not only for analyzing TCP using a source model significantly different from the commonly used persistent sources, but also can be useful for analyzing other feedback schemes. (3) Comparisons of simulated TCP traffic to measured Ethernet traffic shows qualitatively similar second order autocorrelation when think times follow a Pareto distribution with infinite variance. Also, the simulated and measured traffic have long range dependence. In this sense our traffic model, which purports to be Web-user-like, also agrees with measured data traffic.
    Computer Communications 05/2003; DOI:10.1016/S0140-3664(02)00214-1 · 1.35 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Despite the fact that most of todays' Internet traffic is transmitted via the TCP protocol, the performance behavior of networks with TCP traffic is still not well understood. Recent research activities have lead to a number of performance models for TCP traffic, but the degree of accuracy of these models in realistic scenarios is still questionable. This paper provides a comparison of the results (in terms of average throughput per connection) of three different `analytic' TCP models: I. the throughput formula in [Padhye et al. 98], II. the modified Engset model of [Heyman et al. 97], and III. the analytic TCP queueing model of [Schwefel 01] that is a packet based extension of (II). Results for all three models are computed for a scenario of $N$ identical TCP sources that transmit data in individual TCP connections of stochastically varying size. The results for the average throughput per connection in the analytic models are compared with simulations of detailed TCP behavior. All of the analytic models are expected to show deficiencies in certain scenarios, since they neglect highly influential parameters of the actual real simulation model: The approach of Model (I) and (II) only indirectly considers queueing in bottleneck routers, and in certain scenarios those models are not able to adequately describe the impact of buffer-space, neither qualitatively nor quantitatively. Furthermore, (II) is insensitive to the actual distribution of the connection sizes. As a consequence, their prediction would also be insensitive of so-called long-range dependent properties in the traffic that are caused by heavy-tailed connection size distributions. The simulation results show that such properties cannot be neglected for certain network topologies: LRD properties can even have counter-intuitive impact on the average goodput, namely the goodput can be higher for small buffer-sizes.© (2001) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.
  • Robert D. van der Mei, Daniel P. Heyman
    Proceedings of SPIE - The International Society for Optical Engineering 03/2001; 43(4):201-202. DOI:10.1016/j.comcom.2003.08.015 · 0.20 Impact Factor
  • Source
    Daniel P. Heyman
    [Show abstract] [Hide abstract]
    ABSTRACT: Measurements of file sizes transported on the World-Wide-Web have led some researchers to propose describing them by probability distributions with infinite variance. The M/G/1 queue often arises as a performance model for components of the WWW, and the service times correspond to file sizes; the infinite variance of the file sizes becomes the variance of the service times. In this paper the effects of very large service-time variances on some performance measures for the M/G/1 queue are explored via numerical examples and analytic arguments.The first main conclusion is that it is the form of the service-time distribution over a wide finite range that controls the steady-state queueing performance, so distributions with very large finite variances can yield the same behavior as distributions with infinite variances. The second main conclusion is that very large service-time variances cause the rate of approach to steady-state performance to be so slow that steady-state performance measures are not likely to be of engineering interest. A third conclusion is that a common device of using the probability that the work in an infinite queue exceeds the level b to approximate the probability that a finite buffer of size b overflows may be very inaccurate. The approximation works better for the fat-tailed distributions studied than for the others.The most important engineering implication of these results is that when service times have a very large variance (such as file transfers on the WWW), performance criteria other than steady-state measures have to be used.
    Proceedings of SPIE - The International Society for Optical Engineering 03/2000; 40(1-3):47-70. DOI:10.1016/S0166-5316(99)00069-3 · 0.20 Impact Factor
  • Daniel P. Heyman
    [Show abstract] [Hide abstract]
    ABSTRACT: Data teletraffic is characterized by bursty arrival processes. Performance models are characterized by a desire to know under what circumstances is the probability that an arrival finds a full input buffer very small. In this paper I examine how four models proposed in the literature perform on two data sets of local area network traffic. Among my conclusion are (1) the protocol governing the data transmission may have a substantial effect on the statistical propoerties on the packet stream, (2) approximating the probability that a finit buffer of size b overflows may not be adequately approximated by the probability that an infinite buffer has at least b packets in it, and (3) a data-based estimate of large-deviation rate-function does the best job of estimating packet loss on these data sets. This method may overestimate the loss rate by several orders of magnitude, so there is room for further refinements.
    Proceedings of SPIE - The International Society for Optical Engineering 12/1998; 34(4-34):227-247. DOI:10.1016/S0166-5316(98)00039-X · 0.20 Impact Factor
  • Source
    Daniel P. Heyman, Dianne P. O'Leary
    [Show abstract] [Hide abstract]
    ABSTRACT: We present an algorithm for solving linear systems involving the probability or rate matrix for a Markov chain. It is based on a UL factorization but works only with a submatrix of the factor U. We demonstrate its utility on Erlang-B models as well as more complicated models of a telephone multiplexing system. Key words. Markov chains, fundamental matrix, decision process AMS subject classications. 65F05, 62M05 PII. S0895479896301753
    SIAM Journal on Matrix Analysis and Applications 04/1998; 19(2). DOI:10.1137/S0895479896301753 · 1.81 Impact Factor
  • Tamra Carpenter, Daniel P. Heyman, Iraj Saniee
    [Show abstract] [Hide abstract]
    ABSTRACT: When network demands are uncertain, a planner might design a network based on some nominal set of point‐to‐point demands, and later be faced with a different set of offered demands. To accommodate the offered demands, modification of the network may be required. Given this scenario, it seems natural to question how these modification costs might affect the overall cost. To address such questions, we study the effects of random demands on network costs. In this study, we design a network based on nominal demands, generate random demands based on the nominal demands, and then modify the designed network to carry the random demands. We generate the offered demands randomly from four different distributions. For each demand distribution we perform 300 simulations. This paper describes our observations.
    Telecommunication Systems 01/1998; 10:409-421. DOI:10.1023/A:1019175202276 · 1.16 Impact Factor
  • D.P. Heyman
    [Show abstract] [Hide abstract]
    ABSTRACT: Heyman (1992) examined three sequences giving the number of cells per frame of a VBR encoding of videoconferences (talking heads); these sequences were produced by hardware encoders using different coding algorithms. Each sequence had a gamma marginal distribution, and the autocorrelation function was geometric up to lags of at least 3 s, which includes all autocorrelation values larger than 0.1. We present an easy to simulate autoregressive process that has these properties. The model is tested by comparing the cell-loss rate produced when the data trace was used as the sole source in a simulation of an ATM switch to the cell-loss rates produced when traces generated by the model were used as the source
    IEEE/ACM Transactions on Networking 09/1997; 5(4):554-560. DOI:10.1109/90.649513 · 1.99 Impact Factor
  • Source
    Daniel P. Heyman, T. V. Lakshman, Arnold L. Neidhardt
    [Show abstract] [Hide abstract]
    ABSTRACT: Most of the studies of feedback-based flow and congestion control consider only persistent sources which always have data to send. However, with the rapid growth of Internet applications built on TCP/IP such as the World Wide Web and the standardization of traffic management schemes such as Available Bit Rate (ABR) in Asynchronous Transfer Mode (ATM) networks, it is essential to evaluate the performance of feedback-based protocols using traffic models which are specific to dominant applications. This paper presents a method for analysing feedback-based protocols with a Web-user-like input traffic where the source alternates between "transfer" periods followed by "think" periods. Our key results, which are presented for the TCP protocol, are:(1) The goodputs and the fraction of time that the system has some given number of transferring sources are insensitive to the distributions of transfer (file or page) sizes and think times except through the ratio of their means. Thus, apart from network round-trip times, only the ratio of average transfer sizes and think times of users need be known to size the network for achieving a specific quality of service.(2) The Engset model can be adapted to accurately compute goodputs for TCP and TCP over ATM, with different buffer management schemes. Though only these adaptations are given in the paper, the method based on the Engset model can be applied to analyze other feedback systems, such as ATM ABR, by finding a protocol specific adaptation. Hence, the method we develop is useful not only for analysing TCP using a source model significantly different from the commonly used persistent sources, but also can be useful for analysing other feedback schemes.(3) Comparisons of simulated TCP traffic to measured Ethernet traffic shows qualitatively similar autocorrelation when think times follow a Pareto distribution with infinite variance. Also, the simulated and measured traffic have long range dependence. In this sense our traffic model, which purports to be Web-user-like, also agrees with measured traffic.
    Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems; 06/1997
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The Grassmann-Taksar-Heyman algorithm is a direct algorithm for computing the steady-state distribution of a finite irreducible Markov chain. We describe our experience in implementing this algorithm on a single-instruction multiple-data parallel processor computer. Our main conclusions are that a lower-level language has a performance advantage compared to Fortran, and that data storage is the limiting factor that determines the largest problem that can be solved. As a consequence, we devote considerable attention to storing a block tridiagonal transition matrix.
    Informs Journal on Computing 05/1997; 9:218-223. DOI:10.1287/ijoc.9.2.218 · 1.12 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Intuition suggests that a finite buffer limits the effect of traffic auto-correlations on the queue length. We investigate the extent to which finite buffers can, therefore, be expected to mitigate the effects of long-range dependence (LRD). With traffic sequences generated by the fractional autoregressive integrated moving average (f-ARIMA) models for LRD, and by AR models for short range dependence (SRD), we investigate the traffic performance for a range of finite buffers, both for single and multiplexed streams. For design, the aim is to `match' a given LRD auto-correlation function with a suitable SRD function whose performance provides, for a wide range of traffic intensities and buffer sizes, a conservative bound on the performance associated with the LRD function. The results suggest that by suitably `dominating' an LRD auto-correlation function by an SRD function, one can obtain conservative performance bounds for a realistic range of traffic intensities and buffer sizes (delays). Also, in several cases, a `cross-over' phenomenon is observed for cell-loss probabilities as the buffer size increases, i.e., the loss probabilities are smaller for LRD than for SRD for small buffers, with the converse true for large buffers. This suggests that finite buffers, can in some cases, counteract the effects of LRD in traffic arrivals, and thus enable conservative designs to be based on Markovian traffic models
    Global Telecommunications Conference, 1996. GLOBECOM '96. 'Communications: The Key to Global Prosperity; 12/1996
  • D.P. Heyman, T.V. Lakshman
    [Show abstract] [Hide abstract]
    ABSTRACT: The authors explore the influence of long-range dependence in broadband traffic engineering. The classification of stochastic processes {X<sub>t</sub>} into those with short or long-range dependence is based on the asymptotic properties of the variance of the sum S<sub>m </sub>=X<sub>1</sub>+X<sub>2</sub>+···+X<sub>m </sub>. Suppose this process describes the number of packets (or ATM cells) that arrive at a buffer; X<sub>t</sub> is the number that arrive in the tth time slice (e.g., 10 ms). We use a generic buffer model to show how the distribution of S<sub>m</sub> (for all values of m) determines the buffer occupancy. From this model we show that long-range dependence does not affect the buffer occupancy when the busy periods are not large. Numerical experiments show this property is present when data from four video conferences and two entertainment video sequences (which have long-range dependence) are used as the arrival process, even when the transmitting times are long enough to make the probability of buffer overflow 0.07. We generated sample paths from Markov chain models of the video traffic (these have short-range dependence). Various operating characteristics, computed over a wide range of loadings, closely agree when the data trace and the Markov chain paths are used to drive the model. From this, we conclude that long-range dependence is not a crucial property in determining the buffer behavior of variable bit rate (VBR)-video sources
    IEEE/ACM Transactions on Networking 07/1996; DOI:10.1109/90.502230 · 1.99 Impact Factor
  • D.P. Heyman, T.V. Lakshman
    [Show abstract] [Hide abstract]
    ABSTRACT: Traffic from video services is expected to be a substantial portion of the traffic carried by emerging broadband integrated networks. For variable bit rate (VBR) coded video, statistical source models are needed to design networks that achieve acceptable picture quality at minimum cost and to design traffic shaping and control mechanisms. For video teleconference traffic Heyman et al. (1992) showed that traffic is sufficiently accurately characterized by a multistate Markov chain model that can be derived from three traffic parameters (mean, correlation, and variance). The present authors describe modeling results for sequences with frequent scene changes (the previously studied video teleconferences have very little scene variation) such as entertainment television, news, and sports broadcasts. The authors analyze 11 long sequences of broadcast video traffic data. Unlike video teleconferences, the different sequences studied have different details regarding distributions of cells per frame. The authors present source models applicable to the different sequences and evaluate their accuracy as predictors of cell losses in asynchronous transfer mode (ATM) networks. The modeling approach is the same for all of the sequences but use of a single model based on a few physically meaningful parameters and applicable to all sequences does not seem to be possible
    IEEE/ACM Transactions on Networking 03/1996; DOI:10.1109/90.503760 · 1.99 Impact Factor
  • Sigrún Andradóttir, Daniel P. Heyman, Teunis J. Ott
    [Show abstract] [Hide abstract]
    ABSTRACT: We consider the application of importance sampling in steady-state simulations of finite Markov chains. We show that, for a large class of performance measures, there is a choice of the alternative transition matrix for which the ratio of the variance of the importance sampling estimator to the variance of the naive simulation estimator converges to zero as the sample path length goes to infinity. Obtaining this ‘optimal’ transition matrix involves computing the performance measure of interest, so the optimal matrix cannot be computed in precisely those situations where simulation is required to estimate steady-state performance. However, our results show that alternative transition matrices of the form Q=P+E/T, where P is the original transition matrix and T is the sample path length, can be expected to provide good results. Moreover, we provide an iterative algorithm for obtaining alternative transition matrices of this form that converge to the optimal matrix as the number of iterations increases, and present an example that shows that spending some computer time iterating this algorithm and then conducting the simulation with the resulting alternative transition matrix may provide considerable variance reduction when compared to naive simulation.
    Advances in Applied Probability 03/1996; 28(1). DOI:10.2307/1427916 · 0.83 Impact Factor
  • Daniel P. Heyman
    [Show abstract] [Hide abstract]
    ABSTRACT: We prove that every infinite-state stochastic matrix P say, that is irreducible and consists of positive-recurrrent states can be represented in the form I- P = (A - I)(B - S), where A is strictly upper-triangular, B is strictly lower-triangular, and S is diagonal. Moreover, the elements of A are expected values of random variables that we will specify, and the elements of B and S are probabilities of events that we will specify. The decomposition can be used to obtain steady-state probabilities, mean first-passage-times and the fundamental matrix.
    Journal of Applied Probability 12/1995; 32(4). DOI:10.2307/3215202 · 0.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The main contributions of this paper are two-fold. First, we prove fundamental, similarly behaving lower and upper bounds, and give an approximation based on the bounds, which is effective for analyzing ATM multiplexers, even when the traffic has many, possibly heterogeneous, sources and their models are of high dimension. Second, we apply our analytic approximation to statistical models of video teleconference traffic, obtain the multiplexing system's capacity as determined by the number of admissible sources for given cell-loss probability, buffer size and trunk bandwidth, and, finally, compare with results from simulations, which are driven by actual data from coders. The results are surprisingly close. Our bounds are based on large deviations theory. The main assumption is that the sources are Markovian and time-reversible. Our approximation to the steady-state buffer distribution is called Chenoff-dominant eigenvalue since one parameter is obtained from Chernoffs theorem and the other is the system's dominant eigenvalue. Fast, effective techniques are given for their computation. In our application we process the output of variable bit rate coders to obtain DAR(1) source models which, while of high dimension, require only knowledge of the mean, variance, and correlation. We require cell-loss probability not to exceed 10<sup>-6 </sup>, trunk bandwidth ranges from 45 to 150 Mb/s, buffer sizes are such that maximum delays range from 1 to 60 ms, and the number of coder-sources ranges from 15 to 150. Even for the largest systems, the time for analysis is a fraction of a second, while each simulation takes many hours. Thus, the real-time administration of admission control based on our analytic techniques is feasible
    IEEE Journal on Selected Areas in Communications 09/1995; 13(6-13):1004 - 1016. DOI:10.1109/49.400656 · 4.14 Impact Factor
  • Daniel P. Heyman
    [Show abstract] [Hide abstract]
    ABSTRACT: Associated with every stochastic matrix is another matrix called the fundamental matrix. The fundamental matrix can be used to obtain mean first-passage-times and other interesting operating characteristics. The fundamental matrix is defined as a matrix inverse, and computing it from the definition can be fraught with numerical errors. We establish a new representation of the fundamental matrix where matrix inversion is replaced by multiplying and then adding a pair of matrices. The representation requires the solution of a system of linear equations, and we show that that can be done via back and forward substitution from numbers that have already been calculated when the GTH algorithm is used to compute the steady-state probabilities. An algorithm based on this representation is given. The time complexity of the faster implementation is 75% of the time complexity of using Gaussian elimination.
    SIAM Journal on Matrix Analysis and Applications 07/1995; 16(3). DOI:10.1137/S0895479893258814 · 1.81 Impact Factor
  • Sigrún Andradóttir, Daniel P. Heyman, Teunis J. Ott
    [Show abstract] [Hide abstract]
    ABSTRACT: In the simulation of Markov chains, importance sampling involves replacing the original transition matrix, say P, with a suitably chosen transition matrix Q that tends to visit the states of interest more frequently. The likelihood ratio of P relative to Q is an important random variable in the importance sampling method. It always has expectation one, and for any interesting pair of transition matrices P and Q, there is a sample path length that causes the likelihood ratio to be close to zero with probability close to one. This may cause the variance of the importance sampling estimator to be larger than the variance of the traditional estimator. We develop sufficient conditions for ensuring the tightness of the distribution of the logarithm of the likelihood ratio for all sample path lengths, and we show that when these conditions are satisfied, the likelihood ratio is approximately lognormally distributed with expected value one. These conditions can be used to eliminate some choices of the alternative transition matrix Q that are likely to result in a variance increase. We also show that if the likelihood ratio is to remain well behaved for all sample path lengths, the alternative transition matrix Q has to approach the original transition matrix P as the sample path length increases. The practical significance of this result is that importance sampling can be difficult to apply successfully in simulations that involve long sample paths.
    Operations Research 06/1995; 43(3):509-519. DOI:10.1287/opre.43.3.509 · 1.50 Impact Factor

Publication Stats

2k Citations
32.05 Total Impact Points

Institutions

  • 1997–2003
    • AT&T Labs
      Austin, Texas, United States
  • 1998
    • Loyola University Maryland
      Baltimore, Maryland, United States
  • 1996
    • University of Wisconsin–Madison
      Madison, Wisconsin, United States
    • Texas A&M University
      College Station, Texas, United States