ACM Transactions on Modeling and Computer Simulation

Published by Association for Computing Machinery
Print ISSN: 1049-3301
The Global-Scale Agent Model (GSAM) is presented. The GSAM is a high-performance distributed platform for agent-based epidemic modeling capable of simulating a disease outbreak in a population of several billion agents. It is unprecedented in its scale, its speed, and its use of Java. Solutions to multiple challenges inherent in distributing massive agent-based models are presented. Communication, synchronization, and memory usage are among the topics covered in detail. The memory usage discussion is Java specific. However, the communication and synchronization discussions apply broadly. We provide benchmarks illustrating the GSAM's speed and scalability.
We describe the design and prototype implementation of Indemics (_Interactive; Epi_demic; _Simulation;)—a modeling environment utilizing high-performance computing technologies for supporting complex epidemic simulations. Indemics can support policy analysts and epidemiologists interested in planning and control of pandemics. Indemics goes beyond traditional epidemic simulations by providing a simple and powerful way to represent and analyze policy-based as well as individual-based adaptive interventions. Users can also stop the simulation at any point, assess the state of the simulated system, and add additional interventions. Indemics is available to end-users via a web-based interface. Detailed performance analysis shows that Indemics greatly enhances the capability and productivity of simulating complex intervention strategies with a marginal decrease in performance. We also demonstrate how Indemics was applied in some real case studies where complex interventions were implemented.
The calendar queue is an important implementation of a priority queue which is particularly useful in discrete event simulators. In this paper we present an analysis of the static calendar queue which maintains N active events. A step of the discrete event simulator removes and processes the event with the smallest associated time and inserts a new event whose associated time is the time of the removed event plus a random increment with mean μ. We demonstrate that for the infinite bucket calendar queue the optimal bucket width is approximately δ<sub>opt</sub>=√(2b/c)μ/N where b is the time to process an empty bucket and c the incremental time to process a list element. With bucket width chosen to be δ<sub>opt</sub>, the expected time to process an event is approximately minimized at the constant c+√(2bc)+d, where d is the fixed time to process an event. We show that choosing the number of buckets to be O(N) yields a calendar queue with performance equal to or almost equal to the performance of the infinite bucket calendar queue
We consider a two-node tandem Jackson network. Starting from a given state, we are interested in estimating the probability that the content of the second buffer exceeds some high level L before it becomes empty. The theory of Markov additive processes is used to determine the asymptotic decay rate of this probability, for large L. Moreover, the optimal exponential change of measure to be used in importance sampling is derived and used for efficient estimation of the rare event probability of interest. Unlike changes of measures proposed and studied in recent literature, the one derived here is a function of the content of the first buffer, and yields asymptotically efficient simulation for any set of arrival and service rates. The relative error is bounded independent of the level L, except when the first server is the bottleneck and its buffer is infinite, in which case the relative error is bounded linearly in L
We develop fully sequential procedures for comparison with a standard. The goal is to find systems whose expected performance measures are larger or smaller than a single system referred as a standard and, if there is any, to find the one with the largest or smallest performance. Our procedures allow for unequal variances across systems, the use of common random numbers and known or unknown expected performance of the standard. Experimental results are provided to compare the efficiency of the procedure with other existing procedures.
Comparison of SRWM (left), EE (center) and AEE (right) for a Gaussian mixture in one dimension
Comparison of the algorithms for a Gaussian mixture in two dimensions: (from left to right) the true density, SRWM, EE and AEE. 
Results given by AEE, I-MCMC and a MH sampler
Markov chain Monte Carlo (MCMC) methods allow to sample a distribution known up to a multiplicative constant. Classical MCMC samplers are known to have very poor mixing properties when sampling multimodal distributions. The Equi-Energy sampler is an interacting MCMC sampler proposed by Kou, Zhou and Wong in 2006 to sample difficult multimodal distributions. This algorithm runs several chains at different temperatures in parallel, and allow lower-tempered chains to jump to a state from a higher-tempered chain having an energy “close” to that of the current state. A major drawback of this algorithm is that it depends on many design parameters and thus, requires a significant effort to tune these parameters. In this article, we introduce an Adaptive Equi-Energy (AEE) sampler that automates the choice of the selection mecanism when jumping onto a state of the higher-temperature chain. We prove the ergodicity and a strong law of large numbers for AEE, and for the original Equi-Energy sampler as well. Finally, we apply our algorithm to motif sampling in DNA sequences.
Queuing Network. 
Convergence behaviousr of SF and q-SF algorithms for q = 0.8, β = 0.005.
The importance of the q-Gaussian family of distributions lies in its power-law nature, and its close association with Gaussian, Cauchy and uniform distributions. This class of distributions arises from maximization of a generalized information measure. We use the power-law nature of the q-Gaussian distribution to improve upon the smoothing properties of Gaussian and Cauchy distributions. Based on this, we propose a Smoothed Functional (SF) scheme for gradient estimation using q-Gaussian distribution. Our work extends the class of distributions that can be used in SF algorithms by including the q-Gaussian distributions, which encompass the above three distributions as special cases. Using the derived gradient estimates, we propose two-timescale algorithms for optimization of a stochastic objective function with gradient descent method. We prove that the proposed algorithms converge to a local optimum. Performance of the algorithms is shown by simulation results on a queuing model.
We discuss a unified approach to stochastic optimization of pseudo-Boolean objective functions based on particle methods, including the cross-entropy method and simulated annealing as special cases. We point out the need for auxiliary sampling distributions, that is parametric families on binary spaces, which are able to reproduce complex dependency structures, and illustrate their usefulness in our numerical experiments. We provide numerical evidence that particle-driven optimization algorithms based on parametric families yield superior results on strongly multi-modal optimization problems while local search heuristics outperform them on easier problems.
Exemplar uncertainty analysis of angioedema as an adverse event under the L 1 prior. Here, we plot the non-parametric bootstrap 95% confidence intervals for the 441 drug effects that demonstrated non-zero coefficients in at least 50% of the bootstrap replicates. Gray-scaling reports the proportion of bootstrap replicates in which effect estimates are non-zero.  
Following a series of high-profile drug safety disasters in recent years, many countries are redoubling their efforts to ensure the safety of licensed medical products. Large-scale observational databases such as claims databases or electronic health record systems are attracting particular attention in this regard, but present significant methodological and computational concerns. In this paper we show how high-performance statistical computation, including graphics processing units, relatively inexpensive highly parallel computing devices, can enable complex methods in large databases. We focus on optimization and massive parallelization of cyclic coordinate descent approaches to fit a conditioned generalized linear model involving tens of millions of observations and thousands of predictors in a Bayesian context. We find orders-of-magnitude improvement in overall run-time. Coordinate descent approaches are ubiquitous in high-dimensional statistics and the algorithms we propose open up exciting new methodological possibilities with the potential to significantly improve drug safety.
Recent advances in imaging technology now provide us with 3D images of developing organs. These can be used to extract 3D geometries for simulations of organ development. To solve models on growing domains, the displacement fields between consecutive image frames need to be determined. Here we develop and evaluate different landmark-free algorithms for the determination of such displacement fields from image data. In particular, we examine minimal distance, normal distance, diffusion-based and uniform mapping algorithms and test these algorithms with both synthetic and real data in 2D and 3D. We conclude that in most cases the normal distance algorithm is the method of choice and wherever it fails, diffusion-based mapping provides a good alternative.
Variance of graph metrics across sets of equal-sized random graphs.
The coarsest approximation of the structure of a complex network, such as the Internet, is a simple undirected unweighted graph. This approximation, however, loses too much detail. In reality, objects represented by vertices and edges in such a graph possess some nontrivial internal structure that varies across and differentiates among distinct types of links or nodes. In this work, we abstract such additional information as network annotations . We introduce a network topology modeling framework that treats annotations as an extended correlation profile of a network. Assuming we have this profile measured for a given network, we present an algorithm to rescale it in order to construct networks of varying size that still reproduce the original measured annotation profile. Using this methodology, we accurately capture the network properties essential for realistic simulations of network applications and protocols, or any other simulations involving complex network topologies, including modeling and simulation of network evolution. We apply our approach to the Autonomous System (AS) topology of the Internet annotated with business relationships between ASs. This topology captures the large-scale structure of the Internet. In depth understanding of this structure and tools to model it are cornerstones of research on future Internet architectures and designs. We find that our techniques are able to accurately capture the structure of annotation correlations within this topology, thus reproducing a number of its important properties in synthetically-generated random graphs.
True trajectory (bold line) and true landmark positions (balls) with the estimated path (dotted line) and the landmarks estimated positions (stars) at the end of the run (T = 2000).  
Distance between the final estimation and the true position for each of the 15 landmarks with the averaged marginal SLAM (left) and the averaged P-BOEM algorithm (right).
Online variants of the Expectation Maximization (EM) algorithm have recently been proposed to perform parameter inference with large data sets or data streams, in independent latent models and in hidden Markov models. Nevertheless, the convergence properties of these algorithms remain an open problem at least in the hidden Markov case. This contribution deals with a new online EM algorithm which updates the parameter at some deterministic times. Some convergence results have been derived even in general latent models such as hidden Markov models. These properties rely on the assumption that some intermediate quantities are available in closed form or can be approximated by Monte Carlo methods when the Monte Carlo error vanishes rapidly enough. In this paper, we propose an algorithm which approximates these quantities using Sequential Monte Carlo methods. The convergence of this algorithm and of an averaged version is established and their performance is illustrated through Monte Carlo experiments.
A queueing network with four stations and four classes of customers (three non-null exogenous and one null exogenous).
Multiclass open queueing networks find wide applications in communication, computer and fabrication networks. Often one is interested in steady-state performance measures associated with these networks. Conceptually, under mild conditions, a regenerative structure exists in multiclass networks, making them amenable to regenerative simulation for estimating the steady-state performance measures. However, typically, identification of a regenerative structure in these networks is difficult. A well known exception is when all the interarrival times are exponentially distributed, where the instants corresponding to customer arrivals to an empty network constitute a regenerative structure. In this paper, we consider networks where the interarrival times are generally distributed but have exponential or heavier tails. We show that these distributions can be decomposed into a mixture of sums of independent random variables such that at least one of the components is exponentially distributed. This allows an easily implementable embedded regenerative structure in the Markov process. We show that under mild conditions on the network primitives, the regenerative mean and standard deviation estimators are consistent and satisfy a joint central limit theorem useful for constructing asymptotically valid confidence intervals. We also show that amongst all such interarrival time decompositions, the one with the largest mean exponential component minimizes the asymptotic variance of the standard deviation estimator.
Trace of the last 20000 of 100000 samples of the energy for the Swendsen-Wang sampler. As we can see, the sampler performs poorly on the RBM model.  
RBM parameters. Each image corresponds to the parameters connecting a specific hidden unit to the entire set of visible units.  
This paper introduces a new specialized algorithm for equilibrium Monte Carlo sampling of binary-valued systems, which allows for large moves in the state space. This is achieved by constructing self-avoiding walks (SAWs) in the state space. As a consequence, many bits are flipped in a single MCMC step. We name the algorithm SARDONICS, an acronym for Self-Avoiding Random Dynamics on Integer Complex Systems. The algorithm has several free parameters, but we show that Bayesian optimization can be used to automatically tune them. SARDONICS performs remarkably well in a broad number of sampling tasks: toroidal ferromagnetic and frustrated Ising models, 3D Ising models, restricted Boltzmann machines and chimera graphs arising in the design of quantum computers.
We want to select the best systems out of a given set of systems (or rank them) with respect to their expected performance. The systems allow random observations only and we assume that the joint observation of the systems has a multivariate normal distribution with unknown mean and covariance. We allow dependent marginal observations as they occur when common random numbers are used for the simulation of the systems. In particular, we focus on positively dependent observations as they might be expected in heuristic optimization where `systems' are different solutions to an optimization problem with common random inputs. In each iteration, we allocate a fixed budget of simulation runs to the solutions. We use a Bayesian setup and allocate the simulation effort according to the posterior covariances of the solutions until the ranking and selection decision is correct with a given high probability. Here, the complex posterior distributions are approximated only but we give extensive empirical evidence that the observed error probabilities are well below the given bounds in most cases. We also use a generalized scheme for the target of the ranking and selection that allows to bound the error probabilities with a Bonferroni approach. Our test results show that our procedure uses less simulations than comparable procedures from literature even in most of the cases where the observations are not positively correlated.
The seven blocks of Tetris.
We consider the inverse reinforcement learning problem, that is, the problem of learning from, and then predicting or mimicking a controller based on state/action data. We propose a statistical model for such data, derived from the structure of a Markov decision process. Adopting a Bayesian approach to inference, we show how latent variables of the model can be estimated, and how predictions about actions can be made, in a unified framework. A new Markov chain Monte Carlo (MCMC) sampler is devised for simulation from the posterior distribution. This step includes a parameter expansion step, which is shown to be essential for good convergence properties of the MCMC sampler. As an illustration, the method is applied to learning a human controller.
Consider a system of N identical hard spherical particles moving in a d-dimensional box and undergoing elastic, possibly multi-particle, collisions. We develop a new algorithm that recovers the pre-collision state from the post-collision state of the system, across a series of consecutive collisions, with essentially no memory overhead. The challenge in achieving reversibility for an n-particle collision (where, n << N) arises from the presence of nd-d-1 degrees of freedom during each collision, and from the complex geometrical constraints placed on the colliding particles. To reverse the collisions in a traditional simulation setting, all of the particular realizations of these degrees of freedom during the forward simulation must be saved. This limitation is addressed here by first performing a pseudo-randomization of angles, ensuring determinism in the reverse path for any values of n and d. To address the more difficult problem of geometrical and dynamic constraints, a new approach is developed which correctly samples the constrained phase space. Upon combining the pseudo-randomization with correct phase space sampling, perfect reversibility of collisions is achieved, as illustrated for n <= 3, d=2, and n=2, d=3 (and, in principle, generalizable to larger n). This result enables for the first time reversible simulations of elastic collisions with essentially zero memory accumulation. The reverse computation methodology uncovers important issues of irreversibility in conventional models, and the difficulties encountered in arriving at a reversible model for one of the most basic physical system processes, namely, elastic collisions for hard spheres. Insights and solution methodologies, with regard to accurate phase space coverage with reversible random sampling proposed in this context, can help serve as models and/or starting points for other reversible simulations.
Average delay τ m of a free packet as a function of m for a periodic lattice 50 × 50. The continuous line represents the least squares fit to the first 10 points.
We investigate individual packet delay in a model of data networks with table-free, partial table and full table routing. We present analytical estimation for the average packet delay in a network with small partial routing table. Dependence of the delay on the size of the network and on the size of the partial routing table is examined numerically. Consequences for network scalability are discussed.
Multipotent differentiation, where cells adopt one of several cell fates, is a determinate and orchestrated procedure that often incorporates stochastic mechanisms in order to diversify cell types. How these stochastic phenomena interact to govern cell fate are poorly understood. Nonetheless, cell fate decision making procedure is mainly regulated through the activation of differentiation waves and associated signaling pathways. In the current work, we focus on the Notch/Delta signaling pathway which is not only known to trigger such waves but also is used to achieve the principle of lateral inhibition, i.e. a competition for exclusive fates through cross-signaling between neighboring cells. Such a process ensures unambiguous stochastic decisions influenced by intrinsic noise sources, e.g.~as ones found in the regulation of signaling pathways, and extrinsic stochastic fluctuations, attributed to micro-environmental factors. However, the effect of intrinsic and extrinsic noise on cell fate determination is an open problem. Our goal is to elucidate how the induction of extrinsic noise affects cell fate specification in a lateral inhibition mechanism. Using a stochastic Cellular Automaton with continuous state space, we show that extrinsic noise results in the emergence of steady-state furrow patterns of cells in a "frustrated/transient" phenotypic state.
A new method for estimating Sobol' indices is proposed. The new method makes use of 3 independent input vectors rather than the usual 2. It attains much greater accuracy on problems where the target Sobol' index is small, even outperforming some oracles that adjust using the true but unknown mean of the function. The new estimator attains a better rate of convergence than the old one in a small effects limit. When the target Sobol' index is quite large, the oracles do better than the new method.
Improving Importance Sampling estimators for rare event probabilities requires sharp approximations of conditional densities. This is achieved for events E_{n}:=(f(X_{1})+...+f(X_{n}))\inA_{n} where the summands are i.i.d. and E_{n} is a large or moderate deviation event. The approximation of the conditional density of the real r.v's X_{i} 's, for 1\leqi\leqk_{n} with repect to E_{n} on long runs, when k_{n}/n\to1, is handled. The maximal value of k compatible with a given accuracy is discussed; algorithms and simulated results are presented.
This paper surveys efficient techniques for estimating, via simulation, the probabilities of certain rare events in queueing and reliability models. The rare events of interest are long waiting times or buffer overflows in queueing systems, and system failure events in reliability models of highly dependable computing systems. The general approach to speeding up such simulations is to accelerate the occurrence of the rare events by using importance sampling. In importance sampling, the system is simulated using a new set of input probability distributions, and unbiased estimates are recovered by multiplying the simulation output by a likelihood ratio. Our focus is on describing asymptotically optimal importance sampling techniques. Using asymptotically optimal importance sampling, only a fixed number of samples are required to get accurate estimates, no matter how rare the event of interest is. In practice, this means that the required run lengths can be reduced by many orders of magnitude, compared to standard simulation. The queueing systems studied include simple queues (e.g., GI/GI/1) and discrete time queues with multiple autocorrelated arrival processes that arise in the analysis of Asynchronous Transfer Mode communications switches. References for results on Jackson networks and and tree structured networks of ATM switches are given. Both Markovian and non-Markovian reliability models are treated.
2: High-level description of the algorithm  
1: 1 1 illustrates a sample path {S n (µ) : 0 ≤ n ≤ 12}. If we set m = 1 and L = 2 then the corresponding stopping times are D 1 = 4, U 1 = 6, D 2 = 9. If in addition U 2 = ∞, then S n (µ) stays below the doted bold line for all n ≥ D 2. Therefore, at time t = D 2 the values of {M n : 0 ≤ n ≤ 7} can be calculated and we can update C U B ← S D2 (µ) + 1.
We provide the first algorithm that under minimal assumptions allows to simulate the stationary waiting-time sequence of a single-server queue backwards in time, jointly with the input processes of the queue (inter-arrival and service times). The single-server queue is useful in applications of DCFTP (Dominated Coupling From The Past), which is a well known protocol for simulation without bias from steady-state distributions. Our algorithm terminates in finite time assuming only finite mean of the inter-arrival and service times. In order to simulate the single-server queue in stationarity until the first idle period in finite expected termination time we require the existence of finite variance. This requirement is also necessary for such idle time (which is a natural coalescence time in DCFTP applications) to have finite mean. Thus, in this sense, our algorithm is applicable under minimal assumptions.
Importance sampling algorithms for heavy-tailed random walks are considered. Using a specification with algorithms based on mixtures of the original distribution with some other distribution, sufficient conditions for obtaining bounded relative error are presented. It is proved that mixture algorithms of this kind can achieve asymptotically optimal relative error. Some examples of mixture algorithms are presented, including mixture algorithms using a scaling of the original distribution, and the bounds of the relative errors are calculated. The algorithms are evaluated numerically in a simple setting.
Based on the theory of stochastic chemical kinetics, the inherent randomness and stochasticity of biochemical reaction networks can be accurately described by discrete-state continuous-time Markov chains. The analysis of such processes is, however, computationally expensive and sophisticated numerical methods are required. Here, we propose an analysis framework in which we integrate a number of moments of the process instead of the state probabilities. This results in a very efficient simulation of the time evolution of the process. In order to regain the state probabilities from the moment representation, we combine the fast moment-based simulation with a maximum entropy approach for the reconstruction of the underlying probability distribution. We investigate the usefulness of this combined approach in the setting of stochastic chemical kinetics and present numerical results for three reaction networks showing its efficiency and accuracy. Besides a simple dimerization system, we study a bistable switch system and a multi-attractor network with complex dynamics.
ion for Memory-System Simulation Alvin R. Lebeck and David A. Wood Computer Sciences Department University of Wisconsin--Madison 1210 West Dayton Street Madison, WI 53706 USA falvy, Abstract This paper describes the active memory abstraction for memory-system simulation. In this abstraction---designed specifically for on-the-fly simulation, memory references logically invoke a user-specified function depending upon the reference's type and accessed memory block state. Active memory allows simulator writers to specify the appropriate action on each reference, including "no action" for the common case of cache hits. Because the abstraction hides implementation details, implementations can be carefully tuned for particular platforms, permitting much more efficient onthe -fly simulation than the traditional trace-driven abstraction. Our SPARC implementation, Fast-Cache, executes simple data cache simulations two or three times faster than a highly-tuned trace-driven...
Architecture for a Command Post Exercise (CPX).
Concept Implementation Activities.
Confederation Deployment Activities.
This article describes the verification, validation and accreditation process used within one of the largest current ADS efforts, the Aggregate Level Simulation Protocol (ALSP) Joint Training Confederation (JTC). The JTC is a collection of constructive training simulations that supports joint training at the command and battle staff levels during several major exercises each year. The primary objective of the article is to document "lessons learned," both in terms of failures as well as successes, for the benefit of future, similar systems. The remainder of this article is organized as follows. Section 2 provides the context for the case study, outlining the terminology used in the article and briefly reviewing both ALSP and the JTC. Section 3 presents a development process model for the JTC and highlights the VV&A activities within it. An evaluation of the process is given in Section 4. Initiatives that mark the future for joint training are briefly discussed in Section 5, and conclusions appear in Section 6.
7 [Simulation and Modeling]: Simulation Support Systems General Terms: Experimentation, Measurement, Performance A preliminary version of this work appeared in the Proceedings of the 1994 ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems, May 1994. This work was supported by NSF grant number MIP-94085550 and Innovative Science and Technology contract number DASG60-93-C-0126 provided by the Ballistic Missile Defense Organization and managed through the Strategic Defense Command Advanced Technology Directorate Processing Division. The work of the first author is also supported by DoD/AFOSR grant number F49620-96-1-0472 and NSF grant number CDA-9529541. NOTE: This is a preliminary release of an article accepted by the ACM Transactions on Modeling and Computer Simulation. The definitive version is currently in production at ACM and, when released, will supersede this version. Copyright (C) 1996 by the Association for Computing Mach
This paper describes and evaluates an efficient execution-driven technique for the simulation of multiprocessors that includes the simulation of system memory and is driven by real program workloads. The technique produces correctly interleaved address traces at run time without disk access overhead or hardware support, allowing accurate simulation of the effects of a variety of architectural alternatives on programs. We have implemented a simulator based on this technique that offers substantial advantages in terms of reduced time and space overheads when compared to instruction-driven or trace-driven simulation techniques, without significant loss of accuracy. The paper presents the results of several validation experiments used to quantify the accuracy and efficiency of the simulator for sequential, distributed, and shared-memory multiprocessors, and several parallel programs. These experiments show that prediction errors of less than 5% compared to actual execution times and overhe...
AES, the Advanced Encryption Standard, is one of the most important algorithms in modern cryptography. Certain randomness properties of AES are of vital importance for its security. At the same time, these properties make AES an interesting candidate for a fast nonlinear random number generator for stochastic simulation. In this article, we address both of these two aspects of AES. We study the performance of AES in a series of statistical tests that are related to cryptographic notions like confusion and diffusion. At the same time, these tests provide empirical evidence for the suitability of AES in stochastic simulation. A substantial part of this article is devoted to the strategy behind our tests and to their relation to other important test statistics like Maurer's Universal Test.
This paper describes methods for simulating continuous time Markov chain models, using parallel architectures. The basis of our method is the technique of uniformization; within this framework there are a number of options concerning optimism and aggregation. We describe four different variations, paying particular attention to an adaptive method that optimistically assumes upper bounds on the rate at which one processor affects another in simulation time, and which recovers from violations of this assumption using global checkpoints. We describe our experiences with these methods on a variety of Intel multiprocessor architectures, including the Touchstone Delta, where excellent speedups of up to 220 using 256 processors are observed. Portions of this paper are reprinted with permission from "Parallel Algorithms for Simulating Continuous Time Markov Chains" in Proceedings of the 1993 Workshop on Parallel and Distributed Simulation, and from "Parallel Simulation of Markovian Que...
One of the most fundamental and frequently used operations in the process of simulating a stochastic discrete event system is the generation of a nonuniform discrete random variate. The simplest form of this operation can be stated as follows: Generate a random variable X which is distributed over the integers 1,2,...,n such that P(X = i) = pi. A more difficult problem is to generate X when the pi's change with time. For this case, there is a well-known algorithm which takes O(log n) time to generate each variate. Recently Fox [4] presented an algorithm that takes an expected o(log n) time to generate each variate under assumptions restricting the way the pi's can change. In this paper we present algorithm for discrete random variate generation that take an expected O(1) time to generate each variate. Furthermore, our assumptions on how the pi's change are less restrictive than those of Fox. The algorithms are quite simple and can be fine-tuned to suit a wide variety of application. The application to the simulation of queueing networks is discussed in some detail.
Maximum Generated Objects By Model Per Program than the homogeneous models. In particular, the Trans model displays a signiicant amount of variance in behavior in the espresso program. The gures also illustrate some of the characteristics of the test programs. gawk and perl are relatively well-behaved programs with respect to maximum bytes and objects allocated. In particular, these programs maintain a relatively constant level of allocation throughout their execution. A more diicult time-varying behavior to model is indicated by the cfrac and gs applications, in which the number of allocated objects increased monotonically as the program ran. Finally, the espresso and cham programs showed a widely varying allocation of objects and bytes, and all models had trouble accurately estimating the correct maxima for these programs. In summary, most of the models very eeectively emulate the total bytes and objects allocated by the test programs, while none of the models accurately emulate the maximum allocations for all the test programs. Because the models have trouble emulating maximum program allocations, using these models to generate synthetic system loads would not result in accurate loads. The heart of the problem lies with the inability of the models investigated to accurately track rapid time-varying changes in the amount of data allocated. The Time model was our attempt to address this problem, but as the gures in this section show, that model is not particularly accurate in any case.  
Because dynamic memory management is an important part of a large class of computer programs, high-performance algorithms for dynamic memory management have been, and will continue to be, of considerable interest. We evaluate and compare models of the memory allocation behavior in actual programs and investigate how these models can be used to explore the performance of memory management algorithms. These models, if accurate enough, provide an attractive alternative to algorithm evaluation based on trace-driven simulation using actual traces. We explore a range of models of increasing complexity including models that have been used by other researchers. Based on our analysis, we draw three important conclusions. First, a very simple model, which generates a uniform distribution around the mean of observed values, is often quite accurate. Second, two new models we propose show greater accuracy than those previously described in the literature. Finally, none of the models investigated ap...
S. All local authors can be reached via e-mail at the address Questions and comments should be addressed to Recent Titles from the UBLCS Technical Report Series 98-6 GSMPA: A Core Calculus With Generally Distributed Durations, M. Bravetti, M. Bernardo, R. Gorrieri, June 1998. 98-7 A Communication Architecture for Critical Distributed Multimedia Applications: Design, Implementation, and Evaluation, F. Panzieri, M. Roccetti, June 1998. 98-8 Formal Specification of Performance Measures for Process Algebra Models of Concurrent Systems, M. Bernardo, June 1998. 98-9 Formal Performance Modeling and Evaluation of an Adaptive Mechanism for Packetized Audio over the Internet, M. Bernardo, R. Gorrieri, M. Roccetti, June 1998. 98-10 Value Passing in Stochastically Timed Process Algebras: A Symbolic Approach based on Lookahead, M. Bernardo, June 1998. 98-11 Structuring Sub-Populations in Parallel Genetic Algorithms for MPP, R. Gaioni, R. Davoli, J...
Simple failure biasing is an importance-sampling technique used to reduce the variance of estimates of performance measures and their gradients in simulations of highly reliable Markovian systems. Although simple failure biasing yields bounded relative error for the performance measure estimate when the system is balanced, it may not provide bounded relative error when the system is unbalanced. In this article, we provide a characterization of when the simple failure-biasing method produces estimators of a performance measure and its derivatives with bounded relative error. We derive a necessary and sufficient condition on the structure of the system for when the performance measure can be estimated with bounded relative error when using simple failure biasing. Furthermore, a similar condition for the derivative estimators is established. One interesting aspect of the conditions is that it shows that to obtain bounded relative error, not only the most likely paths to system failure must be examined but also some secondary paths leading to failure as well. We also show by example that the necessary and sufficient conditions for a derivative estimator do not imply those for the performance measure estimator; i.e., it is possible to estimate a derivative more efficiently than the performance measure when using simple failure biasing.
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or Analysis of Bounded Time Warp and Comparison with YAWNS Delta 1 This paper studies an analytic model of parallel discrete-event simulation, comparing the YAWNS conservative synchronization protocol with Bounded Time Warp. The assumed simulation problem is a heavily loaded queueing network where the probability of an idle server is close to zero. We model workload and job routing in standard ways, then develop and validate methods for computing approximated performance measures as a function of the degree of optimism allowed, overhead costs of state-saving, rollback, and barrier synchronization, and workload aggregation. We find t...
This paper describes an efficient technique for estimating, via simulation, the probability of buffer overows in a queueing model that arises in the analysis of ATM (Asynchronous Transfer Mode) communication switches. There are multiple streams of (autocorrelated) traffic feeding the switch that has a buffer of finite capacity. Each stream is designated as either being of high or low priority. When the queue length reaches a certain threshold, only high priority packets are admitted to the switch's buffer. The problem is to estimate the loss rate of high priority packets. An asymptotically optimal importance sampling approach is developed for this rare event simulation problem. In this approach, the importance sampling is done in two distinct phases. In the first phase, an importance sampling change of measure is used to bring the queue length up to the threshold at which low priority packets get rejected. In the second phase, a different importance sampling change of measure is used to move the queue length from the threshold to the buffer capacity.
We present a Markov chain Monte Carlo method for class throughputs in closed multiclass product-form networks. The method is as follows. For a given network, we construct a "regularized " network with a highly simplified structure that has the same steady-state distribution. We then simulate the regularized network. The method has performed reasonably well across a broad range of problems. We give a heuristic explanation of this. We prove that the regularized network "mixes in polynomial time" in some special cases. A revision of a manuscript entitled "An efficient Markov chain Monte Carlo algorithm for closed product-form networks." Supported in part under NSF grants DMI-9414630 and DMI-9713730 1 Introduction 1.1 Overview Closed multiclass product-form (CMP) queueing networks are useful as models for manufacturing and communication systems [5, 7, 20, 26, 31, 34, 46, 49]. These networks are of interest because their steady-state distributions are known explicitly up to a normali...
Systems with both continuous and discrete behaviors can be modeled using a mixed-signal style or a hybrid systems style. This paper presents a component-based modeling and simulation framework that supports both modeling styles. The component framework, based on an actor meta-model, takes a hierarchical approach to manage heterogeneity in modeling complex systems. We describe how ordinary differential equations, discrete-event systems, and finite state machines can be built under this meta-model. A mixed-signal system is a hierarchical composition of continuous-time and discrete-event models, and a hybrid system is a hierarchical composition of continuous-time and finite-state-machine models. Hierarchical composition and information hiding help building clean models and efficient execution engines. Simulation technologies, in particular, the interaction between a continuous-time ODE solving engine and various discrete simulation engines are discussed. A signal type system is introduced to schedule hybrid components inside a continuous-time environment. Breakpoints are used to control the numerical integration step sizes so that discrete events are handled properly. A "refiring" mechanism and a "rollback" mechanism are designed to manage continuous components inside a discrete-event environment. The technologies are implemented in the Ptolemy II software environment. Examples are given to show the applications of this framework in mixed-signal and hybrid systems.
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specic permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or 2 C. D. Carothers, K. S. Perumalla and R. M. Fujimoto 1. INTRODUCTION Parallel simulation approaches can be broadly categorized as optimistic or conservative, depending on whether (transient) incorrect computation is ever permitted to occur during the execution. Optimistic parallel simulations permit potentially incorrect computation to occur, but undo or roll back such computation after realizing that it was in fact incorrect. The computation" in simulation applications is one in which a set of operations, called the event computation, modies a set of memory items, called the state. Hence, in order to roll back a computation, ...
ing with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works, requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept, ACM Inc., 1515 Broadway, New York, NY 10036 USA, fax +1 (212) 869-0481, or Modeling Cost/Performance of a Parallel Computer Simulator Babak Falsafi and David A. Wood This paper examines the cost/performance of simulating a hypothetical target parallel computer using a commercial host parallel computer. We address the question of whether parallel simulation is simply faster than sequential simulation, or if it is also more cost-effective. To answer this, we develop a performance model of the Wisconsin Wind Tunnel (WWT), a system that simulates cache-coherent shared-memory machines on a message-passing Thinking Machines CM5. The performance model uses Kruskal and Weiss's fork-join model to account for the...
hat function for univariate normal density
Two construction points in a cone  
Different universal methods (also called automatic or black-box methods) have been suggested to sample from univariate log-concave distributions. The description of a suitable universal generator for multivariate distributions in arbitrary dimensions has not been published up to now. The new algorithm is based on the method of transformed density rejection. To construct a hat function for the rejection algorithm the multivariate density is transformed by a proper transformation T into a concave function (in the case of log-concave density T (x) = log(x).) Then it is possible to construct a dominating function by taking the minimum of several tangent hyperplanes which are transformed back by T Gamma1 into the original scale. The domains of different pieces of the hat function are polyhedra in the multivariate case. Although this method can be shown to work, it is too slow and complicated in higher dimensions. In this paper we split the R n into simple cones. The hat function is co...
We show how to solve the maximal cut and partition problems using a randomized algorithm based on the cross-entropy method. The proposed algorithm employs an auxiliary Bernoulli distribution, which transforms the original deterministic network into an associated stochastic one, called the associated stochastic network (ASN). Each iteration of the randomized algorithm for ASN involves the following two phases: 1. Generation of random cuts using a multidimensional Ber(p) distribution and calculation of the associated cut lengths (objective functions) and some related quantities, such as rare-event probabilities. 2. Updating the parameter vector p on the basis of the data collected in the first phase.
In this paper, we present semantic translations for the actions of Demos, a process-based, discrete event simulation language. Our formal translation schema permits the automatic construction of a process algebraic representation of the underlying simulation model which can then be checked for freedom from deadlock and livelock, as well as system-specific safety and liveness properties. As simulation methodologies are increasingly being used to design and implement complex systems of interacting objects, the ability to perform such verifications is of increasing methodological importance. We also present a normal form for the syntactic construction of Demos programs which allows for the direct comparison of such programs (two programs with the same normal form must execute in identical fashion), reduces model proof obligations by minimising the number of language constructs, and permits an implementer to concentrate on the basic features of the language (any program implementation whic...
This paper introduces Latin supercube sampling (LSS) for very high dimensional simulations, such as arise in particle transport, finance and queuing. LSS is developed as a combination of two widely used methods: Latin hypercube sampling (LHS), and Quasi-Monte Carlo (QMC). In LSS, the input variables are grouped into subsets, and a lower dimensional QMC method is used within each subset. The QMC points are presented in random order within subsets. QMC methods have been observed to lose effectiveness in high dimensional problems. This paper shows that LSS can extend the benefits of QMC to much higher dimensions, when one can make a good grouping of input variables. Some suggestions for grouping variables are given for the motivating examples. Even a poor grouping can still be expected to do as well as LHS. The paper also extends LHS and LSS to infinite dimensional problems. The paper includes a survey of QMC methods, randomized versions of them (RQMC) and previous methods for extending Q...
A comparison of A(s) (solid) and B(s) (dashed).  
One popular family of low discrepancy sets is the (t, m, s)-nets. Recently a randomization of these nets that preserves their net property has been introduced. In this article a formula for the mean square L2-discrepancy of (0, m, s)-nets in base b is derived. This formula has a computational complexity of only O(s log(N) + s2) for large N or s, where N = bm is the number of points. Moreover, the root mean square L2-discrepancy of (0, m, s)-nets is shown to be O(N-1[log(N)](s-1)/2) as N tends to infinity, the same asymptotic order as the known lower bound for the L2-discrepancy of an arbitrary set.
We present a case study in using simulation at design time to predict the performance and scalability properties of a large-scale distributed object system. The system, called Consul, is a network management system that is designed to support hundreds of operators managing millions of network devices. It is essential that a system such as Consul be designed with performance and scalability in mind, but due to Consul's complexity and scale, it is hard to reason about performance and scalability using ad-hoc techniques. We built a simulation of Consul's design to guide the design process by enabling performance and scalability analysis of various design alternatives.
This paper deals with simulation of approximate models of dynamic systems. We propose an approach appropriate when the uncertainty intrinsic in some models cannot be reduced by traditional identification techniques, due to the impossibility of gathering experimental data about the system itself. The paper presents a methodology for qualitative modeling and simulation of approximately known systems. The proposed solution is based on the Fuzzy Sets theory, extending the power of traditional numerical-logical methods. We have implemented a fuzzy simulator that integrates a fuzzy, qualitative approach and traditional, quantitative methods. 1. Introduction Simulation can be considered as a part of the process of modeling and forecasting the behavior of a dynamic system. Its task is to reproduce, in the most suitable way, the evolution of a system model in time (Zeigler, 1976). A model is a finite set of formal relations which, in the traditional scientific approach, are mathema...
A pot of boiling water. 
Simulation model types. 
Partitioning of boiling water state space. 
Homogeneous FSA reenement. 
Continuous state space for boiling water system. 
Qualitative models arising in the artificial intelligence domain often concern real systems that are difficult to represent with traditional means. However, some promise for dealing with such systems is offered by research in simulation methodology. Such research produces models that combine both continuous and discrete event formalisms. Nevertheless, the aims and approaches of the AI and the simulation communities remain rather mutually ill-understood. Consequently, there is a need to bridge theory and methodology in order to have a uniform language when either analyzing or reasoning about physical systems. This article introduces a methodology and formalism for developing multiple, cooperative models of physical systems of the type studied in qualitative physics. The formalism combines discrete event and continuous models and offers an approach to building intelligent machines capable of physical modeling and reasoning. Categories and Subject Descriptors: I.2.4 [Artificia...
This paper presents HCSM, a framework for behavior and scenario control based on communicating hierarchical, concurrent state machines. We specify the structure and an operational execution model of HCSM's state machines. Without providing formal semantics, we provide enough detail to implement the state machines and an execution engine to run them. HCSM explicitly marries the reactive (or logical) portion of system behavior with the control activities that produce the behavior. HCSM state machines contain activity functions that produce outputs each time a machine is executed. An activity function's output value is computed as a function of accessible external data and the outputs of lower level state machines. We show how this enables HCSM to model behaviors that involve attending to multiple concurrent concerns and arbitrating between conflicting demands for limited resources. The execution algorithm is free of order dependencies that cause robustness and stability problems in behav...
Top-cited authors
Makoto Matsumoto
Pierre L’Ecuyer
  • Université de Montréal
Nicholson Collier
Michael John North
  • Argonne National Laboratory
Seong-hee Kim
  • Georgia Institute of Technology