Conference Paper

Parallel simulation: distributed simulation systems.

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Originating from basic research conducted in the 1970's and 1980's, the parallel and distributed simulation field has matured over the last few decades. Today, operational systems have been fielded for applications such as military training, analysis of communication networks, and air traffic control systems, to mention a few. The article gives an overview of technologies to distribute the execution of simulation programs over multiple computer systems. Particular emphasis is placed on synchronization (also called time management) algorithms as well as data distribution techniques

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... To address the above question, this paper proposes a multi-homing network design model (MHND). In MHND, each user belongs to multiple servers so that applications can continue to be used even in the event of user-server link failures or server failures; the order of event occurrence is guaranteed by using conservative synchronization [3]. It is ensured that the delay is not affected in the event of link failure or server failure, which is determined by the delay of the link with the largest delay among the multiple user-server links. ...
... Research works that guarantee the order of events have been studied in parallel and distributed processing. These works are mainly classified into two categories: conservative synchronization and optimistic synchronization [3]. In conservative synchronization, time information is given to events, and the events are rearranged in the order of occurrence before processing the application, thereby guaranteeing the order of the events. ...
... In terms of availability, the works mentioned above [3,5,6] are based on the single-homing, which provides less availability of services and may not provide service continuity in user-server link and server failures. The work in [8] introduced trailing state synchronization (TSS) for multi-player games with low latency but strong consistency requirements. ...
Article
When mission-critical applications are provided over a network, high availability is required in addition to a low delay. This paper proposes a multi-homing network design model, named MHND, that achieves low delay, high availability, and the order guarantee of events. MHND maintains the event occurrence order with a multi-homing configuration using conservative synchronization. We formulate MHND as an integer linear programming problem to minimize the delay. We prove that the distributed server allocation problem with MHND is NP-complete. Numerical results indicate that, as a multi-homing number, which is the number of servers to which each user belongs, increases, the availability increases while increasing the delay. Noteworthy, two or more multi-homing can achieve approximately an order of magnitude higher availability compared to that of conventional single-homing at the expense of a delay increase up to two times. By using MHND, flexible network design is achieved based on the acceptable delay in service and the required availability.
... An alternative concept to time stepping is Discrete-Event Simulation (DES), or in other words, event-driven simulation. DES is particularly well adopted in computer science (e.g., [2,6]) where this methodology has been used to model temporal evolution of networks and engineering systems whose elements are connected to each other through deterministic or stochastic relations. In these systems events refer to local (quantitative or qualitative) changes to the global state of the system. ...
... Therefore, at each time, clock t PEP selects and processes a batch of events in general. This makes parallel EMAPS straightforward compared to conservative and optimistic approaches used to parallelize traditional discrete-event simulations (e.g., see [6,10]). ...
... Efficient parallelization of event-driven simulations is not straightforward. Traditional discrete-event applications use various deterministic and optimistic synchronization schemes [6]. Deterministic algorithms always execute events in the event timestamp increasing order, which typically results in too few operations per global synchronization step. ...
Preprint
Full-text available
EMAPS is a novel multiscale approach to physics-driven simulations.
... Fujimoto [14] identifies two generations of conservative synchronisation algorithms. The first generation of algorithms has been introduced by Chandy, Misra, and Bryant [5] [9] in the late 1970s. ...
... Many other approaches to prevent deadlocks and lookahead creep problems were developed. Those solutions include, among others, an algorithm that detects deadlocks and resolves them [8], barrier synchronisation [14], and the usage of time intervals [13]. ...
Preprint
Full-text available
Industrial Cyber-Physical Systems (CPS) are sophisticated interconnected systems that combine physical and software components driving various industry sectors worldwide. Distributed CPS (dCPS) consist of many multi-core systems connected via complicated networks. During the development of dCPS, researchers and industrial designers need to consider various design options which have the potential to impact the system's behaviour, cost, and performance. The resulting ramification in size and complexity poses new challenges for manufacturing companies in designing their next-generation machines. However, objectively evaluating these machines' vast number of potential arrangements can be resource-intensive. One potential alternative is to use simulations to model the systems and provide initial analysis with reduced overhead and costs. This literature review investigates state-of-the-art scalability techniques for system-level simulation environments, i.e. Simulation Campaigns, Parallel Discrete Event Simulations (PDES), and Hardware Accelerators. The goal is to address the challenge of scalable Design Space Exploration (DSE) for dCPS, discussing such approaches' characteristics, applications, advantages, and limitations. The conclusion recommends starting with simulation campaigns as those provide increased throughput, adapt to the number of tasks and resources, and are already implemented by many state-of-the-art simulators. Nevertheless, further research has to be conducted to define, implement, and test a sophisticated general workflow addressing the diverse sub-challenges of scaling system-level simulation environments for the exploration of industrial-size distributed Cyber-Physical Systems.
... Simulation is needed to test systems/subsystems what will happen in certain scenarios while the system is running. It is to test the "what if" [18] [11] scenarios that might crop up, and how it effects the whole system. DES need to be run hundreds of times to get enough data to get a reasonable estimate of the dynamic functioning of the whole system/subsystem. ...
... The process consumes much time while running sequentially on a single processor. Parallel DES can parallelise the process of simulation and cut down the time required to simulate drastically [18]. This parallelization is mainly to take advantage of multiple cores in the system or tens of cores in a cloud environment. ...
Conference Paper
Discrete-event Simulation has been used widely in academia as well as in the industry in different areas. However, Discrete-event Simulation (DES) has severe limitations as it is designed to work on a single processor as a single process. With Moore’s Law coming to an end, the designers of processors embed more cores on a single processor. A standard DES cannot take advantage of multiple cores. Parallel Discrete Event Simulators (PDES) take advantage of multiple cores in modern processors. However, the design and implementation of a simulation scenario as a PDES is challenging to achieve. The challenge is to synchronise the execution of various processes and to make them communicate with each other. This paper attempts to create a simple PDES to describe the Simulation of multiple lifts in Scala. We demonstrate the ease and flexibility of Scala in designing and developing a simulation using multi-threading architecture.
... Backstroke produces E f orward (that performs forward execution and saves necessary information), E reverse (that uses this saved information to reverse all effects), and E commit (that performs actions that are irreversible or that should not be undone). Optimistic PDES [14] can then be performed, with any events executed in the wrong order corrected via reversal [4,5,6]. ...
... Each pass of the algorithm contains an even and an odd phase. During even phases (lines [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27], each element at an even index is compared with their neighbour and potentially switched. Each comparison is performed in parallel and can be interleaved in any order. ...
Preprint
Full-text available
We introduce a method of reversing the execution of imperative concurrent programs. Given an irreversible program, we describe the process of producing two versions. The first performs forward execution and saves information necessary for reversal. The second uses this saved information to simulate reversal. We propose using identifiers to overcome challenges of reversing concurrent programs. We prove this reversibility to be correct, showing that the initial program state is restored and that all saved information is used (garbage-free).
... However, the DAS, which sends fixed-time/ variabletime updates as timestamped messages from the physical system to the RtS, runs on a different computer. In the PADS literature, distributed simulation can be defined as the distribution of the execution of a single run of a simulation program across multiple processors (Fujimoto 2000). Thus the execution of RtS should arguably consider concepts from distributed simulation since they both work as one unifying system (namely, integrated DAS-RtS execution and distributed simulation federation), with the sub-systems executed over different computers, and coordination between the subsystems achieved through the exchange of timestamped messages between the subsystems. ...
... form in terms of safety, time, and other resources. To promote the reusability and interoperability of simulation models allowing them to interoperate without geographic constraints, Distributed Simulation (DS) has been introduced (Fujimoto, 2000). One of the most widely adopted standards for distributed simulation is IEEE 1516-2010 -High Level Architecture (IEEE Std. ...
... There is a long history with many other sophisticated methods for implementing consistent systems. In addition to the legacy from database systems [14], distributed simulation has also contributed a wealth of techniques [12]. Lingua Franca (see Section 3.4) realizes extensions of several of these techniques [1]. ...
... Towards this, researchers could investigate using Machine-Learning and AI-based approaches to generate synthetic data for use in RtS and DTs. Similarly, Parallel and Distributed Simulation (PADS) techniques, such as optimistic synchronisation (Fujimoto, 2001), could enable the rollback of computations when real-time feeds are (eventually) restored, which is yet another opportunity for future research. ...
... Similar to #Study 4, the final study is also an example of Model Type E. It is an application of DES with standards and techniques developed in a very specialized area of Computer Science/Applied Computing called Parallel and Distributed Simulation (PADS). Since the late 1970s, this field has studied approaches to distributing a simulation across many computers and linking together and reusing existing simulations running on one or more processors (Fujimoto, 2000). Coordinated execution of such distributed models over different computers requires specialist distributed computing software (also referred to as distributed simulation middleware). ...
... Centralized coordination is based on high-level architecture (HLA) [35] and other distributed simulation frameworks [36,37], with significant extensions that we describe here. Distributed simulation is a relevant problem because, usually, consistency trumps availability. ...
Article
Full-text available
Tiered distributed computing systems, where components run in Internet-of-things devices, in edge computers, and in the cloud, introduce unique difficulties in maintaining consistency of shared data while ensuring availability. A major source of difficulty is the highly variable network latencies that applications must deal with. It is well known in distributed computing that when network latencies rise sufficiently, one or both of consistency and availability must be sacrificed. This paper quantifies consistency and availability and gives an algebraic relationship between these quantities and network latencies. The algebraic relation is linear in a max-plus algebra and supports heterogeneous networks, where the communication latency between 2 components may differ from the latency between another 2 components. We show how to make use this algebraic relation to guide design, enabling software designers to specify consistency and availability requirements, and to derive from those the requirements on network latencies. We show how to design systems to fail in predictable ways when the network latency requirements are violated, by choosing to sacrifice either consistency or availability.
... Lookahead is a safe time-frame, in which LPs will neither generate nor receive new events. In this work, instead of using Null messages [Fujimoto 2001], we use control messages that carry a more broad set of attributes [Cai and Turner 1995]. Each control message has header fields as an array that stores the message exchange route (trace) between LPs. ...
Conference Paper
Full-text available
Hybrid synchronization provides more in-depth details about real distributed systems. However, several advances in algorithms to provide synchronization between local processes brings new difficulties to integrate into existing simulation architectures. This paper explores an alternative architecture to provide hybrid synchronization. We present optimistic and conservative synchronization primitives and design mechanisms to enable LPs to cooperate during the execution of a simulation. The results show that our primitives improve the simulation in terms of rollback-time and idleness.
... Co-simulation procedures have been developed and proposed by different researchers, in particular with the appearance of parallel processing capabilities in modern computers [1][2][3] and with network distributed simulation environments [4][5][6][7]. The computational efficiency opportunities opened up by parallelization led the breaking down of models in simpler sub-systems and the development of co-simulation methods to simulate them, both in the framework of finite elements [8,9] or smoothed particle hydrodynamics [10,11] and in the context of multibody dynamics [12][13][14]. ...
Article
Full-text available
Multibody dynamics methodologies have the potential to integrate in a unique simulation environment the different disciplines, each one with its own equilibrium equations and computational methods. The key issue is not so much the ability to include the different modelling and solution methods in the same computer code, but much more the possibility to handle different codes, eventually programmed with different languages and using their own numerical methods, in a computational environment in which they exchange data in a controlled and efficient form. Such an environment in which different codes co-exist and have coordinated time stepping procedures is defined as a co-simulation. In this work, a co-simulation environment in which a multibody dynamics code is simulated concurrently with a finite element code is presented and demonstrated. The main goal behind this development is to develop a virtual scenario of a realistic interaction between train roof-mounted pantographs and the overhead contact line, also known as catenary, in which different paradigms for the development of mechatronic pantographs can be tested. The multibody code is the simulation tool for the multibody pantograph, while the finite element code is the computational tool in which the catenary is modelled and simulated. Each code has its own time integration algorithm, which require that the equations of motion of both multibody and finite element models are solved at different time instants. This paper proposes a coordination strategy for the time stepping and input-output data required by each of the time integrators that complete the co-simulation environment. The complete co-simulation methodology is demonstrated with the study of the interaction between pantographs and a catenary with the objectives of providing the realistic virtual test ground for the development of mechatronic pantographs and identifying the maximum operation velocity at which the pantograph-catenary couple can operate.
Conference Paper
Full-text available
This article introduces a novel methodology, Network Simulator-centric Compositional Testing (NSCT), to enhance the verification of network protocols with a particular focus on time-varying network properties. NSCT follows a Model-Based Testing (MBT) approach. These approaches usually struggle to test and represent time-varying network properties. NSCT also aims to achieve more accurate and reproducible protocol testing. It is implemented using the Ivy tool and the Shadow network simulator. This enables online debugging of real protocol implementations. A case study on an implementation of QUIC (picoquic) is presented, revealing an error in its compliance with a time-varying specification. This error has subsequently been rectified, highlighting NSCT's effectiveness in uncovering and addressing real-world protocol implementation issues. The article underscores NSCT's potential in advancing protocol testing methodologies, offering a notable contribution to the field of network protocol verification.
Preprint
Full-text available
Discrete-event (DE) systems are concurrent programs where components communicate via tagged events, where tags are drawn from a totally ordered set. Reactors are an emerging model of computation based on DE and realized in the open-source coordination language Lingua Franca. Distributed DE (DDE) systems are DE systems where the components (reactors) communicate over networks. The prior art has required that for DDE systems with cycles, each cycle must contain at least one logical delay, where the tag of events is incremented. Such delays, however, are not required by the elegant fixed-point semantics of DE. The only requirement is that the program be constructive, meaning it is free of causality cycles. This paper gives a way to coordinate the execution of DDE systems that can execute any constructive program, even one with zero-delay cycles. It provides a formal model that exposes exactly the information that must be shared across networks for such execution to be possible. Furthermore, it describes a concrete implementation that is an extension of the coordination mechanisms in Lingua Franca.
Article
This letter proposes a delay-sensitive network design scheme, DSND, for multi-service slice networks. DSND contributes to a virtual processing system where users can share the same application space regardless of distance-related delays. DSND introduces the service slice and service virtual time concepts. A service slice is a virtual network comprising user and server nodes. A service virtual time is a time for eliminating the difference in delay caused by distance, and the user’s events are reordered in occurrence order. The difference between the current time and the service virtual time is the end-to-end delay shared by all users within the same service slice. We formulate DSND as an integer linear programming problem and compare the delays between DSND to a benchmark scheme where each user selects the closest server. Numerical results indicate that DSND can reduce the delay by 4–38 percent compared to the benchmark scheme.
Article
This paper proposes a network design model, considering data consistency for a delay-sensitive distributed processing system. The data consistency is determined by collating the own state and the states of slave servers. If the state is mismatched with other servers, the rollback process is initiated to modify the state to guarantee data consistency. In the proposed model, the selected servers and the master-slave server pairs are determined to minimize the end-to-end delay and the delay for data consistency. We formulate the proposed model as an integer linear programming problem. We evaluate the delay performance and computation time. We evaluate the proposed model in two network models with two, three, and four slave servers. The proposed model reduces the delay for data consistency by up to 31 percent compared to that of a typical model that collates the status of all servers at one master server. The computation time is a few seconds, which is an acceptable time for network design before service launch. These results indicate that the proposed model is effective for delay-sensitive applications.
Article
We discuss a novel approach for constructing deterministic reactive systems that revolves around a temporal model that incorporates a multiplicity of timelines. This model is central to Lingua Franca (LF), a polyglot coordination language and compiler toolchain we are developing for the definition and composition of concurrent components called reactors, which are objects that react to and emit discrete events. Our temporal model differs from existing models like the logical execution time (LET) paradigm and synchronous languages in that it reflects that there are always at least two distinct timelines involved in a reactive system; a logical one and a physical one—and possibly multiple of each kind. This paper explains how the relationship between events across timelines facilitates reasoning about consistency and availability across components in Cyber-Physical Systems (CPS).
Chapter
With the rapid development of wireless networks, wireless local area networks (WLAN) are becoming more and more complex and densely deployed, resulting in a significant increase in the time consumption of traditional serial simulations. Aiming at the time consumption problem of traditional discrete-event-based WLAN serial simulation, A parallel simulation method is proposed based on offline learning with non-uniform time slices, which effectively reduces the time consumption. Firstly, the parallel simulation task is modeled as a problem of completing the simulation task within a given time consumption threshold constraint based on the processes pool. Secondly, the time consumption factor is obtained by offline learning of the simulation platform. Thirdly, the parallel simulation algorithm of non-uniform time slice division (NUTSD) based on the time consumption factor is proposed to analyze and solve the problem. Finally, the method is simulated and verified. The simulation results show that this method can greatly reduce time consumption.KeywordsWLANProcesses PoolOffline LearningNon-Uniform Time SliceParallel Simulation
Chapter
Providing the appropriate infrastructure for simulation is the topic of this chapter of the SCS M&S Body of Knowledge. It provides sections on various simulation standards, standard organizations, and compliance certificates. Publications of some applicable codes of best practices and lessons learned are provided as well as a section on resource repositories. The currently dominant standard Distributed Interactive Simulation (DIS) and High-Level Architecture (HLA) conclude the chapter.KeywordsModeling and simulationSimulation standardsSimulation standard organizationsDistributed interactive simulation (DIS)High-level architecture (HLA)
Article
Full-text available
We extend the work of Ravipati et al.[Comput. Phys. Commun., 2022, 270, 108148] in benchmarking the performance of large-scale, distributed, on-lattice kinetic Monte Carlo (KMC) simulations. Our software package, Zacros, employs a graph-theoretical approach to KMC, coupled with the Time-Warp algorithm for parallel discrete event simulations. The lattice is divided into equal subdomains, each assigned to a single processor; the cornerstone of the Time-Warp algorithm is the state queue, to which snapshots of the KMC (lattice) state are saved regularly, enabling historical KMC information to be corrected when conflicts occur at the subdomain boundaries. Focusing on three model systems, we highlight the key Time-Warp parameters that can be tuned to optimise KMC performance. The frequency of state saving, controlled by the state saving interval, δsnap, is shown to have the largest effect on performance, which favours balancing the overhead of re-simulating KMC history with that of writing state snapshots to memory. Also important is the global virtual time (GVT) computation interval, ΔτGVT, which has little direct effect on the progress of the simulation but controls how often the state queue memory can be freed up. We find that a vector data structure is, in general, more favourable than a linked list for storing the state queue, due to the reduced time required for allocating and de-allocating memory. These findings will guide users in maximising the efficiency of Zacros or other distributed KMC software, which is a vital step towards realising accurate, meso-scale simulations of heterogeneous catalysis.
Article
This is Part 2 of a trio of papers intended to provide a unifying framework in which conservative and optimistic synchronization for parallel discrete event simulations (PDES) can be freely and transparently combined in the same logical process (LP) on an event-by-event basis. In this paper we continue the outline of an approach called Unified Virtual Time (UVT) that was introduced in Part 1 , and show in detail via two extended examples how conservative synchronization can be refactored and combined with optimistic synchronization in the UVT framework. We describe UVT versions of both a basic time windowing algorithm called USTW (Unified Simple Time Windows) and a refactored version of the Chandy-Misra-Bryant Null Message algorithm called UCMB (Unified CMB) .
Article
Algorithms for synchronization of parallel discrete event simulation have historically been divided between conservative methods that require lookahead but not rollback, and optimistic methods that require rollback but not lookahead. In this paper we present a new approach in the form of a framework called Unified Virtual Time (UVT) that unifies the two approaches, combining the advantages of both within a single synchronization theory. Whenever timely lookahead information is available, a logical process (LP) executes conservatively using an irreversible event handler. When lookahead information is not available the LP does not block, as it would in a classical conservative execution, but instead executes optimistically using a reversible event handler. The switch from conservative to optimistic synchronization and back is decided on an event-by-event basis by the simulator, transparently to the model code. UVT treats conservative synchronization algorithms as optional accelerators for an underlying optimistic synchronization algorithm, enabling the speed of conservative execution whenever it is applicable, but otherwise falling back on the generality of optimistic execution. We describe UVT in a novel way, based on fundamental invariants, monotonicity requirements, and synchronization rules. UVT permits zero-delay messages and pays careful attention to tie-handling using superposition. We prove that under fairly general conditions a UVT simulation always makes progress in virtual time. This is Part 1 of a trio of papers describing the UVT framework for PDES, mixing conservative and optimistic synchronization and integrating throttling control.
Article
Distributed Simulation has still to be adopted significantly by the wider simulation community. Reasons for this might be that distributed simulation applications are difficult to develop and access to multiple computing resources are required. Cloud computing offers low-cost on-demand computing resources. Developing applications that can use cloud computing can be also complex, particularly those that can run on different clouds. Cloud-based Distributed Simulation (CBDS) is potentially attractive, as it may solve the computing resources issue as well as other cloud benefits, such as convenient network access. However, as possibly shown by the lack of sustainable approaches in the literature, the combination of cloud and distributed simulation may be far too complex to develop a general approach. E-Infrastructures have emerged as large-scale distributed systems that support high-performance computing in various scientific fields. Workflow Management Systems (WMS) have been created to simplify the use of these e-Infrastructures. There are many examples of where both technologies have been extended to use cloud computing. This article therefore presents our investigation into the potential of using these technologies for CBDS in the above context and the WORkflow architecture for cLoud-based Distributed Simulation (WORLDS), our contribution to CBDS. We present an implementation of WORLDS using the CloudSME Simulation Platform that combines the WS-PGRADE/gUSE WMS with the CloudBroker Platform as a Service. The approach is demonstrated with a case study using an agent-based distributed simulation of an Emergency Medical Service in REPAST and the Portico HLA RTI on the Amazon EC2 cloud.
Article
Full-text available
In this paper, we propose DSP (Distributed algorithm Simulation Platform), a novel process-based distributed algorithm simulation platform to simulate real distributed systems for the design and verification of distributed algorithms. DSP consists of computer processes, and each process simulates an individual distributed node. A DSP process is mainly composed of a communication module, a computation module, an internal storage module, and an external interaction module. DSP is a flexible, versatile, and scalable simulation platform. It supports the testing of applications in various fields. Small-scale experiments can be done with a single personal computer, while large-scale experiments can be carried out through cloud servers. The greatest highlight of DSP is that it is plug and play, where nodes can be freely added or deleted during the simulation process. DSP is now open-sourced on GitHub, https://github.com/Wales-Wyf/Distributed-Algorithm-Simulation-Platform-DSP--2.0.
Chapter
Full-text available
Conducting simulation studies within a model-based framework is a complex process, in which many different concerns must be considered. Central tasks include the specification of the simulation model, the execution of simulation runs, the conduction of systematic simulation experiments, and the management and documentation of the model’s context. In this chapter, we look into how these concerns can be separated and handled by applying domain-specific languages (DSLs), that is, languages that are tailored to specific tasks in a specific application domain. We demonstrate and discuss the features of the approach by using the modelling language ML3, the experiment specification language SESSL, and PROV, a graph-based standard to describe the provenance information underlying the multi-stage process of model development.
Article
The field of Supply Chain Management (SCM ) is experiencing rapid strides in the use of Industry 4.0 technologies and the conceptualization of new supply chain configurations for online retail, sustainable and green supply chains, and the Circular Economy. Thus, there is an increasing impetus to use simulation techniques such as discrete-event simulation, agent-based simulation, and hybrid simulation in the context of SCM. In conventional supply chain simulation, the underlying constituents of the system like manufacturing, distribution, retail, and logistics processes are often modelled and executed as a single model. Unlike this conventional approach, a distributed supply chain simulation (DSCS) enables the coordinated execution of simulation models using specialist software. To understand the current state-of-the-art of DSCS, this paper presents a methodological review and categorization of literature in DSCS using a framework-based approach. Through a study of over 130 articles, we report on the motivation for using DSCS, the modelling techniques, the underlying distributed computing technologies and middleware, its advantages and a future agenda, and also limitations and trade-offs that may be associated with this approach. The increasing adoption of technologies like Internet-of-Things and Cloud Computing will ensure the availability of both data and models for distributed decision-making, which is likely to enable data-driven DSCS of the future. This review aims to inform organizational stakeholders, simulation researchers and practitioners, distributed systems developers and software vendors, as to the current state-of-the art of DSCS, and which will inform the development of future DSCS using new applied computing approaches.
Article
Full-text available
The simulation of high-speed telecommunication systems such as ATM (Asynchronous Transfer Mode) networks has generally required excessively long run times. This paper reviews alternative approaches using parallelism to speed up simulations of discrete event systems, and telecommunication networks in particular. Subsequently, a new simulation method is introduced for the fast parallel simulation of a common network element, namely, a work-conserving finite capacity statistical multiplexer of bursty ON/OFF sources arriving on input links of equal peak rate. The primary performance measure of interest is the cell loss ratio, due to buffer overflows. The proposed method is based on two principal techniques: (1) the derivation of low-level (cell level) statistics from a higher level (burst level) simulation and (2) parallel execution of the burst level simulation program. For the latter, atime-division parallel simulation method is used where simulations operating at different intervals of simulated time are executed concurrently on different processors. Both techniques contribute to the overall speedup. Furthermore, these techniques support simulations that are driven by traces of actual network traffic (trace-driven simulation), in addition to standard models for source traffic. An analysis of this technique is described, indicating that it offers excellent potential for delivering good performance. Measurements of an implementation running on a 32 processor KSR-2 multiprocessor demonstrate that, for certain model parameter settings, the simulator is able to simulate up to 10 billion cell arrivals per second of wallclock time.
Conference Paper
Full-text available
The aggregate level simulation protocol (ALSP) concept was initiated by ARPA in January 1990, the first laboratory demonstration took place in January 1991, and the first fielding in support of a major military exercise took place in July 1992. Since then, the ALSP confederation of models has grown from the original two members to six. In support of this growing confederation, the ALSP Infrastructure Software (AIS) has evolved from its fundamental functionality to the current focus on improved confederation management and performance. This paper describes the evolution of the AIS from the initial prototype to the present, emphasizing the discovery of new requirements and how they were accommodated.
Conference Paper
Full-text available
We describe NetEffect, a highly-scalable architecture for developing, supporting and managing large, media-rich, 3D virtual worlds used by several thousand geographically dispersed users using low-end computers (PCs) and modems. NetEffect partitions a whole virtual world into communities, allocates these communities among a set of servers, and migrates clients from one server to another as clients move through the communities. It devotes special attention to minimizing the network traffic, in particular, the traftic that must go through servers. HistoryCity, a virtual world for children, has been developed on NetEffect and is currently being beta-tested for deployment in Singapore.
Conference Paper
Full-text available
The Data Distribution Management (DDM) service is one of the six services provided in the Runtime Infrastructure (RTI) of High Level Architecture (HLA). Its purpose is to perform data filtering and reduce irrelevant data communicated between federates. The two DDM schemes proposed for RTI, region based and grid based DDM, are oriented to send as little irrelevant data to subscribers as possible, but only manage to filter part of this information and some irrelevant data is still being communicated. Previously (G. Tan et al., 2000), we employed intelligent agents to perform data filtering in HLA, implemented an agent based DDM in RTI (ARTI) and compared it with the other two filtering mechanisms. The paper reports on additional experiments, results and analysis using two scenarios: the AWACS sensing aircraft simulation and the air traffic control simulation scenario. Experimental results show that compared with other mechanisms, the agent based approach communicates only relevant data and minimizes network communication, and is also comparable in terms of time efficiency. Some guidelines on when the agent based scheme can be used are also given
Conference Paper
Full-text available
This paper addresses the issue of efficient and accurate performance prediction of large-scale message-passing applications on high performance architectures using simulation. Such simulators are often based on parallel discrete event simulation, typically using the conservative protocol to synchronize the simulation threads. The paper considers how a compiler can be used to automatically extract information about the lookahead present in the application, and how this can be used to improve the performance of the null protocol used for synchronization. These techniques are implemented in the MPI-Sim simulator and dHPF compiler, which had previously been extended to work together for optimizing the simulation of local computational components of an application. The results shows that the availability of lookahead information improves the runtime of the simulator by factors ranging from 9% up to two orders of magnitude, with 30-60% improvements being typical for the real-world codes. The experiments also show that these improvements are directly correlated with reductions in the number of null messages required by the simulations.
Article
Full-text available
Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concurrency control. Virtual time provides a flexible abstraction of real time in much the same way that virtual memory provides an abstraction of real memory. It is implemented using the Time Warp mechanism, a synchronization protocol distinguished by its reliance on lookahead-rollback, and by its implementation of rollback via antimessages.
Article
Full-text available
Thesis (MS)--Massachusetts Institute of Technology. Includes bibliographical references (p. 91-92).
Conference Paper
Full-text available
We present a classification that groups lookback into four types: direct strong lookback, universal strong lookback, direct weak lookback, and universal weak lookback. They are defined in terms of absolute and dynamic impact times. We discuss relationships between lookback types by considering when rollbacks and/or antimessages are avoided. From different types of lookback, we also derive three optimization techniques for optimistic simulation and point out their advantages over lazy cancellation. Finally, we show that all four types of lookback exist in the PCS network simulation and can be exploited by either lookback-based or optimistic protocols.
Conference Paper
Full-text available
We describe two major developments in the general network simulation integration system (Genesis): the support for BGP protocol in large network simulations and distribution of the simulation memory among Genesis component simulations. Genesis uses a high granularity synchronization mechanism between parallel simulations simulating parts of a network. This mechanism uses checkpointed simulation state to iterate over the same time interval until convergence. It also replaces individual packet data for flows crossing the network partitions with statistical characterization of such flows over the synchronization time interval. We had achieved significant performance improvement over the sequential simulation for simulations with TCP and UDP traffic. However, this approach cannot be used directly to simulate dynamic routing protocols that use underlying network for exchanging protocol information, as no packets are exchanged in Genesis between simulated network parts. We have developed a new mechanism to exchange and synchronize BGP routing data among distributed Genesis simulators. The extended Genesis allows simulations of more realistic network scenarios, including routing flows, in addition to TCP or UDP data traffic. Large memory size required by simulation software hinders the simulation of large-scale networks. Based on our new support of distributed BGP simulation, we developed an approach to construct and simulate networks on distributed memory using Genesis simulators in such a way that each participating processor possesses only data related to the part of the network it simulates. This solution supports simulations of large-scale networks on machines with modest memory size.
Conference Paper
Full-text available
Wireless networks' models differ from wired ones at least in the innovative dynamic effects of host-mobility and open-broadcast nature of the wireless medium. Topology changes due to simulated hosts' mobility map on causality effects in the "areas of influence" of each mobile device. The analysis of wireless networks of interest today may include a potentially high number of simulated hosts, resulting in performance and scalability problems for discrete-event sequential simulation tools and methods, on a single physical execution unit (PEU). In a distributed simulation, the main bottleneck becomes the communication and synchronization required to maintain the causality constrains between distributed model components. We propose an HLA-based, dynamic mechanism for the runtime management and allocation of model entities in a distributed simulation of wireless networks models, over a cluster of PEUs. By adopting a runtime evaluation of causal bindings between model entities we map the causal effects of virtual topology changes to dynamic migration of data structures. Preliminary results demonstrate that the prototype heuristics lead to a reduction in the percentage of external communication between the PEUs, limited overheads and performance enhancements for a worst-case scenario.
Conference Paper
Full-text available
This paper describes a time warp mechanism designed to exploit temporal uncertainty (TU) in distributed simulation. Novel in the proposed approach are: a formal event model where events are assigned time intervals instead of usual punctual timestamps; an aggressive cancellation technique which shifts overheads from communications to computation; and an implementation in Java which deploys a framework for distributed simulations over the Internet. The paper introduces the time warp mechanism and reports some experimental results using a large PCS model. The experiments confirm that TU is able to speedup simulation without compromising the accuracy of the results
Article
Simulating asynchronous multiple-loop networks is commonly considered a difficult task for parallel programming. This paper presents two examples of asynchronous multiple-loop networks: a stylized queuing system and an Ising model. The network topology in both cases is an n X n grid on a torus. A new distributed simulation algorithm is demonstrated on these two examples. The algorithm combines three elements: 1) the bounded lag restriction, 2) precomputed minimal propagation delays, and 3) the so-called opaque periods. Theoretical performance evaluation suggests that if N processing elements (PEs) execute the algorithm in parallel and the simulated system exhibits sufficient density of events, then, in average, processing one event would require Ο (log N ) instructions of one PE. In practice, the algorithm has achieved substantial speed-ups: the speed-up is greater than 16 using 25 PEs on a shared memory MIMD bus computer, and greater than 1900 using 2 ¹⁴ PEs on a SIMD computer.
Article
The successful application of optimistic synchronization techniques in parallel simulation requires that rollback overheads be contained. The chief contributions to rollback overhead in a Time Warp simulation are the time required to save state information and the time required to restore a previous state. Two competing techniques for reducing rollback overhead are periodic checkpointing (Lin and Lazowska, 1989) and incremental state saving (Bauer et al., 1991). This paper analytically compares the relative performance of periodic checkpointing to incremental state savings. The analytical model derived for periodic checkpointing is based almost entirely on the previous model developed by Lin (Lin and Lazowska, 1989). The analytical model for incremental state saving has been developed for this study. The comparison assumes an optimal checkpoint interval and shows under what simulation parameters each technique performs best.
Technical Report
Data messages, called protocol data units (PDUs), that are exchanged between simulation applications are defined. These PDUs provide information concerning simulated entity states and the types of entity interactions that take place in a distributed interactive simulation (DIS). The messages defined are for interactions that are primarily within visual range. Future versions of this standard will contain additional PDUs required to exchange information about interactions and functions not currently supported.
Conference Paper
Architectural advances are making PDES more difficult. Processor speed improves much more quickly than interprocessor communication and memory. Lagging memory performance has a much greater impact on optimistic techniques, especially for large scale models, but conservative techniques require more small protocol messages and are therefore impacted by slow IPC. The performance of conservative PDES is heavily dependent on the lookahead available in the simulation model. Finding larger lookahead not only allows for increased parallelism, it also reduces the number of protocol messages required. In this paper, a global view of a PDES model as a set of data flows is presented. Using this view, the lookahead of the model can be optimized, resulting in a significant decrease in protocol messages, with only a marginal increase in computation, using realistic, detailed models as examples.
Article
The successful application of optimistic synchronization techniques in parallel simulation requires that rollback overheads be contained. The chief contributions to rollback overhead in a Time Warp simulation are the time required to save state information and the time required to restore a previous state. Two competing techniques for reducing rollback overhead are periodic checkpointing (Lin and Lazowska, 1989) and incremental state saving (Bauer et al., 1991). This paper analytically compares the relative performance of periodic checkpointing to incremental state savings. The analytical model derived for periodic checkpointing is based almost entirely on the previous model developed by Lin (Lin and Lazowska, 1989). The analytical model for incremental state saving has been developed for this study. The comparison assumes an optimal checkpoint interval and shows under what simulation parameters each technique performs best.
Article
This paper is a reminder of the danger of allowing ``risk'' when synchronizing a parallel discrete-event simulation: a simulation code that runs correctly on a serial machine may, when run in parallel, fail catastrophically. This can happen when Time Warp presents an ``inconsistent'' message to an LP, a message that makes absolutely no sense given the LP's state. Failure may result if the simulation modeler did not anticipate the possibility of this inconsistency. While the problem is not new, there has been little discussion of how to deal with it; furthermore the problem may not be evident to new users or potential users of parallel simulation. This paper shows how the problem may occur, and the damage it may cause. We show how one may eliminate inconsistencies due to lagging rollbacks and stale state, but then show that so long as risk is allowed it is still possible for an LP to be placed in a state that is inconsistent with model semantics, again making it vulnerable to failure. We finally show how simulation code can be tested to ensure safe execution under a risk-free protocol. Whether risky or risk-free, we conclude that under current practice the development of correct and safe parallel simulation code is not transparent to the modeler; certain protections must be included in model code or model testing that are not rigorously necessary if the simulation were executed only serially.
Article
An approach to carrying out asynchronous, distributed simulation on multiprocessor message-passing architectures is presented. This scheme differs from other distributed simulation schemes because (1) the amount of memory required by all processors together is bounded and is no more than the amount required in sequential simulation and (2) the multiprocessor network is allowed to deadlock, the deadlock is detected, and then the deadlock is broken. Proofs for the correctness of this approach are outlined.
Article
Simulating asynchronous multiple-loop networks is commonly considered a difficult task for parallel programming. Two examples of asynchronous multiple-loop networks are presented in this article: a stylized queuing system and an Ising model. In both cases, the network is an n × n grid on a torus and includes at least an order of n2 feedback loops. A new distributed simulation algorithm is demonstrated on these two examples. The algorithm combines three elements: (1) the bounded lag restriction; (2) minimum propagation delays; and (3) the so-called opaque periods. We prove that if N processing elements (PEs) execute the algorithm in parallel and the simulated system exhibits sufficient density of events, then, on average, processing one event would require O(log N) instructions of one PE. Experiments on a shared memory MIMD bus computer (Sequent's Balance) and on a SIMD computer (Connection Machine) show speed-ups greater than 16 on 25 PEs of a Balance and greater than1900 on 214 PEs of a Connection Machine.
Conference Paper
Data distribution management (DDM) is one of the services defined by the DoD High Level Architecture. DDM is necessary to provide efficient, scalable mechanisms for distributing state updates and interaction information in large scale distributed simulations. We describe data distribution management mechanisms (also known as filtering) used for real time training simulations. We propose a new DDM approach to multicast group allocation, which we refer to as a dynamic grid-based allocation. Our scheme is based on a combination of a fixed grid-based method, known for its low overhead and ease of implementation, and a sender-based strategy, which uses fewer multicast groups than the fixed grid-based method. We describe our DDM algorithm, its implementation, and report on the performance results that we have obtained using the RTI-Kit framework. These results include the outcome of experiments comparing our approach to the fixed grid-based method, and they show that our scheme is scalable and significantly reduces the message overhead of previous grid-based allocation schemes
Conference Paper
Supply-chain management covers the planning and management of material and information from the manufacturer through the distributors, and finally to the customer. With the globalization of markets, the optimization of supply-chain management becomes more and more important. Simulation of supply-chains can help in the optimization process by evaluating the impact of alternative policies. To support the reusability of existing simulation models in a supply-chain simulation, a common standard is required for enabling the interoperability amongst different simulation programs through a well-defined interface. The High Level Architecture (HLA) is an architecture for reuse and interoperation of simulations. We report our experiences on employing the HLA to support reusability and interoperability in semiconductor supply-chain simulation. Our experiments show that by fine-tuning the integration of the application with the HLA Run-Time Infrastructure (RTI), considerable performance improvements can be achieved
Conference Paper
The authors describe a technique for performing parallel simulation of a trace of address references for the purpose of evaluating different cache structures. One way to achieve fast parallel simulation is to simulate the individual independent sets of a cache concurrently on different computers, but this technique is not efficient in a statistical sense because of a high correlation of the activity between different sets. Only a small fraction of sets should actually be simulated. To put parallelism to effective use, a trace of the sets to be simulated can be partitioned into disjoint time intervals, and each interval can be simulated concurrently. Because the contents of the cache are unknown at the start of the time intervals, this parallel simulation does not produce the correct counts of cache hits and misses. However, after simulating the trace in parallel, a small amount of resimulation can produce the correct counts. The resimulation effort required is proportional to the size of the cache simulated and not to the length of the trace
Conference Paper
Distributed synchronization for parallel simulation is generally classified as being either optimistic or conservative. While considerable investigations have been conducted to analyze and optimize each of these synchronization strategies, very little study on the definition and strictness of causality have been conducted. Do we really need to preserve causality in all types of simulations? The paper attempts to answer this question. We argue that significant performance gains can be made by reconsidering this definition to decide if the parallel simulation needs to preserve causality. We investigate the feasibility of unsynchronized parallel simulation through the use of several queuing model simulations and present a comparative analysis between unsynchronized and Time Warp simulation
Conference Paper
One of the six categories of management services provided in the Run Time Infrastructure (RTI) to federated simulations is time management. Currently, it provides only two message ordering policies, that is, time stamp ordering and receipt ordering. Temporal anomalies occurring during the execution of federation due to the heterogeneous latencies in the communication network are not handled in receipt ordering. While time stamp ordering eliminates the temporal anomalies entirely, it incurs great communication latency, and huge bandwidth requirement. The paper presents a novel time management mechanism which provides a less costly message ordering service, namely causal ordering, to federates. It does not require the specification of lookahead and allows federates that do not require stringent message ordering properties to achieve much more efficient execution. A series of experiments has been carried out to benchmark the performance of this new time management mechanism and the results show that it includes a slight overhead compared to the receipt ordering mechanism but achieves significant performance improvement over the time stamp ordering mechanism
Conference Paper
A variation of the Time Warp parallel discrete event simulation mechanism is presented that is optimized for execution on a shared memory multiprocessor. In particular, the direct cancellation mechanism is proposed that eliminates the need for anti-messages and provides an efficient mechanism for cancelling erroneous computations. The mechanism thereby eliminates many of the overheads associated with conventional, message-based implementations of Time Warp. More importantly, this mechanism effects rapid repairs of the parallel computation when an error is discovered. Initial performance measurements of an implementation of the mechanism executing on a BBN Butterfly multiprocessor are presented. These measurements indicate that the mechanism achieves good performance, particularly for many workloads where conservative clock synchronization algorithms perform poorly. Speedups as high as 56.8 using 64 processors were obtained. However, our studies also indicate that state saving overheads represent a significant stumbling block for many parallel simulations using Time Warp. (kr)
Article
Most parallel simulation algorithms (e.g., Chandy and Misra’s algorithm or the Time Warp algorithm) are based on a “space-division” approach. The parallelism of this approach is limited by the causality constraints. Another approach, the “time-division” approach, may provide more parallelism if the time domain is appropriately partitioned. We present a time-division parallel simulation algorithm that partitions the time domain via state matching. We show that linear speed up can be achieved. For a complex system, the best parallel simulation approach is to integrate “time-division” and “space-division” algorithms: the simulated system is partitioned into several subsystems; a subsystem may be simulated by the time-division approach (e.g., our algorithm), while the overall system is simulated by the space-division approach.
Article
A new parallel simulation method that is amenable to an SIMD implementation is proposed for the class of stochastic decision-free Petri nets. This method is based on the max, +-linear structure of recurrence equations that were established for this type of system. Two variants are analyzed, the spatial and the temporal methods. The spatial method allows one to simulate large networks. The temporal method, which generalizes to Petri nets a method that was introduced recently for queues, is of more use for simulating systems for a long time interval. The emphasis is on the spatial approach, which is shown to provide a simple way of estimating both the cycle time and the statistics of the marking process. The theoretical parallel complexity of this algorithm is first investigated. In particular, a few examples of practical interest are provided (blocking queues in tandem and a stochastic job shop model) for which the cost of simulating O(NT) events of a net of size T is in O(NlogT) with this parallel simulation method, while the classical sequential discrete event simulation is in O(NT) at least. These theoretical considerations are confirmed by experimental results obtained from a prototype that was implemented on the Connection Machine.
Article
A discrete event simulation model may contain several events that have the same timestamp, referred to as simultaneous events. In general, the results of a simulation depend on the order in which simultaneous events are executed. Simulation languages and protocols use different, sometimes ad hoc, tie-breaking mechanisms to order simulataneous events. As a result, it may be impossible to reproduce the results of a simulation model across different simulators. This article presents a systematic analysis of the lookahead requirements for sequential and parallel simulation protocols, utilizing the process-oriented world view, with respect to their abililty to execute models with simultaneous events in a deterministic order. In particular, the article shows that most protocols, including the global event list protocol and commonly used parallel conservative and optimistic protocols, require that the simulation model provide some form of lookahead guarantee to enforce deterministc ordering of simultaneous events. The article also shows that the lookahead requirements for many protocols can be weakened if the model allows simultaneous events to be processed in a nondeterministic order. Finally, the lookahead properties that must be satisfied by a model in order for its execution to make guaranteed progress are derived using various simulation protocols.
Article
Distributed virtual reality systems require accurate, efficient remote rendering of animated entities in the virtual environment. Position, velocity, and acceleration information about each player is maintained at the player's local machine, but remote hosts must display this information in real-time to support interaction between users across the network. Prior applications have transmitted position information at the local frame rate, or they have relied on dead-reckoning protocols using higher derivative information to extrapolate entity position between less frequent updates. These approaches require considerable network bandwidth and at times exhibit poor behavior. This paper describes a position history-based protocol whose update packets contain only position information. Remote hosts extrapolate from several position updates to track the location and orientation of entities between infrequent updates. Our evaluation suggests that the position history-based protocol provides a network-scalable solution for generating smooth, accurate rendering of remote entities.
Article
Thesis (Ph. D.)--University of California, Los Angeles, 1985. Typescript (photocopy). Vita. Includes bibliographical references (leaves 211-216).
Article
Guía para High Level Architecture (HLA) que es un nuevo estándard global utilizado en la creación de modelos y simulación por computadora basados en componentes.
Article
©2003 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or distribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder. Presented at the Seventeenth Workshop on Parallel and Distributed Simulation (PADS 03), 2003 Federated simulation interfaces such as the High Level Architecture (HLA) were designed for interoperability, and as such are not traditionally associated with high performance computing. In this paper, we present results of a case study examining the use of federated simulations using runtime infrastructure (RTI) software to realize large-scale parallel network simulators. We examine the performance of two different federated network simulators, and describe RTI performance optimizations that were used to achieve efficient execution. We show that RTI-based parallel simulations can scale extremely well and achieve very high speedup. Our experiments yielded more than 80-fold scaled speedup in simulating large TCP/IP networks, demonstrating performance of up to 6 million simulated packet transmissions per second on a Linux cluster. Networks containing up to two million network nodes (routers and end systems) were simulated.
Article
Synchronous Parallel Environment for Emulation and Discrete-Event Simulation (SPEEDES) is a unified parallel simulation environment. It supports multiple-synchronization protocols without requiring users to recompile their code. When a SPEEDES simulation runs on one node, all the extra parallel overhead is removed automatically at run time. When the same executable runs in parallel, the user preselects the synchronization algorithm from a list of options. SPEEDES currently runs on UNIX networks and on the California Institute of Technology/Jet Propulsion Laboratory Mark III Hypercube. SPEEDES also supports interactive simulations. Featured in the SPEEDES environment is a new parallel synchronization approach called Breathing Time Buckets. This algorithm uses some of the conservative techniques found in Time Bucket synchronization, along with the optimism that characterizes the Time Warp approach. A mathematical model derived from first principles predicts the performance of Breathing Time Buckets. Along with the Breathing Time Buckets algorithm, this paper discusses the rules for processing events in SPEEDES, describes the implementation of various other synchronization protocols supported by SPEEDES, describes some new ones for the future, discusses interactive simulations, and then gives some performance results.
Conference Paper
Federated simulation interfaces such as the high level architecture (HLA) were designed for interoperability, and as such are not traditionally associated with high-performance computing. We present results of a case study examining the use of federated simulations using runtime infrastructure (RTI) software to realize large-scale parallel network simulators. We examine the performance of two different federated network simulators, and describe RTI performance optimizations that were used to achieve efficient execution. We show that RTI-based parallel simulations can scale extremely well and achieve very high speedup. Our experiments yielded more than 80-fold scaled speedup in simulating large TCP/IP networks, demonstrating performance of up to 6 million simulated packet transmissions per second on a Linux cluster. Networks containing up to two million network nodes (routers and end systems) were simulated.
Conference Paper
HLA is a mechanism for interconnecting disparate simulations over a network. Its main application has been distributed wargaming, where simulations prepared by different organizations are combined in a virtual environment for a specific training exercise or study objective. The individual simulations are called federates in the HLA world, while the collection of federates, that interoperate in the virtual world is called a federation. We investigate an alternate use of the HLA. It is possible to imagine that the federates are not disparate simulations, but a single simulation that has been partitioned along some logical axis and federated with itself. Such a self-federation has many attractive features. First, the data structures are identical among the various federates. Secondly, there are no problems with common definitions of data elements, messages, interactions, or algorithms, as the federates all derive from the same source code. Finally, the mechanism can be viewed as a lightweight parallel simulation engine, as all of the details of synchronization, time management, message passing, and monitoring have already been well thought out and implemented in the HLA system. The Total Airport and Airspace Model (TAAM) is a large air traffic simulation that is a worldwide standard for aviation analysis. Traditionally it has been used for regional studies, consisting of a small subset of airports and airspace. Recently, there has been interest in using TAAM for much larger scenarios, such as simulations of traffic throughout the entire United States. It is possible, but not practical, to run such simulations with TAAM.
Conference Paper
Time Warp is known for its ability to maximize the exploitation of the parallelism inherent in a simulation. However, this potential has been undermined by the cost of processing causality violations. Minimizing this cost has been one of the most challenging issues facing Time Warp. In this paper, we present dependence list cancellation, a direct cancellation technique for Time Warp which is intended for use in a distributed memory environment such as a network of workstations. This approach provides for the swift cancellation of erroneous events, thereby preventing the propagation of their (erroneous) descendants. The dependence list also provides an event filtering function which detects erroneous future events, and also reduces the number of anti-messages used in the simulation. Our experimental work indicates that dependence list cancellation results in a dramatic reduction in the time required to process causality violations in Time Warp
Conference Paper
Originating from basic research conducted in the 1970's and 1980's, the parallel and distributed simulation field has matured over the last few decades. Today, operational systems have been fielded for applications such as military training, analysis of communication networks, and air traffic control systems, to mention a few. This tutorial gives an overview of technologies to distribute the execution of simulation programs over multiple computer systems. Particular emphasis is placed on synchronization (also called time management) algorithms as well as data distribution techniques.