Conference Paper

Conservative Synchronization in Object-Oriented Parallel Battlefield Discrete Event Simulations.

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We present a conservative strategy for spatially decomposed parallel discrete event battlefield simulation. The traditional null message algorithm provides a foundation from which a mapping to generic simulation attributes can be made. We informally discuss preservation of logical correctness and freedom from deadlock. Experimental results demonstrate the potential execution time savings when load imbalance is not dominant; more importantly, they highlight improvement opportunities in spite of potential load imbalance. The net result is that a very reasonable performance gain can be delivered for little effort in a way that supports good simulation system design principles. The approach is straightforward and can be easily implemented as part of a more general sequential or parallel simulation support environment. While the approach is expressed in terms of battlefield simulation, its essence applies to many simulation applications

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

Article
This article focused on the two multi-channel system splicing technology, which are flat-panel screen and ring-screen, and also given the realization method. Established a distributed communication environment by selecting TCP protocol, the multi-channel systems are optimized by adding the maximum relevant waiting time for the communication process between the Master-side and the Slave-side, achieved a large-screen multi-channel D display system, gives a passive multi-channel D display system hardware solutions. The experimental results show that the system has a good performance on the real-timing, consistency and the display.
Article
This paper addresses issues of implementation and performance optimization of simulations designed to model spatially explicit problems with the use of parallel discrete event simulation. A simulation system is presented that uses the optimistic protocol and runs on a distributed memory machine—the IBM SP. The efficiency of parallel discrete event simulations that use the optimistic protocol is strongly dependent on the overhead incurred by rollbacks. This paper introduces a novel approach to rollback processing which limits the number of events rolled back as a result of a straggler or antimessage. The method, called Breadth-First Rollback (BFR), is suitable for spatially explicit problems where the space is discretized and distributed among processes and simulation objects move freely in the space. The BFR uses incremental state saving, allowing the recovery of causal relationships between events during rollback. These relationships are then used to determine which events need to be rolled back. This paper presents an application of BFR to the simulation of Lyme disease. Our results demonstrate and almost linear speedup—a dramatic improvement over the traditional approach to rollback processing. Additionally. BFR is used as a basis of a dynamic load balancing algorithm that migrates load between the simulation processing. Additionally, BFR is used as a basis of a dynamic load balancing algorithm that migrates load between the simulation processes. A brief outline of the algorithm and its potential performance are presented. © 2002 Elsevier Science
Conference Paper
Full-text available
Load balancing is a critical issue for exploiting the parallelism in any application and, particularly, in battlefield simulation where the computational load dynamically changes with both time and space. Domain decomposition is an effective means to balance the load distribution in battlefield simulation. However, finer domain decompositions that lead to better load balance incur heavier communication overhead. Earlier attempts in parallelizing battlefield sinlulation have traded load balance in favor of low con~munication overhead. In this paper, we present three parallel battlefield simulators, implemented on Intel’s iPSC/2 and BBN Butterfly GP-1OOO multicomputers, with finer domain decomposition and address the communication overhead problem by processor allocation strategies that suit the underlying architecture of the machine, On the shared-memory BBN Butterfly, the strategy leads to a new parallel battlefield simulation with dynamic load balancing. Execution times of these simulators are provided, which show that the communication overhead is tolerable.
Article
Full-text available
The performance of a synchronous conservative parallel discrete-event simulation protocol is analyzed. The class of simulation models considered is oriented around a physical domain and possesses a limited ability to predict future behavior. A stochastic model is used to show that as the volume of simulation activity in the model increases relative to a fixed architecture, the complexity of the average per-event overhead due to synchronization, event list manipulation, lookahead calculations, and processor idle time approach the complexity of the average per-event overhead of a serial simulation. The method is therefore within a constant factor of optimal. The analysis demonstrates that on large problems--those for which parallel processing is ideally suited--there is often enough parallel workload so that processors are not usually idle. The viability of the method is also demonstrated empirically, showing how good performance is achieved on large problems using a thirty-two node Intel iPSC/2 distributed memory multiprocessor.
Article
Full-text available
The problem of system simulation is typically solved in a sequential manner due to the wide and intensive sharing of variables by all parts of the system. We propose a distributed solution where processes communicate only through messages with their neighbors; there are no shared variables and there is no central process for message routing or process scheduling. Deadlock is avoided in this system despite the absence of global control. Each process in the solution requires only a limited amount of memory. The correctness of a distributed system is proven by proving the correctness of each of its component processes and then using inductive arguments. The proposed solution has been empirically found to be efficient in preliminary studies. The paper presents formal, detailed proofs of correctness.
Article
Full-text available
SPECTRUM is a testbed for designing and evaluating parallel simulation protocols. The concept is based on the work reported in [Reyn88] in which it was shown that a broad range of possibilities exists for designing parallel simulation protocols. SPECTRUM is the first known testbed in which experimentation on a full range of protocols is supported in a common environment. We introduce the use of filters as a means for efficiently specifying protocols. We have implemented prototype versions of our testbed on an INTEL iPSC/2 and a BBN GP-1000. INTRODUCTION The SPECTRUM testbed is meant to support experimentation with simulation protocol design variables. Our goal is to facilitate experimentation so that a designer may focus on protocols and performance rather than implementation details. We give a brief overview of the state of research into parallel simulation protocols and then discuss the SPECTRUM testbed. Parallel simulation, generally called distributed simulation in the literatu...
Article
This paper presents an approach for speculative parallel execution of rendezvous-synchronized simulations. Rendezvous-synchronized simulation is based on the notions of processes and gates and on the rendezvous mechanism defined in the basic process algebra of LOTOS—a standard formal specification language for temporal ordering [2] . Time is introduced via a mechanism similar to the delay behaviour annotation provided by the TOPO toolset [4-6] . The algorithm allows speculative gate activations. This increases the available parallelism while ensuring correct execution of the computation. The model is used to describe closed stochastic queueing network simulations. Analysis of their execution results suggests that the model makes available a promising degree of parallelism.
Article
Generalized proximity detection for moving objects in a logically correct parallel discrete-event simulation is an interesting and fundamentally challenging problem. Determining who can see whom in a manner that is fully scalable in terms of CPU usage, number of messages, and memory requirements is highly non-trivial. A new scalable approach has been developed to solve this problem. This algorithm, called The Distribution List, has been designed and tested using the object-oriented Synchronous Parallel Environment for Emulation and Discrete-Event Simulation (SPEEDES) operating system. Preliminary results show that the Distribution List algorithm achieves excellent parallel performance.
Article
An approach to carrying out asynchronous, distributed simulation on multiprocessor message-passing architectures is presented. This scheme differs from other distributed simulation schemes because (1) the amount of memory required by all processors together is bounded and is no more than the amount required in sequential simulation and (2) the multiprocessor network is allowed to deadlock, the deadlock is detected, and then the deadlock is broken. Proofs for the correctness of this approach are outlined.
Article
Traditional discrete-event simulations employ an inherently sequential algorithm. In practice, simulations of large systems are limited by this sequentiality, because only a modest number of events can be simulated. Distributed discrete-event simulation (carried out on a network of processors with asynchronous message-communicating capabilities) is proposed as an alternative; it may provide better performance by partitioning the simulation among the component processors. The basic distributed simulation scheme, which uses time encoding, is described. Its major shortcoming is a possibility of deadlock. Several techniques for deadlock avoidance and deadlock detection are suggested. The focus of this work is on the theory of distributed discrete-event simulation.
Conference Paper
The authors assert that the major flaw with the ModSim/TWOS (Time Warp Operating System) system as it currently exists is that there is no compiler support for mapping a ModSim application into an efficient C/TWOS application. Moreover, the ModSim language as currently defined does not provide explicit hooks into the TWOS and hence the developer is unable to tailor a ModSim application in the same fashion that a C application can be tailored. Without sufficient compiler support, there is a mismatch between ModSim's object-oriented, process-based execution model and the Time Warp execution model. The authors present their assessment of ModSim/TWOS and also discuss both components in isolation
Article
Despite over a decade and a half of research and several successes, technologies to use parallel computers to speed up the execution of discrete event simulation programs have not had a significant impact in the general simulation community. Unless new inroads are made in reducing the effort and expertise required to develop efficient parallel simulation models, the field will continue to have limited application, and will remain a specialized technique used by only a handful of researchers. The future success, or failure, of the parallel discrete event simulation field hinges on the extent to which this problem can be addressed. Moreover, failure to meet this challenge will ultimately limit the effectiveness of discrete event simulation, in general, as a tool for analyzing and understanding large-scale systems. Basic underlying principles and techniques that are used in parallel discrete event simulation are briefly reviewed. Taking a retrospective look at the field, several successes and failures in utilizing this technology are discussed. It is noted that past research has not paid adequate attention to the problem of developing simulation models for efficient parallel execution, highlighting the need for future research to pay more attention to this problem. A variety of approaches to make parallel discrete event simulation an effective tool are discussed. INFORMS Journal on Computing, ISSN 1091-9856, was published as ORSA Journal on Computing from 1989 to 1995 under ISSN 0899-1499.
Article
This tutorial surveys the state of the art in executing discrete event simulation programs on a parallel computer. Specifically, we will focus attention on asynchronous simulation programs where few events occur at any single point in simulated time, necessitating the concurrent execution of events occurring at different points in time.We first describe the parallel discrete event simulation problem, and examine why it so difficult. We review several simulation strategies that have been proposed, and discuss the underlying ideas on which they are based. We critique existing approaches in order to clarify their respective strengths and weaknesses.
Article
Parallel discrete event simulation (PDES), sometimes called distributed simulation, refers to the execution of a single discrete event simulation program on a parallel computer. PDES has attracted a considerable amount of interest in recent years. From a pragmatic standpoint, this interest arises from the fact that large simulations in engineering, computer science, economics, and military applications, to mention a few, consume enormous amounts of time on sequential machines. From an academic point of view, parallel simulation is interesting because it represents a problem domain that often contains substantial amounts of parallelism (e.g., see [59]), yet paradoxically, is surprisingly difficult to parallelize in practice. A sufficiently general solution to the PDES problem may lead to new insights in parallel computation as a whole. Historically, the irregular, data-dependent nature of PDES programs has identified it as an application where vectorization techniques using supercomputer hardware provide little benefit [14]. A discrete event simulation model assumes the system being simulated only changes state at discrete points in simulated time. The simulation model jumps from one state to another upon the occurrence of an event. For example, a simulator of a store-and-forward communication network might include state variables to indicate the length of message queues, the status of communication links (busy or idle), etc. Typical events might include arrival of a message at some node in the network, forwarding a message to another network node, component failures, etc. We are especially concerned with the simulation of asynchronous systems where events are not synchronized by a global clock, but rather, occur at irregular time intervals. For these systems, few simulator events occur at any single point in simulated time; therefore parallelization techniques based on lock-step execution using a global simulation clock perform poorly or require assumptions in the timing model that may compromise the fidelity of the simulation. Concurrent execution of events at different points in simulated time is required, but as we shall soon see, this introduces interesting synchronization problems that are at the heart of the PDES problem. This article deals with the execution of a simulation program on a parallel computer by decomposing the simulation application into a set of concurrently executing processes. For completeness, we conclude this section by mentioning other approaches to exploiting parallelism in simulation problems. Comfort and Shepard et al. have proposed using dedicated functional units to implement specific sequential simulation functions, (e.g., event list manipulation and random number generation [20, 23, 47]). This method can provide only a limited amount of speedup, however. Zhang, Zeigler, and Concepcion use the hierarchical decomposition of the simulation model to allow an event consisting of several subevents to be processed concurrently [21, 98]. A third alternative is to execute independent, sequential simulation programs on different processors [11, 39]. This replicated trials approach is useful if the simulation is largely stochastic and one is performing long simulation runs to reduce variance, or if one is attempting to simulate a specific simulation problem across a large number of different parameter settings. However, one drawback with this approach is that each processor must contain sufficient memory to hold the entire simulation. Furthermore, this approach is less suitable in a design environment where results of one experiment are used to determine the experiment that should be performed next because one must wait for a sequential execution to be completed before results are obtained.
Article
Perhaps the most critical problem in distributed simulation is that of mapping: without an effective mapping of workload to processors the speedup potential of parallel processing cannot be realized. Mapping a simulation onto a message-passing architecture is especially difficult when the computational workload dynamically changes as a function of time and space; this is exactly the situation faced by battlefield simulations. This paper studies an approach where the simulated battlefield domain is first partitioned into many regions of equal size; typically there are more regions than processors. The regions are then assigned to processors; a processor is responsible for performing all simulation activity associated with the regions. The assignment algorithm is quite simple and attempts to balance load by exploiting locality of workload intensity. The performance of this technique is studied on a simple battlefield simulation implemented on the Flex/32 multiprocessor. Measurements show that the proposed method achieves reasonable processor efficiencies. Furthermore, the method shows promise for use in dynamic remapping of the simulation.
Conference Paper
Can parallel simulations efficiently exploit a network of workstations? Why haven't PDES models followed standard modeling methodologies? Will the field of PDES survive, and if so, in what form? Researchers in the PDES field have addressed these questions and others in a series of papers published in the last few years. The purpose of this paper is to shed light on these questions, by documenting on actual case study of the development of an optimistically synchronized PDES application on a network of workstations. This paper is unique in that its focus is not necessarily on performance, but on the whole process of developing a model, from the physical system being simulated, through its conceptual design, validation, implementation, and, of course, its performance. This paper also presents the first reported performance results indicating the impact of risk on performance. The results suggest that the optimal value of risk is sensitive to the latency parameters of the communications network
AFIT Guide to SPECTRUM. Department of Electrical and Computer Engineering, School of Engineering, Air Force Institute of Technology
  • T C Hartrum
AFIT Guide to SPECTRUM
  • T C Hartrum
  • Guide
  • Spectrum
  • Oh
  • Afb Wright-Patterson