Content uploaded by Khaled Elleithy
Author content
All content in this area was uploaded by Khaled Elleithy
Content may be subject to copyright.
Content uploaded by Khaled Elleithy
Author content
All content in this area was uploaded by Khaled Elleithy
Content may be subject to copyright.
Content uploaded by Khaled Elleithy
Author content
All content in this area was uploaded by Khaled Elleithy
Content may be subject to copyright.
Reducing Null Messages Using Grouping and Status Retrieval for a Conservative
Discrete-Event Simulation System
Bevin Thomas, Syed S. Rizvi, Khaled M. Elleithy
Department of Computer Science, University of Bridgeport, Bridgeport, CT 06604, USA
{srizvi, bthomas, elleithy}@bridgeport.edu
Tel: 92 (111) 002-004
Fax:92 (021) 509-0968
Keywords: Discrete event simulation, conservative
algorithms, null message algorithm, parallel systems.
Abstract
In this paper we investigate Chandy-Misra-Bryant Null
message algorithm and propose a grouping technique to
improve the performance. This technique along with status
retrieval which will be explained in detail can improve the
performance when compared to the traditional conservative
algorithm by Chandy-Misra-Bryant. Null message algorithm is
an efficient conservative algorithm that uses null messages to
provide synchronization between logical processes in a parallel
discrete event simulation (PDES) system. The performance can
be decreased if a large number of null messages are generated
by LPs to avoid deadlock. The main objective of this research
work is to propose a new grouping technique that can be used
to reduce the Null messages between the logical processes.
Since the performance of Null Message algorithm mainly
depends on the Lookahead (L) values, our proposed technique
can be used to determine an optimum value of the Lookahead.
1. INTRODUCTION
Parallel and distributed simulation refers to technologies that
help a simulation program to execute on multiple processors,
interconnected networks. Parallel simulations execute on
multiple processors or multiple computers confined to single
machine room while distributed simulations execute on
computers that are distributed geographically. A Time
management is required to ensure the execution of the
distributed simulation is properly synchronized. Time
management ensures that events are processed in a correct
order. Time management algorithms assume that logical
processes (LPs) communicate by exchanging time stamped
messages or events. The criterion is to make sure that LP
process events are processed in timestamp order.
There are two different kinds of parallel simulation: -
optimistic and conservative. Optimistic simulation allows
processors to independently simulate events assuming they are
temporally correct. When it is discovered that there is a
temporal discrepancy, the simulation is “rolled back” to the
time of the discrepancy and then proceeds again. Conservative
simulation never allows discrepancies – event processing is
only allowed when it can be guaranteed that the event will not
be altered. The principal task of conservative simulation is to
determine when it is “safe” to process an event. An event is
said to be “safe” if the event containing the time stamp is less
than the Lower Bound on the Time Stamp (LBTS).
The algorithms developed by Chandy, Misra and Bryant
were the first synchronizations algorithms that were developed.
Each LP sends messages with non – decreasing timestamps
and it’s received in the same order it was received. Each
process will execute an event with the smallest timestamp from
the queues. If any of the queues is empty then process is
blocked till the selected queue is not empty. This approach is
prone to deadlocks and therefore Chandy-Misra-Bryant
suggested the Null Message Algorithm (NMA). Null messages
are used to avoid deadlocks. A null message with timestamp
T
null
send from LP
a
to LP
b
is a promise by LPa that it will not
later send a message to LP
b
carrying a timestamp smaller than
T
null
. The null message algorithm introduced a key property
which is the Lookahead. In simple words, if an LP is at
simulation time T, and it can guarantee that any message it will
send in the future will have a time stamp of at least T+L
regardless of what messages it may later receive.
1.1 Problem Identification
Null message algorithm (NMA) has indeed resolved the
problem of deadlocks which by sending null messages between
neighboring LPs. The drawback of NMA is that the
performance can degrade drastically if a lot of null messages
are sent across the network and betweens LPs. It depends upon
the Lookahead value (L). If the Lookahead value is very small
then a lot of null messages are sent across. The main objective
is to calculate an optimum Lookahead value so that the null
1-4244-1457-1/08/$25.00 © IEEE
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the SpringSim 2009 proceeding.
Authorized licensed use limited to: University of Bridgeport. Downloaded on October 30, 2008 at 18:00 from IEEE Xplore. Restrictions apply.
messages can reduce and therefore improve the performance.
2. RELATED WORK
There are a few researches done in NMA in terms of
reducing the no. of null messages and calculating the optimum
Lookahead value. Ronald C. De Vries in [4] reduced the
number of NULL messages through of prediction of channel
lines. A framework is presented on which the distributed
discrete event simulation can be built for applications which
can be decomposed into feed-forward and feedback networks.
Another notable work done mentioned in [2] was the
research done by Syed S. Rizvi, K. M. Elleithy, and Aasia
Riasat in which they proposed a mathematical model which
can be used to approximate the optimal values of some critical
parameters such as frequency of transmission, Lookahead (L)
values, and the variance of null message elimination.
According to B. R. Preiss, W. M. Loucks, J. D. MacIntyre, J.
A. Field referred in [3], a null message cancellation can
improve performance by a great factor. Null Message
cancellation is an algorithmic modification to the basic
conservative synchronization scheme wherein a null message
is discarded before receipt when overcome by a message with a
larger timestamp.
3. METHOD OF GROUPING AND STATUS RETRIEVAL
In this paper, the approach to reduce the number of null
messages is to group logical processes where each group may
consist of n number of LPs. The number of LPs for each group
depends upon their similarities or with an optimum value. The
optimum number of LPs should be calculated. Each group is
controlled by a controller. The role of the controller is to
synchronize LPs within a group as well as send
synchronization messages across groups to different other
controllers. Synchronization messages are sent after fixed
interval of time. Whenever a controller receives a message
from the neighboring linked controller, it broadcasts the
message within the group.
The controller has to be directly connected to the LPs. All
LPs are connected to other neighboring LPs using mesh
topology. However, an LP can send synchronization messages
only to controller. The synchronization message from an LP to
its corresponding controller indicates that the sending LP has
finished with its assigned tasks. Upon receiving the
synchronization message from the one of the LPs, controller
broadcast it inside the group. Controllers send synchronization
messages after a fixed interval of time.
Fig. 1 represents the implementation of the proposed
Start
Controller Receives
a Message from LP
Pool of LPs
Grouped LPs
Non-Grouped
LPs
Reorder LP and
Time S
tamp
Message
Broadcast to LPs
Select Smallest Time
Stamp Mes
sage
Controller Receives
a Message from LP
Message Transmitted
to Controller LPs
Select Smallest Time
Stamp Mes
sage
Fig. 1.
An illustration of the proposed algorithm. The upper dotted box represents the implementation of the first 4 steps. The lower dotted
box represents step 5 to 8 of the proposed algorithm. The last 2 steps of the proposed algorithm used in both upper and lower boxes.
algorithm. The upper dotted-box of Fig.1 represents the
behavior of a controller when it receives a message from a
grouped-LP. On the other hand, the lower dotted-box of Fig. 1
represents that how controller deals with the message
originating from an ungrouped-LP.
In Fig. 2, there are 9 LPs and are grouped into 3 where each
group consist of 3 LPs in this figure. Each group also consists
of a controller which is directly connected to each LP.
3.1 Proposed Algorithm
1. While Loop(Simulation is not over)
2. Controller receives from LP inside the group
3. Record the LP and the time stamp
4. Broadcast message to other LPs
5. If Message is received from controller of other group
6. Broadcast message to the LPs within the group
7. Send null messages to neighboring controllers
8. With smallest timestamp indicating a lower bound on future
messages sent from that group
9. Approximately as: (T + L)
10. End if
11. END-LOOP
Fig. 3 shows the implementation of the proposed algorithm
with eight LPs. if all LPs were connected interconnected then
the total null messages sent for a single deadlock would be
totally 32. If the Lookahead is a small value, then the total
number of null messages transmitted between the LPs will be
approximated as follows:
LP
1
LP
3
LP
2
Group-1 (G1)
LP
1
LP
3
LP
2
LP
1
LP
3
LP
2
Group-3 (G3)
Group-2 (G2)
Controller LP
Controller LP
Controller LP
Fig. 2. Topological Map of 3 Groups with nine L
Ps and three Master Controllers. Within the group, LP are connected via Mesh Topology.
LPs can communicate with each other in a group but they can only send synch-message to their respective controller.
Total Null Messages = (32 * n) where n is iteration number.
It should be noted in Fig. 3 that the total number of null
messages are reduced with respect to the implementation of the
proposed algorithm. Node A sends a sync message to
controller A. On receiving the controller sends a message
inside the group except the link it received and also keeps a
track of the latest timestamp of the each LP. Thus in our
example it sends 9 messages within the group. Total number of
messages exchanged across all the LP’s are
Total number of null messages = 9 X 2(inside the group) + 2
(Between controllers) + 3 X 2(broadcast inside group) = 26.
The main thing to note here is that outside the group only 2
messages were sent. This brief analysis emphasizes the
significance of the proposed methodology for efficiently
grouping the LPs with their respective controllers and
connecting them using the mesh topology.
4. CONCLUSION
In this paper, we presented a new technique of grouping the
LPs with the status retrieval method. In order to support the
proposed technique, we presented an algorithm for the
modified NMA. In addition to the proposed technique and the
algorithm, we also provided a discussion on the
implementation of the proposed technique with the help of the
schematic. We believe that the proposed method of grouping
LPs can reduce the null messages transmission to a reasonable
extent. Though this technique is expensive since a controller
has to be setup, it serves the purpose of reducing the null
messages.
REFERENCES
[1] K. M. Chandy and J. Misra, "Distributed Simulation: A case
study in design and verification of distributed programs", IEEE
Transactions on Software Engineering, SE-5:5, pp. 440-452,
1979.
[2] Syed S. Rizvi, K. M. Elleithy, Aasia Riasat, “Minimizing the
Null Message Exchange in Conservative Distributed
Simulation,” International Joint Conferences on Computer,
Information, and Systems Sciences, and Engineering, CISSE
2006, pp. 443-448, Bridgeport CT, December 4-14 2006
[3] B. R. Preiss, W. M. Loucks, J. D. MacIntyre, J. A. Field, “Null
Message Cancellation in Conservative Distributed Simulation,”
Distributed Simulation 91 Proceedings of the SCS
Multiconference on Advances in Parallel and Distributed
Simulation, 1991.
[4] Ronald C. De Vries, Senior Member, IEEE Reducing Null
Discrete Messages in Misra’s Distributed Event Simulation
Method.
Fig. 3. Implementation of 8 LPs with the Mesh Topology
within the
group. Figure shows a different construction of grouped-
LPs
connecting via a Unicast connection between B and H.