Optimistic SynchronizationBased StateSpace Reduction
ABSTRACT Reductions that aggregate finegrained transitions into coarser transitions can significantly reduce the cost of automated
verification, by reducing the size of the state space. We propose a reduction that can exploit common synchronization disciplines,
such as the use of mutual exclusion for accesses to shared data structures. Exploiting them using traditional reduction theorems
requires checking that the discipline is followed in the original (i.e., unreduced) system. That check can be prohibitively expensive. This paper presents a reduction that instead requires checking
whether the discipline is followed in the reduced system. This check may be much cheaper, because the reachable state space
is smaller.

Conference Paper: Verifying SystemC: A software model checking approach
[Show abstract] [Hide abstract]
ABSTRACT: SystemC is becoming a defacto standard for the development of embedded systems. Verification of SystemC designs is critical since it can prevent error propagation down to the hardware. SystemC allows for very efficient simulations before synthesizing the RTL description, but formal verification is still at a preliminary stage. Recent works translate SystemC into the input language of finitestate model checkers, but they abstract away relevant semantic aspects, and show limited scalability. In this paper, we approach formal verification of SystemC by reduction to software model checking. We explore two directions. First, we rely on a translation from SystemC to a sequential C program, that contains both the mapping of the SystemC threads in form of C functions, and the coding of relevant semantic aspects (e.g. of the SystemC kernel). In terms of verification, this enables the “offtheshelf” use of model checking techniques for sequential software, such as lazy abstraction. Second, we propose an approach that exploits the intrinsic structure of SystemC. In particular, each SystemC thread is translated into a separate sequential program and explored with lazy abstraction, while the overall verification is orchestrated by the direct execution of the SystemC scheduler. The technique can be seen as generalizing lazy abstraction to the case of multithreaded software with exclusive threads and cooperative scheduling. The above approaches have been implemented in a new software model checker. An experimental evaluation carried out on several case studies taken from the SystemC distribution and from the literature demonstrate the potential of the approach.Formal Methods in ComputerAided Design (FMCAD), 2010; 11/2010  SourceAvailable from: spinroot.com
Conference Paper: Reduction of Verification Conditions for Concurrent System Using Mutually Atomic Transactions.
[Show abstract] [Hide abstract]
ABSTRACT: We present a new symbolic method based on partial order reduction to reduce verification problem size and state space of a multi threaded concurrent system with shared variables and locks. We combine our method with a pre vious tokenbased approach that generates verification con ditions directly with out a scheduler. For a bounded unrolling of threads, the previous approach adds concurrency constraints between all pairs of global accesses. We introduce the notion of Mutually Atomic Transactions (MAT), i.e., two transactions are mutu ally atomic when there exists exactly one conflicting shared access pair between them. We propose to reduce the verification conditions by add ing concurrency constraints only between MATs. Such an approach removes all redundant inter leavings, thereby, achieves state reduction as well. We guarantee that our MAT based reduction is both adequate (preserves all the necessary interleavings) and optimal (no redundant interleaving), for a bounded depth analysis. Our experi mental results show the efficacy of our approach in reducing t he state space and the verification problem sizes by orders of magnitude, and th ereby, improving the overall performance, compared with the stateoftheart approaches.Model Checking Software, 16th International SPIN Workshop, Grenoble, France, June 2628, 2009. Proceedings; 01/2009  SourceAvailable from: citeseerx.ist.psu.edu[Show abstract] [Hide abstract]
ABSTRACT: Serializability is a commonly used correctness condition in concurrent programming. When a concurrent module is serializable, certain other properties of the module can be verified by considering only its sequential executions. In many cases, concurrent modules guarantee serializability by using standard locking protocols, such as tree locking or twophase locking. Unfortunately, according to the existing literature, verifying that a concurrent module adheres to these protocols requires considering concurrent interleavings. In this paper, we show that adherence to a large class of locking protocols (including tree locking and twophase locking) can be verified by considering only sequential executions. The main consequence of our results is that in many cases, the (manual or automatic) verification of serializability can itself be done using sequential reasoning .ACM SIGPLAN Notices 01/2010; 45(1):3142. · 0.71 Impact Factor
Page 1
Optimistic SynchronizationBased StateSpace
Reduction
Scott D. Stoller?1and Ernie Cohen??2
1State University of New York at Stony Brook
2Microsoft Research, Cambridge, UK
Abstract. Reductions that aggregate finegrained transitions into coarser
transitions can significantly reduce the cost of automated verification,
by reducing the size of the state space. We propose a reduction that can
exploit common synchronization disciplines, such as the use of mutual
exclusion for accesses to shared data structures. Exploiting them us
ing traditional reduction theorems requires checking that the discipline
is followed in the original (i.e., unreduced) system. That check can be
prohibitively expensive. This paper presents a reduction that instead re
quires checking whether the discipline is followed in the reduced system.
This check may be much cheaper, because the reachable state space is
smaller.
1Introduction
For many concurrent software systems, a straightforward model of the system
has such a large and complicated state space that automated verification, by
automated theoremproving or statespace exploration (model checking), is in
feasible. Reduction is an important technique for reducing the size of the state
space by aggregating transitions into coarsergrained transitions.
When exploring the state space of a concurrent system, context switches be
tween threads are typically allowed before each transition. A simple example of a
reduction for concurrent systems is to inhibit context switches within sequences
of transitions that access only unshared variables. This effectively increases the
granularity of transitions. Thus, one can regard this and similar reductions as
defining a reduced system, which is a coarsergrained version of the original sys
tem. The reduced system may have dramatically fewer states than the original
system. A reduction theorem asserts that certain properties are preserved by the
transformation.
We consider a more powerful reduction that exploits common synchroniza
tion disciplines. For example, in a system that uses mutual exclusion on accesses
to some shared variables—called protected variables—our reduction inhibits con
text switches within sequences of transitions that access only unshared variables
and protected variables. The modelchecking experiments reported in [Sto02] are
?The author gratefully acknowledges the support of NSF under Grant CCR9876058
and the support of ONR under Grants N000140110109 and N000140210363.
Address: Computer Science Dept., SUNY at Stony Brook, Stony Brook, NY 11794
4400. Email: stoller@cs.sunysb.eduWeb: http://www.cs.sunysb.edu/˜stoller/
??Email: ernie.cohen@acm.org
Page 2
based on a similar reduction, which decreased memory usage (which is propor
tional to the number of states) by a factor of 25 or more. Such reductions can also
significantly decrease the computational cost of the automated theoremproving
needed for threadmodular verification [FQS02].
Traditional reduction theorems, such as [Lip75,CL98,Coh00], can also exploit
such synchronization disciplines. However, a hypothesis of these traditional the
orems is that the allegedly protected variables are indeed protected (by synchro
nization that enforces mutual exclusion) in the original (i.e., unreduced) system.
How can we establish this? Static analyses like [BR01,FF01] can automatically
provide a conservative approximation but sometimes return “don’t know”. For
general finitestate systems, it might seem that the only way to automatically
obtain exact information about whether selected variables are actually protected
is to express this condition as a history property and check it by statespace ex
ploration of the original system. If this were the case, then the reduction would
be almost pointless.
Our reduction theorem implies that one can determine exactly during state
space exploration of the reduced system whether the synchronization discipline
is followed in the original system.
Our reduction theorem is designed to be used together with traditional re
duction theorems. Suppose a traditional reduction theorem asserts that some
property φ is preserved by the reduction if the original system follows the syn
chronization discipline. After checking that the reduced system follows the dis
cipline and satisfies φ, one can use our reduction theorem to conclude that the
original system follows the discipline, and then use the traditional reduction
theorem to conclude that the original system satisfies φ.
The reduction in [Sto02] is similar in spirit to the one in this paper. The
main contributions of this paper relative to [Sto02] are: (1) a reduction that
applies to systems that use arbitrary synchronization mechanisms to achieve
mutual exclusion (the results in [Sto02] apply only when monitors are used);
(2) separation of a general reduction theorem that justifies checking hypotheses
of traditional reduction theorems in the reduced system from the application
of this technique to mutualexclusion synchronization disciplines; (3) allowing
nondeterminism in invisible transitions (in the notation of Section 3, [Sto02]
requires that u be deterministic); (4) significantly shorter and cleaner proofs,
based on ωalgebra. The first author initially tried to prove similar results in a
transitionsystem framework, like the one in [God96]; that should be possible,
but our experience suggests that the algebraic framework facilitates the task.
Operations on monitors are not analyzed specially in this paper. As a result,
for systems that mainly use monitors for synchronization, this reduction is not
as effective as the one in [Sto02]. It should be possible to integrate the specialized
treatment of monitor operations in [Sto02] into this paper’s broader framework.
Our method and traditional partialorder methods (e.g., ample sets [CGP99],
stubborn sets [Val97], and persistent sets [God96]) both exploit independence
(commutativity) of transitions, but our method can establish independence of
transitions—and hence achieve a reduction—in many cases where traditional
Page 3
partialorder methods cannot. Traditional partialorder methods, as implemented
in tools such as Spin [Hol97] and VeriSoft [God97], use two kinds of information
to determine independence of transitions: programspecific information about
which processes may perform which operations on which objects (e.g., only
process P2 sends messages on channel C1), and manually supplied program
independent information about dependencies between operations on selected
datatypes (e.g., a send operation on a full channel is disabled until a receive
operation is performed on that channel). Our method also exploits more com
plicated programspecific information to determine independence of transitions,
e.g., the invariant that a particular variable is always protected by particular
synchronization constructs.
Traditional partialorder methods rely on static analysis to conservatively
determine dependencies between transitions. As a result, those methods are less
effective for programs that contain references (or pointers) and arrays, because
static analysis cannot in general determine exactly which locations are accessed
by each transition, and the static analysis of dependencies between transitions
is correspondingly imprecise. Our method does not rely on conservative static
analysis of dependencies and has no difficulty with references, etc.
2 Omega Algebra
An omega algebra is an algebraic structure over the operators (in order of in
creasing precedence) 0 (nullary), 1 (nullary), + (binary infix), · (binary infix,
usually written as simple juxtaposition), ? (binary infix, same precedence as ·),
∗(unary suffix), andω(unary suffix), satisfying the following axioms3:
(x + y) + z = x + (y + z)
x + y = y + x
x + x = x
0 + x = x
x (y z) = (x y) z
0 x = x 0 = 0
1 x = x 1 = x
x (y + z) = x y + x z
(x + y) z = x z + y zx ≤ y x + z ⇒ x ≤ y ? z
In parsing formulas, · and ? associate to the right; e.g., u v ? x ? y parses to
(u · (vω+ v∗· (xω+ x∗· y))). In proofs, we use the hint “(dist)” to indicate
application of the distributivity laws, and the hint “(hyp)” to indicate the use
of hypotheses. If xiis a finite collection of terms, we write (+i : xi) and (·i : xi)
for the sum and product, respectively, of these terms.
These axioms are sound and complete for the usual equational theory of
omegaregular expressions. (Completeness holds only for standard terms, where
the first arguments to ·,ω, and ? are regular.) Thus, we make free use, without
3The axioms are equivalent to Kozen’s axioms for Kleene algebra [Koz94], plus the
three axioms for omega terms.
x ≤ y ⇔ x + y = y
x∗= 1 + x + x∗x∗
x y ≤ x ⇒ x y∗= x
x y ≤ y ⇒ x∗y = y
x ? y = xω+ x∗y
xω= x xω
(* ind)
(* ind)
(? ind)
Page 4
proof, of familiar equations from the theory of (omega)regular languages (e.g.,
x∗x∗= x∗).
y is a complement of x iff x y = 0 = y x and x + y = 1. It is easy to show
that complements (when they exist) are unique and that complementation is
an involution; a predicate is an element of the algebra with a complement. In
this paper, p and q range over predicates, with complements p and q. It is easy
to show that the predicates form a Boolean algebra, with + as disjunction, ·
as conjunction, 0 as false, 1 as true, complementation as negation, and ≤ as
implication. Common properties of Boolean algebras (e.g., p q = q p) are used
silently in proofs, as is the fact x p y = 0 =⇒ x y = x p y.
The omega algebra axioms support several interesting programming models,
where (intuitively) 0 is magic4, 1 is skip, + is chaotic nondeterministic choice, · is
sequential composition, ≤ is refinement, x∗is executed by executing x any finite
number of times, and xωis executed by executing x an infinite number of times.
The results of this paper are largely motivated by the relational model, where
terms denote binary relations over a state space, 0 is the empty relation, 1 is the
identity relation, · is relational composition, + is union,∗is reflexivetransitive
closure, ≤ is subset, and xωrelates an input state s to an output state if there is
an infinite sequence of states starting with s, with consecutive states related by
x. (Thus, xωrelates an input state to either all states or none, and xω= 0 iff x
is wellfounded.) Predicates are identified with the set of states in their domain
(i.e., the states from which they can be executed). We define ? = 1ω; it is easy
to see that ? is the maximal element under ≤, and in the relational model, it
relates all pairs of states.
In addition to equational identities of regular languages, we will use the fol
lowing two standard theorems (proofs of these theorems and more sophisticated
theorems of this type appear in [Coh00]):
x y ≤ y z =⇒ x∗y ≤ y z∗
y x ≤ x y =⇒ (x + y)∗≤ x∗y∗
(1)
(2)
3A Reduction Theorem
We consider systems composed of a fixed, finite set of concurrent processes (each
perhaps internally concurrent and nondeterministic). Variables i and j range
over process indices. Each process i has a visible action vi and an invisible
action ui5, where the invisible action is constrained to neither receive information
from other processes nor to send information to other processes so as to create
a race condition in the recipient. This constraint is guaranteed only so long
as some global synchronization policy is followed. For example, in a system
where processes are synchronized using locks, either visible or invisible actions
4magic is the program that has no possible executions (and so satisfies every possible
specification). Of course, it cannot be implemented.
5Note that ui and vi can be sums of nondeterministic actions that correspond to
individual transitions of process i.
Page 5
of process i might modify variables that are either local to process i or protected
by locks held by process i, or send asynchronous messages to other processes;
but only visible actions can acquire locks or wait for a condition to hold. Note
that violation of the synchronization discipline (e.g., an action accessing a shared
variable without first obtaining an appropriate lock) might cause a race condition
between an invisible action and the actions of another process, violating the
constraint on invisible actions.
To avoid introducing temporal operators, we introduce a Boolean history
variable q that records whether the synchronization discipline has been violated
at some point in the execution. Predicate pimeans that process i cannot perform
an invisible action, i.e., that uiis disabled. Let p be the conjunction of the pi’s,
i.e., p = (·i : pi). A state satisfying p is called visible; thus, in a visible state, all
invisible transitions are disabled.
We now define several actions, formalized in the definitions (3)–(9) below.
An Miaction consists of a visible action of process i followed by a sequence of
invisible actions of process i. An Niaction is an Miaction that is “maximal”
(i.e., further ui actions are disabled) and that finishes in a state where the
synchronization discipline has not been violated. Niis effectively the transition
relation of thread i in the reduced system. (Additional conditions will imply that
executing an N action in a visible state results in a visible state; thus, in the
reduced system, context switches occur only in visible states.) A u (respectively
v, M, N) action is a ui(respectively, vi, Mi, Ni) action of some process i. Finally,
an R action is executable iff (i) the discipline has been violated, or (ii) such a
violation is possible after execution of a single M action. (Like xω, R relates
each initial state to either all final states or none.)
Mi= viu∗
Ni= Mipiq
u = (+i : ui)
v = (+i : vi)
M = (+i : Mi)
N = (+i : Ni)
R = (1 + M) q ?
Our reduction theorem says that if the original system can reach a violation of
the synchronization discipline starting from some visible state, then the reduced
system can also reach a violation starting from the same initial state, except
that the violation might occur partway through the last transition of the reduced
system (i.e., the last transition might be an M action rather than an N action).
The transition relations of the original and reduced systems are u + v and N,
respectively. Thus, the conclusion of the reduction theorem, (19), is
p (u + v)∗q ≤ N∗R
The hypotheses of our reduction theorem are as follows, formalized in for
mulas 11–(18) below. It is impossible to execute invisible actions of a single
i
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
Page 6
process forever without violating the discipline (11). An action cannot enable or
disable an invisible action of another process (12),(13), and in the absence of a
discipline violation, it commutes to the right of such an action (14),(15). Visible
and invisible actions of a process cannot be simultaneously enabled (16). uiis
enabled whenever piis false (17). Invisible actions cannot hide violations of the
discipline (18).
(uiq)ω= 0(11)
i ?= j =⇒ ujpi= piuj
i ?= j =⇒ vjpi= pivj
i ?= j =⇒ ujui≤ ui(q ? + uj+ ujq ?)
i ?= j =⇒ vjui≤ ui(q ? + vj+ vjq ?)
pivi= 0 = piui
(16)
1 ≤ pi+ ui?
q ui≤ uiq
Our reduction theorem can be used to check not only the synchronization dis
cipline, but also the invariance of any other predicate I such that violations of
I cannot be hidden by invisible actions. To see this, note that, except for (18),
the conditions above are all monotonic in q. Thus, if all the conditions above
(including (18)) are satisfied for a predicate q, and there is a predicate I such
that I ui≤ uiI for each i, then all the conditions are still satisfied if q is replaced
with q + I.
The proof below can be viewed as formalizing the following construction,
which starts from an execution that violates the discipline and produces an
execution of the reduced system that also violates the discipline. First, we try to
move invisible uiactions to the left of ujand vjactions, where i ?= j, starting
from the left (i.e., from the leftmost uiaction that immediately follows a ujor
vj action). The ui action cannot make it all the way to the beginning of the
execution (since p ui= 0), so it must eventually run into either another uior a
vi. Repeating this produces an execution in which a sequence of M actions leads
to a violation of the discipline.
Next, we try to turn all but the last of these M actions into N actions,
starting from the next to last M action. In general, we will have done this for
some number of M actions, so we will have an execution that ends with N∗R.
Now try to convert the last Mibefore the N∗R suffix into an N action. Suppose
this Miaction ends with uienabled. uimust then also be enabled later when
the discipline is first violated (because (12) and (13) imply Nj does not affect
enabledness of ui, and (16) implies Niis disabled when uiis enabled), so we add
a ui action just after the violation and try to push it backward (through the
N∗(1 + M)). This may create additional violations of the discipline, but there
will always be an N∗R to the right of the new ui. Eventually, uimakes it back
to the Mi, extending Miwith another ui. By (11), ui’s cannot continue forever
without violating the discipline, so repeating this extension process eventually
either gives us a violation right after Mi (in which case we have produced a
(12)
(13)
(14)
(15)
(17)
(18)
Page 7
new N∗R action, so we can discard everything after it) or lead to the ui’s
being disabled, in which case we have succesfully turned the Mi action into
an N action and again turned the extended execution into an execution that
ends with N∗R. Repeating this for each Miaction, moving from right to left,
produces the desired execution of the reduced system.
We now turn to the formal proof of the reduction theorem (19). We push u’s
left (lines 12) where they are eliminated by the initial p (line 3), push M’s to
the left of R’s (line 4), condense the R’s to a single R (lines 56), and finally
turn the M’s into N’s (lines 78):
p (u + v)∗q ≤ N∗R
p (u + v)∗q
≤
p (u + M + R)∗q
≤
p u∗(M + R)∗q
≤
(M + R)∗q
≤
M∗R∗q
≤
M∗(1 + R) q
≤
M∗R
≤
M∗N∗R
=
{M N∗R ≤ N∗R (21); (* ind)
N∗R
(19)
{v ≤ M ≤ M + R
{(M + R) u ≤ (1 + u) (M + R) (20);(2)}
{p u = 0 (16);p ≤ 1
{R M ≤ R; (2)
{R R ≤ R, so R∗= (1 + R)
{(1 + R) q = R
{1 ≤ N∗
}
}
}
}
}
}
}
(20) says that a u moves to the left of an M or R (but may disappear in the
process):
(M + R) u ≤ (1 + u) (M + R)
(M + R) u
≤
M u + R u
≤
M u + R
=
(+i,j : Mjui) + R
≤
(+i,j : (1 + ui) (Mj+ R)) + R
≤
(1 + u) (M + R)
(20)
{(dist)
{R = R ?, so R u = R ? u ≤ R ? = R}
{(7),(5),(dist)
{Mjui≤ (1 + ui) (Mj+ R)) (25)
{(7),(5),(dist)
}
}
}
}
(21) shows that N∗R actions act as a factory for uiactions until they either
produce a discipline violation (q) or until they produce enough ui’s to turn the
Mito their left into an N.
MiN∗R ≤ N∗R
MiN∗R
≤
{(22);(? ind)
Mi(uiq) ? (pi+ uiq) N∗R
≤
Mi(uiq)∗(pi+ uiq) N∗R
≤
Mi(pi+ uiq) N∗R
=
{(dist)
(Mipi+ Miuiq) N∗R
≤
(Mipiq + Miq) N∗R
≤
(N + Miq) N∗R
≤
N N∗R + Miq N∗R
≤
N N∗R + R
≤
N∗R
(21)
{N∗R ≤ (uiq)N∗R + (pi+ uiq)N∗R
{(uiq)ω= 0 (11)
{q ≤ 1; Miu∗
{Miui≤ Mi(3); Mipi≤ Mipiq + Miq}
{Mipiq = Ni(4); Ni≤ N(8)
{(dist)
{Miq N∗R ≤ Miq ? ≤ R (9)
{N N∗≤ N∗, 1 ≤ N∗
}
}
}
}
}
i= Mi(3)
}
}
}
}
Page 8
(22) and (23) show that N∗R generates a ui(unless pialready holds, or the
discipline has already been violated).
N∗R ≤ (uiq) N∗R + (pi+ uiq) N∗R
N∗R
≤
(pi+ ui) N∗R
≤
(pi+ uiq + uiq) N∗R
≤
(uiq) N∗R + (pi+ uiq) N∗R
N∗R ≤ (pi+ ui) N∗R
N∗R
=
{(9)
N∗(1 + M) q ?
N∗(1 + M) q (pi+ ui) ?
N∗(1 + M) (pi+ ui) q ?
N∗(pi+ ui) (1 + M + R) q ?
N∗(pi+ ui) R
≤
{(1)
(pi+ ui) (N + R)∗R
≤
(pi+ ui) N∗R∗R
=
{R R ≤ R; (* ind)
(pi+ ui) N∗R
(22)
{N∗R ≤ (pi+ ui) N∗R(23)}
{ui= uiq + uiq
{(dist)
}
}
(23)
}
}
}
≤
≤
≤
≤
{1 ≤ pi+ ui? (17)
{q pi= piq; q ui≤ uiq (18)
{M (pi+ ui) ≤ (pi+ ui) (M + R), (25)}
{(1 + M + R) q ? ≤ R (9)
{N (pi+ ui) ≤ (pi+ ui) (N + R) (24); }
{R N ≤ R; (2)
}
}
}
}
Finally, (24) and (25) show that (pi+ui) commutes left past an N or M (possibly
changing them into R’s).
Nj(pi+ ui) ≤ (pi+ ui) (N + R)
Nj(pi+ ui)
≤
Mjpj(pi+ ui)
≤
Mj(pi+ ui) pj
≤
(pi+ ui) (Mj+ R) pj
≤
(pi+ ui) (Mj+ R) pj(q + q)
≤
(pi+ ui) (N + R)
(24)
{Nj≤ Mjpj(4)
{pjpi= pipj; pjui≤ uipj(12)(16)
{Mj(pi+ ui) ≤ (pi+ ui) (Mj+ R), (25)}
{1 = q + q
{Mjpjq ≤ N (4)(8); Mjq ≤ R (9)
}
}
}
}
Mj(pi+ ui) ≤ (pi+ ui) (Mj+ R)
i = j :
Mi(pi+ ui)
Mi
(pi+ ui) (Mi+ R)
(25)
≤
≤
{pi≤ 1; Miui≤ Mi(3)
{vi= pivi(16), so Mi= piMi(3)}
}
i ?= j :
Let [x] = x + x q ? + q ?; then
Mj(pi+ ui)
vju∗
vj(pi+ ui) [uj]∗
vj(pi+ ui) [u∗
(pi+ ui) [vj] [u∗
(pi+ ui) (Mj+ R)
=
≤
=
≤
≤
{(3)
{uj(pi+ ui) ≤ (pi+ ui) [uj] (12)(14); (1)}
{[uj]∗= [u∗
{vj(pi+ ui) ≤ (pi+ ui) [vj] (13)(15)
{[vj] [uj]∗≤ Mj+ R
}
j(pi+ ui)
j] (2)
}
}
}
j]
j]
Page 9
4System Model and Synchronization Discipline
We define a simple model of concurrent systems that use mutual exclusion for
access to selected variables, and we prove that our reduction theorem applies
to these systems. This model is intended to be the simplest one that retains all
relevant aspects of concurrent programming languages, such as Java. It can be
modified and generalized in various ways with little effect on our results.
Each shared variable is classified as protected or unprotected. There are no
constraints on how unprotected variables are accessed. The synchronization dis
cipline requires that mutual exclusion be used for access to protected variables.
Any combination of synchronization mechanisms (locks, condition variables,
semaphores, barriers, etc.) can be used to provide the mutual exclusion, pro
vided the scheme can be captured by exclusive access predicates. For each pro
tected variable x and each thread i, there is an exclusive access predicate ex
The synchronization discipline requires that ex
i can execute a transition that accesses x. Mutual exclusion is expressed by the
requirement that, for every variable x and every two distinct threads i and j, ex
and ex
Formally, a system is a tuple ?Θ,Vunsh,Vprot,Vunprot,T,I,e? where
Θ is a set of threads (thread identifiers). i and j range over Θ.
Vunshis a set of unshared variables, i.e., variables that appear in transitions of
at most one thread.
Vprot is a set of variables declared (possibly incorrectly) to be “protected”,
i.e., there are synchronization mechanisms that ensure mutual exclusion for
accesses to these variables. For each variable x ∈ Vprot and each thread i,
there is an exclusive access predicate ex
Vunprot is a set of (possibly shared) variables, called “unprotected variables”.
No assumptions are made regarding synchronization for accesses to them.
T =?
a guarded command g → c, where the guard g is a predicate over Vguard, and
c is built from assignments over V , sequential composition, and conditionals
(ifthen and ifthenelse).
I is a predicate over V . I characterizes the initial states.
e is a family of (possibly incorrect) exclusive access predicates ex
Guards are used for synchronization (blocking). Conditionals in commands
are used for sequential control flow. For convenience of analysis, protected vari
ables cannot appear in guards. This is reasonable because the synchronization
mechanisms that protect the variables, not the protected variables themselves,
should be used to achieve the necessary synchronization. The value of a pro
tected variable v can be copied into an unshared or unprotected variable, and
the latter variable can be used in a guard, or v can be moved from Vprot to
Vunprotand then used in a guard directly.
Fix a system. A state is a mapping from variables to values. Let Σ be the set
of states. We also use states as maps from expressions to values, with the usual
meaning (homomorphic extension).
i.
ihold in states from which thread
i
jare mutually exclusive (i.e., cannot hold simultaneously).
i.
iTiis a set of transitions, where Tiis the set of transitions of thread i.
Let V = Vunsh∪Vprot∪Vunprotand Vguard= Vunsh∪Vunprot. A transition t is
iover V .
Page 10
A transition t is enabled in state s if its guard is true in s. An execution is a
finite or infinite sequence σ of states such that σ(0) satisfies I and every pair of
consecutive states in σ is in [[t]] for some transition t.
A transition is visible if it (i) contains an occurrence of a variable in Vunprot
or (ii) might change the value of an exclusive access predicate. Other transitions
are invisible. This classification of transitions determines the transition relations
uiand viand the predicates pi.
A system is wellformed if the following conditions hold.
WFinitVis. The initial transitions of each thread are visible, i.e., I ⇒ p. (This
ensures that the conclusion of the reduction theorem applies to all reachable
states of the original system.)
WFsep. Visible and invisible transitions of each thread are separate, i.e., can
not be executed from the same state. Formally, (∀i : domain(ui)∩domain(vi) =
∅).
WFacc. Internal nondeterminism in a transition (i.e., nondeterministic choices
that do not affect the ending state) does not affect the set of variables ac
cessed by the transition or the order in which those variables are first ac
cessed. (This ensures welldefinedness of acc in Section 4.1 and of x in case
2 of the proof of (15) in Section 5.)
WFfiniteInvis. No thread has an infinite execution sequence containing only
invisible transitions. Formally, (∀i : uω
WFinitExcl. For each protected variable x, the exclusive access predicates for
x are initially disjoint, i.e., I ⇒ disjoint(ex), where disjoint(ex) = ¬(∃i,j :
i ?= j ∧ ex
WFendExcl. A thread cannot take away another thread’s exclusive access to a
variable. Formally, for an exclusive access predicate ex
of thread j cannot falsify ex
i= 0).
i∧ ex
j).
iand j ?= i, transitions
i.
4.1MutualExclusion Synchronization Discipline
The synchronization discipline requires that, for every variable x ∈ Vprot, (i) a
transition of thread i executed from a state s may access x only if s = ex
(ii) disjoint(ex) holds in every reachable state.
Let acc(s1,t,s2) denote the set of variables accessed by execution of transition
t from state s1to s2. The set of accessed variables may depend on which branches
of conditionals are taken. The ending state s2is included as an argument to acc
because t may be nondeterministic. WFacc ensures that acc is welldefined.
Since guards do not contain protected variables, acc(s1,t,s2) = ∅ if t is disabled
in s1(otherwise, acc(s1,t,s2) would be the set of protected variables in t’s guard).
We augment the system with a predicate q that holds iff the synchronization
discipline has been violated. Formally, q is the least predicate that satisfies
∀i : ∀x ∈ Vprot: ∀t ∈ Ti: ∀?s1,s2? ∈ [[ti]] : s2= q ⇐⇒
((x ∈ acc(s1,t,s2) ∧ s1?= ex
The third disjunct in (26) implies that q is monotonic, i.e., it can be truthified
but not falsified.
i, and
i) ∨ s2?= disjoint(ex) ∨ s1= q).
(26)
Page 11
Maintaining q involves accesses to q and accesses to variables that occur
in exclusive access predicates. These accesses are ignored when determining
acc(s1,t,s2).
5Proof that the Reduction Theorem Applies to the
MutualExclusion Synchronization Discipline
We prove in [SC02] that wellformed systems satisfy the hypotheses (11)–(18) of
the reduction theorem. Most of the proofs are straightforward. Here we consider
only the most interesting one.
Proof of (15). Let tibe an invisible transition of thread i, and let tjbe a visible
transition tjof thread j, and let s1, s2, and s3be states such that ?s1,s2? ∈ tjand
?s2,s3? ∈ ti. Let ti= gi→ ciand tj= gj→ cj. tjdoes not enable ti, because cj
and giaccess disjoint sets of variables (because tiis invisible and hence does not
access unprotected variables, and protected variables do not appear in guards).
tidoes not disable tj, for analogous reasons. Thus, there exist states s?
such that ?s1,s?
so s?
that s?
correspond to the summands in (15)) holds: (i) s?
leftcommutes with cj), or (iii) s?
case 1: A = ∅. This implies that
acc(s1,tj,s2) = acc(s?
(27)
because the same branches of conditionals will be executed from either source
state. This and A = ∅ imply that (∀x ∈ acc(s2,ti,s3) : s1(x) = s2(x)) and
(∀x ∈ acc(s1,tj,s2) : s1(x) = s?
any) in the transitions in the same way when executing tifollowed by tjas when
executing tjfollowed by tito reach s3, we obtain s?
v ∈ V \ {q}. We must exclude q here because acc does not reflect accesses used
to update q, as stated in Section 4.1.
case 1.1: s3= ¯ q. If s?
then condition (iii) holds.
case 1.2: s3= q. We show that s?
case 1.2.1: s1= q. This and monotonicity of q imply s?
case 1.2.2: s1= ¯ q. This and s3= q imply that the synchronization discipline
is violated either by execution of tjfrom s1or by execution of tifrom s2. The
violation corresponds to the first or second disjunct in (26) being true (the third
disjunct just makes q monotonic). Thus, there are 2 × 2 cases to consider.
case 1.2.2.1: (∃x ∈ Vprot : x ∈ acc(s1,tj,s2) ∧ s1 ?= ex
acc(s?
definition of q implies s?
case 1.2.2.2: (∃x ∈ Vprot : x ∈ acc(s2,ti,s3) ∧ s2 ?= ex
acc(s1,ti,s?
definition of q implies s?
2and s?
3
2? ∈ tiand ?s?
3are not uniquely determined by these conditions. It suffices to show
2and s?
2,s?
3? ∈ tj. Transitions may be nondeterministic,
2and s?
3can be chosen so that one of the following conditions (which
2= q, (ii) s?
3= s3 (i.e., ci
3= q. Let A = acc(s1,tj,s2) ∩ acc(s1,ti,s?
2).
2,tj,s?
3) ∧ acc(s2,ti,s3) = acc(s1,ti,s?
2),
2(x)). Thus, by resolving nondeterminism (if
3(v) = s3(v) for all variables
3= ¯ q, then s?
3= s3, i.e., condition (ii) holds. If s?
3= q,
2= q or s?
3= q.
3= q.
j). (27) implies x ∈
j, so s?
2,tj,s?
3). ti is invisible, so it cannot truthify ex
3= q.
2?= ex
j. Thus, the
i). (27) implies x ∈
i, so s1?= ex
2). WFendExcl implies tj did not falsify ex
2= q.
i. Thus, the
Page 12
case 1.2.2.3: (∃x ∈ Vprot : s2 ?= disjoint(ex)). ti is invisible, so it cannot
falsify any exclusive access predicate, so s3?= disjoint(ex). s3and s?
same values for all variables except q, so s?
of q implies s?
case 1.2.2.4: (∃x ∈ Vprot: s3?= disjoint(ex)). s3and s?
for all variables except q, so s?
s?
case 2: A ?= ∅. Let x be the variable in A first accessed by execution of tj
from s1to s2.
case 2.1: s1= ex
case 2.1.1: s1= disjoint(ex). The hypotheses of cases 2.1 and 2.1.1, together
with i ?= j, imply s1?= ex
case 2.1.2: s1?= disjoint(ex). This and the definition of q imply s1= q. This
and monotonicity of q imply s?
case 2.2: s1?= ex
because the first access to x by tjprecedes execution of conditionals in tjwhose
conditions could be affected by execution of tifrom s1. tiis invisible, so it cannot
truthify ex
3have the
3?= disjoint(ex). Thus, the definition
3= q.
3have the same values
3?= disjoint(ex). Thus, the definition of q implies
3= q.
j. By definition of A, x ∈ acc(s1,ti,s?
2).
i. This and x ∈ acc(s1,ti,s?
2) imply s?
2= q.
2= q.
j. The definitions of A and x imply that x ∈ acc(s?
2,tj,s?
3),
j, so s?
2?= ex
j. Thus, the definition of q implies s?
3= q.
6Examples
This section contains examples of systems for which the current reduction is effec
tive (i.e., it reduces the number of reachable states) and the reduction in [Sto02]
is not effective. In general, our method is effective whenever some variables can
be classified as protected. These examples are based mainly on descriptions in
[SBN+97] of code in real systems.
Semaphores. A user thread sends a request to a device driver thread, asking
the device driver to store data in a buffer b, and then waits for the result by
invoking sem.down(), where sem is a semaphore, initialized to zero. The device
driver thread receives the request, waits for the device to supply the data, stores
the data in b, and then calls sem.up(). The buffer b can be classified as protected.
For example, eb
a statement after the call to sem.down(), and eb
counter of the device driver thread points to a statement before the call to
sem.up(). The semaphore ensures disjointness of eb
userholds when the program counter of the user thread points to
driverholds when the program
userand eb
driver.
Memory Reuse. Some systems reuse objects (or structures) by placing them on
a free list when they are not in use. These objects may be protected by different
locks each time they are reused, violating the locking discipline of [Sto02]. For
example, consider a file system in which blocks in a file are protected by the lock
associated with (the inode of) that file, and blocks on the free list are protected
by the lock associated with the free list. A block may be in a different file, and
hence protected by a different lock, each time it is reused. Let mF denote the
lock associated with the free list. Let mf denote the lock associated with file f.
The exclusive access predicate eb
ifor a block b might be
Page 13
(onFreeList(b) ∧ mF.owner = i) ∨ (∃ file f : allocatedTo(b,f) ∧ mf.owner = i)
MasterWorker Paradigm. In the masterworker paradigm, a master thread as
signs tasks to worker threads. Typically, each task is represented by an object
created by the master thread and passed to a worker thread. The master thread
does not access a task object after passing it to a worker. Task objects can be
classified as protected. Suppose each worker thread w has a field w.task that
refers to the worker’s task. For a task object x, the exclusive access predicate
ex
w.task = x.
masterholds before x has been passed to a worker thread, and ex
wholds when
7Comparison to Traditional PartialOrder Methods
This section demonstrates that our method has advantages over traditional
partialorder methods even for some simple systems for which precise static anal
ysis of transition dependencies is feasible. Consider a system with two threads
that use monitors m0and m1as locks and use an integer variable y to implement
a barrier. Let uppercase letters denote control points. Let guard → stmt denote
a transition that blocks when guard is false and can execute stmt when guard is
true. For i ∈ {0,1}, the code for thread i is
Am0.acquire();Bx0:= i;Cm0.release();Dm1.acquire();Ex1:= i;
Fm1.release();Gy + +;Hy = 2 → skip;Ixi= iJ
In the initial state, xj= j and y = 0, and both threads are at control point A.
xj is a protected variable, with exclusive access predicate exj
i) ∨ (y = 2 ∧ i = j). y is not protected.
This system has 106 reachable states. With the reduction in this paper,
transitions that update x0or x1are invisible; other transitions are visible. The
reachable states of the reduced system are the reachable states of the original
system in which every thread is ready to perform a visible transition or is at its
final control point. There are 62 such states.
Traditional partialorder methods based on persistent sets [God96] (or ample
sets [CGP99]) can also significantly reduce the number of explored states but
do not achieve the same benefits as our reduction. For concreteness, we compare
our method to selective search using the conditional stubborn set algorithm
(CSSA) [God96]. We always resolve nondeterminism in CSSA in a way that
yields a minimumsize persistent set. CSSA is parameterized by dependency
relations on operations. For acquire and release, we use the mightbethefirst
tointerferewith relation in [Sto02, Fig. 3]. For accesses to y, we use the minimal
mightbethefirsttointerferewith relation, based on the dependency relation on
operations in which an increment to y is dependent with the condition y = 2
only in states in which the increment changes the truth value of the condition.
The selective search (using CSSA) explores 77 states. To illustrate why it
explores more than the 62 states explored by our method, consider the reach
able state s in which thread 0 is at control point D and thread 1 is at control
(28)
i
= (mj.owner =
Page 14
point B. With the reduction in this paper, the transitions that update x0or x1
are invisible, so the system passes through this invisible state by executing the
enabled transition of thread 1; the enabled transition of thread 0 is not executed
in s. In contrast, the selective search explores both enabled transitions in s, as
explained in detail in [SC02].
This example can be generalized to show our method outperforming the
selective search by an arbitrary amount: simply insert additional transitions
that access x0before the transition m0.release() in thread 0.
The selective search exploits some independence that our method does not,
in particular, independence of release with acquire and release, and indepen
dence of acquire with acquire in some states. One way to obtain the benefits
of both methods is to apply selective search to the reduced system. This works
for systems for which sufficiently precise static analysis of dependencies between
transitions is feasible (cf. Section 1). Another approach is to extend our method,
e.g., to incorporate the specialized treatment of monitor operations in [Sto02]
that allows release to be classified as invisible.
8How to Use the Reduction
The intended methodology for using the reduction is as follows.
1. Guess the set Vprot of protected variables and the exclusive access pred
icates ex
reduced system, in which the transition relation of thread i is Ni, defined in (4).
2. Augment the reduced system with a predicate q, as described in Section
4.1.
3. Check whether ¯ q holds in all reachable states of the reduced system. Check
this using your favorite technique: model checking, theorem proving, hand wav
ing, etc.
4. If so, then the reduction theorem implies that ¯ q holds in all reachable
states of the original system, i.e., the guesses in Step 1 are correct. Traditional
reduction theorems can now be used to infer other properties of the original
system from properties of the reduced system.
5. If not, then for some variable x in Vprot, the reduced system has a reachable
state in which the mutualexclusion synchronization discipline for x is violated.
Revise the guess for ex(using the path to the violation as a guide) or reclassify
x as unprotected, and then return to Step 1.
i. These guesses determine visibility of transitions and hence define a
9 How to Use the Reduction Automatically for Systems
with Monitors
The methodology in Section 8 is automatic except that the user must guess
Vprot and the exclusive access predicates. For systems that use monitors for
synchronization, this step, too, can be automated, based on the observation
Page 15
that the exclusive access predicates typically have the form ex
eapx,m
i
= initx
(29)
initx= (∃i ∈ Θ : initx
and where the initialization predicate initx
that initializes x. Note that the lock protecting a variable does not need to be
held while the variable is being initialized.
Initialization predicates for variables in systems that correspond to Java pro
grams can be guessed automatically: the initialization predicate holds when the
thread’s program counter is in the appropriate class initializer (for static fields)
or the appropriate constructor invocation (for instance fields).
To use (29), we need to identify, for each variable x in Vprot, a monitor m that
protects x. This can be done automatically by running a variant of the lockset
algorithm [SBN+97] during statespace exploration of the reduced system.
i= eapx,m
i
, where
i∨ (i = m.owner ∧ ¬initx)
i).
(30)
iholds while thread i is executing code
10 Experimental Results
We implemented the similar reduction of [Sto02] in Java PathFinder (JPF)
[BHPV00] and measured the benefit of the reduction for several programs with
monitorbased synchronization. HaltException and Clean [BHPV00, Figure 1]
are small “synchronization skeletons” supplied by the developers of JPF. Xtango
DP and XtangoQS are animations of a dining philosophers algorithm and quick
sort, respectively, from http://www.mcs.drexel.edu/˜shartley/; we replaced java.
awt methods with methods having empty bodies, due to limitations of JPF. The
lockset algorithm was used in all experiments. With negligible manual effort (to
write a few lines of config files), the reduction decreases memory usage by a fac
tor of 1.4MB/0.77MB ≈ 1.8 for HaltException, 4.3MB/2.2MB ≈ 2.0 for Clean,
609MB/236MB ≈ 2.6 for XtangoDP, and 344MB/101MB ≈ 3.4 for Xtango
QS, compared to model checking with JPF’S default granularity, which executes
each line of source code atomically. In a real JVM, bytecode instructions execute
atomically. Our reduction preserves that semantics. JPF’s sourceline granularity
does not: it can miss errors. Compared to bytecode granularity, our reduction
decreases memory usage by a factor of 13.9MB/0.77MB ≈ 18 for HaltExcep
tion and at least 1800MB/101MB ≈ 18 for XtangoQS (“at least” reflects an
outofmemory exception).
Acknowledgements. We thank Shaz Qadeer for telling us about exclusive ac
cess predicates, Liqiang Wang for doing the experiments with JPF, and Patrice
Godefroid for comments about partialorder methods.
References
[BR01]C. Boyapati and M. C. Rinard. A parameterized type system for racefree
Java programs. In Proc. 16th ACM Conference on ObjectOriented Program
ming, Systems, Languages and Applications (OOPSLA), volume 36(11) of
SIGPLAN Notices, pages 56–69. ACM Press, November 2001.