# Optimistic Synchronization-Based State-Space Reduction

**ABSTRACT** Reductions that aggregate fine-grained transitions into coarser transitions can significantly reduce the cost of automated

verification, by reducing the size of the state space. We propose a reduction that can exploit common synchronization disciplines,

such as the use of mutual exclusion for accesses to shared data structures. Exploiting them using traditional reduction theorems

requires checking that the discipline is followed in the original (i.e., unreduced) system. That check can be prohibitively expensive. This paper presents a reduction that instead requires checking

whether the discipline is followed in the reduced system. This check may be much cheaper, because the reachable state space

is smaller.

**0**Bookmarks

**·**

**98**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**Structural model abstraction is a powerful technique for reducing the complexity of a state based enumeration analysis. We present in this paper new ecien t ordinary Petri nets reductions. At rst, we dene \behavioural" reductions (i.e. based on conditions related to the language of the net) which preserve a fundamental property of a net (i.e. liveness) and any LTL formula that does not observe reduced transitions of the net. We substitute these conditions by structural or algebraical ones leading to reductions that can be ecien tly checked and applied whereas enlarging the application spectrum of the previous reductions. At last, we illustrate our method on signican t and typical examples. - SourceAvailable from: citeseerx.ist.psu.edu[Show abstract] [Hide abstract]

**ABSTRACT:**The interleaving of concurrent processes actions leads to a com- binatory explosion. There exists in Petri nets theory some structural reduc- tions that combat the state explosion by agglomerating sequences of transi- tions into a single atomic transition. These reductions are easily checkable and preserve deadlocks, Petri nets liveness and any LTL formula that do not observe the modied transitions. Furthermore, they can be combined with others kinds of reductions such like partial-order techniques to obtain very eectiv e reductions. We propose in this paper to adapt these reduc- tions to Promela specications by proposing some simple rules which give the possibility to automatically infer atomic steps in the Promela model while preserving the checked property. We demonstrate on typical example the eciency of this approach and we propose some perspectives of this work. - SourceAvailable from: Shaz Qadeer[Show abstract] [Hide abstract]

**ABSTRACT:**In this paper, we present a new algorithm for detecting data-races in an execution of a concurrent program. Our algorithm is sound and precise, that is, it reports a race in an execution iff there are two accesses to a shared variable along the execution that are not ordered by the happens-before relation. Previous algorithms for computing the happens-before relation are based on clock vectors. On the other hand, our algorithm is based solely on the concept of locksets and is able to capture all mutual-exclusion synchronization idioms uniformly with one mechanism. Our lockset algorithm could be very useful for improving the precision of flow-sensitive static analyses, particularly those for detecting data-races and atomicity violations in concurrent programs. We present one such analysis, a model checking algorithm that uses our lockset algorithm both to check for races exhaustively and perform partial-order reduction when races are absent. Our characterization of the happens-before relation in terms of locksets rather than clock vectors is crucial for the fixpoint computation inherent in model checking and other flow-sensitive analyses. We have implemented our algorithm and used it to prove the absence of data-races and assertion failures on a number of examples containing a variety of synchronization idioms.

Page 1

Optimistic Synchronization-Based State-Space

Reduction

Scott D. Stoller?1and Ernie Cohen??2

1State University of New York at Stony Brook

2Microsoft Research, Cambridge, UK

Abstract. Reductions that aggregate fine-grained transitions into coarser

transitions can significantly reduce the cost of automated verification,

by reducing the size of the state space. We propose a reduction that can

exploit common synchronization disciplines, such as the use of mutual

exclusion for accesses to shared data structures. Exploiting them us-

ing traditional reduction theorems requires checking that the discipline

is followed in the original (i.e., unreduced) system. That check can be

prohibitively expensive. This paper presents a reduction that instead re-

quires checking whether the discipline is followed in the reduced system.

This check may be much cheaper, because the reachable state space is

smaller.

1Introduction

For many concurrent software systems, a straightforward model of the system

has such a large and complicated state space that automated verification, by

automated theorem-proving or state-space exploration (model checking), is in-

feasible. Reduction is an important technique for reducing the size of the state

space by aggregating transitions into coarser-grained transitions.

When exploring the state space of a concurrent system, context switches be-

tween threads are typically allowed before each transition. A simple example of a

reduction for concurrent systems is to inhibit context switches within sequences

of transitions that access only unshared variables. This effectively increases the

granularity of transitions. Thus, one can regard this and similar reductions as

defining a reduced system, which is a coarser-grained version of the original sys-

tem. The reduced system may have dramatically fewer states than the original

system. A reduction theorem asserts that certain properties are preserved by the

transformation.

We consider a more powerful reduction that exploits common synchroniza-

tion disciplines. For example, in a system that uses mutual exclusion on accesses

to some shared variables—called protected variables—our reduction inhibits con-

text switches within sequences of transitions that access only unshared variables

and protected variables. The model-checking experiments reported in [Sto02] are

?The author gratefully acknowledges the support of NSF under Grant CCR-9876058

and the support of ONR under Grants N00014-01-1-0109 and N00014-02-1-0363.

Address: Computer Science Dept., SUNY at Stony Brook, Stony Brook, NY 11794-

4400. Email: stoller@cs.sunysb.eduWeb: http://www.cs.sunysb.edu/˜stoller/

??Email: ernie.cohen@acm.org

Page 2

based on a similar reduction, which decreased memory usage (which is propor-

tional to the number of states) by a factor of 25 or more. Such reductions can also

significantly decrease the computational cost of the automated theorem-proving

needed for thread-modular verification [FQS02].

Traditional reduction theorems, such as [Lip75,CL98,Coh00], can also exploit

such synchronization disciplines. However, a hypothesis of these traditional the-

orems is that the allegedly protected variables are indeed protected (by synchro-

nization that enforces mutual exclusion) in the original (i.e., unreduced) system.

How can we establish this? Static analyses like [BR01,FF01] can automatically

provide a conservative approximation but sometimes return “don’t know”. For

general finite-state systems, it might seem that the only way to automatically

obtain exact information about whether selected variables are actually protected

is to express this condition as a history property and check it by state-space ex-

ploration of the original system. If this were the case, then the reduction would

be almost pointless.

Our reduction theorem implies that one can determine exactly during state-

space exploration of the reduced system whether the synchronization discipline

is followed in the original system.

Our reduction theorem is designed to be used together with traditional re-

duction theorems. Suppose a traditional reduction theorem asserts that some

property φ is preserved by the reduction if the original system follows the syn-

chronization discipline. After checking that the reduced system follows the dis-

cipline and satisfies φ, one can use our reduction theorem to conclude that the

original system follows the discipline, and then use the traditional reduction

theorem to conclude that the original system satisfies φ.

The reduction in [Sto02] is similar in spirit to the one in this paper. The

main contributions of this paper relative to [Sto02] are: (1) a reduction that

applies to systems that use arbitrary synchronization mechanisms to achieve

mutual exclusion (the results in [Sto02] apply only when monitors are used);

(2) separation of a general reduction theorem that justifies checking hypotheses

of traditional reduction theorems in the reduced system from the application

of this technique to mutual-exclusion synchronization disciplines; (3) allowing

non-determinism in invisible transitions (in the notation of Section 3, [Sto02]

requires that u be deterministic); (4) significantly shorter and cleaner proofs,

based on ω-algebra. The first author initially tried to prove similar results in a

transition-system framework, like the one in [God96]; that should be possible,

but our experience suggests that the algebraic framework facilitates the task.

Operations on monitors are not analyzed specially in this paper. As a result,

for systems that mainly use monitors for synchronization, this reduction is not

as effective as the one in [Sto02]. It should be possible to integrate the specialized

treatment of monitor operations in [Sto02] into this paper’s broader framework.

Our method and traditional partial-order methods (e.g., ample sets [CGP99],

stubborn sets [Val97], and persistent sets [God96]) both exploit independence

(commutativity) of transitions, but our method can establish independence of

transitions—and hence achieve a reduction—in many cases where traditional

Page 3

partial-order methods cannot. Traditional partial-order methods, as implemented

in tools such as Spin [Hol97] and VeriSoft [God97], use two kinds of information

to determine independence of transitions: program-specific information about

which processes may perform which operations on which objects (e.g., only

process P2 sends messages on channel C1), and manually supplied program-

independent information about dependencies between operations on selected

datatypes (e.g., a send operation on a full channel is disabled until a receive

operation is performed on that channel). Our method also exploits more com-

plicated program-specific information to determine independence of transitions,

e.g., the invariant that a particular variable is always protected by particular

synchronization constructs.

Traditional partial-order methods rely on static analysis to conservatively

determine dependencies between transitions. As a result, those methods are less

effective for programs that contain references (or pointers) and arrays, because

static analysis cannot in general determine exactly which locations are accessed

by each transition, and the static analysis of dependencies between transitions

is correspondingly imprecise. Our method does not rely on conservative static

analysis of dependencies and has no difficulty with references, etc.

2 Omega Algebra

An omega algebra is an algebraic structure over the operators (in order of in-

creasing precedence) 0 (nullary), 1 (nullary), + (binary infix), · (binary infix,

usually written as simple juxtaposition), ? (binary infix, same precedence as ·),

∗(unary suffix), andω(unary suffix), satisfying the following axioms3:

(x + y) + z = x + (y + z)

x + y = y + x

x + x = x

0 + x = x

x (y z) = (x y) z

0 x = x 0 = 0

1 x = x 1 = x

x (y + z) = x y + x z

(x + y) z = x z + y zx ≤ y x + z ⇒ x ≤ y ? z

In parsing formulas, · and ? associate to the right; e.g., u v ? x ? y parses to

(u · (vω+ v∗· (xω+ x∗· y))). In proofs, we use the hint “(dist)” to indicate

application of the distributivity laws, and the hint “(hyp)” to indicate the use

of hypotheses. If xiis a finite collection of terms, we write (+i : xi) and (·i : xi)

for the sum and product, respectively, of these terms.

These axioms are sound and complete for the usual equational theory of

omega-regular expressions. (Completeness holds only for standard terms, where

the first arguments to ·,ω, and ? are regular.) Thus, we make free use, without

3The axioms are equivalent to Kozen’s axioms for Kleene algebra [Koz94], plus the

three axioms for omega terms.

x ≤ y ⇔ x + y = y

x∗= 1 + x + x∗x∗

x y ≤ x ⇒ x y∗= x

x y ≤ y ⇒ x∗y = y

x ? y = xω+ x∗y

xω= x xω

(* ind)

(* ind)

(? ind)

Page 4

proof, of familiar equations from the theory of (omega-)regular languages (e.g.,

x∗x∗= x∗).

y is a complement of x iff x y = 0 = y x and x + y = 1. It is easy to show

that complements (when they exist) are unique and that complementation is

an involution; a predicate is an element of the algebra with a complement. In

this paper, p and q range over predicates, with complements p and q. It is easy

to show that the predicates form a Boolean algebra, with + as disjunction, ·

as conjunction, 0 as false, 1 as true, complementation as negation, and ≤ as

implication. Common properties of Boolean algebras (e.g., p q = q p) are used

silently in proofs, as is the fact x p y = 0 =⇒ x y = x p y.

The omega algebra axioms support several interesting programming models,

where (intuitively) 0 is magic4, 1 is skip, + is chaotic nondeterministic choice, · is

sequential composition, ≤ is refinement, x∗is executed by executing x any finite

number of times, and xωis executed by executing x an infinite number of times.

The results of this paper are largely motivated by the relational model, where

terms denote binary relations over a state space, 0 is the empty relation, 1 is the

identity relation, · is relational composition, + is union,∗is reflexive-transitive

closure, ≤ is subset, and xωrelates an input state s to an output state if there is

an infinite sequence of states starting with s, with consecutive states related by

x. (Thus, xωrelates an input state to either all states or none, and xω= 0 iff x

is well-founded.) Predicates are identified with the set of states in their domain

(i.e., the states from which they can be executed). We define ? = 1ω; it is easy

to see that ? is the maximal element under ≤, and in the relational model, it

relates all pairs of states.

In addition to equational identities of regular languages, we will use the fol-

lowing two standard theorems (proofs of these theorems and more sophisticated

theorems of this type appear in [Coh00]):

x y ≤ y z =⇒ x∗y ≤ y z∗

y x ≤ x y =⇒ (x + y)∗≤ x∗y∗

(1)

(2)

3A Reduction Theorem

We consider systems composed of a fixed, finite set of concurrent processes (each

perhaps internally concurrent and nondeterministic). Variables i and j range

over process indices. Each process i has a visible action vi and an invisible

action ui5, where the invisible action is constrained to neither receive information

from other processes nor to send information to other processes so as to create

a race condition in the recipient. This constraint is guaranteed only so long

as some global synchronization policy is followed. For example, in a system

where processes are synchronized using locks, either visible or invisible actions

4magic is the program that has no possible executions (and so satisfies every possible

specification). Of course, it cannot be implemented.

5Note that ui and vi can be sums of nondeterministic actions that correspond to

individual transitions of process i.

Page 5

of process i might modify variables that are either local to process i or protected

by locks held by process i, or send asynchronous messages to other processes;

but only visible actions can acquire locks or wait for a condition to hold. Note

that violation of the synchronization discipline (e.g., an action accessing a shared

variable without first obtaining an appropriate lock) might cause a race condition

between an invisible action and the actions of another process, violating the

constraint on invisible actions.

To avoid introducing temporal operators, we introduce a Boolean history

variable q that records whether the synchronization discipline has been violated

at some point in the execution. Predicate pimeans that process i cannot perform

an invisible action, i.e., that uiis disabled. Let p be the conjunction of the pi’s,

i.e., p = (·i : pi). A state satisfying p is called visible; thus, in a visible state, all

invisible transitions are disabled.

We now define several actions, formalized in the definitions (3)–(9) below.

An Miaction consists of a visible action of process i followed by a sequence of

invisible actions of process i. An Niaction is an Miaction that is “maximal”

(i.e., further ui actions are disabled) and that finishes in a state where the

synchronization discipline has not been violated. Niis effectively the transition

relation of thread i in the reduced system. (Additional conditions will imply that

executing an N action in a visible state results in a visible state; thus, in the

reduced system, context switches occur only in visible states.) A u (respectively

v, M, N) action is a ui(respectively, vi, Mi, Ni) action of some process i. Finally,

an R action is executable iff (i) the discipline has been violated, or (ii) such a

violation is possible after execution of a single M action. (Like xω, R relates

each initial state to either all final states or none.)

Mi= viu∗

Ni= Mipiq

u = (+i : ui)

v = (+i : vi)

M = (+i : Mi)

N = (+i : Ni)

R = (1 + M) q ?

Our reduction theorem says that if the original system can reach a violation of

the synchronization discipline starting from some visible state, then the reduced

system can also reach a violation starting from the same initial state, except

that the violation might occur partway through the last transition of the reduced

system (i.e., the last transition might be an M action rather than an N action).

The transition relations of the original and reduced systems are u + v and N,

respectively. Thus, the conclusion of the reduction theorem, (19), is

p (u + v)∗q ≤ N∗R

The hypotheses of our reduction theorem are as follows, formalized in for-

mulas 11–(18) below. It is impossible to execute invisible actions of a single

i

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

Page 6

process forever without violating the discipline (11). An action cannot enable or

disable an invisible action of another process (12),(13), and in the absence of a

discipline violation, it commutes to the right of such an action (14),(15). Visible

and invisible actions of a process cannot be simultaneously enabled (16). uiis

enabled whenever piis false (17). Invisible actions cannot hide violations of the

discipline (18).

(uiq)ω= 0(11)

i ?= j =⇒ ujpi= piuj

i ?= j =⇒ vjpi= pivj

i ?= j =⇒ ujui≤ ui(q ? + uj+ ujq ?)

i ?= j =⇒ vjui≤ ui(q ? + vj+ vjq ?)

pivi= 0 = piui

(16)

1 ≤ pi+ ui?

q ui≤ uiq

Our reduction theorem can be used to check not only the synchronization dis-

cipline, but also the invariance of any other predicate I such that violations of

I cannot be hidden by invisible actions. To see this, note that, except for (18),

the conditions above are all monotonic in q. Thus, if all the conditions above

(including (18)) are satisfied for a predicate q, and there is a predicate I such

that I ui≤ uiI for each i, then all the conditions are still satisfied if q is replaced

with q + I.

The proof below can be viewed as formalizing the following construction,

which starts from an execution that violates the discipline and produces an

execution of the reduced system that also violates the discipline. First, we try to

move invisible uiactions to the left of ujand vjactions, where i ?= j, starting

from the left (i.e., from the leftmost uiaction that immediately follows a ujor

vj action). The ui action cannot make it all the way to the beginning of the

execution (since p ui= 0), so it must eventually run into either another uior a

vi. Repeating this produces an execution in which a sequence of M actions leads

to a violation of the discipline.

Next, we try to turn all but the last of these M actions into N actions,

starting from the next to last M action. In general, we will have done this for

some number of M actions, so we will have an execution that ends with N∗R.

Now try to convert the last Mibefore the N∗R suffix into an N action. Suppose

this Miaction ends with uienabled. uimust then also be enabled later when

the discipline is first violated (because (12) and (13) imply Nj does not affect

enabledness of ui, and (16) implies Niis disabled when uiis enabled), so we add

a ui action just after the violation and try to push it backward (through the

N∗(1 + M)). This may create additional violations of the discipline, but there

will always be an N∗R to the right of the new ui. Eventually, uimakes it back

to the Mi, extending Miwith another ui. By (11), ui’s cannot continue forever

without violating the discipline, so repeating this extension process eventually

either gives us a violation right after Mi (in which case we have produced a

(12)

(13)

(14)

(15)

(17)

(18)

Page 7

new N∗R action, so we can discard everything after it) or lead to the ui’s

being disabled, in which case we have succesfully turned the Mi action into

an N action and again turned the extended execution into an execution that

ends with N∗R. Repeating this for each Miaction, moving from right to left,

produces the desired execution of the reduced system.

We now turn to the formal proof of the reduction theorem (19). We push u’s

left (lines 1-2) where they are eliminated by the initial p (line 3), push M’s to

the left of R’s (line 4), condense the R’s to a single R (lines 5-6), and finally

turn the M’s into N’s (lines 7-8):

p (u + v)∗q ≤ N∗R

p (u + v)∗q

≤

p (u + M + R)∗q

≤

p u∗(M + R)∗q

≤

(M + R)∗q

≤

M∗R∗q

≤

M∗(1 + R) q

≤

M∗R

≤

M∗N∗R

=

{M N∗R ≤ N∗R (21); (* ind)

N∗R

(19)

{v ≤ M ≤ M + R

{(M + R) u ≤ (1 + u) (M + R) (20);(2)}

{p u = 0 (16);p ≤ 1

{R M ≤ R; (2)

{R R ≤ R, so R∗= (1 + R)

{(1 + R) q = R

{1 ≤ N∗

}

}

}

}

}

}

}

(20) says that a u moves to the left of an M or R (but may disappear in the

process):

(M + R) u ≤ (1 + u) (M + R)

(M + R) u

≤

M u + R u

≤

M u + R

=

(+i,j : Mjui) + R

≤

(+i,j : (1 + ui) (Mj+ R)) + R

≤

(1 + u) (M + R)

(20)

{(dist)

{R = R ?, so R u = R ? u ≤ R ? = R}

{(7),(5),(dist)

{Mjui≤ (1 + ui) (Mj+ R)) (25)

{(7),(5),(dist)

}

}

}

}

(21) shows that N∗R actions act as a factory for uiactions until they either

produce a discipline violation (q) or until they produce enough ui’s to turn the

Mito their left into an N.

MiN∗R ≤ N∗R

MiN∗R

≤

{(22);(? ind)

Mi(uiq) ? (pi+ uiq) N∗R

≤

Mi(uiq)∗(pi+ uiq) N∗R

≤

Mi(pi+ uiq) N∗R

=

{(dist)

(Mipi+ Miuiq) N∗R

≤

(Mipiq + Miq) N∗R

≤

(N + Miq) N∗R

≤

N N∗R + Miq N∗R

≤

N N∗R + R

≤

N∗R

(21)

{N∗R ≤ (uiq)N∗R + (pi+ uiq)N∗R

{(uiq)ω= 0 (11)

{q ≤ 1; Miu∗

{Miui≤ Mi(3); Mipi≤ Mipiq + Miq}

{Mipiq = Ni(4); Ni≤ N(8)

{(dist)

{Miq N∗R ≤ Miq ? ≤ R (9)

{N N∗≤ N∗, 1 ≤ N∗

}

}

}

}

}

i= Mi(3)

}

}

}

}

Page 8

(22) and (23) show that N∗R generates a ui(unless pialready holds, or the

discipline has already been violated).

N∗R ≤ (uiq) N∗R + (pi+ uiq) N∗R

N∗R

≤

(pi+ ui) N∗R

≤

(pi+ uiq + uiq) N∗R

≤

(uiq) N∗R + (pi+ uiq) N∗R

N∗R ≤ (pi+ ui) N∗R

N∗R

=

{(9)

N∗(1 + M) q ?

N∗(1 + M) q (pi+ ui) ?

N∗(1 + M) (pi+ ui) q ?

N∗(pi+ ui) (1 + M + R) q ?

N∗(pi+ ui) R

≤

{(1)

(pi+ ui) (N + R)∗R

≤

(pi+ ui) N∗R∗R

=

{R R ≤ R; (* ind)

(pi+ ui) N∗R

(22)

{N∗R ≤ (pi+ ui) N∗R(23)}

{ui= uiq + uiq

{(dist)

}

}

(23)

}

}

}

≤

≤

≤

≤

{1 ≤ pi+ ui? (17)

{q pi= piq; q ui≤ uiq (18)

{M (pi+ ui) ≤ (pi+ ui) (M + R), (25)}

{(1 + M + R) q ? ≤ R (9)

{N (pi+ ui) ≤ (pi+ ui) (N + R) (24); }

{R N ≤ R; (2)

}

}

}

}

Finally, (24) and (25) show that (pi+ui) commutes left past an N or M (possibly

changing them into R’s).

Nj(pi+ ui) ≤ (pi+ ui) (N + R)

Nj(pi+ ui)

≤

Mjpj(pi+ ui)

≤

Mj(pi+ ui) pj

≤

(pi+ ui) (Mj+ R) pj

≤

(pi+ ui) (Mj+ R) pj(q + q)

≤

(pi+ ui) (N + R)

(24)

{Nj≤ Mjpj(4)

{pjpi= pipj; pjui≤ uipj(12)(16)

{Mj(pi+ ui) ≤ (pi+ ui) (Mj+ R), (25)}

{1 = q + q

{Mjpjq ≤ N (4)(8); Mjq ≤ R (9)

}

}

}

}

Mj(pi+ ui) ≤ (pi+ ui) (Mj+ R)

i = j :

Mi(pi+ ui)

Mi

(pi+ ui) (Mi+ R)

(25)

≤

≤

{pi≤ 1; Miui≤ Mi(3)

{vi= pivi(16), so Mi= piMi(3)}

}

i ?= j :

Let [x] = x + x q ? + q ?; then

Mj(pi+ ui)

vju∗

vj(pi+ ui) [uj]∗

vj(pi+ ui) [u∗

(pi+ ui) [vj] [u∗

(pi+ ui) (Mj+ R)

=

≤

=

≤

≤

{(3)

{uj(pi+ ui) ≤ (pi+ ui) [uj] (12)(14); (1)}

{[uj]∗= [u∗

{vj(pi+ ui) ≤ (pi+ ui) [vj] (13)(15)

{[vj] [uj]∗≤ Mj+ R

}

j(pi+ ui)

j] (2)

}

}

}

j]

j]

Page 9

4System Model and Synchronization Discipline

We define a simple model of concurrent systems that use mutual exclusion for

access to selected variables, and we prove that our reduction theorem applies

to these systems. This model is intended to be the simplest one that retains all

relevant aspects of concurrent programming languages, such as Java. It can be

modified and generalized in various ways with little effect on our results.

Each shared variable is classified as protected or unprotected. There are no

constraints on how unprotected variables are accessed. The synchronization dis-

cipline requires that mutual exclusion be used for access to protected variables.

Any combination of synchronization mechanisms (locks, condition variables,

semaphores, barriers, etc.) can be used to provide the mutual exclusion, pro-

vided the scheme can be captured by exclusive access predicates. For each pro-

tected variable x and each thread i, there is an exclusive access predicate ex

The synchronization discipline requires that ex

i can execute a transition that accesses x. Mutual exclusion is expressed by the

requirement that, for every variable x and every two distinct threads i and j, ex

and ex

Formally, a system is a tuple ?Θ,Vunsh,Vprot,Vunprot,T,I,e? where

Θ is a set of threads (thread identifiers). i and j range over Θ.

Vunshis a set of unshared variables, i.e., variables that appear in transitions of

at most one thread.

Vprot is a set of variables declared (possibly incorrectly) to be “protected”,

i.e., there are synchronization mechanisms that ensure mutual exclusion for

accesses to these variables. For each variable x ∈ Vprot and each thread i,

there is an exclusive access predicate ex

Vunprot is a set of (possibly shared) variables, called “unprotected variables”.

No assumptions are made regarding synchronization for accesses to them.

T =?

a guarded command g → c, where the guard g is a predicate over Vguard, and

c is built from assignments over V , sequential composition, and conditionals

(if-then and if-then-else).

I is a predicate over V . I characterizes the initial states.

e is a family of (possibly incorrect) exclusive access predicates ex

Guards are used for synchronization (blocking). Conditionals in commands

are used for sequential control flow. For convenience of analysis, protected vari-

ables cannot appear in guards. This is reasonable because the synchronization

mechanisms that protect the variables, not the protected variables themselves,

should be used to achieve the necessary synchronization. The value of a pro-

tected variable v can be copied into an unshared or unprotected variable, and

the latter variable can be used in a guard, or v can be moved from Vprot to

Vunprotand then used in a guard directly.

Fix a system. A state is a mapping from variables to values. Let Σ be the set

of states. We also use states as maps from expressions to values, with the usual

meaning (homomorphic extension).

i.

ihold in states from which thread

i

jare mutually exclusive (i.e., cannot hold simultaneously).

i.

iTiis a set of transitions, where Tiis the set of transitions of thread i.

Let V = Vunsh∪Vprot∪Vunprotand Vguard= Vunsh∪Vunprot. A transition t is

iover V .

Page 10

A transition t is enabled in state s if its guard is true in s. An execution is a

finite or infinite sequence σ of states such that σ(0) satisfies I and every pair of

consecutive states in σ is in [[t]] for some transition t.

A transition is visible if it (i) contains an occurrence of a variable in Vunprot

or (ii) might change the value of an exclusive access predicate. Other transitions

are invisible. This classification of transitions determines the transition relations

uiand viand the predicates pi.

A system is well-formed if the following conditions hold.

WF-initVis. The initial transitions of each thread are visible, i.e., I ⇒ p. (This

ensures that the conclusion of the reduction theorem applies to all reachable

states of the original system.)

WF-sep. Visible and invisible transitions of each thread are separate, i.e., can-

not be executed from the same state. Formally, (∀i : domain(ui)∩domain(vi) =

∅).

WF-acc. Internal non-determinism in a transition (i.e., non-deterministic choices

that do not affect the ending state) does not affect the set of variables ac-

cessed by the transition or the order in which those variables are first ac-

cessed. (This ensures well-definedness of acc in Section 4.1 and of x in case

2 of the proof of (15) in Section 5.)

WF-finiteInvis. No thread has an infinite execution sequence containing only

invisible transitions. Formally, (∀i : uω

WF-initExcl. For each protected variable x, the exclusive access predicates for

x are initially disjoint, i.e., I ⇒ disjoint(ex), where disjoint(ex) = ¬(∃i,j :

i ?= j ∧ ex

WF-endExcl. A thread cannot take away another thread’s exclusive access to a

variable. Formally, for an exclusive access predicate ex

of thread j cannot falsify ex

i= 0).

i∧ ex

j).

iand j ?= i, transitions

i.

4.1Mutual-Exclusion Synchronization Discipline

The synchronization discipline requires that, for every variable x ∈ Vprot, (i) a

transition of thread i executed from a state s may access x only if s |= ex

(ii) disjoint(ex) holds in every reachable state.

Let acc(s1,t,s2) denote the set of variables accessed by execution of transition

t from state s1to s2. The set of accessed variables may depend on which branches

of conditionals are taken. The ending state s2is included as an argument to acc

because t may be non-deterministic. WF-acc ensures that acc is well-defined.

Since guards do not contain protected variables, acc(s1,t,s2) = ∅ if t is disabled

in s1(otherwise, acc(s1,t,s2) would be the set of protected variables in t’s guard).

We augment the system with a predicate q that holds iff the synchronization

discipline has been violated. Formally, q is the least predicate that satisfies

∀i : ∀x ∈ Vprot: ∀t ∈ Ti: ∀?s1,s2? ∈ [[ti]] : s2|= q ⇐⇒

((x ∈ acc(s1,t,s2) ∧ s1?|= ex

The third disjunct in (26) implies that q is monotonic, i.e., it can be truthified

but not falsified.

i, and

i) ∨ s2?|= disjoint(ex) ∨ s1|= q).

(26)

Page 11

Maintaining q involves accesses to q and accesses to variables that occur

in exclusive access predicates. These accesses are ignored when determining

acc(s1,t,s2).

5Proof that the Reduction Theorem Applies to the

Mutual-Exclusion Synchronization Discipline

We prove in [SC02] that well-formed systems satisfy the hypotheses (11)–(18) of

the reduction theorem. Most of the proofs are straightforward. Here we consider

only the most interesting one.

Proof of (15). Let tibe an invisible transition of thread i, and let tjbe a visible

transition tjof thread j, and let s1, s2, and s3be states such that ?s1,s2? ∈ tjand

?s2,s3? ∈ ti. Let ti= gi→ ciand tj= gj→ cj. tjdoes not enable ti, because cj

and giaccess disjoint sets of variables (because tiis invisible and hence does not

access unprotected variables, and protected variables do not appear in guards).

tidoes not disable tj, for analogous reasons. Thus, there exist states s?

such that ?s1,s?

so s?

that s?

correspond to the summands in (15)) holds: (i) s?

left-commutes with cj), or (iii) s?

case 1: A = ∅. This implies that

acc(s1,tj,s2) = acc(s?

(27)

because the same branches of conditionals will be executed from either source

state. This and A = ∅ imply that (∀x ∈ acc(s2,ti,s3) : s1(x) = s2(x)) and

(∀x ∈ acc(s1,tj,s2) : s1(x) = s?

any) in the transitions in the same way when executing tifollowed by tjas when

executing tjfollowed by tito reach s3, we obtain s?

v ∈ V \ {q}. We must exclude q here because acc does not reflect accesses used

to update q, as stated in Section 4.1.

case 1.1: s3|= ¯ q. If s?

then condition (iii) holds.

case 1.2: s3|= q. We show that s?

case 1.2.1: s1|= q. This and monotonicity of q imply s?

case 1.2.2: s1|= ¯ q. This and s3|= q imply that the synchronization discipline

is violated either by execution of tjfrom s1or by execution of tifrom s2. The

violation corresponds to the first or second disjunct in (26) being true (the third

disjunct just makes q monotonic). Thus, there are 2 × 2 cases to consider.

case 1.2.2.1: (∃x ∈ Vprot : x ∈ acc(s1,tj,s2) ∧ s1 ?|= ex

acc(s?

definition of q implies s?

case 1.2.2.2: (∃x ∈ Vprot : x ∈ acc(s2,ti,s3) ∧ s2 ?|= ex

acc(s1,ti,s?

definition of q implies s?

2and s?

3

2? ∈ tiand ?s?

3are not uniquely determined by these conditions. It suffices to show

2and s?

2,s?

3? ∈ tj. Transitions may be non-deterministic,

2and s?

3can be chosen so that one of the following conditions (which

2|= q, (ii) s?

3= s3 (i.e., ci

3|= q. Let A = acc(s1,tj,s2) ∩ acc(s1,ti,s?

2).

2,tj,s?

3) ∧ acc(s2,ti,s3) = acc(s1,ti,s?

2),

2(x)). Thus, by resolving non-determinism (if

3(v) = s3(v) for all variables

3|= ¯ q, then s?

3= s3, i.e., condition (ii) holds. If s?

3|= q,

2|= q or s?

3|= q.

3|= q.

j). (27) implies x ∈

j, so s?

2,tj,s?

3). ti is invisible, so it cannot truthify ex

3|= q.

2?|= ex

j. Thus, the

i). (27) implies x ∈

i, so s1?|= ex

2). WF-endExcl implies tj did not falsify ex

2|= q.

i. Thus, the

Page 12

case 1.2.2.3: (∃x ∈ Vprot : s2 ?|= disjoint(ex)). ti is invisible, so it cannot

falsify any exclusive access predicate, so s3?|= disjoint(ex). s3and s?

same values for all variables except q, so s?

of q implies s?

case 1.2.2.4: (∃x ∈ Vprot: s3?|= disjoint(ex)). s3and s?

for all variables except q, so s?

s?

case 2: A ?= ∅. Let x be the variable in A first accessed by execution of tj

from s1to s2.

case 2.1: s1|= ex

case 2.1.1: s1|= disjoint(ex). The hypotheses of cases 2.1 and 2.1.1, together

with i ?= j, imply s1?|= ex

case 2.1.2: s1?|= disjoint(ex). This and the definition of q imply s1|= q. This

and monotonicity of q imply s?

case 2.2: s1?|= ex

because the first access to x by tjprecedes execution of conditionals in tjwhose

conditions could be affected by execution of tifrom s1. tiis invisible, so it cannot

truthify ex

3have the

3?|= disjoint(ex). Thus, the definition

3|= q.

3have the same values

3?|= disjoint(ex). Thus, the definition of q implies

3|= q.

j. By definition of A, x ∈ acc(s1,ti,s?

2).

i. This and x ∈ acc(s1,ti,s?

2) imply s?

2|= q.

2|= q.

j. The definitions of A and x imply that x ∈ acc(s?

2,tj,s?

3),

j, so s?

2?|= ex

j. Thus, the definition of q implies s?

3|= q.

6Examples

This section contains examples of systems for which the current reduction is effec-

tive (i.e., it reduces the number of reachable states) and the reduction in [Sto02]

is not effective. In general, our method is effective whenever some variables can

be classified as protected. These examples are based mainly on descriptions in

[SBN+97] of code in real systems.

Semaphores. A user thread sends a request to a device driver thread, asking

the device driver to store data in a buffer b, and then waits for the result by

invoking sem.down(), where sem is a semaphore, initialized to zero. The device

driver thread receives the request, waits for the device to supply the data, stores

the data in b, and then calls sem.up(). The buffer b can be classified as protected.

For example, eb

a statement after the call to sem.down(), and eb

counter of the device driver thread points to a statement before the call to

sem.up(). The semaphore ensures disjointness of eb

userholds when the program counter of the user thread points to

driverholds when the program

userand eb

driver.

Memory Re-use. Some systems re-use objects (or structures) by placing them on

a free list when they are not in use. These objects may be protected by different

locks each time they are re-used, violating the locking discipline of [Sto02]. For

example, consider a file system in which blocks in a file are protected by the lock

associated with (the i-node of) that file, and blocks on the free list are protected

by the lock associated with the free list. A block may be in a different file, and

hence protected by a different lock, each time it is re-used. Let mF denote the

lock associated with the free list. Let mf denote the lock associated with file f.

The exclusive access predicate eb

ifor a block b might be

Page 13

(onFreeList(b) ∧ mF.owner = i) ∨ (∃ file f : allocatedTo(b,f) ∧ mf.owner = i)

Master-Worker Paradigm. In the master-worker paradigm, a master thread as-

signs tasks to worker threads. Typically, each task is represented by an object

created by the master thread and passed to a worker thread. The master thread

does not access a task object after passing it to a worker. Task objects can be

classified as protected. Suppose each worker thread w has a field w.task that

refers to the worker’s task. For a task object x, the exclusive access predicate

ex

w.task = x.

masterholds before x has been passed to a worker thread, and ex

wholds when

7Comparison to Traditional Partial-Order Methods

This section demonstrates that our method has advantages over traditional

partial-order methods even for some simple systems for which precise static anal-

ysis of transition dependencies is feasible. Consider a system with two threads

that use monitors m0and m1as locks and use an integer variable y to implement

a barrier. Let uppercase letters denote control points. Let guard → stmt denote

a transition that blocks when guard is false and can execute stmt when guard is

true. For i ∈ {0,1}, the code for thread i is

Am0.acquire();Bx0:= i;Cm0.release();Dm1.acquire();Ex1:= i;

Fm1.release();Gy + +;Hy = 2 → skip;Ixi= iJ

In the initial state, xj= j and y = 0, and both threads are at control point A.

xj is a protected variable, with exclusive access predicate exj

i) ∨ (y = 2 ∧ i = j). y is not protected.

This system has 106 reachable states. With the reduction in this paper,

transitions that update x0or x1are invisible; other transitions are visible. The

reachable states of the reduced system are the reachable states of the original

system in which every thread is ready to perform a visible transition or is at its

final control point. There are 62 such states.

Traditional partial-order methods based on persistent sets [God96] (or ample

sets [CGP99]) can also significantly reduce the number of explored states but

do not achieve the same benefits as our reduction. For concreteness, we compare

our method to selective search using the conditional stubborn set algorithm

(CSSA) [God96]. We always resolve non-determinism in CSSA in a way that

yields a minimum-size persistent set. CSSA is parameterized by dependency

relations on operations. For acquire and release, we use the might-be-the-first-

to-interfere-with relation in [Sto02, Fig. 3]. For accesses to y, we use the minimal

might-be-the-first-to-interfere-with relation, based on the dependency relation on

operations in which an increment to y is dependent with the condition y = 2

only in states in which the increment changes the truth value of the condition.

The selective search (using CSSA) explores 77 states. To illustrate why it

explores more than the 62 states explored by our method, consider the reach-

able state s in which thread 0 is at control point D and thread 1 is at control

(28)

i

= (mj.owner =

Page 14

point B. With the reduction in this paper, the transitions that update x0or x1

are invisible, so the system passes through this invisible state by executing the

enabled transition of thread 1; the enabled transition of thread 0 is not executed

in s. In contrast, the selective search explores both enabled transitions in s, as

explained in detail in [SC02].

This example can be generalized to show our method outperforming the

selective search by an arbitrary amount: simply insert additional transitions

that access x0before the transition m0.release() in thread 0.

The selective search exploits some independence that our method does not,

in particular, independence of release with acquire and release, and indepen-

dence of acquire with acquire in some states. One way to obtain the benefits

of both methods is to apply selective search to the reduced system. This works

for systems for which sufficiently precise static analysis of dependencies between

transitions is feasible (cf. Section 1). Another approach is to extend our method,

e.g., to incorporate the specialized treatment of monitor operations in [Sto02]

that allows release to be classified as invisible.

8How to Use the Reduction

The intended methodology for using the reduction is as follows.

1. Guess the set Vprot of protected variables and the exclusive access pred-

icates ex

reduced system, in which the transition relation of thread i is Ni, defined in (4).

2. Augment the reduced system with a predicate q, as described in Section

4.1.

3. Check whether ¯ q holds in all reachable states of the reduced system. Check

this using your favorite technique: model checking, theorem proving, hand wav-

ing, etc.

4. If so, then the reduction theorem implies that ¯ q holds in all reachable

states of the original system, i.e., the guesses in Step 1 are correct. Traditional

reduction theorems can now be used to infer other properties of the original

system from properties of the reduced system.

5. If not, then for some variable x in Vprot, the reduced system has a reachable

state in which the mutual-exclusion synchronization discipline for x is violated.

Revise the guess for ex(using the path to the violation as a guide) or re-classify

x as unprotected, and then return to Step 1.

i. These guesses determine visibility of transitions and hence define a

9 How to Use the Reduction Automatically for Systems

with Monitors

The methodology in Section 8 is automatic except that the user must guess

Vprot and the exclusive access predicates. For systems that use monitors for

synchronization, this step, too, can be automated, based on the observation

Page 15

that the exclusive access predicates typically have the form ex

eapx,m

i

= initx

(29)

initx= (∃i ∈ Θ : initx

and where the initialization predicate initx

that initializes x. Note that the lock protecting a variable does not need to be

held while the variable is being initialized.

Initialization predicates for variables in systems that correspond to Java pro-

grams can be guessed automatically: the initialization predicate holds when the

thread’s program counter is in the appropriate class initializer (for static fields)

or the appropriate constructor invocation (for instance fields).

To use (29), we need to identify, for each variable x in Vprot, a monitor m that

protects x. This can be done automatically by running a variant of the lockset

algorithm [SBN+97] during state-space exploration of the reduced system.

i= eapx,m

i

, where

i∨ (i = m.owner ∧ ¬initx)

i).

(30)

iholds while thread i is executing code

10 Experimental Results

We implemented the similar reduction of [Sto02] in Java PathFinder (JPF)

[BHPV00] and measured the benefit of the reduction for several programs with

monitor-based synchronization. HaltException and Clean [BHPV00, Figure 1]

are small “synchronization skeletons” supplied by the developers of JPF. Xtango-

DP and Xtango-QS are animations of a dining philosophers algorithm and quick-

sort, respectively, from http://www.mcs.drexel.edu/˜shartley/; we replaced java.-

awt methods with methods having empty bodies, due to limitations of JPF. The

lockset algorithm was used in all experiments. With negligible manual effort (to

write a few lines of config files), the reduction decreases memory usage by a fac-

tor of 1.4MB/0.77MB ≈ 1.8 for HaltException, 4.3MB/2.2MB ≈ 2.0 for Clean,

609MB/236MB ≈ 2.6 for Xtango-DP, and 344MB/101MB ≈ 3.4 for Xtango-

QS, compared to model checking with JPF’S default granularity, which executes

each line of source code atomically. In a real JVM, bytecode instructions execute

atomically. Our reduction preserves that semantics. JPF’s source-line granularity

does not: it can miss errors. Compared to bytecode granularity, our reduction

decreases memory usage by a factor of 13.9MB/0.77MB ≈ 18 for HaltExcep-

tion and at least 1800MB/101MB ≈ 18 for Xtango-QS (“at least” reflects an

out-of-memory exception).

Acknowledgements. We thank Shaz Qadeer for telling us about exclusive ac-

cess predicates, Liqiang Wang for doing the experiments with JPF, and Patrice

Godefroid for comments about partial-order methods.

References

[BR01]C. Boyapati and M. C. Rinard. A parameterized type system for race-free

Java programs. In Proc. 16th ACM Conference on Object-Oriented Program-

ming, Systems, Languages and Applications (OOPSLA), volume 36(11) of

SIGPLAN Notices, pages 56–69. ACM Press, November 2001.