Page 1

Dynamic Model Checking with Property Driven

Pruning to Detect Race Conditions⋆

Chao Wang1, Yu Yang2, Aarti Gupta1, and Ganesh Gopalakrishnan2

1NEC Laboratories America, Princeton, New Jersey, USA

2School of Computing, University of Utah, Salt Lake City, Utah, USA

Abstract. Wepresent anew propertydrivenpruning algorithmindynamicmodel

checking to efficiently detect race conditions in multithreaded programs. The

main idea is to use a lockset based analysis of observed executions to help prune

the search space to be explored by the dynamic search. We assume that a state-

lesssearchalgorithmisusedtosystematicallyexecute theprogram inadepth-first

search order. If our conservative lockset analysis shows that a search subspace is

race-free, it can be pruned away by avoiding backtracks to certain states in the

depth-first search. The new dynamic race detection algorithm is both sound and

complete (as precise as the dynamic partial order reduction algorithm by Flana-

gan and Godefroid). The algorithm is also more efficient in practice, allowing it

to scale much better to real-world multithreaded C programs.

1Introduction

Concurrent programsare notoriouslyhard to debug because of their often large number

of possible interleavings of thread executions. Concurrency bugs often arise in rare

situations that are hard to anticipate and handle by standard testing techniques. One

representative type of bugs in concurrent programs is a data race, which happens when

multiple threads access a shared data variable simultaneously and at least one of the

accesses is a write. Race conditions were among the flaws in the Therac-25 radiation

therapy machine [12], which led to the death of three patients and injuries to several

more. A race condition in the energy management system of some power facilities

prevented alerts from being raised to the monitoring technicians, eventually leading to

the 2003 North American Blackout.

To completely verify a multithreaded program for a given test input, one has to

inspect all possible thread interleavings. For deterministic threads, the only source of

nondeterminism in their execution comes from the thread scheduler of the operating

system. In a typical testing environment, the user does not have full control over the

scheduling of threads; running the same test multiple times does not necessarily trans-

late into a better interleaving coverage. Static analysis has been used for detecting data

races in multithreadedprograms, both for a given test input [20, 16] and for all possible

inputs [6, 4, 17, 11, 22]. However, a race condition reported by static analysis may be

bogus (there can be many false alarms); even if it is real, there is often little information

for the user to reproduce the race. Model checking [3, 18] has the advantage of exhaus-

tive coverage which means all possible thread interleavings will be explored. However,

⋆Yu Yang and Ganesh Gopalakrishnan were supported in part by NSF award CNS-0509379 and

the Microsoft HPC Institutes program.

Page 2

2

model checkers require building finite-state or pushdown automata models of the soft-

ware [10, 1]; they often do not perform well in the presence of lock pointers and other

heap allocated data structures.

Dynamic model checking as in [9, 5, 14, 23, 24] can directly check programs writ-

ten in full-fledgedprogramminglanguagessuch as C and Java. For detectingdata races,

these methods are sound (no bogus race) due to their concrete execution of the pro-

gram itself as opposed to a model. While a bounded analysis is used in [14], the other

methods [9, 5, 23, 24] are complete for terminating programs (do not miss real races)

by systematically exploring the state space without explicitly storing the intermediate

states. Although such dynamic software model checking is both sound and complete,

the search is often inefficient due to the astronomically large number of thread inter-

leavings and the lack of property specific pruning. Dynamic partial order reduction

(DPOR) techniques [5, 23, 7] have been used in this context to remove the redundant

interleavings from each equivalence class, provided that the representative interleaving

has been explored. However, the pruning techniques used by these DPOR tools have

been generic, rather than property-specific.

T1

...

lock(f1) ;

x++;

unlock(f1) ;

...

lock(f2) ;

y++;

unlock(f2) ;

...

lock(f1) ;

z++;

unlock(f1) ;

a1

a2

a3

a4

a5

a6

a7

a8

a9

a10

a11

T2

...

lock(f1) ;

lock(f2) ;

z++;

c = x;

unlock(f2) ;

unlock(f1) ;

...

lock(f1) ;

if (c==0)

y++;

unlock(f1) ;

b1

b2

b3

b4

b5

b6

b7

b8

b9

b10

b11

Fig.1. Race condition on accessing variable y (assume that x = y = 0 initially)

Without a conservative or warranty type of analysis tailored toward the property to

be checked, model checking has to enumerate all the equivalence classes of interleav-

ings. Our observation is that, as far as race detection is concerned, many equivalence

classes themselves may be redundant. Fig. 1 shows a motivating example, in which

two threads use locks to protect accesses to shared variables x,y, and z. A race con-

dition between a6and b10may occur when b4is executed before a2, by setting c to

0. Let the first execution sequence be a1...a11b1...b9b11. According to the DPOR

algorithm by Flanagan and Godefroid [5], since a10and b3have a read-write conflict,

we need to backtrack to a8and continuethe search from a1...a8b1. As a generic prun-

ing technique, this is reasonable since the two executions are not Mazurkiewicz-trace

equivalent [13]. For data race detection, however, it is futile to search any of these ex-

ecution traces in which a6and b10cannot be simultaneously reachable (which can be

Page 3

3

revealed by a conservative lockset analysis). We provide a property-specific pruning

algorithm to skip such redundant interleavings and backtrack directly to a1.

In this paper,we proposea trace-baseddynamiclockset analysis to prunethe search

space in the context of dynamic model checking. Our main contributions are: (1) a new

lockset analysis of the observed execution trace for checking whether the associated

search subspace is race-free. (2) property driven pruning in a backtracking algorithm

using depth-first search.

We analyze the various alternatives of the current execution trace to anticipate race

conditions in the corresponding search space. Our trace-based lockset analysis relies

on both information derived from the dynamic execution and information collected

statically from the program; therefore, it is more precise than the purely static lock-

set analysis conducted a priori on the program [4, 6, 17, 11, 22]. Our method is also

different from the Eraser-style dynamic lockset algorithms [20, 16], since our method

decides whether the entire search subspace related to the concrete execution generated

is race-free, not merely the execution itself. The crucial requirement for a method to be

used in our framework for pruning of the search space is completeness—pruning must

not remove real races. Therefore, neither the aforementioned dynamic lockset analysis

nor the various predictive testing techniques [21, 2] based on happens-before causality

(sound but incomplete) can be be used in this framework. CHESS [14] can detect races

that may show up within a preemption bound; it exploits the preemption bounding for

pruning, but does not exploit the lock semantics to effect reduction.

In our approach, if the search subspace is found to be race-free, we prune it away

during the search by avoiding backtracks to the correspondingstates. Recall that essen-

tially the search is conducted in a DFS order. If there is a potential race, we analyze

the cause in order to compute a proper backtracking point. Our backtracking algorithm

shares the same insights as the DPOR algorithm [5], with the additional pruning capa-

bility provided by the trace-based lockset analysis. Note that DPOR relies solely on the

independence relation to prune redundant interleavings (if t1,t2are independent, there

is no need to flip their execution order). In our algorithm, even if t1,t2are dependent,

we mayskip the correspondingsearchspace if flippingthe orderof t1,t2does notaffect

the reachability of any race condition. If there is no data race at all in the program, our

algorithm can obtain the desired race-freedom assurance much faster.

2Preliminaries

2.1 Concurrent Programs

We consider a concurrent program with a finite number of threads as a state transition

system. Let Tid = {1,...,n} be a set of thread indices. Threads may access local

variablesintheirownstacks,aswell asglobalvariablesina sharedheap.Theoperations

on global variables are called visible operations, while those on thread local variables

are called invisible operations. We use Global to denote the set of states of all global

variables, Local to denote the set of local states of a thread. PC is the set of values of

the program counter of a thread. The entire system state (S), the program counters of

the threads (PCs), and the local states of threads (Locals) are defined as follows:

Page 4

4

S ⊆ Global × Locals × PCs

PCs = Tid → PC

Locals = Tid → Local

A transition t : S → S advances the program from one state to a subsequent state.

Following the notation of [5, 23], each transition t consists of one visible operation,

followed by a finite sequence of invisible operations of the same thread up to (but ex-

cluding) the next visible operation. We use tid(t) ∈ Tid to denote the thread index of

the transition t. Let T be the set of all transitions of a program. A transition t ∈ T

is enabled in a state s if the next state t(s) is defined. We use s

t is enabled in s and s′= t(s). Two transitions t1,t2may be co-enabled if there ex-

ists a state in which both t1and t2are enabled. The state transition graph is denoted

?S,s0,Γ?, where s0 ∈ S is the unique initial state and Γ ⊆ S × S is the transition

relation: (s,s′) ∈ Γ iff ∃t ∈ T : s

s0,...,snsuch that ∃ti. si−1

→ sifor all 1 ≤ i ≤ n.

Two transitions are independent if and only if they can neither disable nor enable

each other, and swapping their order of execution does not change the combined effect.

Two execution trace are equivalent iff they can be transformed into each other by re-

peatedly swapping adjacent independent transitions. In model checking, partial order

reduction (POR [8]) has been used to exploit the redundancy of executions from the

same equivalence class to prune the search space; in particular, model checking has to

consider only one representative from each equivalence class.

t→ s′to denote that

t→ s′. An execution sequence is a sequence of states

ti

2.2Dynamic Partial Order Reduction

Model checking of a multithreaded program can be conducted in a stateless fashion by

systematically executing the program in a depth-first search order. This can be imple-

mented by using a special scheduler to control the execution of visible operations of

all threads; the scheduler needs to give permission to, and observe the result of every

visible operationof the program.Instead of enumeratingthe reachablestates, as in clas-

sic model checkers, it exhaustively explores all the feasible thread interleavings. Fig. 2

shows a typical stateless search algorithm. The scheduler maintains a search stack S

of states. Each state s ∈ S is associated with a set s.enabled of enabled transitions,

a set s.done of executed transitions, and a backtracking set, consisting of the thread

indices of some enabled transitions in s that need to be exploredfrom s. In this context,

backtracking is implemented by re-starting the program afresh under a different thread

schedule[23],whileensuringthat thereplayis deterministic—i.e.all externalbehaviors

(e.g., mallocs and IO) are also assumed to be replayable1.

TheprocedureDPORUPDATEBACKTRACKSETS(S,t) implementsthedynamicpar-

tial order reduction algorithm of [5]. It updates the backtrack set only for the last tran-

sition tdin T such that tdis dependent and may be co-enabled with t (line 19). The

set sd.backtrack is also a subset of the enabled transitions, and the set E consists of

1While malloc replayability is ensured by allocating objects in the same fashion, IO replayabil-

ity is ensured by creating suitable closed environments.

Page 5

5

1: Initially: S is empty; DPORSEARCH(S,s0)

2: DPORSEARCH(S, s) {

3:

if (DETECTRACE(s))exit (S);

4:

S.push(s);

5:

for each t ∈ s.enabled, DPORUPDATEBACKTRACKSETS(S,t);

6:let τ ∈ Tid such that ∃t ∈ s.enabled : tid(t) = τ;

7:

s.backtrack ← {τ};

8:

s.done ← ∅;

9:

while (∃t: tid(t) ∈ s.backtrack and t ?∈ s.done) {

10:

s.done ← s.done ∪ {t};

11:

s.backtrack ← s.backtrack \ {tid(t)};

12:let s′∈ S such that s

13:DPORSEARCH(S,s′);

14:

S.pop(s);

15:

}

16: }

17: DPORUPDATEBACKTRACKSETS(S,t) {

18:let T = {t1,...,tn} be the sequence of transitions associated with S;

19:let tdbe the latest transition in T that is dependent and may be co-enabled with t;

20:

if (td?= null){

21:let sdbe the state in S from which tdis executed;

22: let E be {q ∈ sd.enabled | either tid(q) = tid(t), or q was executed after tdin T and

a happens-before relation exists for (q,t)}

23:

if (E ?= ∅)

24:choose any q in E, add tid(q) to sd.backtrack;

25:

else

26:

sd.backtrack ← sd.backtrack ∪ {tid(q) | q ∈ sd.enabled};

27:

}

28: }

t→ s′;

Fig.2. Stateless search with dynamic partial order reduction (c.f. [5])

transitions q in T such that (q,t) has a happens-before relation (line 22). Intuitively,

q happens-before t means that flipping the execution order of q and t may lead to in-

terleavings in a different equivalence class. For a better understanding, a plain depth-

first search, with no partial order reduction at all, would correspond to an alternative

implementation of line 19 in which tdis defined as the last transition in T such that

tid(td) ?= tid(t), regardless of whether tdand t are dependent, and an alternative im-

plementation of line 22 in which E = ∅.

Data race detection is essentially checking the simultaneous reachability of two

conflicting transitions. The procedure DETECTRACE(s) used in line 3 of Fig. 2 checks

in each state s whether there exist two transitions t1,t2such that (1) they access the

same shared variable; (2) at least one of them is a write; and (3) both transitions are

enabled in s. If all three conditions hold, it reports a data race; in this case, the se-

quence of states s0,s1,...,s currently in the stack S serve as a counterexample. The

advantage of this race detection procedure is that it does not report bogus races (of

course, the race itself may be benign; detecting whether races are malicious is outside

the scope ofourapproach,as well as most otherapproachesin this area).If the top-level