Conference PaperPDF Available

Optimal Dynamic Partial Order Reduction

Authors:

Abstract

Stateless model checking is a powerful technique for program verification, which however suffers from an exponential growth in the number of explored executions. A successful technique for reducing this number, while still maintaining complete coverage, is Dynamic Partial Order Reduction (DPOR). We present a new DPOR algorithm, which is the first to be provably optimal in that it always explores the minimal number of executions. It is based on a novel class of sets, called source sets, which replace the role of persistent sets in previous algorithms. First, we show how to modify an existing DPOR algorithm to work with source sets, resulting in an efficient and simple to implement algorithm. Second, we extend this algorithm with a novel mechanism, called wakeup trees, that allows to achieve optimality. We have implemented both algorithms in a stateless model checking tool for Erlang programs. Experiments show that source sets significantly increase the performance and that wakeup trees incur only a small overhead in both time and space.
A preview of the PDF is not available
... SMC has been implemented in many tools (e.g., VeriSoft [16], Chess [34], Concuerror [9], Nidhugg [2], rInspect [42], CDSChecker [35], RCMC [22], and GenMC [26]), and successfully applied to realistic programs (e.g., [17] and [25]). To reduce the number of explored executions, SMC tools typically employ dynamic partial order reduction (DPOR) [12,1]. DPOR defines an equivalence relation on executions, which preserves relevant correctness properties, such as reachability of local states and assertion violations, and explores at least one execution in each equivalence class. ...
... This can be achieved by defining an equivalence on executions, which respects only the ordering of conflicting accesses to shared variables, irrespective of the order in which events are executed. For plain multi-threaded programs, this equivalence is the basis for several effective DPOR algorithms [12,1]. The challenge is to develop an effective DPOR algorithm also for event-driven programs. ...
... The multiset semantics is used in many works [21,37,20], often with the significant restriction that there is only one handler thread; we consider the more general situation with an arbitrary number of handler threads. Event-DPOR is based on Optimal-DPOR [1,3], a DPOR algorithm for multi-threaded programs. The basic working mode of Optimal-DPOR is similar to several other DPOR algorithms: Given a terminating program, one of its executions is explored and then analyzed to construct initial fragments of new executions; each fragment that is not redundant (i.e., which can be extended to an execution that is not equivalent to a previously explored execution), is subsequently extended to a maximal execution, which is analyzed to construct initial fragments of new executions, and so on. ...
Preprint
Full-text available
Event-driven multi-threaded programming is an important idiom for structuring concurrent computations. Stateless Model Checking (SMC) is an effective verification technique for multi-threaded programs, especially when coupled with Dynamic Partial Order Reduction (DPOR). Existing SMC techniques are often ineffective in handling event-driven programs, since they will typically explore all possible orderings of event processing, even when events do not conflict. We present Event-DPOR , a DPOR algorithm tailored to event-driven multi-threaded programs. It is based on Optimal-DPOR, an optimal DPOR algorithm for multi-threaded programs; we show how it can be extended for event-driven programs. We prove correctness of Event-DPOR for all programs, and optimality for a large subclass. One complication is that an operation in Event-DPOR, which checks for redundancy of new executions, is NP-hard, as we show in this paper; we address this by a sequence of inexpensive (but incomplete) tests which check for redundancy efficiently. Our implementation and experimental evaluation show that, in comparison with other tools in which handler threads are simulated using locks, Event-DPOR can be exponentially faster than other state-of-the-art DPOR algorithms on a variety of programs and manages to completely avoid unnecessary exploration of executions.
... Specifically, they partition the interleavings into equivalence classes (two interleavings are equivalent if one can be obtained from the other by reordering independent instructions), and strive to explore one interleaving per equivalence class. Optimal algorithms [2,15] achieve this goal. DPOR algorithms explore interleavings dynamically. ...
... Theorem 2 (Optimality). Given a well-formed program P (1) Verify(P) never visits two equivalent final executions, and (2) if Visit(P, G) directly leads to a call to Visit(P, G ) with G being fruitless, then Visit(P, G ) will not initiate any other Visit calls. ...
... The seminal work of Flanagan and Godefroid [13] has spawned a number of papers on DPOR. Among these, Optimal-DPOR [2] and TruSt [15] stand out, as they provide the first optimal DPOR algorithm, and the first optimal DPOR algorithm with polynomial memory consumption, respectively. TruSt is based on [17] and thus has the extra advantage of being parametric in the choice of the underlying weak memory model. ...
Chapter
Full-text available
Existing dynamic partial order reduction (DPOR) algorithms scale poorly on concurrent data structure benchmarks because they visit a huge number of blocked executions due to spinloops. In response, we develop Awamoche , a sound, complete, and strongly optimal DPOR algorithm that avoids exploring any useless blocked executions in programs with await and confirmation-CAS loops. Consequently, it outperforms the state-of-the-art, often by an exponential factor.
... The basic DPOR algorithm described in § 2.1 does not guarantee optimality, i.e., that only one execution from each equivalent class will be explored. There are several improvements of the basic algorithm, some of which achieve optimality (e.g., [2,17]). Here, we follow the most recent such improvement, TruSt [16], which achieves optimality with polynomial memory consumption. ...
... There is a large body of work that has improved the original DPOR algorithm of Flanagan et al. [11]. Abdulla et al. [2] introduced the first optimal DPOR algorithm, which, however, suffers from possibly exponential memory consumption. Kokologiannakis et al. [16] developed TruSt, which is the first optimal DPOR algorithm that consumes polynomial memory. ...
Chapter
Full-text available
There are two major techniques for scaling up stateless model checking: dynamic partial order reduction (DPOR), which only explores executions that differ in the ordering of racy accesses, and preemption bounding , which only explores executions containing up to k preemptions (preemptive context-switches). Combining these two techniques is challenging because DPOR-equivalent executions often contain a different number of preemptions, making it incorrect to cut explorations that exceed the preemption bound. To restore completeness, prior work has weakened the DPOR algorithm, which often results in the exploration of many redundant executions. We propose an alternative approach. Starting from an optimal DPOR algorithm, we achieve completeness by allowing some slack on the preemption-bound of the explored executions. We prove that the required slack does not exceed the number of threads of the program (minus two), and that this upper limit is tight.
... There has been a variety of previous work on using commutativity for the analysis of concurrent programs. Many approaches in this field do not treat commutativity relations as first-class objects, but rather fix one underlying definition of commutativity (or łindependencež) for the approach [Abdulla et al. 2014;Flanagan and Godefroid 2005;Kahlon et al. 2009]. While [Godefroid 1996] considers the possibility of different commutativity relations, they impose a soundness notion (łvalid dependency relationž) that effectively limits these relations to capture the spirit of what we call concrete commutativity. ...
Article
The importance of exploiting commutativity relations in verification algorithms for concurrent programs is well-known. They can help simplify the proof and improve the time and space efficiency. This paper studies commutativity relations as a first-class object in the setting of verification algorithms for concurrent programs. A first contribution is a general framework for abstract commutativity relations . We introduce a general soundness condition for commutativity relations, and present a method to automatically derive sound abstract commutativity relations from a given proof. The method can be used in a verification algorithm based on abstraction refinement to compute a new commutativity relation in each iteration of the abstraction refinement loop. A second result is a general proof rule that allows one to combine multiple commutativity relations, with incomparable power, in a stratified way that preserves soundness and allows one to profit from the full power of the combined relations. We present an algorithm for the stratified proof rule that performs an optimal combination (in a sense made formal), enabling usage of stratified commutativity in algorithmic verification. We empirically evaluate the impact of abstract commutativity and stratified combination of commutativity relations on verification algorithms for concurrent programs.
... This has been adapted to SMC as DPOR (dynamic partial order reduction). DPOR was first developed for concurrent programs under SC [Abdulla et al. 2014;Sen and Agha 2006]. Recent years have seen DPOR adapted to language induced weak memory models such as RC11 [Kokologiannakis et al. 2018;Norris and Demsky 2016], the release acquire fragment of C11 [Abdulla et al. 2018], as well as hardware-induced relaxed memory models [Abdulla et al. 2015;Zhang et al. 2015]. ...
Preprint
Full-text available
The verification of transactional concurrent programs running over causally consistent databases is a challenging problem. We present a framework for efficient stateless model checking (SMC) of concurrent programs with transactions under two well known models of causal consistency, CCv and CC. Our approach is based on exploring the program order po and the reads from rf relations, avoiding exploration of all possible coherence orders. Our SMC algorithm is provably optimal in the sense that it explores each po and rf relation exactly once. We have implemented our framework in a tool called \ourtool{}. Experiments show that \ourtool{} performs well in detecting anomalies in classical distributed databases benchmarks.
Conference Paper
Event-driven multi-threaded programming is an important idiom for structuring concurrent computations. Stateless Model Checking (SMC) is an effective verification technique for multi-threaded programs, especially when coupled with Dynamic Partial Order Reduction (DPOR). Existing SMC techniques are often ineffective in handling event-driven programs, since they will typically explore all possible orderings of event processing, even when events do not conflict. We present Event-DPOR, a DPOR algorithm tailored to event-driven multi-threaded programs. It is based on Optimal-DPOR, an optimal DPOR algorithm for multi-threaded programs; we show how it can be extended for event-driven programs. We prove correctness of Event-DPOR for all programs, and optimality for a large subclass. One complication is that an operation in Event-DPOR, which checks for redundancy of new executions, is NP-hard, as we show in this paper; we address this by a sequence of inexpensive (but incomplete) tests which check for redundancy efficiently. Our implementation and experimental evaluation show that, in comparison with other tools in which handler threads are simulated using locks, Event-DPOR can be exponentially faster than other state-of-the-art DPOR algorithms on a variety of programs and manages to completely avoid unnecessary exploration of executions.
Article
Verification of distributed systems using systematic exploration is daunting because of the many possible interleavings of messages and failures. When faced with this scalability challenge, existing approaches have traditionally mitigated state space explosion by avoiding exploration of redundant states (e.g., via state hashing) and redundant interleavings of transitions (e.g., via partial-order reductions). In this paper, we present an efficient symbolic exploration method that not only avoids redundancies in states and interleavings, but additionally avoids redundant computations that are performed during updates to states on transitions. Our symbolic explorer leverages a novel, fine-grained, canonical representation of distributed system configurations (states) to identify opportunities for avoiding such redundancies on-the-fly. The explorer also includes an interface that is compatible with abstractions for state-space reduction and with partial-order and other reductions for avoiding redundant interleavings. We implement our approach in the tool Psym and empirically demonstrate that it outperforms a state-of-the-art exploration tool, can successfully verify many common distributed protocols, and can scale to multiple real-world industrial case studies across
Article
Automatically verifying multi-threaded programs is difficult because of the vast number of thread interleavings, a problem aggravated by weak memory consistency. Partial orders can help with verification because they can represent many thread interleavings concisely. However, there is no dedicated decision procedure for solving partial-order constraints. In this paper, we propose a novel ordering consistency theory for concurrent program verification that is applicable not only under sequential consistency, but also under the TSO and PSO weak memory models. We further develop an efficient theory solver, which checks consistency incrementally, generates minimal conflict clauses, and includes a custom propagation procedure. We have implemented our approach in a tool, called Zord , and have conducted extensive experiments on the SV-COMP 2020 ConcurrencySafety benchmarks. Our experimental results show a significant improvement over the state of the art.
Conference Paper
Full-text available
To detect hard-to-find concurrency bugs, testing tools try to systematically explore all possible interleavings of the transitions in a concurrent program. Unfortunately, because of the nondeterminism in concurrent programs, exhaustively exploring all interleavings is time-consuming and often computationally intractable. Speeding up such tools requires pruning the state space explored. Partial-order reduction (POR) techniques can substantially prune the number of explored interleavings. These techniques require defining a dependency relation on transitions in the program, and exploit independency among certain transitions to prune the state space. We observe that actor systems, a prevalent class of programs where computation entities communicate by exchanging messages, exhibit a dependency relation among co-enabled transitions with an interesting property: transitivity. This paper introduces a novel dynamic POR technique, TransDPOR, that exploits the transitivity of the dependency relation in actor systems. Empirical results show that leveraging transitivity speeds up exploration by up to two orders of magnitude compared to existing POR techniques.
Data
Full-text available
In multithreaded programs both environment input data and the nondeterministic interleavings of concurrent events can affect the behavior of the program. One approach to systematically explore the nondeterminism caused by input data is dynamic symbolic execution. For testing multithreaded programs we present a new approach that combinesdynamicsymbolicexecutionwithunfoldings, amethod originally developed for Petri nets but also applied to many other models of concurrency. We provide an experimental comparison of our new approach with existing algorithms combining dynamic symbolic execution and partial-order reductions and show that the new algorithm can explore the reachable control states of each thread with a significantly smaller number of test runs. In some cases the reduction to the number of test runs can be even exponential allowing programs with long test executions or hard-to-solve constrains generated by symbolic execution to be tested more efficiently. Categories andSubject Descriptors
Conference Paper
Full-text available
We present the techniques used in Concuerror, a systematic testing tool able to find and reproduce a wide class of concurrency errors in Erlang programs. We describe how we take advantage of the characteristics of Erlang's actor model of concurrency to selectively instrument the program under test and how we subsequently employ a stateless search strategy to systematically explore the state space of process interleaving sequences triggered by unit tests. To ameliorate the problem of combinatorial explosion, we propose a novel technique for avoiding process blocks and describe how we can effectively combine it with preemption bounding, a heuristic algorithm for reducing the number of explored interleaving sequences. We also briefly discuss issues related to soundness, completeness and effectiveness of techniques used by Concuerror.
Conference Paper
Full-text available
Testing multi-threaded programs is hard due to the state explosion problem arising from the different interleavings of concurrent operations. The dynamic partial order reduction (DPOR) algorithm by Flanagan and Godefroid is one solution to reducing this problem. We present a modification to this algorithm that allows it to exploit the commutativity of read operations and provide further reduction. To enable testing of multi-threaded programs that also take input we show that it is possible to combine DPOR with concolic testing. We have implemented our modified DPOR algorithm in the LCT concolic testing tool. We have also implemented the sleep set algorithm, which can be used along with DPOR to provide further reduction. As the LCT tool was designed for distributed use we have modified the sleep set algorithm for use in a distributed testing client-server setting.
Article
Full-text available
In multithreaded programs both environment input data and the nondeterministic interleavings of concurrent events can affect the behavior of the program. One approach to systematically explore the nondeterminism caused by input data is dynamic symbolic execution. For testing multithreaded programs we present a new approach that combines dynamic symbolic execution with unfoldings, a method originally developed for Petri nets but also applied to many other models of concurrency. We provide an experimental comparison of our new approach with existing algorithms combining dynamic symbolic execution and partial-order reductions and show that the new algorithm can explore the reachable control states of each thread with a significantly smaller number of test runs. In some cases the reduction to the number of test runs can be even exponential allowing programs with long test executions or hard-to-solve constrains generated by symbolic execution to be tested more efficiently.
Article
Full-text available
State-space caching is a verification technique for finite-state concurrent systems. It performs an exhaustive exploration of the state space of the system being checked while storing only all states of just one execution sequence plus as many other previously visited states as available memory allows. So far, this technique has been of little practical significance: it allows one to reduce memory usage by only twoo to three times, before an unacceptable blow-up of the run-time overhead sets in. The explosion of the run-time requirements is due to redundant multiple explorations of unstored parts of the state space. Indeed, almost all states in the state space of concurrent systems are typically reached several times during the search. In this paper, we present a method to tackle the main cause of this prohibitive state matching: the exploration of all possible interleavings of concurrent executions of the system which all lead to the same state. Then, we show that, in many cases, with this method, most reachable states are visited only once during state-space exploration. This enables one not to store most of the states that have already been visited without incurring too much redundant explorations of parts of the state space, and makes therefore state-space caching a much more attractive verification method. As an example, we were able to competely explore a state space of 250,000 states while storing simultaneously no more than 500 states and with only a three-fold increas of the run-time requirements.
Article
Full-text available
With the advancement of computer technology, highly concurrent systems are being developed. The verification of such systems is a challenging task, as their state space grows exponentially with the number of processes. Partial order reduction is an effective technique to address this problem. It relies on the observation that the effect of executing transitions concurrently is often independent of their ordering. In this paper we present the basic principles behind partial order reduction and its implementation.
Article
We present a new approach to partial-order reduction for model checking software. This approach is based on initially exploring an arbitrary interleaving of the various concurrent processes/threads, and dynamically tracking interactions between these to identify backtracking points where alternative paths in the state space need to be explored. We present examples of multi-threaded programs where our new dynamic partial-order reduction technique significantly reduces the search space, even though traditional partial-order algorithms are helpless.
Article
Testing concurrent programs that accept data inputs is no- toriously hard because, besides the large number of possi- ble data inputs, nondeterminism results in an exponentially large number of interleavings of concurrent events. We pro- pose a novel testing algorithm for concurrent programs in which our goal is not only to execute all reachable state- ments of a program, but to detect all possible data races, and deadlock states. The algorithm uses a combination of symbolic and concrete execution (called concolic execution) to explore all distinct causal structures (or partial order re- lations among events generated during execution) of a con- current program. The idea of concolic testing is to use the symbolic execution to generate inputs that direct a program to alternate paths, and to use the concrete execution to guide the symbolic execution along a concrete path. Our algorithm uses the concrete execution to compute the exact race conditions between the events of an execution at run- time. Subsequently, we systematically re-order or permute the events involved in these races by generating new thread schedules as well as generate new test inputs. This way we explore at least one representative from each partial order. We describe jCUTE, a tool implementing the testing algo- rithm together with the results of applying jCUTE to real- world multithreaded Java applications and libraries. In one of our case studies, we discovered several undocumented po- tential concurrency-related bugs in the widely used Java col- lection framework distributed with the Sun Microsystems' JDK 1.4.