Conference PaperPDF Available

Generic CDCL -– A Formalization of Modern Propositional Satisfiability Solvers

Authors:

Abstract and Figures

Modern propositional satisfiability (or SAT) solvers are very powerful due to recent developments on the underlying data structures, the used heuristics to guide the search, the deduction techniques to in- fer knowledge, and the formula simplification techniques that are used during pre- and inprocessing. However, when all these techniques are put together, the soundness of the combined algorithm is not guaranteed any more. In this paper we present a small set of rules that allows to model modern SAT solvers in terms of a state transition system. With these rules all techniques which are applied in modern SAT solvers can be adequately modeled. Finally, we compare Generic CDCL with related systems.
Content may be subject to copyright.
Generic CDCL – A Formalization of Modern
Propositional Satisfiability Solvers
Steffen H¨
olldobler, Norbert Manthey, Tobias Philipp and Peter Steinke
International Center for Computational Logic
Technische Universit¨
at Dresden
Abstract. Modern propositional satisfiability (or SAT) solvers are very
powerful due to recent developments on the underlying data structures,
the used heuristics to guide the search, the deduction techniques to in-
fer knowledge, and the formula simplification techniques that are used
during pre- and inprocessing. However, when all these techniques are
put together, the soundness of the combined algorithm is not guaranteed
any more. In this paper we present a small set of rules that allows to
model modern SAT solvers in terms of a state transition system. With
these rules all techniques which are applied in modern SAT solvers can
be adequately modeled. Finally, we compare Generic CDCL with related
systems.
1 Introduction
Many practical problems of computer science are in the complexity class NP.
There are many well studied formalisms that can handle problems of this class,
among them are constraint satisfaction [20], answer set programming [7], or
satisfiability checking [3]. Although the two former formalisms admit a richer
language, the latter approach is still very competitive even if the expressivity of
its language is comparatively low.
The propositional satisfiability problem (SAT) consists of a propositional for-
mula and asks whether there is a satisfying assignment for the Boolean variables
occurring in the formula. From a complexity theory point of view SAT is NP-
complete [4] and, thus, intractable. Still, there are many industrial and academic
applications that can be solved nicely with modern SAT solvers. For instance a
SAT-based railway scheduling software outperformed the native version [9]. Like-
wise, haplotype matching [13] can be solved nicely with modern SAT solvers.
The success of the SAT approach lies in the strength of today’s SAT solvers.
SAT solvers do not operate on testing all possible variable assignments, but
on constructing an assignment by successively interleaving two processes, viz.,
guessing and propagating the assignment of literals. The main inference rule is
unit propagation, an efficient form of resolution. Combined with a decision rule
it is the core of the basic algorithm known as the DPLL algorithm [5]. In the
case that a contradiction is found in the formula with respect to the current
variable assignment, advanced SAT solvers backtrack and learn a conflict clause
which prevents the current and similar conflicts. With the addition of so-called
learned clauses the basic algorithm is known as CDCL algorithm [16].
Modern systematic SAT solvers are highly tuned and complex proof pro-
cedures employing many advanced techniques like clause learning [16], non-
chronological backtracking,restarts [8], clause removal [2, 6], decision heuris-
tics [2,17], and formula simplification techniques [12]. Specialized, cache-conscious
data structures [10] further improve the performance. This way, today’s solvers
like Riss,1MiniSAT or Lingeling can handle formulas with millions of variables
and millions of clauses.
However, the success of modern solvers carries a price tag: increased code
complexity. Successful SAT solvers like the above mentioned ones consist of mul-
tiple thousand lines of code and are written in programming languages with
side effects like C or C++. Due to the code complexity, the behavior of SAT
solvers is hard to understand and state-of-the-art SAT solver internals are hard
to teach. Moreover, finding additional techniques and integrating them into a
SAT solver is getting more complex, as we have to consider the interplay with
all the remaining techniques. Consequently, abstracting from specific algorithms,
data structures, and heuristics is extremely important in order to discover and
prove properties of a modern SAT solver as well as to understand the principles
of SAT solving.
This problem was tackled by different formalizations, notably Linearized
DPLL [1], Rule-based SAT Solver Descriptions [15], and Abstract
DPLL [18]. However, these systems do not appropriately model modern SAT
solvers anymore. In particular, preprocessing and applying preprocessing tech-
niques interleaved with search, known as inprocessing, became a crucial part in
SAT solving. Applying formula simplification techniques also during search is an
attractive idea since it allows to use valuable formula simplifications while taking
learned clauses into account. For example, the SAT solver Lingeling benefits
considerably from this approach.
The contribution of this paper is the formalism Generic CDCL that models
the computation of modern SAT solvers. Equipped with a small set of simple
state transition rules, we can model all well-established techniques like prepro-
cessing, inprocessing, restarts, clause sharing, as well as clause learning and for-
getting. This formalism allows us to reason about the behavior of SAT solvers
independently of the specific implementation. Additionally, the framework is a
first step to explain how modern SAT solvers are working in a compact and easy
way. Besides the presentation of Generic CDCL, the main result of this paper is
the proof that Generic CDCL and, consequently, all its instances are sound.
The paper is structured as follows: In Section 2 we describe basic concepts of
satisfiability testing. We present Generic CDCL in Section 3, where we also prove
that Generic CDCL correctly solves the satisfiability problems. Afterwards, we
compare Generic CDCL with related formalism in Section 4 and we conclude
the paper in Section 5.
1The SAT solver Riss is freely available at tools.computational-logic.org.
2 Preliminaries
2.1 The Satisfiability Problem
We assume a fixed infinite set Vof Boolean variables. A literal is a variable v
(positive literal ) or a negated variable v(negative literal). The complement xof
a positive (negative, resp.) literal xis the negative (positive, resp.) literal with
the same variable as x. The complement of a set Sof literals, denoted with S,
is defined as S={x|xS}. Finite sets of clauses are called formulas, where
aclause is a finite set of literals. Sometimes, we write a clause {x1, . . . , xn}also
as the disjunction (x1. . . xn) and a formula {C1, . . . , Cn}as the conjunction
(C1. . . Cn). The empty clause is denoted by , the empty formula by >. The
formula obtained from Fby replacing all occurrences of the variable vby the
variable wis denoted by F[v7→ w]. The set of all variables occurring in a formula
F(in positive or negative literals) is denoted by vars(F); the set of all literals
occurring in Fby lits(F). For instance, if x, y ∈ V , then F={{x, y},{y}} is
a formula, its alternative representation using logical connectives is (xy)y,
vars(F) = {x, y}, and lits(F) = {x, y}.
The semantics of formulas is based on the notion of an interpretation. An
interpretation Iis a set of literals which does not contain a complementary pair
x,xof literals. An interpretation Iis total iff for each v∈ V either vIor vI.
The satisfaction relation |= is defined as follows: Let Ibe an interpretation, then
I|=>,I6|=,I|= (x1. . . xn) iff I|=xifor some i∈ {1, . . . , n}, and
I|= (C1. . . Cn) iff I|=Cifor all i∈ {1, . . . , n}. Interpretation Iis a model
for the formula Fiff I|=F. In the case that a formula Fhas a model, then F
is satisfiable, otherwise it is unsatisfiable.
We relate formulas by three relations: the entailment, the equivalence and
the equisatisfiability relation: Formula Fentails formula F0iff every total model
of Fis a model of F0. Two formulas Fand F0are equivalent, in symbols FF0,
iff Fentails F0and F0entails F. Two formulas Fand F0are equisatisfiable, in
symbols Fsat F0, iff either both are satisfiable or both are unsatisfiable.
For instance, the interpretation I={x, ¬z}is a model of the formula F1=
(xy)(xz) and, therefore, Fis satisfiable. The formula F2=xxhas
no model and, therefore, is unsatisfiable. The formula F3=xzis satisfiable
and, therefore, the formulas F1and F3are equisatisfiable, but the formulas F1
and F2are not equisatisfiable. In fact, F3|=F1since every total model Iof the
formula F3must contain xand zand, hence, the two clauses of F1are satisfied
by I. Finally, we find for all clauses Cand formulas Fthat C∨ > ≡ > ∨ C≡ >,
C∨ ⊥ ≡ ⊥ ∨ CC,F∧ > > ∧ FF, and F∧ ⊥ ≡ ⊥ ∧ F≡ ⊥.
Let xbe a literal, C= (xC0) and D= (xD0) be two clauses. Then the
clause (C0D0) is the resolvent of the clauses Cand Dupon the literal x. A
linear resolution derivation from the clause Cto the clause Din the formula F
is a finite sequence of clauses (Ci|1in) such that C1=C,Cn=Dand
Ciis a resolvent of the clause Ci1and some clause in the formula Ffor all
i∈ {2, . . . , n 1}; one should observe that Fentails Dand that the addition of
entailed clauses to a formula preserves equivalence.
2.2 Variable Assignments and the Reduct Operator
Let Jbe a finite sequence of literals. In Jeach literal may be marked as a
decision literal by placing a dot on top like in ˙x; if a literal xis not marked, then
it is a propagation literal. Let Jbe a sequence of literals of length n. We say that
literal xJiff there is a k∈ {1, . . . , n}such that x=xk. Let J1= (x1, . . . , xn)
and J2= (y1, . . . , ym) be two sequences of literals; their concatenation J1J2is
the sequence (x1, . . . , xn, y1, . . . , ym). If a finite sequence Jof literals does not
contain a complementary pair of literals, then Jrepresents an interpretation. As
this condition is always met in this paper, we identify sequences of literals with
interpretations whenever appropriate.
The reduct of a formula Fw.r.t. an interpretation J, in symbols F|J, is
defined as F|J:= {C|J|CFand for every literal xCwe find that x6∈ J},
where C|J=C\ {x|xJ}. Intuitively, the reduct operator expresses the state
of a SAT solver, where the formula Fis the working formula and Jis the working
assignment. For instance, let F={{x, y},{z}}, then F|x={{y},{z}},F|z=
{{x, y},⊥} and F|y z =>, where the interpretations are written as sequences
of literals. One should observe that the reduct operator does not distinguish
between propagation and decision literals.
Lemma 1 below summarizes the properties of the reduct operator: (i) The
reduct is monotone. (ii) A formula Fis satisfiable iff there exists an interpre-
tation Jsuch that the reduct of a formula w.r.t. Jis the empty formula. (iii)
If the reduct of a formula Fw.r.t. the interpretation {x}is unsatisfiable, and
the formula Fentails the literal x, then the formula Fis unsatisfiable. (iv) The
reduct operator is a semantic operator in the sense that it cannot distinguish
equivalent formulas.
Lemma 1 (Reduct Operator). Let F, F 0be formulas and xa literal.
(i) F|J(FF0)|Jfor every interpretation J.
(ii) Fis satisfiable iff there exists a Jsuch that F|J=>.
(iii) If F|xis unsatisfiable and F|=x, then Fis unsatisfiable.
(iv) If FF0, then F|JF0|Jfor every interpretation J.
Proof. See [19, pp.10–12]. ut
3 Generic CDCL
Modern SAT solvers are based on the linearized DPLL [5] algorithm and consists
of the following components: termination criteria, a decision component that
picks the branching literals, an inference component that adds propagation lit-
erals to the working sequence of literals, a backtracking component that rolls
back wrong decisions and a formula management component that simplifies the
working formula. We maintain two data structures during the execution of mod-
ern SAT solvers: the working formula, and the working set of literals. Together
they define the state. The components are modelled as a transition relation over
the set of states; the union of the rules in Fig. 1 is then the transition relation
SAT-rule: FJ;SAT SAT iff F|J=>.
UNSAT-rule: FJ;UNSAT UNSAT iff
⊥ ∈ F|Jand Jcontains only propagation literals.
DEC-rule: FJ;DEC FJ˙xiff
xvars(F)vars(F) and {x, x} ∩ J=.
INF-rule: FJ;INF FJ x iff
F|Jsat F|J x and {x, x} ∩ J=.
LEARN-rule: FJ;LEARN F∪ {C}Jiff F|=C.
FORGET-rule: FJ;FORGET F\ {C}Jiff F\ {C} |=C.
BACK-rule: FJ J0
;BACK FJ.
INP-rule: Fε;INP F0εiff Fsat F0.
Fig. 1: Transition relations of Generic CDCL. These relations apply to all for-
mulas Fand F0, clauses C, literals xand lists of literals Jand J0.εdenotes the
empty sequence of literals.
of Generic CDCL: Formally, we model the computation of modern SAT solvers
by means of state transition systems as follows:
Definition 2 (Generic CDCL). Generic CDCL is a state transition system
whose sets of states is
{FJ|Fis a formula and Jis a sequence of literals}∪{SAT,UNSAT},
whose initial state for the input formula Fis init(F) = Fε, whose set of
terminal states is {SAT,UNSAT}, and whose transition relation ;is defined as:
;:= {;SAT,;UNSAT,;DEC,;INF ,;LEARN,;FORGET,;BACK ,;INP}.
The SAT-rule terminates the computation with the output SAT, if the reduct of
the working formula w.r.t. the working set of literals is the empty formula. This
condition can be decided in linear time w.r.t. the size of the working formula F.
By Lemma 1 (ii) the working formula is then satisfiable.
The UNSAT-rule terminates the computation with the output UNSAT, if no
model of the working formula exists. This is the case when a conflict occurs in
the top level, i.e. ⊥ ∈ F|Jand the sequence Jof literals contains only propagation
literals. These conditions can be decided in polynomial time.
The DEC-rule extends the working sequence of literals by an unassigned literal
˙xas a decision literal.
The INF-rule extends the working list of literals by a propagation literal x, if
the reducts of the working formula w.r.t. the working sequence of literals and its
extension are equisatisfiable.
The BACK-rule models backtracking, as well as backjumping and restarts, by
deleting outermost right literals in the working sequence of literals.
The LEARN-rule adds a clause Cto the working formula, if it is entailed by
the working formula F. Deciding whether F|=Cholds, is coNP-complete.
Similarly to the INF-rule, SAT solvers avoid this check by using techniques for
creating the clause Cthat ensure this property, as for example resolution.
The FORGET-rule deletes a clause Cof the working formula F, if F\ {C} |=C.
The question whether F\ {C} |=Cholds is coNP-complete. Typically, we use
tractable algorithms to identify redundant clauses. For instance, clauses that
were introduced by the LEARN-rule but have turned out to be useless and did
not participate in the elimination of other clauses in the formula can be removed.
For more details on the deletion of clauses see [12].
The INP-rule models formula simplifications that are used in pre- and inpro-
cessing. It replaces the working formula with an equisatisfiable formula when
the working sequence of literals is empty.
Let
;be the reflexive and transitive closure of ;. Let x0
;xfor all states x,
and xn
;zfor all natural numbers nNif and only if there exists a state ysuch
that xn1
;y;z. In the next subsection we investigate the question whether
Generic CDCL correctly solves the SAT problem. Formally, we define Generic
CDCL to be sound iff for all formulas F0we have that init(F0)
;SAT implies
that F0is satisfiable and init(F0)
;UNSAT implies that F0is unsatisfiable.
3.1 Generic CDCL is Sound
Before proceeding to the soundness proof of Generic CDCL, it will be necessary
to study two invariants of Generic CDCL that are presented in the proposition
below: (i) states that the rules of Generic CDCL do not change the satisfiability
of the working formula, and (ii) states that whenever the working sequence of
literals is of the form J1x J2, where xis a propagation literal, the reducts of the
working formula w.r.t. J1and J1xare equisatisfiable.
Proposition 3 (Invariants). Let F0, F be formulas, Jbe a sequence of literals,
and nN. If init(F0)n
;FJ, then
1. F0sat F, and
2. F|J1sat F|J1x, for all sequences of literals J1, J2and propagation literals x
with J=J1x J2.
Proof. The claims are proven by induction on n. For the base case n= 0, 1.
follows from F0=Fand 2. holds since the Jis empty. For the induction step,
assume that the claim holds for the state FJand suppose that FJ;RF0J0
where R∈ {DEC,INF,LEARN,FORGET,BACK,INP}:
DEC-rule: In this case, F0=Fand J0=J˙xfor some decision literal ˙x
with {x, x} ∩ J=.1. follows since F0sat Fholds by induction. 2. holds
because the appended literal is a decision literal. Formally, let J0
1, J 0
2be literal
sequences, ybe a propagation literal such that J0=J0
1y J0
2˙x. By induction,
we conclude that F|J0
1sat F|J0
1y. Hence, F0|J0
1sat F0|J1y.
BACK-rule: In this case, F0=Fand J=J0J00.1. follows since F0sat Fby
induction. 2. holds because the literal sequence J0is a prefix of J. Formally,
let J0
1, J 0
2be literal sequences and ybe a propagation literal such that J0=
J0
1y J0
2. By induction, we conclude that F|J0
1sat F|J0
1y, and consequently
we know that F0|J0
1sat F0|J0
1y.
LEARN-rule: In this case, F0=F∪ {C}where F|=Cand J0=J.1. follows
since FF0and the addition of the entailed clause Cpreserves equivalence
of the working formula. 2. follows from the reduct operator being a semantic
operator by Lemma 1.iv and therefore F0|J0
1sat F0|J0
1yholds by induction
for every literal sequences J0
1, J 0
2and propagation literals ywith J0=J0
1y J0
2.
FORGET-rule: This case can be treated as in the LEARN-rule.
INF-rule: In this case, F0=Fand J0=J x for some propagation literal x
with {x, x} ∩ J=.1. follows since F0sat Fholds by induction. 2. follows
from the definition of the INF-rule: Consider the literal sequences J0
1, J 0
2and a
propagation literal ysuch that J0=J0
1y J0
2. In the case that y=x, we know
that J0
2is the empty sequence of literals and consequently F|J0
1sat F|J0
1y
holds by the definition of the INF-rule. In the case of y6=x, we can conclude
the claim by induction.
INP-rule: In this case, F0sat Fand J0is the empty sequence. Consequently,
1. holds by the definition of the INP-rule, and 2. is satisfied as J00 =ε.ut
We can now show the main theorem in this paper.
Theorem 4 (Soundness). Generic CDCL is sound.
Proof. We divide the proof in two parts, first proving that the output SAT is cor-
rect, and then proving that the output UNSAT is correct. Let F0, F be formulas,
Jbe a sequence of literals and suppose that
init(F0)
;FJ;SAT(UNSAT,resp.).
SAT By the definition of the SAT-rule, we know that F|J=>. By Lemma 1.ii,
we know that the formula Fis satisfiable. From the result that the formula Fis
satisfiable together with the property that the formulas F0and Fare equisatis-
fiable given in Prop. 3(1.), we conclude that the input formula F0is satisfiable.
UNSAT By the definition of the UNSAT-rule, we know that ⊥ ∈ F|Jand the
working sequence of literals J= (x1. . . xn) contains only propagation literals.
Since a conflict occurs, F|Jis unsatisfiable. From the result that the formula F|J
is unsatisfiable and the literal sequence Jcontains only propagation literals we
can repeatably apply Prop. 3(2.) and we obtain that the formula Fis unsat-
isfiable. Since the formula Fis unsatisfiable and the formulas Fand F0are
equisatisfiable by Prop. 3(1.), we conclude that F0is unsatisfiable. ut
3.2 Generic CDCL Subsumes Important SAT Solving Techniques
We now describe some important SAT solving techniques, and demonstrate that
Generic CDCL can adequately model these techniques.
Subsumption For a formula F, the clause CFsubsumes the clause DF
iff CD. In this case, Dcan be deleted because F\ {D} |=D. Consequently,
FJ;FORGET F\{D}Jholds for every literal sequence J. Removing subsumed
clauses is done as a preprocessing step in SAT solvers and during clause learning.
Tautologies A clause Cis a tautology if it contains a complementary pair of lit-
erals. Every formula Fentails a tautology and the LEARN-rule in Generic CDCL
subsumes this techniques. Tautologies are eliminated during preprocessing.
Conflict-Directed Backtracking and Learning [21] is an improvement of naive
backtracking that takes the reason of the conflict into account. Consider the
state FJ˙x J0and a clause CFwhere C|J˙x J 0=. The clause Cis the
conflict clause. If there is a linear resolution derivation from the conflict clause C
to a clause Din the formula Fsuch that D|Jis the unit clause y, the technique
rewrites the state FJ˙x J0into the state F∪ {C}J y. Conflict-directed
backtracking and learning can be simulated by the following transition steps:
FJ˙x J0;BACK FJ;LEARN F∪ {D}J;INF F∪ {D}J y.
Blocked Clause Elimination [11] A clause Cis blocked in the formula Fif it
contains a literal xsuch that all resolvents of the clause Cand clauses DF
upon xare tautological. Blocked clauses are removed from a formula during pre-
and inprocessing. If Cis blocked in F, then Fsat F\ {C}and, therefore, the
INP-rule subsumes the blocked clause elimination technique.
Unit Propagation A clause that contains a single literal is a unit clause. Unit
propagation adds the propagation literal xto the literal sequence J, whenever
the reduct of the working formula w.r.t. Jcontains the unit clause (x). Since
F|J|=x, we know that F|Jsat F|J x and consequently the INF-rule subsumes
unit resolution.
Pure Literal A literal xis pure in the formula F, if x6∈ lits(F). For pure
literals, it holds that Fsat F|xand, therefore, whenever a literal xis pure in
the formula F|Jfor some literal sequence J, Generic CDCL can add the pure
literal to the working literal sequence: FJ;INF FJ x.
4 Related Work
Several attempts have been made to formalize sequential SAT solvers in terms of
transition systems or proofs calculi: Abstract DPLL [18], Linearized DPLL [1],
and Rule-based SAT solver description [15]. In general, these formalization are
based on a notion of state like Generic CDCL.
However, these formalizations cannot adequately model recent SAT solvers:
For instance, Linearized DPLL does not model the SAT solver MiniSAT, because
Linearized DPLL restricts decision literals to occur in the working formula, but
the solver MiniSAT can also select the complements of such literals. Addition-
ally, Linearized DPLL does not model formula simplification techniques such
as blocked clause elimination, or probing-based inference techniques. Similarly,
Abstract DPLL and the Rule-based SAT solver description [15] do not model
formula simplifications that changes the semantics of formulas like blocked clause
elimination. Maric highlighted the implementation of clause learning techniques
in his Rule-based SAT solver description [15], but it does not include recent devel-
opments such as clause strengthening. All these formalizations consider DPLL-
based SAT solvers, but the ancient pure literal rule is not subsumed in these
systems. In contrast, Generic CDCL subsumes all recent SAT techniques to the
best of our knowledge.
In [12] Jarvisalo et. al. developed a formal system to model clause learning,
forgetting and formula simplification techniques to understand the side-effects
of the combination of different rules. They drew our attention to the interplay
of the learned clause database with inprocessing techniques. The interplay of
clause sharing and formula simplification techniques in parallel SAT solvers was
analyzed in [14], where the state of a sequential SAT solver was modelled just as
the working formula. We believe that Generic CDCL is an important fragment
to understand sequential SAT solvers with inprocessing and their cooperation in
the parallel-portfolio setting with clause sharing.
5 Conclusion
The propositional satisfiability problem is of great practical interest and can be
efficiently answered by modern SAT solvers like Riss,Lingeling or MiniSAT. To-
day, modern SAT solvers are highly tuned proof procedures with many advaned
techniques. Therefore, it is desireable to investigate SAT solving techniques in
combination with each other and to abstract from implementations.
In this paper, we developed Generic CDCL, a formalism that models the
computation of modern SAT solvers in terms of a state transition system, where
each transition rule abstracts a component in a SAT solver. In particular, the
transition rules INF and INP model formula simplification techniques like blocked
clause elimination and inference techniques such as the pure literal rule. We
have examined invariants in Generic CDCL and have shown that Generic CDCL
is sound. In contrast to previous work on formalizations of SAT solvers, we
can model all recent techniques. The findings add to our understanding of the
interplay of inprocessing techniques with the other components of SAT solvers.
A limitation of Generic CDCL is its lack of details in the learning and in-
ference component. As future work, we plan to investigate properties such as
completeness,confluence and termination.
References
1. Arnold, H.: A linearized DPLL calculus with clause learning. Tech. rep., Universit¨
at
Potsdam. Institut f¨
ur Informatik (2009)
2. Audemard, G., Simon, L.: Predicting learnt clauses quality in modern SAT solvers.
In: Boutilier, C. (ed.) IJCAI 2009. pp. 399–404. Morgan Kaufmann Publishers Inc.,
Pasadena (2009)
3. Biere, A., Heule, M., van Maaren, H., Walsh, T. (eds.): Handbook of Satisfiability.
IOS Press, Amsterdam (2009)
4. Cook, S.A.: The complexity of theorem-proving procedures. In: Harrison, M.A.,
Banerji, R.B., Ullman, J.D. (eds.) STOC 1991. pp. 151–158. ACM (1971)
5. Davis, M., Logemann, G., Loveland, D.: A machine program for theorem-proving.
Commun. ACM 5(7), 394–397 (1962)
6. en, N., S¨
orensson, N.: An extensible SAT-solver. In: Giunchiglia, E., Tacchella,
A. (eds.) SAT 2003. LNCS, vol. 2919, pp. 502–518. Springer, Heidelberg (2004)
7. Gelfond, M., Lifschitz, V.: The stable model semantics for logic programming. In:
Kowalski, R.A., Bowen, K.A. (eds.) ICLP 1988. pp. 1070–1080. MIT Press (1988)
8. Gomes, C.P., Selman, B., Crato, N., Kautz, H.: Heavy-tailed phenomena in satis-
fiability and constraint satisfaction problems. J. Autom. Reason. 24(1–2), 67–100
(2000)
9. Großmann, P., H¨
olldobler, S., Manthey, N., Nachtigall, K., Opitz, J., Steinke, P.:
Solving periodic event scheduling problems with SAT. In: Jiang, H., Ding, W.,
Ali, M., Wu, X. (eds.) IEA / AIE 2012. LNCS, vol. 7345, pp. 166–175. Springer,
Heidelberg (2012)
10. H¨
olldobler, S., Manthey, N., Saptawijaya, A.: Improving resource-unaware SAT
solvers. In: Ferm¨
uller, C.G., Voronkov, A. (eds.) LPAR 2010. LNCS, vol. 6397, pp.
519–534. Springer, Heidelberg (2010)
11. J¨
arvisalo, M., Biere, A., Heule, M.: Blocked clause elimination. In: Esparza, J., Ma-
jumdar, R. (eds.) TACAS 2010. LNCS, vol. 6015, pp. 129–144. Springer, Heidelberg
(2010)
12. J¨
arvisalo, M., Heule, M.J.H., Biere, A.: Inprocessing rules. In: Gramlich, B., Miller,
D., Sattler, U. (eds.) IJCAR 2012. LNCS, vol. 7364, pp. 355–370. Springer, Hei-
delberg (2012)
13. Lynce, I., Marques-Silva, J.: Efficient haplotype inference with Boolean satisfiabil-
ity. In: AAAI 2006. pp. 104–109. AAAI Press (2006)
14. Manthey, N., Philipp, T., Wernhard, C.: Soundness of inprocessing in clause sharing
SAT solvers. In: J¨
arvisalo, M., Van Gelder, A. (eds.) SAT 2013. LNCS, vol. 7962,
pp. 22–39. Springer, Heidelberg (2013)
15. Mari´c, F.: Formalization and implementation of modern SAT solvers. J. Autom.
Reason. 43(1), 81–119 (2009)
16. Marques Silva, J.P., Sakallah, K.A.: GRASP: A search algorithm for propositional
satisfiability. IEEE Transactions on Computers 48(5), 506–521 (1999)
17. Moskewicz, M.W., Madigan, C.F., Zhao, Y., Zhang, L., Malik, S.: Chaff: Engineer-
ing an efficient SAT solver. In: DAC 2001. pp. 530–535. ACM, New York (2001)
18. Nieuwenhuis, R., Oliveras, A., Tinelli, C.: Abstract DPLL and abstract DPLL
modulo theories. In: Baader, F., Voronkov, A. (eds.) LPAR 2004. LNCS, vol. 3452,
pp. 36–50. Springer, Heidelberg (2005)
19. Philipp, T.: Expressive Models for Parallel Satisfiability Solvers. Master thesis,
Technische Universit¨
at Dresden (2013)
20. Rossi, F., Beek, P.v., Walsh, T.: Handbook of Constraint Programming. Elsevier
Science Inc., New York (2006)
21. Silva, J.P.M., Sakallah, K.A.: GRASP - a new search algorithm for satisfiability.
In: ICCAD 1996. pp. 220–227. IEEE Computer Society, Washington (1996)
... Many formula simplifications were proposed to speed up SAT solvers. Formalizations of SAT solvers [1,20,30,31,33] are necessary tools to study soundness of various systems, but they do not guarantee the absence of bugs. For solving this issue, the DRAT proof format was developed such that SAT solvers can emit a witness of unsatisfiability which can be easily verified [16,38]. ...
Conference Paper
Full-text available
Many real world problems are solved with satisfiability testing (SAT). However, SAT solvers have documented bugs and therefore the answer that a formula is unsatisfiable can be incorrect. Certifying algorithms are an attractive approach to increase the reliability of SAT solvers. For unsatisfiable formulas an unsatisfiability proof has to be created. This paper presents certificate constructions for various formula simplification techniques, which are crucial to the success of modern SAT solvers.
... On the other hand, since they are used in proving properties of hardware, software and even mathematical theorems, the whole proof often relies on their correctness. Several attempts at proving SAT solver correctness have been made [11,16,17,23,30]. However, better results are usually obtained by extending SAT solvers to provide certificates for their results (models for satisfiable and proofs for unsatisfiable formulas). ...
Article
Full-text available
A conjecture originally made by Klein and Szekeres in 1932 (now commonly known as “Erdős–Szekeres” or “Happy Ending” conjecture) claims that for every \(m \ge 3\), every set of \(2^{m-2}+1\) points in a general position (none three different points are collinear) contains a convex m-gon. The conjecture has been verified for \(m \le 6\). The case \(m=6\) was solved by Szekeres and Peters and required a huge computer enumeration that took “more than 3000 GHz hours”. In this paper we improve the solution in several directions. By changing the problem representation, by employing symmetry-breaking and by using modern SAT solvers, we reduce the proving time to around only a half of an hour on an ordinary PC computer (i.e., our proof requires only around 1 GHz hour). Also, we formalize the proof within the Isabelle/HOL proof assistant, making it significantly more reliable.
... The formalisms presented in [1,30,34] models clause learning sequential SAT solvers, but they are inadequate to express advanced features like restarts, preprocessing, inprocessing and the ability to share clauses. But, Generic CDCL [20] can handle the above techniques. A formalization in [29] was introduced, that models portfolio solvers with inprocessing, and it was argued that this can be extended to guiding path solvers. ...
Conference Paper
Full-text available
Inprocessing is to apply clause simplification techniques during search and is a very attractive idea since it allows to apply computationally expensive techniques. In this paper we present the search space decomposition formalism SSD that models parallel SAT solvers with clause sharing and inprocessing. The main result of this paper is that the sharing model SSD is sound. In the formalism, we combine clause addition and clause elimination techniques, and it covers many SAT solvers such as PaSAT, PaMira, PMSat, MiraXT and ccc. Inprocessing is not used in these solvers, and we therefore propose a novel way how to combine clause sharing, search space decomposition and inprocessing.
Conference Paper
In many recent applications of model counting not all variables are relevant for a specific problem. For instance redundant variables are added during formula transformation. In projected model counting these redundant variables are ignored by projecting models onto relevant variables. Inspired by dual propagation which has its origin in solving quantified Boolean formulae and jointly works on both the original formula and its negation, we present a novel calculus for dual projected model counting. It allows to capture existing techniques such as blocking clauses, chronological as well as non-chronological backtracking, but also introduces new concepts including discounting and dual conflict analysis to obtain partial models. Experiments demonstrate the benefit of our approach.
Conference Paper
Full-text available
We present a formalism that models the computation of clause sharing portfolio solvers with inprocessing. The soundness of these solvers is not a straightforward property since shared clauses can make a formula unsatisfiable. Therefore, we develop characterizations of simplification techniques and suggest various settings how clause sharing and inprocessing can be combined. Our formalization models most of the recent implemented portfolio systems and we indicate possibilities to improve these. A particular improvement is a novel way to combine clause addition techniques - like blocked clause addition - with clause deletion techniques - like blocked clause elimination or variable elimination.
Conference Paper
Full-text available
In this paper, periodic event scheduling problems (PESP) are encoded as satisfiability problems (SAT) and solved by a state-of-the-art SAT solver. Two encodings, based on direct and order encoded domains, are presented. An experimental evaluation suggests that the SAT-based approach using order encoding outperforms constraint-based PESP solvers, which until now were considered to be the best solvers for PESP. This opens the possibility to model significantly larger real-world problems.
Article
Full-text available
Most, if not all, state-of-the-art complete SAT solvers are complex variations of the DPLL procedure described in the early 1960’s. Published descriptions of these modern algorithms and related data structures are given either as high-level state transition systems or, informally, as (pseudo) programming language code. The former, although often accompanied with (informal) correctness proofs, are usually very abstract and do not specify many details crucial for efficient implementation. The latter usually do not involve any correctness argument and the given code is often hard to understand and modify. This paper aims to bridge this gap by presenting SAT solving algorithms that are formally proved correct and also contain information required for efficient implementation. We use a tutorial, top-down, approach and develop a SAT solver, starting from a simple design that is subsequently extended, step-by-step, with a requisite series of features. The heuristic parts of the solver are abstracted away, since they usually do not affect solver correctness (although they are very important for efficiency). All algorithms are given in pseudo-code and are accompanied with correctness conditions, given in Hoare logic style. The correctness proofs are formalized within the Isabelle theorem proving system and are available in the extended version of this paper. The given pseudo-code served as a basis for our SAT solver argo-sat.
Conference Paper
Full-text available
The paper discusses cache utilization in state-of-the-art SAT solvers. The aim of the study is to show how a resource-unaware SAT solver can be improved by utilizing the cache sensibly. The analysis is performed on a CDCL-based SAT solver using a subset of the industrial SAT Competition 2009 benchmark. For the analysis, the total cycles, the resource stall cycles, the L2 cache hits and the L2 cache misses are traced using sample based profiling. Based on the analysis, several techniques – some of which have not been used in SAT solvers so far – are proposed resulting in a combined speedup up to 83% without affecting the search path of the solver. The average speedup on the benchmark is 60%. The new techniques are also applied to MiniSAT2.0 improving its runtime by 20% on average.
Article
Many formal descriptions of DPLL-based SAT algorithms either do not include all essential proof techniques applied by modern SAT solvers or are bound to particular heuristics or data structures. This makes it dicult to analyze proof-theoretic properties or the search complexity of these algorithms. In this paper we try to improve this situation by developing a nondeterministic proof calculus that models the functioning of SAT algorithms based on the DPLL calculus with clause learning. This calculus is independent of implementation details yet precise enough to enable a formal analysis of realistic DPLL-based SAT algorithms.
Article
Constraint programming is a powerful paradigm for solving combinatorial search problems that draws on a wide range of techniques from artificial intelligence (AI), operations research, algorithms, and graph theory. The basic idea in constraint programming is that the user states the constraints, and a general-purpose constraint solver is used to solve them. Constraints are just relations, and a constraint satisfaction problem (CSP) states the relations that should hold among the given decision variables. A constraint satisfaction problem consists of a set of variables, each with some domain of values, and a set of relations on the subsets of these variables. For example, in scheduling exams at a university, the decision variables might be the times and locations of different exams, and the constraints might be on the capacity of each examination room (e.g., we cannot schedule more students to sit for exams in a given room at any one time than the room's capacity) and on the exams scheduled at the same time (e.g., we cannot schedule two exams at the same time if they share students in common). Constraint solvers take a real-world problem like this represented in terms of decision variables and constraints and find an assignment to all the variables that satisfies the constraints. Extensions of this framework may involve, for example, finding optimal solutions according to one or more optimization criterion (e.g., minimizing the number of days over which exams need to be scheduled), finding all solutions, replacing (some or all) constraints with preferences, and considering a distributed setting where constraints are distributed among several agents.
Article
Abstract One of the main topics of research in genomics,is determining the relevance ofmutations, described in haplotype data, as causes of some genetic diseases. However, due to technological limitations, genotype data rather than haplotype data is usually obtained. The haplotype inference by pure parsimony (HIPP) problem,consists in inferring haplotypes from genotypes s.t. the number,of required haplotypes is minimum.,Previous approaches to the HIPP problem have focused on integer programming,models,and branch-and-bound algorithms. In contrast, this paper proposes the utilization of Boolean Satisfiability (SAT). The proposed solution entails a SAT model, a number of key pruning techniques, and an iterative algorithm that enumerates the possible solution values for the target optimization problem. Experimental results, obtained on a wide range of instances, demonstrate that the SAT-based approach,can be several orders of magnitude,faster than existing solutions. Besides being more efficient, the SAT-based approach,is also the only capable of computing,the solution for a large number,of instances.
Conference Paper
It is shown that any recognition problem solved by a polynomial timebounded nondeterministic Turing machine can be "reduced" to the problem of determining whether a given propositional formula is a tautology.Here "reduced" means, roughly speaking,that the first problem can be solved deterministically in polynomial time provided an oracle is available for solving the second.From this notion of reducible,polynomial degrees of difficulty are defined, and it is shown that the problem of determining tautologyhood has the same polynomial degree as the problem of determining whether the first of two given graphs is isomorphic to a subgraph of the second.Other examples are discussed. A method of measuring the complexity of proof procedures for the predicate calculus is introduced and discussed.
Conference Paper
We introduce Abstract DPLL, a general and simple abstract rule-based formulation of the Davis-Putnam-Logemann-Loveland (DPLL) procedure. Its properties, such as soundness, completeness or termination, immediately carry over to the modern DPLL implementations with features such as non-chronological backtracking or clause learning. This allows one to formally reason about practical DPLL algorithms in a simple way. In the second part of this paper we extend the framework to Abstract DPLL modulo theories. This allows us to express—and formally reason about—state-of-the-art concrete DPLL-based techniques for satisfiability modulo background theories, such as the different lazy approaches, or our DPLL(T) framework.