Conference PaperPDF Available

A Fast Disprover for VeriFun

Authors:

Abstract

We present a disprover for universal formulas stating conjectures about functional programs. The quantified variables range over freely generated polymorphic data types, thus the domains of discourse are infinite in general. The objective in the development was to quickly find counter-examples for as many false conjectures as possible without wasting too much time on true statements. We present the reasoning method underlying the disprover and illustrate its practical value in several applications in an experimental version of the verification tool VeriFun.
The 2006 Federated Logic Conference
The Seattle Sheraton Hotel and Towers
Seattle, Washington
August 10 - 22, 2006

IJCAR’06 Workshop
DISPROVING’06:
Non-Theorems, Non-Validity, Non-Provability
August 16th, 2006
Proceedings
Editors:
W. Ahrendt, P. Baumgartner, H. de Nivelle
A Fast Disprover for XeriFun
Markus Aderhold, Christoph Walther, Daniel Szallies, and Andreas Schlosser
Fachgebiet Programmiermethodik, Technische Universit¨at Darmstadt, Germany
{aderhold, chr.walther, szallies, schlosser}@pm.tu-darmstadt.de
Abstract. We present a disprover for universal formulas stating conjec-
tures about functional programs. The quantified variables range over free-
ly generated polymorphic data types, thus the domains of discourse are
infinite in general. The objective in the development was to quickly find
counter-examples for as many false conjectures as possible without wast-
ing too much time on true statements. We present the reasoning method
underlying the disprover and illustrate its practical value in several ap-
plications in an experimental version of the verification tool XeriFun .
1 Introduction
As a common experience, specifications are faulty and programs do not meet
their intention. Program bugs range from easily detected simple lapses (such
as not excluding division by zero or typos when setting array bounds) to deep
logical flaws in the program design which emerge elsewhere in the program and
therefore are hard to discover.
But programmers’ faults are not the only source of bugs. State-of-the-art
verifiers synthesize conjectures about a program that are needed (or are at least
useful) in the course of verification: Statements may be generalized to be qualified
for a proof by induction, the verifier might generate termination hypotheses that
ensure a procedure’s termination, or it might synthesize conjectures justifying
an optimization of a procedure. Sometimes these conjectures can be faulty, i.e.
over-generalizations might result, the verifier comes up with a wrong idea for
termination, or an optimization simply does not apply.
Verifying that a program meets its specification is a waste of time in all these
cases, and therefore one should begin with testing the program beforehand. How-
ever, as testing is a time consuming and boring task, machine support is welcome
(not to say needed) to relieve the human from the test-and-verify cycle. Program
testing can be reformulated as a verification problem: A program conjecture φ
fails the test if the negation of φcan be verified. However, for proving these
negated conjectures, a special verifier—called a disprover—is needed.
In this paper, we present such a disprover for statements about programs
written in the functional programming language L[14], which has been in-
tegrated into an experimental version [12] of the interactive verification tool
XeriFun [15, 16]. The procedures of L-programs operate over freely generated
structure bool <=true,false
structure N<= 0, +(:N)
structure list[@A]<=ε,[infix] ::(hd : @A,tl :list[@A])
function [outfix] |(k:list[@A]) : N<=
if k=εthen 0else +(|tl(k)|)
function [infix] <>(k, l :list[@A]) : list[@A]<=
if k=εthen lelse hd(k) :: (tl(k)<> l)end
function rev(k:list [@A]) : list [@A]<=
if k=εthen εelse rev(tl (k)) <> hd (k) :: εend
lemma rev <> <=
k, l :list[@A]rev(k <> l)=rev (k)<> rev(l)
Fig. 1. A simple L-program
polymorphic data types and are defined by using recursion, case analyses, let-
expressions, and functional composition. The data types bool for Boolean values
and Nfor natural numbers as well as equality =: @A×@Abool and
a procedure >:N×Nbool deciding the >-relation on are predefined
in L. Figure 1 shows an example of an L-program that defines a polymorphic
data type list[@A], list concatenation <>, and list reversal rev. In this program,
the symbols true,false are constructors of type bool,(. . .) is the selector of
the N-constructor +(. . .), and hd and tl are the selectors of constructor :: for
lists. Subsequently, we let Σ(P) denote the signature of all function symbols de-
fined by an L-program P, and Σ(P)cis the signature of all constructor function
symbols in P. An operational semantics for L-programs Pis defined by an inter-
preter evalP:T(Σ(P)) 7→ T (Σ(P)c) which maps ground terms to constructor
ground terms of the respective monomorphic data types using the definition of
the procedures and data types in P, cf. [10, 14, 18].
In L, statements about programs are given by expressions of the form lemma
name <=x1:τ1, . . . , xn:τnb(cf. Fig. 1), where b—called the body of the
lemma—is a Boolean term built with the variables xi(of type τi) from a set V
of typed variables and the function symbols in Σ(P), where case analyses (like
in procedure definitions) and the truth values are used to represent connectives.
Hence the general form of the proof obligations we are concerned with are univer-
sal formulas φ=x1:τ1, . . . , xn:τnb. Disproving such a formula φis equivalent
to proving its negation ¬φ≈ ∃x1:τ1, . . . , xn:τn¬b, and as the domain of each
type τican be enumerated, disproving φis a semi-decidable problem. A disproof
of φ(also called a witness of ¬φ) can be represented by a constructor ground
substitution σsuch that evalP(σ(b)) = false . Consequently, disproving φcan be
viewed as solving the (semi-decidable) equational problem b.
=false.
To solve such an equation, we develop two disproving calculi that consti-
tute the two phases of our disprover. The inference rules of both calculi are
inspired by the calculus proposed in [6] by Comon and Lescanne for solving
equational problems. As disproving is semi-decidable, a complete disprover can
60
be developed. However, as truth of universal formulas φis not semi-decidable
by G¨odel’s first incompleteness theorem, disproving φis undecidable. Therefore
a complete (and sound) disprover need not terminate. But the use in an inter-
active environment—such as the one XeriFun provides—requires termination of
all subsystems, hence completeness must be sacrificed in favor of termination.
The use in an interactive environment also demands runtime performance, so
particular care is taken to achieve early failure on non-disprovable conjectures.
In Section 2, we explain how we disprove universal formulas φ. In Section 3,
we demonstrate the practical use of our disprover when XeriFun employs it in
different disproving applications. We compare our proposal with related work in
Section 4 and conclude with an outlook on future work in Section 5.
2 Disproving Universal Formulas
Our disproving method proceeds in two phases. The first phase is based on
the elimination calculus (E-calculus for short). Its language is given by LE:=
{hE, σ i ∈ LE×Sub | V(E)dom(σ) = ∅}:LEis the set of clauses in which
atoms are built with terms from T(Σ(P),V) and the predicate symbols .
= of
type @A×@Aand mof type N×N; negative literals are written t16.
=t2or t1m/ t2,
respectively. Sub denotes the set of all constructor ground substitutions σ, i. e.
σ(v) T (Σ(P)c) for each vdom(σ). The inference rules of the E-calculus
(defined below) are of the form “ hE,σi
hλ(E0)λi, if cond”, where cond stands for
a side condition that has to be satisfied to apply the rule, and λSub. An E-
deduction is a sequence hE1, σ1i, . . . , hEn, σnisuch that for each i,hEi+1 , σi+1i
originates from hEi, σiiby applying an E-inference rule, and hE1, σ1i `EhEn, σni
denotes the existence of such an E-deduction.
The second phase of our disproving method uses the solution calculus (S-
calculus for short). It operates on LS:= {hE, σi ∈ LS×Sub | V(E)dom (σ) =
∅}:LSLEis the set of clauses in which atoms are formed with predicate
symbols .
=, mand terms from T(Σ(P0),V), where Σ(P0) emerges from Σ(P)
by removing the function symbols if and =as well as all procedure function
symbols. The form of the S-inference rules (defined below) and deduction `S
are defined identically to the E-calculus, and hE, σi `S ◦E hE00, σ 00idenotes the
existence of a composed deduction hE , σi `EhE0, σ0i `ShE00, σ00i.
A substitution σis an E-substitution for a clause ELE,σSubEfor short,
iff σ(v)∈ T (Σ(P)c) for each v∈ V(E). We write σlif an {l}-substitution σ
solves an E-literal l, defined by σt1.
=t2iff evalP(σ(t1)) = evalP(σ(t2)) and
σt1mt2iff evalP(σ(t1)) >evalP(σ(t2)). An E-clause Eis solved by σSubE,
σEfor short, iff σlfor each lE. Both calculi are sound in the sense that
hE, σ i `... hE0, σ0ientails θσ0(E) for each θSubE0with θE0, and σσ0.
To disprove a conjecture φ=x1:τ1, . . . , xn:τnb, we search for a deduction
h{b.
=false}, εi `S ◦E h∅, σi.1Hence σrepresents a disproof of φ, as σb.
=false.
1Since the domain of each data type is at most countably infinite, we actually use
monomorphic types τ0
iinstead of the polymorphic types τiin φwithout loss of gen-
61
(1) hE] {l}, σi
hE∪ {l[πt2], t1.
=true}, σ i|hE∪ {l[πt3], t1.
=false}, σ i,
if l|π=if (t1, t2, t3)
(2) hE] {l}, σi
hE∪ {l[πw], w .
=f(...)}, σi, if fΣproc,l|π=f(...), and π /∈ {1,2}
(3) hE] {l}, σi
hE∪ {l[πρ(r)]} ∪ Econd, σ i, if l|π=f(t1,...,tn) for some π∈ {1,2}
and hC, C, r i ∈ Dffor fΣproc
where Econd =ScC{ρ(c).
=true} ∪ ScC{ρ(c).
=false}and
ρ:= {x1/t1,...,xn/tn}for the formal parameters xiof f
(5) hE] {v.
=cons(t1,...,tn), v .
=cons(t0
1,...,t0
n)}, σi
hE∪ {v.
=cons(w1,...,wn)} ∪ Sn
j=1{wj.
=tj, wj.
=t0
j}, σi
(6) hE] {v.
=cons(t1,...,tn), v 6.
=v0, v0.
=cons(t0
1,...,t0
n)}, σi
hE∪ {v.
=cons(w1,...,wn), v0.
=cons(w0
1,...,w0
n)}
∪ {wi6.
=w0
i} ∪ Sn
j=1{wj.
=tj, w0
j
.
=t0
j}, σi
, if i∈ {1,...,n}
(7) hE] {l}, σi
hE∪ {l[πwi], w .
=cons(w1,...,wn), w .
=t}, σi, if l|π=sel i(t)2
(8) hE] {+(t1)+(t2)}, σi
hE∪ {t1t2}, σi, if ∈{m,m/}
Fig. 2. Inference rules of the E-calculus
The inference rules of both calculi are given in the subsequent paragraphs.
The most important rules are formally defined whereas others (denoted by rule
numbers in italics) are only informally described for the sake of brevity. In order
to reduce the depth of the terms in E- and S-literals, some of the rules introduce
fresh variables (called “auxiliary unknowns” in [6]), which we denote by wand
w0. Terms are written as t,t1and t2, and vand v0denote variables.
2.1 Inference rules of the E-calculus
The E-calculus consists of the inference rules (1)–(3) and (5)–(8) of Fig. 2 plus
rules (4) and (9)–(10 ) described informally. The purpose of the E-inference rules
is to eliminate all occurrences of if ,=, and of procedure function symbols so that
some hE, σ i ∈ LSis obtained by an E-deduction h{b.
=false}, εi `EhE , σi.3All
rules are supplied with an additional side condition (*) demanding E /Lfor
each hE, σithey apply to, where Lis the set of all E-clauses containing evident
erality. Type τ0
ioriginates from type τiby instantiating each type variable in τiwith
type N. E. g., to disprove k, l :list [@A]k <> l =l <> k, the monomorphic instance
k, l :list [N]k <> l =l <> k is considered.
2Assuming t.
=cons(...) is sound, as well-typedness is demanded.
3This elimination is possible whenever b.
=false is solvable, as each procedure call
needs to be unfolded by rule (3) only finitely many times.
62
(15) hE] {t1t2}, σi
hE∪ {w1.
=t1, w2.
=t2, w1w2}, σi, if t1, t2/∈ V
(16) hE] {v6.
=t}, σi
hE∪ {v6.
=t, v .
=cons0(w1,...,wn)}, σi, if t∈ V or t=cons(...)
(17) hE] {t1m/ t2}, σ i
hE∪ {t1.
=t2}, σi|hE∪ {t2mt1}, σi, if t1, t2/∈ V
(18) hE] {t16.
=t2}, σi
hE∪ {t1mt2}, σi|hE∪ {t2mt1}, σi, if t1, t2∈ T (Σ(P0),V)N
Fig. 3. Inference rules of the S-calculus
contradictions such as {t6.
=t, . . .},{0mt, . . .}or {t.
= 0, t .
= 1, . . .}. This
proviso corresponds to the elimination of trivial disequations and the clash rule
for equations in [6].
Rule (1) eliminates an if -conditional and rule (2) eliminates an inner pro-
cedure call from a literal.4Rule (3) unfolds a call of procedure fthat occurs as
adirect argument in a literal. A procedure fis represented here by a set Dfof
triples hC, C, risuch that ris the if -free result term in the procedure body of f
obtained under the conditions Cc|cC}, where Cand Cconsist of if -free
Boolean terms only. E. g., D<> consists of two triples, viz. d1={k=ε},, l
and d2=,{k=ε},hd(k) :: (tl(k)<> l), for procedure <> of Fig. 1.
A further rule (4) translates inequations and equations expressed with sym-
bols from Σ(P) into E-literals; e. g., “t1>t2.
=false” is translated into “t1m/ t2”.
Rule (6) is like the decomposition rule from [6] for inequations, but restricted
to constructors. Another rule (9)—corresponding to the elimination of trivial
equations and clash for inequations in [6]—removes trivial literals such as t.
=t
or 0 m/ t from a clause ELEand supplies arbitrary values in λfor variables
that disappear from the clause. Finally, literals are simplified by rule (10 ), which
replaces subterms of the form seli(cons(t1, . . . , tn)) with ti. This rule does not
exist in [6] and accounts for data type definitions with selectors.
2.2 Inference rules of the S-calculus
The S-calculus consists of the inference rules (5)–(19), to which the additional
side condition (*) applies as well. Rules (5)–(10) are the same as in the E-calculus,
rules (11 )–(14 ) are “structural” rules to merge (in)equations, to replace variables
with terms, and to solve equations v.
=twith t T (Σ(P)c) by substitutions
λ:= {v/t}.
Rules (15)–(18) are given in Fig. 3. Rule (15) removes non-variable argu-
ments from an S-literal, rule (16) is basically a case analysis on vusing some
4For a literal l=t1t2,l|πis the subterm of lat occurrence πOcc(l), and l[πt]
is obtained from lby replacing l|πin lwith t. We use “|” as a shorthand for the
succedents of different rules with the same premises and side conditions. All rules
are applied “modulo symmetry” of .
= and 6.
= if possible (e. g., see rule (5)).
63
constructor cons0, and rules (17) and (18) eliminate negative literals. Rule (19 )
invokes a constraint solver.5We call an S-literal t1t2aconstraint literal iff
at least one of the tiis a variable of type N,Σ(t1)Σ(t2)⊆ {0,+,}, and
 ∈ { .
=,m};Cis the set of all constraint literals. When none of the other S-rules
is applicable, rule (19 ) passes the constraint literals EC to a modified version
of Indigo [5]: To terminate on cyclic constraints like {xmy, y mx}, we simply
limit the number of times a constraint can be used by the number of variables
in E∩ C. If E∩ C can be satisfied, we get a solving assignment λSubE∩C;
otherwise rule (19 ) fails.6
2.3 Search Heuristic and Implementation
By the inherent indeterminism of both calculi, search is required for computing
E- and S-deductions: An E-clause Eto be solved defines an infinite E-search tree
ThE,εi
Ewhose nodes are labeled with elements from LE. The root node of ThE i
E
is labeled with hE , εi, and hE00, σ00iis a successor of node hE0, σ0iiff hE00, σ 00i
originates from hE0, σ0iby applying some E-inference rule. The leaves of ThE,εi
E
are given by the E-success and the E-failure nodes: hE0, σ0iis an E-success node
iff E0LS\L, and hE0, σ0iis an E-failure node iff E0L. A path from the
root node to an E-success node is called an E-solution path. All these notions
carry over to S-search trees ThE i
Sby replacing Ewith Sliterally, except that
an S-node labeled with hE0, σ 0iis an S-success node iff E0=, and hE0, σ0iis
an S-failure node iff E06=and no S-inference rule applies to hE0, σ0i.7
An E-clause Eto be solved defines an infinite S ◦ E -search tree TE
S◦E , which
originates from ThE,εi
Eby replacing each E-success node hE0, σ0iwith the S-search
tree ThE00i
S. An S ◦ E-solution path in TE
S◦E is an E-solution path pfollowed by
an S-solution path that starts at the E-success node of path p.
To disprove a conjecture φ=x1:τ1, . . . , xn:τnb, the S ◦ E-search tree
T{b.
=false}
S◦E is explored to find an S E-solution path. In order to guarantee termi-
nation of the search, only a finite part of T{b.
=false}
S◦E may be explored. Additional
side conditions for the E- and S-inference rules ensure that one rule does not
undo another rule’s work. Rules that do not require a choice, e.g. rule (5), are
preferred to those that need some choice, e. g. rules (6) and (16).
The most significant restriction in exploring T{b.
=false}
S◦E (supporting termina-
tion at the cost of completeness) comes from an additional side condition (**) of
rule (3), called the paramodulation rule in [8]: Definition triples d:= (C , C, r)
Dfwith f /Σ(r), called non-recursive definition triples, can be applied as often
5We use mand m/in LEonly to handle calls of the predefined procedure >more
efficiently by a constraint solver.
6In our setting there is no need to assign priorities to constraints, so we can simplify
the algorithm by treating all constraints as “required” constraints.
7E0Lis sufficient but not necessary for hE0, σ0ibeing an S-failure node, as the
constraint solver called by rule (19 ) might fail on some S-clause E0LS\L.
64
as possible. However, if fΣ(r), we need to limit the usage of d. Side condi-
tion (**) demands that recursive definition triples be used at most once on each
side of a literal in each branch of Th{b.
=false}i
E. This leads to a fast disprover
that works well on simple examples, cf. Sect. 3. We call this restriction simple
paramodulation.
To increase the deductive performance of the disprover, we can allow more
applications of a recursive definition triple dby considering the “context” of
f-procedure calls. Each procedure call f(. . .) in the original formula φis labeled
with (N, . . . , N )kfor a constant N, e. g. N= 2, and k=|Df|.Context-
sensitive paramodulation modifies side condition (**) in the following way: A
recursive definition triple di:= (Ci, C i, ri)Dfmay only be used if procedure
call f(. . .) is labeled with (n1, . . . , nk) such that ni>0. The recursive calls of f
in riare labeled with (n1, . . . , ni1, ni1, ni+1, . . . , nk), and the other procedure
calls in riare labeled with (N, . . . , N ). Context-sensitive paramodulation still
allows only finitely many applications of rule (3), as the labels decrease with each
rule application. Section 3 gives examples that illustrate the difference between
these alternatives in practice. Note that simple paramodulation is not a special
case of context-sensitive paramodulation (by setting N= 1), because it does
not distinguish between different occurrences of a procedure as context-sensitive
paramodulation does.
For efficiency reasons (wrt. memory consumption), we explore Th{b.
=false}i
E
with a depth-first strategy, whereas ThE00i
Sis examined breadth-first to avoid
infinite applications of rule (16). Two technical optimizations considerably speed
up the search for an S ◦ E-solution path. Firstly, caching allows to prune a
branch that has already been considered in another derivation. The cache hit
rates are about 20 %. Secondly, while exploring Th{b.
=false}i
Ewe can already
start a subsidiary S-search on S-literals from a clause E /L(even though
node hE, σiof Th{b.
=false}i
Eis not an E-success node) and feed the results back
to the E-search node hE, σ i. For instance, if we derive x.
=+(y) from E, we
can discard E-branches that consider the case x.
= 0. In conjunction with simple
paramodulation, this (empirically) leads to early failure on unsolvable examples.
3 Using the Disprover
In this section we illustrate the use and the performance of our disprover when
it is employed as a subsystem of XeriFun [12]. Unless otherwise stated, we use
simple paramodulation. We distinguish between conjectures provided by the user
and conjectures speculated by the system.
3.1 User-Provided Conjectures
Before trying to verify a program statement, it is advisable to make sure that
it does not contain lapses that render it false. E. g., in arithmetic we are often
interested in cancelation lemmas such as xy=xzy=z. However, the disprover
65
finds the witness {x/0, y/0, z/1}falsifying the conjecture. Excluding x=0 does
not help, as now the witness {x/1, y/0, z/1}is quickly computed. But excluding
x=1 as well causes the disprover to fail, hence we are expectant that verifi-
cation of x, y, z :Nx=/ 0 (x)=/ 0 xy=xzy=zwill succeed. If we con-
jecture the associativity of expontiation, (xy)z=x(yz), the disprover finds the
witness {x/2, y/0, z/0}. For the injectivity conjecture of the factorial function,
i. e. x, y :Nx!=y!x=y, the disprover comes up with the witness {x/1, y/0}
and fails if we demand x=/ 0 y=/0 in addition. For k, l :list[@A]k <> l =l <> k,
the solution {k/0 :: ε, l/1 :: ε}is computed.
All conjectures from above are disproved within less than a second.8One
might argue that these disproofs are quite simple, so they should be easy to
find. XeriFun’s old disprover [1] basically substitutes the variables with values
(or value templates like n:: kfor lists) of a limited size and uses a heuristic search
strategy to track down a counter-example quickly if one exists. However, such
a strategy does not lead to early failure on true conjectures: The old disprover
fails after 46 s on the conjecture that procedure perm (deciding whether two
lists are a permutation of each other) computes a symmetric relation.9The new
disprover fails after just a second.
The disprover also helps to find simple flaws in the definition of lemmas
or procedures. For instance, it disproves lemma “rev <>” (cf. Fig. 1), yielding
{k/0 :: ε, l/1 :: ε}. Also, the termination hypothesis for <> is disproved at once
if one inadvertently writes tl(l) in the recursive call of <> (instead of tl(k)).
Similar errors are the use of instead of >in program conjectures and procedure
definitions.
To illustrate the consequences of simple paramodulation, consider formula
k:list[@A]rev(k)=k. As the smallest solution is {k/0 :: 1 :: ε}, we need to open
rev twice. Thus the disprover fails to find this witness with simple paramodu-
lation, but succeeds with context-sensitive paramodulation. The same effect is
observed with lemma “rev <>” or with x:N2x> x2. However, as most con-
jectures do not need extensive search, we prefer to save time and offer this
alternative only as an option to the user who is willing to spend more time on
the search for a disproof.
3.2 Conjectures Speculated by the System
When generalizing statements by machine, a disprover is needed to detect over-
generalizations. E. g., XeriFun’s generalization heuristic [1] tries to generalize
φ=k, l :list [@A]half (|k <> l |)=half (|l <> k|) to φ0=k, l :list [@A]|k <> l|=
|l <> k|and then φ0to φ00 =k, l :list[@A]k <> l =l <> k . Our disprover quickly
fails on φ0and succeeds on φ00 (see above), hence generalization φ0is a good candi-
date for a proof by induction, whereas φ00 is recognized as an over-generalization
of φ.
8All timing details refer to our single-threaded Java implementation on a 3.2 GHz
hyper-threading CPU, where the Java VM was assigned 300 MB of main memory.
9The old disprover examined the conjecture for lists of length 2 and natural numbers
between 0 and 2.
66
Another example of such a generate-and-test cycle is recursion elimination:
For user-defined procedures, XeriFun synthesizes so-called difference and do-
main procedures which represent information that is useful for automated anal-
ysis of termination [13, 17] and for proving absence of “exceptions” [18] (caused
by division by 0, for example). Both kinds of procedures may contain unneces-
sary recursive calls, which complicate subsequent proofs. Therefore the system
generates recursion elimination formulas [13] justifying a sound replacement of
some recursive calls with truth values. For those formulas that the system could
not prove, the user has to decide whether to support the system either by inter-
actively constructing a proof or by giving a witness to disprove the conjecture.
He can also ignore the often unreadable conjectures (which most users do), not
being aware that missing a true recursion elimination formula means much more
work in subsequent proofs.
For example, for the domain procedure of a tautology checker (cf. procedure 0
in [14]), XeriFun generates 62 recursion elimination formulas. Our disprover
falsifies all of them within 33 s. Without a disprover, we wasted four times longer
on futile proof attempts from which we cannot conclude anything. With the old
disprover, it took more than five times longer to disprove 59 formulas; it failed
on the others. For other domain or difference procedures, the disprover performs
equally well, so in the vast majority of cases the user does not need to worry
about recursion elimination any more. This is a tremendous improvement in
user-friendliness.
4 Related Work
The problem of automatically disproving statements in the context of program
verification has been tackled in various research projects.
Protzen [8] describes a calculus to disprove universal conjectures in the INKA
system [4]. While it apparently performs quite well on false system-generated
conjectures, it has a rather poor performance on true ones; if the input conjecture
is true, it searches until it reaches an explicit limit of the search depth.
A disprover for KIV is presented in [9]. The existing proof calculus is modified
so that it is able to construct disproofs. This interleaves the incremental instan-
tiation of variables and simplifying proof steps. For solvable cases “good results”
are reported, whereas performance on unsolvable problems is not communicated.
Ahrendt has developed a complete disprover for free data type specifica-
tions [2]. Since the interpretation of function symbols is left open in this loose
semantics approach, one needs to consider all models satisfying the axioms when
proving the non-consequence of a conjecture φ. Similarly, the Alloy modeling
system [7] can investigate properties of under-specified models. The correspond-
ing constraint analyzer checks only models with a bounded number of elements
in each primitive type, so (like our disprover) it is incomplete. Differently from
these approaches, we consider only a fixed interpretation of function symbols
(given by the interpreter evalP) in our setting.
67
Isabelle supports a “quickcheck” command [3] to test a conjecture by substi-
tuting random values for the variables several times. A comparison of the success
rates and the performance of this approach with our results is planned as future
work.
Coral [11] is a system designed to find non-trivial flaws in security protocols.
It is based on the Comon-Nieuwenhuis method for proof by consistency and uses
a parallel architecture for consistency checking and so-called induction derivation
to ensure termination. Finding an attack on a protocol may take several hours
with Coral.
5 Conclusion
In the design of our disprover we tried to minimize the time wasted on true
conjectures. We achieved this by limiting the application of the paramodulation
rule. Apart from this, we do not need any explicit depth limits. In particular,
there is no explicit limit on the size of a witness. We also reduce the cost of
simplifications by restricting them to selector and constructor calls. By incor-
porating a constraint solver [5] for inequalities on the predefined data type N
for , we further improved the performance.
We identified several applications of our disprover that considerably improve
the productivity when working with the XeriFun system. The main application
of our disprover is bulk processing (such as recursion elimination) or automatic
generalization. While it is possible to approximate completeness arbitrarily well
to find deeper flaws in a program (conjecture), this would tremendously increase
the time wasted on true conjectures. The advantage of our disprover is that it
is successful in most solvable cases and quickly gives up in unsolvable cases, as
practical experiments reveal.
In future work, we intend to investigate further heuristics for the paramodu-
lation rule, which primarily controls the power of the disprover. We also intend
to examine whether the use of verified lemmas supports the disproving pro-
cess. Finally, it would be interesting to look at combinations of various disprov-
ing strategies. When we are aware of the strengths and weaknesses of different
strategies, we could possibly decide beforehand which one is most suitable for a
specific problem.
References
1. Markus Aderhold. Formula generalization in XeriFun. Diploma thesis, Technische
Universit¨at Darmstadt, 2004.
2. Wolfgang Ahrendt. Deductive search for errors in free data type specifications
using model generation. In A. Voronkov, editor, Proc. of the 18th International
Conference on Automated Deduction, volume 2392 of LNCS. Springer, 2002.
3. Stefan Berghofer and Tobias Nipkow. Random testing in Isabelle/HOL. In Software
Engineering and Formal Methods, pages 230–239. IEEE Computer Society, 2004.
68
4. Susanne Biundo, Birgit Hummel, Dieter Hutter, and Christoph Walther. The
Karlsruhe induction theorem proving system. In J. Siekmann, editor, Proc. of
CADE-8, volume 230 of LNCS, pages 672–674. Springer, 1986.
5. Alan Borning, Richard Anderson, and Bjorn N. Freeman-Benson. Indigo: A local
propagation algorithm for inequality constraints. In ACM Symposium on User
Interface Software and Technology, pages 129–136, 1996.
6. Hubert Comon and Pierre Lescanne. Equational problems and disunification. Jour-
nal of Symbolic Computation, 7:371–425, 1989.
7. Daniel Jackson. Alloy: A lightweight object modelling notation. ACM Transactions
on Software Engineering and Methodology, 11(2):256–290, 2002.
8. Martin Protzen. Disproving conjectures. In D. Kapur, editor, Proc. of CADE-11,
volume 607 of LNAI, pages 340–354. Springer, 1992.
9. Wolfgang Reif, Gerhard Schellhorn, and Andreas Thums. Flaw detection in formal
specifications. In Proc. of IJCAR-1, pages 642–657. Springer, 2001.
10. Stephan Schweitzer. Symbolische Auswertung und Heuristiken zur Verifikation
funktionaler Programme. Doctoral dissertation, TU Darmstadt, to appear 2006.
11. Graham Steel and Alan Bundy. Attacking group protocols by refuting incorrect
inductive conjectures. Journal of Automated Reasoning, pages 1–28, 2005.
12. Daniel Szallies. Ein Werkzeug zur automatischen Widerlegung von Aussagen in
XeriFun . Diplomarbeit, Technische Universit¨at Darmstadt, 2006.
13. Christoph Walther. On proving the termination of algorithms by machine. Artifi-
cial Intelligence, 71(1):101–157, 1994.
14. Christoph Walther, Markus Aderhold, and Andreas Schlosser. The L1.0 Primer.
Technical Report VFR 06/01, Technische Universit¨at Darmstadt, 2006.
15. Christoph Walther and Stephan Schweitzer. About XeriFun. In F. Baader, editor,
Proc. of CADE-19, volume 2741 of LNCS, pages 322–327. Springer, 2003.
16. Christoph Walther and Stephan Schweitzer. Verification in the classroom. Journal
of Automated Reasoning, 32(1):35–73, 2004.
17. Christoph Walther and Stephan Schweitzer. Automated termination analysis for
incompletely defined programs. In F. Baader and A. Voronkov, editors, Proc. of
LPAR-11, volume 3452 of LNAI, pages 332–346. Springer, 2005.
18. Christoph Walther and Stephan Schweitzer. Reasoning about incompletely defined
programs. In G. Sutcliffe and A. Voronkov, editors, Proc. of LPAR-12, volume 3835
of LNAI, pages 427–442. Springer, 2005.
69
... But this consideration put us on the wrong track. Becoming eventually frustrated by the unsuccessful verification attempts, we started eriFun's Disprover [1] which-to our surprise-came up with the counter example x = 3, k = 2 for Lemma 17 in less than a second. 6 We then repaired the algorithm as displayed in Fig. 5 and subsequently verified it (cf. ...
... The proof of Lemma 17 is an exception as it required some thoughts to create it and some effort as well to lead the system (thus spoiling the proof statistics). Proof development was significantly supported by the system's Disprover [1] which (besides detecting the fault in Algorithm 2 ) often helped not to waste time with trying to prove a false conjecture, where the computed counterexamples provided useful hints how to debug a lemma draft. ...
Conference Paper
Full-text available
We report on a machine assisted verification of an efficient implementation of Montgomery Multiplication which is a widely used method in cryptography for efficient computation of modular exponentiation. We shortly describe the method, give a brief survey of the VeriFun system used for verification, present the formal proofs and report on the effort for creating them. Our work uncovered a serious fault in a published algorithm for computing multiplicative inverses based on Newton-Raphson iteration, thus providing further evidence for the benefit of computer-aided verification. https://doi.org/10.1007/978-3-319-96142-2_30 (open access)
... ACL2 supports all features except the disprover [33,42]. eriFun provides support for all five features [10,11,13,73,80,85,86]. Both tools directly address the verification of programs written in a functional programming language, but are limited to first-order programs. ...
Chapter
Full-text available
We present a novel approach for solving quantified bit-vector formulas in Satisfiability Modulo Theories (SMT) based on computing symbolic inverses of bit-vector operators. We derive conditions that precisely characterize when bit-vector constraints are invertible for a representative set of bit-vector operators commonly supported by SMT solvers. We utilize syntax-guided synthesis techniques to aid in establishing these conditions and verify them independently by using several SMT solvers. We show that invertibility conditions can be embedded into quantifier instantiations using Hilbert choice expressions, and give experimental evidence that a counterexample-guided approach for quantifier instantiation utilizing these techniques leads to performance improvements with respect to state-of-the-art solvers for quantified bit-vector constraints.
Chapter
Full-text available
The cornerstone of dynamic partial order reduction (DPOR) is the notion of independence that is used to decide whether each pair of concurrent events p and t are in a race and thus both ptp \cdot t and tpt \cdot p must be explored. We present constrained dynamic partial order reduction (CDPOR), an extension of the DPOR framework which is able to avoid redundant explorations based on the notion of conditional independence—the execution of p and t commutes only when certain independence constraints (ICs) are satisfied. ICs can be declared by the programmer, but importantly, we present a novel SMT-based approach to automatically synthesize ICs in a static pre-analysis. A unique feature of our approach is that we have succeeded to exploit ICs within the state-of-the-art DPOR algorithm, achieving exponential reductions over existing implementations.
Chapter
Full-text available
Vehicle-to-Vehicle (V2V) communications is a “connected vehicles” standard that will likely be mandated in the U.S. within the coming decade. V2V, in which automobiles broadcast to one another, promises improved safety by providing collision warnings, but it also poses a security risk. At the heart of V2V is the communication messaging system, specified in SAE J2735 using the Abstract Syntax Notation One (ASN.1) data-description language. Motivated by numerous previous ASN.1 related vulnerabilities, we present the formal verification of an ASN.1 encode/decode pair. We describe how we generate the implementation in C using our ASN.1 compiler. We define self-consistency for encode/decode pairs that approximates functional correctness without requiring a formal specification of ASN.1. We then verify self-consistency and memory safety using symbolic simulation via the Software Analysis Workbench.
Chapter
Full-text available
JKind is an open-source industrial model checker developed by Rockwell Collins and the University of Minnesota. JKind uses multiple parallel engines to prove or falsify safety properties of infinite state models. It is portable, easy to install, performance competitive with other state-of-the-art model checkers, and has features designed to improve the results presented to users: inductive validity cores for proofs and counterexample smoothing for test-case generation. It serves as the back-end for various industrial applications.
Chapter
Full-text available
This paper describes our experience with symbolic model checking in an industrial setting. We have proved that the initial boot code running in data centers at Amazon Web Services is memory safe, an essential step in establishing the security of any data center. Standard static analysis tools cannot be easily used on boot code without modification owing to issues not commonly found in higher-level code, including memory-mapped device interfaces, byte-level memory access, and linker scripts. This paper describes automated solutions to these issues and their implementation in the C Bounded Model Checker (CBMC). CBMC is now the first source-level static analysis tool to extract the memory layout described in a linker script for use in its analysis.
Article
Hume is a Turing-complete programming language, designed to guarantee space and time bounds whilst still working on a high-level. Formal properties of Hume programs, such as invariants and transformations, have previously been verified using the temporal logic of actions (TLA). TLA prop- erties are verified in an inductive way, which often requires lemma discovery or generalisations. Rippling was developed for guiding inductive proofs, and supports lemmas and generalisation dis- covery through proof critics. In this paper we show how rippling and proof critics can be used in the verification of Hume invariants represented in TLA. Our approach is based on existing work on the problem of verifying and discovering loop invariants for an imperative program. We then extend this work to Hume program transformations.
Conference Paper
For proofs by induction it is often necessary to generalize statements to strengthen the induction hypotheses. This paper presents improved heuristics to generalize away subterms, unnecessary conditions and function symbols in a formula. This resolves shortcomings that we encountered within an experimental evaluation of generalization heuristics from the literature. Our generalization method has been implemented in the verification tool \checkmark\checkmark eriFun . An evaluation with examples from the literature as well as several case studies of our own demonstrates the success of our development.
Conference Paper
Full-text available
Incompletely defined programs provide an elegant and easy way to write and to reason about programs which may halt with a run time error by throwing an exception or printing an error message, e.g. when attempting to divide by zero. Due to the presence of stuck computations, which arise when calling incompletely defined procedures with invalid arguments, we cannot use the method of argument bounded algorithms for proving termination by machine. We analyze the problem and present a solution to improve this termination analysis method so that it works for incompletely defined programs as well. Our technique of proving the termination of incompletely defined programs maintains performance as well as simplicity of the original method and proved successful by an implementation in the verification tool VeriFun.
Conference Paper
Full-text available
We consider automated reasoning about recursive partial functions with decidable domain, i.e. functions computed by incompletely defined but terminating functional programs. Incomplete definitions provide an elegant and easy way to write and to reason about programs which may halt with a run time error by throwing an exception or printing an error message, e.g. when attempting to divide by zero. We investigate the semantics of incompletely defined programs, define an interpreter for those programs and discuss the termination of incompletely defined procedures. We then analyze which problems need to be solved if a theorem prover designed for verification of completely defined programs is modified to work for incompletely defined programs as well. We also discuss how to reason about stuck computations which arise when calling incompletely defined procedures with invalid arguments. Our method of automated reasoning about incompletely defined programs has been implemented in the verification tool VeriFun . We conclude by discussing experiences obtained in several case studies with this implementation and also compare and relate our proposal to other work.
Conference Paper
Full-text available
In verification of finite domain models (model checking) counterexamples help the user to identify, why a proof attempt has failed. In this paper we present an approach to construct counterexamples for first-order goals over infinite data types, which are defined by algebraic specifications. The approach avoids the implementation of a new calculus, by integrating counterexample search with the interactive theorem proving strategy. The paper demonstrates, that this integrations requires only a few modifications to the theorem proving strategy.
Conference Paper
Full-text available
Without Abstract
Article
Full-text available
Proving the termination of a recursively defined algorithm requires a certain creativity of the (human or automated) reasoner for inventing a hypothesis whose truth implies that the algorithm terminates. We present a reasoning method for simulating this kind of creativity by machine. The proposed method works automatically, i.e. without any human support. We show, (1) how a termination hypothesis for an algorithm is synthesized by machine, (2) which knowledge about algorithms is required for an automated synthesis, and (3) how this knowledge is computed. Our method solves the problem for a relevant class of algorithms, including classical sorting algorithms and algorithms for standard arithmetical operations, which are given in a pure functional notation. The soundness of the method is proved and several examples are presented for illustrating the performance of the proposal. The method has been implemented and proved successful in practice.
Conference Paper
A calculus to disprove universally quantified conjectures is presented. It's soundness and completeness are verified. Some strategies and heuristics are presented that yield an effective prover based on the calculus. The prover has been integrated into the induction theorem proving system INKA and has proven successful for disproving conjectures, in particular of those synthesized by the system.
Article
Automated tools for finding attacks on flawed security protocols often fail to deal adequately with group protocols. This is because the abstractions made,to improve performance on fixed 2 or 3 party protocols either preclude the modelling of group protocols all together, or permit modelling only in a fixed scenario, which can prevent attacks from being discovered. This paper describes Coral, a tool for finding counterexamples to incorrect inductive conjectures, which we have used to model protocols for both group key agreement and group key management, without any restrictions on the scenario. We will show how,we used Coral to discover 6 previously unknown,attacks on 3 group protocols. Keywords: Cryptographic security protocols, counterexamples, superposition