. The CADE-13 ATP System Competition tested 18 ATP systems on 50 theorems, in five competition categories, with a time limit of 300 seconds imposed on each system run. This paper records the results of the competition. Some analysis of these results is given, and interesting points are highlighted. Key words: Automated theorem proving, competition, results 1. Introduction The CADE-13 ATP System Competition tested 18 ATP systems on 50 theorems, in five competition categories, with a time limit of 300 seconds imposed on each system run. Full details of the competition design appear in [5], and details of the procedures used in running the competition appear in [2]. System descriptions of the competing ATP systems appear in [1] (this issue). For each system run, three items of data were recorded: whether or not a proof was found, the CPU time taken in seconds, and whether or not a proof object was created. From this data two scores were calculated. The "A" score is the number of proofs f...
Running a competition for automated theorem proving (ATP) systems is a difficult and arguable venture. However, the potential benefits of such an event by far outweigh the controversial aspects. The motivations for running the CADE-13 ATP System Competition were to contribute to the evaluation of ATP systems, to stimulate ATP research and system development, and to expose ATP systems to researchers both within and outside the ATP community. This article identifies and discusses the issues that determine the nature of such a competition. Choices and motivated decisions for the CADE-13 competition, with respect to the issues, are given.
We present a modification of the superposition calculus that is meant to
generate consequences of sets of first-order axioms. This approach is proven to
be sound and deductive-complete in the presence of redundancy elimination
rules, provided the considered consequences are built on a given finite set of
ground terms, represented by constant symbols. In contrast to other approaches,
most existing results about the termination of the superposition calculus can
be carried over to our procedure. This ensures in particular that the calculus
is terminating for many theories of interest to the SMT community.
Adequacy is an important criterion for judging whether a formalization is suitable for reasoning about the actual object of
study. The issue is particularly subtle in the expansive case of approaches to languages with name-binding. In prior work,
adequacy has been formalized only with respect to specific representation techniques. In this article, we give a general formal
definition based on model-theoretic isomorphisms or interpretations. We investigate and formalize an adequate interpretation of untyped lambda-calculus within a higher-order metalanguage in
Isabelle/HOL using the Nominal Datatype Package. Formalization elucidates some subtle issues that have been neglected in informal
arguments concerning adequacy.
KeywordsAdequacy–Isomorphism–Interpretation–Nominal abstract syntax–Higher-order abstract syntax
This paper introduces a variant of nominal abstract syntax in which bindable names are represented by normal meta-variables
as opposed to a separate class of globally fresh names. Distinct meta-variables can be instantiated with the same concrete
name, which we call aliasing. The possible aliasing patterns are controlled by explicit constraints on the distinctness (freshness)
of names. This approach has already been used in the nominal meta-programming language αML. We recap that language and develop a theory of contextual equivalence for it. The central result of the paper is that
abstract syntax trees (ASTs) involving binders can be encoded into αML in such a way that α-equivalence of ASTs corresponds with contextual equivalence of their encodings. This is novel because the encoding does not
rely on the existence of globally fresh names and fresh name generation, which are fundamental to the correctness of the pre-existing
encoding of abstract syntax into FreshML.
KeywordsMeta-programming–Alpha-equivalence–Nominal abstract syntax
Abstraction has been widely used in automated deduction; a major problem with its use is that the abstract space can be inconsistent even though the ground space is consistent. We show that, under certain very weak conditions true of practically all the abstractions used in the past (but true also of a much wider class of abstractions), this problem cannot be avoided.
We illustrate Nested Abstract Syntax as a high-level alternative representation of languages with binding constructs, based on nested datatypes. Our running example
is a partial solution in the Coq proof assistant to the POPLmark Challenge. The resulting formalization is very compact and
does not require any extra library or special logical apparatus. Along the way, we propose an original, high-level perspective
on environments.
KeywordsPOPLmark-Abstract syntax-Semantics-Nested datatypes-Coq-System F
< :
We present masco
tt, a tool for Knuth-Bendix completion modulo the theory of associative and commutative operators. In contrast to classical
completion tools, masco
tt does not rely on a fixed AC-compatible reduction order. Instead, a suitable order is implicitly constructed during a deduction
by collecting all oriented rules in a similar fashion as done in the tool Slothrop. This allows for convergent systems which cannot be completed using standard orders. We outline the underlying inference
system and comment on implementation details such as the use of multi-completion, term indexing techniques, and critical pair
criteria.
In this paper we present a formalization and proof of Higman’s Lemma in ACL2. We formalize the constructive proof described
in [10] where the result is proved using a termination argument justified by the multiset extension of a well-founded relation.
To our knowledge, this is the first mechanization of this proof.
ACL2(r) is a modified version of the theorem prover ACL2 that adds support for the irrational numbers using nonstandard analysis.
It has been used to prove basic theorems of analysis, as well as the correctness of the implementation of transcendental functions
in hardware. This paper presents the logical foundations of ACL2(r). These foundations are also used to justify significant
enhancements to ACL2(r).
Description logics provide powerful languages for representing and reasoning about knowledge of static application domains. The main strength of description logics is that they offer considerable expressive power going far beyond propositional logic, while reasoning is still decidable. There is a demand to bring the power and character of description logics into the description and reasoning of dynamic application domains which are characterized by actions. In this paper, based on a combination of the propositional dynamic logic PDL, a family of description logics and an action formalism constructed over description logics, we propose a family of dynamic description logics DDL(X
@) for representing and reasoning about actions, where X represents well-studied description logics ranging from the
to the
, and X
@ denotes the extension of X with the @ constructor. The representation power of DDL(X
@) is reflected in four aspects. Firstly, the static knowledge of application domains is represented as RBoxes and acyclic TBoxes of the description logic X. Secondly, the states of the world and the pre-conditions of atomic actions are described by ABox assertions of the description logic X
@, and the post-conditions of atomic actions are described by primitive literals of X
@. Thirdly, starting with atomic actions and ABox assertions of X
@, complex actions are constructed with regular program constructors of PDL, so that various control structures on actions such as the “Sequence”, “Choice”, “Any-Order”, “Iterate”, “If-Then-Else”, “Repeat-While” and “Repeat-Until” can be represented. Finally, both atomic actions and complex actions are used as modal operators for the construction of formulas, so that many properties on actions can be explicitly stated by formulas. A tableau-algorithm is provided for deciding the satisfiability of DDL(X
@)-formulas; based on this algorithm, reasoning tasks such as the realizability, executability and projection of actions can be effectively carried out. As a result, DDL(X
@) not only offers considerable expressive power going beyond many action formalisms which are propositional, but also provides decidable reasoning services for actions described by it.
E-unification problems are central in automated deduction. In this work, we consider unification modulo theories that extend the well-known ACI or ACUI by adding a binary symbol “*” that distributes over the AC(U)I-symbol “+.” If this distributivity is one-sided (say, to the left), we get the theory denoted AC(U)ID
l
; we show that AC(U)ID
l
-unification is DEXPTIME-complete. If “*” is assumed two-sided distributive over “+,” we get the theory denoted AC(U)ID; we show unification modulo AC(U)ID to be NEXPTIME-decidable and DEXPTIME-hard. Both AC(U)ID
l
and AC(U)ID seem to be of practical interest, for example, in the analysis of programs modeled in terms of process algebras. Our results, for the two theories considered, are obtained through two entirely different lines of reasoning. A consequence of our methods of proof is that, modulo the theory that adds to AC(U)ID the assumption that “*” is associative-commutative, or just associative, unification is undecidable.
We describe and verify an elegant equivalence checker for regular expressions. It works by constructing a bisimulation relation
between (derivatives of) regular expressions. By mapping regular expressions to binary relations, an automatic and complete
proof method for (in)equalities of binary relations over union, composition and (reflexive) transitive closure is obtained.
The verification is carried out in the theorem prover Isabelle/HOL, yielding a practically useful decision procedure.
KeywordsIsabelle/HOL–Decision procedure–Regular expressions
We describe an algorithm for deciding the first-order multisorted theory BAPA, which combines Boolean algebras of sets of
uninterpreted elements (BA) and Presburger arithmetic operations (PA). BAPA can express the relationship between integer variables
and cardinalities of unbounded finite sets, and it supports arbitrary quantification over sets and integers. Our motivation
for BAPA is deciding verification conditions that arise in the static analysis of data structure consistency properties. Data
structures often use an integer variable to keep track of the number of elements they store; an invariant of such a data structure
is that the value of the integer variable is equal to the number of elements stored in the data structure. When the data structure
content is represented by a set, the resulting constraints can be captured in BAPA. BAPA formulas with quantifier alternations
arise when verifying programs with annotations containing quantifiers or when proving simulation relation conditions for refinement
and equivalence of program fragments. Furthermore, BAPA constraints can be used for proving the termination of programs that
manipulate data structures, as well as in constraint database query evaluation and loop invariant inference. We give a formal
description of an algorithm for deciding BAPA. We analyze our algorithm and show that it has optimal alternating time complexity
and that the complexity of BAPA matches the complexity of PA. Because it works by a reduction to PA, our algorithm yields
the decidability of a combination of sets of uninterpreted elements with any decidable extension of PA. When restricted to
BA formulas, the algorithm can be used to decide BA in optimal alternating time. Furthermore, the algorithm can eliminate
individual quantifiers from a formula with free variables and therefore perform projection onto a desirable set of variables.
We have implemented our algorithm and used it to discharge verification conditions in the Jahob system for data structure
consistency checking of Java programs; our experience suggest that a straightforward implementation of the algorithm is effective
on nontrivial formulas as long as the number of set variables is small. We also report on a new algorithm for solving the
quantifier-free fragment of BAPA.
Mechanized reasoning systems and computer algebra systems have different objectives. Their integration is highly desirable, since formal proofs often involve both of the two different tasks proving and calculating. Even more important, proof and computation are often interwoven and not easily separable.In this article we advocate an integration of computer algebra into mechanized reasoning systems at the proof plan level. This approach allows us to view the computer algebra algorithms as methods, that is, declarative representations of the problem-solving knowledge specific to a certain mathematical domain. Automation can be achieved in many cases by searching for a hierarchic proof plan at the method level by using suitable domain-specific control knowledge about the mathematical algorithms. In other words, the uniform framework of proof planning allows us to solve a large class of problems that are not automatically solvable by separate systems.Our approach also gives an answer to the correctness problems inherent in such an integration. We advocate an approach where the computer algebra system produces high-level protocol information that can be processed by an interface to derive proof plans. Such a proof plan in turn can be expanded to proofs at different levels of abstraction, so the approach is well suited for producing a high-level verbalized explication as well as for a low-level, machine-checkable, calculus-level proof.We present an implementation of our ideas and exemplify them using an automatically solved example.Changes in the criterion of rigor of the proof'' engender major revolutions in mathematics. H. Poincar, 1905
In this paper a new method is proposed for mechanically proving theorems in the local theory of space curves. The method is based on Ritt-Wus well-ordering principle of ordinary differential polynomials, Clifford algebraic representation of Euclidean space and equation set solving in Clifford algebra formalism. It has been tested by various theorems and seems to be efficient.
DPLL (for Davis, Putnam, Logemann, and Loveland) algorithms form the largest family of contemporary algorithms for SAT (the propositional satisfiability problem) and are
widely used in applications. The recursion trees of DPLL algorithm executions on unsatisfiable formulas are equivalent to
treelike resolution proofs. Therefore, lower bounds for treelike resolution (known since the 1960s) apply to them. However,
these lower bounds say nothing about the behavior of such algorithms on satisfiable formulas. Proving exponential lower bounds
for them in the most general setting is impossible without proving P ≠ NP; therefore, to prove lower bounds, one has to restrict the power of branching heuristics. In this paper, we give exponential
lower bounds for two families of DPLL algorithms: generalized myopic algorithms, which read up to n
1−ε
of clauses at each step and see the remaining part of the formula without negations, and drunk algorithms, which choose a variable using any complicated rule and then pick its value at random.
DPLL algorithms form the largest family of contemporary algorithms for SAT (the propositional satisfiability problem) and
are widely used in applications. The recursion trees of DPLL algorithm executions on unsatisfiable formulas are equivalent
to tree-like resolution proofs. Therefore, lower bounds for tree-like resolution (known since 1960s) apply to them. However,
these lower bounds say nothing about their behavior on satisfiable formulas. Proving exponential lower bounds for them in the most general setting is impossible without proving P ≠ NP; thus, in order to prove lower bounds one has to restrict the power of branching heuristics. In this paper, we give exponential
lower bounds for two families of DPLL algorithms: generalized myopic algorithms (that read up to n
1 − − ε of clauses at each step and see the remaining part of the formula without negations) and drunk algorithms (that choose a variable using any complicated rule and then pick its value at random).
The Gelfond-Lifschitz operator associated with a logic program (and likewise the operator associated with default theories by Reiter) exhibits oscillating behavior. In the case of logic programs, there is always at least one finite, nonempty collection of Herbrand interpretations around which the Gelfond-Lifschitz operator bounces around. The same phenomenon occurs with default logic when Reiter''s operator is considered. Based on this, a stable class semantics and extension class semantics has been proposed. The main advantage of this semantics was that it was defined for all logic programs (and default theories), and that this definition was modelled using the standard operators existing in the literature such as Reiter''s operator. In this paper our primary aim is to prove that there is a very interestingduality between stable class theory and the well-founded semantics for logic programming. In the stable class semantics, classes that were minimal with respect to Smyth''s power-domain ordering were selected. We show that the well-founded semantics precisely corresponds to a class that is minimal w.r.t. Hoare''s power domain ordering: the well-known dual of Smyth''s ordering. Besides this elegant duality, this immediately suggests how to define a well-founded semantics for default logic in such a way that the dualities that hold for logic programming continue to hold for default theories. We show how the same technique may be applied to strong autoepistemic logic: the logic of strong expansions proposed by Marek and Truszczynski.
In this paper, we investigate analogy-driven proof plan construction in inductive theorem proving. The intention is to produce a plan for a target theorem that is similar to a given source theorem. We identify second-order mappings from the source to the target that preserve induction-specific proof- relevant abstractions dictating whether the source plan can be replayed. We replay the planning decisions taken in the source if the reasons or justifications for these decisions still hold in the target. If the source and target plan differ significantly at some isolated point, additional reformulations are invoked to add, delete, or modify planning steps. These reformulations are not ad hoc but are triggered by peculiarities of the mappings and by failed justifications. Employing analogy on top of the proof planner CLAM has extended the problem-solving horizon of CLAM: With analogy, some theorems could be proved automatically that neither CLAM nor NQTHM could prove automatically.
While many higher-order interactive theorem provers include a choice operator, higher-order automated theorem provers currently
do not. As a step towards supporting automated reasoning in the presence of a choice operator, we present a cut-free ground
tableau calculus for Church’s simple type theory with choice. The tableau calculus is designed with automated search in mind.
In particular, the rules only operate on the top level structure of formulas. Additionally, we restrict the instantiation
terms for quantifiers to a universe that depends on the current branch. At base types the universe of instantiations is finite.
We prove completeness of the tableau calculus relative to Henkin models.
Rippling is a proof search guidance technique with particular application to proof by mathematical induction. It is based on a concept
of annotating the differences between two terms. In its original formulation this annotation was only appropriate to first-order
formulae. We use a notion of embedding to adapt these annotations appropriately for higher-order syntax. This representation simplifies the theory of annotated
terms, no longer requiring special substitution and unification theorems. A key feature of the representation is that it provides
a clean separation of the term and the annotation. We illustrate this with selected examples using our implementation of these
ideas in λClam.
KeywordsHigher order logic–Automated theorem proving–Proof planning–Rippling
We describe the reconstruction of a phylogeny for a set of taxa, with a character-based cladistics approach, in a declarative
knowledge representation formalism, and show how to use computational methods of answer set programming to generate conjectures
about the evolution of the given taxa. We have applied this computational method in two domains: historical analysis of languages
and historical analysis of parasite-host systems. In particular, using this method, we have computed some plausible phylogenies
for Chinese dialects, for Indo-European language groups, and for Alcataenia species. Some of these plausible phylogenies are different from the ones computed by other software. Using this method, we
can easily describe domain-specific information (e.g., temporal and geographical constraints), and thus prevent the reconstruction
of some phylogenies that are not plausible.
We present a new approach to query answering in default logics. The basic idea is to treat default rules as classical implications along with some qualifying conditions restricting the use of such rules while query answering. We accomplish this by taking advantage of the conception of structure-oriented theorem proving provided by Bibel's connection method. We show that the structure-sensitive nature of the connection method allows for an elegant characterization of proofs in default logic. After introducing our basic method for query answering in default logics, we present a corresponding algorithm and describe its implementation. Both the algorithm and its implementation are obtained by slightly modifying an existing algorithm and an existing implementation of the standard connection method. In turn, we give a couple of refinements of the basic method that lead to conceptually different algorithms. The approach turns out to be extraordinarily qualified for implementations by means of existing automated theorem proving techniques. We substantiate this claim by presenting implementations of the various algorithms along with some experimental analysis.
Even though our method has a general nature, we introduce it in the first part of this paper with the example of constrained default logic. This default logic is tantamount to a variant due to Brewka, and it coincides with Reiter's default logic and a variant due to Łukaszewicz on a large fragment of default logic. Accordingly, our exposition applies to these instances of default logic without any modifications.