# Journal of Automated Reasoning

Online ISSN: 1573-0670
Print ISSN: 0168-7433
Recent publications
• Richard Schmoetten
• Jake E. Palmer
• Jacques D. Fleuriot
Special relativity is a cornerstone of modern physical theory. While a standard coordinate model is well known and widely taught today, multiple axiomatic systems for SR have been constructed over the past century. This paper reports on the formalisation of one such system, which is closer in spirit to Hilbert’s axiomatic approach to Euclidean geometry than to the vector space approach employed by Minkowski. We present a mechanisation in Isabelle/HOL of the system of axioms as well as theorems relating to temporal order. Some proofs are discussed, particularly where the formal work required additional steps, alternative approaches or corrections to Schutz’ prose.

This article presents Pardinus\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {Pardinus}$$\end{document}, an extension of the popular Kodkod\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {Kodkod}$$\end{document} relational model finder with linear temporal logic (including past operators), to simplify the analysis of dynamic systems. Pardinus\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathsf {Pardinus}$$\end{document} includes a SAT-based bounded-model checking engine and an SMV-based complete model checking engine, both allowing iteration through the different instances (or counter-examples) of a specification. It also supports a decomposed parallel analysis strategy that improves the efficiency of both analysis engines on commodity multi-core machines.

Dedekind domains and their class groups are notions in commutative algebra that are essential in algebraic number theory. We formalized these structures and several fundamental properties, including number-theoretic finiteness results for class groups, in the Lean prover as part of the mathlib mathematical library. This paper describes the formalization process, noting the idioms we found useful in our development and mathlib’s decentralized collaboration processes involved in this project.

The Robinson unification algorithm has exponential worst case behavior. This triggered the development of (semi-)linear versions around 1976 by Martelli and Montanari as well as by Paterson and Wegman (J Comput Syst Sci 16(2):158–167, 1978, https://doi.org/10.1016/0022-0000(78)90043-0). Another version emerged by Baader and Snyder around 2001. While these versions are distinctly faster on larger input pairs, the Robinson version still does better than them on small-sized inputs. This paper describes yet another (semi-)linear version that is faster and challenges also the Robinson version on small-sized inputs. All versions have been implemented and compared against each other on different types and sizes of input pairs.

We provide a formal semantics for a large subset of the Lua programming language, in its version 5.2. The semantics is a major part of an ongoing effort to construct reliable tools to analyze Lua code. In this work, we present the details of several key aspects of the language, like the semantics of its only structured data-type (tables), its meta-programming mechanism (metatables), error handling, and how these mechanisms are used to define a complex dynamic semantics that must deal with several possible erroneous situations during run time, given the nature of the language. The semantics is mechanized in Redex, a DSL specially designed to specify and debug operational semantics. We validated the mechanization in two ways: first, by executing within Redex the test suite of the reference interpreter of Lua, and second, by specifying and performing random testing of its fundamental properties using the redex-check tool. Together, they evidence that our model soundly captures the semantics of the selected fragment of the language. Additionally, we address some of the performance problems that typically arise when testing a mechanization in Redex, by using a simple implementation of a reachability-based garbage collector that captures key aspects of Lua’s. By collecting syntactic garbage, we reduce the size of configurations during run time. Finally, we briefly discuss this avenue of development of our semantics, together with the implementation of a prototype tool to perform static analysis of Lua programs.

The word problem for a finite set of ground identities is known to be decidable in polynomial time using congruence closure, and this is also the case if some of the function symbols are assumed to be commutative or defined by certain shallow identities, called strongly shallow. We show that decidability in P is preserved if we add the assumption that certain function symbols f are extensional in the sense that f(s1,…,sn)≈f(t1,…,tn)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f(s_1,\ldots ,s_n) \mathrel {\approx }f(t_1,\ldots ,t_n)$$\end{document} implies s1≈t1,…,sn≈tn\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s_1 \mathrel {\approx }t_1,\ldots ,s_n \mathrel {\approx }t_n$$\end{document}. In addition, we investigate a variant of extensionality that is more appropriate for commutative function symbols, but which raises the complexity of the word problem to coNP.

We consider three graphs, G7,3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G_{7,3}$$\end{document}, G7,4\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G_{7,4}$$\end{document}, and G7,6\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$G_{7,6}$$\end{document}, related to Keller’s conjecture in dimension 7. The conjecture is false for this dimension if and only if at least one of the graphs contains a clique of size 27=128\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2^7 = 128$$\end{document}. We present an automated method to solve this conjecture by encoding the existence of such a clique as a propositional formula. We apply satisfiability solving combined with symmetry-breaking techniques to determine that no such clique exists. This result implies that every unit cube tiling of R7\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^7$$\end{document} contains a facesharing pair of cubes. Since a faceshare-free unit cube tiling of R8\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^8$$\end{document} exists (which we also verify), this completely resolves Keller’s conjecture.

Algebraic datatypes, and among them lists and trees, have attracted a lot of interest in automated reasoning and Satisfiability Modulo Theories (SMT). Since its latest stable version, the SMT-LIB standard defines a theory of algebraic datatypes, which is currently supported by several mainstream SMT solvers. In this paper, we study this particular theory of datatypes and prove that it is strongly polite, showing how it can be combined with other arbitrary disjoint theories using polite combination. The combination method uses a new, simple, and natural notion of additivity that enables deducing strong politeness from (weak) politeness.

Concrete domains have been introduced in the area of Description Logic to enable reference to concrete objects (such as numbers) and predefined predicates on these objects (such as numerical comparisons) when defining concepts. Unfortunately, in the presence of general concept inclusions (GCIs), which are supported by all modern DL systems, adding concrete domains may easily lead to undecidability. To regain decidability of the DL $$\mathcal {ALC}$$ ALC in the presence of GCIs, quite strong restrictions, in sum called $$\omega$$ ω -admissibility, were imposed on the concrete domain. On the one hand, we generalize the notion of $$\omega$$ ω -admissibility from concrete domains with only binary predicates to concrete domains with predicates of arbitrary arity. On the other hand, we relate $$\omega$$ ω -admissibility to well-known notions from model theory. In particular, we show that finitely bounded homogeneous structures yield $$\omega$$ ω -admissible concrete domains. This allows us to show $$\omega$$ ω -admissibility of concrete domains using existing results from model theory. When integrating concrete domains into lightweight DLs of the $$\mathcal {EL}$$ EL family, achieving decidability is not enough. One wants reasoning in the resulting DL to be tractable. This can be achieved by using so-called p-admissible concrete domains and restricting the interaction between the DL and the concrete domain. We investigate p-admissibility from an algebraic point of view. Again, this yields strong algebraic tools for demonstrating p-admissibility. In particular, we obtain an expressive numerical p-admissible concrete domain based on the rational numbers. Although $$\omega$$ ω -admissibility and p-admissibility are orthogonal conditions that are almost exclusive, our algebraic characterizations of these two properties allow us to locate an infinite class of p-admissible concrete domains whose integration into $$\mathcal {ALC}$$ ALC yields decidable DLs.

SQL is the world’s most popular declarative language, forming the basis of the multi-billion-dollar database industry. Although SQL has been standardized, the full standard is based on ambiguous natural language rather than formal specification. Commercial SQL implementations interpret the standard in different ways, so that, given the same input data, the same query can yield different results depending on the SQL system it is run on. Even for a particular system, mechanically checked formalization of all widely-used features of SQL remains an open problem. The lack of a well-understood formal semantics makes it very difficult to validate the soundness of database implementations. Although formal semantics for fragments of SQL were designed in the past, they usually did not support set and bag operations, lateral joins, nested subqueries, and, crucially, null values. Null values complicate SQL’s semantics in profound ways analogous to null pointers or side-effects in other programming languages. Since certain SQL queries are equivalent in the absence of null values, but produce different results when applied to tables containing incomplete data, semantics which ignore null values are able to prove query equivalences that are unsound in realistic databases. A formal semantics of SQL supporting all the aforementioned features was only proposed recently. In this paper, we report about our mechanization of SQL semantics covering set/bag operations, lateral joins, nested subqueries, and nulls, written in the Coq proof assistant, and describe the validation of key metatheoretic properties. Additionally, we are able to use the same framework to formalize the semantics of a flat relational calculus (with null values), and show a certified translation of its normal forms into SQL.

Interpretation methods constitute a foundation of the termination analysis of term rewriting. From time to time, remarkable instances of interpretation methods appeared, such as polynomial interpretations, matrix interpretations, arctic interpretations, and their variants. In this paper, we introduce a general framework, the tuple interpretation method, that subsumes these variants as well as many previously unknown interpretation methods as instances. Employing the notion of derivers, we prove the soundness of the proposed method in an elegant way. We implement the proposed method in the termination prover NaTT and verify its significance through experiments.

We show how computation of left Kan extensions can be reduced to computation of free models of cartesian (finite-limit) theories. We discuss how the standard and parallel chase compute weakly free models of regular theories and free models of cartesian theories and compare the concept of “free model” with a similar concept from database theory known as “universal model”. We prove that, as algorithms for computing finite-free models of cartesian theories, the standard and parallel chase are complete under fairness assumptions. Finally, we describe an optimized implementation of the parallel chase specialized to left Kan extensions that achieves an order of magnitude improvement in our performance benchmarks compared to the next fastest left Kan extension algorithm we are aware of.

A crucial operation of saturation theorem provers is deletion of subsumed formulas. Designers of proof calculi, however, usually discuss this only informally, and the rare formal expositions tend to be clumsy. This is because the equivalence of dynamic and static refutational completeness holds only for derivations where all deleted formulas are redundant, but the standard notion of redundancy is too weak: A clause C does not make an instance $$C\sigma$$ C σ redundant. We present a framework for formal refutational completeness proofs of abstract provers that implement saturation calculi, such as ordered resolution and superposition. The framework modularly extends redundancy criteria derived via a familiar ground-to-nonground lifting. It allows us to extend redundancy criteria so that they cover subsumption, and also to model entire prover architectures so that the static refutational completeness of a calculus immediately implies the dynamic refutational completeness of a prover implementing the calculus within, for instance, an Otter or DISCOUNT loop. Our framework is mechanized in Isabelle/HOL.

This work presents formal correctness proofs in Isabelle/HOL of algorithms to transform a matrix into Smith normal form, a canonical matrix form, in a general setting: the algorithms are written in an abstract form and parameterized by very few simple operations. We formally show their soundness provided the operations exist and satisfy some conditions, which always hold on Euclidean domains. We also provide a formal proof on some results about the generality of such algorithms as well as the uniqueness of the Smith normal form. Since Isabelle/HOL does not feature dependent types, the development is carried out by switching conveniently between two different existing libraries by means of the lifting and transfer package and the use of local type definitions, a sound extension to HOL.

This article is a tribute to the scientific legacy of automated reasoning pioneer and JAR founder Lawrence T. (Larry) Wos. Larry’s main technical contributions were the set-of-support strategy for resolution theorem proving, and the demodulation and paramodulation inference rules for building equality into resolution. Starting from the original definitions of these concepts in Larry’s papers, this survey traces their evolution, unearthing the often forgotten trails that connect Larry’s original definitions to those that became standard in the field.

We present novel reductions of extensions of the basic modal logic $${\textsf {K} }$$ K with axioms $$\textsf {B}$$ B , $$\textsf {D}$$ D , $$\textsf {T}$$ T , $$\textsf {4}$$ 4 and $$\textsf {5}$$ 5 to Separated Normal Form with Sets of Modal Levels $$\textsf {SNF} _{sml}$$ SNF sml . The reductions typically result in smaller formulae than the reductions by Kracht. The reductions to $$\textsf {SNF} _{sml}$$ SNF sml combined with a reduction to $$\textsf {SNF} _{sml}$$ SNF sml allow us to use the local reasoning of the prover $${\text {K}_{_\text {S}}}{\text {P}}$$ K S P to determine the satisfiability of modal formulae in the considered logics. We show experimentally that the combination of our reductions with the prover $${\text {K}_{_\text {S}}}{\text {P}}$$ K S P performs well when compared with a specialised resolution calculus for these logics, the built-in reductions of the first-order prover SPASS, and the higher-order logic prover LEO-III.

In contrast to common automated theorem proving approaches, in which the search space is a set of some formulae and what is sought is again a (goal) formula, we propose an approach based on searching for a proof (of a given length) as a whole. Namely, a proof of a formula in a fixed logical setting can be encoded as a sequence of natural numbers meeting some conditions and a suitable constraint solver can find such a sequence. The sequence can then be decoded giving a proof in the original theory language. This approach leads to several unique features, for instance, it can provide shortest proofs. In this paper, we focus on proofs in coherent logic, an expressive fragment of first-order logic, and on SAT and SMT solvers for solving sets of constraints, but the approach could be tried in other contexts as well. We implemented the proposed method and we present its features, perspectives and performances. One of the features of the implemented prover is that it can generate human understandable proofs in natural language and also machine-verifiable proofs for the interactive prover Coq.

Uniform interpolants were largely studied in non-classical propositional logics since the nineties, and their connection to model completeness was pointed out in the literature. A successive parallel research line inside the automated reasoning community investigated uniform quantifier-free interpolants (sometimes referred to as “covers”) in first-order theories. In this paper, we investigate cover transfer to theory combinations in the disjoint signatures case. We prove that, for convex theories, cover algorithms can be transferred to theory combinations under the same hypothesis needed to transfer quantifier-free interpolation (i.e., the equality interpolating property, aka strong amalgamation property). The key feature of our algorithm relies on the extensive usage of the Beth definability property for primitive fragments to convert implicitly defined variables into their explicitly defining terms. In the non-convex case, we show by a counterexample that covers may not exist in the combined theories, even in case combined quantifier-free interpolants do exist. However, we exhibit a cover transfer algorithm operating also in the non-convex case for special kinds of theory combinations; these combinations (called ‘tame combinations’) concern multi-sorted theories arising in many model-checking applications (in particular, the ones oriented to verification of data-aware processes).

In this paper, we investigate variants of cutting plane proof systems for a class of integer programs called Horn constraint systems (HCS). Briefly, a system of linear inequalities A·x≥b\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{A \cdot x \ge b}$$\end{document} is called a Horn constraint system, if each entry in A\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{A}$$\end{document} belongs to the set {0,1,-1}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{0,1,-1\}$$\end{document} and furthermore there is at most one positive entry per row. Our focus is on deriving refutations, i.e., proofs of unsatisfiability of such programs in variants of the cutting plane proof system. Horn systems generalize Horn formulas, i.e., CNF formulas with at most one positive literal per clause. A Horn system which results from rewriting a Horn clausal formula is called a Horn clausal constraint system (HClCS). The cutting plane calculus (CP) is a well-known calculus for deciding the unsatisfiability of propositional CNF formulas and integer programs. Usually, CP consists of the addition rule (ADD) and the division rule (DIV). We show that the cutting plane calculus with the addition rule only (CP-ADD) does not require constraints of the form 0≤xi≤1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0 \le x_i \le 1$$\end{document}. We also investigate the existence of read-once refutations in Horn clausal constraint systems in the cutting plane proof system. We show that read-once refutations are incomplete. We show that the problem of finding a read-once refutation using only the ADD rule of an HCS is NP-hard even when the right-hand sides of the HCS belong to the set {0,1}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{0,1\}$$\end{document}. Additionally, we show that the problem of finding a read-once refutation using only the ADD rule of an HClCS is NP-hard. We then show that these problems remain hard when we can use both the ADD and DIV rules. We then show that the problem of finding a shortest read-once refutation of an HCS whose right-hand sides belong to the set {0,1}\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{0,1\}$$\end{document} is NPO BP-complete when the refutation can use only the ADD rule or the refutation can use both the ADD and DIV rules. Finally, we provide a parameterized exponential time algorithm for finding a read-once refutation of a system of Horn constraints using only the ADD rule.

We implement a user-extensible ad hoc connection between the Lean proof assistant and the computer algebra system Mathematica. By reflecting the syntax of each system in the other and providing a flexible interface for extending translation, our connection allows for the exchange of arbitrary information between the two systems. We show how to make use of the Lean metaprogramming framework to verify certain Mathematica computations, so that the rigor of the proof assistant is not compromised. We also use Mathematica as an untrusted oracle to guide proof search in the proof assistant and interact with a Mathematica notebook from within a Lean session. In the other direction, we import and process Lean declarations from within Mathematica. The proof assistant library serves as a database of mathematical knowledge that the CAS can display and explore.

Integration, just as much as differentiation, is a fundamental calculus tool that is widely used in many scientific domains. Formalizing the mathematical concept of integration and the associated results in a formal proof assistant helps in providing the highest confidence on the correctness of numerical programs involving the use of integration, directly or indirectly. By its capability to extend the (Riemann) integral to a wide class of irregular functions, and to functions defined on more general spaces than the real line, the Lebesgue integral is perfectly suited for use in mathematical fields such as probability theory, numerical mathematics, and real analysis. In this article, we present the Coq formalization of $$\sigma$$-algebras, measures, simple functions, and integration of nonnegative measurable functions, up to the full formal proofs of the Beppo Levi (monotone convergence) theorem and Fatou’s lemma. More than a plain formalization of the known literature, we present several design choices made to balance the harmony between mathematical readability and usability of Coq theorems. These results are a first milestone toward the formalization of $$L^p$$ spaces such as Banach spaces.

The quest to develop increasingly sophisticated verification systems continues unabated. Tools such as Dafny, Spec#, ESC/Java, SPARK Ada and Whiley attempt to seamlessly integrate specification and verification into a programming language, in a similar way to type checking. A common integration approach is to generate verification conditions that are handed off to an automated theorem prover. This provides a nice separation of concerns and allows different theorem provers to be used interchangeably. However, generating verification conditions is still a difficult undertaking and the use of more “high-level” intermediate verification languages has become commonplace. In particular, Boogie provides a widely used and understood intermediate verification language. A common difficulty is the potential for an impedance mismatch between the source language and the intermediate verification language. In this paper, we explore the use of Boogie as an intermediate verification language for verifying programs in Whiley. This is noteworthy because the Whiley language has (amongst other things) a rich type system with considerable potential for an impedance mismatch. We provide a comprehensive account of translating Whiley to Boogie which demonstrates that it is possible to model most aspects of the Whiley language. Key challenges posed by the Whiley language included: the encoding of Whiley’s expressive type system and support for flow typing and generics; the implicit assumption that expressions in specifications are well defined; the ability to invoke methods from within expressions; the ability to return multiple values from a function or method; the presence of unrestricted lambda functions; and the limited syntax for framing. We demonstrate that the resulting verification tool can verify significantly more programs than the native Whiley verifier which was custom-built for Whiley verification. Furthermore, our work provides evidence that Boogie is (for the most part) sufficiently general to act as an intermediate language for a wide range of source languages.

This paper celebrates the scientific discoveries and the service to the automated reasoning community of Lawrence (Larry) T. Wos, who passed away in August 2020. The narrative covers Larry’s most long-lasting ideas about inference rules and search strategies for theorem proving, his work on applications of theorem proving, and a collection of personal memories and anecdotes that let readers appreciate Larry’s personality and enthusiasm for automated reasoning.

Shortly before Larry Wos passed away, he sent a manuscript for discussion to Sophie Tourret, the editor of the AAR newsletter. We present excerpts from this final manuscript, put it in its historic context and explain its relevance for today’s research in automated reasoning.

Weak memory presents a new challenge for program verification and has resulted in the development of a variety of specialised logics. For C11-style memory models, our previous work has shown that it is possible to extend Hoare logic and Owicki–Gries reasoning to verify correctness of weak memory programs. The technique introduces a set of high-level assertions over C11 states together with a set of basic Hoare-style axioms over atomic weak memory statements (e.g. reads/writes), but retains all other standard proof obligations for compound statements. This paper takes this line of work further by introducing the first deductive verification environment in Isabelle/HOL for C11-like weak memory programs. This verification environment is built on the Nipkow and Nieto’s encoding of Owicki–Gries in the Isabelle theorem prover. We exemplify our techniques over several litmus tests from the literature and two non-trivial examples: Peterson’s algorithm and a read–copy–update algorithm adapted for C11. For the examples we consider, the proof outlines can be automatically discharged using the existing Isabelle tactics developed by Nipkow and Nieto. The benefit here is that programs can be written using a familiar pseudocode syntax with assertions embedded directly into the program.

We present a semantic framework for the deductive verification of hybrid systems with Isabelle/HOL. It supports reasoning about the temporal evolutions of hybrid programs in the style of differential dynamic logic modelled by flows or invariant sets for vector fields. We introduce the semantic foundations of this framework and summarise their Isabelle formalisation as well as the resulting verification components. A series of simple examples shows our approach at work.

We investigate the proof complexity of modal resolution systems developed by Nalon and Dixon (J Algorithms 62(3–4):117–134, 2007) and Nalon et al. (in: Automated reasoning with analytic Tableaux and related methods—24th international conference, (TABLEAUX’15), pp 185–200, 2015), which form the basis of modal theorem proving (Nalon et al., in: Proceedings of the twenty-sixth international joint conference on artificial intelligence (IJCAI’17), pp 4919–4923, 2017). We complement these calculi by a new tighter variant and show that proofs can be efficiently translated between all these variants, meaning that the calculi are equivalent from a proof complexity perspective. We then develop the first lower bound technique for modal resolution using Prover–Delayer games, which can be used to establish “genuine” modal lower bounds for size of dag-like modal resolution proofs. We illustrate the technique by devising a new modal pigeonhole principle, which we demonstrate to require exponential-size proofs in modal resolution. Finally, we compare modal resolution to the modal Frege systems of Hrubeš (Ann Pure Appl Log 157(2–3):194–205, 2009) and obtain a “genuinely” modal separation.

Search-based satisfiability procedures try to build a model of the input formula by simultaneously proposing candidate models and deriving new formulae implied by the input. Conflict-driven procedures perform non-trivial inferences only when resolving conflicts between formulæ and assignments representing the candidate model. CDSAT ( Conflict-Driven SATisfiability ) is a method for conflict-driven reasoning in unions of theories . It combines inference systems for individual theories as theory modules within a solver for the union of the theories. This article augments CDSAT with a more general lemma learning capability and with proof generation . Furthermore, theory modules for several theories of practical interest are shown to fulfill the requirements for completeness and termination of CDSAT. Proof generation is accomplished by a proof-carrying version of the CDSAT transition system that produces proof objects in memory accommodating multiple proof formats. Alternatively, one can apply to CDSAT the LCF approach to proofs from interactive theorem proving, by defining a kernel of reasoning primitives that guarantees the correctness by construction of CDSAT proofs.

This work presents a formalization in PVS of the computational theory for a computational model given as a class of partial recursive functions called PVS0. The model is built over basic operators, which, when restricted to constants, successor, projections, greater-than, and bijections from tuples of naturals to naturals, results in a proven (formalized) Turing complete model. Complete formalizations of the Recursion theorem and Rice’s theorem are discussed in detail. Other relevant results, such as the undecidability of the Halting problem and the fixed-point theorem, were also fully formalized.

Superposition is among the most successful calculi for first-order logic. Its extension to higher-order logic introduces new challenges such as infinitely branching inference rules, new possibilities such as reasoning about Booleans, and the need to curb the explosion of specific higher-order rules. We describe techniques that address these issues and extensively evaluate their implementation in the Zipperposition theorem prover. Largely thanks to their use, Zipperposition won the higher-order division of the CASC-J10 competition.

Aharoni et al. (J Combinat Theory, Ser B 101:1–17, 2010) proved the max-flow min-cut theorem for countable networks, namely that in every countable network with finite edge capacities, there exists a flow and a cut such that the flow saturates all outgoing edges of the cut and is zero on all incoming edges. In this paper, we formalize their proof in Isabelle/HOL and thereby identify and fix several problems with their proof. We also provide a simpler proof for networks where the total outgoing capacity of all vertices other than the source and the sink is finite. This proof is based on the max-flow min-cut theorem for finite networks. As a use case, we formalize a characterization theorem for relation lifting on discrete probability distributions and two of its applications.

In his regular column in the AAR Newsletter, Larry Wos typically posed challenges to the automated reasoning community. Some of these challenges concern general research objectives, but several of them concern specific problems relevant to his own experience and attempts to solve especially difficult problems, including open questions, with an automated theorem prover. This note honors Larry’s memory by working through a solution to one of these challenges.

We present a number of alternative ways of handling transitive binary relations that commonly occur in first-order problems, in particular equivalence relations , total orders , and transitive relations in general. We show how such relations can be discovered syntactically in an input theory, and how they can be expressed in alternative ways. We experimentally evaluate different such ways on problems from the TPTP, using resolution-based reasoning tools as well as instance-based tools. Our conclusions are that (1) it is beneficial to consider different treatments of binary relations as a user, and that (2) reasoning tools could benefit from using a preprocessor or even built-in support for certain types of binary relations.

The Tokeneer project was an initiative set forth by the National Security Agency (NSA, USA) to be used as a demonstration that developing highly secure systems can be made by applying rigorous methods in a cost-effective manner. Altran UK was selected by NSA to carry out the development of the Tokeneer ID Station. The company wrote a Z specification later implemented in the SPARK Ada programming language, which was verified using the SPARK Examiner toolset. In this paper, we show that the Z specification can be readily and naturally encoded in the $$\{log\}$$ set constraint language, thereby generating a functional prototype. Furthermore, we show that $$\{log\}$$ ’s automated proving capabilities can discharge all the proof obligations concerning state invariants as well as important security properties. As a consequence, the prototype can be regarded as correct with respect to the verified properties. This provides empirical evidence that Z users can use $$\{log\}$$ to generate correct prototypes from their Z specifications. In turn, these prototypes enable or simplify some verification activities discussed in the paper.

The interoperability of proof assistants and the integration of their libraries is a highly valued but elusive goal in the field of theorem proving. As a preparatory step, in previous work, we translated the libraries of multiple proof assistants, specifically the ones of Coq, HOL Light, IMPS, Isabelle, Mizar, and PVS into a universal format: OMDoc/MMT. Each translation presented great theoretical, technical, and social challenges, some universal and some system-specific, some solvable and some still open. In this paper, we survey these challenges and compare and evaluate the solutions we chose. We believe similar library translations will be an essential part of any future system interoperability solution, and our experiences will prove valuable to others undertaking such efforts.

Human-centered automated proof search aims to capture structures of ordinary mathematical proofs and discover human strategies that are used (implicitly) in their construction. We analyze the ways of two theorem provers for approaching that goal. One, the G&G-prover, is presented in Ganesalingam and Gowers (J Autom Reason 58(2):253–291, 2017); the other, Sieg’s AProS system, is described in Sieg and Walsh (Rev Symb Logic 1-35, 2019). Both systems make explicit, via their underlying logical calculi, the goal-directedness and bi-directionality of proof construction. However, the calculus for the G&G-prover is a weak fragment of minimal first-order logic, whereas AProS uses complete calculi for intuitionist and classical first-order logic. The strategies for the construction of proofs are dramatically different as well. The G&G-prover uses a waterfall strategy and is thus restricted to problems that can be solved without backtracking. The AProS strategies, by contrast, support a complete search procedure with backtracking. These divergences are rooted in the fact that the concrete goals of the systems are different: The G&G-prover is to yield write-ups indistinguishable from good mathematical writing; AProS is to yield humanly intelligible formal proofs by logically and mathematically motivated strategies. In our final Programmatic remarks, we sketch a plausible, but difficult project for achieving more fully G&G’s broad goals by radically separating proof search from proof translation: one could use AProS for the proof search and then exploit the strategic structure of the completed proof as the deterministic underpinning for its translation into a natural language.

Context-sensitive rewriting is a restriction of rewriting where reduction steps are allowed on specific arguments µ(f) ⊆ {1,. .. , k} of k-ary function symbols f only. Terms which cannot be further rewritten in this way are called µ-normal forms. For left-linear Term Rewriting Systems (TRSs), the so-called normalization via µ-normalization procedure provides a systematic way to obtain normal forms by the stepwise computation and combination of intermediate µ-normal forms. In this paper we show how to obtain bounds on the derivational complexity of computations using this procedure by using bounds on the derivational complexity of context-sensitive rewriting. Two main applications are envisaged: normalization via µ-normalization can be used with non-terminating TRSs where the procedure still terminates; on the other hand, it can be used to improve on bounds of derivational complexity of terminating TRSs as it discards many rewritings.

This paper presents a PVS development of relevant results of the theory of rings. The PVS theory includes complete proofs of the three classical isomorphism theorems for rings, and characterizations of principal, prime and maximal ideals. Algebraic concepts and properties are specified and formalized as generally as possible allowing in this manner their application to other algebraic structures. The development provides the required elements to formalize important algebraic theorems. In particular, the paper presents the formalization of the general algebraic-theoretical version of the Chinese remainder theorem (CRT) for the theory of rings, as given in abstract algebra textbooks, proved as a consequence of the first isomorphism theorem. Also, the PVS theory includes a formalization of the number-theoretical version of CRT for the structure of integers, which is the version of CRT found in formalizations. CRT for integers is obtained as a consequence of the general version of CRT for the theory of rings.

We present an abstract development of Gödel’s incompleteness theorems, performed with the help of the Isabelle/HOL proof assistant. We analyze sufficient conditions for the applicability of our theorems to a partially specified logic. In addition to the usual benefits of generality, our abstract perspective enables a comparison between alternative approaches from the literature. These include Rosser’s variation of the first theorem, Jeroslow’s variation of the second theorem, and the Świerczkowski–Paulson semantics-based approach. As part of the validation of our framework, we upgrade Paulson’s Isabelle proof to produce a mechanization of the second theorem that does not assume soundness in the standard model, and in fact does not rely on any notion of model or semantic interpretation.

We designed a superposition calculus for a clausal fragment of extensional polymorphic higher-order logic that includes anonymous functions but excludes Booleans. The inference rules work on $$\beta \eta$$ β η -equivalence classes of $$\lambda$$ λ -terms and rely on higher-order unification to achieve refutational completeness. We implemented the calculus in the Zipperposition prover and evaluated it on TPTP and Isabelle benchmarks. The results suggest that superposition is a suitable basis for higher-order reasoning.

We propose a lazy decision procedure for the logic WSk\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}S. It builds a term-based symbolic representation of the state space of the tree automaton (TA) constructed by the classical WSk\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}S decision procedure. The classical decision procedure transforms the symbolic representation into a TA via a bottom-up traversal and then tests its language non-emptiness, which corresponds to satisfiability of the formula. On the other hand, we start evaluating the representation from the top, construct the state space on the fly, and utilize opportunities to prune away parts of the state space irrelevant to the language emptiness test. In order to do so, we needed to extend the notion of language terms (denoting language derivatives) used in our previous procedure for the linear fragment of the logic (the so-called WS1S) into automata terms. We implemented our decision procedure and identified classes of formulae on which our prototype implementation is significantly faster than the classical procedure implemented in the Mona tool.

Optimization modulo theories (OMT) is an important extension of SMT which allows for finding models that optimize given objective functions, typically consisting in linear-arithmetic or Pseudo-Boolean terms. However, many SMT and OMT applications, in particular from SW and HW verification, require handling bit-precise representations of numbers, which in SMT are handled by means of the theory of bit-vectors ( $${{\mathcal {B}}}{{\mathcal {V}}}$$ B V ) for the integers and that of floating-point numbers ( $$\mathcal {FP}$$ FP ) for the reals respectively. Whereas an approach for OMT with (unsigned) $${{\mathcal {B}}}{{\mathcal {V}}}$$ B V objectives has been proposed by Nadel & Ryvchin, unfortunately we are not aware of any existing approach for OMT with $$\mathcal {FP}$$ FP objectives. In this paper we fill this gap, and we address for the first time $$\text {OMT}$$ OMT with $$\mathcal {FP}$$ FP objectives. We present a novel OMT approach, based on the novel concept of attractor and dynamic attractor , which extends the work of Nadel and Ryvchin to work with signed- $${{\mathcal {B}}}{{\mathcal {V}}}$$ B V objectives and, most importantly, with $$\mathcal {FP}$$ FP objectives. We have implemented some novel $$\text {OMT}$$ OMT procedures on top of OptiMathSAT and tested them on modified problems from the SMT-LIB repository. The empirical results support the validity and feasibility of our novel approach.

Uniform interpolants have been largely studied in non-classical propositional logics since the nineties; a successive research line within the automated reasoning community investigated uniform quantifier-free interpolants (sometimes referred to as “covers”) in first-order theories. This further research line is motivated by the fact that uniform interpolants offer an effective solution to tackle quantifier elimination and symbol elimination problems, which are central in model checking infinite state systems. This was first pointed out in ESOP 2008 by Gulwani and Musuvathi, and then by the authors of the present contribution in the context of recent applications to the verification of data-aware processes. In this paper, we show how covers are strictly related to model completions, a well-known topic in model theory. We also investigate the computation of covers within the Superposition Calculus, by adopting a constrained version of the calculus and by defining appropriate settings and reduction strategies. In addition, we show that computing covers is computationally tractable for the fragment of the language used when tackling the verification of data-aware processes. This observation is confirmed by analyzing the preliminary results obtained using the mcmt tool to verify relevant examples of data-aware processes. These examples can be found in the last version of the tool distribution.

Many SMT solvers implement efficient SAT-based procedures for solving fixed-size bit-vector formulas. These techniques, however, cannot be used directly to reason about bit-vectors of symbolic bit-width. To address this shortcoming, we propose a translation from bit-vector formulas with parametric bit-width to formulas in a logic supported by SMT solvers that includes non-linear integer arithmetic, uninterpreted functions, and universal quantification. While this logic is undecidable, our approach can still solve many formulas that arise in practice by capitalizing on advances in SMT solving for non-linear arithmetic and universally quantified formulas. We provide several case studies in which we have applied this approach with promising results, including the bit-width independent verification of invertibility conditions, compiler optimizations, and bit-vector rewrite rules.

The development of domain-independent planners within the AI planning community is leading to “off-the-shelf” technology that can be used in a wide range of applications. Moreover, it allows a modular approach—in which planners and domain knowledge are modules of larger software applications—that facilitates substitutions or improvements of individual modules without changing the rest of the system. This approach also supports the use of reformulation and configuration techniques, which transform how a model is represented in order to improve the efficiency of plan generation. In this article, we investigate how the performance of domain-independent planners is affected by domain model configuration, i.e. the order in which elements are ordered in the model, particularly in the light of planner comparisons. We then introduce techniques for the online and offline configuration of domain models, and we analyse the impact of domain model configuration on other reformulation approaches, such as macros.

Top-cited authors
• University of Miami
• Connion GmbH
• University of Bialystok
• University of Bialystok
• University of Bialystok