Conference Paper

Adding parallelism capabilities to ACL2

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

We have implemented parallelism primitives that permit an ACL2 programmer to parallelize execution of ACL2 func- tions. We (1) introduce logical definitions for these primi- tives, (2) explain the features of our extension, (3) give an evaluation strategy for our implementation, and (4) use the parallelism primitives in examples to show speedup.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Several capabilities allow it to do some things that might be considered higher-order in nature: macros, a proof technique called functional instantiation and oracle-apply. ACL2(p) [227] supports parallelism for execution, proof and other infrastructure supports parallelism at the level of collections of files. Run-time assertions are supported and lots of debugging tools are available for program execution and proof. ...
Preprint
Full-text available
Mechanical reasoning is a key area of research that lies at the crossroads of mathematical logic and artificial intelligence. The main aim to develop mechanical reasoning systems (also known as theorem provers) was to enable mathematicians to prove theorems by computer programs. However, these tools evolved with time and now play vital role in the modeling and reasoning about complex and large-scale systems, especially safety-critical systems. Technically, mathematical formalisms and automated reasoning based-approaches are employed to perform inferences and to generate proofs in theorem provers. In literature, there is a shortage of comprehensive documents that can provide proper guidance about the preferences of theorem provers with respect to their designs, performances, logical frameworks, strengths, differences and their application areas. In this work, more than 40 theorem provers are studied in detail and compared to present a comprehensive analysis and evaluation of these tools. Theorem provers are investigated based on various parameters, which includes: implementation architecture, logic and calculus used, library support, level of automation, programming paradigm, programming language, differences and application areas.
... So, following Ginsburg's lead, we also decided to develop a non-parallel reference implementation, this time in ACL2. We could possibly have explored the use of ACL2(p) [16] in our modeling, and developed a more faithful parallel implementation, but were uncertain about the interplay of ACL2(p) and stobjs. (One of the reviewers of this paper noted that, indeed, plet and stobjs are incompatible.) ...
Article
Full-text available
As Graphics Processing Units (GPUs) have gained in capability and GPU development environments have matured, developers are increasingly turning to the GPU to off-load the main host CPU of numerically-intensive, parallelizable computations. Modern GPUs feature hundreds of cores, and offer programming niceties such as double-precision floating point, and even limited recursion. This shift from CPU to GPU, however, raises the question: how do we know that these new GPU-based algorithms are correct? In order to explore this new verification frontier, we formalized a parallelizable all-pairs shortest path (APSP) algorithm for weighted graphs, originally coded in NVIDIA's CUDA language, in ACL2. The ACL2 specification is written using a single-threaded object (stobj) and tail recursion, as the stobj/tail recursion combination yields the most straightforward translation from imperative programming languages, as well as efficient, scalable executable specifications within ACL2 itself. The ACL2 version of the APSP algorithm can process millions of vertices and edges with little to no garbage generation, and executes at one-sixth the speed of a host-based version of APSP coded in C- a very respectable result for a theorem prover. In addition to formalizing the APSP algorithm (which uses Dijkstra's shortest path algorithm at its core), we have also provided capability that the original APSP code lacked, namely shortest path recovery. Path recovery is accomplished using a secondary ACL2 stobj implementing a LIFO stack, which is proven correct. To conclude the experiment, we ported the ACL2 version of the APSP kernels back to C, resulting in a less than 5% slowdown, and also performed a partial back-port to CUDA, which, surprisingly, yielded a slight performance increase.
... It is this interactive delay that this dissertation seeks to improve. Additionally, we have fully implemented and integrated four parallelism primitives designed to allow an ACL2 user to evaluate expressions in parallel: plet, pargs, pand, and por [27, 28, 29]. As explained in section 4.2, we anticipate improving plet to be more versatile. ...
Article
Multi-core CPUs have become commonplace in desktop computers, but theorem provers often do not take advantage of the additional resources these CPUs provide in an interactive setting. This PhD proposal focuses on automatically using these additional resources to lessen the delay between when a user submits a conjecture to the ACL2 theorem prover and when the user receives feedback from the prover useful for learning how to be successful in completing a failed proof. Research contributions include: (1) maintaining the ability of users to interact with the theorem prover despite the use of parallel execution in its proof process, (2) mechanisms for providing early feedback to the user, (3) improving support for multi-threaded programming in an implementation language (Lisp) and a language used for reasoning (ACL2), and (4) evaluating the usefulness of parallelizing a theorem prover at the subgoal level.
... Future work includes further stressing of these multithreading components and fixing the bugs that cause the system to halt, wherever they may be. The ACL2 parallelism paper from 2006 (Rager 2006) documents an average speedup factor of 3.8x on four cores for the parallelized Fibonacci function. The results in the current implementation show a 1.7x and 3.5x speedup on dual and quad-core machines. ...
Article
This paper discusses four primitives supporting parallel eval-uation for a functional subset of LISP, specifically that subset supported by the ACL2 theorem prover. These primitives can be used to provide parallel execution for functions free from side effects without considering race conditions, deadlocks, and other common parallelism pitfalls. We (1) introduce logical definitions for these primitives, (2) explain three features that improve the performance of these primitives, (3) give a brief explanation of the imple-mentation, and (4) use the parallelism primitives in examples to show improvement in evaluation time.
... • Real numbers (through non-standard analysis) [8] • Hash cons, function memoization, and applicative hash tables [4] • Parallel evaluation [19] more debugging tools, e.g. to trace or to inspect the rewrite stack diverse tools for querying the logical database (see history) -quantification via Skolemization (see defun-sk) ...
Conference Paper
Full-text available
We describe a tutorial that demonstrates the use of the ACL2 theorem prover. We have three goals: to enable a motivated reader to start on a path towards effective use of ACL2; to provide ideas for other interactive theorem prover projects; and to elicit feedback on how we might incorporate features of other proof tools into ACL2.
... Question: How can I build an application that evaluates code in parallel? ACL2(p) is an experimental extension of ACL2 that incorporates research and code from David Rager [7,9]. Recent additions include a macro spec-mv-let, which allows speculative evaluation in parallel. ...
Article
Full-text available
The last several years have seen major enhancements to ACL2 functionality, largely driven by requests from its user community, including utilities now in common use such as 'make-event', 'mbe', and trust tags. In this paper we provide user-level summaries of some ACL2 enhancements introduced after the release of Version 3.5 (in May, 2009, at about the time of the 2009 ACL2 workshop) up through the release of Version 4.3 in July, 2011, roughly a couple of years later. Many of these features are not particularly well known yet, but most ACL2 users could take advantage of at least some of them. Some of the changes could affect existing proof efforts, such as a change that treats pairs of functions such as 'member' and 'member-equal' as the same function.
Conference Paper
We address the multicore problem for interactive theorem proving, notably for Isabelle. The stagnation of CPU clock frequency since 2005 means that hardware manufactures multiply cores to keep up with “Moore’s Law”, but this imposes the burden of explicit parallelism to application developers. To cope with this trend, Isabelle has started to support parallel theory and proof processing in 2007, and continuously improved the use of multicore hardware in recent years. This is of practical relevance to theory and proof development, since their size and complexity is roughly correlated with the real time required for re-checking. Scaling up the prover on parallel hardware will facilitate maintenance of larger theory libraries, for example. Our approach to parallel processing in Isabelle is mostly implicit, without user intervention. The system is able to exploit the inherent problem-structure of LCF-style proof checking, although it requires substantial reforms of the prover architecture and its implementation. Thus the user gains significant speedup factors on typical commodity hardware with 2–32 cores; saturation of 8 cores is already routine in many applications. The present paper provides an overview of the current state of shared-memory multiprocessing in Isabelle2013, which also benefits from recent improvements of parallel memory management in Poly/ML (by David Matthews). We discuss common requirements, problems, and solutions. Concrete performance figures are analyzed for some applications from the Isabelle distribution and the Archive of Formal Proofs (AFP).
Article
Full-text available
One of the most attractive features of functional programming languages is their suitability for programming parallel computers. This paper is devoted to discussion of such a claim. Firstly, parallel functional programming is discussed from the programmer's point of view. Secondly, since most parallel functional language implementations are based on the concept of graph reduction, the issues raised by graph reduction are discussed. Finally, the paper concludes with a case study of a particular parallel graph reduction machine and a survey of other parallel architectures.
Article
We present a new finite set theory implementation for ACL2 wherein sets are implemented as fully ordered lists. This order unifies the notions of set equality and element equality by creating a unique represen-tation for each set, which in turn enables nested sets to be trivially supported and eliminates the need for congruence rules. We demonstrate that ordered sets can be reasoned about in the traditional style of membership argu-ments. Using this technique, we prove the classic properties of set operations in a natural and effort-less manner. We then use the exciting new MBE feature of ACL2 to provide linear-time implementa-tions of all basic set operations. These optimizations are made "behind the scenes" and do not adversely impact reasoning ability. We finally develop a framework for reasoning about quantification over set elements. We also begin to provide common higher-order patterns from func-tional programming. The net result is an efficient library that is easy to use and reason about.
Conference Paper
As the need for high-speed computers increases, the need for multi-processors will be become more apparent. One of the major stumbling blocks to the development of useful multi-processors has been the lack of a good multi-processing language—one which is both powerful and understandable to programmers. Among the most compute-intensive programs are artificial intelligence (AI) programs, and researchers hope that the potential degree of parallelism in AI programs is higher than in many other applications. In this paper we propose multi-processing extensions to Lisp. Unlike other proposed multi-processing Lisps, this one provides only a few very powerful and intuitive primitives rather than a number of parallel variants of familiar constructs.
Conference Paper
Multilisp is an extension of Lisp (more specifically, of the Lisp dialect Scheme [15]) with additional operators and additional semantics to deal with parallel execution. It is being implemented on the 32-processor Concert multiprocessor. The current implementation is complete enough to run the Multilisp compiler itself, and has been run on Concert prototypes including up to four processors. Novel techniques are used for task scheduling and garbage collection. The task scheduler helps control excessive resource utilization by means of an unfair scheduling policy: the garbage collector uses a multiprocessor algorithm modeled after the incremental garbage collector of Baker [2]. A companion paper [9] discusses language design issues relating to Multilisp.
Miscellaneous remarks about guards. On the Web
  • Matt Kaufmann
  • J Strother Moore
Matt Kaufmann and J Strother Moore. Miscellaneous remarks about guards. On the Web, April 2006. http://www.cs.utexas.edu/users/moore/acl2/v2- 9/GUARD-MISCELLANY.html.