Article
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We present an extensive, annotated bibliography of the abstract machines designed for each of the main programming paradigms (imperative, object oriented, functional, logic and concurrent). We conclude that whilst a large number of efficient abstract machines have been designed for particular language implementations, relatively little work has been done to design abstract machines in a systematic fashion.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Today nearly all microprocessor systems are based on the "Von Neumann" architecture. Abstract machines can be used to execute programs written in a programming language, which do not fit the "Von Neumann" architecture well [19]. An abstract machine usually includes a stack and registers, similar to a microprocessor. ...
... An abstract machine usually includes a stack and registers, similar to a microprocessor. It permits step by step execution of a program [19]. A program is a sequence of instructions taken from the instruction set of the abstract machine. ...
... If there is no further instruction to execute, the current program will terminate. This basic control mechanism is known as program loop [19]. ...
Thesis
Full-text available
In Named Function Networking (NFN) λ-calculus[1] enables a programmatic way of inter- acting with an information-centric network (ICN) as a single computer. To enable the ICN to understand and handle expressions written in the λ-calculus, which are encoded in names, NFN proposes the integration of an abstract machine that couples β-reduction with ICN forwarding primitives, and uses ICN as a memory substrate[2]. The goals of this project are, first to integrate a Call-By-Name Abstract Machine[3] into the existing CCN-lite implementation. Second, to couple its operation with a forwarding strategy and thereby combining resolution-distribution of λ-expressions across the network. In the end a NFN capable ICN network will be able to evaluate λ-functions and optimize the location of the evaluation in the network. Extensions for parallel computing and better load distribution inside the network are devel- oped to provide a stable and fast execution process of computations. For testing, native and Omnet experiments are conducted. The functionality of the NFN is proven in different test scenarios.
... Abstractmachine implementations exist for a variety of programming languages including functional, logic, imperative and object-oriented languages. Diehl, Hartel and Sestoft provide an extensive overview of existing abstract machines for different paradigms [65]. In addition, abstract machines provide a convenient format for reasoning about advanced programming language concepts such as control operators and local state where an abstract machine's explicit representation of control contexts is useful [23, 27-29, 69-72, 129]. ...
... Abstract machines have mostly been designed in an ad hoc manner based on the experience of the designer [65]. Since abstract machines are mostly constructed in an ad hoc manner, the correctness of each abstract machine has to be established separately after the construction [19,67,93]. ...
... Designing abstract machines is a favorite among functional programmers [65]. On the one hand, few abstract machines are actually derived with meaning-preserving steps, and on the other hand, few abstract machines are invented from scratch. ...
... -Using the case of automatic memory management, we illustrate that the choice for a particular runtime has a severe impact on the structure of the evaluator (Section 2). -We introduce the notion of a generic abstract machine (Section 3.2), an abstract machine [5] that anticipates the supporting runtime without committing to any details using generic programming techniques. A generic abstract machine corresponds to the evaluator in an intermediate form called defunctionalized monadic style. ...
... An evaluator expressed in defunctionalized monadic style has the structure of an abstract machine [5], and is better suited for composition with a supporting runtime. Using Game, abstract machines do not have to be developed by hand. ...
Conference Paper
Full-text available
Separationofconcernsisdifficulttoachieveintheimplemen- tation of a programming language interpreter. We argue that evaluator concerns (i.e., those implementing the operational semantics of the lan- guage) are, in particular, difficult to separate from the runtime concerns (e.g., memory and stack management) that support them. This precludes the former from being reused and limits variability in the latter. In this paper, we present the Game environment for composing cus- tomized interpreters from a reusable evaluator and different variants of its supporting runtime. To this end, Game offers a language for spec- ifying the evaluator according to the generic programming methodol- ogy. Through a transformation into defunctionalized monadic style, the Game toolchain generates a generic abstract machine in which the se- quencing of low-level interpretational steps is parameterized. Given a suitable instantiation of these parameters for a particular runtime, the toolchain is able to inject the runtime into the generic abstract machine such that a complete interpreter is generated. To validate our approach, we port the prototypical Scheme evaluator to Game and compose the resulting generic abstract machine with several runtimes that vary in their automatic memory management as well as their stack discipline.
... Among them, the widely known asymptotic analysis of worst-case complexity [1,2] and the random access machine abstract model of computation are indeed relevant. Generally speaking, the latter is a rather simple theoretical model which helps to understand how an algorithm performs on actual machines; although alternative theoretical models exist [25,26], the random access machine model strikes a fine balance between capturing the essential behavior of the computer and being simple to work with. In addition, it has proven useful in practice: under this model, the algorithmic analysis of efficiency seldom produces substantially misleading results. ...
... To be consistent with the theoretical model described above, this comparison should ideally be based on metrics that are independent from specific languages or machine implementations [25,26]. The chosen ones are capable of capturing differences in the theoretical efficiency of the algorithms; some of them are space metrics, while the remaining ones are time metrics: ...
Article
Full-text available
Median filtering (MF) is a canonical image processing operation truly useful in many practical applications. The MF most appealing feature is its resistance to noise and errors in data, but because the method requires window values to be sorted it is computationally expensive. In this work, a new insight into MF capabilities based on the optimal breakdown value (BV) of the median is offered, and it is also shown that the BV-based versions of two of the most popular MF algorithms outperform their corresponding standard versions. A general framework for both the theoretical analysis and comparison of MF algorithms is presented in the process, which will hopefully contribute to a better understanding of the MF many subtle features. The introduced ideas are experimentally tested by using real and synthetic images. FULL TEXT ONLINE, http://www.readcube.com/articles/10.1007/s10851-016-0694-0
... Given the abundance of abstract machines [77], relatively little effort has been put to design machines in a systematic manner and connecting them with other semantic descriptions. The benefits of such a systematic method are not to be underestimated: the correctness of each new abstract machine no longer needs to be proved from scratch, but it follows from the correctness of the method itself; moreover, the connection between various language features and their efficient machine realization facilitates modular machine design and extensibility. ...
... This distinction is not widely recognized in the literature: some authors use the two terms interchangeably, and some refer to implementations of abstract machines as virtual machines[77]. We find it, however, convenient. ...
... Since Landin's SECD machine [155] abstract machines have been ubiquitous in the literature on programming language semantics. Not only have they proven useful in prototyping implementations of programming languages [6,79,115,200], but also they facilitate studies of semantic [34-37, 85, 88, 89, 116, 118, 155, 185, 186] and logical [49,51,53,149,151,226] aspects of programming languages. (We distinguish between 'abstract' machines that operate directly on a source language and 'virtual' machines that operate on a compiled code [4,5].) ...
... Designing abstract machines is a favorite among functional programmers [79]. Unsurprisingly, this is also the case among logic programmers, for example, with Warren's abstract machine [8], which incidentally is more of a device for high performance than a model of computation. ...
... Diehl, Hartel, and Sestoft's overview of abstract machines for programminglanguage implementation [12] concluded on the need to develop a theory of abstract machines. In previous work [2,5,8], we have attempted to contribute to this theory by identifying a correspondence between interpreters (i.e., evaluation functions in the sense of denotational semantics) and abstract machines (i.e., transition functions in the sense of operational semantics). ...
... For several decades abstract machines have been an active area of research, ranging from Landin's classical SECD machine [20] to the modern JVM [21]. As observed by Diehl, Hartel, and Sestoft [12], research on abstract machines has chiefly focused on developing new machines and proving them correct. The thrust of our work is a correspondence between interpreters and abstract machines [2,8]. ...
Article
We extend our correspondence between evaluators and abstract machines from the pure setting of the λ-calculus to the impure setting of the computational λ-calculus. We show how to derive new abstract machines from monadic evaluators for the computational λ-calculus. Starting from (1) a generic evaluator parameterized by a monad and (2) a monad specifying a computational effect, we inline the components of the monad in the generic evaluator to obtain an evaluator written in a style that is specific to this computational effect. We then derive the corresponding abstract machine by closure-converting, CPS-transforming, and defunctionalizing this specific evaluator. We illustrate the construction with the identity monad, obtaining the CEK machine, and with a lifted state monad, obtaining a variant of the CEK machine with error and state.In addition, we characterize the tail-recursive stack inspection presented by Clements and Felleisen as a lifted state monad. This enables us to combine this stack-inspection monad with other monads and to construct abstract machines for languages with properly tail-recursive stack inspection and other computational effects. The construction scales to other monads—including one more properly dedicated to stack inspection than the lifted state monad—and other monadic evaluators.
... Compiling languages to the intermediate code of a virtual machine offers many benefits such as platform neutrality, compiler simplification, application distribution, direct support of high-level paradigms and application interoperability [37]. In addition, compiling languages to a virtual machine with a lower abstraction level improves runtime performance in comparison with direct interpretation of programs. ...
Article
Increasing trends towards adaptive, distributed, generative and pervasive software have made object-oriented dynamically typed languages become increasingly pop- ular. These languages oer dynamic software evolution by means of reection, fa- cilitating the development of dynamic systems. Unfortunately, this dynamism com- monly imposes a runtime performance penalty. In this paper, we describe how to ex- tend a production JIT-compiler virtual machine to support runtime object-oriented structural reection oered by many dynamic languages. Our approach improves runtime performance of dynamic languages running on statically-typed virtual ma- chines. At the same time, existing statically-typed languages are still supported by the virtual machine. We have extended the .Net platform with runtime structural reection adding prototype-based object-oriented semantics to the statically typed class-based model of .Net, supporting both kinds of programming languages. The assessment of run- time performance and memory consumption has revealed that a direct support of structural reection in a production JIT-based virtual machine designed for stati-
... In order to overcome the runtime performance drawback of the previous approach, this work has been focused on applying JIT compilation techniques to optimize structural reflection primitives used in our dynamic AOSD system. Compiling languages to a virtual machine's intermediate code offers many benefits such as platform neutrality, compiler simplification, application distribution, direct support of high-level paradigms and application interoperability [8]. In addition, compiling languages to a lower abstraction level virtual machine improves runtime performance in comparison with direct interpretation of programs. ...
Article
Full-text available
This project is aimed at investigating how different reflective technologies can be used to develop a dynamically weaving aspect-oriented computing system, without any dependency of a concrete programming language, built over a heterogeneous computing platform. An abstract machine with a reduced instruction set has been employed as the root computation system's engine; it offers the programmer basic reflection computation primitives. Its reduced size and its introspective capabilities, make it easy to be deployed in heterogeneous computational systems, becoming a platform-independent computational system. By using the reflective features offered by the abstract machine, the running applications can be adapted and extended. This would be programmed on its own language, without needing to modify the virtual machine's source code and, therefore, without loosing code portability. As an example of its extensiveness, new software aspects can be programmed achieving facilities such as persistence, distribution, logging or trace. All this new abstractions are adaptable at runtime to any application. By using a reflective language-neutral computing platform, a framework has been created to develop dynamic weaving aspect scenarios. No needing to modify application's functional code, new crosscutting concerns will be weaved to any program at runtime. Following the same scheme, when this dynamic and language-neutral aspects are needed no more, they could be suppressed at runtime in a programmatically way –i.e., not only a human could adapt an application at runtime but another program or even itself could do it.
... For a survey of different abstract machines see[Diehl et al. 2000]. ...
... The employment of specific abstract machines implemented by different virtual machines has brought many benefits to different computing systems. The most relevant are platform neutrality, compiler simplification, application distribution , direct support of high-level paradigms and application interoperability [4]. These benefits of using abstract machines were firstly employed in an exclusive way. ...
Article
The concepts of abstract and virtual machines have been used for many different purposes to obtain diverse benefits such as code portability, compiler simplification, interoperability, distribution and direct support of specific paradigms. Despite of these benefits, the main drawback of virtual machines has always been execution performance. Consequently, there has been considerable research aimed at improving the performance of virtual machine's application execution compared to its native counterparts. Techniques like adaptive Just In Time compilation or efficient and complex garbage collection algorithms have reached such a point that Microsoft and Sun Microsystems identify this kind of platforms as appropriate to implement commercial applications.What we have noticed in our research work is that these platforms have heterogeneity, extensibility, platform porting and adaptability limitations caused by their monolithic designs. Most designs of common abstract machines are focused on supporting a fixed programming language and the computation model they offer is set to the one employed by the specific language. We have identified reflection as a basis for designing an abstract machine, capable of overcoming the previously mentioned limitations. Reflection is a mechanism that gives our platform the capability to adapt the abstract machine to different computation models and heterogeneous computing environments, not needing to modify its implementation. In this paper we present the reflective design of our abstract machine, example code extending the platform, a reference implementation, and a comparison between our implementation and other well-known platforms.
... Further relevant work (although not dealing with abstract machines) is developed by Wadler et al [24,22] and Bierman et al [5] among others. An annotated bibliography on abstract machines is provided by Diehl et al [8]. ...
Article
Full-text available
We derive an abstract machine from the Curry-Howard correspondence with a sequent calculus presentation of Intuitionistic Propositional Linear Logic. The states of the register based abstract machine comprise a low-level code block, a register bank and a dump holding suspended procedure activations. Transformation of natural deduction proofs into our sequent calculus yields a type-preserving compilation function from the Linear Lambda Calculus to the abstract machine. We prove correctness of the abstract machine with respect to the standard call-by-value evaluation semantics of the Linear Lambda Calculus.
... Compiling languages to the intermediate code of a virtual machine offers many benefits such as platform neutrality, compiler simplification, application distribution , direct support of high-level paradigms and application interoperabil- ity [37]. In addition, compiling languages to a virtual machine with a lower abstraction level improves runtime performance in comparison with direct interpretation of programs. ...
Article
Increasing trends towards adaptive, distributed, generative and pervasive software have made object-oriented dynamically typed languages become increasingly popular. These languages offer dynamic software evolution by means of reflection, facilitating the development of dynamic systems. Unfortunately, this dynamism commonly imposes a runtime performance penalty. In this paper, we describe how to extend a production JIT-compiler virtual machine to support runtime object-oriented structural reflection offered by many dynamic languages. Our approach improves runtime performance of dynamic languages running on statically typed virtual machines. At the same time, existing statically typed languages are still supported by the virtual machine.We have extended the .Net platform with runtime structural reflection adding prototype-based object-oriented semantics to the statically typed class-based model of .Net, supporting both kinds of programming languages. The assessment of runtime performance and memory consumption has revealed that a direct support of structural reflection in a production JIT-based virtual machine designed for statically typed languages provides a significant performance improvement for dynamically typed languages.
... Abstract machines (ABM) are mostly used for compilation [11], but we proposed an all-hardware ABM approach. The ABM-based DBT technique (ADBT) is shown in Figure 1. ...
Article
Binary Translation is a migration technique that allows software to run on other machines achieving near native code performance. The paper proposed an abstract machine-based dynamic translation technique in Java processors. The technique employs the “mock execution” of the hardware abstract machine (HAM) to identify and analyze the dependency among Java programs, dynamically translate Java bytecode into tag-based RISC-like instructions. After that, stack folding is combined with the technique to further optimize translated instructions. We used the technique to realize a Java ILP processor. To further describe the technique's availability, we extended the Java processor to design a multithreading Java processor, and explained its some new features.
... Abstract machines are often used as an idealized model of execution; they are in general simpler than a real machine since they lack certain hardware details that would otherwise complicate the reasoning and the analysis of its behaviour. They are therefore suitable as an intermediate language for compilation [10]. We proceed with the definition of the i : |π components of our machine. ...
Article
Full-text available
In this paper we prove the correctness of a compiler for a call-by-name language using step-indexed logical relations and biorthogonality. The source language is an extension of the simply typed lambda-calculus with recursion, and the target language is an extension of the Krivine abstract machine. We formalized the proof in the Coq proof assistant.
... The foremost innovation of memory-centric programming is the use of a formal specification for representing computer system architecture including the memory systems across the programming software stack. An abstract machine model is one such option that provides conceptual models for hardware design, performance modeling, and compiler implementation [4,5,13,15]; however, the model needs to be in a formal specification for parallel programming. For the hardware memory hierarchy, a tree-based abstract machine model is a natural choice. ...
Conference Paper
The memory wall challenge -- the growing disparity between CPU speed and memory speed -- has been one of the most critical and long-standing challenges in computing. For high performance computing, programming to achieve efficient execution of parallel applications often requires more tuning and optimization efforts to improve data and memory access than for managing parallelism. The situation is further complicated by the recent expansion of the memory hierarchy, which is becoming deeper and more diversified with the adoption of new memory technologies and architectures such as 3D-stacked memory, non-volatile random-access memory (NVRAM), and hybrid software and hardware caches. The authors believe it is important to elevate the notion of memory-centric programming, with relevance to the compute-centric or data-centric programming paradigms, to utilize the unprecedented and ever-elevating modern memory systems. Memory-centric programming refers to the notion and techniques of exposing hardware memory system and its hierarchy, which could include DRAM and NUMA regions, shared and private caches, scratch pad, 3-D stacked memory, non-volatile memory, and remote memory, to the programmer via portable programming abstractions and APIs. These interfaces seek to improve the dialogue between programmers and system software, and to enable compiler optimizations, runtime adaptation, and hardware reconguration with regard to data movement, beyond what can be achieved using existing parallel programming APIs. In this paper, we provide an overview of memory-centric programming concepts and principles for high performance computing.
... There, our matrix transposition transformation can be employed for a whole program optimization (such as [6]), as follows. An opportunity for optimization presents itself to the compiler when it is basically able to recognize an abstract machine in the code; optimizing this abstract machine is then an intermediate step, more generally applicable, that precedes hardware-specific optimizations [18]. As outlined above, defunctionalization can turn higher-order programs into first-order programs where this machine might be apparent. ...
Chapter
Full-text available
We characterize the relation between generalized algebraic datatypes (GADTs) with pattern matching on their constructors one hand, and generalized algebraic co-datatypes (GAcoDTs) with copattern matching on their destructors on the other hand: GADTs can be converted mechanically to GAcoDTs by refunctionalization, GAcoDTs can be converted mechanically to GADTs by defunctionalization, and both defunctionalization and refunctionalization correspond to a transposition of the matrix in which the equations for each constructor/destructor pair of the (co-)datatype are organized. We have defined a calculus, \(GADT^T\), which unifies GADTs and GAcoDTs in such a way that GADTs and GAcoDTs are merely different ways to partition the program.
... Designing abstract machines is a favorite among functional programmers [14]. On the one hand, few abstract machines are actually derived with meaningpreserving steps, and on the other hand, few abstract machines are invented from scratch. ...
Article
We bridge the gap between compositional evaluators and abstract machines for the lambda-calculus, using closure conversion, transformation into continuation-passing style, and defunctionalization of continuations. This article is a followup of our article at PPDP 2003, where we consider call by name and call by value. Here, however, we consider call by need.We derive a lazy abstract machine from an ordinary call-by-need evaluator that threads a heap of updatable cells. In this resulting abstract machine, the continuation fragment for updating a heap cell naturally appears as an ‘update marker’, an implementation technique that was invented for the Three Instruction Machine and subsequently used to construct lazy variants of Krivine's abstract machine. Tuning the evaluator leads to other implementation techniques such as unboxed values. The correctness of the resulting abstract machines is a corollary of the correctness of the original evaluators and of the program transformations used in the derivation.
... Since "The Mechanical Evaluation of Expressions," many other abstract machines for the λ-calculus have been invented, discovered, or derived [20]. In fact, the literature simply abounds with derivations of abstract machinesthough with one remarkable exception: there is no derivation of Landin's original SECD machine, even though it was the first such abstract machine. ...
Conference Paper
Full-text available
Landin’s SECD machine was the first abstract machine for the λ-calculus viewed as a programming language. Both theoretically as a model of computation and practically as an idealized implementation, it has set the tone for the subsequent development of abstract machines for functional programming languages. However, and even though variants of the SECD machine have been presented, derived, and invented, the precise rationale for its architecture and modus operandi has remained elusive. In this article, we deconstruct the SECD machine into a λ-interpreter, i.e., an evaluation function, and we reconstruct λ-interpreters into a variety of SECD-like machines. The deconstruction and reconstructions are transformational: they are based on equational reasoning and on a combination of simple program transformations—mainly closure conversion, transformation into continuation-passing style, and defunctionalization. The evaluation function underlying the SECD machine provides a precise rationale for its architecture: it is an environment-based eval-apply evaluator with a callee-save strategy for the environment, a data stack of intermediate results, and a control delimiter. Each of the components of the SECD machine (stack, environment, control, and dump) is therefore rationalized and so are its transitions. The deconstruction and reconstruction method also applies to other abstract machines and other evaluation functions.
... Abstract machines have been widely used in the implementation of programming languages [7]. Most of them have been invented from scratch and subsequently been proved to correctly implement the specification of a programming language [11]. ...
Conference Paper
We describe how to construct correct abstract machines from the class of L-attributed natural semantics introduced by Ibraheem and Schmidt at HOOTS 1997. The construction produces stack-based abstract machines where the stack contains evaluation contexts. It is defined directly on the natural semantics rules. We formalize it as an extraction algorithm and we prove that the algorithm produces abstract machines that are equivalent to the original natural semantics. We illustrate the algorithm by extracting abstract machines from natural semantics for call-by-value and call-by-name evaluation of lambda terms.
... We answer these questions with the key idea of a Domain Specific Machine (DSM). Although it is inspired by the concept of abstract machines, commonly used in automata theory [5] in computer science, it serves a different purpose. Turing machines, statemachines are examples of abstract machines which serve as foundations for analyzing computer programs. ...
Preprint
Domain Specific Modeling Languages (DSML) significantly improve productivity in designing Computer Based System (CBS), by enabling them to be modeled at higher levels of abstraction. It is common for large and complex systems with distributed teams, to use DSMLs, to express and communicate designs of such systems uniformly, using a common language. DSMLs enable domain experts, with no or minimal software development background, to model solutions, using the language and terminologies used in their respective domains. Although, there are already a number of DSMLs available for modeling CBSs, their need is felt strongly across multiple domains, which still are not well supported with DSMLs. Developing a new DSML, however, is non trivial, as it requires (a) significant knowledge about the domain for which the DSML needs to be developed, as well as (b) skills to create new languages. In the current practice, DSMLs are developed by experts, who have substantial understanding of the domain of interest and strong background in computer science. One of the many challenges in the development of DSMLs, is the collection of domain knowledge and its utilization, based on which the abstract syntax, the backbone of the DSML is defined. There is a clear gap in the current state of art and practice, with respect to overcoming this challenge. We propose a methodology, which makes it easier for people with different backgrounds such as domain experts, solution architects, to contribute towards defining the abstract syntax of the DSML. The methodology outlines a set of steps to systematically capture knowledge about the domain of interest, and use that to arrive at the abstract syntax of Permission to make digital the DSML. The key contribution of our work is in abstracting a CBS from a domain into a Domain Specific Machine, embodied in domain specific concepts. The methodology outlines, how the Domain Specific Machine, when coupled with guidelines from current practices of developing DSMLs, results in the definition of the abstract syntax of the intended DSML. We discuss our methodology in detail, in this paper. CCS Concepts • Software and its engineering → Domain specific languages.
... L'approche machine virtuelle consiste alors à suivre le processus de compilation d'un programme jusqu'à la production d'une représentation commune et non-native des programmes : le bytecode (ou code-octet). Il est ensuite laissé à une machine virtuelle, c'est-à-dire un interprète de bytecode associé à un environnement d'exécution, le soin d'exécuter le programme dans cette forme intermédiaire [DS00]. ...
Thesis
Les microcontrôleurs sont des circuits imprimés programmables nichés dans de nombreux objets de notre quotidien. En raison de leurs ressources limitées, ils sont souvent programmés dans des langages de bas niveau comme le C, ou en langage assembleur. Ces derniers n'offrent pas les mêmes abstractions et les mêmes garanties que des langages de haut niveau, comme OCaml. Cette thèse propose alors un ensemble de solutions destinées à enrichir la programmation de microcontrôleurs avec des paradigmes de programmation de plus haut niveau. Ces solutions apportent une montée en abstraction progressive, permettant notamment de réaliser des programmes indépendants du matériel utilisé. No présentons ainsi une première abstraction du matériel prenant la forme d’une machine virtuelle OCaml, qui permet de profiter des nombreux avantages du langage tout conservant une faible empreinte mémoire. Nous étendons par la suite OCaml avec un modèle de programmation synchrone inspiré du langage Lustre et permettant d'abstraire les aspects concurrents d’un programme. Une spécification formelle du langage est donnée, et plusieurs propriétés de typage sont par la suite vérifiées. Les abstractions offertes par nos travaux induisent par ailleurs la portabilité de certaines analyses statiques pouvant être réalisées sur le bytecode des programmes. Une telle analyse, servant à estimer le temps d’exécution pire-cas d’un programme synchrone, est alors proposée. L'ensemble des propositions de cette thèse constitue une chaîne complète de développement, et plusieurs exemples d’applications concrètes illustrant la complétude des solutions offertes sont alors présentées.
... Designing abstract machines is a favorite among functional programmers [16]. Unsurprisingly, this is also the case among logic programmers, for example, with Warren's abstract machine [4], which incidentally is more of a device for high performance than a model of computation. ...
Conference Paper
Starting from a continuation-based interpreter for a sim- ple logic programming language, propositional Prolog with cut, we de- rive the corresponding logic engine in the form of an abstract machine. The derivation originates in previous work (our article at PPDP 2003) where it was applied to the lambda-calculus. The key transformation here is Reynolds's defunctionalization that transforms a tail-recursive, continuation-passing interpreter into a transition system, i.e., an abstract machine. Similar denotational and operational semantics were studied by de Bruin and de Vink (their article at TAPSOFT 1989), and we compare their study with our derivation. Additionally, we present a direct-style interpreter of propositional Prolog expressed with control operators for delimited continuations.
... Operational semantics specifies programming languages in terms of program execution on abstract machines which provide an intermediate language stage for compilation. They bridge the gap between the high level of a programming language and the low level of a real machine [4]. Structural operational semantics represents computation by means of deductive systems that turn the abstract machine into a system of logical inferences [19]. ...
Conference Paper
This paper presents a step forward on a research trend focused on increasing runtime adaptability of commercial JIT-based virtual machines, describing how to include dynamic inheritance into this kind of platforms. A considerable amount of research aimed at improving runtime performance of virtual machines has converted them into the ideal support for developing different types of software products. Current virtual machines do not only provide benefits such as application interoperability, distribution and code portability, but they also offer a competitive runtime performance. Since JIT compilation has played a very important role in improving runtime performance of virtual machines, we first extended a production JIT-based virtual machine to support efficient language-neutral structural reflective primitives of dynamically typed programming languages. This article presents the next step in our research work: supporting language-neutral dynamic inheritance for both statically and dynamically typed programming languages. Executing both kinds of programming languages over the same platform provides a direct interoperation between them.
Article
One of the most attractive features of functional languages is that functions are first-class. To support first-class functions many functional lan-guages created heap-allocated closures to store the bindings of free variables. This makes it difficult to predict how the heap is accessed and makes accesses to free variables slower than accesses to bound variables at runtime. This article presents how support for first-class functions is provided in the MT Evaluator Vir-tual Machine without creating heap-allocated closures by using partial evaluation at runtime.
Article
Abstract machines have been widely employed in computing systems in order to obtain different aims. Compiler simplification, platform neutrality, code distribution, interoperability, and direct support for specific paradigms are examples of the benefits they offer. Although performance has been its main drawback, the use of modern techniques like adaptive (hotspot) just in time compilation has overcome this weakness. Nowadays, well-known platforms based on abstract machines such as Java™ or Microsoft .NET are commercially used.With the purpose of supporting any programming-language computational-model in heterogeneous environments, we have noticed that most abstract machines have extensibility, adaptability and heterogeneously lacks. We have designed an abstract machine that, using reflection as the main design principle, overcomes the limitations discovered. In this paper, we describe its objectives, its design, a sample implementation, and a comparison with other similar platforms.
Conference Paper
Abstract machines provide a certain separation between platform-dependent and platform-independent concerns in compilation. Many of the differences between architectures are encapsulated in the specific abstract machine implementation and the bytecode is left largely architecture independent. Taking advantage of this fact, we present a framework for estimating upper and lower bounds on the execution times of logic programs running on a bytecode-based abstract machine. Our approach includes a one-time, program-independent profiling stage which calculates constants or functions bounding the execution time of each abstract machine instruction. Then, a compile-time cost estimation phase, using the instruction timing information, infers expressions giving platform-dependent upper and lower bounds on actual execution time as functions of input data sizes for each program. Working at the abstract machine level makes it possible to take into account low-level issues in new architectures and platforms by just reexecuting the calibration stage instead of having to tailor the analysis for each architecture and platform. Applications of such predicted execution times include debugging/verification of time properties, certification of time properties in mobile code, granularity control in parallel/distributed computing, and resource-oriented specialization
Article
Full-text available
We develop a compilation scheme and categorical abstract machine for execution of logic programs based on allegories, the categorical version of the calculus of relations. Operational and denotational semantics are developed using the same formalism, and query execution is performed using algebraic reasoning. Our work serves two purposes: achieving a formal model of a logic programming compiler and efficient runtime; building the base for incorporating features typical of functional programming in a declarative way, while maintaining 100% compatibility with existing Prolog programs.
Article
Abstract machines bridge the gap between the high-level of programming languages and the low-level mechanisms of a real machine. The paper proposed a general abstract-machine-based framework (AMBF) to build instruction level parallelism processors using the instruction tagging technique. The constructed processor may accept code written in any (abstract or real) machine instruction set, and produce tagged machine code after data conflicts are resolved. This requires the construction of a tagging unit which emulates the sequential execution of the program using tags rather than actual values. The paper presents a Java ILP processor by using the proposed framework. The Java processor takes advantage of the tagging unit to dynamically translate Java bytecode instructions to RISC-like tag-based instructions to facilitate the use of a general-purpose RISC core and enable the exploitation of instruction level parallelism. We detailed the Java ILP processor architecture and the design issues. Benchmarking of the Java processor using SpecJVM98 and Linpack has shown the overall ILP speedup improvement between 78% and 173%.
Conference Paper
As scalable parallel systems evolve towards more complex nodes with many-core architectures and larger trans-petascale & upcoming exascale deployments, there is a need to understand, characterize and quantify the underlying execution models being used on such systems. Execution models are a conceptual layer between applications & algorithms and the underlying parallel hardware and systems software on which those applications run. This paper presents the SCaLeM (Synchronization, Concurrency, Locality, Memory) framework for characterizing and execution models. SCaLeM consists of three basic elements: attributes, compositions and mapping of these compositions to abstract parallel systems. The fundamental Synchronization, Concurrency, Locality and Memory attributes are used to characterize each execution model, while the combinations of those attributes in the form of compositions are used to describe the primitive operations of the execution model. The mapping of the execution model's primitive operations described by compositions, to an underlying abstract parallel system can be evaluated quantitatively to determine its effectiveness. Finally, SCaLeM also enables the representation and analysis of applications in terms of execution models, for the purpose of evaluating the effectiveness of such mapping.
Article
We exploit the idea of proving properties of an abstract machine by using a corresponding semantic artefact better suited to their proof. The abstract machine is an improved version of Pierre Crégut’s full-reducing Krivine machine KN. The original version works with closed terms of the pure lambda calculus with de Bruijn indices. The improved version reduces in similar fashion but works on closures where terms may be open. The corresponding semantic artefact is a structural operational semantics of a calculus of closures whose reduction relation is purposely a reduction strategy. As shown in previous work, improved KN and the structural operational semantics ‘correspond’, i.e. both artefacts realise the same reduction strategy. In this paper, we prove in the calculus of closures that the reduction strategy simulates in lockstep (at every reduction step) the complete and standard normal-order strategy (i.e. leftmost reduction to normal form) of the pure lambda calculus. The simulation is witnessed by a substitution function from closures of the closure calculus to pure terms of the pure lambda calculus. Thus, KN also simulates normal-order in lockstep by the correspondence. This result is stronger than the known proof that KN is complete, for in the pure lambda calculus there are complete but non-standard strategies. The lockstep simulation proof consists of straightforward structural inductions, thanks to three properties of the closure calculus we call ‘index alignment’, parameters-as-levels’ and ‘balanced derivations’. The first two come from KN. Thanks to these properties, a proof in a calculus of closures involving de Bruijn indices and de Bruijn levels is unproblematic. There is no lexical adjustment at binding lookup, on-the-fly alpha conversion or recursive traversals of the term to deal with bound and free variables as in other calculi. This paper contributes to the framework for environment machines of Biernacka and Danvy a full- reducing open-terms closure calculus, its corresponding abstract machine, and a lockstep simulation proof via a substitution function.
Chapter
Existing machines for lazy evaluation use a flat representation of environments, storing the terms associated with free variables in an array. Combined with a heap, this structure supports the shared intermediate results required by lazy evaluation. We propose and describe an alternative approach that uses a shared environment to minimize the overhead of delayed computations. We show how a shared environment can act as both an environment and a mechanism for sharing results. To formalize this approach, we introduce a calculus that makes the shared environment explicit, as well as a machine to implement the calculus, the Cactus Environment Machine. A simple compiler implements the machine and is used to run experiments for assessing performance. The results show reasonable performance and suggest that incorporating this approach into real-world compilers could yield performance benefits in some scenarios.
Chapter
This chapter discusses some widely used real-world abstract machines.
Chapter
This chapter discusses some widely used real-world abstract machines.
Article
In previous work, we proposed a new approach to the problem of implementing compilers in a modular manner, by combining earlier work on the development of modular interpreters using monad transformers with the à la carte approach to modular syntax. In this article, we refine and extend our existing framework in a number of directions. In particular, we show how generalised algebraic datatypes can be used to support a more modular approach to typing individual language features, we increase the expressive power of the framework by considering mutable state, variable binding, and the issue of noncommutative effects, and we show how the Zinc Abstract Machine can be adapted to provide a modular universal target machine for our modular compilers.
Conference Paper
Perhaps the most popular approach to animating algorithms consists of identifying interesting events in the implementation code, corresponding to relevant actions in the underlying algorithm, and turning them into graphical events by inserting calls ...
Conference Paper
In this paper we present a compiler that translates programs from an imperative higher-order language into a sequence of instructions for an abstract machine. We consider an extension of the Krivine machine for the call-by-name lambda calculus, which includes strict operators and imperative features. We show that the compiler is correct with respect to the big-step semantics of our language, both for convergent and divergent programs.
Article
Dynamic languages are becoming widely used in software engineering due to the flexibility needs of specific software systems. Different example scenarios are the development of dynamic aspect oriented software, Web applications, adaptable and adaptive software or application frameworks. One important lack of these languages is compile-time error detection offered by static languages. However, runtime performance is the most serious limitation to use them in commercial software development. Although JIT optimizing compilation is a widely used technique to speed up intermediate code execution, this has not been successfully applied to dynamically adaptive platforms yet. We present an approach to improve the structural reflective primitives offered by dynamic languages. Looking for a language-neutral platform with a good JIT-based runtime performance, we have used the Microsoft shared source implementation of the CLI. Its model has been extended with semantics of prototype-based object-oriented models, much more suitable than the class-based one for reflective environments. This augmented semantics together with JIT generation of native code has produced significantly better runtime performance than the existing implementations.
Conference Paper
We present the first operational account of call by need that connects syntactic theory and implementation practice. Syntactic theory: the storeless operational semantics using syntax rewriting to account for demand-driven computation and for caching intermediate results. Implementational practice: the store-based operational technique using memo-thunks to implement demand-driven computation and to cache intermediate results for subsequent sharing. The implementational practice was initiated by Landin and Wadsworth and is prevalent today to implement lazy programming languages such as Haskell. The syntactic theory was initiated by Ariola, Felleisen, Maraist, Odersky and Wadler and is prevalent today to reason equationally about lazy programs, on par with Barendregt et al.'s term graphs. Nobody knows, however, how the theory of call by need compares to the practice of call by need: all that is known is that the theory of call by need agrees with the theory of call by name, and that the practice of call by need optimizes the practice of call by name. Our operational account takes the form of three new calculi for lazy evaluation of lambda-terms and our synthesis takes the form of three lock-step equivalences. The first calculus is a hereditarily compressed variant of Ariola et al.'s call-by-need lambda-calculus and makes "neededness" syntactically explicit. The second calculus distinguishes between strict bindings (which are induced by demand-driven computation) and non-strict bindings (which are used for caching intermediate results). The third calculus uses memo-thunks and an algebraic store. The first calculus syntactically corresponds to a storeless abstract machine, the second to an abstract machine with local stores, and the third to a lazy Krivine machine, i.e., a traditional store-based abstract machine implementing lazy evaluation. The machines are intensionally compatible with extensional reasoning about lazy programs and they are lock-step equivalent. Each machine functionally corresponds to a natural semantics for call by need in the style of Launchbury, though for non-preprocessed λ-terms. Our results reveal a genuine and principled unity of computational theory and computational practice, one that readily applies to variations on the general theme of call by need.
Article
It has been a long way since Knowlton’s movie about list processing with the programming language L6. Thousands of algorithm animations, hundreds of systems, and numerous case studies and evaluations have been produced since. But don’t get me wrong, it’s not all said and done? By and large software visualization research has concentrated on a few aspects of software. So you might ask, what should it concentrate on in the future? To answer this question we present a quantitative map of existing research and discuss some cross-topic research themes.
Conference Paper
One of the most attractive features of functional languages is first-class functions. To support first-class functions many functional languages create heap-allocated closures to store the bindings of free variables. This makes it difficult to predict how the heap is accessed and makes accesses to free variables slower than accesses to bound variables. This article presents the operational semantics of the MT evaluator virtual machine and proposes a new implementation strategy for first-class functions in a pure functional language that eliminates the use of heap-allocated closures by using partial evaluation and dynamic code generation. At runtime, functions containing references to free variables are specialized for the bindings of these variables. In these specialized functions, references to free variables become references to constant values. As a result, the need for heap allocated closures is eliminated, accesses to free variables become faster than accesses to bound variables, and expected heap access patterns remain unchanged.
Article
Full-text available
Computational logic can roughly be defined as that branch of artifical intelligence that is based on logic. It includes logic programming, theorem-proving, rewrite-rule systems, and production-rule systems. Such fields are going through a truly interesting transition. It now is becoming apparent that performance improvements for computational logic systems could be as high as 3 to 5 orders or magnitude during the next 5 to 7 years. These improvements will come from three distinct sources: (1) The processors used in the commonly available computing systems will improve by a factor of 5 to 10. That is, commonly available, cheap processors will increase substantially in speed. (2) Multiprocessors featuring up to 1024 nodes will become widely available. Each node may contain 4 to 16 tightly coupled processors (on a shared memory). (3) Substantial speedups will occur due to improvements in the implementation of the basic algorithms. The last point is the focus of this document.
Article
Full-text available
We have developed and implemented techniques that double the performance of dynamically-typed object-oriented languages. Our SELF implementation runs twice as fast as the fastest Smalltalk implementation, despite SELF's lack of classes and explicit variables. To compensate for the absence of classes, our system uses implementation-level maps to transparently group objects cloned from the same prototype, providing data type information and eliminating the apparent space overhead for prototype-based systems. To compensate for dynamic typing, user-defined control structures, and the lack of explicit variables, our system dynamically compiles multiple versions of a source method, each customized according to its receiver's map. Within each version the type of the receiver is fixed, and thus the compiler can statically bind and inline all messages sent to self. Message splitting and type prediction extract and preserve even more static type information, allowing the compiler to inline many other messages. Inlining dramatically improves performance and eliminates the need to hard-wire low-level methods such as +,==, and ifTrue:. Despite inlining and other optimizations, our system still supports interactive programming environments. The system traverses internal dependency lists to invalidate all compiled methods affected by a programming change. The debugger reconstructs inlined stack frames from compiler-generated debugging information, making inlining invisible to the SELF programmer.
Article
This paper describes the principles underlying an efficient implementation of a lazy functional language, compiling to code for ordinary computers. It is based on combinator-like graph reduction: the user defined functions are used as rewrite rules in the graph. Each function is compiled into an instruction sequence for an abstract graph reduction machine, called the G-machine, the code reduces a function application graph to its value. The G-machine instructions are then translated into target code. Speed improvements by almost two orders of magnitude over previous lazy evaluators have been measured; we provide some performance figures.
Article
GUM is a portable, parallel implementation of the Haskell functional language. Despite sustained research interest in parallel functional programming, GUM is one of the first such systems to be made publicly available.GUM is message-based, and portability is facilitated by using the PVM communications harness that is available on many multi-processors. As a result, GUM is available for both shared-memory (Sun SPARCserver multiprocessors) and distributed-memory (networks of workstations) architectures. The high message-latency of distributed machines is ameliorated by sending messages asynchronously, and by sending large packets of related data in each message.Initial performance figures demonstrate absolute speedups relative to the best sequential compiler technology. To improve the performance of a parallel Haskell program GUM provides tools for monitoring and visualising the behaviour of threads and of processors during execution.
Article
This paper provides a mathematical analysis of the Warren Abstract Machine for executing Prolog and a correctness proof for a general compilation scheme of Prolog for the WAM. Starting from an abstract Prolog model which is close to the programmer’s intuition, we derive the WAM methodically by stepwise refinement of Prolog models, proving correctness and completeness for each refinement step. Along the way we explicitly formulate, as proof assumptions, a set of natural conditions for a compiler to be correct, thus making our proof applicable to a whole class of compilers. The proof method provides a rigorous mathematical framework for the study of Prolog compilation techniques. It can be applied in a natural way to extensions and variants of Prolog and related WAMs allowing for parallelism, constraint handling, types, functional components – in some cases it has in fact been successfully extended. Our exposition assumes only a general understanding of Prolog. We reach full mathematical rigour, without heavy methodological overhead, by using Gurevich’s notion of evolving algebras.
Article
This paper suggests that input and output are basic primitives of programming and that parallel composition of communicating sequential processes is a fundamental program structuring method. When combined with a development of Dijkstra's guarded command, these concepts are surprisingly versatile. Their use is illustrated by sample solutions of a variety of a familiar programming exercises.
Conference Paper
Effective, efficient implementations of proof systems are rare. Efficient parallel proof systems are even scarcer. In order to practically address this problem it is necessary to produce a programming model that enables the user to construct portable parallel proof systems quickly, and with minimal programming effort. The model must also offer the facility to adapt the produced proof system so that different heuristic variants can be considered. To this end we introduce an abstract graph machine (AGM), and a language for programming the machine, AGML. Their structure has evolved from both consideration of the requirements of the user, and the restrictions inherent in particular parallel architectures. We show how the data structures used in proof systems can be mapped to this graph-based model, and how the specific algorithms/heuristics for deduction can be represented as operations in AGML.
Conference Paper
Traditionally, the fields of parallel and distributed computing have been quite distinct and have been motivated by different concerns. The parallel community have focused on issues of performance and scalability while the distributed community have been more concerned with sharing and wide area connectivity. There is now however some evidence that the interests of the two communities are converging. This paper examines arguments for and against convergence (from the perspective of the distributed systems community). In particular, the paper examines three trends towards convergence: i) the emergence of high speed networks and, in particular; ATM, ii) the shared interest in platform-independent computational models, and iii) the emergence of microkernel-based architectures for operating systems. The paper concludes that there is cause for optimism that there will be some steps towards a convergence between the two communities in the next few years. Before this becomes reality, there are a number of key research questions that must be answered.
Conference Paper
ksh is a high level interactive script language that is a superset of the UNIX system shell. ksh has better programming features and better performance. Versions of ksh are distributed with the UNIX system by many vendors; this has created a large and growing user community in many different companies and universities. Applications of up to 25,000 lines have been written in ksh and are in production use. ksh-93 is the first major revision of ksh in five years. Many of the changes for ksh-93 were made in order to conform to the IEEE POSIX and ISO shell standards. In addition, ksh-93 has many new programming features that vastly extend the power of shell programming. It was revised to meet the needs of a new generation of tools and graphical interfaces. Much of the impetus for ksh-93 was wksh, which allows graphical user interfaces to be written in ksh. ksh-93 includes the functionality of awk, perl, and tcl. Because ksh-93 maintains backward compatibility with earlier versions of ksh, older ksh and Bourne shell scripts should continue to run with ksh-93.
Conference Paper
This paper describes the principles underlying an efficient implementation of a lazy functional language, compiling to code for ordinary computers. It is based on combinator-like graph reduction: the user defined functions are used as rewrite rules in the graph. Each function is compiled into an instruction sequence for an abstract graph reduction machine, called the G-machine, the code reduces a function application graph to its value. The G-machine instructions are then translated into target code. Speed improvements by almost two orders of magnitude over previous lazy evaluators have been measured; we provide some performance figures.
Article
The Threaded Abstract Machine (TAM) refines dataflow execution models to address the critical constraints that modern parallel architectures place on the compilation of general-purpose parallel programming languages. TAM defines a self-scheduled machine language of parallel threads, which provides a path from data-flow-graph program representations to conventional control flow. The most important feature of TAM is the way it exposes the interaction between the handling of asynchronous message events, the scheduling of computation, and the utilization of the storage hierarchy. This paper provides a complete description of TAM and codifies the model in terms of a pseudo machine language TL0. Issues in compilation from a high level parallel language to TL0 are discussed in general and specifically in regard to the Id90 language. The implementation of TL0 on the CM-5 multiprocessor is explained in detail. Using this implementation, a cost model is developed for the various TAM primitives. The TAM approach is evaluated on sizable Id90 programs on a 64 processor system. The scheduling hierarchy of quanta and threads is shown to provide substantial locality while tolerating long latencies. This allows the average thread scheduling cost to be extremely low.
Article
The Spineless Tagless G-machine is an abstract machine designed to support nonstrict higher-order functional languages. This presentation of the machine falls into three parts. Firstly, we give a general discussion of the design issues involved in implementing non-strict functional languages. Next, we present the STG language, an austere but recognisably-functional language, which as well as a denotational meaning has a well-defined operational semantics. The STG language is the "abstract machine code" for the Spineless Tagless G-machine. Lastly, we discuss the mapping of the STG language onto stock hardware. The success of an abstract machine model depends largely on how efficient this mapping can be made, though this topic is often relegated to a short section. Instead, we give a detailed discussion of the design issues and the choices we have made. Our principal target is the C language, treating the C compiler as a portable assembler. Version 2.5 of this paper (minus appendix) appe...
Article
We have implemented a parallel graph reducer on a commerciallyavailable shared memory multiprocessor (a SequentSymmetryTM), that achieves real speedup comparedto a a fast compiled implementation of the conventional Gmachine.Using 15 processors, this speedup ranges between5 and 11, depending on the program.Underlying the implementation is an abstract machinecalled the h; Gi-machine. We describe the sequential andthe parallel h; Gi-machine, and our implementation ofthem. We provide...
Book
The research described in this book addresses the semantic gap between logic programming languages and the architecture of parallel computers the problem of how to implement logic programming languages on parallel computers in a way that can most effectively exploit the inherent parallelism of the language and efficiently utilize the parallel architecture of the computer. Following a review of other research results, the first project explores the possibilities of implementing logic programs on MIMD, nonshared memory massively parallel computers containing 100 to 1,000 processing elements. The second investigates the possibility of implementing Prolog on a distributed processor array. The author's objectives are to define a parallel computational paradigm (the extended cellular-dataflow model) that can be used to create a parallel Prolog abstract machine.
Article
machine is not abstract machine is determined by six
Article
An abstract is not available.
Article
UNCOL--UNiversal Computer Oriented Language--is being designed as an empirical , pragmatic aid to the solution of a fundamental problem of the digital data processing business: automated translation of programs from expressions in an ever increasing set of problem oriented languages into the machine languages of an expanding variety of digital processing devices. By application of a program called a generator , specific to a given problem language, program statements in this problem language are transformed into equivalent UNCOL statements, independent of any consideration of potential target computers. Subsequently, without regard to the identity of the original problem language, the UNCOL statement of the problem is processed by a program called a translator , which is specific to a given target computer, and the result is an expression of the original problem solution in the machine language of the desired processor. The advantage of this apparent complication over the current procedure of employing a program called a compiler for direct transformation from problem language to machine language is evident when one examines the number of languages and machines involved and the not inconsiderable expense of translation program construction. If there are M problem languages and N machines, then M+N compilers are required and only M+N generators and translators. In order to arrive at sensible specifications for UNCOL, certain limitations in its scope are essential. Accordingly, UNCOL is designed to cope with only those problem language and machine language characteristics that can reasonably be expected to enjoy general use in the next decade. Any broader approach shows promise of leading to elegant, impractical results. A glance at the preliminary specifications for UNCOL shows a language akin to a symbolic assembly language for a register-free, single address, indexed machine. The specific commands are few in number and simple in construction, depending on a special defining capability for the expression of more elaborate instructions. The data referencing scheme is complex, allowing the application of a signed index to the primary address, and permitting both the primary and index parts to be delegated to an indefinite level. Each item of data, either input or calculated, must be described as to type, range precision and the like by special data descriptive syntactical sentences in the language. These descriptions, additionally, provide information concerning ultimate storage allocation as well as indicators of contextual meaning for the explicit commands. Supplementary to the instructions and data descriptions are certain declarative sentences, inessential to the explicit statement of the problem solutions being translated, designed to provide information useful to the translator in the introduction of computational efficiency into the object program.
Article
Interpreted languages have become increasingly popular due to demands for rapid program development, ease of use, portability, and safety. Beyond the general impression that they are "slow," however, little has been documented about the performance of interpreters as a class of applications.This paper examines interpreter performance by measuring and analyzing interpreters from both software and hardware perspectives. As examples, we measure the MIPSI, Java, Perl, and Tcl interpreters running an array of micro and macro benchmarks on a DEC Alpha platform. Our measurements of these interpreters relate performance to the complexity of the interpreter's virtual machine and demonstrate that native runtime libraries can play a key role in providing good performance. From an architectural perspective, we show that interpreter performance is primarily a function of the interpreter itself and is relatively independent of the application being interpreted. We also demonstrate that high-level interpreters' demands on processor resources are comparable to those of other complex compiled programs, such as gcc. We conclude that interpreters, as a class of applications, do not currently motivate special hardware support for increased performance.
Article
The functional or applicative languages have long been regarded as suitable vehicles for overcoming many of the problems involved in the production and maintenance of correct and reliable software. However, their inherent inefficiences when run on conventional von Neumann style machines have prevented their widespread acceptance. With the declining cost of hardware and the increasing feasibility of multi-processor architectures this position is changing, for, in contrast to conventional programs where it is difficult to detect those parts that may be executed, concurrently, applicative programs are ideally suited to parallel evaluation. In this paper we present a scheme for the parallel evaluation of a wide variety of applicative languages and provide an overview of the architecture of a machine on which it may be implemented. First we describe the scheme, which may be characterized as performing graph reduction, at the abstract level and discuss mechanisms that allow several modes of parallel evaluation to be achieved efficiently. We also show how a variety of languages are supported. We then suggest an implementation of the scheme that has the property of being highly modular; larger and faster machines being built by joining together smaller ones. Performance estimates illustrate that a small machine (of the size that we envisage would form the basic building block of large systems) would provide an efficient desk-top personal applicative computer, while the larger versions promise very high levels of performance Indeed. The machine is designed to be ultimately constructed from a small number of types of VLSI component. Finally we compare our approach with the other proposes schemes for the parallel evaluation of applicative languages and discuss planned future developments.
Article
From the Publisher:Whether your knowledge of Perl is casual or deep, this book will make you a more accomplished programmer. Here you can learn the complex techniques for production-ready Perl programs. This book explains methods for manipulating data and objects that may have looked like magic before. Furthermore, it sets Perl in the context of a larger environment, giving you the background you need for dealing with networks, databases, and GUIs. The discussion of internals helps you program more efficiently and embed Perl within C, or C within Perl. In addition, the book patiently explains all sorts of language details you've always wanted to know more about, such as the use of references, trapping errors through the eval operator, non-blocking I/O, when closures are helpful, and using ties to trigger actions when data is accessed. You will emerge from this book a better hacker, and a proud master of Perl.
Article
Lisp is restricted in application because of d the unpredictable delays introduced by garbage collection. Incremental garbage collectors eliminate the delays, but are less efficient overall. We present a semi-incremental algorithm that reduces the delays and is, moreover more efficient than the mark-scan algorithm from which it is derived. The mark-scan algorithm is explained it consists of a mark phase followed by a sweep phase. The sweep phase can be performed incrementally, but the mark phase cannot. If this modification is made, our semi-incremental algorithm is derived. Using the new algorithm the delay on garbage collection is proportional to the amount of heap actually in use, not the size of the heap. Allocating a cell takes a variable amount of time, depending on the proportion of the heap in use. Comparing the number of operations in the old and new algorithms, we see that the new algorithm is more efficient. The new algorithm was used as part of a Lisp implementation on an LSI-11/03 and found to behave well.
Conference Paper
The Cartesian closed categories have been shown by several authors to provide the right framework of the model theory of -calculus. The second author developed this as a syntactic equivalence between two calculi, giving rise to a new kind of combinatory logic: the categorical combinatory logic, where computations can be done through simple rewrite rules, and, as usual with combinators, avoiding problems with variable name clashes. This paper goes further (though not requiring a previous knowledge of categorical combinatory logic) and describes a very simple machine where categorical terms can be considered as code acting on a graph of values. The only saving mechanism is a stack containing pointers on code or on the graph. Abstractions are handled using closures. The machine is called categorical abstract machine or CAM. The CAM is easier to grasp and prove than the SECD machine. The natural evaluation strategy in the CAM is call-by-value, but lazy evaluation can be easily incorporated. The paper discusses the implementation of a real functional programming language, ML, through the CAM. A basic acquaintance with -calculus is required.
Chapter
Although the sequential execution speed of logic programs has been greatly improved by the concepts introduced in the Warren Abstract Machine (WAM), parallel execution represents the only way to increase this speed beyond the natural limits of sequential systems. However, most proposed parallel logic programming execution models lack the performance optimizations and storage efficiency of sequential systems. This paper presents a parallel abstract machine which is an extension of the WAM and is thus capable of supporting AND-Parallelism without giving up the optimizations present in sequential implementations. A suitable instruction set, which can be used as a target by a variety of logic programming languages, is also included. Special instructions are provided to support a generalized version of "Restricted AND-Parallelism" (RAP), a technique which reduces the overhead traditionally associated with the run-time management of variable binding conflicts to a series of simple run-time checks, which select one out of a series of compiled execution graphs.
Article
In this paper we study the compilation of Prolog by making visible hidden operations (especially unification), and then optimizing them using well-known partial evaluation techniques. Inspection of straightforward partially evaluated unification algorithms gives an idea how to design special abstract machine instructions which later form the target language of our compilation. We handle typical compiler problems like representation of terms explicitly. This work gives a logical reconstruction of abstract Prolog machine code, and represents an approach of constructing a correct compiler from Prolog to such a code. As an example, we are explaining the unification principles of Warren’s New Prolog Engine within our framework.
Conference Paper
An analysis is made of simple probabilistic implementations of (slightly restricted) parallel graph rewriting both on a shared memory architecture like a PRAM and a more realistic distributed memory architecture like a transputer network. Graph rewriting is executed in cycles where every cycle consists of the execution of all the tasks presently available in the graph. Assuming that there are p processors and N executable tasks in the cycle, it is shown that the PRAM can execute the cycle in (optimal) time O(N/p) with high probability provided N=Ω(p<sup>2</sup> log p), whereas a processor net can execute the cycle in time O(N/p log p) with high probability using chunks of messages of size O(N/p) if only N=Ω(p log p)
Article
We present a nondeterministic calculus of closures for the evaluation of λ-calculus, which is based, not on symbolic evaluation (β-reduction), but on the paradigm of environments and closures, as in the old SECD machine, or in the more recent Categorical Abstract Machine (CAM). This calculus stands as a suitable abstraction right above those devices: there is no commitment as to the order of evaluation, and no explicit handling of stacks: the calculus is expressed in the style of Structural Operational Semantics, by a set of conditional rewrite rules. The CAM, and a very simple lazy abstract machine, due to J.-L. Krivine, arise naturally by first restricting the calculus to a specific strategy, and then implementing recursive calls by a stack. The Church–Rosser property and termination in presence of types hold in the calculus of closures, and are easier to establish than in the λ-calculus. The calculus of closures has served as a starting point to a more powerful general calculus of substitutions at which we hint shortly, insisting on its category- theoretic significance.
Article
This article surveys the major developments in sequential Prolog implementation during the period 1983–1993. In this decade, implementation technology has matured to such a degree that Prolog has left the university and become useful in industry. The survey is divided into four parts. The first part gives an overview of the important technical developments starting with the Warren abstract machine. The second part presents the history and the contributions of the major software and hardware systems. The third part charts the evolution of Prolog performance since Warren's DEC-10 compiler. The fourth part extrapolates current trends regarding the evolution of sequential logic languages, their implementation, and their role in the marketplace.
Article
We extend the theory of Prolog to provide a framework for the study of Prolog compilation technology. For this purpose, we first demonstrate the semantic equivalence of two Prolog interpreters: a conventional SLD-refutation procedure and one that employs Warren's “last call” optimization. Next, we formally define the Warren Abstract Machine (WAM) and its instruction set and present a Prolog compiler for the WAM. Finally, we prove that the WAM execution of a compiled Prolog program produces the same result as the interpretation of its source.
Article
PROLOG implementation efforts have recently begun to shift from single-processor systems to the new commercially available shared-memory multiprocessors. Among the problems encountered are efficient implementation of operations on variables and the scheduling of the processors. Most of the solutions proposed so far suffer from expensive, nonconstant-time implementation of operations on variables. We propose a storage model (versions-vector model) and a scheduling algorithm. The objectives of the scheduling algorithm are to approximate the sequential processing whenever feasible and to minimize the change in the state of a processor looking for a new task. The most important property of the storage model is a constant-time implementation of operations on all variables. The price paid for efficiency in managing variables is a nonconstant time of task switching. We propose three ways to decrease this price. The first is promotion of variables from versions vectors to value cells on the stack or heap during a task switch, making the subsequent task switches cheaper. The second is delayed installation of variables in versions vectors, decreasing the cost of short branches. The third is a possibility of restricting parallelism to predicates which can gain from the OR-parallel execution.
Conference Paper
Eden is being implemented by extending the Glasgow Haskell Compiler (GHC) which is based on the Spineless Tagless G-Machine (STGM). In this paper we present a parallel abstract machine which embodies the STGM for sequential computations and describes a distributed runtime environment for Eden programs on an abstract level.
Conference Paper
By studying properties of CLP over an unspecified constraint domain X one obtains general results applicable to all instances of CLP(X ). The purpose of this paper is to study a general implementation scheme for CLP(X ) by designing a generic extension WAM(X ) of the WAM and a corresponding generic compilation scheme of CLP(X ) programs to WAM(X ) code which is based on Borger and Rosenzweig's WAM specification and correctness proof. Thus, using the evolving algebra specification method, we obtain not only a formal description of our WAM(X ) scheme, but also a mathematical correctness proof for the design. ¤