Article

Paging Performance in the MT Evaluator Virtual Machine

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The performance of functional languages is closely related to the manner in which they utilize memory and it is com-monly believed that functional languages are slow due to their poor interaction with memory. To make functional languages faster research efforts have focused on parallelism, on memory allocation and performance, and on compiler technology. Each of these lines of research has advanced considerably, but have rarely been combined into one com-prehensive approach. The MT system is being developed as a test bed for novel implementation techniques to improve memory performance by utilizing parallelism and modern compiler technology to manage a distributed memory sys-tem to support a program evaluator. In this paper, we first briefly describe the MT evaluator virtual machine and its new implementation in C++. We then present performance measurements of the paging behavior of MT's heap, stack, and code space using three benchmarks that represent typ-ical programs written using a functional language. The re-sults confirm previous findings that suggest FIFO ought to be used over LRU as the page replacement policy for MT's heap while LRU ought to be used for MT's stack. We also present the first empirical measurements taken on the paging behavior of MT's code space. These measurements suggest that LRU performs better than FIFO, but that FIFO is a competitive page replacement policy for MT's code space.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Book
Full-text available
This book provides a practical approach to understanding implementations of non-strict functional languages using lazy graph reduction. It is intended to be a source of practical laboratory work material and to help students to develop, modify and experiment with their own implementations. The emphasis lies on the building of working prototypes of several functional language implementations and in each case the approach is to give a complete working prototype of a particular implementation, then lead the reader through a sequence of improvements to expand its scope. The prototypes are expressed in the functional language Miranda and software is available in machine-readable form.
Conference Paper
Full-text available
Poor memory performance has been associated with functional languages. The MT System is being developed to study the design choices in the development of an all-software based distributed virtual memory for a pure list-based functional language. Of key importance is to understand the virtual memory behaviour of the different components of a pure functional system. In this paper, we present results obtained from observing the paging behaviour of the MT heap that demonstrate the virtual memory benefits of using the MT allocation algorithm over allocation algorithms that box their data. FIFO performs nearly as well as LRU for list-based memory on a set of benchmarks which include a metacircular evaluator for MT. The empirical results are explained by the paging behaviour of three frequent high-level memory accessing operations.
Article
Full-text available
Advances in parallel computation are of central importance to Artificial Intelligence due to the significant amount of time and space their pro- grams require. Functional languages have been identified as providing a clear and concise way of programming parallel machines for artificial intelligence tasks. The problems of exporting, creating, and manipulating processes have been thoroughly studied in relation to the paralleliza- tion of functional languages, but none of the necessary support structures needed for the ab- straction, like a distributed memory, have been properly designed. In order to design and im- plement parallel functional languages efficiently, we propose the development of an all-software based distributed virtual memory system de- signed specifically for the memory demands of a functional language. In this paper, we review the MT architecture and briefly survey the related literature that lead to its development. We then present empirical results obtained from observ- ing the paging behavior of the MT stack. Our empirical results suggest that LRU is superior to FIFO as a page replacement policy for MT stack pages. We present a proof that LRU is an opti-
Article
Full-text available
"The Structure and Interpretation of Computer Programs" is the entry-level subject in Computer Science at the Massachusetts Institute of Technology. It is required of all students at MIT who major in Electrical Engineering or in Computer Science, as one fourth of the "common core curriculum," which also includes two subjects on circuits and linear systems and a subject on the design of digital systems. We have been involved in the development of this subject since 1978, and we have taught this material in its present form since the fall of 1980 to approximately 600 students each year. Most of these students have had little or no prior formal training in computation, although most have played with computers a bit and a few have had extensive programming or hardware design experience. Our design of this introductory Computer Science subject reflects two major concerns. First we want to establish the idea that a computer language is not just a way of getting a computer to perform operations, but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute. Secondly, we believe that the essential material to be addressed by a subject at this level, is not the syntax of particular programming language constructs, nor clever algorithms for computing particular functions of efficiently, not even the mathematical analysis of algorithms and the foundations of computing, but rather the techniques used to control the intellectual complexity of large software systems.
Article
Full-text available
. Dynamic memory allocation has been a fundamental part of most computer systems since roughly 1960, and memory allocation is widely considered to be either a solved problem or an insoluble one. In this survey, we describe a variety of memory allocator designs and point out issues relevant to their design and evaluation. We then chronologically survey most of the literature on allocators between 1961 and 1995. (Scores of papers are discussed, in varying detail, and over 150 references are given.) We argue that allocator designs have been unduly restricted by an emphasis on mechanism, rather than policy, while the latter is more important; higher-level strategic issues are still more important, but have not been given much attention. Most theoretical analyses and empirical allocator evaluations to date have relied on very strong assumptions of randomness and independence, but real program behavior exhibits important regularities that must be exploited if allocator...
Article
The third edition of Computer Architecture and Organization features a comprehensive updating of the material-especially case studies, worked examples, and problem sets-while retaining the book's time-proven emphasis on basic prinicples. Reflecting the dramatic changes in computer technology that have taken place over the last decade, the treatment of performance-related topics such as pipelines, caches, and RISC's has been expanded. Many examples and end-of-chapter problems have also been added.Table of contents1 Computation and Computers2 Design Methodology3 Processor Design4 Datapath Design5 Control Design6 Memory Organization7 System Organization
Article
One of the basic limitations of a digital computer is the size of its available memory.1 In most cases, it is neither feasible nor economical for a user to insist that every problem program fit into memory. The number of words of information in a program often exceeds the number of cells (i.e., word locations) in memory. The only way to solve this problem is to assign more than one program word to a cell. Since a cell can hold only one word at a time, extra words assigned to the cell must be held in external storage. Conventionally, overlay techniques are employed to exchange memory words and external-storage words whenever needed; this, of course, places an additional planning and coding burden on the programmer. For several reasons, it would be advantageous to rid the programmer of this function by providing him with a “virtual” memory larger than his program. An approach that permits him to use a sufficiently large address range can accomplish this objective, assuming that means are provided for automatic execution of the memory-overlay functions.
Conference Paper
Stop-and-copy garbage collection has been preferred to mark-and-sweep collection in the last decade because its collection time is proportional to the size of reachable data and not to the memory size. This paper compares the CPU overhead and the memory requirements of the two collection algorithms extended with generations, and finds that mark-and-sweep collection requires at most a small amount of additional CPU overhead (3-6%) but, requires an average of 20% (and up to 40%) less memory to achieve the same page fault rate. The comparison is based on results obtained using trace-driven simulation with large Common Lisp programs.
Conference Paper
This paper discusses garbage collection techniques used in a high-performance Lisp implementation with a large virtual memory, the Symbolics 3600. Particular attention is paid to practical issues and experience. In a large system problems of scale appear and the most straightforward garbage-collection techniques do not work well. Many of these problems involve the interaction of the garbage collector with demand-paged virtual memory. Some of the solutions adopted in the 3600 are presented, including incremental copying garbage collection, approximately depth-first copying, ephemeral objects, tagged architecture, and hardware assists. We discuss techniques for improving the efficiency of garbage collection by recognizing that objects in the Lisp world have a variety of lifetimes. The importance of designing the architecture and the hardware to facilitate garbage collection is stressed.
Conference Paper
Functional languages have failed to become preeminent in the industrial world, because they are perceived as slow. This reputation is mostly undeserved, but some lentitude is associated with functional languages due to their interaction with memory systems. In fact, this interaction is poorly understood and not enough research is conducted on the design of memory systems for functional languages. The MT system is being developed to understand and improve the interaction between a pure list-based functional language and memory. MT provides its evaluator with aa intelligent backing store implemented as an all-software paged distributed virtual memory system. Memory is divided into different spaces that are each managed independently and in parallel with program execution. In this paper, we present results obtained from observing the MT-paging traffic between the processor running the evaluator and the MT-heap's backing store in a system where the heap and stack are distributed, managed as separate address spaces, and each is allocated exclusive use of a set of frames. The results suggest tbat FIFO is as competitive as LRU and it is argued that FIFO should be used as the MT-heap page replacement policy. Furthermore, we present a brief argument that justifies using multiple address spaces and exclusive frame pools for the MT heap and MT stack.
Article
Modern Lisp systems make heavy use of a garbage-collecting style of memory management. Generally, the locality of reference in garbage-collected systems has been very poor. In virtual memory systems, this poor locality of reference generally causes a large amount of wasted time waiting on page faults or uses excessively large amounts of main memory. An adaptive memory management algorithm, described in this article, allows substantial improvement in locality of reference. Performance measurements indicate that page-wait time typically is reduced by a factor of four with constant memory size and disk technology. Alternately, the size of memory typically can be reduced by a factor of two with constant performance.
Article
This paper is a contribution to the “theory” of the activity of using computers. It shows how some forms of expression used in current programming languages can be modelled in Church's λ-notation, and then describes a way of “interpreting” such expressions. This suggests a method, of analyzing the things computer users write, that applies to many different problem orientations and to different phases of the activity of using a computer. Also a technique is introduced by which the various composite information structures involved can be formally characterized in their essentials, without commitment to specific written or other representations.
Article
Cover title. "February 1988." Thesis (Ph. D.)--Stanford University, 1988. Includes bibliographical references. Supported in part by Hewlett-Packard, using facilities provided by NASA.
Article
A program's working set is the collection of segments (or pages) recently referenced. This concept has led to efficient methods for measuring a program's intrinsic memory demand; it has assisted in undetstanding and in modeling program behavior; and it has been used as the basis of optimal multiprogrammed memory management. The total cost of a working set dispatcher is no larger than the total cost of other common dispatchers. This paper outlines the argument why it is unlikely that anyone will find a cheaper nonlookahead memory policy that delivers significantly better performance.
Article
this document, the rationale for design choices made in the interface specification is set off in this format. Some readers may wish to skip these sections, while readers interested in interface design may want to read them carefully. (End of rationale.) Advice to users. Throughout this document, material that speaks to users and illustrates usage is set off in this format. Some readers may wish to skip these sections, while readers interested in programming in MPI may want to read them carefully. (End of advice to users.) Advice to implementors. Throughout this document, material that is primarily commentary to implementors is set off in this format. Some readers may wish to skip these sections, while readers interested in MPI implementations may want to read them carefully. (End of advice to implementors.) 1.7.2 Procedure Specification MPI procedures are specified using a language independent notation. The arguments of procedure calls are marked as IN, OUT or INOUT. The meanings of these are: ffl the call uses but does not update an argument marked IN, ffl the call may update an argument marked OUT, ffl the call both uses and updates an argument marked INOUT
Article
Machine The Functional Abstract Machine (Fam) is a stack machine designed to support functional languages on large address space computers. It can be considered an SECD machine [Landin 64] which has been optimized to allow very fast function application and the use of true stacks (as opposed to Page 10 June 4, 1995 10:46 AM linked lists). This section contains a brief overview of the Fam; which is fully described in [Cardelli 83]. The machine supports functional objects (closures, which are dynamically allocated and garbage collected). All the optimization and support techniques which make application slower are strictly avoided, while tail recursion and pattern-matching calls are supported. Restricted side effects and arrays are provided. The machine is intended to make compilation from ML and other functional languages easy and regular, by providing a rich set of operations and an open-ended collection of data types. The instructions of the machine are not supposed to be interpret...
Article
Introduction In a recent study [1], Prechelt compared the relative performance of Java and C++ in terms of execution time and memory utilization. Unlike many benchmark studies, Prechelt compared multiple implementations of the same task by multiple programmers in order to control for the effects of differences in programmer skill. Prechelt concluded that, "as of JDK 1.2, Java programs are typically much slower than programs written in C or C++. They also consume much more memory." We have repeated Prechelt's study using Lisp as the implementation language. Our results show that Lisp's performance is comparable to or better than C- in terms of execution speed, with significantly lower variability which translates into reduced project risk. Furthermore, development time is significantly lower and less variable than either C- or Java. Memory consumption is comparable to Java. Lisp thus presents a viable alternative to Java for dynamic applications where performance is important. The ex
Article
This bibliography cites and comments almost 400 publications on the parallel functional programming research of the last 15 years. It focuses on the software aspect of this area i.e. on languages, compile-time analysis techniques (in particular for strictness and weight analysis), code generation, and runtime systems. Excluded from this bibliography are publications on special architectures and on garbage collection unless they contain aspects interesting for above areas. Most bibliographic items are listed inclusive their full abstracts. Supported by the Austrian Science Foundation (FWF) contract S5302-PHY "Parallel Symbolic Computation". 2 CONTENTS Contents 1 Introduction to the 2nd Edition 3 2 Introduction 3 3 General Work 6 4 Streams, Process Networks, and Operating Systems 7 5 Distributed Data Structures 10 6 Para-Functional Programming 11 7 Non-Determinism 17 8 Program Analysis 20 9 Parallelization and Compilation 27 10 Abstract Machines and Runtime Systems 31 11 Dynamic Pr...
Functional Programming for Loosely-Coupled Multiprocessors. Research Monographs in Parallel and Distributed Computing
  • P Kelly
P. Kelly. Functional Programming for Loosely-Coupled Multiprocessors. Research Monographs in Parallel and Distributed Computing, MIT Press, Cambridge, MA, USA, 1989.
Using MPI. Scientific and Engineering Computation
  • William Gropp
  • Ewing Lusk
  • Anthony Skjellum
William Gropp and Ewing Lusk and Anthony Skjellum. Using MPI. Scientific and Engineering Computation. The MIT Press, 1999.