Article

Data Movement in Kernelized Systems

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

ions The Mach 3.0 Microkernel and the CHORUS Nucleus supply a similar set of abstractions for building systems servers [10, 17]. Unfortunately, for historical reasons, the two systems often use different names to describe the same thing. The remainder of this section describes the abstractions of Mach 3.0 and CHORUS relevant for understanding the rest of the paper using either the common name or both when necessary. . A Task [4] or Actor [8] is an execution environment and the basic unit of resource allocation. Both include virtual memory and threads. The Mach task also includes port rights. An actor includes ports as communication resources. A task or actor can either be in kernel or user space. . Threads are the basic unit of execution. A task or actor can have multiple simultaneous threads of execution. Threads may communicate via Ports. . Both systems are built around Interprocess Communication or IPC. . Mach ports are protected communication channels [12] managed by the kerne...

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The effect of sharing code space in such cases significantly reduces application startup latencies. Experience with Mach and Chorus suggested that heap growth may inhibit system performance if it is slow [8, 21]. The POSIX design allows the operating system to grab any available swap page and zero it. ...
... Both use external memory managers. Neither externalizes storage allocation or provides persistence, and in both cases the external memory manager has proven to be a source of performance difficul- ties [21, 8]. Both systems are hybrid designs, in that other system calls are present in addition to capability invocation. ...
Conference Paper
Full-text available
... The effect of sharing code space in such cases significantly reduces application startup latencies. Experience with Mach and Chorus suggested that heap growth may inhibit system performance if it is slow [8, 21]. The POSIX design allows the operating system to grab any available swap page and zero it. ...
... Both use external memory managers. Neither externalizes storage allocation or provides persistence, and in both cases the external memory manager has proven to be a source of performance difficul- ties [21, 8]. Both systems are hybrid designs, in that other system calls are present in addition to capability invocation. ...
Conference Paper
Full-text available
EROS is a capability-based operating system for commodity processors which uses a single level storage model. The single level store's persistence is transparent to applications. The performance consequences of support for transparent persistence and capability-based architectures are generally believed to be negative. Surprisingly, the basic operations of EROS (such as IPC) are generally comparable in cost to similar operations in conventional systems. This is demonstrated with a set of microbenchmark measurements of semantically similar operations in Linux. The EROS system achieves its performance by coupling well-chosen abstract objects with caching techniques for those objects. The objects (processes, nodes, and pages) are well-supported by conventional hardware, reducing the overhead of capabilities. Software-managed caching techniques for these objects reduce the cost of persistence. The resulting performance suggests that composing protected subsystems may be less costly than commonly believed.
... It is possible to implement most of Unix I/O in the application address space to reduce the number of system calls. For example, the OSF/1 implementation of Unix I/O uses an emulation library in the application address space that maps (Section 2.2.2) large file regions into the application address space and copies data to or from the mapped regions, hence reducing the number of system calls [30]. ...
... One open question in porting HFS to a multicomputer would be whether mapped-file I/O should be used. The fact that the OSF/1 file system for the Paragon multicomputer uses mapped-file I/O [30] suggests that it would not be a problem. However, it is not clear how the mapping facility provided by the Vesta file system [25] (where an application can impose a mapping between the bytes in the file and the order that the file system will provide these bytes to the application) can be efficiently supported for the case where the mapping requires a small amount of data to be transferred from each of a large number of pages. ...
Article
Full-text available
The HURRICANE File System (HFS) is designed for large-scale, shared-memory multiprocessors. Its architecture is based on the principle that a file system must support a wide variety of file structures, file system policies and I/O interfaces to maximize performance for a wide variety of applications. HFS uses a novel, object-oriented building-block approach to provide the flexibility needed to support this variety of file structures, policies, and I/O interfaces. File structures can be defined in HFS that optimize for sequential or random access, read-only, write-only or read/write access, sparse or dense data, large or small file sizes, and different degrees of application concurrency. Policies that can be defined on a per-file or per-open instance basis include locking policies, prefetching policies, compression/decompression policies and file cache management policies. In contrast, most existing file systems have been designed to support a single file structure and a small set of po...
... In fact, the Mach implementation of RPC has been highly optimized through the use of techniques such as stack-handoff scheduling and continuations [Draves91] for the common case of small messages and out-of-line (virtual memory) transfers for the expensive case of large messages [Dean91]. ...
Article
Full-text available
The allocation of die area to different processor components is a central issue in the design of single-chip microprocessors. Chip area is occupied by both core execution logic, such as ALU and FPU datapaths, and memory structures, such as caches, TLBs, and write buffers. This work focuses on the allocation of die area to memory structures through a cost/benefit analysis. The cost of memory structures with different sizes and associativities is estimated by using an established area model for on-chip memory. The performance benefits of selecting a given structure are measured through a collection of methods including on-the-fly hardware monitoring, trace-driven simulation and kernel-based analysis. Special consideration is given to operating systems that support multiple application programming interfaces (APIs), a software trend that substantially affects on-chip memory allocation decisions.Results: Small adjustments in cache and TLB design parameters can significantly impact overall performance. Operating systems that support multiple APIs, such as Mach 3.0, increase the relative importance of on-chip instruction caches and TLBs when compared against single-APl systems such as Ultrix.
... To do so, however, requires the client to act effectively as a VM system, engaging the pager in the memory object's paging and initialization/termination protocols . In practice, UNIX applications on MACH access files through an emulation library that maps the file in the process' address space [12]. Although one may argue that it is cheaper to access the file by mapping it rather than by reading/writing to it (which probably requires someone to map it somewhere anyway), in the Spring operating system, we wanted to retain the ability to issue read/write requests on the file object directly. ...
Article
: In this paper we describe an aspect of the Spring virtual memory system that was influenced by the distributed object-oriented architecture of Spring. The virtual memory system supports external pagers like those provided in the MACH ® operating system, yet the architecture is more flexible and provides better caching opportunities than is possible in other systems. A novel aspect of the architecture is the separation of the memory abstraction from the interface that provides the paging operations. This separation provides considerable caching opportunities in our file system, and it facilitates our extensible stackable file system architecture. The virtual memory architecture described in this paper is implemented and has been in use for over three years as part of the experimental Spring operating system. A Sun Microsystems, Inc. Business M/S 29-01 2550 Garcia Avenue Mountain View, CA 94043 email addresses: yousef.khalidi@eng.sun.com michael.nelson@eng.sun.com 2 A Flexible ...
Article
The number of applications of small embedded systems such as PDAs, electronic note books, etc. based on Kinux, have increased. Due to the monolithic characteristic of Linux kernel, it is not suitable to satisfy the various kinds of embedded application requirement. To assist the shortcoming of monolithic kernel, we implement uJFFS 113th file system as an application program process which runs in user space. This solution consists of a file system and a flash device driver, and makes Linux kernel smaller by separating the file system from the kernel. uJFFS consists of ujffs_fs that plays a part of file system and ujffs_drv that controls a flash device. Which provides the same user interface as Linux does. A Device driver for the physical device is implemented in user pace, which prevents kernel failures from file system errors. So uJFFS can increase stability of the system.
Chapter
The sections in this article are
Conference Paper
This paper describes the Unix file access and caching mechanisms in a version of the OSF/1 Unix operating system designed to run in a multicomputer environment. The multicomputer hard-ware platforms targeted can consist of hundreds or even thousands of individual nodes, where each node consists of one or more processors. The multicomputer version of OSF/1 (called OSF/1 AD) uses Mach memory objects to cache data from Unix files, and relies on an in-kernel distributed shared memory implementation to maintain coherency for data cached across multiple nodes. The focus of this paper is on the modifications made to standard OSF/1 functionality to support distributed, efficient access to memory objects. Of particular interest are the introduction of a mapped files module for synchronizing clients and maintaining file meta data, the elimination of the traditional Unix buffer cache from the file data access path, and the implementation of a disk block reservation scheme to correctly support Unix write() semantics. An evaluation of the technology is presented, providing insight into how it can be improved in the future, including several possible enhancements to Mach. As will be seen, most of this insight would equally apply to a single-node operating system based on Mach.
Conference Paper
Full-text available
Not Available
Article
This paper presents the design of an object-oriented file system which was developed as a part of the "OBJIX Object-Oriented Operating System" project. The file system is a self-contained program system which is decomposed using a standard object-oriented framework concept. A novel approach to object-oriented frameworks, the Class Hierarchy Framework concept recapitulated in this paper, is employed in structuring components of the file system. Further, this paper illustrates on an example how the file system pursues a typical system call.
Article
Full-text available
The authors introduce an application-level I/O facility, the Alloc Stream Facility, that addresses three primary goals. First, ASF addresses recent computing substrate changes to improve performance, allowing applications to benefit from specific features such as mapped files. Second, it is designed for parallel systems, maximizing concurrency and reporting errors properly. Finally, its modular and object-oriented structure allows it to support a variety of popular I/O interfaces (including stdio and C++ stream I/O) and to be tuned to system behavior, exploiting a system's strengths while avoiding its weaknesses. On a number of standard Unix systems, I/O-intensive applications perform substantially better when linked to the Alloc facility. Also, modifying applications to use a new interface provided by the facility can improve performance by another factor of two. These performance improvements are achieved primarily by reducing data copying and the number of system calls. Not visible in these improvements is the extra degree of concurrency the facility brings to multithreaded and parallel applications.< >
Article
Full-text available
this paper appeared in the Proceedings of the 20th Annual International Symposium on Computer Architecture, San Diego, May 1993. Authors' address: Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109-2122 This work was supported by Defense Advanced Research Projects Agency under DARPA/ARO Contract Number DAAL03-90-C-0028 and a National Science Foundation Graduate Fellowship. Uhlig et al. . 2 within the kernel. These and related operating system trends place greater stress upon the TLB by increasing miss rates and, hence, decreasing overall system performance. This paper explores these issues by examining design trade-offs for software-managed TLBs and their impact, in conjunction with various operating systems, on overall system performance. To examine issues which cannot be adequately modeled with simulation, we have developed a system analysis tool called Monster, which enables us to monitor actual systems. We have also developed a novel TLB simulator called Tapeworm, which is compiled directly into the operating system so that it can intercept all TLB misses caused by both user process and OS kernel memory references. The information that Tapeworm extracts from the running system is used to obtain TLB miss counts and to simulate different TLB configurations. The remainder of this paper is organized as follows: Section 2 examines previous TLB and OS research related to this work. Section 3 describes our analysis tools, Monster and Tapeworm. The MIPS R2000 TLB structure and its performance under Ultrix, OSF/1 and Mach 3.0 are explored in Section 4. Hardware- and software-based performance improvements are presented in Section 5. Section 6 summarizes our conclusions. 2 RELATED WORK By caching page table entries, TLBs g...
Conference Paper
A set of programs, the WPI Benchmark Suite (WBS), was developed with the specific objective of comparing the performances of Unix-like operating systems running on identical hardware platforms. Performance results from running parts of the WBS under Mach 2.5, Mach 3.0, and SunOS 4.1.1 on HP 386 PCs, HP 486 PCs, and Sun 3/60 workstations are presented. The focus is on the WPI benchmarks designed to evaluate the effectiveness of different operating system mechanisms for distributed applications. The results identify strengths and weaknesses of these operating systems. The results show that Mach 3.0 does not perform as well as Mach 2.5 on benchmarks involving network communications or local socket communication. Mach 3.0 does outperform Mach 2.5 on a benchmark which involves extensive disk input/output (I/O). The Jigsaw benchmark demonstrates that Mach 3.0 has difficulty with heavy paging activity. Additionally, SunOS 4.1.1 performs better that Mach 2.5 for certain interprocess communication methods
Article
Full-text available
The Hurricane File System (HFS) is a new file system being developed for large-scale shared memory multiprocessors with distributed disks. The main goal of this file system is scalability; that is, the file system is designed to handle demands that are expected to grow linearly with the number of processors in the system. To achieve this goal, HFS is designed using a new structuring technique called Hierarchical Clustering. HFS is also designed to be flexible in supporting a variety of policies for managing file data and for managing file system state. This flexibility is necessary to support in a scalable fashion the diverse workloads we expect for a multiprocessor file system. 1 Introduction The Hurricane File System (HFS) is a new file system being developed for large-scale shared memory multiprocessors. In this paper the goals and basic architecture of this file system are introduced. The main goal of this file system is scalability; we expect the file system load to grow linearly...
Article
Execution [Larus90] 10 - 40 20 - 60 0 Yes N/A None [Eggers90] --- 1,000 + 0 Yes N/A None Stack Deletion [Smith77] 5 - 100 0 4 - 50 No 4 - 5% Fully-associative Memories Snapshot Method [Smith77] 5 - 100 0 4 - 50 No 4 - 5% Fully-associative Memories Cache Filter [Puzak85] 10 - 20 0 --- Yes N/A Fixed-line-size Caches [Wang90] 10 - 20 0 7 - 15 Yes N/A Fixed-line-size Caches Block Filter [Agarwal90] 50 - 100 0 --- No 12% Fixed-line-size Caches Time Sampling [Laha88] 5 - 20 0 5 - 20 No 5% Small Caches (< 128 K-byte) [Kessler91] 10 0 10 No 10% Small Caches (< 1 M-byte) Set Sampling [Puzak85] 5 - 10 0 10 No 2% Set Sample Not General [Kessler91] 10 0 10 No 10% Constant-bits Set Sample Table 2.5 Address Trace Reduction Methods The trace reduction factor is the ratio in size between the reduced trace and the full trace.
Article
OF THE TECHNOLOGY MASTER'S THESIS Author: Johannes Helander Thesis Title: Unix under Mach -- The Lites Server Date: December 30, 1994 Pages: 7 + 64 Department: Faculty of Information Technology Chair: Tik-76 Supervisor: Professor Heikki Saikkonen Instructor: Professor Eljas Soisalon-Soininen Lites is a 4.4 BSD Lite based server and an emulation library that provide free unix functionality to a Mach based system. Lites provides binary compatibility with 4.4 BSD, NetBSD (0.8, 0.9, and 1.0), FreeBSD (1.1.5 and 2.0), 386BSD, UX (4.3BSD) and Linux on the i386 platform. It has also been ported to the pc532, PA-RISC, and R3000. It works with Mach 3.0, Mach 4, and RT-Mach. This thesis describes the Lites server and its design motivations. It shows that emulation libraries are a good way of structuring systems software. Problems identified with earlier emulators are not inherent to the idea but rather to the implementations. Lites provides a completely new emulation library. It provid...
Article
Full-text available
Distributed computer systems support several key attributes that are essential for the development and execution of command and control (C2) applications. Since C2 applications need to become more survivable, more dispersed, and better able to quickly adapt to new threats, we are seeking to provide an architecture for a survivable Distributed Computing Environment (SDCE). In essence, the SDCE will be a base upon which survivable distributed applications can be built. This base must be flexible enough to incorporate advances in technology. It must also be tailorable to the needs of specific C2 applications, and well structured for ease of maintenance. Hence, this base must be capable of evolving with the needs of C2 systems and their supporting technologies. The approach that was used in this effort was to utilize existing technologies such as the Mach micro kernel, along with the CRONUS and/or ISIS distributed Computing Environments to provide many of the SDCE requirements.
Article
Abstract One of the,most important resources that an operating,system manages,is main,memory. Systems often provide virtual memory, using this main memory to cache the contents of a larger virtual storage. Unfortunately, existing systems hide the algorithms for controlling the contents ofthe main memory,cache and for managing,the external storage used to represent,that virtual memory within the operating system kernel, making them unavailable to important systems applications. Mythesis is that it is desirable and practical to provide,an external,memory,management
Conference Paper
We discuss the rationale and design of a Generic Memory management Interface, for a family of scalable operating systems. It consists of a general interface for managing virtual memory, independently of the underlying hardware architecture (e.g. paged versus segmented memory), and independently of the operating system kernel in which it is to be integrated. In particular, this interface provides abstractions for support of a single, consistent cache for both mapped objects and explicit I/O, and control of data caching in real memory. Data management policies are delegated to external managers. A portable implementation of the Generic Memory management Interface for paged architectures, the Paged Virtual Memory manager, is detailed. The PVM uses the novel history object technique for efficient deferred copying. The GMI is used by the Chorus Nucleus, in particular to support a distributed version of Unix. Performance measurements compare favorably with other systems.
Article
As hardware prices continue to drop rapidly, building large computer systems by interconnecting substantial numbers of microcomputers becomes increasingly attractive. Many techniques for interconnecting the hardware, such as Ethernet [Metcalfe and Boggs, 1976], ring nets [Farber and Larson, 1972], packet switching, and shared memory are well understood, but the corresponding software techniques are poorly understood. The design of general purpose distributed operating systems is one of the key research issues for the 1980s.
Article
The Chorus Object-Oriented Layer (COOL) is an extension of the facilities provided by the Chorus distributed operating system with additional functionality for the support of object-oriented environments. This functionality is realized by a layer built on top of the Chorus V3 Nucleus, which extends the Chorus interface with generic functions for object management: creation, deletion, storage, remote invocation and migration. One major goal of this approach was to explore the feasibility of general object management at the kernel level, with support of multiple object models at a higher level. We present the implementation of COOL and a first evaluation of this approach with a C++ environment using the COOL mechanisms. 1 Introduction COOL is a distributed object-oriented system, built on top of the Chorus 1 V3 minimal kernel, or Nucleus 3 Joint ECOOP/OOPSLA Conference, Ottawa (Canada), October 1990 y Author's current address: Department of Computer Science and Engineering, FR-35,...
Article
This paper argues that a shared, distributed name space and I/O interface should be implemented inside the operating system kernel. The grounding for the argument is a comparison between the Sprite network operating system and the Mach microkernel. Sprite optimizes the common case of file and device access, both local and remote, by providing a kernel-level implementation. Sprite also allows for user-level extensibility by letting a user-level process implement the naming and I/O interfaces of the file system. Mach, in contrast, provide general interprocess communication and does not define a file system protocol in the kernel. [Published in the Proceedings of the 2nd USENIX Mach Symposium, Nov 20-22, 1991, pages 233250 ] 1 Introduction This paper argues that the file system is a mature enough abstraction that it should be implemented inside the operating system kernel for optimal performance. Data storage and high-level naming are fundamental features of today's computer syste...
Article
We have improved the performance of the Mach 3.0 operating system by redesigning its internal thread and interprocess communication facilities to use continuations as the basis for control transfer. Compared to previous versions of Mach 3.0, our new system consumes 85% less space per thread. Cross-address space remote procedure calls execute 14% faster. Exception handling runs over 60% faster. In addition to improving system performance, we have used continuations to generalize many control transfer optimizations that are common to operating systems, and have recast those optimizations in terms of a single implementation methodology. This paper describes our experiences with using continuations in the Mach operating system. This research was sponsored in part by The Defense Advanced Research Projects Agency, Information Science and Technology Office, under the title "Research on Parallel Computing", ARPA Order No. 7330, issued by DARPA/CMO under Contract MDA972-90-C-0035 and in part by...
Chorus Systemes. CHORUS Kernel v3 r4.0 Specification and Interface
Chorus Systemes. CHORUS Kernel v3 r4.0 Specification and Interface. Technical Report CS/TR-91-69, Chorus Systemes, September, 1991.
MACH 3 Kernel Principles. Open Software Foundation and Carnegie Mellon University
  • Keith Loepere
Keith Loepere. MACH 3 Kernel Principles. Open Software Foundation and Carnegie Mellon University, 1992.
Binary Emulation of UNIX using the V Kernel Chorus Systemes. CHORUS Kernel v3 r4.0 Programmer's Reference Manual
  • D R Cheriton
  • G R Whitehead
  • E W Szynter
In Proceedings of EurOpen Spring 1991 Conference. May, 1991. [7] Cheriton, D.R., Whitehead, G.R., Szynter, E.W. Binary Emulation of UNIX using the V Kernel. In Proceedings of Summer 1990 USENIX Conference. June, 1990. [8] Chorus Systemes. CHORUS Kernel v3 r4.0 Programmer's Reference Manual. Technical Report CS/TR-91-71, Chorus Systemes, September, 1991. [9] Rozier, M., et. al. CHORUS Distrbuted Operating Systems. Computing Systems 1(4), December, 1988.
Revolution 89: or Distributing Unix brings it back to its Original Virtue Give a Process Manager to your drivers! In Proceedings of EurOpen Autumn 1991 Kernel Support for Distrubted Memory Multiprocessors
  • F Armand
  • M Gien
  • F Herrmann
  • M Rozier
  • R V Francois Armand Baron
Armand F., Gien M., Herrmann F., Rozier M. Revolution 89: or Distributing Unix brings it back to its Original Virtue. In Proceedings WEDMS I. October, 1989. [3] Francois Armand. Give a Process Manager to your drivers! In Proceedings of EurOpen Autumn 1991. September, 1991. [4] Baron, R.V. et al. MACH Kernel Interface Manual. Technical Report, School of Computer Science, Carnegie Mellon University, September, 1988. [5] Joseph S. Barrera III. Kernel Support for Distrubted Memory Multiprocessors. PhD thesis, School of Computer Science, Carnegie Mellon University, To be published, 1992. [6] Bricker, A., Gien, M., Guillemont, M., Lipkis, J., Orr, D., Rozier, M. A New Look at Microkernel-Based UNIX Operating Systems: Lessons in Performance and Compatibility.
Revolution 89: or Distributing Unix brings it back to its Original Virtue
  • Armand F Gien
  • M Herrmann
  • F Rozier
Armand F., Gien M., Herrmann F., Rozier M. Revolution 89: or Distributing Unix brings it back to its Original Virtue. In Proceedings WEDMS I. October, 1989.
Give a Process Manager to your drivers!
  • Francois Armand
Francois Armand. Give a Process Manager to your drivers! In Proceedings of EurOpen Autumn 1991. September, 1991.
Kernel Support for Distrubted Memory Multiprocessors
  • Joseph S Barrera
Joseph S. Barrera III. Kernel Support for Distrubted Memory Multiprocessors. PhD thesis, School of Computer Science, Carnegie Mellon University, To be published, 1992.
A New Look at Microkernel-Based UNIX Operating Systems: Lessons in Performance and Compatibility
  • A Bricker
  • M Gien
  • M Guillemont
  • J Lipkis
  • D Orr
  • M Rozier
Bricker, A., Gien, M., Guillemont, M., Lipkis, J., Orr, D., Rozier, M. A New Look at Microkernel-Based UNIX Operating Systems: Lessons in Performance and Compatibility. In Proceedings of EurOpen Spring 1991 Conference. May, 1991.
Binary Emulation of UNIX using the V Kernel
  • D R Cheriton
  • G R Whitehead
  • E W Szynter
Cheriton, D.R., Whitehead, G.R., Szynter, E.W. Binary Emulation of UNIX using the V Kernel. In Proceedings of Summer 1990 USENIX Conference. June, 1990.
CHORUS Kernel v3 r4.0 Programmer's Reference Manual
Chorus Systemes. CHORUS Kernel v3 r4.0 Programmer's Reference Manual. Technical Report CS/TR-91-71, Chorus Systemes, September, 1991.
  • M Rozier
Rozier, M., et. al. CHORUS Distrbuted Operating Systems. Computing Systems 1(4), December, 1988.
CHORUS Kernel v3 r4.0 Specification and Interface
Chorus Systemes. CHORUS Kernel v3 r4.0 Specification and Interface. Technical Report CS/TR-91-69, Chorus Systemes, September, 1991.
Using Continuations to Implement Thread Management and Communication in Operating Systems
  • R P Draves
  • B Bershad
  • R F Rashid
  • R W Dean
Draves, R.P., Bershad, B., Rashid, R.F., Dean, R.W. Using Continuations to Implement Thread Management and Communication in Operating Systems. In Proceedings of the Thirteenth ACM Symposium on Operating Systems Principles. April, 1991.
Real Time in a Distributed Computing Environment
  • M Guillemont
Guillemont M. Real Time in a Distributed Computing Environment. Computer Technology Review, October, 1990.
Towards a Predictable Real-Time System
  • H Tokuda
  • T Nakajima
  • P Rao
  • Real-Time
  • Mach
Tokuda, H., Nakajima, T., Rao, P. Real-Time Mach: Towards a Predictable Real-Time System. In Proceedings of the First Mach USENIX Workshop, pages 73--82. October, 1990.