Wilson Hsieh

Wilson Hsieh

About

102
Publications
23,238
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
8,497
Citations

Publications

Publications (102)
Article
Spanner is Google’s scalable, multiversion, globally distributed, and synchronously replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This article describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a nov...
Article
Full-text available
Spanner is Google’s scalable, multiversion, globally distributed, and synchronously replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This article describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a nov...
Conference Paper
Full-text available
Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a nove...
Article
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in...
Article
Full-text available
This article describes some of the ongoing research projects related to structured data management at Google today. The organization of Google encourages research scientists to work closely with engineering teams. As a result, the research projects tend to be motivated by real needs faced by Google's products and services, and solutions are put int...
Article
Full-text available
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1995. Includes bibliographical references (p. 123-131). Vita.
Conference Paper
Full-text available
The successful assembly of large programs out of software compo- nents depends on modular reasoning. When the linking of component code is modular, components can be compiled and type checked separately, deployed in binaryform,andareeasiertoreuse.Unfortunately,linkingisnotmodularinmany mainstream OO languages like Java. In this paper we propose an...
Conference Paper
Full-text available
Components that process state make programs more interactive with their ability to handle continuously changing data. Unfortunately, the code that glues together such state-processing components is often difficult to write be- cause it is exposed to many complicated event-handling details. This paper in- troduces SuperGlue as a language for assembl...
Conference Paper
Full-text available
In this paper we describe PRELUDE, a programming language and accompanying system support for writing portable MIMD parallel programs. PRELUDE supports a methodology for designing and orga. nizing parallel programs that makes them easier to tune for particular architectures and to port to new architectures. It builds on earlier work on Emerald, Amb...
Conference Paper
This session describes three data management projects at Google. BigTable is a highly scalable system for distributed storage and querying of structured data. Sawzall is a system for large-scale analysis of data sets that have a flat but regular structure. Finally, GoogleBase is a system for storing and searching structured data contributed by exte...
Article
Full-text available
Vita. Supervised by William E. Weihl. Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1988. Includes bibliographical references (leaves 66-69).
Article
Single-language runtime systems, in the form of Java virtual machines, are widely deployed platforms for executing untrusted mobile code. These runtimes provide some of the features that operating systems provide: interapplication memory protection and basic system services. They do not, however, provide the ability to isolate applications from eac...
Conference Paper
Full-text available
This paper describes Splice, a system for writing aspects that perform static program analyses to direct program modifications. The power of an inter-procedural data-flow analysis enables an aspect to examine the flow of data around a program execution point when it determines what code to add or change at that point. For example, an aspect can cha...
Article
ue on Advances in High Performance Memory Systems, 2001 . Design of a Parallel Vector Access Unit for SDRAM Memory Systems, BINU K. MATHEW, SALLY A. MCKEE AND JOHN B. CARTER, AL DAVIS, Proceedings of the Sixth Annual Symposium on High Performance Computer Architecture (HPCA), 2000. . Algorithmic Foundations for a a Parallel Vector Access Memory Sys...
Article
Full-text available
Feature-wise decomposition is an important approach to building configurable software systems. Although there has been research on the usefulness of particular tools for feature-wise decomposition, there are not many informative comparisons on the relative effec-tiveness of different tools. In this paper, we compare AspectJ and Jiazzi, which are tw...
Article
Full-text available
Many language implementations provide a mechanism to express concurrent processes, but few provide support for terminating a process based on its resource consumption. Those implementations that do support termination generally charge the cost of a resource to the principal that allocates the resource, rather than the principal that retains the res...
Article
Full-text available
Recent systems such as SLAM, Metal, and ESP help programmers by automating reason-ing about the correctness of temporal program properties. This paper presents a technique called property synthesis, which can be viewed as the inverse of property checking. We show that the code for some program properties, such as proper lock acquisition, can be aut...
Article
Full-text available
The ability of active networks technology to allow customized router computation critically depends on having resource control techniques that prevent buggy, malicious, or greedy code from a#ecting the integrity or availability of the router's resources. It is hard to choose between static and dynamic checking for resource control. Dynamic checking...
Article
Full-text available
Compilers must make choices between different optimizations; in this paper we present an analytic cost model that can be used to compare several compile-time optimizations for memory-intensive, matrix-based codes. These optimizations increase the spatial locality of references to improve cache hierarchy performance. Specifically, we consider loop t...
Article
Full-text available
Language-based extensible systems such as Java use type safety to provide memory safety in a single address space. Memory safety alone, however, is not sufficient to protect different applications from each other. Such systems must support a process model that enables the control and management of computational resources. In particular, language-ba...
Article
Full-text available
Data access costs contribute significantly to the execution time of applications with complex data structures. A the latency of memory accesses becomes high relative to processor cycle times, application performance is increasingly limited by memory performance. In some situations it is useful to trade increased com- putation costs for reduced memo...
Article
Full-text available
Dynamic code generation allows programmers to use run-time information in order to achieve performance and expressiveness superior to those of static code. The 'C (Tick C) language is a superset of ANSI C that supports efficient and high-level use of dynamic code generation. LC provides dynamic code generation at the level of C expressions and stat...
Article
Full-text available
We present Jiazzi, a system that enables the construction of large-scale binary components in Java. Jiazzi components can be thought of as generalizations of Java packages with added support for external linking and separate compilation. Jiazzi components are practical becuase they are constructed out of standard Java source code. Jiazzi requires n...
Article
Full-text available
We present aspect-oriented programming in Jiazzi. Jiazzi enhances Java with separately compiled, externally-linked code modules called units. Besides making programming in Java generally more modular, units are also effective "aspect" constructs that can separate concerns. The unit-linking metaphor provides a convenient and explicit way for program...
Article
Full-text available
In this paper we show how modular linking of program fragments can be added to statically typed, object-oriented (OO) languages. Programs are being assembled out of separately developed software components deployed in binary form. Unfortunately, mainstream OO languages (such as Java) still do not provide support for true modular linking. Modular li...
Article
Full-text available
readers to acquire locks independently. We describe two new algorithms for readier-writer synchronization that allow parallelism among readers during lock acquisition. We achieve this parallelism by distributing the lock state among different processors, and by trading reader throughput for writer throughput; we expect that in highly concurrent pro...
Article
We have designed and implemented Maya, a version of Java that allows programmers to extend and reinterpret its syntax. Maya generalizes macro systems by treating grammar productions as generic functions, and semantic actions on productions as multimethods on the corresponding generic functions. Programmers can write new generic functions (i.e., gra...
Article
Full-text available
We have designed and implemented Maya, a version of Java that allows programmers to extend and reinterpret its syntax. Maya generalizes macro systems by treating grammar productions as generic functions, and semantic actions on productions as multimethods on the corresponding generic functions. Programmers can write new generic functions (i.e., gra...
Article
Full-text available
this paper was published in Proceedings of the 1th International Conference on Aspect Oriented Software Development (AOSD 2002), Enschede, nl, April 2002. Please read and cite the published AOSD 2002 paper in preference to this report
Article
Full-text available
We describe an extension to the Java language, Handi-Wrap, that supports weaving aspects into code at runtime. Aspects in Handi-Wrap take the form of method wrappers, which allow aspect code to be inserted around method bodies like advice in AspectJ. Handi-Wrap oers several advantages over static aspect languages such as AspectJ. First, aspects can...
Article
Full-text available
Language-basedextensible systems, such as Java Virtual Machines and SPIN, use type safety to provide memory safety in a single address space. By using software to provide safety, they can support more efficient IPC. Memory safety alone, however, is not sufficient to protect different applications from each other. Such systems need to support a proc...
Article
Full-text available
h are built from other units. Both atoms and compounds import and export Java classes. P3. Coarse-grained connections: Connections between imports and exports should be able to connect many classes at once. Connections between components should be coarse-grained, because components represent large-scale software entities. Components should import,...
Article
Full-text available
Current Java constructs for code reuse, including classes, are insufficient for organizing programs in terms of reusable software components. Although packages, class loaders, and various design patterns can implement forms of components in ad hoc manners, the lack of an explicit language construct for components places a substantial burden on prog...
Conference Paper
Full-text available
Data access costs contribute significantly to the execution time of applications with complex data structures. As the latency of memory accesses becomes high relative to processor cycle times, application performance is increasingly limited by memory performance. In some situations it may be reasonable to trade increased computation costs for reduc...
Article
Full-text available
We describe a new architecture that improves message-passing performance, both for device I/O and for interprocessor communication. Our architecture integrates an SMT processor with a user-level network interface that can directly schedule threads on the processor. By allowing the network interface to directly initiate message handling code at user...
Article
Full-text available
Impulse is a memory system architecture that adds an optional level of address indirection at the memory controller. Applications can use this level of indirection to remap their data structures in memory. As a result, they can control how their data is accessed and cached, which can improve cache and bus utilization. The Impulse design does not re...
Article
Full-text available
Prefetching has long been used to mask the latency of memory loads. This paper presents results for an initial implementation of pointer-based prefetching within the Impulse adaptable memory controller. We conduct our experiments on a four-way issue superscalar machine. For the microbenchmarks we examine, we consistently realize about a 20% improve...
Article
Full-text available
Because irregular applications have unpredictable memory access patterns, their performance is dominated by memory behavior.
Thesis
Scalable cache-coherent nonuniform memory access (ccNUMA) architectures are an important design segment for high-performance scalable multiprocessor systems. In order to write application programs that take advantage of such systems, or port application programs written for symmetric multiprocessor systems with uniform memory access times, it is im...
Article
Full-text available
Loop transformation and array restructuring are important compiler optimizations that improve memory locality in complementary ways. Although previous researchers have proposed integrating the two techniques, there exists no analytical framework for determining how best to combine them for a given program. In this paper, we propose a cost model for...
Article
Binary tools such as disassemblers, just-in-time compilers, and executable code rewriters need to have an explicit representation of how machine instructions are encoded. Unfortunately, writing encodings for an entire instruction set by hand is both tedious and error-prone. We describe DERIVE, a tool that extracts bit-level instruction encoding inf...
Article
Language-based extensible systems, such as Java Virtual Machines and SPIN, use type safety to provide memory safety in a single address space. By using software to provide safety, they can support more efficient IPC. Memory safety alone, however, is not sufficient to protect different applications from each other. Such systems need to support a pro...
Article
Making a file system efficient usually requires extensive modifications. For example, making a file system log-structured requires the introduction of new data structures that are tightly coupled with the general file system code. This paper describes a new organization for file systems, using a Logical Disk (LD); LD defines a simple new interface...
Article
Full-text available
In this paper we describe PRELUDE, a programming language and accompanying system support for writing portable MIMD parallel programs. PRELUDE supports a methodology for designing and orga. nizing parallel programs that makes them easier to tune for particular architectures and to port to new architectures. It builds on earlier work on Emerald, Amb...
Article
Full-text available
Dynamic code generation allows specialized code sequences to be crafted using runtime information. Since this information is by definition not available statically, the use of dynamic code generation can achieve performance inherently beyond that of static code generation. Previous attempts to support dynamic code generation have been low-level, ex...
Article
Full-text available
Single-language runtime systems, in the form of Java virtual machines, are widely deployed platforms for executing untrusted mobile code. These runtimes provide some of the features that operating systems provide: inter-application memory protection and basic system services. They do not, however, provide the ability to isolate applications from ea...
Article
in multiprocessor systems. Pipes allow a sequence of remote invocations to be performed in order, but asynchronously with respect to the calling thread. Using pipes results in programs that are easier to understand and debug than those with explicit synchronization between asynchronous invocations.
Conference Paper
Full-text available
Loop transformations and array restructuring optimizations usually improve performance by increasing the memory locality of applications, but not always. For instance, loop and array restructuring can either complement or compete with one another. Previous research has proposed integrating loop and array restructuring, but there existed no analytic...
Article
Full-text available
Typical translation lookaside buffers (TLBs) can map a far smaller region of memory than application footprints demand, and the cost of handling TLB misses therefore limits the performance of an increasing number of applications. This bottleneck can be mitigated by the use of superpages, multiple adjacent virtual memory pages that can be mapped wit...
Article
Full-text available
Modern programming languages such as Java are increasingly being used to write systems programs. By "systems programs," we mean programs that provide critical services (compilers), are long-running (Web servers), or have time-critical aspects (databases or query engines). One of the requirements of such programs is predictable behavior. Unfortunate...
Article
The amount of data that a typical translation lookaside buer (TLB) can map has not kept pace with the growth in cache sizes and application footprints. As a result, the cost of handling TLB misses limits the performance of an increasing number of applications. The use of superpages, multiple adjacent virtual memory pages that can be mapped with a s...
Article
Making a file system efficient usually requires extensive modifications. For example, making a file system log-structured requires the introduction of new data structures that are tightly coupled with the general file system code. This paper describes a new organization for file systems, using a Logical Disk (LD); LD defines a simple new interface...
Conference Paper
The conventional wisdom has been that IP is the natural protocol layer for implementing multicast related functionality. However, ten years after its initial proposal, IP Multicast is still plagued with concerns pertaining to scalability, network management, ...
Article
Full-text available
Many binary tools, such as disassemblers, dynamic code generation systems, and executable code rewriters, need to understand how machine instructions are encoded. Unfortunately, specifying such encodings is tedious and error-prone. Users must typically specify thousands of details of instruction layout, such as opcode and field locations values, le...
Article
Full-text available
Providing e#cient device driver support in the Fluke operating system presents novel challenges, which stem from two conflicting factors: (i) a design and maintenance requirement to reuse unmodified legacy device drivers, and (ii) the mismatch between the Fluke kernel's internal execution environment and the execution environment expected by these...
Article
Full-text available
Impulse is a new memory system architecture that adds two important features to a traditional memory controller. First, Impulse supports application-specific optimizations through configurable physical address remapping. By remapping physical addresses, applications control how their data is accessed and cached, improving their cache and bus utiliz...
Article
Full-text available
Many modern systems, including web servers, database engines, and operating system kernels, are using language-based protection mechanisms to provide the safety and integrity traditionally supplied by hardware. As these language-based systems become used in more demanding situations, they are faced with the same problems that traditional operating...
Article
Full-text available
Processor speeds are increasing rapidly, but memory speeds are not keeping pace. Image processing is an important application domain that is particularly impacted by this growing performance gap. Image processing algorithms tend to have poor memory locality because they access their data in a non-sequential fashion and reuse that data infrequently....
Article
Full-text available
This paper extends the notion of shared and private
Conference Paper
Full-text available
Software-based protection has become a viable alternative to hardware-based protection in systems based on languages such as Java, but the absence of hardware mechanisms for protection has been coupled with an absence of a user/kernel boundary. We show why such a “red line” must be present in order for a Java virtual machine to be as effective and...
Conference Paper
Full-text available
Image processing applications tend to access their data non-sequentially and reuse that data infrequently. As a result, they tend to perform poorly on conventional memory systems due to high cache and TLB miss rates and are particularly sensitive to the growing latency of main memory. We analyze the memory performance of three image processing algo...
Conference Paper
Full-text available
Impulse is a new memory system architecture that adds two important features to a traditional memory controller. First, Impulse supports application-specific optimizations through configurable physical address remapping. By remapping physical addresses, applications control how their data is accessed and cached, improving their cache and bus utiliz...
Article
Full-text available
Distributed systems such as client-server applications and cluster-based parallel computation are an important part of modern computing. Distributed computing allows the balancing of processing load, increases program modularity, isolates functionality, and can provide an element of fault tolerance. In these environments, systems must be able to sy...
Article
Many modern extensible systems, such as Java and the SPIN operating system, depend on type safety for memory protection. Unfortunately, current type-safe languages do not support systems programming well, because they do not give programmers the ability to deal with untyped data easily. In particular, they do not support the ability to cast between...
Article
Many modern extensible systems, such as Java and the SPIN operating system, depend on type safety for memory protection. Unfortunately, current type-safe languages do not support systems programming well, because they do not give programmers the ability to deal with untyped data easily. In particular, they do not support the ability to cast between...
Article
Full-text available
Atomic recovery units (ARUs) are a mechanism that allows several logical disk operations to be executed as a single atomic unit with respect to failures. For example, ARUs can be used during file creation to update several pieces of file meta-data atomically. ARUs simplify file systems, as they isolate issues of atomicity within the logical disk sy...
Article
Full-text available
This paper presents the Impulse adaptable memory system, which allows applications to make efficient use of cache space and bus bandwidth. Impulse has a configurable memory controller that allows applications to remap data in the memory system. As a result, applications can control how their data is accessed, organized, and cached. We describe the...
Conference Paper
Full-text available
Because irregular applications have unpredictable memory access patterns, their performance is dominated by memory behavior. The Impulse configurable memory controller will enable significant performance improvements for irregular applications, because it can be configured to optimize memory accesses on an application-by-application basis. In this...