Alfred V. Aho

Alfred V. Aho
Columbia University | CU · Department of Computer Science

PhD, Princeton University

About

195
Publications
50,277
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
33,440
Citations
Additional affiliations
January 1995 - present
Columbia University
Position
  • Lawrence Gussman Professor

Publications

Publications (195)
Article
Jeffrey D. Ullman and Alfred V. Aha are recipients of the 2020 ACM A.M. Turing award. They were recognized for creating fundamental algorithms and theory underlying programming language implementation and for synthesizing these results and those of others in their highly influential books, which educated generations of computer scientists.
Conference Paper
During 2014, Business Insider announced that there are over a billion users of Android worldwide. Government officials are also trending towards acquiring Android mobile devices. Google's application architecture is already ubiquitous and will keep expanding. The beauty of an application-based architecture is the flexibility, interoperability and c...
Article
The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG@CACM community. In each issue of Communications, we'll publish selected posts or excerpts.twitterFollow us on ...
Article
We recommend using the term Computation in conjunction with a well-defined model of computation whose semantics is clear and which matches the problem being investigated. Computer science already has a number of useful clearly defined models of computation whose behaviors and capabilities are well understood. We should use such models as part of an...
Article
Complexity theory is the area of the theory of computation that deals with the study and classification of the amount of computational resources required to solve problems. The subject is intellectually exciting and central to the field of computer science as well as to understanding how complex systems outside of computer science behave and comput...
Article
In this ninth piece to the Ubiquity symposium discussing What is computation? Alfred V. Aho shares his views about the importance of computational thinking in answering the question. --Editor
Article
Full-text available
Article
Full-text available
There is a growing consensus that crosscutting concerns harm code quality. An example of a crosscutting concern is a functional requirement whose implementation is distributed across multiple software modules. We asked the question, "How much does the amount that a concern is crosscutting affect the number of defects in a program?" We conducted thr...
Conference Paper
Full-text available
The concern location problem is to identify the source code within a program related to the features, requirements, or other concerns of the program. This problem is central to program development and maintenance. We present a new technique called prune dependency analysis that can be combined with existing techniques to dramatically improve the ac...
Conference Paper
Full-text available
The ability to debug programs composed using aspect-oriented programming (AOP) techniques is critical to the adoption of AOP. Nevertheless, many AOP systems lack adequate support for debugging, making it difficult to diagnose faults and understand the program’s composition and control flow. We present an AOP debug model that characterizes AOP-speci...
Article
This book provides the foundation for understanding the theory and pracitce of compilers.
Article
A minimal rectilinear Steiner tree for a set A of points in the plane is a tree which interconnects A using horizontal and vertical lines of shortest possible total length. Such trees have potential application to wire layout for printed circuits. Unfortunately, at present no practical algorithm is known for constructing these trees in general. We...
Article
Full-text available
An arbitrarily reliable quantum computer can be efficiently constructed from noisy components using a recursive simulation procedure, provided that those components fail with probability less than the fault-tolerance threshold. Recent estimates of the threshold are near some experimentally achieved gate fidelities. However, the landscape of thresho...
Article
Full-text available
Despite convincing laboratory demonstrations of quantum information processing, it remains difficult to scale because it relies on inherently noisy components. Adequate use of quantum error correction and fault tolerance theoretically should enable much better scaling, but the sheer complexity of the techniques involved limits what is achievable to...
Conference Paper
Full-text available
AspectJ-like languages are currently ineffective at modularizing heterogeneous concerns that are tightly coupled to the source code of the base program, such as logging, invariants, error handling, and optimization. This leads to complicated and fragile pointcuts and large numbers of highly-repetitive and incomprehensible aspects. We propose statem...
Article
Compilers and computer-aided design tools will be essential for quantum computing. We present a computer-aided design flow that transforms a highlevel language program representing a quantum computing algorithm into a technology-specific implementation. We trace the significant steps in this flow and illustrate the transformations to the representa...
Article
Full-text available
We present a novel cluster architecture that unifies switch, server and storage processing to achieve a level of price-performance and simplicity of application development not achievable with current architectures. Our architecture takes advantage of the increasing disparity between storage capacity, network switching on the one hand, and processi...
Article
Although software is the key enabler of the global information infrastructure, the amount and extent of software in use in the world today are not widely understood, nor are the programming languages and paradigms that have been used to create the software. The vast size of the embedded base of existing software and the increasing costs of software...
Article
Full-text available
The design and optimization of quantum circuits is central to quantum computation. This paper presents new algorithms for compiling arbitrary 2^n x 2^n unitary matrices into efficient circuits of (n-1)-controlled single-qubit and (n-1)-controlled-NOT gates. We first present a general algebraic optimization technique, which we call the Palindrome Tr...
Article
In this paper we describe an ongoing research project called the Columbia Digital News Project. The goal of this project is to develop a suite of effective interoperable tools with which people can find relevant information (text, images, video, and structured documents) from distributed sources and trackitover a period of time. Our initial focus i...
Presentation
Full-text available
Software Architecture Principles within Software Engineering are introduced. Software Architecture Styles and Patterns are given special attention. Software Architecture in Practice is demonstrated through examples. A variety of specific Architecture Implementation Considerations are reviewed.
Presentation
Full-text available
Software process and project planning approaches as well as software process models are introduced and reviewed. Various lifecycle models are discussed along with estimation and scheduling methods.
Presentation
Full-text available
Software engineering introduced, and an engineer’s role considered. First software engineering is defined and put in context of systems in a software intensive world. Also, exploration of why architectures and patterns provide a critical foundation for modern software.
Article
Full-text available
In this paper we describe an ongoing research project called the Columbia Digital News Project. The goal of this project is to develop a suite of effective interoperable tools with which people can find relevant information (text, images, video, and structured documents) from distributed sources and track it over a period of time. Our initial focus...
Conference Paper
Full-text available
In this paper we describe an ongoing research project called the Columbia Digital News System. The goal of this project is to develop a suite of effective interoperable tools with which people can find relevant information (text, images, video, and structured documents) from distributed sources and track it over a period of time. Our initial focus...
Article
The principles underlying this report can be summarized as follows: 1. A strong theoretical foundation is vital to computer science. 2. Theory can be enriched by practice. 3. Practice can be enriched by theory. 4. If we consider (2) and (3), the value, impact, and funding of theory will be enhanced. In order to achieve a greater synergy between the...
Article
Full-text available
Awk is a programming language whose basic operation is to search a set of files for patterns, and to perform specified actions upon lines or fields of lines which contain instances of those patterns. Awk makes certain data selection and transformation operations easy to express; for example, the awk program length > 72 prints all input lines whose...
Conference Paper
Full-text available
Abstract This,paper,discusses,some,of,the,major,technical,obsta- cles standing,in the,way,of achieving,cost-effective,univer- sal access,to multimedia,information,stored,in globally,dis- tributed,knowledge,repositories.,Opportunities,for contribu- tions from,the,database,research,community,are highlighted. 1,Introduction The,goal,of developing,an,i...
Article
The international telecommunications system is the world’s largest distributed computing system, offering a wide range of services and service features. Telecommunications service providers are eager to offer new services on this infrastructure, but the development of these new services is hampered by interactions of the service features among the...
Article
Simulation, pp. 75--84, SCS, January 1991. [74] J. S. Steinman, "Breathing time warp," in 7th Workshop on Parallel and Distributed Simulation, pp. 109--118, SCS, May 1993. [75] B. D. Lubachevsky, "Efficient distributed event driven simulations," Communications of the ACM, vol. 32, pp. 63--72, January 1978. [76] S. Haykin, Adaptive Filter Theory. En...
Conference Paper
Full-text available
The international telecommunications system is theworld's largest distributed computing system, offeringa wide range of services and service features.Telecommunications service providers are eager to offernew services on this infrastructure, but the developmentof these new services is hampered by interactionsof the service features among the differ...
Article
A method for generating test sequences for checking the conformance of a protocol implementation to its specification is described. A rural Chinese postman tour problem algorithm is used to determine a minimum-cost tour of the transition graph of a finite-state machine. It is shown that, when the unique input/output sequence (UIO) is used in place...
Article
Full-text available
Although testing is an essential part of program and circuit design, the area is still more an art than a science. This paper considers several fundamental problems arising in program and circuit testing, and abstracts them in terms of path-covering problems on graphs. These problems are representative of important classes of graph-optimization pro...
Article
This chapter discusses the algorithms for solving string-matching problems that have proven useful for text-editing and text-processing applications. String pattern matching is an important problem that occurs in many areas of science and information processing. In computing, it occurs naturally as part of data processing, text editing, term rewrit...
Article
Full-text available
Compiler-component generators, such as lexical analyzer generators and parser generators, have long been used to facilitate the construction of compilers. A tree-manipulation language called twig has been developed to help construct efficient code generators. Twig transforms a tree-translation scheme into a code generator that combines a fast top-d...
Article
In this note, we show how a few UNIX™ commands can be combined to create a flexible reference assembler that automatically maintains the consistency of cross references in manuscripts. The reference assembler can be used in conjunction with any text formatter.
Article
From the Publisher:This book presents the data structures and algorithms that underpin much of today's computer programming. The basis of this book is the material contained in the first six chapters of our earlier work, The Design and Analysis of Computer Algorithms. We have expanded that coverage and have added material on algorithms for external...
Conference Paper
We present a family of data structures that can process a sequence of insert, delete, and lookup instructions such that each lookup and deletion is done in constant worst-case time and each insertion is done in constant expected time. The amount of space used by each data structure is proportional to the maximal number of elements that need to be s...
Article
We show that tree pattern matching has significant advantages in the specification and implementation of efficient code generators. We present a top-down tree-matching algorithm that is particularly well suited to code generation applications. Finally, we present a new back-end language that incorporates tree pattern matching with dynamic programmi...
Article
This is the second issue of the Technical Journal devoted exclusively to papers on the family of computer operating systems bearing the UNIX trademark of AT&T Bell Laboratories. The UNIX operating system was created in 1969 by K. Thompson and D. M. Ritchie. Its growth since then, in both the commercial world and the research community, has been tru...
Conference Paper
Several papers have recently dealt with techniques for proving area-time lower bounds for VLSI computation by “crossing sequence” methods. A number of natural questions are raised by these definitions. 1.Is the fooling set approach the most powerful way to get information-transfer-based lower bounds? We shall show it is not, and offer a candidate f...
Article
Using a pair of finite-state automata to model the transmitter-receiver protocol in a data communications system, we derive lower bounds on the size of automata needed to achieve reliable communication across an error-prone channel. We also show that, at the cost of increasing the size of the automata, a transmission rate close to the theoretical m...
Article
We present an algorithm for constructing a tree to satisfy a set of lineage constraints on common ancestors. We then apply this algorithm to synthesize a relational algebra expression from a simple tableau, a problem arising in the theory of relational databases.
Chapter
To specify and match various patterns of strings and words is an essential part of computerized information processing activities such as text editing, data retrieval, bibliographic search, query processing, lexical analysis, and linguistic analysis. This chapter discusses three basic classes of string patterns that are useful in these activities a...
Article
The design of several database query languages has been influenced by Codd's relational algebra. This paper discusses the difficulty of optimizing queries based on the relational algebra operations select, project, and join. A matrix, called a tableau, is proposed as a useful device for representing the value of a query, and optimization of queries...
Conference Paper
Using a pair of finite-state automata to model the transmitter-receiver protocol in a data communications system, we derive lower bounds on the size of automata needed to achieve reliable communication across an error-phone channel. We also show that, at the cost of increasing the size of the automata, a transmission rate close to the theoretical m...
Article
Answering queries in a relational database often requires that the natural join of two or more relations be computed. However, the result of a join may not be what one expects. In this paper we give efficient algorithms to determine whether the join of several relations has the intuitively expected value (is lossless) and to determine whether a set...
Article
This paper considers the design of a system to answer partial-match queries from a file containing a collection of records, each record consisting of a sequence of fields. A partial-match query is a specification of values for zero or more fields of a record, and the answer to a query is a listing of all records in the file whose fields match the s...
Article
Many database queries can be formulated in terms of expressions whose operands represent tables of information (relations) and whose operators are the relational operations select, project, and join. This paper studies the equivalence problem for these relational expressions, with expression optimization in mind. A matrix, called a tableau, is prop...
Article
This paper describes the design and implementation of awk, a programming language which searches a set of files for patterns, and performs specified actions upon records or fields of records which match the patterns. Awk makes common data selection and transformation operations easy to express; for example, is a complete awk program that prints al...
Conference Paper
Full-text available
We consider the question of how powerful a relational query language should be and state two principles that we feel any query language should satisfy. We show that although relational algebra and relational calculus satisfy these principles, there are certain queries involving least fixed points that cannot be expressed by these languages, yet tha...
Article
Using a pair of finite-state automata to model the transmitter-receiver protocol in a data communication system, lower bounds are derived on the size of automata needed to achieve reliable communication across an error-prone channel. It is also shown that, at the cost of increasing the size of the automata, a transmission rate close to the theoreti...
Conference Paper
Many useful database queries can be formulated in terms of expressions whose operands are relations and whose operators are the relational operations select, project, and join. This paper investigates the computational complexity of optimizing relational expressions of this form under a variety of cost measures. A matrix, called a tableau, is propo...
Conference Paper
As various parts of the compiling process have become better understood, it has been possible to package this understanding in tools that can be used by nonspecialists. This talk describes tools for parser generation and lexical analyzer generation which are available under the UNIX@@@@ operating system. It will also touch on some less successful a...
Conference Paper
Answering queries in a relational database often requires that the natural join of two or more relations be computed. However, not all joins are semantically meaningful. This paper gives an efficient algorithm to determine whether the join of several relations is semantically meaningful (lossless) and an efficient algorithm to determine whether a s...
Article
Easy as the task may seem, many compilers generate rather inefficient code. Some of the difficulty of generating good code may arise from the lack of realistic models for programming language and machine semantics. In this paper we show that the computational complexity of generating efficient code in realistic situations may also be a major cause...
Conference Paper
Previous work on optimal code generation has usually assumed that the underlying machine has identical registers and that all operands fit in a single register or memory location. This paper considers the more realistic problem of generating optimal code for expressions involving single and double length operands, using several models of register-p...
Article
K. Kennedy recently conjectured that for every n node reducible flow graph, there is a sequence of nodes (with repetitions) of length O(n log n) such that all acyclic paths are subsequences thereof. Such a sequence would, if it could be found easily, enable one to do various kinds of global data flow analyses quickly. We show that for all reducible...
Article
This article defines a problem that involves merging nodes into trees while retaining the ability to determine the lowest common ancestor of any two nodes. An O(n log n) algorithm is offered to solve the problem on-line. It is shown how this algorithm provides a fast way of computing the dominator tree of a reducible flow graph.
Article
The problem of finding a longest common subsequence of two strings is discussed. This problem arises in data processing applications such as comparing two files and in genetic applications such as studying molecular evolution. The difficulty of computing a longest common subsequence of two strings is examined using the decision tree model of comput...
Article
This paper shows the problem of generating optimal code for expressions containing common subexpressions is computationally difficult, even for simple expressions and simple machines. Some heuristics for code generation are given and their worst-case behavior is analyzed. For one register machines, an optimal code generation algorithm is given whos...
Article
Full-text available
We investigate the evaluation of an $(n - 1)$st degree polynomial at a sequence of n points. It is shown that such an evaluation reduces directly to a simple convolution if and only if the sequence of points is of the form $b, ba,ba^2 , \cdots ,ba^{n - 1} $ for complex numbers a and b (the so-called “chirp transform”). By more complex reductions we...
Article
Full-text available
This paper describes a simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text. The algorithm consists of constructing a finite state pattern matching machine from the keywords and then using the pattern matching machine to process the text string in a single pass. Construction of the pattern...

Network

Cited By