Terence Parr

Terence Parr
Google Inc. | Google

Ph.D. Computer Engineering

About

39
Publications
19,828
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,084
Citations
Additional affiliations
August 2003 - present
University of San Francisco
Position
  • Professor (Full)

Publications

Publications (39)
Article
Partial dependence curves (FPD) are commonly used to explain feature importance once a supervised learning model has been fitted to data. However, it is common for the same partial dependence algorithm to give meaningfully different curves for different supervised models, even when the algorithm is applied to the same data. As a result, it is diffi...
Preprint
Full-text available
Practitioners use feature importance to rank and eliminate weak predictors during model development in an effort to simplify models and improve generality. Unfortunately, they also routinely conflate such feature importance measures with feature impact, the isolated effect of an explanatory variable on the response variable. This can lead to real-w...
Preprint
Full-text available
Practitioners use feature importance to rank and eliminate weak predictors during model development in an effort to simplify models and improve generality. Unfortunately, they also routinely conflate such feature importance measures with feature impact, the isolated effect of an explanatory variable on the response variable. This can lead to real-w...
Preprint
Full-text available
Partial dependence curves (FPD) introduced by Friedman (2000), are an important model interpretation tool, but are often not accessible to business analysts and scientists who typically lack the skills to choose, tune, and assess machine learning models. It is also common for the same partial dependence algorithm on the same data to give meaningful...
Preprint
Full-text available
Model interpretability is important to machine learning practitioners, and a key component of interpretation is the characterization of partial dependence of the response variable on any subset of features used in the model. The two most common strategies for assessing partial dependence suffer from a number of critical weaknesses. In the first str...
Article
Full-text available
This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to...
Conference Paper
There are many declarative frameworks that allow us to implement code formatters relatively easily for any specific language, but constructing them is cumbersome. The first problem is that “everybody” wants to format their code differently, leading to either many formatter variants or a ridiculous number of configuration options. Second, the size o...
Article
Full-text available
There are many declarative frameworks that allow us to implement code formatters relatively easily for any specific language, but constructing them is cumbersome. The first problem is that "everybody" wants to format their code differently, leading to either many formatter variants or a ridiculous number of configuration options. Second, the size o...
Conference Paper
Full-text available
Despite the advances made by modern parsing strategies such as PEG, LL(*), GLR, and GLL, parsing is not a solved problem. Existing approaches suffer from a number of weaknesses, including difficulties supporting side-effecting embedded actions, slow and/or unpredictable performance, and counterintuitive matching strategies. This paper introduces th...
Conference Paper
Full-text available
Despite the advances made by modern parsing strategies such as PEG, LL(*), GLR, and GLL, parsing is not a solved problem. Existing approaches suffer from a number of weaknesses, including difficulties supporting side-effecting embedded actions, slow and/or unpredictable performance, and counterintuitive matching strategies. This paper introduces th...
Book
Programmers run into parsing problems all the time. Whether it's a data format like JSON, a network protocol like SMTP, a server configuration file for Apache, a PostScript/PDF file, or a simple spreadsheet macro language--ANTLR v4 and this book will demystify the process. ANTLR v4 has been rewritten from scratch to make it easier than ever to buil...
Conference Paper
Full-text available
Despite the power of Parser Expression Grammars (PEGs) and GLR, parsing is not a solved problem. Adding nondeterminism (parser speculation) to traditional LL and LR parsers can lead to unexpected parse-time behavior and introduces practical issues with error handling, single-step debugging, and side-effecting embedded grammar actions. This paper in...
Conference Paper
Despite the power of Parser Expression Grammars (PEGs) and GLR, parsing is not a solved problem. Adding nondeterminism (parser speculation) to traditional LL and LR parsers can lead to unexpected parse-time behavior and introduces practical issues with error handling, single-step debugging, and side-effecting embedded grammar actions. This paper in...
Book
Knowing how to create domain-specific languages (DSLs) can give you a huge productivity boost. Instead of writing code in a general-purpose programming language, you can first build a custom language tailored to make you efficient in a particular domain. The key is understanding the common patterns found across language implementations. Language I...
Article
Programmers tend to avoid using language tools, resorting to ad-hoc methods, because tools can be hard to use, their parsing strategies can be dicult to understand and debug, and their generated parsers can be opaque black-boxes. In particular, there are two very common diculties encountered by grammar developers: Understanding why a grammar fragme...
Conference Paper
Full-text available
Reusing syntax specifications without embedded arbitrary semantic actions is straightforward because the semantic analysis phases of new applications can feed off trees or other intermediate structures constructed by the pre-existing parser. The presence of arbitrary embedded semantic actions, however, makes reuse difficult with existing mechanisms...
Conference Paper
Full-text available
Search engines regularly crawl the web taking vast snapshots of sitecontent. Because previous crawls are not archived, however, searchresults pertain only to a single, recent instant in time. Search engine users are unable to request pages discussing UK politics in2001, for example. The Internet Archive, an organization dedicated to maintaining suc...
Conference Paper
Full-text available
A template engine that strictly enforces model-view separa- tion has been shown to be at least as expressive as a context free grammar allowing the engine to, for example, easily generate any file describable by an XML DTD (7). When faced with supporting internationalized web applications, however, template engine designers have backed o from enfor...
Conference Paper
Full-text available
The mantra of every experienced web application developer is the same: thou shalt separate business logic from display. Ironically, almost all template engines allow violation of this separation principle, which is the very impetus for HTML template engine development. This situation is due mostly to a lack of formal definition of separation and fe...
Article
Documents written in two-dimensional markup languages such as HTML are easy to create with word processor-style editors. Creating VRML[1] documents, however, is complicated by its three-dimensional nature. VRML can be generated from existing 3D graphics editors or new applications can be written specifically for constructing VRML scenes. In this pa...
Article
Full-text available
. Despite the sophistication of code-generator generators and source-to-source translator generators (such as attribute grammar based tools), programmers often choose to build tree parsers by hand for source translation problems. In many cases, a programmer has a front-end that constructs intermediate form trees and simply wants to traverse the tre...
Article
Full-text available
Syntax Trees. -gx Do not create the lexical analyzer files (dlg-related). This option should be given when the user wishes to provide a customized lexical analyzer. It may also be used in make scripts to cause only the parser to be rebuilt when a change not affecting the lexical struc- ture is made to the input grammars. -k n Set k of LL(k) to n; i...
Article
this paper, we introduce the ANTLR (ANother Tool for Language Recognition) parser generator, which addresses all these issues. ANTLR is a component of the Purdue Compiler Construction Tool Set (PCCTS)
Article
Full-text available
Language translation is a harder and more important problem than language recognition. In particular, programmers implement translators not recognizers. Yet too often, translation is equated with the simpler task of syntactic parsing. This misconception coupled with computing limitations of past computers has led to the almost exclusive use of LR(1...
Article
Despite the parsing power of recursive-descent parsers by hand to obtain increased flexibility, better error handling, and ease of debugging. We introduce ANTLR, a public-domain parser generator that combines the flexibility of hand-coded parsing with the convenience of a parser generator, which is a component of PCCTS. ANTLR has many features that...
Article
Full-text available
Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this paper we show how applications codes written i...
Article
Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written...
Conference Paper
Full-text available
Most language translation problems can be solved with existing LALR(1) or LL(k) language tools; e.g., YACC [Joh78] or ANTLR [PDC92]. However, there are language constructs that defy almost all parsing strategy commonly in use. Some of these constructs cannot be parsed without semantics, such as symbol table information, and some cannot be properly...
Thesis
Full-text available
LL(k) and LR(k) parsers for $k>1$ are well understood in theory, but little work has been done in practice to implement them. This fact arises primarily because traditional lookahead information for LL(k) and LR(k) parsers and their variants is exponentially large in k. Fortunately, this worst case behavior can usually be averted and practical dete...
Article
Full-text available
Although existing LR(1) or U(1) parser generators suffice for many language recognition problems, writing a straightforward grammar to translate a complicated language, such as C++ or even C, remains a non-trivial task. We have often found that adding translation actions to the grammar is harder than writing the grammar itself. Part of the problem...
Article
Full-text available
Article
Full-text available
This paper describes ST (StringTemplate), a domain-specific func-tional language for generating structured text from internal data structures that has the flavor of an output grammar. ST's feature set is driven by solving real problems encountered in complicated sys-tems such as ANTLR version 3's retargetable code generator. Fea-tures include templ...
Article
Full-text available
In terms of recognition strength, LL techniques are widely held to be inferior to LR parsers. The fact that any LR (k) grammar can be rewritten to be LR (I), whereas LL(k) is stronger than LL (I), appears to give LR techniques the additional benefit of not requiring k-token lookahead and its associated ovehead. In this paper, we suggest that LL(k)...

Network

Cited By

Projects

Projects (2)
Project
To build and maintain a powerful but easy-to-use parser generator that generates code in multiple languages.