Conference Paper
Deterministic Techniques for Efficient NonDeterministic Parsers
DOI: 10.1007/3540068414_65 Conference: Automata, Languages and Programming, 2nd Colloquium, University of Saarbrücken, July 29  August 2, 1974, Proceedings
Source: DBLP
ABSTRACT
A general study of parallel nondeterministic parsing and translation à la Earley is developped formally, based on nondeterministic pushdown acceptortransducers. Several results (complexity and efficiency) are established, some new and other previously proved only in special cases. As an application, we show that for every family of deterministic contextfree pushdown parsers (e.g. precedence, LR(k), LL(k), ...) there is a family of general contextfree parallel parsers that have the same efficiency in most practical cases (e.g. analysis of programming languages).
Get notified about updates to this publication Follow publication 
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.

 "The sbp package is an implementation of the LangTomita Generalized LR Parsing Algorithm [2] [3], employing Johnstone & Scott's RNGLR algorithm [13] for handling productions and circularities. "
Article: Scannerless Boolean Parsing
[Show abstract] [Hide abstract]
ABSTRACT: Scannerless generalized parsing techniques allow parsers to be derived directly from unified, declarative specifications. Unfortunately, in order to uniquely parse existing programming languages at the character level, disambiguation extensions beyond the usual contextfree formalism are required. This paper explains how scannerless parsers for boolean grammars (contextfree grammars extended with intersection and negation) can specify such languages un ambiguously, and can also describe other interesting constructs such as indentation based block structure. 
 "Members of this class are widely agreed to be expressive enough to accommodate reasonable structures for natural language sentences while still ruling out some conceivable alternatives (Frank, 2004; Joshi, VijayShanker, & Weir, 1991). Section 3 answers this question in the affirmative, showing how the entropy reduction idea can be extended to mildly contextsensitive languages by applying two classical ideas in (probabilistic) formal language theory: Grenander's (1967) closedform solution for the entropy of a nonterminal in a probabilistic grammar, and Lang's (1974, 1988) insight that an intermediate parser state is itself a specification of a grammar. Sections 4 through 10 assert the feasibility of this extension by examining the implications of two alternative relative clauses analyses for a proposed linguistic universal. "
[Show abstract] [Hide abstract]
ABSTRACT: A wordbyword human sentence processing complexity metric is presented. This metric formalizes the intuition that comprehenders have more trouble on words contributing larger amounts of information about the syntactic structure of the sentence as a whole. The formalization is in terms of the conditional entropy of grammatical continuations, given the words that have been heard so far. To calculate the predictions of this metric, Wilson and Carroll's (1954) original entropy reduction idea is extended to infinite languages. This is demonstrated with a mildly contextsensitive language that includes relative clauses formed on a variety of grammatical relations across the Accessibility Hierarchy of Keenan and Comrie (1977). Predictions are derived that correlate significantly with repetition accuracy results obtained in a sentencememory experiment (Keenan & Hawkins, 1987). 
 "An obvious way to extend the standard LR parsing approach to incorporate nondeterminism is to replicate the stack when a point of nondeterminism is reached, and to explore all the possible traversals of the DFA. An efficient algorithm for exploring all traversals of a nondeterministic PDA which performs at most one stack pop and one stack push at each step, was given by Lang [11]. Tomita [15] gave an algorithm aimed explicitly at LR DFAs (which in their standard form can pop multiple stack symbols at each step). "
[Show abstract] [Hide abstract]
ABSTRACT: Reduction Incorporated (RI) recognisers and parsers deliver high performance by suppressing the stack activity except for those rules that generate fully embedded recursion. Automaton constructions for RI parsing have been presented by Aycock and Horspool [John Aycock and Nigel Horspool. Faster generalised LR parsing. In Compiler Construction, 8th Intnl. Conf, CC'99, volume 1575 of Lecture Notes in Computer Science, pages 32 – 46. SpringerVerlag, 1999] and by Scott and Johnstone [Adrian Johnstone and Elizabeth Scott. Generalised regular parsers. In Gorel Hedin, editor, Compiler Construction, 12th Intnl. Conf, CC'03, volume 2622 of Lecture Notes in Computer Science, pages 232–246. SpringerVerlag, Berlin, 2003] but both can yield very large tables. An unusual aspect of the RI automaton is that the degree of stack activity suppression can be varied in a finegrained way, and this provides a large family of potential RI automata for real programming languages, some of which have manageable table size but still show high performance. We give examples drawn from ANSIC, Cobol and Pascal; discuss some heuristics for guiding manual specification of stack activity suppression; and describe work in progress on the automatic construction of RI automata using profiling information gathered from running parsers: in this way we propose to optimise our parsers' table size against performance on actual parsing tasks.