Prototyping Efficient Natural Language Parsers
ABSTRACT We present a technique for the construction
of efficient prototypes for natural language
parsing based on the compilation of parsing
schemata to executable implementations of
their corresponding algorithms. Taking a
simple description of a schema as input, Java
code for the corresponding parsing algorithm
is generated, including schema-specific indexing
code in order to attain efficiency.
Full-textDOI: · Available from: Miguel Ángel Alonso Pardo, Aug 12, 2015
- SourceAvailable from: sciencedirect.com[Show abstract] [Hide abstract]
ABSTRACT: A recognition algorithm is exhibited whereby an arbitrary string over a given vocabulary can be tested for containment in a given context-free language. A special merit of this algorithm is that it is completed in a number of steps proportional to the “cube” of the number of symbols in the tested string. As a byproduct of the grammatical analysis, required by the recognition algorithm, one can obtain, by some additional processing not exceeding the “cube” factor of computational complexity, a parsing matrix—a complete summary of the grammatical structure of the sentence. It is also shown how, by means of a minor modification of the recognition algorithm, one can obtain an integer representing the ambiguity of the sentence, i.e., the number of distinct ways in which that sentence can be generated by the grammar.The recognition algorithm is then simulated on a Turing Machine. It is shown that this simulation likewise requires a number of steps proportional to only the “cube” of the test string length.Information and Control 02/1967; 10(2-10):189-208. DOI:10.1016/S0019-9958(67)80007-X
- [Show abstract] [Hide abstract]
ABSTRACT: Parsing schemata provide a formal, simple and uniform way to describe, analyze and compare different parsing algorithms. The notion of a parsing schema comes from considering parsing as a deduction process which generates intermediate results called items. An initial set of items is directly obtained from the input sentence, and the parsing process consists of the application of inference rules (called deductive steps) which produce new items from existing ones. Each item contains a piece of information about the sentence’s structure, and a successful parsing process will produce at least one final item containing a full parse tree for the sentence or guaranteeing its existence. Their abstraction of low-level details makes parsing schemata useful to define parsers in a simple and straightforward way. Comparing parsers, or considering aspects such as their correction and completeness or their computational complexity, also becomes easier if we think in terms of schemata. However, when we want to actually use a parser by running it on a computer, we need to implement it in a programming language, so we have to abandon the high level of abstraction and worry about implementation details that were irrelevant at the schema level. In particular, we study in this article how the source parsing schema should be analysed to decide what kind of indexes need to be generated in order to obtain an efficient parser.Computer Aided Systems Theory, Edited by Roberto Moreno-Díaz, Franz Pichler, Alexis Quesada-Arencibia, 01/1970: pages 257-264; Springer.