ArticlePDF Available

Constructing Natural Language Interpreters in a Lazy Functional Language.

Authors:
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
at University of Windsor on September 22, 2010comjnl.oxfordjournals.orgDownloaded from
... Finally, in some problem domains one may consider avoiding generated source code entirely. For example, in parsing, some programmers find parser combinators [5,6] to be a suitable or even preferable alternative to Yacc-like tools. Nevertheless, many programmers prefer traditional LR parser generators for various reasons including error reporting and recovery, and ambiguity diagnostics. ...
... The tool is implemented in Standard ML.5 Although SREs are less compact than some other notations, we find their syntax is much easier to remember.6 ...
Article
Existing source-code-generating tools such as Lex and Yacc suffer from practical inconveniences because they use disembodied code to implement actions. To prevent this problem, such tools could generate closed functors that are then instantiated by the programmer with appropriate action code. This results in all code being type checked in its appropriate context, and it assists the type checker in localizing errors correctly. We have implemented a lexer generator and parser generator based on this technique for Standard ML, OCaml, and Haskell.
... Due to their declarative nature, often times the definition of the mathematical function itself can be directly expressed in the language. Since UEV-FLMS is heavily rooted in setrelational theory [39], we were able to implement all of our semantic functions as they were defined in Chapter 3 by directly expressing those functions in Haskell. To accommodate communication with external triplestores, we "lifted" the implementations of the semantic functions into the IO monad, enabling our semantics to use non-referentially transparent getts functions. ...
... The query program is called "Solarman", and there exists two web based interfaces that can be used to interact with it.Solarman was a program originally built in Miranda to demonstrate Frost's FLMS semantics in 1989[39], enabling the user to perform queries about objects in the Solar system. It was later ported to Haskell and integrated with Hafiz's parser in 2008[25] to form a Natural Language Interface using FLMS semantics to perform queries. ...
Thesis
Full-text available
The Semantic Web is an emerging component of the set of technologies that will be known as Web 3.0 in the future. With the large changes it brings to how information is stored and represented to users, there is a need to re-evaluate how this information can be queried. Specifically, there is a need for Natural Language Interfaces that allow users to easily query for information on the Semantic Web. While there has been previous work in this area, existing solutions suffer from the problem that they do not support prepositional phrases in queries (e.g, “in 1958” or “with a key”). To achieve this, we improve on an existing semantics for event-based triplestores that supports prepositional phrases and demonstrate a novel method of handling the word “by”, treating it directly as a preposition in queries. We then show how this new semantics can be integrated with a parser constructed as an executable attribute grammar to create a highly modular and extensible Natural Language Interface to the Semantic Web that supports prepositional phrases in queries.
... Natural language processing is of course a very rich and diverse research area, and space limitations preclude a summary of techniques. However, the topic of natural language processing in a functional language has also been discussed by Frost and Launchbury [5]. Their work differs from ours by its foundation on a semantic theory that is based on principles proposed by Montague [12]. ...
... Parser-combinator morphology Our third lemmatizer (named in our submission filenames as 'ipamorph', since it operates on the output of our Epitran IPA system) arose from a refactoring of the regular-expression compiler to be a recursive descent parser, by converting the primitive elements (e.g., suffix specifications) to parser combinators (Hutton and Meijer 1988;Frost and Launchbury 1989) rather than regular expression snippets. We also significantly expanded the suffix inventory and morphotactic com- Table 3 Example output from franmorph including lemmas and detailed lexical and morphological information ...
Article
Full-text available
The LoReHLT16 evaluation challenged participants to extract Situation Frames (SFs)—structured descriptions of humanitarian need situations—from monolingual Uyghur text. The ARIEL-CMU SF detector combines two classification paradigms, a manually curated keyword-spotting system and a machine learning classifier. These were applied by translating the models on a per-feature basis, rather than translating the input text. The resulting combined model provides the accuracy of human insight with the generality of machine learning, and is relatively tractable to human analysis and error correction. Other factors contributing to success were automatic dictionary creation, the use of phonetic transcription, detailed, hand-written morphological analysis, and naturalistic glossing for error analysis by humans. The ARIEL-CMU SF pipeline produced the top-scoring LoReHLT16 situation frame detection systems for the metrics SFType, SFType+Place+Need, SFType+Place+Relief, and SFType+Place+Urgency, at each of the three checkpoints.
... Functional approaches to parsing, including parser combinators, have been studied for several decades (Burge, 1975;Fairbairn, 1987;Frost and Launchbury, 1989;Hutton, 1990). Both Norvig (1991) and Leermakers (1993) use memoisation to improve efficiency, but Norvig forbids left recursive rules, while Leermakers avoids the problem of left recursion by using a 'recursive ascent' strategy, sacrificing the modularity of top-down approaches (Koskimies, 1990). ...
Article
Full-text available
Memoisation, or tabling, is a well-known technique that yields large improvements in the performance of some recursive computations. Tabled resolution in Prologs such as XSB and B-Prolog can transform so called left-recursive predicates from non-terminating computations into finite and well-behaved ones. In the functional programming literature, memoisation has usually been implemented in a way that does not handle left-recursion, requiring supplementary mechanisms to prevent non-termination. A notable exception is Johnson's (1995) continuation passing approach in Scheme. This, however, relies on mutation of a memo table data structure and coding in explicit continuation passing style. We show how Johnson's approach can be implemented purely functionally in a modern, strongly typed functional language (OCaml), presented via a monadic interface that hides the implementation details, yet providing a way to return a compact represention of the memo tables at the end of the computation.
... Syntactic integration Syntactic integration of domain-specific languages is commonly supported by so-called language workbenches [22], environments that define (1) a schema for an abstract syntax for a language (i.e., a grammar), (2) one or more rich editing environments for the language, and (3) language semantics, typically either by direct interpretation or code generation. Language workbenches can be based on a variety of parsing technologies, such as generalized LR (GLR) parsing [48], generalized LL (GLL) parsing [43], term rewriting [23], parser combinators [24], or parsing expression grammars [9]. ...
Book
Full-text available
When a project is realized in a globalized environment, multiple stakeholders from different organizations work on the same system. Depending on the stakeholders and their organizations, various (possibly overlapping) concerns are raised in the development of the system. In this context a Domain Specific Language (DSL) supports the work of a group of stakeholders who are responsible for addressing a specific set of concerns. This chapter identifies the open challenges arising from the coordination of globalized domain-specific languages. We identify two types of coordination: technical coordination and social coordination. After presenting an overview of the current state of the art, we discuss first the open challenges arising from the composition of multiple DSLs, and then the open challenges associated to the collaboration in a globalized environment.
... Bounded seas can be integrated into a parser combinator framework, a highly modular framework for building a parser from other composable parsers [10]. The fact that a bounded sea can be implemented as a parser combinator demonstrates its composability and flexibility. ...
Article
Imprecise manipulation of source code (semi-parsing) is useful for tasks such as robust parsing, error recovery, lexical analysis, and rapid development of parsers for data extraction. An island grammar precisely defines only a subset of a language syntax (islands), while the rest of the syntax (water) is defined imprecisely. Usually, water is defined as the negation of islands. Albeit simple, such a definition of water is naive and impedes composition of islands. When developing an island grammar, sooner or later a programmer has to create water tailored to each individual island. Such an approach is fragile, however, because water can change with any change of a grammar. It is time-consuming, because water is defined manually by a programmer and not automatically. Finally, an island surrounded by water cannot be reused because water has to be defined for every grammar individually. In this paper we propose a new technique of island parsing — bounded seas. Bounded seas are composable, robust, reusable and easy to use because island-specific water is created automatically. We integrated bounded seas into a parser combinator framework as a demonstration of their composability and reusability.
... Parser combinators are a technique developed for lazily evaluated functional programming languages. Some early impressive example can be found in [FL89]. A recent and very efficient implemention is the parsec-library for Haskell [LM01]. ...
Article
Monadic Parser Combinators stem from functional programming. This paper exploits the ideas of parser combinators and applies them to the C++ programming language. The resulting library is extremely small, flexible and easy to use. The paper contains the complete source code of the resulting parser library. As an example a parser of N. Wirth's language PL/0 is given in terms of the parser library.
... It is worth mentioning that any practical grammar for a Natural Language would be much less ambiguous than the above grammars. 13 In a CNF grammar, each rule has at most two symbols in sequence for each alternative. ...
Thesis
Full-text available
The Internet of Things (IoT) is an emerging phenomenon in the public space. Users with accessibility needs could especially benefit from these “smart” devices if they were able to interact with them through speech. This thesis presents a Compositional Semantics and framework for developing extensible and expressive Natural Language Query Interfaces to the Semantic Web, addressing privacy and auditability needs in the process. This could be particularly useful in healthcare or legal applications, where confidentiality of information is a key concern.
ResearchGate has not been able to resolve any references for this publication.