ArticlePDF Available

The Lunar Science Natural Language Information System: Final Report

Authors:
  • A9,com (Amazon)
A preview of the PDF is not available
... Natural language interfaces (NLIs) have been the "holy grail" of natural language understating and human-computer interaction for decades (Woods et al., 1972;Codd, 1974;Hendrix et al., 1978;Zettlemoyer and Collins, 2005). However, early attempts in building NLIs to databases did not achieve the expected success due to limitations in language understanding capability, among other reasons (Androutsopoulos et al., 1995;Jones and Galliers, 1995). ...
... Text-to-SQL Parsing: Natural language to SQL (natural language interfaces to databases) has been an active field of study for several decades (Woods et al., 1972;Hendrix et al., 1978;Warren and Pereira, 1982;Popescu et al., 2003;Li and Jagadish, 2014). This line of work has been receiving increased attention recently driven, in part, by the development of new large scale datasets such as WikiSQL (Zhong et al., 2017) and Spider . ...
Preprint
We study the task of semantic parse correction with natural language feedback. Given a natural language utterance, most semantic parsing systems pose the problem as one-shot translation where the utterance is mapped to a corresponding logical form. In this paper, we investigate a more interactive scenario where humans can further interact with the system by providing free-form natural language feedback to correct the system when it generates an inaccurate interpretation of an initial utterance. We focus on natural language to SQL systems and construct, SPLASH, a dataset of utterances, incorrect SQL interpretations and the corresponding natural language feedback. We compare various reference models for the correction task and show that incorporating such a rich form of feedback can significantly improve the overall semantic parsing accuracy while retaining the flexibility of natural language interaction. While we estimated human correction accuracy is 81.5%, our best model achieves only 25.1%, which leaves a large gap for improvement in future research. SPLASH is publicly available at https://aka.ms/Splash_dataset.
... Natural language interfaces (NLIs) have been the "holy grail" of natural language understating and human-computer interaction for decades (Woods et al., 1972;Codd, 1974;Hendrix et al., 1978;Zettlemoyer and Collins, 2005). However, early attempts in building NLIs to databases did not achieve the expected success due to limitations in language understanding capability, among other reasons (Androutsopoulos et al., 1995;Jones and Galliers, 1995). ...
... Text-to-SQL Parsing: Natural language to SQL (natural language interfaces to databases) has been an active field of study for several decades (Woods et al., 1972;Hendrix et al., 1978;Warren and Pereira, 1982;Popescu et al., 2003;Li Feedback (Zhong et al., 2017) and Spider . The majority of this work has focused on mapping a single query to the corresponding SQL with the exception of a few datasets, e.g., SParC and CoSQL (Yu et al., 2019a), that target inducing SQL parses for sequentially related questions. ...
... Language as an interface for interactions. NLU have been the important direction for human-computer interaction and information search for decades [22,40,87]. The recent impressive advances in capabilities of NLU [1,14,20,28,63,75] powered by large-scale deep learning and increasing demand for new applications has led to a major resurgence of natural language interfaces in the form of virtual assistants, dialog systems, conversational search, semantic parsing, and question answering systems [29,60,61,92]. ...
Preprint
Current interactive systems with natural language interface lack an ability to understand a complex information-seeking request which expresses several implicit constraints at once, and there is no prior information about user preferences, e.g., "find hiking trails around San Francisco which are accessible with toddlers and have beautiful scenery in summer", where output is a list of possible suggestions for users to start their exploration. In such scenarios, the user requests can be issued at once in the form of a complex and long query, unlike conversational and exploratory search models that require short utterances or queries where they often require to be fed into the system step by step. This advancement provides the final user more flexibility and precision in expressing their intent through the search process. Such systems are inherently helpful for day-today user tasks requiring planning that are usually time-consuming, sometimes tricky, and cognitively taxing. We have designed and deployed a platform to collect the data from approaching such complex interactive systems. In this paper, we propose an Interactive Agent (IA) that allows intricately refined user requests by making it complete, which should lead to better retrieval. To demonstrate the performance of the proposed modeling paradigm, we have adopted various pre-retrieval metrics that capture the extent to which guided interactions with our system yield better retrieval results. Through extensive experimentation, we demonstrated that our method significantly outperforms several robust baselines
... Natural language to SQL: Natural language interfaces to databases have been an active field of study for many years (Woods et al., 1972;Warren and Pereira, 1982;Popescu et al., 2003;Li and Jagadish, 2014). The development of new large scale datasets, such as WikiSQL (Zhong et al., 2017) and SPIDER (Yu et al., 2018b), has reignited the interest in this area with several new models introduced recently (Choi et al., 2020;Wang et al., 2020;Scholak et al., 2020). ...
... Natural language to SQL: Natural language interfaces to databases have been an active field of study for many years (Woods et al., 1972;Warren and Pereira, 1982;Popescu et al., 2003;Li and Jagadish, 2014). The development of new large scale datasets, such as WikiSQL (Zhong et al., 2017) and SPIDER (Yu et al., 2018b), has reignited the interest in this area with several new models introduced recently (Choi et al., 2020;Wang et al., 2020;Scholak et al., 2020). ...
Preprint
We study semantic parsing in an interactive setting in which users correct errors with natural language feedback. We present NL-EDIT, a model for interpreting natural language feedback in the interaction context to generate a sequence of edits that can be applied to the initial parse to correct its errors. We show that NL-EDIT can boost the accuracy of existing text-to-SQL parsers by up to 20% with only one turn of correction. We analyze the limitations of the model and discuss directions for improvement and evaluation. The code and datasets used in this paper are publicly available at http://aka.ms/NLEdit.
... SHRDLU (Winograd, 1971) was a simulated robot that used natural language to query and manipulate objects inside a very simple virtual micro-world consisting of a number of color blocks and pyramids. LUNAR (Woods et al., 1972) was developed as an interface system to a database that consisted of information about lunar rock samples using augmented transition network. Lastly, PARRY (Colby, 1974) attempted to simulate a person with paranoid schizophrenia based on concepts, conceptualizations, and beliefs. ...
Thesis
Full-text available
Obtaining accurate information about products in a fast and efficient way is becoming increasingly important at Cisco as the related documentation rapidly grows. Thanks to recent progress in natural language processing (NLP), extracting valuable information from general domain documents has gained in popularity, and deep learning has boosted the development of effective text mining systems. However, directly applying the advancements in NLP to domain-specific documentation might yield unsatisfactory results due to a word distribution shift from general domain language to domain-specific language. Hence, this thesis aims to determine if a large language model pre-trained on domain-specific (computer networking) text corpora improves performance over the same model pre-trained exclusively on general domain text, when evaluated on in-domain text mining tasks. To this end, we introduce NetBERT (Bidirectional Encoder Representations from Transformers for Computer Networking), a domain-specific language representation model based on BERT (Devlin et al., 2018) and pre-trained on large-scale computer networking corpora. Through several extrinsic and intrinsic evaluations, we compare the performance of our novel model against the domain-general BERT. We demonstrate clear improvements over BERT on the following two representative text mining tasks: networking text classification (0.9% F1 improvement) and networking information retrieval (12.3% improvement on a custom retrieval score). Additional experiments on word similarity and word analogy tend to show that NetBERT capture more meaningful semantic properties and relations between networking concepts than BERT does. We conclude that pre-training BERT on computer networking corpora helps it understand more accurately domain-related text.
... More traditional, semantic parsing methods map questions to compositional programs, whose sub-programs can be viewed as question decompositions in a formal language (Talmor & Berant, 2018;Wolfson et al., 2020). Examples include classical QA systems like SHRDLU (Winograd, 1972) and LUNAR (Woods et al., 1974), as well as neural Seq2Seq semantic parsers (Dong & Lapata, 2016) and neural module networks (Andreas et al., 2015;. Such methods usually require strong, program-level supervision to generate programs, as in visual QA (Johnson et al., 2017b) and on HOTPOTQA (Jiang & Bansal, 2019b). ...
Preprint
We aim to improve question answering (QA) by decomposing hard questions into easier sub-questions that existing QA systems can answer. Since collecting labeled decompositions is cumbersome, we propose an unsupervised approach to produce sub-questions. Specifically, by leveraging >10M questions from Common Crawl, we learn to map from the distribution of multi-hop questions to the distribution of single-hop sub-questions. We answer sub-questions with an off-the-shelf QA model and incorporate the resulting answers in a downstream, multi-hop QA system. On a popular multi-hop QA dataset, HotpotQA, we show large improvements over a strong baseline, especially on adversarial and out-of-domain questions. Our method is generally applicable and automatically learns to decompose questions of different classes, while matching the performance of decomposition methods that rely heavily on hand-engineering and annotation.
... A schematic ATN grammar for transitive, intransitive, and full-passive sentences system (Woods and Kaplan 1971). Bonnie Webber joined us and extended the semantics and other features for the second version (Woods, Kaplan, and Nash-Webber 1972). Lunar and its ATN parser and grammar are described at greater length in Bill's LTA acceptance paper (Woods 2006) and elsewhere. ...
Article
Full-text available
Chapter
Today’s databases of corporations are so huge, that they can only be approached by experienced programmers. Accessing data from a database usually needs notable skills such as knowledge of SQL; however, the most of us who interact with databases every day don’t have that background. Hence it’s an increase demand for non-technical user to be able to redeem data from databases without having to list SQL queries. And this problem is solved by using approach of Natural Language Processing. This research work presents an approach for querying system for natural language processing. Hence it will dramatically simplify the process of handling with large data and making data available for everyone.
ResearchGate has not been able to resolve any references for this publication.