David Stallard

David Stallard
Raytheon BBN Technologies | BBN

About

58
Publications
7,146
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,056
Citations
Citations since 2017
0 Research Items
213 Citations
2017201820192020202120222023010203040
2017201820192020202120222023010203040
2017201820192020202120222023010203040
2017201820192020202120222023010203040

Publications

Publications (58)
Article
The development of high-performance statistical machine translation (SMT) systems is contingent on the availability of substantial, in-domain parallel training corpora. The latter, however, are expensive to produce due to the labor-intensive nature of manual translation. We propose to alleviate this problem with a novel, semi-supervised, batch-mode...
Article
In this paper we present a speech-to-speech (S2S) translation system called the BBN TransTalk that enables two-way communication between speakers of English and speakers who do not understand or speak English. The BBN TransTalk has been configured for several languages including Iraqi Arabic, Pashto, Dari, Farsi, Malay, Indonesian, and Levantine Ar...
Conference Paper
If unsupervised morphological analyzers could approach the effectiveness of supervised ones, they would be a very attractive choice for improving MT performance on low-resource inflected languages. In this paper, we compare performance gains for state-of-the-art supervised vs. unsupervised morphological analyzers, using a state-of-the-art Arabic-to...
Conference Paper
Full-text available
Arabic Dialects present many challenges for machine translation, not least of which is the lack of data resources. We use crowdsourcing to cheaply and quickly build Levantine-English and Egyptian-English parallel corpora, consisting of 1.1M words and 380k words, respectively. The dialectal sentences are selected from a large corpus of Arabic web te...
Article
A common cause of errors in spoken language systems is the presence of out-of-vocabulary (OOV) words in the input. Named entities (people, places, organizations, etc.) are a particularly important class of OOVs. In this paper we focus on detecting OOV named entities (NEs) for two-way English/Iraqi speech-to-speech translation. Our approach builds o...
Conference Paper
Full-text available
The availability of substantial, in-domain parallel corpora is critical for the development of high-performance statistical machine translation (SMT) systems. Such corpora, however, are expensive to produce due to the labor intensive nature of manual translation. We propose to alleviate this problem with a novel, semi-supervised, batch-mode active...
Conference Paper
Speech-to-speech translation systems have made a great deal of progress in recent years. But users of such systems still face the problem of not knowing whether the system has translated their utterance correctly. Various confirmation strategies can be used to address this problem. Some of these generate a confirmation utterance for the user to app...
Conference Paper
Full-text available
Production of parallel training corpora for the development of statistical machine translation (SMT) systems for resource-poor languages usually requires extensive manual effort. Active sample selection aims to reduce the labor, time, and expense incurred in producing such resources, attaining a given performance benchmark with the smallest possibl...
Conference Paper
Full-text available
We report on recent improvements in our English/Iraqi Arabic speech-to-speech translation system. User interface improvements include a novel parallel approach to user confirmation which makes confirmation cost-free in terms of dialog duration. Automatic speech recognition improvements include the incorporation of state-of-the-art techniques in fea...
Conference Paper
In this paper, we describe a novel approach that exploits intra-sentence and dialog-level context for improving translation performance on spoken Iraqi utterances that contain named entities (NEs). Dialog-level context is used to predict whether the Iraqi response is likely to contain names and the intra-sentence context is used to determine words...
Article
Speech-to-speech translation (S2S) technology holds out the promise of allowing spoken communication across language barriers. Recently, there has been a great deal of progress in S2S technology, much of it under the sponsorship of DARPA's TransTac program. In this paper, we present BBN's S2S system, "TransTalk", whose development has been funded u...
Conference Paper
We report on recent ASR and MT work on our English/Iraqi Arabic speech-to-speech translation system. We present detailed results for both objective and subjective evaluations of translation quality, along with a detailed analysis and categorization of translation errors. We also present novel ideas for quantifying the relative importance of differe...
Conference Paper
In this paper, we introduce a new metric which we call the semantic translation error rate, or STER, for evaluating the performance of machine translation systems. STER is based on the previously published translation error rate (TER) (Snover et al., 2006) and METEOR (Banerjee and Lavie, 2005) metrics. Specifically, STER extends TER in two ways: fi...
Conference Paper
In this paper we present a speech-to-speech translation system configured for translingual communication in English and colloquial Iraqi on a mobile, handheld device. The end-to-end system employs a medium/large vocabulary n-gram speech recognition engine for recognizing English and colloquial Iraqi, a question canonicalizer for mapping a recognize...
Conference Paper
Full-text available
In this paper, we present a 2-way speech-to-speech translation system for English and Iraqi colloquial Arabic, the dialect of Arabic spoken by ordinary people in Iraq. The application domain of the system is military force protection, including municipal services surveys, detainee screening, and descriptions of people, houses, vehicles, etc. The sy...
Article
Full-text available
We describe and present evaluation results for Talk'n'Travel, a spoken dialogue language system for making air travel plans over the telephone. Talk'n'Travel is a fully conversational, mixedinitiative system that allows the user to specify the constraints on his travel plan in arbitrary order, ask questions, etc., in general spoken English. The sys...
Conference Paper
Full-text available
This paper describes the evaluation methodology and results of the 2001 DARPA Communicator evaluation. The experiment spanned 6 months of 2001 and involved eight DARPA Communicator sys- tems in the travel planning domain. It resulted in a corpus of 1242 dialogs which include many more dialogues for complex tasks than the 2000 evaluation. We describ...
Article
this document, the user will get fired. This means that in every possible future where the system fails to print this document, the user gets fired. This research is b Andrew Haas, who is also working on a planning program that uses these ideas
Article
Full-text available
We present a natural language interface system which is based entirely on trained statistical models. The system consists of three stages of processing: parsing, semantic interpretation, and discourse. Each of these stages is modeled as a statistical process. The models are fully integrated, resulting in an end-to-end system that maps input utteran...
Article
Full-text available
We propose a distinction between two kinds of metonymy: "referential" metonymy, in which the referent of an N-P is shifted, and "predicative" metonymy, in which the referent of the NP is unchanged and the ar- gument place of the predicate is shifted instead. Examples are, respectively, "The hamburger is waiting for his check" and "Which airlines fl...
Article
Full-text available
We describe Talk'n'Travel, a spoken dialogue language system for making air travel plans over the telephone.
Article
Full-text available
We present a computational treatment of the semantics of plural Noun Phrases which extends an earlier approach presented by Scha [7] to be able to deal with multiple-level plurals ("the boys and the girls", "the juries and the committees". etc.) t We ar- gue that the arbitrary depth to which such plural structures can be nested creates a correspond...
Article
Full-text available
the syntactically impossible antecedents. This latter for handling bound anaphora, disjoint reference, and pronominal reference. The algorithm maps over every node in a parse tree in a left-to-right, depth first manner. Forward and backwards coreference, and disjoint reference are assigned during this tree walk. A semantic interpretation procedure...
Article
Full-text available
A new method is presented for simplifying the logical expressions used to represent utterance meaning in a natural language system. 1 This simplification method utilizes the encoded knowledge and the limited inference-making capability of a taxonomic knowledge representation system to reduce the constituent structure of logical expressions. The spe...
Article
This paper describes results of an experiment with 9 different DARPA Communicator Systems who participated in the June 2000 data collection. All systems supported travel planning and utilized some form of mixed-initiative interaction. However they varied in several critical dimensions: (1) They targeted different back-end databases for travel infor...
Conference Paper
A central problem for mixed-initiative dialogue management is coping with user utterances that fall outside of the expected sequence of dialogue. Independent initiative by the user may require a complete revision of the future course of the dialogue, even when the system is engaged in activities of its own, such as querying a database, etc. This pa...
Conference Paper
Full-text available
We describe the first sentence understanding system that is completely based on learned methods both for understanding individual sentences, and determining their meaning in the context of preceding sentences. We divide the problem into three stages: semantic parsing, semantic classification, and discourse modeling. Each of these stages requires a...
Conference Paper
Full-text available
Describes a sentence understanding system that is completely based on learned methods both for understanding individual sentences and for determining their meaning in the context of the preceding sentences. We describe the models used for each of three stages in the understanding: semantic parsing, semantic classification and discourse modeling. Wh...
Conference Paper
Full-text available
The design and performance of a complete spoken language understanding system under development at BBN are described. The system, dubbed HARC (Hear And Respond to Continuous speech), successfully integrates state-of-the-art speech recognition and natural language understanding subsystems. The system has been tested extensively on a restricted airli...
Conference Paper
Full-text available
This paper presents the Semantic Linker, the fallback component used by the the DELPHI natural language component of the BBN spoken language system HARC. The Semantic Linker is invoked when DELPHI's regular chart-based unification grammar parser is unable to parse an input; it attempts to come up with a semantic interpretation by combining the frag...
Conference Paper
Full-text available
We have recently made significant changes to the BBN DELPHI syntactic and semantic analysis component. These goal of these changes was to maintain the tight coupling between syntax and semantics characteristic of earlier versions of DELPHI, while making it possible for the system to provide useful semantic interpretations of input for which complet...
Article
Full-text available
We present results from the February '92 evaluation on the ATIS travel planning domain for HARC, the BBN spoken language system (SLS). In addition, we discuss in detail the individual performance of BYBLOS, the speech recognition (SPREC) component.In the official scoring, conducted by NIST, BBN's HARC system produced a weighted SLS score of 43.7 on...
Article
Full-text available
This paper presents the fallback understanding component of BBN's DELPHI NL sysystem. This component is invoked when the core DELPHI system is unable to understand an input. It incorporates both syntax- and frame-based fragment combination sub-components, in an attempt to provide a smoother path from accurate but fragile conventional parsers on the...
Conference Paper
Full-text available
We present the "mapping unit" approach to representing subeategoriza- tion information, a computational framework for encoding subcategorization information which has been developed and implemented for BBN's DEL- PHI system (the NL component of the HARC spoken language system). The advantage of our approach to subeategorization lies in its flexibil...
Conference Paper
Full-text available
ABSTRACT This paper presents the test results of running BBN's HARC spoken language system and DELPHI natural language understanding system on the ATIS benchmarks. We give a brief system overview, and review the major changes that have
Conference Paper
This paper reports recent progress on the development of the Delphi natural language component of the BBN spoken language system for the ATIS domain, focussing on the comparative evaluation performed by NIST in June, 1990.
Article
Full-text available
This paper presents recent natural language work on HARC, the BBN Spoken Language System. The HARC system in- corporates the Byblos system (6) as its speech recognition component and the natural language system Delphi, which consists of a bottom-up parser paired with an integrated syn- tax/semantics unification grammar, a discourse module, and a da...
Article
Full-text available
We describe HARC, a system for speech understanding that integrates speech recognition techniques with natural language processing. The integrated system uses statistical pattern recognition to build a lattice of potential words in the input speech. This word lattice is passed to a unification parser to derive all possible associated syntactic stru...
Article
Full-text available
This paper describes the current state of work on unification-based semantic interpretation in HARC (for Hear and Recognize Continous speech) the BBN Spoken Language System. It presents the implementation of an integrated syntax/semantics grammar written in a unification formalism similar to Definite Clause Grammar. This formalism is described, and...
Article
Full-text available
Theories of semantic interpretation which wish to capture as many generalizations as possible must face up to the manifoldly ambiguous and contextually dependent nature of word meaning. In this paper I present a two-level scheme of semantic interpretation in which the first level deals with the semantic consequences of syntactic structure and the s...
Conference Paper
Full-text available
BBN's responsibility is to conduct research and development in natural language interface technology. This responsibility has three aspects:• to demonstrate state-of-the-art technology in a Strategic Computing application, collecting data regarding the effectiveness of the demonstrated heuristics,• to conduct research in natural language interface...
Article
Full-text available
Significant advances have been achieved in Speech-to-Speech (S2S) translation systems in recent years. However, rapid configuration of S2S systems for low-resource language pairs and domains remains a challenging problem due to lack of human translated bilingual training data. In this paper, we report on an effort to port our existing English/Iraqi...

Network

Cited By