Pierre Isabelle

Pierre Isabelle
  • Ph. D. Comput. Linguistics
  • Principal Investigator at National Research Council Canada

About

59
Publications
9,114
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,777
Citations
Current institution
National Research Council Canada
Current position
  • Principal Investigator

Publications

Publications (59)
Preprint
Full-text available
We present a challenge set for French --> English machine translation based on the approach introduced in Isabelle, Cherry and Foster (EMNLP 2017). Such challenge sets are made up of sentences that are expected to be relatively difficult for machines to translate correctly because their most straightforward translations tend to be linguistically di...
Article
Full-text available
Techniques for generating and recognizing paraphrases, i.e., semantically equivalent expressions, play an important role in a wide range of natural language processing tasks. In the last decade, the task of automatic acquisition of subsentential paraphrases, i.e., words and phrases with (approximately) the same meaning, has been drawing much attent...
Article
Full-text available
Neural machine translation represents an exciting leap forward in translation quality. But what longstanding weaknesses does it resolve, and which remain? We address these questions with a challenge set approach to translation evaluation and error analysis. A challenge set consists of a small set of sentences, each hand-designed to probe a system's...
Patent
Full-text available
This application is related to a means and a method for facilitating the use of translation memories by aligning words of an input source language sentence with the correspondent translated words in target language sentence. More specifically, this invention relates to such a means and method where there is an enhanced translation memory comprising...
Conference Paper
Full-text available
This paper presents a paraphrase acquisition method that uncovers and exploits generalities underlying paraphrases: paraphrase patterns are first induced and then used to collect novel instances. Unlike existing methods, ours uses both bilingual parallel and monolingual corpora. While the former are regarded as a source of high-quality seed paraphr...
Conference Paper
Translation is a key capability to access relevant information expressed in various languages on social media. Unfortunately, systematically translating all content far exceeds the capacity of most organizations. Computer-aided translation (CAT) tools can significantly increase the productivity of translators, but can not ultimately cope with the o...
Conference Paper
Full-text available
In this paper, we show how a large bilingual English-French parallel corpus can be brought to bear in terminology search. First, we dem-onstrate that the coverage of available corpora has become substantially more extensive than that of mainstream term banks. One potential drawback in searching large unstructured cor-pora is that large numbers of s...
Article
Full-text available
We investigate the possibility of automatically detecting whether a piece of text is an original or a translation. On a large parallel English-French corpus where reference information is available, we find that this is possible with around 90% accuracy. We further study the implication this has on Machine Translation performance. After separating...
Article
Full-text available
We propose to use a statistical phrase-based machine translation system in a post-editing task: the system takes as input raw machine translation output (from a commercial rule-based MT system), and produces post-edited target-language text. We report on experiments that were performed on data collected in precisely such a setting: pairs of raw MT...
Article
Full-text available
This article describes a machine translation system based on an automatic post-editing strategy: initially translate the input text into the target-language using a rule-based MT system, then automatically post-edit the output using a statistical phrase-based system. An implementation of this approach based on the SYSTRAN and PORTAGE MT systems was...
Article
Full-text available
It is generally acknowledged that the performance of rule-based machine translation (RMBT) systems can be greatly improved through domain-specific system adaptation. To that end, RBMT users often choose to invest significant resources into the development of ad hoc MT dictionaries. In this paper, we demonstrate that comparable customization effects...
Article
Machine translation – the use of computers to translate automatically among human languages – is an alluring prospect, one that for more than 50 years has fascinated researchers, inspired idealists and opportunists, and provoked unease among professional translators. This article gives a broad survey of this diverse and active field. It begins with...
Article
Full-text available
Lexica! Grammars are a class of unification g,'ammars which share a fixed rule component, for which lhere exists a simple left-recursion elimination transformation. The parsing and generation programs ae seen as two dual non-left-recursive versions of the original grammar, and are implemented through a standard top-down Prolog interpreter. Formal c...
Article
Full-text available
We argue that the conventional approach to Interactive Machine 2anslation is not the best way to provide assistance to skilled trmslators, and propose an alter native whose central feature is the use of the target text as a medium of inter- action. We describe an automatic word- completion system intended to serve as a vehicle for exploring the fea...
Article
Full-text available
There is an increasing need for document search mechanisms capable of matching a natural language query with documents written in a different language. Recently, we conducted several experiments aimed :at comparing various methods of incorporating a cross-linguistic capability to existing information retrieval (IR) systems. Our results indicate tha...
Article
d on a daily basis for the Canadian Environment Department (Chandioux and Gu6raud 1981). Its current workload represents an annual volume of 8.5 million words (Bourbeau 1984). In spite of its very narrow scope, TAUM-METEO represents an important breakthrough in MT, since it is the-only system that currently produces high quality translation without...
Article
ws clustering and multi-document summarization. In Human Language Technology Conference, San Diego, CA, 2001. [32] Dragomir R. Radev, Weiguo Fan, and Zhu Zhang. Webinessence: A personalized web-based multi-document summarization and recommendation system. In NAACL Workshop on Automatic Summarization, Pittsburgh, PA, 2001. [33] Dragomir R. Radev, Va...
Article
Research on a number of developments in language technologies, targeted at improving patent processing procedures within patent offices and in subsequent patent database search systems, is described. Aspects of patent processing covered are (1) OCR correction, to assist the conversion of paper documents to electronic versions, and (2) text classifi...
Article
Full-text available
This book is intended for researchers who want to keep abreast of current developments in corpus-based natural language processing. It captures the essence of a series of highly successful workshops organized over the last few years. The papers cover a range of current research topics in this field including part-of-speech tagging, word sense disam...
Article
Linguistics. [33] Evelyne Tzoukermann and Dragomir R. Radev. Use of weighted nding new information in threaded news. Technical Report CUCS-026-99, Columbia University, 1999. [24] Dragomir R. Radev, Vasileios Hatzivassiloglou, and Kathleen R. McKeown. A description of the CIDR system as used for TDT-2. In DARPA Broadcast News Workshop, Herndon, VA,...
Article
Full-text available
This paper describes the use of a probabilistic translation model to cross-language IR (CLIR). The performance of this approach is compared with that using machine translation (MT). It is shown that using a probabilistic model, we are able to obtain performances close to those using an MT system. In addition, we also investigated the possibility of...
Conference Paper
Full-text available
This paper describes the use of a probabilistic translation model to cross-language IR (CLIR). The performance of this approach is compared with that using machine translation (MT). It is shown that using a probabilistic model, we are able to obtain performances close to those using an MT system. In addition, we also investigated the possibility of...
Book
ABOUT THIS BOOK This book is intended for researchers who want to keep abreast of cur­ rent developments in corpus-based natural language processing. It is not meant as an introduction to this field; for readers who need one, several entry-level texts are available, including those of (Church and Mercer, 1993; Charniak, 1993; Jelinek, 1997). This b...
Article
This paper describes the work achieved in the first half of a 4-year cooperative research project (ARCADE), financedby AUPELF-UREF. The project is devoted to the evaluation of parallel text alignment techniques. In its first periodARCADE ran a competition between six systems on a sentence-to-sentence alignment task which yielded two main typesof re...
Article
Full-text available
While machine translation can successfully tackle some highly restricted sublanguages, it is in most cases more productive to turn to support tools for human translators. The functions taken over by existing translator's workstations are rather peripheral with respect to the core aspects of the translation task. However, recent developments show th...
Article
Full-text available
. The use of machine translation as a tool for professional or other highly skilled translators is for the most part currently limited to postediting arrangements in which the translator invokes MT when desired and then manually cleans up the results. A theoretically promising but hitherto largely unsuccessful alternative to postedition for this ap...
Article
Full-text available
In a recent paper, Gale and Church describe an inexpensive method for aligning bitext, based exclusively on sentence lengths [Gale and Church, 1991]. While this method produces surprisingly good results (a success rate around 96%), even better results are required to perform such tasks as the computer-assisted revision of translations. In this pape...
Article
Full-text available
ingual texts. This simple result turns out to be of fundamental importance from the point of view of MAHT. It constitutes in itself a suitable foundation for many kinds of new translation support tools. More on this below. 5. Why should there be such a difference between the two paradigms? The explanation, I think, is as follows. Rule-based MT tend...
Article
Full-text available
This paper describes a system designed for use by professional translators that enables them to dictate their translation. Because the speech recognizer has access to the source text as well as the spoken translation, a statistical translation model can guide recognition. This can be done in many different ways---which is best? We discuss the exper...
Article
en using systems like SYSTRAN since the early sixties. Moreover, this kind of application is now becoming a mass-market business under the guise of "browsing tools" for the Internet. For example, Web Translator, a multilingual Netscape-integrated MT system is now available from Globalink for less than $50. No doubt, countless new users will be dazz...
Article
Full-text available
Professional translators often dictate their translations orally and have them typed afterwards. The TransTalk project aims at automating the second part of this process. Its originality as a dictation system lies in the fact that both the acoustic signal produced by the translator and the source text under translation are made available to the sys...
Conference Paper
Full-text available
We argue that the concept of translation analysis provides a suitable foundation for a new generation of translation support tools. We show that pre-existing translations can be analyzed into a structured translation memory and describe our TransSearch bilingual concordancing system, which allows translators to harness such a memory. We claim that...
Article
Somers' paper falls short of its declared aim of demonstrating that: 1) G2 principles of modularity are inadequate; 2) current knowledge-based and linguistics-based MT research is wrong in adhering to these modularity principles; and 3) clearcut alternatives to these principles have now emerged. Finally, I would like to point out that Somers' revie...
Article
Full-text available
Résumé Les postes de travail de traducteur actuels se concentrent encore trop peu sur les aspects proprement traductionnels de la tâche du traducteur. Nous montrons que le concept de bi-texte permet d'envisager de nouvelles possibilités à cet égard. Un bi-texte consiste en un couple de textes (une source et sa traduction) unis par une représentatio...
Conference Paper
Full-text available
Lexical Grammars are a class of unification grammars which share a fixed rule component, for which there exists a simple left-recursion elimination transformation. The parsing and generation programs are seen as two dual non-left-recursive versions of the original grammar, and are implemented through a standard top-down Prolog interpreter. Formal c...
Conference Paper
Full-text available
The CRITTER system is being developed to translate agricultural market reports between English and French. It is based on a transfer model, and designed to be reversible. The source and target language texts are described by means of: a) a surface syntactic representation consisting of a tree annotated with feature structures, built by an extraposi...
Conference Paper
Full-text available
The transfer components of typical second generation (G2) MT systems do not fully conform to the principles of G2 modularity, incorporating extensive target language information while failing to seperate translation facts from linguistic theory. The exclusion from transfer of all non-contrastive information leads us to a system design in which the...
Article
Full-text available
Upon the completion of its highly successful TAUM-METEO machine translation system, the TAUM group undertook the construction of TAUM-AVIATION, an experimental system for English to French translation in the sublanguage of technical maintenance manuals. A detailed description of the resulting prototype is offered. In particular, the paper includes:...
Article
This document contains the instructions for preparing a camera-ready manuscript for the proceedings of MT-Summit XII. The docu-ment itself conforms to its own specifica-tions, and is therefore an example of what your manuscript should look like. Authors are asked to conform to all the directions reported in this document.
Article
Full-text available
We explore the problem of integrating a phrase-based MT system within a computer- assisted translation (CAT) environment. We argue that one way of achieving successful in- tegration is to design an MT system that be- haves more like the translation memory (TM) component of CAT systems. This implies pro- ducing MT output that is consistent with that...
Article
Full-text available
Machine Translation traditionally treats doc-uments as sets of independent sentences. In many genres, however, documents are highly structured, and their structure contains infor-mation that can be used to improve transla-tion quality. We present a preliminary ap-proach to document translation that uses struc-tural features to modify the behaviour...
Article
Full-text available
The CRITTER translation system makes use of a single grammar to perform analysis and synthesis tasks. The formalism used is a variant of DCG (Definite Clause Grammars), in which annotations have been added to allow for dual compilations of the grammar into analysis and synthesis Prolog programs sharing the same declarative content. These annotation...
Article
Thèse (M.A.)--Université du Québec à Montréal, 1978. Comprend des réf. bibliogr.

Network

Cited By