Table 4 - uploaded by Kazuhide Yamamoto
Content may be subject to copyright.
Examples of correct paraphrasing
Source publication
We propose a method of paraphrasing a Japanese noun modifier into a noun phrase in the form of "A no B." The semantic structures of "A no B" are sometimes recognized by supplementing some abbreviated predicate. We define these abbreviated verbs as "deletable verbs" in twoways: 1. Wechoose verbs matched with the semantic relations of "A no B" by usi...
Context in source publication
Similar publications
Media coverage of the vegetative state (VS) includes refutations of the VS diagnosis and describes behaviors inconsistent with VS. We used a quality score to assess the reporting in articles describing the medical characteristics of VS in Italian newspapers.
Our search covered a 7-month period from July 1, 2008, to February 28, 2009, using the onli...
This paper presents our first experiments aimed at the automatic selection of the relevant documents for the blind relevance feedback method in speech information retrieval. Usually the relevant documents are selected only by simply determining the first N documents to be relevant. We consider this approach to be insufficient and we would try in th...
The main purpose of the article is to analyze how effectively different types of thesaurus relations can be used for solutions of text classification tasks. The basis of the study is an automatically generated thesaurus of a subject area, that contains three types of relations: synonymous, hierarchical and associative. To generate the thesaurus the...
Objective
To investigate the replication validity of biomedical association studies covered by newspapers.
Methods
We used a database of 4723 primary studies included in 306 meta-analysis articles. These studies associated a risk factor with a disease in three biomedical domains, psychiatry, neurology and four somatic diseases. They were classifie...
Citations
... To address this predicament, many techniques for automatic text summarization systems have been developed in the last decade. Summarization systems for different languages, such as French (Lehmam, 1999), German (Reimer & Hahn 1990), Chinese (Chen, Kuo, Huang, Lin, & Wung, 2003), Japanese (Kataoka, Masuyama, & Yamamoto, 1999), and Korean (Myaeng & Jang, 1999), have also been explored. Unfortunately, most of these systems are monolingual summarization systems. ...
Based on the salient features of the documents, auto- matic text summarization systems extract the key sen- tences from source documents. This process supports the users in evaluating the relevance of the extracted documents returned by information retrieval systems. Because of this tool, efficient filtering can be achieved. Indirectly, these systems help to resolve the problem of information overloading. Many automatic text summa- rization systems have been implemented for use with different languages. It has been established that the grammatical and lexical differences between languages have a significant effect on text processing. However, the impact of the language differences on the automatic text summarization systems has not yet been investigated. The authors provide an impact analysis of language dif- ference on automatic text summarization. It includes the effect on the extraction processes, the scoring mecha- nisms, the performance, and the matching of the extracted sentences, using the parallel corpus in English and Chinese as the tested object. The analysis results provide a greater understanding of language differences and promote the future development of more advanced text summarization techniques.
... Some cases of paraphrasing research with certain targets have been reported. For example, there has been work on rewriting the source language in machine translation with a focus on reducing syntactic ambiguities (Shirai et al., 1993), research on paraphrasing paper titles with a focus on transforming syntactic structures to achieve readability (Sato, 1999), and research on paraphrasing Japanese in summarization with a focus on transforming a noun modifier into a noun phrase (Kataoka et al., 1999). We have reported some research on Chinese paraphrasing (). ...
... In fact, the paraphrasing can be conducted at many different levels, for instance, words, phrases, or larger constituents. Although the paraphrasing of such constituents is probably related to context, it is not true that paraphrasing is impossible without being able to understand the whole sentence (Kataoka et al., 1999). The paraphrasing process encounters the following problems. ...
One of the key issues in spoken language translation is how to deal with unrestricted expressions in spontaneous utterances. This research is centered on the development of a Chinese paraphraser that automatically paraphrases utterances prior to transfer in Chinese-Japanese spoken language translation. In this paper, a pattern-based approach to paraphrasing is proposed for which only morphological analysis is required. In addition, a pattern construction method is described through which paraphrasing patterns can be efficiently learned from a paraphrase corpus and human experience. Using the implemented paraphraser and the obtained patterns, a paraphrasing experiment was conducted and the results were evaluated.
... Consequently, it is necessary to conduct a feasibility study on collecting various kinds of paraphrase knowledge from non-paraphrase corpora, particularly from raw text corpora. Although we have already reported extracting paraphrasing knowledge of Japanese noun modifiers from a raw corpus (Kataoka et al., 1999), we need to explore other types of expressions. With this motivation, we have attempted to acquire paraphrasing knowledge on content words, mainly nouns and verbs. ...
Automatic acquisition of paraphrase knowledge for content words is proposed. Using only a non-parallel text corpus, we compute the para-phrasability metrics between two words from their similarity in context. We then filter words such as proper nouns from external knowledge. Finally, we use a heuristic in further filtering to improve the accuracy of the automatic acqui-sition. In this paper, we report the results of acquisition experiments.
... In Japanese-English machine translation, rewriting the source language before the translation process has shown improvements in the translation quality [1]. Recently, research on paraphrasing has received more attention and some useful techniques have been developed [2][3]. At present, ATR Spoken Language Translation Research Laboratories is involved in the research and development of the proposed SLT system, called the Sandglass project [4]. ...
In this paper, we propose a paraphrasing approach to spoken language processing and introduce our preliminary investigation on phenomena of the Chinese spoken language. In spoken language processing, many problems have still not been resolved satisfactorily, such as ungrammatical expressions due to spontaneous utterances and speech recognition errors due to noisy environments. One of the important issues in this field is how to achieve robustness against these phenomena. We propose transforming various expressions of a spoken language into formal expressions of a written language with the same meanings, i.e. paraphrasing. For this purpose, we design three types of paraphrasing processes, i.e. (1) to correct speech recognition errors (2) to provide formal and simple expressions, and (3) to add informative expressions for disambiguation. In order to automatically paraphrase the Chinese spoken language, we carry out an investigation into phenomena of Chinese spontaneous utterances in the ATR travel conversation corpus and LDC CallHome Mandarin transcript corpus. The investigation results point out the direction of future research.
The purpose of this paper is to propose a method of paraphrasing a Japanese verbal noun phrase into a noun phrase in the form of \N 1 no N 2 ". The semantic structure of \N 1 no N 2 " can be recognized by supplementing some abbreviated predicate. We d e -ne \deletable verbs" as these abbreviated predicates in two w a ys. 1. Choose verbs equivalent to the semantic relations of \N 1 no N 2 " using a thesaurus. 2. Choose verbs associated with nouns. If a verb frequently cooccurs with a noun in newspa-per articles, it is concluded that the verb is associated with the noun. By dening \deletable verbs" and utilizing a variety of the semantic structure of \N 1 no N 2 ", this paraphrasing is accomplished. The subjective evaluation of our paraphrasing method shows that the precision is 63.8% and the recall is 61.4%. It is also shown that restriction on targets can increase the precision by 82.9%.
Generally inquiries through Web forms and e-mails are increasing. These inquiry texts usually include many informal expressions use of the colloquial style, such as a spoken language, and many omitted words. An omitted word causes the meaning of a sentence to become ambiguous and may make the reader misread and misunderstand the context. In this paper we focus on the frequently omitted noun ``B'' in the noun phrase ``A NO1 B'' (usually meaning B of A) seen in the colloquial style inquiry text and propose a method to predict omitted noun ``B'' from context and knowledge using topic information. From the results of the evaluation experiment, we have confirmed that our method improved 11.34 points from the conventional method, and predicted the omitted word with an accuracy rate of more than 75% using ``Latent Dirichlet Allocation'' (LDA.)
片岡 明,増山 繁, 山本 和英. 動詞型連体修飾表現の“N1のN2”への言い換え. 自然言語処理, Vol.7, No.4, pp.79-98 , 言語処理学会 (2000.10)
As a result of the rapid growth in Internet access, significantly more information has become available online in real time.
However, there is not sufficient time for users to read large volumes of information and make decisions accordingly. The problem
of information-overloading can be resolved through the application of automatic summarization. Many summarization systems
for documents in different languages have been implemented. However, the performance of summarization system on documents
in different languages has not yet been investigated. In this paper, we compare the result of fractal summarization technique
on parallel documents in Chinese and English. The grammatical and lexical differences between Chinese and English have significant
effect on the summarization processes. Their impact on the performances of the summarization for the Chinese and English parallel
documents is compared.
One of the key issues in spoken-language translation is how to deal with unrestricted expressions in spontaneous utterances. We have developed a paraphraser for use as part of a translation system, and in this paper we describe the implementation of a Chinese paraphraser for a Chinese-Japanese spoken-language translation system. When an input sentence cannot be translated by the transfer engine, the paraphraser automatically transforms the sentence into alternative expressions until one of these alternatives can be translated by the transfer engine. Two primary issues must be dealt with in paraphrasing: how to determine new expressions, and how to retain the meaning of the input sentence. We use a pattern-based approach in which the meaning is retained to the greatest possible extent without deep parsing. The paraphrase patterns are acquired from a paraphrase corpus and human experience. The paraphrase instances are automatically extracted and then generalized into paraphrase patterns. A total of 1719 paraphrase patterns obtained using this method and an implemented paraphraser were used in a paraphrasing experiment. The results showed that the implemented paraphraser generated 1.7 paraphrases on average for each test sentence and achieved an accuracy of 88%.