Conference Paper

Acquisition of Word Translations Using Local Focus-Based Learning in Ainu-Japanese Parallel Corpora.

DOI: 10.1007/978-3-540-24630-5_36 Conference: Computational Linguistics and Intelligent Text Processing, 5th International Conference, CICLing 2004, Seoul, Korea, February 15-21, 2004, Proceedings
Source: DBLP

ABSTRACT This paper describes a new learning method for acquisition of word translations from small parallel corpora. Our proposed
method, Local Focus-based Learning (LFL), efficiently acquires word translations and collocation templates by focusing on parts of sentences, not on
entire sentences. Collocation templates have collocation information to acquire word translations from each sentence pair.
This method is useful even when frequency of appearances of word translations is very low in sentence pairs. The LFL system
described in this paper extracts Ainu-Japanese word translations from small Ainu-Japanese parallel corpora. The Ainu language
is spoken by the Ainu ethnic group residing in northern Japan and Sakhalin. An evaluation experiment indicated that the recall
was 57.4% and the precision was 72.0% to 546 kinds of nouns and verbs in 287 Ainu-Japanese sentence pairs even though the
average frequency of appearances of the 546 kinds of nouns and verbs was 1.98.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a learning and extracting method of word sequence correspondences from non-aligned parallel corpora with Support Vector Machines, which have high ability of the generalization, rarely cause over-fit for training samples and can learn dependencies of features by using a kernel function. Our method uses features for the translation model which use the translation dictionary, the number of words, part-of-speech, constituent words and neighbor words. Experiment results in which Japanese and English parallel corpora are used archived 81.1% precision rate and 69.0% recall rate of the extracted word sequence correspondences. This demonstrates that our method could reduce the cost for making translation dictionaries.
    Proceedings of the 19th international conference on Computational linguistics - Volume 1; 01/2002
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper iroposes ;t n('w kr lzarning bilingH;l collocat, ions From scntence-;ligte(l I);rallol COrlor;t.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A number of machine translation systems based on the learning algorithms are presented. These methods acquire translation rules from pairs of similar sentences in a bilingual text corpora. This means that it is difficult for the systems to acquire the translation rules from sparse data. As a result, these methods require large amounts of training data in order to acquire high-quality translation rules. To overcome this problem, we propose a method of machine translation using a Recursive Chain-link-type Learning. In our new method, the system can acquire many new high-quality translation rules from sparse translation examples based on already acquired translation rules. Therefore, acquisition of new translation rules results in the generation of more new translation rules. Such a process of acquisition of translation rules is like a linked chain. From the results of evaluation experiments, we confirmed the effectiveness of Recursive Chain-link-type Learning.
    Proceedings of the 19th international conference on Computational linguistics - Volume 1; 01/2002