Publications (5)0 Total impact
-
Conference Proceeding: A Novel Reordering Model Based on Multi-layer Phrase for Statistical Machine Translation.
COLING 2010, 23rd International Conference on Computational Linguistics, Proceedings of the Conference, 23-27 August 2010, Beijing, China; 01/2010 -
Conference Proceeding: Word Alignment Based on Multi-Grain Model
[show abstract] [hide abstract]
ABSTRACT: Word alignment plays a critical role in statistical machine translation (SMT) and cross-language information retrieval. Until now, most existing methods get the word alignment within the whole range of the sentence length. The alignment quality is unsatisfactory. In this paper, we propose a novel approach to word alignment based on multi-grain model (WAMG). We split a parallel sentence pair into blocks in different grain and get the word alignments within each corresponding block. Our approach is able to restrict the search space of word alignment in the relatively accurate local range and reduce the mapping error. The experiments have shown that our approach outperforms the traditional word alignment algorithm relatively by about 12% in AER and improves the performance of Chinese-to-English translation system relatively by about 2.8% in BLEU.Chinese Spoken Language Processing, 2008. ISCSLP '08. 6th International Symposium on; 01/2009 -
Article: The CASIA phrase-based statistical machine translation system for IWSLT 2007
[show abstract] [hide abstract]
ABSTRACT: This paper describes our phrase-based statistical machine translation system (CASIA) used in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2007. In this year's evaluation, we participated in the open data track of clean text for the Chinese-to-English machine translation. Here, we mainly introduce the overview of the system, the primary modules, the key techniques, and the evaluation results. -
Article: The CASIA Statistical Machine Translation System for IWSLT 2008
[show abstract] [hide abstract]
ABSTRACT: This paper describes our statistical machine translation system (CASIA) used in the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2008. In this year's evaluation, we participated in challenge task for Chinese-English and English-Chinese, BTEC task for Chinese-English. Here, we mainly introduce the overview of our system, the primary modules, the key techniques, and the evaluation results. -
Article: A Generalized Reordering Model for Phrase-Based Statistical Machine Translation
[show abstract] [hide abstract]
ABSTRACT: Phrase-based translation models are widely studied in statistical machine translation (SMT). However, the existing phrase-based translation models either can not deal with non-contiguous phrases or reorder phrases only by the rules without an effective reor-dering model. In this paper, we propose a generalized reordering model (GREM) for phrase-based statistical machine translation, which is not only able to capture the knowl-edge on the local and global reordering of phrases, but also is able to obtain some ca-pabilities of phrasal generalization by using non-contiguous phrases. The experimental results have indicated that our model out-performs MEBTG (enhanced BTG with a maximum entropy-based reordering model) and HPTM (hierarchical phrase-based trans-lation model) by improvement of 1.54% and 0.66% in BLEU.
Institutions
-
2009
-
Chinese Academy of Sciences
- Institute of Automation
Beijing, Beijing Shi, China
-