Conference Paper

Rule Filtering by Pattern for Efficient Hierarchical Translation.

DOI: 10.3115/1609067.1609109 Conference: EACL 2009, 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference, March 30 - April3, 2009, Athens, Greece
Source: DBLP

ABSTRACT We describe refinements to hierarchical translation search procedures intended to reduce both search errors and memory us- age through modifications to hypothesis expansion in cube pruning and reductions in the size of the rule sets used in transla- tion. Rules are put into syntactic classes based on the number of non-terminals and the pattern, and various filtering strate- gies are then applied to assess the impact on translation speed and quality. Results are reported on the 2008 NIST Arabic-to- English evaluation task. lation. Memory usage can be reduced in cube pruning (Chiang, 2007) through smart memoiza- tion, and spreading neighborhood exploration can be used to reduce search errors. However, search errors can still remain even when implementing simple phrase-based translation. We describe a 'shallow' search through hierarchical rules which greatly speeds translation without any effect on quality. We then describe techniques to analyze and reduce the set of hierarchical rules. We do this based on the structural properties of rules and develop strategies to identify and remove redun- dant or harmful rules. We identify groupings of rules based on non-terminals and their patterns and assess the impact on translation quality and com- putational requirements for each given rule group. We find that with appropriate filtering strategies rule sets can be greatly reduced in size without im- pact on translation performance.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Shallow-n grammars (de Gispert et al., 2010) were introduced to reduce over-generation in the Hiero translation model (Chiang, 2005) resulting in much faster decoding and restricting reordering to a desired level for specific language pairs. However, Shallow-n grammars require parameters which cannot be directly optimized using minimum error-rate tuning by the decoder. This paper introduces some novel improvements to the translation model for Shallow-n grammars. We introduce two rules: a BITG-style reordering glue rule and a simpler monotonic concatenation rule. We use separate features for the new rules in our log-linear model allowing the decoder to directly optimize the feature weights. We show this formulation of Shallow-n hierarchical phrase-based translation is comparable in translation quality to full Hiero-style decoding (without shallow rules) while at the same time being considerably faster.
    Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 06/2012
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we apply lightly-supervised training to a hierarchical phrase-based statistical machine translation system. We employ bitexts that have been built by automatically translating large amounts of monolingual data as additional parallel training corpora. We explore different ways of using this additional data to improve our system. Our results show that integrating a second translation model with only non-hierarchical phrases extracted from the automatically generated bitexts is a reasonable approach. The translation performance matches the result we achieve with a joint extraction on all training bitexts while the system is kept smaller due to a considerably lower overall number of phrases.
    Proceedings of the First Workshop on Unsupervised Learning in NLP; 07/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: Hierarchical phrase-based (HPB) translation has been introduced to speech-to-speech (S2S) translation system on mobile terminals, such as smartphones. However, it suffers from the explosive growth in the number of rules along with the increment in decoding time for S2S translation system when the memory and decoding speed is restricted. In this paper, we propose a nesting HPB model to capture the topological structure of hierarchical rules on the source language side, which will not only filter out the redundant rules in HPB model but also speed up the decoder. Experiments on the HPB translation system show that our approach can greatly reduce the rule table size by 75% with a faster decoder, and yield the same translation quality (measured by using BLEU) as the state-of-art HPB model.
    Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on; 01/2012


Available from