ArticlePDF Available

Abstract

The goal of this paper is to summarise a work in progress focused on speech translation making use of stochastic finite-state transducers (SFSTs). The aim of these devices lays on their versatility to inte-grate acoustic models within translation models. Our interest lays on Spanish and Basque, offi-cial languages in the 600.000 inhabitant Basque Au-tonomous Community. These two languages, show remarkable differences in both syntax and morphol-ogy, as a result they represent a challenge for SFSTs. In addition, we deal with little linguistic resources due to the fact that Basque is a minority language. 1 On the use of SFSTs for ST Finite-state models constitute an elementary frame-work not only in syntactic pattern recognition but also in language processing. Particularly, stochas-tic finite-state transducers (SFSTs) have proved to be of use in machine translation of restricted do-mains. As it is known, SFSTs can be inferred from positive bilingual samples following GIATI al-gorithm (Casacuberta and Vidal, 2004; González and Casacuberta, 2009). In a few words, given the bilingual training set, GIATI looks for a mono-tonic segmentation and next generates a stochas-tic regular grammar made up of bilingual symbols (source words together with target phrases), yield-ing an SFST. What make SFSTs interesting for speech transla-tion (ST), besides of the fast decoding algorithms they rely on, is their versatility to get them integrated with other finite-state models. Decoupled architectures tackle speech translation in two consecutive decoding steps. Basically, the first step converts speech utterances into text tran-scription, and the second step consists on text to text translation of the recognised string. Admit-tedly, there are different approaches that differ on the amount and type of information rendered from the first stage to the second one. As integration is concerned, one of the challenges of speech transla-tion consists on exploring different ways of integrat-ing both acoustic and translation knowledge sources and translation in an attempt to make them collab-orate. Intuitively, cooperative models might yield more accurate estimated hypotheses than those mak-ing decisions in an isolate manner. the most of both knowledge sources. In (Pérez et al., 2010) it was proved that integrated architectures deal with signif-icantly more accurate hypotheses than decoupled ar-chitectures. Apart from the ability to integrate acoustic and translation models, SFSTs have also proved to al-low the integration of multiple languages in order to carry out multi-target speech translation (Pérez et al., 2007a). As a result, speech was translated si-multaneously into several languages.
Steps taken in Spanish-Basque speech translation using stochastic
finite-state transducers
Alicia P´
erez, M. In´
es Torres
Electricity and Electronics
University of the Basque Country
manes.torres@ehu.es
Francisco Casacuberta
Information Systems and Computation
Polytechnic University of Valencia
fcn@iti.upv.es
Abstract
The goal of this paper is to summarise a work in
progress focused on speech translation making use
of stochastic finite-state transducers (SFSTs). The
aim of these devices lays on their versatility to inte-
grate acoustic models within translation models.
Our interest lays on Spanish and Basque, offi-
cial languages in the 600.000 inhabitant Basque Au-
tonomous Community. These two languages, show
remarkable differences in both syntax and morphol-
ogy, as a result they represent a challenge for SFSTs.
In addition, we deal with little linguistic resources
due to the fact that Basque is a minority language.
1 On the use of SFSTs for ST
Finite-state models constitute an elementary frame-
work not only in syntactic pattern recognition but
also in language processing. Particularly, stochas-
tic finite-state transducers (SFSTs) have proved to
be of use in machine translation of restricted do-
mains. As it is known, SFSTs can be inferred
from positive bilingual samples following GIATI al-
gorithm (Casacuberta and Vidal, 2004; Gonz´
alez
and Casacuberta, 2009). In a few words, given
the bilingual training set, GIATI looks for a mono-
tonic segmentation and next generates a stochas-
tic regular grammar made up of bilingual symbols
(source words together with target phrases), yield-
ing an SFST.
What make SFSTs interesting for speech transla-
tion (ST), besides of the fast decoding algorithms
they rely on, is their versatility to get them integrated
with other finite-state models.
Decoupled architectures tackle speech translation
in two consecutive decoding steps. Basically, the
first step converts speech utterances into text tran-
scription, and the second step consists on text to
text translation of the recognised string. Admit-
tedly, there are different approaches that differ on
the amount and type of information rendered from
the first stage to the second one. As integration is
concerned, one of the challenges of speech transla-
tion consists on exploring different ways of integrat-
ing both acoustic and translation knowledge sources
and translation in an attempt to make them collab-
orate. Intuitively, cooperative models might yield
more accurate estimated hypotheses than those mak-
ing decisions in an isolate manner. the most of both
knowledge sources. In (P´
erez et al., 2010) it was
proved that integrated architectures deal with signif-
icantly more accurate hypotheses than decoupled ar-
chitectures.
Apart from the ability to integrate acoustic and
translation models, SFSTs have also proved to al-
low the integration of multiple languages in order to
carry out multi-target speech translation (P´
erez et
al., 2007a). As a result, speech was translated si-
multaneously into several languages.
2 Overcoming challenges of Basque
Basque language is a minority language of unknown
origin by contrast to Spanish, which is a Romance
language. While both languages co-exist in the
Basque Autonomous Community, they differ in both
morphology and syntax.
As for morphology, Basque (by contrast to Span-
ish) is very productive in both noun and verbs, with
more than 17 declension cases that can be recur-
sively appended to a lemma. As a result, a Basque
word tends to be translated into Spanish by more
than one word.
Moreover, the morphology of a word might in-
clude syntactic features, e.g. the Basque word
irakasleek means the teachers as the subject of a
transitive clause (Fig. 1).
In order to tackle the rich morphology of Basque,
in (P´
erez et al., 2007b) phrase-based SFSTs (PB-
SFST) were proposed within GIATI framework.
Those PB-SFSTs represented a step ahead with re-
spect to previous SFSTs since the monotonic bilin-
gual segmentation groups not only words in the tar-
get language but also words in the source language.
In this line, both statistically and linguistically mo-
tivated phrases were explored. Amongst the linguis-
tically motivated phrases morphologically and syn-
tactically motivated ones were distinguished (P´
erez
et al., 2008). Syntactically motivated phrases im-
proved the performance of the system significantly.
As a consequence of the rich morphology of
Basque, inflected words show a little repetition ra-
tion within the corpus. As an alternative mechanism
to deal with sparsity of data, categorisation was used
yielding a hierarchically arranged SFST in (Justo
et al., 2010) . This approach allowed to categorise
the bilanguage and infer specialised SFSTs for each
category, which, in addition, allowed to integrate the
acoustic models in the same network. While the for-
mulation of the models is neat, experimentally did
not offer much benefits in terms of performance.
Nevertheless, it might had to do with the small di-
mensions of the corpus with which the experiments
were carried out.
Regarding the syntax, Spanish tend to follow SVO
arrangement while Basque would follow SOV ar-
rangement. As a result very long distance align-
ments are frequent, and this is a big deal for SFSTs
under GIATI approach. The SFSTs which we are
dealing with have shown a limited ability to cope
with reordering.
preffix
(make it)
verb stem
(learn)
suffix
(collective)
det. plural
(the)
ergative case
(subject)
- kas - - le - - k- e -ira-
Figure 1: Analysis of a Basque word.
3 Concluding remarks and further work
Speech translation making use of SFSTs offers a
versatile framework. Recognised utterance and its
translation can be obtained in a single-pass decod-
ing strategy.
PB-SFSTs under GIATI approach have shown to
be useful to tackle speech translation between Span-
ish and Basque. PB approach deals with more ac-
curate alignments than word-based one. In addition,
gathering words into phrases helps alignments not
happen at so long distance, and thus overcome one
of the weakness of regular SFSTs. Reordering still
represents an open problem for SFSTs facing this
pair of languages.
Since Basque is a minority language, linguistic
resources are limited. The aforementioned meth-
ods were explored with a restricted domain corpus.
Admittedly, in order to draw solid conclusions we
should experiment with ample-domain corpora. On
this account, we are currently making efforts to col-
lect a EuroParl-like corpus for Basque with text and
speech.
References
F. Casacuberta and E. Vidal. 2004. Machine translation
with inferred stochastic finite-state transducers. Com-
putational Linguistics, 30(2):205–225.
J. Gonz´
alez and F. Casacuberta. 2009. GREAT: a
finite-state machine translation toolkit implementing a
Grammatical Inference Approach for Transducer In-
ference. In EACL workshop on Computational Lin-
guistics Aspects of Grammatical Inference, 24–32
R. Justo, A. P´
erez, M. In´
es Torres, and F. Casacuberta.
2010. Hierarchical finite-state models for speech
translation using categorization of phrases. In 11th In-
ternational Conference on Intelligent Text Processing
and Computational Linguistics
A. P´
erez, M. I. Torres, M. T. Gonz´
alez, and F. Ca-
sacuberta. 2007a. An integrated architecture for
speech-input multi-target machine translation. In
Proc. NAACL-HLT, 133–136
A. P´
erez, M. I. Torres, and F. Casacuberta. 2007b.
Speech translation with phrase based stochastic finite-
state transducers. In Proc. IEEE ICASSP
A. P´
erez, M.I. Torres, and F. Casacuberta. 2008. Join-
ing linguistic and statistical methods for Spanish-to-
Basque speech translation. Speech Communication.
A. P´
erez, M.I. Torres, and F. Casacuberta. 2010. Poten-
tial scope of a fully-integrated architecture for speech
translation. In Proc. EAMT10
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The classical approach to tackle speech translation assembles a text-to-text trans-lation system placed after a speech recog-niser, yielding the so-called decoupled ar-chitecture. In this regard, there are two is-sues to bear in mind: first, what is trans-lated in the decoupled architecture is the most likely transcription of the spoken ut-terance; second, translation systems are sensitive to errors in the source string, and speech recognition systems are still far from being flawless. In this paper we promote the use of an architecture to carry out speech transla-tion that allows to build up the most likely translation relying upon both acoustic and translation models in a cooperative man-ner, that is the so-called integrated archi-tecture. The integrated architecture is im-plemented in the finite-state framework by virtue of the composition of finite-state acoustic models of the source language within a stochastic finite-state transducer that would encompass source and target languages.
Conference Paper
Full-text available
The aim of this work is to show the abil- ity of finite-state transducers to simultane- ously translate speech into multiple lan- guages. Our proposal deals with an ex- tension of stochastic finite-state transduc- ers that can produce more than one out- put at the same time. These kind of de- vices offer great versatility for the inte- gration with other finite-state devices such as acoustic models in order to produce a speech translation system. This proposal has been evaluated in a practical situation, and its results have been compared with those obtained using a standard mono- target speech transducer.
Article
Full-text available
The goal of this work is to develop a text and speech translation system from Spanish to Basque. This pair of languages shows quite odd characteristics as they differ extraordinarily in both morphology and syntax, thus, attractive challenges in machine translation are involved. Nevertheless, since both languages share official status in the Basque Country, the underlying motivation is not only academic but also practical.Finite-state transducers were adopted as basic translation models. The main contribution of this work involves the study of several techniques to improve probabilistic finite-state transducers by means of additional linguistic knowledge. Two methods to cope with both linguistics and statistics were proposed. The first one performed a morphological analysis in an attempt to benefit from atomic meaningful units when it comes to rendering the meaning from one language to the other. The second approach aimed at clustering words according to their syntactic role and used such phrases as translation unit. From the latter approach phrase-based finite-state transducers arose as a natural extension of classical ones.The models were assessed under a restricted domain task, very repetitive and with a small vocabulary. Experimental results shown that both morphological and syntactical approaches outperformed the baseline under different test sets and architectures for speech translation.
Conference Paper
Full-text available
Stochastic finite-state transducers constitute a type of word-based models that allow an easy integration with acoustic model for speech translation. The aim of this work is to develop a novel approach to phrase-based statistical finite-state transducers. In this work, we explore the use of linguistically motivated phrases to build phrase-based models. The proposed phrase-based transducer has been tested and compared to a word-based equivalent machine, yielding promising results in the reported preliminary text and speech translation experiments
Article
GREAT is a finite-state toolkit which is devoted to Machine Translation and that learns structured models from bilingual data. The training procedure is based on grammatical inference techniques to obtain stochastic transducers that model both the structure of the languages and the relationship between them. The inference of grammars from natural language causes the models to become larger when a less restrictive task is involved; even more if a bilingual modelling is being considered. GREAT has been successful to implement the GIATI learning methodology, using different scalability issues to be able to deal with corpora of high volume of data. This is reported with experiments on the EuroParl corpus, which is a state-of-the-art task in Statistical Machine Translation.
Conference Paper
In this work a hierarchical translation model is formally defined and integrated in a speech translation system. As it is well known, the relations between two languages are better arranged in terms of phrases than in terms of running words. Nevertheless phrase-based models may suffer from data sparsity at training time. The aim of this work is to improve current speech translation systems by integrating categorization within the translation model. The categories are sets of phrases either linguistically or statistically motivated. Both category and translation and acoustic models are within the framework of finite-state models. In what temporal cost is concerned, finite-state models count on efficient decoding algorithms. Regarding the spatial cost, all the models where integrated on-the-fly at decoding time, allowing an efficient use of memory.
Article
Finite-state transducers are models that are being used in different areas of pattern recognition and computational linguistics. One of these areas is machine translation, in which the approaches that are based on building models automatically from training examples are becoming more and more attractive. Finite-state transducers are very adequate for use in constrained tasks in which training samples of pairs of sentences are available. A technique for inferring finite-state transducers is proposed in this article. This technique is based on formal relations between finite-state transducers and rational grammars. Given a training corpus of source-target pairs of sentences, the proposed approach uses statistical alignment methods to produce a set of conventional strings from which a stochastic rational grammar (e.g., an n-gram) is inferred. This grammar is finally converted into a finite-state transducer. The proposed methods are assessed through a series of machine translation experiments within the framework of the EuTrans project.
Speech translation with phrase based stochastic finite-state transducers Join-ing linguistic and statistical methods for Spanish-to-Basque speech translation Poten-tial scope of a fully-integrated architecture for speech translation
  • Icassp A Erez
  • M I Torres
  • F Casacuberta
Speech translation with phrase based stochastic finite-state transducers. In Proc. IEEE ICASSP A. P´ erez, M.I. Torres, and F. Casacuberta. 2008. Join-ing linguistic and statistical methods for Spanish-to-Basque speech translation. Speech Communication. A. P´ erez, M.I. Torres, and F. Casacuberta. 2010. Poten-tial scope of a fully-integrated architecture for speech translation. In Proc. EAMT10 2009. GREAT: a In 2007b.
Potential scope of a fully-integrated architecture for speech translation
  • A Pérez
  • M I Torres
  • F Casacuberta
A. Pérez, M.I. Torres, and F. Casacuberta. 2010. Potential scope of a fully-integrated architecture for speech translation. In Proc. EAMT10