- A preview of this full-text is provided by Springer Nature.
- Learn more
Preview content only
Content available from Machine Translation
This content is subject to copyright. Terms and conditions apply.
Vol.:(0123456789)
Machine Translation (2020) 34:325–346
https://doi.org/10.1007/s10590-020-09255-9
1 3
Neural machine translation withapolysynthetic low
resource language
JohnE.Ortega1 · RichardCastroMamani2· KyunghyunCho1
Received: 24 February 2020 / Accepted: 13 December 2020 / Published online: 4 February 2021
© The Author(s), under exclusive licence to Springer Nature B.V. part of Springer Nature 2021
Abstract
Low-resource languages (LRL) with complex morphology are known to be more
difficult to translate in an automatic way. Some LRLs are particularly more difficult
to translate than others due to the lack of research interest or collaboration. In this
article, we experiment with a specific LRL, Quechua, that is spoken by millions of
people in South America yet has not undertaken a neural approach for translation
until now. We improve the latest published results with baseline BLEU scores using
the state-of-the-art recurrent neural network approaches for translation. Addition-
ally, we experiment with several morphological segmentation techniques and intro-
duce a new one in order to decompose the language’s suffix-based morphemes. We
extend our work to other high-resource languages (HRL) like Finnish and Spanish to
show that Quechua, for qualitative purposes, can be considered compatible with and
translatable into other major European languages with measurements comparable to
the state-of-the-art HRLs at this time. We finalize our work by making our best two
Quechua–Spanish translation engines available on-line.
Keywords Neural machine translation· Low resource languages· Morphology·
Quechua· Finnish· Spanish
* John E. Ortega
jortega@cs.nyu.edu
Richard Castro Mamani
rcastro@hinant.in
Kyunghyun Cho
kyunghyun.cho@nyu.edu
1 New York University, NewYork, USA
2 Hinantin Software, Cusco, Peru
Content courtesy of Springer Nature, terms of use apply. Rights reserved.