ThesisPDF Available

Productivity and quality in the post-editing of outputs from translation memories and machine translation

Authors:

Abstract

This study presents empirical research on no-match, machine-translated and translation-memory segments, analyzed in terms of translators’ productivity, final quality and prior professional experience. The findings suggest that translators have higher productivity and quality when using machine-translated output than when translating on their own, and that the productivity and quality gained with machine translation are not significantly different from the values obtained when processing fuzzy matches from a translation memory in the 85-94 percent range. The translators’ prior experience impacts on the quality they deliver but not on their productivity. These quantitative findings are triangulatedwith qualitative data from an online questionnaire and from one-to-one debriefings with the translators.
A preview of the PDF is not available
... Rate (Snover et al. 2006), an automatic score that reflects the number of edits performed on the MT output normalized by the number of words in the sentence. Research has shown that this metric correlates well with actual post-editing effort (O'Brien, 2011;Guerberof-Arenas, 2012). The closer HTER is to 0 (the lowest possible value), the lower the translator's effort because fewer changes are required. ...
... These results seem in line with the view often expressed by translators that they are constrained by MT proposals, which makes MTPE texts less fluent (Moorkens et al. 2018). They are also consistent with previous research that indicates that the final quality of post-edited products (in terms of number of errors) is equal to or higher than that of translations produced without any aid (Guerberof-Arenas 2012). ...
Preprint
Full-text available
This article presents the results of a study involving the translation of a fictional story from English into Catalan in three modalities: machine-translated (MT), post-edited (MTPE) and translated without aid (HT). Each translation was analysed to evaluate its creativity. Subsequently, a cohort of 88 Catalan participants read the story in a randomly assigned modality and completed a survey. The results show that HT presented a higher creativity score if compared to MTPE and MT. HT also ranked higher in narrative engagement, and translation reception, while MTPE ranked marginally higher in enjoyment. HT and MTPE show no statistically significant differences in any category, whereas MT does in all variables tested. We conclude that creativity is highest when professional translators intervene in the process, especially when working without any aid. We hypothesize that creativity in translation could be the factor that enhances reading engagement and the reception of translated literary texts.
... Based on more recent studies on Translation Memories, it is seen the focus of attention has shifted towards more automated systems, in which TMs and Machine Translation engines are seen as an aid to each other interchangeably or compared in terms of productivity (Biçici & Dymetman, 2008;Sánchez-Gijón, Moorkens, & Way, 2019). To this end, translation scholars have been discussing using machine translation engines in combination with translation memories or feeding machine translation engines with translation memory corpus or vice versa (Guerberof, 2012;Moorkens, Doherty, Kenny, & O'Brien, 2014;Screen, 2017). ...
... Therefore, there is a need to evaluate the quality of inverse translations and determine whether or not non-native speakers can be trained to meet the professional standards of quality. There is no doubt that translation technologies, such as machine translation (MT) and translation memory (TM), have significantly advanced over the years, but whether or not these technologies can help non-native translators produce good quality translation, it still remains debatable because many factors can contribute to the findings of the existing studies (Guerberof, 2012;Koponen, Aziz, Ramos & Specia, 2012;Koponen & Salmi, 2015) such as language pairs, text genres, sentence length, and MT users. ...
Article
Full-text available
This paper aims to analyse the quality of inverse translation and to see whether or not trainee translators, such as undergraduate language students, can produce translations between foreign languages, and whether or not post-editing machine translation and translation memories, have any effect on the Malay students' performance. Through error analysis approach, this paper also aims to reveal the contributing factors to the mistakes the students did in their translations and uncover the nature of Google Translate by identifying the recurring types of errors in the MT outputs. Results revealed that the translation technologies, particularly in the post-editing modified translation memories and machine translation tasks, helped the students improved the quality of their translations, suggesting that non-native speakers can be highly skilled professional translators with years of experience and proper training. Based on the error analysis, syntactic and lexical errors seem to be problematic in Google Translate in both Arabic-English and English-Arabic translations, implying that proper guidelines are crucial in post-editing so that post-editors can be aware of the potential recurrent errors and not overlook them. Also, the study identified that linguistic interference might have significantly influenced the students' performance as the three languages differ from one another in many aspects.
... Buna rağmen akademik çeviri eğitimi veren kurumların teknolojik gelişmelerin gerisinde kaldığı yani "kurumsal geç kalmışlık" (Pym, 2011) Sorunlar bu kadar açıkken akademik çeviri eğitimi veren kurumların, müfredatlarına doğrudan makine çevirisi konusunu ele alan bir ders eklemeleri bu sorunların çözümüne giden ilk adım olacaktır. Bu anlamda Guerberof (2008) çeviride düzeltmeyi öğreten bir kursun geliştirilebileceğini, bu kursta postediting bağlamında farklı makine çevirisi teknolojilerine ilişkin arka plan bilgisinin verilebileceğini ve öğrencilere makine çevirisinden çıkan ham metni düzeltmede uygulamalı ödevlerin yararlı olacağını vurgulamıştır. Bu konuda Mellinger (2017) ise müfredatların düzenlenmesine ilişkin yaklaşımların, çeviri piyasasının gereksinimleriyle uyum içerisinde olması gerektiğini, böylece öğrencilerin mezuniyet sonrası piyasada doğru yerlerde konumlanabileceğini yinelemiştir. ...
Article
Full-text available
Makine Çevirisine yönelik son yıllarda artan bir talep görülmektedir. Diğer bir ifadeyle makine çevirisi sistemlerindeki güncel gelişmeler hem çeviri piyasasında hem de çeviriye ilişkin akademik çevrelerde ilgi uyandırmıştır. Böylece, bir zamanlar ALPAC raporuyla kesintiye uğrayan “Tam Otomatik Yüksek Kalitede Makine Çevirisine” yönelik umutlar tekrar canlanmıştır. Ancak farklı birçok deneysel çalışmada ortaya konduğu üzere makine çevirisi sistemleri henüz, herhangi bir düzeltme gerektirmeden kolayca yayımlanabilir çeviri ürünleri ortaya koyacak durumda değildir. Bu şartlar altında “Makine Çevirisi Sonrası Düzeltme İşlemi” (Post-Editing) isimli yeni bir araştırma alanı ortaya çıkmıştır. Literatürdeki en yaygın tanıma göre post-editing, makine çevirisi tarafından çevrilen bir metnin düzeltilmesi veya değiştirilmesi anlamına gelmektedir. Bu tanım normal bir çevirmen ile Makine Çevirisi Sonrası Düzeltme İşlemini (post-editing) gerçekleştiren çevirmen arasındaki ayrıma dikkat çekmektedir. Ayrıca, ilgili alanyazında belirtildiği üzere, nihai çeviri ürününün kalitesi büyük ölçüde makine çevirisinden çıkan ham çeviriye bağlıdır. Bu yüzden kontrollü dil kuralları veya ön-düzeltme adımları gibi parametreler makine çevirisinden çıkan ham çevirinin yüksek kalitede olmasını sağlayan etkenler olarak ön plana çıkmıştır. Ayrıca, çevirmenin rolü ve dahil olduğu süreç dikkate alındığında Makine Çevirisi Sonrası Düzeltme İşlemi, Çeviribilimde yeni paradigma değişimlerinin yolunu açmıştır. Ancak, tüm bu hususlar ve kavramlar ilgili alanyazında birbirinden bağımsız olarak açıklanmaktadır. Bu yüzden bu çalışma betimleyici bir metotla bu kavramları kapsamlı olarak ele almayı, birbirleriyle ilişkilendirmeyi ve Makine Çevirisi Sonrası Düzeltme İşleminin yerini Çeviribilimdeki dönemler açısından incelemeyi amaçlamaktadır
Article
Machine translation (MT) has made significant strides and has reached accuracy levels that often make the post-editing (PE) of MT output a viable alternative to manual translation. However, despite professional translators increasingly considering PE as a valid stage in their translation workflow, little has been done to investigate MT output for the purpose of informing training in PE. Against this background, the present project focuses on the handling of tense and aspect configurations in the English translation of Arabic sentences using current neural machine translation (NMT) systems. Using a dataset of representative Arabic sentences, the output of five NMT engines was assessed against reference translations. The investigation reveals regressing accuracy levels when comparing morphological, structural, and contextual tenses. These findings are believed to represent valuable information that contributes to a more informed training in the PE of Arabic-into-English NMT output.
Article
Full-text available
En el contexto actual de avances tecnológicos y desarrollo de la inteligencia artificial, la digitalización de las sociedades y las mejoras tecnológicas transforman nuestras vidas en todos los ámbitos. La traducción no es una excepción. Con la aparición de la traducción automática neuronal -un nuevo paradigma de traducción automática-, la calidad que ofrece dicho sistema ha mejorado sustancialmente, incluso llegando a afirmarse que iguala o supera la calidad de la traducción humana en determinados ámbitos como las noticias. No obstante, los lenguajes de especialidad entrañan complejidades intrínsecas. En traducción jurídica, el anisomorfismo del lenguaje jurídico puede ser una brecha muy difícil de salvar para las máquinas: términos dispares para un mismo concepto en sistemas jurídicos diferentes, equivalencia cero o parcial, etc. Así, el objetivo del presente trabajo es estudiar la utilidad de la traducción automática como recurso formativo en el aula de traducción jurídica, teniendo en cuenta las características y tendencias del sector de la traducción profesional. Para ello, en este estudio se hace una evaluación humana de tres traducciones humanas de contratos societarios inglés-español y de una traducción generada por un motor de traducción automática neuronal. Los resultados apuntan a 1) que la traducción automática podría constituir una herramienta didáctica muy útil en la clase de traducción jurídica; 2) que la identificación de las competencias podría potenciarse con un enfoque de esta naturaleza; 3) la forma de incorporación de la traducción automática a la formación en Traducción Jurídica, y 4) las ventajas que tendría aquella sobre métodos de enseñanza-aprendizaje tradicionales. Palabras clave: traducción automática; tecnologías de la traducción; traducción jurídica; calidad de la traducción; evaluación de la calidad.
Article
Usability is a key factor for increasing adoption of machine translation. This study aims to measure the usability of machine translation in the classroom context by comparing translation students’ machine translation post-editing output with their manual translation in two comparable translation tasks. Three dimensions of usability were empirically measured: efficiency, effectiveness, and satisfaction . The findings suggest that machine translation post-editing is more efficient than human translation and post-editing produces fewer errors than human translation. While the types of errors vary, errors in terms of accuracy outnumber those related to fluency. In addition, participants perceive the amount of time and work that is saved when post-editing to be greater benefit than the overall utility of post-editing. Likewise, students report a strong desire to learn post-editing skills in training programs.
Book
Full-text available
A Lektorálástudomány – fordításban című tanulmánykötet – a sorozat előző kötete, a Fordítástudomány – fordításban folytatásaként – válogatott írásokat kínál az olvasók számára a lektorálással kapcsolatos kutatások szaktekintélyeitől. Különlegessége abban rejlik, hogy a nemzetközi szakirodalom képviselői ezúttal magyar nyelven szólalnak meg a könyv lapjain, az Eötvös Loránd Tudományegyetem szakfordító hallgatóinak tolmácsolásában. A kiválasztott tanulmányok híven tanúskodnak a lektoráláskutatás sokrétűségéről, összefoglalják az eddigi eredményeket, és áttekintést nyújtanak a fő elméleti dilemmákról. Érdekes gondolatokat olvashatunk a lektorálás terminológiájáról, a lektorok beavatkozásairól, a lektori kompetencia meghatározó összetevőiről, a lektorálás módszertani kérdéseiről, valamint a sajtófordításban megjelenő transzeditálásról. A kötet mindazok érdeklődésére számot tarthat, akik gyakorlatban művelik a fordítást, és szívesen tájékozódnak a szakmájukkal kapcsolatban folyó tudományos vizsgálódásokról, különös tekintettel a lektorálásra; akik maguk is tudományos alapokon kutatják a fordítás folyamatait és jellemzőit; és azok a jövendő szakfordító hallgatók is haszonnal forgathatják, akik saját tanulmányaikhoz keresnek fordítástudományi témájú szövegeket a lektorálásról – magyar fordításban.
Article
Full-text available
The application of machine translation (MT) in crisis settings is of increasing interest to humanitarian practitioners. We collaborated with industry and non-profit partners: (1) to develop and test the utility of an MT system trained specifically on crisis-related content in an under-resourced language combination (French-to-Swahili); and (2) to evaluate the extent to which speakers of both French and Swahili without post-editing experience could be mobilized to post-edit the output of this system effectively. Our small study carried out in Kenya found that our system performed well, provided useful output, and was positively evaluated by inexperienced post-editors. We use the study to discuss the feasibility of MT use in crisis settings for low-resource language combinations and make recommendations on data selection and domain consideration for future crisis-related MT development.
Chapter
Chinese, in particular Mandarin Chinese, is currently the most spoken language in the world, with an estimated 1.2 billion primary and secondary speakers, while English ranks a distant second with 330 million native speakers, and a further 150 million secondary speakers. Among various Chinese languages, Standard Mandarin (Putonghua/Guoyu/Huayu) is the only official written form and is the only common official language in the four Chinese-speaking countries and regions, including the People’s Republic of China, the Republic of China (commonly known as “Taiwan”), Hong Kong, Macau, and Singapore. Standard Mandarin is also one of the six official languages of the United Nations. (There are dialects within the Mandarin language family, spoken in various regions in the north and southwest of China.) Incidentally, Standard Mandarin Chinese, together with the other five official UN languages, are also ranked as the six most influential languages in the world, when judged by the total number of world speakers, the geographical influence, the economic power of countries speaking the language, and the literary and scientific use of the language. China has the fastest growing economy in the world, and is the second largest economy, after the United States, in terms of purchasing power parity GDP. Perhaps most pertinent to the topic in this chapter, China (including Hong Kong) was the biggest exporter in 2008 and is poised to become the world’s biggest importer in 2010. The largest trading partners with China are (1) the European Union, (2) the United States, (3) Japan, and (4) the Association of South East Asian Nations. Consequently, for various economic, political, cultural, and humanitarian reasons, machine translation (MT) of Chinese from and into other languages is an increasingly more important application in the natural language processing (NLP) area.
Chapter
This chapter is concerned with the relationship between Translation Studies and translation technology. We begin by discussing translation theory and describing the professional and academic groups who are involved in translation. In addition, the schema of the applied branch of Translation Studies proposed by John S. Holmes (1988/2000) is explored to show the areas of Translation Studies that have direct relevance to natural-language processing applications. A description of several stages in the translation process follows, involving pre- and post-editing tasks. We also consider the idea of a ‘controlled language’, frequently used to author texts as input for machine translation systems. The semiotic classification of translation models introduced by Roman Jakobson (1959/2000) is used to illustrate a different perspective on the translation process, involving editing tasks and controlled language.