ChapterPDF Available

Assessing User Interface Needs of Post-Editors of Machine Translation

Authors:

Abstract and Figures

The increased use of machine translation (MT) in commercial localization has seen post-edited MT becoming an increasingly cost-effective solution for specific domains and language pairs. At present, post-editing tends to be carried out via tools built for editing human-generated translations. These tools, while suited to their intended task, do not always integrate with machine translation or support post-editing. This lack of support, and a sense of apprehension related to MT, leads to reluctance among translators to accept post-editing jobs. This article describes the results of a survey and a series of follow-up interviews with professional translators and post-editors, in which they discussed features and functions for their ideal translation and post-editing User Interface (UI). Participants stressed the need for a simple and customisable UI that supports multilingual formatting and follows established translation editor conventions, in addition to some functionality specific to post-editing. The survey and interview responses led to the creation of specifications for a UI that better supports the post-editing task.
Content may be subject to copyright.
Assessing User Interface Needs of Post-Editors of Machine Translation
Joss Moorkens & Sharon O’Brien, ADAPT/Dublin City University
Introduction
Translation Memory (TM) and Machine Translation (MT) were until quite recently considered to
be distinct and diverging technologies. The current trend in translation technology however, is to
attempt to create a synergy of the two. At present, the TM tools used for the last decades to recycle
human translation are being adopted also for the task of post-editing MT output. We consider that,
while these existing translation editor interfaces are by now familiar and functional for translators
working with TM, any support for post-editing or integration with MT has tended to be appended
as an afterthought. Post-editing of MT output is different from revision of an existing translation
suggested by a TM, or, indeed, from translation without any suggestion whatsoever, primarily
because the type of revisions differ. MT output tends to include mistakes that would not generally
be made by professional human translators. When this is coupled with the fact that few professional
translators have received training either in machine translation technology or in post-editing
practices to date, the result is often apprehension among translators with regard to the post-editing
task, along with a high level of frustration. Some of the most common complaints from translators
about the task of post-editing stem from the fact that it is an edit-intensive, mechanical task that
requires correction of basic linguistic errors over and over again (Guerberof 2013; Moorkens and
O’Brien 2014). Understandably, translators see this task as boring and demeaning and especially
despise it when the ‘machine’ does not ‘learn’ from its mistakes or from translators’ edits. Kelly
(2014: online) even goes so far as to call this task “linguistic janitorial work”.
This Chapter describes our first steps in an ongoing effort towards creating specifications for user
interfaces (UIs) that better support the post-editing task, with a view to making the task less boring,
repetitive and edit-intensive for the human translator. While our focus is on features for post-editing,
our results demonstrate that, according to translator-users, even the human translation task is not
well-supported by existing tools. We therefore also discuss basic features of TM tools, as well as
those features that are used to facilitate MT post-editing using TM interfaces.
The project began with a pre-design survey of professional translators, focusing on features of the
translation UI that they commonly use. The survey was followed by a series of interviews with
professional translators, all of whom have experience with post-editing, to examine their survey
responses in more detail, to discuss the specific challenges of post-editing, and so that potential
users could instil their knowledge and concern into the design process from the very beginning”
(Gould and Lewis 1985). Interim results from the survey were published in Moorkens and O’Brien
(2013), which we extend here by reporting on all survey responses and by including interview data.
We assume from the outset that a standalone editor for post-editing is not required, but features and
functionality as specified could instead be built into existing translation editing environments in
order to better integrate with MT systems, and to support the post-editing task.
In this Chapter, we first define post-editing and present previous relevant research on interfaces used
in post-editing. We explain how software designers are currently responding to the emergent
integration of TM and MT. We go on to describe our survey and interview design, and present a
summary of the results of the survey, followed by the interview findings. For reasons of space, the
final UI specifications will be published elsewhere. Our conclusions are presented in the final
section.
Interfaces for Editing Human Translation and Post-Editing Machine
Translation
Somers (2001 p.138) describes post-editing of MT as “tidying up the raw output, correcting mistakes,
revisiting entire, or, in the worst case, retranslating entire sections” of a machine translated text.
This task can be further subdivided into ‘light’ (or ‘fast’) post-editing, to create output that is
understandable and readable, but not necessarily stylistically perfect, and ‘full’ (or ‘conventional’)
post-editing, with the aim of producing quality equivalent to “that of a text that has been translated
by a human linguist” (de Almeida 2013 p.13). The type of post-editing chosen depends on the
purpose of the translated text, and the financial resources available.
In professional translation workflows, once the source text has been machine translated, the output
is presented to the translator-user using a suitable interface. It is increasingly common for this
interface to be one and the same as that provided by the translation memory tool. As described in
the introduction to this volume, the TM UI is where specialized translators translate from scratch,
or edit legacy translations when the source text segment is the same as or similar to one that has
been translated previously. Target text segments from the TM are assigned a match percentage based
on the difference between the source text to be translated and the source text in the TM (a match of
less than 100% is known as a ‘fuzzy match’). This gives the translator an estimate of the amount of
similarity with a source segment previously stored in the TM. The new translation or edited fuzzy
match is then saved to the TM, dynamically improving future match suggestions. It is increasingly
common for a TM UI to have a facility to use MT when no TM target text match is available, or
when the TM match’s fuzzy match percentage is low. There is no universally accepted threshold
above which MT output is considered to require less editing effort than a TM fuzzy match but
research has suggested that raw MT editing effort may be equivalent to that required for 80%-90%
fuzzy matches (O’Brien 2006; Guerberof 2008), although this would depend on the text type,
language pair, and the raw quality from the MT engine.
Given the growing importance of TM interfaces in the post-editing process it is worthwhile
considering the extent to which they meet their users’ basic translation editing needs, even before
post-editing requirements are factored in. Previous research suggests unfortunately that user-centred
design (UCD) has not been general practice in TM technology. In Lagoudaki’s survey of TM UIs
in 2006, it was found that industry research and development was mostly motivated by “technical
improvement of the TM system and not how the TM system can best meet the needs of its users”
(Lagoudaki 2008 p.17). She added that “systems’ usability and end-users’ demands seem to have
been of only subordinate interest” in TM system development (2008 p.17). In practice, TM users
are usually “invited to provide feedback on an almost finished product with limited possibilities for
changes” (Lagoudaki 2006 p.1). Lagoudaki concluded that TM tool users wanted simplicity in their
UI, not necessarily meaning fewer features, but focusing on stable performance, improved
interoperability, and high compatibility between file formats.
The non-UCD design process goes some way to explaining why Lagoudaki found that users were
widely dissatisfied with their translation editing interface, despite (at that time) 14 years of TM tool
development. This dissatisfaction was reiterated in McBride (2009 p.125), where one forum
contributor said that the user was not the focus of the design process as tool developers hope “above
all to sell to giant corporations, who will put pressure on translation agencies to buy, who will
likewise pressure translators to buy”. According to Ozcelik et al (2011 p.7), end user involvement
in software development is frequently mitigated, as “often the person who decides on purchase is
not really the end user.”
Other research has looked at the type of post edits that are typically made by human post-editors.
De Almeida (2013), for example, found that post edits (in English to French and English to Brazilian
Portuguese) typically include changes such as word reordering, addition or removal of capitalization,
and changes to the gender or number of a word. A post-editor may find him or herself repeating
edits throughout a project, e.g. changing the same word from masculine to feminine inflection every
time it occurs, with no associated improvement to the MT suggestions (ibid.). Koponen (2012)
reported similar edits in English to Spanish post-editing, noting that word order changes were
perceived by her study participants as being more cognitively challenging than correction of an
individual word.
Unfortunately, current TM UIs are usually incapable of providing the post-editor with an estimate
of editing effort required for each segment, or of assisting with these common edits, although several
research projects are now underway that aim to create new enhanced CAT tools, by adding
functionality to assist post-editing. An early example of such a project, TransType, integrated
Interactive Machine Translation (IMT), with varying levels of success, to suggest completions of a
segment that the translator had already started to translate (Langlais, LaPalme, and Loranger 2002).
The technique used was similar to that used in predictive texting. Later projects integrated functions
that have only recently become feasible, such as the use of translation quality estimation to
recommend TM or MT matches (as proposed by Dara et al. (2013)), and most are not in use by
professional translators or post-editors at the time of writing. Another tool under active research and
development, iOmega-T, is a version of the popular open source CAT tool Omega-T that retains
information on edits carried out by translators for post-translation analysis. This information can
give valuable detail to researchers and managers of post-editing activities, but the UI itself offers no
novel functionality for post-editors (Moran and Lewis 2011). The Matecat project, meanwhile, was
an industry/academia collaboration (since commercialised) that aimed to create a web-based CAT
tool to include estimation of MT quality, and incremental “tuning” of the MT output based on post-
edits (Cettolo, Bertoldi, and Federico 2013). At the time of writing it is in full production use by the
project’s industry partner, although MT quality estimation has not yet been incorporated (de Souza
et al. 2014). The associated Casmacat project focused further on novel functionality deployed in a
web-based platform, adding interactive machine translation prediction, intelligent auto-completion,
word alignment visualisations, and confidence measures.
i
Casmacat also added integration with eye-
trackers and e-pens, and has subsequently been made available as open source software for end users
(Koehn et al. 2015). Two other tools, PET (Aziz, de Sousa, and Specia 2012) and Caitra (Koehn
2009), were developed for post-editing research purposes, although they are not actually used in
production. Notably, prior to the current research there has not been a focus on what functionality
users would like to see in a tool for post-editing as the MT research community has had a tendency
to “focus on system development and evaluation” rather than considering end users (Doherty and
O’Brien 2013 p.4). Our work builds on previous research by gathering several possibilities for a
post-editing UI and inviting user input into whether and how this may be implemented.
Research design
The main objective in this research is to create user-focused specifications for editing interfaces to
better support the post-editing task. Our two research questions are:
1. Can we get pre-design information from users in order to redress the balance of
user/engineering input that is common in translation tool development?
2. What are the pain points in post-editing and how can these be addressed in a translation
tool?
The method employed in answering these questions was a pre-design user survey (Tidwell 2006),
followed by detailed interviews with several of the survey participants. The findings from this initial
research may form a starting point for tool development, which should involve evaluation and
validation (or otherwise) of the specifications as gathered from direct observation of users.
Survey
The first phase of this research was a pre-design survey focusing on five broad areas so as to better
understand what support features post-editors might require. These five areas were: (1) biographical
details, (2) current working methods, (3) concepts of the ideal UI, (4) presentation of TM matches
and MT output, and (5) intelligent functions to combine TM and MT matches. The survey contained
ideas for specific features that we considered might serve post-editors, based on common edits
reported in research, and post-editing functions currently in development within the research
community (see above). Respondents to the survey were also able to give more detailed comments
or suggestions immediately following specific questions.
The survey was carried out via the Internet after ethics approval had been granted by Dublin City
University Research Ethics Committee using the LimeService platform (www.limeservice.com),
and required completion of an Informed Consent section prior to beginning the main body of the
survey. In section 1 of the survey, participants were asked about their length of experience as a
translator and as a post-editor, in years and months. They were asked about their professional status
(freelance or employed by a company), and their views on TM and MT technology respectively (I
like using it; I dislike using it; I use it because I have to; I use it because it helps in my work; MT is
now an advanced technology; MT is still problematic). In section 2, participants were asked for their
source and target languages, what editing environments they currently use, what they like most
about their current tools, and to “describe one aspect of your current editing environment that
frustrates you.” These were all to be answered in free text. They were asked whether they customize
their translation UI and, if so, what elements they customize. Then, finally, they were asked about
preferred screen layouts for source/target texts and for glossaries.
Section 3 focused on features respondents would like to see in the post-editing environment and that
are not currently available in (their) regular translation editing environments”, again leaving a free
text box for response. Following questions about keyboard shortcuts and a simple or rich-featured
UI preference, were a series of questions about specific types of post-edits that respondents might
like a keyboard shortcut to automate, with answers to be chosen on a four-point Likert scale (see
Figure 1), and a query about whether a macro creation tool would be useful.
Fig. 1. Survey question with radio button responses.
Questions in section 4 addressed the combination of TM features with support features for MT and
post-editing. Finally, participants were asked whether they would leave an email address for further
participation in an interview, noting that in doing so, they would waive their anonymity. The survey
went live on May 7th 2013, and a link was sent to six localization companies, who disseminated it
internally (see Acknowledgments). The survey was closed at the end of business hours on June 6th
2013.
Interviews
The follow-on interviews were largely based on the survey results. They were carried out in order
to ascertain what post editors consider most troublesome about post-editing, and to ask them in more
detail about some of the features suggested in the survey. These interviews also presented an
opportunity to see if participants had requirements for a post-editing UI that had not been identified
in the survey responses. The final question was an open one: “Do you have any other suggestions
on how to support post-editing through the UI?” Interviews took place between July 2nd and August
2nd 2013 via the Skype voice-over-IP tool, and were recorded using Callnote.
Survey Results
Demographics
There was a total of 403 survey participants, of whom 231 answered all sections. The number of
participants who completed a section will be presented in reporting that section. Where
percentages are given, these represent the percentage the participants who completed that
particular section. (In such cases, absolute values are given in parentheses.) As the survey was
advertised by six localization service providers to their translator base, responses were somewhat
biased by IT localization practices and experiences. At the same time, this sector has in recent
years embraced the deployment of MT and we therefore expected to access survey respondents
who had acquired post-editing experience and who would be in a good position to respond to the
questions. 280 participants completed the biographical section. Figure 2 shows the age ranges of
these participants.
Fig. 2. Participants’ age range.
Most participants reported that they had 6-12 years’ translation experience. 26 participants
claimed more than 20 years’ experience. As post-editing has only recently become commonplace,
reported post-editing experience was mostly 1-3 years, with 69 participants reporting no
experience of post-editing. All but three of the 42 respondents aged 20-30 had some experience of
post-editing (at most 2 years). Roughly 80% of respondents aged between 31 and 50 (125
respondents) had experience of post-editing (usually 2 to 6 years), and just over half (17) aged
over 50 had post-editing experience.
In response to a question about professional status, 29% of participants (81) reported that they
work as freelancers without an agency, 31% (85) work closely with one agency on a freelance
basis (9 participants work on a freelance basis with several agencies), and 23% (63) are translation
or localization company employees. 21 respondents run their own companies. This cohort
represents a good spread of work profiles typical to the translation industry. A statistically
significant association was found between translators’ age and professional status. Respondents
under 30 are more likely to be company employees (67%, or 23), whereas those over 30 are more
0
20
40
60
80
100
120
<20 20-30 31-40 41-50 50+
likely to work on a freelance basis (71% or 148). The proportion employed directly by a company
drops to 26% (23) for those aged 30-40, falling to 6% (2) for those over 50.
56% of participants (153) reported that they like using TM technology, as compared with 18%
(49) who said that they like using MT. 75% of participants (206) report using TM because it helps
with their work, whereas 30% (83) report using MT because it helps. 56% (149) hold the view that
MT is still a problematic technology. Fewer respondents aged over 40 agreed that “MT was still
problematic”, which suggests that they do not feel threatened by MT, but taken in conjunction
with the older group’s lesser post-editing experience, it could also mean that they have less
familiarity with MT and its associated errors. These differences aside, responses were consistent
between age ranges.
223 of 280 participants (80%) translate from English, a reflection of the nature of the respondents
and the companies who promoted the survey, although many translate from more than one source
language. (In the IT localization sector, English is the main source language (Kelly et. al 2012).)
Target languages are reasonably well spread amongst participants, and are listed in Table 1. This
spread of target languages was important for the survey results as the post-editing task can vary
depending on the target language in question and its typical linguistic features.
Table 1. Participants’ target languages.
Target Language
No.
Arabic
5
Chinese
24
Czech
11
Danish
3
Dutch
7
English
49
Finnish
4
French
34
German
26
Greek
5
Hindi
3
Hungarian
5
Italian
18
Japanese
17
Korean
4
Malay
1
Norwegian
3
Polish
1
Portuguese
24
Russian
7
Spanish
27
Swedish
6
Thai
3
Turkish
4
Urdu
3
Current Editing Environment
246 participants provided details of their current editing environments (63%, or 155, use more than
one environment regularly). 74% (182) of these use a version of the SDL Trados TM tool. Company
employees are more likely to use SDL Trados; 109 of 152 freelance translators (72%) and 65 of 76
(84%) company employees said that they use a version of SDL Trados.
ii
62% of company
employees (39) in our current survey said that they use multiple tools, but the rate was even higher
among freelancers (68%, or 116).
iii
Contrary to our expectations, 38% of participants (94) use
Microsoft Word for post-editing, which suggests that MT and TM are not currently as integrated as
we had thought, despite the increasing industry focus on MT integration as reported above. Figure
3 shows the number of users per editing UI among survey participants. Some other tools used by
fewer than 15 participants were XTM (13), Alchemy Catalyst (12), OmegaT (8), Star Transit (5),
TransStudio (5), and Alchemy Publisher (1). 28 participants also listed proprietary tools (Translation
Workspace, Helium, and MS LocStudio).
Fig. 3. Tools used for translation and post-editing.
Roughly half of the participants in this survey section reported unhappiness with the default layout,
colouring, and display of mark-up tags in their current editing UI. 15 complained specifically about
their current UI, citing poor layout or visibility, outdated UIs, and unhappiness with too many
product updates. “The UI is not user friendly,” wrote one, “each UI uses their own different shortcuts,
there is an inability to see segments comfortably.” Seven participants mentioned compatibility
issues and problems with tags. 66% (167) of participants would rather customize their editor than
use the default set-up. 79% of those 167 respondents adjust their onscreen layout, 74% adjust tag
visibility, 68% adjust font type, and 23% adjust colours.
Performance issues figured strongly amongst survey comments, with 19 participants complaining
about bugs, errors, and slow response times within their current UI. One participant wrote: “I work
mainly in [software name], which is useful [but] an incredibly fragile piece of software that has
caused me to lose time due to crashing or failing to save output files correctly.” 21 other participants
stated that they have experienced formatting problems.
050 100 150 200
SDL Trados (182)
MS Word (94)
Wordfast (50)
Idiom Worldserver (46)
MemoQ (41)
SDLX (26)
Passolo (15)
No. of respondents per tool
25 participants expressed unhappiness with the quality of MT output and MT support within their
current tool. One wrote that “sometimes the quality of MT makes me stay longer at a (translation
unit) than I would having no MT to deal with”. This was a recurring bugbear mentioned in open
responses throughout the survey, as some participants appear to be dissatisfied with and suspicious
of MT. Other problems mentioned were the high learning curve with CAT tools. On the other hand,
30 participants said that they are happy with their current UI, although not necessarily in the most
positive terms: “I'm so used to it, that I can't find anything frustrating”.
When asked what they liked most about their current tools, many (33) mentioned performance, ease
of use, and stability. 17 mentioned specific features such as auto-propagation, integrated QA (quality
assurance) checking, and concordance searches. Six participants wrote that they liked their current
UI, with one writing “the editing changes are clearly marked and text before and after are displayed
side by side”.
UI wish list
The importance of customizability was emphasized in many of the 245 responses to this section of
the survey. 63% of participants (152) expressed a preference for a customizable UI, and 57% (138)
a clean and uncluttered UI. In response to a question about features currently unavailable in regular
translation editing UIs, but that participants would like to see in a UI that supports post-editing, 14
users said they would like to see improved glossaries and dictionaries, with six wanting to be able
to make glossary changes that would be propagated throughout a document. Three suggestions
related to MT and involved improved display of provenance data (e.g. the origin of the suggested
translation), improved pre-processing, and dynamic changes to the MT system in the case of a
recurrent MT error which needs to be fixed many times during post-editing. Other UI wishes
included a global find-and-replace function, reliable concordance features, and grammar checking.
Notably, these latter UI requests are for features to support the general translation task, adjudged to
be lacking in users’ current tools (see Figure 3) despite two decades of TM tool development.
Participants appear to be keen users of keyboard shortcuts; 29% (70) of 241 participants use
keyboard shortcuts often, and 40% (96) use them “very often”, while only 5% never use them at all.
80% (193) responded that their productivity was improved by using keyboard shortcuts. For specific
operations required by MT post-editing and some operations not covered by current default shortcuts
options, participants were asked whether a keyboard shortcut may be useful. Responses to specific
keyboard shortcut suggestions are shown in Table 2. Proposed shortcuts are listed in the left-hand
column and the number of respondents who considered these shortcuts useful is in the right-hand
column.
Table 2. Keyboard shortcuts requested.
Shortcut
No.
Dictionary search
203
One-click rejection of MT suggestion
185
Web-based parallel text lookup
158
Change capitalization
149
Apply source punctuation to target
128
Add/delete spaces
124
Dictionary search and the suggestion of a shortcut for web-based parallel text lookup, while popular
in survey responses, are not post-editing-specific. Post-editing specific responses included a 77%
preference (185) for a keyboard shortcut that would allow a one-click rejection of an MT suggestion.
This assumes that the MT suggestion is automatically pasted to the edit window, of course, which
could be configurable in an editing interface. In XTM, for example, the MT suggestion may either
be automatically pasted or added electively using a keyboard shortcut, depending on user settings.
Incorrect letter casing is also often problematic in MT output, reflected by the 62% (149) who would
like to see a keyboard shortcut for changing capitalization.
Fewer participants consider our suggested language-specific keyboard shortcut suggestions useful,
possibly due to the large spread of target languages among participants. The most popular suggested
shortcut would change the word order in a machine translated segment (considered useful by 42%
or 102 participants), followed by a change in the grammatical number of a word (e.g. from singular
to plural). Further responses to suggested language-specific shortcuts may be seen in Table 3.
Notably, the suggestions for prepositions and for postpositions do not apply to all languages, which
may have led to lower numbers considering these options useful. Additionally, participants may be
unable to measure usefulness without first testing these features in practice.
Table 3. Language-specific keyboard shortcuts requested.
Shortcut
No.
Adjust word order
102
Change number (singular/plural)
99
Change gender
79
Change verb form
68
Add/delete preposition
67
Add/delete postposition
65
Participants expressed their opinions relating to these shortcuts in the open comments sections of
the survey. Of 125 commenters, 34 were in favour of the shortcuts: “It seems obvious to me that all
such keyboard shortcuts would be useful. I reckon I use 15-25 keyboard shortcuts in each of the
main CAT and other productivity applications I use on a daily basis.” Nine comments had further
suggestions for shortcuts, such as Internet-based text lookup, parallel text searching, and ‘copy
source formatting to target’. 41 participants provided negative comments, with many unable to
understand how the shortcuts might work in practice. 18 participants had misgivings specific to one
of their languages. Several thought that manual changes would be easier or less time-consuming
than memorizing a large number of shortcuts, an opinion that recurs in the interviews.
Participants appeared to favour customizable shortcuts. 68% (164) would like to be able to adapt
the UI functionality using macros or scripts. 52% (125) would like to be helped or guided in creating
such a macro. Three comments expressed a desire for instructions be clear and simple, and
interoperability considerations to be taken into account, so that macros from other programs (MS
Word was suggested) might work in the UI. Of 20 comments, all but one were positive about user-
added macros. Two commenters use Auto Hotkey to set shortcuts globally on their systems, but
would like to be able to add program-specific shortcuts.
Presentation of TM matches and MT output
233 participants completed this section of the survey, of which 81% (189) would like to be presented
with confidence scores for each target text segment from the MT engine. 70% of those 189 would
like confidence scores to be expressed as a percentage (like a fuzzy match score in a TM tool), and
25% expressed a preference for color-coding. If a machine translation suggestion received a higher
confidence score than any fuzzy match available from the translation memory, 88% of participants
(205) said that they would nevertheless like to see both MT suggestions and TM matches. Only
three participants (just above 1%) would like to see the MT match only. Both of these findings
suggest a lack of translator confidence in MT output. This scepticism is also expressed by the 14
participants who would like to see the TM match only, even when a higher-rated MT suggestion is
available, and in the choice by many participants of the lowest possible fuzzy match value below
which they would prefer to see MT output rather than a TM fuzzy match. (The thresholds chosen
by participants may be seen in Figure 4.) But despite other evidence suggesting a low level of post-
editor confidence in MT, 80% of respondents (186) would like to see ‘the best MT suggestion’
automatically appear in the UI target window when no TM match is available.
Fig. 4. Fuzzy match thresholds below which an MT match is favoured.
62% (144) felt it would be useful if the editing environment could combine the best sub-segments
(or phrases) from the MT system and the TM to produce an MTM (Machine Translation/Translation
Memory) combined suggestion. 47 participants commented about this proposed function showing
mixed opinions. 21 commenters responded positively about a potential MTM match, while nine
commenters were not in favour of the feature, with one writing that it “seems theoretically useful,
but when really applied it (could) create confusion”.
Five commenters had suggestions such as allowing the feature to be disabled, while 87% of
respondents (203) said they would like to see the provenance of MT or TM suggestions denoted by
010 20 30 40 50 60 70
< 65%
< 70%
< 75%
< 80%
< 85%
< 95%
Don't know
No. of respondents per fuzzy match value
colour at a sub-segment level. The importance of provenance and retention of meta-data showing
the origin of match suggestions appeared clear across the whole survey.
Features to Support Post-Editing
231 participants completed this section of the survey, wherein participants were asked for their
opinion on some functions that have been suggested for supporting post-editing. As these functions
have not yet been implemented in a commercial tool, they remain largely untested. We considered
it worthwhile to gather participants’ opinions on these functions, but they would, of course, require
evaluation and validation prior to adding to a UI for release. We have already outlined above how
some participants expressed a desire to see dynamic, real-time improvements to MT systems. Some
work on this topic has been published by Alabau et al. (2012), who suggest that MT systems could
use human post-edits as “additional information to achieve improved suggestions”. In our survey
71% of respondents (164) said that their edits should be used to improve a client-specific MT system.
23% (53) were unsure, with concerns expressed in 42 comments. Four commenters were concerned
about issues relating to client confidentiality, while others resented further reuse of their translation
work. This intellectual property (IP) concern was expressed by one participant who wrote: “Who
would pay a translator for his intellectual work in improving the TM/MTM?”
69% of participants (159) would like to see edits not just used to improve suggestions, but to also
retrain an MT system in real time. Most commenters felt positively about potential improvements,
one writing that “That should be one of (the) main goals of MT, not only lower rates.” Again, several
participants would like to use this function electively. Three commenters believe that the client
should decide whether the content should be added to the MT engine, and five were not in favour
of this function. One participant feels that, were it possible to incorporate this function, it may lead
to further complications depending on the work flow and steps required for review or approval. “If
immediately incorporated, I'd like to know where each segment is coming from (i.e. what is from
(the) old MT engine, what is a recent addition/my own work, what's been reviewed as accurate, and
what's still pending.” Another commenter wondered how a system could learn only the ‘right’
changes (i.e. changes to recurring incorrect phrases or terms): “a too-generalized auto-adaptation
feature may create errors”.
Participants were asked how useful they would find two variants of IMT. Using the first variant, the
editing environment could dynamically alter pre-populated MT suggestions depending on edits as
the user moves through a segment, so that as the user edits the MT suggestion, the system would
offer “context dependent completions”, adjusting the remainder of the target segment based on the
user’s edit (Langlais, LaPalme, and Loranger, 2002 p.78). 48% (111) considered that this feature
would be useful, with 28% (65) unsure. Participants were more certain that they would like to be
able to turn this feature on and off, with 93% (215) requesting that the function could be used
electively. 46% (106) were in favour of a second variant of IMT, whereby the editing environment
could dynamically auto-complete segments translated from scratch based on MT proposals. (20%
(46) thought that this would not be at all useful.) A slightly higher proportion of 54% (125) would
like to see MT suggestions at a sub-segment level. 184 participants (80%) said that they would like
to see sub-segment MT suggestions provided as a drop-down list, with 35% (64) of those suggesting
two to three list items, and a further 34% (63) preferring the ability to customize the number of list
items themselves.
The final questions related to user feedback on productivity, something that is of great importance
in the translation industry and is largely driving MT deployment (DePalma et al. 2013) and also
revealed participants’ suspicion of data dispossession. 70% (162) would like to see their productivity
reported dynamically, such as in words per hour or percentage completed, as long as this reporting
function can be turned on and off, and that the information is for their personal use only. 48% would
like to see dynamic reporting of their earnings. Among 58 comments about this proposed feature,
36 participants consider this to be a great idea. One participant wrote “Even after 6 years in the
industry, I still find estimating time vs. fees to be quite difficult. Even when I am able to view the
source text beforehand to make my estimate, I often misjudge the quantity or technicality of the
work. I think having an automated tracker would be fairer for me and the client.” Ten commenters
specified that this sort of information should not be available to the client (“For client tracking – oh
hell no.”) and 13 would not be in favour of this function at all, saying that it is unnecessary, will
create more clutter, and put post-editors under too much pressure.
Interview results
43 survey participants provided email addresses, agreeing to waive their anonymity and to provide
more details about their preferences for a post-editing user interface. In order to better answer our
research questions (see Section 3), we contacted 16 of these survey participants, choosing only those
who had post-editing experience and attempting to cover a wide range of language pairs. Ten
participants agreed to participate in follow-on interviews, all but one of whom listed English as their
source language. Interviewee profiles are shown in Table 4.
Table 4. Interviewee profiles.
Post-Editor
Post-Editing
Experience
Tools
A
2 years
SDLX
B
6 years
SDLT, GTT
C
2 years
MemoQ, SDLX, SDL
Trados, Worldserver,
others
D
5 years
SDLX, Trados (2007 and
Studio), TagEditor,
Idiom
E
2 years
Wordfast Pro or SDL
Trados Studio 2011
F
9 months
MemoQ, Trados (2007
and Studio), TagEditor,
Idiom
G
Various projects over
years
OmegaT
H
5 months or so
SDL Trados
I
1 year
Trados, Wordfast, Idiom
Worldserver
J
2 years
SDL Trados, MemoQ,
WordFast, Catalyst (SDL
Trados 2009 most
efficient)
Interface design
In response to the question “What existing software comes closest to your ideal UI?”, four
interviewees chose SDL Trados (all but one specified the Studio version), four interviewees
mentioned MemoQ, one chose SDLX, and one chose Omega-T. Informant D chose the SDLX tool,
but only as she had been able to customize the tool so as to link with online dictionaries. Interviewees
were asked “What do you think is most important: simplicity in the UI or a feature-rich UI?” Three
interviewees chose ‘feature-rich’ and three ‘simplicity’, but for many this was not an appropriate
distinction. Rather than having many or few items onscreen, they considered it important to have
‘the right’ features. Informant B complained of too much information displayed to the user in the
SDL Trados interface, saying “you have to use a big screen to be able to leverage from all of that
information”.
Several informants felt that the solution to onscreen clutter is in having a highly customizable UI.
H said “what I'd like would be more opportunity to build buttons into the interface myself.” He
continued, “There are some functions that I want to use repeatedly, and I have to go through various
dropdown menus (to access them)”. Another possible approach could be to create a UI that adapts
to the user or one that presupposes a learning curve whereby more functionality may be revealed as
users gain experience with the new interface. Tidwell (2006 p.45) recommends designing a UI,
particularly for new users, that hides all but the “most commonly used, most important items” by
default. Informant C explained that user needs may change as they become familiar with the UI:
“I have a lot of difficulties learning shortcuts at the beginning, but then after 6 months
using the same tool, you find that those buttons, you don't need any more, so maybe
something that could be customized to the user.”
F added that, for him, performance ranks higher than the look of a UI, and additional functionality
is only useful if allied with performance. “In the end,” he said, “what decides is how fast we can
work with it.” In his office, although he considers that SDL Trados Studio has good features, “we
use MemoQ because it’s quicker.” Good performance makes this compromise worthwhile. “We
miss some features, but we make up for it in speed.” Referring to ease of use of editing interfaces,
F said that many of the edits he makes “are very easy to see but very cumbersome to make with the
common tools”. “When a segment is wrong, you have to start all over again from scratch, then the
user interface has no influence whatsoever, but if a segment is almost right, these little things that
could make it better, those are the things that could really speed up the process.”
Current post-editing problems
Interviewees were asked about common edits they make during post-editing and possible ways to
support these changes in the UI. Six informants mentioned word order changes. Four informants
reported problems with terminology, such as term updates since the MT engine was trained. Three
informants mentioned grammar problems generally, with D asserting that for her, the “main pitfall
of Statistical Machine Translation (SMT) is “all the agreement: all the gender, number,
conjugation.” Other orthographical and grammatical pain points mentioned were letter case,
prepositions, number, and gender. F said that these kinds of changes “are the most frustrating for us
because it’s mechanics, and if it’s mechanic, there must be a way it could be done by a machine”.
He noted that gender changes should also include “other elements… like associated articles and the
words surrounding the noun.” Many of these suggested features are similar to those proposed in the
survey, which begs the question: did the survey prime participants for these interviews, or set them
thinking on the topic of a post-editing UI? It is unlikely that they remembered an online survey in
detail from one month previously (and once closed, the survey was not available to browse), but it
is quite possible that specific suggested features remained in mind at the time of the interviews.
Three interviewees would like to see automated language-specific changes to word order with B
commenting that “one of my dream features that I haven’t seen so far is changing word order.” She
said that in Portuguese “we use an adjective after the noun and it's quite the opposite in English, so
that is a really common error in machines that are not well-trained.” Two interviewees requested
that highlighted words could be swapped around within a segment. E said she would like to drag
and drop words into the correct order, while G thought such a feature would be superfluous: “It will
always be quicker doing that with a few keystrokes rather than clicking some button or special
shortcut.”
Two interviewees requested propagation of edits in subsequent segments, one would like to see a
feature like Autosuggest in SDL Trados Studio (a feature whereby a word or phrase from the TM
or glossary is suggested to the user based on context while the user types), and one suggested a list
of alternative translations for a word when hovered over by a mouse. When interviewees were asked
more generally about pain points in the post-editing task, five returned to the topic of MT quality.
D stated “I will only work with clients who customize their MT. I will not work with anybody who
just sends something to Google Translate and says ‘go ahead and post-edit this’.” A finds that,
because in her current UI MT is highlighted in violet, she tends to “jump automatically to the target
segment as if I were reviewing.” Even though she finds it “really is important to look at the source
first,” the highlighting makes it “difficult to focus on this ideal process of looking at the source
segment before looking at the target.”
Four participants were dissatisfied with the lack of confidence scores for MT. Interviewee I
complained that, when the MT is completely wrong, “it’s more work to post-edit machine translation
than just to translate from scratch.” Nine interviewees would like to see improved terminology
features. B requested “a shortcut to get different (language-specific) variations for the same term.”
She added that “having a glossary that automatically produces variation regarding gender and
number; I think that would be a killer glossary.” Four would like to make global term changes, with
D suggesting a scenario where, she could update the second part of a project “taking into account
what you’ve added during the first part - that would be amazing”.
Features to Support Post-Editing
When asked in interviews about combined sub-segment matches from MT and TM, all but one of
the interviewees responded positively. Some were concerned with how it would work in practice.
An advantage of the interview stage was the possibility of yielding these more detailed and
considered opinions from the participants. B felt it would be useful “where you don’t have a
dedicated glossary but you have a really rich TM.” J, however, believes that this is a “far-fetched
feature that will complicate things further.”
All interviewees would like to see edits used to improve an MT system in real time, although five
would like these improvements communicated to the user (mostly using color-coding), and five
would like it to happen without notification. F said “It’s very important to know where a segment
has been assembled from, because you have to take a different approach to post-edit it.” E suggests
feedback if the user hovers with the mouse over a word that has been updated, and C would like to
see prompting as unsupervised improvements could be “very dangerous”. B and J felt that
notifications might delay them. J said “I don’t want to go through too many details while I work... I
just want the suggestion in order to be used as quickly as possible.”
Interviewees were asked about a function whereby the MT system would provide suggestions that
would be dynamically altered as they type, based on their edits. Six interviewees were in favour of
such a function whereas four interviewees thought that this would be distracting or might delay their
translation this could also be tested during development. D said “Although I type fast, I still look
quite a lot at my keyboard, so I wouldn’t see the edits as they happen while I’m typing.”
Discussion
The survey and interviews elicited user input to potential future UIs for post-editing, and identified
pain points in the post-editing process. They also revealed a prevailing level of frustration amongst
users concerning fundamental support for the translation task quite aside from post-editing. This
finding is in keeping with recent workplace studies such as the one carried out by Ehrensberger-
Dow (2014) in which 19 hours of screen recordings were collected in the translation workplace for
analysis. Ehrensberger-Dow and Massey (2014) found that certain features of the text editing
software used by the translators in their study were slowing down the process. For example, the
small window in which the translators had to type resulted in them frequently losing their position
when they had to shift between areas of the UI to obtain other information.
Our modus operandi could be criticized on the basis that participants were asked to comment on
features that are not yet available in commercial tools. Sleeswijk Visser et al warn that such user
responses may “oer a view on people’s current and past experiences” rather than exposing “latent
needs” (2005 p.122). While we accept that such responses have their drawbacks, we consider user
input very much worthwhile as part of a pre-design survey that may lead to a new step in interactive
translation support. We also stress the necessity of testing and evaluation of all recommended
features prior to implementation as sometimes features are implemented with the supposition that
they will be useful, but this is not rigorously tested with a significant number of relevant users.
In much of the survey and in particular in the interviews, we focused on post-editing-specific
problems and requirements. We found evidence of scepticism towards MT based on perceptions of
MT quality and a feeling of dispossession related to the translator’s work being used for MT
retraining. Survey respondents stated a preference for even low-threshold TM matches (<65%) to
MT output, despite previous findings that such TM matches require more effort to edit than MT
output would (O’Brien 2006; Guerberof 2008). Scepticism about MT might also be behind survey
respondents’ enthusiasm for one-click rejection of MT suggestions. Both survey respondents and
interviewees expressed frustration at having to make the same post-edits over and over again, and
would like to see ‘on-the-fly’ improvements of MT output based on their edits (increasingly
plausible due to recent breakthroughs in SMT retraining speeds (Du et al. 2015). Nonetheless, MT
and UI developers need to find an efficient method of making incremental improvements to MT
output in real time, in order to lessen the frustration of post-editors who are currently obliged to
make the same edits repeatedly.
A further issue that emerged in discussions of the integration of MT with TM related to post-editors’
strong desire to know the provenance of data that would be re-used in such scenarios. Retaining this
level of provenance data would require user tracking and careful retention of metadata from both
TMs and training data used in SMT however, the latter of which would be particularly technically
demanding.
Conclusion
In this Chapter we have reported on the results of an online survey with 231 full participants and a
series of interviews with 10 informants, focusing on the task of post-editing Machine Translation
(MT) and associated UI requirements. Our results provide an update to survey work by Lagoudaki
(2006 and 2008) on TM tools and what users consider important and desirable in a translation editor.
The focus of the study was on post-editing and MT, but an unexpected pain point,was continued
dissatisfaction with translation tools in general. Despite updates of popular tools in recent years and
new UI features appearing in tools such as Matecat, Casmacat and Lilt
iv
, our study identified a
perceived lack of development within existing translation interfaces, with survey and interview
participants complaining of long-standing issues with their current interfaces. This highlights a lack
of HCI input in translation tool development and design, and would suggest a real need for input
from HCI experts.
In addition, we must introduce a note of caution when basing user requirements on users’ opinions.
We hope that some latent needs may have been revealed during the interview stage, and consider
that unforeseen requirements may arise during user testing. We furthermore hope that this research
may contribute to translation editor interfaces optimised for post-editing that reflect the functional
and design needs of users.
Acknowledgments
This research is supported by the Science Foundation Ireland (Grant 12/CE/I2267) as part of the
CNGL (www.cngl.ie) at Dublin City University. The authors would like to thank the companies that
helped promote the post-editing survey: Alchemy/TDC, Lingo24, Pactera, Roundtable, VistaTEC,
and WeLocalize. We are grateful to Brian Kelly of TimeTrade Systems for advice on specifications
creation, and Prof. Dave Lewis and Joris Vreeke for feedback on the ensuing specifications. Finally,
we would like to thank Dr. Johann Roturier (Symantec), Morgan O’Brien (Intel), and Linda Mitchell
(DCU) for comments and advice on survey questions.
References
Alabau, V., Leiva, L. A., Ortiz-Martínez, D., Casacuberta, F. 2012. User Evaluation of Interactive
Machine Translation Systems. In Proceedings of the 16th Annual Conference of the European
Association for Machine Translation (EAMT). Association for Computational Linguistics,
Stroudsburg, PA, 20-23.
Aziz, W., Castilho de Sousa, S., Specia, L. 2012. PET: a tool for post-editing and assessing machine
translation. In Proceedings of the Eight International Conference on Language Resources and
Evaluation (LREC'12). Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur
Doğan, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidi (Eds.) European Language
Resources Association, Paris, France.
Blatz, J., Fitzgerald, E., Foster, G., Gandrabur, S, Goutte, C., Kulesza, A., Sanchis, A., Ueffing, N.
2004. Confidence Estimation for Machine Translation. In Proceedings of COLING '04
Proceedings of the 20th international conference on Computational Linguistics. Association for
Computational Linguistics, Stroudsburg, PA, 315-321. DOI:http://dx.doi.org/
10.3115/1220355.1220401.
De Almeida, G. 2013. Translating the Post-Editor: An investigation of post-editing changes and
correlations with professional experience across two romance languages. PhD dissertation.
Dublin City University (DCU), Dublin, Ireland.
DePalma, D. A., Hegde, V., Pielmeier, H., Stewart, R. G. 2013. The Language Services Market:
2013. Common Sense Advisory, Boston, USA.
De Souza, J. G. C.,Turchi, M., Negri, M. 2014. Machine Translation Quality Estimation Across
Domains. In Proceedings of COLING 2014, the 25th International Conference on
Computational Linguistics: Technical Papers, 409420, Dublin, Ireland, August 23-29 2014.
Du, J., Moorkens, J., Srivastava, A., Lauer, M., Way, A., Lewis, D. 2015. D4.3: Translation Project-
Level Evaluation. FALCON Project EU FP7 deliverable.
Ehrensberger-Dow, M.. 2014. Challenges of Translation Process Research at the Workplace. MonTI
Monographs in Translation and Interpreting 7: 355-383.
Ehrensberger-Dow, M., Massey, G. 2014. Cognitive Ergonomic Issues in Professional Translation.
In The Development of Translation Competence: Theories and Methodologies from
Psycholinguistics and Cognitive Science, John W. Schwieter and Aline Ferreira (Eds.), 58-86.
Newcastle upon Tyne: Cambridge Scholars Publishing.
Gould, J. D., Lewis, C. 1985. Designing for Usability: Key Principles and What Designers Think.
Commun. ACM 28, 3 (1985) 300-311. ACM Press, New York, NY, 4. DOI:http://dx.doi.org/
10.1145/3166.3170
González-Rubio, J., Ortiz-Martínez, D., Casacuberta, F. 2010. On the Use of Confidence Measures
within an Interactive-predictive Machine Translation System. In Proceedings of the 14th.
Annual Conference of the European Association for Machine Translation (EAMT). Association
for Computational Linguistics, Stroudsburg, PA, 8pp.
Guerberof, A. 2008. Productivity and Quality in the Post-editing of Outputs from Translation
Memories and Machine Translation. Minor dissertation. Universitat Rovira i Virgili, Tarragona,
Spain.
Guerberof, A. 2013. What do professional translators think about post-editing? The Journal of
Specialised Translation 19, 75-95.
Hutchins, W. J. 2001. Machine translation over fifty years. Histoire, Epistémologie, Langage. Vol.
23 (1), 2001: Le traitement automatique des langues (ed. Jacqueline Léon); 7-31. Retrieved
October 18, 2013 from http://www.mt-archive.info/HEL-2001-Hutchins.pdf
Kelly, N. 2014. Why so many translators hate translation technology. The Huffington Post. Posted
online on 19/06/2014. http://www.huffingtonpost.com/nataly-kelly/why-so-many-translators-
h_b_5506533.html. Last accessed: 04/11/2015.
Kelly, N., DePalma, D., Hegde, V. 2012. Voices from the Freelance Translator Community.
Common Sense Advisory, Boston, USA.
Kluger, A. N., Denisi, A. 1996. The effects of feedback interventions on performance: A historical
review, a meta-analysis, and a preliminary feedback intervention theory. Psychological Bulletin,
119, 2, Mar 1996, 254-284.
Koehn, P. 2009. A process study of computer aided translation. Machine Translation 23, 4, 241-
263.
Koehn, P., Alabau, V., Carl, M., Casacuberta, F., García-Martínez, M., González-Rubio, J., Keller,
F., Ortiz-Martínez, D., Sanchis-Trilles, G., Germann, U.. 2015. CASMACAT Final Public Report.
CASMACAT EU FP7 Deliverable.
Koponen, M. 2012. Comparing human perceptions of post-editing effort with post-editing
operations. In proceedings of the Seventh Workshop on Statistical Machine Translation. 181-
190.
Lagoudaki, E. 2006. Translation Memories Survey 2006: Users’ Perceptions Around TM Use. In
proceedings of ASLIB Translating and the Computer 28. London, UK. 15-16 November 2006.
Lagoudaki, E. 2008. Expanding the Possibilities of Translation Memory Systems: From the
Translator’s Wishlist to the Developer’s Design. PhD dissertation. Imperial College, London,
UK.
Langlais, P., Lapalme, G., Loranger, M. 2002. Transtype: Development-evaluation cycles to boost
translator’s productivity. Machine Translation (Special Issue on Embedded Machine
Translation Systems), 15, 2, 77-98.
McBride, C. 2009. Translation Memory Systems: An Analysis of Translators’ Attitudes and
Opinions. Master’s thesis. University of Ottawa, Canada.
Moorkens, J. O’Brien, S. 2013. User Attitudes to the Post-Editing Interface. In Proceedings of
Machine Translation Summit XIV Workshop on Post-editing Technology and Practice. Sharon
O’Brien, Michel Simard, Lucia Specia (Eds.) Association for Computational Linguistics,
Stroudsburg, PA, 19-25.
Moorkens, J. O’Brien, S. 2014. Post-Editing Evaluations: Trade-offs between Novice and
Professional Participants. ElKahlout, İ. D., Özkan, M., SánchezMartínez, F., RamírezSánchez,
G., Hollowood, F., Way, A. (Eds.). Proceedings of the 18th Annual Conference of the European
Association for Machine Translation (EAMT 2015), 75-81.
Moran, J., Lewis, D. 2011.Unobtrusive methods for low-cost manual evaluation of machine
translation. In Proceedings of Tralogy 2011, Centre National de la Recherche Scientifique, Paris,
France, 3-4 March 2011.
O’Brien, S. 2006. Pauses as indicators of cognitive effort in post-editing machine. Across Languages
And Cultures, 7, 1, 1-21, DOI:http://dx.doi.org/10.1556/Acr.7.2006.1.1
Ozcelik, D., Quevedo-Fernandez, J., Thalen, J., Terken, J. 2011. Engaging users in the early phases
of the design process: attitudes, concerns and challenges from industrial practice. In proceedings
of the 2011 Conference on Designing Pleasurable Products and Interfaces (DPPI '11). ACM
New York. DOI:http://dx.doi.org/10.1145/2347504.2347519
Visser, F. S., Stappers, P. J., van der Lugt, R., Sanders, E. B. N. 2005. Contextmapping: experiences
from practice. CoDesign, 1, 2, 119149, DOI:http://dx.doi.org/ 10.1080/15710880500135987
Somers, H. 2001. Machine Translation: Applications. Mona Baker (Ed.) Routledge Encyclopaedia
of Translation Studies. London: Routledge, 140143.
Tidwell, J. 2006. Designing Interfaces. Sebastopol, CA: O’Reilly.
i
A confidence measure is an automated estimate of raw MT quality, and may be presented at a
segment or sub-segment level (Blatz et al. 2004). At the segment level, they may inform a user of
the likely usefulness of the raw MT, and at the sub-segment level they may inform a user about
incorrect or possibly incorrect translations (Alabau et al. 2013). There have been various
suggestions as to how to estimate and display confidence scores (Blatz et al 2004; González-Rubio
et al. 2010), but a confidence estimation feature has not yet been included in a commercial
translation editor.
ii
Lagoudaki (2006) found that SDL Trados was also the most widely-used tool within her 2006
cohort, with 51% reporting that they used the tool regularly.
iii
Lagoudaki (ibid.) on the contrary, reported that company employees were more likely than
freelancers to use multiple tools.
iv
www.lilt.com
... In the light of prospective developments, it is essential to find out which of the features discussed above would cater to real user needs and thus enhance the PE experience. In that regard, Moorkens and O'Brien [9] investigated user attitudes towards PE interfaces and identified several suggestions for improvement, including confidence scores and dynamic MT adaptation. The authors also concluded that translation tools required to be improved before even considering PE, with higher customisability for example. ...
... It should be acknowledged here that while the sample size is considered sufficient to formulate practical recommendations, it can be limited in comparison to other studies (e.g. [9]). The clear prevalence of Romance languages may also have affected the results. ...
Conference Paper
The advances in Machine Translation led to the application of this technology in the translation workflow, thus resulting in Post-Editing being performed in Computer-Assisted Translation tools. While these tools have experienced a rather steady evolution since the commercialisation of the first systems, a plethora of enhancements can now be envisaged due to the latest technological developments in Artificial Intelligence. In this context, the present study seeks to elucidate Post-Editing needs and to identify potential features which could cater to those. To that end, the outcomes of a user survey were examined. This analysis allowed for determining the most complex types of Machine Translation errors, editing actions and decisions during Post-Editing. A feature wish list was also proposed, and the results allowed for identifying the most popular functionalities, which are aligned with Post-Editing needs. Keywords: Post-Editing, Computer-Assisted Translation, user needs.
... In either case, freelancers undertake the majority of the work (O'Hagan & Mangiron, 2013). The proportion of freelance translators has been growing since the 1990s (Pym et al., 2013), partly thanks to the developers' growing interest in vendor models (Moorkens & O'Brien, 2017). ...
Article
The nuances involved in game localization call for an expert workforce, well-versed in dealing with the challenges involved. Yet, the prospect of game localization is still a blue-water area of research and not much is known regarding the profiles and the current industry practices. Thus, the present study seeks to throw light on the profiles, perceptions, and experiences of game translators. A total of 125 game translators provided qualitative and quantitative data regarding the various aspects of the profession through a 25-item online questionnaire. The findings point to a relatively young, highly educated, and mostly self-employed workforce who undertake translation as their main source of income and have a strong gaming background. The strengths and the weaknesses of the current workflow practices are identified and discussed, and suggestions are made drawing on the perceptions and experiences of game translation practitioners.
... Davon betroffen sind vor allem die Arbeitsprozesse und Workflows von professionellen Übersetzer:innen, die sich durch die Digitalisierung und vor allem durch den Einsatz von MÜ stark verändern und neue Kompetenzen erfordern (Pym, 2013;Läubli, & Green, 2019). Darüber hinaus kommt es auch in der gesamten Übersetzungsbranche zu tiefgreifenden Transformationen (Moorkens, 2017;do Carmo, 2020;Vieira, 2020), wie sich beispielsweise in der zunehmenden Bedeutung von MTPE zeigt (Moorkens, & O'Brien, 2017). Auf einer Meta-Ebene steht der Erfolg der MÜ für einen anbrechenden Posthumanismus im Übersetzungsbereich (O'Thomas, 2017) sowie für eine Machtverschiebung weg von demokratischen Entscheidungsprozessen hin zu monopolistisch agierenden Technologiekonzernen. ...
Book
Die Entwicklung und Verbreitung von Systemen für maschinelles Übersetzen bewirkt massive Transformationsprozesse in der Sprachdienstleistungsbranche. Die ‚Maschinisierung‘ von Translation führt nicht nur zu Umwälzungen innerhalb des Übersetzungsmarktes, sondern stellt uns auch vor die grundlegende Frage: Was ist ‚Übersetzen‘, wenn eine Maschine menschliche Sprache übersetzt? Diese Arbeit widmet sich diesem Problem aus der Perspektive der Translationswissenschaft und der Techniksoziologie. Im Fokus stehen Translationskonzepte in der Computerlinguistik, die aus einer Wechselwirkung zwischen sozialer Konstruktion und technischen Gegebenheiten resultieren. Der Übersetzungsbegriff von Computerlinguist:innen orientiert sich an der Mechanik der Maschine, wodurch ein Spannungsverhältnis mit den Paradigmen der Humantranslation entsteht.
Thesis
Full-text available
Recent language technology developments have disrupted the translation and interpreting professions. However, the focus has been on using more computational power and training larger language models, often neglecting the users of such technology (do Carmo and Moorkens 2022). To date, the goal of technology development has been the creation of an intelligent agent that emulates human behaviour to increase automation. As a response, a novel technology design framework has gained a foothold recently: human-centered artificial intelligence, where instead of human replacement, the aim is to produce a powerful tool that augments human capabilities, enhances performance, and empowers users, who are at all instances in supervisory control of such systems (Shneiderman 2022). If applied to machine translation (MT), we can talk about human-centered, augmented MT (HCAMT). This shift, moving from emulation to empowerment, places humans at the centre of AI/language technology. This PhD thesis presents the concept of Machine Translation User Experience (MTUX) as a way to foster HCAMT. Consequently, we conduct a longitudinal user study with 11 professional translators in the English-Spanish language combination that analyses the effects of traditional post-editing (TPE) and interactive post-editing (IPE) on MTUX, translation quality and productivity. MTUX results suggest that translators prefer IPE to TPE because they are in control of the interaction in this new form of translator-computer interaction and feel more empowered in their interaction with MT. Productivity results also suggest that translators working with IPE report a statistically significantly higher productivity than when working with TPE. Quality results also indicate that translators offer more fluent translations in IPE, and equally adequate translations in both post-editing modalities. All these results allow for reflection on the potential adoption of IPE as a more HCAMT post-editing modality, which empowers the users, who have been increasingly reluctant to interact with machine translation post-editing in industry workflows (Cadwell, O’Brien, and Teixeira 2018). This PhD thesis establishes the methodology for fostering HCAMT tools, systems and workflows through the study of MTUX. The successful implementation of HCAMT in translation and interpreting may lead to sustainable, diverse, and ethically sound development in MT systems and other technological tools through a wide variety of users and use-cases.
Book
Tasa Fuster, Vicenta, Esther Monzó-Nebot, and Rafael Castelló-Cogollos, eds. 2023. Repurposing language rights. Guiding the uses of artificial intelligence. València: Tirant lo Blanch. Repurposing language rights. Guiding the uses of artificial intelligence stresses the relevance of minoritized language communities in our future uses of Artificial Intelligence (AI). This edited volume presents an overview of the issues and discussions related to minoritized language communities that arise in connection with the development of language-based AI applications. It offers a blueprint of how the different language communities can be guided towards cooperation, and highlights the challenges that risk frustrating global efforts to enhance global-scale, inclusive communication. The chapters focus on legal and social issues concerning minoritized communities and stress how machine translation in particular can be redirected for greater inclusivity. The book underscores that industry efforts and public policy frameworks are required to ensure that AI-based applications consider the needs of all language communities.
Chapter
Translation has been one of the first skilled occupations targeted by the current wave of industrial automation. The chapter argues that reviewing how we have collectively approached the automation of translation reveals some underexplored risks that have impacted both society at large and marginalized groups. At a time when humanity faces the farthest-reaching technological and industrial revolution of all times, exploring the roots and routes of translation automation may be key for maintaining the hard-won stability of our societies, and is a necessity to prevent existing inequalities from growing. After characterizing translation automation as an industrial process involving deskilling and a simplification of what it is to translate, cultural, social, gender, psycho-social, health, communicative, legal, and ethical risks of translation automation are discussed. From a care-ethics lens, it is suggested that our approach to translation automation so far has been unethical, and that particularly care as a value has been, once again, underestimated in both misrepresenting how humans communicate and how human translators craft mediated communication. The need for legal and policy actions to redress the approach to automation is stressed.
Chapter
Extraordinary advances in machine translation over the last three quarters of a century have profoundly affected many aspects of the translation profession. The widespread integration of adaptive “artificially intelligent” technologies has radically changed the way many translators think and work. In turn, groundbreaking empirical research has yielded new perspectives on the cognitive basis of the human translation process. Translation is in the throes of radical transition on both professional and academic levels. The game-changing introduction of neural machine translation engines almost a decade ago accelerated these transitions. This volume takes stock of the depth and breadth of resulting developments, highlighting the emerging rivalry of human and machine intelligence. The gathering and analysis of big data is a common thread that has given access to new insights in widely divergent areas, from literary translation to movie subtitling to consecutive interpreting to development of flexible and powerful new cognitive models of translation.
Conference Paper
Full-text available
With the increase in machine translation (MT) quality of the latest years, it is now common practice to integrate machine-translated segments as well as translation memory (TM) fuzzy matches in the translation environment used in production settings. Language Service Providers (LSPs) and translation agencies usually set a threshold value above which segments are recovered from TMs, when fuzzy matches from 100% down to that value are present, and below which MT output is provided for post-editing. Though fuzzy recovery is usually allowed from 75% to 100%, the threshold may be set higher and with neural machine translation (NMT) 85% fuzzy matches are usually considered a good threshold value for the maximization of productivity. This study aims at verifying the said threshold of 85% in a production environment and at proposing a new fuzzy value to be applied for pre-translation. The testing was carried out on the output of both a generic model and trained models, and productivity differences when post-editing the two outputs were calculated. Professional linguists were asked to translate from scratch, to edit and correct 75%-99% fuzzy matches and to post-edit segments for both kinds of MT. Findings suggest that most linguists show higher productivity when editing NMT translated segments, than when working on fuzzy matches, even in the higher bands (85%-99%). Furthermore, productivity appears to be influenced by the kind of changes to be applied on the fuzzy matches, while it shows an inconsistent correlation to the fuzzy percentage value shown to linguists by the CAT tool.
Thesis
Full-text available
Post-editing of machine translation (MT) is a workflow that is being used for an increasing number of text types and domains (Koponen, 2016; Hu, 2020; Zouhar et al., 2021),but the sections of text that post-editors need to fix have become harder to detect due to the increased human-like fluency that neural machine translation (NMT) affords (Comparin & Mendes, 2017; Yamada, 2019). This dissertation seeks to address this problem by developing a word-level machine translation quality estimation (MTQE) system to highlight words in raw MT output that need editing in order to aid post-editors. Subsequently, this MTQE system is tested in a large-scale post-editing experiment to determine if it increases productivity and decreases cognitive effort and error rate. This MTQE system is based on two automatically generated features: word translation entropy, generated from the output of multiple MT systems (a feature that has never been used in MTQE), and word class (based on part-of-speech tags). For the post-editing experiment, a within-subjects design assigns raw MT output to participants under three different conditions. Two experimental conditions consist of MT output that has been enhanced with highlighting surrounding the stretches of text that likely need to be edited. In the first experimental condition, this highlighting is supplied automatically by the MTQE system, and in the second experimental condition, this highlighting is supplied by an experienced translator, indicating what text needs editing. The control condition constitutes MT output without highlighting. Participants post-edit three experimental texts in Trados Studio while time-stamped keystroke logs are gathered (which are later integrated into the CRITT Translation Process Research Database (TPR-DB)), and various measures of temporal, technical, cognitive, perceived effort, and group editing activity are used to assess the efficacy and usefulness of highlighting potential errors in the post-editing user interface.
Article
Full-text available
Machine-translated segments are increasingly included as fuzzy matches within the translation-memory systems in the localisation workflow. This study presents preliminary results on the correlation between these two types of segments in terms of productivity and final quality. In order to test these variables, we set up an experiment with a group of eight professional translators using an on-line post-editing tool and a statistical-based machine translation engine. The translators were asked to translate new, machine-translated and translation-memory segments from the 80-90 percent value range using a post-editing tool without actually knowing the origin of each segment, and to complete a questionnaire. The findings suggest that translators have higher productivity and quality when using machine-translated output than when processing fuzzy matches from translation memories. Furthermore, translators' technical experience seems to have an impact on productivity but not on quality.
Article
Full-text available
As part of a larger research project on productivity and quality in the post-editing of machine-translated and translation-memory outputs, 24 translators and three reviewers were asked to complete an on-line questionnaire to gather information about their professional experience but also to obtain data on their opinions about post-editing and machine translation. The participants were also debriefed after finalising the assignment to triangulate the data with the quantitative results and the questionnaire. The results show that translators have mixed experiences and feelings towards machine-translated output and post-editing, not necessarily because they are misinformed or reluctant to accept its inclusion in the localisation process but due to their previous experience with various degrees of output quality and to the characteristics of this type of projects. The translators were quite satisfied in general with the work they do as translators, but not necessarily with the payment they receive for the work done, although this was highly dependent on different customers and type of tasks.
Conference Paper
Full-text available
At present, the task of post-editing Machine Translation (MT) is commonly carried out via Translation Memory (TM) tools that, while well-suited for editing of TM matches, do not fully support MT post-editing. This paper describes the results of a survey of professional translators and post-editors, in which they chose features and functions that they would like to see in translation and post-editing User Interfaces (UIs). 181 participants provided details of their translation and post-editing experience, along with their current working methods. The survey results suggest that some of the desired features pertain to supporting the translation task in general, even before post-editing is considered. Simplicity and customiza-blity were emphasized as important features. There was cautious support for the idea of a UI that made use of post-edits to improve the MT output. This research is intended as a first step towards creating specifications for a UI that better supports the task of post-editing.
Conference Paper
Full-text available
The increasing use of post-editing in localisation workflows has led to a great deal of research and development in the area, much of it requiring user evaluation. This paper compares some results from a post-editing user interface study carried out using novice and expert translator groups. By comparing rates of productivity, edit distance, engagement with the research, and qualitative findings regarding each group’s attitude to post-editing, we find that there are trade-offs to be considered when selecting participants for evaluation tasks. Novices may generally be more positive and enthusiastic and will engage considerably with the research while professionals will be more efficient, but their routines and attitudes may prevent full engagement with research objectives.
Chapter
Full-text available
This chapter explores the interface between translation studies and ergonomics, especially those factors that can affect cognitive processing. Professional translators perform a challenging multi-activity task involving receptive and productive language proficiency, advanced information literacy skills, and a high degree of instrumental competence. They do so under tight temporal constraints in an increasingly technologized environment; many work in offices that may not be designed for intensive text work and within organizational systems that do not suit their cognitive and informational needs. Drawing on quantitative and qualitative translation process data from a large corpus comprising screen recordings, keystroke logs, eye-tracking records, retrospective verbalizations, questionnaire surveys, and interviews, the chapter illustrates how ergonomic issues at translators’ workplaces can have an impact on the efficiency of the process and the quality of the product, and how they could affect professional identity. It closes with a discussion of how findings from such research can contribute to the language service industry and the professional development of translators.
Article
Full-text available
Translation usually takes place at translators' workplaces, yet much translation process research refers to data collected under controlled conditions such as the classroom or the lab. Pursuant with recent descriptions of translation as a situated activity comes the necessity of investigating that activity where and when it occurs. Many of the methods that have proved useful in the lab have also been applied in the field, and some of the challenges associated with investigating translation at the workplace are common to any kind of empirical translation research. However, certain workplace constraints present special challenges to everyone involved. Some solutions that were developed for a workplace study in Switzerland may prove useful in other investigations and might allow new questions to emerge in this developing field.
Article
Full-text available
In this work, we address the question of how to integrate confidence measures into a interactive-predictive machine transla-tion system and reduce user effort. Specif-ically, we propose to use word confidence measures to aid the user in validating cor-rect prefixes from the outputs given by the system. Experimental results obtained on a corpus of the Bulletin of the European Union show that confidence information can help to reduce user effort.
Article
Full-text available
Given the significant improvements in Machine Translation (MT) quality and the increasing demand for translations, post-editing of automatic translations is becoming a popular practice in the translation industry. It has been shown to allow for larger volumes of translations to be produced, saving time and costs. In addition, the post-editing of automatic translations can help understand problems in such translations and this can be used as feedback for researchers and developers to improve MT systems. Finally, post-editing can be used as a way of evaluating the quality of translations in terms of how much effort these translations require in order to be fixed. We describe a standalone tool that has two main purposes: facilitate the post-editing of translations from any MT system so that they reach publishable quality and collect sentence-level information from the post-editing process, e.g.: post-editing time and detailed keystroke statistics.
Conference Paper
Any system designed for people to use should be (a) easy to learn; (b) useful, i.e., contain functions people really need in their work; (c) easy to use; and (d) pleasant to use. In this note we present theoretical considerations and empirical data relevant to attaining these goals. First, we mention four principles for system design which we believe are necessary to attain these goals; Then we present survey results that demonstrate that our principles are not really all that obvious, but just seem obvious once presented. The responses of designers suggest they may sometimes think they are doing what we recommend when in fact they are not. This is consistent with the experience that systems designers do not often recommend or use them themselves. We contrast some of these responses with what we have in mind in order to provide a more useful description of our principles. Lastly, we consider why this might be so. These sections are summaries of those in a longer paper to appear elsewhere (Gould & Lewis, 1983). In that paper we elaborate on our four principles, showing how they form the basis for a general methodology of design, and we describe a successful example of using them in actual system design (IBM's Audio Distribution System).