Table 3 - uploaded by Masaru Yamada
Content may be subject to copyright.
Source publication
Although research suggests the use of a TM (translation memory) can lead to an increase of 10% to 70%, any actual productivity increase must depends on the TM content. If the target renditions included in the TM database exhibit more free characteristics, this may adversely affect the translator’s productivity. This paper examines how productivity...
Context in source publication
Context 1
... order to analyze more closely the effect of two different types of TM, I measured each individual translator's speed for every 10% of the fuzzy-match ranges. Table 3 gives the mean speed of individul translator sorted by the match rate, and Figures 2 and 3 show their behavioral patterns. : WPM change at detailed match rate for TM-F Figure 3: WPM change at detailed match rate for TM-L ...
Similar publications
Compared with purpose-built units, excavator-based harvesters offer many advantages, but they also face one main limitation: a much higher fuel consumption, which also results in higher CO 2 emission levels. The fuel efficiency of excavator-based harvesters can be increased by a better interface between the excavator and the harvester head. This st...
Citations
... Since the world is not a snapshot once the training corpus is collected, we can never expect an everlarge model to capture everything in its parameters, even for LLMs like GPT-3 (Brown et al., 2020), and it is important to endow the model with access to an external memory bank to solve different NLP tasks (Lewis et al., 2020c). For translation task, long before machine translation, the localization industry has been proposing retrieval techniques to help human translators to get higher productivity and consistency (Yamada, 2011). Early works on machine translation mainly contribute to employing memory for statistical machine translation (SMT) systems (Simard and Isabelle, 2009;Liu et al., 2012). ...
With direct access to human-written reference as memory, retrieval-augmented generation has achieved much progress in a wide range of text generation tasks. Since better memory would typically prompt better generation~(we define this as primal problem), previous works mainly focus on how to retrieve better memory. However, one fundamental limitation exists for current literature: the memory is retrieved from a fixed corpus and is bounded by the quality of the corpus. Due to the finite retrieval space, bounded memory would greatly limit the potential of the memory-augmented generation model. In this paper, by exploring the duality of the primal problem: better generation also prompts better memory, we propose a framework called Selfmem, which iteratively adopts a retrieval-augmented generator itself to generate an unbounded memory pool and uses a memory selector to pick one generated memory for the next generation round. By combining the primal and dual problem, a retrieval-augmented generation model could lift itself up with its own output in the infinite generation space. To verify our framework, we conduct extensive experiments across various text generation scenarios including neural machine translation, abstractive summarization and dialogue generation over seven datasets and achieve state-of-the-art results in JRC-Acquis(four directions), XSum(50.3 ROUGE-1) and BigPatent(62.9 ROUGE-1).
... The use of TM is very necessary for computeraided translation (Yamada, 2011) and computational approaches for machine translation (Koehn and Senellart, 2010). Similar sentence pairs retrieved from a TM are also utilized as a type of knowledge to enhance the translation (Liu et al., 2019a;He et al., 2021;Khandelwal et al., 2021). ...
... Motivated by the use of Translation Memory in Computer-Aided Translation (Yamada, 2011) and its usage in computational approaches to Machine Translation (Somers, 1999;Koehn and Senellart, 2010;Khandelwal et al., 2020, inter alia), we retrieve similar examples to the test source from a datastore that includes pairs of the source text and their corresponding translations via BM25, an unsupervised efficient retriever to provide additional context to the model. We propose a novel incontext example selection and re-ranking strategy to maximize the coverage of the source n-grams in the retrieved examples. ...
... Translators wishing to translate a sentence can benefit from fuzzy matching techniques to retrieve similar segments from the TM. These segments can then be revised, thereby improving productivity and consistency of the translation process (Koehn and Senellart, 2010;Yamada, 2011). The retrieval of similar examples from a TM has also proved useful in conventional (AR) neural MT systems; they can be injected into the encoder (Bulte and Tezcan, 2019; or as priming signals in the decoder (Pham et al., 2020) to influence the translation process. ...
... Translation memory (TM) is basically a database of segmented and paired source and target texts that translators can access in order to re-use previous translations while translating new texts (Christensen and Schjoldager, 2010). For human translators, such similar translation pieces can lead to higher productivity and consistency (Yamada, 2011). For machine translation, early works mainly contributes to employ TM for statistical machine translation (SMT) systems (Simard and Isabelle, 2009;Utiyama et al., 2011;Liu et al., 2012. ...
Retrieval-augmented Neural Machine Translation models have been successful in many translation scenarios. Different from previous works that make use of mutually similar but redundant translation memories~(TMs), we propose a new retrieval-augmented NMT to model contrastively retrieved translation memories that are holistically similar to the source sentence while individually contrastive to each other providing maximal information gains in three phases. First, in TM retrieval phase, we adopt a contrastive retrieval algorithm to avoid redundancy and uninformativeness of similar translation pieces. Second, in memory encoding stage, given a set of TMs we propose a novel Hierarchical Group Attention module to gather both local context of each TM and global context of the whole TM set. Finally, in training phase, a Multi-TM contrastive learning objective is introduced to learn salient feature of each TM with respect to target sentence. Experimental results show that our framework obtains improvements over strong baselines on the benchmark datasets.
... Furthermore, motivated by the use of Translation Memory in Computer-Aided Translation (Yamada, 2011) and its usage in computational approaches to Machine Translation (Somers, 1999;Koehn and Senellart, 2010;Khandelwal et al., 2020, arXiv:2212.02437v1 [cs.CL] 5 Dec 2022 inter alia), we retrieve similar examples to the test source from a datastore that includes pairs of source text and their corresponding translations via BM25, an unsupervised efficient retriever to provide additional context to the model. ...
Large-scale generative models show an impressive ability to perform a wide range of Natural Language Processing (NLP) tasks using in-context learning, where a few examples are used to describe a task to the model. For Machine Translation (MT), these examples are typically randomly sampled from the development dataset with a similar distribution as the evaluation set. However, it is unclear how the choice of these in-context examples and their ordering impacts the output translation quality. In this work, we aim to understand the properties of good in-context examples for MT in both in-domain and out-of-domain settings. We show that the translation quality and the domain of the in-context examples matter and that 1-shot noisy unrelated example can have a catastrophic impact on output quality. While concatenating multiple random examples reduces the effect of noise, a single good prompt optimized to maximize translation quality on the development dataset can elicit learned information from the pre-trained language model. Adding similar examples based on an n-gram overlap with the test source significantly and consistently improves the translation quality of the outputs, outperforming a strong kNN-MT baseline in 2 out of 4 out-of-domain datasets.
... For decades, the localization industry has proposed TM technologies in CAT tools to allow human translators to visualize one or several similar or usable translations extracted from the TM when translating a sentence, leading to higher productivity and consistency (Yamada, 2011). Hence, even though the retrieval methods of TM differ among CAT tools (Bloodgood and Strauss, 2014), human translators generally accept discounted translation rates for sentences with high fuzzy matches. ...
In an increasingly global world, more situations appear where people need to express themselves in a foreign language or multiple languages. However, for many people, writing in a foreign language is not an easy task. Machine translation tools can help generate texts in multiple languages. With the tangible progress in neural machine translation (NMT), translation technologies are delivering usable translations in a growing number of contexts. However, it is not yet realistic for NMT systems to produce error-free translations. Therefore, users with a good command of a given foreign language may find assistance from computer-aided translation technologies. In case of difficulties, users writing in a foreign language can access external resources such as dictionaries, terminologies, or bilingual concordancers. However, consulting these resources causes an interruption in the writing process and starts another cognitive activity. To make the process smoother, it is possible to extend writing assistant systems to support bilingual text composition. However, existing studies mainly focused on generating texts in a foreign language. We suggest that showing corresponding texts in the user's mother tongue can also help users to verify the composed texts with synchronized bitexts. In this thesis, we study techniques to build bilingual writing assistant systems that allow free composition in both languages and display synchronized monolingual texts in the two languages. We introduce two types of simulated interactive systems. The first solution allows users to compose mixed-language texts, which are then translated into their monolingual counterparts. We propose a dual decoder Transformer model comprising a shared encoder and two decoders to simultaneously produce texts in two languages. We also explore the dual decoder model for various other tasks, such as multi-target translation, bidirectional translation, generating translation variants, and multilingual subtitling. The second design aims to extend commercial online translation systems by letting users freely alternate between the two languages, changing the texting input box at their will. In this scenario, the technical challenge is to keep the two input texts synchronized while taking the users' inputs into account, again with the goal of authoring two equally good versions of the text. For this, we introduce a general bilingual synchronization task and implement and experiment with autoregressive and non-autoregressive synchronization systems. We also investigate bilingual synchronization models on specific downstream tasks, such as parallel corpus cleaning and NMT with translation memories, to study the generalization ability of the proposed models.
... For this reason, translators need training on using TMs to gain an ideal benefit from the functions of them (p.13). Later, Yamada (2011) investigated the effect of TMs with different features (e.g. free and literal ones) on translator's productivity within the context of localization and concluded that localizability might be adversely affected if the TM contained "freer renditions" and this might in turn reduce the productivity (p. ...
... Translation memory (TM) is basically a database of segmented and paired source and target texts that translators can access in order to re-use previous translations while translating new texts (Christensen and Schjoldager, 2010). For human translators, such similar translation pieces can lead to higher productivity and consistency (Yamada, 2011). For machine translation, early works mainly contributes to employ TM for statistical machine translation (SMT) systems (Simard and Isabelle, 2009;Utiyama et al., 2011;Liu et al., 2012. ...
... In the face of these challenges, translation technologies were seen as helpful agents to improve consistency and increase productivity (Bowker, 2005;Yamada, 2011). Over time these tools have got complicated to offer many functions that were previously done with external resources. ...
İstihdamın mümkün olduğunca üst bir düzeyde gerçekleşmesi ülkede yaşayan insanların refah içerisinde yaşayabilmesinin şartlarından biridir. İşsizlik çoğunlukla tarım toplumundan sanayi toplumuna geçiş ve makineleşmenin artmasıyla açıklanmıştır. Birçok araştırma sonucuna göre yapısal işsizliğin olduğu ifade edildiği ülkemizde bunu giderebilmek için girişimciliği destekler yönde farklı kurum ve projeler kapsamında destekler verilmektedir. Devletin istihdamın yönetimine baktığımızda ilk girişimler on
dokuzuncu yüzyılın ortalarına rastladığı görülmektedir. Bu konunun ülkemizde devlet tarafından ele alınması uluslararası bir zorunluluk biçiminde kurum bazında yönetimi, İkinci Dünya Savaşı sonuna denk gelmektedir. 1947 yılında kurulan İş ve İşçi Bulma Kurumu bunlardan ilkidir. Kurum 1990’lı ve 2000’li yıllarda iki farklı düzenleme ile kendisini yeni gelişmelere ayak uydurarak varlığını sürdürmektedir.
Planlı kalkınma dönemiyle birlikte Devlet Planlama Teşkilatı ve bu nedenle kamudaki istihdam yapısının oluşturulması için Devlet Personel Dairesi kurulmuştur. Kamuda idari ve mali olarak özerk kurulan kurum yapısı uzun bir müddet sayılan işlevleri yapabilecek kaynakları bulunmadan çalışmıştır.
1984 yılında Devlet Personel Başkanlığı adını alarak Başbakanlığa bağlanan kurum, doksanlı yıllarda işlevini yapamadığı yönünde eleştirilmeye başlanmıştı. Nihayetinde 2019 yılında Çalışma Bakanlığı’na
devredilerek ortadan kaldırılmıştır. Aynı yıl istihdamın yönetilebilmesi için Cumhurbaşkanlığı İnsan Kaynakları Ofisi kurulmuştur. 1 Nolu Cumhurbaşkanlığı Teşkilatı Hakkında Cumhurbaşkanlığı Kararnamesi’nde idari ve mali özerkliğe sahip Dijital Dönüşüm Ofisi, Finans Ofisi ve Yatırım Ofisi ile
benzer biçimde oluşturulmuş bir kurumdur. Bu yapılanmayla sorunları bürokratik yapılanmaya takılmadan çözebilecek, ihtisasa özgü durumuyla ve merkezi devlet yapılanmasına yakın bir karar alma yapısına sahiptir. Bu durum istihdam yönetimine önem verildiğine dair bir duruştur. İşsizliğin neoliberal politikaların uygulanmaya başladığı 1980’li yıllarla beraber büyümeyi izlemeyerek her geçen yıl artarak geldiği bilinmektedir. Ofis insan kaynağını envanteri oluşturarak, bu kaynağı ihtiyaç duyulan alanlara uygun bir biçimde yönlendirme ve geliştirmeyi hedeflemektedir. Kamudaki istihdamda liyakat ve
yetkinliğin artırılması için gerekli olan projelerin üretilmesine yönelik çalışmalar yapacaktır. Devlet Personel Dairesi’nin geçmişteki işlevlerine talip gözükmektedir. Özel yeteneklerin keşfini sağlayarak yetenek yöneteceği topyekûn bir insan kaynağı seferberliği yanında özel yeteneklere ilerleme imkanı
sağlayacağı ifade edilmektedir.
Çalışmada bu kurumların tarihi, yaptıkları faaliyetler, işsizlik ve istihdam üzerine etkileri mercek altına alınmaya çalışılmıştır. Kurumun yasada sayılan işlevleri henüz görünür değildir. Buradan hareketle merkezi bir anlayışla kurulan Cumhurbaşkanlığı İnsan Kaynakları Ofisi’nin çalışmalarının istihdam üzerindeki olası etkileri mevzuata dayalı varsayımlar üzerinden bir çıkarsamaya gidilerek yapılmaya çalışılmıştır.