Conference Paper

Semantic web based machine translation

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper describes the experimental combination of traditional Natural Language Processing (NLP) technology with the Semantic Web building stack in order to extend the expert knowledge required for a Machine Translation (MT) task. Therefore, we first give a short introduction in the state of the art of MT and the Semantic Web and discuss the problem of disambiguation being one of the common challenges in MT which can only be solved using world knowledge during the disambiguation process. In the following, we construct a sample sentence which demonstrates the need for world knowledge and design a prototypical program as a successful solution for the outlined translation problem. We conclude with a critical view on the developed approach.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... • Heuss et al. [124] combined SWT as a post-editing technique with the direct approach in their work. Post-editing technique involves fixing mistakes from a given output by choosing the right translated word or order. ...
... Although this technique is commonly used by a professional translator or a linguist, its automated implementation has been widely researched recently in order to reduce human efforts. The works [15,124] propose a method for retrieving translations from a domain ontology. The approach performs SPARQL queries to search translations of a given word. ...
... The ontology was mainly used for supporting a given parser in the extraction of syntactic and semantic rules. These syntactic-semantic rules enable a deep analysis of sentences thus avoiding the problem [115] 2004 EBMT Annotation Ontologies None N. Elita and A. Birladeanu [117] 2005 EBMT SPARQL Ontologies None W. Hahn and C. Vertan [116] 2005 EBMT SPARQL + Annotation Ontologies None C. Shi and H. Wang [118] 2005 None Reasoner Ontologies None N. Elita and M. Gavrila [119] 2006 EBMT SPARQL Ontologies Human E. Seo et al. [120] 2009 None Reasoner Ontologies None P. Knoth et al. [121] 2010 RBMT or SMT Annotation Ontologies Human A. M. Almasoud and H. S. Al-Khalifa [123] 2011 TBMT + EBMT SPARQL Ontologies Human L. Lesmo et al. [122] 2011 Interlingua Annotation Ontologies Human B. Harriehausen-Mühlbauer and T. Heuss [15,124] 2012 Direct SPARQL + Reasoner Ontologies Human K. Nebhi et al. [125] 2013 TBMT Annotation LOD None J. P. McCrae and P. Cimiano [126] 2013 SMT Annotation LOD Human D. Moussallem and R. Choren [127] 2015 SMT SPARQL Ontologies Human O. Lozynska and M. Davydov [128] 2015 RBMT Annotation Ontologies Human K.Simov et al. [129] 2016 RBMT + SMT SPARQL LOD Automatic T.S. Santosh Kumar. [130] 2016 SMT SPARQL Ontologies Human N. Abdulaziz et al. [131] 2016 SMT SPARQL Ontologies Human J. Du et al. [132] 2016 SMT SPARQL LOD Automatic A. Srivastava et al. [133] 2016 SMT SPARQL + Annotation LOD Automatic C. Shi et al. [134] 2016 NMT Annotation LOD Automatic + Human A. Srivastava et al. [135] 2017 SMT SPARQL + Annotation LOD Automatic of ambiguity in USL. ...
Preprint
A large number of machine translation approaches have recently been developed to facilitate the fluid migration of content across languages. However, the literature suggests that many obstacles must still be dealt with to achieve better automatic translations. One of these obstacles is lexical and syntactic ambiguity. A promising way of overcoming this problem is using Semantic Web technologies. This article presents the results of a systematic review of machine translation approaches that rely on Semantic Web technologies for translating texts. Overall, our survey suggests that while Semantic Web technologies can enhance the quality of machine translation outputs for various problems, the combination of both is still in its infancy.
... • Heuss et al. [124] combined SWT as a post-editing technique with the direct approach in their work. Post-editing technique involves fixing mistakes from a given output by choosing the right translated word or order. ...
... Although this technique is commonly used by a professional translator or a linguist, its automated implementation has been widely researched recently in order to reduce human efforts. The works [15,124] propose a method for retrieving translations from a domain ontology. The approach performs SPARQL queries to search translations of a given word. ...
... The ontology was mainly used for supporting a given parser in the extraction of syntactic and semantic rules. These syntactic-semantic rules enable a deep analysis of sentences thus avoiding the problem Year MT approach SW method SW resource Evaluation C. Vertan [115] 2004 EBMT Annotation Ontologies None N. Elita and A. Birladeanu [117] 2005 EBMT SPARQL Ontologies None W. Hahn and C. Vertan [116] 2005 EBMT SPARQL + Annotation Ontologies None C. Shi and H. Wang [118] 2005 None Reasoner Ontologies None N. Elita and M. Gavrila [119] 2006 EBMT SPARQL Ontologies Human E. Seo et al. [120] 2009 None Reasoner Ontologies None P. Knoth et al. [121] 2010 RBMT or SMT Annotation Ontologies Human A. M. Almasoud and H. S. Al-Khalifa [123] 2011 TBMT + EBMT SPARQL Ontologies Human L. Lesmo et al. [122] 2011 Interlingua Annotation Ontologies Human B. Harriehausen-Mühlbauer and T. Heuss [15,124] 2012 Direct SPARQL + Reasoner Ontologies Human K. Nebhi et al. [125] 2013 TBMT Annotation LOD None J. P. McCrae and P. Cimiano [126] 2013 SMT Annotation LOD Human D. Moussallem and R. Choren [127] 2015 SMT SPARQL Ontologies Human O. Lozynska and M. Davydov [128] 2015 RBMT Annotation Ontologies Human K.Simov et al. [129] 2016 RBMT + SMT SPARQL LOD Automatic T.S. Santosh Kumar. [130] 2016 SMT SPARQL Ontologies Human N. Abdulaziz et al. [131] 2016 SMT SPARQL Ontologies Human J. Du et al. [132] 2016 SMT SPARQL LOD Automatic A. Srivastava et al. [133] 2016 SMT SPARQL + Annotation LOD Automatic C. Shi et al. [134] 2016 NMT Annotation LOD Automatic + Human A. Srivastava et al. [135] 2017 SMT SPARQL + Annotation LOD Automatic of ambiguity in USL. ...
... Although this technique is commonly used by a professional translator or a linguist, its automated implementation has been widely researched recently in order to reduce human efforts. The works [17,129] proposed an approach for retrieving translations from a domain ontology. The approach performs SPARQL queries to search the translations of a given word. ...
... SKOS is a common model for describing concepts in SW. It uses the prefLabel property to assign a primary label to a particular concept [120] 2004 EBMT Annotation Ontologies None N. Elita and A. Birladeanu [122] 2005 EBMT SPARQL Ontologies None W. Hahn and C. Vertan [121] 2005 EBMT SPARQL + Annotation Ontologies None C. Shi and H. Wang [123] 2005 None Reasoner Ontologies None N. Elita and M. Gavrila [124] 2006 EBMT SPARQL Ontologies Human E. Seo et al. [125] 2009 None Reasoner Ontologies Human P. Knoth et al. [126] 2010 RBMT or SMT Annotation Ontologies Human A. M. Almasoud and H. S. Al-Khalifa [128] 2011 TBMT + EBMT SPARQL Ontologies Human L. Lesmo et al. [127] 2011 Interlingua Annotation Ontologies Human B. Harriehausen-Mühlbauer and T. Heuss [17,129] 2012 Direct SPARQL + Reasoner Ontologies Human K. Nebhi et al. [130] 2013 TBMT Annotation LOD None J. P. McCrae and P. Cimiano [131] 2013 SMT Annotation LOD Human D. Moussallem and R. Choren [132] 2015 SMT SPARQL Ontologies Human O. Lozynska and M. Davydov [133] 2015 RBMT Annotation Ontologies Human K.Simov et al. [134] 2016 RBMT + SMT SPARQL LOD Automatic T.S. Santosh Kumar. [135] 2016 SMT SPARQL Ontologies Human N. Abdulaziz et al. [136] 2016 SMT SPARQL Ontologies Human J. Du et al. [137] 2016 SMT SPARQL LOD Automatic A. Srivastava et al. [138] 2016 SMT SPARQL + Annotation LOD Automatic C. Shi et al. [139] 2016 NMT Annotation LOD Automatic + Human A. Srivastava et al. [140] 2017 SMT SPARQL + Annotation LOD Automatic and altLabel for alternative names or translations. ...
... The result was very promising, but unfortunately the work was discontinued. Additionally, some researchers, including Harriehausen-Mühlbauer and Heuss and Seo et al. [125,129], have used reasoners to disambiguate words in MT systems. ...
Article
Full-text available
A large number of machine translation approaches has been developed recently with the aim of migrating content easily across languages. However, the literature suggests that many boundaries have to be dealt with to achieve better automatic translations. A central issue that machine translation systems must handle is ambiguity. A promising way of overcoming this problem is using semantic web technologies. This article presents the results of a systematic review of approaches that rely on semantic web technologies within machine translation approaches for translating natural-language sentences. Overall, our survey suggests that while semantic web technologies can enhance the quality of machine translation outputs for various problems, the combination of both is still in its infancy.
... al. [10] works uses statistical methods after semantic step it is not so good for disambiguate short sentences. In Harriehausen-Muhlbauer and Heuss [11] an interesting proposal but your performance is very bad to use in a dialogue system and the authors present simple examples but the approach is very interesting to the machine translation future. ...
... This work makes the multilingual terms disambiguation to be with application in its proper language or in a possible translation. In Harriehausen-Muhlbauer [11] implements a novel approach developing a semantic machine translation. This machine performs automatic translation of a sentence based in relationship semantics into ontologies. ...
... Idea imposed for Harriehausen-Muhlbauer [11] is presented as the future of automatic machine translations. Machines need to do the translation with base in semantics relationship extracted of an ontology. ...
Article
Full-text available
This paper introduces a novel approach to tackle the existing gap on message translations in dialogue systems. Currently, submitted messages to the dialogue systems are considered as isolated sentences. Thus, missing context information impede the disambiguation of homographs words in ambiguous sentences. Our approach solves this disambiguation problem by using concepts over existing ontologies.
... Entretanto, em algumas situações não é viável sua aplicação, como nos casos das línguas que não dispõem de uma corpora paralela extensa de textos técnicos [Brandt e Tyers 2011]. Dois exemplos de línguas sem corpora paralela técnica, pública e extensa são o português-Br e o islandês, ambos em relação ao inglês [ No caso do tradutor Baseado em Conhecimento (KBMT) apresentado, foi descrita a utilização da ferramenta Jena [Harriehausen-Mühlbauer e Heuss 2012] , que realiza o fluxo de processamento central de um tradutor. Para a criação de um tradutor técnico, seria necessário desenvolver ou buscar uma base de conhecimentos RDF da web semântica específica, para utilização como base no sistema. ...
Conference Paper
Full-text available
Machine Translation (MT) has emerged as one of the relevant agents that can make possible human communication among different cultures and languages. To date, known Machine Translation tools, as Google Translate, still have insufficient performance in distinguishing specific disambiguation contexts. For that reason, this paper reviews three popular MT techniques for written text, presenting a comparative study and also describing translator examples using these techniques. Therefore, this paper can be used as a concise material for initial steps in Machine Translation development.
Preprint
Full-text available
The Natural Language Processing (NLP) community has recently seen outstanding progress, catalysed by the release of different Neural Network (NN) architectures. Neural-based approaches have proven effective by significantly increasing the output quality of a large number of automated solutions for NLP tasks (Belinkov and Glass, 2019). Despite these notable advancements, dealing with entities still poses a difficult challenge as they are rarely seen in training data. Entities can be classified into two groups, i.e., proper nouns and common nouns. Proper nouns are also known as Named Entities (NE) and correspond to the name of people, organizations, or locations, e.g., John, WHO, or Canada. Common nouns describe classes of objects, e.g., spoon or cancer. Both types of entities can be found in a Knowledge Graph (KG). Recent work has successfully exploited the contribution of KGs in NLP tasks, such as Natural Language Inference (NLI) (KM et al.,2018) and Question Answering (QA) (Sorokin and Gurevych, 2018). Only a few works had exploited the benefits of KGs in Neural Machine Translation (NMT) when the work presented herein began. Additionally, few works had studied the contribution of KGs to Natural Language Generation (NLG) tasks. Moreover, the multilinguality also remained an open research area in these respective tasks (Young et al., 2018). In this thesis, we focus on the use of KGs for machine translation and the generation of texts to deal with the problems caused by entities and consequently enhance the quality of automatically generated texts.
Article
Full-text available
This paper describes an approach to modelling a general-language wordnet, GermaNet, and a domain-specific wordnet, TermNet, in the web ontology language OWL. While the modelling process for GermaNet adopts relevant recommendations with respect to the English Princeton WordNet, for TermNet an alternative modelling concept is developed that considers the special characteristics of domain-specific terminologies. We present a proposal for linking a general-language wordnet and a terminological wordnet within the framework of OWL and on this basis discuss problems and alternative modelling approaches.
Article
Full-text available
This paper discusses current versions of Word-Net from a data modelling perspective. We show that these versions do not consider ba-sic data model desiderata for their design, like flexibility, extensibility and interoperability. We claim that a data model for WordNet must also consider the inherent network structure of WordNet data. Thus we make the case for an RDF model for WordNet and present a concrete version of WordNet in RDF format.
Conference Paper
Full-text available
Automatic translation from one human language to another using computers, better known as machine translation (MT), is a long-standing goal of computer science. Accurate translation requires a great deal of knowledge about the usage and meaning of words, the structure of phrases, the meaning of sentences, and which real-life situations are plausible. For general-purpose translation, the amount of required knowledge is staggering, and it is not clear how to prioritize knowledge acquisition efforts.Recently, there has been a fair amount of research into extracting translation-relevant knowledge automatically from bilingual texts. In the early 1990s, IBM pioneered automatic bilingual-text analysis. A 1999 workshop at Johns Hopkins University saw a re-implementation of many of the core components of this work, aimed at attracting more researchers into the field. Over the past years, several statistical MT projects have appeared in North America, Europe, and Asia, and the literature is growing substantially. We will provide a technical overview of the state-of-the-art.
Article
In this paper we present the actions we made to prepare an EBMT system to be integrated into the Semantic Web. We also described briefly the developed EBMT tool for translators.
SKOS simple knowledge organiza-tion system reference', W3C Recommendation
  • A Miles
Miles, A. (2008), 'SKOS simple knowledge organiza-tion system reference', W3C Recommendation. URL: http://www.mendeley.com/r esearch/skos-simple-knowledge-organization-system-reference/
Reprä und Verknü allgemeinsprachlicher und terminolo-gischer Wortnetze in OWL', Zeitschrift fü Sprach-wissenschaft
  • C Kunze
  • H Lü
Kunze, C. & Lü, H. (2007), 'Reprä und Verknü allgemeinsprachlicher und terminolo-gischer Wortnetze in OWL', Zeitschrift fü Sprach-wissenschaft. Mark van Assem, V. U. A., Aldo Gangemi, ISTC-CNR, R. & Guus Schreiber, V. U. A. (2006), 'RDF/OWL Representation of WordNet'. URL: http://www.w3.org/TR/wordnet-rdf/
w3c-gibt-recommendation-fr-resource-description-framework-rdf-frei A statistical MT tutorial workbook, in 'Prepared for the 1999 JHU Summer WorkshopWhat's New in Statis-tical Machine Translation
  • Knight
  • K Knight
  • K Koehn
URL: http://www.mendeley.com/res earch/w3c-gibt-recommendation-fr-resource-description-framework-rdf-frei/ Knight, K. (1999), A statistical MT tutorial workbook, in 'Prepared for the 1999 JHU Summer Workshop'. URL: http://www.snlp.de/prescher/t eaching/2007/StatisticalNLP/bib/ 1999jhu.knight.pdf Knight, K. & Koehn, P. (2004), 'What's New in Statis-tical Machine Translation', Tutorial, HLT/NAACL pp. 1–89.
Statistical machine translation to enable universal communication?'. <b>URL</b>: http://www.newelectronics.co.uk/electronics-technology/statistical-machine-translation-to-enable-universal-communication
  • D Boothroyd
Multiple Uses of Machine Translation and Computerised Translation Tools
  • J Hutchins
Hutchins, J. (2009), 'Multiple Uses of Machine Translation and Computerised Translation Tools', Machine Translation pp. 13-20. URL: http://www.hutchinsweb.me.uk/ Besancon-2009.pdf
A statistical MT tutorial workbook, in 'Prepared for the 1999 JHU Summer Workshop
  • K Knight
Knight, K. (1999), A statistical MT tutorial workbook, in 'Prepared for the 1999 JHU Summer Workshop'. URL: http://www.snlp.de/prescher/t eaching/2007/StatisticalNLP/bib/ 1999jhu.knight.pdf
SKOS simple knowledge organization system reference', W3C Recommendation. <b>URL&lt
  • A Miles
Statistical machine translation to enable universal communication
  • D Boothroyd
  • N Elita
  • A Birladeanu
Boothroyd, D. (2011), 'Statistical machine translation to enable universal communication?'. URL: http://www.newelectronics.co. uk/electronics-technology/statist ical-machine-translation-to-enab le-universal-communication/33008/ Elita, N. & Birladeanu, A. (2005), 'A first step in integrating an EBMT into the Semantic Web'. URL: www.mt-archive.info/MTS-2005-Elita.pdf
RDF/OWL Representation of WordNet
  • V U A Mark Van Assem
  • Aldo Gangemi
  • R Istc-Cnr
  • V U A Schreiber
Mark van Assem, V. U. A., Aldo Gangemi, ISTC-CNR, R. & Guus Schreiber, V. U. A. (2006), 'RDF/OWL Representation of WordNet'. URL: http://www.w3.org/TR/wordnetrdf/