Conference Paper

Enhancing Translation Quality of English Medical Text Reports into Marathi through integration of Google And Microsoft Bing Transliteration

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Considering various areas of research in the domain NLP, perfect translation of text from a source language to destination language still remains a problem under development. Many a times we get “transliterated” output instead of expected “translated” output. Specifically considering good quality translation of English medical text reports into Marathi language is altogether a domain of research. This paper proposes an algorithm where, the first step is to provide the English medical text report to Google translator and Microsoft Bing Translator and receive first level Marathi translated output from these two. As the so received output is many a times loaded with a lot of transliterated words, especially medical terms, the algorithm makes use of the proposed medical dictionary to provide equivalent Marathi words to the transliterated words, so that the readability of the medical text reports is increased. Keywords -- NLP, Machine Translation, Transliterated words, Loan words, Text similarity

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The current research paper discusses the phenomenon of loanwords in light of a range of other borrowing phenomena that are more or less closely related to loanwords. The study concluded that loanwords make up the most frequent type of lexical borrowing and an inevitable consequence, among other various outcomes, of the contact between languages. The study further concluded that borrowing loanwords allows the recipient language to expand its vocabulary. However, the loanwords borrowed from any donor language have to undergo certain processes to make them fit appropriately into the recipient language. These processes include: 1) a process of adaptation, in which non-native phonemes are substituted to fit the recipient language's sound structure, and 2) a process of accommodation, in which phonological patterns are modified according to the phonological rules of the recipient language. The results provided from this present study also showed that there are different levels to which a borrowed loanword from the donor language becomes assimilated into the recipient language. In addition, the level of such assimilation depends on two factors: time and usage. That is, the longer the loanword is borrowed from the donor language and the more it is used by the speakers of the recipient language, the greater its degree of assimilation and familiarity. Finally, many reasons and motives lying behind the existence of loanwords were highlighted in the current research paper.
Article
Full-text available
Meaningful translation and transliteration is NP problem in case of languages like Marathi language as there are so many word disambiguation and multiple use and meaning of single word in different context is available. That is why identifying correct informational need and translating text into meaningful information is a tedious and error prone task. Google translate works on machine neuron network and WorldNet is an online reference system works on psycholinguistic theory of human memory. Both approaches are promising tools for language translation. Complete translation of Marathi text to English or English to Marathi also having problem of more complicated meaningless or tedious translation. Proposed algorithm is taking into consideration meaningful translation or transliteration as per user’s informational need. This novel approach consider machine neuron network for meaningful formation of translated sentence and morphological structure for correct translation of word based on ontological analysis of word.
Conference Paper
Full-text available
The document text similarity measurement and analysis is a growing application of Natural Language Processing. This paper presents the results of using different techniques for semantic text similarity measurements in documents used for safety-critical systems. The research objective of this work is to measure the degree of semantic equivalence of multi-word sentences for rules and procedures contained in the documents on railway safety. These documents, with unstructured data and different formats, need to be preprocessed and cleaned before the set of Natural Language Processing toolkits, and Jaccard and Cosine similarity metrics are applied. The results demonstrate that it is feasible to automate the process of identifying equivalent rules and procedures and measure similarity of disparate safety-critical documents using Natural language processing and similarity measurement techniques.
Article
Full-text available
Text similarity measurement is the basis of natural language processing tasks, which play an important role in information retrieval, automatic question answering, machine translation, dialogue systems, and document matching. This paper systematically combs the research status of similarity measurement, analyzes the advantages and disadvantages of current methods, develops a more comprehensive classification description system of text similarity measurement algorithms, and summarizes the future development direction. With the aim of providing reference for related research and application, the text similarity measurement method is described by two aspects: text distance and text representation. The text distance can be divided into length distance, distribution distance, and semantic distance; text representation is divided into string-based, corpus-based, single-semantic text, multi-semantic text, and graph-structure-based representation. Finally, the development of text similarity is also summarized in the discussion section.
Article
Full-text available
Text similarity measurement compares text with available references to indicate the degree of similarity between those objects. There have been many studies of text similarity and resulting in various approaches and algorithms. This paper investigates four majors text similarity measurements, which include String-based, Corpus-based, Knowledge-based, and Hybrid similarities. The results of the investigation showed that the semantic similarity approach is more rational in finding the substantial relationship between texts.
Article
Full-text available
Measuring the similarity between words, sentences, paragraphs and documents is an important component in various tasks such as information retrieval, document clustering, word-sense disambiguation, automatic essay scoring, short answer grading, machine translation and text summarization. This survey discusses the existing works on text similarity through partitioning them into three approaches; String-based, Corpus-based and Knowledge-based similarities. Furthermore, samples of combination between these similarities are presented.
Article
Users of the WWW across the globe are increasing rapidly. According to Internet live stats there are more than 3 billion Internet users worldwide today and the number of non-English native speakers is quite high there. A large proportion of these non-English speakers access the Internet in their native languages but use the Roman script to express themselves through various communication channels like messages and posts. With the advent of Web 2.0, user-generated content is increasing on the Web at a very rapid rate. A substantial proportion of this content is transliterated data. To leverage this huge information repository, there is a matching effort to process transliterated text. In this article, we survey the recent body of work in the field of transliteration. We start with a definition and discussion of the different types of transliteration followed by various deterministic and non-deterministic approaches used to tackle transliteration-related issues in machine translation and information retrieval. Finally, we study the performance of those techniques and present a comparative analysis of them.
Marathi to English Neural Machine Translation with near perfect corpus and transformers
  • S A Jadhav
Technical terms and processes: A case for transliteration
  • A Mohammad
  • G Grami
  • H Alshenqeeti
Exploiting Transliterated Words for Finding Similarity in InterLanguage News Articles using Machine Learning
  • S Naeem
  • A U Rahman
  • S M Haider
Google Translate vs. Bing Translator: Which is better?
  • K Knight
Exploiting Transliterated Words for Finding Similarity in InterLanguage News Articles using Machine Learning
  • Naeem