Conference Paper

Hybrid word sense disambiguation using language resources for transliteration of Arabic numerals in Korean

DOI: 10.1145/1644993.1645053 Conference: Proceedings of the 2009 International Conference on Hybrid Information Technology, ICHIT 2009, Daejeon, Korea, August 27-29, 2009
Source: DBLP


The high frequency of the use of Arabic numerals in informative texts and their multiple senses and readings deteriorate the accuracy of TTS systems. This paper presents a hybrid word sense disambiguation method exploiting a tagged corpus and a Korean wordnet, KorLex 1.0, for the correct and efficient conversion of Arabic numerals into Korean phonemes according to their senses. Individual contextual features are extracted from the tagged corpus and are grouped in order to determine the sense of Arabic numerals. Least upper bound synsets among common hypernyms of contextual features were obtained from the KorLex hierarchy, and they were used as semantic categories of the contextual features of Arabic numerals. The semantic classes were trained to classify the meaning and the reading of Arabic numerals using decision tree and to compose grapheme-to-phoneme rules for an automatic transliteration system for Arabic numerals. The proposed system outperforms the customized TTS systems by 3.9%--20.3%.

Download full-text


Available from: Youngim Jung
  • Source

    Full-text · Conference Paper · Apr 2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: India is multilingual country, in India people speak different language, and they also used different ways to write number text. Due to lack of language knowledge, people are not able to read number text from one language to another language. This proposed method solves the problem of reading number text from one language to another language. It is part of Natural Language Processing (NLP). Optical Character is used for separate out number text from image. Perform translation on number text by using rule based approach, so that it will convert number text from one regional language to another regional language. By using speech synthesis, it will give number text in voice form
    Full-text · Conference Paper · Dec 2014