Conference Paper

Finite-State Machine for Post-Processing Method of Balinese Script to Latin Transliteration

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The decreasing use of the Balinese Script, including its Balinese Script to Latin transliteration knowledge, has caused concern over the threat of extinction. This research joined the preservation effort through the collaboration between the Engineering and Language discipline. This research focused on the development of a modular post-processing method of that transliteration by using a Finite-State-Machine (FSM) method. This method can be used on the mobile application for ubiquitous learning and handles the transliteration process from Unicode Balinese Script text to Latin text. It receives the output from the preceding conversion process from the Balinese Script image to Unicode Balinese Script text. This method was combined with a dictionary data structure for the advantage of time complexity O(1) and avoiding hard-coded transliteration rule. This research contributed to that development since there has been no such development in this research area. The FSM method was represented by a state-transition table showing six possible states, transitions between them (based upon twenty inputs), and the outputs. The dictionary consists of 9620 key-value pairs that comply with the transliteration rule. Through the experiment, this method has passed over 99% (251 of 253) testing cases based on intermediate and output results of the selected image data set that consists of various possible kinds of post-processing cases.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... et al.[6] have provided development and success increase accuracy by up to 94% with the addition of special words and corrections. More developed dedicated Balinese-Unicode font, was found on Noto Serif Balinese font rather than Noto Sans Balinese font[7] [8][9] ...
Article
Full-text available
The existence of regional languages shows an identity of the existence of a tribe. Indonesia has many regional languages, one of which is Balinese with Balinese script as input for the script which comes from Sanskrit. Reading Balinese script requires knowledge of Balinese script and Balinese culture in order to be able to correctly pronounce the written Balinese script. Since learning Balinese script took a long learning process, it certainly causes a slowdown in knowledge about how to read the Balinese script. Therefore, it is necessary to have a system that can translate Balinese script into a speech so that it can be used in learning the Balinese script without any mistakes in pronouncing vowels. This research designed a system synthesis of Balinese script into a speech with the system synthesis method of Noto Serif Balinese Font which consists of three processes namely text pre-processing, prosody generation, and concatenation. Concatenation or combining speech will use the Pitch Synchronous Overlap Add Method (PSOLA). The result to be achieved in this research is to be able to decompose the Unicode from the Noto Serif Balinese font into a sentence which then this sentence will become a message. In the future, this research can use input from an image to directly enter into speech. Making it easier for users to translate the Balinese script.
Article
This research proposed a method for the affixed word transliteration to the Balinese Script since there has not been studied yet and it is important since the affixed word needs to be transliterated, inevitably. This research is one of the efforts to preserve digitally the endangered Balinese local language knowledge in Indonesia through the multi-discipline collaboration between Computer Science and Language discipline. The proposed method was taken care of two related aspects, i.e.; (1) A Latin root word has its related Balinese Script root word by using default or special transliteration rule; and (2) A Latin root word with a special transliteration rule for its Balinese Script root word, also need a special transliteration rule for its affixed word. This study was conducted on the pioneering web-based transliteration learning application, BaliScript, that receives the Latin text input and outputs the Balinese Script by using the Noto Serif Balinese font with its dedicated Balinese Unicode. Through the experiment, the proposed method gave the expected transliteration results, added a certain perspective, and strengthened the transliteration knowledge. Future work is to enhance and reuse this method on the mobile computing device, as a part of the Balinese Language ubiquitous learning that supports Balinese Language education, which is a mandatory local subject from the elementary school to the high school in Bali Province.
Article
Full-text available
Many Javanese manuscripts in Indonesia are stored in museums and libraries. Most of these manuscripts were written using local scripts that are rarely used in everyday life, and hence a software application that can help and improve the reading of these manuscripts is valuable. An essential step in automatic manuscript image transliteration is post-processing, which involves editing and concatenating syllables into words. The main problem of post-processing is that there exists no symbol for space between words in a sentence, which is called the scriptio-continua problem. This paper proposes methods based on the backtracking algorithm to solve the scriptio continua in the post-processing step of Javanese manuscript image transliteration. The proposed methods use a depth-first search in seeking relevant candidate words to determine whether to merge a new syllable or not. The results of the proposed methods to concatenate 17,687 syllables from the Hamong Tani book using a dictionary containing 49,801 words are found to be satisfactory in terms of computation and accuracy. The accuracy of the implemented greedy and brute-force methods is both 81.64%. However, the greedy-based method is more efficient and has a better performance than the brute-force method.
Article
Full-text available
This research aims to preserve Balinese script writing knowledge using technological approach. This across-disciplines research (Computer Science and Balinese Language) contributes on the development of a Latin-to-Balinese script transliteration robotic system that was called LBtrans-Bot. LBtrans-Bot can be used as a learning system to give the transliteration knowledge as one aspect of Balinese script writing. LBtrans-Bot was known as the first system that utilize Noto Sans Balinese font and was developed based on the identified seventeen kinds of special words. LBtrans-Bot consists of the transliterator web application, the transceiver console application, and the robotic arm with its GUI controller application. Through the experiment, LBtrans-Bot has been able to write the 34-pixel font size of the Noto Sans Balinese font from HTML 5 canvas that has been setup with additional 10-pixel length of the width and the height of the Balinese script writing area. Its transliterator gave the accuracy result up to 91% testing cases of The Balinese Alphabet writing rules and examples document. This transliterator result outperformed the best result of the known existing transliterator based on Bali Simbar font, i.e. Transliterasi Aksara Bali, that only has accuracy up to 68% cases of the same testing document. In the future work, LBtrans-Bot could be improved by: 1) Accommodating more complex Balinese script with trade off to the limited writing area of robotic system; 2) Enhancing its transliterator by enriching the database consists of words belong to the seventeen kinds of special words, and implementing semantic relation transliteration. © 2018 Institute of Advanced Engineering and Science. All rights reserved.
Article
Full-text available
Balinese script writing, as one of Balinese cultural richness, is going to extinct because of its decreasing use. This research is a way to preserve it through collaboration between Computer Science and Language discipline, which focused on accuracy comparison of Latin-to-Balinese script transliteration method on mobile application as a ubiquitous learning media. From few research in this area, there are only two existing methods to be compared, i.e. each on Android mobile application that were called Belajar Aksara Bali (BAB), and Transliterasi Aksara Bali (TAB). The comparison was based on The Balinese Alphabet writing rules and examples document by Sudewa. Through the experiment, TAB has outperformed BAB since TAB has passed over 68% (103 of 151) cases, while BAB has passed over only 39% (59 of 151) cases. This research contributes on a comprehensive accuracy comparison analysis of Latin-to-Balinese script transliteration method, specifically on mobile application, since there is no such study. This research also contributes on those methods improvement possibility. In the future, this research can be used as a reference for improvement of any Latin-to-Balinese script transliteration method by taking care on thirteen kind of special words that were found during this comparison study. © 2018 Institute of Advanced Engineering and Science. All rights reserved.
Article
Full-text available
This paper proposes a personal learning assistant called LORAMS (Link of RFID and Movies System), which supports the learners with a system to share and reuse learning experience by linking movies and environmental objects. These movies are not only kind of classes' experiments but also daily experiences movies. Therefore, you can share these movies with other people. LORAMS can infer some contexts from objects around the learner and search for shared movies that match with the contexts. We think that these movies are very useful to learn various kinds of subjects. We did evaluation experiments. The target of some experimenters is to record movies and link objects, while the target of other experimenters is to learn using LORAMS and to try doing a task. We got the result that the learner's performance of doing a task using LORAMS is better than doing a task without its assistant.
Article
Full-text available
Machine transliteration is the process of automatically transforming the script of a word from a source language to a target language, while preserving pronunciation. The development of algorithms specifically for machine transliteration began over a decade ago based on the phonetics of source and target languages, followed by approaches using statistical and language-specific methods. In this survey, we review the key methodologies introduced in the transliteration literature. The approaches are categorized based on the resources and algorithms used, and the effectiveness is compared.
Conference Paper
The decreasing use of Balinese Script, including its Latin-to-Balinese Script transliteration knowledge, has caused concern over the threat of extinction of this part of Balinese culture. This research is aimed to preserve that knowledge through the technological approach by collaboration between Engineering and Language discipline. It provided lessons learned from the analysis of computer-based implementation of Latin to Balinese Script transliteration on a relatively large document. This study has never been conducted yet and added certain perspective also strengthened computer-based transliteration knowledge, as part of Balinese Language ubiquitous learning. The analysis was conducted on pioneering Aksara Bali Simbar Dwijendra (SD) web application that basically receives Latin text input and outputs Balinese Script based on Bali SD font. Through the experiment, ten categorized lessons were exposed and learned. Future work to enrich this transliteration knowledge is by conducting more exploration on computer-based implementation using dedicated Balinese-Unicode fonts.
Conference Paper
This study is aimed to preserve part of Balinese local culture that is endangered, i.e. Latin-to-Balinese Script transliteration knowledge through the web technology that supports ubiquitous learning. It analyzed handling of line breaking on Latin-to-Balinese Script transliteration web application since there has not been studied yet on no-word boundaries (scriptio continua) script. Two rules of thumb should be applied: 1) no line breaking is allowed between the syllable-sign cluster or consonant-vowel cluster; and 2) no line breaking is allowed just before a colon, comma, or period. This research added a certain perspective and strengthened the transliteration knowledge, as part of Balinese Language ubiquitous learning that supports Balinese Language education, which is a mandatory local subject from basic to high school in Bali Province. This analysis was conducted on Hanacaraka web application that was developed as a technological product of Universitas Pendidikan Ganesha (Undiksha), Indonesia. Hanacaraka basically receives Latin text input and outputs Balinese Script based on Bali Simbar font. Through the experiment, its handling of line breaking gave a good transliteration result since a special algorithm was applied. Future work to enrich and strengthen the transliteration knowledge is by extending the handling of line breaking on Latin-to-Balinese Script web application that was supported by dedicated Balinese-Unicode font.
Conference Paper
This study is aimed at analyzing the handling of mathematical expression on the Latin-to-Balinese Script transliteration method since there has not been studied yet. It is one of the ways to preserve the endangered local culture knowledge through the collaboration between Computer Science and Language discipline. Moreover, this study was conducted on mobile computing that supports ubiquitous learning. There are three aspects related, i.e.; 1) Balinese Language uses verbal mathematical expression rather than mathematical expression using the notation in its Balinese writing; 2) In transliteration case, the Latin text of mathematical expression using notation should be preserved to avoid complexity related to various verbal mathematical expressions (which actually they have the same meaning); and 3) The second aspect was limitedly handled by the supporting computer font, which in this case is Bali Simbar Dwijendra (SD) font, and special algorithm needed to be applied on them. This research added a certain perspective and strengthened the transliteration knowledge, as part of Balinese Language ubiquitous learning that supports Balinese Language education, which is a mandatory local subject from basic to high school in Bali Province. This analysis was conducted on pioneering Aksara Bali SD mobile application that basically receives Latin text input and outputs Balinese Script based on Bali SD font. Through the experiment, its handling of mathematical expression gave good transliteration results since a special rule-based algorithm was applied.
Article
is the conversion of a text from one script to another, and thus representing words from one language using the approximate phonetic or spelling equivalents of another language. Machine Transliteration has come out to be an emerging and a very important research area in the field of machine translation. Transliteration systems are very beneficial for removing the language and scriptural barriers. It has gained prime importance as a supporting tool for machine translation and cross-language information retrieval, especially when proper names and technical terms are involved. Various techniques are available for transliteration process. This paper is intended to give a brief overview of commonly used machine transliteration techniques.
Balinese Unicode Table
  • The Unicode Consortium
The Unicode Consortium, "Balinese Unicode Table," 2020. [Online]. Available: http://unicode.org/charts/PDF/U1B00.pdf. [Accessed: 01-May-2020].
Peraturan Gubernur Bali No. 80 tentang Pelindungan dan Penggunaan Bahasa, Aksara, dan Sastra Bali serta Penyelenggaraan Bulan Bahasa Bali [Bali Governor Regulation No. 80 on Protection and Usage of Balinese Language, Script, and Literature, also Organizing Balinese Language Month
  • Bali Government
Bali Government, "Peraturan Gubernur Bali No. 80 tentang Pelindungan dan Penggunaan Bahasa, Aksara, dan Sastra Bali serta Penyelenggaraan Bulan Bahasa Bali [Bali Governor Regulation No. 80 on Protection and Usage of Balinese Language, Script, and Literature, also Organizing Balinese Language Month]," 2018. https://jdih.baliprov.go.id/produkhukum/peraturan/abstrak/24665 (accessed Aug. 17, 2020).
Surat Edaran Gubernur Bali No. 3172 Tahun 2019 tentang Penggunaan Busana Adat Bali dan Aksara Bali [Bali Governor Circular Letter No. 3172 Year 2019 about The Usage of Balinese Traditional Clothing and Balinese Script
  • Bali Government
Bali Government, "Surat Edaran Gubernur Bali No. 3172 Tahun 2019 tentang Penggunaan Busana Adat Bali dan Aksara Bali [Bali Governor Circular Letter No. 3172 Year 2019 about The Usage of Balinese Traditional Clothing and Balinese Script]," 2019. https://jdih.baliprov.go.id/produk-hukum/peraturan/abstrak/24741 (accessed Aug. 17, 2020).
An Improved Algorithm and Accuracy Analysis Testing Cases of Latin-to-Balinese Script Transliteration Method based on Bali Simbar Dwijendra Font
  • G Indrawan
  • I P E Swastika
  • I K Sariyasa
  • Paramarta
G. Indrawan, I. P. E. Swastika, Sariyasa, and I. K. Paramarta, "An Improved Algorithm and Accuracy Analysis Testing Cases of Latin-to-Balinese Script Transliteration Method based on Bali Simbar Dwijendra Font," Test Eng. Manag., vol. 83, pp. 7676-7683, 2020.
The Balinese Alphabet
  • Ida Bagus
  • Adi Sudewa
Ida Bagus Adi Sudewa, "The Balinese Alphabet," 2003. [Online]. Available: http://www.babadbali.com/aksarabali/alphabet.htm. [Accessed: 01-May-2020]
Kamus Bali -Indonesia Beraksara Latin dan Bali [Balinese -Indonesian Dictionary with its Latin and Balinese Script
  • I G K Anom
I. G. K. Anom et al., Kamus Bali -Indonesia Beraksara Latin dan Bali [Balinese -Indonesian Dictionary with its Latin and Balinese Script]. Denpasar: Bali Province, 2009.
Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet
  • J Esling
J. Esling, Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet. Cambridge University Press, 1999.