ArticlePDF Available

Abstract and Figures

In this paper, we report a survey of lan- guage resources in Indonesia, primarily of indigenous languages. We look at the offi- cial Indonesian language (Bahasa Indone- sia) and 726 regional languages of Indone- sia (Bahasa Nusantara) and list all the available LRs that we can gathered. This paper suggests that the smaller regional languages may remain relatively unstudied, and unknown, but they are still worthy of our attention. Various LRs of these endan- gered languages are being built and col- lected by regional language centers for study and its preservation. We will also briefly report its presence on the Internet.
Content may be subject to copyright.
A preview of the PDF is not available
... Being geographically wide country, there are hundreds of regional languages which are spoken by the diverse ethnicities across Indonesia. It has been reported that Indonesia has at least 726 identified regional languages; With the most spoken regional languages respectively comprise Javanese, Sundanese, Malay, Madurese, Minangkabau, Batak, Buginese, Balinese (Hammam, 2008). On the practice, Indonesian as the national language even roles as the L2 in certain regions in this country. ...
Article
Full-text available
This study aspires to scrutinise the attitude and motivation of Airlangga University linguistics students in learning and mastering English as their L2. Sufficient proficiency and command in English are needed to support the academic success in teaching-learning process, particularly in a university. Through a total of 20 active linguistics students in Airlangga University that had been surveyed, it is found that a plenty of them still do not meet the minimum proficiency that the university has set. There are several factors that affect students’ proficiency in English of the linguistics students in Airlangga University. Through the finding of this research, it is found out that these factors derived from internal motivation of the students and also external factors that keep motivating the students to have adequate proficiency and command in English. To help the linguistics students in bridging their understanding for the teaching-learning materials which are predominantly delivered in English is the main drive of the linguistics students’ motivation to master English as their L2. This research provides students, academics, scholars, and institutions several suggestions and solutions that can be done to overcome obstacles or problems that the learners must face in order to obtain L2 learning achievements. Several things are also suggested in this research regarding what can be done in order to maximise L2 learning process.
... In Indonesia, Minangkabau language is the fifth most spoken indigenous language after Javanese (75m), Sundanese (27m), Malay (20m), and Madurese (14m)(Riza, 2008). ...
Preprint
Full-text available
Although some linguists (Rusmali et al., 1985; Crouch, 2009) have fairly attempted to define the morphology and syntax of Minangkabau, information processing in this language is still absent due to the scarcity of the annotated resource. In this work, we release two Minangkabau corpora: sentiment analysis and machine translation that are harvested and constructed from Twitter and Wikipedia. We conduct the first computational linguistics in Minangkabau language employing classic machine learning and sequence-to-sequence models such as LSTM and Transformer. Our first experiments show that the classification performance over Minangkabau text significantly drops when tested with the model trained in Indonesian. Whereas, in the machine translation experiment, a simple word-to-word translation using a bilingual dictionary outperforms LSTM and Transformer model in terms of BLEU score.
... We conduct the experiment on the PANLOC parallel dataset, which relates to four topics: the economy, international affairs, science and sport [18]. The translation of the PANLOC parallel dataset was carried out by a human being, and therefore not all words in a sentence pair have corresponding/projected words. ...
Conference Paper
Open Information Extraction (Open IE) is a paradigm that tries to extract as much information as possible, with less restriction on the information type to be extracted. It extracts relation tuples, in which a relation tuple consists of a relation tuple trigger and several relation arguments. Previous studies on developing Open IE systems have mainly been for English. Recently, several works have also been carried out in other languages, but there is no study on Open IE for Indonesian. In this paper, we investigate several rule-based methods for building an Open IE system for Indonesian. We use lexical and syntactic features that were obtained from an Indonesian language processing tool and compare the extraction results against the standard English Open IE systems. The experimental results for English-Indonesian parallel sentences show that the POSTag+Noun Phrase-based rules have better performance. At the same time, the dependency relation-based performance depends on the dependency parser performance, which still needs improvement since we use a small size dataset on training the parser. However, both approaches show good performance in identifying the relation tuple trigger, with the recall score being 0.96 for the POSTag+Noun Phrase-based rules and 0.6 for the POSTag+Dependency relation based-rules. Index Terms—Open Information Extraction, rule-based, POSTag and Noun Phrase-based, POSTag and dependency relation-based, Indonesian Open IE system
... Javanese is one of the indigenous languages in Indonesia, still relatively strong in terms of the number of speakers using it accounting for 69,91 % of the total population of Indonesia ( Riza, 2008:93). However, this condition is worrying due to the dominance of Bahasa Indonesia as stated by Lauder in Riza (2008) saying that the increasing number of the speakers of Indonesian causes the decreasing number of the speakers of the indigeous languages. Eventhough Javanese is getting lesser and lesser, the values are still preserved in the proverbs. ...
Conference Paper
Full-text available
roverbs are expressions representing views and values in creating harmonious and successful life. This paper is to investigate the local wisdom in Javanese proverbs. Nonparticipant observation method with note-taking technique is used to collect the data. To analyze the data, I used distributional, identity, and inferencial methods. From the analysis, I found that there are some values among other things related with the Javanese views on social and cultural diversity, self control and management in individual, social, and spiritual life. There are four levels of meaning the proverbs convey, i.e literal, cognitive, literary, and cultural, since they represent concepts for conceptualizing. The concepts are related with nature, body organ, building, motion, space, visual experience, habit, cosmology, number, family relationship, country, God, container, shape. This implies that language in this case Javanese proverbs can preserve human’s experiences and habits that may lose in another decade due to social dynamicity and natural changes. The loss may happen along with the loss of the language. Therefore, there must be some efforts to preserve the language upholding them. Keywords: Javanese proverbs, metaphorical, embodied, values, conceptually.
... Indonesia has a broad linguistic diversity. There are 726 languages in the country; making it the world's second most diverse, after Papua New Guinea which has 823 local languages [1], [2]. Therefore, Indonesia's national language, Indonesian language, policy has been called a "miraculous success" [3]. ...
Article
Full-text available
There are six vowels in Indonesian language, i.e. /a/, /i/, /u/, /e/, /ə/ and /o/. This paper presents Indonesian vowel recognition using artificial neural network (ANN) based on the wavelet features. The wavelet features were the wavelet coefficients of vowel signal which were extracted by using discrete wavelet transform (DWT). Vowel samples were recorded from native Indonesian speakers, 10 males and 10 females. Db4 and sym4 were used as the mother wavelet, and decomposition level 2, 4 and 6 were implemented for each vowel sample. Minimum, maximum, mean and standard deviation value of the wavelet coefficients then were used as input vectors of ANN with 2 hidden layers. Backpropagation algorithm was used to training the ANN. From the experimental results, an overall recognition rate of 70.83% could be achieved. In case of male speakers the highest recognition rate is 90% and in case of female speakers the highest recognition rate is 80%. DOI: http://dx.doi.org/10.11591/ijece.v3i2.2325 Full Text: PDF
... This work contributes an effective solution which automatically translates information mostly provided digitally in English and provides a method to Natural Language Processing (NLP) society to develop a majorless-resourced language pair MT system, such as an English-Indonesian MT system. There were several MT research activities for Indonesian language such as the Multilingual Machine Translation System (MMTS) project which uses interlingual approach [1] and two statistical MT systems; namely Google Translate application that uses phrasebased statistical approach [2] and a system developed by BPPT and ANTARA [3]. In the recent years, there are few available English-Indonesian MT software: Rekso Translator 1 , Translator XP 2 , and KatakuTM 3 . ...
Article
Full-text available
In this paper, annotated disjunct (ADJ) technique is discussed to develop phrase-based transfer rules. We developed an English to Indonesian Machine Translation (MT) system using those transfer rules and compared with three available MT software and our earlier prototype, which uses sentence-based ADJ technique. It was found that the developed system outperforms other systems. In addition, the phrase-based approach generalized the transfer rules, thus reduced the number of transfer rules.
Chapter
Full-text available
Covering an area of approximately 1.8 million square kilometers, and including a vast number of ethnic and linguistic groups, encompassing approximately 700 indigenous and minority languages, from varying cultural backgrounds, Indonesia presents a unique opportunity to examine how its education system addresses such a variety of needs across such a diverse context. Any second language education including English presents unique challenges in such a linguistically diverse landscape. Examining how such an education system addresses these challenges would be useful for policy makers, administrators, teacher educators, and teachers throughout the region as these stakeholders are faced with similar challenges in other Asian countries and beyond. In order to investigate how the Indonesian educational system prepares and responds to such a complex environment, participants for the current ongoing research project are drawn from four groups: Current preservice language educators, teacher educators, graduated/practicing teachers, and institutional leaders (principles, headmasters, etc.). In-depth individual or group interviews were conducted with members of each of the four groups. Classroom observations were also conducted with practicing teachers and are used to confirm information gathered through interviews as to actual practice in the language-learning classroom. The results of such research could be instrumental in providing a better understanding of the current situation of English language teacher education and teaching practice in Indonesia, as well as how to plan for future curricular developments. The results could also not only be valuable to policymakers and planners in Indonesia and across the Asian region but beyond in other contexts where learners represent a variety of linguistic and cultural backgrounds.
Article
Full-text available
Since 1998, regional governments in Indonesia have had greater autonomy due to the commencement of a reformation movement across Indonesia. Large portions of education management were delegated to the regional governments. Because of this, the education level varies strongly across Indonesia’ provinces. Referring to the data provided by the Indonesian Bureau of Statistics, it is found that Eastern Indonesia generally has a higher rate of uneducated than Western Indonesia. We review the current condition of Indonesian education in terms of regional disparity among eastern and western provinces and study the correlation between inequality in education and other related aspects, such as social and economic conditions. We find that inequality issues on socio-economic conditions are reflected in the education disparity between Eastern and Western Indonesia. By employing panel data with provinces as units of observations, we find that the difference in regional development among Indonesian provinces influences education issues. By evaluating the standard deviation of the statistic we were able to identify socio-economic factors that influence the regional education disparity.
Conference Paper
Bahasa Indonesia and English have many differences in their linguistic structure. Translating sentences from one language to another is not a straight forward task for these pair of languages. Example-Based Machine Translation (EBMT) approach which introduced as a new paradigm in machine translation field is used in this initial work of developing a Bahasa Indonesia to English machine translation. The machine translation is developed by utilizing Moses system. Experiments in translating Bahasa Indonesia to English by tuning the parameters in Moses decoder have set alight about how the effects of manipulating the weight on translation model, language model, distortion (re-ordering) and word penalty on increasing the quality of the translation.
Book
The now-classic Metaphors We Live By changed our understanding of metaphor and its role in language and the mind. Metaphor, the authors explain, is a fundamental mechanism of mind, one that allows us to use what we know about our physical and social experience to provide understanding of countless other subjects. Because such metaphors structure our most basic understandings of our experience, they are "metaphors we live by"--metaphors that can shape our perceptions and actions without our ever noticing them. In this updated edition of Lakoff and Johnson's influential book, the authors supply an afterword surveying how their theory of metaphor has developed within the cognitive sciences to become central to the contemporary understanding of how we think and how we express our thoughts in language.
Article
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
Conference Paper
This paper reports about the development of a Named Entity Recognition (NER) system in Bengali. A pattern directed bootstrapping method has been used to develop the NER system from a tagged Bengali news corpus, developed from the web. Different tags of the tagged news corpus help to identify the seed data in the system. The training corpus is initially tagged against the different seed data and a lexical contextual seed pattern is generated for each tag. The entire training corpus is shallow parsed to identify the occurrence of these initial seed patterns and further patterns are generated through bootstrapping. Patterns that occur in the entire training corpus above a certain threshold frequency are considered as the final set of patterns learnt from the training corpus. The test corpus is shallow parsed to identify the occurrence of these patterns and estimate the named entities. System has been tested with four manually tagged Bengali news corpus (Gold Standard Test Sets) and it has demonstrated the highest Recall, Precision and F-Score values of 63.3%, 84.8% and 73.2% respectively.