Bornini LahiriIndian Institute of Technology Kharagpur | IIT KGP · Department of Humanities and Social Sciences
Bornini Lahiri
PhD in Linguistics
About
37
Publications
20,335
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
70
Citations
Introduction
Bornini Lahiri currently works at Indian Institute of Technology Kharagpur as an Assistant Professor. Earlier she was Assistant Professor in Central Institute of Hindi. She has also taught in Linguistics in School of languages and linguistics, Jadavpur University. She has worked in different projects related to language documentation. Her research area includes exploring and documenting minor and endangered languages of India along with typology and sociolinguistics.
Skills and Expertise
Additional affiliations
November 2017 - January 2019
Education
July 2010 - April 2015
Publications
Publications (37)
এই প্রবন্ধটিতে কুরুখ ভাষার কারক ও তার চিহ্নগুলির বিশ্লেষণ করা হয়েছে। কুরুখ হল উত্তর দ্রাবিড়ীয় ভাষাবংশের একটি ভাষা। পূর্ব-মধ্য ভারতীয় অঞ্চলের ওরাঁও উপজাতির মানুষের মুখের ভাষা এই কুরুখ। ঝাড়খণ্ড, ছত্তিশগঢ়, ওড়িশা, পশ্চিমবঙ্গ এবং অসমে প্রায় দুই কোটি মানুষ এই ভাষাতে কথা বলেন। কুরুখ ভাষা ইউনেস্কোর দ্বারা 'ঝুঁকিপূর্ণ' বা vulnerable ভাষা হিসেবে চিহ্...
In this paper, we discuss the development of a multilingual dataset annotated with a hierarchical, fine-grained tagset marking different types of aggression and the “context" in which they occur. The context, here, is defined by the conversational thread in which a specific comment occurs and also the “type” of discursive role that the comment is p...
The preparation of speech corpora for languages
un(der)represented on the web largely depends on the manual
methods of data collection and processing from different
sources. The methods used in field linguistics and documentary
linguistics for collecting data from the speech communities
provide a valuable set of resources and methodologies for such...
In this paper, we discuss an in-progress work on the development of a speech corpus for four low-resource Indo-Aryan languages- Awadhi, Bhojpuri, Braj and Magahi - using the field methods of linguistic data collection. The total size of the corpus currently stands at approximately 18 hours (approx. 4-5 hours each language) and it is transcribed and...
In this paper we discuss an in-progress work on the development of a speech corpus for four low-resource Indo-Aryan languages -- Awadhi, Bhojpuri, Braj and Magahi using the field methods of linguistic data collection. The total size of the corpus currently stands at approximately 18 hours (approx. 4-5 hours each language) and it is transcribed and...
In the present paper, we will present the results of an acoustic analysis of political discourse in Hindi and discuss some of the conventionalised acoustic features of aggressive speech regularly employed by the speakers of Hindi and English. The study is based on a corpus of slightly over 10 hours of political discourse and includes debates on new...
In the present paper, we will present a survey of the language resources and technologies available for the non-scheduled and endangered languages of India. While there have been different estimates from different sources about the number of languages in India, it could be assumed that there are more than 1,000 languages currently being spoken in I...
In the present paper, we present a detailed description of the classifier systems of five Indian languages-- Mizo, Galo, Tagin (all belongs to the Tibeto-Burman family), Assamese (Indo-Aryan) and Malto (Dravidian). It is observed that the classifiers are a predominant feature in the Tibeto-Burman and we observe an extensive classifier system in the...
In this paper, we give a description of one of the varieties of Eastern Hindi spoken in thecentral, Magahi-speaking parts of Bihar (the variety spoken in and around the capital city ofPatna) and present the case for it being a mixed language. Based on extensive empiricalevidence, we conclude that Eastern Hindi is a conventionalised/plain mixed lang...
In the present paper, we carry out a comparative study between offensive and aggressive language and attempt to understand their inter-relationship. To carry out this study, we develop classifiers for offensive and aggressive language identification in Hindi, Bangla, and English using the datasets released for the languages as part of the two share...
In the situation of language endangerment, especially because of various kinds of pressure from surrounding majority languages and a low language prestige among the community members, language games of various kinds could prove to be an effective tool enhancing the prestige, providing an additional domain of language use to the community members an...
The paper discusses effects of borrowing of Bangla words in Koda verb morphology. Koda belongs to Munda language group within Austro-Asiatic (AA) language family. The paper is based on primary data collected from Paschim Medinipur of West Bengal. Words like /ar/ (and) came into Bangla from AA languages. The borrowed Bangla words take Koda suffixes...
The Kurmali speaking community in West Bengal is a part of a continuum of the Kurmali belt which exists in northern part of India. The continuum occurs through Jharkhand, Bihar, Odisha and West Bengal. In this paper the kinship terms of Kurmali have been studied from both linguistic and anthropological point of views. Kurmali is an Indo-Aryan langu...
The document has been compiled under the aegis of UGC SRIELI project (2016-2021) funded by the University Grants Commission, India. The version of this questionnaire is v.oct.18.1.
In this paper, we discuss the development of a multilingual annotated corpus of misogyny and aggression in Indian English, Hindi, and Indian Bangla as part of a project on studying and automatically identifying misogyny and communalism on social media (the ComMA Project). The dataset is collected from comments on YouTube videos and currently contai...
The document has been compiled under the aegis of UGC SRIELI project (2016-2021) funded by the University Grants commission, India. The document has been compiled under the aegis of UGC SRIELI project (2016-2021) funded by the University Grants commission, India. The version of this questionnaire is v.oct.18.1.
Surjapuri is a minor language of India, spoken in eastern part of Bihar. It is an Indo-Aryan language. The language has not been worked upon much. The language shares feature and lexical items with neighbouring languages like Maithili, Bhojpuri, Bangla and Assamese. However, the language has an interesting feature which is not found in the other In...
In this paper, we discuss an attempt to develop an automatic language identification system for 5 closely-related Indo-Aryan languages of India, Awadhi, Bhojpuri, Braj, Hindi and Magahi. We have compiled a comparable corpora of varying length for these languages from various resources. We discuss the method of creation of these corpora in detail. U...
This paper is about a Tibeto-Burman language called Dhimal. It is an endangered language of India. However the community in recent times is trying hard to revitalize their language through various methods which are described n this paper.
Verbal aggression could be defined as any act which seeks to disturb the social and relational equilibrium. In a large number of cases, verbal aggression could be a precursor to certain kind of criminal activities; in others, as in political speeches, it might be desirable to take note of specific kinds of aggression. In this paper we discuss the d...
The paper describes the contact induced changes those can be found in Bangla spoken in the states where Hindi is the major and the official language of the state. The paper also compares the Bangla of West Bengal, where Bangla is the major language, with the Bangla of the states of Bihar, Uttar Pradesh and Delhi, where Bangla is not the major langu...
The present study reports an investigation of English spelling errors by the students of native Hindi speakers. The study was conducted on grade five students of an English medium school of India. Students with similar socioeconomic background, studying in the same school were given various tasks to test their English spelling skills. The errors we...
Magahi is an Indo-Aryan Language, spoken mainly in the Eastern parts of India. Despite having a significant number of speakers, there has been virtually no language resource (LR) or language technology (LT) developed for the language, mainly because of its status as a non-scheduled language. The present paper describes an attempt to develop an anno...
The paper describes the local cases of four of the eastern Indo-Aryan languages (EIA) using the cognitive framework. The languages under observation are Asamiya, Bangla, Maithili, and Oriya. The local cases are used to mark the position or location of an object which is always stated in reference to another object. These languages use local cases t...