Shikhar Kr. Sarma

Shikhar Kr. Sarma
  • Ph.D.
  • Professor at Gauhati University

About

72
Publications
54,057
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
520
Citations
Introduction
NLP, Language Technology in Assamese, Bodo
Current institution
Gauhati University
Current position
  • Professor
Additional affiliations
January 2008 - present
Gauhati University
Position
  • Head of Faculty

Publications

Publications (72)
Chapter
This paper proposes an approach to introduce a text summarization in the Assamese language as it has been observed that text summarization is an important natural language processing application developed to condense the original or main text by retrieving the relevant information while retaining the main matter of the text. In the proposed approac...
Article
Full-text available
Social media users and online news portals are rising exponentially in the Northeastern state of India, where every small instance of daily life are posted on social media platforms such as Facebook, Twitter by the users in their native language. Every social media user post about their daily experiences on such kind of social media platform which...
Chapter
The popularity of mobile devices and the availability of the Internet increase the use of various online platforms for chatting and communicating with others. Due to the use of such platforms, the use of local languages is also increasing because everyone feels comfortable with his/her mother tongue. In this research work, clustering of Assamese wo...
Article
Full-text available
IEEE 802.11 Wireless LAN, popularly known as WiFi, has become the admired source of internet connectivity for most of the offices as well as organizations. Due to the rapid growth of multimedia data and VoIP and also to provide better quality of service (QoS), bandwidth and management of bandwidth have become important factors in 802.11 wireless LA...
Chapter
In this paper, an extractive text summarization approach using Assamese WordNet is proposed, and the difficulties faced while extracting summary in the Assamese document are discussed. The Assamese language is a low-level language. Synset is applied from Assamese WordNet. The various features used for identifying the most salient sentences to gener...
Article
Full-text available
In this review article authors are trying to put light on text to speech synthesis of Assamese language using unit selection concatenative speech synthesis technique. Assamese is one of the North East Indian languages spoken by millions of people. This article tries to highlight some major difficulties when developing the synthesizer. The speech un...
Article
Full-text available
It is very common to miss certain words when writing while listening to others. A similar problem can arise when typing on the computer. The automatic generation of missed words shall very much helpful for users by suggesting the required words. In this research work, missed words of the Assamese sentences are generated, at present, there is no suc...
Conference Paper
Full-text available
WordNets have been used in a wide variety of applications, including in design and devel- opment of intelligent and human assisting sys- tems. Although WordNet was initially devel- oped as an online lexical database, (Miller, 1995 and Fellbaum, 1998) later developments have inspired using WordNet database as re- sources in NLP applications, Languag...
Article
Full-text available
The present day scenerio demands for using and adopting the Green Computer, Green Computing and as such Green Computer Network. Many industries already initiate tasks to attain the the green soft-computing and green networking in the line of eco-friendly technology so as to help the ecosysytem and thus can make an impact in the biological sciences....
Chapter
A natural language or an everyday language is an accustomed form of communication used by the people to speak, express and write. Besides, these languages are called natural because they are evolved naturally among the communities. Natural Language Processing is a very vital field in connection with Artificial Intelligence, where research has expon...
Chapter
With the rapid growth of ICT, almost all the entities exist in this world are moving towards computing enabled digital space. At present, the computing infrastructure is not only an infrastructure for scientific computation and communication, but also it has been considered as a platform for computation of anything appearing at anytime and anywhere...
Article
Full-text available
Text summarization is the task of condensing the input text documents into a shorter version by retaining the overall meaning and information of the original document. Though plenty of text compaction research works in English and other languages have been done till date, but text compaction in Assamese language is still lagging behind due to the l...
Article
Full-text available
A statistical machine translation system for Assamese and English language pair. It is developed using Marie.
Article
Full-text available
The demand of Machine Translation (MT) is increasing due to the increased rate of exchange of information around the globe. Considering Internet as the main channel of information sharing, the source of information is not confined to a specific geographical location and a specific language. MT is the way of translating from one language to another...
Article
Word prediction is a technique which try to suggest the word by observing the previous input letters or words in any text editor. At present there is no such software or tool in Assamese which can predict the future word(s) of a sentence. This method helps the people who are not very much expert in typing and this research aims to reduce the gap be...
Chapter
Analyzing morphology of a word is a crucial task and may be varied based on Language Grammar for different languages. Assamese is a language spoken by the people of Assam, the northeastern part of country India, located in south of the eastern Himalayas. Assamese is the major language spoken in Assam and it is served as a bridge language among diff...
Chapter
Prosody is a term related with literature as well as speech technology. It is one of the primary parts for design of any natural sounding text to speech synthesis. Prosody is a broad as well as complex way of expressing meaning of a text segment in terms of pitch means fundamental frequency of utterance, loudness or intensity, intonation, and rhyth...
Chapter
This paper deals with a major issue for designing a Text-to-Speech synthesizer. To design a speech synthesizer, we need speech prosody where all significant and important utterance-related information are systematically stored. An utterance can be divided into the segmental level as well as suprasegmental level. Suprasegmental level deals with syll...
Conference Paper
The most important factor to build a synthesized discourse close to human speech, it is important that the content preparing segment delivers a suitable arrangement of phonemic units relating to an the text given as input. A detailed experimental research has been done throughout the project to syllabify Assamese words by using existing Phonetic Al...
Conference Paper
This paper presents the design and development of an expert system that aim to provide for suitable diagnosis of some of the diseases of rice plants. An expert system can be defined as a computer program that uses encoded knowledge to solve problems in a specific domain that normally requires a specialized human expert. The proposed system composed...
Article
Sense Disambiguation (WSD) aims to disambiguate the words which have multiple sense in a context automatically. Sense denotes the meaning of a word and the words which have various meanings in a context are referred as ambiguous words. WSD is vital in many important Natural Language Processing tasks like MT, IR, TC, SP etc. This research paper atte...
Conference Paper
Word Sense Disambiguation (WSD) is the process ofidentifying the proper sense of an ambiguous word depending onthe particular context. It is to find the accurate sense si among theset of senses {s1, s2, , sn}. This task was motivated by itsinterpretation in various Natural Language Processing (NLP) applications like IR, MT, QA, TC, SP etc. In this...
Article
Full-text available
Searching a document from the huge collection all over the internet is becoming a challenge. Like other languages Bodo language also providing content to the electronic world. Bodo is widely used in the North Eastern states of India. As text documents are increasing exponentially across the web, grouping similar documents for versatile applications...
Article
Full-text available
Information or knowledge contained by the texts is structured in a language specific syntactic form. They are neither understood nor processed by the computers. They must be organized in a structured form. Structured representation of sentences written in a particular language enables a computer to have a good understanding of the knowledge they co...
Article
Full-text available
Morphological Analysis is an important branch of linguistics for any Natural Language Processing Technology. Morphology studies the word structure and formation of word of a language. In current scenario of NLP research, morphological analysis techniques have become more popular day by day. For processing any language, morphology of the word should...
Article
Full-text available
This work primarily aim on different aspects of designing a spell checker for the Assamese language and integrating it as an add-on into Open Office Writer. Besides emphasizing on error detection and suggestion generation the programming model and challenges of developing the add-on for the aforementioned application has also been discussed. The sy...
Article
Full-text available
Machine translation is the process of translating text from one language to another. In this paper, Statistical Machine Translation is done on Assamese and English language by taking their respective parallel corpus. A statistical phrase based translation toolkit Moses is used here. To develop the language model and to align the words we used two a...
Article
The IEEE 802.11 WLAN is primarily used for web browsing which belongs to the category of non-real time application. But the demand of real time applications like VOIP and video conferencing has become very much common to such WLAN. With IEEE 802.11e Mac protocol it is possible to improve the QoS for both real and non-real time traffic by service di...
Article
Machine Translation is a task to translate the text from a source language to a target language in an automatic manner. Here, we describe a system that translate the English language to Assamese language text which is based on Phrase based statistical translation technique. To overcome the translation problem related with highly open word class lik...
Article
Full-text available
The paper defines the term electrical noise with its types. Electromagnetic Interference (EMI), which is one type of electrical noise, is also defined and general techniques used for controlling EMI are described. Networking cables are affected by the EMI effect caused by a nearby power cable and data transmission through Unshielded Twisted Pair (U...
Conference Paper
The increasing rate of Assamese text contents in digital format encourages us to generate a system that automatically categorizes them. This paper discusses a system that will perform the categorization of texts automatically based on the knowledge from Assamese WordNet. In WordNet, synset correspond to the words which implies the same concept and...
Conference Paper
Multiword Expressions (MWEs) are sequence of words separated by space or delimiter which determines a unique meaning instead of words' individual meanings. Our work concentrates on automatic identification of MWEs for two less computationally aware languages Assamese and Bodo spoken in the North Eastern part of India. Statistical measure and Langua...
Article
Full-text available
The performance of automatic speaker recognition (ASR) system degrades drastically in the presence of noise and other distortions, especially when there is a noise level mismatch between the training and testing environments. This paper explores the problem of speaker recognition in noisy conditions, assuming that speech signals are corrupted by no...
Conference Paper
The objective of the paper is to give an idea about the copper based networking cables and their important characteristics. Copper based cables, specially UTP cables are very sensitive to EMI and optical fiber are insensitive to EMI. But still today, UTP cable is the most popularly used networking cable supporting the standards up to Ten Gigabit Et...
Conference Paper
Assamese is one of the regional languages of India spoken by the people of Assam and other north eastern states of India. Parts Of Speech (POS) tagging is one of the most important research issue as it is the basic need for any Natural Language Processing (NLP). An automated way to provide a Parts Of Speech label to a word on a context is known as...
Conference Paper
Extracting the users expected information from a large text collection based on some query is the aim of a Information Retrieval (IR) system. Now a days Assamese Digital documents are increasing at a huge rate and to collect the information efficiently from them we are in need of an Assamese IR system for retrieving documents. Comparing query and d...
Conference Paper
The objective of the paper is to outline the current trend of cabling in the networking world. Today, UTP cable (CAT5e, CAT6) is the most popularly used cable to provide Gigabit Ethernet Networking despite of some disadvantages compared to Fiber Optic cable, which offers more signal reliability. The paper tries to give an overview of all categories...
Conference Paper
Data transmission through UTP cabling system is effected by EMI from a nearby Power line through the coupling mechanisms. Today, though UTP cable is the most preferred cabling supporting 10G Ethernet, but it is also the mostly influenced cable by EMI, since it is unshielded. Shielding and Physical Separation are the two most effective methods to av...
Conference Paper
An expert system is computer program composed of knowledge base, inference engine and user-interface. Its technical aspect involves the design and implementation of the architectural model of an expert system namely the knowledge base component, the graphical user interface component, the application component and the database. This paper presents...
Conference Paper
IEEE802.11 WLAN is primarily used for non-real time traffics, but in recent times real-time traffics like VoIP and video conferencing have emerged as exciting and heavily used applications in such WLAN, which needs special attention to attributes like delay sensitiveness or bandwidth requirement. The IEEE 802.11e MAC protocol produces improved perf...
Conference Paper
The objective of the paper is to study Electromagnetic Interference (EMI) produced by AC Power lines and its effect on communication/networking cables. Causes of EMI and techniques used for reduction of EMI are pointed out. The aim of the paper is to investigate and analyze the research works, standards, and studies on effect of AC Power line on UT...
Conference Paper
Full-text available
The present paper deals with the design and implementation of multilingual lexical resources of Assamese and Bodo Language with the help of Hindi Wordnet. Here, we present the multilingual dictionaries (for Hindi, Assamese and Bodo), synset based word search for Assamese-Hindi and Bodo-Hindi language. These words, of course, will have to go through...
Conference Paper
Integrating Expert System based on fuzzy logic and its inferences is a new dimension of research in e-learning environment. Standards so far do not provide suitable methods to extract learner's correct expertise level in e-learning environment. This paper depicts adaptation of expert system technology using fuzzy logic and inferences to handle the...
Article
Although there are various applications of Expert system in various fields, right from agriculture to the diagnosis of diseases of patients, it has potential for extensive contribution in digital learning. This paper discusses and analyses the present applications of Expert System in e-learning and to see the usefulness and effectiveness of it. The...
Conference Paper
Full-text available
Kinship terms form a considerable part of the Wordnet in any language. Most of the kinship terms interact each other with different relational characteristics of Wordnet. This paper explores the area of kinship terms in Assamese language, and outlines the standard kinship relations, associated set of terms in the language. The formation of such ter...
Article
Full-text available
This paper presents an architectural framework of an Expert System in the area of agriculture and describes the design and development of the rule based expert system, using the shell ESTA (Expert System for Text Animation). The designed system is intended for the diagnosis of common diseases occurring in the rice plant. An Expert System is a compu...
Article
Full-text available
Development of Wordnets of regional languages has been of great concern in recent years. This is mainly due to the ever increasing demands and requirements of putting those languages as effective media of the digital world, including the internet. As the technologies for putting regional languages in the digital media are being developed, research...
Conference Paper
We have made an attempt to study the spectral characteristics of two North East Indian languages, Assamese and Boro, coming from different genres. We have taken a few words with similar, partially similar, and dissimilar characteristics in their nature of utterance from Assamese and Boro. The spectral analysis revels that both the languages have a...
Article
Full-text available
This paper discusses the linguistics foundations for developing a Bodo Wordnet, describing the Bodo language characteristics and properties specific to the development of Wordnet. The characteristics of the Bodo language in terms of its morphological and syntactic structure are outlined. Important characteristics related to building of Wordnet are...
Article
Full-text available
Kinship terms form a considerable part of the Wordnet in any language. Most of the kinship terms interact each other with different relational characteristics of Wordnet. This paper explores the area of kinship terms in Assamese language, and outlines the standard kinship relations, associated set of terms in the language. The formation of such ter...
Article
Full-text available
In this paper, a new simplified approach has been made for the design and implementation of a noise robust speech recognition using Multilayer Perceptron (MLP) based Artificial Neural Network and LPC-Cepstral Coefficient. Cepstral matrices obtained via Linear Prediction Coefficient are chosen as the eligible features. Here, MLP neural network based...

Network

Cited By