Sivaji Bandyopadhyay

Sivaji Bandyopadhyay
Jadavpur University | JU · Department of Computer Science and Engineering

About

354
Publications
223,068
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,930
Citations
Additional affiliations
July 1989 - June 2016
Jadavpur University
Position
  • Professor Computer Science and Engineering

Publications

Publications (354)
Conference Paper
In recent times, the extraction of semantic relation has become extremely useful for the tasks of information retrieval, question answering, decision making, and event prediction. There are a number of relationships such as cause-effect, if-then, part-whole, and etc., that express essential information about how different events or entities are ant...
Article
The mathematical formula is one of the most vital components in a scientific document, which can explicitly describe various complex concepts and ideas. In addition to numerical calculations, they are also used to clarify definitions and disambiguate explanations transcribed in natural language. Nevertheless, the formulas have a noteworthy impact i...
Chapter
The purpose of this paper is to design and develop a complete Hindi-to-English speech-to-speech translation system. We employ three modules, i.e. automatic speech recognition (ASR), neural machine translation (NMT) and text-to-speech (TTS). The ASR recognizes speech signal in source language A (i.e. Hindi) and gives the text version of it. The NMT...
Article
Full-text available
The generation of natural language descriptions for a video has been reported by many researchers till now. But, it is still the most interesting research topic among the researchers due to the emerging interdisciplinary problem of Computer Vision (CV), Natural Language Processing (NLP) and Deep Learning (DL). The results of a video description are...
Article
In the scientific field, mathematical formulae are a significant factor in communicating the ideas and the fundamental principles of any scientific knowledge. Nowadays, the scientific research community generates a huge number of documents that comprise both textual and mathematical formulae. For the retrieval of textual information, numerous retri...
Article
Full-text available
Training translation systems with complex and compound sentences are generally considered computationally tough and such systems fail to process, the large syntactical information given out by these sentences. This issue subsequently, affects the overall quality of translations. On the other hand, simple sentences are shorter by nature and produce...
Article
Full-text available
In recent times, active research is going on for bridging the gap between computer vision and natural language. In this paper, we attempt to address the problem of Hindi video captioning. In a linguistically diverse country like India, it is important to provide a means which can help in understanding the visual entities in native languages. In thi...
Article
Language translation is essential to bring the world closer and plays a significant part in building a community among people of different linguistic backgrounds. Machine translation dramatically helps in removing the language barrier and allows easier communication among linguistically diverse communities. Due to the unavailability of resources, m...
Article
Detecting entailment relationship between two sentences has profoundly impacted several different application areas of Natural Language Processing (NLP). Though recognizing textual entailment (TE) is amongst the widely studied problems, the research on detecting entailment between pieces of scientific texts is still in its infancy. To this end the...
Article
Full-text available
Sentiment analysis is a classification task where polarity of textual data is identified, i.e. to analyze whether a sentence or document expresses a negative, positive or neutral sentiment. Manipuri is a less privileged, highly agglutinative and tonal language. Despite being a scheduled language of Indian Constitution, it is also a resource constra...
Article
Full-text available
Parallel corpora are central to translation studies and contrastive linguistics. However, training machine translation (MT) systems by barely using the semantic aspects of a parallel corpus leads to unsatisfactory results, as then the trained MT systems are likely to generate target sentences that are semantically and pragmatically different from t...
Article
Full-text available
In recent times, research activity on image caption generation has attracted several researchers. The present work attempt to address the problem of Hindi image caption generation using Hindi Visual genome dataset. Hindi is the official and most spoken language in India. In a linguistically diverse country like India, it is essential to provide a m...
Conference Paper
Full-text available
The neural machine translation approach has gained popularity in machine translation because of its context analysing ability and its handling of long-term dependency issues. We have participated in the WMT21 shared task of similar language translation on a Tamil-Telugu pair with the team name: CNLP-NITS. In this task, we utilized monolingual data...
Conference Paper
Full-text available
In machine translation, corpus preparation is one of the crucial tasks, particularly for low resource pairs. In multilingual countries like India, machine translation plays a vital role in communication among people with vari ous linguistic backgrounds. There are avail able online automatic translation systems by Google and Microsoft which include...
Conference Paper
Full-text available
Machine translation performs automatic translation from one natural language to another. Neural machine translation attains a state-of-the-art approach in machine translation, but it requires adequate training data, which is a severe problem for low-resource language pairs translation. The concept of multimodal is introduced in neural machine trans...
Article
Retrieval of mathematical information from scientific documents is one of the crucial tasks. Numerous Mathematical Information Retrieval (MIR) systems have been developed, which mainly focus on the improvement over the indexing and the searching mechanism, the poor results obtained for evaluation measures depict major limitations of such systems. T...
Chapter
Dadure, PankajPakray, ParthaBandyopadhyay, SivajiMathematical information retrieval is a comparatively early stage of research, lying at the interaction of a text-based retrieval approach and mathematical knowledge management. There are a wide variety of approaches have been developed to cope-up with the challenge of formula representation, indexin...
Chapter
Full-text available
Laskar, Sahinur RahmanPakray, ParthaBandyopadhyay, SivajiNeural machine translation (NMT) attracts attention to the machine translation (MT) community because of its potential to persist sequential data over variable lengths of input and output sentences. With the attention mechanism, the NMT system achieves state-of-the-art technique which allows...
Chapter
Singh, AlokMeetei, Loitongbam SanayaiSingh, Thoudam DorenBandyopadhyay, SivajiThe automatic image caption generation with proper fluency and expressiveness is an emerging area of research. A lot of research has been done on image caption generation for English, but very few work has been done in the area of generating and evaluating captions in Hin...
Article
Full-text available
The command line has always been the most efficient method to interact with UNIX flavor based systems while offering a great deal of flexibility and efficiency as preferred by professionals. Such a system is based on manually inputting commands to instruct the computing machine to carry out tasks as desired. This human-computer interface is quite t...
Chapter
Full-text available
Laskar, Sahinur RahmanPakray, ParthaBandyopadhyay, SivajiNeural machine translation (NMT) is a state-of-the-art technique in the task of machine translation (MT), where a source-language text is converted into a target language text while preserving its meaning. NMT attracts attention because it handles sequence to sequence learning problems for va...
Chapter
In the current work, we exploreMahata, Kumar the enrichment in the machine translation output when the training parallel corpus is augmented with the introduction of sentiment analysis. The paper discusses the preparation of the same sentimentChandra, Amrita tagged English-Bengali parallel corpus. The preparation of raw parallel corpus, sentiment a...
Article
Full-text available
In today’s world where an individual is becoming more and more busy and independent, the use of recommendation-based systems is steadily increasing. Thus, making available professional knowledge to the common man in a short-span quite necessary. The aim of our recipe recommendation system is to recommend recipes to users based on their questions. T...
Chapter
The continuous growth in the development of interactive technologies has lighted up the game-based learning applications. The game-based learning applications motivate the students to enhance their knowledge and improve the overall student learning experience. Learning with fun and entertainment is the prime aspect of any interactive platform. The...
Chapter
Mathematical formulas are widely used to express ideas and fundamental principles of science, technology, engineering, and mathematics. The rapidly growing research in science and engineering leads to a generation of a huge number of scientific documents which contain both textual as well as mathematical terms. In a scientific document, the sense o...
Preprint
Full-text available
Neural machine translation (NMT) is a widely accepted approach in the machine translation (MT) community, translating from one natural language to another natural language. Although , NMT shows remarkable performance in both high and low resource languages, it needs sufficient training corpus. The availability of a parallel corpus in low resource l...
Conference Paper
Full-text available
Neural machine translation (NMT) is a widely accepted approach in the machine translation (MT) community, translating from one natural language to another natural language. Although , NMT shows remarkable performance in both high and low resource languages, it needs sufficient training corpus. The availability of a parallel corpus in low resource l...
Conference Paper
Full-text available
Machine translation (MT) focuses on the automatic translation of text from one natural language to another natural language. Neural machine translation (NMT) achieves state-of-the-art results in the task of machine translation because of utilizing advanced deep learning techniques and handles issues like long-term dependency, and context-analysis....
Conference Paper
Full-text available
The corpus preparation is one of the important challenging task for the domain of machine translation especially in low resource language scenarios. Country like India where multiple languages exists, machine translation attempts to minimize the communication gap among people with different linguistic backgrounds. Although Google Translation covers...
Preprint
Full-text available
The corpus preparation is one of the important challenging task for the domain of machine translation especially in low resource language scenarios. Country like India where multiple languages exists, machine translation attempts to minimize the communication gap among people with different linguistic backgrounds. Although Google Translation covers...
Preprint
Full-text available
Video description involves the generation of the natural language description of actions, events, and objects in the video. There are various applications of video description by filling the gap between languages and vision for visually impaired people, generating automatic title suggestion based on content, browsing of the video based on the conte...
Chapter
With the increasing amount of digital data, it is becoming increasingly hard to extract useful information from text data, especially for resource-constrained languages. In this work, we report the task of language-independent automatic extraction of locations from news articles using domain knowledge. The work is tested on four languages namely, E...
Preprint
Full-text available
Machine translation (MT) is a vital tool for aiding communication between linguistically separate groups of people. The neural machine translation (NMT) based approaches have gained widespread acceptance because of its outstanding performance. We have participated in WMT20 shared task of similar language translation on Hindi-Marathi pair. The main...
Conference Paper
Full-text available
Machine translation (MT) is a vital tool for aiding communication between linguistically separate groups of people. The neural machine translation (NMT) based approaches have gained widespread acceptance because of its outstanding performance. We have participated in WMT20 shared task of similar language translation on Hindi-Marathi pair. The main...
Preprint
Full-text available
Sentiment analysis has been an active area of research in the past two decades and recently, with the advent of social media, there has been an increasing demand for sentiment analysis on social media texts. Since the social media texts are not in one language and are largely code-mixed in nature, the traditional sentiment classification models fai...
Conference Paper
Full-text available
The continuously increasing research in the field of science, engineering, and technology has generated textual and mathematical data in huge amounts. The research in retrieval and searching of textual data achieved the state-of-the-art results while searching and retrieval of mathematical information is in the early stage of research and requires...
Preprint
Full-text available
Question classification (QC) is a prime constituent of automated question answering system. The work presented here demonstrates that the combination of multiple models achieve better classification performance than those obtained with existing individual models for the question classification task in Bengali. We have exploited state-of-the-art mul...
Conference Paper
Code-mixed texts are widespread nowadays due to the advent of social media. Since these texts combine two languages to formulate a sentence, it gives rise to various research problems related to Natural Language Processing. In this paper, we try to excavate one such problem, namely, Parts of Speech tagging of code-mixed texts. We have built a syste...
Conference Paper
In the current work, we explore the enrichment in the machine translation output when the training parallel corpus is augmented with the introduction of sentiment analysis. The paper discusses the preparation of the same sentiment tagged English-Bengali parallel corpus. The preparation of raw parallel corpus, sentiment analysis of the sentences and...
Preprint
In the current work, we explore the enrichment in the machine translation output when the training parallel corpus is augmented with the introduction of sentiment analysis. The paper discusses the preparation of the same sentiment tagged English-Bengali parallel corpus. The preparation of raw parallel corpus, sentiment analysis of the sentences and...
Preprint
Full-text available
Code-mixed texts are widespread nowadays due to the advent of social media. Since these texts combine two languages to formulate a sentence, it gives rise to various research problems related to Natural Language Processing. In this paper, we try to excavate one such problem, namely, Parts of Speech tagging of code-mixed texts. We have built a syste...
Conference Paper
Full-text available
A chatbot is a software application aimed at simulating real-time conversations. This system has been designed to address a plethora of domains where they have proved themselves worthy to complement or in some areas replace human-based information acquisition. Though some domains like travel and food have advanced with the growing consumer demand,...
Article
Full-text available
In an automated Question Answering (QA) system, Question Classification (QC) is an essential module. The aim of QC is to identify the type of questions and classify them based on the expected answer type. Although the machine-learning approach overcomes the limitation of rules as is the case with the conventional rule-based approach but is restrict...
Preprint
Full-text available
Despite being an open-source operating system pioneered in the early 90s, UNIX based platforms have not been able to garner an overwhelming reception from amateur end users. One of the rationales for under popularity of UNIX based systems is the steep learning curve corresponding to them due to extensive use of command line interface instead of usu...
Presentation
Full-text available
System description of the framework used for VATEX-2020 video captioning challenge
Preprint
Full-text available
Video captioning is process of summarising the content, event and action of the video into a short textual form which can be helpful in many research areas such as video guided machine translation, video sentiment analysis and providing aid to needy individual. In this paper, a system description of the framework used for VATEX-2020 video captionin...
Article
Full-text available
India is a nation of geographical and cultural diversity where over 1600 dialects are spoken by the people. With the technological advancement, penetration of the internet and cheaper access to mobile data, India has recently seen a sudden growth of internet users. These Indian internet users generate contents either in English or in other vernacul...
Preprint
Full-text available
The use of multilingualism in the new generation is widespread in the form of code-mixed data on social media, and therefore a robust translation system is required for catering to the monolingual users, as well as for easier comprehension by language processing models. In this work, we present a translation framework that uses a translation-transl...
Preprint
Full-text available
With the widespread use of Machine Translation (MT) techniques, attempt to minimize communication gap among people from diverse linguistic backgrounds. We have participated in Workshop on Asian Translation 2019 (WAT2019) multi-modal translation task. There are three types of submission track namely, multi-modal translation, Hindi-only image caption...
Chapter
Full-text available
The content inside an image is exceptionally compelling. As such, text within an image can be of special interest and compared to other semantic contents, it tends to be effectively extracted. Text detection within an image is the task of detecting and localizing the portion of an image that contains the text information. Manipuri and Mizo are resp...
Preprint
Full-text available
With the extensive use of Machine Translation (MT) technology, there is progressively interest in directly translating between pairs of similar languages. Because the main challenge is to overcome the limitation of available parallel data to produce a precise MT output. Current work relies on the Neural Machine Translation (NMT) with attention mech...
Conference Paper
Full-text available
In the current work, we present a description of the system submitted to WMT 2019 News Translation Shared task. The system was created to translate news text from Lithuanian to English. To accomplish the given task, our system used a Word Embedding based Neu-ral Machine Translation model to post edit the outputs generated by a Statistical Machine T...