Dipankar Das

Dipankar Das

About

143
Publications
104,161
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,952
Citations
Citations since 2016
80 Research Items
1508 Citations
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300
2016201720182019202020212022050100150200250300

Publications

Publications (143)
Article
Full-text available
Training translation systems with complex and compound sentences are generally considered computationally tough and such systems fail to process, the large syntactical information given out by these sentences. This issue subsequently, affects the overall quality of translations. On the other hand, simple sentences are shorter by nature and produce...
Chapter
Building an automatic system to solve Math word problem is very interesting in AI domain. From last four decades Math word problem takes more attention of the researchers. Automatic system of math word problem can increase the effectiveness of e-learning system which is often used in current days. Many research claimed promising result but most of...
Preprint
Full-text available
Identifying argument components from unstructured texts and predicting the relationships expressed among them are two primary steps of argument mining. The intrinsic complexity of these tasks demands powerful learning models. While pretrained Transformer-based Language Models (LM) have been shown to provide state-of-the-art results over different N...
Article
The behavior of information cascades (such as retweets) has been modeled extensively. While point process-based generative models have long been in use for estimating cascade growths, deep learning has greatly enhanced the integration of diverse features and signals. We observe two significant temporal signals in cascade data that have not been rep...
Article
Detecting entailment relationship between two sentences has profoundly impacted several different application areas of Natural Language Processing (NLP). Though recognizing textual entailment (TE) is amongst the widely studied problems, the research on detecting entailment between pieces of scientific texts is still in its infancy. To this end the...
Article
Full-text available
Parallel corpora are central to translation studies and contrastive linguistics. However, training machine translation (MT) systems by barely using the semantic aspects of a parallel corpus leads to unsatisfactory results, as then the trained MT systems are likely to generate target sentences that are semantically and pragmatically different from t...
Preprint
Research on adversarial attacks are becoming widely popular in the recent years. One of the unexplored areas where prior research is lacking is the effect of adversarial attacks on code-mixed data. Therefore, in the present work, we have explained the first generalized framework on text perturbation to attack code-mixed classification models in a b...
Conference Paper
Full-text available
The on-going pandemic has opened the pandora’s box of the plethora of hidden problems which the society has been hiding for years. But the positive side to the present scenario is the opening up of opportunities to solve these problems on the global stage. One such area which was being flooded with all kinds of different emotions, and reaction from...
Preprint
Full-text available
The behaviour of information cascades (such as retweets) has been modelled extensively. While point process-based generative models have long been in use for estimating cascade growths, deep learning has greatly enhanced diverse feature integration. We observe two significant temporal signals in cascade data that have not been emphasized or reporte...
Chapter
In the current work, we exploreMahata, Kumar the enrichment in the machine translation output when the training parallel corpus is augmented with the introduction of sentiment analysis. The paper discusses the preparation of the same sentimentChandra, Amrita tagged English-Bengali parallel corpus. The preparation of raw parallel corpus, sentiment a...
Article
In this paper, we proposed a novel approach to improve the performance of multiple choice question answering (MCQA) system using distributed semantic similarity and classification approach. We mainly focus on science-based MCQ which is really difficult to handle. Our proposed method is based on the hypothesis that the relation between question and...
Article
Full-text available
Question answering (QA), one of the important applications of Natural Language Processing (NLP) aims to take the user questions and returned to the user with the answers. An open domain QA system deals with a set of questions that can be of any domain. The other type of QA is close-domain where it deals with the questions under a specific domain e....
Chapter
Learning different subjects to enhance knowledge of students and children, reading habit plays an important role. Students often face problems or reading difficulties are aroused when the students are non-native English learners or suffering from dyslexia. Thus, in the present work, we have built a hybrid sequential model for text simplification (i...
Article
Full-text available
In today’s world where an individual is becoming more and more busy and independent, the use of recommendation-based systems is steadily increasing. Thus, making available professional knowledge to the common man in a short-span quite necessary. The aim of our recipe recommendation system is to recommend recipes to users based on their questions. T...
Preprint
Full-text available
Sentiment analysis has been an active area of research in the past two decades and recently, with the advent of social media, there has been an increasing demand for sentiment analysis on social media texts. Since the social media texts are not in one language and are largely code-mixed in nature, the traditional sentiment classification models fai...
Chapter
Question Answering (QA) is an emerging domain of research that retrieves a textual segment from the set of documents in response to user’s queries. To recommend the answer in response to cooking recipe related questions is just an early stage of research and requires the significant refinement. In this paper, we have developed a question answering...
Chapter
Question Answering is a very actively developing filed of Natural Language Processing. In this field, the most active research work is currently going on with relation to the Question Classification module of a Question Answering System. It plays a very important role in determining the expectations of the user. The aim of question classification i...
Conference Paper
Code-mixed texts are widespread nowadays due to the advent of social media. Since these texts combine two languages to formulate a sentence, it gives rise to various research problems related to Natural Language Processing. In this paper, we try to excavate one such problem, namely, Parts of Speech tagging of code-mixed texts. We have built a syste...
Conference Paper
In the current work, we explore the enrichment in the machine translation output when the training parallel corpus is augmented with the introduction of sentiment analysis. The paper discusses the preparation of the same sentiment tagged English-Bengali parallel corpus. The preparation of raw parallel corpus, sentiment analysis of the sentences and...
Preprint
In the current work, we explore the enrichment in the machine translation output when the training parallel corpus is augmented with the introduction of sentiment analysis. The paper discusses the preparation of the same sentiment tagged English-Bengali parallel corpus. The preparation of raw parallel corpus, sentiment analysis of the sentences and...
Preprint
Full-text available
Code-mixed texts are widespread nowadays due to the advent of social media. Since these texts combine two languages to formulate a sentence, it gives rise to various research problems related to Natural Language Processing. In this paper, we try to excavate one such problem, namely, Parts of Speech tagging of code-mixed texts. We have built a syste...
Article
Full-text available
In an automated Question Answering (QA) system, Question Classification (QC) is an essential module. The aim of QC is to identify the type of questions and classify them based on the expected answer type. Although the machine-learning approach overcomes the limitation of rules as is the case with the conventional rule-based approach but is restrict...
Preprint
Full-text available
The use of multilingualism in the new generation is widespread in the form of code-mixed data on social media, and therefore a robust translation system is required for catering to the monolingual users, as well as for easier comprehension by language processing models. In this work, we present a translation framework that uses a translation-transl...
Conference Paper
Full-text available
Over the last decade, online forums have become primary news sources for readers around the globe, and social media platforms are the space where these news forums find most of their audience and engagement. Our particular focus in this paper is to study conflict dynamics over online news articles in Reddit, one of the most popular online discussio...
Preprint
Full-text available
Clustering is an unsupervised learning problem in the domain of machine learning and data science, where information about data instances may or may not be given. K-Means algorithm is one such clustering algorithms, the use of which is widespread. But, at the same time K-Means suffers from a few disadvantages such as low accuracy and high number of...
Conference Paper
Full-text available
In the current work, we present a description of the system submitted to WMT 2019 News Translation Shared task. The system was created to translate news text from Lithuanian to English. To accomplish the given task, our system used a Word Embedding based Neu-ral Machine Translation model to post edit the outputs generated by a Statistical Machine T...
Preprint
Full-text available
Online discussions are valuable resources to study user behaviour on a diverse set of topics. Unlike previous studies which model a discussion in a static manner, in the present study, we model it as a time-varying process and solve two inter-related problems -- predict which user groups will get engaged with an ongoing discussion, and forecast the...
Preprint
In the current work, we present a description of the system submitted to WMT 2018 News Translation Shared task. The system was created to translate news text from Finnish to English. The system used a Character Based Neural Machine Translation model to accomplish the given task. The current paper documents the preprocessing steps, the description o...
Preprint
Full-text available
In the current work, we present a description of the system submitted to WMT 2019 News Translation Shared task. The system was created to translate news text from Lithuanian to English. To accomplish the given task, our system used a Word Embedding based Neural Machine Translation model to post edit the outputs generated by a Statistical Machine Tr...
Preprint
Full-text available
Persuasion and argumentation are possibly among the most complex examples of the interplay between multiple human subjects. With the advent of the Internet, online forums provide wide platforms for people to share their opinions and reasonings around various diverse topics. In this work, we attempt to model persuasive interaction between users on R...
Preprint
Full-text available
Persuasion and argumentation are possibly among the most complex examples of the interplay between multiple human subjects. With the advent of the In-ternet, online forums provide wide platforms for people to share their opinions and reasonings around various diverse topics. In this work, we attempt to model persuasive interaction between users on...
Article
Persuasion and argumentation are possibly among the most complex examples of the interplay between multiple human subjects. With the advent of the Internet, online forums provide wide platforms for people to share their opinions and reasonings around various diverse topics. In this work, we attempt to model persuasive interaction between users on R...
Chapter
Full-text available
Over the last two decades, social media has emerged as almost an alternate world where people communicate with each other and express opinions about almost anything. This makes platforms like Facebook, Reddit, Twitter, Myspace, etc., a rich bank of heterogeneous data, primarily expressed via text but reflecting all textual and non-textual data that...
Preprint
Full-text available
In the present article, we identified the qualitative differences between Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) outputs. We have tried to answer two important questions: 1. Does NMT perform equivalently well with respect to SMT and 2. Does it add extra flavor in improving the quality of MT output by employing si...
Conference Paper
In the current work, we present a description of the system submitted to WMT 2018 News Translation Shared task. The system was created to translate news text from Finnish to English. The system used a Character Based Neural Machine Translation model to accomplish the given task. The current paper documents the preprocessing steps, the description o...
Conference Paper
Full-text available
Author profiling is gaining the interest of people in both academia and outside it. Author profiling/analysis deals with the identification of author information from text based on stylistic choices. It helps in identifying author related information such as gender, age, native language, personality, demographics, etc. Thus, author profiling is bot...
Preprint
Full-text available
We propose a novel attention based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary t...
Article
Full-text available
In healthcare services, information extraction is the key to understand any corpus-based knowledge. The process becomes laborious when the annotation is done manually for the availability of a large number of text corpora. Hence, future automated extraction systems will be essential for groups of experts such as doctors and medical practitioners as...
Article
Full-text available
Machine translation (MT) is the automatic translation of the source language to its target language by a computer system. In the current paper, we propose an approach of using recurrent neural networks (RNNs) over traditional statistical MT (SMT). We compare the performance of the phrase table of SMT to the performance of the proposed RNN and in tu...
Chapter
Full-text available
Commendable amount of work has been attempted in the field of Sentiment Analysis or Opinion Mining from natural language texts and Twitter texts. One of the main goals in such tasks is to assign polarities (positive or negative) to a piece of text. But, at the same time, one of the important as well as difficult issues is how to assign the degree o...
Preprint
Full-text available
Sentiment analysis is essential in many real-world applications such as stance detection, review analysis, recommendation system, and so on. Sentiment analysis becomes more difficult when the data is noisy and collected from social media. India is a multilingual country; people use more than one languages to communicate within themselves. The switc...
Article
Music mood classification is one of the most interesting research areas in music information retrieval, and it has many real-world applications. Many experiments have been performed in mood classification or emotion recognition of Western music; however, research on mood classification of Indian music is still at initial stage due to scarcity of di...
Conference Paper
Full-text available
In this paper, we describe a deep learning framework for analyzing the customer feedback as part of our participation in the shared task on Customer Feedback Analysis at the 8th International Joint Conference on Natural Language Processing (IJCNLP 2017). A Convolutional Neural Network (CNN) based deep neural network model was employed for the custo...
Conference Paper
Full-text available
The present task describes the participation of the JU NITM team in IJCNLP2017 Shared Task 5: ”Multi-choice Question Answering in Examinations”. One of the main aims of this shared task is to choose the correct option for each of the multi-choice questions. We represent each of the questions and its corresponding answer in vector space and find the...
Conference Paper
Full-text available
Presently, millions of music tracks are available on the web. In this context, a music recommender system can be helpful to filter and organize the music tracks according to the need of users. To develop a recommendation system, we need an enormous amount of data with the user preference information. However, there is a scarcity of such dataset for...
Conference Paper
We present an approach to develop a Question Answering (QA) system over cooking recipes that makes use of Cooking Ontology management. QA systems are designed to satisfy the user’s specific information need whereas ontology is the conceptualization of knowledge and it exhibits the hierarchical structure. The system is an Information retrieval (IR)...
Conference Paper
Full-text available
Social Media is a rich source of human-human interactions on exhausting number of topics. Although dialogue modeling from human-human interactions is not new, but there is no previous work as far as our knowledge attempting to model dialogues from social media data. This paper implements and compares multiple supervised and unsupervised approaches...
Article
Full-text available
Whenever human beings interact with each other, they exchange or express opinions, emotions, and sentiments. These opinions can be expressed in text, speech or images. Analysis of these sentiments is one of the popular research areas of present day researchers. Sentiment analysis, also known as opinion mining tries to identify or classify these sen...
Article
Full-text available
Sentiment analysis is the Natural Language Processing (NLP) task dealing with the detection and classification of sentiments in texts. While some tasks deal with identifying the presence of sentiment in the text (Subjectivity analysis), other tasks aim at determining the polarity of the text categorizing them as positive, negative and neutral. When...
Article
Full-text available
An evaluation metric is an absolute necessity for measuring the performance of any system and complexity of any data. In this paper, we have discussed how to determine the level of complexity of code-mixed social media texts that are growing rapidly due to multilingual interference. In general, texts written in multiple languages are often hard to...
Article
Full-text available
Digitization has created a wide platform for music, in the form of televisions, desktops and other hand held devices. This has increased the reach of musical content as well as its impact on people. Music is often asso- ciated with distinct emotional content, generally referred to as music mood. Literature focusing on analyzing the content of a mus...
Article
Full-text available
Effective retrieval of mathematical contents from vast corpus of scientific documents demands enhancement in the conventional indexing and searching mechanisms. Indexing mechanism and the choice of semantic similarity measures guide the results of Math Information Retrieval system (MathIRs) to perfection. Tokenization and formula unification are am...