Pushpak Bhattacharyya

Pushpak Bhattacharyya
Indian Institute of Technology Bombay | IIT Bombay ·  Department of Computer Science & Engineering

About

323
Publications
59,668
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,917
Citations

Publications

Publications (323)
Article
Full-text available
In the era of social media, the use of emojis and code-mixed language has become essential in online communication. However, selecting the appropriate emoji that matches a particular sentiment or emotion in the code-mixed text can be difficult. This paper presents a novel task of predicting multiple emojis in English-Hindi code-mixed sentences and...
Article
Image captioning frameworks usually employ an encoder-decoder paradigm, with the encoder receiving abstract image feature vectors as input and decoder for language modeling. Nowadays, most prominent architectures employ features from region proposals derived from object detection modules. In this work, we propose a novel architecture for image capt...
Article
Full-text available
Emotion classification along with sentimental analysis in dialogues is a complex task that has currently attained immense popularity. When communicating their thoughts and feelings, humans are prone to having many emotions of varying intensities. The task is complicated and fascinating since emotions in a dialogue utterance can be independent or ba...
Chapter
Cyber forensics, personalized services, and recommender systems require the development of automatic personality prediction systems. Current paper works on developing a multi-modal personality prediction system from videos considering three different modalities, text, audio and video. The emotional state of a user helps in revealing the personality...
Article
Full-text available
Moderators often face a double challenge regarding reducing offensive and harmful content in social media. Despite the need to prevent the free circulation of such content, strict censorship on social media cannot be implemented due to a tricky dilemma - preserving free speech on the Internet while limiting them and how not to overreact. Existing s...
Chapter
Despite significant evidence linking mental health to almost every major development issue, individuals with mental disorders are among those most at risk of being excluded from development programs. We outline a novel task of detection of Cognitive Distortion and Emotion Cause extraction of associated emotions in conversations. Cognitive distortio...
Article
Detecting suicidal tendencies and preventing suicides is an important social goal. The rise and continuance of emotion, the emotion category, and the intensity of the emotion are important clues about suicidal tendencies. The three determinants of emotion, viz. Valence, Arousal, and Dominance (VAD) can help determine a person’s exact emotion(s) and...
Chapter
News Title (NT) and News Body (NB) consistency detection is a demanding problem in Fake News Detection. In this paper, we formulate consistency detection between NT and NB from the perspective of Textual Entailment (TE), and propose various deep learning based methods for solving this problem. Inconsistency between NT and NB can affect the purpose...
Chapter
Expressing the polarity of sentiment as ‘positive’ and ‘negative’ usually have limited scope compared with the intensity/degree of polarity. These two tasks (i.e. sentiment classification and sentiment intensity prediction) are closely related and may offer assistance to each other during the learning process. In this paper, we propose to leverage...
Chapter
Due to the ever-changing nature of the human language and the variations in writing style, age-old texts in one language may be incomprehensible to a modern reader. In order to make these texts familiar to the modern reader, we need to rewrite them manually. But this is not always feasible if the volume of texts is very large. In this paper, we pre...
Chapter
Information extraction in disaster domain is a critical task for effective disaster management. A high quality event detection system is the very first step towards this. Since disaster annotated data-sets are not available in Indian languages, we first create and annotate a dataset in three different languages, namely Hindi, Bengali and English. T...
Chapter
Recently, neural machine translation (NMT) has become highly successful achieving state-of-the-art results on many resource-rich language pairs. However, it fails when there is a lack of sufficiently large amount of parallel corpora for a domain and/or language pair. In this paper, we propose an effective method for NMT under a low-resource scenari...
Chapter
We discuss the importance of domain specific language model in statistical machine translation system. Both the structures and phrase selection are not the same for different domains. So, the language model trained with the general domain data or other domain data can not provide better accuracy. Moreover, there may have some specific focus in diff...
Article
Full-text available
Effective dialogue generation for task completion is challenging to build. The task requires the response generation system to generate the responses consistent with intent and slot values, have diversity in response and be able to handle multiple domains. The response also needs to be context relevant with respect to the previous utterances in the...
Chapter
Sentiment research encompasses all aspects of identifying, interpreting, and evaluating people's attitudes toward different events, problems, administrations, or other topics. Sentiment analysis is a more extensive portrayal of an examination field, which considers emotional processing applied to text-based investigation. In this sense, it addition...
Article
As the number of non-native English speakers on social media has skyrocketed in recent years, sentiment and emotion analysis on regional languages and code-mixed data has gained traction. Despite extensive research on English, the area of Hindi-English code-mixed texts is still relatively new and understudied. We create an emotion annotated Hindi-E...
Preprint
With the rise of online hate speech, automatic detection of Hate Speech, Offensive texts as a natural language processing task is getting popular. However, very little research has been done to detect unintended social bias from these toxic language datasets. This paper introduces a new dataset ToxicBias curated from the existing dataset of Kaggle...
Conference Paper
Full-text available
The World Health Organization has emphasised the need of stepping up suicide prevention efforts to meet the United Nation’s Sustainable Development Goal target of 2030 (Goal 3: Good health and well-being). We address the challenging task of personality subtyping from suicide notes. Most research on personality subtyping has relied on statistical an...
Conference Paper
Full-text available
Mental health is a critical component of the United Nations’ Sustainable Development Goals (SDGs), particularly Goal 3, which aims to provide “good health and well-being”. The present mental health treatment gap is exacerbated by stigma, lack of human resources, and lack of research capability for implementation and policy reform. We present and di...
Article
Full-text available
Machine Reading Comprehension (MRC) of a document is a challenging problem that requires discourse-level understanding. Information extraction from scholarly articles nowadays is a critical use case for researchers to understand the underlying research quickly and move forward, especially in this age of infodemic. MRC on research articles can also...
Article
Sarcasm is a case of implicit emotion and needs additional information like context and multimodality for better detection. But sometimes, this additional information also fails to help in sarcasm detection. For example, the utterance “Oh yes, you’ve been so helpful. Thank you so much for all your help”, said in a polite tone with a smiling face, c...
Article
Dense image captioning is a task that requires generating localized captions in natural language for multiple regions of an image. This task leverages its functionalities from both computer vision for recognizing regions in an image and natural language processing for generating captions. Numerous works have been carried out on dense image captioni...
Article
Deep learning has become most prominent in solving various Natural Language Processing (NLP) tasks including sentiment analysis. However, these techniques require considerably large amount of annotated corpus, which is not easy to obtain for most of the languages, especially under the scenario of low-resource setting. In this paper, we propose a de...
Article
Full-text available
The rising usage of social media has motivated to invent different methodologies of anonymous writing, which leads to increase in malicious and suspicious activities. This anonymity has created difficulty in finding the suspect. Author profiling deals with characterization of an author through some key attributes such as gender, age, language, dial...
Article
Contextualization In recent years, the popularity of virtual agents particularly task-oriented dialogue agents has increased immensely due to their effectiveness and simplicity in various domains such as industry, e-commerce, and health. Problem In real-world, users do not have always a predefined and immutable goal, i.e., they may upgrade/downgra...
Article
Image captioning refers to the process of generating a textual description that describes objects and activities present in a given image. It connects two fields of artificial intelligence, computer vision, and natural language processing. Computer vision and natural language processing deal with image understanding and language modeling, respectiv...
Preprint
Full-text available
Movies reflect society and also hold power to transform opinions. Social biases and stereotypes present in movies can cause extensive damage due to their reach. These biases are not always found to be the need of storyline but can creep in as the author's bias. Movie production houses would prefer to ascertain that the bias present in a script is t...
Preprint
Full-text available
The long-standing goal of Artificial Intelligence (AI) has been to create human-like conversational systems. Such systems should have the ability to develop an emotional connection with the users, hence emotion recognition in dialogues is an important task. Emotion detection in dialogues is a challenging task because humans usually convey multiple...
Conference Paper
Full-text available
Computational comprehension and identifying emotional components in language have been critical in enhancing human-computer connection in recent years. The WASSA 2022 Shared Task introduced four tracks and released a dataset of news stories: Track-1 for Empathy and Distress Prediction, Track-2 for Emotion classification, Track-3 for Personality pre...
Preprint
Full-text available
The World Health Organization (WHO) has emphasized the importance of significantly accelerating suicide prevention efforts to fulfill the United Nations' Sustainable Development Goal (SDG) objective of 2030. In this paper, we present an end-to-end multitask system to address a novel task of detection of two interpersonal risk factors of suicide, Pe...
Conference Paper
Full-text available
The World Health Organization (WHO) has emphasized the importance of significantly accelerating suicide prevention efforts to fulfill the United Nations' Sustainable Development Goal (SDG) objective of 2030. In this paper, we present an end-to-end multitask system to address a novel task of detection of two interpersonal risk factors of suicide, Pe...
Conference Paper
Inspired by recent advances in emotion-cause extraction in texts and its potential in research on computational studies in suicide motives and tendencies and mental health, we address the problem of cause identification and cause extraction for emotion in suicide notes. We introduce an emotion-cause annotated suicide corpus of 5769 sentences by lab...
Article
Thanks to the digital age, online speech and information may now be disseminated anonymously without regard for repercussions. Regulators face a unique problem with social media platforms because of the speed and volume of material and the lack of editorial supervision. The existing datasets on hate speech or offensive language identification lack...
Article
Full-text available
With the upsurge in suicide rates worldwide, timely identification of the at-risk individuals using computational methods has been a severe challenge. Anyone presenting with suicidal thoughts, mainly recurring and containing a deep desire to die, requires urgent and ongoing psychiatric treatment. This work focuses on investigating the role of tempo...
Article
Disease diagnosis is an essential and critical step in any disease treatment process. Automatic diagnostic testing has gained popularity in recent years due to its scalability, rationality, and efficacy. The major challenges for the diagnosis agent are inevitably large action space (symptoms) and varieties of diseases, which demand either rich doma...
Article
Full-text available
Temporal orientation is an important aspect of human cognition which shows how an individual emphasizes past, present, and future. Theoretical research in psychology shows that one’s emotional state can influence his/her temporal orientation. We hypothesize that measuring human temporal orientation can benefit from concurrent learning of emotion. T...
Article
Full-text available
Creation of task-oriented dialog/virtual agent (VA) capable of managing complex domain-specific user queries pertaining to multiple intents is difficult since the agent must deal with several subtasks simultaneously. Most end-to-end dialogue systems, however, only provide user semantics as inputs from texts into the learning process and neglect oth...
Article
Full-text available
The exponential growth in the number of scientific articles has made it difficult for the researchers to keep themselves updated with the new developments. Scientific document summarization solves this problem by providing a summary of essential contributions. In this paper, we have presented a novel method of scientific document summarization usin...
Article
Full-text available
Social media platforms become paramount for gathering relevant information during the occurrence of any natural disaster. Twitter has emerged as a platform which is heavily used for the purpose of communication during disaster events. Therefore, it becomes necessary to design a technique which can summarize the relevant tweets and thus, can help in...
Article
Mental health disorder continues to be a grievous concern plaguing humans worldwide. The scarcity of mental health professionals (MHPs) has driven novel efforts lately to combat mental illness by developing automated systems capable of assisting MHPs. However, lack of high-quality conversational data due to privacy concerns remains a bottleneck tow...
Article
Emojis or emoticons are not just a modern trend but have become an essential part of our day-to-day interactions. Predicting a suitable emoji for a given tweet is a challenging task because a wrong emoji prediction for a tweet can change the meaning of the message or can amplify the emotion of the message. This task is particularly challenging sinc...
Article
Gender plays a crucial role in improving the performance quality of personalized systems. Privacy and anonymity allow users to hide their details. Based on the intuition that post contents of male and female users differ, we can predict the gender of the social media account holder via their corresponding posts. These posts can be multimodal (text...
Article
Politeness enhances interactions by improving relations between the participants. If there is a display of rudeness, even the finest conversation can fall through. In addition, if lathered with kindness, even the most angst-prone circumstance can be expressed with far less suffering. Previously, researchers have focused upon including politeness in...
Article
The advent of the Internet is a boon to society. However, many of its banes cannot be undermined, cyberbullying being one of them. The emotional state and sentiment of a person have a significant influence on the intended content. The current work is the first attempt in investigating the role of sentiment and emotion information for identifying cy...
Article
With the advent of internet technologies, it has created different ways of writing anonymously, which has lead to criminal and malicious activities over social media platforms. Thus, the automatic authentication checking of the available contents is the need of the hour. Social media sites, such as Facebook, Twitter, and so on, are used heavily by...
Article
Full-text available
Linking event triggers with their respective arguments is an essential component for building an event extraction system. It is challenging to link event triggers with the corresponding arguments triggers when the sentence contains multiple events and arguments triggers. The task becomes even more challenging in a low-resource setup due to the unav...
Article
In the natural language processing community, open-domain conversational agents, also known as chatbots, are gaining popularity. One of the difficulties is getting them to communicate in an emotionally intelligent manner. To generate dialogues, current neural response generation methods depend solely on end-to-end learning from large scale conversa...
Article
Social chatbots have gained immense popularity, and their appeal lies in their capacity to respond to diverse requests, but also in their ability to develop an emotional connection with users. To develop and promote social chatbots, we need to concentrate on increasing user interaction and consider both the intellectual and emotional quotient in co...
Article
Full-text available
The quest for new information is an inborn human trait and has always been quintessential for human survival and progress. Novelty drives curiosity, which in turn drives innovation. In Natural Language Processing (NLP), Novelty Detection refers to finding text that has some new information to offer with respect to whatever is earlier seen or known....
Article
Full-text available
Neural machine translation (NMT) has emerged as a preferred alternative to the previous mainstream statistical machine translation (SMT) approaches largely due to its ability to produce better translations. The NMT training is often characterized as data hungry since a lot of training data, in the order of a few million parallel sentences, is gener...
Article
Building Virtual Agents capable of carrying out complex queries of the user involving multiple intents of a domain is quite a challenge, because it demands that the agent manages several subtasks simultaneously. This article presents a universal Deep Reinforcement Learning framework that can synthesize dialogue managers capable of working in a task...
Article
Full-text available
This paper proposes a hierarchical method for learning an efficient Dialogue Management (DM) strategy for task-oriented conversations serving multiple intents of a domain. Deep Reinforcement Learning (DRL) networks specializing in individual intents communicate with each other, having the capability of sharing overlapping information across intents...
Article
The behavior, mental-health, emotion, life choices, social nature, and thought patterns of an individual are revealed by personality. Cyber forensics, personalized services, recommender systems are some of the examples of automatic personality prediction. A deep learning based personality prediction system has been developed in this work. Facial an...
Article
Microblog summarization systems are gaining importance during natural disasters. A lot of tweets are posted along with multimedia content during the occurrence of any natural disaster event. Extracting relevant information/summary from these tweets is important for the smooth functioning of the rescue operation. Moreover, because of the limited siz...
Article
People suffering from stress and various mental health problems find it easier to express and share their feelings on online platforms, such as Twitter. However, the imposed character limit (280 characters) by Twitter and infrequent online activities of a section of users poses a serious setback in using computational methods for mental health anal...
Preprint
Full-text available
During this pandemic situation, extracting any relevant information related to COVID-19 will be immensely beneficial to the community at large. In this paper, we present a very important resource, COVIDRead, a Stanford Question Answering Dataset (SQuAD) like dataset over more than 100k question-answer pairs. The dataset consists of Context-Answer-Q...
Preprint
Full-text available
Humor recognition in conversations is a challenging task that has recently gained popularity due to its importance in dialogue understanding, including in multimodal settings (i.e., text, acoustics, and visual). The few existing datasets for humor are mostly in English. However, due to the tremendous growth in multilingual content, there is a great...
Conference Paper
Emotion analysis from texts has emerged as an important area of research in the field of Natural Language Processing (NLP) in the past few years. Several benchmark datasets have been released for this task, however most of the datasets are open domain in nature and for resource-rich language like English. These datasets may not always be sufficient...
Article
In this paper, we propose an unsupervised approach for knowledge graph (KG) creation from conversational data. We make use of intent classification and slot-filling, the two important components of any dialogue agent, exploit their inter-connectedness, and finally construct a KG. We build a supervised intent classifier to extract the intent classes...
Conference Paper
Emotion Recognition in Conversation (ERC) is becoming increasingly popular due to the accessibility of an enormous measure of openly accessible conversational information. Moreover, it has potential applications in opinion mining, social media, and the health care domain. In this paper, we propose a novel Context and Knowledge Enriched Transformer...