Uma Shanker Tiwary

Uma Shanker Tiwary
  • PhD; FIETE; SMIEEE
  • Professor (Full) at Indian Institute of Information Technology Allahabad

About

153
Publications
51,408
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,778
Citations
Introduction
Currently working on how to enhance human intelligence working with computational systems.
Current institution
Indian Institute of Information Technology Allahabad
Current position
  • Professor (Full)
Additional affiliations
July 2002 - present
Indian Institute of Information Technology Allahabad
Position
  • Professor (Full)
Description
  • Research Interest : Speech , Image and Language Processing : Emotion and Mental State Recognition; Wavelet Transform;
September 1984 - February 1988
Indian Institute of Technology BHU
Position
  • Senior Researcher
July 2002 - present
Indian Institute of Information Technology Allahabad

Publications

Publications (153)
Article
Full-text available
Our team, silp_nlp, participated in the LLMs4OL Challenge at ISWC 2024, engaging in all three tasks focused on ontology generation. The tasks include predicting the type of a given term, extracting a hierarchical taxonomy between two terms, and extracting non-taxonomy relations between two terms. To accomplish these tasks, we used machine learning...
Conference Paper
Full-text available
Emotion space models are frameworks that represent emotions in a multidimensional space, providing a structured way to understand and analyze the complex landscape of human emotions. However, the dimensional representation of emotions is still debatable. In this work, we are probing the higher dimensional space constituted by emotion labeling done...
Article
This study aimed to investigate the relationship between the gender and personality traits of individuals and their ability to detect, identify, and discriminate odors. The dataset on olfactory performance was compiled from a panel of 207 healthy human volunteers. The dataset for the odor identification test was collected using a set of 10 distinct...
Chapter
Full-text available
As our knowledge, there is no dialog system for mental health-care domain in Hindi. This may be due to unavailability of user utterances corpora in Hindi for this domain. In this paper, we propose a novel algorithmic approach for user utterance generation in Hindi by considering dialects, linguistic attributes, symptoms, frequency of symptoms, and...
Chapter
Visual content generation is an active area of research nowadays with considerable work. Still leaving possibilities for enhancement. However, in this area, the domain of the Hindi language has remained unexplored, largely due to the scarcity of linguistic resources. As Hindi is the fourth most spoken language in this world, such work will be a con...
Chapter
Experience and expression of Emotions have cultural influences. In behavioural experiments, Western and Eastern cultures are shown to have differences in emotional experience and expression. Western culture promotes the expression of emotional experience, whereas, in Eastern culture, emotional expressions are not very explicit and are sometimes res...
Preprint
Recently, the representation of emotions in the Valence, Arousal and Dominance (VAD) space has drawn enough attention. However, the complex nature of emotions and the subjective biases in self-reported values of VAD make the emotion model too specific to a particular experiment. This study aims to develop a generic model representing emotions using...
Article
Full-text available
Named entities are random, like emerging entities and complex entities. Most of the large language model’s tokenizers have fixed vocab; hence, they tokenize out-of-vocab (OOV) words into multiple sub-words during tokenisation. During fine-tuning for any downstream task, these sub-words (tokens) make the named entity classification more complex sinc...
Conference Paper
Full-text available
In this paper, we worked on the fusion of multiple brain regions in order to combine information from different brain regions. The idea is that considering the dynamic processing of emotional video stimulus will involve different brain regions, and hence, fusion of information from these brain regions can increase emotion recognition accuracy signi...
Chapter
Time-series forecasting emphasizes a specific future value prediction over a specific period. By utilizing the available time resources, forecasting assists in estimating significant values beyond time. Numerous real-world forecasting applications exist that aim to estimate prominent values over time by exploiting the current time resources. Foreca...
Article
Full-text available
We describe the creation of an affective film dataset for researchers interested in studying a spectrum of emotional experiences. We followed a two stage process. In the first stage, two hundred twenty-two audio-visual clips with 60-s long duration were rated in the lab by 407 participants. Based on the selection criteria, 69 audio-visual clips wer...
Conference Paper
This report discusses the problem of denoising in image processing and the application of Generative Adversarial Networks (GANs) to address this challenge. GANs have demonstrated promising results in denoising tasks by learning to generate clean images from noisy ones through training on paired noisy and clean image datasets. Several variations of...
Conference Paper
Inter-subject or subject-independent emotion recognition has been a challenging task in affective computing. This work is about an easy-to-implement emotion recognition model that classifies emotions from EEG signals subject independently. It is based on the famous EEGNet architecture, which is used in EEG-related BCIs. We used the ‘Dataset on Emot...
Preprint
Full-text available
Inter-subject or subject-independent emotion recognition has been a challenging task in affective computing. This work is about an easy-to-implement emotion recognition model that classifies emotions from EEG signals subject independently. It is based on the famous EEGNet architecture, which is used in EEG-related BCIs. We used the Dataset on Emoti...
Article
Generating natural language description for visual content is a technique for describing the content available in the image(s). It requires knowledge of both the domains of computer vision and natural language processing. For this, various models with different approaches are suggested. One of them is encoder-decoder-based description generation. E...
Article
Full-text available
In this paper, we propose a novel method of evaluating text-to-speech systems named “Learning-Based Objective Evaluation” (LBOE), which utilises a set of selected low-level-descriptors (LLD) based features to assess the speech-quality of a TTS model. We have considered Unit selection speech synthesis (USS), Hidden Markov Model speech synthesis (HMM...
Article
Full-text available
Emotion recognition using EEG signals is an emerging area of research due to its broad applicability in Brain-Computer Interfaces. Emotional feelings are hard to stimulate in the lab. Emotions don’t last long, yet they need enough context to be perceived and felt. However, most EEG-related emotion databases either suffer from emotionally irrelevant...
Article
Full-text available
All fruits emit some specific volatile organic compounds (VOCs) during their life cycle. These VOCs have specific characteristics, by using these characteristics fruit ripening stage can be identified without destructing the fruit. In this study, an application-specific electronic nose device was designed for monitoring fruit ripeness. The propose...
Preprint
We describe the creation of an affective film dataset for researchers interested in studying a spectrum of emotional experiences. We followed a two stage process. In the first stage, two hundred twenty-two video clips with 60-seconds long duration were rated in the lab by 407 participants. Based on the selection criteria, 69 audio-visual clips were...
Preprint
Full-text available
Emotion recognition using EEG signals is an emerging area of research due to its broad applicability in BCI. Emotional feelings are hard to stimulate in the lab. Emotions do not last long, yet they need enough context to be perceived and felt. However, most EEG-related emotion databases either suffer from emotionally irrelevant details (due to prol...
Preprint
Full-text available
Availability of naturalistic affective stimuli is needed for creating the affective technological solution as well as making progress in affective science. Although a lot of progress in the collection of affective multimedia stimuli has been made in western countries, the technology and findings based on such monocultural datasets may not be scalab...
Preprint
Full-text available
Understanding the dynamics of emotional experience is an old problem. However, a clear understanding of the mechanism of emotional experience is still far away. In the presented work, we tried to address this problem using a well-established method called microstate analysis using multichannel electroencephalography (EEG). We recorded the brain act...
Article
Full-text available
This paper presents an improved adversarial network for visual content generation from textual description. Synthesizing high-quality images from the textual description is the most challenging problem in Computer vision. Existing methods first generate the initial image sketch and then refine that to fine-grained details at different portions of t...
Article
Full-text available
While naturalistic stimuli, such as movies, better represent the complexity of the real world and are perhaps crucial to understanding the dynamics of emotion processing, there is limited research on emotions with naturalistic stimuli. There is a need to understand the temporal dynamics of emotion processing and their relationship to different dime...
Preprint
While naturalistic stimuli like movies better resemble the complexity of the real world and are perhaps crucial to understanding the dynamics of emotion processing, there is limited research on emotions with naturalistic stimuli. There is a need to understand the temporal dynamics of emotion processing and their relationship to different dimensions...
Article
Full-text available
Our brain continuously interacts with the body as we engage with the world. Although we are mostly unaware of internal bodily processes, such as our heartbeats, they may be influenced by and in turn influence our perception and emotional feelings. Although there is a recent focus on understanding cardiac interoceptive activity and interaction with...
Article
Full-text available
Dialogue policy is a crucial component in task-oriented Spoken Dialogue Systems (SDSs). As a decision function, it takes the current dialogue state as input and generates appropriate system’s response. In this paper, we explore the reinforcement learning approaches to solve this problem in an Indic language scenario. Recently, Deep Reinforcement Le...
Article
Full-text available
Perceptual computing (Per-C) is a branch of CWW (Computing with words) that assist people in making subjective decisions. Their applications take linguistic inputs (i.e., words) from the user and return a linguistic output (i.e., word). The perception of these linguistic inputs suffers from uncertainties, for which IT2FSs (Interval Type-2 Fuzzy Set...
Preprint
Full-text available
Our brain continuously interacts with the body as we engage with the world. Although we are mostly unaware of internal bodily processes, such as our heartbeats, they may be influenced by and in turn influence our perception and emotional feelings. While there is a recent focus on understanding cardiac interoceptive activity and interaction with bra...
Article
Full-text available
Hand signs are an effective form of human-to-human communication that has a number of possible applications. Being a natural means of interaction, they are commonly used for communication purposes by speech impaired people worldwide. In fact, about one percent of the Indian population belongs to this category. This is the key reason why it would ha...
Article
Full-text available
Features in a text are hierarchically structured and may not be optimally learned using one-step encoding. Scrutinizing the literature several times facilitates a better understanding of content and helps frame faithful context representations. The proposed model encapsulates the idea of re-examining a piece of text multiple times to grasp the unde...
Chapter
Various methods in machine learning have noticeable use in generating descriptive text for images and video frames and processing them. This area has attracted the immense interest of researchers in past years. For text generation, various models contain CNN and RNN combined approaches. RNN works well in language modeling; it lacks in maintaining i...
Article
Impreciseness and uncertainty are the fabrics that make life interesting. For decades, human beings have developed strategies to cope with uncertainties and automate them. In personnel selection for the I.T. field, selectors often find it very difficult to select candidates by going through a set of resumes containing similar kinds of skills. Hence...
Preprint
We describe the creation of an affective film dataset for researchers interested in studying a broad spectrum of emotional experiences. Two hundred twenty-two 60-seconds long video clips were selected based on multimedia content analysis and screened in the lab with 407 participants. The participants' ratings mapped to 31 emotion categories in the...
Preprint
Full-text available
Analysing expressions on the person's face plays a very vital role in identifying emotions and behavior of a person. Recognizing these expressions automatically results in a crucial component of natural human-machine interfaces. Therefore research in this field has a wide range of applications in bio-metric authentication, surveillance systems , em...
Article
Summaries are expected to relay the most amount of information in the least amount of words. Summary assessment tools such as Rouge, METEOR, and BLEU fail to explain the factual consistency of summaries with source documents and also disregard the synonymy between expressions. Information-coverage is a measure of the amount of important information...
Preprint
Full-text available
Emotion experiments with naturalistic paradigms are emerging and giving new insights into dynamic brain activity. Context familiarity is considered as an important dimensions of emotion processing by appraisal theorists. However, how the context un/familiarity of the naturalistic stimuli influences the central and autonomic activity is not probed y...
Preprint
Full-text available
The emotion research with artificial stimuli does not represent the dynamic processing of emotions in real-life situations. The lack of data on emotion with the ecologically valid naturalistic paradigm hinders the knowledge of emotion mechanisms in a real-world interaction. To this aim, we collected the emotional multimedia clips, validated them wi...
Preprint
Full-text available
Emotion is a constructed phenomenon that emerges from the dynamic interaction of multiple components neurologically, physiologically and behaviorally. Such dynamics can not be captured by static and controlled experiments. Hence, the study of emotion with a naturalistic paradigm is needed. In this dataset, multimedia naturalistic stimuli are used t...
Article
Due to the rapid increase in the development of Task-oriented dialogue systems, the need for labelled dialogue corpus has become inevitable. For the Hindi language, there is no such dialogue corpus yet available. As a first attempt, we release a Hindi Dialogue Restaurant Search (HDRS) corpus and compare various state-of-the-art dialogue state track...
Chapter
Knowledge sharing platforms like Quora have millions and billions of questions. With such a vast number of questions, there will be a lot of duplicates in it. Duplicate questions in these sites are normal, especially with the increasing number of questions asked. These redundant queries reduce efficiency and create repetitive data on the data serve...
Book
The two-volume set LNCS 12615 + 12616 constitutes the refereed proceedings of the 12th International Conference on Intelligent Human Computer Interaction, IHCI 2020, which took place in Daegu, South Korea, during November 24-26, 2020. The 75 full and 18 short papers included in these proceedings were carefully reviewed and selected from a total of...
Book
The two-volume set LNCS 12615 + 12616 constitutes the refereed proceedings of the 12th International Conference on Intelligent Human Computer Interaction, IHCI 2020, which took place in Daegu, South Korea, during November 24-26, 2020. The 75 full and 18 short papers included in these proceedings were carefully reviewed and selected from a total of...
Preprint
Full-text available
Repeated reading (RR) helps learners, who have little to no experience with reading fluently to gain confidence, speed and process words automatically. The benefits of repeated readings include helping all learners with fact recall, aiding identification of learners' main ideas and vocabulary, increasing comprehension, leading to faster reading as...
Chapter
Natural Language Generation (NLG) is a crucial component of a Spoken Dialogue System. Its task is to generate utterances with intended attributes like fluency, variation, readability, scalability and adequacy. As the handcrafted models are rigid and tedious to build, people have proposed many statistical and deep-learning based models to bring abou...
Chapter
Full-text available
The proposed model unites the robustness of the extractive and abstractive summarization strategies. Three tasks indispensable to automatic summarization, namely, apprehension, extraction, and abstraction, are performed by two specially designed networks, the highlighter RNN and the generator RNN. While the highlighter RNN collectively performs the...
Chapter
Interval Type-2 fuzzy sets (IT2FSs) are used for modeling uncertainty and imprecision in a better way. In a conversation, the information given by humans are mostly words. IT2FSs can be used to provide a suitable mathematical representation of a word. The IT2FSs can be further processed using Computing with the words (CWW) engine to return the IT2F...
Chapter
In this work, we perform feature selection on fifty-six prosodic features (such as intensity, formants, pause) extracted from the MIT-interview dataset. These features help in rating the various personality traits (Engaged, Excited, Friendly, Speaking Rate) which in turn help to determine an interviewee performance. First, we have demonstrated how...
Book
This volume constitutes the proceedings of the 11th International Conference on Intelligent Human Computer Interaction, IHCI 2019, held in Allahabad, India, in December 2019. The 25 full papers presented in this volume were carefully reviewed and selected from 73 submissions. The papers are grouped in the following topics: EEG and other biological...
Article
Full-text available
This paper presents an approach for speech retrieval. The feature being used in this approach is MFCC. This approach does not use any phoneme recognizer or Speech to text tool hence it can be used for other languages as well leads to the problem of speech retrieval (SR). This method retrieves ranked audio files containing spoken text in response to...
Preprint
Full-text available
What is an emotion? an old riddle repeatedly being attempted with advance modern tools and understanding of the age. With the new advancement old theories are tested and with correction new is formed. Such is the case with defining emotion broadly shifting from classical definite marker theory to statistically context situated conceptual theory. In...
Book
This book presents select papers from the International Conference on Emerging Trends in Communication, Computing and Electronics (IC3E 2018). Covering the latest theories and methods in three related fields – electronics, communication and computing, it describes cutting-edge methods and applications in the areas of signal and image processing, cy...
Book
This book constitutes the thoroughly refereed proceedings of the 10th International Conference on Intelligent Human Computer Interaction, IHCI 2018, held in Allahabad, India, in December 2018. The 28 regular papers presented were carefully reviewed and selected from 89 submissions. The papers have been organized in the following topical sections: E...
Conference Paper
Full-text available
In education, some students lack language comprehension, language production and language acquisition skills. In this paper we extracted several psycholinguistics features broadly grouped into lexical and morphological complexity, syntactic complexity, production units, syntactic pattern density, referential cohesion, connectives, amounts of coordi...
Article
Full-text available
In this paper, an extended combined approach of phrase based statistical machine translation (SMT), example based MT (EBMT) and rule based MT (RBMT) is proposed to develop a novel hybrid data driven MT system capable of outperforming the baseline SMT, EBMT and RBMT systems from which it is derived. In short, the proposed hybrid MT process is guided...
Article
Full-text available
Automatic speech recognition (ASR) and Text to speech (TTS) are two prominent area of research in human computer interaction nowadays. A set of phonetically rich sentences is in a matter of importance in order to develop these two interactive modules of HCI. Essentially, the set of phonetically rich sentences has to cover all possible phone units d...
Conference Paper
The algorithm proposed in this paper aims to achieve pose recognition in Indian classical dance domain. Three different dance forms namely Bharatnatyam, Kathak and Odissi, all together with 15 poses have been considered for pose classification problem. An initial database is created consisting of 100 images and split into training and testing datas...
Article
Full-text available
Currently, the research on human affect recognition has shifted from six basic emotions to complex affect recognition in continuous two or three dimensional space due to the following challenges: (i) representing and analyzing large number of emotions in one framework, (ii) representing complex emotions in the framework, and (iii) validating the fr...
Article
Full-text available
In this paper an approach of Semantic Knowledge Extraction (SKE), from a set of research papers, is proposed to develop a system Summarized Research Article Generator (SRAG) which would generate a summarized research article based on the query given by a user. The SRAG stores the semantic knowledge extracted from the query relevant papers in the fo...
Conference Paper
Full-text available
EEG based biometric system can be used for authentication, with advantages like confidentiality retention and forgery prevention. Signals which are taken from maximum brain regions show some sort of unique information that can be used for extracting the subject dependent pattern. This paper presents an approach to find the relationships among signa...
Conference Paper
Full-text available
Clustering algorithms have been used for systematic retrieval of data by organizing them into several clusters and K-Means is one such algorithm, which partitions data into groups based on distance metric in an unsupervised way. In this paper, we study denoising of images corrupted with variable Gaussian noise spread across the entire images in the...
Article
Full-text available
In this paper we implemented an unobtrusive and non-invasive method to measure pulse rate and heart rate variability. A Ballistocardiography technique has been used which describes ballistic force applied by heart on blood vessels. Ballistocardiography depicts repetitive motion in human body against blood flow because of the ballistic force. In thi...
Conference Paper
Full-text available
Successful implementation of type-1 Fuzzy Systems (FS) in diverse application areas have been accomplished till date. Nonetheless type-1 FS are not able to handle significant amount of uncertainties present in dynamic real world applications. An improved performance against these uncertainties is achieved by type-2 FS. In this paper, a type-2 FS is...
Chapter
The objective of this chapter is twofold. On one hand, it tries to introduce and present various components of Human Computer Interaction (HCI), if HCI is modeled as a process of cognition; on the other hand, it tries to underline those representations and mechanisms which are required to develop a general framework for a collaborative HCI. One mus...
Conference Paper
Full-text available
The objective of this paper is two-fold. First, to adapt the earlier cognitive interactive framework for speech retrieval application and second, to enhance the accuracy of continuous speech retrieval system using fuzzy word mesh representing linguistic knowledge of users. The proposed method recognizes the audio query and retrieves the audio file(...
Conference Paper
Full-text available
Associationism is one of the dominant theories of learning. It is believed to play an implicit role in formation of inert cognition when provided with proper reinforcements. On the other hand learning by analogy plays an important role in the learning process of humans. Instructors all over the world use analogies to put their point through. We hav...
Conference Paper
Automated human quality translation between languages has been an enduring goal in computer science. We have worked on the concept of chunk-based statistical machine translation for the purpose of machine translation between Hindi and English using a probabilistic approach. Translation between Hindi and English is a challenging task as Hindi and En...
Chapter
Full-text available
In this paper, we present a generic interactive framework based on human cognition, where the system can learn continuously from the Internet and from its interaction with the users. To show the utilization of this framework, Iintelli, an agent based application for multiple text document summarization is developed and compared with the MEAD on the...
Conference Paper
In this paper we propose the use of multilevel classification techniques similar to concept of Bayesian belief networks for Combining Words and Pictures (Images) for Museum Information Retrieval. We have designed our own corpus on Allahabad Museum. This approach is static which allows one to compute the rank of documents of relevant words and pictu...
Chapter
The objective of this chapter is twofold. On one hand, it tries to introduce and present various components of Human Computer Interaction (HCI), if HCI is modeled as a process of cognition; on the other hand, it tries to underline those representations and mechanisms which are required to develop a general framework for a collaborative HCI. One mus...
Chapter
Spoken dialogue systems are a step forward towards the realization of human-like interaction with computer-based systems. This chapter focuses on issues related to spoken dialog systems. It presents a general architecture for spoken dialogue systems for human-computer interaction, describes its components, and highlights key research challenges in...
Book
Human Computer Interaction is the study of relationships among people and computers. As the digital world is getting multi-modal, the information space is getting more and more complex. In order to navigate this information space and to capture and apply this information to appropriate use, an effective interaction between human and computer is req...
Conference Paper
Full-text available
In this paper, we have proposed speech emotion recognition system based on multi-algorithm fusion. Mel Frequency Cepstral Coefficients (MFCC) and Discrete Wavelet Transform (DWT), the two prominent algorithms for speech analysis, have been used to extract emotion information from speech signal. MFCC, a representation of the short-term power spectru...
Conference Paper
Full-text available
In this paper, we have investigated the performance of different multi-resolution transforms in the application of emotion recognition from facial images. Multi-resolution analysis of image provides frequency information along with time information in different scale, orientation and locations. The emotion information from facial images was being c...
Conference Paper
Full-text available
This paper present an overview of multimodal information fusion strategies such as early, intermediate and late fusion as reported in literature. We also made an experimental evaluation for one of them for multimodal emotion recognition system. Further we propose a fusion scheme based on speech and facial expression for multimodal emotion recogniti...
Article
Full-text available
In this paper, we have proposed a multilevel soft thresholding technique for noise removal in Daubechies complex wavelet transform domain. Two useful properties of Daubechies complex wavelet transform, approximate shift invariance and strong edge representation, have been explored. Most of the uncorrelated noise gets removed by shrinking complex wa...
Conference Paper
Full-text available
The purpose of this paper is to evolve a robust text independent speaker identification system based on the wavelet transform, which is able to analyze signal at multiple resolutions. The proposed system identifies speakers by their acoustic characteristics embedded in speech signal of speakers. Features are obtained from approximation and detail c...
Article
Full-text available
Various structural and functional changes associated with ischemic (myocardial infarcted) heart cause amplitude and spectral changes in signals obtained at different leads of ECG. In order to capture these changes, Relative Frequency Band Coefficient (RFBC) features from 12-lead ECG have been proposed and used for automated identification of myocar...
Article
Full-text available
The appropriateness of dielectric loaded antenna for the passive millimeter wave imaging application has recently been demonstrated. In this paper, we analyze the optical performance of the passive millimeter wave (PMMW) imaging system based on a 1D focal plane array (FPA) of dielectric rod waveguide (DRW) antennas. A first step in the design proce...

Network

Cited By