Sandeep Pandey

Sandeep Pandey
Indian Institute of Technology Guwahati | IIT Guwahati · Department of Electronics and Electrical Engineering (EEE)

Bachelor of Technology

About

10
Publications
6,807
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
46
Citations

Publications

Publications (10)
Article
Full-text available
Depression is one of the significant mental health issues affecting all age groups globally. While it has been widely recognized to be one of the major disease burdens in populations, complexities in definitive diagnosis present a major challenge. Usually, trained psychologists utilize conventional methods including individualized interview assessm...
Article
In an attempt to make Human-Computer Interactions more natural, we propose the use of Tensor Factorized Neural Networks (TFNN) and Attention Gated Tensor Factorized Neural Network (AG-TFNN) for Speech Emotion Recognition (SER) task. Standard speech representations such as 2D and 3D Mel-Spectrogram and Temporal Modulation Spectrogram is explored to...
Chapter
Emotions play an essential role in public speaking. The emotional content of speech has the power to influence minds. As such, we present an analysis of the emotional content of politicians speech in the Indian political scenario. We investigate the emotional content present in the speeches of politicians using an Attention based CNN+LSTM network....
Preprint
Full-text available
Emotions play an essential role in public speaking. The emotional content of speech has the power to influence minds. As such, we present an analysis of the emotional content of politicians speech in the Indian political scenario. We investigate the emotional content present in the speeches of politicians using an Attention based CNN+LSTM network....
Conference Paper
Full-text available
This paper proposes the use of Wavenet architecture to the task of speech emotion recognition using raw speech signals. In contrast to the conventional deep learning methods, Wavenet utilises dilation filters, residual blocks and skip connections to model the long-term dependencies in speech signals and eliminates the need for LSTMs for the same. W...
Conference Paper
Full-text available
This paper presents an introduction to various deep learning techniques with the aim of capturing and classifying emotional state from speech utterances. Architectures such as Convolutional Neural Network(CNN) and Long Short-Term Memory(LSTM) have been used to test the emotion capturing capability from various standard speech represenations such as...
Conference Paper
Full-text available
This paper explores speaker identification based on the speaker adaptation via multilinear decomposition of a speaker model.Tucker decomposition of the third order mean Tensor of training speaker yields three subspaces corresponding to each mode. The mean of the mixtures for speakers is expressed as a product of the mixture space and a weight matri...
Article
Full-text available
Injection locking characteristics of oscillators are reviewed both theoretically and experimentally. Theoretical results coupled with experimental findings are presented. A simple method of deriving the equation for locking range is reported.

Questions

Questions (7)
Question
I want to collect a speech dataset of real world interview conversations between two persons. Which microphone / recording instrument is advisable for it so that it doesn't interfere with the process ? Please suggest
Question
I am looking forward to develop voice based diagnostic system for covid-19 detection from speech. Is there any publicly available dataset of covid-19 speech files ?
Question
I have to put three images side by side in IEEE two column format in overleaf.com but I am unable to do so using subfigure or minipage. Please suggest any workaround
Question
Any python toolbox which calculated the modulation spectrogram ( stft/ wavelet based) in python and return as a matrix. The on available at : https://github.com/MuSAELab/amplitude-modulation-analysis-module shows memory error.
Question
Please elaborate the merits and demerits in comparison with GMM, Eigenvoice etc for speaker verification
Question
For a speaker identification problem, a MFCC tensor of dimensions feature x frames x no of utterances is constructed for each speaker , and Hosvd is used for dimensionality reduction. What will the Hosvd capture in each tensor along each mode
Question
terminology of Tensor, how it is different from N-dimensional array

Network

Cited By

Projects

Projects (2)
Archived project