ChapterPDF Available

Abstract and Figures

Electroencephalogram (EEG) signals are the recording of brain electrical activity, commonly used for emotion recognition. Different EEG rhythms carry different neural dynamics. EEG rhythms are separated using tunable Q-factor wavelet transform (TQWT). Several features like mean, standard deviation, information potential are extracted from the TQWT-based EEG rhythms. Machine learning classifiers are used to differentiate various emotional states automatically. The authors have validated the proposed model using a publicly available database. Obtained classification accuracy of 92.9% proves the candidature of the proposed method for emotion identification.
Content may be subject to copyright.
Emotion identification from TQWT
based EEG rhythms
Aditya Nalwaya, Kritiprasanna Das, and Ram Bilas Pachori
Department of Electrical Engineering, Indian Institute of Technology Indore, Indore, India
ABSTRACT
Electroencephalogram (EEG) signals are the recording of brain electrical activity, commonly used for
emotion recognition. Different EEG rhythms carry different neural dynamics. EEG rhythms are separated
using tunable Q-factor wavelet transform (TQWT). Several features like mean, standard deviation,
information potential are extracted from the TQWT based EEG rhythms. Machine learning classifiers are
used to differentiate various emotional states automatically. We have validated our proposed model using
a publicly available database. Obtained classification accuracy of 92.86% proves the candidature of the
proposed method for emotion identification.
Keywords: Emotion recognition, affective computing, signal processing, machine learning, physiological
signal.
INTRODUCTION
Emotion plays a vital role in human life, as it influences human behavior, mental state, decision making,
etc. [1]. In humans, overall intelligence is generally measured by logical and emotional intelligence
[2],[3]. In recent years artificial intelligence (AI) and machine learning (ML) have helped computers
achieve higher intelligence particularly, in numerical computing and logical reasoning. But still, there are
some limitations in its ability to understand, comprehend, and respond according to the emotional state of
persons interacting with a computer. To address these shortcomings, research in the domain of affective
computing is going on. Affective computing is a field that aims to design machines that can recognize,
interpret, process, and stimulate the human experience of feeling or emotion. Recognizing a person's
emotional state can help a computer to interact with humans in a better way.
In order to get more customized and user-centric information and communications technology solutions,
an emotion recognition system could play an important role. Although computing systems have achieved
great progress in AI till now but still, it lags in intelligence when compared with humans. The reason is
the absence of emotional intelligence, as it helps in understanding and making a decision according to the
situation. Thus, instead of making decisions logically, computers can be made aware of the human
emotional state and then make any decision. Emotion recognition is also helpful in upcoming new
entertainment systems such as virtual reality systems for enhancing user experience [4]. Emotion
recognition systems can also be used in understanding the health condition of patients with mental
disabilities or infant patients [5]. Emotion detection can be used to monitor students learning and create
personalized educational content for students [6]. Also, a software developer can examine user experience
by using the emotion recognition system. Emotion recognition system has a vast area of application such
as health care, brain-computer interface (BCI), education, smart entertainment system, smart rooms,
intelligent cars, psychological study, etc. [6].
Emotions are revealed by a human through either facial expression, verbal expression, or several
physiological signals such as variability in heart rate, skin conductance, etc. These are generated by the
human body in response to the emotion evoked [1].
In an emotion recognition system, emotions can be evoked or elicited either in a passive way or in an
active way. In the case of passive emotion elicitation, the subject's emotions are evoked by exposing them
to targeted emotion elicitation material. Some of the publicly available elicitation materials are the
international affective picture system (IAPS) [52]; it is a library of photographs used extensively for
emotion elicitation, Nencki affective picture system (NAPS); is another database for visual stimulus. The
Montreal affective voices and the international affective digitized sounds (IADS) are some of the acoustic
stimulus databases used for passive emotion elicitation [53]. In the case of active emotion elicitation,
subjects will be asked to actively participate in a certain task that leads to emotion elicitation. Participants
may be asked to play video games [54] or engage in conversation with another participant [55]; thus, by
actively participating in the experiment subject's emotions can be evoked. Emotion elicited can be labeled
either through explicit assessment by the subject itself, where the subject tells about his/her feeling, or by
an implicit assessment, where the subject's emotional state is evaluated externally by some other person.
Some standard psychological questionnaires used for the emotion evaluation are self-assessment manikin
(SAM) [56], the positive and negative affective scheme (PANA) [57], and differential emotion scale
(DES) [58], subjects will answer according to their feelings. Both implicit and explicit methods of
assessment are approximate evaluations of elicitation. Therefore in [59], to ensure the correctness of the
label, both techniques are used in combination. Thus, in order to get physiological signals for a targeted
emotion, the elicitation or stimulus of a particular emotion must be chosen carefully.
Signals which are interpretable such as facial expressions, speech expressions, etc., can be collected easily
as the subject is not required to wear any equipment for recording such signals. Most facial emotion
recognition (FER) approaches have three main stages: preprocessing, feature extraction, and emotion
classification.
Preprocessing involves operations related to face detection and face alignment. There are many face
detection techniques available such as Viola-Jones [6], normalized pixel difference (NPD) feature-based
face detection [7], and facial image threshing machine [8]. Zhang et al.[9] proposed an algorithm that can
detect faces as well as perform a face alignment operation. Face alignment is an important operation for
making a non-frontal face image to a frontal image. An active appearance model (AAM) matches the
statistical appearance model iteratively to get new face aligned images [10]. Another face alignment
technique is constrained local models (CLM), which has more smooth image alignment due to the use of
linear filters [11].
Feature extraction is a process of extracting useful information from any given image or video. The
process of identifying information from a given image is called data registration [12]. It can be either
from a full facial image, part of a facial image, or a point-based method. The full facial image is generally
used when one is looking for every single detail of variation across the face. Whereas in the case of the
part-based method, only a part of the facial image, such as eye, nose, etc., is considered. The point-based
method is useful in getting information related to shape. Both part-based and point-based holds low-level
feature information. Different low-level feature are local binary pattern (LBP) [13], local phase
quantisation (LPQ) [14], histogram of gradients (HOG) [15], etc. Such low-level feature generates high
dimensional feature vectors. Therefore, for removing redundancy from the obtained feature vector
pooling methods are used. Next, emotions are classified into different classes using the feature vector
obtained.
In [16], authors have used discrete wavelet transform (DWT) to extract features from a face image and
then convolutional neural networks are used to classify the emotion. In [17], authors have extracted
multiscale features using biorthogonal wavelet entropy from the face image and then a fuzzy support
vector machines (SVM) classifier is used to classify the emotion. Jeen et al. [18] have calculated features
using a multilevel wavelet gradient transform, then using Pearson kernel principal component analysis
pooling is done. The classification of emotion is done using a fuzzy SVM classifier. In [19], the authors
have used pre-trained convolutional neural networks that were trained using the ImageNet database.
Using such transfer learning approach used for facial emotion have been recognized. Such FER systems
have been found in applications related to video analytics for monitoring people [20], e-learning to
identify student engagement [21], reducing fatigue during video conferencing [22], etc.
Another popular approach for recognizing a person's emotions is speech analysis. Speech emotion
recognition (SER) estimates the emotional state of a speaker from voice signal. The SER has the same
stages as the FER system had; the only difference is in the kind of features been extracted from the input
signal. Preprocessing stage helps in extracting the speech signal of the target speaker and removing the
voice of the non-target speaker as well as background noise and reverberation. Frequently used features
for emotion recognition through speech are the Teager-energy operator (TEO) [23], prosodic features
[24], voice quality features [25], and spectral features [26]. TEO features find stress in the speech. It has
been observed that speech is produced due to the nonlinear airflow from the human vocal tract, which is
directly related to change in muscle tension. Thus, TEO can be used to analyze the pitch contour to detect
emotions such as neutral, angry, etc. Prosodic feature highlights pitch-related information such as stress,
tone, pause in between words, etc. The voice quality feature represents voice level, voice pitch for a
particular emotion, i.e., the amplitude and the duration of the speech. Spectral features give information
related to frequency distribution over the audible frequency range. Linear predictive cepstral coefficients
(LPCC), mel frequency cepstral coefficients (MFCC), modulation spectral features, etc., are some of the
popular spectral features used for emotion recognition [26].
In [27], the energy content in the speech signal is computed using wavelet-based time-frequency
distribution for the classification of emotions. In [28], authors have used empirical mode decomposition
(EMD) based signal reconstruction method for feature extraction. Daneshfar et al. [29] have proposed
hybrid spectral-prosodic features of a speech signal. Using quantum-behaved particle swarm optimization
(QPSO) dimensionality of the feature vector is reduced. The reduced feature vector is then passed to a
neural network classifier having a Gaussian elliptical basis function (GEBF) for detecting speech
emotion. In [30], LPCC and MFCC features have been extracted using wavelet decomposition. These
features are then reduced using the vector quantization method and using the radial basis function
network (RBFNN) classifier emotions were classified. Compared to biological signals, facial expressions
and speech signals can be acquired comfortably and economically. Although such signals can be
controlled or fabricated by the subject, thus are less reliable when compared with physiological signals
[51].
The autonomous nervous system (ANS) regulates different parameters of our body. Emotions cause a
change in the activity of ANS [31]. Thus, to analyze changes in the emotional state of a person's heart
rate, body temperature, respiration rates, and other physiological signals are often used. There are many
physiological signals such as electroencephalogram (EEG) [32], electrocardiogram (ECG) [33],
phonocardiogram (PCG) [34], galvanic skin response (GSR) [35], respiration [36], etc. which have been
used for emotion recognition.
ECG represents the heart's electrical activity due to cardiac contraction and expansion. The sympathetic
system in the ANS stimulates differently for different emotions. Emotions influence the ANS activity,
which causes changes in the heartbeat rhythm. The ECG signals can be recorded by placing electrodes at
different parts of the chest [33].
As in the case of FER and SER systems, here also feature are extracted for emotion recognition. ECG has
three different approaches of feature extraction: PQRST detection, heart rate (HR) and within beat (WIB),
heart rate variability (HRV), and inter-beat interval (IBI) [37]. HRV is a time-domain feature and is most
widely used for the purpose of emotion recognition [38]. It measures variation in the heartbeats interval.
The time between beats is called an IBI or RR interval. There are three domains of features that are
generally extracted from the HRV, namely: time domain, frequency domain, and nonlinear domain
feature.
Emotion recognition using ECG signal is done in [39]. ECG signal features such as WIB mean, standard
deviation, median, etc., are calculated, and various time and frequency domain parameters have been
calculated using EMD. This feature vector is then passed to extra tree, a random forest classifier. In [40],
using DWT, the signal is decomposed, and different features have been calculated, and then using the
feature vector, emotions have been classified with the help of an SVM classifier.
GSR measures the conductivity of the skin, which is also known as electrodermal activity (EDA). Like
heartbeat is regulated by ANS, sweating is also regulated by the ANS. Due to stress or fear emotions, the
nervous system gets stimulated, and sweat is generated. Thus EDA signals can be used for the purpose of
emotion detection. The conductivity of the skin increases when the subject is active and decreases when a
subject is in a relaxed state. EDA signals can be recorded by placing electrodes over the fingers [35]. In
[41], fractional Fourier transform (FrFT) is used for feature extraction from GSR signal, then using
Wilcoxon test feature selection is done, with SVM classification. Generally, for emotion recognition
purposes, GSR is used in combination with other physiological signals such as ECG, EEG, etc.
Respiration rate is defined as the number of times a person breathes per unit time. The breathing pattern
of a person varies with changes in the physical and emotional state. Due to an increase in physical
workload, respiration may increase. Similarly, a decreased respiration rate indicates a relaxed state. Thus
respiration rate indicates the affective state of the ANS in the condition of emotional response and mental
workload. Fast and deep breathing indicates anger or a happy emotional state. Momentary interruption of
respiration indicates tension. Irregular respiration may also indicate a depressed or fearful emotional state
[36]. In [42], a deep learning-based sparse auto-encoder (SAE) is used to extract features for recognizing
emotional information from respiratory signals. Then logistic regression is used for the classification of
emotion.
An EEG signal is generated due to electrical activities inside the brain. EEG signals can be recorded both
invasively and non-invasively. A non-invasive way of EEG signal recording is popularly used in the case
of human brain study. An international standard 10-20 cap of multiple electrodes are placed over the scalp
to record the EEG signal, and potential difference among the electrodes captures any electrical activity
inside the brain. Here 1020 refers to the distance between the adjacent electrodes i.e., 10% or 20%
distance from front to back or from right to the left of the skull. The EEG signal hence received from
various channels of a cap has a nonstationary characteristic. A professionally trained person can only
understand such nonstationary EEG signals like a doctor uses these signals in diagnosing different brain
disorders.
Various signal processing algorithms can help extract information from such nonstationary signals to
automate this manual process. Before applying any signal processing algorithm, a signal must be
preprocessed. Preprocessing helps in artifact and noise removal from the raw input signal. Preprocessing
stage makes the signal suitable for further processing. Preprocessing includes operations such as artifact
removal, noise filtering, and resampling the signal. The signal is generally recorded at a higher sampling
rate, and then the signal is downsampled before further processing to reduce the computational
complexity. Downsampling helps in reducing the number of samples used while still maintaining the
needed information. Several artifact removal techniques based on independent component analysis (ICA),
bandpass filtering, deep learning, etc., will further improve the signal quality [51]. The preprocessed
signal is then analyzed using various signal decomposition techniques. Then certain features are extracted
from the decomposed signal. Various features which help in characterizing signals are spatial features,
spectral features, temporal features, statistical features, etc. can be extracted from the oscillatory
components. Spatial information helps in finding out the source of information. In the case of EEG signal,
this feature will help in selecting specific EEG channels or focusing more on the signal coming from a
specific region of the brain. The spectral feature can be helpful in describing signal power distribution in
different frequency bands. Mean, variance, standard deviation, skewness, kurtosis etc., are some of the
widely used statistical features.
EEG signals are very useful physiological signals for the study of human emotion recognition, as EEG
signals have a high temporal resolution [32]. Also, they are generated from the human brain; these signals
signal have more important emotion-related information. EEG signals consist of different rhythms such as
delta (δ) (0.54 Hz), theta (θ) (48 Hz), alpha (α) (813 Hz), beta (β) (1330 Hz) and gamma (γ) (more
than 30 Hz) [43]. The study of these EEG rhythms gives information about the user's mental and
emotional state.
Several methods have been proposed for emotion recognition using EEG signals. In [45], feature
extraction from EEG signal is done using short-time Fourier transform (STFT). F-score is used for feature
selection than using SVM classification was done. In [46], the author has extracted rational asymmetry
(RASM), which describes the frequency-space domain characteristic of EEG signal, then using long-
short-term-memory (LSTM) recurrent neural networks; different emotions were classified with an
accuracy of 76%. In [47], multivariate synchrosqueezing transform (MSST) is used for time-frequency
representation. The high-dimensional extracted feature is reduced using independent component analysis
(ICA). Gupta et al.[48] decomposed EEG signals using flexible analytic wavelet transform. Information
potential (IP) using Reyni's quadratic entropy is computed from each sub-bands. Then the obtained
feature is smoothed using a moving average filter. Then the classification is done using a random forest
classifier. In [49], the authors have divided an EEG signal into small segments. Then for each segment,
statistical parameters such as mean, median, Fisher information ratio, standard deviation, variance,
maximum, minimum, range, skewness, kurtosis, entropy, and Petrosian fractal dimension. Then the
feature vector of the above computed statistical parameter is passed to a classifier called sparse
discriminative ensemble learning (SDEL) for emotion classification. In [50], Fourier-Bessel series
expansion (FBSE) based empirical wavelet transform (FBSE-EWT) is used for computing K-nearest
neighbour (K-NN) and spectral Shannon entropies. The extracted features are smoothed and then given to
the sparse autoencoder-based random forest (ARF) classifier for emotion classification.
In [51], various multimodal emotion recognition approaches are explained. In the multimodal-based
approach, different signals are captured from the subject, and a separate feature vector is formed from
each signal. These feature vectors of different signals are combined either at the feature level or at the
decision level.
As this chapter focuses more on EEG-related emotion recognition, the subsequent section will give more
details related to different blocks or stages of EEG-based emotion recognition systems.
Proposed framework
This section explains the proposed methodology, the flow chart of the proposed emotion recognition
system is shown in Fig. 1.
Figure 1. Flow chart of the proposed methodology
a) Database description:
A publically available database called SJTU Emotion EEG Dataset (SEED) is used for validating
the proposed model [60]. EEG signals of 15 subjects were recorded for three classes of emotions,
i.e., neutral, sad, and happy. Each subject was called in three sessions. The SEED database
includes the Chinese movie clips used to elicit emotion, a list of subjects, and recorded EEG data.
The Chinese movie clips used for eliciting emotion were selected based on certain criteria: 1) to
avoid fatigue among subjects during the trial, clips are chosen as short as possible, 2) the subject
must easily understand the clip, and 3) only desired emotion must get elicited. Clips were then
shown to twenty participants, and among those, only 15 were selected finally with an equal
number of clips for evoking each emotion. The time interval of each movie clip was
approximately 4 minutes duration.
b) Tunable Q-factor wavelet transform (TQWT) [68]: In this stage, signal is decomposed into
different sub-bands. DWT is the most widely used time-frequency analysis technique for
analyzing nonlinear and nonstationary processes [book]. DWT has a constant Q-factor, which is
not suitable for different varieties of signal, i.e. if the signal is highly oscillatory then it should
have a high Q-factor. on the other hand, if the signal is less oscillatory then it should have a low
Q-factor. TQWT is a more advanced version of wavelet transform. Using TQWT multi-
component EEG signals can be decomposed into several sub-band signals. TQWT is more
flexible than the original DWT as the Q factor of the filter can be adjusted in the case of TQWT.
TQWT provide good time-frequency localization. TQWT has three main parameters Q, r, and J.
Where Q represents the Q-factor it is a dimensionless quantity, J is a parameter that represents
the levels of decomposition, it varies from 1 to N integer values and r represents the oversampling
rate. Wavelet oscillations are adjusted by Q, whereas r controls temporal localization while
conserving its form. By increasing the value of Q, frequency response becomes narrower
resulting in more levels of decompositions for the same frequency range span. For the fixed value
of Q, if r is increased then overlap between the adjacent frequency responses will in turn increase
the levels of decomposition for the same frequency range. TQWT contains a chain of two-
channel high pass and low pass filter banks. The output of low pass filter is connected to the input
of the next stage of the filter bank. Fig. 3 (a) shows the process of decomposition and
reconstruction of a given input EEG signal using TQWT based approach. Fig. 3 (b) shows the
process of iterative signal decomposition up to N levels. For N level of decomposition number of
sub-bands will be N+1 i.e., one low pass sub-band and N high pass sub-bands are obtained. In
Fig. 3 (a) α represents low pass scaling factor, it preserves the low-frequency components of the
signal. Similarly, β represents the high pass scaling factor it preserves the high-frequency
components of the signal.
Figure 3. (a) TQWT decomposition and reconstruction. (b) Multi-stage filter bank.
Equivalent frequency response of low pass (󰇛󰇜) and high pass (󰇛󰇜) filter bank is given as [68]:
󰇛󰇜


 󰇡
󰇢 

󰇛󰇜󰇡
󰇢


󰇡
󰇢󰇛󰇜
󰇟󰇠
For perfect reconstruction α + β > 1 and for α + β = 1 TQWT is critically sampled with no
transition width. For getting the desired Q-factor wavelet filter bank parameters α & β are given
by [68]:

c) Rhythm Separation: Rhythms from decomposed EEG signals are separated by calculating the
mean frequency of the sub-bands signals. If the mean frequency value lies between 0.5 to 4 Hz
then delta rhythm is obtained. Similarly, alpha, beta, and gamma rhythm's mean frequency values
lie between 4 to 8 Hz, 8 to 13 Hz, 13 to 30 Hz, 30 to 75 Hz, respectively.
Sub-bands of decomposed EEG signal are grouped to separate rhythms according to their mean
frequency (μk) calculated as follows [70]:
󰇛󰇜


󰇛󰇜


Where 
, here is the sampling frequency, N is the length of the signal and is the sub-
band DFT. As only a one-sided spectrum is considered, therefore, the range for i is between 0 to
(N/2)-1.
d) Feature extraction: IP of the obtained rhythm is then calculated using Reyni's quadratic entropy
[71]. For adaptation and learning of the information, Renyi derived a group of estimators which
use entropy and divergences as a cost function. Since entropy is a scalar quantity, for calculating
the entropy of random data, first its probability density function (PDF) must be estimated. But for
high dimensional spaces, it is difficult to calculate. Using quadratic Renyi's entropy and the IP
(i.e. H(x)) requirement of estimating PDF can be relaxed. Where H(x) is a Gaussian kernel with σ
as variance, IPσ(x) is a quadratic IP estimator which depends on σ, xm & xn are sample pairs and
the total number of samples is given by N.
󰇛󰇜
󰇛󰇜

 
e) Classification: SVM is a supervised machine learning algorithm. It is mainly used for finding
decision boundaries or support vectors which are then used for classification. SVM classifier
learns from the training data that are projected into a higher-dimensional space, where data is
separated into two classes by a hyperplane [72]. The user-defined kernel function helps in
transforming the original feature space into a higher-dimensional space. It finds support vectors to
maximize the separation between the two classes. Margin is the total separation between the
hyperplane. Once hyperplanes are defined, SVM iteratively optimizes in order to maximize the
margin. SVM can perform both linear and nonlinear classification. In the case of a nonlinear
classifier, the kernels are complex polynomial, homogenous polynomial, Gaussian radial basis
function, and hyperbolic tangent function. In this work SVM classifier, with a cubic kernel is
used.
Hyperplane of SVM classifier can be expressed mathematically as follow [73]:
󰇛󰇜󰇛󰇜

Where is a positive real constant, c is a real constant, 󰇛󰇜 is a kernel or feature space, xn
and yn are nth input and output vector. For a linear feature space 󰇛󰇜
, for polynomial
SVM of order d feature space is given by 󰇛󰇜󰇛
󰇜 i.e., for quadratic polynomial
(d=2) and cubic polynomial (d=3).
RESULTS AND DISCUSSION
The proposed methodology for the recognition of human emotion using EEG signal is evaluated using a
publically available database consisting of EEG signal of 15 participants. TQWT decomposes EEG signal
into several sub-bands. Different parameters related to TQWT were chosen as Q = 5, r = 3, and J = 18.
One second epoch of the EEG signal is chosen for decomposing it into (J+1) sub-bands. Then mean
frequency is calculated, and based on its value, EEG rhythms were extracted. Thus, all sub-bands with a
mean frequency value between 1 to 4 Hz are summed together to obtain delta rhythm. In a similar way,
all other EEG rhythms were obtained as shown in Figs. 4, 5, and 6. Then information potential for each
rhythm is calculated, and a feature vector corresponding to a particular emotion was formed. Several
machine learning classifiers are trained using the feature vector. SVM classifier with cubic kernel
outperforms other classifiers.
Figure 4: EEG rhythms for happy emotion.
Figure 5: EEG rhythms for neutral emotion.
Figure 6: EEG rhythms for sad emotion.
All simulations are done using MATLAB 2021b, installed on a system having an Intel i5 processor and 8
GB RAM. Table 1 shows the accuracy of the classifier for an individual subject. The overall average
accuracy for the subject-dependent emotion recognition task is 92.86%.
 

 

Table 1
SVM (cubic) classifier performance for subject dependent emotion recognition
Subject
True Positive rate (in %)
Overall
Accuracy
(in %)
Negative (-1)
Neutral (0)
Positive (1)
Subject 1
91.7
93.7
95.9
93.8
Subject 2
93.6
94.7
94.2
94.2
Subject 3
86.8
88.7
94.2
90
Subject 4
99.1
99.5
99.6
99.4
Subject 5
93.8
93
92.7
93.2
Subject 6
93.8
92.4
97.9
94.7
Subject 7
82.1
87.7
88.4
86.1
Subject 8
89
90.2
92.5
90.6
Subject 9
88.8
87.2
95.2
90.5
Subject 10
92.1
97.5
94.2
Subject 11
89
91.8
91.5
90.8
Subject 12
93
94.8
92.9
93.6
Subject 13
91.4
94.8
93
Subject 14
93.6
96.5
94.9
Subject 15
93.1
95.5
93.2
93.9
Average
91.39
92.61
94.46
92.86
In the case of subject independent classification, MATLAB classifier learner application is used, and five
different classifiers are trained. Table 2 shows a comparison of different classifiers' performance in the
case of subject independent emotion recognition. Ensemble bagged trees provides the highest overall
accuracy of 86.8%.
Table 2
Classification performance for subject independent emotion recognition
Classifier
True Positive rate (in %)
Overall
Accuracy
(in %)
Negative (-1)
Neutral (0)
Positive (1)
Fine tree
51.8
68.9
80.1
67.1
SVM (cubic)
81.8
83.1
90.9
85.4
KNN (cubic)
58.2
61.4
73.4
64.4
Ensemble boosted trees
50
59.4
80.3
64.2
Ensemble bagged trees
85
85.2
90
86.8
TQWT based emotion recognition using EEG signal requires less amount of training data compared to the
deep neural network-based emotion recognition system. Comparative performance of different emotion
recognition methods is shown in Table 3.
Table 3
Performance comparison of the different existing methods with the proposed method
Author
Year
Dataset
Methodology
Accuracy
Li et al.[75]
2017
DEAP
CNN + LSTM, RNN.
75.21%
W. Zhang et
al [76]
2019
SEED
CNN and DDC
82.1%
Zhong et al.
[77]
2020
SEED
RGNN
85.3%
Yucel et
al.[78]
2020
SEED
Pretrained CNN
78.34%
Wang et
al.[79]
2020
SEED
CNN, EFDMs, and STFT
90.59%
Wei et al.[80]
2020
SEED
Dual-tree complex wavelet transform and simple
recurrent units network
80.02%
Khateeb et
al.[81]
2021
DEAP
Multi-domain features extraction using wavelet
(entropy, energy) and classification using SVM
classifier
65.92%
Haqque et
al.[82]
2021
SEED
Wavelet filters and CNN.
83.44%
Proposed
work
SEED
TQWT is used for feature extraction, and SVM with
cubic kernel is used for classification
92.86%
CNN: Convolutional neural network, LSTM: Long short-term memory, RNN: recurrent neural network,
STFT: short-time Fourier transform, DDC: deep domain confusion, EFDMs: electrode-frequency
distribution maps.
CONCLUSION
This study recognized human emotions using EEG signals by applying advanced signal processing
techniques. TQWT decomposes EEG signal into several sub-bands. Depending on the mean frequency,
sub-bands are grouped together in order to obtain EEG rhythms. Statistical feature from each rhythm is
computed and used for classification using an SVM classifier. SEED emotion database is used for
evaluation of the proposed methodology. For subject-dependent and independent emotion recognition,
92.86% and 86.8 % accuracy, respectively, are obtained. The proposed technique may be helpful in
creating a more user-friendly interactive system. Also, the complexity of such a system will be less when
compared with artificial neural network techniques. Therefore, it can be easily deployed in real-time
applications.
ACKNOWLEDGEMENT
This study is supported by the Council of Scientific & Industrial Research (CSIR) funded Research
Project, Government of India, Grant number: 22(0851)/20/EMR-II.
REFERENCES
[1] Šimić, G., Tkalčić, M., Vukić, V., Mulc, D., Španić, E., Šagud, M., ... & R Hof, P. (2021). Understanding
Emotions: Origins and Roles of the Amygdala. Biomolecules, 11(6), 823.
[2] Picard, R.W., Vyzas, E., & Healey, J. (2001). Toward Machine Emotional Intelligence: Analysis of Affective
Physiological State. IEEE Trans. Pattern Anal. Mach. Intell., 23, 1175-1191.
[3] Salovey, P., & Mayer, J. D. (1990). Emotional intelligence. Imagination, cognition and personality, 9(3), 185-
211.
[4] Gupta, K., Lazarevic, J., Pai, Y. S., & Billinghurst, M. (2020, November). AffectivelyVR: Towards VR
Personalized Emotion Recognition. In 26th ACM Symposium on Virtual Reality Software and Technology (pp. 1-3).
[5] Hassouneh, A., Mutawa, A. M., & Murugappan, M. (2020). Development of a real-time emotion recognition
system using facial expressions and EEG based on machine learning and deep neural network methods. Informatics
in Medicine Unlocked, 20, 100372.
[6] Kołakowska, A., Landowska, A., Szwoch, M., Szwoch, W., & Wrobel, M. R. (2014). Emotion recognition and
its applications. In Human-Computer Systems Interaction: Backgrounds and Applications 3 (pp. 51-62). Springer,
Cham.
[7] Egger, M., Ley, M., & Hanke, S. (2019). Emotion recognition from physiological signal analysis: A review.
Electronic Notes in Theoretical Computer Science, 343, 35-55.
[] Viola, P.A., & Jones, M.J. (2001). Rapid object detection using a boosted cascade of simple features. Proceedings
of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 1, I-I.
[7] Liao, S., Jain, A. K., & Li, S. Z. (2015). A fast and accurate unconstrained face detector. IEEE Transactions on
Pattern Analysis And Machine Intelligence, 38(2), 211-223.
[8] Kim, J. H., Poulose, A., & Han, D. S. (2021). The extensive usage of the facial image threshing machine for
facial emotion recognition performance. Sensors, 21(6), 2026.
[9] Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded
convolutional networks. IEEE Signal Processing Letters, 23(10), 1499-1503.
[10] Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. IEEE Transactions On Pattern
Analysis And Machine Intelligence, 23(6), 681-685.
[11] Saragih, J. M., Lucey, S., & Cohn, J. F. (2009, September). Face alignment through subspace constrained
mean-shifts. In 2009 IEEE 12th International Conference on Computer Vision (pp. 1034-1041). Ieee.
[12] Sariyanidi, E., Gunes, H., & Cavallaro, A. (2014). Automatic analysis of facial affect: A survey of registration,
representation, and recognition. IEEE Transactions On Pattern Analysis And Machine Intelligence, 37(6), 1113-
1133.
[13] Jabid, T., Kabir, M. H., & Chae, O. (2010, January). Local directional pattern (LDP) for face recognition. In
2010 Digest of Technical Papers International Conference On Consumer Electronics (ICCE) (pp. 329-330). IEEE.
[14] Ojansivu, V., & Heikkilä, J. (2008, July). Blur insensitive texture classification using local phase quantization.
In International Conference On Image And Signal Processing (pp. 236-243). Springer, Berlin, Heidelberg.
[15] Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In 2005 IEEE
Computer Society Conference on Computer Vision And Pattern Recognition (CVPR'05) (Vol. 1, pp. 886-893).
[16] Bendjillali, R. I., Beladgham, M., Merit, K., & Taleb-Ahmed, A. (2019). Improved facial expression
recognition based on DWT feature for deep CNN. Electronics, 8(3), 324.
[17] Zhang, Y. D., Yang, Z. J., Lu, H. M., Zhou, X. X., Phillips, P., Liu, Q. M., & Wang, S. H. (2016). Facial
emotion recognition based on biorthogonal wavelet entropy, fuzzy support vector machine, and stratified cross
validation. IEEE Access, 4, 8375-8385.
[18] Kumar, R., Sundaram, M., & Arumugam, N. (2021). Facial emotion recognition using subband selective
multilevel stationary wavelet gradient transform and fuzzy support vector machine. The Visual Computer, 37(8),
2315-2329.
[19] Chowdary, M. K., Nguyen, T. N., & Hemanth, D. J. (2021). Deep learning-based facial emotion recognition for
humancomputer interaction applications. Neural Computing and Applications, 1-18.
[20] Gautam, K. S., & Thangavel, S. K. (2021). Video analytics-based facial emotion recognition system for smart
buildings. International Journal of Computers and Applications, 43(9), 858-867.
[21] De Carolis, B., D'Errico, F., Macchiarulo, N., & Palestra, G. (2019, October). "Engaged Faces": Measuring and
Monitoring Student Engagement from Face and Gaze Behavior. In IEEE/WIC/ACM International Conference on
Web Intelligence-Companion Volume (pp. 80-85).
[22] Rößler, J., Sun, J., & Gloor, P. (2021). Reducing Videoconferencing Fatigue through Facial Emotion
Recognition. Future Internet, 13(5), 126.
[23] Kaiser, J. F. (1993, April). Some useful properties of Teager's energy operators. In 1993 IEEE International
Conference on Acoustics, Speech, and Signal Processing (Vol. 3, pp. 149-152). IEEE.
[24] Ingale, A. B., & Chaudhari, D. S. (2012). Speech emotion recognition. International Journal of Soft Computing
and Engineering (IJSCE), 2(1), 235-238.
[25] Guidi, A., Gentili, C., Scilingo, E. P., & Vanello, N. (2019). Analysis of speech features and personality traits.
Biomedical Signal Processing and Control, 51, 1-7.
[26] Gupta, D., Bansal, P., & Choudhary, K. (2018). The state of the art of feature extraction techniques in speech
recognition. Speech and Language Processing For Human-Machine Communications, 195-207.
[27] Vasquez-Correa, J. C., Arias-Vergara, T., Orozco-Arroyave, J. R., Vargas-Bonilla, J. F., & Noeth, E. (2016,
October). Wavelet-based time-frequency representations for automatic recognition of emotions from speech. In
Speech Communication; 12. ITG Symposium (pp. 1-5). VDE.
[28] Li, X., Li, X., Zheng, X., & Zhang, D. (2010). EMD-TEO based speech emotion recognition. In Life System
Modeling and Intelligent Computing (pp. 180-189). Springer, Berlin, Heidelberg.
[29] Daneshfar, F., Kabudian, S. J., & Neekabadi, A. (2020). Speech emotion recognition using hybrid spectral-
prosodic features of speech signal/glottal waveform, metaheuristic-based dimensionality reduction, and Gaussian
elliptical basis function network classifier. Applied Acoustics, 166, 107360.
[30] Palo, H. K., & Mohanty, M. N. (2018). Wavelet-based feature combination for recognition of emotions. Ain
Shams Engineering Journal, 9(4), 1799-1806.
[31] McCraty, R. "Science of the heart: Exploring the role of the heart in human performance (Vol. 2)". Boulder
Creek, CA: HeartMath Institute (2015).
[32] Zheng, W. (2016). Multichannel EEG-based emotion recognition via group sparse canonical correlation
analysis. IEEE Transactions on Cognitive and Developmental Systems, 9(3), 281-290.
[33] Jing, C., Liu, G., & Hao, M. (2009, July). The research on emotion recognition from ECG signal. In 2009
International Conference on Information Technology and Computer Science (Vol. 1, pp. 497-500). IEEE.
[34] Xiefeng, C., Wang, Y., Dai, S., Zhao, P., & Liu, Q. (2019). Heart sound signals can be used for emotion
recognition. Scientific reports, 9(1), 1-11.
[35] Wu, G., Liu, G., & Hao, M. (2010, October). The analysis of emotion recognition from GSR based on PSO. In
2010 International Symposium on Intelligence Information Processing and Trusted Computing (pp. 360-363). IEEE.
[36] Philippot, P., Chapelle, G., & Blairy, S. (2002). Respiratory feedback in the generation of emotion. Cognition &
Emotion, 16(5), 605-627.
[37] Hasnul, M. A., Alelyani, S., & Mohana, M. (2021). Electrocardiogram-Based Emotion Recognition Systems
and Their Applications in HealthcareA Review. Sensors, 21(15), 5015.
[38] Ferdinando, H., Seppänen, T., & Alasaarela, E. (2016, October). Comparing features from ECG pattern and
HRV analysis for emotion recognition system. In 2016 IEEE Conference on Computational Intelligence in
Bioinformatics and Computational Biology (CIBCB) (pp. 1-6). IEEE.
[39] Dissanayake, T., Rajapaksha, Y., Ragel, R., & Nawinne, I. (2019). An ensemble learning approach for
electrocardiogram sensor based human emotion recognition. Sensors, 19(20), 4495.
[40] Chen, Genlang, Yi Zhu, Zhiqing Hong, and Zhen Yang. "EmotionalGAN: generating ECG to enhance emotion
state classification." In Proceedings of the 2019 International Conference on Artificial Intelligence and Computer
Science, pp. 309-313. 2019.
[41] Panahi, F., Rashidi, S., & Sheikhani, A. (2021). Application of fractional Fourier transform in feature extraction
from ELECTROCARDIOGRAM and GALVANIC SKIN RESPONSE for emotion recognition. Biomedical Signal
Processing and Control, 69, 102863.
[42] Zhang, Qiang, Xianxiang Chen, Qingyuan Zhan, Ting Yang, and Shanhong Xia. "Respiration-based emotion
recognition with deep learning." Computers in Industry 92 (2017): 84-90.
[43] Das, K., & Pachori, R. B. (2021). Schizophrenia detection technique using multivariate iterative filtering and
multichannel EEG signals. Biomedical Signal Processing and Control, 67, 102525.
[44] Mühl, C., Allison, B., Nijholt, A., & Chanel, G. (2014). A survey of affective brain computer interfaces:
principles, state-of-the-art, and challenges. Brain-Computer Interfaces, 1(2), 66-84.
[45] Lin, Y. P., Wang, C. H., Jung, T. P., Wu, T. L., Jeng, S. K., Duann, J. R., & Chen, J. H. (2010). EEG-based
emotion recognition in music listening. IEEE Transactions on Biomedical Engineering, 57(7), 1798-1806.
[46] Li, Z., Tian, X., Shu, L., Xu, X., & Hu, B. (2017, August). Emotion recognition from EEG using RASM and
LSTM. In International Conference on Internet Multimedia Computing and Service (pp. 310-318). Springer,
Singapore.
[47] Mert, A., & Akan, A. (2018). Emotion recognition based on timefrequency distribution of EEG signals using
multivariate synchrosqueezing transform. Digital Signal Processing, 81, 106-115.
[48] Gupta, V., Chopda, M. D., & Pachori, R. B. (2018). Cross-subject emotion recognition using flexible analytic
wavelet transform from EEG signals. IEEE Sensors Journal, 19(6), 2266-2274.
[49] Ullah, H., Uzair, M., Mahmood, A., Ullah, M., Khan, S. D., & Cheikh, F. A. (2019). Internal emotion
classification using EEG signal with sparse discriminative ensemble. IEEE Access, 7, 40144-40153.
[50] Bhattacharyya, A., Tripathy, R. K., Garg, L., & Pachori, R. B. (2020). A novel multivariate-multiscale
approach for computing EEG spectral and temporal complexity for human emotion recognition. IEEE Sensors
Journal, 21(3), 3579-3591.
[51] Li, W., Zhang, Z., & Song, A. (2021). Physiological-signal-based emotion recognition: An odyssey from
methodology to philosophy. Measurement, 172, 108747.
[52] Lang, P. J., Bradley, M. M., & Cuthbert, B. N. (1997). International affective picture system (IAPS): Technical
manual and affective ratings. NIMH Center for the Study of Emotion and Attention, 1(39-58), 3.
[53] Yang, W., Makita, K., Nakao, T., Kanayama, N., Machizawa, M. G., Sasaoka, T., ... & Miyatani, M. (2018).
Affective auditory stimulus database: An expanded version of the International Affective Digitized Sounds (IADS-
E). Behavior Research Methods, 50(4), 1415-1429.
[54] Martínez-Tejada, L. A., Puertas-González, A., Yoshimura, N., & Koike, Y. (2021). Exploring EEG
Characteristics to Identify Emotional Reactions under Videogame Scenarios. Brain Sciences, 11(3), 378.
[55] Boateng, G., Sels, L., Kuppens, P., Lüscher, J., Scholz, U., & Kowatsch, T. (2020, April). Emotion elicitation
and capture among real couples in the lab. In 1st Momentary Emotion Elicitation & Capture workshop (MEEC
2020, cancelled). ETH Zurich, Department of Management, Technology, and Economics.
[56] Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: the self-assessment manikin and the semantic
differential. Journal of Behavior Therapy and Experimental Psychiatry, 25(1), 49-59.
[57] Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and
negative affect: the PANAS scales. Journal of personality and social psychology, 54(6), 1063.
[58] Gross, J. J., & Levenson, R. W. (1995). Emotion elicitation using films. Cognition & emotion, 9(1), 87-108.
[59] Correa, J. A. M., Abadi, M. K., Sebe, N., & Patras, I. (2018). Amigos: A dataset for affect, personality and
mood research on individuals and groups. IEEE Transactions on Affective Computing.
[60] Zheng, W. L., & Lu, B. L. (2015). Investigating critical frequency bands and channels for EEG-based emotion
recognition with deep neural networks. IEEE Transactions on Autonomous Mental Development, 7(3), 162-175.
[61] Koelstra, S., Muhl, C., Soleymani, M., Lee, J. S., Yazdani, A., Ebrahimi, T., ... & Patras, I. (2011). Deap: A
database for emotion analysis; using physiological signals. IEEE Transactions on Affective Computing, 3(1), 18-31.
[62] Soleymani, M., Lichtenauer, J., Pun, T., & Pantic, M. (2011). A multimodal database for affect recognition and
implicit tagging. IEEE Transactions on Affective Computing, 3(1), 42-55.
[63] Subramanian, R., Wache, J., Abadi, M. K., Vieriu, R. L., Winkler, S., & Sebe, N. (2016). ASCERTAIN:
Emotion and personality recognition using commercial sensors. IEEE Transactions on Affective Computing, 9(2),
147-160.
[64] Correa, J. A. M., Abadi, M. K., Sebe, N., & Patras, I. (2018). Amigos: A dataset for affect, personality and
mood research on individuals and groups. IEEE Transactions on Affective Computing.
[65] Katsigiannis, S., & Ramzan, N. (2017). DREAMER: A database for emotion recognition through EEG and
ECG signals from wireless low-cost off-the-shelf devices. IEEE Journal of Biomedical and Health Informatics,
22(1), 98-107.
[66] Dragomiretskiy, K., & Zosso, D. (2013). Variational mode decomposition. IEEE Transactions On Signal
Processing, 62(3), 531-544.
[67] Gilles, J. (2013). Empirical wavelet transform. IEEE transactions on signal processing, 61(16), 3999-4010.
[68] Selesnick, I. W. (2011). Wavelet transform with tunable Q-factor. IEEE Transactions on Signal
Processing, 59(8), 3560-3575.
[69] Daubechies, I., Lu, J., & Wu, H. T. (2011). Synchrosqueezed wavelet transforms: An empirical mode
decomposition-like tool. Applied and Computational Harmonic Analysis, 30(2), 243-261.
[70] Singh, P., Joshi, S. D., Patney, R. K., & Saha, K. (2016). Fourier-based feature extraction for classification of
EEG signals using EEG rhythms. Circuits, Systems, and Signal Processing, 35(10), 3700-3715.
[71] Xu, D., & Erdogmuns, D. (2010). Renyi's entropy, divergence and their nonparametric estimators. In
Information Theoretic Learning (pp. 47-102). Springer, New York, NY.
[72] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
[73] Suykens, J. A., & Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural Processing
Letters, 9(3), 293-300.
[74] Tuncer, T., Dogan, S., & Subasi, A. (2021). A new fractal pattern feature generation function based emotion
recognition method using EEG. Chaos, Solitons & Fractals, 144, 110671.
[75] Li, Y., Huang, J., Zhou, H., & Zhong, N. (2017). Human emotion recognition with electroencephalographic
multidimensional features by hybrid deep neural networks. Applied Sciences, 7(10), 1060.
[76] Zhang, W., Wang, F., Jiang, Y., Xu, Z., Wu, S., & Zhang, Y. (2019, August). Cross-subject EEG-based
emotion recognition with deep domain confusion. In International Conference On Intelligent Robotics And
Applications (pp. 558-570). Springer, Cham.
[77] Zhong, P., Wang, D., & Miao, C. (2020). EEG-based emotion recognition using regularized graph neural
networks. IEEE Transactions on Affective Computing.
[78] Cimtay, Y., & Ekmekcioglu, E. (2020). Investigating the use of pretrained convolutional neural network on
cross-subject and cross-dataset EEG emotion recognition. Sensors, 20(7), 2034.
[79] Wang, F., Wu, S., Zhang, W., Xu, Z., Zhang, Y., Wu, C., & Coleman, S. (2020). Emotion recognition with
convolutional neural network and EEG-based EFDMs. Neuropsychologia, 146, 107506.
[80] Wei, C., Chen, L. L., Song, Z. Z., Lou, X. G., & Li, D. D. (2020). EEG-based emotion recognition using simple
recurrent units network and ensemble learning. Biomedical Signal Processing and Control, 58, 101756.
[81] Khateeb, M., Anwar, S. M., & Alnowami, M. (2021). Multi-Domain Feature Fusion for Emotion Classification
Using DEAP Dataset. IEEE Access, 9, 12134-12142.
[82] Haqque, R. H. D., Djamal, E. C., & Wulandari, A. (2021, September). Emotion Recognition of EEG Signals
Using Wavelet Filter and Convolutional Neural Networks. In 2021 8th International Conference on Advanced
Informatics: Concepts, Theory and Applications (ICAICTA) (pp. 1-6). IEEE.
[book] Daubechies, I. (1992). Ten lectures on wavelets. Society for industrial and applied mathematics.
... Sharma et al. [14] proposed the discrete wavelet transform as decomposition method to explore the nonlinear dynamics of each subband signal. Nalwaya et al. [15] proposed tunable Q-factor wavelet transform to separate EEG rhythms. Kritiprasanna et al. [16] proposed a EEG rhythms segmentation method using multivariate iterative filtering. ...
Article
Full-text available
Constructing reliable and effective models to recognize human emotional states has become an important issue in recent years. In this article, we propose a double way deep residual neural network combined with brain network analysis, which enables the classification of multiple emotional states. To begin with, we transform the emotional EEG signals into five frequency bands by wavelet transform and construct brain networks by inter-channel correlation coefficients. These brain networks are then fed into a subsequent deep neural network block which contains several modules with residual connection and enhanced by channel attention mechanism and spatial attention mechanism. In the second way of the model, we feed the emotional EEG signals directly into another deep neural network block to extract temporal features. At the end of the two ways, the features are concatenated for classification. To verify the effectiveness of our proposed model, we carried out a series of experiments to collect emotional EEG from eight subjects. The average accuracy of the proposed model on our emotional dataset is 94.57%. In addition, the evaluation results on public databases SEED and SEED-IV are 94.55% and 78.91%, respectively, demonstrating the superiority of our model in emotion recognition tasks.
... A two-hidden layer multilayer perceptron is used for multi class classification and SVM is used for binary classification. Nalwaya et al. [30], have used tunable Q-factor wavelet transform (TQWT) to separate various rhythms of EEG. Different statistical and information potential features are computed for each rhythm, which are then fed to SVM cubic classifier for emotion identification. ...
Article
Full-text available
Human dependence on computers is increasing day by day, thus human interaction with computers must be more dynamic and contextual rather than static or generalized. The development of such devices requires knowledge of the emotional state of the user interacting with it. Thus, for this purpose an emotion recognition system is required. Physiological signals namely, the electrocardiogram (ECG) and electroencephalogram (EEG) are being studied here for emotion recognition. This paper proposes novel entropy-based features in the Fourier-Bessel domain instead of the Fourier domain where frequency resolution is twice as compared to the later. Also, to represent such non-stationary signals, Fourier-Bessel series expansion (FBSE) is used. It has non-stationary basis functions which makes it more suitable than the Fourier representation. EEG and ECG signals are decomposed into narrow-band modes, using FBSE based empirical wavelet transform (FBSE-EWT). The proposed entropies of each mode are computed to form the feature vector which are further used to develop a machine learning models. The proposed emotion detection algorithm is evaluated using publicly available DREAMER dataset. K-nearest neighbors (KNN) classifier provides accuracies of 97.84%, 97.91%, and 97.86% for arousal, valence, and dominance classes, respectively. Finally, this paper concludes that the obtained entropy features are suitable for emotion recognition from given physiological signals.
Article
Full-text available
Affective computing is a field of study that integrates human affects and emotions with artificial intelligence into systems or devices. A system or device with affective computing is beneficial for the mental health and wellbeing of individuals that are stressed, anguished, or depressed. Emotion recognition systems are an important technology that enables affective computing. Currently, there are a lot of ways to build an emotion recognition system using various techniques and algorithms. This review paper focuses on emotion recognition research that adopted electrocardiograms (ECGs) as a unimodal approach as well as part of a multimodal approach for emotion recognition systems. Critical observations of data collection, pre-processing, feature extraction, feature selection and dimensionality reduction, classification, and validation are conducted. This paper also highlights the architectures with accuracy of above 90%. The available ECG-inclusive affective databases are also reviewed, and a popularity analysis is presented. Additionally, the benefit of emotion recognition systems towards healthcare systems is also reviewed here. Based on the literature reviewed, a thorough discussion on the subject matter and future works is suggested and concluded. The findings presented here are beneficial for prospective researchers to look into the summary of previous works conducted in the field of ECG-based emotion recognition systems, and for identifying gaps in the area, as well as in developing and designing future applications of emotion recognition systems, especially in improving healthcare.
Article
Full-text available
Emotions arise from activations of specialized neuronal populations in several parts of the cerebral cortex, notably the anterior cingulate, insula, ventromedial prefrontal, and subcortical structures, such as the amygdala, ventral striatum, putamen, caudate nucleus, and ventral tegmental area. Feelings are conscious, emotional experiences of these activations that contribute to neuronal networks mediating thoughts, language, and behavior, thus enhancing the ability to predict, learn, and reappraise stimuli and situations in the environment based on previous experiences. Contemporary theories of emotion converge around the key role of the amygdala as the central subcortical emotional brain structure that constantly evaluates and integrates a variety of sensory information from the surroundings and assigns them appropriate values of emotional dimensions, such as valence, intensity, and approachability. The amygdala participates in the regulation of autonomic and endocrine functions, decision-making and adaptations of instinctive and motivational behaviors to changes in the environment through implicit associative learning, changes in short- and long-term synaptic plasticity, and activation of the fight-or-flight response via efferent projections from its central nucleus to cortical and subcortical structures.
Article
Full-text available
In the last 14 months, COVID-19 made face-to-face meetings impossible and this has led to rapid growth in videoconferencing. As highly social creatures, humans strive for direct interpersonal interaction, which means that in most of these video meetings the webcam is switched on and people are “looking each other in the eyes”. However, it is far from clear what the psychological consequences of this shift to virtual face-to-face communication are and if there are methods to alleviate “videoconferencing fatigue”. We have studied the influence of emotions of meeting participants on the perceived outcome of video meetings. Our experimental setting consisted of 35 participants collaborating in eight teams over Zoom in a one semester course on Collaborative Innovation Networks in bi-weekly video meetings, where each team presented its progress. Emotion was tracked through Zoom face video snapshots using facial emotion recognition that recognized six emotions (happy, sad, fear, anger, neutral, and surprise). Our dependent variable was a score given after each presentation by all participants except the presenter. We found that the happier the speaker is, the happier and less neutral the audience is. More importantly, we found that the presentations that triggered wide swings in “fear” and “joy” among the participants are correlated with a higher rating. Our findings provide valuable input for online video presenters on how to conduct better and less tiring meetings; this will lead to a decrease in “videoconferencing fatigue”.
Article
Full-text available
One of the most significant fields in the man–machine interface is emotion recognition using facial expressions. Some of the challenges in the emotion recognition area are facial accessories, non-uniform illuminations, pose variations, etc. Emotion detection using conventional approaches having the drawback of mutual optimization of feature extraction and classification. To overcome this problem, researchers are showing more attention toward deep learning techniques. Nowadays, deep-learning approaches are playing a major role in classification tasks. This paper deals with emotion recognition by using transfer learning approaches. In this work pre-trained networks of Resnet50, vgg19, Inception V3, and Mobile Net are used. The fully connected layers of the pre-trained ConvNets are eliminated, and we add our fully connected layers that are suitable for the number of instructions in our task. Finally, the newly added layers are only trainable to update the weights. The experiment was conducted by using the CK + database and achieved an average accuracy of 96% for emotion detection problems.
Article
Full-text available
In this article we present the study of electroencephalography (EEG) traits for emotion recognition process using a videogame as a stimuli tool, and considering two different kind of information related to emotions: arousal–valence self-assesses answers from participants, and game events that represented positive and negative emotional experiences under the videogame context. We performed a statistical analysis using Spearman’s correlation between the EEG traits and the emotional information. We found that EEG traits had strong correlation with arousal and valence scores; also, common EEG traits with strong correlations, belonged to the theta band of the central channels. Then, we implemented a regression algorithm with feature selection to predict arousal and valence scores using EEG traits. We achieved better result for arousal regression, than for valence regression. EEG traits selected for arousal and valence regression belonged to time domain (standard deviation, complexity, mobility, kurtosis, skewness), and frequency domain (power spectral density—PDS, and differential entropy—DE from theta, alpha, beta, gamma, and all EEG frequency spectrum). Addressing game events, we found that EEG traits related with the theta, alpha and beta band had strong correlations. In addition, distinctive event-related potentials where identified in the presence of both types of game events. Finally, we implemented a classification algorithm to discriminate between positive and negative events using EEG traits to identify emotional information. We obtained good classification performance using only two traits related with frequency domain on the theta band and on the full EEG spectrum.
Article
Full-text available
Facial emotion recognition (FER) systems play a significant role in identifying driver emotions. Accurate facial emotion recognition of drivers in autonomous vehicles reduces road rage. However, training even the advanced FER model without proper datasets causes poor performance in real-time testing. FER system performance is heavily affected by the quality of datasets than the quality of the algorithms. To improve FER system performance for autonomous vehicles, we propose a facial image threshing (FIT) machine that uses advanced features of pre-trained facial recognition and training from the Xception algorithm. The FIT machine involved removing irrelevant facial images, collecting facial images, correcting misplacing face data, and merging original datasets on a massive scale, in addition to the data-augmentation technique. The final FER results of the proposed method improved the validation accuracy by 16.95% over the conventional approach with the FER 2013 dataset. The confusion matrix evaluation based on the unseen private dataset shows a 5% improvement over the original approach with the FER 2013 dataset to confirm the real-time testing.
Article
Full-text available
A new approach for extension of univariate iterative filtering (IF) for decomposing a signal into intrinsic mode functions (IMFs) or oscillatory modes is proposed for multivariate multi-component signals. Additionally the paper proposes a method to detect schizophrenia (Sz), based on analysing multi-channel electroencephalogram (EEG) signals. Using proposed multivariate iterative filtering (MIF), multi-channel EEG data are decomposed into multivariate IMFs (MIMFs). Depends on mean frequency, IMFs are grouped in order to separate EEG rhythms (delta, theta, alpha, beta, gamma) from EEG signals. The features, such as Hjorth parameters are extracted from EEG rhythms. Extracted features are ranked using student t-test and most discriminant thirty features are used for classification. Different classifier such as K-nearest neighbors (KNN), linear discriminant analysis (LDA), support vector machine (SVM) with diffident kernels are considered to classify Sz and healthy EEG patterns. The proposed method is employed to evaluate 19-channel EEG signals recorded from 14 paranoid Sz patients and 14 healthy subjects. We have achieved highest accuracy of 98.9% using the SVM (Cubic) classifier. Sensitivity, specificity, positive predictive value (PPV), and area under ROC curve (AUC) of the same classifier are 99.1%, 98.8%, 98.4% and 0.999 respectively. Proposed approach for MIF is computationally efficient as compared to other multivariate signal decomposition algorithms. This paper presents a framework for decomposing multivariate signals efficiently and builds a model for detecting Sz accurately.
Article
Full-text available
Emotion recognition in real-time using electroencephalography (EEG) signals play a key role in human-computer interaction and affective computing. The existing emotion recognition models, that use stimuli such as music and pictures in controlled lab settings and limited number of emotion classes, have low ecological validity. Moreover, for effective emotion recognition identifying significant EEG features and electrodes is important. In our proposed model, we use the DEAP dataset consisting of physiological signals collected from 32 participants as they watched 40 movie (each of 60 seconds) clips. The main objective of this study is to explore multi-domain (time, wavelet, and frequency) features and hence, identify the set of stable features which contribute towards emotion classification catering to a larger number of emotion classes. Our proposed model is able to identify nine classes of emotions including happy, pleased, relaxed, excited, neutral, calm, distressed, miserable, and depressed with an average accuracy of 65.92%. Towards this end, we use support vector machine as a classifier along with 10-fold and leave-one-out cross-validation techniques. We achieve a significant emotion classification accuracy which could be vital towards developing solutions for affective computing and deal with a larger number of emotional states.
Article
Emotion recognition from physiological signals plays an essential role in human-computer interaction and af-fective computing. This paper aims to study the effectiveness of Fractional Fourier Transform (FrFT) as a novel feature extraction method in improving the accuracy of emotion recognition from physiological signals. Emotion detection is performed in two dimensions, of arousal and valence, using Electrocardiogram (ECG) and galvanic skin response (GSR) signals recorded on the ASCERTAIN database. Features extracted in the FrFT, time, and frequency domains are classified using two binary SVMs. The results suggest the usefulness of the phase information of the FrFT coefficients in arousal and valence detection and above-chance emotion recognition is achieved with both ECG and GSR signals. Comparison of the time domain features, frequency domain features, and their combination shows that FrFT are more distinct in emotion detection. The best recognition accuracy in both valence and arousal is achieved from the phase information of the FrFT coefficients using the ECG signal, which is equal to 78.32% and 76.83%, respectively.