ArticlePDF Available

Abstract

Human dependence on computers is increasing day by day, thus human interaction with computers must be more dynamic and contextual rather than static or generalized. The development of such devices requires knowledge of the emotional state of the user interacting with it. Thus, for this purpose an emotion recognition system is required. Physiological signals namely, the electrocardiogram (ECG) and electroencephalogram (EEG) are being studied here for emotion recognition. This paper proposes novel entropy-based features in the Fourier-Bessel domain instead of the Fourier domain where frequency resolution is twice as compared to the later. Also, to represent such non-stationary signals, Fourier-Bessel series expansion (FBSE) is used. It has non-stationary basis functions which makes it more suitable than the Fourier representation. EEG and ECG signals are decomposed into narrow-band modes, using FBSE based empirical wavelet transform (FBSE-EWT). The proposed entropies of each mode are computed to form the feature vector which are further used to develop a machine learning models. The proposed emotion detection algorithm is evaluated using publicly available DREAMER dataset. K-nearest neighbors (KNN) classifier provides accuracies of 97.84%, 97.91%, and 97.86% for arousal, valence, and dominance classes, respectively. Finally, this paper concludes that the obtained entropy features are suitable for emotion recognition from given physiological signals.
Citation: Nalwaya, A.; Das, K.;
Pachori, R.B. Automated Emotion
Identification Using Fourier–Bessel
Domain-Based Entropies. Entropy
2022,24, 1322. https://doi.org/
10.3390/e24101322
Academic Editor: Daniel Abasolo
Received: 7 August 2022
Accepted: 16 September 2022
Published: 20 September 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
entropy
Article
Automated Emotion Identification Using Fourier–Bessel
Domain-Based Entropies
Aditya Nalwaya *, Kritiprasanna Das and Ram Bilas Pachori
Department of Electrical Engineering, Indian Institute of Technology Indore, Indore 453552, India
*Correspondence: adityanalwaya@iiti.ac.in
Abstract:
Human dependence on computers is increasing day by day; thus, human interaction with
computers must be more dynamic and contextual rather than static or generalized. The development
of such devices requires knowledge of the emotional state of the user interacting with it; for this
purpose, an emotion recognition system is required. Physiological signals, specifically, electrocar-
diogram (ECG) and electroencephalogram (EEG), were studied here for the purpose of emotion
recognition. This paper proposes novel entropy-based features in the Fourier–Bessel domain instead
of the Fourier domain, where frequency resolution is twice that of the latter. Further, to represent such
non-stationary signals, the Fourier–Bessel series expansion (FBSE) is used, which has non-stationary
basis functions, making it more suitable than the Fourier representation. EEG and ECG signals are
decomposed into narrow-band modes using FBSE-based empirical wavelet transform (FBSE-EWT).
The proposed entropies of each mode are computed to form the feature vector, which are further used
to develop machine learning models. The proposed emotion detection algorithm is evaluated using
publicly available DREAMER dataset. K-nearest neighbors (KNN) classifier provides accuracies of
97.84%, 97.91%, and 97.86% for arousal, valence, and dominance classes, respectively. Finally, this
paper concludes that the obtained entropy features are suitable for emotion recognition from given
physiological signals.
Keywords: Fourier–Bessel series expansion; spectral entropy; ECG; EEG; FBSE-EWT
1. Introduction
Nowadays, humans are becoming more and more dependent on computers for their
various day-to-day tasks. Thus, the need for making computers more user-friendly and
engaging is becoming more and more common rather than just being logically or com-
putationally efficient [
1
]. To make computers more user-friendly, it must recognize the
emotion of the user interacting with it, as emotion is the most basic component that is
responsible for human attention, action, or behavior in a particular situation [
2
]; therefore,
for the applications related to human–computer interaction (HCI), an emotion recognition
system is very useful [
3
]. Further, such HCI system can be very useful in areas related to
mental health care [
4
], smart entertainment [
5
], and an assistive system for a physically
disabled person [6].
Emotion recognition systems generally have two different approaches for recognizing
emotion, i.e., explicit approach, which includes expression on the face, speech, etc., that are
visible and can be easily captured for analysis; however, the problem with such an approach
is that the signals obtained in this approach are not very reliable because the subject under
consideration can hide his/her emotion, so these might not show the actual emotional
state of the subject. An alternative way of recognizing emotion is an implicit approach
where physiological signals, such as electroencephalogram (EEG), electrocardiogram (ECG),
galvanic skin response (GSR), etc., are being captured and analyzed. Such signals are not
visible and are generated inside the body by the autonomous nervous system (ANS).
Any change in the emotional state of a person is reflected in the respiration rates, body
Entropy 2022,24, 1322. https://doi.org/10.3390/e24101322 https://www.mdpi.com/journal/entropy
Entropy 2022,24, 1322 2 of 22
temperature, heart rate, and other physiological signals, which are not under the control of
the subject [7].
The brain is a central command center. It reacts to any external stimuli by firing
neurons inside the brain, causing changes in ANS activity, due to which, the routine activity
of the heart and other peripheral organs varies. In addition, change in the emotional
state affects heart activity [
8
]. EEG and ECG signals are the biopotential that reflect the
activities of the brain and heart, respectively. Thus, in this study, to recognize the emotion
of the subject, physiological signals namely, EEG and ECG are considered. Although in
the literature, some other physiological signals are also used to determine the emotional
state of a person, such as changes in the conductivity of the skin, which is measured
using GSR [
9
11
]. In [
12
], the authors have extracted time–frequency domain features
using fractional Fourier transform (FrFT); the most relevant features are selected using
the Wilcoxon method, which is then given as input to a support vector machine (SVM)
classifier for determining the emotional state of the person. Similarly, the respiration rate
also helps in determining the emotional state of the person. In [
13
], authors have used
deep learning (DL)-based feature extraction method and logistic regression, determining
different emotional states from the obtained features.
In recent years, several emotion recognition frameworks have been established using
different physiological signals. The authors in [
14
] have proposed a methodology that uses
ECG signal features derived from different intrinsic mode functions (IMFs) for emotion
classification. Using bivariate empirical mode decomposition (BEMD), IMFs are obtained.
The instantaneous frequency and local oscillation feature are computed, which are given
as input to the linear discriminant classifier to discriminate different emotional states.
A method is proposed to determine negative emotion through a single channel ECG
signal [
15
]. Different features, extracted from the ECG signals, are linear features, i.e., mean
of RR interval. Nonlinear features consist of bispectral analysis, power content in different
frequency bands, time–domain features contains different statistical parameters of ECG
signal, and time–frequency domain features were obtained using wavelet-based signals
decomposition. Based on these features, different emotions were detected using several
machine learning classifiers. The authors of [
16
] have extracted four different features from
the ECG signal, i.e., heart rate variability (HRV), with-in beat (WIB), frequency spectrum,
and signal decomposition-based features. To evaluate the performance of the obtained
features, ensemble classifiers are used. In [
17
], the authors have extracted ECG signal
features at different time scale using wavelet scattering, and features obtained are given as
an input to various classifiers for evaluating their performance.
Similarly, different emotion recognition techniques have been proposed based on
EEG signals. Anuragi et al. [
18
] have proposed EEG based emotion detection framework.
Using the Fourier–Bessel series expansion (FBSE)-based empirical wavelet transform (EWT)
(FBSE-EWT) method, the EEG signals are decomposed into four sub-band signals from
which features such as energy and entropy are extracted. The features are selected using
neighborhood component analysis (NCA), using different classifiers, such as k-nearest
neighborhood (KNN), ensemble bagged tree, and artificial neural network (ANN), emotion
class is identified. Sharma et al. [
19
] have used discrete wavelet transform (DWT) for EEG
signal decomposition. Third-order cumulant nonlinear features are extracted from each
sub-band, and using swarm optimization features, dimensionality is reduced. To classify
the emotional states, a long short-term memory-based technique is used. Bajaj et al. [
20
]
have used multiwavelet transform to decompose EEG signals into narrow sub-signals.
Using Euclidean distance, features are extracted that are computed from a 3-D phase
space diagram. Multiclass least squares support vector machines (MC-LS-SVM) classifier
is used for identifying class of emotion; however, the authors in [
21
] have computed
features, namely entropy and ratio of the norms-based measure, are computed from the sub-
signals that are obtained after decomposing EEG signal using multiwavelet decomposition.
The feature matrix is given as input to MC-LS-SVM for determining the emotional state.
In [
22
], features related to changes in the spectral power of EEG signal are extracted
Entropy 2022,24, 1322 3 of 22
using a short-time Fourier transform (STFT). The best 55 features among 60 features are
selected using F-score. A feature vector is formed from the selected features which are
given as an input to the SVM classifier for determining the emotional state of the subject.
Liu et al. [
23
] have extracted fractal dimension and higher-order crossing (HOC) as features
via the sliding window approach from the given EEG signal; based on these features, an
emotion class is determined using an SVM classifier. In [
24
], the authors have decomposed
EEG signals in different sub-bands using DWT. From the obtained wavelet coefficient
features, entropy and energy of the coefficients are calculated. Finally, using SVM and KNN
classifiers, emotional states are obtained.The authors in [
25
] have used a multi-channel
signal processing technique called multivariate synchrosqueezing transform (MSST) for
representing EEG signals in time–frequency domain. Independent component analysis
(ICA) is used to reduce dimensionality of the obtained high-dimensional feature matrix.
In addition, its performance is compared with non-negative matrix factorization (NMF),
which is an alternative feature selection method. The reduced feature matrix is passed
to different classifiers such as SVM, KNN, and ANN to discriminate different emotions.
Gupta et al. used flexible analytic wavelet transform for decomposing EEG signals into
narrow-band sub-bands [
26
]. From each sub-band, information potential (IP) is computed
using Reyni’s quadratic entropy, and using moving average filter the extracted features
are smoothed. To determine the different emotion classes, a random forest classifier is
used. Ullah et al. segmented the large duration EEG signal into smaller epoch [
27
] from
each segment features; specifically, Fisher information ratio, entropy, statistical parameters,
and Petrosian fractal dimension, are computed. The above feature vector is passed to a
sparse discriminative ensemble learning (SDEL) classifier to identify the emotion class.
Bhattacharyya et al. decomposed EEG signals into different modes using FBSE-EWT [
28
].
From each mode of KNN entropy, multiscale multivariate Hilbert marginal spectrum
(MHMS), and spectral Shannon entropy, features are calculated. Features are smoothed and
fed to a classifier called sparse autoencoder-based random forest (ARF) for determining
the class of human emotion. Nilima et al. used the empirical mode decomposition (EMD)
method for decomposing EEG signal [
29
]. The second-order difference plot (SODP) features
are calculated from each IMF. A two-hidden layer multilayer perceptron is used for multi
class classification and SVM is used for binary classification. Nalwaya et al. [
30
], have used
tunable Q-factor wavelet transform (TQWT) to separate various rhythms of EEG. Different
statistical and information potential features are computed for each rhythm, which are then
fed to SVM cubic classifier for emotion identification.
Further, some of the studies use both ECG and EEG signals. In [
31
], the authors
have recorded EEG and ECG signals while the subject is exposed to immersive virtual
environments to elicit emotion. From ECG signal, time–domain features, frequency domain
features, and non-linear features are calculated. Whereas from EEG signal band power and
mean phase connectivity features are calculated. Both EEG and ECG features are combined
to form a single feature matrix whose dimensionality is then reduced using principal
component analysis (PCA). Then, the reduced matrix is passed to an SVM classifier to
determine the emotional state of the subject.
The literature review of previous studies on human emotion identification shows
emerging research trends in finding appropriate signal decomposition techniques, dis-
tinctive features, and machine learning algorithms to classify emotional states. Most of
the studies carried out previously used a single modality, i.e., either EEG, ECG, or other
peripheral physiological signals, and very few studies have been conducted using multiple
modalities. Despite this, there is still scope for improvement in the classification accuracy
of the earlier proposed methods.
In this article, an automated emotion identification framework using EEG and ECG
signals is developed. Furthermore, the Fourier–Bessel (FB) domain has been explored
instead of working in traditional Fourier domain, due to various advantages associated
with latter one. FBSE uses Bessel functions as basis functions whose amplitude decay with
time and are damped in nature, which make them more suitable for non-stationary signal
Entropy 2022,24, 1322 4 of 22
analysis. FBSE spectrum has twice the resolution as compared to Fourier domain spectral
representation. Thus, looking at such advantages, FBSE-EWT is used instead of EWT for
extracting different modes from the given signal. New FB-based spectral entropy, such as
Shannon spectral entropy (SSE), log energy entropy (LEE), and Wiener entropy (WE), have
been proposed, which are used as features from the obtained modes. Then, smoothing of
the feature values is performed by applying moving average over obtained features values,
which is then given as input to SVM and KNN classifiers for emotional class identification.
The block diagram of the proposed methodology is shown in Figure 1.
EEG/ ECG
Feature smoothening
FBSE-EWT based
signal decomposition
Feature extraction
Classifier
Emotional state
Figure 1. Block diagram for the proposed emotion detection methodology.
The rest of the article is organized as follows: In Section 2, the material and method-
ology are discussed in detail. In Section 3, results obtained after applying the proposed
methodology are presented. Section 4presents the performance of the proposed method
with the help of the results obtained, compares it with other exiting methodologies, and
highlights some of its limitations. Finally, Section 5concludes the article.
2. Materials and Methods
The emotion recognition framework presented in this article consists of preprocessing,
signal processing, feature extraction, feature smoothing, and classification steps.
2.1. Dataset Description
A publicly available DREAMER dataset is used to evaluate the proposed methodology.
It contains raw EEG and ECG signals, which are recorded from 23 healthy participants
while the subject is watching audio and video clips. Emotions are quantified in terms of
three different scales: arousal, valence, and dominance [
32
]. Each participant was shown
18 different clips of different durations with a mean of 199 s. The EPOC system by Emotive
was used, which contains 16 gold-plated dry electrodes placed in accordance with the
international 10–20 system EEG, and were recorded from, i.e., AF3, AF4, F3, F4, F7, F8, FC5,
FC6, T7, T8, P7, P8, O1, and O2; two reference electrodes, M1 and M2 were placed over
mastoid, as described in [
32
]. To obtain the ECG signals, electrodes were placed in two
vector directions, i.e., right arm, left leg (RA-LL) vector and left arm, left leg (LA-LL) vector.
ECG was recorded using the SHIMMER ECG sensor. Both EEG and ECG signals were
recorded at different sampling rates, i.e., 128 Hz, and 256 Hz, respectively. The sample EEG
and ECG signals obtained from the dataset are shown in Figure 2for different classes of
emotion (high arousal (HA) or low arousal (LA), high dominance (HD) or low dominance
(LD), high valence (HV) or low valence (LV)). The dataset also contains information relating
Entropy 2022,24, 1322 5 of 22
to the rating of each video on a scale of 1 to 5. Each participant rated the video as per
his/her level of emotion elicited in three different dimensions, i.e., valence (pleasantness
level), arousal (excitement level), and dominance (level of control). Rating between 1–5
was labeled as ‘0’ (low) or ‘1’ (high) for each dimension, with 3 as the threshold, i.e., if
a participant rates a video between 1 to 3, it will be considered as low or ‘0’ and a value
above 3 is considered as high or ‘1’ [33,34].
3800
4000
4200 EEG signal of LA class
3838
3988
EEG signal of HA class
4445
4480
EEG signal of LD class
4450
4500
EEG signal of HD class
20 40 60 80 100 120
4134
4184
EEG signal of LV class
20 40 60 80 100 120
4150
4200
EEG signal of HV class
1855
2105
2355
Amplitude
ECG signal of LA class
2000
2200
2400 ECG signal of HA class
1862
2112
2362 ECG signal of of LD class
2000
2200
ECG signal of of HD class
50 100 150 200 250
Sample number
1872
2072
2272
ECG signal of of LV class
50 100 150 200 250
Sample number
2000
2200
2400 ECG signal of HV class
Figure 2. Epochs of raw EEG and ECG signals obtained from the DREAMER dataset.
2.2. Preprocessing
At this stage, typically, noise and artifacts from the raw signal are removed with the
help of filters and denoising algorithms. As in [
32
], the analysis of only the last 60 s of
signal was performed. Further, signals were segmented into small epochs of 1 s, which
were then used for further processing. The mean value was subtracted from each epoch.
2.3. FBSE-EWT
Analyzing non-stationary signals such as EEG and ECG is difficult as the signal is
time-varying in nature and its properties also change continuously. In order to understand
the properties of such signals, decomposing them into narrow-band simpler components
can help to make it easier to understand; therefore, the preprocessed signal was given to a
signal decomposition algorithm, which will decompose the input EEG and ECG signal into
various modes.
FBSE-EWT is an improved version of the EWT, which has adaptive basis functions
derived from the signals. FBSE-EWT has been used for many biomedical signal processing
applications, such as epileptic seizure detection [
35
,
36
], valvular heart disease diagno-
sis [
37
], and posterior myocardial infarction detection [
38
]. FBSE-EWT technique flow is
shown in Figure 3and its step by step working is as follows:
Entropy 2022,24, 1322 6 of 22
FBSE
spectrum
Scale-space
representation
Boundary
detection
EWT based
filter-bank
Input signal
Mode 1
Mode 2
Mode M
Figure 3. Flow diagram of FBSE-EWT based signal decomposition.
1.
FBSE spectrum: FBSE has twice the frequency resolution as compared to the Fourier
domain. FBSE spectrum of a signal
s(n)
of length
S
samples can be obtained us-
ing zero-order Bessel functions. The magnitude of the FB coefficients
K(i)
can be
computed mathematically as follows [3942]:
K(i) = 2
S2[J1(βi)]2
S1
n=0
ns(n)J0βin
S(1)
where
J0(·)
is the zero-order and
J1(·)
is the first-order Bessel functions. Positive roots
of zeroth order Bessel function are denoted by
βi
, which are arranged in ascending
order, where
i=
1, 2, 3,
···
,
S
. A one to one relation between order
i
and continuous
frequency is given by [39,42],
βi2πfiS
fs(2)
where
βiβi1+πiπ
,
fs
denotes the sampling frequency. Here,
fi
is the
continuous frequency, corresponding to the ith order and is expressed as [39],
i2fiS
fs(3)
For covering whole bandwidth of
s(n)
,
i
must vary from 1 to
S
. The FBSE spectrum is
the plot of magnitude of the FB coefficient K(i)versus frequency fi.
2.
Scale-space based boundary detection [
39
,
42
,
43
]: For FBSE spectrum, scale-space
representation can be obtained by convolving the signal with a kernel of Gaussian
type, which is expressed as follows:
ϑ(i,q) =
N
n=N
K(in)g(n;q)(4)
where
g(n
;
q) = 1
2πqen2
2q
. Here,
N=Wq+
1 with 3
W
6 and
q
is the scale
parameter. As the scale-space parameter, i.e.,
ε=qq
q0
,
ε=
1, 2,
···
,
εmax
, increases,
the number of minima decreases and no new minima will appear in the scale space
plan. The FBSE spectrum is segmented using the boundary detection technique. The
FBSE spectrum ranges from 0 to
π
and the FBSE spectrum segments are denoted as
[
0,
ω1]
,
[ω1
,
ω2]
,
···
, and
[ωi1
,
π]
, where
ω1
,
ω2
,
···
,
ωi1
are boundaries. Typically
boundaries are defined between two local minima, which are obtained by the two
curves in the scale–space plane, whose length is greater than the threshold, which is
obtained by using Otsu method [44].
3.
EWT filter bank design: After obtaining the filter boundaries, based on these param-
eters, empirical scaling and wavelet functions are adjusted and different band-pass
filters are designed. Wavelets scaling (
Φj(ω))
and wavelet functions (
νj(ω)
) were
constructed, and the mathematical expressions are given by [45],
Φj(ω) =
1 if |ω|(1τ)ωj
coshπγ(τ,ωj)
2iif (1τ)ωj|ω|(1+τ)ωj
0 otherwise
(5)
Entropy 2022,24, 1322 7 of 22
νj(ω) =
1 if (1+τ)ωj|ω|(1τ)ωj+1
coshπγ(τ,ωj+1)
2iif (1τ)ωj+1|ω|(1+τ)ωj+1
sinhπγ(τ,ωj)
2iif (1τ)ωj|ω|(1+τ)ωj
0 otherwise
(6)
where,
γ(τ
,
ωj) = δh|ω|(1τ)ωj
2τωji
. To obtain the tight frames parameter,
δ
and
τ
are
defined as
δ(x) =
0 if x0
and δ(x) + δ(1x) = 1x[0, 1]
1 if x1
(7)
τ<minj"ωj+1ωj
ωj+1+ωj#(8)
4.
Filtering: Expression for detailed coefficients from the EWT filter bank for the analyzed
signal a(m)is given by [45],
Da,ν(j,n) = Za(m)νj(mn)dm (9)
The approximate coefficients from the EWT filter bank are computed by [45],
Aa,Φ(0, n) = Za(m)Φ1(mn)dm (10)
The
jth
empirical mode is obtained by convolving the wavelet function with the detail
coefficients, where
j=
1, 2,
···
,
M
are different empirical modes. Original signal
can be reconstructed by adding all
M
reconstructed modes and one low-frequency
component; mathematically, both are expressed as below [45]
rj(n) = Da,ν(j,n)?νj(n)(11)
r0(n) = Aa,Φ(0, n)?Φ1(n)(12)
where ?denotes the convolution operation.
2.4. Feature Extraction
In order to understand and quantify information associated with any dynamically
changing phenomenon, entropy is the most widely used parameter [
46
]. Thus, in order to
understand the nonstationary behavior of physiological signals considered in this study,
entropies, as a feature set, are considered. Some of the advantages of FBSE representation
are that Bessel functions decay with time, and so provide more suitable representation of
the nonstationary signal, and that FBSE has a frequency resolution twice that of the Fourier
spectrum. In addition, there are many previous studies on emotion recognition where
entropies have been used [26,28,42,47]. SSE, WE, and LEE are defined as follows:
2.4.1. SSE
Uniformity in the distribution of signal energy can be measured using spectral entropy.
Entropy measures the uncertainty, which has been derived from the Shannon’s expression.
The SSE is defined based on the energy spectrum
Ei
obtained using FBSE. The energy
corresponding to the ith order FB coefficient K(i)is defined mathematically as [48]
Ei=K(i)2S2[J1(βi)]2
2(13)
Entropy 2022,24, 1322 8 of 22
The SSE is expressed mathematically as [49]
HSSE =
S
i=1
P(i)log2(P(i))(14)
where
S
is the total number of FBSE coefficients.
P(i)
is the normalized energy distribution
over ith order is mathematically defined as,
P(i) = Ei
S
i=1
Ei
(15)
2.4.2. WE
It is another measure of flatness in the distribution of signal spectral power. The WE is
also called the measure of spectral flatness. It is calculated by taking the ratio of geometric
mean to arithmetic mean of the energy spectrum (
Ei
) of the FB coefficient for order
i
and is
expressed as [50],
HWE =S
S
v
u
u
t
S
i=1
Ei
S
i=1
Ei
(16)
WE is a unitless quantity; its output is purely a numerical value ranging from 0 to 1, where
1 represents a uniform (flat) spectrum and 0 indicates a pure tone.
2.4.3. LEE
Another variant of information measurement using LEE is defined as logarithm of
P(i)in the FB domain and it is given by [51]
HLE =
S
i=1
log2(P(i))(17)
2.5. Feature Smoothing
The brain is the control center of all human activities and internal functioning. The
typical EEG signal obtained is thus a combination of various brain activity and other noise,
either related to the environment or body [
52
]. Rapid fluctuations in feature value may
arise from these noises. Human emotion changes gradually [
26
]; in order to reduce the
effect of such variation on the emotional state-related feature values, a moving average
filter with a window size of 3 samples is utilized. The effect of the moving average filter on
the raw feature value of the first channel’s first epoch can be seen in Figure 4.
2.6. Classifiers
Feature vectors extracted from the EEG signals are used for the classification of dif-
ferent emotions. Based on the feature vector, the classifier will discriminate the data into
high and low dimensions of emotion, i.e., either HV or LV, HA or LA, and HD or LD. In
this study, SVM with the cubic kernel and KNN are used independently for the purpose of
classification.
SVM is a supervised machine learning algorithm that classifies data by first learning
from labeled data belonging to different classes. The main principle behind the classifier
working is finding decision boundaries that are formed by a hyperplane, and it helps in
separating data into two separate classes. The hyperplane is constructed by training the
classifier with sample data. The optimum location of the hyperplane or decision boundary
depends on the support vectors, which are the points nearest to the decision boundary [
53
].
Entropy 2022,24, 1322 9 of 22
SVM iteratively optimizes the location of hyperplane in order to maximize the margin. A
margin is a total separation between two classes. SVM can be linear or nonlinear classifier
depending on kernels used. Generally, in the case of a linear SVM straight line, flat plane, or
an N-dimensional hyperplane are the simplest way of separating data into two groups, but
there are certain situations where nonlinear boundary separates the data more efficiently.
In case of a nonlinear classifier, different kernels can be polynomial, hyperbolic tangent
function, or Gaussian radial basis function. In this work, an SVM classifier with a cubic
kernel is used. Generalized mathematical expression for defining the hyperplane of the
SVM classifier can be expressed as follows [54]:
f(x) = sign"R
i=1
bifiK(x,xi) + c#(18)
where
bi
is a positive real constant,
R
is total number of observations,
c
is a real con-
stant,
K(x
,
xi)
is a kernel or feature space,
xi
input vector, and output vector is denoted
by
fi
. For a linear feature space
K(x
,
xi) = xT
ix
, for polynomial SVM of any order
p
,
K(x
,
xi) = (xT
ix+
1
)p
defines the feature space, i.e.,
(p=
2
)
for quadratic polynomial and
(p=3)for cubic polynomial.
10 20 30 40 50 60
1
2
3
4
SSE feature
10 20 30 40 50 60
2
2.5
3
3.5
Smoothed SSE feature
10 20 30 40 50 60
1600
1800
2000
2200
2400
2600
2800
Feature values
LEE feature
10 20 30 40 50 60
1700
1800
1900
2000
2100
Smoothed LEE feature
10 20 30 40 50 60
Epochs
10 20 30 40 50 60
Epochs
0.01
0.02
0.03
WE feature
0.005
0.01
0.015
0.02
Smoothed WE feature
Figure 4.
Feature values obtained from signal epochs is shown on the left and its smoothed version is
shown on the right.
The KNN is a non-parametric, supervised machine learning algorithm used for data
classification and regression [
55
]. It categorizes data points into different groups based on
the distance from some of the nearest neighbors. For classifying any particular training
data, the KNN algorithm follows these steps:
Step 1:
Compute distance between sample data and other sample using anyone of the
distance metrics such as Euclidean, Mahalanobis, or Minkowski distance.
Step 2:
Rearrange the distant metric obtained from the first step in ascending order and
top kvalues are considered with distance from current sample is minimum.
Entropy 2022,24, 1322 10 of 22
Step 3:
Class is assigned to the sample data depending on the maximum number of nearest
neighbors class.
3. Results
In order to evaluate the performance of our proposed methodology, a publicly avail-
able DREAMER dataset is used [
32
]. In this section, the results obtained after applying the
proposed emotion identification method over the DREAMER dataset is discussed in detail.
The raw EEG data are segmented into an epoch of one second. An EEG epoch length is
128 samples and the ECG length is 256 samples. These epochs of the EEG signal are de-
composed into four modes using FBSE-EWT, as shown in Figure 5, and the corresponding
filter bank is shown in Figure 6. The decomposition of EEG and ECG signal epochs gives a
minimum of four number of modes. So, we set the required number of modes to four to
obtain a uniform number of features across different observations. From each mode, three
features are extracted. The EEG data consist of 14 channels; therefore, the dimension of
the feature vector is 168
(=
14
×
3
×
4
)
. Similarly, from the two channels of the ECG, we
obtain a 24-element feature vector. The number of trials for each participant is 18 and there
are a total of 23 participants, which gives us a total of 24,840
(=
60
×
18
×
23
)
observations
or epochs. The final dimensions of the feature matrices are 24, 840
×
168 and 24, 840
×
24
for EEG and ECG, respectively. For the multimodal case, both the feature from EEG and
ECG are combined, where we obtain a feature matrix with the dimension of 24,840
×
192.
This feature matrix is classified using two classifiers: SVM and KNN. For KNN
(k=
1
)
,
Euclidean distance is used as a distance metric; for SVM, cubic kernels are used.
-100
0
100
EEG epoch
-50
0
50
Mode 1
-20
0
20
40 Mode 2
-50
0
50
Amplitude
Mode 3
20 40 60 80 100 120
Sample number
(a)
-20
0
20
Mode 4
-150
0
150
300
ECG epoch
-100
0
100
Mode 1
-50
0
50
Mode 2
-50
0
50
Mode 3
50 100 150 200 250
Sample number
(b)
-20
0
20
Mode 4
Figure 5.
FBSE
EWT
based signal decomposition of (
a
) EEG and (
b
) ECG epochs into different
modes.
Performance was evaluated using the classifier learner application present in MATLAB
2022a. The system used for computing has an Intel i7 CPU with 8 GB of RAM. For three
different dimensions of emotion, i.e., arousal, dominance, and valence, three different
binary classifications are performed. Thus, each classifier groups data into either high or
low arousal, high or low dominance, and high or low valence classes. Low and high class is
Entropy 2022,24, 1322 11 of 22
decided based on the rating given by the participant on a scale of 1 to 5, with 3 as threshold,
i.e., rating
6
3 is consider low and all above it is considered as high [
33
,
34
], due to which,
unbalanced classes are obtained for some participants [
32
]. Further, the performance of
the proposed method is evaluated using the first features obtained from EEG signals, then
ECG signals, and then using the combined features.
0 20 40 60
Frequency (Hz)
0
1
2
3
Magnitude
FBSE-EWT filter-bank
0 50 100
Frequency (Hz)
0
1
2
3
Magnitude
FBSE-EWT filter-bank
Figure 6.
FBSE
EWT
based filter bank used for decomposing (
a
) EEG and (
b
) ECG epochs into
different modes (filter corresponding to different modes are shown in different colors).
Features obtained from EEG and ECG signals of different subjects are combined
together and the classifiers are trained and tested subject independently. Both SVM and
KNN are evaluated independently using five-fold cross-validation, where different signal
feature observations are placed randomly independent to the subject. For the EEG feature
and arousal, the dimension accuracy obtained for SVM is 81.53% and for KNN it is 97.50%.
Similarly, performance was evaluated for the case of dominance and valence, which are
summarized in Tables 13. The tables consist of classification results obtained by using
different modalities, i.e., EEG, ECG, and the combined multimodal approach, and the best
results obtained are highlighted using bold fonts. The confusion matrices for three different
emotion dimensions classifications based on EEG, ECG, and multimodal signals are shown
in Figures 79.
Table 1. Performance with different modalities for arousal emotion dimension.
Modality Model Accuracy * Sensitivity * Specificity * Precision * F1 Score *
EEG
KNN
(k= 1) 97.50 97.89 97.01 97.67 97.78
SVM
(Cubic) 81.53 82.04 80.81 86.02 83.98
ECG
KNN
(k= 1) 69.94 73.17 65.73 73.56 73.37
SVM
(Cubic) 63.84 65.09 61.35 77.11 70.59
Multimodal
KNN
(k= 1)97.84 98.10 97.51 98.06 98.08
SVM
(Cubic) 84.54 84.75 84.24 88.45 86.56
* in percent.
Accuracy parameters, such as sensitivity, specificity, precision, and F1 score, are shown
in Tables 13. From these tables, it may be observed that the accuracy obtained from EEG
and multimodal are almost similar; however, among the two, multimodal is still the winner
as it can be seen that all other parameters of classifier reliability check are having sightly
better result than the unimodal EEG signal. Following mathematical expressions have been
used for calculating parameters namely sensitivity, specificity, precision, and F1 score [
53
]:
Sensitivity =TP
TP + FN (19)
Entropy 2022,24, 1322 12 of 22
Specificity =TN
FP + TN (20)
Precision =TP
TP + FP (21)
F1 Score =2TP
2TP + FP + FN (22)
where TP is the true positive, TN is the true negative, FP is the false positive, and FN is the
false negative.
Table 2. Performance with different modalities for dominance emotion dimension.
Modality Model Accuracy * Sensitivity * Specificity * Precision * F1 Score *
EEG
KNN
(k= 1) 97.68 97.89 97.44 97.63 97.76
SVM
(Cubic) 81.53 82.04 80.81 86.02 83.98
ECG
KNN
(k= 1) 69.25 70.33 68.08 70.57 70.45
SVM
(Cubic) 62.30 62.90 61.56 66.81 64.80
Multimodal
KNN
(k= 1)97.91 97.95 97.86 98.02 97.99
SVM
(Cubic) 84.12 83.74 84.55 86.15 84.93
* in percent.
Table 3. Performance with different modalities for valence emotion dimension.
Modality Model Accuracy * Sensitivity * Specificity * Precision * F1 Score *
EEG
KNN
(k= 1) 97.55 97.97 96.89 97.98 97.98
SVM
(Cubic) 81.49 82.03 80.47 88.97 85.36
ECG
KNN
(k= 1) 68.82 74.59 60.22 73.69 74.13
SVM
(Cubic) 63.31 66.03 55.28 81.31 72.88
Multimodal
KNN
(k= 1)97.86 98.29 97.19 98.17 98.23
SVM
(Cubic) 84.10 84.77 82.90 89.93 87.28
* in percent.
Entropy 2022,24, 1322 13 of 22
0 1
Predicted class
0
1
True class
EEG (KNN)
294
32613,654
10,566
0 1
Predicted class
0
1
True class
EEG (SVM)
2633
1954
8227
12,026
0 1
Predicted class
0
1
True class
ECG (KNN)
0 1
Predicted class
0
1
True class
ECG (SVM)
0 1
Predicted class
0
1
True class
Multimodal (KNN)
266
27113,709
10,594
0 1
Predicted class
0
1
True class
Multimodal (SVM)
2225
1615
8635
12,365
5781
3200
5079
10,780
3770
3696
7090
10,284
Figure 7. Confusion matrix for arousal emotion dimension.
Entropy 2022,24, 1322 14 of 22
11,676
Figure 8. Confusion matrix for dominance emotion dimension.
Entropy 2022,24, 1322 15 of 22
0 1
Predicted class
0
1
True class
EEG (KNN)
305
304
9475
14,756
0 1
Predicted class
0
1
True class
EEG (SVM)
2936
1661
6844
13,399
0 1
Predicted class
0
1
True class
ECG (KNN)
3781
3963
5999
11,097
0 1
Predicted class
0
1
True class
ECG (SVM)
6301
2814
3479
12,246
0 1
Predicted class
0
1
True class
Multimodal (KNN)
257
275
9523
14,785
0 1
Predicted class
0
1
True class
Multimodal (SVM)
2433
1516
7347
13,544
Figure 9. Confusion matrix for valence emotion dimension.
4. Discussion
Although, in recent years, significant research has been carried out on emotion recogni-
tion related topics, still, it is challenging due to the fuzziness in distinction among different
emotions. The study presented in this article demonstrated an approach for developing an
automated human emotion identification that is able to categorize the three dimensions
of emotion, i.e., arousal, valence, and dominance, into high or low classes of a particular
dimension. In order to retrieve emotion-related information from the physiological signals,
the FBSE-EWT signal decomposition method is used. Then, various FB-based entropies,
such as SSE, WE, and LEE features, are computed from the modes obtained after signal
decomposition. Figures 10 and 11 show different boxplots of mentioned entropy features
obtained from EEG and ECG, respectively. The plots give information related to feature
value distribution among different classes. Due to difference in the interquartile distance
among different feature values good classification accuracy is obtained. Then, the extracted
Entropy 2022,24, 1322 16 of 22
features are fed to two different classifiers: SVM and KNN. The above results show that
the KNN classifier is found to have more accuracy than the SVM classifier. Further, the
multi-model scenario is found to be more beneficial to reliably identify the emotional
states of the subject. A statistical significance test was performed to show that the accu-
racy improvement in the multimodal emotion recognition model is significant [
56
]. An
analysis of variance (ANOVA) test was performed on the 5-fold accuracies of different
modalities, such as EEG, ECG, and multimodal. Figure 12 shows the accuracies of the
high and low states of arousal, dominance, and valance using boxplots. The p-values
for the statistical test for arousal, dominance, and valance are 1.33
×
10
18
, 2.82
×
10
21
,
and 1.50
×
10
17
, respectively, which indicates a significant improvement in accuracies for
the multimodal emotion detection model. The proposed methodology is compared with
the various existing approaches to identify the emotional states. The proposed method
found to have superior performance as highlighted in Table 4using bold fonts. In [
57
],
various features related to EEG and ECG signals, such as power spectral density (PSD),
HRV, entropy, EEG-topographical image-based, and ECG spectrogram image-based DL
features, are calculated. Topic et al. [
58
] used EEG signals for emotion detection and has
derived holographic features and for maximizing model accuracy, only relevant channels
are selected. In [
59
], using data preprocessing, frame level features are derived from the
EEG signal, from which, effective features are extracted using a teacher–student frame-
work. In [
60
], the deep canonical correlation analysis (DCCA) method is proposed for
emotion recognition. From the table, it may be noted that currently, various studies are
being performed for emotion detection based on a DL-related framework [34,58,6062] as
it automatically finds the most significant features; however, it is difficult to understand
or to find the reason behind the results obtained. Moreover, DL models are computation-
ally complex and require a large amount of data to train [
28
]. Feature extraction in other
existing state-of-the-art methods is a complex process and less accurate than the study
presented here, which differentiates itself from the other existing studies as the results are
encouraging. Thus, the proposed method has several advantages over the various listed
DL-based feature extraction methods, such as being easy to comprehend, less complex,
and more reliable. The disadvantage of the proposed multimodal approach is that its time
complexity has increased compared to a single EEG-only modality. Still, the increased
reliability makes it more suitable for feature extraction. As the results obtained using the
proposed methodology have only been tested on a small size dataset, in the future, they can
be tested on a larger database in order to verify their validity. Furthermore, in this study, a
fixed 60 s duration physiological signals was used; this can also be made adaptive in order
to select the time portion, which can make the methodology more efficient and adaptive.
Entropy 2022,24, 1322 17 of 22
Table 4.
Comparison of existing methodologies applied on DREAMER dataset with the proposed
emotion recognition method.
Authors (Year) Methodology Modality Accuracy (%)
LA-HA LD-HD LV-HV
Katsigiannis et al. [32] (2018) PSD features and SVM classifier EEG & ECG 62.32 61.84 62.49
Song et al. [62] (2018) DGCNN EEG 84.54 85.02 86.23
Zhang et al. [34] (2019) PSD features and GCB-net classifier EEG 89.32 89.20 86.99
Bhattacharyya et al. [
28
] (2020)
MFBSE-EWT based entropy features and ARF classifier EEG 85.4 86.2 84.5
Cui et al. [61] (2020) RACNN EEG 97.01 - 95.55
Kamble et al. [63] (2021) DWT, EMD-based features, and CML and EML-based classifier EEG 93.79 - 94.5
Li et al. [64] (2021) 3DFR-DFCN EEG 75.97 85.14 82.68
Siddharth et al. [57] (2022)
PSD, HRV, entropy, DL based feature, and LSTM based classifier
EEG & ECG 79.95 - 79.95
Topic et al. [58] (2022) Holographic features and CNN EEG 92.92 92.97 90.76
Gu et al. [59] (2022) FLTSDP EEG 90.61 91.00 91.54
Liu et al. [60] (2022) DCCA EEG & ECG 89.00 90.7 90.6
Proposed work FBSE-EWT-based entropy features and KNN classifier EEG & ECG 97.84 97.91 97.86
PSD = power spectral density, SVM = support-vector machines, DGCNN = dynamic graph convolutional neural network,
GCB-net = graph convolutional broad network, MFBSE-EWT = multivariate FBSE-EWT,
ARF = adaptive
random forest,
RACNN = region-based convolutional neural network, DWT = discrete wavelet transform, EMD = empirical mode decom-
position, CML = conventional machine learning, 3DFR-DFCN = 3-D feature representation and dilated fully convolutional
network, HRV = heart rate variability, DL = deep learning, LSTM = long short term memory, CNN = convolutional neural
network, FLTSDP = frame level teacher-student framework with data privacy, DCCA = deep canonical correlation analysis.
HA LA
Emotion class
0
0.02
0.04
0.06
0.08
0.1
SSE
HA LA
Emotion class
1400
1500
1600
1700
1800
1900
2000
2100
2200
LEE
HA LA
Emotion class
1.5
2
2.5
3
3.5
4
4.5
5
5.5
WE
HD LD
Emotion class
1400
1600
1800
2000
2200
2400
SSE
HD LD
Emotion class
1400
1500
1600
1700
1800
1900
2000
2100
2200
LEE
HD LD
Emotion class
1.5
2
2.5
3
3.5
4
4.5
5
5.5
WE
HV LV
Emotion class
1
1.5
2
2.5
3
3.5
4
4.5
5
SSE
HV LV
Emotion class
0
0.02
0.04
0.06
0.08
0.1
LEE
HV LV
Emotion class
1500
2000
2500
3000
3500
4000
WE
Figure 10. Box plots for the features calculated using EEG signals.
Entropy 2022,24, 1322 18 of 22
HA LA
Emotion class
2800
3000
3200
3400
3600
3800
4000
4200
SSE
HA LA
Emotion class
2
3
4
5
6
LEE
HA LA
Emotion class
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
WE
HD LD
Emotion class
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
SSE
HD LD
Emotion class
1
2
3
4
5
6
LEE
HD LD
Emotion class
2600
2800
3000
3200
3400
3600
3800
4000
4200
WE
HV LV
Emotion class
2
3
4
5
6
SSE
HV LV
Emotion class
2800
3000
3200
3400
3600
3800
4000
4200
LEE
HV LV
Emotion class
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
WE
Figure 11. Box plots for the features calculated using ECG signals.
EEG
ECG
Multimodal
70
75
80
85
90
95
Accuracy (in %)
EEG
ECG
Multimodal
70
75
80
85
90
95
Accuracy (in %)
EEG
ECG
Multimodal
70
75
80
85
90
95
Accuracy (in %)
(a) (b) (c)
Figure 12.
Box plots showing statistical significance of accuracies for different modalities (
a
) arousal,
(b) dominance, and (c) valence.
5. Conclusions
This study presents an automated emotion identification by a multimodal approach
using FB domain-based entropies. The publically available DREAMER data set was used to
evaluate the performance of the proposed algorithm. Physiological signals obtained from
the dataset, namely EEG and ECG, are decomposed using the FBSE-EWT into four modes.
Entropy 2022,24, 1322 19 of 22
From these modes, several new FB-based entropy features, such as FB spectral-based SSE,
LEE, and WE were computed. The dataset consisted of three-dimensional emotional space,
i.e., arousal, valence, and dominance. Each of the dimensions are categorized into high and
low classes based on the rating provided along with the dataset. The proposed method
using the KNN classifier provides the highest accuracies of 97.84%, 97.91%, and 97.86% for
arousal, valence, and dominance emotion classes, respectively. After comparing it with
results obtained from current methods, significant improvement in the results is obtained.
Moreover, the multimodal approach is found to be the most accurate in terms of identifying
human emotional states.
Author Contributions:
A.N., K.D. and R.B.P. contributed equally to this work. All authors have read
and agreed to the published version of the manuscript.
Funding:
This study is supported by the Council of Scientific & Industrial Research (CSIR) funded
Research Project, Government of India, Grant number: 22(0851)/20/EMR-II.
Institutional Review Board Statement: Not applicable.
Data Availability Statement:
The EEG and ECG data are provided by the Stamos Katsigiannis
collected at University of the West of Scotland.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
ANN Artificial neural network
ANOVA Analysis of variance
ANS Autonomous nervous system
ARF Autoencoder-based random forest
BEMD Bivariate empirical mode decomposition
DCCA Deep canonical correlation analysis
DL Deep learning
DWT Discrete wavelet transform
ECG Electrocardiogram
EEG Electroencephalogram
EMD Empirical mode decomposition
EWT Empirical wavelet transform
FB Fourier–Bessel
FBSE Fourier–Bessel series expansion
FBSE-EWT FBSE-based EWT
FN False negative
FP False positive
FrFT Fractional Fourier transform
GSR Galvanic skin response
HA High arousal
HCI Human–computer interaction
HD High dominance
HOC Higher-order crossing
HRV Heart rate variability
HV High valence
ICA Independent component analysis
IMF Intrinsic mode functions
Entropy 2022,24, 1322 20 of 22
IP Information potential
KNN K-nearest neighbors
LA Low arousal
LA-LL Left arm and left leg
LD Low dominance
LEE Log energy entropy
LV Low valence
MC-LS-SVM Multiclass least squares support vector machines
MHMS Multivariate Hilbert marginal spectrum
MSST Multivariate synchrosqueezing transform
NCA Neighborhood component analysis
NMF Non-negative matrix factorization
PCA Principal component analysis
PSD Power spectral density
RA-LL Right arm and Left leg
SDEL Sparse discriminative ensemble learning
SODP Second order difference plot
SSE Shannon spectral entropy
STFT Short time Fourier transform
SVM Support vector machines
TN True negative
TP True positive
TQWT Tunable Q-factor wavelet transform
WE Wiener entropy
WIB With-in beat
References
1.
Ptaszynski, M.; Dybala, P.; Shi, W.; Rzepka, R.; Araki, K. Towards context aware emotional intelligence in machines: Computing
contextual appropriateness of affective states. In Proceedings of the Twenty-First International Joint Conference on Artificial
Intelligence (IJCAI-09), Pasadena, CA, USA, 11–17 July 2009; AAAI: Menlo Park, CA, USA; pp. 1469–1474.
2. Vingerhoets, A.; Nyklícek, I.; Denollet, J. Emotion Regulation; Springer: Berlin/Heidelberg, Germany, 2008.
3.
Kroupi, E.; Yazdani, A.; Ebrahimi, T. EEG correlates of different emotional states elicited during watching music videos. In
Proceedings of the International Conference on Affective Computing and Intelligent Interaction, Memphis, TN, USA, 9–12
October 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 457–466.
4.
Jin, L.; Kim, E.Y. Interpretable cross-subject EEG-based emotion recognition using channel-wise features. Sensors
2020
,20, 6719.
5.
Kołakowska, A.; Landowska, A.; Szwoch, M.; Szwoch, W.; Wrobel, M.R. Emotion recognition and its applications. In Human-
Computer Systems Interaction: Backgrounds and Applications 3; Springer: Berlin/Heidelberg, Germany, 2014; pp. 51–62.
6.
Šimi´c, G.; Tkalˇci´c, M.; Vuki´c, V.; Mulc, D.; Špani´c, E.; Šagud, M.; Olucha-Bordonau, F.E.; Vukši´c, M.; R. Hof, P. Understanding
emotions: Origins and roles of the amygdala. Biomolecules 2021,11, 823.
7.
Doukas, C.; Maglogiannis, I. Intelligent pervasive healthcare systems. In Advanced Computational Intelligence Paradigms in
Healthcare-3; Springer: Berlin/Heidelberg, Germany, 2008; pp. 95–115.
8.
McCraty, R. Science of the Heart: Exploring the Role of the Heart in Human Performance; HeartMath Research Center, Institute of
HeartMath: Boulder Creek, CA, USA, 2015.
9.
Filippini, C.; Di Crosta, A.; Palumbo, R.; Perpetuini, D.; Cardone, D.; Ceccato, I.; Di Domenico, A.; Merla, A. Automated affective
computing based on bio-signals analysis and deep learning approach. Sensors 2022,22, 1789.
10.
Kipli, K.; Latip, A.A.A.; Lias, K.; Bateni, N.; Yusoff, S.M.; Tajudin, N.M.A.; Jalil, M.; Ray, K.; Shamim Kaiser, M.; Mahmud, M. GSR
signals features extraction for emotion recognition. In Proceedings of Trends in Electronics and Health Informatics, Singapore, 22
March 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 329–338.
11.
Dutta, S.; Mishra, B.K.; Mitra, A.; Chakraborty, A. An analysis of emotion recognition based on GSR signal. ECS Trans.
2022
,
107, 12535.
12.
Panahi, F.; Rashidi, S.; Sheikhani, A. Application of Fractional Fourier Transform in Feature Extraction from Electrocardiogram
and Galvanic Skin Response for Emotion Recognition. Biomed. Signal Process. Control. 2021,69, 102863.
13.
Zhang, Q.; Chen, X.; Zhan, Q.; Yang, T.; Xia, S. Respiration-based emotion recognition with deep learning. Comput. Ind.
2017
,
92, 84–90.
14.
Agrafioti, F.; Hatzinakos, D.; Anderson, A.K. ECG pattern analysis for emotion detection. IEEE Trans. Affect. Comput.
2011
,
3, 102–115.
15.
Cheng, Z.; Shu, L.; Xie, J.; Chen, C.P. A novel ECG-based real-time detection method of negative emotions in wearable applications.
In Proceedings of the 2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC), Shenzhen, China,
15–17 December 2017; IEEE: Washington, DC, USA, 2017; pp. 296–301.
Entropy 2022,24, 1322 21 of 22
16.
Dissanayake, T.; Rajapaksha, Y.; Ragel, R.; Nawinne, I. An ensemble learning approach for electrocardiogram sensor based
human emotion recognition. Sensors 2019,19, 4495.
17.
Sepúlveda, A.; Castillo, F.; Palma, C.; Rodriguez-Fernandez, M. Emotion recognition from ECG signals using wavelet scattering
and machine learning. Appl. Sci. 2021,11, 4945.
18.
Anuragi, A.; Sisodia, D.S.; Pachori, R.B. EEG-based cross-subject emotion recognition using Fourier-Bessel series expansion based
empirical wavelet transform and NCA feature selection method. Inf. Sci. 2022,610, 508–524.
19.
Sharma, R.; Pachori, R.B.; Sircar, P. Automated emotion recognition based on higher order statistics and deep learning algorithm.
Biomed. Signal Process. Control. 2020,58, 101867.
20.
Bajaj, V.; Pachori, R.B. Human emotion classification from EEG signals using multiwavelet transform. In Proceedings of the 2014
International Conference on Medical Biometrics, Shenzhen, China, 30 May–1 June 2014; IEEE: Washington, DC, USA, 2014; pp. 125–130.
21.
Bajaj, V.; Pachori, R.B. Detection of human emotions using features based on the multiwavelet transform of EEG signals. In
Brain-Computer Interfaces; Springer: Berlin/Heidelberg, Germany, 2015; pp. 215–240.
22.
Lin, Y.P.; Wang, C.H.; Jung, T.P.; Wu, T.L.; Jeng, S.K.; Duann, J.R.; Chen, J.H. EEG-based emotion recognition in music listening.
IEEE Trans. Biomed. Eng. 2010,57, 1798–1806.
23.
Liu, Y.; Sourina, O. Real-time subject-dependent EEG-based emotion recognition algorithm. In Transactions on Computational
Science XXIII; Springer: Berlin/Heidelberg, Germany, 2014; pp. 199–223.
24.
Mohammadi, Z.; Frounchi, J.; Amiri, M. Wavelet-based emotion recognition system using EEG signal. Neural Comput. Appl.
2017
,
28, 1985–1990.
25.
Mert, A.; Akan, A. Emotion recognition based on time-frequency distribution of EEG signals using multivariate synchrosqueezing
transform. Digit. Signal Process. 2018,81, 106–115.
26.
Gupta, V.; Chopda, M.D.; Pachori, R.B. Cross-subject emotion recognition using flexible analytic wavelet transform from EEG
signals. IEEE Sens. J. 2018,19, 2266–2274.
27.
Ullah, H.; Uzair, M.; Mahmood, A.; Ullah, M.; Khan, S.D.; Cheikh, F.A. Internal emotion classification using EEG signal with
sparse discriminative ensemble. IEEE Access 2019,7, 40144–40153.
28.
Bhattacharyya, A.; Tripathy, R.K.; Garg, L.; Pachori, R.B. A novel multivariate-multiscale approach for computing EEG spectral
and temporal complexity for human emotion recognition. IEEE Sens. J. 2020,21, 3579–3591.
29.
Salankar, N.; Mishra, P.; Garg, L. Emotion recognition from EEG signals using empirical mode decomposition and second-order
difference plot. Biomed. Signal Process. Control. 2021,65, 102389.
30.
Nalwaya, A.; Das, K.; Pachori, R.B. Emotion identification from TQWT-Based EEG rhythms. In AI-Enabled Smart Healthcare Using
Biomedical Signals; IGI Global: Hershey, PA, USA, 2022; pp. 195–216.
31.
Marín-Morales, J.; Higuera-Trujillo, J.L.; Greco, A.; Guixeres, J.; Llinares, C.; Scilingo, E.P.; Alcañiz, M.; Valenza, G. Affective
computing in virtual reality: Emotion recognition from brain and heartbeat dynamics using wearable sensors. Sci. Rep.
2018
,
8, 13657.
32.
Katsigiannis, S.; Ramzan, N. DREAMER: A database for emotion recognition through EEG and ECG signals from wireless
low-cost off-the-shelf devices. IEEE J. Biomed. Health Inform. 2017,22, 98–107.
33.
Cheng, J.; Chen, M.; Li, C.; Liu, Y.; Song, R.; Liu, A.; Chen, X. Emotion recognition from multi-channel eeg via deep forest. IEEE J.
Biomed. Health Inform. 2020,25, 453–464.
34.
Zhang, T.; Wang, X.; Xu, X.; Chen, C.P. GCB-Net: Graph convolutional broad network and its application in emotion recognition.
IEEE Trans. Affect. Comput. 2019,13, 379–388.
35.
Gupta, V.; Bhattacharyya, A.; Pachori, R.B. Automated identification of epileptic seizures from EEG signals using FBSE-EWT
method. In Biomedical Signal Processing; Springer: Berlin/Heidelberg, Germany, 2020; pp. 157–179.
36.
Anuragi, A.; Sisodia, D.S.; Pachori, R.B. Epileptic-seizure classification using phase-space representation of FBSE-EWT based
EEG sub-band signals and ensemble learners. Biomed. Signal Process. Control. 2022,71, 103138.
37.
Khan, S.I.; Qaisar, S.M.; Pachori, R.B. Automated classification of valvular heart diseases using FBSE-EWT and PSR based
geometrical features. Biomed. Signal Process. Control. 2022,73, 103445.
38.
Khan, S.I.; Pachori, R.B. Derived vectorcardiogram based automated detection of posterior myocardial infarction using FBSE-EWT
technique. Biomed. Signal Process. Control. 2021,70, 103051.
39.
Bhattacharyya, A.; Singh, L.; Pachori, R.B. Fourier-Bessel series expansion based empirical wavelet transform for analysis of
non-stationary signals. Digit. Signal Process. 2018,78, 185–196.
40. Schroeder, J. Signal Processing via Fourier-Bessel series expansion. Digit. Signal Process. 1993,3, 112–124.
41.
Das, K.; Verma, P.; Pachori, R.B. Assessment of chanting effects using EEG signals. In Proceedings of the 2022 24th International
Conference on Digital Signal Processing and Its Applications (DSPA), 30 March–1 April 2022, Moscow, Russia; IEEE: Washington,
DC, USA, 2022, pp. 1–5.
42.
Pachori, R.B.; Sircar, P. EEG signal analysis using FB expansion and second-order linear TVAR process. Signal Process.
2008
,
88, 415–420.
43.
Gilles, J.; Heal, K. A parameterless scale-space approach to find meaningful modes in histograms—Application to image and
spectrum segmentation. Int. J. Wavelets Multiresolut. Inf. Process. 2014,12, 1450044.
44. Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 1979,9, 62–66.
45. Gilles, J. Empirical wavelet transform. IEEE Trans. Signal Process. 2013,61, 3999–4010.
Entropy 2022,24, 1322 22 of 22
46.
Kowalski, A.; Martín, M.; Plastino, A.; Rosso, O. Bandt-Pompe approach to the classical-quantum transition. Phys. D Nonlinear
Phenom. 2007,233, 21–31.
47.
Zheng, W.L.; Lu, B.L. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural
networks. IEEE Trans. Auton. Ment. Dev. 2015,7, 162–175.
48.
Pachori, R.B.; Hewson, D.; Snoussi, H.; Duchêne, J. Analysis of center of pressure signals using empirical mode decomposition
and Fourier-Bessel expansion. In Proceedings of the TENCON 2008—2008 IEEE Region 10 Conference, Hyderabad, India, 19–21
November 2008; IEEE: Washington, DC, USA, 2008; pp. 1–6.
49. Shannon, C.E. A mathematical theory of communication. ACM SIGMOB. Mob. Comput. Commun. Rev. 2001,5, 3–55.
50.
Dubnov, S. Generalization of spectral flatness measure for non-Gaussian linear processes. IEEE Signal Process. Lett.
2004
,
11, 698–701.
51.
Aydın, S.; Sarao˘glu, H.M.; Kara, S. Log energy entropy-based EEG classification with multilayer neural networks in seizure. Ann.
Biomed. Eng. 2009,37, 2626–2630.
52.
Li, X.; Zhang, Y.; Tiwari, P.; Song, D.; Hu, B.; Yang, M.; Zhao, Z.; Kumar, N.; Marttinen, P. EEG based emotion recognition: A
tutorial and review. ACM Comput. Surv. (CSUR) 2022. https://doi.org/10.1145/3524499.
53. Sebastiani, F. Machine learning in automated text categorization. ACM Comput. Surv. (CSUR) 2002,34, 1–47.
54. Cortes, C.; Vapnik, V. Support-vector networks. Machine learning 1995,20, 273–297.
55. Fukunaga, K. Introduction to Statistical Pattern Recognition, 2nd ed.; Academic Press: Cambridge, USA, 1990; pp. 300–361.
56.
Muralidharan, N.; Gupta, S.; Prusty, M.R.; Tripathy, R.K. Detection of COVID19 from X-ray Images Using Multiscale Deep
Convolutional Neural Network. Appl. Soft Comput. 2022,119, 108610.
57.
Siddharth, S.; Jung, T.P.; Sejnowski, T.J. Utilizing deep learning towards multi-modal bio-sensing and vision-based affective
computing. IEEE Trans. Affect. Comput. 2019,13, 96–107.
58.
Topic, A.; Russo, M.; Stella, M.; Saric, M. Emotion recognition using a reduced set of EEG channels based on holographic feature
maps. Sensors 2022,22, 3248.
59.
Gu, T.; Wang, Z.; Xu, X.; Li, D.; Yang, H.; Du, W. Frame-Level Teacher-Student Learning with Data Privacy for EEG emotion
recognition. IEEE Trans. Neural Netw. Learn. Syst. 2022. https://doi.org/10.1109/TNNLS.2022.3168935.
60.
Liu, W.; Qiu, J.L.; Zheng, W.L.; Lu, B.L. Comparing recognition performance and robustness of multimodal deep learning models
for multimodal emotion recognition. IEEE Trans. Cogn. Dev. Syst. 2021,14, 715–729.
61.
Cui, H.; Liu, A.; Zhang, X.; Chen, X.; Wang, K.; Chen, X. EEG-based emotion recognition using an end-to-end regional-asymmetric
convolutional neural network. Knowl.-Based Syst. 2020,205, 106243.
62.
Song, T.; Zheng, W.; Song, P.; Cui, Z. EEG emotion recognition using dynamical graph convolutional neural networks. IEEE
Trans. Affect. Comput. 2018,11, 532–541.
63.
Kamble, K.S.; Sengupta, J. Ensemble machine learning-based affective computing for emotion recognition using dual-decomposed
EEG signals. IEEE Sens. J. 2021,22, 2496–2507.
64.
Li, D.; Chai, B.; Wang, Z.; Yang, H.; Du, W. EEG emotion recognition based on 3-D feature representation and dilated fully
convolutional networks. IEEE Trans. Cogn. Dev. Syst. 2021,13, 885–897.
... FBSE-EWT based framework has been used for different applications of EEG signal namely, epileptic seizure detection [112,113,114,115,116], emotion recognition [117], and alcoholism detection [118], etc. In [119], authors have proposed FBSE-EWT based framework for emotion recognition using EEG and ECG signals. ...
Article
Full-text available
Several applications, analysis and visualization of signal demand representation of time-domain signal in different domains like frequency-domain representation based on Fourier transform (FT). Representing a signal in frequency-domain, where parameters of interest are more compact than in original form (time-domain). It is considered that the basis functions which are used to represent the signal should be highly correlated with the signal which is under analysis. Bessel functions are one of the set of basis functions which have been used in literature for representing non-stationary signals due to their damping (non-stationary) nature, and the representation methods based on these basis functions are named as Fourier-Bessel series expansion (FBSE) and Fourier-Bessel transform (FBT). The main purpose of this paper is to present a review related to theory and applications of FBSE and FBT methods. Roots calculation of Bessel functions, the relation between root order of Bessel function and frequency, advantages of Fourier-Bessel representation over FT have also been included in the paper. In order to make the implementation of FBSE based decomposition methods easy, the pseudo-code of decomposition methods are included. The paper also describes various applications of FBSE and FBT based methods present in the literature. Finally, the future scope of the Fourier-Bessel representation is discussed.
Article
Full-text available
Constructing reliable and effective models to recognize human emotional states has become an important issue in recent years. In this article, we propose a double way deep residual neural network combined with brain network analysis, which enables the classification of multiple emotional states. To begin with, we transform the emotional EEG signals into five frequency bands by wavelet transform and construct brain networks by inter-channel correlation coefficients. These brain networks are then fed into a subsequent deep neural network block which contains several modules with residual connection and enhanced by channel attention mechanism and spatial attention mechanism. In the second way of the model, we feed the emotional EEG signals directly into another deep neural network block to extract temporal features. At the end of the two ways, the features are concatenated for classification. To verify the effectiveness of our proposed model, we carried out a series of experiments to collect emotional EEG from eight subjects. The average accuracy of the proposed model on our emotional dataset is 94.57%. In addition, the evaluation results on public databases SEED and SEED-IV are 94.55% and 78.91%, respectively, demonstrating the superiority of our model in emotion recognition tasks.
Chapter
Full-text available
Electroencephalogram (EEG) signals are the recording of brain electrical activity, commonly used for emotion recognition. Different EEG rhythms carry different neural dynamics. EEG rhythms are separated using tunable Q-factor wavelet transform (TQWT). Several features like mean, standard deviation, information potential are extracted from the TQWT-based EEG rhythms. Machine learning classifiers are used to differentiate various emotional states automatically. The authors have validated the proposed model using a publicly available database. Obtained classification accuracy of 92.9% proves the candidature of the proposed method for emotion identification.
Article
Full-text available
In our day-to-day life, the proper perception of emotion plays an important role in human decision making and behavior. Nowadays, a lot of research is focused on the evocation and precise detection of human emotion, which can be later utilized in a different set of arena. There is good amount of research on emotion detection through parameters extracted via Face Recognition and Speech Modulation, etc. However, there is a huge question on the accuracy or effectiveness of these results as these features can be controlled or manipulated by the subject/person. So, the next approach is the usage of Physiological Signals. These signals are generated by the Central Nervous System (CNS) and cannot be controlled or manipulated by the subject/person. In the proposed work, we have used Galvanic Skin Response (GSR) signals for emotion detection. It is an easily available off-the-shelf, non-invasive sensing device, and is easy to use. We have used different machine learning models to classify the various emotional states with better accuracy. The different classifiers that are used are the k-Nearest Neighbors (kNN), Support Vector Machine (SVM), and Logistic Regression (LR).
Article
Full-text available
An important function of the construction of the Brain-Computer Interface (BCI) device is the development of a model that is able to recognize emotions from electroencephalogram (EEG) signals. Research in this area is very challenging because the EEG signal is non-stationary, non-linear, and contains a lot of noise due to artefacts caused by muscle activity and poor electrode contact. EEG signals are recorded with non-invasive wearable devices using a large number of electrodes, which increase the dimensionality and, thereby, also the computational complexity of EEG data. It also reduces the level of comfort of the subjects. This paper implements our holographic features, investigates electrode selection, and uses the most relevant channels to maximize model accuracy. The ReliefF and Neighborhood Component Analysis (NCA) methods were used to select the optimal electrodes. Verification was performed on four publicly available datasets. Our holo-graphic feature maps were constructed using computer-generated holography (CGH) based on the values of signal characteristics displayed in space. The resulting 2D maps are the input to the Convolutional Neural Network (CNN), which serves as a feature extraction method. This methodology uses a reduced set of electrodes, which are different between men and women, and obtains state-of-the-art results in a three-dimensional emotional space. The experimental results show that the channel selection methods improve emotion recognition rates significantly with an accuracy of 90.76% for valence, 92.92% for arousal, and 92.97% for dominance.
Article
Full-text available
Emotion recognition technology through analyzing the EEG signal is currently an essential concept in Artificial Intelligence and holds great potential in emotional health care, human-computer interaction, multimedia content recommendation, etc. Though there have been several works devoted to reviewing EEG-based emotion recognition, the content of these reviews needs to be updated. In addition, those works are either fragmented in content or only focus on specific techniques adopted in this area but neglect the holistic perspective of the entire technical routes. Hence, in this paper, we review from the perspective of researchers who try to take the first step on this topic. We review the recent representative works in the EEG-based emotion recognition research and provide a tutorial to guide the researchers to start from the beginning. The scientific basis of EEG-based emotion recognition in the psychological and physiological levels is introduced. Further, we categorize these reviewed works into different technical routes and illustrate the theoretical basis and the research motivation, which will help the readers better understand why those techniques are studied and employed. At last, existing challenges and future investigations are also discussed in this paper, which guides the researchers to decide potential future research directions.
Conference Paper
Full-text available
Meditation is practiced since the old days and its popularity is growing in recent years to get better mental as well as physical health in a natural way. A rapidly increasing number of studies are involved in finding the biological mechanism underlying the beneficial impacts of meditation. Surface electroencephalogram (EEG) is a non-invasive way to record the electrical activity of the brain which carries important signature about the different neural processing, going on inside the brain. EEG signals show oscillations at different frequency bands known as EEG rhythms which are associated with divergent neurophysiological states. In this paper, we have analyzed the effect of chanting ’Hare Krishna Mantra’ (HKM) on EEG rhythms. Relative band power of different rhythms, after and before one round (108 times) chanting HKM, are compared. A non-stationary signal decomposition tool, Fourier-Bessel series expansion is used to calculate the band power. After mediation, alpha band power has increased significantly which implies the relaxed and peaceful state of mind. This study on HKM chanting effects on EEG rhythms may show a simple but effective path to control stress, depression, tension, etc
Article
Full-text available
Extensive possibilities of applications have rendered emotion recognition ineluctable and challenging in the fields of computer science as well as in human-machine interaction and affective computing. Fields that, in turn, are increasingly requiring real-time applications or interactions in everyday life scenarios. However, while extremely desirable, an accurate and automated emotion classification approach remains a challenging issue. To this end, this study presents an automated emotion recognition model based on easily accessible physiological signals and deep learning (DL) approaches. As a DL algorithm, a Feedforward Neural Network was employed in this study. The network outcome was further compared with canonical machine learning algorithms such as random forest (RF). The developed DL model relied on the combined use of wearables and contactless technologies, such as thermal infrared imaging. Such a model is able to classify the emotional state into four classes, derived from the linear combination of valence and arousal (referring to the circumplex model of affect’s four-quadrant structure) with an overall accuracy of 70% outperforming the 66% accuracy reached by the RF model. Considering the ecological and agile nature of the technique used the proposed model could lead to innovative applications in the affective computing field.
Article
Multimodal signals are powerful for emotion recognition since they can represent emotions comprehensively. In this article, we compare the recognition performance and robustness of two multimodal emotion recognition models: 1) deep canonical correlation analysis (DCCA) and 2) bimodal deep autoencoder (BDAE). The contributions of this article are threefold: 1) we propose two methods for extending the original DCCA model for multimodal fusion: a) weighted sum fusion and b) attention-based fusion; 2) we systemically compare the performance of DCCA, BDAE, and traditional approaches on five multimodal data sets; and 3) we investigate the robustness of DCCA, BDAE, and traditional approaches on SEED-V and DREAMER data sets under two conditions: 1) adding noises to multimodal features and 2) replacing electroencephalography features with noises. Our experimental results demonstrate that DCCA achieves state-of-the-art recognition results on all five data sets: 1) 94.6% on the SEED data set; 2) 87.5% on the SEED-IV data set; 3) 84.3% and 85.6% on the DEAP data set; 4) 85.3% on the SEED-V data set; and 5) 89.0%, 90.6%, and 90.7% on the DREAMER data set. Meanwhile, DCCA has greater robustness when adding various amounts of noises to the SEED-V and DREAMER data sets. By visualizing features before and after DCCA transformation on the SEED-V data set, we find that the transformed features are more homogeneous and discriminative across emotions.
Article
Automated emotion recognition using brain electroencephalogram (EEG) signals is predominantly used for the accurate assessment of human actions as compared to facial expression or speech signals. Various signal processing methods have been used for extracting representative features from EEG signals for emotion recognition. However, the EEG signals are non-stationary and vary across the subjects as well as in different sessions of the same subject; hence it exhibits poor generalizability and low classification accuracy for an emotion classification of cross subjects. In this paper, EEG signals-based automated cross-subject emotion recognition framework is proposed using the Fourier-Bessel series expansion-based empirical wavelet transform (FBSE-EWT) method. This method is used to decompose the EEG signals from each channel into four sub-band signals. Manually ten channels are selected from the frontal lobe, from which entropy and energy features are extracted from each sub-band signal. The subject variability is reduced using an average moving filter method on each channel to obtain the smoothened feature vector of size 80. The three feature selection techniques, such as neighborhood component analysis (NCA), relief-F, and mRMR, are used to obtain an optimal feature vector. The machine learning models, such as artificial neural network (ANN), k-nearest neighborhood (k-NN) with two (fine and weighted) functions, and ensemble bagged tree classifiers are trained by the obtained feature vectors. The experiments are performed on two publicly accessible databases, named SJTU emotion EEG dataset (SEED) and dataset for emotion analysis using physiological signals (DEAP). The training and testing of the models have been performed using 10-fold cross-validation and leave-one-subject-out-cross-validation (LOSOCV). The proposed framework based on FBSE-EWT and NCA feature selection approach shows superior results for classifying human emotions compared to other state-of-art emotion classification models.
Article
Recently, electroencephalogram (EEG) emotion recognition has gradually attracted a lot of attention. This brief designs a novel frame-level teacher-student framework with data privacy (FLTSDP) for EEG emotion recognition. The framework first proposes a teacher-student network without prior professional information for automated filtering of useful frame-level features by a gated mechanism and extracting high-level features by using knowledge distillation to capture the results of EEG emotion recognition from a teacher network and student networks. Then, the results from subnetworks are integrated by using the novel decision module, which, motivated by the voting mechanism, adjusts the composition of feature vectors and improves the weight of accurate prediction to optimize the integration effect. During training, an innovative data privacy protection mechanism is applied for avoiding data sharing, where each student network only inherits weights from all trained networks and does not inherit the training dataset. Here, the framework can be repeatedly optimized and improved by only training the next student subnetwork on new EEG signals. Experimental results show that our framework improves the accuracy of EEG emotion recognition by more than 5% and gets state-of-the-art performance for EEG emotion recognition in the subject-independent mode.
Chapter
Over the years, the recognition of emotion has become more efficient, diverse, and easily accessible. In general, emotion recognition is conducted in four main steps which are signal acquisition, preprocessing, feature extraction, and classification. Galvanic skin response (GSR) is the autonomic activation of sweat glands in the skin when an individual gets triggered through emotional stimulation. The paper provides an overview of emotion recognition, GSR signals, and how GSR signals are analyzed for emotion recognition. The focus of this research is on the performance of feature extraction of GSR signals. Therefore, related sources were identified using combinations of keywords and terms such as feature extraction, emotion recognition, and galvanic skin response. Existing emotion recognition methods were investigated which focused more on the different feature extraction methods. Research conducted has shown that feature extraction method in time–frequency domain has the best accuracy rate overall compared to time domain and frequency domain. Current GSR-based technology also has the potential to be improved more toward the implementation of a more efficient and reliable emotion recognition system.KeywordsEmotion recognitionGalvanic skin responseFeature extraction