ArticlePDF Available

Abstract and Figures

Research on emotion recognition using electroencephalogram (EEG) of subjects listening to music has become more active in the past decade. However, previous works did not consider emotional oscillations within a single musical piece. In this research, we propose a continuous music-emotion recognition approach based on brainwave signals. While considering the subject-dependent and changing-over-time characteristics of emotion, our experiment included self-reporting and continuous emotion annotation in the arousal-valence space. Fractal dimension (FD) and power spectral density (PSD) approaches were adopted to extract informative features from raw EEG signals and then we applied emotion classification algorithms to discriminate binary classes of emotion. According to our experimental results, FD slightly outperformed PSD approach both in arousal and valence classification, and FD was found to have the higher correlation with emotion reports than PSD. In addition, continuous emotion recognition during music listening based on EEG was found to be an effective method for tracking emotional reporting oscillations and provides an opportunity to better understand human emotional processes.
Content may be subject to copyright.
1234
IEICE TRANS. INF. & SYST., VOL.E99–D, NO.4 APRIL 2016
PAPER
Continuous Music-Emotion Recognition Based on
Electroencephalogram
Nattapong THAMMASANa),Nonmember, Koichi MORIYAMA†∗,Member, Ken-ichi FUKUI,Nonmember,
and Masayuki NUMAO,Member
SUMMARY Research on emotion recognition using electroencephalo-
gram (EEG) of subjects listening to music has become more active in the
past decade. However, previous works did not consider emotional oscil-
lations within a single musical piece. In this research, we propose a con-
tinuous music-emotion recognition approach based on brainwave signals.
While considering the subject-dependent and changing-over-time charac-
teristics of emotion, our experiment included self-reporting and continuous
emotion annotation in the arousal-valence space. Fractal dimension (FD)
and power spectral density (PSD) approaches were adopted to extract infor-
mative features from raw EEG signals and then we applied emotion clas-
sification algorithms to discriminate binary classes of emotion. According
to our experimental results, FD slightly outperformed PSD approach both
in arousal and valence classification, and FD was found to have the higher
correlation with emotion reports than PSD. In addition, continuous emo-
tion recognition during music listening based on EEG was found to be an
eective method for tracking emotional reporting oscillations and provides
an opportunity to better understand human emotional processes.
key words: music, emotion, electroencephalogram
1. Introduction
Emotion is a crucial factor in human-computer interac-
tion. An emotion reflects a mental state and psycho-
physiological expression. Scientists realize that an emo-
tion has strong connection with physiological signals, in-
cluding brainwaves, and emotion-brain research has become
a highly active research area. An electroencephalogram
(EEG) allows the feasible and cost-eective investigation
of emotion by analyzing electrical activities along the scalp
with high temporal resolution. Based on the neural corre-
late studies of emotion using EEG data, various algorithms,
e.g. fractal dimension (FD), power spectral density (PSD)
and discrete wavelet transform, have been proposed to ex-
tract meaningful information from EEG data and construct
models to recognize human emotion [1],[2].
Although emotion can be evoked by various types of
stimuli, including pictures, videos, or even voluntarily, mu-
sic is one of the most frequently used materials in emotion
research because of several benefits. For example, music
is considered as an extraordinary material to elicit emotion
powerfully and evoke a wide variety of emotions [3].Music
Manuscript received June 29, 2015.
Manuscript revised December 3, 2015.
Manuscript publicized January 22, 2016.
The authors are with the Institute of Scientific and Industrial
Research, Osaka University, Ibaraki-shi, 567–0047 Japan.
Presently, with the Graduate School of Engineering, Nagoya
Institute of Technology, Japan.
a) E-mail: nattapong@ai.sanken.osaka-u.ac.jp
DOI: 10.1587/transinf.2015EDP7251
also enables a study of time courses of emotional processes.
Furthermore, using music in EEG-based emotion recogni-
tion also has a promise of potential applications such as
music therapy [4], implicit multimedia tagging [5] and re-
trieval [6]. Thus, we focus on emotion while listening to
music.
Emotion while listening to music can change over time,
especially for long-duration music. Studies have found that
peripheral physiological reactions of listeners during mu-
sic perception change over time relative to emotional states
that were elicited continuously by music [7].InanfMRI
study [8], brain activation dierences between the first 30
seconds and the remaining 30 seconds of musical excerpts
were found. The following EEG study using the same stim-
uli also confirmed the dierences of cortical activities [9].
Considering the dynamic process of music emotion, a con-
siderable amount of works in music emotion variation de-
tection have been done in the last decade by utilizing musi-
cal features [10]. However, information from musical pieces
does not always reflect the listener’s emotional states.
A huge number of works have been proposed to esti-
mate emotional states during music listening by using EEG
and peripheral signals. However, the previous studies did
not consider the emotion variation because the chosen mu-
sical excerpts were relatively short (less than one minute).
Most existing music emotion recognition using EEG have
been based on a single emotion annotation for one musical
excerpt [1]. Indeed, duration of music in the real world is
generally longer than one minute.
In this study, we propose a method to extract emotion-
related time-varying information from EEG signals during
music listening and create an emotion recognition model
based on the time-varying “ground truth”. This study fo-
cuses on continuous emotion annotations in a single song
rather than employing an entire song-level annotation. Be-
cause duration of songs is suciently long to change the
emotion, and the capability of our approach to capture emo-
tional oscillation in songs is also investigated. Importantly,
emotion when listening to music is subjective, i.e.,thesame
piece of music can induce dierent emotions in dierent
listeners. Therefore, rather than relying on the predefined
emotion labels indicated by another listener, we gather self-
annotated emotion labels from the actual listeners.
The remainder of this paper is organized as follows. In
Sect. 2, we briefly review the study related to emotion recog-
nition based on physiological reactions. We describe our
Copyright c
2016 The Institute of Electronics, Information and Communication Engineers
THAMMASAN et al.: CONTINUOUS MUSIC-EMOTION RECOGNITION BASED ON ELECTROENCEPHALOGRAM
1235
methodology and experimental setting in Sect. 3. In Sect. 4,
we present experimental results, and we discuss the results
in Sect. 5. Finally, in Sect. 6, we conclude this paper.
2. Related Work
2.1 Dimensional Emotion Space
Emotional representation models have been proposed to de-
scribe emotion systematically. The dimensional approach
defines models based on the principle that human emo-
tions could be represented as any points lying in two or
three continuous dimensions. One of the most prominent
models is the arousal-valence emotion model proposed by
Russell [11]. In this bipolar model (Fig. 1), valence is
represented by the horizontal axis indicating the positiv-
ity or negativity of emotions. The vertical axis represents
arousal which describes the activation levels of emotions.
In this study, we have employed the arousal-valence emo-
tion model to represent human emotions because it has been
showntobeaneective and reliable model for recognizing
emotion during music listening [12].
2.2 EEG Bandwave
In healthy adults, changing from one cognitive state to an-
other leads to the alteration of the amplitudes and frequen-
cies of EEG signals. The electrical activities of the brain
are classified according to rhythms and defined in the terms
of delta(δ), theta(θ), alpha(α), beta(β), and gamma(γ)from
low to high frequencies [13]. It is occasionally referred to
as EEG bandwaves because each wave lies within a specific
range. The frequency ranges of these brain waves and their
association with normal human activities are summarized in
Table 1.
2.3 EEG-Based Emotion Recognition
Studies on neural correlates of emotion have found evidence
Fig. 1 The arousal-valence emotion space (redrawn from Russell [11])
with axis labels
of emotion-influenced changes in EEG signals. Evidence of
higher activity in the left frontal lobe of the brain in com-
parison with the right hemisphere while subjects were ex-
periencing positive emotions, prominently in αband power,
have been reported [14],[15]. Moreover, Sammler et al.[9]
found increases in the frontal midline theta power while sub-
jects were listening to pleasant musical excerpts. Based on
these discoveries, computational and machine learning al-
gorithms have been applied to EEG signals to achieve high-
performance in emotion estimation and prediction [1].
Various types of materials have been used to elicit emo-
tion. Images from the International Aective Picture Sys-
tem (IAPS) [16] were utilized in research on the identifica-
tion of emotion from EEG data [17]. In addition, videos
were also employed to evoke targeted emotions [18],[19]
and self-elicitation has also been performed [20].Dier-
ent algorithms have been introduced to extract informative
features from EEG signals, such as PSD [20], higher or-
der spectra [21], higher order crossings [22], and discrete
wavelet transform [23].
Music has also been used as stimuli in EEG-based
emotion recognition. As music-emotion recognition based
on EEG and peripheral signals is still in its infancy, re-
searchers are aiming to identify two or more finite classes
of arousal and valence although emotion can be repre-
sented continuously in dimensional space [1].Bos[24]
used sound clips from the International Aective Digitized
Sounds (IADS) [25] and images from IAPS to classify emo-
tional states into positive-arousal, positive-calm, negative-
arousal, and negative-calm states. Bos achieved an accu-
racy of 92.3%. Khosrowabadi et al. [26] introduced kernel
density estimation and Gaussian mixture model probability
estimation to extract features from EEG data and classified
six categorical emotions using a Bayesian network, multi-
layer perceptron (MLP), one-rule, random tree and a radial
basis function. They accomplished inter-subject accuracies
of 90%. Lin et al. [27] used pre-labeled music, and joy, sad-
ness, anger, and pleasure emotions could be discriminated at
a performance rate of 85%. Sourina et al. [28] utilized self-
emotion reporting after listening to specific sound clips and
Tab le 1 Comparison of EEG bandwave summarized from [13]
Band Frequency range (Hz) Association
Delta 0.5–4 Deep sleep
Theta 4–8
Consciousness slips
Drowsiness
Unconscious material
Creative inspiration
Deep meditation
Alpha 8–13 Relaxed awareness
Eye closing
Beta 14–30
Active thinking, Attention
Motor behavior
Focusing on the outside world
Solving concrete problems
Gamma 30 +
Sensory processing
Certain cognitive
motor function
1236
IEICE TRANS. INF. & SYST., VOL.E99–D, NO.4 APRIL 2016
emotion labels in selected clips from IADS, and positive-
high-aroused, positive-low-aroused, negative-high-aroused,
and negative-low-aroused emotions were discriminated at
an accuracy of 84.9% for arousal classification and 90.0%
for valence classification.
Although many methods to estimate the emotional
states that utilize EEG data have been proposed, these stud-
ies have primarily used pre-emotion-labeled music pieces
obtained from standard libraries, where such emotions are
reported by labeling by experts or another person. Realiz-
ing the fact that emotion during music listening is subjective
and can change over time, we applied a technique to com-
bine temporal continuous annotation and self-reporting.
3. Research Methodology
3.1 Participants and Materials
Fifteen males between 22 and 30 years of age (mean =
25.52, SD =2.14) participated in the experiments. All sub-
jects were mentally healthy students of Osaka University.
None had formal music education.
Our music collection is a set of MIDI files comprised of
40 instrumental pop songs having dierent instrument and
tempo. By using MIDI files, any additional emotions con-
tributed by lyrics can be eliminated. MIDI files also enable
musical feature investigation and application to music com-
position which are considered as our future works.
3.2 Data Collection
Our data collection software was developed using Java.
The experiment began with a questionnaire regarding per-
sonal information and musical preferences. Then, the sub-
ject selected 16 MIDI musical excerpts from the 40-song
MIDI collection using the software. We asked the subject
to choose eight familiar songs and eight unfamiliar songs
for further investigation of music familiarity in our future
work. Our software provided a function to play short sam-
ples of songs to facilitate recognizing familiarity, where 1-3
referred to low familiarity (unfamiliar songs) and 4-6 de-
noted high familiarity (familiar songs).
We placed a Waveguard EEG capon the subject’s
head in accordance with the 10–20 international system to
measure electrical activities along the brain. We selected 12
electrodes, i.e., Fp1, Fp2, F3, F4, F7, F8, Fz, C3, C4, T3,
T4, and Pz, out of 21 available electrodes (Fig. 2) as these
electrodes are located close to the frontal lobe which plays
a crucial role in emotion regulation [4],[14]. The sampling
frequency was set to 250 Hz. The impedance of each elec-
trode was less than 20 kΩ. A notch filter, a type of bandstop
filter that reduces a narrow range of frequencies, was applied
to reduce the 60-Hz electrical power line artifact. Brain sig-
nals were transmitted to a Polymate AP1532†† amplifier and
http://www.ant-neuro.com/products/waveguard
††http://www.teac.co.jp/industry/me/ap1132/
Fig. 2 Position of selected electrodes in accordance with 10–20 interna-
tional system
then visualized by its software, APMonitor†††.
Later, the selected music clips were presented as
sounds synthesized by the Java Sound API’s MIDI pack-
age††††. The average duration of each song was approx-
imately two minutes. Each song trial ended with a 16-
second silent rest to reduce the eects from the previous
song. The subjects were instructed to close their eyes and
minimize body movement while wearing the EEG cap and
listening to music. After completing all listening sessions,
the subject removed the EEG cap and proceeded to an an-
notation session. During the session, the subject listened
to the same songs and annotated his/her emotions perceived
in the previous session continuously by clicking on corre-
sponding points in the arousal-valence emotion space shown
on a monitor. Arousal and valence were recorded indepen-
dently as numeric values from –1 to 1. A brief guideline
of arousal-valence emotion space which included Fig. 1 was
given throughout annotation session to acquaint the subjects
with the arousal-valence model.
3.3 Data Preprocessing
To filter out unrelated artifacts, a bandpass filter was ap-
plied to extract only 0.5–60-Hz EEG signals. We uti-
lized EEGLAB [29] to identify and reject distinct artifact-
contaminating data automatically. The rejection of epochs
of continuous EEG data was implemented using a function
in EEGLAB. In addition, eye-blinking related artifacts were
removed from EEG signals by applying the independent
component analysis (ICA) signal processing method [30].
ICA decomposes multivariate signals into independent non-
Gaussian subcomponents. Lacking a dedicated electroocu-
logram, we adopted an artifact removal technique into using
the components of Fp1 and Fp2 electrodes instead because
the two frontal electrodes were positioned nearest to the eyes
and are obviously influenced by eye-blinking artifacts. Fi-
†††Software developed for Polymate AP1532 by TEAC Corpo-
ration.
††††http://docs.oracle.com/javase/7/docs/technotes/guides/sound/
THAMMASAN et al.: CONTINUOUS MUSIC-EMOTION RECOGNITION BASED ON ELECTROENCEPHALOGRAM
1237
nally, we associated EEG signals with the emotion labels
annotated by the subjects via timestamps.
3.4 Feature Extraction Algorithms
EEG signals were processed to retrieve informative features
using two approaches, FD and PSD. The calculations were
performed by MATLAB analysis tools. We applied a sliding
window segmentation technique to analyze temporal data
and track emotional fluctuation. The window size was de-
fined as 1000 samples, which was equivalent to 4 seconds.
In this study, the overlap between one sliding window and
the consecutive window was set to zero.
3.4.1 Fractal Dimension
FD values characterize the complexity of time-varying sig-
nals. Higher FD values of EEG signals reflect the higher
activity of the brain [31]. FD values are typically employed
in aective computing research, including emotion recogni-
tion based on EEG [28], because of their simplicity and in-
formative characteristics that properly indicate brain states.
In this study, we applied the Higuchi algorithm [32] to cal-
culate time series FD values directly in the time domain.
Given a time series X(i) where i=1,...,N,anew
series Xk
m(i) can be constructed by the following definition:
Xk
m:X(m),X(m+k), ..., Xm+Nm
kk,(1)
where kis the interval time and m=1,2,...,kis the initial
time. For example, assuming that the series has N=100
elements and k=3, then the series is separated into three
series as follows:
X3
1:X(1),X(4),X(7), ..., X(97),X(100)
X3
2:X(2),X(5),X(8), ..., X(98)
X3
3:X(3),X(6),X(9), ..., X(99).(2)
Then, the length of the series Xk
mis defined as:
Lm(k)=1
k
Nm
k
i=1X(m+ik)X(m+(i1)k)
(Nm
k)k
N1
,(3)
where the term (Nm
k)k
N1represents a normalization factor.
The length of time interval k, denoted L(k), is obtained
by averaging all the sub-series lengths Lm(k). The following
relationship exists:
L(k)kFD.(4)
The FD is a gradient of a logarithmic plot between kand its
associated L(k)and the calculated slope.
3.4.2 Power Spectral Density
Over the last few decades, the PSD analysis of EEG data has
been a typical approach to investigate the relevance of aec-
tive states and brainwaves [1]. PSD indicates signal power
in specific frequency ranges. This method is based on fast
Fourier transform, which is an algorithm to compute the dis-
crete Fourier transform and its inverse. This transformation
converts data in the time domain to the frequency domain
and vice versa. It is widely used for numerous applications
in engineering, science, and mathematics.
In this research, each EEG signal is decomposed into
five frequency ranges (delta, theta, alpha, beta, and gamma)
using the PSD approach. The PSD values were calcu-
lated using MATLAB Signal Processing Toolbox. As PSD
represents signals in the continuous frequency domain, we
needed to calculate a feature that represents the overall char-
acteristics of a specific frequency range. As the feature, we
used the average power over the given frequency band cal-
culated using the avgpower function in the toolbox.
3.5 Emotion Classification
In this research, emotional arousal was classified as high or
low. Similarly, emotional valence was classified as positive
or negative. Because of the application of sliding window
technique, subjects’ annotated emotion labels could vary
within one window. We unified emotional tagging to the
window by the majority method, i.e., emotional tagging in
one window was set as high or positive if the number of pos-
itive instances was greater than that of the negative instances
(and vice versa).
For the classification of emotion, we trained two mod-
els to identify arousal and valence classes independently by
employing three types of classification algorithms: support
vector machine (SVM), MLP, and C4.5. The SVM and
MLP are typical algorithms in brain-computer interaction
research. C4.5 is superior for speed of learning. All clas-
sification algorithms were implemented using the Waikato
Environment for Knowledge Analysis (WEKA) library[33].
4. Experiments and Results
4.1 Experimental Setup
After retrieving experimental data, we applied feature ex-
traction algorithms to the data from each electrode. As a
result, we obtained 12 features by FD value calculation. In
contrast, the PSD approach produces 60 features.
Previous reports have indicated that asymmetries of
features from symmetric electrode pairs can be used as in-
formative features to classify emotions [19],[27],[28],[34].
One plausible reason is that the asymmetry indexes might
suppress underlying artifact sources that contributed equally
to hemispheric electrode pairs [35]. Therefore, we added
asymmetry indexes to our original features as additional fea-
tures. These additional features were the dierences of fea-
ture values from the left hemisphere electrodes and the sym-
metric electrodes in the right hemisphere, e.g., the dierence
http://www.mathworks.com/help/signal/ref/dspdata.psd.html
1238
IEICE TRANS. INF. & SYST., VOL.E99–D, NO.4 APRIL 2016
Fig. 3 Average arousal and valence classification accuracy for all subjects; error bars denote standard
deviation and stars indicate significant dierence compared to chance levels
between the power of Fp1 and that of Fp2 in PSD features.
There were five symmetric electrode pairs; consequently, we
obtained 17 FD-value features and 85 PSD features.
Then, we trained a subject-dependent recognition
model and tested it using data from one subject. We adopted
the 10-fold cross-validation method to evaluate each sub-
ject’s model to obtain overall classification results. Note
that the non-overlapping characteristic of the adjacent slid-
ing window in our experiment avoided classification bias
caused by similar training and testing instances.
Traditional methodologies have primarily neglected
emotional changes over time. In other words, the informa-
tive features were extracted at the song-length level. Those
traditional models were trained on aggregated instances
from multiple songs. To simulate such conventional meth-
ods, we adapted our methodology by expanding the size of
the sliding window to the full length of the song. We trained
the model with the song-length data using the same feature
extraction and classification algorithms. The overall label of
the window is produced by a majority of annotations.
4.2 Chance Level
As our research relies on subject annotations, an unbalanced
data set could be obtained. In other words, if a subject la-
bels his/her perceived emotion primarily as a positive va-
lence, the number of positive valence instances would be
higher than the number of negative instances. This asymme-
try would lead to misinterpretation of classification results.
Therefore, we introduce a new indicator, chance level,asa
benchmark to evaluate models. The chance level, or ran-
dom guessing level, is defined by the majority class of the
training data. For example, in a training set consisting of
60% positive samples and 40% negative samples, the chance
level would be 60%.
4.3 Results of Emotion Classification
Emotion classification accuracy in each 10-fold cross-
validation was defined as the proportion of correctly classi-
fied test instances (true positives and true negatives) among
the total number of instances in the test set. The average
emotion classification accuracy for all subjects is shown in
Fig. 3. All approaches that considered the dynamics of emo-
tions outperformed the chance level for arousal recognition
significantly (p<0.01). Classification by FD value fea-
tures with the SVM achieved the best relative result (82.8%,
SD =8.1%). In this case, the chance level was 62.0%
(SD =6.6%). It should be noted that data from a sub-
ject who annotated only single class of arousal was removed
from arousal classification to avoid any bias.
Similarly, valence classification performance was su-
perior to the chance level. Again, the FD approach was su-
perior to the chance level significantly (p<0.01) regardless
of classification algorithm (SVM achieved the highest accu-
racy at 87.2%, SD =5.9%), while PSD gave better results
compared to the chance level only with SVM and MLP clas-
sifiers. In valence classification, the chance level was 72.9%
(SD=13.0%).
Compared with traditional approaches, using statistical
paired t-test, all of our methodologies demonstrated superior
performance. In particular, for all algorithms, considering
emotional oscillation improved the performance of arousal
classification significantly (p<0.01). Valence recogni-
tion by any feature extraction or classification technique also
achieved higher results (p<0.05).
THAMMASAN et al.: CONTINUOUS MUSIC-EMOTION RECOGNITION BASED ON ELECTROENCEPHALOGRAM
1239
Fig. 4 Arousal and valence annotations from subject No. 4 and their estimation by the model con-
structed with SVM and FD values from all instances (horizontal axis represents the order of songs
selected by the subject; data are plotted in time order)
5. Discussion
This research focuses on continuous emotion recognition
relying on continuous self-reported emotion labels. The
improved performance of continuous emotion recognition
over the traditional approach of using song-level labels are
promising but leave room for discussion. Empirical results
also showed that each feature extraction algorithm and clas-
sifier achieved dierent results, which we analyze and dis-
cuss in this section. To investigate, we studied the associa-
tion of features with the reported emotional states. Further-
more, we examine whether our approach could track emo-
tional variation by visualizing estimated emotion and then
comparing it with subject-reported emotion.
According to the obtained results, our proposed
methodology that considers emotion variation and applies
a sliding window technique outperformed traditional meth-
ods. It is possible that the results of conventional approaches
could suer because of the limited training examples. To
compensate for this, multiple sessions are required to elicit
dierent types of emotions. Multiple sessions incur a time
cost because of a large number of resting periods between
sessions. On the other hand, the proposed method utilizes
fewer songs to construct an emotion recognition model,
which is a more practical technique in real-world applica-
tions. Continuous annotation enables temporal data seg-
mentation and provides larger amounts of data to analyze.
According to our results, temporal data partitioning has en-
hanced the eciency of emotion recognition empirically.
To investigate the correlates of extracted features with
emotion reports, we computed Pearson product-moment
correlation coecients between each EEG feature and the
Tab l e 2 Top-5 features with the highest absolute averaged correlations
(over all subjects) with the valence and arousal ratings
FD value PSD
Arousal Fp2 0.1405 Fp2-γ0.0705
F7 0.0858 F7-β–F8-β0.0665
Fp1 – Fp2 -0.0776 Fp2-δ0.0581
C3 0.0727 F3-θ–F4-θ-0.0572
F4 0.0702 C3-δ-0.0562
Valence F4 0.1040 F8-α0.0804
F3 – F4 -0.0966 T4-α0.0777
Fz 0.0536 C3-α0.0755
Pz -0.0462 Fp2-α0.0735
F8 0.0357 T3-α0.0683
numerically reported arousal and valence for each subject
separately. The resulting correlations were then averaged
over all subjects to produce overall correlations. For FD
value and PSD features, 5 features with highest absolute
averaged correlations with arousal and valence are summa-
rized in Table 2. The involvement of these features was
partly consistent with previous literature. The FD asymme-
try at F3–F4 was comparable with the finding that the frontal
FD asymmetry at AF3–F4 can recognize valence with high
accuracies [36]. In a work using DEAP dataset [19],FDval-
ues of F4 and F7 were among the 4 channels selected by the
Fisher discriminant ratio channel selection method to clas-
sify emotions using FD and HOC features [31]. Further, the
relevance of beta asymmetry at F7–F8 and theta asymmetry
at F3–F4 to arousal was consistent with their involvement
in top-5 ranked asymmetric features to classify emotions in
previous work [27],[34].
The classification results suggest that FD value features
outperform PSD. The generally higher arousal correlations
of FD value features compared to PSD features may have
1240
IEICE TRANS. INF. & SYST., VOL.E99–D, NO.4 APRIL 2016
contributed to the superior performance of models using FD
value features for arousal recognition. Similarly, valence
recognition with FD value features could achieve better re-
sults compared to PSD features because of their slightly
higher absolute correlations. This evidence coincides with
results from previous studies in the field of EEG-based af-
fective computing [37],[38], which reported that the FD ap-
proach is superior to PSD in recognizing aective states be-
cause of the superior ability to analyze the non-linear behav-
ior of the brain. Note that the SVM achieved better results
than the other classifiers, i.e., C4.5 and MLP, and that simi-
lar results were also obtained in previous works [1].
Examining whether the model could track variation of
self-reported labels is also informative. To illustrate this,
we used data from subject No. 4 because of the obvious
fluctuations in the annotation. We trained the recognition
model with all instances and compared the estimation of
emotion from the trained model with the annotated labels.
The models were trained by using FD and SVM because
of their success in our experiment. The results for arousal
and valence recognition are shown in Fig. 4. The horizontal
axis shows the order of songs selected by the subject from
songs 1 to 16. Emotional reporting oscillations are observ-
able for some songs. According to the results, the emotion
recognition model could handle the distinct shifts of emo-
tion during some songs. For example, the model changes
the estimation of arousal from low to high for songs 2 and
12. Furthermore, the proposed model could track the trajec-
tory of valence shift from negative to positive while listening
to song 1 and could track conversely during song 5. How-
ever, the results shown in Fig. 4 need to be interpreted care-
fully. The model was trained on all the available instances in
the dataset; hence, it reflects the maximum in the capability
of the model to capture emotions, which could be slightly
higher than the results from cross-validation. Moreover, the
model relies heavily on subjective annotation whose capa-
bility of reflecting real emotions is limited by the subject’s
own emotional self-awareness.
In addition, data available to investigate tracking of
emotional reporting oscillations were relatively limited; i.e.,
the annotation oscillations were found in eight of sixteen
songs on average over all subjects. Therefore, further ex-
perimentation with select songs with high emotional oscil-
lation may be necessary to confirm the capabilities of our
approach. This is considered as our future work.
6. Conclusion
In this work, we have presented a study of continuous
music-emotion recognition using EEG based on the hypoth-
esis that emotions evoked when listening to music are sub-
jective and vary over time. Experiments were performed by
focusing on self-reporting and continuous emotion annota-
tion in the arousal-valence space. The results showed that
our approach outperformed traditional approaches that did
not consider emotional changes over time for arousal and
valence recognition, especially when classifying emotions
with FD value features and SVM classifier. We also found
that the emotional ground truth had higher correlation with
FD value than PSD. Finally, the models constructed through
our approach were found to display a satisfactory capac-
ity for tracking reporting oscillations of subjects’ emotions
while listening to music.
References
[1] M.-K. Kim, M. Kim, E. Oh, and S.-P. Kim, “A review on the com-
putational methods for emotional state estimation from the human
EEG,” Comp. Math. Methods in Medicine, vol.2013, pp.1–13, 2013.
[2] R. Jenke, A. Peer, and M. Buss, “Feature extraction and selection for
emotion recognition from EEG,” IEEE Trans. Aective Computing,
vol.5, no.3, pp.327–339, 2014.
[3] S. Koelsch, Brain and Music, Wiley-Blackwell, 2012.
[4] S. Koelsch, “Brain correlates of music-evoked emotions,” Nat. Rev.
Neurosci., vol.15, no.3, pp.170–180, 2014.
[5] M. Soleymani, J. Lichtenauer, T. Pun, and M. Pantic, “A multimodal
database for aect recognition and implicit tagging,” IEEE Trans.
Aective Computing, vol.3, no.1, pp.42–55, 2012.
[6] J. Eaton, D. Williams, and E. Miranda, “AFFECTIVE JUKEBOX:
A confirmatory study of EEG emotional correlates in response to
musical stimuli,” Proc. ICMC/SMC 2014, pp.580–585, 2014.
[7] O. Grewe, F. Nagel, R. Kopiez, and E. Altenm¨
uller, “Emotions Over
Time: Synchronicity and Development of Subjective, Physiologi-
cal, and Facial Aective Reactions to Music,” Emotion, vol.7, no.4,
pp.774–788, 2007.
[8] S. Koelsch, T. Fritz, D. Cramon, K. M¨
uller, and A.D. Friederici,
“Investigating emotion with music: An fMRI study,” Human Brain
Mapping, vol.27, no.3, pp.239–250, 2006.
[9] D. Sammler, M. Grigutsch, T. Fritz, and S. Koelsch, “Music
and emotion: Electrophysiological correlates of the processing of
pleasant and unpleasant music,” Psychophysiology, vol.44, no.2,
pp.293–304, 2007.
[10] Y.H. Yang and H.H. Chen, “Machine recognition of music emo-
tion: A review,” ACM Trans. Intell. Syst. Technol., vol.3, no.3,
pp.40:1–40:30, 2012.
[11] J.A. Russell, “A circumplex model of aect,” J. Personality and So-
cial Psychology, vol.39, no.6, pp.1161–1178, 1980.
[12] Y. Yamano, R. Cabredo, P. Salvador Inventado, R. Legaspi, K.
Moriyama, K.I. Fukui, S. Kurihara, and M. Numao, “Estimat-
ing emotions on music based on brainwave analyses,” Proc. 3rd
Intl. Workshop on Empathic Computing (IWEC2012), pp.115–124,
2012.
[13] S. Sanei and J. Chambers, EEG Signal Processing, Wiley, 2008.
[14] L.A. Schmidt and L.J. Trainor, “Frontal brain electrical activity EEG
distinguishes valence and intensity of musical emotions,” Cognition
& Emotion, vol.15, no.4, pp.487–500, 2001.
[15] T. Baumgartner, M. Esslen, and L. J¨
ancke, “From emotion percep-
tion to emotion experience: Emotions evoked by pictures and classi-
cal music,” Intl. J. Psychophysiology, vol.60, no.1, pp.34–43, 2006.
[16] P.J. Lang, M.M. Bradley, and B.N. Cuthbert, “International aective
picture system (IAPS): Aective ratings of pictures and instruction
manual,” Tech. Rep. A-8, The Center for Research in Psychophysi-
ology, University of Florida, Gainesville, FL, 2008.
[17] G. Chanel, J. Kronegg, D. Grandjean, and T. Pun, “Emotion assess-
ment: Arousal evaluation using EEG’s and peripheral physiological
signals,” Multimedia Content Representation, Classification and Se-
curity, Lecture Notes in Computer Science, vol.4105, pp.530–537,
Springer Berlin Heidelberg, 2006.
[18] X.-W. Wang, D. Nie, and B.-L. Lu, “Emotional state classification
from EEG data using machine learning approach,” Neurocomputing,
vol.129, pp.94–106, 2014.
[19] S. Koelstra, C. Muhl, M. Soleymani, J.-S. Lee, A. Yazdani, T.
Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “DEAP: A database for
THAMMASAN et al.: CONTINUOUS MUSIC-EMOTION RECOGNITION BASED ON ELECTROENCEPHALOGRAM
1241
emotion analysis using physiological signals,” IEEE Trans. Aec-
tive Computing, vol.3, no.1, pp.18–31, 2012.
[20] O. AlZoubi, R.A. Calvo, and R.H. Stevens, “Classification of EEG
for aect recognition: An adaptive approach,” AI 2009: Ad-
vances in Artificial Intelligence, Lecture Notes in Computer Science,
vol.5866, pp.52–61, Springer Berlin Heidelberg, 2009.
[21] S.A. Hosseini, M.A. Khalilzadeh, M.B. Naghibi-Sistani, and V.
Niazmand, “Higher order spectra analysis of EEG signals in emo-
tional stress states,” Proc. 2nd Intl. Conf. on Inf. Tech. and Comp.
Sci. (ITCS) 2010, pp.60–63, 2010.
[22] P.C. Petrantonakis and L.J. Hadjileontiadis, “Emotion recognition
from brain signals using hybrid adaptive filtering and higher order
crossings analysis,” IEEE Trans. Aective Computing, vol.1, no.2,
pp.81–97, 2010.
[23] M. Murugappan, N. Ramachandran, and Y. Sazali, “Classification
of human emotion from EEG using discrete wavelet transform,
J. Biomedical Science and Engineering, vol.3, no.4, pp.390–396,
2010.
[24] D.P.O. Bos, “EEG-based emotion recognition the influence of visual
and auditory stimuli,” Capita Selecta, University of Twente, 2006.
[25] M.M. Bradley and P.J. Lang, “International aective digitized
sounds (IADS): Stimuli, instruction manual and aective ratings,”
Tech. Rep. B-2, The Center for Research in Psychophysiology, Uni-
versity of Florida, Gainesville, FL, 1999.
[26] R. Khosrowabadi, A. Wahab, K.K. Ang, and M. Baniasad, “Aec-
tive computation on EEG correlates of emotion from musical and
vocal stimuli,” Proc. Intl. Joint Conf. on Neural Networks (IJCNN
2009), pp.1590–1594, 2009.
[27] Y.-P. Lin, C.-H. Wang, T.-P. Jung, T.-L. Wu, S.-K. Jeng, J.-R.
Duann, and J.-H. Chen, “EEG-based emotion recognition in mu-
sic listening,” IEEE Trans. Biomedical Engineering, vol.57, no.7,
pp.1798–1806, 2010.
[28] O. Sourina and Y. Liu, “A fractal-based algorithm of emotion recog-
nition from EEG using arousal-valence model,” Proc. Biosignals
2011, pp.209–214, 2011.
[29] A. Delorme, T. Mullen, C. Kothe, Z.A. Acar, N. Bigdely-Shamlo,
A. Vankov, and S. Makeig, “EEGLAB, SIFT, NFT, BCILAB, and
ERICA: New tools for advanced EEG processing,” Comp. Intell.
Neurosci., vol.2011, pp.1–12, 2011.
[30] T.-P. Jung, S. Makeig, C. Humphries, T.-W. Lee, M.J. Mckeown, V.
Iragui, and T.J. Sejnowski, “Removing electroencephalographic ar-
tifacts by blind source separation,” Psychophysiology, vol.37, no.2,
pp.163–178, 2000.
[31] Y. Liu and O. Sourina, “EEG databases for emotion recognition,”
Proc. Intl. Conf. on Cyberworlds 2013, pp.302–309, 2013.
[32] T. Higuchi, “Approach to an irregular time series on the basis of the
fractal theory,” Physica D, vol.31, no.2, pp.277–283, 1988.
[33] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and
I.H. Witten, “The weka data mining software: An update,” SIGKDD
Explor. Newsl., vol.11, no.1, pp.10–18, 2009.
[34] Y.-P. Lin, Y.-H. Yang, and T.-P. Jung, “Fusion of electroencephalo-
gram dynamics and musical contents for estimating emotional re-
sponses in music listening,” Front. Neurosci., vol.8, no.94, 2014.
[35] A. Konar and A. Chakraborty, Emotion Recognition: A Pattern
Analysis Approach, Wiley, 2015.
[36] O. Sourina, Y. Liu, and M.K. Nguyen, “Real-time EEG-based emo-
tion recognition for music therapy,” J. Multimodal User Interfaces,
vol.5, no.1-2, pp.27–35, 2012.
[37] B. Weiss, Z. Clemens, R. B´
odizs, and P. Hal´
asz, “Comparison of
fractal and power spectral EEG features: Eects of topography and
sleep stages,” Brain Research Bulletin, vol.84, no.6, pp.359–375,
2011.
[38] M. Bachmann, J. Lass, A. Suhhova, and H. Hinrikus, “Spectral
asymmetry and higuchi’s fractal dimension measures of depres-
sion electroencephalogram,” Comp. Math. Methods in Medicine,
vol.2013, pp.299–309, 2013.
Nattapong Thammasan received a B.
Computer Eng. degree from Chulalongkorn
University in 2012 and an M.Sc. from Osaka
University in 2015. He is currently a Ph.D.
candidate at the Institute of Scientific and In-
dustrial Research (ISIR), Osaka University. His
research interests include artificial intelligence,
brain–computer interaction, and aective com-
puting.
Koichi Moriyama received B.Eng., M.Eng.,
and D.Eng from Tokyo Institute of Technology
in 1998, 2000, and 2003, respectively. After
working at Tokyo Institute of Technology and
Osaka University, he is currently an associate
professor at Graduate School of Engineering,
Nagoya Institute of Technology. His research
interests include artificial intelligence, multia-
gent systems, game theory, and cognitive sci-
ence. He is a member of the Japanese Society
for Artificial Intelligence (JSAI).
Ken-ichi Fukui is an Associate Professor in
ISIR, Osaka University. He received Master of
Arts from Nagoya University in 2003 and Ph.D.
in information science from Osaka University
in 2010. He was a Specially Appointed Assis-
tant Professor in ISIR, Osaka University from
2005 to 2010, and an Assistant Professor from
2010 to 2015. His research interest includes data
mining algorithm and its environmental contri-
bution. He is a member of JSAI, IPSJ, and the
Japanese Society for Evolutionary Computation.
Masayuki Numao is a professor in the
Department of Architecture for Intelligence, the
ISIR, Osaka University. He received a B.Eng.
in electrical and electronics engineering in 1982
and his Ph.D. in computer science in 1987 from
the Tokyo Institute of Technology. He worked
in the Department of Computer Science, Tokyo
Institute of Technology from 1987 to 2003 and
was a visiting scholar at CSLI, Stanford Univer-
sity from 1989 to 1990. His research interests
include artificial intelligence, machine learning,
aective computing and empathic computing. He is a member of the In-
formation Processing Society of Japan, the JSAI, the Japanese Cognitive
Science Society, the Japan Society for Software Science and Technology,
and the American Association for Artificial Intelligence.
... We used an EEG sensor (Biosignalplux, Plux, Inc., Lisbon, Portugal) with a sampling frequency of 1000 Hz to measure brain activity at Fp1 and Fp2 based on the international 10-20 system. These frontal areas reflect emotional brain functions [20][21][22][23]. A reference electrode was attached to the earlobe, and the sensor was attached after cleaning the electrode attachment sites according to the Biosignalsplux manual. ...
... Alpha waves indicate a state of relaxation, whereas beta waves indicate a state of arousal. Dividing the alpha wave by the beta wave provides the alpha/beta ratio, which can be used to indicate a state of relaxation with suppressed cortical activity or a state of attentional focus dominance [20][21][22][23]. These values were log-transformed and used as physiological indicators. ...
Article
Full-text available
Stretching is an effective exercise for increasing body flexibility and pain relief. This study investigates the relationship between stretching intensity and relaxation effects, focusing on brainwaves and autonomic nervous system (ANS) activity. We used a crossover design with low- and high-intensity conditions to elucidate the impact of varying stretching intensities on neural activity associated with relaxation in 19 healthy young adults. Participants completed mood questionnaires. Electroencephalography (EEG) and plethysmography measurements were also obtained before, during, and after stretching sessions. The hamstring muscle was targeted for stretching, with intensity conditions based on the Point of Discomfort. Data analysis included wavelet analysis for EEG, plethysmography data, and repeated-measures ANOVA to differentiate mood, ANS activity, and brain activity related to stretching intensity. Results demonstrated no significant differences between ANS and brain activity based on stretching intensity. However, sympathetic nervous activity showed higher activity during the rest phases than in the stretch phases. Regarding brain activity, alpha and beta waves showed higher activity during the rest phases than in the stretch phases. A negative correlation between alpha waves and sympathetic nervous activities was observed in high-intensity conditions. However, a positive correlation between beta waves and parasympathetic nervous activities was found in low-intensity conditions. Our findings suggest that stretching can induce interactions between the ANS and brain activity.
... The PSD of the EEG signal was calculated using the Welch method. Since PSD represents a signal in the continuous frequency domain, a feature needs to be computed to represent the overall characteristics of a particular frequency range [29]. As a feature, here we have used the average power over a given frequency band calculated by the mean function in Python. ...
Article
Full-text available
In emotion recognition tasks, electroencephalography (EEG) has gained significant favor among researchers as a powerful biological signal tool. However, existing studies often fail to fully utilize the high temporal resolution provided by EEG when combining spatiotemporal and frequency features for emotion recognition, and do not meet the needs of effective feature fusion. Therefore, this paper proposes a multilevel multidomain feature fusion network model called MMF-Net, aiming to obtain a more comprehensive representation of spatiotemporal-frequency features and achieve higher accuracy in emotion classification. The model takes the original EEG two-dimensional feature map as input, simultaneously extracting spatiotemporal and spatial-frequency domain features at different levels to effectively utilize temporal resolution. Next, at each level, a specially designed fusion network layer is employed to combine the captured temporal, spatial, and frequency domain features. In addition, the fusion network layer contributes positively to the convergence of the model and the enhancement of feature detectors. In subject-dependent experiments, MMF-Net achieved average accuracy rates of 99.50% and 99.59% for valence and arousal dimensions on the DEAP dataset, respectively. In subject-independent experiments, the average accuracy rates for these two dimensions reached 97.46% and 97.54%, respectively.
Article
Emotion detection by evaluating Electroencephalogram (EEG) signals is an emerging field that provides insights into human emotional states by monitoring brain activity, using music therapy and entertainment. This study aims to establish a connection between brain activity and emotion recognition through music. The applications of this study include mental health assessments, emotionally intelligent agents, adaptive learning, pain assessment, patient monitoring, security and surveillance, and personalized music recommendations. Traditional EEG-based emotion detection techniques frequently struggle with the complex and noisy nature of data. Given that EEG signals are raw data, this study proposes the use of an actor-critic algorithm, enabling accurate, real-time emotion detection in the presence of musical stimuli. The actor-critic network architecture is a sophisticated framework designed to predict emotional states from EEG features, leveraging the rich, real-time data provided about brain activity. In this setup, the actor network generates predictions about an individual's emotional state based on the EEG signals it processes, making informed guesses about various emotional conditions, such as happiness, sadness, or stress. The critic network, on the other hand, evaluates the accuracy of these predictions, assessing how well the actor's predictions align with actual emotional states, thus providing feedback essential for refining and enhancing the actor's predictive capabilities.
Article
Full-text available
Background and Objectives: Understanding the relationships between subjective shoulder stiffness, muscle hardness, and various factors is crucial. Our cross-sectional study identified subgroups of shoulder stiffness based on symptoms and muscle hardness and investigated associated factors. Materials and Methods: measures included subjective stiffness, pain, muscle hardness, and factors like physical and psychological conditions, pressure pain threshold, postural alignment, heart rate variability, and electroencephalography in 40 healthy young individuals. Results: Three clusters were identified: Cluster 1 with high stiffness, pain, and muscle hardness; Cluster 2 with low stiffness and pain but high muscle hardness; and Cluster 3 with low levels of all factors. Cluster 1 had significantly higher central sensitization-related symptoms (CSS) scores than Cluster 2. Subjective stiffness is positively correlated with psychological factors. Conclusions: our results suggest that CSS impacts subjective symptom severity among individuals with similar shoulder muscle hardness.
Chapter
Full-text available
In this chapter, work on automatic generation of innovative choreography for ‘Nritta’ pure dance movements in BharataNatyam (BN) is reported. Modelling, i.e. representation of six major body limbs using a 30 attributes vector, genetic algorithm-based automatic generation of huge number of new dance steps for single and multi-beat choreography, and application of rough set approximation for the dimension reduction are the major contributions. The steps are derived from the 13 best known from the traditional practices, called Adavus, while the gold data for training our classifier has been obtained from the experts’ opinion. An aesthetic scale based on fractal-dimension was also tested. This was done with the hope of finding universal metrics for aesthetic computation. In continuation, the need for seeking universals of aesthetics for making dance aesthetics computable is argued towards the end. For this purpose, Ramachandran’s 9 principles of aesthetics are discussed in the conclusion briefly.
Chapter
The selection of musical features plays an important role in a variety of automated music analytic tasks such as music transcription, recommendation, classification, playlist generation, performance evaluation and music information retrieval. Music data available in various digital forms, such as metadata, notations, audio files, online comments and tags and lyrics, is used as an input to the music analyser system. Various approaches are used on music digital data to extract features for identifying relevant information. Feature selection is a critical step in the success of a music analysis task, which is automated or semi-automated using a different machine learning approach. To understand the significance of feature selection, automatic music emotion recognition using supervised machine learning is explored as a case study. Challenges in automating such tasks in computational musicology are discussed so as to give the reader an overview of the complexity in this interdisciplinary research domain.
Article
Full-text available
Using recent regional brain activation/emotion models as a theoretical framework, we examined whether the pattern of regional EEG activity distinguished emotions induced by musical excerpts which were known to vary in affective valence (i.e., positive vs. negative) and intensity (i.e., intense vs. calm) in a group of undergraduates. We found that the pattern of asymmetrical frontal EEG activity distinguished valence of the musical excerpts. Subjects exhibited greater relative left frontal EEG activity to joy and happy musical excerpts and greater relative right frontal EEG activity to fear and sad musical excerpts. We also found that, although the pattern of frontal EEG asymmetry did not distinguish the intensity of the emotions, the pattern of overall frontal EEG activity did, with the amount of frontal activity decreasing from fear to joy to happy to sad excerpts. These data appear to be the first to distinguish valence and intensity of musical emotions on frontal electrocortical measures.
Conference Paper
Full-text available
The arousal dimension of human emotions is assessed from two different physiological sources: peripheral signals and electroencephalographic (EEG) signals from the brain. A complete acquisition protocol is presented to build a physiological emotional database for real participants. Arousal assessment is then formulated as a classification problem, with classes corresponding to 2 or 3 degrees of arousal. The performance of 2 classifiers has been evaluated, on peripheral signals, on EEG's, and on both. Results confirm the possibility of using EEG's to assess the arousal component of emotion, and the interest of multimodal fusion between EEG's and peripheral physiological signals. 1.
Article
Full-text available
Electroencephalography (EEG)-based emotion classification during music listening has gained increasing attention nowadays due to its promise of potential applications such as musical affective brain-computer interface (ABCI), neuromarketing, music therapy, and implicit multimedia tagging and triggering. However, music is an ecologically valid and complex stimulus that conveys certain emotions to listeners through compositions of musical elements. Using solely EEG signals to distinguish emotions remained challenging. This study aimed to assess the applicability of a multimodal approach by leveraging the EEG dynamics and acoustic characteristics of musical contents for the classification of emotional valence and arousal. To this end, this study adopted machine-learning methods to systematically elucidate the roles of the EEG and music modalities in the emotion modeling. The empirical results suggested that when whole-head EEG signals were available, the inclusion of musical contents did not improve the classification performance. The obtained performance of 74~76% using solely EEG modality was statistically comparable to that using the multimodality approach. However, if EEG dynamics were only available from a small set of electrodes (likely the case in real-life applications), the music modality would play a complementary role and augment the EEG results from around 61–67% in valence classification and from around 58–67% in arousal classification. The musical timber appeared to replace less-discriminative EEG features and led to improvements in both valence and arousal classification, whereas musical loudness was contributed specifically to the arousal classification. The present study not only provided principles for constructing an EEG-based multimodal approach, but also revealed the fundamental insights into the interplay of the brain activity and musical contents in emotion modeling.
Conference Paper
Full-text available
Emotion recognition from Electroencephalogram (EEG) rapidly gains interest from research community. Two affective EEG databases are presented in this paper. Two experiments are conducted to set up the databases. Audio and visual stimuli are used to evoke emotions during the experiments. The stimuli are selected from IADS and IAPS databases.14 subjects participated in each experiment. Emotiv EEG device is used for the data recording. The EEG data are rated by the participants with arousal, valence, and dominance levels. The correlation between powers of different EEG bands and the affective ratings is studied. The results agree with the literature findings and analyses of benchmark DEAP database that proves the reliability of the two databases. Similar brain patterns of emotions are obtained between the established databases and the benchmark database. A SVM-based emotion recognition algorithm is proposed and applied to both databases and the benchmark database. Use of a Fractal Dimension feature in combination with statistical and Higher Order Crossings (HOC) features gives us results with the best accuracy. Up to 8 emotions can be recognized. The accuracy is consistent between the established databases and the benchmark database.
Article
Full-text available
This paper aims at providing a new feature extraction method for a user-independent emotion recognition system, namely, HAF-HOC, from electroencephalograms (EEGs). A novel filtering procedure, namely, Hybrid Adaptive Filtering (HAF), for an efficient extraction of the emotion-related EEG-characteristics was developed by applying Genetic Algorithms to the Empirical Mode Decomposition-based representation of EEG signals. In addition, Higher Order Crossings (HOCs) analysis was employed for feature extraction realization from the HAF-filtered signals. The introduced HAF-HOC scheme incorporated four different classification methods to accomplish a robust emotion recognition performance. Through a series of facial-expression image projection, as a Mirror Neuron System-based emotion elicitation process, EEG data related to six basic emotions (happiness, surprise, anger, fear, disgust, and sadness) have been acquired from 16 healthy subjects using three EEG channels. Experimental results from the application of the HAF-HOC to the collected EEG data and comparison with previous approaches have shown that the HAF-HOC scheme clearly surpasses the latter in the field of emotion recognition from brain signals for the discrimination of up to six distinct emotions, providing higher classification rates up to 85.17 percent. The promising performance of the HAF-HOC surfaces the value of EEG signals within the endeavor of realizing more pragmatic, affective human-machine interfaces.
Book
A timely book containing foundations and current research directions on emotion recognition by facial expression, voice, gesture and biopotential signals. This book provides a comprehensive examination of the research methodology of different modalities of emotion recognition. Key topics of discussion include facial expression, voice and biopotential signal-based emotion recognition. Special emphasis is given to feature selection, feature reduction, classifier design and multi-modal fusion to improve performance of emotion-classifiers. Written by several experts, the book includes several tools and techniques, including dynamic Bayesian networks, neural nets, hidden Markov model, rough sets, type-2 fuzzy sets, support vector machines and their applications in emotion recognition by different modalities. The book ends with a discussion on emotion recognition in automotive fields to determine stress and anger of the drivers, responsible for degradation of their performance and driving-ability. There is an increasing demand of emotion recognition in diverse fields, including psycho-therapy, bio-medicine and security in government, public and private agencies. The importance of emotion recognition has been given priority by industries including Hewlett Packard in the design and development of the next generation human-computer interface (HCI) systems. Emotion Recognition: A Pattern Analysis Approach would be of great interest to researchers, graduate students and practitioners, as the book. Offers both foundations and advances on emotion recognition in a single volume. Provides a thorough and insightful introduction to the subject by utilizing computational tools of diverse domains. Inspires young researchers to prepare themselves for their own research. Demonstrates direction of future research through new technologies, such as Microsoft Kinect, EEG systems etc.
Article
The present study uses music as a tool to induce emotion, and functional magnet resonance imaging (fMRI) to determine neural correlates of emotion processing. We found that listening to pleasant music activated the larynx representation in the rolandic operculum. The larynx is the source of vocal sound, and involved in the production of melody, rhythm, and emotional modulation of the vocal timbre during vocalization. The activation of the larynx is reminiscent of the activation of premotor areas during the observation of grasping movements and might indicate that a system for the perception-action mediation which has been reported for the visual domain also exists in the auditory domain.
Article
Emotion recognition from EEG signals allows the direct assessment of the “inner” state of a user, which is considered an important factor in human-machine-interaction. Many methods for feature extraction have been studied and the selection of both appropriate features and electrode locations is usually based on neuro-scientific findings. Their suitability for emotion recognition, however, has been tested using a small amount of distinct feature sets and on different, usually small data sets. A major limitation is that no systematic comparison of features exists. Therefore, we review feature extraction methods for emotion recognition from EEG based on 33 studies. An experiment is conducted comparing these features using machine learning techniques for feature selection on a self recorded data set. Results are presented with respect to performance of different feature selection methods, usage of selected feature types, and selection of electrode locations. Features selected by multivariate methods slightly outperform univariate methods. Advanced feature extraction techniques are found to have advantages over commonly used spectral power bands. Results also suggest preference to locations over parietal and centro-parietal lobes.
Article
Music is a universal feature of human societies, partly owing to its power to evoke strong emotions and influence moods. During the past decade, the investigation of the neural correlates of music-evoked emotions has been invaluable for the understanding of human emotion. Functional neuroimaging studies on music and emotion show that music can modulate activity in brain structures that are known to be crucially involved in emotion, such as the amygdala, nucleus accumbens, hypothalamus, hippocampus, insula, cingulate cortex and orbitofrontal cortex. The potential of music to modulate activity in these structures has important implications for the use of music in the treatment of psychiatric and neurological disorders.