Conference PaperPDF Available

Mental Emotional Sentiment Classification with an EEG-based Brain-machine Interface

Authors:

Abstract and Figures

This paper explores single and ensemble methods to classify emotional experiences based on EEG brainwave data. A commercial MUSE EEG headband is used with a resolution of four (TP9, AF7, AF8, TP10) electrodes. Positive and negative emotional states are invoked using film clips with an obvious valence, and neutral resting data is also recorded with no stimuli involved, all for one minute per session. Statistical extraction of the alpha, beta, theta, delta and gamma brainwaves is performed to generate a large dataset that is then reduced to smaller datasets by feature selection using scores from OneR, Bayes Network, Information Gain, and Symmetrical Uncertainty. Of the set of 2548 features, a subset of 63 selected by their Information Gain values were found to be best when used with ensemble classifiers such as Random Forest. They attained an overall accuracy of around 97.89%, outperforming the current state of the art by 2.99 percentage points. The best single classifier was a deep neural network with an accuracy of 94.89%.
Content may be subject to copyright.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
DOI: http://dx.doi.org/10.17501........................................
Mental Emotional Sentiment Classification with an EEG-
based Brain-Machine Interface
Jordan J. Bird
School of Engineering and Applied Science
Aston University
Birmingham, UK
birdj1@aston.ac.uk
Christopher D. Buckingham
School of Engineering and Applied Science
Aston University
Birmingham, UK
c.d.buckingham@aston.ac.uk
Anikó Ekárt
School of Engineering and Applied Science
Aston University
Birmingham, UK
a.ekart@aston.ac.uk
Diego R. Faria
School of Engineering and Applied Science
Aston University
Birmingham, UK
d.faria@aston.ac.uk
ABSTRACT
This paper explores single and ensemble methods to classify
emotional experiences based on EEG brainwave data. A
commercial MUSE EEG headband is used with a resolution of
four (TP9, AF7, AF8, TP10) electrodes. Positive and negative
emotional states are invoked using film clips with an obvious
valence, and neutral resting data is also recorded with no stimuli
involved, all for one minute per session. Statistical extraction of
the alpha, beta, theta, delta and gamma brainwaves is performed
to generate a large dataset that is then reduced to smaller datasets
by feature selection using scores from OneR, Bayes Network,
Information Gain, and Symmetrical Uncertainty. Of the set of
2548 features, a subset of 63 selected by their Information Gain
values were found to be best when used with ensemble classifiers
such as Random Forest. They attained an overall accuracy of
around 97.89%, outperforming the current state of the art by 2.99
percentage points. The best single classifier was a deep neural
network with an accuracy of 94.89%.
Keywords
Emotion Classification, Brain-Machine Interface, Machine
Learning.
1. INTRODUCTION
The proceedings are the records of the IAPE’18 conference. We
ask that authors follow some simple guidelines. In essence, we ask
you to make your paper look exactly like this document. The
easiest way to do this is simply to replace the content with your
own material.
Autonomous non-invasive detection of emotional states is
potentially useful in multiple domains such as human robot
interaction and mental healthcare. It can provide an extra
dimension of interaction between user and device, as well as
enabling tangible information to be derived that does not depend
on verbal communication [1]. With the increasing availability of
low-cost electroencephalography (EEG) devices, brainwave data
is becoming affordable for the consumer industry as well as for
research, introducing the need for autonomous classification
without the requirement of an expert on hand.
Due to the complexity, randomness, and non-stationary aspects of
brainwave data, classification is very difficult with a raw EEG
stream. For this reason, stationary techniques such as time
windowing must be introduced alongside feature extraction of the
data within a window. There are many statistics that can be
derived from such EEG windows, each of which has varying
classification efficacy depending on the goal. Feature selection
must be performed to identify useful statistics and reduce the
complexity of the model generation process, saving both time and
computational resources during the training and classification
processes.
The main contributions of this work are as follows:
Exploration of single and ensemble methods for the
classification of emotions.
A high performing data mining strategy reaching
97.89% accuracy.
The inclusion of facial EMG signals as part of the
classification process.
A resolution of three emotional classes (positive,
neutral, negative) to allow for real world on mental
states that are not defined by prominent emotions.
One Rule classification demonstrating how accurately
the AF7 electrode’s mean value classifies mental states.
The remainder of this paper will explore related state-of-the-art
research and provide the main inspiration and influences for the
study. It will explain the methodology of data collection, feature
generation, feature selection and prediction methods. The results
will be presented and discussed alongside comparable work,
followed by conclusions and future work.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
DOI: http://dx.doi.org/10.17501........................................
Figure 1. Di agr am to show Lövheim’ s Cube of
Emotional Categorization
2. RELATED WORK
Statistics derived from a time-windowing technique with feature
selection have been found to be effective for classifying mental
states such as relaxed, neutral, and concentrating [2]. An ensemble
method of Random Forest had an observed classification accuracy
of 87% when performed with a dataset which was pre-processed
with the OneR classifier as a feature selector. These promising
results suggested a study on classification of emotional states
using a similar exploration method would be similarly successful.
The best current state-of-the-art solution for classification of
emotional EEG data from a low-resolution, low-cost EEG setup
used Fisher’s Discriminant Analysis to produce an accuracy of
95% [3]. The study tried to prevent participants from becoming
tense and discourage blinking but the previous study [2] found
that EMG data from these activities helped classification because
blink rates are a factor in concentration for example. Hence the
new study described in this paper will explore classification of
emotions in EEG data when unconscious movements are neither
encouraged nor discouraged. Conscious extraneous movements
such as taking a sip of water will not be allowed because they just
form outlying or masking points in the data. For example, if the
people experiencing positive emotions are also drinking water, the
model will simply classify the electrical data that has been
generated by those movements. Stimuli to evoke emotions for
EEG-based studies are often found to be best with music [4] and
film [5]. This paper thus focuses on film clips that have audio
tracks (speech and/or music) to evoke emotions, similarly to a
related study that used music videos [6].
Common Spatial Patterns have proved extremely effective for
emotion classification, attaining an overall best solution at 93.5%
[7]. A MUSE EEG headband was successfully used to classify
high resolutions of valence through differing levels of enjoyment
during a certain task [8]. Deep Belief Network (DBN), Artificial
Neural Network (ANN), and Support Vector Machine (SVM)
methods have all been able to classify emotions from EEG data
was also found to be very effective with when considering binary
classes of positive and negative [9]. This study will build on all
these results using similar methods as well as an ensemble, to
exploit their differing strengths and weaknesses. The study also
supports the usage of a neutral class, for transition into real-world
use, to provide a platform for emotional classification where
emotions are not prominent. It adds valence or perceived
sentiment because this was previously found to be helpful in the
learning processes for a web-based chatbot [10].
3. BACKGROUND
3.1 Electroencephalography
Electroencephalography is the process using applied electrodes to
derive electrophysiological data and signals produced by the brain
[11] [12]. Electrodes can be subdural [13] ie. under the skull,
placed on and within the brain itself. Noninvasive techniques
require either wet or dry electrodes to be placed around the
cranium [14]. Raw electrical data is measured in Microvolts (uV)
at observed time t producing wave patterns from t to t+n.
3.2 Human Emotion
Human emotions are varied and complex but can be generalized
into positive and negative categories [15]. Some emotions overlap
such as ’hope’ and ’anguish’, which are considered positive and
negative respectively but that are often experienced
Table 1. Table to show Lövheim categories and their
encapsulated emotions with a valence label
Emotion
Category
Emotion/Valence
A
Shame (Negative) Humiliation (Negative)
B
Contempt (Negative) Disgust (Negative)
C
Fear (Negative)
Terror (Negative)
D
Enjoyment (Positive) Joy (Positive)
E
Distress (Negative) Anguish (Negative)
F
Surprise (Negative) (Lack of Dopamine)
G
Anger (Negative) Rage (Negative)
H
Interest (Positive) Excitement (Positive)
contemporaneously: e.g. the clearly doomed hope and
accompanying anguish for a character’s survival in a film. This
study will concentrate on those emotions that do not overlap, to
help correctly classify what is and is not a positive experience.
Lövheim’s three-dimensional emotional model maps brain
chemical composition to generalised states of positive and
negative valence [16]. This is shown in Fig. 1 with emotion
categories A-H from each of the model’s vertices, further detailed
in Table I. Various chemical compositions can be mapped to
emotions with positive and negative classes. Furthermore, studies
show that chemical composition influences nervous oscillation
and thus the generation of electrical brainwaves [17]. Since
emotions are encoded within chemical composition that directly
influence electrical brain activity, this study proposes that they
can be classed using statistical features of the produced
brainwaves.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
DOI: http://dx.doi.org/10.17501........................................
Figure 2. A simpl ified di agram of a ful ly-connected feed
forward deep neural network.
3.3 Machine Learning Algorithms
The study in this paper applies a number of machine learning
algorithms. One Rule (OneR) classification is a simplistic
probabilistic process of selecting one attribute from the dataset
and generating logical rules based upon it. For example:
"IF temperature LESS THAN 5.56 THEN December"
and
"IF temperature MORE THAN 23.43 THEN July"
are rules generated based on a temperature attribute to predict the
month (class). This model will identify the strongest attribute
within the dataset for classifying emotions
Decision Trees follow a linear process of conditional control
statements based on attributes, through a tree-like structure where
each node is a rule based decision that will further lead to other
nodes. Finally, an end node is reached, and a class is given to the
data object. The level of randomness or entropy on all end nodes
is used to measure the classification ability of the tree. The
calculation of entropy is given as:
!
"
#
$
%&'
(
)*&+&,-./")*$&
0
*12 3 "4$
Entropic models are compared by their difference in entropy
which is information gain. A positive value would be a better
model, whereas a negative value shows information loss versus
the comparative model. This is given as:
5678
"
9:;
$
% &!
"
9
$
'!
"
9:;
$
: "<$
where E is the entropy calculated by Equation 1.
Support Vector Machines (SVM) classify data points by
generating and optimising a hyperplane to separate them and
classifying based on their position in comparison to the
hyperplane [18]. A model is considered optimised when the
average margins between points and the separator is at its
maximum value. Sequential Minimal Optimisation (SMO) is a
high-performing algorithm to generate and implement an SVM
classifier [19]. The large optimisation problem is broken down
into smaller subproblems, that can then be solved linearly.
Bayes’ Theorem [20] uses conditional probabilities to determine
the likelihood of Class A based on Evidence, B, as follows:
)
"
6=>
$
%)"?=@$A"@$
)">$ 3 "B$
For this study, evidence consists of attribute values (EEG
time-window statistics) and ground-truth training for determining
their most likely classes. A simpler version is known as
Naive Bayes, which assumes independence of attribute values
whether or not they are really unrelated. Classification of
Naive Bayes is adapted from Equation 3 as follows:
C
D
% E F
"
4:G:E
$
&H
"
IJ
$K
H
"
L*
=
IJ
$
M
*12 : "N$
where y is the class and k is the data object (row) that is being
classified.
Logistic Regression is a symmetric statistical model used for
mapping a numerical value to a probability, ie. hours of study to
predict a student’s exam grade [21]. For a binary classification
problem with i attributes, and β model parameters, the log odds l
is given as
, %&OPQ&
R
O*QL*
S
*1P
and thus the corresponding
odds of outcome are therefore given as
- %&TUVW&
R
UXWSX
Y
XZV
which
can be used to predict a model outcome based on previous data.
A Multilayer Perceptron is a type of Artificial Neural Network
(ANN) that predicts a class by taking input parameters and
computing them through a series of hidden layers to one or more
nodes on the final output layer. More than one hidden layer forms
a deep neural network and output layers can be different classes
or, if there is just one, a regression output. A simplified diagram
of a fully connected feed forward deep neural network can be seen
in Fig. 2. Learning is performed for a defined time and follows the
process of backpropagation [22], which is the process of deriving
a gradient that is further used to calculate weights for each node
(neuron) in the network. Training is based on reducing the error
rate given by the error function ie. the performance of a network
in terms of correct and incorrect classifications or total Euclidean
distance from the real numerical values. An error is calculated at
output and fed backwards from outputs to inputs.
3.4 Model Ensemble Methods
An ensemble combines two or more prediction models into a
single process. A method of fusion takes place to increase the
success rate of a prediction process by treating the models as a
sum of their parts.
Voting is a simple ensemble process of combining models and
allowing them to vote through a democratic or elitist process.
Each of the models are trained, and then for prediction, they
award vote v to class(es) via a specified method:
Average of probabilities; v = confidence
Majority vote; v = 1
Min/Max probability v = average confidence of all
models
Following the selected process, a democracy will produce an
outcome prediction as that of the class that has received the
strongest vote or set of votes.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
DOI: http://dx.doi.org/10.17501........................................
Figure 3. EEG sensors TP9, AF7, AF8 and TP10 of the
Muse headband on the international standard EEG
placement system [26]
Random Forest forms a voting ensemble from Decision Trees
[23]. Multiple trees are generated on randomly generated subsets
of the input data (Bootstrap Aggregation) and then those trees, the
random forest, will all vote on their predicted outcome and a
prediction is derived. Adaptive Boosting is the process of creating
multiple unique instances of one type of model prediction to
effectively improve the model in situations where selected
parameters may prove ineffective [24]. Classification predictions
are combined and weighted after a process of using a random data
subset to improve on a previous iteration of a model. Combination
is given as:
[\
"
L
$
%&
(
]^"L$
\
^12 :"_$
where F is the set of t models and x is the data object with an
unknown class [25].
4. METHOD
The study employs four dry extra-cranial electrodes via a
commercially available MUSE EEG headband. Microvoltage
measurements are recorded from the TP9, AF7, AF8, and TP10
electrodes, as seen in figure 3. Sixty seconds of data were
recorded from two subjects (1 male, 1 female, aged 20-22) for
each of the 6 film clips found in Table II producing 12 minutes
(720 seconds) of brain activity data (6 minutes for each emotional
state). Six minutes of neutral brainwave data were also collected
resulting in a grand total of 36 minutes of EEG data recorded from
subjects. With a variable frequency resampled to 150Hz, this
resulted in a dataset of 324,000 data points collected from the
waves produced by the brain. Activities were exclusively stimuli
that would evoke emotional responses from the set of emotions
found in Table I and were considered by their valence labels of
positive and negative rather than the emotions themselves. Neutral
data were also collected, without stimuli and before any of the
emotions data (to avoid contamination by the latter), for a third
class that would be the resting emotional state of the subject.
Three minutes of data were collected per day to reduce the
interference of a resting emotional state.
Table 2. Source of Film Clips used as Stimuli for EEG
Brainwave Data Collection
Stimulus
Valence
Year
Marley and Me
Neg
2008
Up
Neg
2009
My Girl
Neg
1991
La La Land
Pos
2016
Slow Life
Pos
2014
Funny Dogs
Pos
2015
Table 3. Attribute Evaluation Methods used to Generate
Datasets for Model Training
Evaluator
Ranker Cutoff
No. Attributes
OneR
0.4
52
BayesNet
0.4
67
InfoGain
0.75
63
Symmetrical
Uncertainty
0.4
72
Participants were asked to watch the film without making any
conscious movements (eg. drinking coffee) to prevent the
influence of Electromyographic (EMG) signals on the data due to
their prominence over brainwaves in terms of signal strength. A
previous study that suggested blinking patterns are useful for
classifying mental states [2] d blinking patterns are useful for
classifying mental states [2] inspired this study to neither
encourage nor discourage unconscious movements. Observations
of the experiment showed a participant smile for a short few
seconds during the ‘funny dogs’ compilation clip, as well as
become visibly upset during the ‘Marley and Me’ film clip (death
scene). These facial expressions will influence the recorded data
but are factored into the classification model because they
accurately reflect behaviour in the real world, where these
emotional responses would also occur. Hence, to accurately model
realistic situations, both EEG and facial EMG signals are
considered as informative. To generate a dataset of statistical
features, an effective methodology from a previous study [2] was
used to extract 2400 features through a sliding window of 1
second beginning at t=0 and t=0.5. Downsampling was set to the
minimum observed frequency of 150Hz.
Feature selection algorithms were run to generate a reduced
dataset from the 2,549 source attributes. Chosen methods ranked
attributes based on their effectiveness when used in classification,
and a manual cutoff point was tuned where the score began to
drop off, therefore retaining only the strongest attributes. Details
of attribute numbers generated by each method can be seen in
Table III. The reduced dimensionality makes the classification the
classification experiments more tractable and within the remit of
given computational resources.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
DOI: http://dx.doi.org/10.17501........................................
5. PRELIMINARY RESULTS
Model training for each method was performed on every dataset
generated by the four methods shown in Table III. The
parameters, where required, were set to the following:
10-fold cross validation for training models (average of
10 models on 10 folds of data).
A manually tuned deep neural network of two layers, 30
and 20 neurons on each layer respectively. Backward
propagation of errors. 500 epoch training time.
All random numbers generated by the Java Virtual
Machine with a seed of 0.
Ensemble voting based on Average Probabilities.
After downsampling, there were slightly more datapoints for the
neutral state, and thus to benchmark a Zero Rules (’most common
class’) classifier would classify all points as neutral. This was
33.58% and therefore any result above this shows useful rule
generation.
Models for ensemble were selected manually based on best
performance. Voting was performed on average probabilities
using the Random Tree, SMO, BayesNet, Logistic Regression,
and MLP models. Random Forests, due to their impressive
classification ability was attempted to be optimized by the
AdaBoost Algorithm.
Results of both single and ensemble classifiers can be seen in
Table IV. The best model, a Random Forest with the Infogain
dataset, achieved a high accuracy of 97.89%. The small amount
of classification errors came from a short few seconds of the half
an hour dataset, meaning that errors could be almost completely
mitigated when classifying in real time due to the sliding window
technique used for small timeframes t-n. Adaptive boosting was
promising for all Random Forest models but could not achieve a
score higher, pointing towards the possibility of outlying points.
For single classification, the multilayer perceptron was the most
consistently best model, showing the effectiveness of neural
networks for this particular problem.
The effectiveness of OneR classification showed that a certain
best attribute (mean value of AF7) existed that alone had a
classification ability of 85.27%. The rule is specified in Fig. 4.
The normalised mean value of the time windows extracted from
the AF7 electrode when observed show that minimum and
maximum values most commonly map to negative emotions,
whereas positive and neutral are very closely related, having rules
overlapping one another. One Rule classification improved over
the Zero Rule benchmark by over 50 points, and therefore would
have been an effective attribute to consider over others when it
came to utilising more than one of the attributes in the other
methods.
The two best models in our study are compared to the state of the
art alternatives in Table V. The method of generating attributes,
attribute selection via info gain and finally classification with a
Table 4. Classification Accuracy of Single and Ensemble Methods on the Four Generated Datasets
Dataset
Single Model Accuracy
Ensemble Model Accuracy
OneR
RT
SMO
NB
BN
LR
MLP
RF
Vote
AB(RF)
OneR
85.18
91.18
89.49
66.56
91.18
91.84
92.07
95.26
92.68
95.59
BayesNet
85.27
93.05
89.49
60.69
91.23
91.93
93.81
97.14
93.39
97.23
InfoGain
85.27
94.18
89.82
60.98
91.46
92.35
94.89
97.89
94.04
97.84
Symmetrical
Uncertainty
85.27
94.15
89.54
69.66
92.03
91.93
94.18
97.56
94.32
97.65
Table 5. An Indirect Comparison of this Study to Similar
Works Performed on Different Datasets
Study
Method
Accuracy
This study
InfoGain, RandomForest
97.89
Bos, et al. [3]
Fisher’s Discriminant
94.9
This study
InfoGain, MLP
94.89
Li, et al. [7]
Common Spatial Patterns
93.5
Li, et al.
Linear SVM
93
Zheng, et al. [9]
Deep Belief Network
87.62
Koelstra, et al. [6]
Common Spatial Patterns
58.8
Normalised mean value of the AF7 electrode:
< -460.0 -> NEGATIVE
< -436.5 -> POSITIVE
< -101.5 -> NEGATIVE
< 25.45 -> POSITIVE
< 25.85 -> NEUTRAL
< 26.25 -> POSITIVE
< 37.7 -> NEUTRAL
< 39.05 -> POSITIVE
< 43.599999999999994 -> NEUTRAL
< 63.95 -> POSITIVE
< 97.7 -> NEUTRAL
< 423.0 -> POSITIVE
>= 423.0 -> NEGATIVE
Figure 4. The most effective single rule for
classification.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
DOI: http://dx.doi.org/10.17501........................................
Random Forest outperforms an FDA model by 2.99 points.
Further work should be carried out to identify whether this
improved result was due to the methods chosen or the attribute
generation and selection, or possibly both.
6. DISCUSSION
The high performance of simple multilayer perceptrons suggests
neural network models can be effective, especially more complex
ones such as Convolutional Neural Networks (CNNs) that have
performed well in various classification experiments [27].
Similarly, ensemble and and Bayesian models are promising
avenues that could perform better with more advanced models,
such as Dynamic Bayesian Mixture Models (DBMM) [28] that
have previously been applied to statistical data extracted from
EEG brainwave signals.
Being able to recognise emotions autonomously would be
valuable for mental-health decision support systems such as
GRiST which is a risk and safety management system used by
mental-health practitioners and by people for assessing
themselves [29], [30]. Evaluations of emotions independent of
self-reporting would help calibrate the advice as well as guiding
more sensitive interactions. The measurement of brainwaves used
in this paper is too intrusive but would be useful for providing a
benchmark for finding other more appropriate methods.
7. CONCLUSION
This paper explored the application of single and ensemble
methods of classification to take windowed data from four points
on the scalp and quantify that data into an emotional
representation of what the participant was feeling at that time. The
methods showed that using a low resolution, commercially
available EEG headband can be effective for classifying a
participant’s emotional state. There is considerable potential for
producing classification algorithms that have practical value for
real-world decision support systems. Responding to emotional
states can improve interaction and, for mental-health systems,
contribute to the overall assessment of issues and how to resolve
them.
ACKNOWLEDGEMENT
This work was partially supported by the European Commission
through the H2020 project EXCELL (https://www.excell-
project.eu/), grant number 691829 (A. Ekart) and by the EIT
Health GRaCEAGE grant number 18429 awarded to C. D.
Buckingham.
REFERENCES
[1] M. S. El-Nasr, J. Yen, and T. R. Ioerger, “Flame - fuzzy
logic adaptivemodel of emotions,” Autonomous Agents and
Multi-agent systems, vol. 3, no. 3, pp. 219257, 2000.
[2] Ding, W. and Marchionini, G. 1997. A Study on Video
Browsing Strategies. Technical Report. University of
Maryland at College Park.
[3] J. J. Bird, L. J. Manso, E. P. Ribiero, A. Ekart, and D. R.
Faria, “A study on mental state classification using eeg-based
brain-machine interface,” in 9th International Conference on
Intelligent Systems, IEEE, 2018.
[4] D. O. Bos et al., “EEG-based emotion recognition,” The
Influence of Visual and Auditory Stimuli, pp. 117, 2006.
[5] Y.-P. Lin, C.-H. Wang, T.-P. Jung, T.-L. Wu, S.-K. Jeng, J.-
R. Duann, and J.-H. Chen, EEG-based emotion recognition
in music listening,” IEEE Transactions on Biomedical
Engineering, vol. 57, no. 7, pp. 17981806, 2010.
[6] X.-W. Wang, D. Nie, and B.-L. Lu, “Emotional state
classification from eeg data using machine learning
approach,” Neurocomputing, vol. 129, pp. 94106, 2014.
[7] S. Koelstra, A. Yazdani, M. Soleymani, C. Mühl, J.-S. Lee,
A. Nijholt, T. Pun, T. Ebrahimi, and I. Patras, “Single trial
classification of eeg and peripheral physiological signals for
recognition of emotions induced by music videos,” in Int.
Conf. on Brain Informatics, pp. 89100, Springer, 2010.
[8] M. Li and B.-L. Lu, “Emotion classification based on
gamma-band eeg,” in Engineering in medicine and biology
society, 2009. EMBC 2009. Annual international conference
of the IEEE, pp. 12231226, IEEE, 2009.
[9] M. Abujelala, C. Abellanoza, A. Sharma, and F. Makedon,
“Brainee: Brain enjoyment evaluation using commercial eeg
headband,” in Proceedings of the 9th acm international
conference on pervasive technologies related to assistive
environments, p. 33, ACM, 2016.
[10] W.-L. Zheng, J.-Y. Zhu, Y. Peng, and B.-L. Lu, “Eeg-based
emotion classification using deep belief networks,” in
Multimedia and Expo (ICME), 2014 IEEE International
Conference on, pp. 16, IEEE, 2014.
[11] J. J. Bird, A. Ekárt, and D. R. Faria, “Learning from
interaction: An intelligent networked-based human-bot and
bot-bot chatbot system,” in UK Workshop on Computational
Intelligence, pp. 179190, Springer, 2018.
[12] B. E. Swartz, “The advantages of digital over analog
recording techniques,” Electroencephalography and clinical
neurophysiology, vol. 106, no. 2, pp. 113117, 1998.
[13] A. Coenen, E. Fine, and O. Zayachkivska, “Adolf beck: A
forgotten pioneer in electroencephalography,” Journal of the
History of the Neurosciences, vol. 23, pp. 276286, 2014.
[14] A. K. Shah and S. Mittal, “Invasive electroencephalography
monitoring: Indications and presurgical planning,” Annals of
Indian Academy of Neurology, vol. 17, pp. S89, 2014.
[15] B. A. Taheri, R. T. Knight, and R. L. Smith, “A dry electrode
for eeg recording,” Electroencephalography and clinical
neurophysiology, vol. 90, no. 5, pp. 376383, 1994.
[16] K. Oatley and J. M. Jenkins, Understanding emotions.
Blackwell publishing, 1996.
[17] H. Lövheim, “A new three-dimensional model for emotions
and monoamine neurotransmitters,” Medical hypotheses, vol.
78, no. 2, pp. 341348, 2012
[18] J. Gruzelier, “A theory of alpha/theta neurofeedback, creative
performance enhancement, long distance functional
connectivity and psychological integration,” Cognitive
processing, vol. 10, no. 1, pp. 101109, 2009.
[19] C. Cortes and V. Vapnik, “Support-vector networks,”
Machine learning, vol. 20, no. 3, pp. 273297, 1995.
[20] J. Platt, “Sequential minimal optimization: A fast algorithm
for training support vector machines,” 1998.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
DOI: http://dx.doi.org/10.17501........................................
[21] T. Bayes, R. Price, and J. Canton, An essay towards solving
a problem in the doctrine of chances,” 1763.
[22] S. H. Walker and D. B. Duncan, “Estimation of the
probability of an event as a function of several independent
variables,” Biometrika, vol. 54, no. 1-2, pp. 167179, 1967.
[23] Y. Bengio, I. J. Goodfellow, and A. Courville, “Deep
learning,” Nature, vol. 521, no. 7553, pp. 436444, 2015.
[24] T. K. Ho, “Random decision forests,” in Document analysis
and recognition, 1995., proceedings of the third international
conference on, vol. 1, pp. 278282, IEEE, 1995.
[25] Y. Freund and R. E. Schapire, “A decision-theoretic
generalization of on-line learning and an application to
boosting,” Journal of computer and system sciences, vol. 55,
no. 1, pp. 119139, 1997.
[26] R. Rojas, “Adaboost and the super bowl of classifiers a
tutorial introduction to adaptive boosting,” Freie University,
Berlin, Tech. Rep, 2009
[27] H. H. Jasper, “The ten-twenty electrode system of the
international federation,” Electroencephalogr. Clin.
Neurophysiol., vol. 10, pp. 370375, 1958.
[28] M. Hussain, J. J. Bird, and D. R. Faria, “A study on cnn
transfer learning for image classification,” in UK Workshop
on Computational Intelligence, pp. 191202, Springer, 2018.
[29] D. R. Faria, M. Vieira, C. Premebida, and U. Nunes,
“Probabilistic human daily activity recognition towards
robot-assisted living,” in Robot and Human Interactive
Communication (RO-MAN), 2015 24th IEEE International
Symposium on, pp. 582587, IEEE, 2015.
[30] C. D. Buckingham, A. Ahmed, and A. Adams, “Designing
multiple user perspectives and functionality for clinical
decision support systems,” pp. 211218, 2013.
[31] C. D. Buckingham, A. Adams, L. Vail, A. Kumar, A.
Ahmed, A. Whelan, and E. Karasouli, “Integrating service
user and practitioner expertise within a web-based system for
collaborative mental-health risk and safety management,”
Patient Education and Counseling, pp. 1189-1196, 2015.
... Gamma brainwaves , the fastest detectable EEG brainwaves associated with peak mental states, have been compared to heightened perception [7]. The best time to identify beta brainwaves (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32) is when we're actively thinking. The first brainwaves to be identified were alpha waves (8)(9)(10)(11)(12)(13), among the easiest to examine. ...
... The proposed watermarking method uses EEG raw data [31] containing four signals (TP9, AF7, AF8, TP10) while emotions datasets [32] as the input host image of size 512 × 512. The identity of the patient is taken as the watermark image. ...
... Table 8 shows the extracted watermark images against various attacks. Tables 9 and 10 show the value of performance parameters for emotions dataset [32]. ...
Article
Full-text available
Medical data are increasing drastically due to the vast development of medical sciences. The security of this immense data is also a challenge of the present era. Image watermarking is a technique to secure medical data from alteration. Authentication of patients records is also necessary in case of medical data transmission. In this paper, an optimized electroencephalogram watermarking technique are proposed with dual authentication using Advanced encryption standards (AES) and speeded-up robust features is proposed. Scaling factor plays an important role to balance the properties of watermarking algorithm. The cuckoo search optimization is used to get the optimized scaling factor. The Henon encryption (HE) is used to enhance the security of sub-band obtained from identity of the patient image used as the watermark. The diagonalized Hessenberg decomposition (HD) is used for embedding watermark while secured hash algorithm (SHA-256) is used to protect watermark against malicious attacks. For the proposed technique, detailed security analysis has been performed for AES encryption technique. Various performance metrics are computed for the proposed technique to estimate the effectiveness of the watermarking system.
... The extracted features were then saved and later appended to those extracted from the biorthogonal wavelet transform applied to the same pre-processed spectrogram images. This amalgamation of feature sets had 2359 attributes, 2047 of them were extracted by the Xception model, and the remaining 312 attributes were the output from the wavelet transformation done till level 2. This was further reinforced by appending it to the original feature dataset used by J. J. Bird [2], which had a combination of Max, Min, Log Covariance features, Statistical Features, Shannon Entropy, energy model, derivatives, and log energy entropy to generate a feature set with 4907 attributes. ...
... 2047 features were extracted, courtesy of the Xception model and 312 additional features were extracted using multilevel (till level 2) wavelet transform. All of these features were then appended with the original statistical dataset by J. J. Bird [2] to generate a total of 4097 attributes. ...
Article
Full-text available
Throughout the years, major advancements have been made in the field of EEG-based emotion classification. Implementing deep architectures for supervised and unsupervised learning from data has come a long way. This study aims to capitalize on these advancements to classify emotions from EEG signals accurately. It still is, however, a challenging task. The fact that the data we are reliant on changes from person to person calls for an elaborate machine-learning solution that can achieve high degrees of abstraction without sacrificing accuracy and legibility. In this study, the Xception model from Keras API was utilized, as well as wavelet transform for feature extraction, which was then used for classification using different classifiers. These features were classified into three distinct categories: NEGATIVE, POSITIVE and NEUTRAL. To examine the effectiveness of the Xception deep neural net, we compare the results of different classifiers like Support Vector Machine, Random Forest, AdaBoostM1, LogitBoost, Naïve Bayes Updateable and Non-Nested Generalization Exemplars. The random forest ensemble achieved the best results from all the classifiers implemented in this study. It had higher accuracy scores than existing models without compromising on areas like precision, F1 score, and recall value.
... Several studies demonstrate machine learning's ability to decode cognitive functions from neural activity recorded via EEG. Emotion recognition, which seeks to identify emotional states [4][5][6][7][8], has applications ranging from clinical research [9] to targeted product development [10,11]. Other studies involving neural decoding via EEG and machine learning involve motor tasks and motor imagery [12][13][14][15][16][17][18], seizure detection [19][20][21][22], mental workload [23,24], and even visual stimulus decoding [25,26]. ...
Preprint
We describe a method for the neural decoding of memory from EEG data. Using this method, a concept being recalled can be identified from an EEG trace with an average top-1 accuracy of about 78.4% (chance 4%). The method employs deep representation learning with supervised contrastive loss to map an EEG recording of brain activity to a low-dimensional space. Because representation learning is used, concepts can be identified even if they do not appear in the training data set. However, reference EEG data must exist for each such concept. We also show an application of the method to the problem of information retrieval. In neural information retrieval, EEG data is captured while a user recalls the contents of a document, and a list of links to predicted documents is produced.
... Using the XGBoost algorithm, the technique used in this study finally achieved the best accuracy of 99.62%. To the of our knowledge, using the RandomSearchCV technique to adjust the hyperparameters greatly helped produce comparatively better performance for all the outcomes [35]. The dataset may have been overfitted based on the results, as certain methods still had lower values even after hyperparameter adjustment. ...
Article
One of the most exciting areas of computer science right now is brain-computer interface (BCI) research. A conduit for data flow between both the brain as well as an electronic device is the brain-computer interface (BCI). Researchers in several disciplines have benefited from the advancements made possible by brain-computer interfaces. Primary fields of study include healthcare and neuroergonomics. Brain signals could be used in a variety of ways to improve healthcare at every stage, from diagnosis to rehabilitation to eventual restoration. In this research, we demonstrate how to classify EEG signals of brain waves using machine learning algorithms for predicting mental health states. The XGBoost algorithm's results have an accuracy of 99.62%, which is higher than that of any other study of its kind and the best result to date for diagnosing people's mental states from their EEG signals. This discovery will aid in taking efforts [1] to predict mental state using EEG signals to the next level.
... EEG is involved in online gaming, virtual reality, and e-health, and it support psychological analyses [1]. Majority of research papers have explored emotion recognition based on EEG signals using machine learning approaches [1,[6][7][8][9][10][11][12][13][14][15][16][17]. Some datasets are established for emotional recognition, such as the SJTU Emotion EEG Dataset (SEED) [7]. ...
Article
Full-text available
Human emotions are too complex to be accurately recognized by others. In the era of Artificial Intelligence (AI), automatic emotion recognition has become an active field for research and applications.
Article
Full-text available
The development of brain-machine interfaces (BMIs) has revolutionized the study of neuroscience by making it possible for the brain to communicate directly with outside objects. In this study, EEG brainwave data is used to categorize emotional experiences using both individual and group methods. We employ a four-electrode resolution (TP9, AF7, AF8, and TP10) commercial MUSE EEG headband. Film clips with clear emotional content elicit both good and negative emotions. For one minute per session, neutral resting data is also obtained without external stimuli. To do this, we use machine learning algorithms to decode and interpret participant EEG data as they perform activities that evoke a range of emotional reactions. Relevant characteristics are collected from the EEG signals using intensive data preprocessing, feature extraction, and selection approaches to identify the underlying patterns of cognitive sentimental feelings. Following that, the development and assessment of classification models, such as Gradient Artificial Neural Networks (G-ANN), use the retrieved features as input. In conclusion, this study presents an EEG-based BMI system for categorizing cognitive sentimental emotions. The proposed G-ANN achieves a high accuracy of 98.59%, demonstrating superior performance compared to existing methodologies.
Chapter
This short chapter is dedicated to a research approach that is promising but still in the infancy and questionable from an ethical point of view: direct connection to the brain. In this chapter, we present the main techniques of brain–computer interfaces and their role in virtual reality.
Article
During the Covid-19 lockdown, we invited several students with varying levels of education (High school, Middle school, Undergraduate) to watch an online lecture and we recorded their EEG data and brain waves during the lecture. We began by asking them several questions to understand their knowledge base and picked several videos that they would be able to understand and several videos that they wouldn't be able to understand. We then recorded their EEG data, Brain waves, and added a binary variable that indicated whether the student understood the lecture or not. (1 = Understood the lecture | 0 = Did not understand the lecture). We compiled all of our recordings and appended them to a single dataset (EEG data.csv). We also recorded various details relating to the students and videos used, they can be found in Subject details.csv and Video details.csv.
Conference Paper
Full-text available
This work aims to find discriminative EEG-based features and appropriate classification methods that can categorise brainwave patterns based on their level of activity or frequency for mental state recognition useful for human-machine interaction. By using the Muse headband with four EEG sensors (TP9, AF7, AF8, TP10), we categorised three possible states such as relaxing, neutral and concentrating based on a few states of mind defined by cognitive behavioural studies. We have created a dataset with five individuals and sessions lasting one minute for each class of mental state in order to train and test different methods. Given the proposed set of features extracted from the EEG headband five signals (alpha, beta, theta, delta, gamma), we have tested a combination of different features selection algorithms and classifier models to compare their performance in terms of recognition accuracy and number of features needed. Different tests such as 10-fold cross validation were performed. Results show that only 44 features from a set of over 2100 features are necessary when used with classical classifiers such as Bayesian Networks, Support Vector Machines and Random Forests, attaining an overall accuracy over 87%.
Conference Paper
Full-text available
In this paper we propose an approach to a chatbot software that is able to learn from interaction via text messaging between human-bot and bot-bot. The bot listens to a user and decides whether or not it knows how to reply to the message accurately based on current knowledge, otherwise it will set about to learn a meaningful response to the message through pattern matching based on its previous experience. Similar methods are used to detect offensive messages, and are proved to be effective at overcoming the issues that other chatbots have experienced in the open domain. A philosophy of giving preference to too much censorship rather than too little is employed given the failure of Microsoft Tay. In this work, a layered approach is devised to conduct each process, and leave the architecture open to improvement with more advanced methods in the future. Preliminary results show an improvement over time in which the bot learns more responses. A novel approach of message simplification is added to the bot’s architecture, the results suggest that the algorithm has a substantial improvement on the bot’s conversational performance at a factor of three.
Conference Paper
Full-text available
Many image classification models have been introduced to help tackle the foremost issue of recognition accuracy. Image classification is one of the core problems in Computer Vision field with a large variety of practical applications. Examples include: object recognition for robotic manipulation, pedestrian or obstacle detection for autonomous vehicles, among others. A lot of attention has been associated with Machine Learning, specifically neural networks such as the Convolutional Neural Network (CNN) winning image classification competitions. This work proposes the study and investigation of such a CNN architecture model (i.e. Inception-v3) to establish whether it works best in terms of accuracy and efficiency with new image datasets via Transfer Learning. The retrained model is evaluated, and the results are compared to some state-of-the-art approaches.
Conference Paper
Full-text available
Previous studies that involve measuring EEG, or electroencephalograms, have mainly been experimentally-driven projects; for instance, EEG has long been used in research to help identify and elucidate our understanding of many neuroscientific, cognitive, and clinical issues (e.g., sleep, seizures, memory). However, advances in technology have made EEG more accessible to the population. This opens up lines for EEG to provide more information about brain activity in everyday life, rather than in a laboratory setting. To take advantage of the technological advances that have allowed for this, we introduce the Brain-EE system, a method for evaluating user engaged enjoyment that uses a commercially available EEG tool (Muse). During testing, fifteen participants engaged in two tasks (playing two different video games via tablet), and their EEG data were recorded. The Brain-EE system supported much of the previous literature on enjoyment; increases in frontal theta activity strongly and reliably predicted which game each individual participant preferred. We hope to develop the Brain-EE system further in order to contribute to a wide variety of applications (e.g., usability testing, clinical or experimental applications, evaluation methods, etc.).
Conference Paper
Full-text available
In this work, we present a human-centered robot application in the scope of daily activity recognition towards robot-assisted living. Our approach consists of a probabilistic ensemble of classifiers as a dynamic mixture model considering the Bayesian probability, where each base classifier contributes to the inference in proportion to its posterior belief. The classification model relies on the confidence obtained from an uncertainty measure that assigns a weight for each base classifier to counterbalance the joint posterior probability. Spatio-temporal 3D skeleton-based features extracted from RGB-D sensor data are modeled in order to characterize daily activities, including risk situations (e.g.: falling down, running or jumping in a room). To assess our proposed approach, state-of-the-art datasets such as MSR-Action3D Dataset and MSR-Activity3D Dataset [1] are used to compare the results with other recent methods. Reported results on test datasets show that our proposed approach outperforms state-of-the-art methods in terms of precision, recall, and overall accuracy. Moreover, we also validated our framework running on-the-fly in a mobile robot with an RGB-D sensor to identify daily activities for a robot-assisted living application.
Conference Paper
Full-text available
In recent years, there are many great successes in using deep architectures for unsupervised feature learning from data, especially for images and speech. In this paper, we introduce recent advanced deep learning models to classify two emotional categories (positive and negative) from EEG data. We train a deep belief network (DBN) with differential entropy features extracted from multichannel EEG as input. A hidden markov model (HMM) is integrated to accurately capture a more reliable emotional stage switching. We also compare the performance of the deep models to KNN, SVM and Graph regularized Extreme Learning Machine (GELM). The average accuracies of DBN-HMM, DBN, GELM, SVM, and KNN in our experiments are 87.62%, 86.91%, 85.67%, 84.08%, and 69.66%, respectively. Our experimental results show that the DBN and DBN-HMM models improve the accuracy of EEG-based emotion classification in comparison with the state-of-the-art methods.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Article
Objectives: To develop a decision support system (DSS), myGRaCE, that integrates service user (SU) and practitioner expertise about mental health and associated risks of suicide, self-harm, harm to others, self-neglect, and vulnerability. The intention is to help SUs assess and manage their own mental health collaboratively with practitioners. Methods: An iterative process involving interviews, focus groups, and agile software development with 115 SUs, to elicit and implement myGRaCE requirements. Results: Findings highlight shared understanding of mental health risk between SUs and practitioners that can be integrated within a single model. However, important differences were revealed in SUs' preferred process of assessing risks and safety, which are reflected in the distinctive interface, navigation, tool functionality and language developed for myGRaCE. A challenge was how to provide flexible access without overwhelming and confusing users. Conclusion: The methods show that practitioner expertise can be reformulated in a format that simultaneously captures SU expertise, to provide a tool highly valued by SUs. A stepped process adds necessary structure to the assessment, each step with its own feedback and guidance. Practice Implications: The GRiST web-based DSS (www.egrist.org) links and integrates myGRaCE self-assessments with GRiST practitioner assessments for supporting collaborative and self-managed healthcare.