Conference PaperPDF Available

Mental Emotional Sentiment Classification with an EEG-based Brain-machine Interface


Abstract and Figures

This paper explores single and ensemble methods to classify emotional experiences based on EEG brainwave data. A commercial MUSE EEG headband is used with a resolution of four (TP9, AF7, AF8, TP10) electrodes. Positive and negative emotional states are invoked using film clips with an obvious valence, and neutral resting data is also recorded with no stimuli involved, all for one minute per session. Statistical extraction of the alpha, beta, theta, delta and gamma brainwaves is performed to generate a large dataset that is then reduced to smaller datasets by feature selection using scores from OneR, Bayes Network, Information Gain, and Symmetrical Uncertainty. Of the set of 2548 features, a subset of 63 selected by their Information Gain values were found to be best when used with ensemble classifiers such as Random Forest. They attained an overall accuracy of around 97.89%, outperforming the current state of the art by 2.99 percentage points. The best single classifier was a deep neural network with an accuracy of 94.89%.
Content may be subject to copyright.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
Mental Emotional Sentiment Classification with an EEG-
based Brain-Machine Interface
Jordan J. Bird
School of Engineering and Applied Science
Aston University
Birmingham, UK
Christopher D. Buckingham
School of Engineering and Applied Science
Aston University
Birmingham, UK
Anikó Ekárt
School of Engineering and Applied Science
Aston University
Birmingham, UK
Diego R. Faria
School of Engineering and Applied Science
Aston University
Birmingham, UK
This paper explores single and ensemble methods to classify
emotional experiences based on EEG brainwave data. A
commercial MUSE EEG headband is used with a resolution of
four (TP9, AF7, AF8, TP10) electrodes. Positive and negative
emotional states are invoked using film clips with an obvious
valence, and neutral resting data is also recorded with no stimuli
involved, all for one minute per session. Statistical extraction of
the alpha, beta, theta, delta and gamma brainwaves is performed
to generate a large dataset that is then reduced to smaller datasets
by feature selection using scores from OneR, Bayes Network,
Information Gain, and Symmetrical Uncertainty. Of the set of
2548 features, a subset of 63 selected by their Information Gain
values were found to be best when used with ensemble classifiers
such as Random Forest. They attained an overall accuracy of
around 97.89%, outperforming the current state of the art by 2.99
percentage points. The best single classifier was a deep neural
network with an accuracy of 94.89%.
Emotion Classification, Brain-Machine Interface, Machine
The proceedings are the records of the IAPE’18 conference. We
ask that authors follow some simple guidelines. In essence, we ask
you to make your paper look exactly like this document. The
easiest way to do this is simply to replace the content with your
own material.
Autonomous non-invasive detection of emotional states is
potentially useful in multiple domains such as human robot
interaction and mental healthcare. It can provide an extra
dimension of interaction between user and device, as well as
enabling tangible information to be derived that does not depend
on verbal communication [1]. With the increasing availability of
low-cost electroencephalography (EEG) devices, brainwave data
is becoming affordable for the consumer industry as well as for
research, introducing the need for autonomous classification
without the requirement of an expert on hand.
Due to the complexity, randomness, and non-stationary aspects of
brainwave data, classification is very difficult with a raw EEG
stream. For this reason, stationary techniques such as time
windowing must be introduced alongside feature extraction of the
data within a window. There are many statistics that can be
derived from such EEG windows, each of which has varying
classification efficacy depending on the goal. Feature selection
must be performed to identify useful statistics and reduce the
complexity of the model generation process, saving both time and
computational resources during the training and classification
The main contributions of this work are as follows:
Exploration of single and ensemble methods for the
classification of emotions.
A high performing data mining strategy reaching
97.89% accuracy.
The inclusion of facial EMG signals as part of the
classification process.
A resolution of three emotional classes (positive,
neutral, negative) to allow for real world on mental
states that are not defined by prominent emotions.
One Rule classification demonstrating how accurately
the AF7 electrode’s mean value classifies mental states.
The remainder of this paper will explore related state-of-the-art
research and provide the main inspiration and influences for the
study. It will explain the methodology of data collection, feature
generation, feature selection and prediction methods. The results
will be presented and discussed alongside comparable work,
followed by conclusions and future work.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
Figure 1. Di agr am to show Lövheim’ s Cube of
Emotional Categorization
Statistics derived from a time-windowing technique with feature
selection have been found to be effective for classifying mental
states such as relaxed, neutral, and concentrating [2]. An ensemble
method of Random Forest had an observed classification accuracy
of 87% when performed with a dataset which was pre-processed
with the OneR classifier as a feature selector. These promising
results suggested a study on classification of emotional states
using a similar exploration method would be similarly successful.
The best current state-of-the-art solution for classification of
emotional EEG data from a low-resolution, low-cost EEG setup
used Fisher’s Discriminant Analysis to produce an accuracy of
95% [3]. The study tried to prevent participants from becoming
tense and discourage blinking but the previous study [2] found
that EMG data from these activities helped classification because
blink rates are a factor in concentration for example. Hence the
new study described in this paper will explore classification of
emotions in EEG data when unconscious movements are neither
encouraged nor discouraged. Conscious extraneous movements
such as taking a sip of water will not be allowed because they just
form outlying or masking points in the data. For example, if the
people experiencing positive emotions are also drinking water, the
model will simply classify the electrical data that has been
generated by those movements. Stimuli to evoke emotions for
EEG-based studies are often found to be best with music [4] and
film [5]. This paper thus focuses on film clips that have audio
tracks (speech and/or music) to evoke emotions, similarly to a
related study that used music videos [6].
Common Spatial Patterns have proved extremely effective for
emotion classification, attaining an overall best solution at 93.5%
[7]. A MUSE EEG headband was successfully used to classify
high resolutions of valence through differing levels of enjoyment
during a certain task [8]. Deep Belief Network (DBN), Artificial
Neural Network (ANN), and Support Vector Machine (SVM)
methods have all been able to classify emotions from EEG data
was also found to be very effective with when considering binary
classes of positive and negative [9]. This study will build on all
these results using similar methods as well as an ensemble, to
exploit their differing strengths and weaknesses. The study also
supports the usage of a neutral class, for transition into real-world
use, to provide a platform for emotional classification where
emotions are not prominent. It adds valence or perceived
sentiment because this was previously found to be helpful in the
learning processes for a web-based chatbot [10].
3.1 Electroencephalography
Electroencephalography is the process using applied electrodes to
derive electrophysiological data and signals produced by the brain
[11] [12]. Electrodes can be subdural [13] ie. under the skull,
placed on and within the brain itself. Noninvasive techniques
require either wet or dry electrodes to be placed around the
cranium [14]. Raw electrical data is measured in Microvolts (uV)
at observed time t producing wave patterns from t to t+n.
3.2 Human Emotion
Human emotions are varied and complex but can be generalized
into positive and negative categories [15]. Some emotions overlap
such as ’hope’ and ’anguish’, which are considered positive and
negative respectively but that are often experienced
Table 1. Table to show Lövheim categories and their
encapsulated emotions with a valence label
Shame (Negative) Humiliation (Negative)
Contempt (Negative) Disgust (Negative)
Fear (Negative)
Terror (Negative)
Enjoyment (Positive) Joy (Positive)
Distress (Negative) Anguish (Negative)
Surprise (Negative) (Lack of Dopamine)
Anger (Negative) Rage (Negative)
Interest (Positive) Excitement (Positive)
contemporaneously: e.g. the clearly doomed hope and
accompanying anguish for a character’s survival in a film. This
study will concentrate on those emotions that do not overlap, to
help correctly classify what is and is not a positive experience.
Lövheim’s three-dimensional emotional model maps brain
chemical composition to generalised states of positive and
negative valence [16]. This is shown in Fig. 1 with emotion
categories A-H from each of the model’s vertices, further detailed
in Table I. Various chemical compositions can be mapped to
emotions with positive and negative classes. Furthermore, studies
show that chemical composition influences nervous oscillation
and thus the generation of electrical brainwaves [17]. Since
emotions are encoded within chemical composition that directly
influence electrical brain activity, this study proposes that they
can be classed using statistical features of the produced
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
Figure 2. A simpl ified di agram of a ful ly-connected feed
forward deep neural network.
3.3 Machine Learning Algorithms
The study in this paper applies a number of machine learning
algorithms. One Rule (OneR) classification is a simplistic
probabilistic process of selecting one attribute from the dataset
and generating logical rules based upon it. For example:
"IF temperature LESS THAN 5.56 THEN December"
"IF temperature MORE THAN 23.43 THEN July"
are rules generated based on a temperature attribute to predict the
month (class). This model will identify the strongest attribute
within the dataset for classifying emotions
Decision Trees follow a linear process of conditional control
statements based on attributes, through a tree-like structure where
each node is a rule based decision that will further lead to other
nodes. Finally, an end node is reached, and a class is given to the
data object. The level of randomness or entropy on all end nodes
is used to measure the classification ability of the tree. The
calculation of entropy is given as:
*12 3 "4$
Entropic models are compared by their difference in entropy
which is information gain. A positive value would be a better
model, whereas a negative value shows information loss versus
the comparative model. This is given as:
% &!
: "<$
where E is the entropy calculated by Equation 1.
Support Vector Machines (SVM) classify data points by
generating and optimising a hyperplane to separate them and
classifying based on their position in comparison to the
hyperplane [18]. A model is considered optimised when the
average margins between points and the separator is at its
maximum value. Sequential Minimal Optimisation (SMO) is a
high-performing algorithm to generate and implement an SVM
classifier [19]. The large optimisation problem is broken down
into smaller subproblems, that can then be solved linearly.
Bayes’ Theorem [20] uses conditional probabilities to determine
the likelihood of Class A based on Evidence, B, as follows:
)">$ 3 "B$
For this study, evidence consists of attribute values (EEG
time-window statistics) and ground-truth training for determining
their most likely classes. A simpler version is known as
Naive Bayes, which assumes independence of attribute values
whether or not they are really unrelated. Classification of
Naive Bayes is adapted from Equation 3 as follows:
% E F
*12 : "N$
where y is the class and k is the data object (row) that is being
Logistic Regression is a symmetric statistical model used for
mapping a numerical value to a probability, ie. hours of study to
predict a student’s exam grade [21]. For a binary classification
problem with i attributes, and β model parameters, the log odds l
is given as
, %&OPQ&
and thus the corresponding
odds of outcome are therefore given as
- %&TUVW&
can be used to predict a model outcome based on previous data.
A Multilayer Perceptron is a type of Artificial Neural Network
(ANN) that predicts a class by taking input parameters and
computing them through a series of hidden layers to one or more
nodes on the final output layer. More than one hidden layer forms
a deep neural network and output layers can be different classes
or, if there is just one, a regression output. A simplified diagram
of a fully connected feed forward deep neural network can be seen
in Fig. 2. Learning is performed for a defined time and follows the
process of backpropagation [22], which is the process of deriving
a gradient that is further used to calculate weights for each node
(neuron) in the network. Training is based on reducing the error
rate given by the error function ie. the performance of a network
in terms of correct and incorrect classifications or total Euclidean
distance from the real numerical values. An error is calculated at
output and fed backwards from outputs to inputs.
3.4 Model Ensemble Methods
An ensemble combines two or more prediction models into a
single process. A method of fusion takes place to increase the
success rate of a prediction process by treating the models as a
sum of their parts.
Voting is a simple ensemble process of combining models and
allowing them to vote through a democratic or elitist process.
Each of the models are trained, and then for prediction, they
award vote v to class(es) via a specified method:
Average of probabilities; v = confidence
Majority vote; v = 1
Min/Max probability v = average confidence of all
Following the selected process, a democracy will produce an
outcome prediction as that of the class that has received the
strongest vote or set of votes.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
Figure 3. EEG sensors TP9, AF7, AF8 and TP10 of the
Muse headband on the international standard EEG
placement system [26]
Random Forest forms a voting ensemble from Decision Trees
[23]. Multiple trees are generated on randomly generated subsets
of the input data (Bootstrap Aggregation) and then those trees, the
random forest, will all vote on their predicted outcome and a
prediction is derived. Adaptive Boosting is the process of creating
multiple unique instances of one type of model prediction to
effectively improve the model in situations where selected
parameters may prove ineffective [24]. Classification predictions
are combined and weighted after a process of using a random data
subset to improve on a previous iteration of a model. Combination
is given as:
^12 :"_$
where F is the set of t models and x is the data object with an
unknown class [25].
The study employs four dry extra-cranial electrodes via a
commercially available MUSE EEG headband. Microvoltage
measurements are recorded from the TP9, AF7, AF8, and TP10
electrodes, as seen in figure 3. Sixty seconds of data were
recorded from two subjects (1 male, 1 female, aged 20-22) for
each of the 6 film clips found in Table II producing 12 minutes
(720 seconds) of brain activity data (6 minutes for each emotional
state). Six minutes of neutral brainwave data were also collected
resulting in a grand total of 36 minutes of EEG data recorded from
subjects. With a variable frequency resampled to 150Hz, this
resulted in a dataset of 324,000 data points collected from the
waves produced by the brain. Activities were exclusively stimuli
that would evoke emotional responses from the set of emotions
found in Table I and were considered by their valence labels of
positive and negative rather than the emotions themselves. Neutral
data were also collected, without stimuli and before any of the
emotions data (to avoid contamination by the latter), for a third
class that would be the resting emotional state of the subject.
Three minutes of data were collected per day to reduce the
interference of a resting emotional state.
Table 2. Source of Film Clips used as Stimuli for EEG
Brainwave Data Collection
Marley and Me
My Girl
La La Land
Slow Life
Funny Dogs
Table 3. Attribute Evaluation Methods used to Generate
Datasets for Model Training
Ranker Cutoff
No. Attributes
Participants were asked to watch the film without making any
conscious movements (eg. drinking coffee) to prevent the
influence of Electromyographic (EMG) signals on the data due to
their prominence over brainwaves in terms of signal strength. A
previous study that suggested blinking patterns are useful for
classifying mental states [2] d blinking patterns are useful for
classifying mental states [2] inspired this study to neither
encourage nor discourage unconscious movements. Observations
of the experiment showed a participant smile for a short few
seconds during the ‘funny dogs’ compilation clip, as well as
become visibly upset during the ‘Marley and Me’ film clip (death
scene). These facial expressions will influence the recorded data
but are factored into the classification model because they
accurately reflect behaviour in the real world, where these
emotional responses would also occur. Hence, to accurately model
realistic situations, both EEG and facial EMG signals are
considered as informative. To generate a dataset of statistical
features, an effective methodology from a previous study [2] was
used to extract 2400 features through a sliding window of 1
second beginning at t=0 and t=0.5. Downsampling was set to the
minimum observed frequency of 150Hz.
Feature selection algorithms were run to generate a reduced
dataset from the 2,549 source attributes. Chosen methods ranked
attributes based on their effectiveness when used in classification,
and a manual cutoff point was tuned where the score began to
drop off, therefore retaining only the strongest attributes. Details
of attribute numbers generated by each method can be seen in
Table III. The reduced dimensionality makes the classification the
classification experiments more tractable and within the remit of
given computational resources.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
Model training for each method was performed on every dataset
generated by the four methods shown in Table III. The
parameters, where required, were set to the following:
10-fold cross validation for training models (average of
10 models on 10 folds of data).
A manually tuned deep neural network of two layers, 30
and 20 neurons on each layer respectively. Backward
propagation of errors. 500 epoch training time.
All random numbers generated by the Java Virtual
Machine with a seed of 0.
Ensemble voting based on Average Probabilities.
After downsampling, there were slightly more datapoints for the
neutral state, and thus to benchmark a Zero Rules (’most common
class’) classifier would classify all points as neutral. This was
33.58% and therefore any result above this shows useful rule
Models for ensemble were selected manually based on best
performance. Voting was performed on average probabilities
using the Random Tree, SMO, BayesNet, Logistic Regression,
and MLP models. Random Forests, due to their impressive
classification ability was attempted to be optimized by the
AdaBoost Algorithm.
Results of both single and ensemble classifiers can be seen in
Table IV. The best model, a Random Forest with the Infogain
dataset, achieved a high accuracy of 97.89%. The small amount
of classification errors came from a short few seconds of the half
an hour dataset, meaning that errors could be almost completely
mitigated when classifying in real time due to the sliding window
technique used for small timeframes t-n. Adaptive boosting was
promising for all Random Forest models but could not achieve a
score higher, pointing towards the possibility of outlying points.
For single classification, the multilayer perceptron was the most
consistently best model, showing the effectiveness of neural
networks for this particular problem.
The effectiveness of OneR classification showed that a certain
best attribute (mean value of AF7) existed that alone had a
classification ability of 85.27%. The rule is specified in Fig. 4.
The normalised mean value of the time windows extracted from
the AF7 electrode when observed show that minimum and
maximum values most commonly map to negative emotions,
whereas positive and neutral are very closely related, having rules
overlapping one another. One Rule classification improved over
the Zero Rule benchmark by over 50 points, and therefore would
have been an effective attribute to consider over others when it
came to utilising more than one of the attributes in the other
The two best models in our study are compared to the state of the
art alternatives in Table V. The method of generating attributes,
attribute selection via info gain and finally classification with a
Table 4. Classification Accuracy of Single and Ensemble Methods on the Four Generated Datasets
Single Model Accuracy
Ensemble Model Accuracy
Table 5. An Indirect Comparison of this Study to Similar
Works Performed on Different Datasets
This study
InfoGain, RandomForest
Bos, et al. [3]
Fisher’s Discriminant
This study
InfoGain, MLP
Li, et al. [7]
Common Spatial Patterns
Li, et al.
Linear SVM
Zheng, et al. [9]
Deep Belief Network
Koelstra, et al. [6]
Common Spatial Patterns
Normalised mean value of the AF7 electrode:
< -460.0 -> NEGATIVE
< -436.5 -> POSITIVE
< -101.5 -> NEGATIVE
< 25.45 -> POSITIVE
< 25.85 -> NEUTRAL
< 26.25 -> POSITIVE
< 37.7 -> NEUTRAL
< 39.05 -> POSITIVE
< 43.599999999999994 -> NEUTRAL
< 63.95 -> POSITIVE
< 97.7 -> NEUTRAL
< 423.0 -> POSITIVE
>= 423.0 -> NEGATIVE
Figure 4. The most effective single rule for
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
Random Forest outperforms an FDA model by 2.99 points.
Further work should be carried out to identify whether this
improved result was due to the methods chosen or the attribute
generation and selection, or possibly both.
The high performance of simple multilayer perceptrons suggests
neural network models can be effective, especially more complex
ones such as Convolutional Neural Networks (CNNs) that have
performed well in various classification experiments [27].
Similarly, ensemble and and Bayesian models are promising
avenues that could perform better with more advanced models,
such as Dynamic Bayesian Mixture Models (DBMM) [28] that
have previously been applied to statistical data extracted from
EEG brainwave signals.
Being able to recognise emotions autonomously would be
valuable for mental-health decision support systems such as
GRiST which is a risk and safety management system used by
mental-health practitioners and by people for assessing
themselves [29], [30]. Evaluations of emotions independent of
self-reporting would help calibrate the advice as well as guiding
more sensitive interactions. The measurement of brainwaves used
in this paper is too intrusive but would be useful for providing a
benchmark for finding other more appropriate methods.
This paper explored the application of single and ensemble
methods of classification to take windowed data from four points
on the scalp and quantify that data into an emotional
representation of what the participant was feeling at that time. The
methods showed that using a low resolution, commercially
available EEG headband can be effective for classifying a
participant’s emotional state. There is considerable potential for
producing classification algorithms that have practical value for
real-world decision support systems. Responding to emotional
states can improve interaction and, for mental-health systems,
contribute to the overall assessment of issues and how to resolve
This work was partially supported by the European Commission
through the H2020 project EXCELL (https://www.excell-, grant number 691829 (A. Ekart) and by the EIT
Health GRaCEAGE grant number 18429 awarded to C. D.
[1] M. S. El-Nasr, J. Yen, and T. R. Ioerger, “Flame - fuzzy
logic adaptivemodel of emotions,” Autonomous Agents and
Multi-agent systems, vol. 3, no. 3, pp. 219257, 2000.
[2] Ding, W. and Marchionini, G. 1997. A Study on Video
Browsing Strategies. Technical Report. University of
Maryland at College Park.
[3] J. J. Bird, L. J. Manso, E. P. Ribiero, A. Ekart, and D. R.
Faria, “A study on mental state classification using eeg-based
brain-machine interface,” in 9th International Conference on
Intelligent Systems, IEEE, 2018.
[4] D. O. Bos et al., “EEG-based emotion recognition,” The
Influence of Visual and Auditory Stimuli, pp. 117, 2006.
[5] Y.-P. Lin, C.-H. Wang, T.-P. Jung, T.-L. Wu, S.-K. Jeng, J.-
R. Duann, and J.-H. Chen, EEG-based emotion recognition
in music listening,” IEEE Transactions on Biomedical
Engineering, vol. 57, no. 7, pp. 17981806, 2010.
[6] X.-W. Wang, D. Nie, and B.-L. Lu, “Emotional state
classification from eeg data using machine learning
approach,” Neurocomputing, vol. 129, pp. 94106, 2014.
[7] S. Koelstra, A. Yazdani, M. Soleymani, C. Mühl, J.-S. Lee,
A. Nijholt, T. Pun, T. Ebrahimi, and I. Patras, “Single trial
classification of eeg and peripheral physiological signals for
recognition of emotions induced by music videos,” in Int.
Conf. on Brain Informatics, pp. 89100, Springer, 2010.
[8] M. Li and B.-L. Lu, “Emotion classification based on
gamma-band eeg,” in Engineering in medicine and biology
society, 2009. EMBC 2009. Annual international conference
of the IEEE, pp. 12231226, IEEE, 2009.
[9] M. Abujelala, C. Abellanoza, A. Sharma, and F. Makedon,
“Brainee: Brain enjoyment evaluation using commercial eeg
headband,” in Proceedings of the 9th acm international
conference on pervasive technologies related to assistive
environments, p. 33, ACM, 2016.
[10] W.-L. Zheng, J.-Y. Zhu, Y. Peng, and B.-L. Lu, “Eeg-based
emotion classification using deep belief networks,” in
Multimedia and Expo (ICME), 2014 IEEE International
Conference on, pp. 16, IEEE, 2014.
[11] J. J. Bird, A. Ekárt, and D. R. Faria, “Learning from
interaction: An intelligent networked-based human-bot and
bot-bot chatbot system,” in UK Workshop on Computational
Intelligence, pp. 179190, Springer, 2018.
[12] B. E. Swartz, “The advantages of digital over analog
recording techniques,” Electroencephalography and clinical
neurophysiology, vol. 106, no. 2, pp. 113117, 1998.
[13] A. Coenen, E. Fine, and O. Zayachkivska, “Adolf beck: A
forgotten pioneer in electroencephalography,” Journal of the
History of the Neurosciences, vol. 23, pp. 276286, 2014.
[14] A. K. Shah and S. Mittal, “Invasive electroencephalography
monitoring: Indications and presurgical planning,” Annals of
Indian Academy of Neurology, vol. 17, pp. S89, 2014.
[15] B. A. Taheri, R. T. Knight, and R. L. Smith, “A dry electrode
for eeg recording,” Electroencephalography and clinical
neurophysiology, vol. 90, no. 5, pp. 376383, 1994.
[16] K. Oatley and J. M. Jenkins, Understanding emotions.
Blackwell publishing, 1996.
[17] H. Lövheim, “A new three-dimensional model for emotions
and monoamine neurotransmitters,” Medical hypotheses, vol.
78, no. 2, pp. 341348, 2012
[18] J. Gruzelier, “A theory of alpha/theta neurofeedback, creative
performance enhancement, long distance functional
connectivity and psychological integration,” Cognitive
processing, vol. 10, no. 1, pp. 101109, 2009.
[19] C. Cortes and V. Vapnik, “Support-vector networks,”
Machine learning, vol. 20, no. 3, pp. 273297, 1995.
[20] J. Platt, “Sequential minimal optimization: A fast algorithm
for training support vector machines,” 1998.
DISP '19, Oxford, United Kingdom
ISBN: 978-1-912532-09-4
[21] T. Bayes, R. Price, and J. Canton, An essay towards solving
a problem in the doctrine of chances,” 1763.
[22] S. H. Walker and D. B. Duncan, “Estimation of the
probability of an event as a function of several independent
variables,” Biometrika, vol. 54, no. 1-2, pp. 167179, 1967.
[23] Y. Bengio, I. J. Goodfellow, and A. Courville, “Deep
learning,” Nature, vol. 521, no. 7553, pp. 436444, 2015.
[24] T. K. Ho, “Random decision forests,” in Document analysis
and recognition, 1995., proceedings of the third international
conference on, vol. 1, pp. 278282, IEEE, 1995.
[25] Y. Freund and R. E. Schapire, “A decision-theoretic
generalization of on-line learning and an application to
boosting,” Journal of computer and system sciences, vol. 55,
no. 1, pp. 119139, 1997.
[26] R. Rojas, “Adaboost and the super bowl of classifiers a
tutorial introduction to adaptive boosting,” Freie University,
Berlin, Tech. Rep, 2009
[27] H. H. Jasper, “The ten-twenty electrode system of the
international federation,” Electroencephalogr. Clin.
Neurophysiol., vol. 10, pp. 370375, 1958.
[28] M. Hussain, J. J. Bird, and D. R. Faria, “A study on cnn
transfer learning for image classification,” in UK Workshop
on Computational Intelligence, pp. 191202, Springer, 2018.
[29] D. R. Faria, M. Vieira, C. Premebida, and U. Nunes,
“Probabilistic human daily activity recognition towards
robot-assisted living,” in Robot and Human Interactive
Communication (RO-MAN), 2015 24th IEEE International
Symposium on, pp. 582587, IEEE, 2015.
[30] C. D. Buckingham, A. Ahmed, and A. Adams, “Designing
multiple user perspectives and functionality for clinical
decision support systems,” pp. 211218, 2013.
[31] C. D. Buckingham, A. Adams, L. Vail, A. Kumar, A.
Ahmed, A. Whelan, and E. Karasouli, “Integrating service
user and practitioner expertise within a web-based system for
collaborative mental-health risk and safety management,”
Patient Education and Counseling, pp. 1189-1196, 2015.
... Over time, EEG has become one of the best sources of data for conducting affective classification tasks with the study from Chatterjee and Byun (2022) currently showing the most accurate performance (99.55%). Other noteworthy studies with associated performance includes Ashford et al. (2020) with 89.38%, Jain et al. (2021) with 97.0%, Rahman et al. (2022) with 95.36%, Nanthini et al. (2022) with 97.18%, and Bird et al. (2019a) with 97.89%. ...
... Possible reasons for EEG data shortage for experimental purposes may include the effort required to collect data (Seal et al. 2020), accessibility and cost (Courellis et al. 2016) and limitations of low-cost commercial devices (Dadebayev, Goh, and Tan 2022). Additionally, in the few research studies that performed affective classification on a benchmark EEG dataset 1 (Ashford et al. 2020;Bird et al. 2019bBird et al. , 2019aChatterjee and Byun 2022;Jain et al. 2021;Nanthini et al. 2022;Njoku et al. 2022;Rahman et al. 2022), the task has been conducted on the entire set of data using a one-size-fits all approach where the innate differences that exist between subjects (Lehnertz, Rings, and Bröhl 2021) are not taken into account. In other words, the experiments were done through training models on the whole set of data, meaning that the differences that naturally exist from person to person are indistinguishable from changes in signal due to emotional state. ...
... To address the issues of generic approach and differing evaluation methods, we replicated the state-of-the-art experiments (Chatterjee and Byun 2022) performed on the benchmark EEG dataset that was originally used in Bird et al. (2019bBird et al. ( , 2019a (commonly called MUSE dataset because it was collected with a MUSE 2 EEG device). The experiments were exploratory using six machine learning (ML) algorithms predominantly used in similar studies. ...
Full-text available
Background and Objectives Declining mental health is a prominent and concerning issue. Affective classification, which employs machine learning on brain signals captured from electroencephalogram (EEG), is a prevalent approach to address this issue. However, many existing studies have adopted a one-size-fits-all approach, where data from multiple individuals are combined to create a single “generic” classification model. This overlooks individual differences and may not accurately capture the unique emotional patterns of each person. Methods This study explored the performance of six machine learning algorithms in classifying a benchmark EEG dataset (collected with a MUSE device) for affective research. We replicated the best performing models on the dataset found in the literature and present a comparative analysis of performance between existing studies and our personalised approach. We also adapted another EEG dataset (commonly called DEAP) to validate the personalised approach. Evaluation was based on accuracy and significance test using McNemar statistics. Model runtime was also used as an efficiency metric. Results The personalised approach consistently outperformed the generalised method across both datasets. McNemar’s test revealed significant improvements in all but one machine learning algorithm. Notably, the Decision Tree algorithm consistently excelled in the personalised mode, achieving an accuracy improvement of 0.85% (p<0.001p \lt 0.001) on the MUSE dataset and a 4.30% improvement on the DEAP dataset, which was also statistically significant (p=0.004p = 0.004). Both Decision Tree models were more efficient than their generalised counterpart with 1.270 and 23.020-s efficiency gain on the MUSE and DEAP datasets, respectively. Conclusions This research concludes that smaller, personalised models are a far more effective way of conducting affective classification, and this was validated with both small (MUSE) and large (DEAP) datasets consisting of EEG samples from 4 to 32 subjects, respectively.
... Additionally, such analysis provides numerical consequences of the EEG but these techniques are computationally expensive for recovering additional EEG information related to the specific patterns. Therefore, some other kind of technique is required to describe different behavior of the EEG [19]. The analysis and classification of EEG signals are essential for different real-world applications such as Brain Computer Interface (BCI), Neurology, Neuroscience Research, Biometrics, Neuro-marketing etc. Proper preprocessing of EEG signals significantly impacts the classification performance. ...
... The analysis focuses on seizure detection using both conventional and deep learning methods, with a particular emphasis on CNN-RNN architecture [20]. The study presents MBEEGSE, reaching high accuracy in EEG motor imagery classification tasks [19]. This analysis offers MHCNN, which estimates spatial cognitive capacity using multispectral EEG images and achieves impressive accuracy [18]. ...
... The algorithm achieves 82.87% accuracy in the BCI-IV2a dataset and 96.15% in the gamma dataset, surpassing existing motor imagery classification models. [19] Review MHCNN for spatial cognitive ability assessment was proposed, for binary classification of EEG signals. ...
Full-text available
This study examines the influence of EEG signal processing techniques on novel classification algorithms. Through a review of recent EEG research, we analyze datasets, features, and classification methods. We investigate data scaling and feature selection, utilizing a Convolutional Neural Network. Our findings favor standardization over normalization, with PCA and standardization providing optimal training results, and standardization excelling during validation. This research has implications for future studies in EEG signal processing and classification.
... Selecting variables from the input will influence the chosen characteristics in the output. Rear has proven a useful way to simulate genetic changes in other work [13]. ...
... Learning the Ropes of EEG Delta (0.5-4 Hz), theta (4-8 Hz), alpha (8)(9)(10)(11)(12)(13), beta (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), and gamma (30 Hz) seem to be the 5 types into which the wavelength ranges of Eeg recordings fall ( Figure 1). The prefrontal cortex is the typical location for recording electrodes with amplitudes between 20 and 200 lV. ...
... Learning the Ropes of EEG Delta (0.5-4 Hz), theta (4-8 Hz), alpha (8)(9)(10)(11)(12)(13), beta (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), and gamma (30 Hz) seem to be the 5 types into which the wavelength ranges of Eeg recordings fall ( Figure 1). The prefrontal cortex is the typical location for recording electrodes with amplitudes between 20 and 200 lV. ...
Full-text available
The objective of this study is to address the profound impact of emotions on human well-being, particularly their influence on factors such as motivation, perception, cognition, creativity, focus, awareness, understanding, and decision-making. This research is motivated by the significant challenge in comprehending and evaluating the complex nature of emotions. It is evident that negative or positive emotional responses correspond to distinct patterns of electrical activity in the brain. The efficacy of feature extraction algorithms, techniques, and classification processes is critical for the functionality of emotion detection systems based on brain signals, particularly electroencephalography (EEG). With the increasing availability of cost-effective, high-quality biomedical signal recording systems, including wireless options, EEG studies have gained momentum. The study aims to present an algorithmic model for sentiment classification utilizing EEG data. The advantages of this research extend to individuals, businesses, schools, and government entities, who can leverage deep learning methods to identify emotional states, fostering an environment where individuals may find it easier to express their China Petroleum Processing and Petrochemical Technology Catalyst Research Volume 23, Issue 2, November 2023 Pp. 3664-3684 concerns to others. In terms of future prospects, this field promises real-time applications, cross-cultural investigations, ethical considerations, healthcare applications, and the exploration of multi-modal approaches, all contributing to an evolving and promising domain.
... This technique involves placing electrodes on the scalp to detect and measure voltage fluctuations resulting from ionic flow within brain neurons. EEG signals are categorized into different frequency bands: delta (0.5-4 Hz), theta (4)(5)(6)(7)(8), alpha (8)(9)(10)(11)(12)(13), beta (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), and gamma , each associated with different brain states and functions. Compared to other tests like fMRI, EGG offers advantages such as low cost, portability, and noninvasive measurement, reducing the burden on subjects and minimizing side effects. ...
... We used the EEG Emotion dataset provided by J. J. Bird et al. [9] and the DEAP dataset by S. Koelstra et al. [39] for this study. The EEG Emotion dataset was collected from two people (1 male and 1 female) for 3 min per state-positive, neutral, and negative. ...
Full-text available
Emotion classification is a challenge in affective computing, with applications ranging from human–computer interaction to mental health monitoring. In this study, the classification of emotional states using electroencephalography (EEG) data were investigated. Specifically, the efficacy of the combination of various feature selection methods and hyperparameter tuning of machine learning algorithms for accurate and robust emotion recognition was studied. The following feature selection methods were explored: filter (SelectKBest with analysis of variance (ANOVA) F-test), embedded (least absolute shrinkage and selection operator (LASSO) tuned using Bayesian optimization (BO)), and wrapper (genetic algorithm (GA)) methods. We also executed hyperparameter tuning of machine learning algorithms using BO. The performance of each method was assessed. Two different EEG datasets, EEG Emotion and DEAP Dataset, containing 2548 and 160 features, respectively, were evaluated using random forest (RF), logistic regression, XGBoost, and support vector machine (SVM). For both datasets, the experimented three feature selection methods consistently improved the accuracy of the models. For EEG Emotion dataset, RF with LASSO achieved the best result among all the experimented methods increasing the accuracy from 98.78% to 99.39%. In the DEAP dataset experiment, XGBoost with GA showed the best result, increasing the accuracy by 1.59% and 2.84% for valence and arousal. We also show that these results are superior to those by the previous other methods in the literature.
Conference Paper
Emotion recognition from Electroencephalogram (EEG) signals has emerged as a promising method for understanding human affective states. However, Deep learning-based emotion recognition models suffer from overfitting and generalisation due to the variability in EEG signals and the scarcity of labelled data, which impede their performance. In this work, a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) based architecture was adopted for efficient EEG data augmentation. The publicly available “EEG Brainwave” dataset was used to train the WGAN-GP model to synthetically generate the fake EEG data. The generated synthetic data was mixed with the real data in different proportions to determine the optimum ratio of data augmentation for efficient emotion classification. The efficacy of the data augmentation was evaluated by proposing an LSTM-based classifier that efficiently classifies the three emotional states: positive, neutral, and negative. The experimental results show that the maximum classification accuracy of 99.14% was achieved with a precision of 0.9915, recall of 0.9914, and F1 score of 0.9914 when an equal quantity of real and synthetically generated EEG data was mixed to train the classifier. Our WGAN-GP-LSTM method not only enhances the robustness of emotion recognition models by utilizing data augmentation but also significantly improves the classification accuracy with limited labelled data and outperforms all other state-of-the-art techniques.
Full-text available
Mental stress poses a widespread societal challenge, impacting daily routines and contributing to severe health problems. The earlier studies have utilized Electroencephalograms (EEG) for stress classification; however, the computational demands of processing data from numerous channels often hinder the translation of these models to wearable devices. This paper proposes KRAFS-ANet, a novel framework designed for enhanced stress classification using EEG data on wearable devices. KRAFS-ANet framework incorporates two major novel components to achieve high accuracy with a lightweight design: (1) it strategically employs channel selection using Normal Mutual Information and Recursive Feature Elimination (NMI+RFE) to identify the most informative channels and (2) it uses ensemble stacking techniques integrate bagging K-Nearest Neighbour (KNN), bagging Random Forest (RF), and bagging Support Vector Machine (SVM) with an Artificial Neural Network (ANN) meta-classifier. The study conducts comprehensive experiments on the stress-based MAT dataset and further validates the framework on the stress-based SAM40 and anxiety-based DASPS datasets to demonstrate its effectiveness. KRAFS-ANet achieved the highest accuracies and F1-scores of 98.63% and 98.82% on the MAT dataset, 97.25% and 97.24% on the SAM40 dataset, and 94.92% and 95.15% on the DASPS dataset, respectively. This framework advances the practical application of EEG-based stress detection for portable devices such as wearables and smartphones. It enables real-time monitoring and interventions to enhance mental health in daily life, thus proving its efficacy in real-world scenarios.
Cognitive fatigue (CF) is a complex disorder that affects the human being efficiency in work and daily activities. Researchers are actively exploring various physiological signals to monitor fatigue, with electroencephalography (EEG) being considered a top indicator for recognizing cognitive fatigue. Traditionally, fatigue patterns are visually marked in EEG recordings, which are considered a time-consuming and error-prone task. Therefore, developing an efficient cognitive fatigue monitoring system, particularly employing machine learning (ML) methods, holds substantial potential. In this study, our primary objective is to conduct a comprehensive analysis of three machine learning (ML) approaches dedicated to recognizing CF. To achieve this, we utilize advanced multidomain features and a recently published EEG dataset labeled with two levels of fatigue. The ML techniques under scrutiny include support vector machines (SVMs), multilayer perceptron (MLP), and Gaussian Naive Bayes (GNB). Through a meticulous examination of real EEG data, in conjunction with our proposed features, we have drawn a significant conclusion. Notably, the GNB classifier has emerged as the most efficient and accurate among the tested classifiers, boasting a Balanced Classification Rate (BCR) of 82.94%. Additionally, our selected features have demonstrated remarkable effectiveness when compared to previous studies that utilized the same database and benchmark.
Conference Paper
Full-text available
This work aims to find discriminative EEG-based features and appropriate classification methods that can categorise brainwave patterns based on their level of activity or frequency for mental state recognition useful for human-machine interaction. By using the Muse headband with four EEG sensors (TP9, AF7, AF8, TP10), we categorised three possible states such as relaxing, neutral and concentrating based on a few states of mind defined by cognitive behavioural studies. We have created a dataset with five individuals and sessions lasting one minute for each class of mental state in order to train and test different methods. Given the proposed set of features extracted from the EEG headband five signals (alpha, beta, theta, delta, gamma), we have tested a combination of different features selection algorithms and classifier models to compare their performance in terms of recognition accuracy and number of features needed. Different tests such as 10-fold cross validation were performed. Results show that only 44 features from a set of over 2100 features are necessary when used with classical classifiers such as Bayesian Networks, Support Vector Machines and Random Forests, attaining an overall accuracy over 87%.
Conference Paper
Full-text available
In this paper we propose an approach to a chatbot software that is able to learn from interaction via text messaging between human-bot and bot-bot. The bot listens to a user and decides whether or not it knows how to reply to the message accurately based on current knowledge, otherwise it will set about to learn a meaningful response to the message through pattern matching based on its previous experience. Similar methods are used to detect offensive messages, and are proved to be effective at overcoming the issues that other chatbots have experienced in the open domain. A philosophy of giving preference to too much censorship rather than too little is employed given the failure of Microsoft Tay. In this work, a layered approach is devised to conduct each process, and leave the architecture open to improvement with more advanced methods in the future. Preliminary results show an improvement over time in which the bot learns more responses. A novel approach of message simplification is added to the bot’s architecture, the results suggest that the algorithm has a substantial improvement on the bot’s conversational performance at a factor of three.
Conference Paper
Full-text available
Many image classification models have been introduced to help tackle the foremost issue of recognition accuracy. Image classification is one of the core problems in Computer Vision field with a large variety of practical applications. Examples include: object recognition for robotic manipulation, pedestrian or obstacle detection for autonomous vehicles, among others. A lot of attention has been associated with Machine Learning, specifically neural networks such as the Convolutional Neural Network (CNN) winning image classification competitions. This work proposes the study and investigation of such a CNN architecture model (i.e. Inception-v3) to establish whether it works best in terms of accuracy and efficiency with new image datasets via Transfer Learning. The retrained model is evaluated, and the results are compared to some state-of-the-art approaches.
Conference Paper
Full-text available
Previous studies that involve measuring EEG, or electroencephalograms, have mainly been experimentally-driven projects; for instance, EEG has long been used in research to help identify and elucidate our understanding of many neuroscientific, cognitive, and clinical issues (e.g., sleep, seizures, memory). However, advances in technology have made EEG more accessible to the population. This opens up lines for EEG to provide more information about brain activity in everyday life, rather than in a laboratory setting. To take advantage of the technological advances that have allowed for this, we introduce the Brain-EE system, a method for evaluating user engaged enjoyment that uses a commercially available EEG tool (Muse). During testing, fifteen participants engaged in two tasks (playing two different video games via tablet), and their EEG data were recorded. The Brain-EE system supported much of the previous literature on enjoyment; increases in frontal theta activity strongly and reliably predicted which game each individual participant preferred. We hope to develop the Brain-EE system further in order to contribute to a wide variety of applications (e.g., usability testing, clinical or experimental applications, evaluation methods, etc.).
Conference Paper
Full-text available
In this work, we present a human-centered robot application in the scope of daily activity recognition towards robot-assisted living. Our approach consists of a probabilistic ensemble of classifiers as a dynamic mixture model considering the Bayesian probability, where each base classifier contributes to the inference in proportion to its posterior belief. The classification model relies on the confidence obtained from an uncertainty measure that assigns a weight for each base classifier to counterbalance the joint posterior probability. Spatio-temporal 3D skeleton-based features extracted from RGB-D sensor data are modeled in order to characterize daily activities, including risk situations (e.g.: falling down, running or jumping in a room). To assess our proposed approach, state-of-the-art datasets such as MSR-Action3D Dataset and MSR-Activity3D Dataset [1] are used to compare the results with other recent methods. Reported results on test datasets show that our proposed approach outperforms state-of-the-art methods in terms of precision, recall, and overall accuracy. Moreover, we also validated our framework running on-the-fly in a mobile robot with an RGB-D sensor to identify daily activities for a robot-assisted living application.
Conference Paper
Full-text available
In recent years, there are many great successes in using deep architectures for unsupervised feature learning from data, especially for images and speech. In this paper, we introduce recent advanced deep learning models to classify two emotional categories (positive and negative) from EEG data. We train a deep belief network (DBN) with differential entropy features extracted from multichannel EEG as input. A hidden markov model (HMM) is integrated to accurately capture a more reliable emotional stage switching. We also compare the performance of the deep models to KNN, SVM and Graph regularized Extreme Learning Machine (GELM). The average accuracies of DBN-HMM, DBN, GELM, SVM, and KNN in our experiments are 87.62%, 86.91%, 85.67%, 84.08%, and 69.66%, respectively. Our experimental results show that the DBN and DBN-HMM models improve the accuracy of EEG-based emotion classification in comparison with the state-of-the-art methods.
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Objectives: To develop a decision support system (DSS), myGRaCE, that integrates service user (SU) and practitioner expertise about mental health and associated risks of suicide, self-harm, harm to others, self-neglect, and vulnerability. The intention is to help SUs assess and manage their own mental health collaboratively with practitioners. Methods: An iterative process involving interviews, focus groups, and agile software development with 115 SUs, to elicit and implement myGRaCE requirements. Results: Findings highlight shared understanding of mental health risk between SUs and practitioners that can be integrated within a single model. However, important differences were revealed in SUs' preferred process of assessing risks and safety, which are reflected in the distinctive interface, navigation, tool functionality and language developed for myGRaCE. A challenge was how to provide flexible access without overwhelming and confusing users. Conclusion: The methods show that practitioner expertise can be reformulated in a format that simultaneously captures SU expertise, to provide a tool highly valued by SUs. A stepped process adds necessary structure to the assessment, each step with its own feedback and guidance. Practice Implications: The GRiST web-based DSS ( links and integrates myGRaCE self-assessments with GRiST practitioner assessments for supporting collaborative and self-managed healthcare.