Conference PaperPDF Available

A Study on Mental State Classification using EEG-based Brain-Machine Interface

Authors:

Abstract and Figures

This work aims to find discriminative EEG-based features and appropriate classification methods that can categorise brainwave patterns based on their level of activity or frequency for mental state recognition useful for human-machine interaction. By using the Muse headband with four EEG sensors (TP9, AF7, AF8, TP10), we categorised three possible states such as relaxing, neutral and concentrating based on a few states of mind defined by cognitive behavioural studies. We have created a dataset with five individuals and sessions lasting one minute for each class of mental state in order to train and test different methods. Given the proposed set of features extracted from the EEG headband five signals (alpha, beta, theta, delta, gamma), we have tested a combination of different features selection algorithms and classifier models to compare their performance in terms of recognition accuracy and number of features needed. Different tests such as 10-fold cross validation were performed. Results show that only 44 features from a set of over 2100 features are necessary when used with classical classifiers such as Bayesian Networks, Support Vector Machines and Random Forests, attaining an overall accuracy over 87%.
Content may be subject to copyright.
2018 International Conference on Intelligent Systems (IS)
978-1-5386-7097-2/18/$31.00 ©2018 IEEE
A Study on Mental State Classification
using EEG-based Brain-Machine Interface
Jordan J. Bird
School of Engineering & Applied Science
Aston University
Birmingham, UK
birdj1@aston.ac.uk
Anikó Ekárt
School of Engineering & Applied Science
Aston University
Birmingham, UK
a.ekart@aston.ac.uk
Luis J. Manso
School of Engineering & Applied Science
Aston University
Birmingham, UK
l.manso@aston.ac.uk
Diego R. Faria
School of Engineering & Applied Science
Aston University
Birmingham, UK
d.faria@aston.ac.uk
Eduardo P. Ribeiro
Department of Electrical Engineering
Federal University of Parana
Curitiba, Brazil
edu@eletrica.ufpr.br
AbstractThis work aims to find discriminative EEG-based
features and appropriate classification methods that can
categorise brainwave patterns based on their level of activity or
frequency for mental state recognition useful for human-machine
interaction. By using the Muse headband with four EEG sensors
(TP9, AF7, AF8, TP10), we categorised three possible states such
as relaxing, neutral and concentrating based on a few states of
mind defined by cognitive behavioural studies. We have created a
dataset with five individuals and sessions lasting one minute for
each class of mental state in order to train and test different
methods. Given the proposed set of features extracted from the
EEG headband five signals (alpha, beta, theta, delta, gamma), we
have tested a combination of different features selection
algorithms and classifier models to compare their performance in
terms of recognition accuracy and number of features needed.
Different tests such as 10-fold cross validation were performed.
Results show that only 44 features from a set of over 2100
features are necessary when used with classical classifiers such as
Bayesian Networks, Support Vector Machines and Random
Forests, attaining an overall accuracy over 87%.
Keywords EEG, brain-machine interface, machine learning,
mental states classification
I. INTRODUCTION
The ability to autonomously detect mental states, whether
cognitive or affective, is useful for multiple purposes in many
domains such as robotics, health care, education, neuroscience,
etc. The importance of efficient human-machine interaction
mechanisms increases with the number of real life scenarios
where smart devices, including autonomous robots, can be
applied. One of the many alternatives that can be used to
interact with machines is through superficial brain activity
signals. These signals, called electroencephalograms or EEG
for short, convey information regarding the voltage measured
by electrodes (dry or wet) placed around the scalp of an
individual. In addition to regular non-invasive
electroencephalography there can also be found invasive
alternatives which can monitor brain activity placing the
electrodes directly inside the skull of the subject [35]. This
technique is known as intracranial electroencephalography
(iEEG). Despite iEEG can yield better signal acquisition, it is
invasive and therefore more complex to apply. Extracranial
electroencephalography techniques include wearable and non-
wearable technologies. The fact that extracranial devices used
to acquire EEG signals are non-invasive, are becoming easier
to wear, and their price is decreasing widens the range of
applications for which they are suitable.
A major challenge in brain-machine interface applications is
inferring how momentary mental states are mapped into a
particular pattern of brain activity. One of the main issues of
classifying EEG signals is the amount of data needed to
properly describe the different states, since the signals are
complex, non-linear, non-stationary, and random in nature.
The signals are considered stationary only within short
intervals, that is why the best practice is to apply short-time
windowing technique in order to detect local discriminative
features to meet this requirement. The paper at hand focuses
on selecting a subset of highly discriminative features and
comparing to state-of-the-art classification methods that can
categorise EEG signals into different mental states, taking into
consideration the performance in terms of accuracy and
computational cost. The application considered herein is to
distinguish among three different mental states (e.g. relaxed,
neutral and highly concentrated) of an individual using an
EEG device with dry electrodes that can interface a range of
applications, such as to control the movement of a robot.
The remainder of the paper proceeds as follows. Related works
are summarised in section II. The experimental setup,
including information regarding the device used, and details
about the data acquisition are described in section III. The
methods tested to perform feature selection and the criteria
used to compare the different classifiers are presented in
section IV. Preliminary results are presented in section V. A
discussion on the conclusions drawn from the experimental
results is provided in section VI.
II. RELATED WORK
Statistical features derived from EEG data are commonly used
alongside machine learning techniques to classify mental
states [18], [19]. These nominal states can then be used for
finite points of control as a Brain-Computer Interface. A Muse
headband has been recognised by neuroscientists for its
effectiveness and relatively low cost as well as its accuracy
when classified with Bayesian methods [8]. Through signals,
two tasks were recognised with 95% accuracy, though it is
worth noting that tasks were classified rather than mental
states, and said tasks were in binary distinction to one another.
Using a Muse headband, researchers accurately measured a
user’s enjoyment [11], [12] of an activity from brain signals
alone using the stimuli of two videogames, one measurably
more enjoyable than the other. With the use of a high
resolution 32-channel EEG and statistical feature extraction, a
model was developed to control a robot’s movement [9].
Using statistics focused on the signals produced by the motor
cortex which is thought to control muscles for movement [10],
researchers classified various states which successfully
resulted in a model that could direct a robot’s movement. EEG
data has been used extensively to detect abnormal brain
activity related to ill-health such as stroke [13] specifically
when ischemia is present in the brain, brain activity points to
abnormalities prior to the stroke occurring. As well as stroke
detection, neuroscientists found that upper extremities in
motor function post-stroke could be rehabilitated using EEG
data with robotics feedback [14] in the form of a brain-
machine interface. Results were promising in terms of the
effectiveness of the system’s ability to rehabilitate. Also
studied extensively is the ability to use EEG data to detect
seizures both in adults suffering with epilepsy [15] and notably
in new-born infants [16]. A Spiking Neural Network was
developed to classify seizure detection based on statistics
extracted from EEG streams with a high accuracy of 92.5%
[17]. Random Forest classification of extracted EEG features
was used to identify mental states during stages of sleep with a
high accuracy of 82% [20], a Bayesian classifier was trained
on more general awake, sleep and REM sleep states with
accuracies ranging between 92-97% in both humans and rats
[21]. Neural Networks have been observed to have an
accuracy of 64% when classifying emotional states based on
EEG data [7].
Differently from the aforementioned works, this work focuses
on a study on features selection and classification models
given a set of proposed features such as statistical, entropy-
based, derivatives and time-frequency features from short
temporal lapses of EEG data to then generate multiple data
sets of the same data points with original contribution in their
differing selections of attributes, which in turn are selected by
various machine learning models. The primary goal is to find a
suitable model that can categorise mental states based on EEG
data from the TP9, AF7, AF8 and TP10 electrodes.
III. EXPERIMENTAL SETUP AND DATASET
A. EEG Data Acquisition
The sensor Muse Headband was used for data collection. The
Muse is a commercial EEG sensing device with five dry-
application sensors, one used as a reference point (NZ) and
four (TP9, AF7, AF8, TP10) to record brain wave activity.
Fig. 1. The International 10-20 EEG Electrode Placement Standard [4]
Highlighted in yellow are the sensors of the Muse Headband. The NZ
placement (green) is used as a reference point for calibration.
Fig. 2. Example of a live EEG stream of the four Muse sensors, Right AUX
did not have a device and was discarded due to it simply being noise.
This live feed graph has a Y-Axis of measured microvolts at t=0 on each
sensor, and an X-axis detailing the time reading.
To prevent the interference of electromyographic signals,
nonverbal tasks that required little to no movement were set.
Blinking, though providing interference to the AF7 and AF8
sensors, was neither encouraged nor discouraged to retain a
natural state. This was due to the dynamicity of blink rate
being linked to tasks requiring differing levels of
concentration [1], and as such the classification algorithms
would take these patterns of signal spikes into account. In
addition, subjects were asked not to close their eyes during any
of the tasks. Three stimuli were devised to cover the three
mental states available from the Muse Headband - relaxed,
neutral, and concentrating. The relaxed task had the subjects
listening to low-tempo music and sound effects designed to aid
in meditation whilst being instructed on relaxing their muscles
and resting. For a neutral mental, a similar test was carried out,
but with no stimulus at all, this test was carried out prior to
any others to prevent lasting effects of a relaxed or
concentrative mental state. Finally, for concentration, the
subjects were instructed to follow the shell game in which a
ball was hidden under one of three cups, which were then
switched, the task was to try and follow which cup hid the
ball. Future work arises in the implementation of a standard
experiment for each state, for proper comparison to similar
experiment. After a short amount of time into the stimulus
starting, as to not gather data with an inaccurate class, the EEG
data from the Muse Headband was automatically recorded for
sixty seconds. The data was observed to be streaming at a
variable frequency within the range of 150 - 270 Hz.
BlueMuse [5] was used for interfacing the device to a
computer, and Muselsl [6] was used to convert the Muse
signals to MicroVolts and record the data into a preliminary
dataset ready for feature extraction. Fig 2. shows a live stream
of EEG data, blinking can be seen in the troughs of TP9 and
TP10 (forehead sensors). At each point in the data stream (150
- 270 Hz), all signals were recorded along with a UNIX
timestamp which was further used for down sampling the data
to produce a uniform stream frequency. The measured
voltages on the graph can be mapped to the EEG placements
seen in Fig 1. Before the features extraction we have down
sampled the data. The sampling rate was decimated to 200 Hz
based on fast Fourier transformations along a given axis. The
resampled signal starts at the same value as x, but it is sampled
with a spacing of len(x) / num * (spacing of x). Because a
Fourier method is used, the signal is assumed to be periodic.
This is a realistic down-sampling as the dominant energy is
concentrated in the range of 20 - 500Hz, even though the
frequency range of the EEG sensor is superior.
IV. METHODS
A. Proposed Set of Features for EEG signals
Feature extraction and classification of EEG signals are core
issues in brain computer interface (BCI) applications. One
challenging problem when it comes to EEG feature extraction
is the complexity of the signal, since it is non-linear, non-
stationary, and random in nature. The signals are considered
stationary only within short intervals, that is why the best
practice is to apply short-time windowing technique to meet
this requirement. However, it is still considered an assumption
that holds during a normal brain condition. Non-stationary
signals can be observed during the change in alertness and
wakefulness, during eye blinking, and also during transitions
of mental states. Thus, this subsection describes the set of
features considered in this work to adequately discriminate
different classes of mental states. These features rely on
statistical techniques, time-frequency based on fast Fourier
transform (FFT), Shannon entropy, max-min features in
temporal sequences, log-covariance and others. All features
proposed to classify the mental states are computed in terms of
the temporal distribution of the signal in a given time window.
This slide window is defined as a period of 1 second at 250
Hz, i.e. all features are computed within this time instant. An
overlap of 0.5 second is used when moving the window, i.e.
the temporal window 1 (w1) starts at 0 sec. and finishes at 1
sec.; w2 starts at 1.5 sec. and finishes at 2.5 sec.; w3 starts at 2
sec. and finishes at 3 sec.; w4 starts at 2.5 sec. and finishes at
3.5 sec., and so on. Another important point to compute the
features is the signals from the EEG Muse headband. Since it
returns five types of signal frequencies {, β, , , }, then we
compute all proposed set of features for each signal. Thus, the
total number of feature values extracted from these signals is
2147 values.
Statistical Features: In order to have a compact representation
of the raw sensor data in a given time range, we are using a set
of classical statistical features since they are useful with
proven efficiency to complement set of multiples features in
order to recognise patterns in time series. The statistical
features are: (i) given a set of data values {x1, x2, ...xN}
acquired in each temporal window, the mean value

of that sequence is computed; (ii) the standard
deviation
󰇛 󰇜
; (iii) statistical moments of 3rd
and 4th order, which gives us the skewness to measure the
asymmetry of the data, and also the kurtosis to measure the
peakedness of the probability distribution of the data,
respectively. The statistical moments employed are computed
as follows:
, (1)
󰇛 󰇜
 (2)
where is the k = {3rd, 4th} moment about the mean and
y = {skewness, k = 3; kurtosis, k = 4}. Another type of
statistical features computed was the autocorrelation of the
signals at each time window for each of the five signals from
the EEG. The correlation of the signal with a delayed copy of
itself as a function of delay was employed similarly to [22]
and [23], where the implementation details and parameters are
described.
Max, Min and Derivatives: Given a time window of 1 sec., the
maximum and minimum values are computed to increase the
diversity of the features types. Derivatives are also computed
as temporal features. For each time window, we split the time
window by 2, such w/2 = 0.5 sec. and w = 1 sec., resulting in
two sequences of data at ~125 Hz, then we compute:

(3)
where w and w/2 indicates the first and second half of the
sequence of data in a time window of 1 sec. The same strategy
is employed to get the derivative given the max and min
features in sub time windows:

(4)

(5)
The next temporal features are extracted after splitting the
initial time window of one second into 4 batches of 0.25 sec.
each. Then we computed the mean, max and min values of
each batch, {µ1, µ2, µ3, µ4}, {max1, max2, max3, max4} and
{min1, min2, min3, min4}. Then we compute the 1D Euclidean
distance among all mean values, 󰃳12 = | µ1- µ2|, 󰃳13 = | µ1- µ3|,
󰃳14 = | µ1- µ4|, 󰃳23 = |µ2 - µ3|, 󰃳24 = |µ2 - µ4|, 󰃳34 = |µ3 -
µ4|, the same for the minimum and maximum values, so that in
the end we got 18 features based on distances. Using the four
mean values, and the four max and four min values, and
adding the previous 18, we got 30 features for each signal in
the short time window, so that counting the 5 signals we have
150 temporal features per second.
Log-covariance features: Given the previous 150 temporal
features, we then discard the last 6 features in order to attain
144 features, so that we could build a  square matrix
to compute the log-covariance as follows:
 󰇛󰇛󰇛󰇜󰇜󰇜 (6)
where lcM is a resulting vector containing the upper triangular
elements (78 features) of the matrix after computing the matrix
logarithm over the covariance matrix M; U(.) is a function to
return the upper triangular elements; logm(.) is the matrix
logarithm function; and the covariance matrix is given by
󰇛󰇜 󰇛󰇜󰇛 󰇜
The rationale
behind of log-covariance is the mapping of the convex cone of
a covariance matrix to the vector space by using the matrix
logarithm so that it does not lie in Euclidean space, i.e., the
covariance matrix space is not closed under multiplication
with negative scalars.
Shannon entropy and log-energy entropy: non-linear analysis
such as Shannon entropy has proven its efficiency in signal
processing and time series since randomness of non-linear data
is well embodied by calculating entropies over the time series.
Entropy is an uncertainty measure and in brain-machine
interface applications, it is used to measure the level of chaos
of the system, since it is a non-linear measure quantifying the
degree of complexity of the data. In information theory, the
Shannon entropy is given by:
󰇛󰇜
, (7)
where h is a feature computed in every time window of 1 sec.
and Sj is each element (normalized) of this temporal window.
Then, given the same time window, we split into two to
compute the log-energy entropy as follows:
 󰇛󰇜
󰇛󰇜
, (8)
where i represents an index for the elements of the first sub
window (0 - 0.5 sec.) and j represents an index for the second
sub window (0.5 - 1 sec.).
Frequency domain: The FFT is an advantageous method to
analyse the spectrum of a given time-series. At every time
window we compute it as follows:


 k = 0, ... , N - 1. (9)
Accumulative features as energy model: An accumulative
value was obtained frame-by-frame given a time window, for
each individual feature, duplicating the number of features.
We compute the difference between the values of the current
frame to the previous frame and accumulate it over time as
follows:

󰇫 
󰇛󰇜
  , (10)
where 
is the resulting energy model for the current time
instant given a specific type of feature , i = {1, ... , N} at a
time instant z representing a specific frame within a time
window.
B. Feature Selection Algorithms
Feature selection aims to remove data which has no useful
application and only serves to unneededly increase the demand
for resources. Five datasets were generated using different
algorithms. Each retained the same data points, but which had
a reduced number of attributes selected by the algorithm. The
evaluators used were as follows:
1. OneR: calculates error rate of each prediction based on
one rule and selects the lowest risk classification [24].
2. Information Gain: assigns a worth to each individual
attribute by measuring the information gain with
respect to the class (difference of entropy) [25].
3. Correlation: measures the correlation of the attribute
and class via their Pearson's coefficient which is used
to rank attributes’ worth comparable to all others. [26].
4. Symmetrical Uncertainty: measures the uncertainty of
an attribute with respect to the class and bases selection
on lower uncertainties [27].
5. Evolutionary Algorithm: creates a population of
attribute subsets and ranks their effectiveness with a
fitness function to measure their predictive ability of
the class. At each generation, solutions are bred to
create offspring, and weakest solutions are killed off in
a tournament of fitness [34].
C. Machine Learning Algorithms
As a benchmark, a ZeroR classifier was first run on each
dataset. This simplistic classifier chooses one single class to
apply to all of the data to reduce inaccurate classifications, it is
expected that an accuracy of one third is achieved with a fair
distribution of the three mental states. Two models were
trained on Bayes Theorem, a formula of conditional
probability based on hypothesis H and evidence E. The
theorem states that the probability of the hypothesis being true
before evidence P(H) is related to the probability of the
hypothesis after reading the evidence P(H | E) and is given as
follows [29]: 󰇛󰇜 󰇛󰇜󰇛󰇜
󰇛󰇜󰇛󰇜
(11)
Naivety arises due to the unverified assumption of non-
consideration of the relationships between the absence of
attributes. A Bayesian Network (Bayes Net) model was also
trained. This method generates a probabilistic graphical model
via representing probabilities of variables to classes on a
Directed Acyclic Graph (DAG) [28] as follows:
󰇛󰇜
󰇛󰇜󰇛󰇜

 (12)
TABLE I. TABLE TO SHOW ACCURACY OF TRAINED MODELS
Dataset
Model Accuracy % (2dp)
Naive Bayes
Bayes Net
J48
Random Tree
Random Forest
MLP
SVM
OneR
56.21
73.67
80
76.21
87.16
74.27
61.18
Information
Gain
54.2
71.64
76.85
65.02
78.02
72.22
64.1
Correlation
56.3
72.69
77.05
75.85
84.17
80.82
75.24
Symmetrical
Uncertainty
51.49
71.41
76.29
74.35
82.96
72.25
60.1
Evolutionary
Algorithm
55.04
70.31
80.65
72.62
85.29
80.85
67.65
TABLE II. NUMBER OF ATTRIBUTES SELECTED BY FIVE
EVALUATORS OF THE ORIGINAL 2147 STATISTICAL ATTRIBUTES
Attribute Selection
Evaluator
Ranker Search Cut-
off
No. of attributes
selected
OneR
60.0
44
Information Gain
0.5
31
Correlation
0.3
26
Symmetrical
Uncertainty
0.25
36
Evolutionary
Algorithm
N/A
99
The goal is to infer the current time value of Ct given the
data Xt:t-T = {Xt, Xt-1,...,Xt-T} and the prior knowledge of the
class, which is attained by the a-posteriori probability
P(Ct |Ct-1:t-T, Xt:t-T). The superscript notation denotes the set
of values over a time interval.
Three decision trees were developed. Generated by the C4.5
algorithm [2], a J48 tree splits each decision based on
information gain, due to the measure of entropy in a leaf
node.
A Random Tree is generated through a stochastic process
that will consider a random number of attributes at each
node. A Random Forest is the process of generating multiple
Random Trees [3]. A Multilayer Perceptron (MLP) model
was generated, a feedforward Neural Network in that cycles
are not formed by neurons. An MLP was implemented due
to its ability to classify data points that are not linearly
separable in Euclidean space [30]. A model was also trained
using a Support Vector Machine (SVM), which classifies
labelled data through a process of supervised learning,
where examples are mapped out in space and classification
is performed by the closest area in which the unknown class
data falls [31]. In particular, an improved version of Platt’s
Sequential Minimal Optimization (SMO) was used to train
the SVM [32], [33].
V. PRELIMINARY RESULTS
The five generated sets from the original dataset are shown
in Table I. Five different algorithms were chosen, and their
results ranked by their individual scores. Arbitrary cut off
points were implemented where the scores closed in on
either 0 or the lowest score present if there were no zero
values. The values given are incomparable between
algorithms due to their unique methods of giving score. The
MLP was given 2000 epochs to train with the number of
nodes on layers set to the default “a” setting, dynamically
calculated by n = (attributes + classes)/2
for each dataset it was trained on. A Zero Rules classifier
was run as a benchmark, and with close to equally
distributed data, set a model accuracy of 33.36% on all
datasets for comparison. We can observe from when
compared to all other classifiers which are not naive. The
most effective model was a Random Forest classifier along
with the dataset created by the OneR Attribute Selector,
which had a high accuracy of 87.16% when classifying the
data into one of the three mental states. Preliminary results
for each of the datasets and their trained models are
presented in Table II. For each test, 10-fold cross validation
was used to train the model. All random seeds were set to
their default value of 1. Table II that all of the models far
outperformed the benchmarks set by the Zero Rules
classifier, the lowest being 51.49% (Symmetrical
Uncertainty dataset with a Naive Bayes classifier). It is
reasonable to assume that the naivety in not considering
attribute relationships has led to poorer results.
VI. CONCLUSION
This paper presented a study on mental state classification
based on EEG signals, it proposed a set of features using a
short-term windowing extracted from five signals from an
EEG sensor to categorise three different states: neutral,
relaxed and concentrated. A dataset was created using data
from five individuals in sessions lasting one minute for each
state. The primary goal of this work was to find appropriate
set of features by testing multiple feature selection
algorithms and classification models that provide acceptable
accuracy performance on the dataset that can be useful for
human-machine interaction. From the multiple feature sets
and models produced, the most accurate is a Random Forest
classifier on an attribute selected by the OneR ruleset, with a
prediction accuracy of 87.16%. Future work will be focused
on comparing our best results with deep learning strategies
and implementing a real-time application to: (i) control
devices, such as robots; and (ii) detect positive and negative
moods useful for applications in mental health care.
REFERENCES
[1] Himebaugh, N.L., Begley, C.G., Bradley, A. and Wilkinson, J.A.,
2009. Blinking and tear break-up during four visual tasks. Optometry
and Vision Science, 86(2), pp. E106-E114.
[2] Quinlan, R., 1993. C4.5: Programs for Machine Learning. Morgan
Kaufmann Publishers, San Mateo, CA.
[3] Breiman, L., 2001. Random forests. Machine learning, 45(1), pp.5-
32..
[4] Jasper, Herbert H. 1958. "The ten-twenty electrode system of the
International Federation." Electroenceph. Clin. Neurophysiol. 370-
375.
[5] Kowaleski, J. (2017). BlueMuse.
[6] Barachant, A. (2017). Muselsl.
[7] Bos, D.O., 2006. EEG-based emotion recognition. The Influence of
Visual and Auditory Stimuli, 56(3), pp.1-17.
[8] Krigolson, O.E., Williams, C.C., Norton, A., Hassall, C.D. and
Colino, F.L., 2017. Choosing MUSE: Validation of a low-cost,
portable EEG system for ERP research. Frontiers in neuroscience, 11,
p.109.
[9] Li, W., Jaramillo, C. and Li, Y., 2012, January. Development of mind
control system for humanoid robot through a brain computer
interface. In 2012 International Conference on Intelligent System
Design and Engineering Application (pp. 679-682). IEEE.
[10] Rosenzweig, M.R., Breedlove, S.M. and Leiman, A.L., 2002.
Biological psychology: An introduction to behavioral, cognitive, and
clinical neuroscience. Sinauer Associates.
[11] Abujelala, M., Abellanoza, C., Sharma, A. and Makedon, F., 2016,
June. Brain-ee: Brain enjoyment evaluation using commercial eeg
headband. In Proceedings of the 9th acm international conference on
pervasive technologies related to assistive environments (p. 33).
ACM.
[12] Plotnikov, A., Stakheika, N., De Gloria, A., Schatten, C., Bellotti, F.,
Berta, R., Fiorini, C. and Ansovini, F., 2012, July. Exploiting real-
time EEG analysis for assessing flow in games. In 2012 IEEE 12th
International Conference on Advanced Learning Technologies (pp.
688-689). IEEE.
[13] Jordan, K.G., 2004. Emergency EEG and continuous EEG monitoring
in acute ischemic stroke. J. of Clinical Neurophys., 21(5), pp.341-352.
[14] Ang, K.K., Guan, C., Chua, K.S.G., Ang, B.T., Kuah, C., Wang, C.,
Phua, K.S., Chin, Z.Y. and Zhang, H., 2010, August. Clinical study of
neurorehabilitation in stroke using EEG-based motor imagery brain-
computer interface with robotic feedback. 2010 Annual International
Conference of the IEEE (pp. 5549-5552).
[15] Tzallas, A.T., Tsipouras, M.G. and Fotiadis, D.I., 2009. Epileptic
seizure detection in EEGs using timefrequency analysis. IEEE
transactions on information technology in biomedicine, 13(5), pp.703-
710
[16] Aarabi, A., Grebe, R. and Wallois, F., 2007. A multistage knowledge-
based system for EEG seizure detection in newborn infants. Clinical
Neurophysiology, 118(12), pp.2781-2797.
[17] Ghosh-Dastidar, S. and Adeli, H., 2007. Improved spiking neural
networks for EEG classification and epilepsy and seizure detection.
Integrated Computer-Aided Engineering, 14(3), pp.187-212.
[18] Chai, T.Y., Woo, S.S., Rizon, M. and Tan, C.S., 2010. Classification
of human emotions from EEG signals using statistical features and
neural network. In International (Vol. 1, No. 3, pp. 1-6). Penerbit
UTHM.
[19] Tanaka, H., Hayashi, M. and Hori, T., 1996. Statistical features of
hypnagogic EEG measured by a new scoring system. Sleep, 19(9),
pp.731-738.
[20] Fraiwan, L., Lweesy, K., Khasawneh, N., Wenz, H. and Dickhaus, H.,
2012. Automated sleep stage identification system based on time
frequency analysis of a single EEG channel and random forest
classifier. Computer methods and programs in biomedicine, 108(1),
pp.10-19.
[21] Rytkönen, K.M., Zitting, J. and Porkka-Heiskanen, T., 2011.
Automated sleep scoring in rats and mice using the naive Bayes
classifier. Journal of neuroscience methods, 202(1), pp.60-64.
[22] Vital, J.P., Faria, D.R., Dias, G., Couceiro, M.S., Coutinho, F. and
Ferreira, N.M., 2017. Combining discriminative spatiotemporal
features for daily life activity recognition using wearable motion
sensing suit. Pattern Analysis and Applications, 20(4), pp.1179-1194.
[23] Faria, D.R., Vieira, M., Premebida, C. and Nunes, U., 2015, August.
Probabilistic human daily activity recognition towards robot-assisted
living. In Robot and Human Interactive Communication (RO-MAN),
2015 24th IEEE International Symposium on (pp. 582-587). IEEE.
[24] University of Waikato. 2018. OneR. [online] Weka.sourceforge.net.
Available at:
http://weka.sourceforge.net/doc.dev/weka/classifiers/rules/OneR.html
[Accessed 9 Aug. 2018].
[25] University of Waikato. 2018. InfoGainAttributeEval. [online]
Weka.sourceforge.net. Available at:
http://weka.sourceforge.net/doc.dev/weka/attributeSelection/InfoGain
AttributeEval.html [Accessed 9 Aug. 2018].
[26] Pearson, K., 1895. Note on regression and inheritance in the case of
two parents. Proceedings of the Royal Society of London, 58, pp.240-
242.
[27] Witten, I.H., Frank, E., Hall, M.A. and Pal, C.J., 2016. Data Mining:
Practical machine learning tools and techniques. Morgan Kaufmann.
[28] Pearl, Judea 2000. Causality: Models, Reasoning, and Inference.
Cambridge University Press. ISBN 0-521-77362-8.
[29] Bayes, T., Price, R. and Canton, J., 1763. An essay towards solving a
problem in the doctrine of chances.
[30] Rosenblatt, F., 1961. Principles of neurodynamics. perceptrons and
the theory of brain mechanisms (No. VG-1196-G-8). CORNELL
AERONAUTICAL LAB INC BUFFALO NY.
[31] Cortes, C. and Vapnik, V., 1995. Support-vector networks. Machine
learning, 20(3), pp.273-297.
[32] Platt, J.C., 1999. 12 fast training of support vector machines using
sequential minimal optimization. Adv. in kernel methods, pp.185-208.
[33] Keerthi, S.S., Shevade, S.K., Bhattacharyya, C. and Murthy, K.R.K.,
2001. Improvements to Platt's SMO algorithm for SVM classifier
design. Neural computation, 13(3), pp.637-649.
[34] Back, T., 1996. Evolutionary algorithms in theory and practice:
evolution strategies, evolutionary programming, genetic algorithms.
Oxford university press.
[35] Shenoy, P; Miller, KJ; Ojemann, JG; Rao, RPN (2007). Generalized
features for electrocorticographic BCIs. IEEE Transactions on
Biomedical Eng. 55 (1), pp. 27380.
... The Muse headband recorded data for sixty seconds with the streaming frequency within the range of 150-270 Hz. The sampling rate was reduced to 200 Hz using Fast Fourier transformations along an axis [40]. The features used in this study represent distinct frequency bands (e.g., delta, theta, alpha) extracted from EEG signals, which are well-established in the literature for their relevance in distinguishing emotional states. ...
... The features used in this study represent distinct frequency bands (e.g., delta, theta, alpha) extracted from EEG signals, which are well-established in the literature for their relevance in distinguishing emotional states. For the EEG dataset, J. J. Bird et al. used the filtering method to extract meaningful features from the noisy raw signals, including Fast Fourier transform (FFT), Shannon entropy, and max-min feature extraction in temporal sequences using the filtering method [40]. ...
... In this experiment, we used two labels (valence and arousal) for 22 participants whose frontal facial videos were recorded [39]. We extracted features using the band frequencies for each EEG channel corresponding to theta (4-8 Hz), low alpha (8-10 Hz), alpha (8-12 Hz), beta (12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), and gamma (30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45). Finally, the data matrix was 880 × 160, where 880 corresponded to 40 trials from 22 participants and 160 corresponded to five bands with 32 channels per band. ...
Article
Full-text available
Emotion classification is a challenge in affective computing, with applications ranging from human–computer interaction to mental health monitoring. In this study, the classification of emotional states using electroencephalography (EEG) data were investigated. Specifically, the efficacy of the combination of various feature selection methods and hyperparameter tuning of machine learning algorithms for accurate and robust emotion recognition was studied. The following feature selection methods were explored: filter (SelectKBest with analysis of variance (ANOVA) F-test), embedded (least absolute shrinkage and selection operator (LASSO) tuned using Bayesian optimization (BO)), and wrapper (genetic algorithm (GA)) methods. We also executed hyperparameter tuning of machine learning algorithms using BO. The performance of each method was assessed. Two different EEG datasets, EEG Emotion and DEAP Dataset, containing 2548 and 160 features, respectively, were evaluated using random forest (RF), logistic regression, XGBoost, and support vector machine (SVM). For both datasets, the experimented three feature selection methods consistently improved the accuracy of the models. For EEG Emotion dataset, RF with LASSO achieved the best result among all the experimented methods increasing the accuracy from 98.78% to 99.39%. In the DEAP dataset experiment, XGBoost with GA showed the best result, increasing the accuracy by 1.59% and 2.84% for valence and arousal. We also show that these results are superior to those by the previous other methods in the literature.
... Dimensionality reduction plays a crucial role in EEG-based emotion recognition by reducing the high dimensionality inherent in EEG signals, Am J Neurodegener Dis 2024;13(4): [23][24][25][26][27][28][29][30][31][32][33] thereby enhancing computational efficiency [10]. This reduction in dimensions not only mitigates the computational load but also helps in addressing the challenges posed by the 'curse of dimensionality', which can otherwise hinder model performance [11]. ...
... eeg-brainwave-dataset-feeling-emotions. It is a dataset based on EEG brainwave data collected from two subjects, one male and one female, between the ages of 20-22 [24]. The dataset comprises 12 minutes of brain activity data from each subject, recorded during the viewing of six film clips listed in Table 1. ...
... Figure 1. EEG sensors TP9, AF7, AF8 and TP10 of the Muse headband on the international standard EEG placement system [24]. Am J Neurodegener Dis 2024;13(4): [23][24][25][26][27][28][29][30][31][32][33] In this study, the use of dimensionality reduction techniques such as PCA, Laplacian score, and Chi-square feature selection is particularly focused on extracting meaningful components from EEG data, which is known to capture complex brain activities underlying human emotions. ...
Article
Full-text available
Objectives: The aim of this study is to evaluate the impact of various dimensionality reduction methods, including principal component analysis (PCA), Laplacian score, and Chi-square feature selection, on the classification performance of an electroencephalogram (EEG) dataset. Methods: We applied dimensionality reduction techniques, including PCA, Laplacian score, and Chi-square feature selection, and assessed their impact on the classification performance of EEG data using linear regression, K-nearest neighbour (KNN), and Naive Bayes clas-sifiers. The models were evaluated in terms of their classification accuracy and computational efficiency. Results: Our findings suggest that all dimensionality reduction strategies generally improved or maintained classification accuracy while reducing the computational load. Notably, PCA and Autofeat techniques led to increased accuracy for the models. Conclusions: The use of dimensionality reduction techniques can enhance EEG data classification by reducing computational demands without compromising accuracy. These results demonstrate the potential for these techniques to be applied in scenarios where both computational efficiency and high accuracy are desired. The code used in this study is available at https://github.com/movahedso/Emotion-analysis.
... As technology becomes part of our daily lives, EEG data for emotion classification can help create personalized apps that dynamically adapt to users' changing emotions and provide a more personalized and responsive user experience. With freely available EEG datasets like EEG Brain Wave Data Set: Feeling Emotions [2], [3], [4] EEG-based emotion analysis research is advancing. ...
... Our investigations are based on several machine learning and deep learning research on EEG-based emotion classification. Multiple machine learning-based techniques, such as Multi-Layer Perceptron (MLP) [1], [2], [6], K-Nearest Neighbor (K-NN) [6], [7], [8], [9] and Support Vector Machine (SVM) [2], [6], [8], [9] For EEG classification. In the research performed by [6] MLP has the best accuracy of 96.53%, KNN at 94.70%, and SVM at 96.90%. ...
... Our investigations are based on several machine learning and deep learning research on EEG-based emotion classification. Multiple machine learning-based techniques, such as Multi-Layer Perceptron (MLP) [1], [2], [6], K-Nearest Neighbor (K-NN) [6], [7], [8], [9] and Support Vector Machine (SVM) [2], [6], [8], [9] For EEG classification. In the research performed by [6] MLP has the best accuracy of 96.53%, KNN at 94.70%, and SVM at 96.90%. ...
Article
Full-text available
Electroencephalogram (EEG) records brain activity as electrical currents to discern emotions. As interest in human-computer emotional connections rises, reliable and implementable emotion recognition algorithms are essential. This study classifies EEG waves using machine and deep learning. A four-channel Muse EEG headband recorded neutral, negative, and positive emotions for the publicly available Feeling Emotions EEG dataset. Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) were utilized for deep learning, while SVM, K-NN, and MLP were used for machine learning. The models were assessed for accuracy, precision, recall, and F1-Score. SVM, K-NN, and MLP have accuracy scores of 0.98, 0.95, and 0.97. Deep learning methods CNN, LSTM, and GRU had 0.98, 0.82, and 0.97 accuracy. SVM and CNN surpassed other approaches in accuracy, precision, recall, and F1-Score. The research shows that machine learning and deep learning can classify EEG signals to identify emotions. High accuracy results, especially from SVM and CNN, suggest these models could be used in emotion-aware human-computer interaction systems. This study adds to EEG-based emotion classification research by revealing model selection and parameter tweaking strategies for better categorization.
... For EEG, this study utilizes a dataset originally published by Bird et al. [34], which is openly accessible on both GitHub and Kaggle. As the dataset does not contain any identifiable personal information, ethical approval for its use is not required. ...
... As the dataset does not contain any identifiable personal information, ethical approval for its use is not required. Bird et al. [34] collected data from four individuals (two male and two female) over 60 s intervals in three cognitive states: relaxed, concentrating, and neutral. EEG signals were recorded using a Muse EEG headband with dry electrodes positioned at the TP9, AF7, AF8, and TP10 sites. ...
Article
Full-text available
This study presents a novel approach using Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP) to generate synthetic electroencephalography (EEG) and electrocardiogram (ECG) waveforms. The synthetic EEG data represent concentration and relaxation mental states, while the synthetic ECG data correspond to normal and abnormal states. By addressing the challenges of limited biophysical data, including privacy concerns and restricted volunteer availability , our model generates realistic synthetic waveforms learned from real data. Combining real and synthetic datasets improved classification accuracy from 92% to 98.45%, highlighting the benefits of dataset augmentation for machine learning performance. The WGAN-GP model achieved 96.84% classification accuracy for synthetic EEG data representing relaxation states and optimal accuracy for concentration states when classified using a fusion of convolutional neural networks (CNNs). A 50% combination of synthetic and real EEG data yielded the highest accuracy of 98.48%. For EEG signals, the real dataset consisted of 60-s recordings across four channels (TP9, AF7, AF8, and TP10) from four individuals, providing approximately 15,000 data points per subject per state. For ECG signals, the dataset contained 1200 real samples, each comprising 140 data points, representing normal and abnormal states. WGAN-GP outperformed a basic generative adversarial network (GAN) in generating reliable synthetic data. For ECG data, a support vector machine (SVM) classifier achieved an accuracy of 98% with real data and 95.8% with synthetic data. Synthetic ECG data improved the random forest (RF) classifier's accuracy from 97% with real data alone to 98.40% when combined with synthetic data. Statistical significance was assessed using the Wilcoxon signed-rank test, demonstrating the robustness of the WGAN-GP model. Techniques such as discrete wavelet transform, downsampling, and upsampling were employed to enhance data quality. This method shows significant potential in addressing biophysical data scarcity and advancing applications in assistive technologies, human-robot interaction, and mental health monitoring, among other medical applications.
... Several studies have significantly advanced EEG research by delving into brainwave patterns for a range of applications. In [5], the author investigates discriminative EEG features to categorize brainwave patterns, especially for human-machine interaction. The study achieves high classification accuracy with a reduced feature set using classifiers such as Bayesian Networks, Support Vector Machines, and Random Forests. ...
... In this paper, we harnessed an EEG brainwave dataset processed through a specialized technique pioneered by [5]. This method involved the extraction of intricate statistical features from the EEG data, enabling a profound exploration of the neural correlates of emotions. ...
Article
Full-text available
This article delves into using machine learning algorithms for emotion classification via EEG brain signals. The goal is to discover an accurate model beyond traditional methods, necessitating AI for classifying emotional EEG signals. This study, motivated by the complex link between emotions and neural activity, employs Random Forest, Support Vector Machines, and K-Nearest Neighbors. Notably, Random Forest achieves 99% accuracy, SVM 98%, and KNN 94%. These impressive results, backed by performance metrics like confusion matrices, reveal each model’s effectiveness in emotion classification. The dataset, rich in varied emotional stimuli and EEG placements, provides a robust foundation for detailed analysis. This research underscores significant applications in affective computing and mental health, offering a promising path to understanding the intricate relationship between EEG signals and human emotions.
... Furthermore, the paper compares the results with the best-evaluated results obtained by Chatterjee et al. [ 6] on the EEG dataset [ 14]. The author has worked on a 4-channel EEG dataset involving only four subjects and achieved the highest accuracy of 99.55% using a stacked classifier (RF + LGB + GB). ...
... The experimental conditions were carefully controlled to maintain a tranquil setting devoid of any auditory disturbances, while also maintaining a comfortable ambient temperature, in order to promote relaxation among the participants. The participant was instructed to adopt a comfortable posture and remain still, refraining from any physical movement or eye blinking so order to minimize the occurrence of noise or undesired signals during the experiment [42]. ...
Article
Full-text available
Annually, the global economy suffers significant financial losses due to decreased productivity of work, accidents, and crashes in traffic resulting from microsleep. To reduce the adverse impacts of microsleep, it is necessary to have a discreet, dependable, and socially acceptable method of detecting microsleep episodes consistently throughout the day, every single day. Regrettably, the current solutions fail to match these specified criteria. Moreover, by utilizing sophisticated features and employing machine learning techniques, it is possible to process electroencephalogram (EEG) information in a highly efficient manner, enabling the rapid and successful detection of microsleep. The selection of an optimum channel and the use of a competent classification algorithm are crucial for effective microsleep detection. One unique channel selecting strategy has been introduced in the current study to evaluate the classifying accuracy of microsleep detection based on EEG. This strategy is based on correlation coefficients and utilizes the K-Nearest Neighbor (KNN) method. Furthermore, the Fast Fourier Transform (FFT) was employed for extracting the feature, so validating the endurance of the proposed technique. In order to enhance the speed of the microsleep detecting system, the study was performed using 3 distinct time windows: 0.5s, 0.75s, and 1s. The study revealed that the suggested approach achieved a classification accuracy of 98.28% within a time window of 0.5 seconds to detect microsleep using EEG signal. The exceptional effectiveness of the given system can be efficiently utilized in detecting microsleep using EEG signal.
Chapter
Diagnosis and treatment of various emotions are crucial in modern life to prevent chronic emotional states and irreversible damage. By integrating the Internet of Things (IoT) and automated learning strategies within residential settings, it is now feasible to design intelligent environments capable of detecting and recognizing emotions induced by stress. To address the challenges of emotion analysis, this chapter presents a hybrid model that combines evolutionary convolutional neural network-based learning (evCNN) with emotion recognition. The primary objective of our research is to deploy hybrid learning techniques to diagnose different emotions within the context of the healthcare structure and the Internet of Medical Things (IoMT). In this study, we introduce the optimized gray wolf algorithm (opGWO) as an innovative method for determining the optimal parameters within the deep CNN architecture. The proposed architecture was evaluated for its ability to generalize emotion recognition using previously unseen EEG signals. Additionally, we reduced computational complexity by using a limited number of EEG channels to evaluate emotions. Our approach demonstrated a precision of over 90% in classifying emotions across two datasets.
Article
Full-text available
In recent years there has been an increase in the number of portable low-cost electroencephalographic (EEG) systems available to researchers. However, to date the validation of the use of low-cost EEG systems has focused on continuous recording of EEG data and/or the replication of large system EEG setups reliant on event-markers to afford examination of event-related brain potentials (ERP). Here, we demonstrate that it is possible to conduct ERP research without being reliant on event markers using a portable MUSE EEG system and a single computer. Specifically, we report the results of two experiments using data collected with the MUSE EEG system—one using the well-known visual oddball paradigm and the other using a standard reward-learning task. Our results demonstrate that we could observe and quantify the N200 and P300 ERP components in the visual oddball task and the reward positivity (the mirror opposite component to the feedback-related negativity) in the reward-learning task. Specifically, single sample t-tests of component existence (all p's < 0.05), computation of Bayesian credible intervals, and 95% confidence intervals all statistically verified the existence of the N200, P300, and reward positivity in all analyses. We provide with this research paper an open source website with all the instructions, methods, and software to replicate our findings and to provide researchers with an easy way to use the MUSE EEG system for ERP research. Importantly, our work highlights that with a single computer and a portable EEG system such as the MUSE one can conduct ERP research with ease thus greatly extending the possible use of the ERP methodology to a variety of novel contexts.
Article
Full-text available
Motion sensing plays an important role in the study of human movements, motivated by a wide range of applications in different fields, such as sports, health care, daily activity, action recognition for surveillance, assisted living and the entertainment industry. In this paper, we describe how to classify a set of human movements comprising daily activities using a wearable motion capture suit, denoted as FatoXtract. A probabilistic integration of different classifiers recently proposed is employed herein, considering several spatiotemporal features, in order to classify daily activities. The classification model relies on the computed confidence belief from base classifiers, combining multiple likelihoods from three different classifiers, namely Naïve Bayes, artificial neural networks and support vector machines, into a single form, by assigning weights from an uncertainty measure to counterbalance the posterior probability. In order to attain an improved performance on the overall classification accuracy, multiple features in time domain (e.g., velocity) and frequency domain (e.g., fast Fourier transform), combined with geometrical features (joint rotations), were considered. A dataset from five daily activities performed by six participants was acquired using FatoXtract. The dataset provided in this work was designed to be extremely challenging since there are high intra-class variations, the duration of the action clips varies dramatically, and some of the actions are quite similar (e.g., brushing teeth and waving, or walking and step). Reported results, in terms of both precision and recall, remained around 85 %, showing that the proposed framework is able to successfully classify different human activities.
Conference Paper
Full-text available
Previous studies that involve measuring EEG, or electroencephalograms, have mainly been experimentally-driven projects; for instance, EEG has long been used in research to help identify and elucidate our understanding of many neuroscientific, cognitive, and clinical issues (e.g., sleep, seizures, memory). However, advances in technology have made EEG more accessible to the population. This opens up lines for EEG to provide more information about brain activity in everyday life, rather than in a laboratory setting. To take advantage of the technological advances that have allowed for this, we introduce the Brain-EE system, a method for evaluating user engaged enjoyment that uses a commercially available EEG tool (Muse). During testing, fifteen participants engaged in two tasks (playing two different video games via tablet), and their EEG data were recorded. The Brain-EE system supported much of the previous literature on enjoyment; increases in frontal theta activity strongly and reliably predicted which game each individual participant preferred. We hope to develop the Brain-EE system further in order to contribute to a wide variety of applications (e.g., usability testing, clinical or experimental applications, evaluation methods, etc.).
Article
Full-text available
A statistical based system for human emotions classification by using electroencephalogram (EEG) is proposed in this paper. The data used in this study is acquired using EEG and the emotions are elicited from six human subjects under the effect of emotion stimuli. This paper also proposed an emotion stimulation experiment using visual stimuli. From the EEG data, a total of six statistical features are computed and back-propagation neural network is applied for the classification of human emotions. In the experiment of classifying five types of emotions: Anger, Sad, Surprise, Happy, and Neutral. As result the overall classification rate as high as 95% is achieved.
Book
Data Mining: Practical Machine Learning Tools and Techniques, Fourth Edition, offers a thorough grounding in machine learning concepts, along with practical advice on applying these tools and techniques in real-world data mining situations. This highly anticipated fourth edition of the most acclaimed work on data mining and machine learning teaches readers everything they need to know to get going, from preparing inputs, interpreting outputs, evaluating results, to the algorithmic methods at the heart of successful data mining approaches. Extensive updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including substantial new chapters on probabilistic methods and on deep learning. Accompanying the book is a new version of the popular WEKA machine learning software from the University of Waikato. Authors Witten, Frank, Hall, and Pal include today's techniques coupled with the methods at the leading edge of contemporary research. Please visit the book companion website at http://www.cs.waikato.ac.nz/ml/weka/book.html It contains Powerpoint slides for Chapters 1-12. This is a very comprehensive teaching resource, with many PPT slides covering each chapter of the book Online Appendix on the Weka workbench; again a very comprehensive learning aid for the open source software that goes with the book Table of contents, highlighting the many new sections in the 4th edition, along with reviews of the 1st edition, errata, etc. Provides a thorough grounding in machine learning concepts, as well as practical advice on applying the tools and techniques to data mining projects Presents concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods Includes a downloadable WEKA software toolkit, a comprehensive collection of machine learning algorithms for data mining tasks-in an easy-to-use interactive interface Includes open-access online courses that introduce practical applications of the material in the book.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.