Conference PaperPDF Available

A Study on Mental State Classification using EEG-based Brain-Machine Interface


Abstract and Figures

This work aims to find discriminative EEG-based features and appropriate classification methods that can categorise brainwave patterns based on their level of activity or frequency for mental state recognition useful for human-machine interaction. By using the Muse headband with four EEG sensors (TP9, AF7, AF8, TP10), we categorised three possible states such as relaxing, neutral and concentrating based on a few states of mind defined by cognitive behavioural studies. We have created a dataset with five individuals and sessions lasting one minute for each class of mental state in order to train and test different methods. Given the proposed set of features extracted from the EEG headband five signals (alpha, beta, theta, delta, gamma), we have tested a combination of different features selection algorithms and classifier models to compare their performance in terms of recognition accuracy and number of features needed. Different tests such as 10-fold cross validation were performed. Results show that only 44 features from a set of over 2100 features are necessary when used with classical classifiers such as Bayesian Networks, Support Vector Machines and Random Forests, attaining an overall accuracy over 87%.
Content may be subject to copyright.
2018 International Conference on Intelligent Systems (IS)
978-1-5386-7097-2/18/$31.00 ©2018 IEEE
A Study on Mental State Classification
using EEG-based Brain-Machine Interface
Jordan J. Bird
School of Engineering & Applied Science
Aston University
Birmingham, UK
Anikó Ekárt
School of Engineering & Applied Science
Aston University
Birmingham, UK
Luis J. Manso
School of Engineering & Applied Science
Aston University
Birmingham, UK
Diego R. Faria
School of Engineering & Applied Science
Aston University
Birmingham, UK
Eduardo P. Ribeiro
Department of Electrical Engineering
Federal University of Parana
Curitiba, Brazil
AbstractThis work aims to find discriminative EEG-based
features and appropriate classification methods that can
categorise brainwave patterns based on their level of activity or
frequency for mental state recognition useful for human-machine
interaction. By using the Muse headband with four EEG sensors
(TP9, AF7, AF8, TP10), we categorised three possible states such
as relaxing, neutral and concentrating based on a few states of
mind defined by cognitive behavioural studies. We have created a
dataset with five individuals and sessions lasting one minute for
each class of mental state in order to train and test different
methods. Given the proposed set of features extracted from the
EEG headband five signals (alpha, beta, theta, delta, gamma), we
have tested a combination of different features selection
algorithms and classifier models to compare their performance in
terms of recognition accuracy and number of features needed.
Different tests such as 10-fold cross validation were performed.
Results show that only 44 features from a set of over 2100
features are necessary when used with classical classifiers such as
Bayesian Networks, Support Vector Machines and Random
Forests, attaining an overall accuracy over 87%.
Keywords EEG, brain-machine interface, machine learning,
mental states classification
The ability to autonomously detect mental states, whether
cognitive or affective, is useful for multiple purposes in many
domains such as robotics, health care, education, neuroscience,
etc. The importance of efficient human-machine interaction
mechanisms increases with the number of real life scenarios
where smart devices, including autonomous robots, can be
applied. One of the many alternatives that can be used to
interact with machines is through superficial brain activity
signals. These signals, called electroencephalograms or EEG
for short, convey information regarding the voltage measured
by electrodes (dry or wet) placed around the scalp of an
individual. In addition to regular non-invasive
electroencephalography there can also be found invasive
alternatives which can monitor brain activity placing the
electrodes directly inside the skull of the subject [35]. This
technique is known as intracranial electroencephalography
(iEEG). Despite iEEG can yield better signal acquisition, it is
invasive and therefore more complex to apply. Extracranial
electroencephalography techniques include wearable and non-
wearable technologies. The fact that extracranial devices used
to acquire EEG signals are non-invasive, are becoming easier
to wear, and their price is decreasing widens the range of
applications for which they are suitable.
A major challenge in brain-machine interface applications is
inferring how momentary mental states are mapped into a
particular pattern of brain activity. One of the main issues of
classifying EEG signals is the amount of data needed to
properly describe the different states, since the signals are
complex, non-linear, non-stationary, and random in nature.
The signals are considered stationary only within short
intervals, that is why the best practice is to apply short-time
windowing technique in order to detect local discriminative
features to meet this requirement. The paper at hand focuses
on selecting a subset of highly discriminative features and
comparing to state-of-the-art classification methods that can
categorise EEG signals into different mental states, taking into
consideration the performance in terms of accuracy and
computational cost. The application considered herein is to
distinguish among three different mental states (e.g. relaxed,
neutral and highly concentrated) of an individual using an
EEG device with dry electrodes that can interface a range of
applications, such as to control the movement of a robot.
The remainder of the paper proceeds as follows. Related works
are summarised in section II. The experimental setup,
including information regarding the device used, and details
about the data acquisition are described in section III. The
methods tested to perform feature selection and the criteria
used to compare the different classifiers are presented in
section IV. Preliminary results are presented in section V. A
discussion on the conclusions drawn from the experimental
results is provided in section VI.
Statistical features derived from EEG data are commonly used
alongside machine learning techniques to classify mental
states [18], [19]. These nominal states can then be used for
finite points of control as a Brain-Computer Interface. A Muse
headband has been recognised by neuroscientists for its
effectiveness and relatively low cost as well as its accuracy
when classified with Bayesian methods [8]. Through signals,
two tasks were recognised with 95% accuracy, though it is
worth noting that tasks were classified rather than mental
states, and said tasks were in binary distinction to one another.
Using a Muse headband, researchers accurately measured a
user’s enjoyment [11], [12] of an activity from brain signals
alone using the stimuli of two videogames, one measurably
more enjoyable than the other. With the use of a high
resolution 32-channel EEG and statistical feature extraction, a
model was developed to control a robot’s movement [9].
Using statistics focused on the signals produced by the motor
cortex which is thought to control muscles for movement [10],
researchers classified various states which successfully
resulted in a model that could direct a robot’s movement. EEG
data has been used extensively to detect abnormal brain
activity related to ill-health such as stroke [13] specifically
when ischemia is present in the brain, brain activity points to
abnormalities prior to the stroke occurring. As well as stroke
detection, neuroscientists found that upper extremities in
motor function post-stroke could be rehabilitated using EEG
data with robotics feedback [14] in the form of a brain-
machine interface. Results were promising in terms of the
effectiveness of the system’s ability to rehabilitate. Also
studied extensively is the ability to use EEG data to detect
seizures both in adults suffering with epilepsy [15] and notably
in new-born infants [16]. A Spiking Neural Network was
developed to classify seizure detection based on statistics
extracted from EEG streams with a high accuracy of 92.5%
[17]. Random Forest classification of extracted EEG features
was used to identify mental states during stages of sleep with a
high accuracy of 82% [20], a Bayesian classifier was trained
on more general awake, sleep and REM sleep states with
accuracies ranging between 92-97% in both humans and rats
[21]. Neural Networks have been observed to have an
accuracy of 64% when classifying emotional states based on
EEG data [7].
Differently from the aforementioned works, this work focuses
on a study on features selection and classification models
given a set of proposed features such as statistical, entropy-
based, derivatives and time-frequency features from short
temporal lapses of EEG data to then generate multiple data
sets of the same data points with original contribution in their
differing selections of attributes, which in turn are selected by
various machine learning models. The primary goal is to find a
suitable model that can categorise mental states based on EEG
data from the TP9, AF7, AF8 and TP10 electrodes.
A. EEG Data Acquisition
The sensor Muse Headband was used for data collection. The
Muse is a commercial EEG sensing device with five dry-
application sensors, one used as a reference point (NZ) and
four (TP9, AF7, AF8, TP10) to record brain wave activity.
Fig. 1. The International 10-20 EEG Electrode Placement Standard [4]
Highlighted in yellow are the sensors of the Muse Headband. The NZ
placement (green) is used as a reference point for calibration.
Fig. 2. Example of a live EEG stream of the four Muse sensors, Right AUX
did not have a device and was discarded due to it simply being noise.
This live feed graph has a Y-Axis of measured microvolts at t=0 on each
sensor, and an X-axis detailing the time reading.
To prevent the interference of electromyographic signals,
nonverbal tasks that required little to no movement were set.
Blinking, though providing interference to the AF7 and AF8
sensors, was neither encouraged nor discouraged to retain a
natural state. This was due to the dynamicity of blink rate
being linked to tasks requiring differing levels of
concentration [1], and as such the classification algorithms
would take these patterns of signal spikes into account. In
addition, subjects were asked not to close their eyes during any
of the tasks. Three stimuli were devised to cover the three
mental states available from the Muse Headband - relaxed,
neutral, and concentrating. The relaxed task had the subjects
listening to low-tempo music and sound effects designed to aid
in meditation whilst being instructed on relaxing their muscles
and resting. For a neutral mental, a similar test was carried out,
but with no stimulus at all, this test was carried out prior to
any others to prevent lasting effects of a relaxed or
concentrative mental state. Finally, for concentration, the
subjects were instructed to follow the shell game in which a
ball was hidden under one of three cups, which were then
switched, the task was to try and follow which cup hid the
ball. Future work arises in the implementation of a standard
experiment for each state, for proper comparison to similar
experiment. After a short amount of time into the stimulus
starting, as to not gather data with an inaccurate class, the EEG
data from the Muse Headband was automatically recorded for
sixty seconds. The data was observed to be streaming at a
variable frequency within the range of 150 - 270 Hz.
BlueMuse [5] was used for interfacing the device to a
computer, and Muselsl [6] was used to convert the Muse
signals to MicroVolts and record the data into a preliminary
dataset ready for feature extraction. Fig 2. shows a live stream
of EEG data, blinking can be seen in the troughs of TP9 and
TP10 (forehead sensors). At each point in the data stream (150
- 270 Hz), all signals were recorded along with a UNIX
timestamp which was further used for down sampling the data
to produce a uniform stream frequency. The measured
voltages on the graph can be mapped to the EEG placements
seen in Fig 1. Before the features extraction we have down
sampled the data. The sampling rate was decimated to 200 Hz
based on fast Fourier transformations along a given axis. The
resampled signal starts at the same value as x, but it is sampled
with a spacing of len(x) / num * (spacing of x). Because a
Fourier method is used, the signal is assumed to be periodic.
This is a realistic down-sampling as the dominant energy is
concentrated in the range of 20 - 500Hz, even though the
frequency range of the EEG sensor is superior.
A. Proposed Set of Features for EEG signals
Feature extraction and classification of EEG signals are core
issues in brain computer interface (BCI) applications. One
challenging problem when it comes to EEG feature extraction
is the complexity of the signal, since it is non-linear, non-
stationary, and random in nature. The signals are considered
stationary only within short intervals, that is why the best
practice is to apply short-time windowing technique to meet
this requirement. However, it is still considered an assumption
that holds during a normal brain condition. Non-stationary
signals can be observed during the change in alertness and
wakefulness, during eye blinking, and also during transitions
of mental states. Thus, this subsection describes the set of
features considered in this work to adequately discriminate
different classes of mental states. These features rely on
statistical techniques, time-frequency based on fast Fourier
transform (FFT), Shannon entropy, max-min features in
temporal sequences, log-covariance and others. All features
proposed to classify the mental states are computed in terms of
the temporal distribution of the signal in a given time window.
This slide window is defined as a period of 1 second at 250
Hz, i.e. all features are computed within this time instant. An
overlap of 0.5 second is used when moving the window, i.e.
the temporal window 1 (w1) starts at 0 sec. and finishes at 1
sec.; w2 starts at 1.5 sec. and finishes at 2.5 sec.; w3 starts at 2
sec. and finishes at 3 sec.; w4 starts at 2.5 sec. and finishes at
3.5 sec., and so on. Another important point to compute the
features is the signals from the EEG Muse headband. Since it
returns five types of signal frequencies {, β, , , }, then we
compute all proposed set of features for each signal. Thus, the
total number of feature values extracted from these signals is
2147 values.
Statistical Features: In order to have a compact representation
of the raw sensor data in a given time range, we are using a set
of classical statistical features since they are useful with
proven efficiency to complement set of multiples features in
order to recognise patterns in time series. The statistical
features are: (i) given a set of data values {x1, x2, ...xN}
acquired in each temporal window, the mean value
of that sequence is computed; (ii) the standard
󰇛 󰇜
; (iii) statistical moments of 3rd
and 4th order, which gives us the skewness to measure the
asymmetry of the data, and also the kurtosis to measure the
peakedness of the probability distribution of the data,
respectively. The statistical moments employed are computed
as follows:
, (1)
󰇛 󰇜
 (2)
where is the k = {3rd, 4th} moment about the mean and
y = {skewness, k = 3; kurtosis, k = 4}. Another type of
statistical features computed was the autocorrelation of the
signals at each time window for each of the five signals from
the EEG. The correlation of the signal with a delayed copy of
itself as a function of delay was employed similarly to [22]
and [23], where the implementation details and parameters are
Max, Min and Derivatives: Given a time window of 1 sec., the
maximum and minimum values are computed to increase the
diversity of the features types. Derivatives are also computed
as temporal features. For each time window, we split the time
window by 2, such w/2 = 0.5 sec. and w = 1 sec., resulting in
two sequences of data at ~125 Hz, then we compute:
where w and w/2 indicates the first and second half of the
sequence of data in a time window of 1 sec. The same strategy
is employed to get the derivative given the max and min
features in sub time windows:
The next temporal features are extracted after splitting the
initial time window of one second into 4 batches of 0.25 sec.
each. Then we computed the mean, max and min values of
each batch, {µ1, µ2, µ3, µ4}, {max1, max2, max3, max4} and
{min1, min2, min3, min4}. Then we compute the 1D Euclidean
distance among all mean values, 󰃳12 = | µ1- µ2|, 󰃳13 = | µ1- µ3|,
󰃳14 = | µ1- µ4|, 󰃳23 = |µ2 - µ3|, 󰃳24 = |µ2 - µ4|, 󰃳34 = |µ3 -
µ4|, the same for the minimum and maximum values, so that in
the end we got 18 features based on distances. Using the four
mean values, and the four max and four min values, and
adding the previous 18, we got 30 features for each signal in
the short time window, so that counting the 5 signals we have
150 temporal features per second.
Log-covariance features: Given the previous 150 temporal
features, we then discard the last 6 features in order to attain
144 features, so that we could build a  square matrix
to compute the log-covariance as follows:
 󰇛󰇛󰇛󰇜󰇜󰇜 (6)
where lcM is a resulting vector containing the upper triangular
elements (78 features) of the matrix after computing the matrix
logarithm over the covariance matrix M; U(.) is a function to
return the upper triangular elements; logm(.) is the matrix
logarithm function; and the covariance matrix is given by
󰇛󰇜 󰇛󰇜󰇛 󰇜
The rationale
behind of log-covariance is the mapping of the convex cone of
a covariance matrix to the vector space by using the matrix
logarithm so that it does not lie in Euclidean space, i.e., the
covariance matrix space is not closed under multiplication
with negative scalars.
Shannon entropy and log-energy entropy: non-linear analysis
such as Shannon entropy has proven its efficiency in signal
processing and time series since randomness of non-linear data
is well embodied by calculating entropies over the time series.
Entropy is an uncertainty measure and in brain-machine
interface applications, it is used to measure the level of chaos
of the system, since it is a non-linear measure quantifying the
degree of complexity of the data. In information theory, the
Shannon entropy is given by:
, (7)
where h is a feature computed in every time window of 1 sec.
and Sj is each element (normalized) of this temporal window.
Then, given the same time window, we split into two to
compute the log-energy entropy as follows:
 󰇛󰇜
, (8)
where i represents an index for the elements of the first sub
window (0 - 0.5 sec.) and j represents an index for the second
sub window (0.5 - 1 sec.).
Frequency domain: The FFT is an advantageous method to
analyse the spectrum of a given time-series. At every time
window we compute it as follows:
 k = 0, ... , N - 1. (9)
Accumulative features as energy model: An accumulative
value was obtained frame-by-frame given a time window, for
each individual feature, duplicating the number of features.
We compute the difference between the values of the current
frame to the previous frame and accumulate it over time as
󰇫 
  , (10)
where 
is the resulting energy model for the current time
instant given a specific type of feature , i = {1, ... , N} at a
time instant z representing a specific frame within a time
B. Feature Selection Algorithms
Feature selection aims to remove data which has no useful
application and only serves to unneededly increase the demand
for resources. Five datasets were generated using different
algorithms. Each retained the same data points, but which had
a reduced number of attributes selected by the algorithm. The
evaluators used were as follows:
1. OneR: calculates error rate of each prediction based on
one rule and selects the lowest risk classification [24].
2. Information Gain: assigns a worth to each individual
attribute by measuring the information gain with
respect to the class (difference of entropy) [25].
3. Correlation: measures the correlation of the attribute
and class via their Pearson's coefficient which is used
to rank attributes’ worth comparable to all others. [26].
4. Symmetrical Uncertainty: measures the uncertainty of
an attribute with respect to the class and bases selection
on lower uncertainties [27].
5. Evolutionary Algorithm: creates a population of
attribute subsets and ranks their effectiveness with a
fitness function to measure their predictive ability of
the class. At each generation, solutions are bred to
create offspring, and weakest solutions are killed off in
a tournament of fitness [34].
C. Machine Learning Algorithms
As a benchmark, a ZeroR classifier was first run on each
dataset. This simplistic classifier chooses one single class to
apply to all of the data to reduce inaccurate classifications, it is
expected that an accuracy of one third is achieved with a fair
distribution of the three mental states. Two models were
trained on Bayes Theorem, a formula of conditional
probability based on hypothesis H and evidence E. The
theorem states that the probability of the hypothesis being true
before evidence P(H) is related to the probability of the
hypothesis after reading the evidence P(H | E) and is given as
follows [29]: 󰇛󰇜 󰇛󰇜󰇛󰇜
Naivety arises due to the unverified assumption of non-
consideration of the relationships between the absence of
attributes. A Bayesian Network (Bayes Net) model was also
trained. This method generates a probabilistic graphical model
via representing probabilities of variables to classes on a
Directed Acyclic Graph (DAG) [28] as follows:
 (12)
Model Accuracy % (2dp)
Naive Bayes
Bayes Net
Random Tree
Random Forest
Attribute Selection
Ranker Search Cut-
No. of attributes
Information Gain
The goal is to infer the current time value of Ct given the
data Xt:t-T = {Xt, Xt-1,...,Xt-T} and the prior knowledge of the
class, which is attained by the a-posteriori probability
P(Ct |Ct-1:t-T, Xt:t-T). The superscript notation denotes the set
of values over a time interval.
Three decision trees were developed. Generated by the C4.5
algorithm [2], a J48 tree splits each decision based on
information gain, due to the measure of entropy in a leaf
A Random Tree is generated through a stochastic process
that will consider a random number of attributes at each
node. A Random Forest is the process of generating multiple
Random Trees [3]. A Multilayer Perceptron (MLP) model
was generated, a feedforward Neural Network in that cycles
are not formed by neurons. An MLP was implemented due
to its ability to classify data points that are not linearly
separable in Euclidean space [30]. A model was also trained
using a Support Vector Machine (SVM), which classifies
labelled data through a process of supervised learning,
where examples are mapped out in space and classification
is performed by the closest area in which the unknown class
data falls [31]. In particular, an improved version of Platt’s
Sequential Minimal Optimization (SMO) was used to train
the SVM [32], [33].
The five generated sets from the original dataset are shown
in Table I. Five different algorithms were chosen, and their
results ranked by their individual scores. Arbitrary cut off
points were implemented where the scores closed in on
either 0 or the lowest score present if there were no zero
values. The values given are incomparable between
algorithms due to their unique methods of giving score. The
MLP was given 2000 epochs to train with the number of
nodes on layers set to the default “a” setting, dynamically
calculated by n = (attributes + classes)/2
for each dataset it was trained on. A Zero Rules classifier
was run as a benchmark, and with close to equally
distributed data, set a model accuracy of 33.36% on all
datasets for comparison. We can observe from when
compared to all other classifiers which are not naive. The
most effective model was a Random Forest classifier along
with the dataset created by the OneR Attribute Selector,
which had a high accuracy of 87.16% when classifying the
data into one of the three mental states. Preliminary results
for each of the datasets and their trained models are
presented in Table II. For each test, 10-fold cross validation
was used to train the model. All random seeds were set to
their default value of 1. Table II that all of the models far
outperformed the benchmarks set by the Zero Rules
classifier, the lowest being 51.49% (Symmetrical
Uncertainty dataset with a Naive Bayes classifier). It is
reasonable to assume that the naivety in not considering
attribute relationships has led to poorer results.
This paper presented a study on mental state classification
based on EEG signals, it proposed a set of features using a
short-term windowing extracted from five signals from an
EEG sensor to categorise three different states: neutral,
relaxed and concentrated. A dataset was created using data
from five individuals in sessions lasting one minute for each
state. The primary goal of this work was to find appropriate
set of features by testing multiple feature selection
algorithms and classification models that provide acceptable
accuracy performance on the dataset that can be useful for
human-machine interaction. From the multiple feature sets
and models produced, the most accurate is a Random Forest
classifier on an attribute selected by the OneR ruleset, with a
prediction accuracy of 87.16%. Future work will be focused
on comparing our best results with deep learning strategies
and implementing a real-time application to: (i) control
devices, such as robots; and (ii) detect positive and negative
moods useful for applications in mental health care.
[1] Himebaugh, N.L., Begley, C.G., Bradley, A. and Wilkinson, J.A.,
2009. Blinking and tear break-up during four visual tasks. Optometry
and Vision Science, 86(2), pp. E106-E114.
[2] Quinlan, R., 1993. C4.5: Programs for Machine Learning. Morgan
Kaufmann Publishers, San Mateo, CA.
[3] Breiman, L., 2001. Random forests. Machine learning, 45(1), pp.5-
[4] Jasper, Herbert H. 1958. "The ten-twenty electrode system of the
International Federation." Electroenceph. Clin. Neurophysiol. 370-
[5] Kowaleski, J. (2017). BlueMuse.
[6] Barachant, A. (2017). Muselsl.
[7] Bos, D.O., 2006. EEG-based emotion recognition. The Influence of
Visual and Auditory Stimuli, 56(3), pp.1-17.
[8] Krigolson, O.E., Williams, C.C., Norton, A., Hassall, C.D. and
Colino, F.L., 2017. Choosing MUSE: Validation of a low-cost,
portable EEG system for ERP research. Frontiers in neuroscience, 11,
[9] Li, W., Jaramillo, C. and Li, Y., 2012, January. Development of mind
control system for humanoid robot through a brain computer
interface. In 2012 International Conference on Intelligent System
Design and Engineering Application (pp. 679-682). IEEE.
[10] Rosenzweig, M.R., Breedlove, S.M. and Leiman, A.L., 2002.
Biological psychology: An introduction to behavioral, cognitive, and
clinical neuroscience. Sinauer Associates.
[11] Abujelala, M., Abellanoza, C., Sharma, A. and Makedon, F., 2016,
June. Brain-ee: Brain enjoyment evaluation using commercial eeg
headband. In Proceedings of the 9th acm international conference on
pervasive technologies related to assistive environments (p. 33).
[12] Plotnikov, A., Stakheika, N., De Gloria, A., Schatten, C., Bellotti, F.,
Berta, R., Fiorini, C. and Ansovini, F., 2012, July. Exploiting real-
time EEG analysis for assessing flow in games. In 2012 IEEE 12th
International Conference on Advanced Learning Technologies (pp.
688-689). IEEE.
[13] Jordan, K.G., 2004. Emergency EEG and continuous EEG monitoring
in acute ischemic stroke. J. of Clinical Neurophys., 21(5), pp.341-352.
[14] Ang, K.K., Guan, C., Chua, K.S.G., Ang, B.T., Kuah, C., Wang, C.,
Phua, K.S., Chin, Z.Y. and Zhang, H., 2010, August. Clinical study of
neurorehabilitation in stroke using EEG-based motor imagery brain-
computer interface with robotic feedback. 2010 Annual International
Conference of the IEEE (pp. 5549-5552).
[15] Tzallas, A.T., Tsipouras, M.G. and Fotiadis, D.I., 2009. Epileptic
seizure detection in EEGs using timefrequency analysis. IEEE
transactions on information technology in biomedicine, 13(5), pp.703-
[16] Aarabi, A., Grebe, R. and Wallois, F., 2007. A multistage knowledge-
based system for EEG seizure detection in newborn infants. Clinical
Neurophysiology, 118(12), pp.2781-2797.
[17] Ghosh-Dastidar, S. and Adeli, H., 2007. Improved spiking neural
networks for EEG classification and epilepsy and seizure detection.
Integrated Computer-Aided Engineering, 14(3), pp.187-212.
[18] Chai, T.Y., Woo, S.S., Rizon, M. and Tan, C.S., 2010. Classification
of human emotions from EEG signals using statistical features and
neural network. In International (Vol. 1, No. 3, pp. 1-6). Penerbit
[19] Tanaka, H., Hayashi, M. and Hori, T., 1996. Statistical features of
hypnagogic EEG measured by a new scoring system. Sleep, 19(9),
[20] Fraiwan, L., Lweesy, K., Khasawneh, N., Wenz, H. and Dickhaus, H.,
2012. Automated sleep stage identification system based on time
frequency analysis of a single EEG channel and random forest
classifier. Computer methods and programs in biomedicine, 108(1),
[21] Rytkönen, K.M., Zitting, J. and Porkka-Heiskanen, T., 2011.
Automated sleep scoring in rats and mice using the naive Bayes
classifier. Journal of neuroscience methods, 202(1), pp.60-64.
[22] Vital, J.P., Faria, D.R., Dias, G., Couceiro, M.S., Coutinho, F. and
Ferreira, N.M., 2017. Combining discriminative spatiotemporal
features for daily life activity recognition using wearable motion
sensing suit. Pattern Analysis and Applications, 20(4), pp.1179-1194.
[23] Faria, D.R., Vieira, M., Premebida, C. and Nunes, U., 2015, August.
Probabilistic human daily activity recognition towards robot-assisted
living. In Robot and Human Interactive Communication (RO-MAN),
2015 24th IEEE International Symposium on (pp. 582-587). IEEE.
[24] University of Waikato. 2018. OneR. [online]
Available at:
[Accessed 9 Aug. 2018].
[25] University of Waikato. 2018. InfoGainAttributeEval. [online] Available at:
AttributeEval.html [Accessed 9 Aug. 2018].
[26] Pearson, K., 1895. Note on regression and inheritance in the case of
two parents. Proceedings of the Royal Society of London, 58, pp.240-
[27] Witten, I.H., Frank, E., Hall, M.A. and Pal, C.J., 2016. Data Mining:
Practical machine learning tools and techniques. Morgan Kaufmann.
[28] Pearl, Judea 2000. Causality: Models, Reasoning, and Inference.
Cambridge University Press. ISBN 0-521-77362-8.
[29] Bayes, T., Price, R. and Canton, J., 1763. An essay towards solving a
problem in the doctrine of chances.
[30] Rosenblatt, F., 1961. Principles of neurodynamics. perceptrons and
the theory of brain mechanisms (No. VG-1196-G-8). CORNELL
[31] Cortes, C. and Vapnik, V., 1995. Support-vector networks. Machine
learning, 20(3), pp.273-297.
[32] Platt, J.C., 1999. 12 fast training of support vector machines using
sequential minimal optimization. Adv. in kernel methods, pp.185-208.
[33] Keerthi, S.S., Shevade, S.K., Bhattacharyya, C. and Murthy, K.R.K.,
2001. Improvements to Platt's SMO algorithm for SVM classifier
design. Neural computation, 13(3), pp.637-649.
[34] Back, T., 1996. Evolutionary algorithms in theory and practice:
evolution strategies, evolutionary programming, genetic algorithms.
Oxford university press.
[35] Shenoy, P; Miller, KJ; Ojemann, JG; Rao, RPN (2007). Generalized
features for electrocorticographic BCIs. IEEE Transactions on
Biomedical Eng. 55 (1), pp. 27380.
... The electrodes that rest on the forehead are known as frontal electrodes (AF7 and AF8) and those that rest behind the ears are temporal electrodes (TP9 and TP10). Three different states-such as relaxing, neutral, and focused, can be classified by using the Muse headband with four EEG sensors (TP9, AF7, AF8, TP10) based on the cognitive behavioral studies [5]. ...
... The first brainwaves to be identified were alpha waves (8)(9)(10)(11)(12)(13), among the easiest to examine. Theta brainwaves (4)(5)(6)(7)(8) are typically strongly discernible when sleeping and dreaming [7]. The slowest brainwaves, known as delta waves (0.5-4 Hz), are highest while we sleep deeply and dreamlessly [8]. ...
... In [5], an effort was made to identify discriminative EEG-based features and acceptable classification methods that can categorize brainwave rhythms depending on their level of activity or frequency to recognize mental states for use in human-machine interaction. In order to categorize three different states-neutral, relaxed, and concentrated-this research presented a study on the categorization of mental states using EEG signals. ...
Full-text available
Medical data are increasing drastically due to the vast development of medical sciences. The security of this immense data is also a challenge of the present era. Image watermarking is a technique to secure medical data from alteration. Authentication of patients records is also necessary in case of medical data transmission. In this paper, an optimized electroencephalogram watermarking technique are proposed with dual authentication using Advanced encryption standards (AES) and speeded-up robust features is proposed. Scaling factor plays an important role to balance the properties of watermarking algorithm. The cuckoo search optimization is used to get the optimized scaling factor. The Henon encryption (HE) is used to enhance the security of sub-band obtained from identity of the patient image used as the watermark. The diagonalized Hessenberg decomposition (HD) is used for embedding watermark while secured hash algorithm (SHA-256) is used to protect watermark against malicious attacks. For the proposed technique, detailed security analysis has been performed for AES encryption technique. Various performance metrics are computed for the proposed technique to estimate the effectiveness of the watermarking system.
... This research considers two datasets SEED [19] and EEG Brainwave [20] to extract optimised features. SEED (2015) is an emotion dataset in which subjects were shown film clips and their EEG recordings were collected after which each participant reports their ratings in terms of positive, negative and neutral emotions. ...
... SEED [19] and EEG Brainwave [20] datasets are investigated in this research. The section that ensues provides a full description of these datasets. ...
Full-text available
Abstract Purpose Human emotion recognition using electroencephalograms (EEG) is a critical area of research in human–machine interfaces. Furthermore, EEG data are convoluted and diverse; thus, acquiring consistent results from these signals remains challenging. As such, the authors felt compelled to investigate EEG signals to identify different emotions. Methods A novel deep learning (DL) model stacked long short-term memory with attention (S-LSTM-ATT) model is proposed for emotion recognition (ER) in EEG signals. Long Short-Term Memory (LSTM) and attention networks effectively handle time-series EEG data and recognise intrinsic connections and patterns. Therefore, the model combined the strengths of the LSTM model and incorporated an attention network to enhance its effectiveness. Optimal features were extracted from the metaheuristic-based firefly optimisation algorithm (FFOA) to identify different emotions efficiently. Results The proposed approach recognised emotions in two publicly available standard datasets: SEED and EEG Brainwave. An outstanding accuracy of 97.83% in the SEED and 98.36% in the EEG Brainwave datasets were obtained for three emotion indices: positive, neutral and negative. Aside from accuracy, a comprehensive comparison of the proposed model’s precision, recall, F1 score and kappa score was performed to determine the model’s applicability. When applied to the SEED and EEG Brainwave datasets, the proposed S-LSTM-ATT achieved superior results to baseline models such as Convolutional Neural Networks (CNN), Gated Recurrent Unit (GRU) and LSTM. Conclusion Combining an FFOA-based feature selection (FS) and an S-LSTM-ATT-based classification model demonstrated promising results with high accuracy. Other metrics like precision, recall, F1 score and kappa score proved the suitability of the proposed model for ER in EEG signals. Full-text access to a view-only version by using the following SharedIt link:
... The high-intensity BCIT with a training duration between 3 to 6 weeks might provide an efficient approach for the clinical practice [15][16]. In [17][18][19] the authors have used the BCI for handicapped people where they have classified the best distinctive features with the accuracy of 88% and lowered the FPR (False Positive Rate). However, none of the existing studies have developed an EEG based model for emotion recognition with 96% accuracy. ...
... The dataset was collected from the Muse Headband sensor [2]. It is a commercial sensing device that is used to record brain activity with five dry application sensors where 4 sensors (AF, AF8, TP9, TP10) [18][19] are used to record the brain activity and one is used for the reference point (NZ) which is above the forehead point. AF7 and AF8 are the frontal electrodes that rest on the forehead and TP9 and TP10 are the Temporal electrodes that rest behind the ears [20][21][22][23][24]. ...
Conference Paper
Full-text available
Brain Computer Interface (BCI) is a communication channel between brain and machine which helps to control the instruction over the machine. The BCI helps in both the situations where patients are not able to move a bit and utter a word. There are many devices made that are being used by the people with disabilities and helping them in doing their work. Brain computer interfaces have been made useful for the patients with mental disorders like strokes and also for the other physical disorders. BCI first acquires and analyzes the brain Electroencephalography (EEG) signals, translating them into the desired actions that are needed to fulfill the command given by the brain. The recent advancement in the brain computer interface technology is exciting for the scientists, engineers and clinical persons which further enlighten the rapid development and growing research. The classification and detection on BCI systems for the patients have been made. This study proposes a novel stochastic model that classifies the emotions as: positive, negative and neutral with the accuracy of 96% using the ensemble model (Multiclass Logistic Regression, Light Gradient Boosting Machine (LGBM), Random Forest Classifier and Decision Tree Classifier).
... A statistical dataset of 2,548 source attributes [4,5] was derived from this EEG dataset by implementing various feature extraction methods like Accumulative features such as energy model, Shannon entropy, Frequency domain, Log-covariance features, log-energy entropy, Statistical Features, Max, Min and Derivatives over the recorded EEG dataset. ...
Full-text available
Throughout the years, major advancements have been made in the field of EEG-based emotion classification. Implementing deep architectures for supervised and unsupervised learning from data has come a long way. This study aims to capitalize on these advancements to classify emotions from EEG signals accurately. It still is, however, a challenging task. The fact that the data we are reliant on changes from person to person calls for an elaborate machine-learning solution that can achieve high degrees of abstraction without sacrificing accuracy and legibility. In this study, the Xception model from Keras API was utilized, as well as wavelet transform for feature extraction, which was then used for classification using different classifiers. These features were classified into three distinct categories: NEGATIVE, POSITIVE and NEUTRAL. To examine the effectiveness of the Xception deep neural net, we compare the results of different classifiers like Support Vector Machine, Random Forest, AdaBoostM1, LogitBoost, Naïve Bayes Updateable and Non-Nested Generalization Exemplars. The random forest ensemble achieved the best results from all the classifiers implemented in this study. It had higher accuracy scores than existing models without compromising on areas like precision, F1 score, and recall value.
... Bird et al., the creators of the dataset utilised in this study, tested a variety of feature selection methods and classifier models. In the end, they were successful in obtaining data accuracy for mental state sets of over 87% [30].The main goal of the current paper is to use 10 different ML algorithms to conduct an investigation and identify three different mental states. As can be observed from the preceding studies, there is more potential for research into a subject's state of consciousness and focus than there is for investigating other goals like emotion classification. ...
One of the most exciting areas of computer science right now is brain-computer interface (BCI) research. A conduit for data flow between both the brain as well as an electronic device is the brain-computer interface (BCI). Researchers in several disciplines have benefited from the advancements made possible by brain-computer interfaces. Primary fields of study include healthcare and neuroergonomics. Brain signals could be used in a variety of ways to improve healthcare at every stage, from diagnosis to rehabilitation to eventual restoration. In this research, we demonstrate how to classify EEG signals of brain waves using machine learning algorithms for predicting mental health states. The XGBoost algorithm's results have an accuracy of 99.62%, which is higher than that of any other study of its kind and the best result to date for diagnosing people's mental states from their EEG signals. This discovery will aid in taking efforts [1] to predict mental state using EEG signals to the next level.
Conference Paper
Full-text available
This paper presents the basics of non-invasive electroencephalogram (EEG)-based brain-computer interfaces (BCIs) and a system that could support the learning process in an e-learning environment. More specifically, we focus on the user's mental state and how this can be detected and captured using EEG in order to provide a positive user experience through appropriate adaptation. Mental state is a multidimensional term that includes among else relaxation and concentration, which, in this paper, are captured using a low-cost EEG device that the users wear throughout the interaction with the online environment. The accurate identification of these states is made with the aid of Machine Learning (ML) and enables notifications and proper adaptation to automatically match the detected mental state. Finally, the results from the user experience during the system usage are presented assuming 10 participants. The evaluation showed that such a system would provide a positive user experience.
This short chapter is dedicated to a research approach that is promising but still in the infancy and questionable from an ethical point of view: direct connection to the brain. In this chapter, we present the main techniques of brain–computer interfaces and their role in virtual reality.
Attention is a complex process that is important in achieving optimal task performance. EEG-based attention recognition involves calibration tasks that subjects need to perform, as well as training of the attention model for reliable detection of attention levels. However, these calibration paradigms might not be ideal for inducing desirable attention states, as they are simple and monotonous. There is also limited evaluation of the different calibration tasks to assess their attention detection performance. To fill up this gap, we designed different calibration tasks and compared their effectiveness in inducing attention. We collected EEG data from 29 subjects using a consumer-grade EEG headband according to our experimental protocol. We used six bandpower features per channel to classify binary attention states (attention VS inattention). Based on our evaluation results, we discovered that two modified calibration tasks (Modified Flanker and Colour Flanker) achieved higher classification accuracies compared to the baseline task (Baseline Flanker). The Modified Flanker task achieved the highest mean subject-independent accuracy of 66.45 ± 16.38% across subjects. Our findings showed that the different subjects need unique calibration paradigms to achieve a high attention classification performance. In comparison with qualitative survey data analysis, we also found that the subjects’ personality and learner types do not correlate highly with attention detection performance.KeywordsEEGAttentionCalibration tasks
Full-text available
In recent years there has been an increase in the number of portable low-cost electroencephalographic (EEG) systems available to researchers. However, to date the validation of the use of low-cost EEG systems has focused on continuous recording of EEG data and/or the replication of large system EEG setups reliant on event-markers to afford examination of event-related brain potentials (ERP). Here, we demonstrate that it is possible to conduct ERP research without being reliant on event markers using a portable MUSE EEG system and a single computer. Specifically, we report the results of two experiments using data collected with the MUSE EEG system—one using the well-known visual oddball paradigm and the other using a standard reward-learning task. Our results demonstrate that we could observe and quantify the N200 and P300 ERP components in the visual oddball task and the reward positivity (the mirror opposite component to the feedback-related negativity) in the reward-learning task. Specifically, single sample t-tests of component existence (all p's < 0.05), computation of Bayesian credible intervals, and 95% confidence intervals all statistically verified the existence of the N200, P300, and reward positivity in all analyses. We provide with this research paper an open source website with all the instructions, methods, and software to replicate our findings and to provide researchers with an easy way to use the MUSE EEG system for ERP research. Importantly, our work highlights that with a single computer and a portable EEG system such as the MUSE one can conduct ERP research with ease thus greatly extending the possible use of the ERP methodology to a variety of novel contexts.
Full-text available
Motion sensing plays an important role in the study of human movements, motivated by a wide range of applications in different fields, such as sports, health care, daily activity, action recognition for surveillance, assisted living and the entertainment industry. In this paper, we describe how to classify a set of human movements comprising daily activities using a wearable motion capture suit, denoted as FatoXtract. A probabilistic integration of different classifiers recently proposed is employed herein, considering several spatiotemporal features, in order to classify daily activities. The classification model relies on the computed confidence belief from base classifiers, combining multiple likelihoods from three different classifiers, namely Naïve Bayes, artificial neural networks and support vector machines, into a single form, by assigning weights from an uncertainty measure to counterbalance the posterior probability. In order to attain an improved performance on the overall classification accuracy, multiple features in time domain (e.g., velocity) and frequency domain (e.g., fast Fourier transform), combined with geometrical features (joint rotations), were considered. A dataset from five daily activities performed by six participants was acquired using FatoXtract. The dataset provided in this work was designed to be extremely challenging since there are high intra-class variations, the duration of the action clips varies dramatically, and some of the actions are quite similar (e.g., brushing teeth and waving, or walking and step). Reported results, in terms of both precision and recall, remained around 85 %, showing that the proposed framework is able to successfully classify different human activities.
Conference Paper
Full-text available
Previous studies that involve measuring EEG, or electroencephalograms, have mainly been experimentally-driven projects; for instance, EEG has long been used in research to help identify and elucidate our understanding of many neuroscientific, cognitive, and clinical issues (e.g., sleep, seizures, memory). However, advances in technology have made EEG more accessible to the population. This opens up lines for EEG to provide more information about brain activity in everyday life, rather than in a laboratory setting. To take advantage of the technological advances that have allowed for this, we introduce the Brain-EE system, a method for evaluating user engaged enjoyment that uses a commercially available EEG tool (Muse). During testing, fifteen participants engaged in two tasks (playing two different video games via tablet), and their EEG data were recorded. The Brain-EE system supported much of the previous literature on enjoyment; increases in frontal theta activity strongly and reliably predicted which game each individual participant preferred. We hope to develop the Brain-EE system further in order to contribute to a wide variety of applications (e.g., usability testing, clinical or experimental applications, evaluation methods, etc.).
Full-text available
A statistical based system for human emotions classification by using electroencephalogram (EEG) is proposed in this paper. The data used in this study is acquired using EEG and the emotions are elicited from six human subjects under the effect of emotion stimuli. This paper also proposed an emotion stimulation experiment using visual stimuli. From the EEG data, a total of six statistical features are computed and back-propagation neural network is applied for the classification of human emotions. In the experiment of classifying five types of emotions: Anger, Sad, Surprise, Happy, and Neutral. As result the overall classification rate as high as 95% is achieved.
Data Mining: Practical Machine Learning Tools and Techniques, Fourth Edition, offers a thorough grounding in machine learning concepts, along with practical advice on applying these tools and techniques in real-world data mining situations. This highly anticipated fourth edition of the most acclaimed work on data mining and machine learning teaches readers everything they need to know to get going, from preparing inputs, interpreting outputs, evaluating results, to the algorithmic methods at the heart of successful data mining approaches. Extensive updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including substantial new chapters on probabilistic methods and on deep learning. Accompanying the book is a new version of the popular WEKA machine learning software from the University of Waikato. Authors Witten, Frank, Hall, and Pal include today's techniques coupled with the methods at the leading edge of contemporary research. Please visit the book companion website at It contains Powerpoint slides for Chapters 1-12. This is a very comprehensive teaching resource, with many PPT slides covering each chapter of the book Online Appendix on the Weka workbench; again a very comprehensive learning aid for the open source software that goes with the book Table of contents, highlighting the many new sections in the 4th edition, along with reviews of the 1st edition, errata, etc. Provides a thorough grounding in machine learning concepts, as well as practical advice on applying the tools and techniques to data mining projects Presents concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods Includes a downloadable WEKA software toolkit, a comprehensive collection of machine learning algorithms for data mining tasks-in an easy-to-use interactive interface Includes open-access online courses that introduce practical applications of the material in the book.
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.