ArticlePDF Available

Hidden Markov models for online classification of single trial EEG data

Authors:
  • g.tec medical engineering GmbH

Abstract and Figures

Hidden Markov models (HMMs) are presented for the online classification of single trial EEG data during imagination of a left or right hand movement. The classification shows an improvement of the online experiment and the temporal determination of minimal classification error compared to linear classification methods.
Content may be subject to copyright.
Hidden Markov models for online classi®cation
of single trial EEG data
B. Obermaier
b,c,*
, C. Guger
a
, C. Neuper
b
, G. Pfurtscheller
a,b
a
Department of Medical Informatics, Institute for Biomedical Engineering, Graz University of Technology, Graz, Austria
b
Ludwig Boltzmann-Institute for Medical Informatics and Neuroinformatics, Graz University of Technology,
Ineldgasse 16a/II, A-8010, Graz, Austria
c
Instituto Superior Tecnico, ISR-LaSEEB, Lisbon, Portugal
Abstract
Hidden Markov models (HMMs) are presented for the online classi®cation of single trial EEG data during imag-
ination of a left or right hand movement. The classi®cation shows an improvement of the online experiment and the
temporal determination of minimal classi®cation error compared to linear classi®cation methods. Ó2001 Elsevier
Science B.V. All rights reserved.
Keywords: Brain-computer interface (BCI); Hidden Markov models; EEG classi®cation; Event-related desynchronisation (ERD)
1. Introduction
It has been shown that the imagination of either
a left or right hand movement results in an am-
plitude attenuation (desynchronisation) of land
central beta rhythms at the contra-lateral sensori-
motor representation area and, in some cases, in
an amplitude increase (synchronisation) at the
ipsi-lateral hemisphere (Neuper and Pfurtscheller,
1999). The event-related (de)synchronisation
(ERD, ERS) (Pfurtscheller, 1998) characterises
brain states with localised patterns of cortical ac-
tivation and deactivation, respectively, and is the
base of an EEG-based brain±computer interface
(BCI), where brain states associated with motor
imagery are transformed into control signals
(Pfurtscheller et al., 1997). A BCI has to perform
two tasks, the parameter estimation task, which
attempts to describe the properties of the EEG
signal and the classi®cation task, which separates
the dierent EEG patterns based on the estimated
parameters. Dierent groups have used dierent
classi®cation methods like Fisher's linear discri-
minant (LD) (Guger et al., 2000a,b), neural net-
works (Kalcher et al., 1996; Pregenzer et al., 1996;
Anderson et al., 1998; Haselsteiner and Pfurtsch-
eller, 2000) and linear threshold (McFarland et al.,
1997; Kuebler et al., 1998; Birbaumer et al., 1999).
One of the BCI systems presently used by the
research group in Graz estimates AAR parameters
derived from two bipolar EEG channels (electrode
positions close to C3 and C4) and classi®es the
patterns with LD (Guger et al., 2000b). Another
BCI system is based on the method of common
spatial patterns derived from 27 electrodes placed
www.elsevier.com/locate/patrec
Pattern Recognition Letters 22 (2001) 1299±1309
*
Corresponding author. Tel: +43-316-873-5311; fax: +43-
316-873-5349.
E-mail address: obermai@dpmi.tu-graz.ac.at (B. Oberma-
ier).
0167-8655/01/$ - see front matter Ó2001 Elsevier Science B.V. All rights reserved.
PII:S0167-8655(01)00075-7
over the primary sensorimotor cortex and equally
classi®es these patterns with LD (Ramoser et al.,
1999; Guger et al., 2000a). The spatio-temporal
EEG patterns associated with motor imagery are
not always stable, but often demonstrate a dy-
namic behaviour. So for example, the lrhythm
displays a relatively early onset of desynchronisa-
tion and a slow recovery, whereas the central beta
rhythm shows a short-latency ERD often followed
by a fast rebound or beta ERS (Pfurtscheller et al.,
1999; Pfurtscheller et al., 1998). These brain os-
cillatory dynamics during motor imagery promp-
ted us to introduce a classi®cation method, which
also uses information on the change of brain states
over time. This additional information should re-
sult in an improved BCI system. To describe these
temporal changes the Hidden Markov model
(HMM), which is well known in the area of speech
recognition (Rabiner and Juang, 1993), is used.
The Hjorth parameters (Hjorth, 1970) which de-
scribe the properties of the dynamic EEG changes
(ERD/ERS) in a simple and compact manner were
chosen to serve as the EEG parameters in this BCI
system. Our preference of Hjorth over AAR pa-
rameters for an HMM-based classi®er is due to
their lower dimensionality. A linear classi®er like
the LD can suciently be the limited amount of
available training data ± training of the non-linear
HMM-based classi®er based on the same amount
of data seemed critical therefore the need of a di-
mension reduction. Oine analysis of BCI data
recorded during earlier sessions revealed that
HMM in combination with Hjorth parameters are
suitable to classify EEG signals related to imagi-
nation of either a left or right hand movement
(Obermaier et al., 1999). Preliminary results of
online classi®cation of EEG patterns in an HMM-
based system were presented in (Obermaier et al.,
2000).
The aim of this paper is to introduce a new BCI
system based on an HMM classi®er for EEG
patterns obtained during right and left motor im-
agery. In addition, a comparison between the
presently used BCI system based on AAR pa-
rameter estimation and LD, and the new BCI
system using Hjorth parameters and HMM is
presented. The fact that the classi®cation methods
as well as the parameter estimations are dierent,
makes a comparison dicult. In order to be still
able to draw conclusions about the dierent clas-
si®cation methods, independently of the parameter
estimation method we present the upper bound of
the Bayes error for classi®cation with the AAR
and Hjorth parameters. The experiment was de-
signed to answer the following questions:
1. Is the application of HMM suitable for the on-
line classi®cation of EEG patterns?
2. Does the HMM-based BCI result in lower error
rates than the BCI based on LD? (The error
rate refers to the number of misclassi®ed trials
divided by the overall number of trials.) And
if yes: is this due to dierent classi®ers or due
to the dierent parameter estimation?
3. Is the continuous feedback provided by HMM
bene®cial, so that it supports the subject's con-
trol of his brain states?
4. How reliable are both classi®cation methods,
when the classi®er trained on EEG patterns is
again used after a break and using new elec-
trode montage?
2. Experimental setup and data acquisition
2.1. Subjects
Four male subjects (age 17±26 years) took part
in the study, three of them (S1, S2, S4) were fa-
miliar with the BCI, one (S3) was naive. They were
paid for their participation and free from medi-
cation and central nervous system abnormalities.
2.2. Experimental procedure
The subjects were sitting in a comfortable
armchair looking at the centre of a monitor placed
approximately 2 m in front of them. Each trial
(Fig. 1) started with the presentation of a ®xation
cross at the centre of the monitor, followed by a
short warning tone at 2 s. At 3 s an arrow was
displayed at the centre of the monitor for 1.25 s.
Depending on the direction of the arrow presented
(left or right) the subject was instructed to imagine
a movement of either the left or the right hand. In
case of a feedback session, a horizontal bar ex-
tending to the currently classi®ed direction was
1300 B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309
presented from 4.25 to 8.0 s. Continuous feedback
can enhance the dierences between left and right
motor imagery, as was shown recently by Neuper
et al. (1999). The subject's task was to extend the
bar via motor imagery to the side indicated by the
cue stimulus. Fig. 1(A) shows the screen contents
for a correct left trial and Fig. 1(B) for a right one,
respectively. In the case of a session without
feedback, the ®xation cross was presented again
replacing the feedback bar. The trial ended after 8
s and a blank screen was shown until the beginning
of the next trial. The inter-trial period (ranging
from 0.5 to 2.5 s) and the order of trials with left
and right motor imagery were randomised to
avoid adaptation.
During the following discussion the abbrevia-
tion BCI±HMM is used for the HMM-based BCI
system with feedback and BCI±LD for one based
on LD with feedback. The procedure of the study
is given in Fig. 2: the number of runs (each run
contains 20 left and 20 right trials in random or-
der) performed in one session is given in brackets.
The relations between sessions were indicated us-
ing arrows pointing from the sessions used to set
up the BCI system to the one using this BCI sys-
tem.
On the ®rst day the experiment started with
session 1 in order to record subject-speci®c data of
the motor imagery. Since the information about
speci®c EEG patterns was not yet available no
Fig. 2. Setting of sessions to investigate the capabilities of the BCI±LD and the BCI±HMM. The arrows indicate which session was
used to determine a classi®er, and which session was classi®ed using this classi®er. The number of runs of session are given in brackets.
Fig. 1. Course of the experimental trial and the screen contents of a correctly classi®ed left (A) and right trial (B). From 0.0 to 3.0 s a
®xation cross was presented, followed by the cue indicating either the left or right motor imagery. During the feedback period (from
4.25 to 8.0 s) a bar is displayed indicating the currently classi®ed direction (to the right or to the left).
B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309 1301
feedback was provided. Based on the information
of session 1 a BCI±LD was used in session 2.
Session 2 was used to train the BCI±HMM (ses-
sion 3) and the BCI±LD (session 4) both dierent
BCI systems were trained on the same session.
Please note that unlike the BCI±LD the BCI±
HMM has to be trained on a feedback session.
Training a BCI±HMM on a non-feedback session
fails because the temporal behaviour of the EEG
patterns changes due to the feedback in¯uence
(Neuper et al., 1999). While a system like the BCI±
LD which classi®es the EEG patterns at one time
only is insensitive to these temporal changes this is
not the case for the BCI±HMM which models the
temporal changes of the EEG patterns. The un-
derlying assumption is that the feedback given by
the BCI±LD produces the same spatio-temporal
EEG pattern as that produced by a BCI±HMM
feedback and therefore can be used as training
session.
The experiment on the second day started with
session 5 using a BCI±HMM trained on session 3
followed by session 6, a BCI±LD session trained
on session 4. One run in sessions 5 and 6 was
sucient to draw conclusions whether the BCI
systems were still able to distinguish the EEG
patterns or not. Sessions 7, 8, 9 and 10 were
identical to sessions 1±4 conducted on the ®rst day.
To answer the questions raised in the introduction
sessions 3, 5, 9 are used to assess the abilities of a
BCI±HMM, whereas sessions 4, 5, and 10 are used
to compare the results of the two dierent ap-
proaches (BCI±LD and BCI±HMM) based on the
same experimental conditions (electrode settings,
condition of the subjects). Sessions 5 and 6 serve to
investigate how good the BCI±HMM and BCI±
LD trained on EEG patterns obtained on the ®rst
day could classify the data recorded on the second
day.
2.3. EEG recordings
The bipolar EEG signals were recorded from
four Ag/AgCl electrodes placed 2.5 cm anterior
and posterior to electrode positions C3 and C4,
respectively. The ground electrode was located on
the forehead. The EEG signals were band-pass
®ltered between 0.5 and 30 Hz and sampled at
128 Hz. Training data for the set up of the
classi®er were visually checked for EMG or EOG
artefacts, while no artefact detection was per-
formed during online sessions.
2.4. Data pre-processing
The BCI±LD classi®es the EEG patterns based
on the pth order AAR parameter estimation
that describes an EEG signal ytby: yt
a1;tyt1a2;tyt2ap;tytpEt.
Here, ai;tdenotes the ith AAR parameter at time
point tand Etis considered as white noise with
zero mean and ®nite variance. A model order of
p6 (Schloegl et al., 1997b) was used, resulting in
a feature space dimension of v12 of the feature
vector
vt a1;t;a2;t;...;a6;tElectrode C3;
a1;t;a2;t;...;a6;tElectrode C4:
The Hjorth parameters namely activity, mobil-
ity and complexity describing the properties of the
EEG signal ytwere used in the BCI±HMM:
Activityyt VARyt;
Mobilityyt 
Activity dyt
dt

Activityyt
v
u
u
t;
Complexityyt
Mobility dyt
dt

Mobilityyt
combined to
vActivity;Mobility;ComplexityElectrode C3 ;
Activity;Mobility;ComplexityElectrode C4
of dimension v6.
The AAR parameter and the Hjorth parameters
were both calculated sample by sample based
on ytwhich is derived from the EEG signal
xtusing an exponential window given as yt
xtCyt11C,C0:99219 (Guger
et al., 2000b).
1302 B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309
3. Bhattacharyya distance
In order to make possible comparisons between
two BCI systems, where both the classi®er and also
the parameter estimation are dierent the features
were compared separately. Based on the Bhatta-
charyya distance (Bhattacharyya, 1943) l1=2an
upper bound euof the Bayes error ± the minimum
achievable error ± is given for normal distributions
as:
eu1
2el1=2
and
l1=21
8MRMLTRLRR
2

MRML
1
2ln
RLRR
2

RL
jj
RR
jj
p:
ML;RLare the sample mean matrix and the co-
variance matrix of the features corresponding to
the left motor imagery ± MR;RRcorresponding to
the right motor imagery, respectively. The covari-
ance matrices are in Toeplitz form, the number of
parameters to be estimated surpasses the dimen-
sion of the features by one. This allows a robust
estimation of the covariance matrix given the
available training data.
euwas calculated for both types of feature sets
for all available feedback sessions (BCI±HMM (3,
5, 9) and BCI±LD (4, 6, 10)) to suppress eects due
to the dierent classi®cation methods. For com-
parison euwas calculated for the AAR parameter
euAAR and euHjorth for the Hjorth parameters,
respectively. The ratio keuHjorth =euAAR was
calculated for 1 s windows starting from 3 to 8 s
with a 0.5 s overlap.
4. Classi®cation methods
The BCI±LD classi®es the spatio-temporal
EEG patterns using the LD (Bishop, 1995). The
LD separates two classes represented by feature
vectors vby a linear transformation from the v
dimensional feature space into a scalar d:
dwTvw0;1
where wTis a vector of adjustable weights and w0is
called bias or threshold. In order to ®nd the most
discriminating time point Twduring the trial, a 10
times 10-fold cross-validation test on every half-
second (starting from 3 to 8 s of the trial) was
performed. During an online session with feedback
the length and the direction of the feedback bar
was calculated using Eq. (1) with wTcalculated at
Twmultiplied by a scaling factor to keep the length
of the bar within the screen boundaries. The
threshold w0was estimated in such a way that d
results in values below zero for a left and values
above zero for a right imagination.
The basic principles of HMM will brie¯y be
discussed throughout this section, whereas a de-
tailed description can be found in (Rabiner and
Juang, 1993; Deller et al., 1993). The HMM itself
could be seen as a ®nite automata, containing s
discrete states, emitting a feature vector at every
time point that depends on the current state (see
Fig. 3). Each feature vector is modelled using m
Gaussian mixtures per state. The transition prob-
abilities between states are described using a
transition matrix.
During the training phase the expectation
maximisation (EM) algorithm introduced by
Dempster (Dempster et al., 1977) was used to es-
timate the transition matrix and the Gaussian
mixtures. Based on randomly selected values for
the transition matrix (upper triangle matrix) and
an initial estimation of the mixtures the EM al-
gorithm was performed. The estimation formulas
guarantee a monotonic increase of the likelihood
PVjHMMuntil a local or global maximum,
which ®nished the training phase. The number of
states ranged from 1 to 5, which corresponds to
physiological changes in the spatio-temporal pat-
terns in a 1 s range (Schloegl et al., 1997a). The
number of mixtures was limited to eight referring
to earlier studies made by the authors (Obermaier
et al., 1999).
The Gaussian mixtures were estimated on a k-
means clustering of the feature vectors. The clus-
tering was performed using the Euclidean distance,
which necessitates feature vector components with
a mean and variance within the same numerical
B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309 1303
range. The mean and variance of all feature vec-
tors belonging to one cluster were then used to
model the Gaussian mixtures with a diagonal co-
variance matrix. This modelling is feasible only for
the uncorrelated feature vector components. In
order to meet both requirements of normalised
and uncorrelated data, the whitening transforma-
tion (Fukunaga, 1990) was performed. The origi-
nal data Vv1;v2;...;vT of length T
were transformed into
V
v1;
v2;...;
vT
using:
VUD1
2V;2
where Uand Dare the eigenvector and eigenvalue
matrices of the covariance matrix of V, respec-
tively. Two HMM, one representing the left
imagination (HMML) and one the right imagina-
tion (HMMR) were trained using the Hjorth
parameters calculated for the period from 4.25
(beginning of feedback) to 8 s of artefact-free tri-
als. Both HMM consisted of the same numbers of
sand mbecause it was assumed that the spatio-
temporal EEG patterns during left or right motor
imagery are mirrored. Furthermore, the transition
matrix was chosen in such a way, that only tran-
sitions from left to right were allowed. The opti-
mum number for sand mresulting in the lowest
error rate at the end of the trial was evaluated
based on a 3 times 3 cross-validation, for various
combinations of sand m. To force the subjects to
produce EEG patterns during the experiment
which belong clearly to one of the two kinds of
imagination another step was performed during
training. Preliminary models were estimated on the
given training data and were then used to classify
the same training data. Finally, HMMLand
HMMRwere estimated using the correct classi®ed
trials.
The classi®cation of an unknown trial was a
selection of the maximum single best path proba-
bility of Pp
VjHMMLand of Pp
VjHMMRcal-
culated via the Viterbi algorithm (Rabiner and
Juang, 1993). The continuous feedback was cal-
culated as the dierence between the probabilities
Pp
VjHMMLand Pp
VjHMMRon a sample by
sample basis starting from 4.25 s. The resulting
dierence had to be scaled to keep the length of the
feedback bar within the screen boundaries.
4.1. The BCI±HMM system
The parameters estimation and classi®cation
was embedded into a real-time Simulink (Math-
Works, Natick, USA) model which samples two 2
EEG channels at a frequency of 128 Hz (see Fig. 4).
The Hjorth parameters were calculated sample by
sample using a window size of 1 s. These feature
vectors were then normalised using Eq. (2). Fur-
thermore, a third channel was sampled providing
the trigger information, which was a pulse lasting
Fig. 3. The HMM used in the BCI±HMM consists out of s3 states. The arrows indicate the allowed transitions, a feature vector
comprising m3 mixtures is emitted at every time point. The HMM is designed as a left to right model, because transitions are
allowed from a state to itself and to any right neighbour state.
1304 B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309
0.5 s with the raising edge at 2 s of a trial. This
trigger was used to generate a reset signal at 4.25 s
setting Pp
VjHMMLand Pp
VjHMMR. In the
period from 4.25 until 8.0 s Pp
VjHMMLand
Pp
VjHMMRwere calculated sample by sample.
The length of the feedback bar was calculated
based on the dierence of these two probabilities.
5. Results
The online error rates of the two BCI systems
under investigation are presented in Fig. 5 for the
BCI±HMM sessions (3, 5, 9) and the BCI±LD
sessions for all subjects. The sessions 1, 2, 7 and 8
were excluded from the results because they were
used to set up the systems and were therefore not
of further interest.
In 11 out of 12 corresponding sessions (3±4,
6, 9±10) (except subject S3, session 5±6) the online
error rates of the BCI±HMM are lower than those
of the BCI±LD. The average decrease of the error
rate of all sessions and subjects is 9:16:9%.
In Table 1 the number of states sand mixtures
mof a BCI±HMM session and Tw(in seconds from
the beginning of the trial) of a BCI±LD session are
given.
The number of sand mstay constant for just
two sessions and for two subjects (S3/3±S3/5 and
S4/3±S4/10), in all other sessions the structure of
the BCI±HMM changed. Twused in BCI±LD
sessions never stayed the same for two sessions of a
subject, whereby the earliest time point was at 4.5 s
and the latest at 8.0 s.
The propagation of the error rates during the
feedback period is given in Fig. 6 for one subject
(S2) in order to underpin the dierent behaviours
of BCI±HMM and BCI±LD. The sessions of
the other subjects follow the same trends like
the presented ones. It can be seen that the
Fig. 5. Online error rates in percent for sessions 3, 4, 5, 6, 9 and 10 for all subjects.
Fig. 4. The BCI±HMM system, realised by a real-time Simulink model. The Hjorth parameters of two channels (C3 and C4) were
classi®ed using an HMM classi®er. The single best path probabilities for both models are calculated sample by sample and the dif-
ference is used to calculate the feedback bar. A device driver for the RTI800a (DAQ board from Analog Device) is added to the model
to make the connection to the real world.
B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309 1305
BCI±HMM showed a steady decrease of the
error rate in all sessions ± the lowest error rate
was achieved at the end of the trial. In contrast
to this the BCI±LD achieved the lowest error
rate around Tw.
In Table 2 the average online error rates
achieved with the BCI±HMM system are com-
pared to the averaged oine error rates achieved
with the optimal BCI systems for every session.
The selection of the optimal BCI systems was de-
termined as described in Section 3, except that for
both systems a 5 times 5 fold cross-validation was
chosen. The same procedure was performed for
the BCI±LD and those results were also listed in
Table 2.
The kcalculated for all feedback sessions (3, 4,
5, 6, 9 and 10) was averaged to 1:036 0:21. The
upper bound of the Bayes error for classi®cation is
slightly lower for the AAR parameters than for the
Hjorth parameters.
6. Discussion
The discussion of the results of the presented
study will be organised in order to give answers to
the questions posed in the introduction. Based on
the results of the four subjects it can be stated that
the HMM-based classi®er can be used for online
classi®cation of spatio-temporal EEG patterns
during motor imagery.
The answer to question two, if the use of an
HMM-based classi®er can lower the online error
rate has to be answered prudently. To ®nd out why
the online recognition errors using the BCI±HMM
were lower than those obtained by the BCI±LD in
11 out of 12 sessions (in session S3/5, and S3/6
none of the two systems were able to discriminate),
two additional oine analysis of the recorded
EEG were performed.
(1) Eects due to the use of dierent parameter
estimations: The averaged results of k1:036
0:21 for all sessions and subjects showed that the
upper bound of the Bayes error for classi®cation of
the two motor imagery tasks are almost identical
for the AAR parameters and the Hjorth parame-
ters. Because kwas calculated for sessions re-
corded using both types of BCI systems, the
conclusion can be made that the lower error rates
are due to the dierent classi®cation methods and
not due to the use of dierent features.
(2) Cross-validation test: In order to investigate
how good the two BCI system were set up the
2
Fig. 6. Error rates in percent for sessions 3, 4, 5, 6, 9 and 10 of subject S2. In the case of a BCI±HMM session the number of states and
mixtures is presented in brackets. In the case of a BCI±LD the time point at which the classi®cation takes place is indicated.
Table 1
For BCI±HMM sessions the number of states sand mixtures m
is given as s;m. For BCI±LD sessions Twis given in seconds
from the beginning of the trial
Subject Session
3456910
S1 1;5 6.5 1;4 8.0 2;5 7.5
S2 1;4 5.0 5;4 5.5 2;3 7.0
S3 3;1 8.0 3;1 7.0 2;2 5.5
S4 2;3 8.0 1;1 7.0 2;3 4.5
1306 B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309
online error rates of the two dierent BCI systems
were compared to those of the oine cross-vali-
dation results (see Table 2). The closer the online
results are to the oine result, the better the
classi®er can generalise the spatio-temporal brain
patterns captured in dierent sessions. In the case
of the BCI±HMM, this online versus oine dif-
ference was 3.5%, whereas this dierence for the
BCI±LD was 11.4%. The weak generalisation ca-
pabilities of the BCI±LD could be caused by a
change of the EEG patterns due to the interme-
diate BCI±HMM session, or confusion of the
subject due to change of BCI systems or bad set-up
of the BCI±LD system. The change of EEG pat-
terns can be excluded because the oine classi®-
cation of the EEG signals recorded during the two
BCI±LD sessions 4 and 10 using the BCI±HMM
classi®er of sessions 3 and 9 resulted in an average
error of 20:38:5%. The dierence of 4.1% to the
cross-validation results of the BCI±LD system
shows that the EEG patterns did not change due
to the intermediate BCI±HMM sessions. This
leads to the conclusion that the BCI±LD system
was not able to ®nd generalised representations of
the spatio-temporal patterns. This is in contrast to
earlier studies done by Guger et al. (2000b), where
the BCI±LD system was successfully used for on-
line EEG classi®cation during motor imagery. This
study also reports that the BCI±LD system is very
sensitive to the selection of the bias w0, as a bad
selection of w0seems to be the reason of the bad
generalisation abilities of the BCI±LD system. An
answer to question two cannot be given based on
the achieved ®rst results, due to the fact that the
BCI±LD system was not con®gured properly.
Nevertheless it has to be noted that the BCI±
HMM has some advantages which are due to the
principle of the classi®er and can be stated inde-
pendently of the limited number of results pre-
sented in this study. Classi®cation in the BCI±LD
system is taking place at Tw. However oine
analysis of the BCI±LD sessions revealed that in
10 out of 12 sessions this time point of classi®ca-
tion was not the time point where the propagation
of the error rate was minimal. Twwas chosen for
classi®cation because the optimal time point was
not known in advance. This problem does not exist
using the BCI±HMM, because the lowest error
rate was always achieved at 8 s and therefore the
classi®cation took place at the end of the trial. This
can be explained by the fact that the BCI±HMM
was trained to recognize a feature sequence from
4.25 until 8.0 s. The lowest error rate was achieved
presenting the complete feature sequence. More-
over the results showed the sensitivity of the
BCI±LD system in respect to w0which makes the
system unsuitable for automatic classi®er set-up
without manual adjustment.
The proposed calculation of the continuous
feedback bar proportional to the dierence of
Pp
VjHMMLand Pp
VjHMMRdemonstrated
the property to extend to the correct direction at
the end of the trial. This is not the case using the
BCI±LD system, where the feedback bar is pro-
portional to Eq. (1), the feedback bar extends to
the correct direction around Twbut is unreliable
elsewhere. This might lead to confusion of the
subject, especially when Twis at the beginning of
the feedback period (S2/4, S2/6, S3/10 and S4/10)
(see Table 1).
The results of the sessions 5 and 6 performed
after a break of at least one week using a new
electrode set-up showed that the spatio-temporal
EEG patterns of the subjects familiar to the BCI,
could even be distinguished after a break. The
variation of the results in respect to the results of
the sessions used for training could be caused by a
slightly dierent electrode setting or dierent
conditions of the subjects. In S3/S5 and S3/S6
none of the two BCI systems was able to classify
the data. Oine analysis based on a 5 times 5
cross-validation test of these sessions showed that
there is no class information inherent in the data.
Various reasons might have caused the subject not
to be concentrated during these sessions. Bad
electrode set-up as a cause can be neglected,
Table 2
In column 2 the averaged online errors standard deviation for
all performed sessions and all subjects are given. Column 3
shows the averaged oine errors based on the optimal classi®er
for every session
BCI system Session type
Online Oine
BCI±HMM 18:612:8% 15:18:6%
BCI±LD 27:68:6% 16:28:9%
B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309 1307
because settings were done with a high accuracy,
and also the following sessions (9 and 10) could be
classi®ed with an accuracy of 15.3% (BCI±HMM)
and 21.8% (BCI±LD).
The determination of the BCI±HMM classi-
®er, (described in Section 3) based on the cross-
validation to test takes approximately 15 min for
120 trials using a Pentium K6, 300 MHz. One
worrisome fact is the change of sand mfor
dierent sessions: just for two subjects two BCI±
HMM used the same structure for the classi®er
(see Table 1). This makes an interpretation of
what kind of brain phenomena are modelled by
the HMMs more dicult. Further studies have
to be performed to address that issue. Further-
more, it would be interesting how a classi®er
with a constant sand mcould perform in various
sessions.
To summarise this study we can conclude that
the HMM-based BCI system can be used for on-
line classi®cation of EEG patterns during motor
imagery. The eect that classi®cation using the
BCI±HMM system is optimal at the end of the
trial is a major advantage compared to the BCI±
LD system where the optimal time point of clas-
si®cation is not known in advance. This has an
impact on the classi®cation error and also the re-
liability of the feedback. Furthermore, because the
lack of further adjustment, it is possible to perform
an automated set-up of the classi®er. Further
studies should evaluate the performance of the
HMM-based BCI in more detail, e.g. the eect of
prolonging the trials. It should also be examined
what kind of brain phenomena are modelled by
the HMM.
Acknowledgements
This work was supported in part by the
Austrian ``Fonds zur F
orderung der wissens-
chaftlichen Forschung'', project P11208MED.
Furthermore, we would like to thank Alois Schl
ogl
and Martin Pregenzer for their helpful suggestions,
and Gunther Schweitzer and Stewart MacMillan
for their proof-reading.
References
Anderson, C., Stolz, E., Shamsunder, S., 1998. Multivariate
autoregressive models for classi®cation of spontaneous
electroencephalogram during mental tasks. IEEE Trans.
Biomed. Eng. 45 (3), 277±286.
Bhattacharyya, A., 1943. On a measure of divergence between
two statistical populations de®ned by their probability
distribution. Bull. Calcutta Math. Soc. 35, 99±110.
Birbaumer, N., Ghanayim, N., Hinterberger, T., Iversen, I.,
Kotchoubey, B., K
ubler, A., Perelmouter, J., Taub, E.,
Flor, H., 1999. A spelling device for the paralysed. Nature
398, 297±298.
Bishop, Ch.M., 1995. Neural Networks for Pattern Recogni-
tion. Clarendon Press, Oxford.
Deller, J.R., Proakis, J.G., Hansen, J.H.L., 1993. Discrete-Time
Processing of Speech Signals. Macmillan, New York.
Dempster, A.P., Laird, N.M., Rubin, D.B., 1977. Maximum
likelihood from incomplete data via the EM algorithm.
J. Roy. Statist. Soc., Ser. B (Methodological) 39 (1), 1±38.
Fukunaga, K., 1990. Introduction to Statistical Pattern
Recognition. Academic Press, New York.
Guger, C., Ramoser, H., Pfurtscheller, G., 2000a. Real-time
EEG Analysis with subject-speci®c spatial patterns for a
brain±computer interface (BCI). IEEE Trans. Rehab. Eng.
447±456.
Guger, C., Schl
ogl, A., Neuper, C., Walterspacher, D., Strein,
T., Pfurtscheller, G., 2000b. Rapid prototyping of an EEG-
based brain±computer interface (BCI). IEEE Trans. Rehab.
Eng. 49±58.
Haselsteiner, E., Pfurtscheller, G., 2000. Using time dependent
neural networks for EEG classi®cation. IEEE Trans.
Rehab. Eng. 457±463.
Hjorth, B., 1970. EEG analysis based on time domain prop-
erties. Electroencephalogr. Clin. Neurophysiol. 29, 206±310.
Kalcher, J., Flotzinger, D., Neuper, C., G
olly, S., Pfurtscheller,
G., 1996. Graz brain±computer interface II: towards com-
munication between humans and computers based on online
classi®cation of three dierent EEG patterns. Med. Biol.
Eng. Comput. 34, 382±388.
Kuebler, A., Kotchoubey, B., Salzmann, H.P., Ghanayim, N.,
Perelmouter, J., H
ornberg, V., Birbaumer, N., 1998. Self-
regulation of slow cortical potentials in completely para-
lyzed human patients. Neurosci. Lett. 252, 171±174.
McFarland, D.J., Lefkowicz, A.T., Wolpaw, J.R., 1997. Design
and operation of an EEG-based brain±computer interface
with digital signal processing technology. Behav. Res.
Meth., Instr. Comput. 29, 337±345.
Neuper, C., Pfurtscheller, G., 1999. Motor imagery and ERD.
In: Pfurtscheller, G., Lopes da Silva, F.H. (Eds.), Event-
Related Desynchronization. Handbook of Electroencepha-
lography and Clinical Neurophysiology (Revised Edition)
Vol. 6. Elsevier, Amsterdam, pp. 303±325.
Neuper, C., Schl
ogl, A., Pfurtscheller, G., 1999. Enhancement
of left±right sensorimotor EEG dierences during feedback-
regulated motor imagery. Clin. Neurophysiol. 16, 373±382.
1308 B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309
Obermaier, B., Guger, C., Pfurtscheller, G., 1999. Hidden
Markov models used for the oine classi®cation of EEG
data. Biomed. Tech. 44 (6), 158±162.
Obermaier, B., Guger, C., Pfurtscheller, G., 2000. Online
classi®cation of single trial EEG data using hidden Markov
models. In: Proc. RECPAD2000, Portuguese Association
for Pattern Recognition, pp. 251±255.
Pfurtscheller, G., 1998. EEG event-related desynchronization
(ERD) and event-related synchronization (ERS). In: Nie-
dermeyer, E., Lopes da Silva, F.H. (Eds.), Electroenceph-
alography: Basic Principles, Clinical Applications and
Related Fields, fourth ed. Williams and Wilkins, Baltimore,
pp. 958±967.
Pfurtscheller, G., Neuper, C., Flotzinger, D., Pregenzer, M.,
1997. EEG-based discrimination between imagination of
right and left hand movement. Electroencephalogr. Clin.
Neurophysiol. 103 (5), 1±10.
Pfurtscheller, G., Neuper, C., Ramoser, H., M
uller-Gerking, J.,
1999. Visually guided motor imagery activates sensorimotor
areas in humans. Neurosci. Lett. 269, 153±156.
Pfurtscheller, G., Neuper, C., Schloegl, A., Lugger, K., 1998.
Separability of EEG signals recorded during right and left
motor imagery using adaptive autoregressive parameters.
IEEE Trans. Rehab. Eng. 316±325.
Pregenzer, M., Pfurtscheller, G., Flotzinger, D., 1996. Auto-
mated feature selection with a distinction sensitive learning
vector quantizer. Neurocomputing 11, 19±29.
Rabiner, L., Juang, B.H., 1993. Fundamentals of Speech
Recognition. Prentice-Hall, Englewood Clis, NJ.
Ramoser, H., M
uller-Gerking, J., Pfurtscheller, G., 1999.
Optimal spatial ®ltering of single trial EEG during imagined
hand movement. IEEE Trans. Rehab. Eng. 441±446.
Schloegl, A., Flotzinger, D., Pfurtscheller, G., 1997a. Adaptive
autoregressive modelling used for single-trial EEG classi®-
cation. Biomed. Tech. 42, 162±167.
Schloegl, A., Neuper, C., Pfurtscheller, G., 1997b. Subject-
speci®c EEG patterns during motor imagery. In: Proc. 19th
Internat. Conf. on IEEE/EMBS, pp. 1530±1532.
B. Obermaier et al. / Pattern Recognition Letters 22 (2001) 1299±1309 1309
... An HMM is characterized by three main components: (1) the transition probability matrix, which defines the likelihood of moving from one state to another; (2) the emission probability matrix, which describes the probability of observing a particular EEG pattern given the current state; and (3) the initial state distribution, which specifies the starting probabilities of the hidden states. (Obermaier et al. 2001;Mor et al. 2021). One example of this structure is illustrated in Fig. 1 and Fig. 2. ...
Article
Full-text available
One area of interest in neuroscience is the study of differences between male and female brains, encompassing structural, physiological, and neural activity, as well as their implications for behavioral traits and functional capabilities. In this study, we investigate the differences in the complexity of EEG signals between men and women and propose hidden Markov model (HMM) method for measuring complexity which significantly improves the accuracy of gender-based classification compared to conventional signal complexity measures. Using this method to measure complexity of signal, we enhanced the results by reaching to 86% decoding accuracy. Additionally, we demonstrated that the observed effect is particularly dominant in the parietal, frontal and central regions of the brain. Through signal filtering, we observed that differences in signal complexity between men and women are present across most of frequency bands with a high rate of enhancement. It is also noteworthy that the level of complexity in women's brain activity is higher than in men's. The results of HMM method showed higher classification accuracy across most frequency bands compared to conventional methods for measuring signal complexity and nonlinearity, such as entropy, Lyapunov and Hurst exponent. Importantly, the performance improvement rate was significantly higher than that of other conventional methods. Additionally, our finding of higher signal complexity in female was entirely consistent with previous studies. Overall, the results demonstrated that using a Hidden Markov Model can more effectively extract signal complexity, significantly enhancing the accuracy of EEG-based gender classification.
... A HMM is a model used for processing time series data. HMMs are used in many fields and applications like speech recognition [15], motor BCI [16], etc. A HMM is composed of two stochastic processes. ...
Conference Paper
Paper link : https://eurasip.org/Proceedings/Eusipco/Eusipco2024/pdfs/0001466.pdf Abstract: This paper presents the use of a hierarchical hidden Markov model (H2M2) for decoding brain signals from a tetraplegic patient. The H2M2 is a dynamic classifier used in this study to decode the user’s motor intentions. A H2M2 consists of multiple hidden Markov models (HMM), with states organised into production states, which emit observations, and internal states, which activate sub-HMMs. The hierarchical organization of the decoder can be parameterized to represent the state to classify. A two layers and a three layers H2M2 architectures are compared with a classical HMM to decode up to seven states from ElectroCorticoGraphy (ECoG). Results show the H2M2 outperforms the HMM in term of classification performances and latency. The H2M2 presents also more stable results.
... Through sensor attachment on the person's knee, [179] finds the difference between walking patterns with the accuracy of 88.76%. • HMM: HMM is one of the most applied methods among pattern recognition algorithms and probability models, appropriate for online classification [148] of activities. In HMM, activities are the hidden states and observable outputs are the sensor data [133]. ...
Article
Full-text available
Human activity recognition systems using wearable sensors is an important issue in pervasive computing, which applies to various domains related to healthcare, context aware and pervasive computing, sports, surveillance and monitoring, and the military. Three approaches can be considered for activity recognition: video sensor-based, physical sensor-based, and environmental sensor-based. This paper investigates the related work regarding the physical sensor-based approaches to motion processing. In this paper, a wide range of artificial intelligence models, from single classifications to methods based on deep learning, have been reviewed. The human activity detection accuracy of different methods, under natural and experimental conditions poses several challenges. These challenges cause problems regarding the accuracy and applicability of the proposed methods. This paper analyzes the methods, challenges, approaches, and future work. The goal of this paper is to establish a clear distinction in the field of motion detection using inertial sensors.
... HMMs have been used to infer resting-state and task-related dynamic properties from a range of different neuroimaging techniques, such as fMRI (Dang et al., 2017;Goucher-Lambert & McComb, 2019;Hussain et al., 2023), MEG Hawkins et al., 2020;Quinn et al., 2018;Vidaurre et al., 2018) and EEG (Dash et al., 2020;Marzetti, 2023;Obermaier et al., 2001;Williams et al., 2018). Previous studies have revealed that the dynamic properties of RSNs are related to a variety of cognitive and behavioral outcomes (Cabral et al., 2017;Taghia et al., 2018;Vidaurre et al., 2017). ...
Article
Full-text available
The temporal dynamics of resting‐state networks may represent an intrinsic functional repertoire supporting cognitive control performance across the lifespan. However, little is known about brain dynamics during the preschool period, which is a sensitive time window for cognitive control development. The fast timescale of synchronization and switching characterizing cortical network functional organization gives rise to quasi‐stable patterns (i.e., brain states) that recur over time. These can be inferred at the whole‐brain level using hidden Markov models (HMMs), an unsupervised machine learning technique that allows the identification of rapid oscillatory patterns at the macroscale of cortical networks. The present study used an HMM technique to investigate dynamic neural reconfigurations and their associations with behavioral (i.e., parental questionnaires) and cognitive (i.e., neuropsychological tests) measures in typically developing preschoolers (4–6 years old). We used high‐density EEG to better capture the fast reconfiguration patterns of the HMM‐derived metrics (i.e., switching rates, entropy rates, transition probabilities and fractional occupancies). Our results revealed that the HMM‐derived metrics were reliable indices of individual neural variability and differed between boys and girls. However, only brain state transition patterns toward prefrontal and default‐mode brain states, predicted differences on parental‐report questionnaire scores. Overall, these findings support the importance of resting‐state brain dynamics as functional scaffolds for behavior and cognition. Brain state transitions may be crucial markers of individual differences in cognitive control development in preschoolers.
... Many studies suggest that CC relies on the flexible interplay between the cognitive control network (CCN)a distributed circuit of regions (including fronto-parietal areas) -and the default-mode network characterization of whole-brain transitions between discrete states of stable and coordinated activity. HMMs have been used to infer resting-state and task-related dynamic properties from a range of different neuroimaging techniques, such as fMRI (Dang et al., 2017;Goucher-Lambert & McComb, 2019;Hussain et al., 2023), MEG (Baker et al., 2014;Vidaurre et al., 2018;Quinn et al., 2018;Hawkins et al., 2020) and EEG (Obermaier et al., 2001;Williams et al., 2018;Dash et al., 2020;Marzetti, 2023). Previous studies have revealed that the dynamic properties of RSNs are related to a variety of cognitive and behavioral outcomes (Cabral et al., 2017;Taghia et al., 2018;Vidaurre et al., 2017). ...
Preprint
Full-text available
The temporal dynamics of resting-state networks (RSNs) may represent an intrinsic functional repertoire supporting cognitive control performance across the lifespan (Kupis et al., 2021). However, little is known about brain dynamics during the preschool period, which is a sensitive time window for cognitive control development. The fast timescale of synchronization and switching characterizing cortical network functional organization gives rise to quasi-stable patterns (i.e., brain states) that recur over time. These can be inferred at the whole-brain level using Hidden Markov Models (HMMs), an unsupervised machine learning technique that allows the identification of rapid oscillatory patterns at the macro-scale of cortical networks (Vidaurre et al., 2018). The present study used a HMM technique to investigate dynamic neural reconfigurations and their associations with behavioral (i.e., parental questionnaires) and cognitive (i.e., neuropsychological tests) measures in typically developing preschoolers (4-6 years old). We used high density EEG to better capture the fast reconfiguration patterns of the HMM-derived metrics (i.e., switching rates, entropy rates, transition probabilities and fractional occupancies). Our results revealed that the HMM-derived metrics were reliable indices of individual neural variability and differed between boys and girls. However, only brain state transition patterns toward prefrontal and default-mode brain states, predicted differences on parental-report questionnaire scores. Overall, these findings support the importance of resting-state brain dynamics as functional scaffolds for behavior and cognition. Brain state transitions may be crucial markers of individual differences in cognitive control development in preschoolers. Keypoints HMM-derived metrics are reliable hallmarks of individual neural variability and show gender-related differences. Brain state transition patterns toward prefrontal and default-mode brain states predict differences on parental-report questionnaires scores. Brain state transitions may be crucial markers of individual differences in cognitive control development in preschoolers.
... Hjorth's parameters [9] are the measures of signal complexity and they are useful for the quantitative description of EEG. For the time-series signals x(t) activity parameter indicates the signal power. ...
... Machine learning provides and applies models as effective solutions [40][41][42][43] in most cases because although neuroscientists provide knowledge and procedures for processing and diagnosis, signals vary over time, which can be addressed by machine-learning approaches. Many classifiers, such as neural networks [44][45][46][47][48][49], support vector machines (SVMs) [50,51], and hidden Markov models [52,53], have been used to classify EEG signals. Mental activity patterns based on potentials were detected by updating neural networks using a propagation approach after EEG classification was used [54]. ...
Article
Full-text available
Many current brain–computer interface (BCI) applications depend on the quick processing of brain signals. Most researchers strive to create new methods for future implementation and enhance existing models to discover an optimal feature set that can operate independently. This study focuses on four key concepts that will be used to complete future works. The first concept is related to potential future communication models, whereas the others aim to enhance previous models or methodologies. The four concepts are as follows. First, we suggest a new communication imagery model as a substitute for a speech imager that relies on a mental task approach. As speech imagery is intricate, one cannot imagine the sounds of every character in every language. Our study proposes a new mental task model for lip-sync imagery that can be employed in all languages. Any character in any language can be used with this mental task model. In this study, we utilized two lip-sync movements to indicate two sounds, characters, or letters. Second, we considered innovative hybrid signals. Choosing an unsuitable frequency range can lead to ineffective feature extractions. Therefore, the selection of an appropriate frequency range is crucial for processing. The ultimate goal of this method is to accurately discover distinct frequencies of brain imagery activities. The restricted frequency range combination presents an initial proposal for generating fragmented, continuous frequencies. The first model assesses two 4 Hz intervals as filter banks. The primary objective is to discover new combinations of signals at 8 Hz by selecting filter banks with a 4 Hz scale from the frequency range of 4 Hz to 40 Hz. This approach facilitates the acquisition of efficient and clearly defined features by reducing similar patterns and enhancing distinctive patterns of brain activity. Third, we introduce a new linear bond graph classifier as a supplement to a linear support vector machine (SVM) when handling noisy data. The performance of the linear support vector machine (SVM) significantly declines under high-noise conditions. To complement the linear support vector machine (SVM) in noisy-data situations, we introduce a new linear bond graph classifier. Fourth, this paper presents a deep-learning model for formula recognition that converts the first-layer data into a formula extraction model. The primary goal is to decrease the noise in the formula coefficients of the subsequent layers. The output of the final layer comprises coefficients chosen by different functions at various levels. The classifier then extracts the root interval for each formula, and a diagnosis is established based on these intervals. The final goal of the last idea is to explain the main brain imagery activity formula using a combination formula for similar and distinctive brain imagery activities. The results of implementing all of the proposed methods are reported. The results range between 55% and 98%. The lowest result is 55% for the deep detection formula, and the highest result is 98% for new combinations of signals.
Preprint
Full-text available
One area of interest in neuroscience is the study of differences between male and female brains, encompassing structural, physiological, and neural activity, as well as their implications for behavioral traits and functional capabilities. In this study, we investigate the differences in the complexity of EEG signals between men and women and propose hidden Markov model (HMM) method for measuring complexity which significantly improves the accuracy of gender-based classification compared to conventional signal complexity measures. Using this method to measure complexity of signal, we enhanced the results by reaching to 86% decoding accuracy. Additionally, we demonstrated that the observed effect is particularly dominant in the parietal, frontal and central regions of the brain. Through signal filtering, we observed that differences in signal complexity between men and women are present across most of frequency bins with high rate of enhancement. It is also noteworthy that the level of complexity in women's brain activity is higher than in men's. The results of HMM method showed higher classification accuracy across most frequency bins compared to conventional methods for measuring signal complexity and nonlinearity, such as entropy, Lyapunov and Hurst exponent. Importantly, the performance improvement rate was significantly higher than that of other conventional methods. Additionally, our finding of higher signal complexity in female was entirely consistent with previous studies. Overall, the results demonstrated that using a Hidden Markov Model can more effectively extract signal complexity, significantly enhancing the accuracy of EEG-based gender classification.
Article
This article explores the use of scalar and multivariate autoregressive (AR) models to extract features from the human electroencephalogram (EEG) with which mental tasks can be discriminated. This is part of a larger project to investigate the feasibility of using EEG to allow paralyzed persons to control a device such as a wheelchair. EEG signals from four subjects were recorded while they performed two mental tasks. Quarter-second windows of six-channel EEG were transformed into four different representations: scalar AR model coefficients, multivariate AR coefficients, eigenvalues of a correlation matrix, and the Karhunen-Loeve transform of the multivariate AR coefficients. Feature vectors defined by these representations were classified with a standard, feedforward neural network trained via the error backpropagation algorithm. The four representations produced similar results, with the multivariate AR coefficients performing slightly better and more consistently with an average classification accuracy of 91.4% on novel, untrained, EEG signals.
Article
Hidden Markov models (HMM) are introduced for the offline classification of single-trail EEG data in a brain-computer interface (BCI). The HMMs are used to classify Hjorth parameters calculated from bipolar EEG data, recorded during the imagination of a left or right hand movement. The effects of different types of HMMs on the recognition rate are discussed. Furthermore a comparison of the results achieved with the linear discriminat (LD) and the HMM, is presented.
Article
We are developing an electroencephalographic (EEG)-based brain-computer interface (BCI) system that could provide an alternative communication channel for those who are totally paralyzed or have other severe motor impairments. The essential features of this system are as follows: (1) EEG analysis in real time, (2) real-time conversion of that analysis into device control, and (3) appropriate adaptation to the EEG of each user. Digital signal processing technology provides the speed and flexibility needed to satisfy these requirements. It also supports evaluation of alternative analysis and control algorithms, and thereby facilitates further BCI development.