Recognition of Motor Imagery Electroencephalography Using Independent Component Analysis and Machine Classifiers.

Conference Paper: On Color Texture Generating Based on Simplified KIII Model
[Show abstract] [Hide abstract]
ABSTRACT: KIII model is an olfactory neural networks bionic model proposed by Walter J. Freeman. Its architecture simulates that of olfactory neural system, which is different from other artificial neural networks. Through simplifying KIII model, a color texture generating algorithm is proposed combining with RGB. In RGB space, the tricolor of each pixel (red, green and blue) is used as the model input and the model output is composed as the tricolor of corresponding pixel in generated texture image. Experimental results show that simplified KIII model can generate beautiful color texture images.Computer and Information Science, 2009. ICIS 2009. Eighth IEEE/ACIS International Conference on; 07/2009  SourceAvailable from: Rami Oweis[Show abstract] [Hide abstract]
ABSTRACT: This study addresses BrainComputer Interface (BCI) systems meant to permit communication for those who are severely lockedin. The current study attempts to evaluate and compare the efficiency of different translating algorithms. The setup used in this study detects the elicited P300 evoked potential in response to six different stimuli. Performance is evaluated in terms of error rates, bitrates and runtimes for four different translating algorithms; Bayesian Linear Disciminant Analysis (BLDA), Linear Discriminant Analysis (LDA), Perceptron Batch (PB), and nonlinear Support Vector Machines (SVMs) were used to train the classifier whilst an Nfold cross validation procedure was used to test each algorithm. A communication channel based on Electroencephalography (EEG) is made possible using various machine learning algorithms and advanced pattern recognition techniques. All algorithms converged to 100% accuracy for seven of the eight subjects. While all methods obtained fairly good results, BLDA and PB were superior in terms of runtimes, where the average runtimes for BLDA and PB were 13 ± 2 and 15.6 ± 6 seconds, respectively. In terms of bitrates, BLDA obtained the highest average value (22 ± 12 bits/minute), where the average bitrate for all subjects, all sessions, and all algorithms was 18.76 ± 10 bits/minute.J Health Med Informat. 01/2013; 4(2).  SourceAvailable from: Kai Keng Ang[Show abstract] [Hide abstract]
ABSTRACT: Any brain–computer interface (BCI) system must translate signals from the users brain into messages or commands (see Fig. 1). Many signal processing and machine learning techniques have been developed for this signal translation, and this chapter reviews the most common ones. Although these techniques are often illustrated using electroencephalography (EEG) signals in this chapter, they are also suitable for other brain signals.BrainComputer Interfaces: Revolutionizing HumanComputer Interaction, Edited by Bernhard Graimann, Brenda Allison, Gert Pfurtscheller, 10/2011: pages 305330; Springer.
Page 1
Recognition of Motor Imagery Electroencephalography Using
Independent Component Analysis and Machine Classifiers
ChihI Hung1,4, PoLei Lee4, YuTe Wu1,4,*, HuiYun Chen1,4, LiFen Chen3,4
TzuChen Yeh2,3,4, JenChuen Hsieh2,3,4
1Institute of Radiological Sciences, 2 Institute
of Neuroscience, 3 Center for Neuroscience
National YangMing University
No.155, Sec. 2, Linong St., Beitou District,
112, Taipei, Taiwan,
4Integrated Brain Research Laboratory, Dept. of
Medical Research and Education,
Taipei Veterans General Hospital,
No.201, Sec. 2, Shihpai Rd., Beitou District
112, Taipei, Taiwan
email : runtothewater@pie.com.tw; pllee2@vghtpe.gov.tw; ytwu@ym.edu.tw;
airrb@pchome.com.tw; lfchen3@vghtpe.gov.tw; tcyeh@vghtpe.gov.tw; jchsieh@vghtpe.gov.tw
ABSTRACT
Motor imagery electroencephalography (EEG), which embodies cortical potentials during mental simulation of
left or right finger lifting tasks, can be used as neural input signals to activate brain computer interface (BCI).
The effectiveness of such an EEGbased BCI system relies on two indispensable features: distinguishable
patterns of brain signals and accurate classifiers. This work aims to extract a reliable neural feature, termed as
beta rebound map, out of motor imagery EEG by means of independent component analysis, and employ four
classifiers to investigate the efficacy of beta rebound map. Results demonstrated that, with the use of ICA, the
recognition rates of four classifiers, linear discriminant analysis (LDA), backpropagation neural network (BP
NN), radialbasis function neural network (RBFNN), and support vector machine (SVM) improved
significantly from 54%, 54%, 57.3% and 55% to 69.8%, 75.5%, 76.5% and 77.3%, respectively. In addition,
the areas under the ROC curve, which assess the quality of classification over a wide range of misclassification
costs, also improved greatly from .65, .60, .62, and .64 to .78, .73, .77 and .75, respectively.
Keywords
Electroencephalography (EEG), Independent component analysis (ICA), brain computer interface (BCI), beta
rebound, linear discriminant analysis (LDA), backpropagation neural network (BPNN), radialbasis function
neural network (RBFNN), support vector machine (SVM)
1. INTRODUCTION
In recent years, great progress in neuroscience has
inspired studies in developing brain computer
interface (BCI) [Mul99a] [Pfu98a] [Pfu00a] [Pol98a],
a novel technique in assisting people to communicate
with external environments or trigger surrounding
devices by means of their brain signals. These
systems are particularly useful for ones who suffer
from amyotrophic lateral sclerosis or lockedin
syndrome and are unable to produce any motor
activity. Their cognition or sensor functions,
however, may be intact so that they can be trained to
perform mental tasks, for example, in simulating
right or left hand or foot movements without any
overt motor output. The success of BCI systems
relies on two integral parts: distinguishable neural
patterns and effective classifiers. This work aims to
extract a reliably distinguishable feature from the
motor imagery EEG recording by means of
independent component analysis and employ
machine classifiers to investigate the efficacy of
extracted pattern.
Permission to make digital or hard copies of all or part of
this work for personal or classroom use is granted without
fee provided that copies are not made or distributed for
profit or commercial advantage and that copies bear this
notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
WSCG’2004, February 26, 2004, Plzen, Czech Republic.
Copyright UNION Agency – Science Press
Page 2
It has been pointed out that imagination of hand
movement elicits rhythmic EEG patterns in the
primary sensorimotor areas similar to that from a real
hand movement [Pfu96a]. When a specific
movement or imagined movement is performed, it
composes of three phases: planning, execution and
recovery. The planning and execution results in
localized alpha and lower beta bands amplitude
attenuation or eventrelated desynchronization (ERD)
which can be viewed as an EEG correlate of an
activated cortical motor network, while the recovery
phase produces focal mu and beta amplitude
enhancement or eventrelated synchronization (ERS)
which may reflect deactivation/inhibition in the
underlying cortical network.
Several BCI systems have been proposed based on
the induced ERD when subjects performed imagery
hand or foot movements [Pfu98a] [Pfu00a].
Pfurtscheller et al. used a learning vector
quantization to classified ERD signals online in a
subject specific band which was determined by
distinctive sensitive learning vector quantization.
They also adopted adaptive autoregressive model to
analyzed ERD signal offline and applied linear
discrimination analysis to improve the detection of
imagined left and right hand movements. The
reported error rates varied 5.8 and 32.8%. Muller
Gerking et al. applied common spatial filter to detect
real (not imagined) left, right hand or right foot
movements in single trial and reported 84%, 90%
and 94% accuracies for three subjects, respectively
[Mul99a].
Although the ERD elicited by imagined movement
has been extensively used as a feature pattern in BCI
systems, we have observed that not every subject can
produce discernible ERD during the imagery
movement, whereas the beta ERS was persistently
appeared for each subject. This motivated us to adopt
ERS, rather than the ERD, as the feature pattern.
The peaked ERS of imaged left or right hand
movement, referred to as beta rebound, exhibits on
bilateral sensorimotor areas but with distinct patterns.
When the imagination of right hand movement is
executed, the beta rebound over left hemisphere
produces stronger amplitude than that on the right
hemisphere, and the vice versa.
The
contaminated by system noise, artifacts, spontaneous
EEG, etc. Following our previous works for
MEG/EEG denoise [Lee03a], we employed the
Independent Component Analysis (ICA) technique to
decompose each preprocessed epoch into a set of
temporally independent components along with
corresponding spatial maps, and selected the task
related components by matching designed spatial
recorded EEG signals were inevitably
templates with the decomposed spatial maps. As a
result, the signaltonoise ratio of each EEG single
trial was improved, which lead to the promotion of
classifiers’ performance.
This paper is organized as follows. Section 2 reports
our experimental paradigm for motor imagery task
and EEG recording configuration. Section 3 presents
the extracted features, with and without applying
ICA, based on peaked beat ERS and termed as beta
rebound maps. Section 4 reviews four classifiers in
this study. Section 5 summarizes the classification
results and Section 6 concludes this study.
2. EXPERIMENTAL PARADIGM FOR
MOTOR IMAGERY
Four righthanded healthy subjects (two males and
two females), aged between 20 and 28, participated
in this study. Each subject was naive to the
experiment and trained only twenty minutes prior to
the first session. During each session, the subject was
asked to perform 100 trials of imagery right index
finger lifting, followed by another 100 trials of
imagery left index finger lifting. The length of each
trial was ten seconds. Each trial began with one
second presentation of random noise during which
subjects were allowed to blink his/her eyes (A in Fig.
1). The subject was then instructed to stare at the
fixation cross in the center of the monitor from 2s
and started to image right or left index finger lifting
right after he/she heard an acoustic cue “beep” (with
frequency 1k Hz and 10ms duration) at 5s (B in Fig.
1.). The interstimulus interval was 10 second.
Figure 1. Timing of two consecutive trials of the
motor imagery task.
A 64channels electroencephalography (EEG) 1020
system (with an electrocap) was used to record the
cortical potentials. The configuration of standard 1
20 system is shown in Fig. 2. The vertical and
horizontal electrooculograms (VEOG and HEOG)
were applied to reject bad epochs induced by eye
blinking during the recording. The data were
digitized at 250 Hz. Since we focused on beta
activities, the signals were further bandpassfiltered
with 650 Hz to remove the dc drifts and 60 Hz noise.
0
123456789 10 11
AB
beep
C
A
B
beep
12 13 13 14 15 16 17 18 19 s
Page 3
Throughout
electromyogram (EMG) was monitored from the m.
extensor digitorum communis (digitized at 2 KHz)
for the detection of motion status. Data of four
sessions were collected for each subject. Signals
from 3s to 10s (C in Fig. 1.) in each trial (excluding
bad epochs) were extracted for further classifiers
training and testing. Figure 3 exhibits such a pre
processed epoch from sensorimotor area (channel C3
in 1020 system).
the recordings, the surface
Figure 2. The configure of standard 1020 system
with 64 channels.
Figure 3. A preprocessed epoch recorded at C3.
3. FEATURE EXTRACTION WITH
AND WITHOUT ICA
Extraction of reliable feature from measured data is
vital in facilitating the subsequent classification
procedure. Since the measured signals were
inevitably contaminated by system noise, artifacts,
spontaneous EEG, etc., we employed the ICA
technique to decompose each preprocessed epoch
into a set of temporally independent components
along with corresponding spatial maps, and selected
the taskrelated components by matching designed
spatial templates with the decomposed spatial maps.
Two types of feature, one using ICA to extract task
related components and the other without using ICA,
were created from preprocessed data for the purpose
of comparison with their efficacies. The detailed
steps for feature extraction with ICA were described
in the following:
Step 1: Signal decomposition by using ICA. We
first arranged each preprocessed epoch across m
channels (m=62) and n sampled points (n=1750) into
an
nm×
matrix X. The ith row contains the
observed signal from ith EEG channel, and the jth
column vector contains the observed samples at the
jth time point across all channels. In the present study,
all calculations were performed using the FastICA
algorithm [Cov65a] [Cov98a].
technique first removed means of the row vectors in
the X matrix followed by a whitening procedure to
transform the covariance matrix of the zeromean
data into an identity matrix. The whitening process
was implemented using the Principal Component
Analysis. Only the first N most significant
eigenvectors (N=15 in our analysis) were preserved
in the subsequent ICA calculation. In the next step,
FastICA searched a matrix to further separate the
whitened data into a set of components which were
as mutually independent as possible. Combining with
previous whitening process, the matrix X can be
transformed into a matrix S via an unmixing matrix
W, i.e.,
WXS =
The FastICA
(1)
in which the rows of S were mutually independent.
Each column of
W
, i.e. mixing matrix, represents a
spatial map describing the relative projection weights
of the corresponding temporal components at each of
the EEG channels. They will be referred to as IC
spatial maps henceforth. Figure 4 shows 12 IC spatial
maps of 12 independent components (not shown)
decomposed from a singletrial imagery right hand
movement. The maps IC3, IC5, IC7 and IC9 were
highly related to motor imagery task and categorized
as taskrelated components, while the IC4 and IC6
maps were associated with the occipital alpha rhythm,
and IC1 map was the noise emanated from a bad
channel.
Step 2: Correlating the IC spatial maps with pre
defined spatial templates to select taskrelated
components. Since the motor imagery task elicits
1
−
Nasion
fpz
cz
oz
pz
c4
t8
t7
c3
p4
p6
p3
p5
fp2 fp1
o2
o1
Inion
ground
af4 af
af8
af7
f
f4 f3
f7
f8
f1
f5
f2
f6
fczfc
ft8ft
fcfc5 fc1 fc2 fc6
c5
c1
c2
c6
cpz
cp4
tp8tp7
cp3 cp5
cp1
cp2
cp6
p1 p2
p7 p8
poz
po6
po8
po3
po7
po3 po4
ref.2ref.1
Page 4
bilateral activation in the vicinity of sensorimotor
areas, four spatial patterns encompassing C3, C4, Cz
and both C3 and C4 areas, respectively, were
considered as spatial templates (see Fig. 5) in
selecting the taskrelated spatial maps. Please note
that four spatial templates rather than single template
covering C3 and C4 were taken into account because
the taskrelated activities can be separated by ICA
and exhibited in multiple IC spatial maps. Each
template was correlated with 12 IC spatial maps of
single trial and the bets two matches were selected.
For example, the spatial maps IC3, IC5, IC7 and IC9
in Figure 4 were selected automatically due to their
high similarity. The taskrelated IC spatial maps as
well as the corresponding temporal components were
used to reconstruct the signal X by means of equation
(1).
Figure 4. The normalized IC spatial maps of a
singletrial imagery right hand movement.
Figure 5. Spatial templates used to select taskrelated
IC spatial maps.
Step 3: Computing the envelopes of beta
reactivity from reconstructed signals using the
Amplitude Modulation method. The optimal beta
frequency band encompassing the prominent and
relevant brain activities may vary across subjects and
subjects. To tackle this problem, we divided the beta
band, into five subfrequency bands, 8~12, 12~16,
16~20, 20~24, and 24~ 28 Hz, and used them with
additional beta band 8~30 Hz to bandpass filter the
reconstructed signals. The Amplitude Modulation
(AM) method based on the Hilbert transform was
applied to detect the envelope of the filtered EEG
signals and quantify the eventrelated oscillatory
activities [Clo96a]. Each envelope, referred to as AM
waveform, was computed by (see Figure 6 (a))
22
))(()()(tMHtMtm
MBP
BPBP
+=
(t
(2)
where
signal, and
Contrary to the classical measurement of ERS
reactivity and the original AM approach in which a
relative percentage as indexed to the initial baseline
was used [Clo96a], we computed the beta ERS
reactivity (termed as beta rebound) using the
amplitude difference between the maximum values
of beta ERD and beta ERS of the AM envelope.
)
H
is the singletrial bandpassed EEG
))((tM
BP
is its Hilbert transform.
Step 4: Extracting the beta rebound maps. The
imagery finger lifting task, similar to real finger
movement, induced larger beta rebound in the
contralateral sensorimotor area than that in the
ipsilateral one. In addition, the contralateral beta
rebound appeared earlier than the ipsilateral one. The
coexistence of prominent beta rebounds at C3 and
C4 and the constrained time lag between them
suggested that the topographical maps with
maximum rebounds at C3 and C4 were reliable
features. Specifically, we looked for two time points
at which both the AM waveforms of C3 and C4 have
maximum peaks but with time lag ( T
6(a)) less than 0.5 second. The topographical maps at
these two time points, referred to as beta rebound
maps (Figure 6. (c)), were concatenated into a
1124× column vector and used as a feature vector.
∆
in Figure
Using the same time points of the peaked beta
rebound resulted from steps1 ~ 4, we processed the
data using step 3 only, i.e. without using ICA. Figure
7 depicts the extracted beta rebound maps and
appears to be contaminated due to noise compared
with those in Fig. 6.
4. TWOCLASS SUPERVISED
CLASIFICATION
In this section, four twocategory classifiers used in
our study are briefly reviewed. They were linear
discriminant analysis (LDA), backpropagation
neural network (BPNN), radial basis function
network (RBFNN) and support vector machine
(SVM). The beta rebound maps, denoted by
imagery right and left hand movement, each of them
is a
1124× column vector and, were divided into two
data sets, one for training and the other for testing the
classifiers. The numbers of beta rebound maps used
in the training and testing phases for each subject at
each session were 60 and 30. These beta rebound
ixv, of
C3 C4 C3 & C4 Cz
0
1
0
0
0
1
1
1
IC1 IC2 IC3 IC4
IC5 IC6 IC7 IC8
IC9 IC10
IC11 IC12
Imagery right hand movement
Page 5
maps were randomized before being used. For the
sake of simplicity, we use the notation R and L to
denote the category of imagery right and left hand
movement, respectively, in the following discussion.
Figure 6. Computation of the beta rebound maps. (a)
The AM waveform of C3 and C4. T
lag between prominent beta rebounds at C3 and C4.
(b) Reconstructed signals of 62 channels (excluded
HEOG and VEOG) which were used to calculate the
AM waveforms in (a). (c) The beta rebound maps
created from reconstructed signals on 62 channels
indexed to the time points of peaked beta rebounds at
C3 and C4.
∆ was the time
Figure 7. The computed beta rebound maps only
using steps 3 without applying ICA.
Classifiers
4.1.1 LDA
The idea of LDA is to seek a vector wr so that two
projected clusters of R and L feature vectors
on wrcan be well separated from each other while
keeping small variance of each cluster. This can be
done by maximizing the socalled Fisher’s criterion
ixv
’s
w wS
wSw
wJ
w
b
'
)(
=
with respect to wr
scatter matrix:
=
, where
b
S is the betweenclass
)')((
LRLRb
mmmmS
−−
and
w
∑
∈
x
S is the withinclass scatter matrix:
rr
∑
∈
x
−−+−−=
L
LL
R
RRw
mxmxmxmxS)')(()')((
rr
in which two summations run over all the training
samples of classes R and L , respectively, and
and
L
m represent the group mean of classes R and L,
respectively. The optimal wr
corresponding to the largest eigenvalue of
Afterwris obtained by means of the training data, we
projected the test samples on it, and then classified
the projected points by the knearestneighbor
decision rule.
R
m
is the eigenverctor
BwSS
1
−
.
4.1.2 BPNN
The BPNN was trained in a supervised manner
based on the errorcorrection learning rule. The
hierarchy of a BPNN in our implementation is
depicted in Figure 8, which consists of one input
layer, one hidden layer, and one output layer. The
training phase was accomplished by iterating two
passes: the forward and backward passes. In the
forward pass of the backpropagation learning, as
show in the Figure 8, the output of the BPNN at
iteration n was computed by
ϕ
=
)) (()(
nvny
where
the induced local field of output neuron
) (⋅
ϕ
was the activation function and
)(nv
was
∑
=
i
=
m
ii
nonwnv
1
)()()(
in which m was the total number of the inputs
applied to output neuron,
connecting neuron i to the output neuron, and
)(noi
was the output signal of neuron i . The error
signal,
)(ne
, between
i w
was the weight
)(ny
and the desired
C3
C4
(c) Rebound map at C3 and C4 peak
1
(a) AM waveform of C3 and C4
(b) Reconstructed signal of 62 channels
T
∆
4
6
8
10 s
4
6
8
10 s
beep
beep
15
15
u
V
E
R
S
%
0
200
0
0
0.5
Rebound map of the same trial in Figure 6 without
applying ICA.
0.4
1
0.4
0.7
Page 6
output,
error met the stopping criterion, the training
procedure was terminated. Otherwise, it was
minimized in the subsequent backward pass to
update the synaptic weighting
)(nd
, was computed at each iteration. If the
)(nwi
)()( )]1([)( ) 1
+
(nonnwnwnw
iiii
ηδα
+−+=
where α was the momentum constant, and
the local gradients of the output layer in the network,
given by
(( ')()(nvnen
ϕδ
=
input feature vectors, xv’s, can be linearly classified
according to the value of
y
)(n
δ
is
))
. In the testing phase,
)(n
in the output layer.
Figure 8. The hierarchy of BP neural network.
4.1.3 RBFNN
The RBF neural network [Hay94a] uses a nonlinear
function to map the input data into highdimension
space so that they are more likely to be linearly
separable than in the lowdimension space [Cov65a]
[Cov91a] [Cov88a]. The hierarchy of (regularization)
RBF neural network is depicted in Figure9, which
consists of one input layer, one hidden layer, and one
output layer.
Each RBF network is designed to have a nonlinear
trans formation from the input layer to the hidden
layer, followed by a linear mapping from the hidden
layer to the output layer. The mapping between the
input and output space is expressed by:
∑
=
i1
vv
vv
=−ϕ
weighting from the ith hidden neuron to output
neuron, and
represents the ith known feature
vector with dimension m, i =1, 2, …N. The distance
between input vector, xv, and center,
into highdimension space by means of a Gaussian
−=
N
ii
xxwxF)()(
rrr
ϕ
(4.1)
where
2
)(
ixx
i
exx
−−
and
i
w represents the
ixv
ixv, is mapped
function (
ixx
vv−
(
ϕ
) in this study. In the phase of
supervised learning, training feature vectors
2, …N, and output desired output
is either 1 or 1 in our design, are given. For the sake
of simplicity, the training feature vectors are used as
centers. With the known N input feature vectors and
the corresponding designed outputs, the weighting
i w
can be computed from the inputoutput
relationship in equation (4.1):
ixv, i =1,
d
which
ii
xF
=
)(r
dGw =
(4.2)
where
−−−
−
M
−−
−−−
=
)()()(
)()()(
)()()(
11
222
L
12
L
12111
NNNN
N
N
xxxxxx
xxxxxx
x
v
x
v
x
v
x
v
x
v
x
v
G
vv
L
vvvv
O
L
vv
L
vvvv
ϕϕϕ
ϕϕϕ
ϕϕϕ
,
=
N
w
w
L
w
w
2
1
,
=
N
d
d
L
d
d
2
1
By solving the linear system (4.2), the resultant
weighting w vector is
dGw
=
(4.3)
where
GGGG)(
is the pseudoinverse matrix
of G. Compared with other neural network which
uses gradientbased optimization process to estimate
the weightings, for example, the backpropagation
recurrent neural network, the RBF neural network
solve for a set of linear equations to avoid trapping in
a local minimum and greatly reduce the training time.
In the testing phase, input feature vectors, xv’s, can
be linearly classified based on the values of
+
TT1
−+=
)(xFr’s.
Figure 9. The hierarchy of RBF neural network.
4.1.4 SVM
The basic idea of support vector machine hinges
on two mathematical operations: (1) With an
appropriate nonlinear mapping
vector into a highdimensional feature space, data
(.)
ϕ
of an input
ixr
Input layer
Hidden layer
Output
layer
)(ny
MM
) (⋅ϕ
)(nv
1
w
w
2
i w
)(
1no
)(
2no
)(noi
Forward pass
Backward pass
adjust synaptic weighting
ixr
Input
layer
Hidden layer of N
radiobasis functions
(
ii
xx
ϕ
Output
layer
)
r
(xF
ϕ
r
MM
ϕ
1w
2 w
N w
adjust synaptic weighting
ϕ
ϕ
∑
=
i
−=
N
ii
xxwxF
1
)()(
rr
)
rr−
Page 7
from two categories can be linearly separated by a
hyperplane [Cov65a], (2) Construction of an optimal
hyperplane for separating the features in (1). Let xv
denote a vector drawn from the input space, assumed
to be of dimension m0 and let
set of nonlinear transformations from the input space
to the feature space: m1 is the dimension of the
feature space. Given such a set of nonlinear
transformations, we may define a hyperplane acting
as the decision surface as follows:
11
=
)}({
m
j
jx
v
ϕ
denote a
∑
=
j
=
1
0
0)(
m
jj
xw
r
ϕ
(4.4)
where
},...,,{
1
10m
wwww =
denotes a set of linear
weights connecting the feature space to the output
space. And it is assumed that
that
0
w denotes the bias. Equation (4.4) defines the
decision surface computed in the feature space in
terms of the linear weights of the machine. Define
the vector
xx),([)(
0
ϕϕ
=
wwww],...,,[
1
10
=
we rewrite the decision
surface in the compact form:
r
ϕ
1)(
0
=
xr
ϕ
for all xr, so
T
m x
1
x)]( ),...,(
1
rrrr
ϕϕ
, and
T
m
0)(
=
xwT
(4.5)
Given the training feature samples
to the input pattern
response
idi
1 ,
=
design, it has been shown that [Hay94a] the optimal
weight vector w can be expressed as
)(
ixv
ϕ
corresponds
ixr, and the corresponding desired
,...,N
, which is either 1 or 1 in our
∑
=
i
=
N
ii
ϕ
i
xdw
1
)(r
α
(4.6)
where
resulted from maximizing the subject function
∑∑∑
==
ii
11
2
N
i
=
i1
}{
α
is the optimal Lagrange multipliers
=
−=
NN
j
ji
T
j
ϕ
iji
α
N
i
xxddQ
1
)()(
1
)(
rrϕααα
(4.7)
subject to the constraints (1)
0
1
=
∑
=
i
N
iid
α
, and (2)
C
i≤≤α
0
, where C is a userspecified constant.
Substituting equation (4.6) into (4.5), we obtain the
optimal hyperplane
∑
=
i
0)()(
1
=
xxd
N
i
T
i
ϕ
i
rrϕα
(4.8)
which will be used for linearly separating the testing
data, i.e. for any testing sample x, if
0)()(
1
≥
∑
=
i
xxd
N
i
T
i
ϕ
i
rrϕα
then x is classified into the subset having the training
response
1
=
, otherwise it is classified into the
other subset with
1
=
. In our implementation, we
chose the radial basis function in defining the inner
product kernel
)()(xxi
ϕϕ
i d
i d
T
rr
as follows:
)0005. 0exp()()(),(
2
ii
T
i
xxxxxxK
rrrrrr
−=≡ϕϕ
.
According to equation (4.8), once the number of
nonzero Lagrange multipliers,
number of radialbasis functions and their centers are
determined automatically. This differs from the
design of the conventional neural network, for
example, the backpropagation neural network or
radialbasis function network [Hay94a], where the
numbers of hidden layers or of hidden neuron are
usually determined heuristically.
i
α , is determined, the
5. RESULTS
Table 1 summarizes the averaged recognition results
for detecting the right and left imagined finger lifting
in four subjects (denoted by s1 ~ s4). With the use of
ICA in the extraction of the beta rebound maps, each
classifier has superior performance regardless of
subjects and the overall averaged recognition score
improved significantly from 55.0% to 74.8%. In
addition, the SVM outperformed other classifiers.
Classifier ICA s1 s2 s3 s4 mean
LDA without58 55 57 51 54
with 63 79 74 63 69.8
BPNN without 63 52 50 51 54
with 72 84 79 67 75.5
RBFNNwithout66 59 54 50 57.3
with 75 86 79 66 76.5
SVM without66 53 50 51 55
with 72 87 77 73 77.3
Table 1. Averaged recognition rates (in percentages)
over four sessions resulted from different classifiers
with and without using ICA for feature extraction.
The receiver operating characteristics (ROC) curve, a
plot of truepositive rate versus falsepositive rate,
provides another way to evaluate the performance of
binary detection classifiers. The area under the ROC
Page 8
curve, which can be interpreted as the probability of
a random sample being assigned to positive class
than that to negative class, assesses the quality of
classification over a range of misclassification costs.
Table 2 reports that the use of ICA improved the
performance of each classifier and the overall
averaged ROC area increased from 0.63 to 0.75.
Classifier ICA s1 s2 s3 s4 mean
LDA without .71 .64 .58 .67 .65
with .75 .86 .74 .68 .78
BP without .65 .56 .61 .58 .60
with .68 .78 .74 .71 .73
RBF without .73 .60 .54 .62 .62
with .65 .91 .77 .74 .77
SVM without .64 .61 .66 .63 .64
with .69 .87 .77 .65 .75
Table 2. Averaged ROC areas over four sessions
resulted from different classifiers with and without
using ICA for feature extraction. The numbers of
beta rebound maps used for training and testing for
each subject at each session were 60 and 30.
6. CONCLUSIONS
We have presented a novel method using ICA in
extracting a reliable feature, the beta rebound map,
from the peaked ERS of motor imagery EEG. With a
minimum training for each subject (20 minutes only),
satisfactory classification rates from four classifiers
have been achieved. This demonstrated the suitability
of beta rebound map as neural input signals in the
application of BCI systems.
7. ACKNOWLEDGMENTS
The study was funded by the Taipei Veterans
General Hospital, Taiwan 91380, the Ministry of
Education of Taiwan (89BFA221401), and the
National Science of Council, Taiwan (NSC922218
E010016).
8. REFERENCES
[Clo96a] Clochon, P., Fontbonne, J. M., Etevenon, P.
A new method for quantifying EEG eventrelated
desynchronization: amplitude envelope analysis,
Electroencephalography & Clinical Neuro
physiology, 98: 126129, 1996.
[Cov65a] Cover, T. M. Geometrical and statistical
properties of systems of linear inequalities with
applications in pattern
transactions on electronic computers, EC14:
326334, 1965.
[Cov91a] Cover, T. M., Thomas, J. A. Elements of
Information Theory. New York: Wiley, 1991.
[Cov88a] Cover, T. M., Capacity problems for linear
machines. Washington, DC: Thompson Book,
Pattern Recognition: 293289, 1988.
[Hay94a] Haykin, S.
Comprehensive Foundation.
Macmillan College Publishing Company, 1994.
[Lee03a] Lee, P.L., Wu, Y.T., Chen, L.F.,
Chen ,Y.S., Cheng, C.M., Yeh, T.C., Ho, L.T.,
Chang, M.S., Hsieh,
spatiotemporal approach for singletrial analysis
of postmovement MEG beta synchronization,
NeuroImage, in press, 2003.
[Mul99a] MullerGerking, J., Pfurtscheller, G.,
Flyvbjerg, H. Designing optimal spatial filters for
singletrial EEG classification in a movement
task, Clinical neurophysiology, 110:787798,
1999.
[Pfu96a] Pfurtscheller, G., Stancak Jr, A., Neuper, C.
Postmovement beta synchronization. A correlate
of an idling
Electroencephalography & Clinical Neuro
physiology, 98:281293, 1996.
[Pfu98a] Pfurtscheller, G., Neuper, C., A., Schlogl,
Lugger, K. Separability of EEG Signals
Recorded During Right and Left Motor Imagery
Using Adaptive Autoregressive Parameters, IEEE
transactions on Rehabilitation Engineering, Vol.
6, No. 3: 316325, 1998.
[Pfu00a] Pfurtscheller, G., Guger, C., Muller, G.
Krausz, G., Neuper, C. Brain oscillations control
hand orthosis in a tetraplegic, Neuroscience
letters, 292: 211214, 2000.
recognition, IEEE
Neural Network:
New
A
York:
J.C. ICAbased
motor area?,