Single-trial detection in EEG and MEG: Keeping it linear.
ABSTRACT Conventional electroencephalography (EEG) and magnetoencephalography (MEG) analysis often rely on averaging over multiple trials to extract statistically relevant differences between two or more experimental conditions. We demonstrate that by linearly integrating information over multiple spatially distributed sensors within a predefined time window, one can discriminate conditions on a trial-by-trial basis with high accuracy. We restrict ourselves to a linear integration as it allows the computation of a spatial distribution of the discriminating source activity. In the present set of experiments the resulting source activity distributions correspond to functional neuroanatomy consistent with the task (e.g. contralateral sensory-motor cortex and anterior cingulate).
- SourceAvailable from: wisc.edu[Show abstract] [Hide abstract]
ABSTRACT: Brain-computer interfaces (BCIs) aim at providing a non-muscular channel for sending commands to the external world using the electroencephalographic activity or other electrophysiological measures of the brain function. An essential factor in the successful operation of BCI systems is the methods used to process the brain signals. In the BCI literature, however, there is no comprehensive review of the signal processing techniques used. This work presents the first such comprehensive survey of all BCI designs using electrical signal recordings published prior to January 2006. Detailed results from this survey are presented and discussed. The following key research questions are addressed: (1) what are the key signal processing components of a BCI, (2) what signal processing algorithms have been used in BCIs and (3) which signal processing techniques have received more attention?Journal of Neural Engineering 07/2007; 4(2):R32-57. · 3.28 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: In this paper, we use a two-stage sparse factorization approach for blindly estimating the channel parameters and then estimating source components for electroencephalogram (EEG) signals. EEG signals are assumed to be linear mixtures of source components, artifacts, etc. Therefore, a raw EEG data matrix can be factored into the product of two matrices, one of which represents the mixing matrix and the other the source component matrix. Furthermore, the components are sparse in the time-frequency domain, i.e., the factorization is a sparse factorization in the time frequency domain. It is a challenging task to estimate the mixing matrix. Our extensive analysis and computational results, which were based on many sets of EEG data, not only provide firm evidences supporting the above assumption, but also prompt us to propose a new algorithm for estimating the mixing matrix. After the mixing matrix is estimated, the source components are estimated in the time frequency domain using a linear programming method. In an example of the potential applications of our approach, we analyzed the EEG data that was obtained from a modified Sternberg memory experiment. Two almost uncorrelated components obtained by applying the sparse factorization method were selected for phase synchronization analysis. Several interesting findings were obtained, especially that memory-related synchronization and desynchronization appear in the alpha band, and that the strength of alpha band synchronization is related to memory performance.IEEE Transactions on Neural Networks 04/2006; · 2.95 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Analyzing brain states that correspond to event related potentials (ERPs) on a single trial basis is a hard problem due to the high trial-to-trial variability and the unfavorable ratio between signal (ERP) and noise (artifacts and neural background activity). In this tutorial, we provide a comprehensive framework for decoding ERPs, elaborating on linear concepts, namely spatio-temporal patterns and filters as well as linear ERP classification. However, the bottleneck of these techniques is that they require an accurate covariance matrix estimation in high dimensional sensor spaces which is a highly intricate problem. As a remedy, we propose to use shrinkage estimators and show that appropriate regularization of linear discriminant analysis (LDA) by shrinkage yields excellent results for single-trial ERP classification that are far superior to classical LDA classification. Furthermore, we give practical hints on the interpretation of what classifiers learned from the data and demonstrate in particular that the trade-off between goodness-of-fit and model complexity in regularized LDA relates to a morphing between a difference pattern of ERPs and a spatial filter which cancels non task-related brain activity.neuroimage. 01/2010;
Neurocomputing 52–54 (2003) 177–183
Single-trial detection in EEG and MEG:
Keeping it linear
Lucas Parraa, Chris Alvinoa, Akaysha Tangb, Barak Pearlmutterb,
Nick Yeungc, Allen Osmand, Paul Sajdae;∗
aVision Technologies Laboratory, Sarno? Corporation, Princeton, NJ 08540, USA
bDepartment of Psychology and Department of Computer Science, University of New Mexico,
Albuquerque, NM 87131, USA
cDepartment of Psychology, Princeton University, Princeton, NJ 08544, USA
dDepartment of Psychology, University of Pennsylvania, Philadelphia, PA 19104, USA
eDepartment of Biomedical Engineering, 351 Engineering Terrace, MC 8904, Columbia University,
New York, NY 10027, USA
Conventional electroencephalography (EEG) and magnetoencephalography (MEG) analysis of-
ten rely on averaging over multiple trials to extract statistically relevant di?erences between two
or more experimental conditions. We demonstrate that by linearly integrating information over
multiple spatially distributed sensors within a prede?ned time window, one can discriminate
conditions on a trial-by-trial basis with high accuracy. We restrict ourselves to a linear integra-
tion as it allows the computation of a spatial distribution of the discriminating source activity.
In the present set of experiments the resulting source activity distributions correspond to func-
tional neuroanatomy consistent with the task (e.g. contralateral sensory-motor cortex and anterior
c ? 2003 Elsevier Science B.V. All rights reserved.
Keywords: Linear integration; High-density electroencephalography (EEG); Magnetoencephalography
(MEG); Single-trial analysis; Brain–computer interface (BCI)
Trial averaging is often used to increase the signal-to-interference (SIR) ratio, for
example in analysis of event-related potentials (ERPs) . With the large number of
E-mail address: firstname.lastname@example.org (P. Sajda).
0925-2312/03/$-see front matter c ? 2003 Elsevier Science B.V. All rights reserved.
178L. Parra et al./Neurocomputing 52–54 (2003) 177–183
sensors in high density EEG and MEG an alternative approach is to integrate infor-
mation over space rather than across trials. For example in blind source separation
independent signals are extracted so that noise components and artifacts can be re-
moved [8,13,14]. However, blind methods do not exploit the timing information of
external events that is often available.
In the context of a BCI system, many methods have applied linear and non-linear
classi?cation to a set of features extracted from the EEG. For example autoregres-
sive models have been used to extract features across a limited number of elec-
trodes, with these features combined using either linear or non-linear classi?ers to
identify the activity from the time course of individual sensors . Others have pro-
posed to combine sensors in space by computing maximum and minimum eigen-
values of the sensor covariance matrices to obtain a non-linear binary classi?cation
. Though many of these methods show promising performance in terms of clas-
sifying covert (purely mental) processes, their neurological interpretation remains
We use conventional linear discrimination to compute the optimal spatial integration
of a large array of sensors. We exploit timing information by discriminating and aver-
aging within a short time window relative to a given external event. We demonstrate
the method on three di?erent electromagnetic brain imaging data sets: (1) predicting
overt motor action (left/right button push) from 122 MEG sensors, (2) classifying
covert or imagined motor activity (left/right taps) using 59 EEG sensors, (3) detecting
decision errors in a binary discrimination task from 64 EEG sensors.
2. Materials and methods
2.1. Linear discrimination
Denoting x(t) as the M sensor values sampled at time instance t, we compute the
spatial weighting coe?cients v such that
y(t) = vTx(t) (1)
is maximally discriminating between the times t, corresponding to two di?erent ex-
perimental conditions. For example, in the prediction of explicit motor response ex-
periments (described below) the times correspond to a number of samples prior to
an overt button push. The samples corresponding to a left button push are to be dis-
criminated from samples of a right button push. For each of N trials we may have T
samples totaling NT training examples. We use conventional logistic regression  to
?nd v. After ?nding the optimal v we average over the T dependent samples of the
kth trial to obtain a more robust result, ? yk=T−1?
of sample times corresponding to trial k. Receiver operating characteristic (ROC) anal-
ysis  is done using these single-trial short-time averaged discrimination activities
( ? yk).
t∈Tky(t), where Tk denotes the set
L. Parra et al./Neurocomputing 52–54 (2003) 177–183 179
2.2. Localization of discriminating sources
In order to provide a functional neuroanatomical interpretation of the resultant spa-
tial weighting, we treat y(t) as a source, and visualize the coupling coe?cients of the
source with the sensors. The strength of the coupling roughly indicates the closeness
of the source to the sensor. The coupling a is de?ned as the coe?cients that multiply
the putative source y(t) to give its additive contribution xy(t) to the sensor readings,
xy(t) = ay(t). Unfortunately xy(t) is not observable in isolation, instead we observe,
x(t) = xy(t) + xy?(t), where xy?(t) represents the activity that is not due to the dis-
criminating source. If the contributions, xy?(t), of other sources are uncorrelated with
y(t) we obtain the coupling coe?cients by the least-squares solution .
a = ?y?−2Xy;
where the samples x(t) for di?erent t have been arranged as columns in the a matrix
X, and y(t) as a column vector y. In general other sources are not guaranteed to
be uncorrelated with the discriminating source. Therefore, the “sensor projection” a
represents the coupling of all source activity that is correlated to the discriminating
source y(t). Our approach relies on the linearity of y(t) and the fact that di?erent
sources in EEG and MEG add linearly .
2.3. Datasets for analysis
2.3.1. Prediction of motor response from MEG
This data set was provided by AT and BP. Four subjects performed a visual-motor
integration task. Subjects were simultaneously presented with two visual stimuli on a
CRT. Subjects were instructed to push a left hand or right hand button, depending on
which side a target stimulus was present. The subject was to discover the target by trial
and error using auditory feedback. Each trial began with visual stimulus onset, followed
by button push, followed by auditory feedback, indicating if the subject responded
correctly. The interval between the motor-response and the next stimulus presentation
was 3:0±0:5 s. Each subject performed 90 trials, which took approximately 10 minutes.
MEG data was recorded using 122 sensor at a sampling rate of 300 Hz and high-pass
?ltered to remove DC drifts. Dipole ?ts were done using the Neuromag x?t tools,
which assume a spherical head model to ?nd a single equivalent current dipole.
2.3.2. Classi?cation of explicit and imagined motor response from EEG
This data set was provided by AO. Nine subjects performed a visual stimulus driven
?nger (L/R) tapping task. Subjects were asked to synchronize an explicit or imagined
tap to the presentation of a brief temporally predictable signal. Subjects were trained
until their explicit taps occurred consistently within 100 ms of the synchronization
signal. After training, each subject received 10 blocks of trials. Each 72-trial block
consisted of nine replications of the eight trial types (Explicit vs. Imagined × Left vs.
Right vs. Both vs. No Tap) presented in a random order. Trials with noise due to
eye blinks were not considered in the EEG analysis. The electromyogram (EMG) was
180L. Parra et al./Neurocomputing 52–54 (2003) 177–183
recorded to detect muscle activity during imagined movements. The 59 EEG channels
were sampled at 100 Hz and high-pass ?ltered to remove DC drift.
2.3.3. Detection of decision errors from EEG
This data set was provided by NY. Seven subjects performed a visual target detection
amongst distractors task. On each trial, subjects were presented with a stimulus for
100 ms. There were four possible stimuli, each consisting of a row of ?ve arrows.
Subjects were told to respond by pressing a key on the side indicated by the center
arrow. They were to ignore the four ?anking arrows. On half of the trials, the ?anking
arrows pointed in the same direction as the target (e.g. ¡¡¡¡¡), on the other half
the ?ankers pointed in the opposite direction (e.g. ¡¡¿¡¡). The interval between
the motor-response and the next stimulus presentation was 1:5 s. Subjects performed 12
blocks of 68 trials each. The activity during an 100 ms interval prior to the response
was used as the baseline. The sampling rate was 250 Hz. Following baselining, trials
were manually edited to remove those with blinks, large eye movements, instrument
artifacts and ampli?er saturation.
3. Results and discussion
Single trial discrimination results are shown for the three di?erent data sets and
include sensor projections a, and detection/prediction performance using single-trial,
short-time averaged ? yk. Performance is reported using an ROC curve computed with
a leave-one-out training and testing procedure . Overall performance is quanti?ed
with Az (the area under the ROC curve).
Fig. 1 shows results for the dataset used to predict whether a subject will press a
button with their left or right hand by analyzing the MEG signals in a window prior
to the button push. We use an analysis window 100 ms wide centered at 83 ms prior
to the button event, which at 300 Hz corresponds to T = 30. Fig. 1 shows the results
for one subject (AT). Single-trial discrimination is shown in the ROC curve, which
for this subject shows good discriminability (Az= 0:93). Fig. 1 also shows the sensor
projection a and the location of a dipole-?t for this projection. When considered with
respect to the motor-sensory homunculus, these results indicate that the discrimination
source activity originates in the sensory-motor cortex corresponding to the left hand.
Fig. 2 shows results for the second data set, where the goal is to detect activity
associated with purely imagined motor response, a situation more realistic for a BCI
system. Subjects are trained to imagine a tap with the left or right index ?nger syn-
chronized to a brief, temporally predictable signal. We selected a 0:8 s time window
around the time where the task is to be performed. 90 left and 90 right trials were
available to train the coe?cients of the 59 EEG sensors. For the nine subjects we ob-
tain a leave-one-out performance of Az=0:77±0:1. The results for the best performing
subject is Az= 0:90, shown in Fig. 2. The sensor projection of the 59 EEG sensors
shows a clear left-right polarization over the motor area. The results, across the nine
subjects, for predicting explicit ?nger taps from a window 300 ms to 100 ms prior to
the taps is Az=0:87±0:1. As shown in Fig. 2, sensor projections of the discrimination
L. Parra et al./Neurocomputing 52–54 (2003) 177–183 181
ROC left vs. right
Fig. 1. MEG left/right button push prediction. (A) Sensor projections a for discrimination vector. (B) ROC
curve for left vs. right discrimination. Area under the curve Az= 0:93. (C) Dipole-?t of a overlaid on MRI
ROC left vs. right detect
Fig. 2. Discrimination of imagined left/right ?nger taps. (A) Dorsal view of sensor projections a. (B) ROC
curve for left vs. right discrimination. (C) Sensor projection of discriminating source for explicit ?nger tap.
vector for explicit motor response coincide with those for the imagined motor response.
This is an important experimental ?nding supporting the development of an intuitive
BCI system—signals arising from the cortical areas that encode an explicit movement
can be also used for predicting the imagined movement.
Fig. 3 shows the results for the experiments where the goal is to detect the Error
Related Negativity (ERN) on a single trial basis. The ERN is a negative de?ection
in the EEG signals following perceived incorrect responses in a discrimination task
. The detection of error potentials has been proposed as a means of correcting
communication errors in a BCI system . The ERN has a fronto-central distribution,
suggesting a source in the anterior cingulate . It begins around the time of the
incorrect response and lasts roughly 100 ms thereafter. We use this time window for
detection. Forty to eighty error trials and 300 correct trials were used for training 64
coe?cients. The sensor projection, shown in Fig. 3 for one subject, is representative
of the results obtained for other subjects and is consistent with the scalp topography
of the ERN. The detection performance for this subject was Az= 0:84 and is to be
compared to Az= 0:63 when detecting ERN from the center electrode where maximal
activity is expected. ROC performance across all 7 subjects was Az= 0:79 ± 0:05.
182L. Parra et al./Neurocomputing 52–54 (2003) 177–183
ROC correct vs. error detect
Fig. 3. Detection of decision errors with EEG. (A) Dorsal view of sensor projections. (B) ROC curve for error
vs. correct trials. Solid curve corresponds to discrimination using Eq. (1) and dotted line to discrimination
with center electrode (#12).
Our results demonstrate the utility of linear analysis methods for discriminating be-
tween di?erent events in single-trial, stimulus driven experimental paradigms using
EEG and MEG. A particularly important aspect of our approach is that linearity enables
the computation of sensor projections for the optimally discriminating weighting. This
localization can be compared to the functional neuroanatomy, serving as a validation
of the data driven linear methods. In all three cases presented, we ?nd that indeed the
activity distribution correlated with the source that optimizes single-trial discrimination
localizes to a region that is consistent with the functional neuranatomy. This is impor-
tant, for instance in order to determine whether the discrimination model is capturing
information directly related to the underlying task-dependent cortical activity, or is in-
stead exploiting an indirect cortical response or other physiological signals correlated
with the task (e.g. correlations with the stimulus, eye movements, etc.). Localization
of the discriminating source activity also enables one to determine the neuroanatomical
correlations between di?erent discrimination tasks, as was demonstrated for explicit
and imagined motor responses in EEG. Finally, this method is applicable to other en-
cephalographic modalities with linear superposition of activity, such as near infrared
This work was supported in part by grants from the Defense Advanced Research
Project Agency, the National Foundation for Functional Brain Imaging, and the National
Institutes of Health (P50 MH62196 and R02 NS37528). We thank Clay Spence, Adam
Gerson, and Zuohua Zhang for fruitful discussions and assistance in the data analysis,
and Mimi Duvall for assistance with the ?gures.
 S. Baillet, J.C. Mosher, R.M. Leahy, Electromagnetic brain mapping, IEEE Signal Process. Mag. 18
(6) (2001) 14–30.
L. Parra et al./Neurocomputing 52–54 (2003) 177–183183
 D.A. Boas, et al., Imaging the body with di?use optical tomography, IEEE Signal Process. Mag. 18
(6) (2001) 57–75.
 M.G.H. Coles, M.D. Rugg, Event-related brain potentials: an introduction, in: M.D. Rugg, M.G.H. Coles
(Eds.), Electrophysiology of Mind, Oxford University Press, Oxford, 1995.
 S. Dehaene, M. Posner, D. Tucker, Localization of a neural system for error detection and compensation,
Psychol. Sci. 5 (1994) 303–305.
 R. Duda, P. Hart, D. Stork, Pattern Classi?cation, Wiley, New York, 2001.
 M. Falkenstein, J. Hoorman, S. Christ, J. Hohnsbein, ERP components on reaction errors and their
functional signi?cance: a tutorial, Biol. Psychol. 51 (2000) 87–107.
 S. Haykin, Adaptive Filter Theory, Prentice-Hall, Englewood Cli?s, NJ, 1996.
 S. Makeig, A. Bell, T. Jung, T. Sejnowski, Independent component analysis of electroencephalographic
data, in: Advances in Neural Information Processing Systems, Vol. 8, MIT Press, Cambridge, MA,
1996, pp. 145–151.
 G. Pfurtscheller, C. Neuper, Motor imagery and direct brain-computer communication, Proc. IEEE 89
(7) (2001) 1123–1134.
 H. Ramoser, J. Mueller-Gerking, G. Pfurtscheller, Optimal spatial ?ltering of single trial EEG during
imagined hand movement, IEEE Trans. Rehab. Eng. 8 (4) (2000) 441–446.
 G. Schalk, J. Wolpaw, D. McFarland, G. Pfurtscheller, EEG-based communication: presence of an error
potential, Clin. Neurophysiol. 111 (2000) 2138–2144.
 J.A. Swets, ROC analysis applied to the evaluation of medical imaging techniques, Invest. Radiol. 14
 A. Tang, B. Pearlmutter, D. Phung, S. Carter, Independent components of magnetoencephalography,
Part I: Localization, Neural Comput. 14 (8) (2002) 1827–1858.
 R. Vigario, J. Sarela, V. Jousmaki, M. Hamalainen, E. Oja, Independent component approach to the
analysis of EEG and MEG recordings, IEEE Trans. Biomed. Eng. 47 (5) (2000) 589–593.