Page 1
Fast accurate MEG source localization using a multilayer
perceptron trained with real brain noise
Sung Chan Jun, Barak A. Pearlmutter, Guido Nolte
Dept. of Computer Science, University of New Mexico, Albuquerque, NM 87131, U.S.A.
E-mail: junsc@cs.unm.edu, bap@cs.unm.edu, nolte@cs.unm.edu
Abstract. Iterative gradient methods like Levenberg-Marquardt (LM) are in widespread use
for source localization from electroencephalographic (EEG) and magnetoencephalographic
(MEG) signals. Unfortunately LM depends sensitively on the initial guess, necessitating
repeated runs. This, combined with LM’s high per-step cost, makes its computational burden
quite high. To reduce this burden, we trained a multilayer perceptron (MLP) as a real-time
localizer. We used an analytical model of quasistatic electromagnetic propagation through a
spherical head to map randomly chosen dipoles to sensor activities according to the sensor
geometry of a 4D Neuroimaging Neuromag-122 MEG system, and trained a MLP to invert
this mapping in the absence of noise or in the presence of various sorts of noise such as white
Gaussian noise, correlated noise, or real brain noise. A MLP structure was chosen to trade
off computation and accuracy. This MLP was trained four times, with each type of noise. We
measured the effects of initial guesses on LM performance, which motivated a hybrid MLP-
start-LM method, in which the trained MLP initializes LM. We also compared the localization
performance of LM, MLPs, and hybrid MLP-start-LMs for realistic brain signals. Trained
MLPs are much faster than other methods, while the hybrid MLP-start-LMs are faster and
more accurate than fixed-4-start-LM. In particular, the hybrid MLP-start-LM initialized by a
MLP trained with the real brain noise dataset is 60 times faster and is comparable in accuracy
to random-20-start-LM, and this hybrid system (localization error: 0.28 cm, computation time:
36 ms) shows almost as good performance as optimal-1-start-LM (localization error: 0.23 cm,
computation time: 22 ms), which initializes LM with the correct dipole location. MLPs trained
with noise perform better than the MLP trained without noise, and the MLP trained with real
brain noise is almost as good an initial guessor for LM as the correct dipole location.
Physics in Medicine and Biology, 47(14):2547–2560, June 21 2002.
Page 2
Fast MEG source localizer 2
1. Introduction
Source localization using EEG and MEG signals is important in medical diagnosis of
conditions like epilepsy, in surgical planning, and in neuroscience research. The goal of
this localization is to identify electrically active brain regions, which emit signals measured
by EEG and MEG. There are a number of popular localization methods (Ha¨ma¨la¨inen et al.,
1993) most of which assume a dipolar source. The approach taken by most methods to solve
the dipole source localization problem is:
• Calculate the sensor activations Bc(x) for dipole parameters x through a forward model.
• Calculate the cost function c(x), a measure of the difference between the measured
sensor activations Bm and the calculated sensor activations,
c(x) = |Bc(x)− Bm|2.
• Adjust the dipole parameters x in order to reduce c(x).
• Repeat to convergence.
Gradient methods, which calculate the gradient ∇xc(x) in choosing a change to x, seem
superior in terms of accuracy and computational burden (Press et al., 1988).
However, gradient methods require both a differentiable forward model and an initial
guess. As we shall see, the efficiency and accuracy of the most popular gradient method
for this problem, LM (Levenberg, 1944; Marquardt, 1963), depends sensitively on the initial
guess. There is therefore motivation to build faster and more accurate source localizers. This
is particularly important for our real time MEG brain-computer interface system, as we need
to localize BSS-separated components in real time.
Since it is easy to create synthetic data consisting of pairs of corresponding dipole
locations and sensor signals, it is tempting to train a universal approximator to solve the
inverse problem directly, i.e. to map sensor signals directly to the dipole location. The
multilayer perceptron (MLP) of Rumelhart et al. (1986) has been popular for this purpose.
MLPs were first used for EEG dipole source localization and presented as feasible source
localizers by Abeyratne et al. (1991), and a MLP structure composed of six separate networks
was later used for EEG dipole localization by Zhang et al. (1998). Kinouchi et al. (1996) first
used MLPs for MEG source localization by training on a noise-free dataset of near-surface
dipoles, and Yuasa et al. (1998) studied the two-dipole case for EEG dipole source localization
while restricting each source dipole to a small region. Hoey et al. (2000) investigated EEG
measurements for both spherical and realistic head models, trained on a randomly generated
noise-free dataset, and presented a comparison between a MLP and an iterative method for
localization with noisy signals at three fixed dipole locations. Sun and Sclabassi (2000)
adapted a MLP to calculate forward EEG solutions for a spheroidal head model from simple
EEG solutions for a spherical head model. Recently, Kamijo et al. (2001) proposed an
integrated approach to EEG dipole source localization in which a MLP trained with noise-
free data is used as an initializer for Powell’s method.
The human skull phantom study of Leahy et al. (1998) shows that the fitted spherical
head model for MEG localization is slightly inferior in accuracy to the realistic head model
Page 3
Fast MEG source localizer 3
numerically calculated by a boundary element method (BEM). In forward calculation, a
spherical head model has some advantages: it is more easily implemented, a forward
calculation through the model is much faster, and the model parameters can be fit to a
subject much more easily. Despite its inferiority in terms of localization accuracy, we use
a spherical head model in this work.‡ We train a MLP to localize dipoles from noisy MEG
measurements for a spherical head, and measure the efficacy of the resulting network under a
variety of conditions. In Section 2 the forward model and noise model are explained in detail.
Section 3 explores the tradeoff between accuracy and computation time in terms of MLP size,
and present MLP training and learning curves for various datasets. Section 3 continues by
exploring how S/N and other system parameters affect localization accuracy. The effects of
initial guesses on the performance of LM are simulated in Section 4, and Section 5 compares
the performance of LM, MLP and hybrid MLP-start-LM. We conclude that MLPs can serve
as real-time MEG localizers, MLPs should be trained with noise, and the MLP trained with
real brain noise is almost as good an initial guessor for LM as the correct dipole location.
2. Synthetic data
The synthetic data used in our experiments consisted of corresponding pairs of dipole
locations and sensor activations, as generated by our forward model. Given a dipole
location and a set of sensor activations, the minimum error dipole moment can be calculated
analytically (see Section 4). Therefore, although the dipoles used in generating the data set
have both location and moment, we discarded the moment in all the experiments below.
We made two datasets, one for training and the other for testing. Dipoles in the
training and testing sets were drawn uniformly from truncated spherical regions, as shown
in Figure 1. Their moments were drawn uniformly from vectors of strength ≤ 100 nAm. The
corresponding sensor activations were calculated by adding the results of a forward model
and a noise model. To allow the network to interpolate rather than extrapolate, thus improving
performance, the training set used dipoles from the larger region, while the test set contained
only dipoles from the smaller inner region.
2.1. Forward model
We used a standard analytic forward model of quasistatic electromagnetic propagation in
a spherical head (Sarvas, 1987; Mosher et al., 1999), with the sensor geometry of a 4D
‡ This work could be easily extended to a more realistic forward model. One can expect that a more complex
forward model leads to more local optima, and can therefore degrade the performance of gradient-based methods.
However, it should not much affect the performance of a trained MLP. For this reason, we would expect a more
sophisticated head model to give a comparatively greater advantage to the MLP-based approach advocated in
this paper. For realistic volume conductors algorithms like seed-based Simplex are common. Using the MLP to
get an initial guess does not require optimal accuracy, and we expect a spherical approximation for the training
of the MLP to be sufficient for this application.
Page 4
Fast MEG source localizer 4
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����������������
����
��
��
��������������������
z
z
y
x Coronal View
11.26 cm
11.85 cm
13.18 cm
3.05 cm
10 cm
2 cm 3 cm
9 cm
11.90 cm
Sensor surface
Testing Region
Training Region
Saggital View
Spherical Head Model
Figure 1. Sensor surface and spherical head model. Training and testing regions for a spherical
head model. Diamonds denote sensors.
Neuroimaging Neuromag-122 gradiometer.
Bs(x,Q) =
[
M(x,Q; t)|t=x1s − M(x,Q; t)|t=x2s
]
· rs
|x1s − x2s|
, s = 1, · · · , 122,
M(x,Q;xs) =
µ0
4pi
FQ × x − (Q × x · xs)∇Fx
F 2 ,
F (x,xs) = d(xsd + xs2 − (xs · x))
∇Fx(x,xs) =
(d2
xs
+ (d · xs)d + 2d + 2xs
)
xs −
(
d + 2xs +
(d · xs)
d
)
x
d = xs − x, d = |xs − x|, xs = |xs|,
where x and Q denote a source dipole location vector and a source dipole moment vector,
respectively. The vectors x1s and x2s denote the positions of the centers of the first and second
coils of s-th gradiometer sensor, and rs denotes the orientation vector of the s-th sensor.
Bs(x,Q) is the sensor activation of s-th sensor through the forward model, and µ0 is the
permeability constant of air.
Page 5
Fast MEG source localizer 5
2.2. Noise model
For single-trial data, the sensors in MEG systems have poor S/N ratios because MEG data
is strongly contaminated not only by intrinsic sensor noise, but also by external fields, fields
generated by various parts of the body (heart, eye muscles, retina), and signals from parts of
the brain not under study. Blind source separation of MEG data can drastically improve the
situation by segregating noise from signal (Viga´rio et al., 1998; Tang, Pearlmutter, Zibulevsky
and Carter, 2000), and the sensor attenuation vectors of the BSS-separated components can
be well localized to equivalent current dipoles (Tang, Phung, Pearlmutter and Christner, 2000;
Tang et al., 2002). However, the recovered field maps can be quite noisy, and conventional
localization techniques require manual interaction.
In order to compare the performance of various localizers, we need a dataset for which
we know the ground truth, but which contains the sorts of noise encountered in actual
MEG recordings. To this end, we created three noise processes with which to additively
contaminate synthetic sensor readings (Kwon et al., 2000). These are: white Gaussian noise,
correlated noise, and real brain noise. By using a variety of noise models, we achieve a rough
measurement of the robustness of the system to a mismatch between the noise model used
in training and the noise encountered in testing. The white Gaussian noise is generated by
simply drawing a zero mean Gaussian-distributed random number for each sensor. Correlated
noise is made using the method of Lu¨tkenho¨ner (1994):
• Distribute 871 dipoles uniformly on a spherical surface, with dipole moments drawn
from a zero-mean spherical Gaussian.
• Calculate a sensor activation through the analytic forward model for each dipole for each
sensor and sum over all dipoles at each sensor.
• Scale the resultant sensor activation vector to yield a suitable RMS power.
Real brain noise was taken from MEG recordings during periods in which the brain region
of interest in the experiment was quiescent. These signals were not averaged. The real brain
noise has an RMS power P n of roughly 50–100 fT/cm. We measured the S/N ratio of a dataset
using the ratios of the powers in the signal and the noise: S/N (in dB) = 10 log10 P s/P n where
P s is the RMS (square root of mean square) of the sensor readings from the dipole and P n is
the RMS power of the sensor readings from the noise.
The noisy datasets were made by adding noise to synthetic sensor activations generated
by the forward model. Exemplars whose resulting S/N ratio was under 0 dB were rejected.
Real brain noise taken from MEG recordings was added without scaling, while the white
Gaussian noise and the correlated noise were scaled to make the RMS power of the noise
equal to that of the real brain noise.
3. Multilayer Perceptron
It is well known that a MLP with one or more hidden layers is a universal approximator
(Hornik et al., 1989). As in Hoey et al. (2000), our experiments showed that a MLP with two
End of preview.