Fast robust MEG source localization using MLPs
Sung Chan Jun, Barak A. Pearlmutter, and Guido Nolte
Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA
Abstract
Source localization from MEG data in real time requires algorithms which are robust, fully automatic, and very fast. We
present two neural network systems which are able to localize a single dipole to reasonable accuracy within a fraction of
a millisecond, even when the signals are contaminated by considerable noise. The first network is a multilayer perceptron
(MLP) which takes the sensor measurements as inputs, uses two hidden layers, and outputs source location in Cartesian
coordinates. After training with random dipolar sources contaminated by real noise, localization of a single dipole could
be performed within 300 microseconds on an 800 Mhz Athlon workstation, with an average localization error of 1.15 cm.
To improve the accuracy to 0.28 cm, one can apply a few iterations of conventional Levenberg-Marquardt (LM) mini-
mization using the MLP output as the initial guess. The combined method is about twenty times faster than multistart LM
localization with comparable accuracy. In a second network with only one hidden layer, the outputs were the amplitudes
of 193 evenly distributed Gaussian functions holding a soft distributed representation of the dipole location. We trained
this network on dipolar sources with real noise, and externally converted the network’s output into an explicit Cartesian
coordinate representation of the dipole location. This new network had an improved localization accuracy of 0.87 cm,
while localization time was lengthened to about 800 microseconds.
1 Introduction
There are a number of popular localization methods [1]
most of which assume a dipolar source. Among them,
the multilayer perceptron (MLP) [2] has been popular for
building fast robust dipole localizers. In particular, fast
dipole localizers are important in brain-computer inter-
face systems. Since MLPs were first used for EEG dipole
source localization and presented as feasible source local-
izers by Abeyratne et al. [3], various approaches using
MLPs to localize sources of EEG or MEG signals have
been attempted. Related works may be found in [4, 5] and
references therein.
In this study we propose two MLPs which are able to
localize a single dipole to reasonable accuracy from MEG
signals contaminated by noise within a millisecond. The
first network is a conventional Cartesian-MLP [5] which
takes the sensor measurements as inputs, uses two hidden
layers, and outputs the source location in Cartesian co-
ordinates. The second network is a novel Soft-MLP with
only one hidden layer, whose outputs are the amplitudes of
evenly distributed Gaussian functions holding a soft dis-
tributed representation of the dipole location. An exter-
nal decoder converts the network’s output into an explicit
Cartesian coordinate representation.
We use an analytical forward model of quasi-static elec-
tromagnetic propagation through a spherical head to map
randomly chosen dipoles to sensor activities, and train
MLPs to invert this mapping in the presence of real brain
noise. A performance comparison of the Cartesian-MLP,
the Soft-MLP, and a hybrid method—LM initialized by a
MLP—are presented.
2 Methods
2.1 Data
Our synthetic data consisted of corresponding pairs of
dipole locations and sensor activations, as generated by
our forward model. Given a dipole location and a set of
sensor activations, the minimum error dipole moment can
be calculated analytically. Therefore, we discarded the
moment in all the experiments below.
We made two datasets, one for training and the other for
testing. Dipoles in the training and testing sets were drawn
uniformly from truncated spherical regions, as shown in
Figure 1. The dipole moments were drawn uniformly
from vectors of strength ≤100 nAm. The corresponding
sensor activations were the results of a forward model plus
a noise model. To allow the network to interpolate rather
than extrapolate, thus improving performance, the training
set used dipoles from the larger region, while the test set
contained only dipoles from the smaller inner region. We
used a standard analytic forward model of quasistatic elec-
tromagnetic propagation in a spherical head [1, 6], with
the sensor geometry of a 4D Neuroimaging Neuromag-
122 gradiometer.
In order to properly compare the performance of local-
izers, we need a dataset for which we know the ground
truth, but which contains the sorts of noise encountered
in actual MEG recordings [5]. The real brain noise was
taken from MEG recordings during periods in which the
brain region of interest was quiescent. These signals were
not averaged. The real brain noise has an RMS of roughly
Pn = 50–100 fT/cm. We measured the S/N ratio of a
dataset using the ratios of the powers in the signal and
the noise: S/N (in dB) = 10 log10 P s/P n where P s is the
RMS (square root of mean square) of the sensor readings
from the dipole and P n is the RMS of the sensor read-
ings from the noise. The dataset was made by adding
real brain noise (without scaling) to synthetic sensor ac-
tivations generated by the forward model and exemplars
whose resulting S/N ratio was under 0 dB were rejected.
2.2 MLP structures
The Cartesian-MLP and the Soft-MLP charged with ap-
proximating the inverse mapping had an input layer of 122
units, one for each sensor. The Cartesian-MLP had two
hidden layers with N1 and N2 units and an output layer of
three units representing the dipole location (x, y, z). The
Soft-MLP consists of one hidden layer with N units, and
an output layer of 193 units representing the amplitudes
of 193 uniformly distributed three-dimensional Gaussian
functions in the training region of the head model [2].
These Gaussian functions are defined by
Gi(x) = exp
(
−|x− xi|
2
2σ2
)
for i = 1, . . . , 193
where xi is a center of Gaussian function and σ is a fixed
width parameter. These Gaussian functions are homo-
geneously distributed with adjacent centers at a distance
3 cm and a width parameter σ = 1.8 cm. The decoding
strategy to convert the activations into a Cartesian coordi-
nate representation was:
• For the 193 output values (ai ≈ Gi(x)), find the
index i∗ of maximum amplitude i∗ = arg maxiai.
• For some neighborhood I of xi∗ , estimate the dipole
location by linear interpolation,
xˆ =
∑
xi∈I aixi
∑
xi∈I ai
.
Output units had linear activation functions, while to ac-
celerate training hidden units used the hyperbolic tangent
activation function [7]. All units had bias inputs, adja-
cent layers were fully connected, and there were no cut-
through connections, which is shown in Figure 2. The
122 MEG sensor activations were scaled so that their RMS
value was 0.5. The network weights were initialized with
uniformly distributed random values in ±0.1. Backprop-
agation was used to calculate the gradient, and online
stochastic gradient decent for the optimization. No mo-
mentum was used, and learning rate was chosen empiri-
cally.
To empirically determine the number of hidden units,
we trained two MLPs with various numbers of hidden
units and we measured the tradeoff between approxima-
tion accuracy and computation time. Finally, we chose
122–60–30–3 and 122-80-193 as the Cartesian-MLP size
and the Soft-MLP size, respectively.
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
�������������������������������
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �
��
10 cm
2 cm 3 cm
9 cm
Testing Region
Training Region
Spherical Head Model
Figure 1: Training and testing regions for a spherical head
model.
1
−1
1
−1
1
−1
ijW
1
0
1
0
1
0
1
0
ijW
ijWΣ xiY =j tanh( )
x2
x3
x1
xn
x2
x3
x1
xn
ijWΣ xis :=
ijWΣ xi
s < a
s > b
a <= s <= b
1 1 1
Dipole Location
x
y
z
3 Units
30 Units122 Units 60 Units
1
MEG signals
z
y
Bias Units
122 Units 80 Units 193 Units
Activation Functions Model
B1
B2
B122
B3
MEG signals
Bias Units
Decoder
1
B1
B2
B122
B3
Dipole Location
Soft−MLP structure
Cartesian−MLP structure
x
b
a Y =j
a
b{ if ifif
Figure 2: Training dataset was trained in these Cartesian-
MLP and Soft-MLP structures.
3 Results
We trained two MLPs with the same brain noise training
dataset of 20,000 exemplars, and 4,500 MEG signal pat-
terns contaminated by real brain noise were tested. We
used up to 500 epochs which took four to twelve hours on
an 800 MHz AMD Athlon to train each network. Each
of the MLP localizers was used as an LM initializer, for
the MLP-start-LM localizer. For the comparison of MLPs
and their hybrid systems, LM was started with n randomly
chosen restarts, which is called “random-n-start-LM.” We
checked how many restarts of LM were needed to match
the accuracy of the hybrid systems. For the hybrid systems
Cartesian−MLP
Soft−MLP
Cartesian−MLP−start−LM
Soft−MLP−start−LM
random−20−start−LM
0
0.5
1
1.5
2
2.5
0 2 4 6 8 10
M
ea
n
Lo
ca
liz
at
io
n
Er
ro
r (
cm
)
S/N (dB)
Figure 3: Mean localization error versus S/N for multistart
LM, the Cartesian-MLP, the Soft-MLP and their hybrid
methods.
algorithm
speed
(ms)
accuracy
(cm)
random-20-start-LM 2175.0 0.31
Cartesian-MLP 0.3 1.15
Soft-MLP 0.8 0.87
Cartesian-MLP-start-LM 36.0 0.28
Soft-MLP-start-LM 31.0 0.30
Table 1: Comparison of performance on real brain noise
test set of LM, trained MLPs and hybrid systems for the
Cartesian-MLP and the Soft-MLP structures. Each num-
ber is an average over 4,500 localizations.
over 20 restarts were required.
The performance of localization systems on two MLP
structures, their variant hybrid MLP-start-LM localizers,
and random-20-start-LM are shown as a function of S/N
in Figure 3. As a whole, the Soft-MLP shows better local-
ization performance than the Cartesian-MLP. At high S/N
ratios the Soft-MLP-start-LM shows slightly better local-
ization accuracy than the Cartesian-MLP-start-LM, but it
is worse in accuracy at low S/N ratio signals.
A grand summary, averaged across various S/N con-
ditions, is shown in Table 1. The localization error of the
Soft-MLP decreases to 0.87 cm from 1.15 cm, while com-
putation cost increases from 0.3 ms to 0.8 ms. The hybrid
method of the Soft-MLP is slightly faster in computation
time than that of the Cartesian-MLP, while both hybrid
methods are comparable in localization accuracy. It is in-
teresting to note that the initial guess of the Soft-MLP is
close to the optimum, so the LM optimization needs fewer
iterations.
4 Discussion
We presented the Cartesian-MLP and the Soft-MLP struc-
tures, which are able to localize a single dipole within
a millisecond, presented their hybrid methods, and com-
pared their performances. Experiments show that the
Cartesian-MLP and Soft-MLP are feasible fast robust
dipole source localizers. The Soft-MLP is more accurate
than the Cartesian-MLP, at the expense of slightly greater
computation time.
The hybrid methods, MLP-start-LM, are real time lo-
calizers which dramatically improve the localization ac-
curacy beyond that of the original MLP. MLP-start-LM
using the Soft-MLP is slightly faster than, and comparable
in accuracy to, MLP-start-LM using the Cartesian-MLP.
Acknowledgements
Supported in part by NSF CAREER award 97-02-311, the
National Foundation for Functional Brain Imaging, and a
gift from the NEC Research Institute.
References
[1] M. Hämäläinen, R. Hari, R. J. Ilmoniemi, J. Knuutila,
and O. V. Lounasmaa, “Magnetoencephalography —
theory, instrumentation, and applications to noninva-
sive studies of the working human brain,” Reviews of
Modern Physics, vol. 65, pp. 413–497, 1993.
[2] D. E. Rumelhart, G. E. Hinton, and R. J. Williams,
“Learning internal representations by error propaga-
tion,” in Parallel distributed processing: Explorations
in the microstructure of cognition, Volume 1: Foun-
dations (D. E. Rumelhart, J. L. McClelland, and the
PDP research group., eds.), MIT Press, 1986.
[3] U. R. Abeyratne, Y. Kinouchi, H. Oki, J. Okada,
F. Shichijo, and K. Matsumoto, “Artificial neural net-
works for source localization in the human brain,”
Brain Topography, vol. 4, pp. 3–21, 1991.
[4] G. V. Hoey, J. D. Clercq, B. Vanrumste, R. V.
de Walle, I. Lemahieu, M. D’Havé, and P. Boon,
“EEG dipole source localization using artificial neural
networks,” Phys. Med. Biol., vol. 45, pp. 997–1011,
2000.
[5] S. C. Jun, B. A. Pearlmutter, and G. Nolte, “Fast accu-
rate MEG source localization using MLP trained with
realistic noise,” 2002. In Submission.
[6] J. C. Mosher, R. M. Leahy, and P. S. Lewis, “EEG and
MEG: Forward solutions for inverse methods,” IEEE
Transactions on Biomedical Engineering, vol. 46,
pp. 245–259, 1999.
[7] Y. LeCun, I. Kanter, and S. A. Solla, “Second order
properties of error surfaces: Learning time and gen-
eralization,” in Advances in Neural Information Pro-
cessing Systems 3, pp. 918–924, Morgan Kaufmann,
1991.