ArticlePDF Available

Spatial and Time Domain Feature of ERP Speller System Extracted via Convolutional Neural Network

Authors:

Abstract and Figures

Feature of event-related potential (ERP) has not been completely understood and illiteracy problem remains unsolved. To this end, P300 peak has been used as the feature of ERP in most brain–computer interface applications, but subjects who do not show such peak are common. Recent development of convolutional neural network provides a way to analyze spatial and temporal features of ERP. Here, we train the convolutional neural network with 2 convolutional layers whose feature maps represented spatial and temporal features of event-related potential. We have found that nonilliterate subjects’ ERP show high correlation between occipital lobe and parietal lobe, whereas illiterate subjects only show correlation between neural activities from frontal lobe and central lobe. The nonilliterates showed peaks in P300, P500, and P700, whereas illiterates mostly showed peaks in around P700. P700 was strong in both subjects. We found that P700 peak may be the key feature of ERP as it appears in both illiterate and nonilliterate subjects.
This content is subject to copyright. Terms and conditions apply.
Research Article
Spatial and Time Domain Feature of ERP Speller System
Extracted via Convolutional Neural Network
Jaehong Yoon ,1Jungnyun Lee ,2and Mincheol Whang 2
1Department of Biomedical Engineering, Duke University, Durham, NC 27708, USA
2Department of Digital Media Engineering, Sangmyung University, Seoul 03016, Republic of Korea
Correspondence should be addressed to Jaehong Yoon; jaehong.ryan.yoon@gmail.com
Received 6 January 2018; Revised 5 March 2018; Accepted 1 April 2018; Published 15 May 2018
Academic Editor: Victor H. C. de Albuquerque
Copyright ©  Jaehong Yoon et al. is is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Feature of event-related potential (ERP) has not been completely understood and illiteracy problem remains unsolved. To this end,
P peak has been used as the feature of ERP in most brain–computer interface applications, but subjects who do not show such
peak are common. Recent development of convolutional neural network provides a way to analyze spatial and temporal features
of ERP. Here, we train the convolutional neural network with  convolutional layers whose feature maps represented spatial and
temporal features of event-related potential. We have found that nonilliterate subjects’ ERP show high correlation between occipital
lobe and parietal lobe, whereas illiterate subjects only show correlation between neural activities from frontal lobe and central lobe.
e nonilliterates showed peaks in P, P, and P, whereas illiterates mostly showed peaks in around P. P was strong
in both subjects. We found that P peak may be the key feature of ERP as it appears in both illiterate and nonilliterate subjects.
1. Introduction
A brain–computer interface (BCI) is a system which provides
a communication method by utilizing biophysiological sig-
nals []. BCI system enables the users to communicate with
external world through measurements of biological signals
and mostly do not require voluntary muscle movement.
e system has been utilized to support severe locked-in
syndrome (LIS) patients who lack motor ability, such as
amyotrophic lateral sclerosis (ALS) and Guillain–Barre syn-
drome patients, as a means of communication [–]. Of many
biophysiological signals, electroencephalography (EEG) has
been most widely used in BCI eld for its easiness in and low
cost of measurement [, ].
Among dierent applications of BCI, event-related po-
tential (ERP) based speller system has been one of the most
widely used paradigms. e system was pioneered by Farwell
and Donchin [] in  which utilized oddball paradigm
in order to induce visual evoked potential (VEP), especially
the P response. However, there are still illiteracy problems
associated with ERP speller system [, ]. ere has been
reports of ERP features other than P [, ] which may
be a key feature of distinguishing identifying illiterates.
One of the most prominent classication methods for
ERP system is support vector machine (SVM) [–]. SVM is
mathematically simple and, with sucient knowledge of fea-
ture matrix, the experimenter can modulate the kernel for the
target problem. Unfortunately, the kernel of SVM is sensitive
to overtting []. As EEG are measured from multiple elec-
trodes [–], feature matrix can have high dimension with
possible duplicates, which increase possibility of overtting.
As most of ERP system paradigms are dependent on P
peak, the information (peak magnitude and latency) from
eachelectrodeshouldbesimilar.Moreover,itishardto
extract temporal and spatial information of EEG of a single
kernel. Although multiple kernel learning (MKL) problem
has been suggested [], it is hard to extract intuition of the
given problem through the method.
Recent development of deep learning provides an alter-
native approach. e convolutional neural network (CNN)
can extract the feature from a given feature vector by using
convolution. When an optimal lter is applied, the convolu-
tion will magnify the feature of interest and reduce the others
[].CNNhasbeenusedinpatternrecognition,especially
in image recognition and speech recognition, as it provides
topological information within the extracted feature [–].
Hindawi
Computational Intelligence and Neuroscience
Volume 2018, Article ID 6058065, 11 pages
https://doi.org/10.1155/2018/6058065
Computational Intelligence and Neuroscience
erefore, data with sequence or topological information can
be recognized more eciently as CNN enables extracting
both temporal and spatial information within the raw data.
AstheERPshowssequenceofriseandfallasaresponseto
visual stimuli, pattern recognition technique as CNN can be
applied. Moreover, the convolution kernel of CNN can be
used as tool for interpreting the spatial correlation among
EEG electrodes.
In this paper, we explore the performance of CNN on ERP
data to identify the key features that distinguish illiterates of
ERP speller system. e convolution kernels of trained model
will be explored to analyze the spatial correlation between
cortices and pattern within ERP of each electrode. e
subjects were grouped as either strong (nonilliterate) or weak
(illiterate) depending on clarity of ERP signals. Results of two
groups were compared to analyze dierence in features.
2. Methods
2.1. ERP Speller Design. iconsshowninFigurewereused
as visual stimuli for the speller system of this paper. Rapid
serial visual presentation (RSVP) panel design was adopted
for the speller system to avoid gaze eect. During the experi-
ment, screen size icons appeared on the center of the monitor
in a random sequence []. e oddball paradigm was
implemented by presenting target icon with distractors in a
random sequence []. Each icon appeared  times per trial.
e interstimulus interval (ISI) between icon appearances
was set to  ms.
2.2. Data Acquisition. For this paper,  subjects ( female,
 male) participated in the experiment. e subjects’ age
ranged from  to  (mean = ., std = ±.). During
the experiment, subjects were asked to sit upright on a chair
and instructed to keep still. No straps or ties were attached.
Subjectswereaskedtoself-reportanyinconveniencethat
might bother the concentration.
Each trial was initiated with an acoustic cue instructing
the target of the given trial in subjects’ mother tongue
(Korean).  seconds aer the acoustic cue was given, the
icons appeared on the monitor according to RSVP design in
random sequence. e subjects were instructed to mentally
count the target occurrence during each trial (Figure (b)).
Each session consisted of  trials. Each icon was selected as
a target during the session twice in random sequence.
All subjects were naive; –-minute preexperiment ses-
sion was given to get subjects used to the procedure. e
subjects were asked to self-report if they felt condent of
the procedure. Aer the preexperiment session ended, the
measurements of EEG were made. During the experiment,
one training session and online session were conducted as a
pair. To minimize subject’s stress level and fatigue, -minute
break was given in between training and online session. Each
subject conducted minimum of  pairs of training and online
session. No subjects had participated in more than  pairs of
sessions.
EEGwascollectedbyB-AlertXheadsetfromAdvanced
Brain Monitoring (ABM) with sampling rate of  Hz. e
EEG electrodes recorded followed international / system
[] as shown in Figure (a). All experiments were held in
accordance with the Declaration of Helsinki, and the protocol
was approved by the Ethics Committee of Sangmyung Uni-
versity.
2.3.ConvolutionalNeuralNetwork. e architecture of CNN
for this paper was as shown in Figure (c). e CNN con-
sisted of  convolutional layers,  max-pooling layers, and 
fully connected layers. Rectied linear unit (ReLU) function
was applied as activation function for each convolutional
layer since its performance was proven by another []. A
somax function was applied to output the last layer to
regularize the nal output to be between  and . e output
of CNN was vector of  elements where each element
represented the score of target and nontarget.
e CNN was designed to perform both spatial and
temporalltering.efeaturemapsofeachlayerwereusedto
access correlation between adjacent electrodes and temporal
feature of target ERP. In the st convolutional layer (L1), a
lter of size  × was applied to extract correlation of
EEG recorded in adjacent electrodes. e row number of
the lter was set to  as  electrodes were placed on each
lobe (except for occipital lobe where two electrodes were
placed). e size of lter enables analyzing the correlation of
all  electrodes from adjacent lobes. For analysis of temporal
feature of feature map from L1among dierent lobes, a lter
with size of  × was applied for nd convolutional layer (L2)
whose window size was approximately  ms in time scale.
To reduce the receptive eld size for ease of calculation
and prevent overtting, max-pooling layers (M1and M2)
were inserted aer each convolutional layer [, ]. e
max-pooling layers downsample the feature map by applying
a sliding window without overlap. As the name implies, the
maximum value within the window is extracted. As the max-
pooling introduces downsampling eect, a generalization of
feature map was achieved which prevented overtting of the
model. Sliding window sizes of M1and M2were  ×and
×, respectively.
To further reduce the possibility of overtting while train-
ing the model, drop-out technique was applied on the rst
fully connected layer (F1). e drop-out technique padded
zerostorandomlyselectedrowsinthegivenfeaturemap.By
intentionally losing the data within the feature map, general-
ization was achieved for the feature map which prevented the
model from being overtted by the training data [, ].
e size of input matrix fed into the CNN was  ×
where each row corresponded to EEG collected from each
electrode in Figure .
e CNN architecture was implemented in Python via
TensorFlow on Python [, ]. e Adam optimizer was
used to train the CNN which controls the learning rate to use
larger step size. , iterations were conducted for training
the model for each subjects data.
2.4. Tie Breaking. Ideally, if the model is perfect, only one
icon will be identied as the target for a given trial. However,
the system identied multiple icons as the targets in several
trials. On the other extreme, the system failed to identify any
target icon for some trials. For each case, the tie breaking rule
was applied as follows.
Computational Intelligence and Neuroscience
(i) Multiple icons cases: When multiple icons were
thought to be the target of a given trial by the CNN,
the tie breaking rule was applied to select the target
among these candidates. Since the rst element of
output vector represents the icons aliation to target
ERPproperty,theiconwiththegreatestvalueofthe
elementwasselectedasthetargetofthetrial.
(ii) No target case: When the system failed to nd the
association of the ERP from any icons to property
of target ERP, that is, no icons were identied as the
target,samerulesasthoseinmultipleiconscasewere
appliedtoselectthetargetforthegiventrial.Inthis
case, the rst elemenet of output vector from all icons
was compared. e icon whose rst element of output
vectorwasthegreatestwasselectedasthetargetofthe
trial.
2.5. Analysis. Both qualitative and quantitative analysis were
performed to analyze the characteristics of lters of each
convolutional layer. e subjects were divided into two
groups according to their relative strength of ERP as follows:
(i) ERP detection: if the target icon was detected as pos-
itive in a given trial, the ERP is considered detected.
e subjects were divided accordingly into either H
or L group (H and L for high and low) ERP detection
group. e threshold between H and L group was
%.
(ii) Feature map: feature maps from L1and L2were drawn
in color map. As higher weights of featuremap denote
high discriminant power, the colormap can qualita-
tively give insight of how each electrode is correlated
andatwhichtimethemainpeakisformed.
(iii) Statistical analysis: for quantitative analysis of per-
formance, accuracy, sensitivity, precision, F mea-
sure, and ROC were calculated for each subject and
ANOVA test was held to compare mean dierence.
e accuracy is dened as the ratio of number of
correctly identied trial to total trial numbers. e
classical statistic measurements for quantitative eval-
uation are as follows:
TP true positive,
FP false positive
TP true negative,
FP false negative
Sensitivity =TP
FN +TP
Precision =TP
TP +FP
F1measure =Sensitivity ×Precision
Sensitivity +Precision .
()
(iv) Receiver operating characteristic: receiver operat-
ing characteristic (ROC), which plots the sensitivity
against specicity, widely used statistical measure-
ment for its diagnostic ability of binary classier. As
the CNN of the paper is a binary classier, the ROC
information is provided to compare the performance
of CNN between H and L group.
(v) Peak signal to noise ratio: peak signal to noise ratio
(PSNR) is used as measurement of qualitative recon-
struction method of compression codes []. As the
performance of lter will depend on how many core
features are extracted from raw ERP, the PSNR of L1s
feature map was calculated as a mean of measure-
ment of performance. e greater PSNR shows the
presence of signicantly high weight inside feature
map whereas lower PSNR represents only low weights
that are present in the given feature map and the
discriminant power of the lter is low.
3. Results
3.1. ERP Detection. Of  subjects,  were identied as H
group. In Figure , time course of learning curve and other
statistical measurements over the training iteration from H
and L subject are presented. e learning curve of L subject
shown in Figure (a) indicates that although the false negative
rate (FN) drops according to the training iteration, reaching
 eventually, the false positive rate (FP) becomes . Although
the learning curve shows sharp increase at st and th
iteration, mostly it remains around .. is indicates that the
CNN becomes overtrained to positives (target). Moreover, as
the CNN identies most of the ERP to be positive (high FP
and low FN), the result indicates that discriminant feature of
target ERP was not found. On the other hand, both FN and
FP of H subject drop to around  and .. e learning curve
saturates around . indicating nonovertting of the CNN
(Figure (b)).
e errors shown in Figures (c) and (d) are dened as
follows for training and online data:
error =TP +TN
TP +TN +FP +FN.()
Although both H and L group show drop in both training and
validation error as training iteration continues, the validation
errorofLsubjectishigherthanthatofHsubject.
eROCsofHandLsubjectshowninFigure(e)
indicate the performance of CNN of H group to be greater
than that of L group subject.
3.2. Spatial and Temporal Features. e feature map of each
convolutional layer did not contain negative weights associ-
ated with negative peaks, such as N [] as the activation
function was set to ReLU [].
e target ERP and feature map of L1of sampler H and L
subjectareshowninFiguresand.etargetERPshownin
both gures is target ERP averaged over all trials. To analyze
the correlation of frontal and occipital lobe electrodes, the
rst  electrodes (rst  rows of averaged target ERP matrix)
werecopiedandpastedattheendofERPmatrix.Asshown
in Figure (a), the target ERP of L group subject shows
Computational Intelligence and Neuroscience
T : Results of the CNN classication. Data are sorted according to the ERP group. Accuracy (Acc.), sensitivity (Sens.), precision (Prec.),
F measure, ROC, PSNR, and peak time of nd layer (PeT.) are given for comparison.
Subject number Type Acc. Sens. Prec. F measure ROC PSNR PeT.
H . . . . . . .
H . . . . . . .
H . . . . . . .
H . . . . . . .
H . . . . . . .
H . . . . . . .
H . . . . . . .
H . . . . . . .
H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 H . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
 L . . . . . . .
broadpeakaroundPrangeonFandCZ.ERPofother
lobes did not show any signicant positive weight indicating
nonsignicant features associated with target being observed
and being at. Feature maps shown in Figures (b) through
(i) have shown high correlation between ERP from central
and parietal lobe electrodes.
On the other hand, the correlation of ERP among adjacent
electrodes for H group subject shown in Figure  indicates
the correlation is restricted to specic time range. Most of
the high weights of feature maps shown in Figures (b), (d),
(f), and (e) show signicant positive value around P
and P range for frontal and central lobe electrodes. e
correlation between central and parietal lobe is shown in
Figure (c) around P range. Some features around P
region were found to show high correlation among all elec-
trodes.UnlikethatofLgroupsubjects,featuremapofL
1for H
group subject showed high correlation among all electrodes,
where each case shows specic temporal characteristics.
etemporalfeaturesshowninfeaturemapinFigure
indicate that temporal features associated with P peak are
present for L group subjects as expected. In Figures (a), (b),
and (c), high positive weights were found around P range
(row  and ). However, most of the feature maps did not
show signicant weights or were either at as in Figure (i).
e temporal features of H group subjects showed more
variety. Some feature maps showed high positive weights in
their feature maps around P and P range as shown in
Figures (a), (b), (c), and (d), whereas the others indicated
signicant positive weight around P range as in Figures
(a)–(i). However, the weight associated with P range
is more widely dened than those associated with P and
P.
3.3. Statistical Analysis. Comparison of classical statistical
measurements and other measurements is shown in Table .
e accuracy, sensitivity, and precision showed signicant
mean dierence between H and L group (𝑝values were
Computational Intelligence and Neuroscience
A1A2
NASION
INION
F7F3FzF4F8
Fp1Fp2
T3C3CzC4T4
T5P3PzP4T6
O1O2
(a) (b)
1
0
output: 2
feature map: 6510
feature map: 7 × 15 × 62
max-pooling
max-pooling
convolution
convolution
feature map: 7×150×62
[1×12]
[1×10]
feature map: 7×150×32
[2×2]
[6×20]
feature map: 14 × 300 × 32
input: 14 × 300
···
···
···
···
(c)
F : Experimental paradigm. (a) e position of EEG channels in / system. e EEG were collected from F, Fz,F,C,Cz,C,
P, Pz, P, O, and O positions as indicated by red circle. (b) Experimental setting schematics. Subjects were sat on a chair and were asked
to mentally count the occurrence of target icon. e ERP speller system for this paper was implemented with RSVP. e icon appeared on
the center of the monitor. (c) Schematics of CNN architecture. e architecture consisted of  convolutional layers,  max-pool layers, and 
fully connected layers. e number on top of each layer indicates size of feature map.
., 0.8.88e−05, and ., resp.). A signicant mean
dierence in F measure did not exist between H and L
group. e accuracy of H and L group was . and .,
respectively. e sensitivity of H group was higher than that of
L group, but the precision of H group was signicantly lower
than that of L group. e area under ROC of H group was
signicantly higher than that of L group (𝑝value = .).
e PSNR for L1of H group was signicantly lower than
that of L group. As all PSNR measured were negative, the ab-
solutevalueofPSNRofHgroupwasgreaterthanthatofL
group. On the other hand, no mean dierence of the peak
time (PeT.) between H and L group was found (𝑝value =
.).
4. Discussion
In this study, CNN has been used to investigate the spatial
and temporal characteristics of ERP that distinguish the
performance dierence between illiterates and nonilliterates
(L and H group). As a comparison of performance, classical
statistic measurements as well as lter comparison mea-
surementhadbeencollectedtocomparethecorrelation
of ERP taken from dierent EEG electrodes and identify
characteristic temporal features associated with each group.
e statistical measurement shows that the mean per-
formance of CNN with H and L group data had signicant
dierence. e accuracy of H group data was higher than
that of L group data. Interestingly, although the sensitivity
of H group was higher than that of L group, the precision of
H group was signicantly lower than that of L group. is
reects the fact that the ERP of L group was not identied as
target in most of the cases, and the CNN identied ERP from
all  icons to be nontarget in more than half of the trials.
e learning curve and errors in Figure  demonstrate
how the statistical measurement aects the performance of
CNN. Although the false negative rate remains mostly near
Computational Intelligence and Neuroscience
(a) (b) (c)
(d) (e) (f)
F : Schematics of icons used for rapid serial visual presentation (RSVP) panel. e design of icons was taken from television remote
controller. (a) Turn on. (b) Volume up. (c) Channel up. (d) Turn o. (e) Volume down. (f) Channel down.
, as the false positive rate remains close to , the learning
curve remains stable around . for the L group subject. is
again reects the characteristics of L group ERP who were
mostly identied as nontarget. Some of the ERP that were
identied as target ERP were mostly from nontarget icons,
indicating lack of distinctive feature associated with target
ERP.However,bothfalsenegativeandfalsepositiveratedrop
as training iteration continues for H group subject’s data,
leading to increase of learning accordingly to the iteration.
As the ERP of L group does not have sucient distinctive
features, the model becomes slightly overtrained compared to
the model of H group subject as shown in validation error plot
in Figures (c) and (d). e comparison of ROC validates the
analysis as ROC of H group was signicantly higher than that
of L group (𝑝value = .).
As shown in Figure , most of the ERP collected from L
group were at in most of the channels. Most of the positive
weights in target ERP were observed in frontal and central
lobe electrodes (st and th row of Figure (a)) which was
contrary to the expectation as previous research indicated
positive peaks associated with target event were mostly
observed in parietal or occipital lobe [, ]. e correlation
of ERP collected from adjacent electrodes did not show
existence of signicant correlation between occipital and
parietal lobe data in L group subjects. On the other hand, ERP
of H group were more invigorated, showing stronger activity
in P area as shown in Figure (a). e ERP correlation
indicated in feature map also indicated stronger correlation
of ERP data collected from occipital and parietal lobe with
other lobes. e spatial correlation shown in feature map of
H group also indicated that the correlation was restricted in
specic time range corresponding to either P, P, or
P.
e feature map of nd convolutional layer demonstrated
the dierence in temporal features between H and L group
subjects. In most of L group subjects, the feature map did not
show strong positive weights and was at. Some indication of
positive weights was mostly restricted in P region. On the
otherhand,thepositiveweightsofHgroupweredistributed
around P, P, and P and the positive weights found
near P and P range was sharper compared to those
found around P range. Previous researches have indicated
the possibility of existence of dierent features other than
P [, , ] e result of the paper also supports the
idea that P may not be the only key feaure of ERP speller
system. Rather, the P, which were identied among both
L and H group subjects, may represent more universal ERP
feature. However, the ERP from central lobe area observed
in L group indicates the possibility of eect of stimulus
probability [] (Figure (a)).
e PSNR indicated that lack of activities in occipi-
tal/parietal lobe and broad peak found in P aect the
Computational Intelligence and Neuroscience
iteration number
0 50 100 150 200 250
Rate
0
0.2
0.4
0.6
0.8
1
Type I error (FN)
Type II error (FP)
Learning curve
(a)
iteration number
0 50 100 150 200 250
Rate
0
0.2
0.4
0.6
0.8
1
Type I error (FN)
Type II error (FP)
Learning curve
(b)
iteration number
0 50 100 150 200 250
error rate
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Training Error
Validation Error
(c)
iteration number
0 50 100 150 200 250
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
Training Error
Validation Error
(d)
False Positive
0 0.2 0.4 0.6 0.8 1
True Positive
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ROC of L subject
ROC of H subject
(e)
F : Learning curve and receiver operating characteristic curve (ROC) of L and H subject. (a) False negative rate (FN) and learning
curve of L subject saturates near  and ., respectively, whereas false positive rate (FP) increase to . (b) Both FP and FN drop over the time
course for H subject and learning curve saturates near .. (c) Training and validation error of drops over the time course for both L subject
and (d) H subject. Both validation and training error are lower for H subject. (e) ROC curve of H and L subjects.
Computational Intelligence and Neuroscience
50
100
150
200
250
300
2
4
6
8
10
12
14 30
20
10
0
10
(a)
100 200 300
2
6
10
14 0
5
10
15
20
100 200 300
2
6
10
14 0
5
10
15
20
100 200 300
2
6
10
14 0
5
10
15
20
100 200 300
2
6
10
14 0
5
10
15
20
(f)
100 200 300
2
6
10
14 0
5
10
15
20
100 200 300
2
6
10
14 0
5
10
15
20
100 200 300
2
6
10
14 0
5
10
15
20
100 200 300
2
6
10
14 0
5
10
15
20
(b) (c) (d) (e)
(g) (h) (i)
F : ERP averaged over all trials and feature map of L1of L subject. e ERP from frontal lobe was copied and pasted on last three rows.
(a) Grand average ERP over all trials. Feature maps from L1shownin(b),(c),(d),(e),(f),(g),(h),and(i).Strongcorrelationbetweenfrontal
and central lobe and between central and parietal lobe was found. Spatial correlation among other electrodes is not well dened.
50 100 150
2
4
6
0
1
2
3
4
5
(a)
50 100 150
2
4
6
0
1
2
3
4
5
(b)
0
1
2
3
4
5
50 100 150
2
4
6
(c)
50 100 150
2
4
6
0
1
2
3
4
5
(d)
50 100 150
2
4
6
0
1
2
3
4
5
(e)
50 100 150
2
4
6
0
1
2
3
4
5
(f)
50 100 150
2
4
6
0
1
2
3
4
5
(g)
50 100 150
2
4
6
0
1
2
3
4
5
(h)
50 100 150
2
4
6
0
1
2
3
4
5
(i)
F : Feature map of L2of L subject data. Temporal feature associated with P peak is found as shown in (a), (b), and (c).
Computational Intelligence and Neuroscience
50
100
150
200
250
300
2
4
6
8
10
12
14
0
10
100 200 300
2
6
10
14 0
1
2
3
4
5
100 200 300
2
6
10
14 0
1
2
3
4
5
100 200 300
2
6
10
14 0
1
2
3
4
5
100 200 300
2
6
10
14 0
1
2
3
4
5
100 200 300
2
6
10
14 0
1
2
3
4
5
100 200 300
2
6
10
14 0
1
2
3
4
5
100 200 300
2
6
10
14 0
1
2
3
4
5
100 200 300
2
6
10
14 0
1
2
3
4
5
(a)
(f)
(b) (c) (d) (e)
(g) (h) (i)
30
20
10
F : ERP averaged over all trials and feature map of L1of H group. e format is the same as shown in Figure . (a) e grand averaged
ERP of H group shows signicant peak around P and P (rows , , , and ). Correlation between ERP from adjacent electrodes shows
high correlation related to specic time rage (P and P) in (b), (d), (f), and (e).
50 100 150
2
4
6
0
1
2
3
4
5
(a)
50 100 150
2
4
6
0
1
2
3
4
5
(b)
0
1
2
3
4
5
50 100 150
2
4
6
(c)
50 100 150
2
4
6
0
1
2
3
4
5
(d)
50 100 150
2
4
6
0
1
2
3
4
5
(e)
50 100 150
2
4
6
0
1
2
3
4
5
(f)
50 100 150
2
4
6
0
1
2
3
4
5
(g)
50 100 150
2
4
6
0
1
2
3
4
5
(h)
50 100 150
2
4
6
0
1
2
3
4
5
(i)
F : Feature map of L2of H subject. Format is the same as Figure . High positive weight around P and P range were found in
(a), (b), (c), and (d). (e)–(i) Moderate positive weight around P were also found.
 Computational Intelligence and Neuroscience
performance of spatial lter in L as well. As the PSNR
measures the maximum power of a signal and the power
of corrupting noise [], the result indicates that the lter
was not able to extract distinctive signal of target ERP from
background noise for L group subjects’ data. is may be
since peaks near P were broad and uctuating. On the
other hand, P and P peaks found in H group subjects
were sharper, which made the lter extract relevant features
more precisely without being aected by background noise.
Interestingly, the major peak of L2of H and L group subjects
did not dier signicantly (𝑝value = .). As the major
peak was found by averaging the feature maps from L2,the
dierence in each feature map may have been overshadowed.
Further statistical analysis to access temporal feature within
each feature map must be applied to validate the results found
in this study.
5. Conclusions
is study has investigated the dierence in spatial and
temporal features of ERP between high performance group
(H group) and low performance group (L group). e result
indicated that the major dierence arises from spatial correla-
tion of ERP among other lobes rather than temporal features.
Although the temporal feature dierence was not found to
be quantitative in this study, the qualitative analysis indicated
lack of P in low performance group. Interestingly, both
low and high performance group showed activity near P
which may be the key activity of ERP speller system instead of
traditional P peak. Further analysis of individual feature
map will be needed to investigate the key temporal feature of
ERP speller system.
Conflicts of Interest
e authors declare that they have no conicts of interest.
Acknowledgments
is work was partly supported by Institute for Information
& Communications Technology Promotion (IITP) grant
funded by the Korea government (MSIT) (no. --,
the development of technology for social life logging based
on analyzing social emotion and intelligence of convergence
contents) and National Research Foundation of Korea (NRF)
grant funded by the Korea government (MSIT) (no. -
).
References
[] J.R.Wolpaw,N.Birbaumer,D.J.McFarland,G.Pfurtscheller,
and T. M. Vaughan, “Brain-computer interfaces for communi-
cation and control,” Clinical Neurophysiology, vol. , no. , pp.
–, .
[] T. Fomina, G. Lohmann, M. Erb, T. Ethofer, B. Sch¨
olkopf, and
M. Grosse-Wentrup, “Self-regulation of brain rhythms in the
precuneus: A novel BCI paradigm for patients with ALS,Jour-
nalofNeuralEngineering,vol.,no.,ArticleID,.
[] W.Speier,N.Chandravadia,D.Roberts,S.Pendekanti,andN.
Pouratian, “Online BCI typing using language model classiers
by ALS patients in their homes,Brain-Computer Interfaces,vol.
, no. -, pp. –, .
[] L. Botrel, E. M. Holz, and A. K¨
ubler, “Using brain painting at
home for  years: Stability of the P during prolonged BCI
usage by two end-users with ALS,Lecture Notes in Computer
Science (including subseries Lecture Notes in Articial Intelligence
and Lecture Notes in Bioinformatics): Preface,vol.,pp.
–, .
[] S. Saeedi, R. Chavarriaga, R. Leeb, and J. d. Millan, “Adaptive
Assistance for Brain-Computer Interfaces by Online Prediction
of Command Reliability,IEEE Computational Intelligence Mag-
azine,vol.,no.,pp.,.
[] S.Saeedi,R.Chavarriaga,andJ.D.R.Millan,“Long-TermStable
Control of Motor-Imagery BCI by a Locked-In User rough
Adaptive Assistance,” IEEE Transactions on Neural Systems and
Rehabilitation Engineering,vol.,no.,pp.,.
[] R. Swaminathan and S. Prasad, “Brain computer interface used
in health care technologies,SpringerBriefs in Applied Sciences
and Technology,vol.,pp.,.
[] C. Reichert, S. D ¨
urschmid, H.-J. Heinze, and H. Hinrichs, “A
comparative study on the detection of covert attention in event-
related EEG and MEG signals to control a BCI,Frontiers in
Neuroscience, vol. , article no. , .
[] D. McFarland and J. Wolpaw, “EEG-based brain–computer
interfaces,Current Opinion in Biomedical Engineering,vol.,
pp.,.
[] L. A. Farwell and E. Donchin, “Talking o the top of your head:
Toward a mental prosthesis utilizing event-related brain poten-
tials,Electroencephalography and Clinical Neurophysiology,vol.
, no. , pp. –, .
[] J. Yoon, M. Whang, and J. Lee, “Methodology of improving
illiteracy in P speller system with ICA blind detection,
Proceedings of HCI Korea,pp.,.
[] R. Carabalona, “e role of the interplay between stimulus type
and timing in explaining BCI-illiteracy for visual P-based
Brain-Computer Interfaces,Frontiers in Neuroscience, vol. ,
article no. , .
[] S. L. Shishkin, I. P. Ganin, I. A. Basyul, A. Y. Zhigalov, and A. Y.
Kaplan, “N wave in the P BCI is not sensitive to the physical
characteristics of stimuli,Journal of integrative neuroscience,
vol. , no. , pp. –, .
[] L. Bianchi, S. Sami, A. Hillebrand, I. P. Fawcett, L. R. Quitadamo,
and S. Seri, “Which physiological components are more suitable
for visual ERP based brain-computer interface? A preliminary
MEG/EEG study,Brain Topography,vol.,no.,pp.,
.
[] K. Yoon and K. Kim, “Multiple kernel learning based on three
discriminant features for a P speller BCI,Neurocomputing,
vol. , pp. –, .
[] D. B. Ryan, G. Townsend, N. A. Gates, K. Colwell, and E.
W. Sellers, “Evaluating brain-computer interface performance
using color in the P checkerboard speller,Clinical Neuro-
physiology,vol.,no.,pp.,.
[] V.Guy,M.-H.Soriani,M.Bruno,T.Papadopoulo,C.Desnuelle,
and M. Clerc, “Brain computer interface with the P speller:
Usability for disabled people with amyotrophic lateral sclerosis,
Annals of Physical and Rehabilitation Medicine,.
[] Q. Li, K. Shi, S. Ma, and N. Gao, “Improving classication accu-
racy of SVM ensemble using random training set for BCI P-
speller,” in Proceedings of the 13th IEEE International Conference
on Mechatronics and Automation, IEEE ICMA 2016,pp.
, China, August .
Computational Intelligence and Neuroscience 
[] G. C. Cawley and N. L. Talbot, “On over-tting in model selec-
tion and subsequent selection bias in performance evaluation,
Journal of Machine Learning Research, vol. , pp. –,
.
[] D.J.Krusienski,E.W.Sellers,F.Cabestaingetal.,“Acomparison
of classication techniques for the P Speller,Journal of Neu-
ral Engineering, vol. , no. , article , pp. –, .
[] A. Rakotomamonjy and V. Guigue, “BCI competition III:
dataset II-ensemble of SVMs for BCI P speller,IEEE Trans-
actions on Biomedical Engineering,vol.,no.,pp.,
.
[]Y.Yu,Z.Zhou,J.Jiangetal.,“TowardaHybridBCI:Self-
Paced Operation of a P-based Speller by Merging a Motor
Imagery-Based “Brain Switch” into a P Spelling Approach,
International Journal of Human-Computer Interaction,vol.,
no. , pp. –, .
[] Y.Yu,J.Jiang,Z.Zhouetal.,“Aself-pacedbrain-computerinter-
face speller by combining motor imagery and P potential,
in Proceedings of the 8th International Conference on Intelligent
Human-Machine Systems and Cyber netics, IHMSC 2016,pp.
–, China, September .
[] S. Sonnenburg, G. R¨
atsch, C. Sch¨
afer, and B. Sch¨
olkopf, “Large
scale multiple kernel learning,Journal of Machine Learning
Research,vol.,pp.,.
[] Q.V.Le,J.Ngiam,A.Coates,A.Lahiri,B.Prochnow,andA.Y.
Ng, “On optimization methods for deep learning,” in Proceed-
ings of the 28th International Conference on Machine Learning
(ICML ’11), pp. –, Bellevue, Wash, USA, July .
[] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, “Face
recognition: a convolutional neural-network approach,IEEE
Transactions on Neural Networks and Learning Systems,vol.,
no.,pp.,.
[] A.Krizhevsky,I.Sutskever,andG.E.Hinton,“Imagenetclassi-
cation with deep convolutional neural networks,” in Proceedings
of the 26th Annual Conference on Neural Information Processing
Systems (NIPS ’12), pp. –, Lake Tahoe, Nev, USA,
December .
[] P.Y.Simard,D.Steinkraus,andJ.C.Platt,“Bestpracticesfor
convolutional neural networks applied to visual document
analysis,” in Proceedings of the 7th International Conference on
Document Analysis and Recognition,vol.,pp.,IEEE
Computer Society, Edinburgh, UK, August .
[] O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, L. Deng, G. Penn,
andD.Yu,“Convolutionalneuralnetworksforspeechrecog-
nition,IEEE Transactions on Audio, Speech and Language
Processing,vol.,no.,pp.,.
[] Y. LeCun and Y. Bengio, “Convolutional networks for images,
speech, and time series,e Handbook of Brain eory and
Neural Networks,vol.,no.,p.,.
[] M. S. Treder, N. M. Schmidt, and B. Blankertz, “Gaze-independ-
ent brain-computer interfaces based on covert attention and
feature attention,Journal of Neural Engineering,vol.,no.,
Article ID , .
[] R. W. Homan, J. Herman, and P. Purdy, “Cerebral location
of international – system electrode placement,Electroen-
cephalography and Clinical Neurophysiology,vol.,no.,pp.
–, .
[] V. Nair and G. E. Hinton, “Rectied linear units improve
Restricted Boltzmann machines,” in Proceedings of the 27th
International Conference on Machine Learning (ICML ’10),pp.
–, Haifa, Israel, June .
[] G. Benjamin, “Fractional max-pooling, ,” https://arxiv.org/
abs/..
[] G.E.Dahl,T.N.Sainath,andG.E.Hinton,“Improvingdeep
neural networks for LVCSR using rectied linear units and
dropout,” in Proceedingsofthe38thIEEEInternationalConfer-
ence on Acoustics,Speech, and Signal Processing (ICASSP ’13),pp.
–, May .
[] N. Srivastava, “Improving neural networks with dropout,” Uni-
versity of Toronto,vol.,.
[] M. Abadi, P. Barham, C. Jianmin et al., “Tensorow: A system
for large-scale machine learning,12th USENIX Symposium on
Operating Systems Design and Implementation,vol.,pp.
, .
[] G. Aur´
elien, “Hands-on machine learning with scikit-learn and
tensorow: concepts, tools, and techniques to build intelligent
systems, ”.
[] Q. Huynh-u and M. Ghanbari, “Scope of validity of PSNR in
image/video quality assessment,IEEE Electronics Letters,vol.
,no.,pp.-,.
[] R. N¨
a¨
at¨
anen, Attention and Brain Function, Psychology Press,
.
[]K.Takano,H.Ora,K.Sekihara,S.Iwaki,andK.Kansaku,
“Coherent activity in bilateral parieto-occipital cortices during
P-BCI operation,Frontiers in Neurology,vol.,ArticleID
Article , .
[]F.A.Capati,R.P.Bechelli,andM.C.F.Castro,“Hybrid
SSVEP/P BCI keyboard: Controlled by Visual Evoked
Potential,” in Proceedings of the 9th International Conference on
Bio-Inspired Systems and Signal Processing, BIOSIGNALS 2016
- Part of 9th International Joint Conference on Biomedical
Engineering Systems and Technologies, BIOSTEC 2016,pp.
, ita, February .
[] S. Ikegami, K. Takano, M. Wada, N. Saeki, and K. Kansaku,
“Eect of the green/blue icker matrix for P-based brain-
computer interface: An EEG-fMRI study,Frontiers in Neurol-
ogy,.
[] W. Speier, A. Deshpande, and N. Pouratian, “A method for
optimizing EEG electrode number and conguration for signal
acquisition in P speller systems,Clinical Neurophysiology,
vol. , no. , pp. –, .
[] C.-Y. Chen, C.-H. Chen, C.-H. Chen, and K.-P. Lin, “An auto-
matic ltering convergence method for iterative impulse noise
lters based on PSNR checking and ltered pixels detection,”
Expert Systems with Applications,vol.,pp.,.
... The most frequent model inspection techniques involved the analysis of the trained model's weights [135,211,86,34,87,200,182,122,170,228,164,109,204]. This often requires focusing on the weights of the first layer only, as their interpretation in regard to the input data is straightforward. ...
... Occlusion sensitivity techniques [92,26,175] use a similar idea, by which the decisions of the network when different parts of the input are occluded are analyzed. [135,211,86,34,87,200,182,122,170,228,164,109,204,85,25] Analysis of activations [212,194,87,83,208,167,154,109] Input-perturbation network-prediction correlation maps [149,191,67,16,150] Generating input to maximize activation [188,144,160,15] Occlusion of input [92,26,175] Several studies used backpropagation-based techniques to generate input maps that maximize activations of specific units [188,144,160,15]. These maps can then be used to infer the role of specific neurons, or the kind of input they are sensitive to. ...
Article
Full-text available
Context: Electroencephalography (EEG) is a complex signal and can require several years of training, as well as advanced signal processing and feature extraction methodologies to be correctly interpreted. Recently, deep learning (DL) has shown great promise in helping make sense of EEG signals due to its capacity to learn good feature representations from raw data. Whether DL truly presents advantages as compared to more traditional EEG processing approaches, however, remains an open question. Objective: In this work, we review 154 papers that apply DL to EEG, published between January 2010 and July 2018, and spanning different application domains such as epilepsy, sleep, brain-computer interfacing, and cognitive and affective monitoring. We extract trends and highlight interesting approaches from this large body of literature in order to inform future research and formulate recommendations. Methods: Major databases spanning the fields of science and engineering were queried to identify relevant studies published in scientific journals, conferences, and electronic preprint repositories. Various data items were extracted for each study pertaining to (1) the data, (2) the preprocessing methodology, (3) the DL design choices, (4) the results, and (5) the reproducibility of the experiments. These items were then analyzed one by one to uncover trends. Results: Our analysis reveals that the amount of EEG data used across studies varies from less than ten minutes to thousands of hours, while the number of samples seen during training by a network varies from a few dozens to several millions, depending on how epochs are extracted. Interestingly, we saw that more than half the studies used publicly available data and that there has also been a clear shift from intra-subject to inter-subject approaches over the last few years. About [Formula: see text] of the studies used convolutional neural networks (CNNs), while [Formula: see text] used recurrent neural networks (RNNs), most often with a total of 3-10 layers. Moreover, almost one-half of the studies trained their models on raw or preprocessed EEG time series. Finally, the median gain in accuracy of DL approaches over traditional baselines was [Formula: see text] across all relevant studies. More importantly, however, we noticed studies often suffer from poor reproducibility: a majority of papers would be hard or impossible to reproduce given the unavailability of their data and code. Significance: To help the community progress and share work more effectively, we provide a list of recommendations for future studies and emphasize the need for more reproducible research. We also make our summary table of DL and EEG papers available and invite authors of published work to contribute to it directly. A planned follow-up to this work will be an online public benchmarking portal listing reproducible results.
... In [55], a CNNtransfer learning approach was taken and [56] used CNN in a Devanagari script-based P300 speller application. Yoon et al. [57] used CNN's capability to identify the key features of ERPs that distinguish the illiterates of the BCI speller system. Liu et al. [58] studied the effect of batch normalization in the input and convolutional layers to alleviate overfitting in the ERP decoding task. ...
Article
Predicting attention-modulated brain responses is a major area of investigation in brain-computer interface (BCI) research that aims to translate neural activities into useful control and communication commands. Such studies involve collecting electroencephalographic (EEG) data from subjects to train classifiers for decoding users' mental states. However, various sources of inter or intrasubject variabilities in brain signals render training classifiers in BCI systems challenging. From a machine learning perspective, this model training generally follows a common methodology: 1) apply some type of feature extraction, which can be time-consuming and may require domain knowledge and 2) train a classifier using extracted features. The advent of deep learning technologies has offered unprecedented opportunities to not only construct remarkably accurate classifiers but also to integrate the feature extraction stage into the classifier construction. Although integrating feature extraction, which is generally domain-dependent, into the classifier construction is a considerable advantage of deep learning models, the process of architecture selection for BCIs generally depends on domain knowledge. In this study, we examine the feasibility of conducting a systematic model selection combined with mainstream deep learning architectures to construct accurate classifiers for decoding P300 event-related potentials. In particular, we present the results of 232 convolutional neural networks (CNNs) (4 datasets x 58 structures), 36 long short-term memory cells (LSTMs) (4 datasets x 9 structures), and 320 hybrid CNN-LSTM models (4 datasets x 80 structures) of varying complexity. Our empirical results show that in the classification of P300 waveforms, the constructed predictive models can outperform the current state-of-the-art deep learning architectures, which are partially or entirely inspired by domain knowledge. The source codes and constructed models are available at https://github.com/berdakh/P3Net.
... Others [135,136] [137] [138] [139], [140], [141], [142], [138], [143], [144], [145,146] RSVP [174,175], [176], [177], [178], [179], [180,181], [182,175] [ 183,184] [ 185,12] [186] [181,175] [12] ...
Article
Full-text available
Brain signals refer to the biometric information collected from the human brain. The research on brain signals aims to discover the underlying neurological or physical status of the individuals by signal decoding. The emerging deep learning techniques have improved the study of brain signals significantly in recent years. In this work, we first present a taxonomy of non-invasive brain signals and the basics of deep learning algorithms. Then, we provide a comprehensive survey of the frontiers of applying deep learning for non-invasive brain signals analysis, by summarizing a large number of recent publications. Moreover, upon the deep learning-powered brain signal studies, we report the potential real-world applications which benefit not only disabled people but also normal individuals. Finally, we discuss the opening challenges and future directions.
... In future studies, more categories of visual objects will be used to support and to expand the neural mechanisms of perceptual closure found in this study. In related works, P300 ERP component was used for brain computer interface study [25][26][27], the ERP components in this study also can be used to build an affective brain computer interface. BCI also can be very useful for the elder people [28][29][30][31]. ...
Article
Full-text available
Perceptual organization is an important part of visual and auditory information processing. In the case of visual occlusion, whether the loss of information in images could be recovered and thus perceptually closed affects object recognition. In particular, many elderly subjects have defects in object recognition ability, which may be closely related to the abnormalities of perceptual functions. This phenomenon even can be observed in the early stage of dementia. Therefore, studying the neural mechanism of perceptual closure and its relationship with sensory and cognitive processing is important for understanding how the human brain recognizes objects, inspiring the development of neuromorphic intelligent algorithms of object recognition. In this study, a new experiment was designed to explore the realistic process of perceptual closure under occlusion and intact conditions of faces and building. The analysis of the differences in ERP components P1, N1, and Ncl indicated that the subjective awareness of perceptual closure mainly occurs in Ncl, but incomplete information has been processed and showed different manners compared to complete stimuli in N170 for facial materials. Although occluded, faces, but not buildings, still maintain the specificity of perceptual processing. The Ncl by faces and buildings did not show significant differences in both amplitude and latency, suggesting a "completing" process regardless of categorical features.
... e proposed model ooered an AUC of 86.1%. Yoon et al. [238] provided a way to analyze the spatial and temporal features of ERP. e authors trained a CNN with two convolutional layers whose feature maps represented spatial and temporal features of the event-related potential. e results demonstrated that literate subjects' ERP shows a high correlation between the occipital lobe and parietal lobe, whereas illiterate subjects only show the correlation between neural activities from the frontal lobe and central lobe. ...
... e proposed model o ered an AUC of 86.1%. Yoon et al. [238] provided a way to analyze the spatial and temporal features of ERP. e authors trained a CNN with two convolutional layers whose feature maps represented spatial and temporal features of the event-related potential. e results demonstrated that literate subjects' ERP shows a high correlation between the occipital lobe and parietal lobe, whereas illiterate subjects only show the correlation between neural activities from the frontal lobe and central lobe. ...
Preprint
Full-text available
Brain-Computer Interface (BCI) bridges the human's neural world and the outer physical world by decoding individuals' brain signals into commands recognizable by computer devices. Deep learning has enhanced the performance of brain-computer interface systems significantly in recent years. In this article, we systematically investigate brain signal types for BCI and related deep learning concepts for brain signal analysis. We then present a comprehensive survey of deep learning techniques used for BCI, by summarizing over 230 contributions, most published in the past five years. Finally, we discuss the applied areas, emerging challenges, and future directions for deep learning-based BCI.
... e proposed model o ered an AUC of 86.1%. Yoon et al. [238] provided a way to analyze the spatial and temporal features of ERP. e authors trained a CNN with two convolutional layers whose feature maps represented spatial and temporal features of the event-related potential. e results demonstrated that nonilliterate subjects' ERP shows a high correlation between the occipital lobe and parietal lobe, whereas illiterate subjects only show the correlation between neural activities from the frontal lobe and central lobe. ...
Preprint
Full-text available
Brain-Computer Interface (BCI) bridges the human's neural world and the outer physical world by decoding individuals' brain signals into commands recognizable by computer devices. Deep learning has lifted the performance of brain-computer interface systems significantly in recent years. In this article, we systematically investigate brain signal types for BCI and related deep learning concepts for brain signal analysis. We then present a comprehensive survey of deep learning techniques used for BCI, by summarizing over 230 contributions most published in the past five years. Finally, we discuss the applied areas, opening challenges, and future directions for deep learning-based BCI.
... e proposed model o ered an AUC of 86.1%. Yoon et al. [238] provided a way to analyze the spatial and temporal features of ERP. e authors trained a CNN with two convolutional layers whose feature maps represented spatial and temporal features of the event-related potential. e results demonstrated that nonilliterate subjects' ERP shows a high correlation between the occipital lobe and parietal lobe, whereas illiterate subjects only show the correlation between neural activities from the frontal lobe and central lobe. ...
Preprint
Full-text available
Brain-Computer Interface (BCI) bridges human's neural world and the outer physical world by decoding individuals' brain signals into commands recognizable by computer devices. Deep learning has liied the performance of brain-computer interface systems signiicantly in recent years. In this article, we systematically investigate brain signal types for BCI and related deep learning concepts for brain signal analysis. We then present a comprehensive survey of deep learning techniques used for BCI, by summarizing over 230 contributions most published in the past ve years. Finally, we discuss the applied areas, opening challenges, and future directions for deep learning-based BCI.
... e proposed model o ered an AUC of 86.1%. Yoon et al. [238] provided a way to analyze the spatial and temporal features of ERP. e authors trained a CNN with two convolutional layers whose feature maps represented spatial and temporal features of the event-related potential. e results demonstrated that nonilliterate subjects' ERP shows a high correlation between the occipital lobe and parietal lobe, whereas illiterate subjects only show the correlation between neural activities from the frontal lobe and central lobe. ...
Preprint
Full-text available
Brain-Computer Interface (BCI) bridges human's neural world and the outer physical world by decoding individuals' brain signals into commands recognizable by computer devices. Deep learning has liied the performance of brain-computer interface systems signiicantly in recent years. In this article, we systematically investigate brain signal types for BCI and related deep learning concepts for brain signal analysis. We then present a comprehensive survey of deep learning techniques used for BCI, by summarizing over 230 contributions most published in the past ve years. Finally, we discuss the applied areas, opening challenges, and future directions for deep learning-based BCI. 1 INTRODUCTION Brain-Computer Interface (BCI) 1 is a system that translates activity paaerns of the human brain into messages or commands to communicate with the outer world [119]. BCI underpins many novel applications that are important to people's daily life, especially to people with psychological/physical deceases or disabilities. For example, ordinary individuals can enjoy enhanced entertainment and security when brain waves-based techniques are applied for high fake-resistant user identiication [249]. Another example is that BCI can assist the disabled, elders and people with limited motion ability (e.g., people with muscle diseases) in controlling wheelchairs, home appliances, and robots. e key challenge of BCI is to recognize human intents accurately given the meager Signal-to-Noise Ratio (SNR) of brain signals. Both low classiication accuracy and poor generalization ability limit the real-world application of BCI. To overcome the above challenges, deep learning techniques, i.e., deep neural networks, have been investigated to deal with the brain information in the past f ew years. Deep Learning is a sub-eld of machine learning inspired by the structure and function of the brain. It has shown excellent representation learning ability since 2006 [42] and therefore been impacting a wide range of information-processing domains such as computer version, natural language processing, activity recognition, and logic reasoning [217]. Diiering from traditional machine learning algorithms,
Chapter
The study on Amyotrophic Lateral Sclerosis (ALS) patient to identify the non-target or target stimulus based on event-related potential provide a way to improve P300 speller based Brain–Computer Interface (BCI). In the current work channel wise EEG data taken for the research. Feature extraction and Feature selection techniques based on Fourier and Wavelet transform and Statistics have been implemented to get the required features among the redundant one. By classifying the features categorized in 3 labels stated above by using support vector machine (SVM). The study reveals that the classification accuracy is improved in Morlet wavelet-based feature than statistical features for different channels taken in consideration.
Article
Full-text available
In brain-computer interface (BCI) applications the detection of neural processing as revealed by event-related potentials (ERPs) is a frequently used approach to regain communication for people unable to interact through any peripheral muscle control. However, the commonly used electroencephalography (EEG) provides signals of low signal-to-noise ratio, making the systems slow and inaccurate. As an alternative noninvasive recording technique, the magnetoencephalography (MEG) could provide more advantageous electrophysiological signals due to a higher number of sensors and the magnetic fields not being influenced by volume conduction. We investigated whether MEG provides higher accuracy in detecting event-related fields (ERFs) compared to detecting ERPs in simultaneously recorded EEG, both evoked by a covert attention task, and whether a combination of the modalities is advantageous. In our approach, a detection algorithm based on spatial filtering is used to identify ERP/ERF components in a data-driven manner. We found that MEG achieves higher decoding accuracy (DA) compared to EEG and that the combination of both further improves the performance significantly. However, MEG data showed poor performance in cross-subject classification, indicating that the algorithm's ability for transfer learning across subjects is better in EEG. Here we show that BCI control by covert attention is feasible with EEG and MEG using a data-driven spatial filter approach with a clear advantage of the MEG regarding DA but with a better transfer learning in EEG.
Article
Full-text available
Objective: Current Brain-Computer Interface (BCI) systems typically flash an array of items from grey to white (GW). The objective of this study was to evaluate BCI performance using uniquely colored stimuli. Methods: In addition to the GW stimuli, the current study tested two types of color stimuli (grey to color [GC] and color intensification [CI]). The main hypotheses were that in a checkboard paradigm, unique color stimuli will: (1) increase BCI performance over the standard GW paradigm; (2) elicit larger event-related potentials (ERPs); and, (3) improve offline performance with an electrode selection algorithm (i.e., Jumpwise). Results: Online results (n=36) showed that GC provides higher accuracy and information transfer rate than the CI and GW conditions. Waveform analysis showed that GC produced higher amplitude ERPs than CI and GW. Information transfer rate was improved by the Jumpwise-selected channel locations in all conditions. Conclusions: Unique color stimuli (GC) improved BCI performance and enhanced ERPs. Jumpwise-selected electrode locations improved offline performance. Significance: These results show that in a checkerboard paradigm, unique color stimuli increase BCI performance, are preferred by participants, and are important to the design of end-user applications; thus, could lead to an increase in end-user performance and acceptance of BCI technology.
Article
Full-text available
Objectives: Amyotrophic lateral sclerosis (ALS), a progressive neurodegenerative disease, restricts patients' communication capacity a few years after onset. A proof-of-concept of brain-computer interface (BCI) has shown promise in ALS and "locked-in" patients, mostly in pre-clinical studies or with only a few patients, but performance was estimated not high enough to support adoption by people with physical limitation of speech. Here, we evaluated a visual BCI device in a clinical study to determine whether disabled people with multiple deficiencies related to ALS would be able to use BCI to communicate in a daily environment. Methods: After clinical evaluation of physical, cognitive and language capacities, 20 patients with ALS were included. The P300-speller BCI system consisted of electroencephalography acquisition connected to real-time processing software and separate keyboard-display control software. It was equipped with original features such as optimal stopping of flashes and word prediction. The study consisted of two 3-block sessions (copy spelling, free spelling and free use) with the system in several modes of operation to evaluate its usability in terms of effectiveness, efficiency and satisfaction. Results: The system was effective in that all participants successfully achieved all spelling tasks and was efficient in that 65% of participants selected more than 95% of the correct symbols. The mean number of correct symbols selected per minute ranged from 3.6 (without word prediction) to 5.04 (with word prediction). Participants expressed satisfaction: the mean score was 8.7 on a 10-point visual analog scale assessing comfort, ease of use and utility. Patients quickly learned how to operate the system, which did not require much learning effort. Conclusion: With its word prediction and optimal stopping of flashes, which improves information transfer rate, the BCI system may be competitive with alternative communication systems such as eye trackers. Remaining requirements to improve the device for suitable ergonomic use are in progress.
Article
Full-text available
Visual P300-based Brain-Computer Interface (BCI) spellers enable communication or interaction with the environment by flashing elements in a matrix and exploiting consequent changes in end-user's brain activity. Despite research efforts, performance variability and BCI-illiteracy still are critical issues for real world applications. Moreover, there is a quite unaddressed kind of BCI-illiteracy, which becomes apparent when the same end-user operates BCI-spellers intended for different applications: our aim is to understand why some well performers can become BCI-illiterate depending on speller type. We manipulated stimulus type (factor STIM: either characters or icons), color (factor COLOR: white, green) and timing (factor SPEED: fast, slow). Each BCI session consisted of training (without feedback) and performance phase (with feedback), both in copy-spelling. For fast flashing spellers, we observed a performance worsening for white icon-speller. Our findings are consistent with existing results reported on end-users using identical white×fast spellers, indicating independence of worsening trend from users' group. The use of slow stimulation timing shed a new light on the perceptual and cognitive phenomena related to the use of a BCI-speller during both the training and the performance phase. We found a significant STIM main effect for the N1 component on P z and PO7 during the training phase and on PO8 during the performance phase, whereas in both phases neither the STIM×COLOR interaction nor the COLOR main effect was statistically significant. After collapsing data for factor COLOR, it emerged a statistically significant modulation of N1 amplitude depending to the phase of BCI session: N1 was more negative for icons than for characters both on P z and PO7 (training), whereas the opposite modulation was observed for PO8 (performance). Results indicate that both feedback and expertise with respect to the stimulus type can modulate the N1 component and that icons require more perceptual analysis. Therefore, fast flashing is likely to be more detrimental for end-users' performance in case of icon-spellers. In conclusion, the interplay between stimulus type and timing seems relevant for a satisfactory and efficient end-user's BCI-experience.
Article
Brain–Computer Interfaces (BCIs) are real-time computer-based systems that translate brain signals into useful commands. To date most applications have been demonstrations of proof-of-principle; widespread use by people who could benefit from this technology requires further development. Improvements in current EEG recording technology are needed. Better sensors would be easier to apply, more confortable for the user, and produce higher quality and more stable signals. Although considerable effort has been devoted to evaluating classifiers using public datasets, more attention to real-time signal processing issues and to optimizing the mutually adaptive interaction between the brain and the BCI are essential for improving BCI performance. Further development of applications is also needed, particularly applications of BCI technology to rehabilitation. The design of rehabilitation applications hinges on the nature of BCI control and how it might be used to induce and guide beneficial plasticity in the brain.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Conference Paper
Brain painting (BP) is non-invasive electroencephalography (EEG) based Brain-Computer Interface (BCI) for creative expression based on a P300 matrix. The technology was transferred into a home setup for two patients with amyotrophic lateral sclerosis (ALS), who used the system for several years while being evaluated on performance and satisfaction. Holz and colleagues found that the use of BP increased quality of life. Additionally, they described that changes in the amplitude of the P300 ERPs could be observed between recalibrations of the BCI. In this paper, we quantified the evolution of the P300 peaks in the two BCI end-users (HP and JT). For HP, the P300 peak amplitude increased during 9 months, then progressively decreased for the following 51 months, but the BCI accuracy remained stable. JT’s P300 peak amplitude did not significantly decrease during 32 months that separated the calibrations. Yet, JT’s BCI accuracy declined which we may attribute to a decline in physical functioning due to ALS. Painters used online BCI for hundreds of hours (HP 755, JT 223) and both finished more than 50 named brain paintings. HP could use BP autonomously and regularly at home for 33 months without recalibration of the system, and JT for 10 months, suggesting the stability of P300 and SWLDA online classifiers in the long-term, and demonstrating the feasibility of having a P300 based system at home that requires few involvement of BCI experts.
Article
This study presents the self-paced operation of a brain–computer interface (BCI) speller, which can be voluntarily turned on/off by merging a motor imagery (MI)-based brain switch into a P300-based BCI speller. From an off state (idle state), the users can generate a “control signal” by consciously changing the cognitive state differential from the idle state to turn on a P300-based spelling system when he or she wants to spell words. With the system turned on, the user can spell words, and then, the spelling system can be voluntarily turned off and switched to the initial state using a command. In this paradigm, the participants tried to perform the two different cognitive tasks sequentially, rather than simultaneously, and multiple EEG components were processed sequentially. The practicability and effectiveness of the proposed approach were validated by eleven participants, and all of them achieved a satisfactory performance. For the P300 speller, they achieved an average PITR of 42.61 bits/min. The preliminary results indicated that the proposed hybrid BCI system with different mental strategies operating sequentially is feasible and has potential applications for practical self-paced control.