Conference PaperPDF Available

Abstract

GOAL: The P300 Speller is probably the best known application in BCI [1]. Over the years, many improvements over the pioneering systems have been made and some performance comparisons exist [2]. To contribute to the improvement process, we propose an open access to a large database obtained from first-time users of the P300 speller application implemented within the BCI2000 platform [3]. The database is documented with associated classifier designs and objective performance measures, readily available for comparison and reference. We also propose a set of Matlab functions that help in the preparation of data for alternative classifier design and testing. 4th Figure 1. The database website. The database includes recordings from 30 healthy subjects (18 Males/ 12 Females, age 21-25) controlling various conditions (sleep duration, drugs, etc). Available on, akimpech server: http://akimpech.izt.uam.mx/p300db/p300db.html dropbox copy: https://www.dropbox.com/sh/q2g3wfwtzkjbwvi/AACwj3tBYVBtCwi_w7F1ZkO-a?dl=0 Kaggle: https://www.kaggle.com/electrototo/akimpech
An Open-Access P300 Speller Database
Claudia Ledesma-Ramirez
1
, Erik Bojorges-Valdez
1
, Oscar Yáñez-Suarez
1
, Carolina Saavedra
2
, Laurent Bougrain
2
, Gerardo Gabriel Gentiletti
1,3
1
Laboratorio de Neuroimagenología, Universidad Autónoma Metropolitana (UAM), Mexico
2
Cortex team-project, Nancy University/INRIA Nancy Grand Est, France
3
Laboratorio de Ingeniería en Rehabilitación e Investigaciones Neuromusculares y Sensoriales, Universidad Nacional de Entre Ríos (UNER), Argentina
GOAL:
The P300 Speller is probably the best known
application in BCI [1]. Over the years, many
improvements over the pioneering systems have
been made and some performance comparisons
exist [2]. To contribute to the improvement
process, we propose an open access to a large
database obtained from first-time users of the
P300 speller application implemented within the
BCI2000 platform [3] (Figure 1). The database is
documented with associated classifier designs and
objective performance measures, readily available
for comparison and reference. We also propose a
set of Matlab functions that help in the preparation
of data for alternative classifier design and testing.
4th
Figure 1. The database website.
DATABASE:
REFERENCE CLASSIFIERS AND PERFORMANCE
DISCUSSION:
REFERENCES:
[1] Farwell L. A. and Donchin E. “Talking off the top of your head: toward a
mental prosthesis utilizing event-related brain potentials.” Electroenceph. Clin.
Neurophysiol. Vol. 70, pp.510-23 (1988).
[2] Krusienski, D. J., Sellers E. W., Cabestaing F. “A comparison of classification
techniques for the P300 Speller.” Journal of Neural Engineering. Vol. 3, pp. 299-
305 (2006).
[3] Schalk G., Mc Farland D., Hinterberger T., Birbaumer N., Wolpaw J.
“BCI2000: A General-Porpose Brain-computer Interface (BCI) System. IEEE
Trans. Biomed. Eng. Vol. 51, pp. 1034-1043 (2004).
The database includes recordings from 30 healthy subjects
(18 Males/ 12 Females, age 21-25) controlling various
conditions (sleep duration, drugs, etc).
Each subject participated to 4 sessions with 15 sequences:
1) Three copy-spelling runs.
2) One copy-spelling run with feedback using a classifier
trained on data from session one.
3) Three free-spelling runs (user-selected words, around 15
characters per subject).
4) Variable free-spelling runs with reduced number of
sequences as indicated by bit-rate analysis.
10 channels (Fz, C3, Cz, C4, P3, Pz, P4, PO7, PO8, Oz) have
been recorded at 256 sps using the g.tec gUSBamp with
acquisition characteristics shown in Figure 2. The stimulus
is highlighted for 62.5 ms with an inter-stimuli interval of
125 ms.
We also propose a set of Matlab functions to extract and
average target and non-target responses specifying for
example the number of sequences to average and the
duration of the response and to save it in Matlab or ASCII
format.
The database, a complete description of the parameters
used for the speller and the code are available at:
http://akimpech.izt.uam.mx/p300db.
BCI Meeting
International
Notch Band-Pass
Chebyshev Chebyshev
4th order 8th order
58 - 62 Hz 0.1- 60 Hz
http://akimpech.izt.uam.mx/p300db
Table 1. Distribution of database cases as a
function of classifier accuracy and number of
averaged epochs (ne).
Figure 2. Recorded EEG
channels and filter parameters
Figure 3. Mean ROC area for all cases using
SWLDA analysis. Each x represents an individual
case, blue lines are standard deviation.
SWLDA (step-wise linear discriminant analysis) classifiers have been trained for each
subject. In order to provide the users of the database with an objective, comparable
measure of performance -that takes into account the choice of features and is
independent of the training/testing set- the relative (receiver) operating characteristic
or ROC curve has been selected. Summarized by the area under de curve (Az), the
ROC reflects intrinsic class separability: higher values of Az correspond to better
classifier designs.
As a reference, results for each subject are available on the web site. Accuracy using
SWLDA with 15 training sequences can be established in terms of an 86.7% of the
participants having 100% correct spelling, while the lowest percentage of correctly
detected characters reached by the rest of the database population was 85%. ROC
areas above 0.95 were reached by 76.7% of the population in about 10 sequences.
Thus, for 15 sequences the general performance is very good. Classifier features were
selected mainly from P08, Oz, PO7 and Pz electrodes and within the 100-290 ms
window. This shows that EP related to visual stimulation and its recognition play an
important role in the high accuracy of the classifier (See Table 1 and Figure 3).
This open-access P300 database includes recordings from 30 healthy
subjects. Data is available in BCI2000 and Matlab formats. A set of
Matlab functions for the extraction of the information that might be
needed for a given application is also included.
The database website provides, together with the data, a description
about conditions of each subject that has been recorded. Individual
results, accuracy, ROC area, and performace for every sequence count
are also reported. Given the individual accuracies and ROC areas for
the reference SWLDA classifier, it could be argued that overall data
quality is high.
We hope the work will contribute to better compare classifier
techniques as related to the P300 detection problem and applications,
by providing fair comparison grounds and reference data.
Figure 4. Impact of different preprocessing schemes (none,coiflet decomposition, b-
splines decomposition) on SVM classifier accuracies.
DATABASE APPLICATION:
ne
accuracy
100% 95%
100%
90%
95%
85%
90%
≤ 85%
15 26 1 2 0 1
14 22 4 3 1 0
13 25 2 2 1 0
11 19 3 5 0 3
10 20 3 3 4 0
8 18 3 2 3 4
5 8 2 7 6 7
3 3 0 3 6 18
... DaSalla et al., 2009) and(Ledesma-Ramirez et al., 2010), which contain EEG signals. Both datasets are explained in detail in "Pruning Criterion Validation Experiments"). ...
Article
Full-text available
Extreme Learning Machines (ELMs) have become a popular tool for the classification of electroencephalography (EEG) signals for Brain Computer Interfaces. This is so mainly due to their very high training speed and generalization capabilities. Another important advantage is that they have only one hyperparameter that must be calibrated: the number of hidden nodes. While most traditional approaches dictate that this parameter should be chosen smaller than the number of available training examples, in this article we argue that, in the case of problems in which the data contain unrepresentative features, such as in EEG classification problems, it is beneficial to choose a much larger number of hidden nodes. We characterize this phenomenon, explain why this happens and exhibit several concrete examples to illustrate how ELMs behave. Furthermore, as searching for the optimal number of hidden nodes could be time consuming in enlarged ELMs, we propose a new training scheme, including a novel pruning method. This scheme provides an efficient way of finding the optimal number of nodes, making ELMs more suitable for dealing with real time EEG classification problems. Experimental results using synthetic data and real EEG data show a major improvement in the training time with respect to most traditional and state of the art ELM approaches, without jeopardising classification performance and resulting in more compact networks.
... EEG data typically presents high levels of noise, and the relevant classification information is believed to be encoded in a particular subset of the features. The first is DaSalla dataset, already described in Section 3. The second one is a P300-based BCI dataset [11], consisting of 3780 EEG trials (630 with P300) acquired from 25 subjects using 10 channels at 256 Hz. ...
Preprint
Full-text available
Extreme Learning Machines (ELMs) have become a popular tool in the field of Artificial Intelligence due to their very high training speed and generalization capabilities. Another advantage is that they have a single hyper-parameter that must be tuned up: the number of hidden nodes. Most traditional approaches dictate that this parameter should be chosen smaller than the number of available training samples in order to avoid over-fitting. In fact, it has been proved that choosing the number of hidden nodes equal to the number of training samples yields a perfect training classification with probability 1 (w.r.t. the random parameter initialization). In this article we argue that in spite of this, in some cases it may be beneficial to choose a much larger number of hidden nodes, depending on certain properties of the data. We explain why this happens and show some examples to illustrate how the model behaves. In addition, we present a pruning algorithm to cope with the additional computational burden associated to the enlarged ELM. Experimental results using electroencephalography (EEG) signals show an improvement in performance with respect to traditional ELM approaches, while diminishing the extra computing time associated to the use of large architectures.
... For our experiments, we used the EEG-signal dataset reported in [33]. Such dataset is composed of the EEG signals of 22 healthy students from 21 to 25 years old without known neurological damage. ...
Preprint
In this paper, we aim to provide elements to contribute to the discussion about the usefulness of deep CNNs with several filters to solve both within-subject and cross-subject classification for single-trial P300 detection. To that end, we present SepConv1D, a simple Convolutional Neural Network architecture consisting of a depthwise separable 1D convolutional block followed by a Sigmoid classification block. Additionally, we present a one-layer Fully-Connected Neural Network with two neurons in the hidden layer to show the unnecessary of having complex architectures to solve the problem under analysis. We compare their performances against CNN-based state-of-the-art architectures. The experiments did not show a statistically significant difference between their AUC. Moreover, SepConv1D has the lowest number of parameters of all by far. This is important because simpler, cheaper, faster and, thus, more portable devices can be built.
Article
Full-text available
Brain-computer interface (BCI) speller is a system that provides an alternative communication for the disable people. The brain wave is translated into machine command through a BCI speller which can be used as a communication medium for the patients to express their thought without any motor movement. A BCI speller aims to spell characters by using the electroencephalogram (EEG) signal. Several types of BCI spellers are available based on the EEG signal. A standard BCI speller system consists of the following elements: BCI speller paradigm, data acquisition system and signal processing algorithms. In this work, a systematic review is provided on the BCI speller system and it includes speller paradigms, feature extraction, feature optimization and classification techniques for BCI speller. The advantages and limitations of different speller paradigm and machine learning algorithms are discussed in this article. Also, the future research directions are discussed which can overcome the limitations of present state-of-the-art techniques for BCI speller.
Article
Full-text available
Brain-Computer Interfaces (BCIs) are systems allowing people to interact with the environment bypassing the natural neuromuscular and hormonal outputs of the peripheral nervous system (PNS). These interfaces record a user’s brain activity and translate it into control commands for external devices, thus providing the PNS with additional artificial outputs. In this framework, the BCIs based on the P300 Event-Related Potentials (ERP), which represent the electrical responses recorded from the brain after specific events or stimuli, have proven to be particularly successful and robust. The presence or the absence of a P300 evoked potential within the EEG features is determined through a classification algorithm. Linear classifiers such as stepwise linear discriminant analysis and support vector machine (SVM) are the most used discriminant algorithms for ERPs’ classification. Due to the low signal-to-noise ratio of the EEG signals, multiple stimulation sequences (a.k.a. iterations) are carried out and then averaged before the signals being classified. However, while augmenting the number of iterations improves the Signal-to-Noise Ratio, it also slows down the process. In the early studies, the number of iterations was fixed (no stopping environment), but recently several early stopping strategies have been proposed in the literature to dynamically interrupt the stimulation sequence when a certain criterion is met in order to enhance the communication rate. In this work, we explore how to improve the classification performances in P300 based BCIs by combining optimization and machine learning. First, we propose a new decision function that aims at improving classification performances in terms of accuracy and Information Transfer Rate both in a no stopping and early stopping environment. Then, we propose a new SVM training problem that aims to facilitate the target-detection process. Our approach proves to be effective on several publicly available datasets.
Article
Over the past decade convolutional neural networks (CNNs) have become the driving force of an ever-increasing set of applications, achieving state-of-the-art performance. Modern CNN architectures are often composed of many convolutional and some fully connected layers, and have thousands or millions of parameters. CNNs have shown to be effective in the detection of Event-Related Potentials from electroencephalogram (EEG) signals, notably the P300 component which is frequently employed in Brain-Computer Interfaces (BCIs). However, for this task, the increase in detection rates compared to approaches based on human-engineered features has not been as impressive as in other areas and might not justify such a large number of parameters. In this paper, we study the performance of existing CNN architectures with diverse complexities for single-trial within-subject and cross-subject P300 detection on four different datasets. We also proposed SepConv1D, a very simple CNN architecture consisting of a single depthwise separable 1D convolutional layer followed by a fully connected Sigmoid classification neuron. We found that with as few as four filters in its convolutional layer and an overall small number of parameters, SepConv1D obtained competitive performances in the four datasets. We believe these results may represent an important step towards building simpler, cheaper, faster, and more portable BCIs.
Chapter
The Brain-Computer Interfaces (BCI) based on Electroencephalography (EEG), allow that through the processing of impulses or electrical signals generated by the human brain, people who have some type of severe motor disability or suffer from neurological conditions or neurodegenerative diseases, can establish communication with electronic devices. This paper proposes the development of an expert system that generates the control sequences for a neuroprosthesis that will be used in the rehabilitation of patients who cannot control their own muscles through neuronal pathways. This proposal is based on the EGG record during the operation of a BCI under the rare event paradigm and the presence or not of the P300 wave of the Event-Related Potential (ERP). Feature extraction and classification will be implemented on a mobile device using Python as a platform. The processing of the EEG records will allow obtaining the information so that an Expert System implemented in the mobile device, is responsible for determining the control sequences that will be executed by a neuroprosthesis. The tests will be performed by controlling a neuroprosthesis developed by the Instituto Nacional de Rehabilitación in México, which aims to stimulate the movement of a person’s upper limb.
Preprint
Brain-Computer Interfaces (BCIs) are systems allowing people to interact with the environment bypassing the natural neuromuscular and hormonal outputs of the peripheral nervous system (PNS). These interfaces record a user's brain activity and translate it into control commands for external devices, thus providing the PNS with additional artificial outputs. In this framework, the BCIs based on the P300 Event-Related Potentials (ERP), which represent the electrical responses recorded from the brain after specific events or stimuli, have proven to be particularly successful and robust. The presence or the absence of a P300 evoked potential within the EEG features is determined through a classification algorithm. Linear classifiers such as SWLDA and SVM are the most used for ERPs' classification. Due to the low signal-to-noise ratio of the EEG signals, multiple stimulation sequences (a.k.a. iterations) are carried out and then averaged before the signals being classified. However, while augmenting the number of iterations improves the Signal-to-Noise Ratio (SNR), it also slows down the process. In the early studies, the number of iterations was fixed (no stopping), but recently, several early stopping strategies have been proposed in the literature to dynamically interrupt the stimulation sequence when a certain criterion is met to enhance the communication rate. In this work, we explore how to improve the classification performances in P300 based BCIs by combining optimization and machine learning. First, we propose a new decision function that aims at improving classification performances in terms of accuracy and Information Transfer Rate both in a no stopping and early stopping environment. Then, we propose a new SVM training problem that aims to facilitate the target-detection process. Our approach proves to be effective on several publicly available datasets.
Article
Brain computer interfaces (BCI) represent an alternative for patients whose cognitive functions are preserved, but are unable to communicate via conventional means. A commonly used BCI paradigm is based on the detection of event-related potentials, particularly the P300, immersed in the electroencephalogram (EEG). In order to transfer laboratory-tested BCIs into systems that can be used by at homes, it is relevant to investigate if it is possible to select a limited set of EEG channels that work for most subjects and across different sessions without a significant decrease in performance. In this work, two strategies for channel selection for a single-trial P300 brain computer interface were evaluated and compared. The first strategy was tailored specifically for each subject, whereas the second strategy aimed at finding a subject-independent set of channels. In both strategies, genetic algorithms (GAs) and recursive feature elimination algorithms were used. The classification stage was performed using a linear discriminant. A dataset of EEG recordings from 18 healthy subjects was used test the proposed configurations. Performance indexes were calculated to evaluate the system. Results showed that a fixed subset of four subject-independent EEG channels selected using GA provided the best compromise between BCI setup and single-trial system performance.
ResearchGate has not been able to resolve any references for this publication.