Conference PaperPDF Available

Abstract and Figures

In this paper, imagined speech classification is performed with an implementation in Python and using scikit-learn library, to create a toolbox intended for real-time classification. To this aim, the Discrete Wavelet Transform with the mother function Biorthogonal 2.2 is used to then compute the instantaneous and Teager energy distribution for feature extraction. Then, random forest is implemented as a classifier with 10-folds cross-validation. The set of experiments consists of imagined speech classification, linguistic activity and inactivity classification and subjects identification. The experiments were performed using a dataset of 27 subjects which imagined 33 repetitions of 5 words in Spanish up, down, left, right and select. The accuracy obtained with the models were 0.77, 0.78 and 0.98 respectively for each task. The high accuracy rates obtained as a result attest for the feasibility of the proposed method for subject identification.
Content may be subject to copyright.
Towards an API for EEG-Based Imagined
Speech classification
Luis Alfredo Moctezuma1, Marta Molinas1, A. A. Torres Garc´ıa2, L. Villase˜nor
Pineda2, and Maya Carrillo3
1Department of Engineering Cybernetics, Norwegian University of Science and
Technology. Trondheim, Norway,
2Computer Science Department, Instituto Nacional de Astrof´ısica Optica y
Electr´onica. Puebla, Mexico
{alejandro.torres, villasen}
3Faculty of Computer Science, Benem´erita Universidad Aut´onoma de Puebla.
Puebla, Mexico
Abstract. In this paper, imagined speech classification is performed
with an implementation in Python and using scikit-learn library, to cre-
ate a toolbox intended for real-time classification. To this aim, the Dis-
crete Wavelet Transform with the mother function Biorthogonal 2.2 is
used to then compute the instantaneous and Teager energy distribution
for feature extraction. Then, random forest is implemented as a clas-
sifier with 10-folds cross-validation. The set of experiments consists of
imagined speech classification, linguistic activity and inactivity classifi-
cation and subjects identification. The experiments were performed us-
ing a dataset of 27 subjects which imagined 33 repetitions of 5 words in
Spanish up, down, left, right and select. The accuracy obtained with the
models were 0.77, 0.78 and 0.98 respectively for each task. The high ac-
curacy rates obtained as a result attest for the feasibility of the proposed
method for subject identification.
Keywords: Imagined Speech, Linguistic activity, Subjects identifica-
tion, Discrete Wavelet Transform, Brain Computer Interfaces, Electroen-
cephalograms, Application Programming Interface
1 Introduction
In the last years, exploration into the identification of various brain activities
has considerably increased, motivated by its potential use as a new way of com-
munication, with applications ranging from games to medicine and in general to
augmented human capabilities. A brain computer interface (BCI) is a commu-
nication system that monitors the brain activity and translates features, corre-
sponding to the user’s intention/thoughts/movements into control commands.
When thoughts features are identified and extracted, this can become a new
way of human-machine interaction that allows users to employ their thoughts to
2 Towards an API for EEG-Based Imagined Speech classification
control/operate external devices. BCI techniques can be classified into invasive
and non-invasive, the first one requiring surgical procedures for which very clear
brain signals are obtained because the measurements are not attenuated by the
skull and the scalp. Non-invasive BCI, does not require any surgery but signals
obtained are weaker. Non-invasive BCI are the most used due to their relatively
low cost and easy setup.
Neurophysiological exploration based on the bioelectrical brain activity (Os-
cillations of brain electric potentials, frequency spectrum in Hz) registered dur-
ing unconstrained rest, sleep, or with different activation functions are called
Electroencephalography (EEG). Electrophysiological sources refer to the neu-
rological mechanisms (also known as neuro-paradigm) used by a BCI user to
generate control signals [1]. Wolpaw et al. [2] and Bashati et al. [1] separated
these electrophysiological sources into categories based on neuronal mechanisms
and the recording technology they use. There are some invasive categories, but
for the interest of this work the non-invasive categories are: sensory-motor ac-
tivity, potential P300, visual evoked potentials (VEP), slow cortical potentials
(SCP) and response to mental and multiple tasks neuromechanisms.
Later [3] added the imagined or internal speech; which refers to the internal
or imagined pronunciation of words but without uttering sounds and without
articulating gestures. Imagined speech as an electrophysiological source has ad-
vantages over others, because it needs little training. In this research, imagined
speech from EEG signals are employed, the dataset consisting of EEG signals
from 27 subjects captured while imagining 33 repetitions of five words in Span-
ish; up, down, left, right and select.
The state-of-the-art reports the use of imagined speech as Electrophysiologi-
cal source, but in the majority of the cases they are limited to reduced vocabular-
ies (yes or no, etc.) or limited to syllables/phonemes. However, the applications
can be limited in these cases. If instead of syllables/phonemes are used complete
words, the applications can range from medical to biometric systems. The work
in this paper is aimed at a new way of communication in real-time between
people who cannot produce sounds or with specific diseases like Amyotrophic
Lateral Sclerosis [4,5].
Although the reported literature includes research employing imagined speech,
the on-line/real-time process is not considered. For a real-time implementation
it is first important to identify if the signals correspond to linguistic activity
and once the linguistic segment is identified in a signal, another process using a
multi-class classifier can determine the imagined word associated to the respec-
tive signal.
In this area, there is need for efforts to create a method for transfer learning
[6], because in real applications people in need of BCI solutions may have some
difficulties to train a model anew. The idea of transfer learning is to create a
classifier with training from a group of subjects and use in a different group of
subjects. In [7], the authors reported experiments and propose that a first stage
of calibration is needed because the signals are sufficiently different between
subjects and between sessions. Other experiments also suggest the use imagined
Towards an API for EEG-Based Imagined Speech classification 3
speech from EEG data as a biometric measurement for subject identification
[8], and show experiments limited to syllables. Security systems used by orga-
nizations to manage access to facilities, equipment or resources and to protect
against theft or espionage by denying unauthorized access, is one of the first
applications envisioned by this BCI concept. Different types of safety measures
have been proposed and used for a long time, ranging from the use of standard
systems (security guards, smart cards, etc) to the use of biometric measure-
ments (fingerprint, palm-print, etc.). A biometric recognition system is able to
perform automatic recognition of subjects based on their physiological and/or
behavioral features[9]. Any human physiological and/or behavioral characteris-
tic can be used as a biometric characteristic as long as it satisfies the following
requirements: universality, permanence, collectability, performance, acceptability
and circumvention.
Biometric systems are advantageous compared to generic system, because
they are more difficult to steal, compromise or duplicate, and can be more con-
venient for the users since a single biometric trait can be used for the access into
several accounts. However, current biometric systems are vulnerable to attacks
aimed at undermining the integrity of the authentication process [10]. For ex-
ample, an intruder may fraudulently obtain the latent fingerprints of a user and
later use it to construct a digital or physical artifact of the user’s finger [11].
Building on the above existing knowledge and on the need for inviolable
methods for subject identification, this paper proposes to advance the research
by employing imagined speech from EEG data to identify subjects in real-time.
For the experiments, an on-line environment (simulated real-time) was used, as
it is explained in the following sections.
The paper is organized as follow; first, the proposed method for different
tasks are presented and explained in brief. Next, some experiments to show the
application of the proposed method in imagined speech classification, linguis-
tic distinction and subjects identification are described, to finally discuss the
feasibility of the method in real application and possible future improvements.
2 Proposed method
In general, the method can be summarized in 3 fundamental steps: Pre-processing,
feature extraction and classification, as described in the flowchart in figure 1. To
ensure that instances in training data (off-line) are not used in Test data (on-
line), 30% was first separated for Test Set and 70% for training and to create
the model using 10-folds cross-validation.
To implement the proposed method and used it in real-time, it was created
an Application Programming Interface (API) using Django with Python 2.7 and
to store the models per each task and for all experiments a database in MySQL
was used.
4 Towards an API for EEG-Based Imagined Speech classification
In this version of the project the API4consist of 2 EndPoints; dwt/training
and dwt/{model id}with the POST method according to HTTP methods [12].
Fig. 1. Flowchart summarizing the steps of the proposed method.
2.1 Dataset
The dataset consist of EEG signals from 27 subjects captured using EMOTIV
EPOC while imagining 33 repetitions of five imagined words in Spanish; up,
down, left, right and select (Corresponding to arriba, abajo, izquierda, derecha
and seleccion). Each repetition of the imagined words was separated by a state
of rest, as shown in figure 2 and described in [13].
EEG signals were recorded from 14 high resolution channels (AF3, F7, F3,
FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8 and AF4; see figure 3) with a sample
frequency of 128 Hz which were placed according to the 10-20 international
system [14].
2.2 Pre-processing
In order to reduce the signal-to-noise ratio, the common average reference (CAR)
method was used [15,16]. As we can see in the formula 1, the CAR method remove
the common data in all electrodes recorded simultaneously.
4For more information about this public project, visit:
Towards an API for EEG-Based Imagined Speech classification 5
Fig. 2. Protocol designed in [13] for EEG signal acquisition using EMOTIV EPOC.
Fig. 3. 10-20 international system for 14 channels [14].
Where VER
iis the potential between the ith electrode and the reference, and n
is the number of electrodes.
2.3 Feature extraction
The EEG signals are usually non-stationary, they change rapidly over time and
patterns of brain activity contain information related to specific variations over
time. A representation of the signal that considers this behavior is necessary for
a proper feature extraction.
When the DWT is applied to a signals Swith a decomposition level j=4, it
will give a structure with vectors of approximation cAjand detail cDjcoeffi-
cients: [cA4, cD4, cD3, cD2, cD1], as shown in the figure 4 and the table 1 shows
the related frequencies per level of decomposition.
According to the average size of the dataset, 4 decomposition levels were
applied. However, the wavelet coefficients for each decomposition level will vary
depending on the signal size(the duration of imagined pronunciation of the word,
6 Towards an API for EEG-Based Imagined Speech classification
Fig. 4. Coefficient vectors in the 4th decomposition level of DWT for a signal S.
Table 1. DWT with 4 decomposition levels, frequencies ranges and related brain
Level Frequency range Brain rhythm
cD1 32-64 Gamma
cD2 16-32 Beta (16-30 Hz) and Gamma (30-32 Hz)
cD3 8-16 Alpha (8-12 Hz) and Beta (12-16 Hz)
cD4 4-8 Theta
cA4 0-4 Delta
between imagined words from the same subject and between imagined words
from different subjects).
To deal with this problem the instantaneous and teager energy distribution
was calculated [17]. These energy distributions were used since they have shown
the best results related to imagined speech [18,7]. When energy coefficients are
calculated, it is possible to have the same number of features for all instances.
In this work, the feature vector for each instance was represented with en-
ergy coefficients that were calculated for each decomposition level of the DWT
biorthogonal 2.2(bior2.2) and for each channel which were then concatenated in
order to have a single feature vector. The expressions for these energy distribu-
tions are shown below:
Instantaneous: gives the energy distribution in each band [17]:
fj=log10 1
Teager: This energy operator reflects variations in both amplitude and fre-
quency of the signal and it is a robust parameter for speech recognition as
it attenuates auditory noise [19,17].
fj=log10 1
|(wj(r))2wj(r1) wj(r+ 1)|!(3)
Towards an API for EEG-Based Imagined Speech classification 7
At this point, instead of having a features vector for each decomposition level
we have a single value for each one, and the process is repeated for each channel.
After this process, we have 5 values per channel (CA4, CD4, CD3, CD2, C D1)
and then all 14 channels are concatenated in order to have a single feature vector
with 70 coefficients to represent each instance of the EEG signal.
2.4 Classification
Once feature vectors are compute and obtained for each instance of the EEG sig-
nal, random forest was used for automatic classification with an implementation
in Python 2.7 using the library scikit-learn [20]. For all experiments, the param-
eters for random forest in scikit-learn were: max depth = 5, random state =
0, criterion =gini, that were selected after tested all possibilities. This classifier
was selected because of the good results reported in the authors’ previous work
on imagined speech classification using EEG signals [13,18].
To evaluate the classifier performance with 10-folds cross-validation, an ac-
curacy index was defined and calculated.
3 Experiments and results
In this paper an implementation of the proposed method towards on-line BCI
for identification of imagined speech from EEG signals is developed.
The results from several experiments using DWT bior2.2 with Instantaneous
and Teager energy distribution are reported here. For all described experiments
the models were created off-line and tested on-line, using the API created. The
experiments were separated in 3 different groups. First we present the result of
using the proposed method for imagined speech classification. Then, we separate
the imagined speech into a class named linguistic activity and the states of rest
into a class named linguistic inactivity, in order to check if the proposed method
can distinguish the imagined speech from others activities (rest). In addition the
signals were separated and tagged with a subject ID (S1, S2, .. S27) to create
amachine learning model for subjects identification and them use it in real-
time. It were made with subject-level analysis and subject-word-level analysis as
shown in the next experiments.
3.1 Imagined speech classification
Once the feature extraction was applied to the EEG signals, a classifier for each
subject using random forest was created. For each subject, the classifier consists
of 5 classes corresponding to the 5 imagined words (up, down, left, right and
select), and for each imagined word 23 instances were used.
The average accuracy for imagined speech classification task is presented in
figure 5, where it can be noted that when the Teager energy distribution was
used, the highest accuracy was reached (0.77).
8 Towards an API for EEG-Based Imagined Speech classification
Fig. 5. Average accuracy and standard deviation obtained with 27 subjects for imag-
ined speech classification.
Fig. 6. Procedure to separate instances
into linguistic activity and inactivity.
Fig. 7. Average accuracy and standard de-
viation obtained with 27 subjects for Lin-
guistic activity and inactivity distinction.
In this case, the model saved in the database to use it in real-time was with
instantaneous energy values. Using the trained model in on-line environment
with the 30% of the instances, the average accuracy obtained was 0.85, it can
be noted that the accuracy is highly related to the accuracy of the model.
3.2 Linguistic activity and linguistic inactivity
In a complete real-time application (In a continuous recording without restric-
tions) it is necessary to distinguish the brain activity generated by the subject
when imagining a word (linguistic activity) from any other brain activity (non-
linguistic activity or inactivity). In this part the “complete real-time application”
refer to identify the linguistic activity segment and then use another classifier to
detect the specific imagined word.
In this experiment the EEG signals were separated into 2 classes, a set of
imagined words considering the class of linguistic activity; and states of rest
(unconstrained rest) as examples of other brain activity. In this work, the lat-
ter are called linguistic inactivity. The process to separate the instances into 2
classes is shown in the figure 6, where R up, R Down, R Left, R Rigth, R Select
correspond to rest states between each repetitions.
Towards an API for EEG-Based Imagined Speech classification 9
The model saved in the database was with the instantaneous energy distri-
bution, and in the test stage the average accuracy obtained was 0.78.
3.3 Subjects identification
This experiment was carried out to check if there are sufficient information in the
EEG data for this task and to create a model for subject identification. For this,
the 115 instances of imagined words per subject (corresponding to 23 repetitions
per each of 5 imagined words) were considered as a single class, tagged it with
a subject ID (S1, S2, ..., S27).
This experiment was performed with 27 subjects using Instantaneous and
Teager energy distribution based on the DWT bior2.2. The results obtained in
the classification step with 10-fold cross-validation with random forest are shown
in the table 8.
Fig. 8. Accuracy obtained when a classifier with all imagined words as a single class
was created with 27 subjects using 10-fold cross-validation.
In figure 8 it can be observed that when using instantaneous energy distri-
bution the best accuracy obtained is 0.95. However using the Teager energy the
accuracy is similar and we can use them both. However, the saved model was
with instantaneous energy, and after the use of Test Set the average accuracy
obtained was 0.92.
This result suggests that it is possible to identify subjects regardless of the
word that the subject is imagining. This means that a subject can be identified
using different words. This brings up the question whether there a specific word
that is best suited for subjects identification.
To answer this question, the following experiment was carried out, which
consists of testing the classification with random forest with all 27 subjects but
with a classifier per imagined word. For this, the 23 repetitions of each imagined
word were used separately and the experiment was repeated for the 5 words from
the dataset of EEG signals. The classification for each of the imagined words
was done with the two feature extraction ways used in the previous experiment
(Instantaneous and Teager) in order to compare their strengths. The results
obtained in the classification step with 10-fold cross-validation using random
forest are shown in figure 9.
10 Towards an API for EEG-Based Imagined Speech classification
Fig. 9. Accuracies obtained when classifiers for each word separately were created with
27 subjects using 10-fold cross-validation.
Figure 9 again shows that when using instantaneous energy, the accuracy is
higher. The highest accuracy is obtained when using the imagined word Down.
From this, it can preliminarily be asserted that the most suitable word for the
task of subjects identification is the imagined word Down. However, when using
the other imagined words, the results are not much different and in all cases they
are above 0.92 of accuracy.
For subjects identification task, the models saved for each imagined word
was when the instantaneous energy distribution was used in all cases. As in the
proposed was described, for the test stage was used the 30% of the instances (In
this case 270 instances per experiment; corresponding to 10 imagined words per
each of 27 subjects) for on-line environment. The average accuracy per imagined
word was 0.92, 0.91, 0.92, 0.95 and 0.90 respectively.
4 Discussion and Conclusions
I this work, EEG-recorded dataset using the neurological source imagined speech
were used to create machine learning-based models. To use the method in real-
time, an API using Django in Python was designed and created.
The accuracy obtained for on-line environment, is highly related to the model
created. In the best-case the average accuracy obtained with the Test Set was
0.95. This accuracy level was obtained in the subjects identification task using
the imagined word Down.
The time for training and saving a machine learning model is not important
since it is an off-line process. Using the API for real-time classification, the time
Towards an API for EEG-Based Imagined Speech classification 11
to process a new unlabeled entry depends on the size of the signal and the DWT
computational complexity (O(Nlog 2N)) [21], however for fast responses it is
necessary to implement techniques to distribute the memory/work.
In general, as the experiments in this paper report, EEG signal as a new
way of communication can be possible and the work to process new signals can
be separated using an API (useful for several predictions at the same time).
In addition, EEG signals can be used as a password or as a measure for a
biometric security system for several environments. The first experiment for
subjects identification shows that subjects can be identified independent of the
imagined word. This suggests that it is possible to use a classifier to detect
subjects and then use the classified imagined word as a control command (using
the API only with other model id corresponding to the application) in a real
application. For example, it can be used as a 2-step verification or to send 2 or
more commands at the same time (i.e Give access and call the police, Give access
and turn on the lights, etc), in summary, for domestic security applications.
Distributing the work in an API has benefits and will contribute to the use
of a single machine learning model for several applications, for example, for
subjects identification, the same model could be used to manage the access in
2 or more places and for several users and different tasks. In addition, it could
accelerate the process of classification because there are lower restrictions when
using computers (I.e High Performance Servers and Supercomputers).
Future research efforts will be dedicated to explore the extent to which spe-
cific channels can provide more information for these task in order to reduce
the number of channels for real-time applications and decrease the time for a
new unlabeled entry. In addition, implementation in a real environment (with
additional noise) and with new feature extraction techniques will be tested.
Acknowledgments. This work was done under partial support of CONACYT
(scholarship #591475), and the project “David versus Goliath: single-channel
EEG unravels its power through adaptive signal analysis - FlexEEG” which is
supported by Enabling Technologies - NTNU.
1. Bashashati, Ali, Mehrdad Fatourechi, Rabab K. Ward, and Gary E. Birch: A survey
of signal processing algorithms in braincomputer interfaces based on electrical brain
signals. Journal of Neural engineering 4, no. 2 (2007): R32.
2. Wolpaw, Jonathan R., Niels Birbaumer, Dennis J. McFarland, Gert Pfurtscheller,
and Theresa M. Vaughan: Brain-computer interfaces for communication and control.
Clinical neurophysiology 113, no. 6 (2002): 767-791.
3. Desain, Peter, Jason Farquhar, Pim Haselager, Christian Hesse, and R. S. Schaefer:
What BCI research needs. In Proc. ACM CHI 2008 Conf. on Human Factors in
Computing Systems (Venice, Italy) (2008):.
4. Elman, Lauren B., and L. McCluskey: Clinical features of amyotrophic lateral sclero-
sis and other forms of motor neuron disease. Up-to-date. Waltham: Wolters Kluwer
Health (2012): 23.
12 Towards an API for EEG-Based Imagined Speech classification
5. Feller, T. G., R. E. Jones, and M. G. Netsky: Amyotrophic lateral sclerosis and
sensory changes. Virginia medical monthly 93 , no. 6 (1966): 328.
6. Jayaram, Vinay, Morteza Alamgir, Yasemin Altun, Bernhard Scholkopf, and Moritz
Grosse-Wentrup: Transfer learning in brain-computer interfaces. IEEE Computa-
tional Intelligence Magazine 11, no. 1 (2016): 20-31
7. Moctezuma, Luis Alfredo: Distinci´on de estados de actividad e inactividad
ling¨ıstica para interfaces cerebro computadora. Thesis project of master degree
8. Brigham, Katharine, and BVK Vijaya Kumar: Subject identification from electroen-
cephalogram (EEG) signals during imagined speech. In Biometrics: Theory Applica-
tions and Systems (BTAS), 2010 Fourth IEEE International Conference on (2010):
9. Jain, Anil K., Arun Ross, and Salil Prabhakar: An introduction to biometric recog-
nition. IEEE Transactions on circuits and systems for video technology 14, no. 1
(2004): 4-20.
10. Jain, Anil K., Arun Ross, and Umut Uludag: Biometric template security: Chal-
lenges and solutions. Signal Processing Conference 13th European IEEE (2005):
11. Uludag, Umut, and Anil K. Jain: Attacks on biometric systems: a case study in
fingerprints. In Security, Steganography, and Watermarking of Multimedia Contents
VI, vol. 5306, International Society for Optics and Photonics (2004): 622-634
12. Fielding, Roy, Jim Gettys, Jeffrey Mogul, Henrik Frystyk, Larry Masinter, Paul
Leach, and Tim Berners-Lee: Hypertext transfer protocol–HTTP/1.1. No. RFC
2616. (1999):.
13. Torres-Garc´ıa, A. A., C. A. Reyes-Garc´ıa, L. Villase˜nor-Pineda, and J. M. Ram´ırez-
Cort´ıs: An´alisis de se˜nales electroencefalogr´aficas para la clasificacin de habla imag-
inada. Revista mexicana de ingenier´ıa biom´edica 34, no. 1 (2013): 23-39.
14. Jasper, Herbert: Report of the committee on methods of clinical examination in
electroencephalography. Electroencephalogr Clin Neurophysiol 10 (1958): 370-375.
15. Bertrand, O., F. Perrin, and J. Pernier: A theoretical justification of the aver-
age reference in topographic evoked potential studies. Electroencephalography and
Clinical Neurophysiology/Evoked Potentials Section 62, no. 6 (1985): 462-464.
16. Alhaddad, Mohammed J: Common average reference (car) improves p300 speller.
International Journal of Engineering and Technology 2, no. 3 (2012): 21.
17. Didiot, Emmanuel, Irina Illina, Dominique Fohr, and Odile Mella: A wavelet-based
parameterization for speechmusic discrimination. Computer Speech & Language 24,
no. 2 (2010): 341-357.
18. Moctezuma, Luis Alfredo, Maya Carrillo, Luis Villase˜nor Pineda, and Alejandro A.
Torres Garc´ıa: Hacia la clasificaci´on de actividad e inactividad ling¨uistica a partir
de senales de electroencefalogramas (EEG). Research in Computing Science 140
(2017): 135-149.
19. Jabloun, Firas, and A. Enis Cetin: The Teager energy based feature parameters for
robust speech recognition in car noise. In Acoustics, Speech, and Signal Processing.
1999 IEEE International Conference on, vol. 1. (1999): 273-276
20. Pedregosa, Fabian, Ga¨el Varoquaux, Alexandre Gramfort, Vincent Michel,
Bertrand Thirion, Olivier Grisel, Mathieu Blondel et al.: Scikit-learn: Machine learn-
ing in Python. Journal of machine learning research 12, no. Oct (2011): 2825-2830.
21. Averbuch, Amir Z., and Valery A. Zheludev: Construction of biorthogonal dis-
crete wavelet transforms using interpolatory splines. Applied and Computational
Harmonic Analysis 12, no. 1 (2002): 25-56.
... Empirical Mode Decomposition (EMD) [4] and Discrete Wavelet Transform (DWT) [5][6][7] have been applied to transform and analyze brain signals while different mental tasks are performed. Both, EMD and DWT, have shown to be effective in decomposing non-stationary/non-linear time series. ...
... In more recent works, Subject identification methods based on imagined speech using DWT [5] and EMD [4] for feature extraction, were presented and the results obtained suggest that EEG of imagined speech can be a good candidate as a biometric marker. In the present work, a new conceptual proposition using resting-states (unconstrained rest) in conjunction with fewer EEG channels/instances, is explored. ...
... Then, for each level of decomposition the instantaneous energy was obtained. The flowchart describing the method is shown in figure 4 and is detailed first in [6] and then used for Subject identification using imagined speech in [5]. ...
Full-text available
A new concept of low-density electroencephalograms-based (EEG) Subject identification is proposed in this paper. To that aim, EEG recordings of resting-states were analyzed with 3 different classifiers (SVM, k-NN, and naive Bayes) using Empirical Mode Decomposition (EMD) and Discrete Wavelet Transform (DWT) for feature extraction and their accuracies were estimated to compare their performances. To explore the feasibility of using fewer channels with minimum loss of accuracy, the methods were applied to a dataset of 27 Subjects (From 5 sessions of 30 instances per Subject) recorded using the EMOTIV EPOC device with 1 set of 14 channels and 4 subsets (8, 4, 2 and 1 channel) that were selected using a greedy algorithm. The experiments were reproduced using fewer instances each time to observe the evolution of the accuracy using both; fewer channels and fewer instances. The results of this experiments suggest that EMD compared with DWT is a more robust technique for feature extraction from brain signals to identify Subjects during resting-states, particularly when the amount of information is reduced: e.g., using Linear SVM and 30 instances per Subject, the accuracies obtained using 14 channels were 0.91 and 0.95, with 8 channels were 0.87 and 0.89 with EMD and DWT repectively but were reversed in favor of EMD when the number of channels was reduced to 4 channels (0.76 and 0.74), 2 (0.64 and 0.56) and 1 channel (0.46 and 0.31). The general observed trend is that, Linear SVM exhibits higher accuracy rates using high-density EEG (0.91 with 14 channels) while Gaussian naive Bayes exhibits better accuracies when using low-density EEG in comparison with the other classifiers (with EMD 0.88, 0.81, 0.76 and 0.61 respectively for 8, 4, 2 and 1 channel). The findings of these experiments reveal an important insight for continuing the exploration of low-density EEG for Subject identification.
... Software/Theoretical Pillars: a) Neuromagnetic Inverse Problem solutions for brain source localization with scanning map will be based on the state-of-the-art regularized observers and Bayesian frameworks to solve the EEG inverse problem [7], [10] and on the new developments by the project team on partial brain model and optimal electrode configurations [14], [15], [17], [22], [23] [26]. b) Algorithms for real-time continuous EEG monitoring, signal analysis and classification techniques, based on state-of-the-art machine learning algorithms for classification, techniques for feature extraction of event related potentials and non-stationary signal analysis like HHT [29] and existing work by the team in [24], [25] to develop tailored BCI solutions and cEEG systems for WP3 and WP4 exploring the needs reported in [27], [28]. Hardware Pillars a) and b) will be used in WP1 and WP2 while Theoretical Pillars a) in WP2 and b) in WP3 and WP4. ...
Research Proposal
Full-text available
FlexEEG anticipates a new low-density EEG scanning concept based on dry electrodes that will bring real-time brain imaging from the scalp signals into the hands of the user. This will materialize into a real-time Brain Computer Interface (BCI) with brain mapping capabilities. FlexEEG will address the hardware and software challenges together in an embedded design solution that will merge dry electrode-amplifier with the brain mapping tool into a wireless digital EEG sensor. To achieve this, it will exploit methods of inverse problems, path tracking and integrated circuit design for EEG scanning that can attain comparable quality to high-density EEG, to be tested in infants and intensive care units. FlexEEG will have significant impact in expanding the use of EEG brain mapping from research to daily clinical use and to domains of cognitive development, intensive care medicine and rehabilitation.
... Several different machine learning approaches have been used to classify imagined speech data. Among them are SVM [6], RF [15], and linear discriminant analysis (LDA) [16]. SVM has been the most often used method, but none of them have proven to be superior to the others. ...
Full-text available
Imagined speech is a relatively new electroencephalography (EEG) neuro-paradigm, which has seen little use in Brain-Computer Interface (BCI) applications. Imagined speech can be used to allow physically impaired patients to communicate and to use smart devices by imagining desired commands and then detecting and executing those commands in a smart device. The goal of this research is to verify previous classification attempts made and then design a new, more efficient neural network that is noticeably less complex (fewer number of layers) that still achieves a comparable classification accuracy. The classifiers are designed to distinguish between EEG signal patterns corresponding to imagined speech of different vowels and words. This research uses a dataset that consists of 15 subjects imagining saying the five main vowels (a, e, i, o, u) and six different words. Two previous studies on imagined speech classifications are verified as those studies used the same dataset used here. The replicated results are compared. The main goal of this study is to take the proposed convolutional neural network (CNN) model from one of the replicated studies and make it much more simpler and less complex, while attempting to retain a similar accuracy. The pre-processing of data is described and a new CNN classifier with three different transfer learning methods is described and used to classify EEG signals. Classification accuracy is used as the performance metric. The new proposed CNN, which uses half as many layers and less complex pre-processing methods, achieved a considerably lower accuracy, but still managed to outperform the initial model proposed by the authors of the dataset by a considerable margin. It is recommended that further studies investigating classifying imagined speech should use more data and more powerful machine learning techniques. Transfer learning proved beneficial and should be used to improve the effectiveness of neural networks.
... We implement a method to prove that we are really getting flicker information, so we analyze data from Muse headband (.muse) and convert it to Matlab file (.mat). In order to reduce the signal-to-noise ratio, the common average reference (CAR) method was used [15]. ...
... In the work presented in [21] the Common Average Reference (CAR) [22] was used to improve the signal-to-noise ratio. Then, the feature extraction was based on Instantaneous and Teager energy distribution of 4 decomposition levels of Wavelet using the mother function Biorthogonal 2.2, and random forest for classification. ...
Full-text available
When brain activity ions, the potential for human capacities augmentation is promising. In this paper, EMD is used to decompose EEG signals during Imagined Speech in order to use it as a biometric marker for creating a Biometric Recognition System. For each EEG channel, the most relevant Intrinsic Mode Functions (IMFs) are decided based on the Minkowski distance, and for each IMF 4 features are computed: Instantaneous and Teager energy distribution and Higuchi and Petrosian Fractal Dimension. To test the proposed method, a dataset with 20 Subjects who imagined 30 repetitions of 5 words in Spanish, is used. Four classifiers are used for this task - random forest, SVM, naive Bayes, and k-NN - and their performances are compared. The accuracy obtained (up to 0.92 using Linear SVM) after 10-folds cross-validation suggest that the proposed method based on EMD can be valuable for creating EEG-based biometrics of imagined speech for Subject identification.
Full-text available
La capacidad de intercambiar mensajes entre dos o más personas es bien conocida, para ello es necesario que exista un sistema compartido de signos y normas semánticas. Comúnmente se pueden combinar algunas formas de comunicación, por ejemplo, alguien que utilice el lenguaje oral puede comunicarse con alguien que sólo utilice el lenguaje de señas. Sin embargo, un porcentaje significativo de la población mundial, no puede utilizar los modelos tradicionales de comunicación. En el caso de México, se sabe por el Instituto Nacional de Estadística y Geografía que el 6% de la población sufre alguna discapacidad, de las cuales el 18% de ellas se relaciona con discapacidades para hablar o comunicarse, entre ellas la esclerosis lateral amiotrófica, esclerosis múltiple y lesiones de médula espinal o cerebral. Las personas que sufren dichas discapacidades tienen varios riesgos y limitaciones al tratar de convivir día a día con las personas que los rodean, no pueden desempeñar la mayoría de las tareas y no pueden comunicar lo que piensan o sienten. Una respuesta ampliamente aceptada, es el uso de interfaces que conectan al cerebro humano con una computadora, por sus siglas en inglés, BCI. Las BCI no invasivas basadas en electroencefalogramas (EEG) son las más utilizadas debido a su costo relativamente bajo. Un tipo de BCI, son las conocidas como interfaces de habla silenciosa, cuya finalidad es desarrollar sistemas capaces de permitir la comunicación hablada cuando es imposible emitir una señal acústica audible. Para lograr esto se utiliza un mecanismo neurológico que consiste en imaginar la dicción de las palabras, comúnmente conocido como habla imaginada, para luego identificar de que palabra se trataba con el uso de métodos computacionales. La tarea de identificar las palabras imaginadas ha sido abordada como un problema de clasificación, y para tratar de resolverlo se han buscado formas de caracterización de las señales y algoritmos que permitan realizar dicha clasificación, pues resulta poco práctico hacer uso de toda la información generada por el cerebro humano. En este trabajo se evaluaron dos formas de caracterización de las señales de EEG para identificar habla imaginada. Luego las señales de EEG fueron separadas en 2 clases, un conjunto de palabras imaginadas consideradas la clase de actividad lingüística; y los estados de reposo o pausa como ejemplos de otra actividad cerebral. Estos últimos denominados por simplicidad de inactividad lingüística. Así, caracterizar e identificar cuando se trata de actividad e inactividad lingüística, pues esto es un preproceso indispensable para el desarrollo de una BCI basada en habla imaginada. La idea de hacer esta clasificación es orientar los trabajos hacia detectar habla imaginada en tiempo real, pues un primer proceso debe encontrar cuando se trata de actividad o inactividad lingüística y posteriormente otro proceso identificar la palabra imaginada. La primera forma de caracterización utilizó el cálculo de la distribución de la energía instantánea, relativa, jerárquica y teager con base en la transformada discreta wavelet (DWT). La segunda caracterización se basó en el cálculo de valores estadísticas directamente sobre la señal. Para la etapa de clasificación se usó random forest, SVM y naive Bayes. Los experimentos se realizaron sobre dos bases de datos de habla imaginada, una con 27 sujetos y otra con 20. Los resultados muestran que al usar los valores estadísticos y el algoritmo de random forest, la actividad e inactividad lingüística se distingue mejor que al usar la DWT; sin embargo, cabe señalar que con ambas caracterizaciones se alcanzan altas tasas de exactitud. Los experimentos expuestos fueron realizados fuera de línea, sin embargo para realizar una BCI en línea es necesario contar con respuestas rápidas y procesos precisos para identificar cuando inicia y termina la actividad e inactividad lingüística. Este trabajo analizó si en segmentos pequeños de la señal de EEG existe información que permita distinguir cuando se trata de actividad e inactividad lingüística, al utilizar diferentes tamaños de ventana, decreciendo su tamaño, la exactitud varía. Con ventanas pequeñas se podrá identificar en un menor tiempo el inicio y fin de la actividad lingüística. Es así que será posible el reconocimiento de habla imaginada en tiempo real.
Full-text available
In this paper a comprehensive comparative study of the different re-reference techniques are given, as well as, their applications for p300 speller, of both offline and online applications. Twelve different re-reference techniques were applied to three different datasets. Their results were compared with each other. The results showed that Common Average Reference (CAR) is best suited to be the reference technique.
Full-text available
In this paper, a new set of speech feature parameters based on multirate signal processing and the Teager energy operator is developed. The speech signal is first divided into nonuniform subbands in mel-scale using a multirate filter-bank, then the Teager energies of the subsignals are estimated. Finally, the feature vector is constructed by log-compression and inverse DCT computation. The new feature parameters have a robust speech recognition performance in car engine noise which is low pass in nature.
Conference Paper
Full-text available
We investigate the potential of using electrical brainwave signals during imagined speech to identify which subject the signals originated from. Electroencephalogram (EEG) signals were recorded at the University of California, Irvine (UCI) from 6 volunteer subjects imagining speaking one of two syllables, /ba/ and /ku/, at different rhythms without performing any overt actions. In this work, we assess the degree of subject-to-subject variation and the feasibility of using imagined speech for subject identification. The EEG data are first preprocessed to reduce the effects of artifacts and noise, and autoregressive (AR) coefficients are extracted from each electrode's signal and concatenated for subject identification using a linear SVM classifier. The subjects were identifiable to a 99.76% accuracy, which indicates a clear potential for using imagined speech EEG data for biométrie identification due to its strong inter-subject variation. Furthermore, the subject identification appears to be tolerant to differing conditions such as different imagined syllables and rhythms (as it is expected that the subjects will not imagine speaking the syllables at exactly the same rhythms from trial to trial). The proposed approach was also tested on a publicly available database consisting of EEG signals corresponding to Visual Evoked Potentials (VEPs) to test the applicability of the proposed method on a larger number of subjects, and it was able to classify 120 subjects with 98.96% accuracy.
It is often a problem in various fields that one runs into a series of tasks that appear - to a human - to be highly related to each other, yet applying the optimal machine learning solution of one problem to another results in poor performance. Specifically in the field of brain-computer interfaces (BCIs), it has long been known that a subject with good classification of some brain signal today could come into the experimental setup tomorrow and perform terribly using the exact same classifier. One initial approach to get over this problem was to fix the classification rule beforehand and train the patient to force brain activity to conform to this rule.
El presente trabajo tiene como objetivo interpretar las señales de EEG registradas durante la pronunciación imaginada de palabras de un vocabulario reducido, sin emitir sonidos ni articular movimientos (habla imaginada o no pronunciada) con la intención de controlar un dispositivo. Específicamente, el vocabulario permitiría controlar el cursor de la computadora, y consta de las palabras del lenguaje español: "arriba", "abajo", "izquierda", "derecha", y "seleccionar". Para ello, se registraron las señales de EEG de 27 individuos utilizando un protocolo básico para saber a priori en qué segmentos de la señal la persona imagina la pronunciación de la palabra indicada. Posteriormente, se utiliza la transformada wavelet discreta (DWT) para extraer características de los segmentos que son usados para calcular la energía relativa wavelet (RWE) en cada una de los niveles en los que la señal es descompuesta, y se selecciona un subconjunto de valores RWE provenientes de los rangos de frecuencia menores a 32 Hz. Enseguida, éstas se concatenan en dos configuraciones distintas: 14 canales (completa) y 4 canales (los más cercanos a las áreas de Broca y Wernicke). Para ambas configuraciones se entrenan tres clasificadores: Naive Bayes (NB), Random Forest (RF) y Máquina de vectores de soporte (SVM). Los mejores porcentajes de exactitud se obtuvieron con RF cuyos promedios fueron 60.11% y 47.93% usando las configuraciones de 14 canales y 4 canales, respectivamente. A pesar de que los resultados aún son preliminares, éstos están arriba del 20 %, es decir, arriba del azar para cinco clases. Con lo que se puede conjeturar que las señales de EEG podrían contener información que hace posible la clasificación de las pronunciaciones imaginadas de las palabras del vocabulario reducido.
We present a new family of biorthogonal wavelet and wavelet packet transforms for discrete periodic signals and a related library of biorthogonal periodic symmetric waveforms. The construction is based on the superconvergence property of the interpolatory polynomial splines of even degrees. The construction of the transforms is performed in a “lifting” manner that allows more efficient implementation and provides tools for custom design of the filters and wavelets. As is common in lifting schemes, the computations can be carried out “in place” and the inverse transform is performed in a reverse order. The difference with the conventional lifting scheme is that all the transforms are implemented in the frequency domain with the use of the fast Fourier transform. Our algorithm allows a stable construction of filters with many vanishing moments. The computational complexity of the algorithm is comparable with the complexity of the standard wavelet transform. Our scheme is based on interpolation and, as such, it involves only samples of signals and it does not require any use of quadrature formulas. In addition, these filters yield perfect frequency resolution.
This paper addresses the problem of parameterization for speech/music discrimination. The current successful parameterization based on cepstral coefficients uses the Fourier transformation (FT), which is well adapted for stationary signals. In order to take into account the non-stationarity of music/speech signals, this work proposes to study wavelet-based signal decomposition instead of FT. Three wavelet families and several numbers of vanishing moments have been evaluated. Different types of energy, calculated for each frequency band obtained from wavelet decomposition, are studied. Static, dynamic and long-term parameters were evaluated. The proposed parameterization are integrated into two class/non-class classifiers: one for speech/non-speech, one for music/non-music. Different experiments on realistic corpora, including different styles of speech and music (Broadcast News, Entertainment, Scheirer), illustrate the performance of the proposed parameterization, especially for music/non-music discrimination. Our parameterization yielded a significant reduction of the error rate. More than 30% relative improvement was obtained for the envisaged tasks compared to MFCC parameterization.