ArticlePDF Available

Abstract

Synthetic data augmentation is of paramount importance for machine learning classification, particularly for biological data, which tend to be high dimensional and with a scarcity of training samples. The applications of robotic control and augmentation in disabled and able-bodied subjects still rely mainly on subject-specific analyses. Those can rarely be generalised to the whole population and appear to over complicate simple action recognition such as grasp and release (standard actions in robotic prosthetics and manipulators). We show for the first time that multiple GPT-2 models can machine-generate synthetic biological signals (EMG and EEG) and improve real data classification. Models trained solely on GPT-2 generated EEG data can classify a real EEG dataset at 74.71% accuracy and models trained on GPT-2 EMG data can classify real EMG data at 78.24% accuracy. Synthetic and calibration data are then introduced within each cross validation fold when benchmarking EEG and EMG models. Results show algorithms are improved when either or both additional data are used. A Random Forest achieves a mean 95.81% (1.46) classification accuracy of EEG data, which increases to 96.69% (1.12) when synthetic GPT-2 EEG signals are introduced during training. Similarly, the Random Forest classifying EMG data increases from 93.62% (0.8) to 93.9% (0.59) when training data is augmented by synthetic EMG signals. Additionally, as predicted, augmentation with synthetic biological signals also increases the classification accuracy of data from new subjects that were not observed during training. A Robotiq 2F-85 Gripper was finally used for real-time gesture-based control, with synthetic EMG data augmentation remarkably improving gesture recognition accuracy, from 68.29% to 89.5%.
3498 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO. 2, APRIL 2021
Synthetic Biological Signals Machine-Generated by
GPT-2 Improve the Classification of EEG and EMG
Through Data Augmentation
Jordan J. Bird , Michael Pritchard , Antonio Fratini , Anikó Ekárt ,andDiegoR.Faria
Abstract—Synthetic data augmentation is of paramount impor-
tance for machine learning classification, particularly for biological
data, which tend to be high dimensional and with a scarcity of train-
ing samples. The applications of robotic control and augmentation
in disabled and able-bodied subjects still rely mainly on subject-
specific analyses. Those can rarely be generalised to the whole
population and appear to over complicate simple action recognition
such as grasp and release (standard actions in robotic prosthetics
and manipulators). We show for the first time that multiple GPT-2
models can machine-generate synthetic biological signals (EMG
and EEG) and improve real data classification. Models trained
solely on GPT-2 generated EEG data can classify a real EEG dataset
at 74.71% accuracy and models trained on GPT-2 EMG data can
classify real EMG data at 78.24% accuracy. Synthetic and calibra-
tion data are then introduced within each cross validation fold when
benchmarking EEG and EMG models. Results show algorithms are
improved when either or both additional data are used. A Random
Forest achieves a mean 95.81% (1.46) classification accuracy of
EEG data, which increases to 96.69% (1.12) when synthetic GPT-2
EEG signals are introduced during training. Similarly, the Random
Forest classifying EMG data increases from 93.62% (0.8) to 93.9%
(0.59) when training data is augmented by synthetic EMG signals.
Additionally, as predicted, augmentation with synthetic biological
signals also increases the classification accuracy of data from new
subjects that were not observed during training. A Robotiq 2F-85
Gripper was finally used for real-time gesture-based control, with
synthetic EMG data augmentation remarkably improving gesture
recognition accuracy, from 68.29% to 89.5%.
Index Terms—Biological signal processing, data augmentation,
electroencephalography, electromyography, synthetic data.
I. INTRODUCTION
WHEN presenting their Generative Pretrained
Transformer (GPT) model, researchers at OpenAI
hypothesised that language models are unsupervised multitask
Manuscript received October 14, 2020; accepted January 21, 2021. Date of
publication February 2, 2021; date of current version March 23, 2021. This letter
was recommended for publication by Associate Editor S. Leonard and Editor
E. Marchand upon evaluation of the reviewers’ comments. (Jordan J. Bird and
Michael Pritchard are co-first authors). (Corresponding author: Jordan J. Bird.)
Jordan J. Bird, Michael Pritchard, and Diego R. Faria are with the
Aston Robotics, Vision, and Intelligent Systems Lab (ARVIS Lab), As-
ton University, Birmingham, B4 7ET, U.K. (e-mail: birdj1@aston.ac.uk;
pritcham@aston.ac.uk; fariadiego@gmail.com).
Antonio Fratini is with the Optometry & Vision Science Research Group
(OVSRG) at The School of Life and Health Sciences, Aston University, Birm-
ingham, B4 7ET, U.K. (e-mail: a.fratini@aston.ac.uk).
Anikó Ekárt is with the School of Engineering and Applied Science, Aston
University, Birmingham, B47ET, U.K. (e-mail: a.ekart@aston.ac.uk).
Digital Object Identifier 10.1109/LRA.2021.3056355
learners [1]. At the current state-of-the-art this claim has been
consistently argued through applications such as fake news
identification [2], patent claims [3], and stock market analysis [4]
to name just a few in a rapidly growing area of research. In this
work, we follow those before us in exploring the capabilities of
these models in a brand new field of application: the generation
of bio-synthetic signals (in our case Electroencephalographic
(EEG) and Electromyographic (EMG) activity). In detail, we
aimed at exploring whether or not GPT-2’s self-attention based
architecture was capable of creating synthetic signals, and if
those signals could improve the performance of classification
models used on real datasets. Enabling better results for the
deduction of a physical action or mental thought allows for a
higher degree of certainty when it comes to an unseen subject.
That is, for example in electromyographically controlled robotic
prosthetic limbs, a more improved experience for the user of
such a robotic device. Our scientific contributions and results
suggest that:
1) It is possible to generate synthetic biological signals by
tuning a language transformation model.
2) Classifiers trained on either real or synthetic data can
classify one another with relatively high accuracy.
3) Synthetic data improves the classification of the real data
both in terms of model benchmarking and classification
of unseen samples.
II. RELATED WORK AND BACKGROUND
In this section, we describe how previous work has demon-
strated the benefits of augmenting biological signal datasets to
improve classification results, since it has been noted that aug-
mentation is a useful technique to overcome data scarcity in such
domains [5]. A common approach is to generate synthetic signals
by re-arranging components of real data. Lotte [6] proposed
a method of ”Artificial Trial Generation Based on Analogy”
where three data examples x1,x
2,x
3provide examples and an
artificial xsynthetic is formed which is to x3what x2is to x1.
A transformation is applied to x1to make it more similar to x2,
the same transformation is then applied to x3which generates
xsynthetic.1This approach was shown to improve performance
of a Linear Discriminant Analysis classifier on three different
datasets. Dai et al. [7] performed similar rearrangements of
1Equations for Lotte’s EEG generation technique can be found in [6].
2377-3766 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: ASTON UNIVERSITY. Downloaded on May 06,2021 at 17:25:47 UTC from IEEE Xplore. Restrictions apply.
BIRD et al.: SYNTHETIC BIOLOGICAL SIGNALS MACHINE-GENERATED BY GPT-2 IMPROVE THE CLASSIFICATION 3499
waveform components in both the time and frequency domains
to add three times the amount of initially collected EEG data,
finding that this approach could improve the classification ac-
curacy of a Hybrid Scale Convolutional Neural Network. This
work showed that data augmentation allowed the model to
improve the classification of data for individual subjects that
were specifically challenging in terms of the model’s classifi-
cation ability. Dinarès-Ferran [8] decomposed EEG signals into
Intrinsic Mode Functions and constructed synthetic data frames
by arranging these IMFs into new combinations, demonstrating
improvements of classification performance of motor imagery
based BCIs while including these new signals. Other researchers
have proposed data augmentation techniques commonly used
in other domains such as image classification techniques with
positive results. As an example Shovon et al. [9] applied con-
ventional image augmentation techniques e.g. rotation, zoom,
and brightness to spectral images formed from EEG analysis to
increase the size of a public EEG dataset. This ultimately led
to an improvement over the state-of-the-art. Current research
shows great impact can be derived from relatively simple tech-
niques. For example, Freer [10] observed that introducing noise
into gathered data to form additional data points improved the
learning ability of several models which otherwise performed
relatively poorly. Tsinganos et al. [11] studied the approaches
of magnitude warping, wavelet decomposition, and synthetic
surface EMG models (generative approaches) for hand gesture
recognition, finding classification performance increases of up
to +16% when augmented data was introduced during training.
More recently, data augmentation studies have begun to focus
on the field of deep learning, more specifically on the ability
of generative models to create artificial data which is then
introduced during the classification model training process. In
2018, Luo et al. [12] observed that useful EEG signal data could
be generated by Conditional Wasserstein Generative Adversarial
Networks (GANs) which was then introduced to the training set
in a classical train-test learning framework. The authors found
classification performance was improved when such techniques
were introduced. Likewise, Zhang and Liu [13] applied similar
Deep Convolutional GANs (DC-GAN) to EEG signals given
that training examples are often scarce in related works. As
with the previous work, the authors found success when aug-
menting training data with DC-GAN generated data. Zanini
and Colombini [14] provided a state-of-the-art solution in the
field of EMG studies when using a DC-GAN to successfully
perform style transfer of Parkinson’s Disease to bio-electrical
signals, noting the scarcity of Parkinson’s Disease EMG data
available to researchers as an open issue in the field [14]. Many
studies observed follow a relatively simple train/test approach
to benchmarking models.
A limitation of many techniques is that they are not temporal
in their generative natures. Each block of signal output has no
influence on the next, and, as such, a continuous synthetic signal
of unlimited length cannot therefore be generated. Our approach
allows for infinite generation of temporal wave data given the
nature of GPT-2; a continuous synthetic raw signal is generated
by presenting some of the previous outputs as input for the next
generation. We then benchmark the models through k-fold cross
validation, where each fold has synthetic data introduced as
additional training data. Moreover, for the first time in the field,
we show the effectiveness of attention-based models at the signal
level rather than generative based models at the feature-level
for both training and unseen data. We then finally show that
real-time gesture classification towards direct control of a robotic
arm is improved following our data augmentation framework.
A. GPT-2 and Self-Attention Transformers
Self-Attention Transformers are based on calculating scaled
dot-product attention units, and generate new data by learning
to paying attention to previous data generated [15]. Scaled dot-
product attention is calculated for each unit within the input
vector, e.g. words in a sentence, or, in this case, signals in a
stream. The attention units are input with a sequence and output
embeddings of relevant tokens. Query (Wq), key (Wk), and value
(Wv) weights are calculated as:
Attention(Q, K, V )=sof tmax QKT
dkV, (1)
where the query is an entity within the sequence, keys are
vector representations of the input, and the values are derived
by querying against keys. The term self-attention comes from
the fact that Q,Kand Vare received from the same source, and
generation is an unsupervised. GPT-2 architecture follows the
concept of Multi-headed Attention:
MultiH ead(Q, K, V )=Concat(head1,...,head
h)WO
headi=Attention(QW Q
i,KWK
i,VWV
i).
(2)
That is, a deep structure of hiattention heads in order to inter-
connect multiple attention units. Fundamentally, the GPT and
GPT-2 algorithms do not differ. The main advantages of GPT-2
are based on it being many times more complex than the GPT
with 1.5 billion parameters and being trained on a large dataset
of 8 million websites.
III. METHOD
A. Data Collection, Pre-Processing and Feature Extraction
The EMG dataset used in this study was initially acquired by
Dolopikos et al. in [16]. EMG data corresponding to the opening
and closing movements of the right hand were collected from
fifteen able-bodied participants (9 male, 6 female, mean age 26)
using a Thalmic Labs Myo armband. The participants performed
the gestures after a cue from an instructor. The recorded data
corresponding to the time before the onset of physical activity
(muscular background tone) was extracted and compiled into
a third “neutral” class. To assess contraction and relaxation of
muscles, information can be extracted by the simple analysis
of an EMG signal’s smoothed rectified envelope [17]. The data
was indeed first rectified and then low-pass filtered using a peak
detection algorithm [18], interpolating between local maxima
with a separation of at least 20 samples (equivalent to 0.1 seconds
at the Myo’s natural sample rate of 200 Hz). The EEG dataset
used was initially acquired for a previous study [19]. A total
Authorized licensed use limited to: ASTON UNIVERSITY. Downloaded on May 06,2021 at 17:25:47 UTC from IEEE Xplore. Restrictions apply.
3500 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO. 2, APRIL 2021
of 5 participants were presented with stimuli while wearing the
InteraXon Muse headband to collect EEG data from the TP9,
AF7, AF8, and TP10 electrodes. EEG data corresponding to
three mental states was collected from each participant: a neutral
class with no stimulus present, relaxation enabled by classical
music, and concentration induced by a video of the “shell game”
(wherein they had to follow a ball placed underneath one of three
shuffled upturned cups).
Whilst the data was provided to GPT-2 in its raw format, an
ensemble of features was extracted from each dataset to enable
classification. The feature set has previously proven effective,
providing sufficient information to discriminate both between
focused, relaxed, and neutral brains [19], and closed, open, and
neutral hands [16]. Features are extracted from a sliding window
of 1 s in length, at an overlap of 0.5 seconds. These windows are
further sub-divided into halves and quarters, enabling extraction
of the following ensemble of statistical features:2
r1-second window
Mean ¯yk=1
NN
i=1 yki, and Standard deviation sk=
1
N1N
i=1(yki ¯yk)2of the waveform
– Skewness g1,k =N
i=1(yki ¯yk)3
Ns3
k
, and Kurtosis g2,k =
N
i=1(yki ¯yk)4
Ns4
k3of each waveform
Maximum and minimum values over the given period
Sample variances of each wave, and sample covariances
of all unique pairs of waves sk =1
N1N
i=1(yki
¯yk)(yi ¯y); k,  [1,K]
Eigenvalues of the covariance matrix det(SλIK)=
0
Upper triangular elements of the matrix logarithm of the
covariance matrix eB=IK+
n=1
Sn
n!
The magnitude of each signal’s frequency components,
obtained via Fast Fourier Transform (FFT)
r0.5-second windows
The change between the first and second sliding window
in the sample mean and standard deviation and also the
maximum and minimum values
r0.25-second quarter windows produced due to offset
The mean of each signal in the 0.25-second window
All paired differences of means between the windows
rMaximum and minimum values and their paired differ-
ences
B. Generating and Learning From GPT-2 Generated Signals
GPT-2 models are initially trained on each class of data for
1000 steps each. Then, for nclasses, nGPT-2 s are tasked with
generating synthetic data and the class label is finally manually
added to the generated data. This process can be observed
in Fig. 1 where the generative loop is prefixed by the latter
half of the previously generated data.3The synthetic equivalent
of 60 seconds of data per class are generated (30 000 rows
2Feature extraction code available at https://github.com/jordan-bird/eeg-
feature-generation
3Example code can be found at: https://github.com/jordan- bird/Generational-
Loop-GPT2
Fig. 1. Initial training of the GPT-2 model and then generating a dataset of
synthetic biological signals.
per class of raw signal data). To benchmark machine learning
models, a K-fold cross validated learning process is followed
and compared to the process observed in Fig. 2 where training
data is augmented by the synthetically-derived data at each
fold of learning. The testing set does not contain any of the
artificial signal data. This process is performed for both the EEG
and EMG experiments for six different models: Support Vector
Machine (SVM), Random Forest (RF), K-Nearest Neighbours
(KNN, K=10), Linear Discriminant Analysis (LDA), Logistic
Regression (LR), and Gaussian Naïve Bayes (GNB). These
statistical models are selected due to their differing nature, to
explore the hypothesis with a mixed range of approaches. As was
explored in [20], it was found that unseen signal classification
can be improved through calibration via inductive and super-
vised transductive transfer learning. That is, tuning a model by
providing a small amount of calibration data to the training set.
IV. OBSERVATIONS AND RESULTS
In comparison, it was noted that all synthetic data was unique
compared to the real data. A sample of real and synthetic EEG
data can be observed in Fig. 3. Interestingly, natural behaviours,
such as the presence of characteristic oscillations, can be ob-
served within data, showing that complex natural patterns have
been generalised by the GPT-2 model. It is noted that in the
real data, some spikes are observed in the signals from all
electrodes but those are likely due to involuntary (and unwanted)
eye blinks. Worth nothing is that the GPT-2 does not replicate
similar patterns, most likely as a filtering side-effect of data
generalisation, since such occurrences are random and unrelated
to the underlying EEG data. The Power Spectral Densities of the
GPT-2 generated data were computed with Welch’s method [21]
and compared with those computed from real human data as can
be seen in Fig. 5. In observing the frequency domain plots of
the genuine data, there is a clear 50 Hz component in all classes
likely due to power-line interference. Interestingly, there has
been a clear attempt by GPT-2 to mimic this feature, albeit with
a much shallower roll-off. Fig. 4 shows the same process for
EMG data, where the GPT-2 generated waves are seemingly less
natural than their human counterparts; although natural wave
patterns do emerge, they are more erratic and prone to spiking
unlike the signals recorded from a human forearm. The Power
Spectral Densities presented in Fig. 6 indicate that across all
classes the synthetic data has significantly more power in its
high frequency components than the real data. Despite the real
EMG dataset having been low-pass filtered before being used
Authorized licensed use limited to: ASTON UNIVERSITY. Downloaded on May 06,2021 at 17:25:47 UTC from IEEE Xplore. Restrictions apply.
BIRD et al.: SYNTHETIC BIOLOGICAL SIGNALS MACHINE-GENERATED BY GPT-2 IMPROVE THE CLASSIFICATION 3501
Fig. 2. The standard K-Fold cross validation process with the GPT-2 generated synthetic data being introduced as additional training data for each fold.
Fig. 3. Comparison of GPT-2 generated (Left) and genuine recorded (Right)
EEG data across “Concentrating,”“Relaxed,” and “Neutral” mental state classes.
AF8 electrode readings are omitted for readability purposes.
to train GPT-2 this phenomenon is more notable in the EMG
domain, due likely in part to the aforementioned erratic nature
of the synthetic EMG signals.
A. Classification of Real-to-Synthetic Data and Vice-Versa
Table I shows the effects of training models on the real and
synthetic EEG data and then attempting to classify the other
data. Interestingly, the Support Vector Machine when trained on
real data can classify the synthetic data with 90.84% accuracy.
Likewise, the Gaussian Naïve Bayes approach when trained on
the synthetic data can then classify the real data with 74.71%
accuracy.Table II similarly shows the ability to classify real data
by learning from synthetic data and vice versa for EMG. The NB
model when trained on only real data can classify the synthetic
data with 62.36% accuracy, whereas the KNN model can classify
the real dataset with 78.24% accuracy when trained on only
synthetic.
Fig. 4. Comparison of GPT-2 generated (Left) and genuine recorded (Right)
EMG data across “Closed,” “Open,” and “Neutral” hand classes.
Fig. 5. Comparison of Power Spectral Densities of GPT-2 generated (Left)
and genuine recorded (Right) EEG data. For readability, only the PSD computed
from electrode TP9 is shown.
Authorized licensed use limited to: ASTON UNIVERSITY. Downloaded on May 06,2021 at 17:25:47 UTC from IEEE Xplore. Restrictions apply.
3502 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO. 2, APRIL 2021
Fig. 6. Comparison of Power Spectral Densities of GPT-2 generated (Left) and
genuine recorded (Right) EMG data. For readability, only the PSD computed
from electrode EMG1 is shown.
TAB L E I
CLASSIFICATION RESULTS WHEN TRAINING ON REAL OR SYNTHETIC EEG
DATA AND ATTEMPTING TO PREDICT THE CLASS LABELS OF THE OTHER
(SORTED FOR REAL TO SYNTHETIC)
TAB L E I I
CLASSIFICATION RESULTS WHEN TRAINING ON REAL OR SYNTHETIC EMG
DATA AND ATTEMPTING TO PREDICT THE CLASS LABELS OF THE OTHER
(SORTED FOR REAL TO SYNTHETIC)
B. EEG Classification
The results for EEG classification can be seen in Table III. The
best result overall for the dataset was the k-fold training process
with additional training data in the form of GPT-2 generated
synthetic brainwaves, using a Random Forest. This achieved a
mean accuracy of 96.69% at a deviance of 1.12%. Table IV
shows the classification abilities of the models when given
TABLE III
COMPARISON OF THE 10-FOLD CLASSIFICATION OF EEG DATA AND 10-FOLD
CLASSIFICATION OF EEG DATA ALONGSIDE SYNTHETIC DATA AS
ADDITIONAL TRAINING DATA
TAB L E I V
EEG CLASSIFICATION ABILITIES OF THE MODELS ON COMPLETELY UNSEEN
DATA WITH REGARDS TO BOTH WITH AND WITHOUT SYNTHETIC
GPT-2 DATA AS WELL AS PRIOR CALIBRATION
TAB L E V
COMPARISON OF THE 10-FOLD CLASSIFICATION OF EMG DATA AND 10-FOLD
CLASSIFICATION OF EMG DATA ALONGSIDE SYNTHETIC DATA AS
ADDITIONAL TRAINING DATA
completely unseen data from three new subjects. The results
show the difficulty of the classification problem faced, with
many scoring relatively low for the three-class problem. The best
result was found to the the Linear Discriminant Analysis model
when trained with both calibration and synthetic GPT-2 data
alongside the dataset, which then scored 66.02% classification
accuracy on the unseen data.
C. EMG Classification
Table V shows the results for EMG classification. The
best model was the Random Forest which scored 93.9% (de-
viance 0.59) during the k-fold benchmarking process in which
GPT-2 synthetic data was introduced as additional training
data.Table VI shows the abilities of the models when predicting
the class label of completely unseen EMG data. Interestingly,
the Gaussian Naïve Bayes model outperformed all others con-
sistently. The best Gaussian Naïve Bayes model at predicting
completely unseen data was when it was also trained with
calibration and GPT-2 synthetic data alongside the dataset at
an accuracy of 97.03%.
Authorized licensed use limited to: ASTON UNIVERSITY. Downloaded on May 06,2021 at 17:25:47 UTC from IEEE Xplore. Restrictions apply.
BIRD et al.: SYNTHETIC BIOLOGICAL SIGNALS MACHINE-GENERATED BY GPT-2 IMPROVE THE CLASSIFICATION 3503
TAB L E V I
EMG CLASSIFICATION ABILITIES OF THE MODELS ON COMPLETELY UNSEEN
DATA WITH REGARDS TO BOTH WITH AND WITHOUT SYNTHETIC GPT-2 DATA
AS WELL AS PRIOR CALIBRATION
Fig. 7. Real-time predictions of EMG signals enacted by the Robotiq 2F-85
Gripper.
D. Real-Time EMG Prediction for the Control of a
Robotic Manipulator
The overall process followed for robotic enaction of predicted
hand gestures by a Robotiq 2F-85 Gripper can be seen in Fig. 7.
The results in Fig. 8 show the process of a user performing
hand gestures for three minutes (124 data objects). The best-
performing EMG prediction model was applied (Gaussian Naïve
Bayes + GPT-2), which predicted real-time data with 89.5%
accuracy. All of the erroneous predictions occurred during state
transitions, which was expected given that models were trained
on concrete gestures and had not been exposed to transitional
behaviours of the arm muscles when shifting between gestures.
The best predictive model on the dataset without GPT-2 aug-
mentation scored 68.29% accuracy. The 95% Wilson confidence
interval for the augmented model’s accuracy was [82.89, 93.77],
and for the non-augmentation model was [59.62,75.86]. No cal-
ibration was performed, that is, the models were never exposed
to data from this user. Thus, GPT-2 biosignal data augmentation
leads to a model which can classify data from unseen subjects
with a higher rate of success. Fig. 9 shows the confusion matrix
for this experiment. An application of the approach is shown in
Fig. 10, where a pick-and-place routine and the EMG classifier
control a UR3 Manipulator’s Robotiq 2F-85 gripper [22]. The
device mimics the user and allows for teleoperation in order
Fig. 8. Real-time execution of gestures for three minutes predicted with the
augmented EMG model (89.5%) and non-augmented EMG model (68.29%).
Fig. 9. Confusion Matrix for real-time EMG classification.
Fig. 10. A Universal Robotics UR3 Manipulator and Robotiq 2F-85 gripper
picking up and then releasing an object.
to pick up (grip) and place (release) an object. If the operator
keeps their hand in a neutral position, then no movements are
commanded to the artificial arm.
V. C ONCLUSION
To conclude, this study has presented multiple experiments
with real and synthetic biological signals in order to ascertain
Authorized licensed use limited to: ASTON UNIVERSITY. Downloaded on May 06,2021 at 17:25:47 UTC from IEEE Xplore. Restrictions apply.
3504 IEEE ROBOTICS AND AUTOMATION LETTERS, VOL. 6, NO. 2, APRIL 2021
whether classification algorithms can be improved by consid-
ering data generated by the GPT-2 model. Although the data
are different, i.e., real and synthetic data were unique, a model
trained on one of the two sets of signals can strongly classify
the other and thus the GPT-2 model is able to generate relatively
realistic data which holds useful information that can be learnt
from for application to real signals. For EEG, an SVM trained on
synthetic data could classify real data at 74.71% accuracy and a
KNN algorithm could do the same for real EMG classification
at 78.24% accuracy, training on only synthetic data. We then
showed that several learning algorithms were improved for
both EMG and EEG classification when the training data was
augmented by GPT-2. The main argument of this work is that
synthetic biosignals generated by an attention-based transformer
hold useful information towards improving several learning
algorithms for classification of real biological signal data. In
future, larger datasets could be used and thus deep learning
would be a realistic possibility for classification following the
same process. Given that this work showed promise in terms
of the model architecture itself, similar models could also be
benchmarked in terms of their ability to create augmented train-
ing datasets e.g. BART, CTRL, Transformer-XL and XLNet.
Another unoptimised level of detail is the amount of synthetic
data that is added to the training set for augmentation, future
work could explore the levelof data needed for apt improvements
to the models.
Our suggested model for EMG, the GNB approach trained
with human-sourced GPT-2 generated synthetic signals, was
powerful in terms of predictive ability and required relatively
little computational resources given its simplistic nature. Addi-
tionally, the approach did not require further calibration, as many
state-of-the-art approaches do (including the Myo software it-
self), instead correctly predicting the behaviours of a new subject
from the point of wearing the device. Given these attributes, the
model is apt for usage on-board within wearable EMG devices
for real-time prediction of gesture.
REFERENCES
[1] A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever,
“Language models are unsupervised multitask learners,” OpenAI Blog,
vol. 1, no. 8, pp. 1–9, 2019.
[2] J. C. B. Cruz, J. A. Tan, and C. Cheng, “Localization of fake news detection
via multitask transfer learning,” in Proc. 12th Lang. Resources Eval. Conf.,
2020, pp. 2596–2604.
[3] J.-S. Lee and J. Hsiang, “Patent claim generation by fine-tuning OpenAI
GPT-2,World Patent Inf., vol. 62, 2020, Art. no. 101983.
[4] Y. Nishi, A. Suge, and H. Takahashi, “Text analysis on the stock market in
the automotive industry through fake news generated by GPT-2,” in Proc.
Artif. Intell. for Bus., 2020, pp. 103–114.
[5] E. Lashgari, D. Liang, and U. Maoz, “Data augmentation for deep-
learning-based electroencephalography,J. Neurosci. Methods, 2020,
Art. no. 108885.
[6] F. Lotte, “Signal processing approaches to minimize or suppress calibra-
tion time in oscillatory activity-based brain-computer interfaces,” in Proc.
IEEE, vol. 103, no. 6, pp. 871–890, 2015.
[7] G. Dai, J. Zhou, J. Huang, and N. Wang, “HS-CNN: A CNN with hybrid
convolution scale for EEG motor imagery classification,J. Neural Eng.,
vol. 17, Jan. 2020.
[8] J. Dinarés-Ferran, R. Ortner, C. Guger, and J. Solé-Casals, “A new method
to generate artificial frames using the empirical mode decomposition for
an eeg-based motor imagery bci,” Front. Neurosci., vol. 12, pp. 1–308,
2018.
[9] T. H. Shovon, Z. A. Nazi, S. Dash, and M. F. Hossain, “Classification of
motor imagery eeg signals with multi-input convolutional neural network
by augmenting stft,” in Proc. 5th Int. Conf. Adv. Elect. Eng., 2019,
pp. 398–403.
[10] D. Freer and G.-Z. Yang, “Data augmentation for self-paced motor imagery
classification with c-LSTM,” J. Neural Eng., vol. 17, no. 1, Jan. 2020,
Art. no. 016041.
[11] P. Tsinganos, B. Cornelis, J. Cornelis, B. Jansen, and A. Skodras, “Data
augmentation of surface electromyography for hand gesture recognition,”
Sensors, vol. 20, no. 17, 2020, Art. no. 4892 .
[12] Y. Luo and B.-L. Lu, “Eeg data augmentation for emotion recognition
using a conditional wasserstein gan,” in Proc. 40th Annu. Int. Conf. IEEE
Eng. Med. Biol. Soc., 2018, pp. 2535–2538.
[13] Q. Zhang and Y. Liu, “Improving brain computer interface performance
by data augmentation with conditional deep convolutional generative
adversarial networks,” 2018, arXiv:1806.07108.
[14] R. A. Zanini and E. Luna Colombini, “Parkinson’s disease emg data
augmentation and simulation with DCGANS and style transfer,Sensors,
vol. 20, no. 9, 2020, Art. no. 2605.
[15] A. Vaswani et al., “Attention is all you need,” in Proc. Adv. Neural Inf.
Process. Syst., 2017, pp. 5998–6008.
[16] C. Dolopikos, M. Pritchard, J. J. Bird, and D. R. Faria, “Electromyography
signal-based gesture recognition for human-machine interaction in real-
time through model calibration,” in Proc. SAI Future Inf. Commun. Conf.,
2021, pp. 1–18.
[17] C. J. D. Luca, “The use of surface electromyography in biomechanics,” J.
Appl. Biomech., vol. 13, no. 2, pp. 135–163, 1997.
[18] The MathWorks, Inc., Signal Processing Toolbox, Natick, MA, USA:
TheMathWorks, 2020, pp. 489–503.
[19] J. J. Bird, L. J. Manso, E. P. Ribeiro, A. Ekart, and D. R. Faria, “A study
on mental state classification using eeg-based brain-machine interface,” in
Proc. Int. Conf. Intell. Syst., 2018, pp. 795–800.
[20] J. Kobylarz, J. J. Bird, D. R. Faria, E. P. Ribeiro, and A. Ekárt, “Thumbs up,
thumbs down: Non-verbal human-robot interaction through real-time emg
classification via inductive and supervised transductive transfer learning,
J. Ambient Intell. Humanized Comput., vol. 11, no. 12, pp. 6021–6031,
2020.
[21] P. Welch, “The use of fast fourier transform for the estimation of
power spectra: A method based on time averaging over short, modified
periodograms,” IEEE Trans. Audio Electroacoust., vol. AE-15, no. 2,
pp. 70–73, Jun. 1967.
[22] Universal Robotics, “Ur3 technical specifications,” Technical Specifi-
cations, 2015. Accessed: Feb. 2021. [Online]. Available: https://www.
universal-robots.com/media/240742/ur3_gb.pdf
Authorized licensed use limited to: ASTON UNIVERSITY. Downloaded on May 06,2021 at 17:25:47 UTC from IEEE Xplore. Restrictions apply.
... In this work we concentrate on a subset of our EEG data set that includes 184 recordings distributed over 4 classes. The presence of two channels for EEG data acquisition and the wide bandwidth of our EEG data (sampled at 2 kHz) has resulted in high training and testing accuracies, which eliminated the need for data augmentation [7]. ...
Article
Full-text available
Electroencephalography (EEG) is utilised to analyse faint brain signals, which vary in amplitude and frequency depending on brain state during emotions, movement and motor effects. EEG and Brain-Computer Interface (BCI) technology combined with machine learning is deployed in prosthesis control to help amputee and people suffering from severe injuries. In this work, we utilise the Wavelet Transform (WT) to extract EEG features for optimal prosthetic arm control. Unlike most of research work, our work is based upon high-resolution EEG data set (with 2-kHz sampling frequency), dual-channel EEG acquisition module and large-size scalograms, which accounts for the need of data augmentation techniques necessary in deep learning models. We present our results of performance evaluation of 1-D and scalograms classifiers for optimal prosthetic BCI arm control system. We designed our optimal control system using five 1-D Wavelet classifiers including Linear Discriminant Analysing (LDA) and Multi-Layer Perceptron (MLP) as well as 2-D representations of EEG signals (scalograms). The EEG data set was accumulated with the help of 7 subjects who performed 4 different mental activities during each recording session. Each mental activity was recorded during 8 seconds using a dual-channel EEG acquisition module, which was set up at 2 kHz sampling frequency. The scalograms are generated during training using resampled EEG data at 500 Hz, which produced scalograms with sizes of 1500 × 300 pixels. Our performance evaluation results showed high training and testing accuracies for 1-D Wavelet classifiers and scalograms as well. The optimisation results have shown that with the scalograms and 2-D CNN classifier the optimal performance was determined (where training accuracy was 98%) after discarding 2 seconds of the 8-second EEG data and resizing the resulting scalograms to 22% of their maximum size. On the other hand, 1-D Wavelet classifiers showed optimal performance with 95% training accuracy and trimming of 4 seconds. The overall performance of 1-D Wavelet classifiers, MLP in particular, is advantageous in the context of prosthetic arm control due to the high training speed and reduction of EEG data length by half (4 seconds). The complete designed system consists of EEG acquisition, BCI and control modules hosted in a raspberry pi 4, single-board computer system. The designed BCI control system is operated through a comprehensive Graphical User Interface (GUI) using Python. Our contribution in this field include the design of an optimal BCI prosthetic arm control system with emphasis on wide-bandwidth EEG data set, low number of EEG channels, high accuracy and affordable processing hardware of raspberry pi 4. The application of the designed performance optimisation procedure for prosthetic arm control is beneficial to other similar BCI applications.
... In the particular case of few existing relevant data in datasheets, transfer learning [53][54][55][56], data augmentation [57][58][59][60], or synthetic data [61][62][63][64] techniques are usually applied to generate improved machine learning training and models. Transfer learning is based on the knowledge acquired from another existing learned task to improve the performance of a new machine learning model; thus, reducing the amount of required training data. ...
Article
Full-text available
In this paper, a general overview regarding neural recording, classical signal processing techniques and machine learning classification algorithms applied to monitor brain activity is presented. Currently, several approaches classified as electrical, magnetic, neuroimaging recordings and brain stimulations are available to obtain neural activity of the human brain. Among them, non-invasive methods like electroencephalography (EEG) are commonly employed, as they can provide a high degree of temporal resolution (on the order of milliseconds) and acceptable space resolution. In addition, it is simple, quick, and does not create any physical harm or stress to patients. Concerning signal processing, once the neural signals are acquired, different procedures can be applied for feature extraction. In particular, brain signals are normally processed in time, frequency, and/or space domains. The features extracted are then used for signal classification depending on its characteristics such us the mean, variance or band power. The role of machine learning in this regard has become of key importance during the last years due to its high capacity to analyze complex amounts of data. The algorithms employed are generally classified in supervised, unsupervised and reinforcement techniques. A deep review of the most used machine learning algorithms and the advantages/drawbacks of most used methods is presented. Finally, a study of these procedures utilized in a very specific and novel research field of electroencephalography, i.e., autobiographical memory deficits in schizophrenia, is outlined.
... [11] achieve new state-of-the-art results on E2E and WebNLG benchmarks by using GPT-2 to augment training data. There has also been work in using synthesized datasets for counterfactual generation [57], causal inference evaluation [58], clinical entity recognition [26], and more [8,3,5]. ...
Preprint
Full-text available
NLP researchers need more, higher-quality text datasets. Human-labeled datasets are expensive to collect, while datasets collected via automatic retrieval from the web such as WikiBio are noisy and can include undesired biases. Moreover, data sourced from the web is often included in datasets used to pretrain models, leading to inadvertent cross-contamination of training and test sets. In this work we introduce a novel method for efficient dataset curation: we use a large language model to provide seed generations to human raters, thereby changing dataset authoring from a writing task to an editing task. We use our method to curate SynthBio - a new evaluation set for WikiBio - composed of structured attribute lists describing fictional individuals, mapped to natural language biographies. We show that our dataset of fictional biographies is less noisy than WikiBio, and also more balanced with respect to gender and nationality.
... Image recognition tasks for Convolutional Neural Network image classification are affected by data scarcity due to their data requirements (Andriyanov and Andriyanov, 2020;Bloice et al., 2017), where many generative models have been recommended to alleviate such issues (Nalepa et al., 2019;Tran et al., 2021). Generative models have also been noted to positively impact biological signal classification (Anicet Zanini and Luna Colombini, 2020;Bird et al., 2021), semantic Image-to-Image Translation (Arantes et al., 2020), speech processing (Bird et al., 2020;Qian et al., 2019), and Human Activity Recognition (Alnujaim et al., 2019;Erol et al., 2019) among many others. In this work, we use a Conditional GAN for data augmentation, which are described in the following section. ...
Article
Contemporary Artificial Intelligence technologies allow for the employment of Computer Vision to discern good crops from bad, providing a step in the pipeline of selecting healthy fruit from undesirable fruit, such as those which are mouldy or damaged. State-of-the-art works in the field report high accuracy results on small datasets (<1000 images), which are not representative of the population regarding real-world usage. The goals of this study are to further enable real-world usage by improving generalisation with data augmentation as well as to reduce overfitting and energy usage through model pruning. In this work, we suggest a machine learning pipeline that combines the ideas of fine-tuning, transfer learning, and generative model-based training data augmentation towards improving fruit quality image classification. A linear network topology search is performed to tune a VGG16 lemon quality classification model using a publicly-available dataset of 2690 images. We find that appending a 4096 neuron fully connected layer to the convolutional layers leads to an image classification accuracy of 83.77%. We then train a Conditional Generative Adversarial Network on the training data for 2000 epochs, and it learns to generate relatively realistic images. Grad-CAM analysis of the model trained on real photographs shows that the synthetic images can exhibit classifiable characteristics such as shape, mould, and gangrene. A higher image classification accuracy of 88.75% is then attained by augmenting the training with synthetic images, arguing that Conditional Generative Adversarial Networks have the ability to produce new data to alleviate issues of data scarcity. Finally, model pruning is performed via polynomial decay, where we find that the Conditional GAN-augmented classification network can retain 81.16% classification accuracy when compressed to 50% of its original size.
... Image recognition tasks for Convolutional Neural Network image classification are affected by data scarcity due to their data requirements [26,27], where many generative models have been recommended to alleviate such issues [28,29]. Generative models have also been noted to positively impact biological signal classification [30,31], semantic Imageto-Image Translation [32], speech processing [33,34], and Human Activity Recognition [35,36] among many others. ...
Preprint
Full-text available
Contemporary Artificial Intelligence technologies allow for the employment of Computer Vision to discern good crops from bad, providing a step in the pipeline of selecting healthy fruit from undesirable fruit, such as those which are mouldy or gangrenous. State-of-the-art works in the field report high accuracy results on small datasets (<1000 images), which are not representative of the population regarding real-world usage. The goals of this study are to further enable real-world usage by improving generalisation with data augmentation as well as to reduce overfitting and energy usage through model pruning. In this work, we suggest a machine learning pipeline that combines the ideas of fine-tuning, transfer learning, and generative model-based training data augmentation towards improving fruit quality image classification. A linear network topology search is performed to tune a VGG16 lemon quality classification model using a publicly-available dataset of 2690 images. We find that appending a 4096 neuron fully connected layer to the convolutional layers leads to an image classification accuracy of 83.77%. We then train a Conditional Generative Adversarial Network on the training data for 2000 epochs, and it learns to generate relatively realistic images. Grad-CAM analysis of the model trained on real photographs shows that the synthetic images can exhibit classifiable characteristics such as shape, mould, and gangrene. A higher image classification accuracy of 88.75% is then attained by augmenting the training with synthetic images, arguing that Conditional Generative Adversarial Networks have the ability to produce new data to alleviate issues of data scarcity. Finally, model pruning is performed via polynomial decay, where we find that the Conditional GAN-augmented classification network can retain 81.16% classification accuracy when compressed to 50% of its original size.
... It is similar to decoder-only transformers, with the difference lying in the immense model scale and training data. As an unsupervised learning fashion, this method also applies to predict pixels without structure knowledge [6], label data for the graph neural networks [15], generate synthetic biological signals [2], and so forth. Its success may lie in that instances of downstream tasks appear in the succeeding inputs, which facilitates prediction. ...
Preprint
Acquiring dynamics is an essential topic in robot learning, but up-to-date methods, such as dynamics randomization, need to restart to check nominal parameters, generate simulation data, and train networks whenever they face different robots. To improve it, we novelly investigate general robot dynamics, its inverse models, and Gen2Real, which means transferring to reality. Our motivations are to build a model that learns the intrinsic dynamics of various robots and lower the threshold of dynamics learning by enabling an amateur to obtain robot models without being trapped in details. This paper achieves the "generality" by randomizing dynamics parameters, topology configurations, and model dimensions, which in sequence cover the property, the connection, and the number of robot links. A structure modified from GPT is applied to access the pre-training model of general dynamics. We also study various inverse models of dynamics to facilitate different applications. We step further to investigate a new concept, "Gen2Real", to transfer simulated, general models to physical, specific robots. Simulation and experiment results demonstrate the validity of the proposed models and method.\footnote{ These authors contribute equally.
Thesis
Full-text available
In modern Human-Robot Interaction, much thought has been given to accessibility regarding robotic locomotion, specifically the enhancement of awareness and lowering of cognitive load. On the other hand, with social Human-Robot Interaction considered, published research is far sparser given that the problem is less explored than pathfinding and locomotion. This thesis studies how one can endow a robot with affective perception for social awareness in verbal and non-verbal communication. This is possible by the creation of a Human-Robot Interaction framework which abstracts machine learning and artificial intelligence technologies which allow for further accessibility to non-technical users compared to the current State-of-the-Art in the field. These studies thus initially focus on individual robotic abilities in the verbal, non-verbal and multimodality domains. Multimodality studies show that late data fusion of image and sound can improve environment recognition, and similarly that late fusion of Leap Motion Controller and image data can improve sign language recognition ability. To alleviate several of the open issues currently faced by researchers in the field, guidelines are reviewed from the relevant literature and met by the design and structure of the framework that this thesis ultimately presents. The framework recognises a user's request for a task through a chatbot-like architecture. Through research in this thesis that recognises human data augmentation (paraphrasing) and subsequent classification via language transformers, the robot's more advanced Natural Language Processing abilities allow for a wider range of recognised inputs. That is, as examples show, phrases that could be expected to be uttered during a natural human-human interaction are easily recognised by the robot. This allows for accessibility to robotics without the need to physically interact with a computer or write any code, with only the ability of natural interaction (an ability which most humans have) required for access to all the modular machine learning and artificial intelligence technologies embedded within the architecture. Following the research on individual abilities, this thesis then unifies all of the technologies into a deliberative interaction framework, wherein abilities are accessed from long-term memory modules and short-term memory information such as the user's tasks, sensor data, retrieved models, and finally output information. In addition, algorithms for model improvement are also explored, such as through transfer learning and synthetic data augmentation and so the framework performs autonomous learning to these extents to constantly improve its learning abilities. It is found that transfer learning between electroencephalographic and electromyographic biological signals improves the classification of one another given their slight physical similarities. Transfer learning also aids in environment recognition, when transferring knowledge from virtual environments to the real world. In another example of non-verbal communication, it is found that learning from a scarce dataset of American Sign Language for recognition can be improved by multi-modality transfer learning from hand features and images taken from a larger British Sign Language dataset. Data augmentation is shown to aid in electroencephalographic signal classification by learning from synthetic signals generated by a GPT-2 transformer model, and, in addition, augmenting training with synthetic data also shows improvements when performing speaker recognition from human speech. Given the importance of platform independence due to the growing range of available consumer robots, four use cases are detailed, and examples of behaviour are given by the Pepper, Nao, and Romeo robots as well as a computer terminal. The use cases involve a user requesting their electroencephalographic brainwave data to be classified by simply asking the robot whether or not they are concentrating. In a subsequent use case, the user asks if a given text is positive or negative, to which the robot correctly recognises the task of natural language processing at hand and then classifies the text, this is output and the physical robots react accordingly by showing emotion. The third use case has a request for sign language recognition, to which the robot recognises and thus switches from listening to watching the user communicate with them. The final use case focuses on a request for environment recognition, which has the robot perform multimodality recognition of its surroundings and note them accordingly. The results presented by this thesis show that several of the open issues in the field are alleviated through the technologies within, structuring of, and examples of interaction with the framework. The results also show the achievement of the three main goals set out by the research questions; the endowment of a robot with affective perception and social awareness for verbal and non-verbal communication, whether we can create a Human-Robot Interaction framework to abstract machine learning and artificial intelligence technologies which allow for the accessibility of non-technical users, and, as previously noted, which current issues in the field can be alleviated by the framework presented and to what extent.
Article
The Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) will produce several billion photometric redshifts (photo- z 's), enabling cosmological analyses to select a subset of galaxies with the most accurate photo- z . We perform initial redshift fits on Subaru Strategic Program galaxies with deep grizy photometry using Trees for Photo-Z (TPZ) before applying a custom neural network classifier (NNC) tuned to select galaxies with ( z phot − z spec )/(1 + z spec ) < 0.10. We consider four cases of training and test sets ranging from an idealized case to using data augmentation to increase the representation of dim galaxies in the training set. Selections made using the NNC yield significant further improvements in outlier fraction and photo- z scatter ( σ z ) over those made with typical photo- z uncertainties. As an example, when selecting the best third of the galaxy sample, the NNC achieves a 35% improvement in outlier rate and a 23% improvement in σ z compared to using uncertainties from TPZ. For cosmology and galaxy evolution studies, this method can be tuned to retain a particular sample size or to achieve a desired photo- z accuracy; our results show that it is possible to retain more than a third of an LSST-like galaxy sample while reducing σ z by a factor of 2 compared to the full sample, with one-fifth as many photo- z outliers. For surveys like LSST that are not limited by shot noise, this method enables a larger number of tomographic redshift bins and hence a significant increase in the total signal to noise of galaxy angular power spectra.
Conference Paper
Full-text available
The use of the internet as a fast medium of spreading fake news reinforces the need for computational tools that combat it. Techniques that train fake news classifiers exist, but they all assume an abundance of resources including large labeled datasets and expert-curated corpora, which low-resource languages may not have. In this work, we make two main contributions: First, we alleviate resource scarcity by constructing the first expertly-curated benchmark dataset for fake news detection in Filipino, which we call "Fake News Filipino." Second, we benchmark Transfer Learning (TL) techniques and show that they can be used to train robust fake news classifiers from little data, achieving 91% accuracy on our fake news dataset, reducing the error by 14% compared to established few-shot baselines. Furthermore, lifting ideas from multitask learning, we show that augmenting transformer-based transfer techniques with auxiliary language modeling losses improves their performance by adapting to writing style. Using this, we improve TL performance by 4-6%, achieving an accuracy of 96% on our best model. Lastly, we show that our method generalizes well to different types of news articles, including political news, entertainment news, and opinion articles.
Chapter
Full-text available
In this work, we achieve up to 92% classification accuracy of electromyographic data between five gestures in pseudo-real-time. Most current state-of-the-art methods in electromyographical signal processing are unable to classify real-time data in a post-learning environment, that is, after the model is trained and results are analysed. In this work we show that a process of model calibration is able to lead models from 67.87% real-time classification accuracy to 91.93%, an increase of 24.06%. We also show that an ensemble of classical machine learning models can outperform a Deep Neural Network. An original dataset of EMG data is collected from 15 subjects for 4 gestures (Open-Fingers, Wave-Out, Wave-in, Close-fist) using a Myo Armband for measurement of forearm muscle activity. The dataset is cleaned between gesture performances on a per-subject basis and a sliding temporal window algorithm is used to perform statistical analysis of EMG signals and extract meaningful mathematical features as input to the learning paradigms. The classifiers used in this paper include a Random Forest, a Support Vector Machine, a Multilayer Perceptron, and a Deep Neural Network. The three classical classifiers are combined into a single model through an ensemble voting system which scores 91.93% compared to the Deep Neural Network which achieves a performance of 88.68%, both after calibrating to a subject and performing real-time classification (pre-calibration scores for the two being 67.87% and 74.27%, respectively).
Article
Full-text available
The range of applications of electromyography-based gesture recognition has increased over the last years. A common problem regularly encountered in literature is the inadequate data availability. Data augmentation, which aims at generating new synthetic data from the existing ones, is the most common approach to deal with this data shortage in other research domains. In the case of surface electromyography (sEMG) signals, there is limited research in augmentation methods and quite regularly the results differ between available studies. In this work, we provide a detailed evaluation of existing (i.e., additive noise, overlapping windows) and novel (i.e., magnitude warping, wavelet decomposition, synthetic sEMG models) strategies of data augmentation for electromyography signals. A set of metrics (i.e., classification accuracy, silhouette score, and Davies–Bouldin index) and visualizations help with the assessment and provides insights about their performance. Methods like signal magnitude warping and wavelet decomposition yield considerable increase (up to 16%) in classification accuracy across two benchmark datasets. Particularly, a significant improvement of 1% in the classification accuracy of the state-of-the-art model in hand gesture recognition is achieved.
Article
Full-text available
Background Data augmentation (DA) has recently been demonstrated to achieve considerable performance gains for deep learning (DL)—increased accuracy and stability and reduced overfitting. Some electroencephalography (EEG) tasks suffer from low samples-to-features ratio, severely reducing DL effectiveness. DA with DL thus holds transformative promise for EEG processing, possibly like DL revolutionized computer vision, etc. New method We review trends and approaches to DA for DL in EEG to address: Which DA approaches exist and are common for which EEG tasks? What input features are used? And, what kind of accuracy gain can be expected? Results DA for DL on EEG begun 5 years ago and is steadily used more. We grouped DA techniques (noise addition, generative adversarial networks, sliding windows, sampling, Fourier transform, recombination of segmentation, and others) and EEG tasks (into seizure detection, sleep stages, motor imagery, mental workload, emotion recognition, motor tasks, and visual tasks). DA efficacy across techniques varied considerably. Noise addition and sliding windows provided the highest accuracy boost; mental workload most benefitted from DA. Sliding window, noise addition, and sampling methods most common for seizure detection, mental workload, and sleep stages, respectively. Comparing with existing methods Percent of decoding accuracy explained by DA beyond unaugmented accuracy varied between 8% for recombination of segmentation and 36% for noise addition and from 14% for motor imagery to 56% for mental workload—29% on average. Conclusions DA increasingly used and considerably improved DL decoding accuracy on EEG. Additional publications—if adhering to our reporting guidelines—will facilitate more detailed analysis.
Article
Full-text available
This paper proposes two new data augmentation approaches based on Deep Convolutional Generative Adversarial Networks (DCGANs) and Style Transfer for augmenting Parkinson’s Disease (PD) electromyography (EMG) signals. The experimental results indicate that the proposed models can adapt to different frequencies and amplitudes of tremor, simulating each patient’s tremor patterns and extending them to different sets of movement protocols. Therefore, one could use these models for extending the existing patient dataset and generating tremor simulations for validating treatment approaches on different movement scenarios.
Article
Full-text available
In this study, we present a transfer learning method for gesture classification via an inductive and supervised transductive approach with an electromyographic dataset gathered via the Myo armband. A ternary gesture classification problem is presented by states of ’thumbs up’, ’thumbs down’, and ’relax’ in order to communicate in the affirmative or negative in a non-verbal fashion to a machine. Of the nine statistical learning paradigms benchmarked over 10-fold cross validation (with three methods of feature selection), an ensemble of Random Forest and Support Vector Machine through voting achieves the best score of 91.74% with a rule-based feature selection method. When new subjects are considered, this machine learning approach fails to generalise new data, and thus the processes of Inductive and Supervised Transductive Transfer Learning are introduced with a short calibration exercise (15 s). Failure of generalisation shows that 5 s of data per-class is the strongest for classification (versus one through seven seconds) with only an accuracy of 55%, but when a short 5 s per class calibration task is introduced via the suggested transfer method, a Random Forest can then classify unseen data from the calibrated subject at an accuracy of around 97%, outperforming the 83% accuracy boasted by the proprietary Myo system. Finally, a preliminary application is presented through social interaction with a humanoid Pepper robot, where the use of our approach and a most-common-class metaclassifier achieves 100% accuracy for all trials of a ‘20 Questions’ game.
Chapter
News articles have great impacts on asset prices in the financial markets. Many attempts have been reported to ascertain how news influences stock prices. Stock price fluctuations of highly influential companies can have a major impact on the economy as a whole. In particular, the automobile industry is a colossal industry that leads the Japanese industry. However, the limitations in the number of available data sets usually become the hurdle for the model accuracy. In this study, we constructed a news evaluation model utilizing GPT-2. A news evaluation model is a model that evaluates news articles distributed to financial markets based on price fluctuation rates and predicts fluctuations in stock prices. We have added news articles generated by GPT-2 as data for analysis. Besides, we used a co-occurrence network analysis to review the overview of the news articles. News articles were classified through Long Short-Term Memory (LSTM). The results showed that the accuracy of the news evaluation model improved by generating news articles using a language generation model through GPT-2. More detailed analyses are planned for the future.
Article
In this work, we focus on fine-tuning an OpenAI GPT-2 pre-trained model for generating patent claims. GPT-2 has demonstrated impressive efficacy of pre-trained language models on various tasks, particularly coherent text generation. Patent claim language itself has rarely been explored in the past and poses a unique challenge. We are motivated to generate coherent patent claims automatically so that augmented inventing might be viable someday. In our implementation, we identified a unique language structure in patent claims and leveraged its implicit human annotations. We investigated the fine-tuning process by probing the first 100 steps and observing the generated text at each step. Based on both conditional and unconditional random sampling, we analyze the overall quality of generated patent claims. Our contributions include: (1) being the first to generate patent claims by machines and being the first to apply GPT-2 to patent claim generation, (2) providing various experiment results for qualitative analysis and future research, (3) proposing a new sampling approach for text generation, and (4) publishing our fine-tuned GPT-2 model and sample code for future researchers to run on Colab.
Article
Objective: Brain Computer Interfaces (BCI) are becoming important tools for assistive technology, particularly through the use of Motor Imagery (MI) for aiding task completion. However, most existing methods of MI classification have been applied in a trial-wise fashion, with window sizes of approximately 2 seconds or more. Application of this type of classifier could cause a delay when switching between MI events. Approach: In this study, state-of-the-art classification methods for motor imagery are assessed with considerations for real-time and self-paced control, and a Convolutional Long-Short Term Memory (C-LSTM) network based on Filter Bank Common Spatial Patterns (FBCSP) is proposed. In addition, the effects of several methods of data augmentation on different classifiers are explored. Main results: The results of this study show that the proposed network achieves adequate results in distinguishing between different control classes, but both considered deep learning models are still less reliable than a Riemannian Minimum Distance to the Mean (MDM) classifier. In addition, controlled skewing of the data and the explored data augmentation methods improved the average overall accuracy of the classifiers by 14.0% and 5.3%, respectively. Significance: This manuscript is among the first to attempt combining convolutional and recurrent neural network layers for the purpose of MI classification, and is also one of the first to provide an in-depth comparison of various data augmentation methods for MI classification. In addition, all of these methods are applied on smaller windows of data and with consideration to ambient data, which provides a more realistic test bed for real-time and self-paced control.
Article
Objective: The EEG motor imagery classification has been widely used in healthcare applications such as mobile asisstive robots and post-stroke rehabilitation. Recently, CNN-based EEG motor imagery classification methods have been proposed and achieve relatively high classification accuracy. However, these methods use single convolution scale in the CNN, while the best convolution scale differs from subject to subject. This limits the classification accuracy. Another issue is that the classification accuracy degrades when the training data is limited. Approach: To address these issues, we have proposed a hybrid-scale CNN architecture with a data augmentation method for EEG motor imagery classification. Main results: Compared with several state-of-the-art methods, the proposed method achieve an average classification accuracy of 87.6% with 0.2% deviation, which outperforms several state-of-the-art EEG motor imagery classification methods. Significance: The proposed method effectively addressed the issues of existing CNN-based EEG motor imagery classification methods and improved the classification accuracy.