ArticlePDF Available

Abstract and Figures

In this study, we present a transfer learning method for gesture classification via an inductive and supervised transductive approach with an electromyographic dataset gathered via the Myo armband. A ternary gesture classification problem is presented by states of ’thumbs up’, ’thumbs down’, and ’relax’ in order to communicate in the affirmative or negative in a non-verbal fashion to a machine. Of the nine statistical learning paradigms benchmarked over 10-fold cross validation (with three methods of feature selection), an ensemble of Random Forest and Support Vector Machine through voting achieves the best score of 91.74% with a rule-based feature selection method. When new subjects are considered, this machine learning approach fails to generalise new data, and thus the processes of Inductive and Supervised Transductive Transfer Learning are introduced with a short calibration exercise (15 s). Failure of generalisation shows that 5 s of data per-class is the strongest for classification (versus one through seven seconds) with only an accuracy of 55%, but when a short 5 s per class calibration task is introduced via the suggested transfer method, a Random Forest can then classify unseen data from the calibrated subject at an accuracy of around 97%, outperforming the 83% accuracy boasted by the proprietary Myo system. Finally, a preliminary application is presented through social interaction with a humanoid Pepper robot, where the use of our approach and a most-common-class metaclassifier achieves 100% accuracy for all trials of a ‘20 Questions’ game.
This content is subject to copyright. Terms and conditions apply.
1 3
Journal of Ambient Intelligence and Humanized Computing (2020) 11:6021–6031
Thumbs up, thumbs down: non‑verbal human‑robot interaction
throughreal‑time EMG classication viainductive andsupervised
transductive transfer learning
JhonatanKobylarz1· JordanJ.Bird2· DiegoR.Faria2· EduardoParenteRibeiro1· AnikóEkárt2
Received: 11 October 2019 / Accepted: 27 February 2020 / Published online: 7 March 2020
© The Author(s) 2020
In this study, we present a transfer learning method for gesture classification via an inductive and supervised transductive
approach with an electromyographic dataset gathered via the Myo armband. A ternary gesture classification problem is
presented by states of ’thumbs up’, ’thumbs down’, and ’relax’ in order to communicate in the affirmative or negative in a
non-verbal fashion to a machine. Of the nine statistical learning paradigms benchmarked over 10-fold cross validation (with
three methods of feature selection), an ensemble of Random Forest and Support Vector Machine through voting achieves the
best score of 91.74% with a rule-based feature selection method. When new subjects are considered, this machine learning
approach fails to generalise new data, and thus the processes of Inductive and Supervised Transductive Transfer Learning are
introduced with a short calibration exercise (15 s). Failure of generalisation shows that 5 s of data per-class is the strongest
for classification (versus one through seven seconds) with only an accuracy of 55%, but when a short 5 s per class calibra-
tion task is introduced via the suggested transfer method, a Random Forest can then classify unseen data from the calibrated
subject at an accuracy of around 97%, outperforming the 83% accuracy boasted by the proprietary Myo system. Finally, a
preliminary application is presented through social interaction with a humanoid Pepper robot, where the use of our approach
and a most-common-class metaclassifier achieves 100% accuracy for all trials of a ‘20 Questions’ game.
Keywords Gesture classification· Human-robot interaction· Electromyography· Machine learning· Transfer learning·
Inductive transfer learning· Supervised transductive transfer Learning· Myo armband· Pepper robot
1 Introduction
Within a social context, the current state of Human-Robot
Interaction is arguably most often concerned with the
domain of verbal, spoken communication. That is, the tran-
scription of spoken language to text, and further Natural
Language Processing (NLP) in order to extract meaning;
this framework is oftentimes multi-modally combined with
other data, such as the tone of voice, which too carries useful
information. With this in mind, a recent National GP Survey
carried out in the United Kingdom found that 125,000 adults
and 20,000 children had the ability to converse in British
Sign Language (BSL)(Ipsos 2016), and of those surveyed,
15,000 people reported it as their primary language. With
those statistics in mind, this shows that those 15,000 people
only have the ability to directly converse with approximately
0.22% of the UK population. This argues for the importance
of non-verbal communication, such as through gesture.
Jhonatan Kobylarz and Jordan J. Bird are co-first authors.
* Jordan J. Bird
Jhonatan Kobylarz
Diego R. Faria
Eduardo Parente Ribeiro
Anikó Ekárt
1 Department ofElectrical Engineering, Federal University
ofParana, Curitiba, Brazil
2 School ofEngineering andApplied Science, Aston
University, Birmingham, UK
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
6022 J.Kobylarz et al.
1 3
To answer in the affirmative, negative, or to not answer
at all are three very important responses when it comes to
meaningful conversation, especially in a goal-based sce-
nario. In this study, a ternary classification experiment is
performed towards the domain of non-verbal communication
with robots; the electromyographic signals produced when
performing a thumbs up, thumbs down, and resting state
with either the left or right arms are considered, and statis-
tical classification techniques are benchmarked in terms of
validation, generalisation to new data, and transfer learning
to better generalise to new data in order to increase reli-
ability to within the realms of classical speech recognition.
That is, to reach interchangeable accuracies between the two
domains and thus enable those who do not have the ability of
speech to effectively communicate with machines.
The main contributions of this work are as follows:
An original dataset is collected from five subjects for
three-class gesture classification.1 A ternary classifica-
tion problem is thus presented; thumbs up, thumbs down,
and relaxed.
A feature extraction process retrieved from previous work
is used to extract features from electromyographic waves,
the process prior to this has only been explored in elec-
troencephalography (EEG) and in this work is adapted
for electromyographic gesture classification.2
Multiple feature selection algorithms and statistical/
ensemble classifiers are benchmarked in order to derive
a best statistical classifier for the ground truth data.
Multiple best-performing models attempt to predict new
and unseen data towards the exploration of generalisa-
tion, which ultimately fails. Findings during this experi-
ment show that 15 s (5 s per class) performs considerably
better than 3, 6, 9, 12, 18, and 21 s of data. Model gener-
alisation only slightly outperforms random guessing.
Failure of generalisation is then remedied through the
suggestion of a calibration framework via inductive and
supervised transductive transfer learning. Inspired by
the findings of the experiment described in the previous
point, models are then able to reach extremely high clas-
sification ability on further unseen data presented post-
calibration. Findings show that although a confidence-
weighted Vote of Random Forest and Support Vector
Machine performed better on the original, full dataset,
the Random Forest alone outperforms this method for
calibration and classification of unseen data (97% vs.
95.7% respectively).
Finally, a real-time application of the work is preliminary
explored. Social interaction is enabled with a humanoid
robot (Softbank’s Pepper) in the form of a game, through
gestural interaction and subsequent EMG classification
of the gestures in order to answer yes/no questions while
playing 20 Questions.
In order to present the aforementioned findings in a struc-
tured manner, exploration and results are presented in chron-
ological order, since a failed generalisation experiment is
then remedied with the aid of the findings through limita-
tion. The remainder of this article is structured as follows:
firstly, important state-of-the-art work within the field of
gesture recognition and electromyography are presented
in Sect.2, along with important background information
regarding Feature Selection and Machine Learning tech-
niques explored within this study. Section3 then outlines
the processes followed towards dataset acquisition, feature
extraction, experimental methodologies, as well as important
hyperparameters and hardware information required for rep-
licability of the experiments. Results and discussion are then
presented in Sect.4, followed by a preliminary application
of the findings in Sect.5. Finally, possible future works are
discussed in Sect.6 with regards to the limitations of this
work and a final conclusion of the findings presented.
2 Background
In this section, state-of-the-art literature in electromyo-
graphic gesture classification are considered. Additionally,
a short overview of the statistical techniques are given.
Fig. 1 The MYO EMG Armband (Thalmic Labs)
1 Available online, https ://www.kaggl 654/emg-gestu re-
class ifica tion-thumb s-up-and-down/ Last Accessed: 25/02/2020.
2 Available online, https ://githu n-bird/eeg-featu re-gener
ation / Last Accessed: 25/02/2020.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Thumbs up, thumbs down: non-verbal human-robot interaction throughreal-time EMG…
1 3
2.1 EMG gesture classication andcalibration
The MYO Armband, as shown in Fig.1, is a device com-
prised of 8 electrodes ergonomically designed to read
electromyographic data from on and around the arm by an
embedded chip within the device. Researchers have noted
the MYO’s quality as well as its ease of availability to both
researchers and consumers(Rawat etal. 2016), and is thus
recognised as having great potential in EMG-signal based
experiments. In this section, notable state-of-the-art litera-
ture is presented within which the MYO armband has suc-
cesfully provided EMG data for experimentation.
The Myo Armband was found to be accurate enough to
control a robotic arm with 6 Degrees of Freedom (DoF)
with similar speed and precision to the controlling subject’s
movements(Widodo etal. 2018). In this work, researchers
found an effective method of classification through the train-
ing of a novel Convolutional Neural Network (CNN) archi-
tecture at a mean accuracy of 97.81%. A related study, also
performing classification with CNN succesfully classified
9 physical movements from 9 subjects at a mean accuracy
of 94.18%(Mendez etal. 2017); it must be noted, that in
this work, the model was not tested for generalisation abil-
ity. This has shown to be important in this study, since the
strongest method for classification of the dataset was ulti-
mately weaker than another model when it came to transfer
of ability to unseen data.
Researchers have noted that gesture classification with
Myo has real-world application and benefits(Kaur etal.
2016), showing that physiotherapy patients often exhibit
much higher levels of satisfaction when interfacing via EMG
and receiving digital feedback(Sathiyanarayanan and Rajan
2016). Likewise in the medical field, Myo has shown to be
competitively effective with far more expensive methods
of non-invasive electromyography in the rehabilitation of
amputation patients(Abduo and Galster 2015), and follow-
ing this, much work has explored the application of gesture
classification for the control of a robotic hand(Ganiev etal.
2016; Tatarian etal. 2018). Since the armband is worn on
the lower arm, the goal of the robotic hand is to be teleoper-
ated by non-amputees and likewise to be operated by ampu-
tation patients in place of the amputated hand. Work from
the United States has also shown that EMG classification is
useful for exercises designed to strengthen the glenohumeral
muscles towards rehabilitation in Baseball(Townsend etal.
Recently, work in Brazilian Sign Language classifica-
tion via the Myo armband found high classification ability
of results through a Support Vector Machine on a 20-class
problem(Abreu etal. 2016). Researchers noted substantial
limitations’ in the form of realtime classification applica-
tion and generalisation, with models performing sub-par on
unseen data. For example, letters A, T, and U had worthless
classification abilities of 4%, 4%, and 5% respectively. This
work aims to set out to both train models, and also explore
methods of generalisation to new, unseen data in real-time.
The Myo armband’s proprietary framework, through a short
exercise, boasts up to an 83% real-time classification abil-
ity. Although seemingly relatively high, this margin of error
that is a statistical risk in 17% of cases prevents the Myo
from being deployed insituations where such a rate of error
is unacceptable and considered critical. Though it may be
considered acceptable to possibly miscommunicate 17% of
the time in sign language dictation, this error rate would
unacceptable, for example, for the control of a drone where
a physical risk is presented. Thus, the goal of many works is
to improve this ability. In terms of real-time classification,
there are limited works, and many of them suggest a system
of calibration during short exercises (similarly to the Myo
framework) in order to fine-tune a Machine Learning model.
In (Benalcázar etal. 2017), authors suggested a solution
of a ten second exercise (5, 2 s activities) in order to gain
89.5% real-time classification accuracy. This was performed
through K-Nearest Neighbour (KNN) and the Dynamic Time
Warping (DTW) algorithms. EMG has also been applied to
other bodily surfaces for classification, for example, to the
face in order to classify emotional response based on mus-
cular activity(Tan etal. 2012).
In 2017, researchers found that certain early layers of a
CNN could be applied to unseen subjects when further train-
ing is performed on subsequent layers of the network on new
subject data(Côté-Allard etal. 2019). This study showed not
only that a physical task (’pick up the cube) could be com-
pleted on average in less time than with joystick hardware,
but that the transfer learning process allowed for 97.81%
classification accuracy of the EMG data produced by the
movements of 17 individual subjects. It must be noted,
that this deep learning technique (along with some afore-
mentioned) is heavy in terms of resource usage(Shi etal.
2016), and thus, in this study, classical statistical methods
are explored which require far fewer resources to train and
classify data. This paradigm is followed in order to allow
autonomous machines (usually operating a single CPU) the
ability to perform training, calibration, and classification
without the need for comparatively more expensive GPU
capabilities, or access to a cloud system with similar means.
Discrimination of affirmative and negative responses
in the form of thumbs up and thumbs down was shown to
be possible in a related study(Huang etal. 2015b), within
which the two actions were part of a larger eight-class data-
set which achieved 87.6% on average for four individual
subjects. Linear Discriminant Analysis (LDA) was used to
classify features generated by a sliding window of 200ms
in size with a 50ms overlap technique similar to that fol-
lowed in this work; the features were mean absolute value,
waveform length, zero crossing and sign slope change for the
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
6024 J.Kobylarz et al.
1 3
EMG itself and mean value and standard deviation observed
by the accelerometer. In (Huang etal. 2015a), researchers
followed a similar process of the classification of minute
thumb movements when using an Android mobile phone.
Results showed that accuracies of 89.2% and 82.9% are
achieved for a subject holding a phone and not holding a
phone respectively when 2 s of EMG data is classified with
a K-Nearest Neighbour (KNN) classification algorithm. A
more recent work explored the preliminary applications of
image enhancement to surface electromyographs show-
ing their potential to improve the classification of muscle
characteristics(ul Islam etal. 2019).
Calibration in the related works, where performed, are
through the process of Inductive Transfer Learning (ITL)
and Supervised Transductive Transfer Learning (STTL).
According to (Pan and Yang 2009) and (Arnold etal. 2007),
ITL is the process satisfied when the source domain labels
are available as well as the target labels, this is leveraged in
the calibration stage, in which the gesture being performed
is known. STTL is the process in which the source domain
labels are available but the target is not, this is the validation
stage in this study, when a calibrated model is benchmarked
on further unknown data during application of a calibrated
model. Transfer learning is the process of knowledge transfer
from one learned task to another(Zhuang etal. 2019), in
this study, it is shown to be difficult to generalise a model
to new subjects and thus application of a model to new data
is considered a task to be solved by transfer learning; trans-
fer learning often shows strong results in the application of
gesture classification in related state-of-the-art works(Liu
etal. 2010; Goussies etal. 2014; Costante etal. 2014; Yang
etal. 2018; Demir etal. 2019).
Numerous open issues arising from this literature review
can be observed, and this is experiment seeks to address
said issues:
1. Often, only one method of Machine Learning is applied,
and thus different statistical techniques are rarely com-
pared as benchmarks on the same dataset.
In this work, many statistical techniques of feature
selection and machine learning are applied in order
to explore the abilities of each in EMG classification.
2. Very little exploration of generalisation has been per-
formed, researchers usually opt to present classification
ability of a dataset and there is a distinct lack of explora-
tion when unseen subjects are concerned. This is impor-
tant for real-world application.
In this work, models attempt to classify data gath-
ered from new subjects and experience failure. This
is further remedied by the suggestion of a short cali-
bration task, in which the generalisaton then succeeds
through the process of inductive transfer learning and
transductive transfer learning.
3. When applications are presented, there is often a lack of
exposition in the real-time results for that application.
In this work, where real-world, real-time applications
are concerned, classification abilities are given at each
step where required. This is important for exploration
of ability, and thus, exploration of areas for future
2.2 Selected feature selection algorithms
Feature selection is the process of reducing a dataset’s
dimensionality in order to reduce the complexities of
machine learning algorithms while still effectively main-
taining effective classification ability(Dash and Liu 1997;
Guyon and Elisseeff 2003). Thus, the main goal of feature
selection is to disregard worthless attributes that have no
bearing on class, and if stricter rules are in place, to also
disregard those with very little classification ability which is
not considered worth their contribution to model complex-
ity. In this section, the chosen feature selection algorithms
employed within this study are described.3
Information Gain is the scoring of an attribute’s classi-
fication ability in regards to comparing a change in entropy
when said attribute is used for classification(Kullback and
Leibler 1951). The entropy measured for a specific attribute
is given as:
That is, the Entropy E is the sum of the probability mass
function of the value p times by its negative logarithm. The
change in entropy (Information Gain) when different attrib-
utes are observed for classification thus allow for scoring
of ability.
Symmetrical Uncertainty is a method of dimensional-
ity reduction by comparison of two attributes in regards
to classification entropy and Information Gain given a
pair(Gel’Fand and Yaglom 1959; Piao etal. 2019). This
allows for comparative scores to be applied to attributes
within the vector. For attributes X and Y, Symmetrical
Uncertainty is given as:
where Entropy E and Information Gain IG are calculated as
previously described.
3 For the One Rule Feature Selection process, please see Sect.2.3.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Thumbs up, thumbs down: non-verbal human-robot interaction throughreal-time EMG…
1 3
2.3 Selected machine learning algorithms
A Machine Learning (ML) algorithm, in general terms, is
the process of building an analytical or predictive model
with inspiration from labelled (known) data(Bishop 2006;
Michie etal. 1994). The process of classification is to
develop rules to label unseen (validation) data based on seen
(training) data. This section details the general background
of the learning models selected in this study. A wide range of
models are chosen in order to explore the differing abilities
of multiple statistical techniques.
One Rule classification is an extremely simplistic process
in order to generate a best-fit ruleset based on one attrib-
ute. A single attribute is identified as the best for classifica-
tion, and rules are generated based upon it, that is, effective
splits to disseminate the data object (eg. for an attribute a,
Class =Y
, IF
Class =Z
Decision Trees are tree-like branched data structures,
where at each node, a conditional control statement is used
to provide a rule based on attribute values where an end node
without connections represents a class(Pal 2005). Classifi-
cation follows a process of cascading the data objects from
start to end of the tree and their predicted class is given
as the one reached. Fitness of a tree layout is given as the
entropy within the end nodes and their classified instances4.
A Random Decision Tree (RDT) with parameter K will
select K random attributes at each node and develop split-
ting rules based on them(Prasad etal. 2006). The model is
simple since no pruning is performed and thus an overfitted
tree is produced to classify all input data points, therefore
cross-validation is used to create an average of the best per-
forming random trees, or with a testing set of unseen data.
Support Vector Machines (SVM) classify data points by
optimising a data-dimensional hyperplane to most aptly
separate them, and then classifying based on the distance
vector measured from the hyperplane(Cortes and Vapnik
1995). Optimisation follows the goal of the average mar-
gins between points and the separator to be at the maxi-
mum possible value. Generation of an SVM is performed
through Sequential Minimal Optimisation (SMO), a high-
performing algorithm to generate and implement an SVM
classifier(Platt 1998). To perform this, the large optimi-
sation problem is broken down into smaller sub-problems,
these can then be solved linearly. For multipliers a, reduced
constraints are given as:
where there are data classes y and k are the negative of the
sum over the remaining terms of the equality constraint.
Naive Bayes is a probabilistic model given by Bayes’ The-
orem which aims to find the posterior probability for a num-
ber of different hypotheses, then select the hypothesis with
the highest probability. The posterior probability is given by:
Where P(h|d) is the probability of hypothesis h given the
data d, P(d|h) is the probability of data d given that the
hypothesis h is true. P(h) is the probability of hypothesis h
being true and
is the probability of the
data. The algorithm assumes each probability value as con-
ditionally independent for a given target (ergo naive), cal-
culated as P(d1|h)P(d2|h) and so on. Despite its simplicity,
related work has shown its effectiveness in some complex
problems(Wood etal. 2019), showing that Naive Bayes clas-
sification achieves 96% in negative predicted value with the
Wisconsin breast cancer data set.
Bayesian Networks are graphic probabilistic models that
satisfy the local Markov property, and are used for computa-
tion of probability. This network is a Directed Acyclic Graph
(DAG) in which each edge is a conditional dependency, and
each node corresponds to a unique random variable and is
conditionally independent of its non-descendants. Thus the
probability of an arbitrary event
can be com-
puted as
, ..., X
Logistic Regression is a process of symmetric statis-
tics where a numerical value is linked to a probability of
event occurring, ie. the number of driving lessons to pre-
dict pass or fail (Walker and Duncan 1967). In a two class
problem within a dataset containing i number of attrib-
utes and
model parameters, the log odds l is derived via
and the odds of an outcome are shown
which can be used to predict an
outcome based on previous observation.
Voting allows for multiple trained models to act as an
ensemble through democratic or weighted voting. Each
model will vote on their outcome (prediction) by way of
methods such as simply applying a single vote or voting by
weight of probability experienced from training and valida-
tion. The final decision of the model is the class receiv-
ing the highest number of votes or weighted votes, and is
given as the outcome prediction. A Random Decision Forest
(RDF) is an example of a voting model. A specified number
of n RDTs are generated on randomly selected subsets of the
input data (Bootstrap Aggregation), and produce an overall
prediction by presenting the majority vote(Ho 1995).
4 For details on Information Gain, please see Sect.2.2.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
6026 J.Kobylarz et al.
1 3
3 Method
In this section, the methodology of the experiments in
this study are described. Initially, data is acquired prior to
the generation of a full dataset through feature extraction.
Machine Learning paradigms are then benchmarked on the
dataset, before the exploration of real-time classification of
unseen data.
The experiments performed in this study were executed
on a AMD FX-8520 eight-core processor with a clock speed
of 3.8 GHz. In terms of software, the algorithms are exe-
cuted via the Weka API (implemented in Java). The machine
learning algorithms are validated through a process of k-fold
cross validation, where k is set to 10 folds. The voting pro-
cess is to vote by average probabilities of the models, since
two models are considered and thus a democratic voting
process would result in a tie should the two models disagree.
3.1 Data acquisition
The Myo Armband records EMG data at a rate of 200 Hz
via 8 dry sensors worn on the arm, and it also has a 9-axis
Inertial Measurement Unit (IMU) performing at a sample
rate of 50 Hz. For this study, data acquisition is performed
with 5 subjects, which are three males and two females (aged
22–40). For model generalisation, 4 more subjects ware
taken into account, of which two of them are new subjects
and two are performing the movements again. The gestures
performed were, thumbs up, thumbs down, and resting (a
neutral gesture in which the subject is asked to rest their
hand). For training, 60 s of forearm muscle activity data
was recorded for each arm (two minutes, per subject, per
gesture). In the case of benchmark data, the muscle waves
were recorded in intervals of 1–7 s each.
3.2 Feature extraction
In this study, time series are considered through a sliding
window technique in order to generate statistics and thus
extract features or attributes from the 8-dimensional data.
Related work in biological signal processing argues for the
need of feature extraction prior to data mining(Mendoza-
Palechor etal. 2019; Seo etal. 2019) This is performed due
to wave data being complex and temporal in nature and thus
single points are difficult to classify (since they depend on
both past and future events). The feature extraction process
in this study is based on previous works with electroenceph-
alographic signals(Bird etal. 2018, 2019)5, which have been
noted to bare some similarity to EMG signals(Grosse etal.
2002). A general overview of the process is as follows:
Initially, a sliding window of length 1s at an overlap of
0.5s divides the data into short wave segments.
For each time window, the following is performed:
Considering the full time window, the following statistics
are measured:
The mean and standard deviation of the wave.
The skewness and kurtosis of each signal(Zwillinger
and Kokoska 2000).
The maximum and minimum values.
The sample variances of each signal, plus the sample
covariances of all pairs of waves(Montgomery and
Runger 2010).
The eigenvalues of the covariance matrix(Strang
The upper triangular elements of the matrix loga-
rithm of the covariance matrix(Chiu etal. 1996).
The magnitude of the frequency components of each
signal by Fast Fourier Transform (FFT)(VanLoan
The frequency values of the ten most energetic com-
ponents of the FFT, for each signal.
Considering the two 0.5s windows produced due to offset
(overlap of two 1s windows resulting in 0.5s windows):
The change in both the sample means and in the sam-
ple standard deviations between the 1st and 2nd 0.5s
The change in both the maximum and minimum val-
ues between the first and second 0.5s windows.
Considering the two 0.25 s quarter windows produced
due to offset:
The mean of each each quarter-window.
All paired differences of means between the quarter-
The maximum (minimum) values of each quarter-
window, plus all paired differences of maximum
(minimum) values between the quarter-windows.
Change in attributes is also treated as a feature, in which
each window is passed the previous extracted value vector
sans maximum, mean, and minimum values of quarter win-
dows. The first window does not receive this vector since no
window preceded it.
Feature extraction thus produced a dataset of 2040
numerical attributes from the 8 electrodes, of which there
are 159 megabytes of data produced from the five subjects.
A minor original contribution is also presented in the form
5 Available online,
https ://githu n-bird/eeg-featu re-gener ation /
Last Accessed: 25/02/2020
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Thumbs up, thumbs down: non-verbal human-robot interaction throughreal-time EMG…
1 3
of the application of these features to EMG data, since
they have only been shown to be effective thus far in EEG
signal processing.
3.3 Machine learning andbenchmarking
towardsreal‑time classication
Following data acquisition and feature extraction, multiple
ML models are benchmarked in order to compare their
classification abilities on the EMG data. The particularly
strong models are then considered for generalisation and
real-time classification.
In this work, two approaches towards real-time classi-
fication are explored. Small datasets are recorded sequen-
tially from four subjects, varying from lengths of 1 s, from
1 to 7 s per class. These then constitute seven datasets per
person {3,6..21}.
Initially, the best four models observed by the previous
experiments are used to classify these datasets in order
to derive the ideal amount of time that an action must be
observed before the most accurate classification can be
Following this, a method of calibration through transfer
learning is also explored. The result from the aforemen-
tioned experiment (the ideal amount of observation time)
is taken forward and, for each person, appended to the full
dataset recorded for the classification experiments. Each
of the chosen ML techniques are then retrained and used
to classify further unseen data from said subject.
4 Results
In this section, the preliminary results from the experiments
are given. Firstly, the chosen machine learning techniques are
benchmarked in order to select the most promising method
for the problem presented in this study. Secondly, generalisa-
tion of models to unseen data is benchmarked before a similar
experiment is performed within which transfer learning is lev-
eraged to enable generalisation of models to new data through
calibration to a subject.
4.1 Feature selection andmachine learning
Table1 shows the results of attribute selection performed on
the full dataset of 2040 numerical attributes. One Rule fea-
ture selection found that the majority of attributes held strong
One Rule classification ability, as is often expected(Ali and
Smith 2006). Information Gain and Symmetrical Uncertainty
produced slightly smaller datasets both of 1898, and it must
be noted that the two datasets are comprised of differing
In Table2, the full matrix of benchmarking results are pre-
sented. An interesting pattern occurs throughout all datasets,
both reduced and full; an SVM is always the best single classi-
fier, scoring between 87.11 and 87.14%. Additionally, a voting
ensemble of Random Forest and SVM always produce the
strongest classifiers at results of between 91.3 and 91.74%.
Interestingly, the One Rule dataset is slightly less complex
than the full dataset but produces a slightly superior result. The
Information Gain and Symmetrical Uncertainty datasets are far
less complex, and yet are only behind the best One Rule score
by 0.44% and 0.34% respectively. Logistic Regression on the
whole dataset fails due to its high resource requirements, but
is observed to be viable on the datasets that have been reduced.
Table 1 A comparison of the three attribute selection experiments
Note that Scoring methods are Unique and thus not Comparable
between the Three
Method No. attributes
Max score Min score
One rule 2000 64.39 30.51
Information gain 1898 0.62 0.004
Symmetrical uncertainty 1898 0.32 0.003
Table 2 10-fold classification ability of both single and ensemble methods on the datasets
Voting does not include random tree due to the inclusion of random forest
Dataset Single Model Accuracy (%) Ensemble Model Accuracy (%)
OneR RT SVM NB BN LR RF Vote (best two) Vote (best three)
OneR 61.33 74.03 87.14 64.32 69.9 60.76 91.30 91.74 74.67
InfoGain 61.49 75.39 87.11 64.13 69.9 61.45 91.7 91.30 75.13
Symmetrical uncertainty 61.48 74.37 87.11 64.13 69.9 61.55 91.36 91.4 75.16
Whole dataset 61.33 74.09 87.14 64.32 69.9 x 91.3 91.71 74.72
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
6028 J.Kobylarz et al.
1 3
4.2 Benchmarking requirements forrealtime
In this section, very short segments of unseen data are
collected from four subjects in order to attempt to apply
the previously generated models to new data. That is, to
experiment on the generalisation ability or lack thereof of
the models on the 5-subject dataset. Generalisation ini-
tially fails, but with the least catastrophic model in mind,
leading the focus to calibration of a ’user’ in ideally short
amounts of time via transfer learning.
When the best model from Table2 is used, the ensemble
vote of average probabilities between a Random Forest and
SVM fails in being able to classify unseen data. Observe
Fig.2, in which 15 s of unseen data performs, on average, in
excess of any other amount of data, but yet still only reaches
a mean classification ability of 55.12% (which is unaccepta-
ble for a ternary classification problem).
In Fig.3, the mean classification ability of other highly
performing models from the previous experiment are given
when unseen data are attemptedly classified. Likewise to the
Vote model observed in Fig.2, generalisation has failed for
all models. Two interesting insights emerge from the failed
experiments; firstly, 15 s of data (5 s per class) most often
leads to the best limited generalisation as opposed to both
shorter and longer experiments. Furthermore, the ability of
the Random Forest can be seen to exceed all of the other
three methods, suggesting that it is superior (albeit limited)
when generalisation is considered.
As previously described, calibration is attempted through
a short experiment. Due to the findings aforementioned, 15 s
of known data (that is, requested during ’setup’) is collected.
Secondsof Data
Classification Accuracy (%)
Subject 1
Subject 2
Subject 3
Subject 4
Fig. 2 Benchmarking of vote (Best Two) model generalisation abil-
ity for unseen data segments per subject, in which generalisation has
failed due to low classification accuracies
36912 15 18 21
Seconds of Data
Classification Accuracy (%)
Vote (RF,SVM, BN)
Vote (RF, SVM)
Fig. 3 Initial pre-calibration mean generalisation ability of models
on unseen data from four subjects in a three-class scenario. Time is
given for total data observed Equally for three classes. Generalisation
has failed
Table 3 Results of the models generalisation ability to 15 s of unseen
data once calibration has been performed
Model Generalisa-
tion Ability
Single models
OneR 63
RT 91.86
SVM 94
NB 53.35
BN 66.05
LR 90.1
Ensemble models
RF 97
Vote (RF, SVM) 95.7
Vote (RF, SVM, BN) 87.8
Table 4 Confusion matrix for the random forest once calibrated by
the subject for 15 s when used to predict unseen data
Counts have been compiled from all subjects. Class imbalance occurs
in real-time due to bluetooth sampling rate
Prediction Ground Truth
Rest Up Down
300 0 1 Rest
0 324 1 Up
0 19 376 Down
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Thumbs up, thumbs down: non-verbal human-robot interaction throughreal-time EMG…
1 3
These labelled data are then added to the training data, in
order to expand knowledge at a personal level. Once this is
performed, and the models are trained, they are then bench-
marked with a further unseen dataset of 15 s of data, again,
5 s per class. No further training of models are performed,
and they simply attempt to classify this unseen data. Table3
shows the abilities of all previously benchmarked models
once the short calibration process is followed, with far
greater success than observed in the previous failed experi-
ments, where those previous were benchmarked. As was
conjectured from said failed experiments, the Random For-
est showed to be the most successful calibration experiment
for generalisation towards a new subject. The error matrix
for the best model is given in Table4. The most difficult task
was the prediction of ’thumbs down’, which, when a subject
had a particularly smaller arm would sometimes be classi-
fied as a resting state. Observed errors are extremely low,
and thus future work to explore this is suggested in Sect.6.
5 Applications inhuman‑robot interaction
In this section, an application of the framework is presented
in a HRI context. The Random Forest model observed to be
the best model for generalisation in Sect.4.2 is calibrated
for 5 s per class in regards to the benchmark results, then
enabling the subject to interact non-verbally with machines
via EMG gesture classification. Note that only preliminary
benchmarks are presented, and Sect.6 details potential
future work in this regard, that is, these preliminary activi-
ties are not considered the main contributions of this work
which were presented in Sect.4.
5.1 20 Questions withahumanoid robot opponent
20Q, or 20 Questions, is a digital game developed by Robin
Burgener based on the 20th Century American parlor
game of the same name and rules; it is a situational puzzle.
Through Burgener’s algorithm, computer opponents play
via the dissemination and subsequent strategy presented by
an Artificial Neural Network(Burgener 2006, 2003). In the
game between man and machine, the player thinks of an
entity and the opponent is able to ask 20 yes/no questions.
Through elimination of potential answers, the opponent is
free to guess the entity that the player is thinking of. If the
opponent cannot guess the entity by the end of the 20 ques-
tions, then the player has won.
In this application the 20 Questions game is played with
a humanoid robot, Softbank Robotics’ Pepper. Initially, the
subject is calibrated with 15 s of data (5 per class) added to
the full dataset, due to the findings in this work. Following
this, for every round of questioning, the robot will listen
to 5 s of data from the player, perform feature generation,
and finally will consider the most commonly predicted class
from all data objects produced in order to derive the player’s
answer. This process can be seen in Fig.4 in which feedback
is given during data classification. Two players each play
two games each with the robot. Thus, the model used is a
calibrated Random Forest (through inductive and transduc-
tive transfer learning) and a simple meta-approach of the
most common class.
As can be seen in Table5, results from the four games
are given as average accuracy on a per-data-object basis, but
the results of the game operate on the final column, EMG
Predictions Accuracy, this is the measure of correct predic-
tions of thumb states by the most common prediction of all
data objects generated over the course of data collection and
feature generation. As can be observed, the high accuracies
of per-object classification contribute towards perfect clas-
sification of player answers, all of which were at 100%.
6 Future work andconclusion
In the calibration experiment, error rates were found to
be extremely low. Accuracy measurements exceeded the
original benchmarks and thus further experimentation is
required to explore this. Calibration was performed for a
limited group of four subjects, further experimentation
should explore a more general affect when a larger group of
participants are considered.
Fig. 4 Softbank Robotics’ pepper robot playing 20 Questions with a
human through real-time EMG signal classification
Table 5 Statistics from two games played by two subjects each
Average Accuracy is given as per-data-object, correct EMG predic-
tions are given as overall decisions
Subject Yes avg.
No avg.
(accuracy) (%)
1 96.9 96.5 96.7 100
2 97 97 97 100
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
6030 J.Kobylarz et al.
1 3
Towards the end of this work, preliminary benchmarks
are presented for potential application of the inductive and
supervised transductive transfer learning calibration process.
The 20 Questions game with a Pepper Robot was possible
with 15 s of calibration data and 5 s of answering time per
question, and predictions were at 100% for two subjects in
two different experimental runs. Further would could both
explore more subjects as well as attempt to perform this
task with shorter answering time, ie. a deeper exploration
into how much data is enough for a confident prediction. For
example, rather than the simplistic most common class Ran-
dom Forest approach, a more complex system of meta-clas-
sification could prove more useful as the pattern of error may
be useful also for prediction; if this were so, then it stands to
reason that confident classification could be enabled sooner
than the 5 s mark. Additionally, when a a best-case para-
digm is confirmed, the method could then be compared to
other sensory techniques such as image/video classification
for gesture recognition. Furthermore, should said method
be also viable, then a multi-modal approach could also be
explored in order to fuse both visual and EMG data.
This article shows that the proposed transfer learning
system is viable to be applied to the ternary classification
problem presented. Future work could explore the robust-
ness of this approach to problems of additional classes and
gestures in order to compare how results are affected when
more problems are introduced.
To finally conclude, this experiment firstly found that a
voting ensemble was a strong performer for classification of
gesture but failed to generalise to new data. With the induc-
tive and transductive transfer learning calibration approach,
the best model for generalisation of new data was a Random
Forest technique which achieved very high accuracy. After
gathering data from a subject for only 5 s, the model could
confidently classify the gesture at 100% accuracy through
the most common class Random Forest classifier. Since
very high accuracies were achieved by the transfer learning
approach in this work when compared to the state-of-the-
art related works and the proprietary MYO system, future
applications could be enabled with our approach towards a
much higher resolution of input than is currently available
with the MYO system.
Open Access This article is licensed under a Creative Commons Attri-
bution 4.0 International License, which permits use, sharing, adapta-
tion, distribution and reproduction in any medium or format, as long
as you give appropriate credit to the original author(s) and the source,
provide a link to the Creative Commons licence, and indicate if changes
were made. The images or other third party material in this article are
included in the article’s Creative Commons licence, unless indicated
otherwise in a credit line to the material. If material is not included in
the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will
need to obtain permission directly from the copyright holder. To view a
copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.
Abduo M, Galster M (2015) Myo gesture control armband for medical
applications. https ://www.seman ticsc holar .org/paper /Myo-Gestu
re-Contr ol-Armba nd-for-Medic al-Abduo -Galst er/3b5ed 355b0
9beec b7b2b 6bbd2 3fead 44b50 374c6
Abreu JG, Teixeira JM, Figueiredo LS, Teichrieb V (2016) Evaluating
sign language recognition using the myo armband. In: 2016 XVIII
Symposium on Virtual and Augmented Reality (SVR), IEEE, pp
Ali S, Smith KA (2006) On learning algorithm selection for classifica-
tion. Applied Soft Computing 6(2):119–138
Arnold A, Nallapati R, Cohen WW (2007) A comparative study of
methods for transductive transfer learning. In: ICDM Work-
shops, pp 77–82
Benalcázar ME, Motoche C, Zea JA, Jaramillo AG, Anchundia CE,
Zambrano P, Segura M, Palacios FB, Pérez M (2017) Real-time
hand gesture recognition using the myo armband and muscle
activity detection. In: 2017 IEEE Second Ecuador Technical
Chapters Meeting (ETCM), IEEE, pp 1–6
Bird JJ, Manso LJ, Ribeiro EP, Ekárt A, Faria DR (2018) A study on
mental state classification using eeg-based brain-machine inter-
face. In: 2018 International Conference on Intelligent Systems
(IS), IEEE, pp 795–800
Bird JJ, Faria DR, Manso LJ, Ekárt A, Buckingham CD (2019) A
deep evolutionary approach to bioinspired classifier optimi-
sation for brain-machine interaction. Complexity. https ://doi.
org/10.1155/2019/43165 48
Bishop CM (2006) Pattern recognition and machine learning.
Springer, Berlin
Burgener R (2003) 20q twenty questions
Burgener R (2006) Artificial neural network guessing method and
game. US Patent App. 11/102,105
Chiu TY, Leonard T, Tsui KW (1996) The matrix-logarithmic covar-
iance model. Journal of the American Statistical Association
Cortes C, Vapnik V (1995) Support-vector networks. Machine learn-
ing 20(3):273–297
Costante G, Galieni V, Yan Y, Fravolini ML, Ricci E, Valigi P (2014)
Exploiting transfer learning for personalized view invariant ges-
ture recognition. In: 2014 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP), IEEE, pp
Côté-Allard U, Fall CL, Drouin A, Campeau-Lecours A, Gosselin
C, Glette K, Laviolette F, Gosselin B (2019) Deep learning
for electromyographic hand gesture signal classification using
transfer learning. IEEE Transactions on Neural Systems and
Rehabilitation Engineering 27(4):760–771
Dash M, Liu H (1997) Feature selection for classification. Intelligent
data analysis 1(1–4):131–156
Demir F, Bajaj V, Ince MC, Taran S, Şengür A (2019) Surface emg
signals and deep transfer learning-based physical action classi-
fication. Neural Computing and Applications 31(12):8455–8462
Ganiev A, Shin HS, Lee KH (2016) Study on virtual control of a
robotic arm via a myo armband for the selfmanipulation of a
hand amputee. Int J Appl Eng Res 11(2):775–782
Gel’Fand I, Yaglom A (1959) Calculation of amount of information
about a random function contained in another such function.
Eleven Papers on Analysis, Probability and Topology 12:199
Goussies NA, Ubalde S, Mejail M (2014) Transfer learning decision
forests for gesture recognition. The Journal of Machine Learn-
ing Research 15(1):3667–3690
Grosse P, Cassidy M, Brown P (2002) Eeg-emg, meg-emg and emg-
emg frequency analysis: physiological principles and clinical
applications. Clinical Neurophysiology 113(10):1523–1531
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Thumbs up, thumbs down: non-verbal human-robot interaction throughreal-time EMG…
1 3
Guyon I, Elisseeff A (2003) An introduction to variable and feature
selection. J Mach Learn Res 3:1157–1182
Ho TK (1995) Random decision forests. In: Proceedings of the third
international conference on document analysis and recognition,
IEEE, vol1, pp 278–282
Huang D, Zhang X, Saponas TS, Fogarty J, Gollakota S (2015a)
Leveraging dual-observable input for fine-grained thumb inter-
action using forearm emg. In: Proceedings of the 28th annual
ACM symposium on user interface software and technology,
ACM, pp 523–528
Huang Y, Guo W, Liu J, He J, Xia H, Sheng X, Wang H, Feng X,
Shull PB (2015b) Preliminary testing of a hand gesture recog-
nition wristband based on emg and inertial sensor fusion. In:
International conference on intelligent robotics and applica-
tions, Springer, pp 359–367
Ipsos M (2016) Gp patient survey-national summary report. NHS Eng-
land, London
ul Islam I, Ullah K, Afaq M, Chaudary MH, Hanif MK (2019) Spatio-
temporal semg image enhancement and motor unit action potential
(muap) detection: algorithms and their analysis. J Ambient Intell
Humaniz Comput 10(10):3809–3819
Kaur M, Singh S, Shaw D (2016) Advancements in soft comput-
ing methods for emg classification. Int J Biomed Eng Technol
Kullback S, Leibler RA (1951) On information and sufficiency. Ann
Math Stat 22(1):79–86
Liu J, Yu K, Zhang Y, Huang Y (2010) Training conditional random
fields using transfer learning for gesture recognition. In: 2010
IEEE international conference on data mining, IEEE, pp 314–323
Mendez I, Hansen BW, Grabow CM, Smedegaard EJL, Skogberg NB,
Uth XJ, Bruhn A, Geng B, Kamavuako EN (2017) Evaluation
of the myo armband for the classification of hand motions. In:
2017 International conference on rehabilitation robotics (ICORR),
IEEE, pp 1211–1214
Mendoza-Palechor F, Menezes ML, SantAnna A, Ortiz-Barrios M,
Samara A, Galway L (2019) Affective recognition from eeg
signals: an integrated data-mining approach. J Ambient Intell
Humaniz Comput 10(10):3955–3974
Michie D, Spiegelhalter DJ, Taylor C etal (1994) Machine learning.
Neural Stat Classif 13:1–298
Montgomery DC, Runger GC (2010) Applied statistics and probability
for engineers. Wiley, New York
Pal M (2005) Random forest classifier for remote sensing classification.
Int J Remote Sens 26(1):217–222
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans
Knowl Data Eng 22(10):1345–1359
Piao M, Piao Y, Lee JY (2019) Symmetrical uncertainty-based feature
subset generation and ensemble learning for electricity customer
classification. Symmetry 11(4):498
Platt J (1998) Sequential minimal optimization: a fast algorithm for
training support vector machines. https ://www.micro
en-us/resea rch/wp-conte nt/uploa ds/2016/02/tr-98-14.pdf
Prasad AM, Iverson LR, Liaw A (2006) Newer classification and
regression tree techniques: bagging and random forests for eco-
logical prediction. Ecosystems 9(2):181–199
Rawat S, Vats S, Kumar P (2016) Evaluating and exploring the myo
armband. In: 2016 International conference system modeling and
advancement in research trends (SMART), IEEE, pp 115–120
Sathiyanarayanan M, Rajan S (2016) Myo armband for physiotherapy
healthcare: a case study using gesture recognition application. In:
2016 8th International conference on communication systems and
networks (COMSNETS), IEEE, pp 1–6
Seo J, Laine TH, Sohn KA (2019) Machine learning approaches for
boredom classification using eeg. J Ambient Intell Humaniz Com-
put 10(10):3831–3846
Shi S, Wang Q, Xu P, Chu X (2016) Benchmarking state-of-the-art
deep learning software tools. In: 2016 7th international confer-
ence on cloud computing and big data (CCBD), IEEE, pp 99–104
Strang G (2006) Linear algebra and its applications. Brooks Cole,
Tan JW, Walter S, Scheck A, Hrabal D, Hoffmann H, Kessler H, Traue
HC (2012) Repeatability of facial electromyography (emg) activ-
ity over corrugator supercilii and zygomaticus major on differ-
entiating various emotions. J Ambient Intell Humaniz Comput
Tatarian K, Couceiro MS, Ribeiro EP, Faria DR (2018) Stepping-stones
to transhumanism: An emg-controlled low-cost prosthetic hand for
academia. In: 2018 International conference on intelligent systems
(IS), IEEE, pp 807–812
Townsend H, Jobe FW, Pink M, Perry J (1991) Electromyographic
analysis of the glenohumeral muscles during a baseball rehabilita-
tion program. Am J Sports Med 19(3):264–272
Van Loan C (1992) Computational frameworks for the fast Fourier
transform. SIAM 10:10
Walker SH, Duncan DB (1967) Estimation of the probability of an
event as a function of several independent variables. Biometrika
Widodo MS, Zikky M, Nurindiyani AK (2018) Guide gesture applica-
tion of hand exercises for post-stroke rehabilitation using myo
armband. In: 2018 international electronics symposium on knowl-
edge creation and intelligent computing (IES-KCIC), IEEE, pp
Wood A, Shpilrain V, Najarian K, Kahrobaei D (2019) Private naive
bayes classification of personal biomedical data: application in
cancer data analysis. Comput Biol Med 105:144–150
Yang S, Lee S, Byun Y (2018) Gesture recognition for home automa-
tion using transfer learning. In: 2018 International conference on
intelligent informatics and biomedical sciences (ICIIBMS), IEEE,
vol3, pp 136–138
Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2019)
A comprehensive survey on transfer learning. arXiv :19110 2685
Zwillinger D, Kokoska S (2000) CRC standard probability and statis-
tics tables and formulae. Chapman and Hall, Boca Raton
Publisher’s Note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
Terms and Conditions
Springer Nature journal content, brought to you courtesy of Springer Nature Customer Service Center GmbH (“Springer Nature”).
Springer Nature supports a reasonable amount of sharing of research papers by authors, subscribers and authorised users (“Users”), for small-
scale personal, non-commercial use provided that all copyright, trade and service marks and other proprietary notices are maintained. By
accessing, sharing, receiving or otherwise using the Springer Nature journal content you agree to these terms of use (“Terms”). For these
purposes, Springer Nature considers academic use (by researchers and students) to be non-commercial.
These Terms are supplementary and will apply in addition to any applicable website terms and conditions, a relevant site licence or a personal
subscription. These Terms will prevail over any conflict or ambiguity with regards to the relevant terms, a site licence or a personal subscription
(to the extent of the conflict or ambiguity only). For Creative Commons-licensed articles, the terms of the Creative Commons license used will
We collect and use personal data to provide access to the Springer Nature journal content. We may also use these personal data internally within
ResearchGate and Springer Nature and as agreed share it, in an anonymised way, for purposes of tracking, analysis and reporting. We will not
otherwise disclose your personal data outside the ResearchGate or the Springer Nature group of companies unless we have your permission as
detailed in the Privacy Policy.
While Users may use the Springer Nature journal content for small scale, personal non-commercial use, it is important to note that Users may
use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
use bots or other automated methods to access the content or redirect messages
override any security feature or exclusionary protocol; or
share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at
... Simultaneously we load the unseen dataset, from which a portion is extracted and incorporated into the training dataset for calibration purposes. This mimics the calibration period that would typically take place in a physical system [17]. This extended training dataset goes through the feature extraction process and is passed to the classification models to train upon. ...
... Conversely, Bird [4] and Kobaylarz [17] both extracted an ensemble of statistical features from the time-series EMG data provided by the Myo to be used as model attributes. Bird et al. demonstrated the effectiveness of a feature ensemble previously developed for use with electroencephalography (EEG) in the EMG domain, using an optimised deep neural network topology to classify four gestures from ten subjects at an accuracy of 85%. ...
... The connection is achieved using Myo Hub which enables the device's SDK and allows Python to gain access to the device as well. A Python script is used to record forearm EMG data produced by muscle activity when executing a gesture for 60 seconds [17]. Each of the Myo's eight EMG sensors produces an EMG reading and the IMU produces a single reading; the resulting data file is hence ten columns wide: one stating unix timestamps, eight for the EMG sensors and one for the IMU readings. ...
Full-text available
In this work, we achieve up to 92% classification accuracy of electromyographic data between five gestures in pseudo-real-time. Most current state-of-the-art methods in electromyographical signal processing are unable to classify real-time data in a post-learning environment, that is, after the model is trained and results are analysed. In this work we show that a process of model calibration is able to lead models from 67.87% real-time classification accuracy to 91.93%, an increase of 24.06%. We also show that an ensemble of classical machine learning models can outperform a Deep Neural Network. An original dataset of EMG data is collected from 15 subjects for 4 gestures (Open-Fingers, Wave-Out, Wave-in, Close-fist) using a Myo Armband for measurement of forearm muscle activity. The dataset is cleaned between gesture performances on a per-subject basis and a sliding temporal window algorithm is used to perform statistical analysis of EMG signals and extract meaningful mathematical features as input to the learning paradigms. The classifiers used in this paper include a Random Forest, a Support Vector Machine, a Multilayer Perceptron, and a Deep Neural Network. The three classical classifiers are combined into a single model through an ensemble voting system which scores 91.93% compared to the Deep Neural Network which achieves a performance of 88.68%, both after calibrating to a subject and performing real-time classification (pre-calibration scores for the two being 67.87% and 74.27%, respectively).
... Hand gestures are a very expressive medium of communication that occasionally can be used as a valid substitute for speech and conventional input devices, as it happens, for instance, in Sign Language and human-robot interaction applications [1,12,20,21]. Due to the complicated nature of hand, the task of recognizing dynamic hand gestures is a crucial yet challenging task in the computer vision community. The core challenge of this task lies in the effective extraction of discriminative spatiotemporal features to represent the different dependencies of each gesture. ...
... Vision-based dynamic hand gesture recognition is the task of automatically interpreting the hand action performed by the subject from a sequence of images. It is an active research direction in computer vision, artificial intelligence, and machine learning due to its widespread potential applications in sign language recognition, visual reality, human-robot action recognition, and visualization application (Avola et al. 2019;Mohammed et al. 2019b;Kobylarz et al. 2020). Early works on vision-based gesture recognition were mainly focused on both RGB and depth modalities as inputs and relied on image-level features for recognition (Ohn-Bar and Trivedi 2013;Oreifej and Liu 2013;). ...
Full-text available
Hand gesture and action recognition have been extensively researched in the past two decades due to the emerging advanced acquisition and interaction technologies, which open the floodgates for a vast range of potential applications. Particularly, many spatial–temporal feature extractors have been proposed, such as RNNs-based models, temporal convolutional network (TCN), and 3D convolutional neural networks (3DCNN) for modeling long-term dependencies in sequential data. However, it remains challenging to obtain a high recognition rate because of the difficulty of effectively extracting spatial–temporal features and efficiently classifying them with noisy and complex skeleton sequences. Therefore, this paper proposes a deep ensemble framework called multi-model ensemble gesture recognition network (MMEGRN) for skeleton-based hand gesture recognition. Specifically, to establish effective feature extraction and accurate gesture recognition, we propose an architecture consisting of four sub-networks, three spatio-temporal features classifiers to leverage their various capabilities of extracting and classifying skeleton sequences. Through late feature fusion, the features resulted from the feature extractors of each sub-network are fused into a new fusion classifier. Each subnetwork is trained independently to perform the task of gesture recognition using only skeleton joints. The training is performed using the cyclic annealing learning rate to generate a series of models that are combined in an ensemble using the optimized weighted ensemble (OWE) method. The proposed framework combines deep learning and ensemble strengths to establish a new deep-learning network architecture for more accurate and efficient hand gesture recognition. Extensive experiments on three skeleton-based hand gesture recognition datasets have shown the effectiveness of the proposed framework and the superiority over other models in terms of recognition accuracy.
... For many years, work has been underway to develop a loading scheme corresponding to the actual state that is also possible to implement for the needs of analyses [40][41][42][43][44]. For this purpose, in vivo tests, for example, use advanced systems based on electromyography (EMG), which is the measurement of an electric signal related to muscle activation [45,46]. In conjunction with the cross-sectional analysis of the muscle surface, this enables the interaction of individual muscles to be reduced to force vectors. ...
Full-text available
The aim of the study was to develop a new FEM (finite element method) model of a mandible with the temporal joint, which can be used in the numerical verification of the work of bonding elements used in surgical operations of patients with mandibular fractures or defects. Most of such types of numerical models are dedicated to a specific case. The authors engaged themselves in building a model that can be relatively easily adapted to various types of tasks, allowing to assess stiffness, strength and durability of the bonded fragments, taking into account operational loads and fatigue limit that vary in time. The source of data constituting the basis for the construction of the model were DICOM (digital imaging and communications in medicine) files from medical imaging using computed tomography. On their basis, using the 3D Slicer program and algorithms based on the Hounsfield scale, a 3D model was created in the STL (standard triangle language) format. A CAD (computer-aided design) model was created using VRMesh and SolidWorks. An FEM model was built using HyperWorks and Abaqus/CAE. Abaqus solver was used for FEM analyses. A model meeting the adopted assumptions was built. The verification was conducted by analyzing the influence of the simplifications of the temporomandibular joint in the assessment of mandibular strain. The work of an undamaged mandible and the work of the bonded fracture of the mandible were simulated.
... EMG is not only used in medical diagnostic procedures. It is also utilized as a gesture recognition tool that enables human physical activities to be entered into a computer, so as a human-computer interaction form [16]. Moreover, there are attempts to use EMG as a control signal for electronic mobile devices [17,18], prosthesis [19], and even flight control systems [20,21]. ...
Full-text available
This work deals with electromyography (EMG) signal processing for the diagnosis and therapy of different muscles. Because the correct muscle activity measurement of strongly noised EMG signals is the major hurdle in medical applications, a raw measured EMG signal should be cleaned of different factors like power network interference and ECG heartbeat. Unfortunately, there are no completed studies showing full multistage signal processing of EMG recordings. In this article, the authors propose an original algorithm to perform muscle activity measurements based on raw measurements. The effectiveness of the proposed algorithm for EMG signal measurement was validated by a portable EMG system developed as a part of the EU research project and EMG raw measurement sets. Examples of removing the parasitic interferences are presented for each stage of signal processing. Finally, it is shown that the proposed processing of EMG signals enables cleaning of the EMG signal with minimal loss of the diagnostic content.
Full-text available
In modern Human-Robot Interaction, much thought has been given to accessibility regarding robotic locomotion, specifically the enhancement of awareness and lowering of cognitive load. On the other hand, with social Human-Robot Interaction considered, published research is far sparser given that the problem is less explored than pathfinding and locomotion. This thesis studies how one can endow a robot with affective perception for social awareness in verbal and non-verbal communication. This is possible by the creation of a Human-Robot Interaction framework which abstracts machine learning and artificial intelligence technologies which allow for further accessibility to non-technical users compared to the current State-of-the-Art in the field. These studies thus initially focus on individual robotic abilities in the verbal, non-verbal and multimodality domains. Multimodality studies show that late data fusion of image and sound can improve environment recognition, and similarly that late fusion of Leap Motion Controller and image data can improve sign language recognition ability. To alleviate several of the open issues currently faced by researchers in the field, guidelines are reviewed from the relevant literature and met by the design and structure of the framework that this thesis ultimately presents. The framework recognises a user's request for a task through a chatbot-like architecture. Through research in this thesis that recognises human data augmentation (paraphrasing) and subsequent classification via language transformers, the robot's more advanced Natural Language Processing abilities allow for a wider range of recognised inputs. That is, as examples show, phrases that could be expected to be uttered during a natural human-human interaction are easily recognised by the robot. This allows for accessibility to robotics without the need to physically interact with a computer or write any code, with only the ability of natural interaction (an ability which most humans have) required for access to all the modular machine learning and artificial intelligence technologies embedded within the architecture. Following the research on individual abilities, this thesis then unifies all of the technologies into a deliberative interaction framework, wherein abilities are accessed from long-term memory modules and short-term memory information such as the user's tasks, sensor data, retrieved models, and finally output information. In addition, algorithms for model improvement are also explored, such as through transfer learning and synthetic data augmentation and so the framework performs autonomous learning to these extents to constantly improve its learning abilities. It is found that transfer learning between electroencephalographic and electromyographic biological signals improves the classification of one another given their slight physical similarities. Transfer learning also aids in environment recognition, when transferring knowledge from virtual environments to the real world. In another example of non-verbal communication, it is found that learning from a scarce dataset of American Sign Language for recognition can be improved by multi-modality transfer learning from hand features and images taken from a larger British Sign Language dataset. Data augmentation is shown to aid in electroencephalographic signal classification by learning from synthetic signals generated by a GPT-2 transformer model, and, in addition, augmenting training with synthetic data also shows improvements when performing speaker recognition from human speech. Given the importance of platform independence due to the growing range of available consumer robots, four use cases are detailed, and examples of behaviour are given by the Pepper, Nao, and Romeo robots as well as a computer terminal. The use cases involve a user requesting their electroencephalographic brainwave data to be classified by simply asking the robot whether or not they are concentrating. In a subsequent use case, the user asks if a given text is positive or negative, to which the robot correctly recognises the task of natural language processing at hand and then classifies the text, this is output and the physical robots react accordingly by showing emotion. The third use case has a request for sign language recognition, to which the robot recognises and thus switches from listening to watching the user communicate with them. The final use case focuses on a request for environment recognition, which has the robot perform multimodality recognition of its surroundings and note them accordingly. The results presented by this thesis show that several of the open issues in the field are alleviated through the technologies within, structuring of, and examples of interaction with the framework. The results also show the achievement of the three main goals set out by the research questions; the endowment of a robot with affective perception and social awareness for verbal and non-verbal communication, whether we can create a Human-Robot Interaction framework to abstract machine learning and artificial intelligence technologies which allow for the accessibility of non-technical users, and, as previously noted, which current issues in the field can be alleviated by the framework presented and to what extent.
Full-text available
Estimating applied force using force myography (FMG) technique can be effective in human-robot interactions (HRI) using data-driven models. A model predicts well when adequate training and evaluation are observed in same session, which is sometimes time consuming and impractical. In real scenarios, a pretrained transfer learning model predicting forces quickly once fine-tuned to target distribution would be a favorable choice and hence needs to be examined. Therefore, in this study a unified supervised FMG-based deep transfer learner (SFMG-DTL) model using CNN architecture was pretrained with multiple sessions FMG source data (Ds, Ts) and evaluated in estimating forces in separate target domains (Dt, Tt) via supervised domain adaptation (SDA) and supervised domain generalization (SDG). For SDA, case (i) intra-subject evaluation (Ds ≠ Dt-SDA, Ts ≈ Tt-SDA) was examined, while for SDG, case (ii) cross-subject evaluation (Ds ≠ Dt-SDG, Ts ≠ Tt-SDG) was examined. Fine tuning with few “target training data” calibrated the model effectively towards target adaptation. The proposed SFMG-DTL model performed better with higher estimation accuracies and lower errors (R2 ≥ 88%, NRMSE ≤ 0.6) in both cases. These results reveal that interactive force estimations via transfer learning will improve daily HRI experiences where “target training data” is limited, or faster adaptation is required.
Full-text available
Electromyogram (EMG) classification is a key technique in EMG-based control systems. Existing EMG classification methods, which do not consider EMG features that have distribution with skewness and kurtosis, have limitations such as the requirement to tune hyperparameters. In this paper, we propose a neural network based on the Johnson SU translation system that is capable of representing distributions with skewness and kurtosis. The Johnson system is a normalizing translation that transforms non-normal distribution data into normal distribution data, thereby enabling the representation of a wide range of distributions. In this study, a discriminative model based on the multivariate Johnson SU translation system is transformed into a linear combination of coefficients and input vectors using log-linearization; then, it is incorporated into a neural network structure. This allows the calculation of the posterior probability of each class given the input vectors and the determination of model parameters as weight coefficients of the network. The uniqueness of convergence of the network learning is theoretically guaranteed. In the experiments, the suitability of the proposed network for distributions including skewness and kurtosis was evaluated using artificially generated data. Its applicability to real biological data was also evaluated via EMG classification experiments. The results showed that the proposed network achieved high classification performance (e.g., 99.973% accuracy using Khushaba’s dataset) without the need for hyperparameter optimization.
Electromyography (EMG) signal detection and analysis finds numerous clinical and nonclinical applications. Several EMG acquisition and monitoring arrangements have been developed, and are still being researched to make the system compact, cost-effective, utilizing less power, with inbuilt signal processing chips to eliminate noise, having immense mathematical capability for analysis and automated decision making. An attempt has been made in this work to deploy an economic and user-friendly computer-based home health monitoring system that can capture human EMG signals in real-time using a simple interface arrangement. The muscle contractions could be detected from different sites using an amplifier and filter hardware designed using TL-084C op-amp. The real-time acquired EMG information was further made noise interference proof by generating digital filter algorithm in MATLAB®, which has the ability to filter the bio-signal in on-line mode. Digital Signal processing methods for feature extraction and classification of EMG signal have also been discussed at length. The experiment is extended to explore the possibility of acquiring two bio-signals simultaneously to establish a correlation between EMG data and Carotid pulsation, which can be very useful in reaching diagnostic inferences. A standalone executable app has also been generated using MATLAB compilers, so that the algorithm can become platform independent.
Full-text available
Human physical action classification is an emerging area of research for human-to-machine interaction, which can help to disable people to interact with real world, and robotics application. EMG signals measure the electrical activity muscular systems, which involved in physical action of human. EMG signals provide more information related to physical action. In this paper, we proposed deep transfer learning-based approach of human action classification using surface EMG signals. The surface EMG signals are represented by time–frequency image (TFI) by using short-time Fourier transform. TFI is used as input to pre-trained convolutional neural network models, namely AlexNet and VGG16, for deep feature extraction, and support vector machine (SVM) classifier is used for classification of physical action of EMG signals. Also, the fine-tuning of the pre-trained AlexNet model is also considered. The experimental results show that deep feature extraction and SVM classification method and fine-tuning have obviously improved the classification accuracy when compared with various results from the literature. The 99.04% accuracy score is obtained with AlexNet fc6 + AlexNet fc7 + VGG16 fc6 + VGG16 fc7 deep feature concatenation and SVM classification. 98.65% accuracy score is performed by fine-tuning of the AlexNet model. We also compare the obtained results with some of the existing methods. The comparisons show that the deep feature concatenation and SVM classification method provide better classification accuracy than the compared methods.
Full-text available
In spatiotemporal multi-channel surface electromyogram (EMG) images where the x-axis is time, the y-axis is EMG channels and the gray level is EMG amplitude, the motor unit action potential (MUAP) appears as a linear Gaussian structure. The appearance of this MUAP pattern in the spatiotemporal images is mostly distorted either by the destructive superposition of other MUAPs occurring in the conducting volume or by various noises such as a power line, bad electrode and skin contacts and movement artifacts. For accurate automatic detection of MUAP, EMG image enhancement is needed to suppress the background noises and enhance the line-like MUAP propagation patterns. This study presents several candidate filters to enhance the MUAPs propagation pattern in spatiotemporal EMG images. The filters, which can detect and enhance line-like structure in digital images, are used. Specifically, the Hermite shape filter is used for EMG image enhancement and compared with Gabor filter and steerable filters. The performance of the filters regarding accuracy, specificity, and sensitivity is evaluated with real sEMG signal measured from different muscles and computer-generated EMG signals. In the enhanced images the visibility of the MUAP region is improved. These results can help in better estimation of muscle characteristics from sEMG signals.
Full-text available
The use of actual electricity consumption data provided the chance to detect the change of customer class types. This work could be done by using classification techniques. However, there are several challenges in computational techniques. The most important one is to efficiently handle a large number of dimensions to increase customer classification performance. In this paper, we proposed a symmetrical uncertainty based feature subset generation and ensemble learning method for the electricity customer classification. Redundant and significant feature sets are generated according to symmetrical uncertainty. After that, a classifier ensemble is built based on significant feature sets and the results are combined for the final decision. The results show that the proposed method can efficiently find useful feature subsets and improve classification performance.
Full-text available
This study suggests a new approach to EEG data classification by exploring the idea of using evolutionary computation to both select useful discriminative EEG features and optimise the topology of Artificial Neural Networks. An evolutionary algorithm is applied to select the most informative features from an initial set of 2550 EEG statistical features. Optimisation of a Multilayer Perceptron (MLP) is performed with an evolutionary approach before classification to estimate the best hyperparameters of the network. Deep learning and tuning with Long Short-Term Memory (LSTM) are also explored, and Adaptive Boosting of the two types of models is tested for each problem. Three experiments are provided for comparison using different classifiers: one for attention state classification, one for emotional sentiment classification, and a third experiment in which the goal is to guess the number a subject is thinking of. The obtained results show that an Adaptive Boosted LSTM can achieve an accuracy of 84.44%, 97.06%, and 9.94% on the attentional, emotional, and number datasets, respectively. An evolutionary-optimised MLP achieves results close to the Adaptive Boosted LSTM for the two first experiments and significantly higher for the number-guessing experiment with an Adaptive Boosted DEvo MLP reaching 31.35%, while being significantly quicker to train and classify. In particular, the accuracy of the nonboosted DEvo MLP was of 79.81%, 96.11%, and 27.07% in the same benchmarks. Two datasets for the experiments were gathered using a Muse EEG headband with four electrodes corresponding to TP9, AF7, AF8, and TP10 locations of the international EEG placement standard. The EEG MindBigData digits dataset was gathered from the TP9, FP1, FP2, and TP10 locations.
Full-text available
Recently, commercial physiological sensors and computing devices have become cheaper and more accessible, while computer systems have become increasingly aware of their contexts, including but not limited to users’ emotions. Consequently, many studies on emotion recognition have been conducted. However, boredom has received relatively little attention as a target emotion due to its diverse nature. Moreover, only a few researchers have tried classifying boredom using electroencephalogram (EEG). In this study, to perform this classification, we first reviewed studies that tried classifying emotions using EEG. Further, we designed and executed an experiment, which used a video stimulus to evoke boredom and non-boredom, and collected EEG data from 28 Korean adult participants. After collecting the data, we extracted its absolute band power, normalized absolute band power, differential entropy, differential asymmetry, and rational asymmetry using EEG, and trained these on three machine learning algorithms: support vector machine, random forest, and k-nearest neighbors (k-NN). We validated the performance of each training model with 10-fold cross validation. As a result, we achieved the highest accuracy of 86.73% using k-NN. The findings of this study can be of interest to researchers working on emotion recognition, physiological signal processing, machine learning, and emotion-aware system development.
Full-text available
Clinicians would benefit from access to predictive models for diagnosis, such as classification of tumors as malignant or benign, without compromising patients’ privacy. In addition, the medical institutions and companies who own these medical information systems wish to keep their models private when in use by outside parties. Fully homomorphic encryption (FHE) enables computation over encrypted medical data while ensuring data privacy. In this paper we use private-key fully homomorphic encryption to design a cryptographic protocol for private Naive Bayes classification. This protocol allows a data owner to privately classify his or her information without direct access to the learned model. We apply this protocol to the task of privacy-preserving classification of breast cancer data as benign or malignant. Our results show that private-key fully homomorphic encryption is able to provide fast and accurate results for privacy-preserving medical classification.
Conference Paper
Full-text available
Over the past decades, humans have redesign themselves with the intent to evolve beyond their current physical and mental limitations. This phenomenon has been known as transhumanism, wherein robotics and biomimetics have been exploiting the unique designs of the human body with the intent to develop disruptive anthropomorphic artificial appendages. Nevertheless, while lower extremity prosthetics have evolved to the point at which lower leg amputees may be competitive with professional runners in the world, there is still a gap between upper extremity prosthetics and real hands. This work intends to be a pioneer into developing a low-cost multipurpose robotic hand for research and academia. This paper describes the robotic hand, including its electromechanical development and full ROS integration. Moreover, the paper also presents a MatLab framework designed to introduce sequence data classification, namely providing the ability to control the robotic hand using electromyography (EMG) signals from the forearm. This paper expects to contribute to an ever-increasing human-robot symbiosis by motivating students to engage in transhumanism studies using more sophisticated technologies and methods.