Electromyography Signal-based Gesture
Recognition for Human-Machine Interaction in
Real-Time through Model Calibration
Christos Dolopikos1, Michael Pritchard2Jordan J. Bird3, and Diego R. Faria4
ARVIS Lab (Aston Robotics, Vision, and Intelligent Systems Laboratory), Computer
Science, Aston University, B4 7ET, Birmingham, United Kingdom
Abstract. In this work we achieve up to 92% classiﬁcation accuracy of
electromyographic data between ﬁve gestures in pseudo-real-time. Most
current state-of-the-art methods in electromyographical signal processing
are unable to classify real-time data in a post-learning environment, that
is, after the model is trained and results are analysed. In this work we
show that a process of model calibration is able to lead models from
67.87% real-time classiﬁcation accuracy to 91.93%, an increase of 24.06%.
We also show that an ensemble of classical machine learning models can
outperform a Deep Neural Network. An original dataset of EMG data
is collected from 15 subjects for 4 gestures (Open-Fingers, Wave-Out,
Wave-in, Close-ﬁst) using a Myo Armband for measurement of forearm
muscle activity. The dataset is cleaned between gesture performances
on a per-subject basis and a sliding temporal window algorithm is used
to perform statistical analysis of EMG signals and extract meaningful
mathematical features as input to the learning paradigms. The classiﬁers
used in this paper include a Random Forest, a Support Vector Machine,
a Multilayer Perceptron, and a Deep Neural Network. The three classical
classiﬁers are combined into a single model through an ensemble voting
system which scores 91.93% compared to the Deep Neural Network which
achieves a performance of 88.68%, both after calibrating to a subject
and performing real-time classiﬁcation (pre-calibration scores for the two
being 67.87% and 74.27% respectively).
Keywords: Real-time Gesture Classiﬁcation, Machine Learning, Deep
Learning, Biosignal Processing, EMG
This work is partially supported by EPSRC-UK InDex project (EU CHIST-
ERA programme), with reference EP/S032355/1 and by the Royal Society (UK)
through the project ”Sim2Real” with grant number RGS\R2\192498.
2 Christos Dolopikos et al.
Human-Machine interfaces have been relatively static in recent years, despite
many technological advancements in both hardware and software. The eﬃciency
and productivity of conventional devices for interacting with machines has peaked,
and therefore alternative means of interaction have to be further explored. The
realization that human-beings are the only source of input in this relation is im-
portant as it encourages us to shift the balance of Human-Machine interaction
closer to the human aspect. As humans we use our hands to interact with devices
and machines every day; in order to perform actions such as typing, bioelectrical
signals are carried through the spinal cord from the brain to the muscle motor
neurons to generate muscular activity . Thalmic Labs’ Myo Armband  fea-
tures a set of Electromyography (EMG) sensors  and an inertial measurement
unit (IMU) , and can provide an alternative interface by reading that forearm
bioelectricity. In this work we classify between four distinct hand gestures (and
a ﬁfth class for the neutral, or resting, position) using electromyographic data
obtained with a Myo band. Although human beings are biologically similar ,
there are small diﬀerences among them which can signiﬁcantly aﬀect the per-
formance of biosignal sensing devices; we hence collected a novel EMG dataset
from a diverse population of ﬁfteen able-bodied adults.
Provides single instance
Set Up Model
Ready Model for
Feature Extraction Feature Extraction
Split Dataset into
Fig. 1. Real-time emulation System
Electromyography Gesture Recognition 3
The main scientiﬁc contributions of this work are as follows:
1. Collection of an original EMG dataset of 4 gestures from 15 individuals.
(Available at Kaggle )1
2. Achievement of better performance of voting classiﬁers over deep learning
3. Results support the importance of model calibration for real-time classiﬁca-
tion through classical, deep, and transfer learning.
4. Success in real-time classiﬁcation of unseen data
As part of this work we intended to verify our classiﬁcation system in real-
time, however due to the ongoing 2020 pandemic of the SARS-CoV-2 virus ,
we were unable to safely access the necessary equipment. We hence devised a
system to emulate real-time reading of pre-recorded EMG data for classiﬁcation.
The real-time classiﬁcation process starts by loading the required models and
the training dataset. Simultaneously we load the unseen dataset, from which a
portion is extracted and incorporated into the training dataset for calibration
purposes. This mimics the calibration period that would typically take place in
a physical system . This extended training dataset goes through the feature
extraction process and is passed to the classiﬁcation models to train upon. The
remainder of the unseen data meanwhile undergoes the same feature extraction
process as the training set, and awaits the completion of model training. When
the models are ready, single instances of the unseen testing dataset are provided
in turn for classiﬁcation. Figure 1 depicts an overview of our real-time emulation
approach for electromyographic gesture recognition.
As the majority of movements of the hand and wrist  are controlled by muscles
in the forearm, it is appropriate to measure electromyographic (EMG) activity
in these muscles to identify hand movement. Thalmic Labs’ Myo Armband 
is a wearable human-electronics interface which combines muscle activity sen-
sors utilizing surface electromyography (sEMG) and inertial sensor signals in
order to provide a gesture recognition system for electronic device control. The
bioelectric signals generated by muscle activity have potential diﬀerences in the
order of μV – mV , which are recorded by the sEMG sensors and converted
by Thalmic’s software to a unitless measure of muscular “activation” . Sur-
face EMG sensors require direct contact with the skin in order to conduct the
bioelectric signals of the underlying muscles; the quality of the acquired signals
is aﬀected by various factors including the condition of the skin and the fat and
muscle density of the forearm . To develop Machine Learning models which are
more robust to variations in such conditions, it is beneﬁcial to collect EMG data
from a range of individuals. EMG recognition devices such as the Myo are widely
used for prosthetic limbs , controlling robotic arms , wheelchairs , and
have even been integrated as part of a smart home environment .
1Dataset available at https://www.kaggle.com/chrisdolopikos/eleectromyography-
4 Christos Dolopikos et al.
2.1 Related Work
There is much precedent in the literature for using Thalmic’s Myo device to
collect electromyographic data for oﬄine classiﬁcation. Benalc´azar et al.  used
a k-nearest-neighbours model to classify ﬁve diﬀerent gestures from a group of
ten individuals, reaching up to 98.5% accuracy when incorporating an activity
detection system to remove any segments in the EMG signals which did not
correspond to a movement of the hand. The attributes provided to their kNN
were the levels of EMG activity detected at each of the Myo’s eight sensors over
a given period.
Conversely, Bird  and Kobaylarz  both extracted an ensemble of sta-
tistical features from the time-series EMG data provided by the Myo to be used
as model attributes. Bird et al. demonstrated the eﬀectiveness of a feature en-
semble previously developed for use with electroencephalography (EEG) in the
EMG domain, using an optimised deep neural network topology to classify four
gestures from ten subjects at an accuracy of 85%. They also made use of transfer
learning techniques, using network weights learned from one type of biological
signal as a starting point for the other (as opposed to random starting weights).
Kobylarz succeeded in using this feature ensemble to classify EMG data corre-
sponding to three hand gestures, classifying unseen data with a Random Forest
at up to 97% when incorporating some calibration of the model to the new par-
ticipant. Similar accuracies were also reached with a voting system comprising
both a Random Forest and a Support Vector Machine (SVM).
Fewer studies however have been able to achieve online, or real-time, ges-
ture classiﬁcation. Kim et al.  found success in real-time classiﬁcation of four
gestures from single-channel EMG measurement. They similarly extracted sta-
tistical features from the data, using an ensemble that shares some similarities
with that of Bird and Kobylarz whilst also using some diﬀerent features, notably
the number of zero-crossings and the degree to which the signal was periodic. By
combining a kNN and a Bayesian model they reached 94% accuracy, and were
able in real-time to control a radio-controlled car through gesture.
One developing application of electromyographic gesture control is sign lan-
guage recognition. The British Deaf Association reports that over 85,000 in-
dividuals in the United Kingdom use British Sign Language as their primary
preferred form of communication . Communicating with computer systems
is an integral part of modern life; many devices are now “hands-free”, relying
on voice recognition, hence creating a need to ensure that such devices are still
made accessible for the deaf. Deja et al. found that sign language users responded
positively to an electromyographic human-computer-interface for signing using
a pair of Myo devices , although further work was needed in constructing a
system robust to syntax and grammar errors.
Kaya & Kumbasar used data from a Myo to classify gestures representing the
digits 0 - 9 in Turkish sign language . In a similar fashion to Bird, Kobylarz,
and Kim they used a sliding time-series window to extract statistical features
from the EMG data, successfully classifying the ten gestures with kNN, SVM,
and Artiﬁcial Neural Network models. Their SVM was able to reach a F-score of
Electromyography Gesture Recognition 5
0.866, with 10-fold cross validation, though it remained to be seen whether the
model would be capable of classifying unseen data. Abreu et al.  developed
a 20-class system based on the Brazilian sign language alphabet Libras. They
rectiﬁed and averaged the electromyographic signals from the Myo device, and
used these averaged signals as the input to an ensemble of twenty SVMs, one per
letter, each trained using a one-vs-all approach. While very high cross-validation
accuracies were reached for all letters, these accuracies decreased signiﬁcantly
when the system was presented with real-time, unseen data.
Whilst our study does not set out to classify sign language gestures speciﬁ-
cally, it is hoped that with our successful exploration of real-time methods ges-
ture classiﬁcation we can help lay the groundwork for furthering development in
3 Implementation: EMG Data Processing and
3.1 Data Acquisition
The experimental setup for data collection consists of a Myo Armband (Figure
2) connected to a computer via Bluetooth Low Energy (BLE). The connection is
achieved using Myo Hub which enables the device’s SDK and allows Python to
gain access to the device as well. A Python script is used to record forearm EMG
data produced by muscle activity when executing a gesture for 60 seconds .
Each of the Myo’s eight EMG sensors produces an EMG reading and the IMU
produces a single reading; the resulting data ﬁle is hence ten columns wide:
one stating unix timestamps, eight for the EMG sensors and one for the IMU
readings. The EMG data streams at 200Hz whilst the IMU data (which was not
utilised in this study) streams at 50Hz. Each single measurement session results
in a single .csv ﬁle.
A single subject would wear the armband on their right forearm (data was
also collected from the participants’ left arms, which we did not use in our study
but still intend to make available for the research community). The measuring
procedure started after a brief demonstration of one of the required deﬁned
gestures: a closed ﬁst (ﬁnger ﬂexion ), spread open ﬁngers (ﬁnger abduction
), waving inwards (wrist ﬂexion ), and waving outwards (wrist extension
). The subject was then instructed to perform this gesture repeatedly for 60
seconds. To minimize the impact of random errors, 5 diﬀerent repetitions per
gesture were executed. A single subject hence performed four diﬀerent gestures
for ﬁve minutes each.
One factor contributing to random errors is the participant themselves. Con-
tinuous repetition of gestures is exhaustive for the muscles. As time passes, they
start becoming stiﬀer and as a result the subject struggles to perform the ges-
tures correctly, directly aﬀecting the quality of EMG readings. To mitigate this
each subject was given one to two minutes for muscle relaxation after the com-
pletion of each measurement task. Diﬀerent stress levels might also contribute to
6 Christos Dolopikos et al.
Fig. 2. Subject performing “Wave-in” gesture (left), and the resulting elecromyo-
graphic waveforms (right).
diﬀerent or random bioelectrical activity resulting in outliers. To mitigate this
and lower stress levels the data collection sessions were conducted in an environ-
ment that was familiar to the subjects. Fifteen able-bodied subjects (9 male, 6
female) participated in this experiment. The mean age of the subjects was 26,
with the youngest participant being 20 and the oldest 52. Whilst a diverse pop-
ulation was sought, there would be merit in future work expanding the dataset
even further, both in number and to individuals from a more diverse range of
3.2 Data Preprocessing
To ensure the integrity of the collected data all ﬁles were initially checked for
missing values, NaNs, and other discrepancies of a similar nature. Any individual
measurement instances containing such errors were removed.
3.2.1 Data Cleaning As previously described the participants of the data
collection are themselves a source of possible errors. Observation of the produced
datasets lead to the recognition of systematic errors; it was found that due to
variation in the participants’ reaction times in initiating muscle movement when
instructed, a portion at the start of each recording did not have any signiﬁcant
EMG activity. It would not be appropriate to consider this “dead” time part
of the gesture as no physical muscular activity was taking place. To address
this, all EMG readings were divided into two sub ﬁles, one containing the actual
gesture and one containing the null data . This processing was done in MATLAB
, with the assessment of when EMG activity started being performed by
observation of graphical plots of the signals. The null data was used to form a
rejection class, resulting in a total of ﬁve deﬁned gestures.
Electromyography Gesture Recognition 7
3.2.2 Rectiﬁcation Raw EMG readings typically range from −128mV to
+127mV for each timestamp and are highly oscillatory. This means that in some
cases, simple statistical measures such as the mean will result in values close to
zero, regardless of the intensity of electromyographic activity. The ensemble of
statistical features used in this study includes more complex measures which are
less liable to be inﬂuenced by this, however to maximise the informativity of
the featureset the EMG waveforms were rectiﬁed (i.e. the modulus of the values
was taken) before feature extraction was performed. Both the rectiﬁed and raw
datasets were analysed to investigate the impact of this process.
3.2.3 Feature Extraction Raw usage of the collected EMG data is not ben-
eﬁcial for machine learning due to the data’s stochastic nature. EMG data, as
previously mentioned, are bioelectric signals that are non-stationary, non-linear
and of random nature. This means that even when a subject replicates the ex-
act same gesture, at a given point in time during the gesture’s execution it is
highly likely that the activated muscles do not have identical bioelectric activity.
Therefore, some statistical analysis on the raw data is required to produce a
meaningful dataset for machine learning.
Electroencephalographic (EEG) signals have been reported to bear similari-
ties with EMG signals ; both are relatively low frequency, very low amplitude
electrical waveforms. Previous research has performed statistical feature analysis
for EEG and EMG by introducing sliding windows of length of 1 second at an
overlap of 0.5 seconds to segment the data [4, 5, 17]; we make use of the same
feature extraction approach here. Firstly, a sliding 1-second window is divided
into two windows of 0.5 seconds decreasing the size of the initial wave to equally
sized segments of the wave. The feature extraction process then comprises of 3
stages; each stage is related to the size of the window that is being processed:
•Mean and Standard deviation of the wave are computed
•Skewness and Kurtosis of each wave
•Maximum and minimum values
•Sample variances of each wave, plus the sample covariances of all pairs
of the waves
•The eigenvalues of the covariance matrix
•The upper triangular elements of the matrix logarithm of the covariance
•The magnitude of frequency components of each signal, obtained using
a Fast Fourier Transform (FFT)
•The change of (between the ﬁrst and second sliding window):
∗The sample mean and the sample standard deviation
∗Maximum and minimum values
–0.25-second windows produced due to oﬀset
•The mean of each 0.25-second window
•All paired diﬀerences of means between the windows
8 Christos Dolopikos et al.
•Maximum and minimum values and their paired diﬀerences
The input to the feature extraction process is an 8-signal wide array (with one
column for each of the Myo’s 8 EMG sensors), from which a total of 1771 at-
tributes are generated. The resulting featureset is then shuﬄed, and before being
passed to a classiﬁer is standardised such that all attributes are in the range [−1
3.3 Classiﬁer Model Optimisation & Tuning
With data collection and preprocessing stages being completed, the resultant
dataset is in a form suitable for use in the classiﬁers. Before the training of the
classiﬁers starts, the dataset is split into two chunks; one for training and one for
testing purposes. Details of machine learning tuning and results are presented
in the following section. The utilization of these sub-datasets allows for training
of a model classiﬁer and testing of its performance against a subset of data.
This is repeated as part of the parameter tuning process. Classiﬁers have sets of
parameters that determine their learning ability that need to be tuned.
3.3.1 Implementation Equipment Experiments were developed in Python
3.7 with the Scikit Learn library . Deep Learning weights that were pro-
duced using the training dataset were produced on an Intel Core email@example.comGHz
and a GTX980Ti GPU in the Keras library . The produced weights using
the calibration dataset and models were built using an Intel Core firstname.lastname@example.org GHz.
The same conﬁguration was used for the real-time emulation system described
3.3.2 Support Vector Machine Support Vector Machines are kernelised
models, meaning that their learning abilities depend on the kernel utilized, along-
side C and gamma parameters of a given model. For this classiﬁer ScikitLearn
was used . Parameter tuning results suggested that in this case a higher C
parameter could enable the SVM to achieve higher levels of accuracy, whereas
for the gamma parameter the reverse was true; a lower gamma resulted in higher
accuracy. The best performing kernel for the datasets was the Radial Basis Func-
tion (RBF). Based on experimentation, the chosen set of parameters used for
our SVM is: kernel“RBF”, C=100, Gamma=0.00001.
3.3.3 Random Forest The implementation of this classiﬁer is also achieved
with the ScikitLearn library. The ﬁrst parameter that can be optimised in this
classiﬁer is the number of features per split (n estimators). The search for this
parameter started from 200 and had a maximum value of 2000. The next opti-
mised parameter is the maximum number of features when looking for a split
(max features), which can be square root, log2, or auto. The parameter related
to the length of the constructed trees is max depth. This parameter’s search
started from 100 with a maximum of 500. Utilizing the RandomizedSearchCV
Electromyography Gesture Recognition 9
method provided by Scikit Learn library, the cross-validation parameter is set
to 3 and the number of iterations to 100 so as to ensure the validity of the ﬁnd-
ings. The optimized Random Forest model has n estimator, max features and
max depth set to 1600, auto and 220 respectively.
3.3.4 Artiﬁcial Neural Network The model used consists of a single hidden
layer. Based on experimental tuning results, 20 neurons for a single layer provided
a competitive advantage over a model with 10 neurons, however performance was
not signiﬁcantly aﬀected by an increase to 30 neurons. Therefore, to minimize
memory usage the 20 neurons conﬁguration was the preferred choice. Despite
the fact that learning rate aﬀects the performance of the models both positively
and negatively, it also aﬀects the memory usage. While the model is training,
high learning rates result in error messages indicating possible memory overﬂow.
The set of optimised parameters used for the Multilayer Perceptron was hence
an L-BFGS solver, 20 neurons, 0.1 learning rate.
3.3.5 Deep Neural Network The DNN used for the on-the-ﬂy real-time
emulation consists of ﬁve hidden layers with the ReLU activation function. The
ﬁve layers consisted of 206, 226, 298, 167, and 36 neurons respectively. As the
classiﬁcation problem of this work is a multiclass one a softmax activation func-
tion is used, taking place at the last layer of the network with 5 neurons.
Another DNN is constructed for use with a set of weights found using EEG
data in ; in that study these weights were found to be a suitable starting point
for using with EMG data. The number of layers and neurons are the same as
those described above. Similarly, the output layer utilizes a softmax activation
function with 3 neurons.
The weights produced above are input to the ﬁrst DNN with EMG data to
create another set of weights, to be used for the emulated real-time classiﬁcation.
The ﬁrst model is also used to randomly produce some weights using the same
dataset and save them for later use.
3.4 Classiﬁer Implementation
3.4.1 Voting Ensemble An ensemble voting mechanism was constructed
from the SVM, Random Forest, and ANN models. As the models are provided
with an instance ithey each predict the class of i, outputting an array of 5
probabilities, one per class, that total to 1. The arrays produced by the SVM and
ANN are scaled by a factor of 1.5 whilst the output array of the Random Forest
is multiplied by a scale of 2. Consistency on the performance of Random Forest,
due to its tree structure, resulted in arbitrary biases based on the classiﬁer’s
performance throughout the experiments. This provides a conﬁdence vote to
Random forest in case of disagreement. The scaled arrays are then summed
to provide a single array of similar length. The index with the higher score
represents the prediction of class of i.
10 Christos Dolopikos et al.
3.4.2 Real-time Classiﬁcation A user is typically expected to use the Myo
Armband to classify their gestures in real-time. Machine learning algorithms are
often expected to perform poorly given real-time data and sometimes require the
use of a small additional training dataset to recalibrate their models. Due to the
ongoing 2020 SARS-CoV-2 pandemic, we were unable to access the Myo armband
for genuine live classiﬁcation with a new subject. However, in anticipation of
this, data had been collected from an additional sixteenth participant (male,
aged 21) who was left out of the training set entirely. This unseen raw EMG
was cleansed and preprocessed as described above, including the formation of a
rectiﬁed version of the data. The calibration process starts simultaneously with
the emulation procedure, by splitting this previously unseen data of the sixteenth
participant into thirds. One third is used for calibration and the remaining two
thirds for emulating real-time data. The calibration process implements Platt’s
Scaling , a technique that calculates a score to probability calibration curve
using the training set. Calibration data are then combined with the existing and
previously seen training data for retraining the classiﬁers. The emulation portion
of the data also underwent the same feature extraction as the training dataset.
Each instance was then passed in turn in near-real-time to the classiﬁers, thus
emulating live human input (this process is depicted in Figure 1).
The Random Forest, SVM and ANN are combined with the voting mecha-
nism described above to provide a single prediction in order to compete with
the DNN. The process for DNNs is slightly diﬀerent for the real-time data; two
of them were required to load the previously produced weights. One DNN was
left to randomly produce weights and use those for the on the ﬂy predictions,
whereas the other DNN model uses Fine-tune Learning, by getting the feature
extracted calibration data to generate some weights that are going to be loaded
in the same model for the emulation process.
This section presents our experimental results. In all cases both the rectiﬁed
dataset produced by section 3.2.2 and the dataset without rectiﬁcation as in
3.2.1 were trialled to investigate the impact of the rectiﬁcation. In all tables the
reported accuracy is the arithmetic mean of 5 separate runs, whilst the values
in parentheses indicate 95% conﬁdence intervals on that accuracy (using the
Wilson method ). To enable further analysis of classiﬁcation errors, we also
present confusion matrices, precision, and recall scores for the best-performing
model in each experiment.
4.1 Model Benchmarking
We here set a benchmark by using a randomised 66% of the oﬄine dataset for
training and the remaining 33% for testing, providing us a goal for comparison
with our real-time results.
Electromyography Gesture Recognition 11
Table 1. Model benchmark scores using dataset without rectiﬁcation
Classiﬁers Accuracy (%)
Multilayer Perceptron 92.16 [89.02, 94.58]
Support Vector Machine 87.75 [84.03, 90.79]
Random Forest 91.12 [87.75, 93.65]
Random Weights 86.84 [82.81, 89.82]
EEG Weights 86.84 [82.81, 89.82]
EMG Weights 85.78 [81.90, 89.09]
Table 2. Confusion Matrix of Table 1’s best performing model, MLP
ﬁst 1.00 0.00 0.00 0.00 0.00
open 0.02 0.80 0.04 0.00 0.16
wave-in 0.00 0.01 0.90 0.09 0.00
wave-out 0.04 0.02 0.09 0.85 0.00
null 0.01 0.06 0.01 0.00 0.91
Table 3. Model benchmark scores using dataset with rectiﬁcation
Classiﬁers Accuracy (%)
Multilayer Perceptron 92.16 [89.02, 94.58]
Support Vector Machine 88.61[84.95, 91.52]
Random Forest 90.08 [86.50, 92.71]
Random Weights 85.26 [81.30, 88.60]
EEG Weights 82.63 [78.30, 86.12]
EMG Weights 83.68 [79.49, 87.11]
Table 4. Confusion Matrix for Table 3’s best performing model, MLP
ﬁst 1.00 0.00 0.00 0.00 0.00
open 0.06 0.79 0.04 0.00 0.10
wave-in 0.00 0.01 0.89 0.08 0.02
wave-out 0.03 0.04 0.09 0.83 0.01
null 0.01 0.05 0.02 0.02 0.89
12 Christos Dolopikos et al.
Table 5. Accuracy, Precision, and Recall scores for the best performing benchmark
Multilayer Perceptron Accuracy (%) Precision (%) Recall (%)
Without rectiﬁed Data 92.16 92.08 80.83
With rectiﬁed Data 92.16 96.66 79.51
Comparison of Tables 1 & 3 suggests that the rectiﬁcation process did not
provide a signiﬁcant competitive advantage to the models’ performances. They
also suggest that base classiﬁers individually are capable of outperforming the
developed deep neural networks, albeit by a relatively small degree.
4.2 Real-time Classiﬁcation of Unseen Data
In emulated real-time we were able to classify unseen data with accuracies com-
petitive with the above benchmark when the models were calibrated, though
even uncalibrated performance was strong for the Fist, Wave In, and Wave Out
gestures as seen in Tables 7 & 9. We again found base classiﬁers to individually
outperform the deep neural networks as seen in Tables 11 & 13, though the DNN
achieved the best performance when uncalibrated (Tables 6 & 8).
Table 6. Real-time model performance without calibration using unrectiﬁed dataset
(* denotes the use of calibration dataset for the production of weights)
Classiﬁers Accuracy (%)
Voting Mechanism 67.87 [62.88, 72.47]
Random Weights 66.80 [61.75, 71.42]
EEG Weights 66.59 [61.46, 71.16]
EMG Weights 69.96 [65.17, 74.57]
EMG Calibration Weights* 74.27 [69.49, 78.48]
Table 7. Confusion Matrix for Table 6’s best model, EMG calibrated-weights DNN
ﬁst 0.77 0.00 0.04 0.01 0.21
open 0.00 0.11 0.23 0.27 0.40
wave-in 0.00 0.00 0.95 0.00 0.05
wave-out 0.00 0.10 0 0.81 0.10
null 0.00 0.24 0.33 0.00 0.44
Electromyography Gesture Recognition 13
Table 8. Real-time model performance without calibration using rectiﬁed dataset (*
denotes the use of calibration dataset for the production of weights)
Classiﬁers Accuracy (%)
Voting Mechanism 64.84 [59.76, 69.57]
Random Weights 67.00[62.03, 71.68]
EEG Weights 68.01 [63.17, 72.74]
EMG Weights 68.61 [63.45, 73.00]
EMG Calibration Weights* 70.63 [65.74, 75.10]
Table 9. Confusion Matrix of Table 8’s best performing model, DNN with calibrated
ﬁst 0.77 0.01 0.02 0.00 0.21
open 0.00 0.14 0.29 0.25 0.32
wave-in 0.00 0.00 0.88 0.00 0.12
wave-out 0.00 0.09 0.01 0.87 0.04
null 0.00 0.25 0.38 0.05 0.31
Table 10. Accuracy, Precision, and Recall scores for the best performing uncalibrated
DNN Accuracy (%) Precision (%) Recall (%)
Without rectiﬁed Data 74.27 67.11 67.21
With rectiﬁed Data 70.63 64.38 61.34
Table 11. Real-time model performance with calibration using unrectiﬁed dataset
Classiﬁers Accuracy (%)
Voting Mechanism 90.24 [86.81, 92.95]
Random Weights 86.06 [82.20, 89.33]
EEG Weights 83.36 [80.99, 88.35]
EMG Weights 85.58 [81.60, 88.84]
EMG Calibration Weights 83.23 [78.89, 86.61]
14 Christos Dolopikos et al.
Table 12. Confusion Matrix of Table 11’s best performing model, Voting Mechanism
ﬁst 0.97 0.01 0.01 0.00 0.01
open 0.00 0.63 0.12 0.07 0.18
wave-in 0.00 0.00 1.00 0.00 0.00
wave-out 0.00 0.00 0.00 0.99 0.01
null 0.00 0.27 0.67 0.00 0.07
Table 13. Real-time model performance with calibration using rectiﬁed dataset
Classiﬁers Accuracy (%)
Voting Mechanism 91.93 [88.70, 94.35]
Random Weights 87.23 [83.42, 90.31]
EEG Weights 81.54 [77.40, 85.36]
EMG Weights 88.68 [84.95, 91.52]
EMG Calibration Weights 85.58 [81.60, 88.84]
Table 14. Confusion Matrix of Table 13’s best performing model, Voting Mechanism
ﬁst 0.96 0.00 0.02 0.00 0.02
open 0.00 0.23 0.34 0.15 0.28
wave-in 0.00 0.00 1.00 0.00 0.00
wave-out 0.00 0.01 0.00 0.99 0.00
null 0.00 0.00 0.27 0.00 0.72
Electromyography Gesture Recognition 15
Table 15. Accuracy, Precison, and Recall scores for the best performing real-time
models with calibration
Voting Mechanism Accuracy (%) Precision (%) Recall (%)
Without rectiﬁed Data 90.24 83.83 77.18
With rectiﬁed Data 91.93 88.04 78.13
The confusion matrices presented in subsections 4.1 & 4.2 allow us to infer that
some of the classiﬁcation errors were in part aﬀected by human anatomy. The
“open ﬁngers” gesture was the most likely to be misclassiﬁed in our benchmark
experiment (Tables 2 & 4, with 21.56% & 21.27% misclassiﬁcation respectively).
For the real-time experiments presented in section 4.2, “open ﬁngers” misclassi-
ﬁcation percentages reach 89.06%, 85.62%, 36.56% & 77.18% in tables 7, 9, 12
& 14, respectively. In all these cases the misclassiﬁcation percentages mentioned
are substantially higher than those of the other gestures’ classes.
It can be observed that the “open-ﬁngers” gesture was most frequently erro-
neously recognized as the “wave-in” & “wave-out” gestures. The “open-ﬁngers”
gesture comprises primarily an abduction of the ﬁngers, involving the extensor
carpi radialis (brevis &longus), extensor digitorium communis, and ﬂexor carpi
radialis muscles . The ﬂexor carpi radialis is also utilised in combination
with the ﬂexor digitorium superﬁcialis and ﬂexor digitorum profundus  in
ﬂexion of the wrist, i.e. the “wave-in” gesture. Similarly, wrist extension and
hyperextension, as performed in the “wave-out” gesture, also requires activa-
tion of the ﬂexor digitorum profundus . This shared employment of certain
forearm muscles provides some insight as to why gestures which outwardly ap-
pear very distinct in fact have electromyographic similarities, hence leading to
misclassiﬁcation between these gestures.
5 Conclusion & Future Work
Future extensions to this study could consider a larger number of participants for
collecting the base dataset, since this would allow the algorithms to generalise a
greater number of people, which may minimise the amount of data required for
calibration purposes. Whilst in this work the results were based on processing
of EMG data collected from the right forearm, the process could be replicated
for the left and would be likely to achieve similar results. Future work may even
be able to develop an ambidextrous gesture classiﬁcation system. Additionally,
the relationship between a larger number of gestures and the required size of
calibration datasets could also be explored, to move beyond the four classes
(plus one neutral class) studied in this work.
To ﬁnally conclude, Human-Computer Interaction has been somewhat static
in the state-of-the-art for many years compared to other computer science ﬁelds
since real-time classiﬁcation is often forgotten, with research focusing on max-
imising the accuracy of models on static training and testing datasets. In this
16 Christos Dolopikos et al.
work, we contributed towards methods for real-time gesture recognition, devel-
oping approaches that would allow the technology to be used in the real world.
Connecting users with technology would allow for higher level of multi-tasking
and productivity without physically discriminating its users, that is, setting the
basis of bringing humans in unison with technology. It is evident from these
experiments that calibration processes are important for real-time classiﬁcation
of hand gestures, improving the best model performance from 67.87% to 91.93%
accuracy when a short calibration exercise is performed. Additionally, this study
shows the combination of multiple classical models can outperform a Deep Learn-
ing neural network, providing motivation for future investigations into the opti-
mal choice of classiﬁers for this kind of problem. The product of the study is a
process of successfully classifying hand gestures in real-time.
Electromyography Gesture Recognition 17
1. J. G. Abreu, J. M. Teixeira, L. S. Figueiredo, and V. Teichrieb. Evaluating sign
language recognition using the myo armband. In 2016 XVIII Symposium on Virtual
and Augmented Reality (SVR), pages 64–70, 2016.
2. I. S. Aleem, P. Ataee, and S. Lake. Systems, devices, and methods for wearable
electronic devices as state machines, Feb. 5 2019. US Patent 10,199,008.
3. M. E. Benalc´azar, C. Motoche, J. A. Zea, A. G. Jaramillo, C. E. Anchundia,
P. Zambrano, M. Segura, F. Benalc´azar Palacios, and M. P´erez. Real-time hand
gesture recognition using the myo armband and muscle activity detection. In 2017
IEEE Second Ecuador Technical Chapters Meeting (ETCM), pages 1–6, 2017.
4. J. J. Bird, J. Kobylarz, D. R. Faria, A. Ek´art, and E. P. Ribeiro. Cross-domain
mlp and cnn transfer learning for biological signal processing: Eeg and emg. IEEE
Access, 8:54789–54801, 2020.
5. J. J. Bird, L. Manso, E. Ribeiro, A. Ekart, and D. Faria. A study on mental state
classiﬁcation using eeg-based brain-machine interface. In 2018 International Con-
ference on Intelligent Systems (IS), Funchal, Madeira Island, Portugal, September
6. A. Boyali, N. Hashimoto, and O. Matsumoto. Hand posture and gesture recognition
using myo armband and spectral collaborative representation based classiﬁcation.
In 2015 IEEE 4th Global Conference on Consumer Electronics (GCCE), pages
7. British Deaf Association. Help & resources, March 2017.
8. F. Chollet et al. Keras. https://keras.io, 2015.
9. S. W. Cummings, C. Tangen, B. Wood, and R. H. Crompton. Encyclopædia
britannica, Apr 2018.
10. C. Darwin. On the origin of species: A facsimile of the ﬁrst edition. Harvard
University Press, 1964.
11. J. A. Deja, P. Arceo, D. G. David, P. L. Gan, and R. C. Roque. MyoSL: A
framework for measuring usability of two-arm gestural electromyography for sign
language. In M. Antona and C. Stephanidis, editors, Universal Access in Human-
Computer Interaction. Methods, Technologies, and Users, pages 146–159, Cham,
2018. Springer International Publishing.
12. C. Dolopikos, M. Pritchard, J. J. Bird, and D. R. Faria. Collection of original emg
13. T. A. Ghebreyesus. WHO director-general’s opening remarks at the media brieﬁng
on COVID-19 - 11 March 2020, Mar 2020.
14. H. Gray. Anatomy of the human body. Lea & Febiger, Philadelphia, Pennsylvania,
USA, 20 edition, 1918.
15. E. Kaya and T. Kumbasar. Hand gesture recognition systems with the wearable
myo armband. In 2018 6th International Conference on Control Engineering In-
formation Technology (CEIT), 10 2018.
16. J. Kim, S. Mastnik, and E. Andr´e. Emg-based hand gesture recognition for real-
time biosignal interfacing. In Proceedings of the 13th international conference on
Intelligent user interfaces, pages 30–39, 2008.
17. J. Kobylarz, J. J. Bird, D. R. Faria, E. P. Ribeiro, and A. Ek´art. Thumbs up,
thumbs down: non-verbal human-robot interaction through real-time emg classi-
ﬁcation via inductive and supervised transductive transfer learning. Journal of
Ambient Intelligence and Humanized Computing, pages 1–11, 2020.
18 Christos Dolopikos et al.
18. S. Masson, F. Fortuna, F. Moura, and D. Soriano. Integrating myo armband for
the control of myoelectric upper limb prosthesis. In XXV Brazillian Congress on
Biomedial Engineering, 10 2016.
19. The Mathworks, Inc., Natick, Massachusetts, USA. MATLAB version
126.96.36.1996695 (R2019b) Update 4, 2019.
20. R. Merletti and H. Hermens. Detection and conditioning of the surface emg
signal. Electromyography: physiology, engineering, and noninvasive applications,
pages 107–31, 2004.
21. R. Merletti and P. J. Parker. Electromyography: physiology, engineering, and non-
invasive applications, volume 11. John Wiley & Sons, 2004.
22. G. Morais, L. Neves, A. Masiero, and M. C. Castro. Application of myo armband
system to control a robot interface. In Proceedings of the 9th International Joint
Conference on Biomedical Engineering Systems and Technologies, pages 227–231,
23. F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,
M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos,
D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine
learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
24. J. Platt et al. Probabilistic outputs for support vector machines and comparisons to
regularized likelihood methods. Advances in large margin classiﬁers, 10(3):61–74,
25. Thalmic Labs, Inc. How do i access the raw emg data from the myo armband?,
26. Thalmic Labs, Inc. Tech specs: Myo battery life, dimensions, compatibility, and
more. Web Archive, 2016.
27. M. Victorino, X. Jiang, and C. Menon. Wearable Technologies and Force Myography
for Healthcare, pages 135–152. Academic Press, 01 2018.
28. X. Yan and X. G. Su. Stratiﬁed wilson and newcombe conﬁdence intervals for
multiple binomial proportions. Statistics in Biopharmaceutical Research, 2(3):329–
29. O. K. Zheng and C. Cheng. Interactive lighting performance system with myo
gesture control armband. In 2018 1st IEEE International Conference on Knowledge
Innovation and Invention (ICKII), pages 381–383, 2018.