Access to this full-text is provided by IOP Publishing.
Content available from Flexible and Printed Electronics
This content is subject to copyright. Terms and conditions apply.
Flex. Print. Electron. 8(2023) 025012 https://doi.org/10.1088/2058-8585/acd2e8
Flexible and Printed Electronics
OPEN ACCESS
RECEIVED
26 October 2022
REVISED
20 April 2023
ACC EPT ED FOR PUB LICATI ON
5 May 2023
PUBLISHED
7 June 2023
Original Content from
this work may be used
under the terms of the
Creative Commons
Attribution 4.0 licence.
Any further distribution
of this work must
maintain attribution to
the author(s) and the title
of the work, journal
citation and DOI.
PAPER
Finger gesture recognition with smart skin technology and deep
learning
Liron Ben-Ari1,5, Adi Ben-Ari1,5, Cheni Hermon2and Yael Hanein1,2,3,4,∗
1School of Electrical Engineering, Tel Aviv University, Tel Aviv, Israel
2X-trodes, Herzelia, Israel
3Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
4Tel Aviv University Center for Nanoscience and Nanotechnology, Tel Aviv University, Tel Aviv, Israel
5L Ben-Ari and A Ben-Ari contributed equally to this work.
∗Author to whom any correspondence should be addressed.
E-mail: yaelha@tauex.tau.ac.il
Keywords: EMG, gesture recognition, BCI, soft electrodes, skin electronics
Abstract
Finger gesture recognition (FGR) was extensively studied in recent years for a wide range of
human-machine interface applications. Surface electromyography (sEMG), in particular, is an
attractive, enabling technique in the realm of FGR, and both low and high-density sEMG were
previously studied. Despite the clear potential, cumbersome electrode wiring and electronic
instrumentation render contemporary sEMG-based finger gestures recognition to be performed
under unnatural conditions. Recent developments in smart skin technology provide an
opportunity to collect sEMG data in more natural conditions. Here we report on a novel approach
based on soft 16 electrode array, a miniature and wireless data acquisition unit and neural network
analysis, in order to achieve gesture recognition under natural conditions. FGR accuracy values, as
high as 93.1%, were achieved for 8 gestures when the training and test data were from the same
session. For the first time, high accuracy values are also reported for training and test data from
different sessions for three different hand positions. These results demonstrate an important step
towards sEMG based gesture recognition in non-laboratory settings, such as in gaming or
Metaverse.
1. Introduction
Finger gesture recognition (FGR) is a widely stud-
ied domain in human-machine interfaces (HMIs).
Applications include virtual games, where finger ges-
tures can be used instead of a joystick to achieve
improved user experience, medical uses, where FGR
can be used to help distinguish between normal
and abnormal movements [1], or sign-language
translation [2,3], to name just a few examples [4].
Several different approaches were explored in recent
decades for FGR, including: Video analysis [5,6],
smart gloves [7,8], smart bands [9] and surface elec-
tromysography (sEMG) [10–15]. sEMG in particu-
lar is an attractive approach as it records the elec-
trical activity of arm muscles located away from the
fingers, so that finger movements are not restric-
ted. Moreover, it does not necessitate visual path-
way, allowing operation in a dark environment or
during movement. sEMG is also sensitive to applied
force, even without any apparent movement (isomet-
ric muscle activation).
Despite the great potential of sEMG, such meas-
urements have a number of technical and compu-
tational challenges. First, under dynamic activity,
motion artifacts are very common. Second, elec-
trode position may vary from session to session or
from subject to subject, complicating the analysis.
Owing to these challenges, current sEMG based stud-
ies concerned with the detection of hand move-
ments are mostly performed in a controlled environ-
ment, with the hand held at a fixed position [11–13].
Also, most studies only report on intra-session
classification [12,13,15], or inter-session classifica-
tion with degraded performance [11,16]. Further-
more, to mediate good electrical contact between
the electrodes and the skin, wet electrodes are com-
monly used [11,12]. These electrodes severely limit
© 2023 The Author(s). Published by IOP Publishing Ltd
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
the usability of the technology, as they limit session
duration and electrode number (and therefore sep-
aration capacity). Wires, cumbersome amplification
and recording instrumentation and relatively large
electrode arrays [11,13] further limit the technology,
restricting it to clinical or laboratory use and mandate
skilled personnel for electrode placement and system
operation. Previous studies concerning the identific-
ation of finger gestures from sEMG signals were per-
formed in a controlled environment, using relatively
bulky electrode arrays that require the use of a con-
ductive gel. Such measurements do not allow con-
tinuous real-time tracking, and are far from prac-
tical use in HMI applications. Flexible electrodes were
recently demonstrated to have great potential in clas-
sifying hand gestures in a natural environment [17].
In this investigation, we demonstrate FGR under
natural conditions for real life applications by
addressing the following requirements: First, the sys-
tem should be compatible with hours of use. Second,
recognition of finger gestures should be achieved,
regardless of the general position of the hand. Third,
system performance must be invariant to precise elec-
trode placement in repeated sessions (removal of the
wearable device and re-application on a later day).
To achieve these requirements, we used novel
printed electrode arrays. The electrodes are printed
on a thin and soft substrate (figure 3(a)) and were
studied previously for various applications [18,19].
The electrode arrays used in this study were designed
specifically to capture arm muscle activity. Moreover,
the arrays were designed with an internal ground for
simple and quick placement and robustness against
mechanical artifacts. The thinness and elasticity of
the arrays allows excellent mechanical coupling to
the skin. In this study we also used, for the first
time, a new miniature wearable sensing system that
allows continuous sEMG measurement, even during
dynamic movement. Using such a small, convenient
and non-invasive system is an important step towards
hand gesture recognition in freely behaving humans.
Finally, using deep learning, we demonstrate FGR
under different hand positions and invariance to pre-
cise electrode placement (in repeated sessions).
2. Materials and methods
2.1. Wireless sEMG system
The electrode arrays we tested in this study were pur-
chased from X-trodes Inc. The dry carbon electrodes
are 4.5 mm in diameter, they are organized in a 4 by
4 arrangement and can be quickly applied to the skin
with a built in adhesive film. Their fabrication is based
on a technology which was previously described in
[19]. Briefly, carbon electrodes and silver traces are
screen printed on a thin and soft PU film. A second
double sided adhesive PU film is used for passiva-
tion and skin adhesive material. Data were recorded
with a miniature wireless data acquisition unit (DAU,
X-trodes Inc.), which was developed to allow elec-
trophysiological measurements under natural condi-
tions. The DAU supports up to 16 unipolar channels
(2 µV noise root-mean-square (RMS), 0.5–700 Hz)
with a sampling rate of 4000 S s−1, 16 bit resolu-
tion and input range of ±12.5 mV. A 620 mAh bat-
tery supports DAU operation for a duration of up
to 16 h. A Bluetooth (BT) module is used for con-
tinuous data transfer. The DAU is controlled by an
Android application and the data are stored on a
built-in SD card and on the Cloud for further ana-
lysis.The DAU also includes a 3-axis inertial sensor in
order to measure the acceleration of the hand during
the measurements.
2.2. Data collection
Eight healthy subjects (aged 18–30) completed two
recording sessions with good signal to noise ratio
(SNR). Electrode arrays were placed on the region of
the extensor digitorum muscle of the dominant hand.
Muscle location was identified by applying strong
abduction of the fingers. During the recording, each
subject sat or stood in front of a table (depending on
the position of the hand being examined). An instruc-
tional video displayed on a computer was used to
guide the subjects.
The experiment consisted of two steps: First, the
hand was supported on a table, followed by a second
stage in which the hand was not supported. The pro-
tocol was structured as follows: First, subjects were
shown a short video showing different hand ges-
tures. The subjects were instructed to perform specific
gestures, both through voice and visual instructions
presented on a computer screen (for 3 s), and then
to stop and rest (another 3 s), after which the subjects
were instructed to repeat the gesture. The instructions
continued until each gesture was performed 10 times.
Altogether, sEMG was recorded for 10 different fin-
ger gestures: Stretching two fingers, stretching three
fingers, stretching all the fingers (abduction), making
a fist, as well as six movements that represent letters
in the Hebrew sign language: ‘Bet’, ‘Gimel’, ‘Het’, ‘Tet’,
‘Kaf’ and ‘Nun’. During the entire process of perform-
ing the movements, a representative of the research
team monitored that the process was conducted as
planned. In addition, a python code was used to send
annotation to the Android application to mark the
timing in which the subject was instructed to start and
finish each of the gestures, in order to assist the ana-
lysis stage.
2.3. Data analysis
Data analysis flow is depicted in figure 1.
2.3.1. Filtering
Raw sEMG data were first filtered using 50 Hz and
100 Hz comb filters to reduce power-line interference.
A 20–400 Hz 4th-order Butterworth bandpass filter
was applied to attenuate non-sEMG components.
2
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
Figure 1. Data analysis flow chart: constructing training and test databases.
Figure 2. CNN architecture. Layers and input data sizes appear above the corresponding layers.
2.3.2. Segmentation
Segmentation into time intervals for each gesture
(denoted as active time windows), was performed
manually. Namely, the time window in which each
gesture was performed was identified using the
annotations made during the recording.
2.3.3. Classification
Each active time window, identified in the segment-
ation stage was divided into 200 ms sub-windows.
For each sub-window, the RMS value per channel
was derived, resulting in 16 values for each sub-
window. The obtained 16 values were arranged on
a grid according to the spatial locations of the elec-
trodes in the array, resulting in an activation map.
The sequence of maps for each active time window
was then fed into a classification algorithm. Sev-
eral algorithmic solutions were explored, as detailed
below.
2.3.4. Convolutional neural network (CNN)
In this approach, each map is fed separately into a
CNN, which outputs a classification. By conducting
majority voting between the classifications obtained
for different sub-windows of the same active window,
a final classification is obtained. The CNN architec-
ture used in this work is depicted in figure 2. This
neural network (NN) consists of two convolutional
layers, followed by three fully connected layers. Such
architecture was favored as it has a relatively small
number of parameters, making it suitable for our rel-
atively small data set, as well as the small size of the
activation maps. Each layer, apart from the last fully
connected one, was followed by a ReLU activation
[20] and Batch Normalization [21]. The network
was trained with 500–2000 epochs (these were the
epoch numbers required for the CNN training loss
to converge in the different tasks), with a learning
rate between 0.0005 and 0.001 (with Adam optimizer
[22]), weight decay of 0.0001 and dropout [23] of up
to 0.3.
2.3.5. Recurrent neural network (RNN)
In this approach, the sequence of maps for a certain
action is treated as a time-series of dimension 16,
which is fed into an LSTM-based RNN. The RNN
3
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
outputs a classification. The architecture of the RNN
consisted of one LSTM layer of 12 units, followed by
one fully connected layer. The small number of layers
was again favored to match the relatively small data
set. The network was trained with 1000–2000 epochs
(these were the epoch numbers required for the RNN
training loss to converge in the different tasks), using
a learning rate between 0.005 and 0.01 (with Adam
optimizer [22]), weight decay of 0.0001 and dropout
[23] of up to 0.1.
We note that the different hyper-parameters relate
to different tasks. For each task (see table 1), hyper-
parameters are fixed for all subjects.
2.3.6. Classical algorithms
In this approach, the classification pipeline follows
the same steps as in CNN classification, apart from the
CNN which is replaced by either a k-nearest neigh-
bor (KNN) classifier or a Multi-class support vector
machine (SVM) classifier. For KNN, we chose the
number of nearest neighbors for the voting (k), to
be k=1. For SVM, we used a soft margin SVM with
Radial Basis Function kernel and C=100 (where C
is the weight given to the slack variables in the SVM
loss).
We used classical algorithms to examine which
cases indeed require Neural Networks, and for which
cases we can use the simpler, classical algorithms and
still achieve satisfying results. Specifically, KNN was
chosen as a reference model for being one of the
simplest classification algorithms. SVM was chosen
for being an advanced and popular classical classific-
ation algorithm.
2.3.7. Enhancing training quality using hidden
Markov model (HMM)
To improve classification accuracy, the training data
set consisted of artificial data, generated as follows:
For each of the 10 gestures, a Gaussian HMM with c
components was defined. Each HMM was trained for
Iiterations, using the sequences of activation maps
belonging to the active windows of the training data
set. From each trained HMM, new sequences of activ-
ation maps were generated, and these maps were
added to the training data set. We empirically set the
parameters to: c=4 and I=10.
2.3.8. A note on the fixed-size 200 ms time windows
While gesture-dependent window sizes may allow
better utilization of signal features, fixed-size win-
dows serve the purpose of providing data with less
inherent noise variability to the classification model,
as well as real-time operation. If we consider a ran-
dom noise, the length of the time window determ-
ines the uniformity of the noise mean amplitude
across multiple repetitions (the longer the window,
the more uniform the noise mean amplitude across
different repetitions). Thus, a fixed time window
will provide classification inputs with similar noise
amplitudes, facilitating the classification process. In
addition, gesture-dependent window size requires the
processing to start only after the gesture is fully
performed, enlarging the delay between gesture and
recognition.
The length of the time window is a compromise
between several considerations. First, EMG frequency
range is 20–400 Hz, requiring us to choose a win-
dow size larger than 50 ms in order to capture more
than one cycle of the signal. However, considering the
random nature of the signals, one cycle would not
be sufficient. This implies a trade-off: a larger win-
dow decreases randomness, but provides less maps
for each gesture. The latter makes the majority voting
procedure, applied at the end of the classification pro-
cess for some algorithms (as described in the above
paragraphs), less efficient. After testing several win-
dow sizes on an initial dataset, we found 200 ms to be
a good compromise.
3. Results
To demonstrate reliable FGR with the soft electrodes,
we used sEMG data collected from the arms of healthy
volunteers to train and test several different classific-
ation models. sEMG data were first collected and pre-
pared following these steps: (1) sEMG recording dur-
ing hand gesturing; (2) sEMG data segmentation; (3)
constructing RMS maps for each segment, and finally;
(4) classification of each segment with a trained NN.
Some of the collected data were used for the train-
ing (see figure 3for a schematic presentation of
the data flow).
Throughout the text, we use the following defin-
itions: For each subject, there are two sessions, each
recorded on a different date with a new electrode
array placed approximately at the same location on
the hand. For each one of these sessions, there are
three hand positions (see figure 3), and for each one,
there are 100 events, corresponding to ten different
gestures, repeated ten times each. In total, we collec-
ted 300 events for each session.
3.1. sEMG signals
Soft 16 electrode arrays were placed at the region of
the extensor digitorum muscle (figure 3(a)). Healthy
volunteers performed finger gestures (see section 2.2)
while the sEMG activity of the muscle was recorded.
The soft nature of the electrodes, along with the small
dimension of the wireless DAU, allowed subjects
to perform natural gestures while recording almost
artifact free sEMG data. sEMG data were collec-
ted during ten different hand gestures (figure 3(b)),
with three arm and body positions (figure 3(a)):
(1) Arm placed on the table (position I); (2) The
elbow is placed on the table and the arm is raised
at 90 degrees (position II); and (3) The subject is
4
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
Figure 3. Data collection and analysis scheme. (a) The three hand positions examined in the experiment. (b) The ten hand
gestures examined in the experiment. (c) sEMG data from channels 1, 2 to 16. This data is segmented and RMS maps are derived.
standing with the arm next to the body (position
III). Typical filtered sEMG signals of five gestures in
three different electrodes is presented in figure 3(c),
demonstrating some degree of variability between
different gestures. For each such segment, RMS activ-
ation maps were generated. These 4 by 4 matrices
provide a normalized representation of sEMG activ-
ity in the electrode space and is used as an input to
the NN.
We selected ten gestures based on their physiolo-
gical link with the location of the electrode array.
This link is already apparent in the filtered sEMG data
(figure 3(b)), but it is particularly conspicuous when
examining the RMS maps (for clarity, RMS intensity
for each electrode was calculated over the entire dura-
tion of each action) (figure 4). Consecutive activation
maps of the same gesture appear consistent within
the same hand position, while varying between ges-
tures. Importantly, the same gesture appeared to have
different maps when the arm position was changed.
This result reflects the complexity of the sEMG data.
From close examinations of other maps, it is evident
that small differences in electrode placement (either
by repetitive use by the same subject or by differ-
ent subjects) result with different activation maps. It
is therefore important that the classification will be
stable against these differences so that network train-
ing does not have to be fully repeated for each elec-
trode placement, especially when used by the same
individual.
3.2. Constructing train and test databases
Each session, from each hand position, contributed
100 events of 10 repetitions for each of the ten hand
gestures. From each event, we generated normalized
RMS maps similar to those described in figure 4. For
each subject, the maps obtained from eight of the ten
events of each gesture were assigned to the training
5
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
Figure 4. RMS maps. An example of RMS maps of one subject. Each row represents one hand gesture, and the columns represent
different repetitions of the same gesture. The first five columns were recorded as the subject’s arm was placed on the table, and the
next five columns were recorded with the elbow placed on the table, and the arm bent to 90 degrees.
database, and the remaining maps were added to the
test database.
3.3. Classification models
Based on the results discussed above, we set to realize
a NN based classification which is not only accurate
but also requires minimal tuning for newly acquired
data (different recording sessions). We implemen-
ted both convolutional and RNNs. For compar-
ison, we also implemented two classical classification
algorithms: KNN and SVM (For a detailed explana-
tion on the algorithms and the classification pipeline,
see section 2.3). In addition to the original train-
ing data set acquired, the training data set of the
networks consisted of artificial data, generated from
the training data using Hidden Markov Models (see
section 2.3,Enhancing training quality using HMM,
for further elaboration).
3.4. Evaluating the classification models
We examined the performance of the models in clas-
sifying the sEMG data, focusing on the ability to
overcome variability between hand positions and ses-
sions. These capabilities are essential for FGR in nat-
ural conditions. In order to test these capabilities, we
designed three classification tasks (see table 1). Each
of these tasks was performed separately with each
subject: In task 1, we trained the model with events
Table 1. Classification tasks used to evaluate the proposed
algorithms.
Task Training Data Test Data
1 160 Events from hand
position II (all sessions)
40 Events from hand
position II (all sessions)
2 1440 Events from hand
positions I, II and III
(all sessions)
160 Events from hand
positions I, II and III
(all sessions)
3 240 Events from all
hand positions from
session 1, and only two
repetitions of each
gesture from session 2,
hand position II (20
events)
60 Events from session
2, hand position II
from the training database belonging to hand posi-
tion II. We then tested it on the events from the test
database belonging to the same hand position. In task
2, we trained the model with events from the train-
ing database belonging to hand positions I, II and III.
We then tested it on the events from the test database
belonging to the same hand positions. In task 3, we
first trained the model with all events from the train-
ing database belonging to session I, from all subjects
(1440 events). After this training stage was completed,
we used 20 events from hand position II of session II,
6
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
Table 2. Accuracy (%) of different algorithms. N=8.
Task 1 Task 2 Task 3
KNN 84.1 ±7.9 81.1 ±9.6 73.6 ±11.9
SVM 90.4 ±5.7 87.8 ±7.7 75.27 ±11.9
CNN 87.4 ±7.3 79.7 ±10.1 78.2 ±12.4
RNN 64.5 ±11.7 60.0 ±8.9 40.3 ±11.6
Figure 5. Confusion matrices for CNN. Confusion matrices for tasks 1,2 and 3 for CNN classifier, accumulated over N=8
subjects and 5 evaluation repetitions..
i.e. only two repetitions from each gesture in order to
fine-tune the model. Then, we tested it with 60 events
from session II, hand position II.
Overall, we evaluated four classification mod-
els (KNN, SVM, CNN and RNN). Each evaluation
was repeated five times, where in each repetition the
data was randomly divided to train, validate and test
data, and the desired model was trained and tested.
Obtained accuracy values (averaged over N=8 sub-
jects and five repetitions for each evaluation) are
presented in table 2. For tasks 1 and 2, the best results
were obtained for SVM, while for task 3, the best res-
ults were obtained for CNN. The best average accur-
acy values obtained for 10 gestures classification are
90.4% for task 1, 87.8% for task 2, and 78.2% for task
3. Reducing the number of gestures to 8 (by removing
the two gestures classified with the lowest accuracy -
‘Tet’ and ‘Nun’) increases the best accuracy obtained
fortask 1 to 93.1%. In addition, confusion matrices
of the three tasks using CNN are shown in figure 5.
They were obtained by accumulating test data confu-
sion matrices over N=8 subjects and five evaluation
repetitions, as explained above.
For the sake of completeness, we conducted
another task dealing with inter-subject classification.
The models we examined did not manage to clas-
sify gestures in that scenario, apparently due to large
inter-subject variability. Further elaboration on that
task can be found in appendix.
4. Discussion and conclusions
In this investigation, we demonstrated automated
classification of sEMG data recorded using a novel
user-friendly wireless system. We presented a NN
based algorithm which can classify finger gestures
under natural scenarios. Specifically, we demon-
strated the ability to perform gesture classification
which is insensitive to the position of the hand with an
accuracy of 87.8%, and to classify hand gestures from
a new recording session with an accuracy of 78.2%,
using only a short calibration step. For tasks 1 and
2 used in the reported investigation, the SVM-based
model outperforms CNN and RNN based models,
as well as the classical model of KNN. However,
for the more complex task, task 3, the CNN-based
model outperforms RNN and the classical models of
KNN and SVM. This might suggest that while sim-
pler classification tasks can be performed by classical
algorithms, a deep learning approach outperforms
classical approaches when dealing with more com-
plicated classification tasks.
An important element contributing to the high
performances of the system described here are the soft
electrode arrays. We have previously found that the
sEMG SNR of these electrode arrays meets the criteria
for recording high quality sEMG signals. Implement-
ing an internal ground and using a new miniature
wireless system further contributed to our ability to
perform sEMG recordings with almost no mechanical
artifact, even under natural conditions and in differ-
ent hand positions.
The classification accuracy of the system
described here is close to the state of the art, while
doing so with significantly fewer electrodes. Several
recent studies reported on sEMG based FGR (table 3).
Atzori et al [12] established the NinaPro database for
sEMG based hand movement classification, and used
linear discriminant analysis, KNNs, an SVM and a
multi-layer preceptron (MLP) to classify gestures.
7
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
Table 3. Comparison between various classification models applied to position I.
References Electrodes Gestures Method Accuracy (%)
Atzori et al [12] 10 OttoBock 52 SVM 76.0
Amma et al [11] 192 dry 27 Naive Bayes 90.4
Geng et al [13] 10 OttoBock 52 CNN 77.8
Geng et al [13] 192 dry 27 CNN 96.8
Geng et al [13] 128 dry 8 CNN 99.5
Wei et al [15] 10 OttoBock 52 MS CNN 85.0
Wei et al [15] 192 dry 27 MS CNN 95.4
Wei et al [15] 128 dry 8 MS CNN 99.8
Padhy [16] 192 dry 27 MLSVD 98.0
Padhy [16] 128 dry 8 MLSVD 96.8
Moin et al [17] 64 dry 13 HD computing 97.1
Moin et al [17] 16 dry 13 HD computing ∼84.0
This study 16 dry 10 SVM 90.4
This study 16 dry 8 SVM 93.1
Using carefully placed ten electrodes, they were able
to distinguish between 52 hand gestures with an
accuracy of 76%. In a later study, CNN was used
for classification of the same database, achieving
an accuracy of 66.59 ±6.40% [14]. Other studies
focused on high density EMG (HD-EMG) record-
ings. Rojas-Martínez et al [10] used activation maps
obtained from HD-EMG recordings of the forearm
muscles to classify between 12 hand gestures, achiev-
ing an accuracy of 90%. Amma et al [11] character-
ized another database for sEMG based FGR (CSL-
HDEMG), using 192-electrode array and a naive
Bayes classifier to discriminate 27 gestures, achieving
an accuracy of up to 90%. Geng et al [13] introduced
a new database (CapgMyo), consisting of 8 gestures
recorded using a 128-electrodes array. Using a CNN,
they were able to reach an accuracy of up to 99.5%.
They also achieved a recognition accuracy of 96.8%
and 77.8% on the CSL-HDEMG and NinaPro data-
bases, respectively. Later studies achieved improved
results using these databases. Wei et al [15] used a
Multi Stream CNN, reaching an accuracy of 99.8%,
95.4% and 85% on the CapgMyo, CSL-HDEMG and
NinaPro databases, respectively, while Padhy [16]
proposed a multilinear singular value decomposition
approach, resulting in accuracies of 98.0% and 98.6%
for CapgMyo, and CSL-HDEMG databases, respect-
ively. The first classification task introduced in this
work is similar to classification tasks described in
previous studies. A comparison is provided in table 3.
From this comparison, it is apparent that our sys-
tem achieves a similar accuracy to previous methods,
while utilizing a much smaller and more convenient
electrode array without the necessity for careful elec-
trode placement. Classification tasks equivalent to the
three tasks presented in this work were also examined
by Moin et al [17]. Using a flexible 64-electrode
array and an adaptive machine learning approach,
they present classification of 13 gestures from the
same hand position with an accuracy of 97.12%.
They also show the ability to classify gestures in new
contexts, including a different hand position, new day
recording and a recording session performed after the
device was worn for two hours during daily activities.
After the classification model was updated with some
of the new-context data, an accuracy was degraded
on average by only 2.39%. However, the larger num-
ber of electrodes provides much more information
for analysis. When applying their suggested method
to data obtained from only 16 electrodes, Moin et al
[17] show that the accuracy for same-hand-position
classification drops to around 84%, and the accur-
acy for different-hand-position classification drops
below 84%.
In the investigation reported here, data were pro-
cessed off-line. For many FGR applications, on-line
analysis is desired and can be achieved if data transfer
and analysis times are fast enough. Although sEMG
requires electrode placement at close proximity to
the muscle, the use of 16 electrode arrays and CNN
analysis negate the need for very precise placement,
allowing accurate classification despite the variability
of multiple sessions.
In this investigation we used 16 electrode arrays.
These 4 by 4 arrays clearly provide more information
than low resolution sEMG, contributing to effect-
ive discrimination between gestures. Higher electrode
resolution may contribute to improved resolution, in
particular if discrimination between more gestures is
needed. It is important to note that increasing elec-
trode count may increase data analysis time and DAU
dimensions, tempering with the ultimate goal of real
time analysis under natural conditions.
sEMG signals contain information on applied
force. This information can be important in many
applications. In the current study, we did not
utilize this feature: RMS maps were normalized
and as such, amplitude information was discarded.
8
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
Moreover, spectral analysis was not implemented to
gain additional information about the applied force.
These topics remain for future investigations.
As we demonstrated here, sEMG has many
important benefits compared with video analysis
and smart gloves. On the downside, sEMG gesture
separation is limited to specific degrees of freedom
associated with the targeted muscle. sEMG may gain
from the combination of other technologies, such as
video or smart gloves, in order to improve network
training and resolution. For example, in this study we
did not exploit three-dimensional acceleration data
which was recorded by a built-in 3 axis inertial sensor
implemented in the wireless DAU.
To conclude, the results we presented here
demonstrate an important step in achieving FGR in
natural conditions using sEMG signals. The advant-
age compared to other studies concerns the com-
bination of a minimally interfering wearable system
(owing to the small number of dry electrodes and
the wireless nature of the system), the real-life scen-
arios which were examined, and the results which
are close to the state of the art, despite the challenges
mentioned above. Such advantages make the sys-
tem a possible candidate for gaming or Metaverse
applications.
Data availability statement
The data that support the findings of this study are
openly available at the following URL/DOI: https://
datadryad.org/stash/share/khYfwOcRgshRlaRgDivy
UxlkznzcledB-fI-Ywwz8Fs.
Acknowledgments
The authors thank Liron Amihay, David Buzaglo,
Kerith Aginsky and Param Gandhi for advice and
assistance concerning sEMG data collection. Y H
thanks Anat Mirelman for many fruitful discussions.
Conflicts of interest
Adi Ben Ari, and Liron Ben Ari report no financial or
no-financial conflicts of interest. Cheni Hermon was
an employee of X-trodes Ltd. Yael Hanein declares a
financial interest in X-trodes Ltd, which commercial-
ized the screen-printed electrode technology used in
this paper. She has no other relevant financial involve-
ment with any organization or entity with a financial
interest in or financial conflict with the subject matter
or materials discussed in the manuscript apart from
those disclosed.
Appendix. Inter-subject classification
In order to test the ability of the suggested mod-
els to perform inter-subject classification, we split
the original training set into new training and test
sets: the inter-subject training set, composed of all
Table 4. Accuracy (%) of inter-subject classification.
Inter-subject task
KNN 7.00 ±0.61
SVM 9.1 ±1.2
CNN 16.1 ±4.5
RNN 12.1 ±2.3
gestures from 6 randomly chosen subjects, and the
inter-subject test set, composed of all gestures from
the remaining 2 subjects. These training and test sets
constitute the inter-subject task. We evaluated each
model on this task the same way we did in the first
three tasks (see section 3.4), and the obtained accur-
acy values (averaged over 5 repetitions for each eval-
uation) are presented in table 4.
As evident from these results (which are around
chance probabilities), EMG-based systems are not
capable of performing inter-subject classification.
The reason for this shortcoming are physiological dif-
ferences between subjects that result in totally differ-
ent RMS maps. These differences cannot be handled
by the models.
ORCID iDs
Liron Ben-Ari https://orcid.org/0000-0002-1121-
881X
Adi Ben-Ari https://orcid.org/0000-0001-9585-
0267
Yael Hanein https://orcid.org/0000-0002-4213-
9575
References
[1] Nizamis K, Rijken N H M, van Middelaar R, Neto J,
Koopman B F J M and Sartori M 2020 Characterization of
forearm muscle activation in duchenne muscular dystrophy
via high-density electromyography: a case study on the
implications for myoelectric control Front. Neurol. 11 1–14
[2] Wu J, Sun L and Jafari R 2016 A wearable system for
recognizing american sign language in real-time using imu
and surface emg sensors IEEE J. Biomed. Health Inform.
20 1281–90
[3] Wen F, Zhang Z, He T and Lee C 2021 Ai enabled sign
language recognition and VR space bidirectional
communication using triboelectric smart glove Nat.
Commun. 12 5378
[4] Konar A and Saha S 2018 Introduction Gesture Recognition:
Principles, Techniques and Applications (Cham: Springer)
pp 1–33
[5] Chen Y, Zhao L, Peng Xi, Yuan J and Metaxas D N 2019
Construct dynamic graphs for hand gesture recognition via
spatial-temporal attention BMVC
[6] Min Y, Zhang Y, Chai X and Chen X 2020 An efficient
pointlstm for point clouds based gesture recognition Proc.
IEEE/CVF Conf. on Computer Vision and Pattern Recognition
(CVPR)
[7] Santos L, Carbonaro N, Tognetti A, Gonzalez J L, De la
Fuente E, Fraile J C and Parez-Turiel J 2018 Dynamic gesture
recognition using a smart glove in hand-assisted
laparoscopic surgery Technologies 68
[8] Primya T, Kanagaraj G, Muthulakshmi K, Chitra J and
Gowthami A 2021 Gesture recognition smart glove for
speech impaired people Mater. Today
9
Flex. Print. Electron. 8(2023) 025012 L Ben-Ari et al
[9] Ramalingame R, Barioul R, Li X, Sanseverino G, Krumm D,
Odenwald S and Kanoun O 2021 Wearable smart band for
american sign language recognition with polymer carbon
nanocomposite-based pressure sensors IEEE Sens.
Lett. 51–4
[10] Rojas-Martínez M, Ma˜
nanas M A and Alonso J F 2012
High-density surface emg maps from upper-arm and
forearm muscles J. Neuroeng. Rehabil. 985–85
[11] Amma C, Krings T, Böer J and Schultz T 2015 Advancing
muscle-computer interfaces with high-density
electromyography Proc. 33rd Annual ACM Conf. on Human
Factors in Computing Systems (CHI 2015) (ACM) pp 929–38
[12] Atzori M, Gijsberts A, Kuzborskij I, Elsig S, Hager A-G,
Deriaz O, Castellini C, Muller H and Caputo B 2015
Characterization of a benchmark database for myoelectric
movement classification IEEE Trans. Neural Syst. Rehabil.
Eng. 23 73–83
[13] Geng W, Du Y, Jin W, Wei W, Hu Y and Li J 2016 Gesture
recognition by instantaneous surface EMG images Sci. Rep.
636571
[14] Atzori M, Cognolato M and Müller H 2016 Deep learning
with convolutional neural networks applied to
electromyography data: a resource for the classification
of movements for prosthetic hands Front. Neurorobot.
10 9
[15] Wei W, Wong Y, Du Y, Hu Y, Kankanhalli M and Geng W
2019 A multi-stream convolutional neural network for
semg-based gesture recognition in muscle-computer
interface Pattern Recognit. Lett. 119 131–8
[16] Padhy S 2021 A tensor-based approach using multilinear
SVD for hand gesture recognition from semg signals IEEE
Sens. J. 21 6634–42
[17] Moin A et al 2021 A wearable biosensing system with
in-sensor adaptive machine learning for hand gesture
recognition Nat. Electron. 454–63
[18] Inzelberg L, Rand D, Steinberg S, David-Pur M and
Hanein Y 2018 A wearable high-resolution facial
electromyography for long term recordings in freely
behaving humans Sci. Rep. 82058
[19] Inzelberg L, David-Pur M, Gur E and Hanein Y 2020
Multi-channel electromyography-based mapping of
spontaneous smiles J. Neural Eng. 17 026025
[20] Krizhevsky A, Sutskever I and Hinton G E 2017 Imagenet
classification with deep convolutional neural networks
Commun. ACM 60 84–90
[21] Ioffe S and Szegedy C 2015 Batch normalization: accelerating
deep network training by reducing internal covariate shift
(arXiv:1502.03167)
[22] Kingma D P and Ba J 2014 Adam: a method for stochastic
optimization (arXiv:1412.6980)
[23] Srivastava N, Hinton G, Krizhevsky A, Sutskever I and
Salakhutdinov R 2014 Dropout: a simple way to prevent
neural networks from overfitting J. Mach. Learn. Res.
15 1929–58
10
Available via license: CC BY 4.0
Content may be subject to copyright.