Conference PaperPDF Available

Detection and Classification of Robotic Manipulator Anomalies Using MLSTM-FCN Models

Authors:

Abstract

Errors and failures occur inevitably in modern Cyber-Physical Systems (CPS) due to their structural variability and internal heterogeneity. This can cause economic losses or even hazardous accidents. Currently, deep learning-based anomaly detection methods, e.g., Transformer or LSTM-based detectors, have shown tremendous results in terms of anomaly detection and prevention. However, focusing solely on improving detection performance without classification and interpretation of the detected anomalies is not enough for many industrial scenarios. Instead of only reporting an anomaly, the detection results should be understandable and transparent for the users. The interpretability can provide some guidance and help to identify suitable countermeasures for different types of anomalies. In this paper, we introduce a Multivariate Long Short Term Memory Fully Convolutional Network (MLSTM-FCN) for anomaly classification based on multivariate time-series data generated from industrial robotic manipulators. Specifically, we investigate several scenarios: no collision, collision with another manipulator, and manually injected sensor faults. We collect time-series data from the simulations of robotics arms using CoppeliaSim software. We feed these data into the MLSTM-FCN model and train it to be a multivariate time-series classifier. The paper presents the simulative case study results that show that the MLSTM-FCN model can efficiently classify different types of anomalies.
1
DETECTION AND CLASSIFICATION OF ROBOTIC MANIPULATOR ANOMALIES USING
MLSTM-FCN MODELS
Yuliang Ma
University of Stuttgart
Germany
Philipp Grimmeisen
University of Stuttgart
Germany
Andrey Morozov
University of Stuttgart
Germany
ABSTRACT
Errors and failures occur inevitably in modern Cyber-
Physical Systems (CPS) due to their structural variability and
internal heterogeneity. This can cause economic losses or even
hazardous accidents. Currently, deep learning-based anomaly
detection methods, e.g., Transformer or LSTM-based detectors,
have shown tremendous results in terms of anomaly detection and
prevention. However, focusing solely on improving detection
performance without classification and interpretation of the
detected anomalies is not enough for many industrial scenarios.
Instead of only reporting an anomaly, the detection results should
be understandable and transparent for the users. The
interpretability can provide some guidance and help to identify
suitable countermeasures for different types of anomalies.
In this paper, we introduce a Multivariate Long Short Term
Memory Fully Convolutional Network (MLSTM-FCN) for
anomaly classification based on multivariate time-series data
generated from industrial robotic manipulators. Specifically, we
investigate several scenarios: no collision, collision with another
manipulator, and manually injected sensor faults. We collect
time-series data from the simulations of robotics arms using
CoppeliaSim software. We feed these data into the MLSTM-FCN
model and train it to be a multivariate time-series classifier. The
paper presents the simulative case study results that show that the
MLSTM-FCN model can efficiently classify different types of
anomalies.
Keywords: Anomaly Classification, MLSTM-FCN, Deep
Learning, Multivariate Time-Series Data, Cyber-Physical
Systems, Manipulators.
1. INTRODUCTION
In modern industry, Cyber-Physical Systems are playing an
increasingly important role in many scenarios such as flexible
production, smart factory, and industrial control systems.
Moreover, a fact that cannot be ignored is that CPS are getting
more and more complex. This happens because a variety of
sensors, actuators, network hardware, embedded computers, and
an enormous amount of software are contributing to the
increasing complexity of these systems. As a result, CPS are very
susceptible to abnormal situations due to the high degree of
structural variability and internal heterogeneity. This means
errors and failures will occur inevitably and these anomalies can
easily cause economic loss or even hazardous accidents.
Anomaly detection for CPS is attracting more attention and
during the last few years, Deep Learning-based Anomaly
Detection (DLAD) methods have shown high performance and
achieved good results in numerous industrial domains [1].
Normally, industrial time-series data is the main research
object of CPS anomaly detection. DLAD exploits different
approaches including classification, prediction, and
reconstruction. For example, in the prediction-based methods,
neural networks are trained as time-series data predictors using
error-free data and the residuals between the predicted data and
the real data are computed. If the residual is higher than a
threshold, then an anomaly is reported.
However, although current DLAD methods can offer reliable
anomaly detection performance, focusing on this alone is still not
sufficient for many real industrial applications. Let us consider an
abstract example of two robotic manipulators that work together.
Anomalies that we detect in the time-series data can be caused by
different events such as sensor failures, network delays, or hacker
attacks to name a few as shown in Figure 1. Besides these
“internal” failures also “external” failures can take place such as
collisions or intended contacts. It is reasonable to classify
internal and external failures for CPS because users need to
know what exactly happened in systems and take right actions for
different scenarios. Thus, instead of only reporting an anomaly,
the detection results should be more understandable and
transparent for users, and provide some guidance for the reaction
operations when encountering different types of anomalies. In
this paper, we convert the anomaly classification and
interpretation problem into a multivariate time-series data
classification (MTC) problem using a Multivariate Long Short
Term Memory Fully Convolutional Network (MLSTM-FCN). In
addition, we conduct a large number of classification experiments
and investigate the effect of different input features on the
accuracy of the classification results based on the robot arm
kinetic equations. The experimental results confirm that
leveraging the MTC-based method can distinguish internal and
external anomalies. Furthermore, selecting input features based
on existing physical knowledge for the network also contributes
to the accuracy of classification results.
FIGURE 1: EXAMPLAR ANOMALY CLASSIFICATION
2
Contributions: There are two main contributions of this paper.
First, we designed and applied a new MLSTM-FCN-based model
for anomaly classification. This model is able to classify
multivariate industrial time-series data and provide the user with
the interpretation of the different types of anomalies, such as
sensor failures or collisions. Second, our experiment results show
that selecting input features for the neural network based on
existing physics knowledge can improve the accuracy of the
anomaly classification.
2. STATE OF THE ART
The periodical behaviors of systems and processes in
domains such as engineering, economy, or social sciences can be
reflected in time-series data [2]. In our previous paper, we have
shown that Long-Short Time Memory (LSTM) networks and
Transformer-based networks can be trained as a predictor for
industrial time-series data [3,4]. In many research fields Time-
Series data Classification (TSC) problem is in focus. TSC is a
challenging topic since normally it is necessary to consider not
only the temporal relationship in one single input channel but also
the spatial relationship among different input channels. In our
case study, for example, we choose robotic arms as our subjects
of study and model collision accidents between two arms using
the CoppeliaSim simulation software. On the one hand, it is
reasonable to believe that during collision accidents, the torque
sensor time-series data in each joint obtained from the simulation
will show outliers, which we consider external anomalies. On the
other hand, however, another set of anomalies caused by other
reasons, e.g. sensor failure or a cyber-attack, can potentially cause
the same form of torque sensor time-series data, which we
consider internal anomalies. It is meaningful to classify these two
groups of anomalies because the users can obtain much more
transparent and understandable detection results instead of
knowing whether there is an anomaly or not.
Collision Detection is an important sub-topic of physical
Human-Robot Interaction (pHRI) and Automatic control.
Currently, two main approaches are widely used in order to detect
collision accidents: (i) the exteroceptive sensors-based method
and (ii) the proprioceptive sensors-based method. Proprioceptive
sensors measure the state of the robot itself while exteroceptive
sensors measure the state of the environment. The first method is
based on the development of artificial skin and tactile sensors
which are used to cover all the links of robotic arms and serve as
a collision detector [5 - 7]. Although the exteroceptive sensors-
based method can provide reliable detection results and locate the
collision point precisely, it is still not feasible to cover the whole
body of manipulators due to the difficulty of developing
applicable tactile sensors. On the other hand, the proprioceptive
sensors-based method does not require any external sensor.
Generally, it considers the dynamic models of manipulators and
achieves collision detection through the observation of several
specific physical values, e.g., external torque [8], energy change
[9], or momentum observer [10]. Then, a threshold is set to
determine whether a collision has happened. While the dynamic
model-based method avoids expensive external sensors, the
deficiency and uncertainty of the dynamic model will inevitably
influence the collision detection performance. Furthermore, with
the development of artificial intelligence, many methods based
on deep neural networks are also adopted to implement collision
detection [11,12]. In our work we use a proprioceptive sensors-
based method.
Multivariate Time-series data Classification (MTC) is
applied in various domains including phoneme classification
[13], healthcare [14] and human activity recognition [15]. Many
algorithms, e.g., the distance-based method [16], the feature-
based method [17], dimensional reduction techniques [18], are
developed and adopted efficiently in various MTC problems.
Fisher kernel learning (FKL) [19] and the Naive Logistic model
(NL) are two representative traditional models that have shown
high performance on many time series classification problems.
Compared with traditional approaches, Deep Learning-based
models are also attracting interest. Multi-Channel Deep
Convolutional Neural Network (MC-DCNN) [20] was designed
for the MTC problem. MC-DCNN can take multi-channel input
and capture the latent features inside the time-series data. Then, a
Multi-Layer Perceptron (MLP) would serve as a classifier and
give the prediction. Furthermore, Multivariate Long Short Term
Memory Fully Convolutional Network (MLSTM-FCN) [21] also
have shown good performance on 35 public datasets compared
with other models. According to current research, end-to-end
deep learning can achieve the current state-of-the-art
performance for MTC problems with architectures such as Deep
Residual Networks and Fully Convolutional Neural Networks
[22].
3. REFERENCE SIMULATION MODEL
In this paper, we use a CoppeliaSim model of Franka Emika
Panda (refer as Panda hereinafter) manipulator. Panda is an
increasingly popular manipulator for industry and robotics
research community due to its high usability and relatively low
price compared with other products of the same level [23].
CoppeliaSim is a robot simulator used in industry, education and
research [24], and its ease of use and reliable performance make
it increasingly attracting.
We created three scenarios for our experiments including
Normal Movement (Figure 2 a), Collision 1 (Figure 2 b), and
Collision 2 (Figure 2 c). Panda manipulator has the same
trajectory in each scenario, but for Collision 1 and Collision 2,
the Panda manipulator collides with a KUKA manipulator at
different positions. The movement process of this case study is
straightforward. The panda manipulator is placed vertically. We
keep the first two links stationary and the last, fourth link, parallel
to the ground. A 180-degree periodical rotational movement is
achieved by only controlling the third joint. The reason we define
two collisions in this way is that the horizontal movement of the
manipulator can ignore the change in output power due to the
potential energy of gravity. This means that we can only focus on
changes in the kinetic energy of the manipulator. On the other
hand, Collision 1 and Collision 2 have difference in directions,
time stamps and power loss. By defining two different scenarios
of collisions, we want to further validate the classification
accuracy of the neural network.
3
(a) Normal Movement
(b) Collision 1
(c) Collision 2
FIGURE 2: THREE SCENARIOS OF COPPELIASIM-BASED
SIMULATION.
Panda manipulator is a 7-DOF robotic arm and the torque
and velocity time-series data generated from each joint is
accessible during the simulation. We collect these data from
every Panda in the aforementioned three scenarios.
4. ANOMALY CLASSIFICATION METHOD
In this paper, we mainly focus on two types of anomalies that
are caused by different reasons and we roughly divide these two
types as internal anomalies and external anomalies. Internal
anomalies caused by a sensor failure, a cyber-attack, or another
reason can be detected by checking the values in the sensor data.
It is also possible to detect external anomalies caused by collision
accidents via observing external torque values. Figure 3 shows
the external torque values from each joint of the Panda for the
Normal Movement scenario. The red frame in the Figure 4 shows
a failure, however it is not possible to distinguish was it a
collision with another robot or an internal sensor failure through
the observation of torque sensor data only. Although the torque
sensor data is a direct and convenient choice to collect movement
information from robotic arms for anomaly detection in most real
applications, it is still not enough for distinguishing between
internal and external anomalies that can have very different
effects on CPS.
FIGURE 3: TORQUE SENSOR DATA DURING NORMAL
MOVEMENT
FIGURE 4: TORQUE SENSOR DATA DURING COLLISION OR
SENSOR FAILURE
4.1 Robot dynamic model
The common, for the robotics community, way to detect
collision accidents is setting a torque threshold: A collision can
be detected directly if |external torque| threshold. The
dynamic model of an n-DOF manipulator [25] is as follows:
󰇛󰇜
󰇘 󰇛
󰇗󰇜
󰇗 󰇛󰇜   
where 󰇛󰇜  󰇛
󰇗󰇜 󰇛󰇜 represent the
inertial matrix, the vector of the Coriolis and centrifugal forces,
and the gravity vector, respectively. 
󰇗
󰇘 are the joint
positions, velocities, and accelerations respectively.  is the
loading force of the manipulator.   are
the torques because to dynamics or external load.
Normally for a working manipulator without load, a torque
sensor should always report only  value and 
should be zero. However, the > 0 could be explained by
a sensor failure, a cyber-attack, or network delay. Worse still, it
also could be caused by a collision accident. Our goal is to
4
classify the former and the latter. According to [26], the presence
of a collision accident will immediately change the energy level
of a manipulator. The total energyof a manipulator includes
kinetic energyand potential energydue to the gravity [26]:

󰇗󰇛󰇜
󰇗 󰇛󰇜
Its difficult to observe the real-time total energy of a
manipulator. However, the derivative of the total energy w.r.t
time, which represents the power of a manipulator, can be easily
computed by [26]
󰇗
󰇗
and when a collision happens, the change in the power can be
described as
󰇗
󰇗
󰇗
Which represents that if a collision that impedes the movement
of the manipulator happens, the output power 󰇗 will decrease
immediately. In our paper, we consider power 󰇗 as one of input
features for the neural networks. In fact, a collision can cause the
rotational speed of the motor inside the joint to decrease, or even
the motor to get stuck. This means that the current of the motor
will increase significantly, which will lead to an increase in motor
output power. However, in this paper, we only consider the output
power of the whole manipulator, while a large part of the power
output of the motor is converted into heat, mechanical energy lost
during collisions, etc. Only a small part of motor power is
converted into working power of the manipulator.
4.2 Datasets
In this paper, we consider internal anomalies that will not
influence the current movement status including velocity and
power of a manipulator. However, external anomalies will change
the velocity and power immediately based on the aforementioned
physics knowledge.
Our next step is to construct the data sets. In this paper, we
select 󰇝
󰇗 󰇗󰇞 as our input time-series data and
󰇝󰇞 is the output classification result. The index number
represents the defined label for different scenarios including
Normal, Collision1, Collision2, Anomaly1, Anomaly2,
respectively. Anomaly1 and Anomaly2 have the same abnormal
values with Collision1 and Collision2 but only in 󰇝󰇞 channel.
In this way we can generate some collision-like anomalies in
torque sensors to simulate sensor failure or cyber-attack
situations. The 󰇝
󰇗 󰇗󰇞 of the normal movement can be
collected from the simulation and our focus is to inject other
abnormal scenarios into the normal time-series data. The
injection workflow can be described as following: First, we
obtain time series segments of collision events in 󰇝
󰇗 󰇗󰇞
channels. Second, we inject those segments into normal data
according to the right time stamp. Finally, anomaly events are
injected into error-free data but only in 󰇝󰇞 channel. The
󰇝
󰇗 󰇗󰇞 of Collision1 and Anomaly1 are shown in Figure 5 and
Figure 6. Abnormal values can be observed in all channels (pink
box) for the collision scenario, which means the collision will
influence the movement states immediately. But for internal
anomalies, abnormal values can be observed only in the torque
sensor channel (orange box) because these anomalies will not
change the current movement states (green box), which are the
󰇝
󰇗 󰇗󰇞 channels in our case study.
FIGURE 5: TORQUE, VELOCITY, AND POWER SIGNALS FOR
COLLISION 1.
5
FIGURE 6: TORQUE, VELOCITY, AND POWER SIGNALS FOR
ANOMALY 1.
As mentioned above, Collision1 and Anomaly1 are
classifiable by capturing features in three different channels
(󰇝
󰇗 󰇗󰇞) and this is also applicable for Collision2 and
Anomaly2. Next, we use the torque channel as an example to
further illustrate the difference in feature space between these
four scenarios. As shown in Figure 7, a collision or anomaly will
both cause huge changes in torque values. However, torque
signals from the third and fourth joints differs significantly due to
the different direction of rotation. This is the reason why
Anomaly1 and Anomaly2 can be classified. For collision events,
we add signals from other two channels 󰇝
󰇗 󰇗󰇞 to determine
whether it comes from a collision or a sensor failure or a cyber-
attack. It is worth noting that Collision1 and Collision2 have
different data patterns in 󰇝
󰇗 󰇗󰇞 channels as well.
FIGURE 7: TORQUE TIME-SERIES DATA FROM 7 JOINTS.
4.3 Neural network model
We used a Multivariate Long Short Term Memory Fully
Convolutional Network (MLSTM-FCN). Its architecture is
shown in Figure 8. A fully convolutional branch and an LSTM
branch form the MLSTM-FCN. The fully convolutional branch
uses a convolutional layer with many filters and the first two
convolutional blocks end with a squeeze-and-excite block.
In our paper, we take the same hyper-parameters as in [21]
including the number of filters (128, 256, and 128), kernel size
(8, 5, and 3 respectively), dropout rate (0.8), and so on. According
to [21], MLSTM-FCN is applicable to multivariate time-series
data classification problems.
FIGURE 8: MLSTM-FCN NETWORK ARCHITECTURE [21]
Collision 1 or
Anomaly 1
Collision 2 or
Anomaly 2
6
5. EXPERIMENTS AND RESULTS
5.1 Experiments
The workflow of our experiments is shown in Figure 9. First,
we collect the data from different scenarios including Collision1,
Collision2, Anomaly1, and Anomaly2, and give labels to the
relevant time-series segmentation. After that we inject those data
with their labels into error-free data according to their time stamp,
which indicates the Normal movement scenario.
Each dataset has 29270 time steps with a sampling time
interval of 50ms. The data is divided into the training and test sets
with a split ratio of 80:20. Finally, we use the neural network to
give predictions for different classifications for the
aforementioned scenarios.
We select different features including torque, velocity and
power as our input to investigate the effect on the final
classification accuracy. We implement three sets of experiments
as a comparison with the input features are (a) 󰇝󰇞
,(b)󰇝
󰇗󰇞 , (c) 󰇝
󰇗 󰇗󰇞 , respectively. For each
set of experiments, we implement 100 predictions on the test set
and use a confusion matrix to show the average precision of
anomaly classification.
FIGURE 9: THE WORKFLOW OF EXPERIMENTS
5.2 Results
The confusion matrix results of our classification
experiments are shown in Figure 10. The confusion matrices
show the expected results of average prediction precision. Based
on the results we can make the following conclusions.
1) Collecting torque sensor data only, which we usually do
in real applications, is not suitable for distinguishing internal and
external anomalies. When we only input 󰇝󰇞 into
MLSTM-FCN, the worst classification performance is obtained.
For example, Collision1 and Anomaly1 have the same data
pattern in 󰇝󰇞 channel but with different labels. As a result, the
neural network is confused and 40% of Collision1 is
recognized as Anomaly1, 25% of Collision2 is recognized as
Normal and other 33% of Collision2 is classified as Anomaly2 .
11% Anomaly2 is classified as Normal.
2) When we take 󰇝
󰇗󰇞  as input features,
classification precision has improved significantly. Only 22% of
Collision1 is classified as Anomaly1 and 11% of Collision2 is
classified as Normal. But only 71% of Anomaly2 is classified
correctly. This indicates that adding additional information to the
input data is very helpful in improving classification accuracy.
The neural network can capture different data pattern of different
events in 󰇝󰇞 and 󰇝
󰇗󰇞 channels and make the correct
prediction.
3) The best performance is achieved when we input
󰇝
󰇗 󰇗󰇞  , and only 8% of Anomaly2 is classified
incorrectly. This means that adding another constraint based on
prior physical knowledge is helpful to distinguish internal
anomalies and external anomalies.
7
(a) input 󰇝󰇞
(b) input 󰇝
󰇗󰇞 
(c) input 󰇝
󰇗 󰇗󰇞 
FIGURE 10: CONFUSION MATRICIES FOR ANOMALY
CLASSIFICATION WITH DIFFERENT INPUT FEATURES
In addition, the following metrics can also be used to
evaluate the performance of classification model according to
[27]:
 
 
 
 
   
 
The values of metrics are shown in Table 1. The results
again indicates that adding additional information to the input
data is very helpful in improving classification accuracy.
Compared with using 󰇝󰇞 channel only, all metrics have been
significantly improved by inputing 󰇝
󰇗󰇞 or 󰇝
󰇗 󰇗󰇞. At the
same time, the results further validates the effectiveness of
MLSTM-FCN model for multivariate time-series data
classification tasks.
TABLE 1: PERFORMANCE OF ANOMALY CLASSIFICATION
WITH DIFFERENT INPUT FEATURES
Input
Class
Recall
F-score
󰇝󰇞
Normal
1.00
1.00
Collision 1
0.60
0.75
Collision 2
0.42
0.59
Anomaly 1
1.00
0.89
Anomaly 2
0.89
0.76
󰇝
󰇗󰇞
Normal
1.00
1.00
8
Collision 1
0.78
0.88
Collision 2
0.89
0.94
Anomaly 1
1.00
0.88
Anomaly 2
0.71
0.83
󰇝
󰇗 󰇗󰇞
Normal
1.00
1.00
Collision 1
1.00
1.00
Collision 2
1.00
1.00
Anomaly 1
1.00
1.00
Anomaly 2
1.00
0.86
6. CONCLUSIONS
In this paper, we have implemented anomaly classification
for a robotic manipulator based on the MLSTM-FCN deep
learning model. We exploit the CoppeliaSim simulator to create
five? different scenarios and collect time-series data. Then we
convert the anomaly classification problem into a multivariate
time-series classification problem by using the MLSTM-FCN
network. In our experiments, we compared the effect of different
input features on the classification precision. The results show
that observing torque sensor data only is not adequate for
distinguishing internal anomalies and external anomalies.
However, adding other constraints based on prior physics
knowledge of CPS, such as velocity and power of a manipulator,
can significantly improve classification precision. Based on this
result, we conclude that adding a physics-based constraint to a
neural network is a useful and natural choice for the analysis of
detected anomalies, which can provide more understandable and
intuitive detection results for users.
REFERENCES
[1] Chalapathy, R., S. Chawla. “Deep Learning for Anomaly
Detection: A Survey.” 2019.
[2] G. Box and G. Jenkins, Time Series Analysis:
Forecasting and Control, Holden Day, San Francisco,
1976.
[3] Ding, S. , et al. “Model-Based Error Detection for Industrial
Automation Systems Using LSTM Networks.” Springer,
Cham, 2020.
[4] Ma Y, Morozov A, Ding S. Anomaly Detection for Cyber-
Physical Systems Using Transformers[C]//ASME
International Mechanical Engineering Congress and
Exposition. American Society of Mechanical Engineers,
2021, 85697: V013T14A038.
[5] J. Ulmen and M. Cutkosky, “A robust, low-cost and low-
noise artifificial skin for human-friendly robots,” in Proc.
IEEE Int. Conf. Robot. Automat., 2010, pp. 48364841.
[6] R. S. Dahiya, P. Mittendorfer, M. Valle, G. Cheng, and V. J.
Lumelsky, “Directions toward effective utilization of tactile
skin: A review,” IEEE Sensors J., vol. 13, no. 11, pp. 4121–
4138, Nov. 2013.
[7] E. Cagatay, A. Abdellah, P. Lugli, P. Mittendorfer, and G.
Cheng, “Integrating CNT force sensors into a multimodal
modular electronic skin,” in Proc. IEEE 15th Int. Conf.
Nanotechnol., 2015, pp. 12991302.
[8] A. De Luca, D. Schroder and M. Thummel, "An
Acceleration-based State Observer for Robot Manipulators
with Elastic Joints," Proceedings 2007 IEEE International
Conference on Robotics and Automation, 2007, pp. 3817-
3823, doi: 10.1109/ROBOT.2007.364064.
[9] A. De Luca, A. Albu-Sch¨affer, S. Haddadin, and G.
Hirzinger, “Collision detection and safe reaction with the
DLR-III lightweight manipulator arm,” in Proc. IEEE/RSJ
Int. Conf. Intell. Robots Syst., 2006, pp. 1623 1630.
[10] A. de Luca and R. Mattone, "Sensorless Robot Collision
Detection and Hybrid Force/Motion Control," Proceedings
of the 2005 IEEE International Conference on Robotics and
Automation, 2005, pp. 999-1004, doi:
10.1109/ROBOT.2005.1570247.
[11] D. Lim, D. Kim and J. Park, "Momentum Observer-Based
Collision Detection Using LSTM for Model Uncertainty
Learning," 2021 IEEE International Conference on Robotics
and Automation (ICRA), 2021, pp. 4516-4522, doi:
10.1109/ICRA48506.2021.9561667.
[12] Y. J. Heo, D. Kim, W. Lee, H. Kim, J. Park and W. K. Chung,
"Collision Detection for Industrial Collaborative Robots: A
Deep Learning Approach," in IEEE Robotics and
Automation Letters, vol. 4, no. 2, pp. 740-746, April 2019,
doi: 10.1109/LRA.2019.2893400.
[13] Graves A, Schmidhuber J. Framewise phoneme
classification with bidirectional LSTM and other neural
network architectures[J]. Neural networks, 2005, 18(5-6):
602-610.
[14] Kang H, Choi S. Bayesian common spatial patterns for
multi-subject EEG classification[J]. Neural Networks, 2014,
57: 39-50.
[15] Fu Y. Human activity recognition and prediction[M].
Switzerland: Springer, 2016.
[16] Orsenigo C, Vercellis C. Combining discrete SVM and
fixed cardinality warping distances for multivariate time
series classification[J]. Pattern Recognition, 2010, 43(11):
3787-3794.
[17] Xing Z, Pei J, Keogh E. A brief survey on sequence
classification[J]. ACM Sigkdd Explorations Newsletter,
2010, 12(1): 40-48.
[18] Baydogan M G, Runger G. Learning a symbolic
representation for multivariate time series classification[J].
Data Mining and Knowledge Discovery, 2015, 29(2): 400-
422.
[19] Jaakkola T, Diekhans M, Haussler D. A discriminative
framework for detecting remote protein homologies[J].
Journal of computational biology, 2000, 7(1-2): 95-114.
[20] Zheng Y, Liu Q, Chen E, et al. Time series classification
using multi-channels deep convolutional neural
networks[C]//International conference on web-age
information management. Springer, Cham, 2014: 298-310.
[21] Karim F, Majumdar S, Darabi H, et al. Multivariate LSTM-
FCNs for time series classification[J]. Neural Networks,
2019, 116: 237-245.
[22] Ismail Fawaz H, Forestier G, Weber J, et al. Deep learning
for time series classification: a review[J]. Data mining and
9
knowledge discovery, 2019, 33(4): 917-963.
[23] Gaz C, Cognetti M, Oliva A, et al. Dynamic identification
of the franka emika panda robot with retrieval of feasible
parameters using penalty-based optimization[J]. IEEE
Robotics and Automation Letters, 2019, 4(4): 4147-4154.
[24] Rohmer E, Singh S P N, Freese M. V-REP: A versatile and
scalable robot simulation framework[C]//2013 IEEE/RSJ
International Conference on Intelligent Robots and Systems.
IEEE, 2013: 1321-1326.
[25] Popov D, Klimchik A, Mavridis N. Collision detection,
localization & classification for industrial robots with joint
torque sensors[C]//2017 26th IEEE International
Symposium on Robot and Human Interactive
Communication (RO-MAN). IEEE, 2017: 838-843.
[26] Haddadin S, De Luca A, Albu-Schäffer A. Robot collisions:
A survey on detection, isolation, and identification[J]. IEEE
Transactions on Robotics, 2017, 33(6): 1292-1312.
[27] Chen H, Leu M C, Yin Z. Real-Time Multi-modal Human-
Robot Collaboration Using Gestures and Speech[J]. Journal
of Manufacturing Science and Engineering, 2022: 1-22.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
As artificial intelligence and industrial automation are developing, human-robot collaboration (HRC) with advanced interaction capabilities has become an increasingly significant area of research. In this paper, we design and develop a real-time, multi-model HRC system using speech and gestures. A set of sixteen dynamic gestures is designed for communication from a human to an industrial robot. A data set of dynamic gestures is designed and constructed, and it will be shared with the community. A convolutional neural network (CNN) is developed to recognize the dynamic gestures in real time using the Motion History Image (MHI) and deep learning methods. An improved open-source speech recognizer is used for real-time speech recognition of the human worker. An integration strategy is proposed to integrate the gesture and speech recognition results, and a software interface is designed for system visualization. A multi-threading architecture is constructed for simultaneously operating multiple tasks, including gesture and speech data collection and recognition, data integration, robot control, and software interface operation. The various methods and algorithms are integrated to develop the HRC system, with a platform constructed to demonstrate the system performance. The experimental results validate the feasibility and effectiveness of the proposed algorithms and the HRC system.
Article
Full-text available
In this paper, we address the problem of extracting a feasible set of dynamic parameters characterizing the dynamics of a robot manipulator. We start by identifying through an ordinary least squares approach the dynamic coefficients that linearly parametrize the model. From these, we retrieve a set of feasible link parameters (mass, position of center of mass, inertia) that is fundamental for more realistic dynamic simulations or when implementing in real time robot control laws using recursive Newton-Euler algorithms. The resulting problem is solved by means of an optimization method that incorporates constraints on the physical consistency of the dynamic parameters, including the triangle inequality of the link inertia tensors as well as other user-defined, possibly nonlinear constraints. The approach is developed for the increasingly popular Panda robot by Franka Emika, identifying for the first time its dynamic coefficients, an accurate joint friction model, and a set of feasible dynamic parameters. Validation of the identified dynamic model and of the retrieved feasible parameters is presented for the inverse dynamics problem using, respectively, a Lagrangian approach and Newton-Euler computations.
Article
Full-text available
Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.
Article
Full-text available
Over the past decade, multivariate time series classification has been receiving a lot of attention. We propose augmenting the existing univariate time series classification models, LSTM-FCN and ALSTM-FCN with a squeeze and excitation block to further improve performance. Our proposed models outperform most of the state of the art models while requiring minimum preprocessing. The proposed models work efficiently on various complex multivariate time series classification tasks such as activity recognition or action recognition. Furthermore, the proposed models are highly efficient at test time and small enough to deploy on memory constrained systems.
Chapter
The increasing complexity of modern automation systems leads to inevitable faults. At the same time, structural variability and untrivial interaction of the sophisticated components makes it harder and harder to apply traditional fault detection methods. Consequently, the popularity of Deep Learning (DL) fault detection methods grows. Model-based system design tools such as Simulink allow the development of executable system models. Besides the design flexibility, these models can provide the training data for DL-based error detectors. This paper describes the application of an LSTM-based error detector for a system of two industrial robotic manipulators. A detailed Simulink model provides the training data for an LSTM predictor. Error detection is achieved via intelligent processing of the residual between the original signal and the LSTM prediction using two methods. The first method is based on the non-parametric dynamic thresholding. The second method exploits the Gaussian distribution of the residual. The paper presents the results of extensive model-based fault injection experiments that allow the comparison of these methods and the evaluation of the error detection performance for varying error magnitude.
Article
With increased human-robot interactions in industrial settings, a safe and reliable collision detection framework has become an indispensable element of collaborative robots. The conventional framework detects collisions by estimating collision monitoring signals with a particular type of observer, which is followed by collision decision processes. This results in unavoidable trade-off between sensitivity to collisions and robustness to false alarms. In this study, we propose a collision detection framework (CollisionNet) based on a deep learning approach. We designed a deep neural network model to learn robot collision signals and recognize any occurrence of a collision. This data-driven approach unifies feature extraction from high-dimensional signals and the decision processes. CollisionNet eliminates heuristic and cumbersome nature of the traditional decision processes, showing high detection performance and generalization capability in real time. We performed quantitative analysis and verified the performance of the proposed framework through various experiments.
Book
This book provides a unique view of human activity recognition, especially fine-grained human activity structure learning, human-interaction recognition, RGB-D data based action recognition, temporal decomposition, and causality learning in unconstrained human activity videos. The techniques discussed give readers tools that provide a significant improvement over existing methodologies of video content understanding by taking advantage of activity recognition. It links multiple popular research fields in computer vision, machine learning, human-centered computing, human-computer interaction, image classification, and pattern recognition. In addition, the book includes several key chapters covering multiple emerging topics in the field. Contributed by top experts and practitioners, the chapters present key topics from different angles and blend both methodology and application, composing a solid overview of the human activity recognition techniques.
Article
Robot assistants and professional coworkers are becoming a commodity in domestic and industrial settings. In order to enable robots to share their workspace with humans and physically interact with them, fast and reliable handling of possible collisions on the entire robot structure is needed, along with control strategies for safe robot reaction. The primary motivation is the prevention or limitation of possible human injury due to physical contacts. In this survey paper, based on our early work on the subject, we review, extend, compare, and evaluate experimentally model-based algorithms for real-time collision detection, isolation, and identification that use only proprioceptive sensors. This covers the context-independent phases of the collision event pipeline for robots interacting with the environment, as in physical human–robot interaction or manipulation tasks. The problem is addressed for rigid robots first and then extended to the presence of joint/transmission flexibility. The basic physically motivated solution has already been applied to numerous robotic systems worldwide, ranging from manipulators and humanoids to flying robots, and even to commercial products.