Conference PaperPDF Available

Case Study: ROS-Based Fault Injection for Risk Analysis of Robotic Manipulator

Authors:

Abstract

Industrial cyber-physical systems (ICPS) are becoming more complex due to increasing behavioral and structural complexity. This increases the likelihood of faults, errors and failures. This can lead to economic losses and even hazardous events. Fault injection is an efficient method to estimate the potential risk of safety-critical ICPS. In this paper, we propose a new fault injection-based risk analysis method for Robot Operating System (ROS) and demonstrate its applicability with a robot manipulator case study. We conducted extensive fault injection experiments using a pick-and-place task. We injected two types of sensor signal faults: bias and noise. First, fault injections were implemented on a ROS/Gazebo model of the manipulator with randomly selected fault parameters such as fault type, location, magnitude and duration. The experiments helped to identify potential failure scenarios and to find critical fault locations. The most important factor contributing to system failures was the operational phase during which the faults were injected. We then tested our fault injection method on a real Franka Emika Panda collaborative manipulator to validate the effectiveness of the proposed ROSbased fault injection method. We observed that the digital model showed similar behavior to the real manipulator.
Abstract—Industrial cyber-physical systems (ICPS) are
becoming more complex due to increasing behavioral and
structural complexity. This increases the likelihood of faults,
errors and failures. This can lead to economic losses and even
hazardous events. Fault injection is an efficient method to
estimate the potential risk of safety-critical ICPS. In this paper,
we propose a new fault injection-based risk analysis method for
Robot Operating System (ROS) and demonstrate its
applicability with a robot manipulator case study. We conducted
extensive fault injection experiments using a pick-and-place task.
We injected two types of sensor signal faults: bias and noise.
First, fault injections were implemented on a ROS/Gazebo
model of the manipulator with randomly selected fault
parameters such as fault type, location, magnitude and duration.
The experiments helped to identify potential failure scenarios
and to find critical fault locations. The most important factor
contributing to system failures was the operational phase during
which the faults were injected. We then tested our fault injection
method on a real Franka Emika Panda collaborative
manipulator to validate the effectiveness of the proposed ROS-
based fault injection method. We observed that the digital model
showed similar behavior to the real manipulator.
Index Terms—Industrial Cyber-Physical Systems, fault
injection, robot operating system, risk analysis
I. INTRODUCTION
Cyber-Physical Systems (CPS) integrate computational
and physical capabilities that enable interaction between the
cyber and physical worlds through computation,
communication and control [1]. In the industrial domain,
Industrial Cyber-Physical Systems (ICPS) play an important
role. However, due to the increasing behavioral and structural
complexity of internal components, different types of failures
are likely to occur in ICPS. Therefore, it is important to assess
the potential risks for ICPS, especially for safety-critical
applications, such as collaborative robot manipulators, which
are widely used in industrial Human-Robot Collaboration
(HRC) scenarios. Fault injection is a widely used and
promising risk analysis method. At the same time, Robot
Operation System (ROS) [2] is a powerful open-source
software framework that helps to develop software for various
robotic applications, e.g. manipulators, mobile robots and
drones. An internal communication mechanism in ROS
provides convenient means to inject errors into target ROS
nodes. The ROS/Gazebo model allows users to simulate the
robot in any environment, making fault injection experiments
effective and safe.
*This work is funded by the Ministry of Science, Research and Arts of the
Federal State of Baden-Württemberg for the financial support of the projects
within the InnovationCampus Future Mobility (ICM).
In this paper we follow the definitions of faults, errors and
failures given in [3]. A fault activation leads to an error. An
error in turn can lead to a failure if it propagates through
different components of a system. Fig. 1 outlines such a fault
propagation process. In modern ICPS, fault tolerance
mechanisms allow systems to continue to perform their tasks
even when a fault occurs. In other words, not every fault leads
to failure. Faults with different parameters can lead to different
failure scenarios. Therefore, it is important to study critical
faults and estimate their consequences for safety-critical
applications.
Figure 1. Visualization of the fault-error-failure chain.
Contribution: In this paper, we present a new automated
fault injection method for risk assessment of a ROS-based
robotic manipulator. A Franka Emika Panda (hereafter
referred to as Panda) manipulator is used in this case study.
Fault injection is implemented for a pick-and-place task. Two
types of sensor faults are injected with six different fault
parameters: fault type, fault location, fault magnitude, fault
duration, fault activity and fault phase. Our method enables
random fault selection, injection and monitoring based on
ROS communication mechanisms. First, we extensively tested
our method on a ROS/Gazebo model. We then used it on a real
Panda manipulator. The real Panda manipulator showed
similar behavior to the simulated model after fault injection.
Experiments proved the feasibility of our fault injection
method. The main results are summarized in this paper.
The rest of the paper is organized as follows: Section Ⅱ
discusses the relevant state-of-the-art methods. Section Ⅲ
Yuliang Ma, Philipp Grimmeisen, and Andrey Morozov are with the
Institute of Industrial Automation and Software Engineering, University of
Stuttgart, Germany (e-mail: yuliang.ma@ias.uni-stuttgart.de).
Case Study: ROS-based Fault Injection for Risk Analysis of Robotic
Manipulator
Yuliang Ma, Philipp Grimmeisen, and Andrey Morozov
introduces the proposed fault injection method. Section
presents the experiment results. Finally, conclusions and
future steps are discussed in Section Ⅴ.
II. STATE OF THE ART
Collaborative robot applications require a comprehensive
risk assessment of both the robot systems and the workplace
(ISO/TS 15066:2016). Fault injection is a promising method
to estimate the potential risks of HRC scenarios and to evaluate
the fault tolerance of the robot system.
A. Fault Injection for ROS
Much impressive work has been done on fault injection
based on ROS and a simulation platform called Gazebo for
various safety-critical robotic applications. For autonomous
mobile robots, a ROS/Gazebo-based hierarchical fault-tolerant
framework that can inject faults and analyze error propagation
chains is proposed in [4]. In [5], sensor faults of mobile robot
localization systems are described, but with relatively simple
fault parameters. A fault diagnosis method for multiple mobile
robots is proposed in [6], and its fault injector module is
implemented using the ROS service. In [7], an end-to-end fault
analysis framework called MAVFI is proposed. It shows a
comprehensive analysis of fault injection for different
algorithms, error propagation and recovery strategy in a
simulation environment. MAVFI is based on a ROS node that
uses the ROS communication protocol and Linux system
commands to inject faults.
There are also ROS-based fault injection methods and
tools for robotic manipulators. In [8] and [9], faults are injected
into teleoperated surgical robots, and experiments are
implemented in a simulator. They focus on malicious control
commands generated by different types of attacks and a
model-based framework is used to estimate the consequences.
B. Fault Injection for other Systems
In addition to ROS-based fault injection, other tools are
available for other platforms and frameworks. In [10], model-
based fault injection experiments are performed on an
exoskeleton system using a highly customizable Simulink
block called FIBlock. In [11], a safety analysis platform called
xSAP is developed, which allows customizable definition of
fault modes and implements safety analysis. With the rapid
development of Artificial Intelligence (AI), some AI-based
fault injection paradigms are also attractive [12].
Fault injection is a popular method for testing the resilience
of a system. Although many ROS-based fault injection
methods have been introduced, most of them are used to
evaluate the impact of a specific fault. In this paper, we
perform extensive Monte Carlo fault injections for a
collaborative manipulator, which helps to identify important
failure scenarios and find the critical fault parameters. Our
method has the following distinctive features:
1) It is a specific fault injection method for Panda
manipulator based on ROS.
2) Our method enables Monte Carlo fault injection with
random and customizable fault parameter selection and
fault monitoring in the ROS/Gazebo model.
3) It has been applied to both the ROS/Gazebo model and
the real Panda manipulator.
III. FAULT INJECTION METHOD
A. Panda and Gazebo
The Panda manipulator has seven degrees of freedom and
is becoming increasingly popular due to its ease of use and
relatively low price [13]. The end-effector has two fingers that
can perform open and close actions. In our lab, we have a
demonstrator under development that helps to perform
chemical experiments: moving test tubes, pouring chemical
reagents and shaking the beaker. In this work, we focus on the
pick-and-place task. The Panda manipulator moves a test tube
from position A to position B. We have also built a simplified
ROS/Gazebo model. Gazebo is a simulator that integrates
precise physics and advanced 3D graphics, making it a popular
simulator in the ROS research community. Fig. 2 shows the
real Panda manipulator (a) and its ROS/Gazebo model (b).
Figure 2. Real-world manipulator and simulation models
B. Fault Configuration
We introduce errors into the Panda manipulator by
manipulating the original sensor data. The Panda manipulator
has seven joints. Each of them generates time series data about
positions, velocities and torques. Since a position-based
motion planner is used (rapidly exploring random tree), we
focus on the position signal channel to inject erroneous data in
real time.
a) Panda manipulator
b) ROS/Gazebo model
We defined the fault space using the parameters listed in
Table 1. Here is the description of these parameters.
Fault type: We consider two common fault types, bias,
and noise.
Fault location: Which position signal will be changed.
We inject faults in one of seven position signals.
Fault magnitude: For two types of faults, the fault
injector randomly selects a value within a predefined
range.
Fault duration: For how long the fault will be injected.
We set the maximum duration as 1.0s for bias and 3.0s
for noise, respectively.
Fault activity: During which activity a fault is injected.
The pick-and-place task consists of nine sequential
activities, such as hover, GoDown, close gripper,
GoUp, and so on.
Fault phase: For each activity, the system has a
planning period and an execution period. We define
the fault phase as execution fault when the fault
duration entirely falls into the execution period of a
certain activity. Otherwise, it will be defined as a
planning fault. It is worth noting that sometimes a
long-term fault could last for two planning periods or
more, and we only consider the first planning period
and its corresponding activity.
TABLE I. FAULT PARAMETERS
Fault
parameters
Attributes
Description Set notations
Type Bias, Noise T = {1,2}
Location Joint 1,2…7 L = {1,2,3,4,5,6,7}
Magnitude Intensity of fault M = (0,1]
Duration Time length of the
fault
D = (0,1] for bias,
D = (0,3] for noise
Activity
PickHover,
GoDown,
CloseHand,
GoUp,
PlaceHover,
GoDown,
OpenHand,
GoUp,
BacktoInit
A ={1,2,3,4,5,6,7,8,9}
Phase Planning period,
Execution period P = {1,2}
C. Fault Injection Method
As an executable program unit in ROS, a ROS node can
perform computation, publish, and subscribe to ROS topics
and provide ROS services. Our fault injector is a ROS node
that communicates with other nodes through related ROS
topics, which is a one-to-many communication protocol. The
architecture of the fault injector is shown in Fig. 3. The fault
injection process consists of three steps:
Figure 3. Overview of fault injection process. Fault injector generates faults
and Failure monitor judges failure modes.
1) Fault Selection: The fault injector extracts a set of
fault parameters including fault type, fault location, fault
duration, and fault magnitude. For each round of pick and
place, only one selected fault is injected.
2) Fault Injection: After selecting fault parameters, the
fault injector randomly selects a start time to inject faults.
Specifically, the fault injector subscribes to the normal
Joint states data and manipulates them according to
selected fault parameters. It then publishes a new ROS
topic named Faulty joint states to subscribers. Meanwhile,
the fault activity and fault phase are determined by
checking the start and end time of the injected fault. It is
worth noting that sensor data are abnormal only during the
selected fault duration, and only one fault is injected
during each round of the pick-and-place process.
3) Failure Monitoring: Failure monitor is a ROS node
that subscribes to Ros out, Joint states, and Model states
topics. Ros out topic contains information about the start
and end times of various activities and all fault parameters.
In addition, the failure monitor determines different
failure modes by monitoring the status of the Panda
manipulator (according to Joint states) and the test tube
(according to Model states). Based on our experiments
and observations, five failure modes are defined:
1) Critical Acceleration (Crit. Accel.),
2) Pick Failure (P-Failure),
3) Release Failure (R-Failure),
4) Collision,
5) Drop.
For each failure mode, a hazard level is defined to indicate
the severity of the failure. A higher hazard level indicates
a more serious failure. In addition, some reasonable
constraints are defined for the failure monitor based on the
pick-and-place task. For example, only when the object is
held by the end effector, it has the possibility to Drop the
object. Table 2 shows the definition of all failure modes
and their trigger conditions. The final step of the failure
monitor is to store all fault parameters and hazard levels.
TABLE II. FAILURE MODES, TRIGGERS, AND HAZARD LEVELS.
Failure Mode Trigger condition Hazard
level
Success No Failures (Not a failure mode) 0
Crit. Accel. Joint velocity is over a certain threshold.
1
P-Failure Object position never changed. 2
R-Failure Object position is still changing after
release activity. 3
Collision Object position changed before the pick
activity. 4
Drop Object velocity is over a certain threshold
during the manipulator is holding it. 5
IV. EXPERIMENTS AND RESULTS
We have used two approaches to implement fault injection
experiments: 1) the complete fault space is covered for discrete
fault parameters; 2) fault parameters of continuous values are
investigated through a constrained fault space.
A. Fault injection for the complete fault space
2900 experiments were conducted to estimate levels of
risks caused by a particular fault (Bias: 1500 samples, Noise:
1400 samples). The results of the fault injection are shown in
Fig. 4, and this figure ignores faults parameters defined by
continuous values, such as fault duration and fault magnitude.
For the two types of faults configured, bias, and noise, the fault
phase is the most critical parameter in determining whether a
failure will occur. Specifically, if a fault is injected during the
planning period, the system has an extremely high percentage
of failures (bias fault: 84.00%, noise fault: 68.55%). On the
other hand, if a fault is injected during the execution period,
the failure rate is much lower (bias fault: 6.87%, noise fault:
11.11%). There is a logical explanation for this: If the fault is
injected during the planning period, the Motion planner node
receives corrupted information about the current Joint states
and implements incorrect motion planning. The Panda
manipulator then receives incorrect waypoints (due to a bias
fault) or is stuck at a certain activity (due to a noise fault) thus
leading to a failure. In contrast, the failure rate is much lower
in the case of an execution period fault since Motion planner
does not compute a new path for the current activity, even
though the erroneous position data are sent to the Motion
planner.
Figure 4. Bias and noise faults during different fault phases
Fig. 5 shows the distribution of the failures caused by the
faults injected during the planning period. For bias faults, the
top two failure modes are Critical Acceleration (52.67%) and
Drop (22.33%). On the other hand, when noise faults are
injected during the planning period, the Panda manipulator has
a relatively high percentage of Pick Failure (36.57%) and
Drop (28.09%).
Figure 5. Failure modes distribution of different fault types for the planning
phase
a) Bias
b) Noise
c) Failure rate distribution
Fig. 6 shows the distribution of failures for different values
of the fault location parameter during the planning period. For
bias faults, Critical Acceleration (average: 60.06%) and Drop
(average: 25.54%) are two failure modes with a relatively high
percentage of occurrence. This is because bias faults during
the planning period bring incorrect current positions of the
manipulator to the Motion planner node. This results in
incorrect waypoints that deviate from the normal situation are
generated. The manipulator must move at a relatively high
speed to pass through all the abnormal waypoints. This makes
Critical Acceleration failures more likely, and if the speed is
extremely high, Drop failures will occur. For noise faults, Pick
Failure (average: 38.79%) and Drop (average: 30.16%) are
two more common failure modes. When the Motion planner
node receives noisy sensor data, it cannot plan any waypoints
due to the unstable data, so the manipulator keeps refusing to
perform the following activities. When the fault is over, the
manipulator proceeds to the next activity directly. In other
words, the noise fault changes the original activity flow and
the failure modes are more dependent on the start and end time
of faults. For example, when the manipulator is moving an
object but is stuck at a hovering position due to the noisy data,
and the fault ends before the OpenHand activity, the
manipulator will perform OpenHand activity directly and
ignore the GoDown activity. As a result, the object is released
prematurely, and a Drop failure will occur in this situation.
Figure 6. Histogram results summary of the failure rate distribution for all
joints.
Finally, the fault activity parameter is selected as a
reference for the risk analysis with two types of faults during
the planning period. The results are illustrated in Fig. 7. We
focus only on the Drop scenario, which has the highest hazard
level. For bias faults, the most dangerous failure occurs only
during Closehand, GoUp, PlaceHover, and GoDown activities.
This is obviously because the manipulator is holding the object
during these activities so these activities are riskier when the
fault is injected. As for the noise faults, during the PlaceHover
and GoDown activities, the rate of Drop failures is very high.
Figure 7. Histogram results summary of the failure rate distribution for all
activities.
B. Fault parameters of continuous values
To investigate the influence of continuous fault parameters
(fault magnitude and fault duration) on risk estimation, we
inject planning period faults (bias faults) into a given activity.
The experiment results are shown in Fig. 8. The second
activity, GoDown is the target of fault injection. When the
fault duration parameter is taken as a reference, there is no
clear relationship between the injected fault and the failure it
caused. On the contrary, the distribution of failure modes
shows a strong dependence on the fault location parameter.
The most dangerous failure modes are often caused by faults
in the first three joints, which are closer to the manipulator root
and far away from the end effector. Due to the mechanical
structure of the Panda manipulator, the Critical Acceleration
failures that occur in the first three joints often result in a large
deviation from the correct position. As such, the more serious
failure modes are more likely to happen in this case. On the
other hand, when the fault magnitude parameter is taken into
account, faults from the first three joints are still riskier than
the others. In addition, a fault magnitude of around 0.25 is
likely to be a valid threshold for distinguishing the most severe
failure mode from other modes.
a) Bias
b) Noise
a) Bias
b) Noise
Figure 8. Monte Carlo fault injection results for the 1st GoDown activity
C. Key findings
The next three interesting observations we made are:
1) Fault phase is the most critical parameter and if a fault
is injected during the planning period, the failure rate is
much higher than for faults in the execution period.
2) For the abnormal behavior of the manipulator after
fault injection, bias faults lead to a Critical Acceleration
while noise faults cause a stuck and disrupt the original
activity flow of the pick and place task. As such, the failure
distribution is different for these two types of faults as well.
3) When a partivular activity is chosen to inject faults,
fault location parameter is a more important factor
contributing to a failure. In our case study, fault duration
does not have a strong relationship with the failure
distribution while fault magnitude shows a visible
threshold for the most severe failure.
We test our fault injection method on a real Panda
manipulator, and it shows similar behaviors after fault
injection operations: https://youtu.be/LKtGLkaFTPo. For
safety reasons, fault magnitude parameter is limited to 0.1 for
both fault types in the demo video.
V. CONCLUSION AND FUTURE WORK
In this paper, we presented the results of a case study where
we injected noise and bias faults using ROS software to
identify potential failures of a robotic manipulator. First,
extensive Monte Carlo fault injections are performed, and we
obtain some interesting experimental results. Then, we test our
fault injection method on a real Panda manipulator, and it
shows very similar behavior to the simulated one after fault
injections. This method proved to be helpful for risk
assessment of ICPS and it helps to identify critical faults and
develop mechanisms.
REFERENCES
[1] R. Baheti and H. Gill. “Cyber-physical systems.” The impact of control
technology., vol. 12, pp. 161-166, 2011.
[2] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs,
R. Wheeler, A. Y. Ng et al., “Ros: an open-source robot operating
system,” in ICRA workshop on open source software, vol. 3, no. 3.2.
2009, p. 5.
[3] A. Avizienis, J. . -C. Laprie, B. Randell and C. Landwehr, “Basic
concepts and taxonomy of dependable and secure computing,” in IEEE
Transactions on Dependable and Secure Computing, vol. 1, no. 1, pp.
11-33, 2004.
[4] A. Favier, A. Messioux, J. Guiochet, J. -C. Fabre and C. Lesire, “A
hierarchical fault tolerant architecture for an autonomous robot, ” in
2020 50th Annual IEEE/IFIP International Conference on Dependable
Systems and Networks Workshops (DSN-W), pp. 122-129.
[5] Z. Zhao, J. Wang, J. Cao, W. Gao and Q. Ren, “A Fault-tolerant
Architecture for Mobile Robot Localization,” in 2019 IEEE 15th
International Conference on Control and Automation (ICCA), 2019, pp.
584-589.
[6] M. G. Morais, F. R. Meneguzzi, R. H. Bordini and A. M. Amory,
“Distributed fault diagnosis for multiple mobile robots using an agent
programming language,” in 2015 International Conference on
Advanced Robotics (ICAR), 2015, pp. 395-400.
[7] Y. S. Hsiao, Z. Wan, T. Jia, R. Ghosal, A. Mahnoud, A. Raychowdhury,
D. Brooks, G.Y. Wei and V. J. Reddi. “Mavfi: An end-to-end fault
analysis framework with anomaly detection and recovery for micro
aerial vehicles,” arXiv preprint arXiv:2105.12882, 2021.
[8] H. Alemzadeh, D. Chen, X. Li, T. Kesavadas, Z. T. Kalbarczyk and R.
K. Iyer, "Targeted Attacks on Teleoperated Surgical Robots: Dynamic
Model-Based Detection and Mitigation," in 2016 46th Annual
IEEE/IFIP International Conference on Dependable Systems and
Networks (DSN), 2016, pp. 395-406.
[9] X. Li, H. Alemzadeh, D. Chen, Z. Kalbarczyk, R. K. Iyer and T.
Kesavadas, "A hardware-in-the-loop simulator for safety training in
robotic surgery," in 2016 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS). IEEE, 2016, pp. 5291-5296.
[10] T. Fabarisov, I. Mamaev, A. Morozov and K. Janschek. “Model-based
fault injection experiments for the safety analysis of exoskeleton
system.” arXiv preprint arXiv:2101.01283, 2021.
[11] B. Bittner, M. Bozzano, R. Cavada, A. Cimatti, M. Gario, A. Griggio,
C. Mattarei, A. Micheli and G. Zampedri. “The xSAP safety analysis
platform.” in 2016 Tools and Algorithms for the Construction and
Analysis of Systems (TACAS): 22nd International Conference, 2016, pp.
533-539.
[12] A. Sedaghatbaf, M. Moradi, J. Almasizadeh, B. Sangchoolie, B. Van
Acker and J. Denil, “DELFASE: A Deep Learning Method for Fault
Space Exploration, ” in 2022 18th European Dependable Computing
Conference (EDCC), 2022, pp. 57-64.
[13] C. Gaz, M. Cognetti, A. Oliva, P. Robuffo Giordano and A. De Luca,
“Dynamic identification of the Franka Emika Panda robot with retrieval
of feasible parameters using penalty-based optimization,” in IEEE
Robotics and Automation Letters, vol. 4, no. 4, pp. 4147-4154, 2019.
a) Overview
b) Duration
... Our previous work introduced a ROS-based fault injection method for risk assessment. Ma et al. (2023). The case study illustrated the failure modes might happen after injecting faults in a pick-and-place task. ...
Preprint
Full-text available
Robotic systems increasingly rely on artificial intelligence (AI) to enhance their capabilities in performing complex tasks across various domains. The development and evaluation of AI systems usually require high-quality datasets. In addition to normal datasets, faulty datasets are critical for enabling anomaly detection and failure prevention, which are essential for ensuring the safety and reliability of safety-critical robotic applications. However, faults are rare in real-world environments. Although fault injection techniques allow for the manual injection of configurable faults, deploying such methods directly in real-world settings is rather risky. As such, it is important to develop a data generation tool which is low-cost, safe, and efficient. To address this, we developed a time-series data generation tool for the risk assessment of robotic applications. This ROS-based simulation tool integrates three key modules: (1) a Gazebo-based scene generator that can configure different working scenarios (e.g., drilling and welding) by adjusting end-effectors, workpieces, and hand positions; (2) an online fault injector that can introduce faults into robotic systems with configurable parameters; and (3) a risk monitor that records faulty data and safety violations in real time by measuring the distance between hands and end-effectors. Proposed tool facilitates the generation of time-series fault data and helps identify faults that may pose risks in human-robot collaboration scenarios. Additionally, the proposed simulation tool enables fast and safe deployment for other robot-related research areas, e.g., deep learning-based anomaly detection, failure prediction, and risk assessment.
... In [8] and [9], cyber-physical attacks are demonstrated via injecting faults into the control system of teleoperated surgical robots. In our previous work [10], configurable time-series faults are injected into a manipulator and we found that not every fault could lead to a manipulation failure. As such, we continue to investigate the error propagation problem for a vision-based manipulator in this work. ...
Preprint
Full-text available
Due to the increasing behavioral and structural complexity of robots, it is challenging to predict the execution outcome after error detection. Anomaly detection methods can help detect errors and prevent potential failures. However, not every fault leads to a failure due to the system's fault tolerance or unintended error masking. In practical applications, a robotic system should have a potential failure evaluation module to estimate the probability of failures when receiving an error alert. Subsequently, a decision-making mechanism should help to take the next action, e.g., terminate, degrade performance, or continue the execution of the task. This paper proposes a multimodal method for failure prediction for vision-based manipulation systems that suffer from potential camera faults. We inject faults into images (e.g., noise and blur) and observe manipulation failure scenarios (e.g., pick failure, place failure, and collision) that can occur during the task. Through extensive fault injection experiments, we created a FAULT-to-FAILURE dataset containing 4000 real-world manipulation samples. The dataset is subsequently used to train the failure predictor. Our approach processes the combination of RGB images, masked images, and planned paths to effectively evaluate whether a certain faulty image could potentially lead to a manipulation failure. Results demonstrate that the proposed method outper-forms state-of-the-art models in terms of overall performance, requires fewer sensors, and achieves faster inference speeds. The analytical software prototype and dataset are available at: Github: MultimodalFailurePrediction.
Conference Paper
Full-text available
Model-based fault injection methods are widely used for the evaluation of fault tolerance in safety-critical control systems. In this paper, we introduce a new model-based fault injection method implemented as a highly-customizable Simulink block called FIBlock. It supports the injection of typical faults of essential heterogeneous components of Cyber-Physical Systems, such as sensors, computing hardware, and network. The FIBlock GUI allows the user to select a fault type and configure multiple parameters to tune error magnitude, fault activation time, and fault exposure duration. Additional trigger inputs and outputs of the block enable the modeling of conditional faults. Furthermore, two or more FIBlocks connected with these trigger signals can model chained errors. The proposed fault injection method is demonstrated with a lower-limb EXO-LEGS exoskeleton, an assistive device for the elderly in everyday life. The EXO-LEGS model-based dynamic control is realized in the Simulink environment and allows easy integration of the aforementioned FIBlocks. Exoskeletons, in general, being a complex CPS with multiple sensors and actuators, are prone to hardware and software faults. In the case study, three types of faults were investigated: 1) sensor freeze, 2) stuck-at-0, 3) bit-flip. The fault injection experiments helped to determine faults that have the most significant effects on the overall system reliability and identify the fine line for the critical fault duration after that the controller could no longer mitigate faults.
Article
Full-text available
In this paper, we address the problem of extracting a feasible set of dynamic parameters characterizing the dynamics of a robot manipulator. We start by identifying through an ordinary least squares approach the dynamic coefficients that linearly parametrize the model. From these, we retrieve a set of feasible link parameters (mass, position of center of mass, inertia) that is fundamental for more realistic dynamic simulations or when implementing in real time robot control laws using recursive Newton-Euler algorithms. The resulting problem is solved by means of an optimization method that incorporates constraints on the physical consistency of the dynamic parameters, including the triangle inequality of the link inertia tensors as well as other user-defined, possibly nonlinear constraints. The approach is developed for the increasingly popular Panda robot by Franka Emika, identifying for the first time its dynamic coefficients, an accurate joint friction model, and a set of feasible dynamic parameters. Validation of the identified dynamic model and of the retrieved feasible parameters is presented for the inverse dynamics problem using, respectively, a Lagrangian approach and Newton-Euler computations.
Conference Paper
Full-text available
Programming autonomous multi-robot systems can be extremely complex without the use of appropriate software development techniques to abstract away the hardware heterogeneity and to overcome the complexity of distributed software to coordinate autonomous behavior. Moreover, real-world environments are dynamic, which can generate unpredictable events that can lead the robots to failure. This paper presents a highly abstract cooperative fault diagnosis method for a team of mobile robots described through a high level programming environment based on ROS (Robot Operating System) and the Jason multi-agent framework. When a robot detects a failure, it can perform two types of diagnosis methods: a local method executed on the faulty robot itself and a cooperative method where another robot helps the faulty robot to determine the source of failure. A case study demonstrates the effectiveness of out approach on two robots.
Conference Paper
Cyber-Physical Systems (CPSs) are increasingly used in various safety-critical domains; assuring the safety of these systems is of paramount importance. Fault Injection is known as an effective testing method for analyzing the safety of CPSs. However, the total number of faults to be injected in a CPS to explore the entire fault space is normally large and the limited budget for testing forces testers to limit the number of faults injected by e.g., random sampling of the space. In this paper, we propose DELFASE as an automated solution for fault space exploration that relies on Generative Adversarial Networks (GANs) for optimizing the identification of critical faults, and can run in two modes: active and passive. In the active mode, an active learning technique called ranked batch-mode sampling is used to select faults for training the GAN model with, while in the passive mode those faults are selected randomly. The results of our experiments on an adaptive cruise control system show that compared to random sampling, DELFASE is significantly more effective in revealing system weaknesses. In fact, we observed that compared to random sampling that resulted in a fault coverage of around 10%, when using the active and passive modes, the fault coverage of DELFASE could be as high as 89% and 81%, respectively.