Available via license: CC BY 4.0
Content may be subject to copyright.
Citation: Cho, A.D.; Carrasco, R.A.;
Ruz, G.A. A RUL Estimation System
from Clustered RuntoFailure
Degradation Signals. Sensors 2022,22,
5323. https://doi.org/10.3390/
s22145323
Academic Editors: Ningyun Lu,
Hamed Badihi and Tao Chen
Received: 30 May 2022
Accepted: 14 July 2022
Published: 16 July 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional afﬁl
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
sensors
Article
A RUL Estimation System from Clustered RuntoFailure
Degradation Signals
Anthony D. Cho 1,2 , Rodrigo A. Carrasco 1,3 and Gonzalo A. Ruz 1,4,5,*
1Faculty of Engineering and Sciences, Universidad Adolfo Ibáñez, Santiago 7941169, Chile;
acholo@alumnos.uai.cl (A.D.C.); rax@uai.cl (R.A.C.)
2Faculty of Sciences, Engineering and Technology, Universidad Mayor, Santiago 7500994, Chile
3School of Engineering, Pontiﬁcia Universidad Católica de Chile, Santiago 7820436, Chile
4Data Observatory Foundation, Santiago 7941169, Chile
5Center of Applied Ecology and Sustainability (CAPES), Santiago 8331150, Chile
*Correspondence: gonzalo.ruz@uai.cl
Abstract:
The prognostics and health management disciplines provide an efﬁcient solution to improve
a system’s durability, taking advantage of its lifespan in functionality before a failure appears.
Prognostics are performed to estimate the system or subsystem’s remaining useful life (RUL). This
estimation can be used as a supply in decisionmaking within maintenance plans and procedures.
This work focuses on prognostics by developing a recurrent neural network and a forecasting method
called Prophet to measure the performance quality in RUL estimation. We apply this approach to
degradation signals, which do not need to be monotonical. Finally, we test our system using data
from new generation telescopes in realworld applications.
Keywords: prognostics; fault detection; recurrent neural networks; prophet
1. Introduction
Modern industry has evolved signiﬁcantly in the past decades, building more complex
systems with greater functionality. This evolution has added many sensors for better control,
higher system reliability, and information availability. Given this improvement in data
availability, new adequate maintenance policies can be developed [
1
]. Thus, maintenance
policies have evolved from waiting to ﬁx the system when a failure appears (known as
reactive maintenance) to predictive maintenance, where intervention is scheduled with the
information obtained from fault detection methods.
Various researchers conﬁrm that sensors play a crucial role in preserving the proper
functioning of the system or subsystem [
2
,
3
] as they provide information about the oper
ating status in realtime such as possible failure patterns, level of degradation, abnormal
states of operation, and others. Taking this into account, various methodologies have been
developed for fault detection [
4
], testability design for fault diagnosis [
5
,
6
], detection of
fault conditions malfunction using deep learning techniques [
7
,
8
], and test selection design
for fault detection and isolation [
9
], just to name a few. Most of them share the same goal of
being able to help increase the reliability, availability, and performance of a system.
The two main extensions of predictive maintenance are Condition Based Mainte
nance (CBM) and Prognostics and Health Management (PHM); both terms have been
used as a substitute for predictive maintenance in the literature [
10
,
11
]. According to
Jimenez et al. [
11
], they aligned these terms by adopting predictive maintenance as the ﬁrst
term to refer to a maintenance strategy, CBM as an extension of predictive maintenance,
and adding alarms to warn when there is a fault in the system. Later, Vachtsevanos and
Wang [
12
] introduced prognostics algorithms as tools for predicting the timetofailure
of components; from this insight emerged PHM [
13
] as an extension of CBM to improve
the predictability and remaining useful life (RUL) estimation of a component in question
Sensors 2022,22, 5323. https://doi.org/10.3390/s22145323 https://www.mdpi.com/journal/sensors
Sensors 2022,22, 5323 2 of 29
after a fault appears. This information can then be used as a supply for decisionmaking in
maintenance scheduling [14].
It is necessary to highlight that fault detection and prognostics are not always exclusive.
Fault detection is usually an initial step in computing prognostics to estimate the future
behavior of the system or subsystem.
Generally, faults are generated by degradation of the components that make up the
system. Such degradation can be monitored through the signals collected from the sensors.
There are various types of degradations that have been addressed in the literature, one
of the most common are those signals that present degradation with slow decay that are
present in different components, such as, for example, an increase in resistivity of fuses,
reduction in currents on frequency processors, and the mean resolution of a telescope’s
camera, among others. Considering these similarities, it is possible that an automatic fault
detection framework that manages to detect the degradation in a frequency processor could
also effectively detect the degradation in the resolution of a camera or vice versa. Similarly,
it is possible that a good prediction of the RUL of a camera can be obtained using historical
fault information present in other components.
This work focuses on prognostics by developing recurrent neural networks (RNNs)
and a forecasting method called Prophet to measure the performance quality in RUL es
timation. First, we apply this approach to degradation signals, which do not need to be
monotonical, using the fault detection framework proposed in [
15
] with some improve
ments in the preprocessing and the cleaning data step. Later, we applied our approach to
similar degradation problems but with different statistical characteristics.
The difference between our research with the rest of the works is in the scalability of
the framework in fault detection towards other similar problems, showing its effectiveness
and robustness. On the other hand, the adjusted RNN models with historical data of one
type of fault to predict its RUL can also be used in other problems that have signals with
similar degradation, such as the resolution of a telescope’s camera, showing the power of
generalization and precision in the prediction of the RUL.
Our work has the following contributions:
1.
We made improvements in cleaning spikes or possible outlines and smoothing time
series in the preprocessing data step in the fault detection framework developed
in [
15
] to reduce the remaining noise level while maintaining its relevant characteristics
such as trends and stationarity.
2.
We show that the fault detection framework in [
15
], together with our preprocessing
method, improves the robustness of the framework and can be transferable to another
problem with similar degradation, although with different statistical characteristics.
3.
We built a strategy using clustering runtofailure critical segments to deﬁne an
appropriate failure threshold that improves the RUL estimation. Moreover, using this
strategy, we predict the RUL of another problem with similar degradation.
The rest of this article is organized as follows. First, the background related to this
work is presented in Section 2. In Section 3, we present the proposed method for data
preprocessing for cleaning spikes or outlier points, the smoothing for time series, and the
process of prognostic for RUL estimation. In Section 4, the details of the application are
given, as well as the results. Section 5presents a discussion of results and performances
obtained for each application. Finally, the conclusion of the work is presented in Section 6
and future work in Section 7.
2. Background
The following subsections present a brief description of fault detection, prognostics,
performance measurements, and a method used for RUL estimation.
2.1. Fault Detection
Most modern industries are equipped with several sensors collecting processrelated
data to monitor the status of the process and discover faults arising in the system. Fault
Sensors 2022,22, 5323 3 of 29
detection systems were developed around the 1970s [
4
,
16
], as an essential part of automatic
control systems to maintain desirable performance. Fault detection can be deﬁned as a
process of determining if a system or subsystem has entered a mode different from the
normal operating condition [
15
], and a fault may appear at an unknown time, and the
speed of appearance may be different [17,18].
In the literature, a wide variety of methods used for fault detection can be classiﬁed
into signal processing approaches [
18
–
23
], modelbased approaches [
23
–
26
], knowledge
based approaches [
18
,
27
–
29
], and datadriven approaches [
18
,
23
,
30
–
36
]. With the arrival
of technology and the advancement of computing methods, datadriven approaches are
gaining attention in the last decades, where it is expected that the data will drive the
identiﬁcation of normal and faulty modes of operation. See [
4
] for a general description of
fault detection and diagnosis systems.
Some recent developments have addressed this issue with deep learning to increase
accuracy in fault detection. For example, Yao Li [
37
] presented a branched LongShort
Term Memory (LSTM) model with an attention mechanism to discriminate multiple states
of a system showing high performance in its prediction based on the F1score metric. On
the other hand, Liu et al. [
38
] showed a strategy for failure prediction using the LSTM
model in a multistage regression model to predict the trend; this is then used to classify the
level of degradation by similarity with established failure proﬁles, achieving improvement
estimates with better precision.
Zhu et al. [
39
] addressed the problem of classifying multiple states of a system with a
convolutional network structure (CNN), speciﬁcally LeNet, optimized with Particle Swarm
Optimization (PSO). Their results showed that this strategy achieves better performance and
greater robustness compared to LeNet without PSO, VGG11, VGG13, VGG16, AlexNet,
and GoogleNet. Another approach using CNN is presented in the work of Jana et al. [
40
]
which uses a suite of Convolutional Autoencoder (CAE) networks to detect each type of
failure. Its design allows addressing failures in multiple sensors with multiple failures,
obtaining an accuracy of around 99%.
Within the approaches not fully supervised, Long et al. [
41
] developed a SelfAdaptation
Graph Attention Network, one of the ﬁrst models of this type of network to be able to use a
fewshot learning approach in which abundant data is available but very little is labeled
and also to be able to incorporate cases of failures that rarely occur. In their results, they
showed better performance at the level of accuracy compared to other models.
From an application perspective, fault detection systems have been developed in
many areas such as rolling bearing, machines, industrial systems, mechatronics sys
tems, industrial cyberphysical systems, and industrialscale telescopes, to name a
few [15,23–26,33–35,37,38,41,42].
Some of them describe some advantages and disadvantages over others in the ap
plied methodology to obtain better results. However, there are still a lot of difﬁculties in
implementing fault detection methods for real industries due to the properties of the data.
2.2. Prognostic
The prognosis task is mainly focused on estimating or predicting the RUL of a degrad
ing system and reducing the system’s downtime [
43
]. So, the development of effective
prognosis methods to anticipate the time of failure by estimating the RUL of a degrading
system or subsystem would be useful for decisionmaking in maintenance [
44
]. A failure
refers to the event or inoperable behavior in which the system or subsystem does not
perform correctly.
According to the literature, prognostics approaches can be classiﬁed into modelbased
approaches [
18
,
45
], hybrid approaches [
18
,
46
,
47
], and datadriven approaches [
18
,
48
–
50
].
Datadriven approaches offer some advantages over the other approaches, especially when
obtaining large and reliable historical data is easier than constructing physical models that
require a deeper understanding of the system degradation. Also, they are increasingly
applied to industrial system prognostic [
18
,
44
]. Recently, these studies are also divided into
three branches: degradation statebased, regressionbased, and pattern matchingbased
Sensors 2022,22, 5323 4 of 29
prognostics methods [
51
,
52
]. The former usually estimates the RUL by estimating the
system’s health state and then using a failure threshold to compute the RUL. The second
method is dedicated to predicting the evolution behavior of a degradation signal, and the
estimation of the RUL can be obtained when the prediction reaches the failure threshold.
The last methods consist of characterizing the signal and comparing it in the runtofailure
repository to compute the RUL by similarity.
In recent years, various deep learning models have been introduced to address fore
casting problems in RUL prediction. For example, Kang et al. [53] developed a multilayer
perceptron neural network (MLP) model to predict the health index of a signal; this is used
in a polynomial interpolation model to estimate the RUL. They indicate that their strategy
outperforms direct prediction methods using SVR, Linear Regression, and Random Forest.
In an ensembletype approach, Chen et al. [
52
] presented a hybrid method for RUL predic
tion using Support Vector Regression (SVR) and LSTM in which the results are functionally
weighted, showing to be more robust as it takes advantage of the beneﬁts provided by SVR
and LSTM.
Among the most innovative methods, Ding and Jia [
54
] designed a convolutional
Transformer network model that takes advantage of the attention mechanism and CNN to
capture global information and local dependence of a signal allowing to directly map the
raw signal to an estimated RUL, increasing its effectiveness and accuracy in prediction. On
the other hand, Zhang et al. [
55
] developed a model that allows evaluating health status and
predicting RUL simultaneously using a dualtask network model based on the bidirectional
gated recurrent unit (BiGRU) and multigate mixtureofexperts (MMoE), resulting in better
performance compared to traditional popular models such as ANN, RNN, LSTM, CNN,
GRU and BiGRU, and with satisfactory higher robustness.
Under the not fully supervised approach, He et al. [56] developed a semisupervised
model based on a generative adversarial network (GAN) in regression mode, considering
historical data for prevention and scarce historical information for failures to predict
the RUL. This approach allows for avoiding overﬁtting, thus increasing its power of
generalization and manages to achieve satisfactory accuracy even when the amount of
historical data per failure is limited.
To measure the performance of the prognosis method, Saxena et al. [
57
] introduced
some standard evaluation metrics that were used to evaluate several algorithms compared
to other conventional metrics effectively. Such metrics can be used as a guideline for choos
ing one model over another. A description of these metrics can be found in Appendix A;
they can be considered as a hierarchical validation approach for model selection described
in [
57
], where the ﬁrst instance is to check out whether a model gives a sufﬁcient prognostic
horizon, and if not, this method is not meant to compute the other metrics. If the model
passes PH’s criterion, it is followed by the computation of the
α

λ
accuracy, which needs a
more strict requirement of staying within a converging cone of error margin as a system
reaches the EndofLife (EoL). If this criterion is also met, we can quantify how well the
method does by computing the accuracy levels relative to the actual RUL and, ﬁnally,
measure how fast the method converges. This work will focus on the ﬁrst two metrics since
they provide a meaningful level of accuracy of the model in the RUL estimation.
2.3. Recurrent Neural Networks (RNNs)
Among datadriven techniques used for prognostics, RNNs have been widely studied
in recent years and are one of the most powerful tools as they can model signiﬁcant
nonlinear dynamical time series. A large dynamic memory is allowed to preserve temporal
dynamics of complex sequential information and has been used with success in several
prognostic applications [
49
]. Three types of RNN are chosen in this work: Echo State
Networks (ESNs), LongShort Term Memory (LSTM), and Gated Recurrent Unit (GRU),
to measure the performance of RUL estimation applied in three problems described in
Section 4. A description of these RNNs appears in Appendix B.
Sensors 2022,22, 5323 5 of 29
2.4. Prophet Model
The Prophet model was developed by Sean Taylor and Benjamin Letham [
58
] in 2018 to
produce more conﬁdent forecasts. Its methodology consists of the usage of a decomposable
time series model, consisting of three main components: trend, seasonality, and holidays.
It allows one to look at each component of the forecast separately. These components are
combined as an additive model in the following form:
y(t) = g(t) + s(t) + h(t) + e(t), (1)
where
g(t)
is the trend function that represents the nonperiodic changes of the time series,
s(t)
describes the periodic changes (daily, weekly, and yearly seasonality),
h(t)
deﬁnes
the effects of holidays that occur on potentially irregular calendar schedules over one or
more days, and
e(t)
represents the error term of any idiosyncratic changes which are not
accommodated by the model. This method has several advantages that allow the analyst to
make different assumptions about the trend, seasonality, and holidays if necessary, and the
parameters of the model are easy to interpret.
3. Methodology
3.1. PreProcessing Data
The data or signals collected from a system, in most cases, are noisy, and some outliers
or spikes might be present. So, it is necessary to preprocess each signal before feeding
it to the forecasting model. This process is shown in Figure 1, and it consists of the
following steps:
Spikes
cleaning
Raw data
(input)
Double
exponential
smoothing
Convolutional
smoothing
Smoothed data
(output)
Figure 1. Preprocessing ﬂow chart.
1. Spikes cleaning
: it consists of clearing possible outliers and spikes points by compar
ing time series values with the values of their preceding time window, identifying a
time point as anomalous if the change of value from its preceding average or median
is anomalously large.
An advantage of this outlier reduction strategy is that it considers the local dynamics
of the signal with time windows. Therefore, managing to identify as outliers the
samples that are outside the local range and thus reduce the number of samples
that are normal but that were identiﬁed as outliers, as could happen with traditional
methods that depend on the global mean and standard deviation. This method is
implemented in the ADTK library [59].
2. Double exponential smoothing
: this ﬁlter [
26
,
60
–
64
] is commonly used for fore
casting in time series, but it can also be used for noise reduction. This method is
particularly useful in time series to smooth its behavior, preserving the trend and
without losing almost any information in the dynamics of the series. Also, the model is
simple to implement, depending on two main parameters. For more details, see [
15
].
Sensors 2022,22, 5323 6 of 29
3. Convolutional smoothing
: this consist of applying the Fourier transform with a
ﬁxed window size to smooth the signal maintaining the trend. In other words, this
method applies a central weighted moving average to the signal allowing shortterm
ﬂuctuations to be reduced and longterm trends to be highlighted. It is implemented
in the TSmoothie library [65].
Each of the methods that make up the preprocessing process offers some strengths
and weaknesses. To see its independent effect, each of the methods was applied to a signal
that presented outliers with a high level of noise, as shown in Figure 2.
The effect of the method that was mentioned in Step 1, shown in Figure 2a, can be
seen that it manages to reduce the large jumps that are considered outliers, but still, some
outliers remain with minor jumps. The noise reduction or smoothing methods that were
mentioned in Steps 2 and 3 present some artifacts in the signal dynamics due to outliers,
and their effects are unknown, as shown in Figure 2b,c.
It is for this reason that we combine the methods to use the advantages offered by
each one of them, allowing us to reduce large jump outliers, followed by a noise reduction
strategy and reduce minor jump outliers, and ﬁnally, reduce possible remaining artifacts
with smoothing procedure as presented in the designed preprocessing scheme, Figure 1.
The effect of this combination is shown in Figure 2d, where the resulting signal has smoother
dynamics and preserves the trend of the original signal.
2002 2003 2004 2005 2006 2007
date
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
Resolutionmedia[R]
Raw
No outliers
Remaining
outliers
(a)
2002 2003 2004 2005 2006 2007
date
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
Resolutionmedia[R]
Raw
DES
Artifact
(b)
2002 2003 2004 2005 2006 2007
date
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
Resolutionmedia[R]
Raw
Convolutional
Artifact
(c)
2003 2004 2005 2006 2007
date
60,000
65,000
70,000
75,000
80,000
85,000
90,000
95,000
Resolutionmedia[R]
Raw
Combined
(d)
Figure 2.
Application of each method separately to the raw signal. (
a
) Outliers and spikes
cleaning. (
b
) Double Exponential Smoothing. (
c
) Convolutional smoothing. (
d
) Proposed pre
processing method.
3.2. RuntoFailures Critical Segments Clustering
The increase in processor speed, sensors monitoring, and the development of storage
technologies allow realworld applications to record changing data over time easily in
components of a system/subsystem [
66
]. It is necessary to highlight that the components
used in different environments lead to different degradation levels, even for one type of
component. Therefore, the failure threshold can be different in each situation. However,
from the historical runtofailure signals, they can be clustered so that each signal in a
cluster behaves similarly; thus, it is possible to deﬁne a failure threshold based on the
Sensors 2022,22, 5323 7 of 29
signals that belong to a cluster. In other words, there is a failure threshold A that can be
deﬁned as cluster A, a failure threshold B to cluster B, and so on.
Our scheme of clustering does not consider the entire signal since it starts running
until EoL; instead, we use the critical segment of the signal for clustering. Our deﬁnition of
a critical segment of a signal is the segment where the degradation begins until EoL. Under
these critical segments, we build clusters so that each cluster has signals with a degradation
level relatively similar.
The advantage of clustering by critical segments allows us to deﬁne, in an easy way,
the different failure thresholds. Therefore, we can deﬁne for each cluster an appropriate
failure threshold based on the critical segment signals that belong to a cluster. To increase
the effectivity, each critical segment is centered with its own standard normal condition
value before the clustering process, i.e, if
S
is the complete signal, and
S0
is the critical
segment, then
S0
is centered by
S0−S0+k
, where
k
is the standard normal condition
value and
S0
is the ﬁrst sample of
S
. Lastly, a threshold can be deﬁned as the minimum
degradation level reached by critical signals in the cluster.
3.3. Prognostic Method
Two strategies are proposed to deal with the estimation of RUL in components. For all
strategies, we consider the fault date as the point in time
tP
at which the fault prediction
starts [
67
]. We also assume that the recollected data consists of daily samples, which were
processed using the approach presented in Section 3.1. In what follows, a description of
these strategies is presented.
3.3.1. Strategy A
This strategy is based on a regression model, similar to the prognostic approach
proposed in [
48
]. In this strategy, we deﬁne a time window of
d
days in which we analyze
the data. Note that the number of samples in the time window can vary since data is not
assumed to be available every day. Figure 3a shows an example with missing data, whereas
Figure 3b shows an example where data is available through the whole time window.
(a) (b)
Figure 3.
Timewindow examples. (
a
) Timewindow with missing values. (
b
) Timewindow without
missing values.
The data within the time window is used to train the model, which is then utilized
to predict a forecast for the next
n
days, following the structure shown in Figure 4. In this
approach,
X(
1
:t)
represents the ﬁrst
t
samples of
X
, the data used as input to train the
model. The model then estimates
y(t+
1
)
, and the current window is updated by dropping
the oldest value and adding the newly calculated one:
[X(
2
:t)
,
y(t+
1
)]
. The forecasting
process is similar to the Pmethod developed in [48].
Using the previous forecast, we verify if the failure threshold is crossed within the
time window, calculating the RUL if this occurs. This procedure is applied in a rolling
window fashion whenever new data arrives.
Sensors 2022,22, 5323 8 of 29
ModelData y(t+1)
Training Forecast
X(1:t)
Figure 4. Model training and forecast structure.
Figure 5shows an application example using a time window of 365 days. The ﬁrst
iteration result is shown in Figure 5a, with the time window between 18 November 2014
and 18 November 2015. Since some data is missing, we have 340 samples in this case.
In this step, our approach estimates the RUL to be 384 days. Next, Figure 5b shows the
results of the second iteration, where the time window lies between 14 September 2015 and
13 September 2016, containing 365 samples. In this step, the RUL is estimated to be 181 days.
The black line represents the ground truth in both ﬁgures, and the blue line represents the
obtained forecast. The green dashed line is
tP
, the red dashed one is the failure threshold,
and the RUL value is computed as the difference between when the forecast crosses the
failure threshold and tP. Finally, the whole process is shown in the diagram in Figure 6.
1 January 2015
Date
0.53
0.54
0.55
0.56
0.57
0.58
current [A]
1 May 2015
1 September 2015
1 January 2016
1 May 2016
1 September 2016
1 January 2017
1 May 2017
1 September 2017
1 January 2018
RUL
Ground truth
forecast
Threshold
tP
Timewindow
(a)
0.53
0.54
0.55
0.56
0.57
0.58
current [A]
1 January 2015
Date
1 July 2015
1 January 2016
1 July 2016
1 January 2017
1 July 2017
1 January 2018
1 July 2018
RUL
Ground truth
forecast
Threshold
tP
Timewindow
(b)
Figure 5.
An example of RUL estimation using a timewindow size of 365 days. (
a
) Timewindow
samples until fault date tP. (b) Timewindow shifted by 300 days.
Raw data
(input) Preprocessing
Timewindow
data Model
Forecast
Compute
RUL
Catastrophic
threshold
Fault
date
Is there new
data
available?
Update raw
data
Wait for new
available data
training
NoYes
Figure 6. Prognostic process: strategy A.
Sensors 2022,22, 5323 9 of 29
3.3.2. Strategy B
Considering that one type of component could be in vastly different environments, it
is possible that their degradation level, and thus failure thresholds, could be very different.
Due to this, we need to adapt the previous strategy to account for this difference. We do
this by combining matching and regressionbased methods. This technique consists of
two steps:
•ClusterModel stage
: it consists of the usage of clustering described in Section 3.2, so
that, for each cluster we can ﬁt a regression model. The train data is deﬁned by the
critical signals limited by a deﬁned failure threshold in the cluster with its residual
RUL, i.e., for each critical signal
S
with length
l(S)
in cluster
C
and
S0⊂S
such that
S0
0=S0
, and
S0
l(S0)≈f ail ure_threshold
. Then, each sample
S0
i∈S0
has a residual RUL
ri:=Normalize(S0
i)·l(S0),
where l(S)is the length of the signal S,
Normalize(Si) = Si−min(S)
max(S)−min(S),
S0and S0
0are the ﬁrst sample of Sand S0, respectively.
•Prediction stage
: it consists mainly in predicting the RUL of a component in the
signal that has been diagnosed as a fault, which means a degradation behavior has
started. In this step, we took a segment of the signal after a fault has been detected; it
is preprocessed and submitted to a classiﬁer to identify to which cluster it belongs
and select the related regression model, already ﬁtted in the ClusterModel stage, to
predict the RUL. This procedure is executed when new samples are available.
The classiﬁer works in matching segments to all runtofailure critical segments using
Minimum Variance Matching (MVM) [
68
–
70
], which is a popular method for elastic
matching of two sequences of different lengths by mapping the problem of the best
matching subsequence to the problem of the shortest path in a directed acyclic graph
providing the minimum distance. The classiﬁcation scope provides the assignment by
a voting criterion, i.e., the maximum number of signals of a cluster closer to a given
segment will be taken. A ﬂow chart of this prognostic process is shown in Figure 7.
Figure 7. Prognostic process: strategy B.
Sensors 2022,22, 5323 10 of 29
The principal models used in this work for training and computing forecasts or
RUL are mentioned in Sections 2.3 and 2.4: ESN, LTSM, GRU, and Prophet (only for
Prognostic Strategy A). To measure how well the model is for estimating RUL, we will use
the prognostic horizon and α–λaccuracy.
4. Application Setting
4.1. Crack Growth
The crack propagation description is one of the most important components in the
analysis of the life span of structural components, but it may require time and expense to
investigate experimentally [
71
]. Hence, the estimation of crack propagation and durability
of construction or structural component will be useful to estimate the remaining life of
the component.
4.1.1. Problem Description
As described in [
72
–
74
], components that are subjected to ﬂuctuating loads are prac
tically found everywhere: vehicles and other machinery that contain rotating axles and
gears, pressure vessels and piping may be subjected to pressure ﬂuctuations or repeated
temperature changes, and structural members in bridges are subjected to trafﬁc loads and
wind loads, and some other applications. If the components are subjected to a ﬂuctuating
load of a certain magnitude for a sufﬁcient amount of time, it may cause small cracks in
the material. Over time, the cracks will propagate up to the point where the remaining
crosssection of the component cannot carry the load, at which the component will be
subjected to sudden fracture. This process is called fatigue and is one of the main causes of
failures in structural and mechanical components.
The common Paris–Erdogan model is adopted [
72
] for describing the evolution of the
crack length
x
as a function of the load cycles
N
summarized by the following discrete
time model
xt+1=xt+Ceωt(β√xt)n, (2)
where
ωt∼ N(
0,
σ2
w)
is a random variable depicting white Gaussian noise, and
C
,
β
and
n
are ﬁxed constants. A generation of 30 crack growth trajectories using Equation (2) is
illustrated in Figure 8and consists of 900 days of samples per trajectory.
1 January 2000
date
0
50
100
150
200
250
300
350
Crack length [mm]
Crack growth
1 April 2000
1 July 2000
1 October 2000
1 January 2001
1 April 2001
1 July 2001
1 October 2001
1 January 2002
1 April 2002
Figure 8. 30 crack growth trajectories.
4.1.2. Prognostic
For practical purposes, we choose one trajectory from Figure 8to estimate RUL to
measure the performances of both strategies.
Sensors 2022,22, 5323 11 of 29
•
Strategy A: following the methodology in Section 3.3.1, we estimate RUL shifting the
time window by 15 days in every iteration, 1 year size of timewindow, and 2 years
of forecast.
The results are shown in Figure 9. In the prognostic horizon, Figure 9b, we can see
that all the models underestimate RUL, with some exceptions like the Dense neural
network model. Neural network models had poor performances of RUL estimation
and mostly fall outside of the conﬁdence interval. Only the Prophet model is relatively
close to the ground truth RUL. Concerning the
α
–
λ
accuracy, only Prophet has a
segment close to the ground truth but then falls outside of the conﬁdence interval,
underestimating the RUL.
Jan
2000 Apr Jul Oct Jan
2001 Apr Jul Oct Jan
2002 Apr
date
0
50
100
150
200
250
300
Crack length [mm]
Crack Growth (n_samples=900)
trajectory
threshold
Fault date
(a)
0 100 200 300 400 500
day
0
100
200
300
400
500
600
700
RUL [days]
Prognostic horizon ( = 0.25)
Ground truth
ESN
Prophet
GRU
LSTM
Dense
(b)
0 100 200 300 400 500
day
0
100
200
300
400
500
600
700
RUL [days]
accuracy ( = 0.25)
Ground truth
ESN
Prophet
GRU
LSTM
Dense
(c)
0 100 200 300 400 500
day
0
100
200
300
400
500
600
700
RUL [days]
Prognostic horizon ( = 0.25)
Ground truth
Dense
LSTM
GRU
ESN
(d)
0 100 200 300 400 500
day
0
100
200
300
400
500
600
700
RUL [days]
accuracy ( = 0.25)
Ground truth
Dense
LSTM
GRU
ESN
(e)
Figure 9.
The crack growth prognostic. (
a
) Testing: a crack growth trajectory. (
b
) Strategy A: the
prognostic horizon metric. (
c
) Strategy A: the
α
–
λ
accuracy metric. (
d
) Strategy B: the prognostic
horizon metric. (e) Strategy B: the α–λaccuracy metric.
•
Strategy B: using the technique proposed in Section 3.3.2 in this problem, we will
simplify some steps of this process. Given that all the degradation trajectories are
similar, we can assume only one cluster and the classiﬁer will assign to it every time.
Hence, the ClusterModel stage has only one model, which is used to predict the RUL.
Sensors 2022,22, 5323 12 of 29
Basically, this scheme becomes a simple regression model where it is ﬁtted with all the
historicalcritical segment trajectories limited by its failure threshold and its residual
RUL. We use 100 trajectories as runtofailure signals generated from Equation (2) to
ﬁt the model.
The performances can be seen in Figure 9d,e. All the models fall inside the conﬁdence
interval in the prognostic horizon and are getting closer to the ground truth as they
reach the EoL, as illustrated in Figure 9d. Similar behavior is obtained for
α
–
λ
accuracy,
as shown in Figure 9d. Only a few times, some methods go out and then go back into
the conﬁdence interval, e.g., LSTM and GRU, but these behaviors are acceptable.
The results are shown to indicate a large difference in the estimation of the RUL
between the two strategies. This is due to the fact that the models that use strategy A are
more sensitive to small variations in the signal, making the EoL estimate highly variable
and, most critically, it is unaware of the possible variation that it may present in the future.
On the other hand, the models that use strategy B take advantage of historical information
to incorporate into the model information on how the signal could evolve, reducing the
sensitivity due to small disturbances and better mapping to a more precise RUL.
4.2. Intermediate Frequency Processor Degradation Problem
The Atacama Large Millimeter/submillimeter Array (ALMA) is a revolutionary instru
ment operating in northern Chile’s Atacama desert’s very thin and dry air at an altitude of
5200 m above sea level. ALMA is one of the ﬁrst industrialscale new generation telescopes,
composed of an array of 66 highprecision antennas working together at the millimeter
and submillimeter wavelengths, corresponding to frequencies from about 30 to 950 GHz.
Adding to the observatory’s complexity, these 7 and 12m parabolic antennas, with ex
tremely precise surfaces, can be moved around at the high altitude of the Chajnantor
plateau to provide different array conﬁgurations, ranging in size from about 150 m to up
to 20 km. The ALMA Observatory is an international partnership between Europe, North
America, and Japan, in cooperation with the Republic of Chile [75].
4.2.1. Problem Description
The Intermediate Frequency Processor (IFP) of the antennas of the ALMA telescope, as
described in [
25
], is a critical component responsible for the second downconversion, signal
ﬁltering, and ampliﬁcation of the total power measurement of sidebands and basebands.
This subsystem allows for effective communication of the captured data to the central
correlator for processing, thus making it a central and critical component of each antenna.
It is necessary to highlight that there are 2 IFPs per antenna, one for each polarization, and
each IFP has sensors measuring currents of three different voltage levels: 6.5, 8, and 10 volts.
For 6.5 and 8 volts, currents have four different basebands: A, B, C, and D, whereas, for
10 volts, sidebands USB and LSB, and switch matrices SW1 and SW2 currents are read.
Each current is sampled every 10 min.
One of the diagnosed degradation problems that occur in the IFP module is due to
hydrogen poisoning caused by hydrogen outgassing in tightly sealed packages [
25
], where
this degradation can be tracked by monitoring current signals collected from each module.
4.2.2. Prognostic
To measure the performance of both strategies, we selected one of the signals with a
fault detected in [15], and applied the data preprocessing. This is shown in Figure 10a.
•
Strategy A: the performances of this method are illustrated in Figure 10b,c, in which
we can see that none of these models give good predictions of RUL, nor when it
approaches the EoL.
Sensors 2022,22, 5323 13 of 29
1 July 2012
1 October 2012
1 January 2013
1 April 2013
1 July 2013
1 October 2013
1 January 2014
1 April 2014
1 July 2014
1 October 2014
Date
0.11
0.12
0.13
0.14
0.15
0.16
0.17
0.18
current [A]
Antenna 13  Polarization 1  8 Volts (Channel BBB)
current
Fault date
threshold
(a)
0 100 200 300 400 500
day
0
100
200
300
400
500
600
700
RUL [days]
Prognostic horizon ( = 0.25)
Ground truth
ESN
Prophet
GRU
LSTM
Dense
(b)
0 100 200 300 400 500
day
0
100
200
300
400
500
600
700
RUL [days]
accuracy ( = 0.25)
Ground truth
ESN
Prophet
GRU
LSTM
Dense
(c)
0 100 200 300 400 500
day
0
100
200
300
400
500
600
700
RUL [days]
Prognostic horizon ( = 0.25)
Ground truth
Dense
LSTM
GRU
ESN
(d)
0 100 200 300 400 500
day
0
100
200
300
400
500
600
700
RUL [days]
accuracy ( = 0.25)
Ground truth
Dense
LSTM
GRU
ESN
(e)
Figure 10.
The IFP prognostic. (
a
) Testing: a signal from an IFP. (
b
) Strategy A: the prognostic
horizon metric. (
c
) Strategy A: the
α
–
λ
accuracy metric. (
d
) Strategy B: the prognostic horizon metric.
(e) Strategy B: the α–λaccuracy metric.
•
Strategy B: from the historical runtofailure signals, different degradation levels
appears in each voltage’s current of the IFP. In this application, each voltage’s signals
are clustered into a few clusters so that signals in each cluster have similar degradation
levels making it easier to deﬁne an appropriate failure threshold in each cluster, just
as described in Section 3.2, deﬁning a total of 5 clusters for this problem: 2 cluster for
6.5 volts, 1 cluster for 8 volts, and 2 clusters for 10 volts; they are shown in Figure 11,
in which, for each cluster has its corresponded failure threshold value, i.e., 0.566 is
the failure threshold for cluster 1, 0.2 for cluster 2, 0.127 for cluster 3, 0.246 for cluster
4, and 0.275 for cluster 5; or it can be explained as 5.7%, 2%, 36%, 18%, and 8.3% of
degradation levels for each cluster, respectively. These clusters are used to classify
the new arriving preprocessed signal to select the appropriate failure threshold and
predict the RUL.
The cluster generation criterion focuses mainly on the Minimum Variance Matching
(MVM) similarity metric, which is obtained by solving a shortest path (SP) problem
Sensors 2022,22, 5323 14 of 29
that measures the distance between pairs of signals. The principle is to ﬁx a signal as a
centroid and compute the distances with the other signals; these distances are ordered,
and using the same fundamentals of the elbow method, a group of signals is selected
to form a cluster
C1
and the rest in another group
C2
. This process is repeated for the
cluster
C2
to verify if the signals are similar or if another cluster is generated, and so
on. Repeated runs were made, resulting in most cases with 5 clusters being enough to
separate these signals.
The performances under both metrics, Figure 10d,e, show that almost all models have
relatively good predictions of RUL falling inside of the conﬁdence interval. Only ESN
has some irregularities, but these underestimations are acceptable. The Dense neural
network model outperforms the others slightly when it gets close to the EoL.
Analyzing the results, the models that used strategy A showed a problem similar
to what occurs in the application of the Crack Growth in Section 4.1.2, in which
the models remain sensitive to small variations, generating a great variability in the
estimation of EoL and therefore, affects the prediction of the RUL.
Taking into account these effects that it could have on the models, if strategy B is used
and a set of historical runtofailure signals is considered that have great variability in
the degradation behavior, different from that used in Section 4.1.2 in which the signals
are quite similar, could affect the models in predicting the RUL due to these variations
in the level of degradation of the historical signals.
To avoid this, it was decided to group the signals into groups that are similar in
degradation level and address them separately. As a consequence, the performance in
different models manages to predict the RUL close to the real value.
0 200 400 600 800
day
0.50
0.52
0.54
0.56
0.58
0.60
current [A]
6.5 Volts (type: 1) (Threshold: 0.566)
(a)
0 200 400 600 800 1000 1200 1400 1600
day
0.565
0.570
0.575
0.580
0.585
0.590
0.595
0.600
current [A]
6.5 Volts (type: 2) (Threshold: 0.588)
(b)
0 100 200 300 400 500 600
day
0.08
0.10
0.12
0.14
0.16
0.18
0.20
current [A]
8 Volts (type: 1) (Threshold: 0.127)
(c)
0 100 200 300 400 500 600
day
0.20
0.22
0.24
0.26
0.28
0.30
current [A]
10 Volts (type: 1) (Threshold: 0.246)
(d)
0 100 200 300 400 500 600 700 800
day
0.26
0.27
0.28
0.29
0.30
current [A]
10 Volts (type: 2) (Threshold: 0.275)
(e)
Figure 11.
The IFP signals clustering, the red dashed lines represent the failure threshold deﬁned for
each cluster, and continuous lines are the critical segments segmented from the runtofailure IFP
signals (
a
) Class 1: 6.5 Volts (Degradation type 1). (
b
) Class 2: 6.5 Volts (Degradation type 2). (
c
) Class
3: 8 Volts. (d) Class 4: 10 Volts (Degradation type 1). (e) Class 5: 10 Volts (Degradation type 2).
4.3. Validation in a Different Setting
To validate our approach, we considered testing this methodology in a very different
setting. In particular, we used measurements of camera resolution information from an
important optical telescope.
4.3.1. Problem Description
One of the problems presented in the studied instrument is the Teﬂon wear in the
lens support, increasing the humidity level, which affects the camera resolution. This
degradation can be tracked through measurements collected from the camera’s CCDs.
Sensors 2022,22, 5323 15 of 29
An example of degradation over 18 years is shown in Figure 12, where it can be seen
that this signal is noisy and has several spike points (large down jumps that may be possible
outliers). Some corrective or maintenance actions have been made (time indexes of up
jumps) are taken along these records. Therefore, a process of fault detection would be
excellent for anticipating an unacceptable deviation of the faultfree behavior and then a
prognostic process to compute the RUL of the component accurately.
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
date
20,000
30,000
40,000
50,000
60,000
70,000
80,000
90,000
Resolutionmedia[R]
Figure 12. Resolution media signal obtained from a CCD.
4.3.2. Fault Detection
Recently, Cho et al. [
15
] tackled similar degradation noisy signals using a fault detec
tion framework based on ESNs applied to IFPs of the antennas of the ALMA observatory;
the authors highlighted the noise level in the data affected the performance of detection
signiﬁcantly. In the case of the camera resolution, unlike the ALMA IFP data, it contains
larger spikes that distort the signal dynamics even after double exponential smoothing. For
this reason, it is necessary to adopt a mechanism that allows reducing spikes efﬁciently in
time series as a clean outlier method in the preprocessing stage of the framework proposed
in [
15
]. With this insight, the modiﬁed data preprocessing method was generated, and it is
described in Section 3.1. The results, applying the proposed data preprocessing method,
are shown in Figure 13, where the red signal represents the preprocessed signal, and the
trend is maintained from the raw signal.
2002 2004 2006 2008 2010 2012 2014 2016 2018 2020
date
20,000
30,000
40,000
50,000
60,000
70,000
80,000
90,000
Resolutionmedia[R]
Raw
Pre processed
Figure 13. Raw and preprocessed signal of the resolution media obtained from a CCD.
Once the preprocessing stage is done, the fault detection process is maintained almost
the same as in [
15
]. The result is shown in Figure 14. The vertical dashed red lines are fault
detected time indexes and the vertical dashed green lines are time indexes where corrective
or maintenance were made.
Sensors 2022,22, 5323 16 of 29
2002 2004 2006 2008 2010 2012 2014 2016 2018 2020
date
20,000
30,000
40,000
50,000
60,000
70,000
80,000
90,000
Resolutionmedia[R]
Resol.Med
31 March 2007
25 January 2014
11 April 2017
10 February 2017
04 October 2017
16 December 2004
07 December 2010
29 March 2015
15 June 2016
18 January 2018
Figure 14. Fault detection in the resolution media signal obtained from a CCD.
It is necessary to highlight that the framework designed in [
15
] deals with current
signals with a resolution of 10 min per sample, resulting in high performance on real
data. Now, with this modiﬁcation in the preprocessing, the robustness of the framework
increases, and it is applied to the camera problem, which are signals coming from a
resolution camera with daily samples, resulting in the same effectiveness in fault detection;
this is justiﬁed in that the degradation characteristic is similar to the ones that were used
during the design of the method.
4.3.3. Prognostic
For the prognostic application to the camera resolution signal, we took the ﬁrst segment
of the trajectory until the ﬁrst maintenance, dated 20070331, as the test signal for RUL
estimation, Figure 15a. The rest of the segment can be computed similarly by applying the
methodology described in Section 3.
•
Strategy A: applying this method, we can see Figure 15b,c, that neural networks
have a poor quality of predictions, whereas the Prophet model has some segments
that fall inside the conﬁdence interval, but it is not good enough because of its
irregular behaviour.
•
Strategy B: in this problem, there are no historical runtofailure signals. So, clustering
over this component is not possible. However, given that the degradation behavior
present in this component is similar to the IFP of ALMA, we can use these clusters and
try to transfer to this problem. To achieve this, it is necessary to transform the new
arriving preprocessed signal
Q
and scale it to every cluster described in Section 3.2,
this means, for each cluster, we deﬁne a transformed signal of Qas follows
S=κ·Q, (3)
S0=S−S0+ki(4)
where,
κ=ki−k∗
i
Q0−q∗(5)
is the scaling constant,
ki
and
k∗
i
are standard normal conditions and failure threshold
of the cluster
i
, respectively.
Q0
is the ﬁrst sample of the signal in this problem, and
q∗
is its associated failure threshold.
The classiﬁer result gives the ﬁnal scope, which is used for model selection in the
prediction of RUL. In the prognostic horizon metric, Figure 15d, the GRU model
outperforms the other models. However, the other models fall inside the conﬁdence
interval after 200 days. So, all the models in this metric are acceptable. From the
α
–
λ
accuracy side, most of the time, these models are not inside the conﬁdence interval,
underestimating the RUL on the ﬁrst 300 days
(λ=
3
/
4
)
. After that, they are around
the ground truth up to the EoL. In this case, the GRU model is close to the frontier of
Sensors 2022,22, 5323 17 of 29
the conﬁdence interval, which is not as bad as an instance for RUL computation by
using a similar degradation signal developed from another system or component like
the IFP Problem.
2002
2003
2004
2005
2006
2007
date
82,000
84,000
86,000
88,000
90,000
92,000
Resolution media [R]
Ground truth
Threshold
Fault date
(a)
0 50 100 150 200 250 300 350 400
day
0
100
200
300
400
500
600
RUL [days]
Prognostic horizon ( = 0.25)
Ground truth
ESN
Prophet
GRU
LSTM
Dense
(b)
0 50 100 150 200 250 300 350 400
day
0
100
200
300
400
500
600
RUL [days]
accuracy ( = 0.25)
Ground truth
ESN
Prophet
GRU
LSTM
Dense
(c)
0 50 100 150 200 250 300 350 400
day
0
100
200
300
400
500
600
700
800
RUL [days]
Prognostic horizon ( = 0.25)
Ground truth
Dense
LSTM
GRU
ESN
(d)
0 50 100 150 200 250 300 350 400
day
0
100
200
300
400
500
600
700
800
RUL [days]
accuracy ( = 0.25)
Ground truth
Dense
LSTM
GRU
ESN
(e)
Figure 15.
The Camera Resolution prognostic. (
a
) Testing: Resolution media trajectory. (
b
) Strategy A:
The prognostic horizon metric. (
c
) Strategy A: The
α
–
λ
accuracy metric. (
d
) Strategy B: The prognostic
horizon metric. (e) Strategy B: The α–λaccuracy metric.
The way in which strategy B was approached in this application allows comparing
the critical segment of the new preprocessed and transformed incoming signal with the
clustered signals that have similar patterns at the level of degradation. In addition, this
helps to relate to possible trajectories of the signals of the cluster that is most assimilated
and, thus, to be able to approximate the RUL of this new signal when historical information
is not available. As the mean resolution signal has similar characteristics to some signals in
one of the clusters, this helps in obtaining a relatively good RUL prediction.
5. Discussion
Several frameworks of fault detection have been developed in the last decades, most
of them for a speciﬁc degradation present in an application of interest. In this work, we
Sensors 2022,22, 5323 18 of 29
are interested in a more general framework, transferable to many domains that present a
similar degradation problem. In Section 4.3.2, we show that the fault detection framework
developed in [
15
] can be transferable to other applications with similar degradation be
havior as the one described in Section 4.3.1, without any adjustment to the structure but
only some improvement to the data preprocessing step. In particular, by adding other
properties of noise to get a bettersmoothed signal, as the example shown in Figure 13. Such
improvement increases the performance of this framework slightly even when applied
to the IFP signals, which was the problem of interest in [
15
]. We obtained a smoothed
signal while maintaining the relevant characteristic of the raw data, such as the degradation
trend. This smoothed signal then was used as an input to verify if a fault was present and
returned the date where it was detected, as illustrated in Figure 14, where the red dashed
lines represent the dates of detected faults and the green ones represent the dates of the
performed maintenance.
The parameters used in the preprocessing steps were: Factor used to determine the
bound of the normal range based on the historical interquartile range was ﬁxed as 3, and the
window size was ﬁxed as 20 for both spike cleaner and convolutional smoothing methods.
It is necessary to highlight our meaning of transferable is not the same as transfer
learning used in the context of deep learning. The framework learns from the data auto
matically but does not inherit the insights from another problem so that it can be scaled
and applied to other similar problems. Given that fault detection and prognostic are not
always exclusive to each other, in most of the cases, the former is considered as the previous
step of the prognostic process. Additionally, the preprocessing method that we designed
in Section 3.1 allows us to reduce as far as possible problems of outliers present in the
signal to be later used, either for fault prediction or forecasting. This allows to increase the
performance and reduce possible disturbances that affect the estimation.
For prognostic settings:
•
Strategy A: timewindow size was 365 days, 2 years of forecasting, a lookback of
19 samples format (e.g., samples from time
t−
19 until time
t
with a total of 20 samples)
as input, and 20 epochs for neural networks adjustments. For simplicity, we assume
for this method that new data is available every 15 days to update RUL estimation.
The model hyperparameters used for prognostics are summarized in Table 1.
•
Strategy B: a lookback of 9 samples format (e.g., samples from time
t−
9 until time
t
with a total of 10 samples) as input, and 15 epochs for neural networks adjustments.
The model hyperparameters used for prognostics are summarized in Table 2.
All the algorithms were implemented in Python version 3.8.5 and ran on a computer
with an Intel
®
Core
™
Processor i53230M of 2.6 GHz
×
4 cores, with 8 GB RAM, and using
Linux Mint 20.1 Ulyssa (64 bits) as OS.
Two prognostic strategies were tested in three problems:
•
Crack Growth in Section 4.1.2: is a classical problem in the literature in which the
degradation is a monotonical nondecreasing trajectory. The worst performances
are given by strategy A, where only the Prophet model was relatively close to the
ground truth RUL. Whereas, the strategy B, all prediction models are signiﬁcantly
well performed on both metrics.
•
IFP Degradation in Section 4.2.2: the historical degradation signals are not totally
monotonous with different degradation levels and speeds, resulting in different failure
threshold values for a set of signals. With this insight, deﬁning a unique failure
threshold for all the signals and forecasting the dynamic of the signal until reaching the
failure threshold as described by strategy A does not work well. Therefore, clustering
signals by degradation levels helps to deﬁne appropriately the failure threshold given
the characteristic of similarity to a set of historical runtofailure signals from a cluster.
Therefore, using strategy B improves the prediction of RULs, in which ESN is the less
accurate model than the other models tested.
Sensors 2022,22, 5323 19 of 29
Table 1. Models setting used for strategy A.
Model
ESN GRU LSTM
Hyperparameter
input_size: 20 input_shape: (20, 1) input_shape: (20, 1)
output_size: 1 units (GRU): 20 units (LSTM): 20
reservoir_size: 100 activation (GRU): reLU activation (LSTM): reLU
spectralRadius: 0.75 units (Dense): 20 units (Dense): 20
noise_scale: 0.001 activation (Dense): reLU activation (Dense): reLU
leaking_rate: 0.5 units (Dense): 1 units (Dense): 1
sparsity: 0.3 activation (Dense): linear activation (Dense): linear
activation: tanh optimizer: adam optimizer: adam
feedback: True
regularizationType: Ridge
regularizationParam: auto
Prophet
changepoint_prior_scale: 0.05
seasonality_prior_scale 0.01
daily_seasonality: False
Table 2. Models setting used for strategy B.
Model
ESN GRU Dense
Hyperparameter
input_size: 10 input_shape: (10, 1) input_shape: 10
output_size: 1 units (GRU): 15 units (Dense): 50
reservoir_size: 250 activation (GRU): reLU activation (Dense): reLU
spectralRadius: 1.0 recurrent_dropout (GRU): 0.5 dropout: 0.5
noise_scale: 0.001 units (GRU) 15 units (Dense): 25
leaking_rate: 0.7 activation (GRU): reLU activation (Dense): reLU
sparsity: 0.2 recurrent_dropout (GRU): 0.5 dropout: 0.5
activation: tanh units (Dense): 1 units (Dense): 1
feedback: False activation (Dense): linear activation (Dense): linear
regularizationType: Ridge optimizer: adam optimizer: adam
regularizationParam: 0.01
•
Camera Resolution Degradation in Section 4.3.3: the degradation trajectory showed
irregularities similar to the IFP signals, in which there is some segment increase
and then decrease, and vice versa. Therefore, the degradation trajectory is also not
completely monotonous. Addressing this problem with strategy A showed some
difﬁculties, particularly trying to forecast the dynamic or trend of the signal when the
trend of the segment changes in the opposite sense to the degradation, obtaining an
overestimation of the RUL. Working with this strategy showed that only the Prophet
approximates the ground truth, but it is still not good enough and acceptable. From
the strategy B perspective and using the RUL predictive model transferred from the
IFP setting provided better results compared to the previous strategy, converging to
the ground truth as it reaches the EoL with a few minor exceptions.
For the three problems addressed in this work, the degradation signals present ir
regularities that affect the forecast of the dynamic of the signal by a ﬁtted model; even
with Prophet, which is based on time series decomposition, it could not handle these
irregularities to allow a trustworthy RUL prediction to all the degradation problems.
In most of the cases, RNN models provided an underestimated RUL, opposite to the
results of the linear forecasting model such as Prophet. The time spent in the prognostic
Sensors 2022,22, 5323 20 of 29
process using strategy A are shown in Table 3, where we can see that ESN is the fastest
method because of its simplicity in training and forecast, followed by Prophet, and ﬁnally,
LSTM and GRU were similar in the time spent.
Table 3. Time performance measured in seconds.
Problem Prophet ESN LSTM GRU
Crack growth 252.40 109.49 2170.89 2197.84
Resolution Degradation 193.41 31.60 1995.64 1997.99
IFP Degradation 82.28 38.20 892.36 890.27
Concerning strategy B, the results showed that this strategy obtained better estimations
of RULs. It seems to be robust to irregularities present in the signal, and it is helpful for
problems with similar degradations and scarce historical runtofailure signals. With this
method, it is only necessary to ﬁt the models once and simply call the best representative
model by the classiﬁer to predict the RUL, so the time spent using the ﬁtted model to
calculate the RUL is almost negligible.
Finally, two main points must be highlighted. First, the fault detection framework
deﬁned in our previous work [
15
] was designed from historical fault information of a pair
of IFPs out of the 132 available distributed in the 66 ALMA antennas and was validated on
other IFP data achieving good detection performance. Now by updating the preprocessing
module in this work, it was possible to improve the robustness by reducing the sensitivity
generated by the existing noise level. This was validated in other IFPs data preserving
the same performance and also found that the same effect applied to other signals similar
to those of IFPs can be obtained, such as the average resolution of the camera. Second,
the signals that are in the clusters do not fully represent the historical signals of the IFPs;
for validation purposes, some signals that were used to verify their effectiveness in the
prediction of the RUL were excluded; one of them is shown in Figure 10a, the other signals
showed very similar results, and most interestingly, that using the models ﬁtted with
the IFPs data it is possible to obtain a good approximation in the RUL applied to other
components that have signals with similar degradations, in this case, applied to the camera
resolution signal. This indicates the power of generalization that the adjusted models have
against other similar problems.
6. Conclusions
This work shows a fault detection framework that can be transferable or scalable to
other applications with similar degradation behaviors but not necessarily with the same
statistical characteristics as the particular problem for which it was developed initially.
Hence, it is a helpful tool because it can be used in many applications to detect faults in the
system of interest without any changes in the method.
We also tested the performance of RNN models and a time series decomposition
model called Prophet to measure the precision of the RUL estimation using standard
metrics proposed in [
57
] that allow a systematic evaluation and a level of conﬁdence for
model selection. Through this performance measurement scheme, one could eventually ask
which model is the best? We argue that the best would be one that has the largest PH value
and a lower
tλ
—additionally, an underestimation of the RUL close to the ground truth.
So, future works could use this as a guideline for model testing and the measurement of
quality of the model used for prognostic in RUL estimation.
One of the weaknesses of this proposal in forecasting is that it depends on a catas
trophic failure threshold to estimate the RUL of a component. Furthermore, it considers
a deterministic threshold that could be a bit conservative if it is chosen as the worst
case scenario.
Sensors 2022,22, 5323 21 of 29
7. Future Work
Our approach has shown to work effectively in different settings with slow degrada
tion faults, adapting to each environment effectively. This method, together with several
others that have been developed in the literature, will help organizations transform data
into information. The challenge then becomes transforming this new vast information into
actionable decisions. Hence, as part of our future work, we will work in:
•
Improving the computation of uncertainty measurements of RUL predictions. This
computation will help develop new prescriptive maintenance approaches that help in
the decisionmaking process of maintenance procedures.
•
Test this approach on other problems with similar degradation faults to continue
evaluating the robustness of this runtofailure critical segment clustering approach to
predict a component’s RUL value.
Author Contributions: Conceptualization and validation, A.D.C., R.A.C. and G.A.R.; methodology,
A.D.C. and G.A.R.; software, analysis, visualization, and writing—original draft preparation, A.D.C.;
supervision and writing—review and editing, R.A.C. and G.A.R.; funding acquisition, R.A.C. and
G.A.R. All authors have read and agreed to the published version of the manuscript.
Funding:
This research was partially funded by FONDECYT 1180706, PIA/BASAL FB0002, and
ASTRO200058 grants from ANID, Chile.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement:
The data presented in this study are available on request from the
corresponding author.
Acknowledgments:
The Atacama Large Millimeter/submillimeter Array (ALMA), an international
astronomy facility, is a partnership of the European Organisation for Astronomical Research in the
Southern Hemisphere (ESO), the U.S. National Science Foundation (NSF), and the National Institutes
of Natural Sciences (NINS) of Japan in cooperation with the Republic of Chile. ALMA is funded by
ESO on behalf of its Member States, by NSF in cooperation with the National Research Council of
Canada (NRC) and the National Science Council of Taiwan (NSC) and by NINS in cooperation with
the Academia Sinica (AS) in Taiwan and the Korea Astronomy and Space Science Institute (KASI).
ALMA construction and operations are led by ESO on behalf of its Member States; by the National
Radio Astronomy Observatory (NRAO), managed by Associated Universities, Inc. (AUI), on behalf
of North America; and by the National Astronomical Observatory of Japan (NAOJ) on behalf of East
Asia. The Joint ALMA Observatory (JAO) provides the uniﬁed leadership and management of the
construction, commissioning, and operation of ALMA. The authors would like to thank José Luis
Ortiz, from ALMA, for his support with the relevant data.
Conﬂicts of Interest: The authors declare no conﬂict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
RUL Remaining Useful Life
RNN Recurrent Neural Network
ALMA Atacama Large Millimeter Array
CBM ConditionBased Maintenance
PHM Prognostic and Health Management
PH Prognostic Horizon
EoL EndofLife
ESN Echo State Network
LSTM LongShort Term Memory
GRU Gated Recurrent Unit
ADTK Anomaly Detection Toolkit
MVM Minimum Variance Matching
IFP Intermediate Frecuency Processor
Sensors 2022,22, 5323 22 of 29
LSB Lower Sideband
USB Upper Sideband
SW Switch matrix current
UT Unit Telescope
CCD ChargeCoupled Device
EoP EndofPrediction
ANN Artiﬁcial Neural Network
SP Shortest Path
Appendix A. Evaluation Metrics
Let
J
be the set of all time indexes when the prediction is made,
r∗
the ground
truth RemainingUsefulLife (RUL),
α
is the allowable error bound,
tP
time when the ﬁrst
prediction is made, titime at the time index i, and EoP as EndofPrediction of the RUL.
•Pronostic Horizon (PH)
: it identiﬁes whether a method predicts within speciﬁed
limits around the ground truth EndofLife (EoL) so that the predictions are considered
trustworthy. If it does, how much time does it allow for any maintenance action to be
taken. The longer PH better the model and more time to act based on the prediction
with some desired credibility. This metric is deﬁned as:
PH =tEoL −ti, (A1)
where
i=minj(j∈ J )∧$−
α≤r(j)≤$+
α,
$−
α=r∗−α·tEoL ,
$+
α=r∗+α·tEoL .
•α–λAccuracy
: this metric quantiﬁes the prediction quality by identifying whether
the prediction falls within speciﬁed limits at a particular time; this is a more stringent
requirement as compared to PH since it requires predictions to stay within a cone
of accuracy. Its output is binary since we need to evaluate whether the following
condition is met,
(1−α)·r∗(t)≤r(tλ)≤(1+α)·r∗(t), (A2)
where
tλ=tP+λ·(tEoL −tP).
•Relative Accuracy
: a similar notion as
α
–
λ
accuracy where, instead of ﬁnding out
whether the predictions fall within given accuracy levels at a given time
tλ
, we also
quantitatively measure the accuracy by the following
RAλ=1−r∗(tλ)−rtλ
r∗(tλ), (A3)
where
tλ
is deﬁned previously at
α
–
λ
accuracy. For measurement of the general
behavior of the algorithm over time, Cumulative Relative Accuracy (CRA) can be
used, and it is deﬁned as
CR Aλ=1
Jλ
Jλ
∑
i=1
w(r)RAλ(A4)
where
w(r)
is a weight factor as a function of the RUL at all time indices,
Jλ
is the
set of all time indexes before
tλ
when a prediction is made, and
·
is the cardinality
operation of a set. The meaning of these metrics is that as more information becomes
Sensors 2022,22, 5323 23 of 29
available, the prognostic performance improvement will increase as it converges to
the ground truth RUL.
•Convergence
: it is a useful metric since we expect a prognostics algorithm to converge
to the true value as more information accumulates over time. Besides, it shows that the
distance between the origin and the centroid of the area under the curve for a metric
quantiﬁes convergence, and a faster convergence is desired to achieve high conﬁdence
in keeping the prediction horizon as large as possible. Lower distance means a faster
convergence. The computation of this metric is deﬁned as, let
(xc
,
yc)
be the center of
mass of the area under the curve
M(i)
. Then, the convergence
CM
can be represented
by the Euclidean distance between the center of mass and (tP, 0), where
CM=q(xc−tP)2+y2
c,
xc=1
2
∑EoP
i=Pt2
i+1−t2
iM(i)
∑EoP
i=P(ti+1−ti)M(i),
yc=1
2
∑EoP
i=P(ti+1−ti)M2(i)
∑EoP
i=P(ti+1−ti)M(i),
M(i)
is a nonnegative prediction error accuracy or precision metric. In other words,
this metric measures the fastness of convergence of a method.
Appendix B. Recurrent Neural Networks
Appendix B.1. Echo State Networks (ESNs)
The ESNs are a type of recurrent neural network developed by Herbert Jaeger [
76
]
that has a dynamical memory to preserve in its internal state a nonlinear transformation of
the input’s history. Hence, they have shown to be exceedingly good at modeling nonlinear
systems. Another advantage of ESNs is that they are easy to train because they do not need
to backpropagate gradients as classical ANNs do.
An ESN can be deﬁned as follows: consider a discretetime neural networks like
in
[76–79]
, with
Nu
input units,
Nx
internal units (also called reservoir units), and
Ny
output
units. Activations of input units at time step
t
are
u(t)∈IRNu
, of internal units are
x(t)∈
IRNx
, and of output units
y(t)∈IRNy
. The connection weight matrix
Win ∈IRNx×(1+Nu)
for the input weights,
W∈IRNx×Nx
for reservoir connections,
Wout ∈IRNy×(1+Nu+Nx)
for
connections to the output units, and
Wf b ∈IRNx×Ny
for the connections that are projected
back (also called feedback) from the output to the internal units. The connections go directly
from input to output units and connections between output units are allowed. Figure A1
shows the basic network architecture.
The activation of reservoir units are represented by
˜x(t+1) = tanhWin [1; u(t+1)]
+Wx(t) + Wfb y(t), (A5)
and are updated according to
x(t+1) = (1−δ)x(t) + δ˜x(t+1), (A6)
where δ∈(0, 1]is the leaky integrator rate. The output is calculated by
y(t+1) = Wout[1; u(t+1);x(t+1)], (A7)
where
[·
;
·]
denotes the vertical vector concatenation. The coefﬁcients in
Wout
are computed
by using ridge regression, solving the following equation,
Sensors 2022,22, 5323 24 of 29
Ytarget =WoutX, (A8)
where
X∈IR(1+Nu+Nx)×T
with columns
[1; u(t);x(t)]
for
t=
1,
. . .
,
T
; and all
x(t)
are
produced by presenting the reservoir with u(t)and Ytarget ∈IRNy×T.
Figure A1. The basic echo state network architecture.
Finally, the solution can be represented by
Wout =YtargetXTXXT+τI, (A9)
where
I∈IR(1+Nu+Nx)×(1+Nu+Nx)
is the identity matrix and
τ
is a regularization factor
(ridge constant). The ridge constant is estimated using grid search and time series cross
validation methods.
Appendix B.2. LongShort Term Memory (LSTM)
LSTM is another type of artiﬁcial recurrent neural network (RNN) architecture pro
posed by Hochreiter and Schmidhuber [
80
] that deals with the vanishing gradient problem.
One LSTM unit is composed essentially of three gates: an input gate, an output gate, and a
forget gate; and a memory cell that remembers values over arbitrary time intervals, and the
three gates regulate the ﬂow of information into and out of the cell. This type of RNN has
been found extremely successful in many applications [
81
] and was regarded as one of the
most popular and efﬁcient RNN models using backpropagation as a training method. A
typical LSTM [82] is illustrated in Figure A2, and can be formulated as follow.
Let u(t)∈IRNuan input vector at time t, and consider Mof LSTM units, then
•Block input
: it consists of combining the input
u(t)
and the previous output of LSTM
units h(t−1)for each time step t, and it is deﬁned as
z(t) = φ(Wzu(t) + Rzh(t−1) + bz). (A10)
Sensors 2022,22, 5323 25 of 29
•Input gate
: this gate decides which values needs to be updated with new information
to the cell state. It is computed as a combination of the input
u(t)
, the previous output
of LSTM units h(t−1), and the previous cell state c(t−1)for each time step t,
i(t) = σ(Wiu(t) + Rih(t−1)
+pic(t−1) + bi). (A11)
•Forget gate
: it makes the decision of what information needs to be removed from the
LSTM memory, and it is calculated similarly to the input gate.
f(t) = σWfu(t) + Rfh(t−1)
+pfc(t−1) + bf. (A12)
•Cell state
: this step provides an update for the LSTM memory in which the current
value is given by the combination of block input
z(t)
, input gate
i(t)
, forget gate
f(t)
and the previous cell state c(t−1).
c(t) = z(t)i(t) + c(t−1)f(t). (A13)
•Output gate
: this gate makes the decision of what part of the LSTM memory con
tributes to the output and it is related to the current input vector
u(t)
, the previous
output h(t−1), and the current cell state c(t).
o(t) = σ(Wou(t) + Roh(t−1)
+poc(t) + bo). (A14)
•Block output
: ﬁnally, this step computes the output
h(t)
, which combines the current
cell state c(t)and the current output gate o(t).
h(t) = ψ(c(t)) o(t)(A15)
c(t − 1) c(t)
h(t − 1) h(t)
u(t)
h(t − 1)
f(t)
h(t−1)
i(t)
h(t−1)
z(t)
h(t−1)
o(t)
Figure A2. The basic LSTM architecture.
In the above description,
Wk∈IRM×Nu
,
Rk∈IRM×M
,
pk∈IRM
,
bk∈IRM
, for
k∈ {z
,
i
,
f
,
o}
, are input weights, recurrent weights, peephole weights, and bias weights,
respectively. The operator
represent the pointwise multiplication of two vectors.
σ(x) =
1
1+exand φ(x) = ψ(x) = tanh(x).
Sensors 2022,22, 5323 26 of 29
Appendix B.3. Gated Recurrent Unit (GRU)
The GRU model was introduced by Cho et al. [
83
], which chose a new type of hidden
unit inspired by the LSTM unit. Basically, it combines the input gate and the forget gate
into a single update gate, and some operations are mixed with computing the update cell
state, making this model simpler, containing fewer variables than the basic LSTM model,
as shown in Figure A3. It can be formulated as follow,
h(t − 1) h(t)
u(t)
r(t)
z(t)
c(t)
Figure A3. The basic GRU architecture.
Let u(t)∈IRNuan input vector at time tand consider Mof GRU units, then,
•Update gate
: this gate determines how much previously learned information should
be passed on to the future,
z(t) = σ(Wzu(t) + Rzh(t−1) + bz). (A16)
•Reset gate: this gate decides how much previously learned information to forget.
r(t) = σ(Wru(t) + Rrh(t−1) + br). (A17)
•Cell state
: it consists of storing the relevant information from the past, using the reset
gate to affect the memory content.
c(t) = tanh(Wcu(t)
+Rch(t−1)r(t) + bc). (A18)
•Block output: ﬁnally, compute the output y(t)
h(t) = c(t)z(t) + h(t−1)(1−z(t))(A19)
In the above description,
Wk∈IRM×Nu
,
Rk∈IRM×M
,
bk∈IRM
, for
k∈ {z
,
r
,
c}
, are
update gate weights, reset gate weights, cell state weigths, and bias weights, respectively.
The operator represent the pointwise multiplication of two vectors, and σ(x) = 1
1+ex.
References
1.
Bougacha, O.; Varnier, C.; Zerhouni, N. A Review of PostPrognostics DecisionMaking in Prognostics and Health Management.
Int. J. Progn. Health Manag. 2020,11, 31. [CrossRef]
2.
Patan, K. Artiﬁcial Neural Networks for the Modelling and Fault Diagnosis of Technical Processes; Springer: Berlin/Heidelberg,
Germany, 2008. [CrossRef]
3.
Li, Y.; Wang, X.; Lu, N.; Jiang, B. Conditional Joint DistributionBased Test Selection for Fault Detection and Isolation. IEEE Trans.
Cybern. 2021, 1–13. [CrossRef] [PubMed]
4. Isermann, R. FaultDiagnosis Systems; Springer: Berlin/Heidelberg, Germany, 2006. [CrossRef]
5.
Shi, J.; He, Q.; Wang, Z. Integrated Stateﬂowbased simulation modelling and testability evaluation for electronic builtintest
(BIT) systems. Reliab. Eng. Syst. Saf. 2020,202, 107066. [CrossRef]
Sensors 2022,22, 5323 27 of 29
6.
Shi, J.; Deng, Y.; Wang, Z. Novel testability modelling and diagnosis method considering the supporting relation between faults
and tests. Microelectron. Reliab. 2022,129, 114463. [CrossRef]
7.
Bindi, M.; Corti, F.; Aizenberg, I.; Grasso, F.; Lozito, G.M.; Luchetta, A.; Piccirilli, M.C.; Reatti, A. Machine LearningBased
Monitoring of DCDC Converters in Photovoltaic Applications. Algorithms 2022,15, 74. [CrossRef]
8.
Bindi, M.; Piccirilli, M.C.; Luchetta, A.; Grasso, F.; Manetti, S. Testability Evaluation in TimeVariant Circuits: A New Graphical
Method. Electronics 2022,11, 1589. [CrossRef]
9.
Li, Y.; Chen, H.; Lu, N.; Jiang, B.; Zio, E. DataDriven Optimal Test Selection Design for Fault Detection and Isolation Based on
CCVKL Method and PSO. IEEE Trans. Instrum. Meas. 2022,71, 1–10. [CrossRef]
10.
Tinga, T.; Loendersloot, R. Aligning PHM, SHM and CBM by understanding the physical system failure behaviour. In Proceedings
of the 2nd European Conference of the Prognostics and Health Management Society, PHME 2014, Nantes, France, 8–10 July 2014;
pp. 162–171.
11.
Montero Jimenez, J.J.; Schwartz, S.; Vingerhoeds, R.; Grabot, B.; Salaün, M. Towards multimodel approaches to predictive
maintenance: A systematic literature survey on diagnostics and prognostics. J. Manuf. Syst. 2020,56, 539–557. [CrossRef]
12.
Vachtsevanos, G.; Wang, P. Fault prognosis using dynamic wavelet neural networks. In Proceedings of the 2001 IEEE Autotestcon
Proceedings, IEEE Systems Readiness Technology Conference, Valley Forge, PA, USA, 20–23 August 2001; pp. 857–870. [CrossRef]
13.
Byington, C.S.; Roemer, M.J.; Galie, T. Prognostic enhancements to diagnostic systems for improved conditionbased maintenance
[military aircraft]. In Proceedings of the IEEE Aerospace Conference, Big Sky, MT, USA, 9–16 March 2002; Volume 6, p. 6.
[CrossRef]
14.
Cho, A.D.; Carrasco, R.A.; Ruz, G.A. Improving prescriptive maintenance by incorporating postprognostic information through
chance constraints. IEEE Access 2022,10, 55924–55932. [CrossRef]
15.
Cho, A.D.; Carrasco, R.A.; Ruz, G.A.; Ortiz, J.L. Slow Degradation Fault Detection in a Harsh Environment. IEEE Access
2020
,
8, 175904–175920. [CrossRef]
16.
Carrasco, R.A.; Núñez, F.; Cipriano, A. Fault detection and isolation in cooperative mobile robots using multilayer architecture
and dynamic observers. Robotica 2011,29, 555–562. [CrossRef]
17.
Isermann, R. Process fault detection based on modeling and estimation methods—A survey. Automatica
1984
,20, 387–404.
[CrossRef]
18.
Park, Y.J.; Fan, S.K.S.; Hsu, C.Y. A Review on Fault Detection and Process Diagnostics in Industrial Processes. Processes
2020
,8,
1123. [CrossRef]
19.
Tuan Do, V.; Chong, U.P. Signal ModelBased Fault Detection and Diagnosis for Induction Motors Using Features of Vibration
Signal in TwoDimension Domain. Stroj. Vestn. 2011,57, 655–666. [CrossRef]
20.
Meinguet, F.; Sandulescu, P.; Aslan, B.; Lu, L.; Nguyen, N.K.; Kestelyn, X.; Semail, E. A signalbased technique for fault detection
and isolation of inverter faults in multiphase drives. In Proceedings of the 2012 IEEE International Conference on Power
Electronics, Drives and Energy Systems (PEDES), Bengaluru, India, 16–19 December 2012; pp. 1–6.
21.
GermánSalló, Z.; Strnad, G. Signal processing methods in fault detection in manufacturing systems. In Proceedings of the
11th International Conference Interdisciplinarity in Engineering, INTERENG 2017, Tirgu Mures, Romania, 5–6 October 2017;
Volume 22, pp. 613–620.
22.
Duan, J.; Shi, T.; Zhou, H.; Xuan, J.; Zhang, Y. Multiband Envelope Spectra Extraction for Fault Diagnosis of Rolling Element
Bearings. Sensors 2018,18, 1466. [CrossRef]
23.
Abid, A.; Khan, M.; Iqbal, J. A review on fault detection and diagnosis techniques: Basics and beyond. Artif. Intell. Rev.
2021
,54,
3639–3664. [CrossRef]
24.
Khorasgani, H.; Jung, D.E.; Biswas, G.; Frisk, E.; Krysander, M. Robust residual selection for fault detection. In Proceedings of the
53rd IEEE Conference on Decision and Control, Los Angeles, CA, USA, 15–17 December 2014; pp. 5764–5769.
25.
Ortiz, J.L.; Carrasco, R.A. Modelbased fault detection and diagnosis in ALMA subsystems. In Observatory Operations: Strategies,
Processes, and Systems VI; Peck, A.B., Benn, C.R., Seaman, R.L., Eds.; SPIE: Bellingham, WA, USA, 2016; pp. 919–929. [CrossRef]
26.
Ortiz, J.L.; Carrasco, R.A. ALMA engineering fault detection framework. In Observatory Operations: Strategies, Processes, and
Systems VII; Peck, A.B., Benn, C.R., Seaman, R.L., Eds.; SPIE: Bellingham, WA, USA, 2018; p. 94. [CrossRef]
27. Gómez, M.; Ezquerra, J.; Aranguren, G. Expert System Hardware for Fault Detection. Appl. Intell. 1998,9, 245–262. [CrossRef]
28.
Fuessel, D.; Isermann, R. Hierarchical motor diagnosis utilizing structural knowledge and a selflearning neurofuzzy scheme.
IEEE Trans. Ind. Electron. 2000,47, 1070–1077. [CrossRef]
29.
He, Q.; Zhao, X.; Du, D. A novel expert system of fault diagnosis based on vibration for rotating machinery. J. Meas. Eng.
2013
,
1, 219–227.
30.
Napolitano, M.R.; An, Y.; Seanor, B.A. A fault tolerant ﬂight control system for sensor and actuator failure using neural networks.
Aircr. Des. 2000,3, 103–128. [CrossRef]
31.
Cork, L.; Walker, R.; Dunn, S. Fault detection, identiﬁcation and accommodation techniques for unmanned airborne vehicles. In
Proceedings of the Australian International Aerospace Congress, Fuduoka, Japan, 13–17 March 2005; AIAC, Ed.; AIAC: Australia,
Melbourne, 2005; pp. 1–18.
32.
Masrur, M.A.; Chen, Z.; Zhang, B.; Murphey, Y.L. ModelBased Fault Diagnosis in Electric Drive Inverters Using Artiﬁcial Neural
Network. In Proceedings of the 2007 IEEE Power Engineering Society General Meeting, Tampa, FL, USA, 24–28 June 2007;
pp. 1–7.
Sensors 2022,22, 5323 28 of 29
33.
Wootton, A.; Day, C.; Haycock, P. Echo State Network applications in structural health monitoring. In Proceedings of the 2015
International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–17 July 2015; pp. 1–7. [CrossRef]
34.
Morando, S.; MarionPéra, M.C.; Yousﬁ Steiner, N.; Jemei, S.; Hissel, D.; Larger, L. Fuel Cells Fault Diagnosis under Dynamic
Load Proﬁle Using Reservoir Computing. In Proceedings of the 2016 IEEE Vehicle Power and Propulsion Conference (VPPC),
Hangzhou, China, 17–20 October 2016; pp. 1–6. [CrossRef]
35.
Fan, Y.; Nowaczyk, S.; Rögnvaldsson, T.; Antonelo, E.A. Predicting Air Compressor Failures with Echo State Networks. In
Proceedings of the Third European Conference of the Prognostics and Health Management Society 2016, PHME 2016, Bilbao,
Spain, 5–8 July 2016; PHM Society: Nashville, TN, USA, 2016; pp. 568–578.
36.
Westholm, J. Event Detection and Predictive Maintenance Using Component Echo State Networks. Master ’s Thesis, Lund
University, Lund, Sweden, 2018.
37.
Li, Y. A Fault Prediction and Cause Identiﬁcation Approach in Complex Industrial Processes Based on Deep Learning. Comput.
Intell. Neurosci. 2021,2021, 6612342. [CrossRef] [PubMed]
38.
Liu, J.; Pan, C.; Lei, F.; Hu, D.; Zuo, H. Fault prediction of bearings based on LSTM and statistical process analysis. Reliab. Eng.
Syst. Saf. 2021,214, 107646. [CrossRef]
39.
Zhu, Y.; Li, G.; Tang, S.; Wang, R.; Su, H.; Wang, C. Acoustic signalbased fault detection of hydraulic piston pump using a
particle swarm optimization enhancement CNN. Appl. Acoust. 2022,192, 108718. [CrossRef]
40.
Jana, D.; Patil, J.; Herkal, S.; Nagarajaiah, S.; DuenasOsorio, L. CNN and Convolutional Autoencoder (CAE) based realtime
sensor fault detection, localization, and correction. Mech. Syst. Signal Process. 2022,169, 108723. [CrossRef]
41.
Long, J.; Zhang, R.; Yang, Z.; Huang, Y.; Liu, Y.; Li, C. SelfAdaptation Graph Attention Network via MetaLearning for Machinery
Fault Diagnosis With Few Labeled Data. IEEE Trans. Instrum. Meas. 2022,71, 1–11. [CrossRef]
42.
Czajkowski, A.; Patan, K. Robust Fault Detection by Means of Echo State Neural Network. In Advanced and Intelligent Computations
in Diagnosis and Control; Kowalczuk, Z., Ed.; Springer International Publishing: Cham, Switzerland, 2016; pp. 341–352.
43.
Liu, C.; Yao, R.; Zhang, L.; Liao, Y. Attention Based Echo State Network: A Novel Approach for Fault Prognosis. In Proceedings
of the 2019 11th International Conference on Machine Learning and Computing, ICMLC ’19, Zhuhai, China, 22–24 February 2019;
Association for Computing Machinery: New York, NY, USA, 2019; pp. 489–493. [CrossRef]
44.
Ben Salah, S.; Fliss, I.; Tagina, M. Echo State Network and Particle Swarm Optimization for Prognostics of a Complex System. In
Proceedings of the 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), Hammamet,
Tunisia, 30 October–3 November 2017; pp. 1027–1034. [CrossRef]
45.
Luo, J.; Namburu, M.; Pattipati, K.;<