ArticlePDF Available

Abstract and Figures

Maintenance is one of the critical areas in operations in which a careful balance between preventive costs and the effect of failures is required. Thanks to the increasing data availability, decision-makers can now use models to better estimate, evaluate, and achieve this balance. This work presents a maintenance scheduling model which considers prognostic information provided by a predictive system. In particular, we developed a prescriptive maintenance system based on run-to-failure signal segmentation and a Long Short Term Memory (LSTM) neural network. The LSTM network returns the prediction of the remaining useful life when a fault is present in a component. We incorporate such predictions and their inherent errors in a decision support system based on a stochastic optimization model, incorporating them via chance constraints. These constraints control the number of failed components and consider the physical distance between them to reduce sparsity and minimize the total maintenance cost. We show that this approach can compute solutions for relatively large instances in reasonable computational time through experimental results. Furthermore, the decision-maker can identify the correct operating point depending on the balance between costs and failure probability.
Content may be subject to copyright.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.DOI
Improving prescriptive maintenance by
incorporating post-prognostic
information through chance constraints
ANTHONY D. CHO1, RODRIGO A. CARRASCO1,2, (Member, IEEE), and GONZALO A.
RUZ1,2,3, (Senior Member, IEEE),
1Faculty of Engineering and Sciences, Universidad Adolfo Ibáñez, Santiago 7941169, Chile
(e-mail: acholo@alumnos.uai.cl, rax@uai.cl, gonzalo.ruz@uai.cl)
2Data Observatory Foundation, Santiago 7941169, Chile
3Center of Applied Ecology and Sustainability (CAPES), Santiago 8331150, Chile
Corresponding author: Gonzalo A. Ruz (e-mail: gonzalo.ruz@uai.cl).
This research was partially funded by ANID FONDECYT 1180706, ANID PIA/BASAL FB0002, and ASTRO20-0058 grants from ANID,
Chile.
ABSTRACT Maintenance is one of the critical areas in operations in which a careful balance between
preventive costs and the effect of failures is required. Thanks to the increasing data availability, decision-
makers can now use models to better estimate, evaluate, and achieve this balance.
This work presents a maintenance scheduling model which considers prognostic information provided by
a predictive system. In particular, we developed a prescriptive maintenance system based on run-to-failure
signal segmentation and a Long Short Term Memory (LSTM) neural network. The LSTM network returns
the prediction of the remaining useful life when a fault is present in a component. We incorporate such
predictions and their inherent errors in a decision support system based on a stochastic optimization model,
incorporating them via chance constraints. These constraints control the number of failed components and
consider the physical distance between them to reduce sparsity and minimize the total maintenance cost.
We show that this approach can compute solutions for relatively large instances in reasonable computational
time through experimental results. Furthermore, the decision-maker can identify the correct operating point
depending on the balance between costs and failure probability.
INDEX TERMS prescriptive maintenance, chance constraints, remaining useful life.
I. INTRODUCTION
Operational areas within organizations are under ever-
increasing pressure to improve their performance. Social,
political, and competitors are just some of the drivers push-
ing companies to be more efficient and effective with their
resources and assets. This pressure, in turn, has added a
tremendous burden to maintenance, an area that must keep
a delicate balance between the effects of failures and the
cost of preventive measures. Furthermore, the increase in
complexity of current production systems makes this balance
even more challenging, making condition-based maintenance
policies hard to define and implement. To deal with these
difficulties, maintenance areas have turned to operational
data to get an answer, taking advantage of many sensors and
telemetry systems that are now available. Here, predictive
analytics tools have helped convert data into information,
transforming the constant flow from sensors and actuators to
detect and even predict changes in the state of the system
[1], [2]. The development of frameworks like the Prognostics
and Health Management (PHM) one [3], [4], have further
increased the need for fault prediction [5]–[8] as well as
estimating the remaining useful life (RUL) of a component
after a fault appears [9]–[12]. However, this is only a partial
solution. As systems grow, so will the number of detections
and diagnoses, and what maintenance areas need is to have
reliable plans that help them balance the cost of preventive
measures with the ones caused by undetected or untreated
failures [13], [14]. In this setting, prescriptive analytics tools
might hold the key to improving the efficiency and efficacy
of these complex systems, taking advantage of the plethora
of operational data sources that are now available, if these
systems can handle the uncertainties inherent with prognostic
VOLUME 4, 2016 1
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
A. D. Cho et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
FIGURE 1: ALMA radio telescopes. From image gallery of almaobservatory.org by Liam Young, Juan Carlos Rojas, and Sergio Otárola
procedures.
Researchers have recently been dealing with uncertainty
and component connections in maintenance planning from
the decision-making perspective [14]–[16]. Covering both
aspects will be essential for large and complex production
facilities like wind farms [17], solar generators, and even
scientific instruments such as the ALMA radio telescope
[5]. In this work, we will focus on the last step of PHM
for decision-making in maintenance, which covers the two
aspects mentioned before.
A. OUR CONTRIBUTION
Our work has the following novel contributions:
1) We propose a stochastic model with chance constraints
to handle unexpected failures and address components
with different levels of uncertainty in decision-making
for maintenance to minimize the total cost. In addition,
the model considers the distance between components
in each maintenance period and the total residual RUL.
2) We study and describe the effect of varying of the
chance constraints in the resulting schedule.
II. PROBLEM DESCRIPTION
Let Nbe the set of components distributed over Kmachines,
which might be in different sites, as shown in Figure 1. Ad-
ditionally, each machine has a list of components on which a
predictive system, like the one described in [5], has detected
a degradation fault. Furthermore, each component has a pre-
dicted RUL distribution provided by this predictive system.
The machines are not necessarily identical, and we assume
that their components are independent between machines and
within each machine. If one of the machine’s components
fails, we consider that the machine fails. This type of setting
rises in several applications like manufacturing [18], off-
shore wind farms [17], and scientific instruments like the
ALMA radio telescope [19], among others.
Our goal is to arrange this set of components to minimize
the maintenance cost considering the distance between ma-
chines and balancing the machines’ availability. We consider
a one-year planning horizon with maintenance decisions per
month in our work.
A. PREDICTIVE SYSTEM
LSTM networks are a type of artificial recurrent neural
network (RNN) architecture proposed by Hochreiter and
Schmidhuber [20] to deal with the vanishing gradient prob-
lem. One LSTM unit comprises three gates: an input gate,
an output gate, and a forget gate. It also has a memory cell
that remembers values over arbitrary time intervals, while the
three gates regulate the flow of information into and out of the
cell. This type of RNN has been found extremely successful
in many applications [21]. A typical LSTM [22] is illustrated
in Figure 2.
We have developed an RUL prediction system based on
LSTM neural networks .This network was pre-trained us-
ing run-to-failure data with degradation faults as the ones
described in [5]. The data for each component was ana-
lyzed and clustered, with each cluster having a catastrophic
failure threshold. The system is in charge of identifying
which cluster best represents the detected fault, after which
2VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
A. D. Cho et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
FIGURE 2: LSTM unit.
it uses the corresponding analytical model to predict the
RULs distribution. As a result, we have available the mean
ˆriand the standard deviation ˆσiof the RUL estimation for
each component i. A general diagram with the developed
prediction system is shown in Figure 3.
III. PROPOSED MAINTENANCE SCHEDULING MODEL
The scheduling model formulation is based on the ideas
developed in [23]. However, unlike that work, instead of
accomplishing the given demands, our approach aims to use
the components as much as possible before the end of their
respective RUL.
A. DYNAMIC MAINTENANCE COST
A dynamic maintenance cost function models the trade-off
between the cost of preventive maintenance Cp(early repair
before failure) and the corrective maintenance cost Ccthat
deals with unexpected failures [24]. Typically, corrective
maintenance costs are higher than preventive maintenance
ones. Therefore, dynamic cost functions are directly related
to the RUL of each component, and it is defined in [23]–[25]
Fault
prediction
Classier Model
Selection
RUL
prediction
Signal
Cluster 1
Cluster 2
Cluster k
Cluster N
Model 1
Model 2
Model k
Model N
FIGURE 3: Predictive system.
as:
Ci,ti,0(t) = CpPRi,ti,0> t+CcPRi,ti,0t
Rt
0PRi,ti,0> zdz +ti,0
(1)
where Ri,ti,0is the residual RUL of component i, which
started at time ti,0.
B. SCHEDULING MODEL
The prescriptive maintenance problem is modeled as the
following optimization problem:
min X
iGX
tT
Ci,ti,0(t)zi,t X
iGX
tT
Vi,tzi,t
+X
tT
C+γt
+X
tTCd
clDt+Cr+
cl ¯
t+Cr
cl t,(2)
such that, X
tT
zi,t = 1,iG, (3)
P X
iGX
tT
ζi,tzi,t ρ!1, (4)
X
iG
fizi,t ¯
M+γt,tT, (5)
Dtϕi,j (zi,t +zj,t 1) ,i, j G, i 6=j;tT , (6)
¯
t=X
iG
zi,t max {0, riOp·t},tT, (7)
t=X
iG
zi,t max {0, Op·tri},tT, (8)
zi,t ∈ {0,1}, γt, Dt,¯
t,t0,iG, tT, (9)
where Ci,ti,0(t)is the dynamic maintenance cost defined
in Section III-A. The parameters and decision variables are
summarized in Table 1.
The objective function, given by equation (2), minimizes
the total maintenance costs of a set of |G|components.
Each component has its dynamic maintenance cost, nominal
functional cost, additional time for the repair cost, the cost of
the distance between components, and cost related to residual
RUL.
Constraints (3) guarantees that each component enters
maintenance only once in the planning horizon. In contrast,
the chance constraint (4) restricts the number of components
that run out of RUL before their scheduled maintenance with
a threshold ρand a probability of 1. In that constraint,
the Bernoulli random variable ζi,t is 1 if Ri,ti,0< t and
0 otherwise; and ρsets a upper bound on the number of
components with catastrophic failure. The probability of not
achieving the bound set by ρis given by .
Constraints (5) ensures that at most ¯
M+γtwork-hours are
needed for maintenance in each period t. If additional work-
VOLUME 4, 2016 3
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
A. D. Cho et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
TABLE 1: Summary of the sets, decision variables and parameters of the
scheduling model.
Sets
GSet of components.
TSet of maintenance periods in the planning horizon.
Decision & State Variables
zi,t 1 if component ienters maintenance in maintenance
period t, and 0 otherwise.
γtAdditional maintenance hour added to maintenance pe-
riod t.
DtMaximum distance between components in mainte-
nance period t.
¯
tTotal residual RUL of components in maintenance pe-
riod t.
tTotal time after the end of RUL that components enter
to maintenance in maintenance period t.
Parameters
CcPer unit cost of corrective maintenance.
CpPer unit cost of preventive maintenance.
C+Per unit cost of maintenance work-hours added.
Vi,t Nominal functional cost of component iin period t.
HPlanning horizon length of maintenance, i.e, total main-
tenance periods.
¯
MMaximum work-hours for maintenance in each mainte-
nance period.
OpOperational period length.
Cd
cl Per unit cost of distance in each period.
Cr+
cl Per unit cost residuals RUL of components in each
period.
Cr
cl Per unit cost of time after the end of RUL of compo-
nents in each period.
ϕi,j Distance between components.
fiMean-hours to repair the component i.
riRUL of component i.
hours are needed, then additional costs are added to the total
maintenance cost.
Constraints (6) determine the maximum distance between
components planned for maintenance in period t; this allows
for reducing the dispersion of the components in each main-
tenance period. Finally, constraints (7)-(8) represent the total
number of days before and after the end of the RUL in which
components require maintenance in period t.
C. SAFE APPROXIMATION OF CHANCE CONSTRAINT
The usage of chance constraint (4) in a decision-making
model makes it computationally challenging. In order to
make this constraint tractable, an upper bound can be com-
puted on the left-hand side of this inequality using Markov
and generalized Bernstein inequality as proposed in Proposi-
tion 1 in [25]. Hence, given z∈ {0,1}|T|×|G|satisfying
X
iGX
tT
E[ζi,t]zi,t
max (ρ, max
δ>0"|G|(eδρ )1/|G|1
eδ1#)=ρ,(10)
FIGURE 4: Safe approximation curve using ρ= 11 and 150 components
it will also satisfy constraint (4).
Figure 4 shows the behavior of the values of ρ, for ρ= 11
and 150 components. In this Figure, we can notice that for
very close to 0, the bound is smaller, implying that it
is strengthened more than the number of components with
corrective maintenance, as long as it does not exceed the
amount ρ. As the value of increases, this condition becomes
less strict.
D. STOCHASTIC MIP SCHEDULING MODEL
To deal with non-linear terms, we linearize the chance
constraint and dynamic cost using safe approximation and
approach the stochastic optimization model with a set of
scenarios sampled from the prediction distribution of the
RUL of each component.
1) Chance constraint linearization
Using the safe approximation defined in Section III-C, we
can reformulate chance constraint (4) taking the same strat-
egy proposed in [23] by defining an auxiliary decision vari-
able as follows
Pi,t := E(ζi,t) = P(Ri,ti,0t),iG, tT . (11)
Considering ¯
Pi,t as an upper bound of Pi,t and 0Pi,t
¯
Pi,t 1, we can rewrite
ui,t =Pi,tzi,t ,iG, tT, (12)
in the form of a safe approximation of the chance constraint
as follows: X
tTX
iG
ui,t ρ,(13)
0ui,t Pi,t,iG, tT , (14)
Pi,t (1zi,t )¯
Pi,t ui,t ¯
Pi,tzi,t ,iG, tT. (15)
Analogously, we apply the linearization to the non-linear
term Ci,ti,0(t)zi,t of the objective function by defining
wi,t := θi,tzi,t ,(16)
where, θi,t =Ci,ti,0(t), and 0θi,t ¯
θi,t Cc. Therefore,
the linearization of wi,t is given as follows:
0wi,t θi,t,iG, tT , (17)
4VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
A. D. Cho et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
θi,t (1 zi,t)¯
θi,t wi,t ¯
θi,tzi,t ,iG, tT, (18)
¯
θi,t = max
0t0ti,0+tCi,ti,0(t0),iG, tT. (19)
2) Scenarios
Since each component has its RUL distribution provided by
the predictive system, we create a set of scenarios, S, such
that each scenario is generated from each component’s RUL
distribution, i.e.,
S={s1, s2, . . . , s|S|}(20)
with
sk= (rsk,1, rsk,2, . . . , rsk,|G|), k ={1, . . . , |S|},(21)
rsk,i N ri,ˆσ2
i),iG, (22)
where ˆri,ˆσirepresent the mean and standard deviation of the
RUL estimate of component i, respectively.
3) Optimization model
Considering the information on the distribution of the RUL
of each component and the linearization of the non-linear
terms of both the chance constraint and the dynamic cost
function described in Section III-D1, we can formulate our
prescriptive maintenance problem into a stochastic mixed-
integer model as follows,
min 1
|S|X
sS
Φs,(23)
such that,
Φs= X
tTX
iG
wi,t +X
tT
C+γt!
+X
tTCd
clDt+Cr+
cl ¯
s,t +Cr
cl s,t
X
tTX
iG
Vi,t ·zi,t,sS, (24)
X
tT
zi,t = 1,iG, (25)
X
iG
fizi,t ¯
M+γt,tT, (26)
Dtϕi,j (zi,t +zj,t 1) ,
i, j G, i 6=j;tT , (27)
¯
s,t =X
iG
zi,t max(0, rs,i Op·t),tT, sS, (28)
s,t =X
iG
zi,t max(0, Op·trs,i),tT , sS, (29)
X
tTX
iG
ui,t ρ.(30)
0ui,t Pi,t,iG, tT , (31)
Pi,t (1 zi,t)¯
Pi,t ui,t ¯
Pi,tzi,t ,
iG, tT, (32)
0wi,t θi,t,iG, tT , (33)
θi,t (1 zi,t)¯
θi,t wi,t ¯
θi,tzi,t ,
iG, tT, (34)
zi,t ∈ {0,1};wi,t, γt, Dt,¯
s,t,s,t 0;
iG, tT, sS, (35)
where, ¯
θi,t = max
0t0ti,0+tCi,ti,0(t0),(36)
Pi,t =P(Ri,ti,0t) = P(riti,0+t),(37)
and rihas the same distribution as defined in (22). ti,0repre-
sents the days elapsed since the last emission of the predictive
information until the moment the scheduling process is car-
ried out and ρis the safe approximation constant defined in
(10). For simplicity, we consider ¯
Pi,t = 1,iG, tT.
The model aims to minimize the average cost generated
through all the scenarios, which is described by the equations
(23) and (24). The constraints (25)-(29) guarantee that all
components enter maintenance only once during the planning
horizon, and ensure that at most they need ¯
M+γtwork-
hours for maintenance in each period t. These constraints also
reduce the geographical dispersion between the components
attended in each period, considering the distance between
them. The model aims to use each component as much as
possible and reduce the days each enters maintenance after
the end of RUL in each period tin each scenario.
The constraints (30)-(32) represent the linearization of
the chance constraint (4), whereas the linearization of the
dynamic cost is given by the equations (33)-(34).
IV. EXPERIMENTAL SETTINGS
The proposed prescriptive maintenance system was imple-
mented in Python 3.8.10 using Gurobi 9.1.1 as a mixed-
integer optimization solver. The experiments were done on
a computer with an Intel®Core™ Processor i5-3230M of 2.6
GHz x 4 cores, with 8 GB RAM, and Linux Mint 20.1 Ulyssa
(64 bits) as OS.
The model settings were as follows: the planning horizon
for maintenance was set to one year, i.e. H=|T|= 12,
with each month as a period with operational length of 30
days, Op= 30. The preventive and corrective costs were
Cp= 100000 and Cc= 400000, respectively. Other related
costs were: C+= 10000,Cd
cl = 10000,Cr+
cl = 11000,
Cr
cl = 22000, and Vi,t = 5000. The maximum work-hours
was set to ¯
M= 160, and 100 scenarios were generated.
These cost values were set with the objective of evaluating
both the dynamic cost and the performance of the proposed
model.
A public repository with all the benchmark instances tested
with our methodology can be found at [26].
VOLUME 4, 2016 5
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
A. D. Cho et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
(a) Distribution of maintenance costs of all scenarios.
(b) Number of components per machine. (c) Maintenance before or after RUL ri, i G).
(d) Maintenance before or after RUL of all scenarios. (e) Pareto curve: Varying -value.
FIGURE 5: Case study: 150 faulty components distributed in 9 machines.
V. COMPUTATIONAL RESULTS
A simulated problem of |G|= 250 components distributed
over K= 9 machines, as shown in Fig. 5b was used
as one of the instances to test the model’s performance.
Furthermore, we set = 0.1and ρ= 11; this implies that
about 5% of the components enter corrective maintenance
due to a catastrophic failure with a probability of at least
1. We solved the model using multiple scenarios sampled
from RUL distribution, described in Section III-D2, and we
assumed that the predictive system provided us with the
information on the same day that the scheduling model was
executed; therefore, we set ti,0= 0. The result of the
minimum maintenance cost of each scenario is shown in
Figure 5a, where the red dash line represents the average
maintenance cost over all the scenarios.
In the resulting recommendation, all components enter
maintenance before the end of their RUL, with planned main-
tenance of fewer than 12 days before they fail with respect
to the estimated RUL ˆriiG, as illustrated in Figure
5c (which presents no After RUL orange bars). Looking at
the cases for all scenarios, 3.41% of cases have some of the
components go into maintenance after the end of the sampled
RUL, rs,i sS, iG. This study verifies that less
than 5% of the components goes into corrective maintenance,
which we have previously set, and is reflected in the orange
bars in Figure 5d.
Constraint (4) introduces a tuning parameter , that helps
the decision-maker balance the different costs. Figure 5e
shows that for smaller values of , a higher maintenance cost
is needed since the model tries to increase the machines’
availability by making earlier maintenance procedures. For
the case study, if 0.1, the maintenance cost decreases
almost linearly, showing slight changes in some periods.
Figures 6a to 6b show the effect on the schedule of in-
creasing from 108to 0.1. The analysis shows significant
changes, showing several grouping modifications in each pe-
riod. On the other hand, when we increased the from 0.1 to
0.2, there were only small changes in the movement of some
components: one component from period 2 to period 1, two
components from period 9 to period 10, and two components
6VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
A. D. Cho et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
from period 10 to the next period. Varying implied some
schedule changes and the effect on the maintenance cost and
computational effort. In the figures, for each period, the red
box indicates the component with a residual RUL of less than
ten days, the orange box when it is between 11 and 20 days,
and the green one when it is greater than 20 days.
We also tested the performance in instances with 500 and
1000 components distributed over 20 machines, measuring
the time required to solve them. Our instances with 1000
components were solved in around 12 minutes. The results
are summarized in Table 2.
VI. CONCLUSION
The increasing complexity of systems has made it harder
for the operational areas to develop well-balanced policies in
maintenance. The availability of data has helped significantly
get better information, but decision support tools are crucial
to help improve efficiency and the effective use of resources
and assets. Furthermore, these tools need to embrace the
uncertainty inherent with predictive analytics tools such as
RUL predictions to be helpful.
Our work shows an initial approach to doing this. Our
model presents excellent performance, even when there are
different levels of uncertainty in the predicted RUL. This
approach complements predictive systems, taking advantage
of their information. Furthermore, the scheduling model can
handle a more extensive set of components, reduce process-
ing time, and give robust recommendations to the decision-
makers.
ACKNOWLEDGMENT
This research was partially funded by ANID FONDECYT
1180706, ANID PIA/BASAL FB0002, and ASTRO20-0058
grants from ANID, Chile.
REFERENCES
[1] E. Balaban, M. J. Kochenderfer, J. J. Alonso, M. Schwager, S. U. D.
of Aeronautics, and Astronautics, Health-aware Decision Making Under
Uncertainty for Complex Systems. Stanford University, 2020. [Online].
Available: https://books.google.cl/books?id=sjSuzQEACAAJ
[2] J. J. Montero Jimenez, S. Schwartz, R. Vingerhoeds, B. Grabot, and
M. Salaün, “Towards multi-model approaches to predictive maintenance:
A systematic literature survey on diagnostics and prognostics,” Journal of
Manufacturing Systems, vol. 56, pp. 539–557, 2020. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0278612520301187
[3] C. S. Byington, M. J. Roemer, and T. Galie, “Prognostic enhancements
to diagnostic systems for improved condition-based maintenance [military
aircraft],” in Proceedings, IEEE Aerospace Conference, vol. 6, 3 2002, pp.
6–6.
[4] T. Tinga and R. Loendersloot, “Aligning phm, shm and cbm by understand-
ing the physical system failure behaviour,” in Proceedings of the European
Conference of the Prognostics and Health Management Society, A. Bregon
and M. Daigle, Eds. PHM society, 7 2014, pp. 162–171, 2nd European
Conference of the Prognostics and Health Management Society, PHME
2014, PHME2014 ; Conference date: 08-07-2014 Through 10-07-2014.
[5] A. D. Cho, R. A. Carrasco, G. A. Ruz, and J. L. Ortiz,
“Slow degradation fault detection in a harsh environment,” IEEE
Access, vol. 8, pp. 175 904–175 920, 2020. [Online]. Available:
https://ieeexplore.ieee.org/document/9205191
[6] G. P. Bhandari and R. Gupta, “Machine learning based software fault
prediction utilizing source code metrics,” in 2018 IEEE 3rd International
Conference on Computing, Communication and Security (ICCCS), 2018,
pp. 40–45.
[7] J. Liu, C. Pan, F. Lei, D. Hu, and H. Zuo, “Fault prediction of
bearings based on LSTM and statistical process analysis,” Reliability
Engineering and System Safety, vol. 214, no. C, 2021. [Online]. Available:
https://ideas.repec.org/a/eee/reensy/v214y2021ics0951832021001873.html
[8] Y. Li, “A fault prediction and cause identification approach in complex
industrial processes based on deep learning,” Computational Intelligence
and Neuroscience, vol. 2021, pp. 1–13, 03 2021.
[9] G. Vachtsevanos and P. Wang, “Fault prognosis using dynamic wavelet
neural networks,” in 2001 IEEE Autotestcon Proceedings. IEEE Systems
Readiness Technology Conference. (Cat. No.01CH37237), 8 2001, pp.
857–870.
[10] N. Z. Gebraeel, M. A. Lawley, R. Li, and J. K. Ryan, “Residual-life
distributions from component degradation signals: A bayesian approach,
IIE Transactions, vol. 37, no. 6, pp. 543–557, 2005. [Online]. Available:
https://doi.org/10.1080/07408170590929018
[11] O. E. Yurek and D. Birant, “Remaining useful life estimation for predictive
maintenance using feature engineering,” in 2019 Innovations in Intelligent
Systems and Applications Conference (ASYU), 10 2019, pp. 1–5.
[12] H. Rozas, F. Jaramillo, A. Perez, D. Jimenez, M. E. Orchard, and
K. Medjaher, “A method for the reduction of the computational
cost associated with the implementation of particle-filter-based
failure prognostic algorithms,” Mechanical Systems and Signal
Processing, vol. 135, p. 106421, 2020. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0888327019306429
[13] K. Benaggoune, S. Meraghni, J. Ma, L. H. Mouss, and N. Zerhouni, “Post
prognostic decision for predictive maintenance planning with remaining
useful life uncertainty,” in 2020 Prognostics and Health Management
Conference (PHM-Besançon), 5 2020, pp. 194–199.
[14] O. Bougacha, C. Varnier, and N. Zerhouni, “A review of post-prognostics
decision-making in prognostics and health management,” International
Journal of Prognostics and Health Management, vol. 11, p. 31, 12 2020.
[15] S. Meraghni, L. Terrissa, S. Ayad, N. Zerhouni, and C. Varnier,
“Post-prognostics decision in Cyber-Physical Systems,” in International
Conference on Advanced Systems and Electric Technologies, Hammamet,
Tunisia, 3 2018. [Online]. Available: https://hal.archives-ouvertes.fr/hal-
02399256
[16] J. I. Aizpurua, V. M. Catterson, Y. Papadopoulos, F. Chi-
acchio, and D. D’Urso, “Supporting group maintenance
through prognostics-enhanced dynamic dependability prediction,”
Reliability Engineering & System Safety, vol. 168, pp.
171–188, 2017, maintenance Modelling. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0951832016308626
[17] F. Camci, “Maintenance scheduling of geographically distributed
assets with prognostics information,” European Journal of Operational
Research, vol. 245, no. 2, pp. 506–516, 2015. [Online]. Available:
https://www.sciencedirect.com/science/article/pii/S0377221715002301
[18] C. Jin, D. Djurdjanovic, H. D. Ardakani, K. Wang, M. Buzza, B. Begheri,
P. Brown, and J. Lee, “A comprehensive framework of factory-to-factory
dynamic fleet-level prognostics and operation management for geograph-
ically distributed assets,” in 2015 IEEE International Conference on Au-
tomation Science and Engineering (CASE), 8 2015, pp. 225–230.
[19] J. L. Ortiz and R. A. Carrasco, “Model-based fault detection and
diagnosis in ALMA subsystems,” in Observatory Operations: Strategies,
Processes, and Systems VI, A. B. Peck, C. R. Benn, and R. L.
Seaman, Eds. SPIE, 7 2016, pp. 919 – 929. [Online]. Available:
https://doi.org/10.1117/12.2233204
[20] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
computation, vol. 9, pp. 1735–80, 12 1997.
[21] G. V. Houdt, C. Mosquera, and G. Nápoles, “A review on the long short-
term memory model,” Artificial Intelligence Review, pp. 1–27, 2020.
[Online]. Available: https://link.springer.com/article/10.1007/s10462-020-
09838-1
[22] K. Greff, R. Kumar Srivastava, J. Koutník, B. R. Steunebrink, and
J. Schmidhuber, “Lstm: A search space odyssey,” IEEE transactions on
neural networks and learning systems, vol. 28, no. 10, pp. 2222–2232, 03
2017.
[23] B. Basciftci, S. Ahmed, and N. Z. Gebraeel, “Data-driven maintenance
and operations scheduling in power systems under decision-dependent
uncertainty,” IISE Transactions, vol. 52, no. 6, pp. 589–602, 2019.
[Online]. Available: https://doi.org/10.1080/24725854.2019.1660831
[24] M. Yildirim, X. A. Sun, and N. Z. Gebraeel, “Sensor-driven condition-
based generator maintenance scheduling—part i: Maintenance problem,”
IEEE Transactions on Power Systems, vol. 31, no. 6, pp. 4253–4262, 11
2016.
VOLUME 4, 2016 7
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
A. D. Cho et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
TABLE 2: Runtime in seconds to load each step.
Components Data-loading [sec] Decision vars. [sec] Constraints [sec] Solver [sec] Total [sec]
150 20.48 0.03 2.75 17.40 40.68
500 64.68 0.08 23.28 97.44 185.53
1000 128.98 0.20 87.39 503.37 719.99
(a) Scheduling (= 108).
(b) Scheduling (= 0.1).
(c) Scheduling (= 0.2). The boxes highlight the changes from (b)
FIGURE 6: Effect of scheduling: varying -value in the chance constraint
[25] B. Basciftci, S. Ahmed, N. Z. Gebraeel, and M. Yildirim, “Stochastic
optimization of maintenance and operations schedules under unexpected
failures,” IEEE Transactions on Power Systems, vol. 33, no. 6, pp. 6755–
6765, 11 2018.
[26] A. D. Cho, R. A. Carrasco, and G. A. Ruz, “Benchmarks
for maintenance problems v. 1.0,” 2022. [Online]. Available:
https://github.com/raxlab/maintenanceBenchmarks
ANTHONY D. CHO received the B.S. degree in
mathematics from the Universidad de Carabobo,
Valencia, Venezuela, in 2008. He is currently pur-
suing the Ph.D. degree in Industrial Engineer-
ing and Operations Research at the Universidad
Adolfo Ibáñez, Santiago, Chile.
His research interests include machine learning,
evolutionary algorithms, operation research, pre-
8VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2022.3177537, IEEE Access
A. D. Cho et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
TABLE 3: Complemtary: sets and parameters.
SSet of scenearios of RULs sampled from RUL
distribution
rs,i RUL sampled from the distribution of the com-
ponent ifor the scenario s.
ˆriEstimated mean RUL of the component i.
ˆσiEstimated standard deviation of RUL of the
component i.
Ri,ti,0Residual RUL of the component istarted at
time ti,0.
TABLE 4: Abbreviations.
RUL Remaining Useful Life
RNN Recurrent Neural Network
ALMA Atacama Large Millimeter Array
PHM Prognostic and Health Management
LSTM Long-Short Term Memory
IFP Intermediate Frecuency Processor
scriptive analytics, and image processing.
RODRIGO A. CARRASCO (M’2002) is a pro-
fessor at the School of Engineering and Sciences
at Universidad Adolfo Ibáñez, and Academic Di-
rector of the Master in Industrial Engineering pro-
gram. He also founded and was the initial director
of the UAI Systems Center, a center dedicated
to technology transfer and solving complex real-
life problems using operations research tools. His
research is focused on the design and development
of decision support tools and algorithms.
Before joining UAI, he was a researcher at Siemens Corporate Research
in Princeton, NJ, developing decision support algorithms for smart grids
and energy management. Prior to this, he worked at Booz Allen Hamilton,
leading operations research projects in Chile, Argentina, Brazil, Peru, and
Canada.
Rodrigo holds an electrical engineering degree and a master of science
in engineering, focused in control systems, from Universidad Católica de
Chile, and an M.Phil. and a Ph.D. from Columbia University in industrial
engineering and operations research.
GONZALO A. RUZ received his B.Sc. (2002),
P.E. and M.Sc. (2003) degrees in Electrical En-
gineering from Universidad de Chile, Santiago,
Chile. He then completed his Ph.D. degree (2008)
at Cardiff University, UK. Currently, he is a Pro-
fessor at the Faculty of Engineering and Sciences,
Universidad Adolfo Ibáñez, Santiago, Chile. His
research interests include machine learning, evolu-
tionary computation, data science, gene regulatory
network modeling, and complex systems.
VOLUME 4, 2016 9
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Faults occurring in the production line can cause many losses. Predicting the fault events before they occur or identifying the causes can effectively reduce such losses. A modern production line can provide enough data to solve the problem. However, in the face of complex industrial processes, this problem will become very difficult depending on traditional methods. In this paper, we propose a new approach based on a deep learning (DL) algorithm to solve the problem. First, we regard these process data as a spatial sequence according to the production process, which is different from traditional time series data. Second, we improve the long short-term memory (LSTM) neural network in an encoder-decoder model to adapt to the branch structure, corresponding to the spatial sequence. Meanwhile, an attention mechanism (AM) algorithm is used in fault detection and cause identification. Third, instead of traditional biclassification, the output is defined as a sequence of fault types. The approach proposed in this article has two advantages. On the one hand, treating data as a spatial sequence rather than a time sequence can overcome multidimensional problems and improve prediction accuracy. On the other hand, in the trained neural network, the weight vectors generated by the AM algorithm can represent the correlation between faults and the input data. This correlation can help engineers identify the cause of faults. The proposed approach is compared with some well-developed fault diagnosing methods in the Tennessee Eastman process. Experimental results show that the approach has higher prediction accuracy, and the weight vector can accurately label the factors that cause faults.
Article
Full-text available
Mainly, the prognostics and health management (PHM) process is based on three processes: the data acquisition and health assessment process in which sensors signals are acquired and processed, the diagnostic and prognostic process in which the source of failure is detected and the remaining useful life (RUL) is predicted and finally the decision-making process that refers to the term management in prognostics and health management. This paper reviews in the literature about the different aspects of decision-making in the context of PHM. The selected papers are subject to content assessment and grouped according to the decision type. Additionally, this paper presents a synthesis of the previous works that helps identify new trends and deficiencies in the decision-making process. The synthesis can guide efforts for future work.
Article
Full-text available
The ever increasing challenges posed by the science projects in astronomy have skyrocketed the complexity of the new generation telescopes. Due to the climate and sky requirements, these high-precision instruments are generally located in remote areas, suffering from the harsh environments around it. These modern telescopes not only produce massive amounts of scientific data, but they also generate an enormous amount of operational information. The Atacama Large Millimeter/submillimeter Array (ALMA) is one of these unique instruments, generating more than 50 Gb of operational data every day while functioning in conditions of extreme dryness and altitude. To maintain the array working under extreme conditions, the engineering teams must check over 130,000 monitoring points, combing through the massive datasets produced every day. To make this possible, predictive tools are needed to identify, hopefully beforehand, the occurrence of failures in all the different subsystems. This work presents a novel fault detection scheme for one of these subsystems, the Intermediate Frequency Processors (IFP). This subsystem is critical to process the information gathered by each antenna and communicate it, reliably, to the correlator for processing. Our approach is based on echo state networks, a configuration of artificial neural networks, used to learn and predict the signal patterns. These patterns are later compared to the actual signal, to identify failure modes. Additional preprocessing techniques were also added since the signal-to-noise ratio of the data used was very low. The proposed scheme was tested in over seven years of data from 132 IFPs at ALMA, showing an accuracy of over 70%. Furthermore, the detection was done several months earlier, on average, when compared to what human operators did. These results help the maintenance procedures, increasing reliability while reducing humans' exposure to the harsh environment where the antennas are. Although applied to a specific fault, this technique is broad enough to be applied to other types of faults and settings.
Article
Full-text available
Long Short-Term Memory (LSTM) has transformed both machine learning and neurocomputing fields. According to several online sources, this model has improved Google's speech recognition, greatly improved machine translations on Google Translate, and the answers of Amazon's Alexa. This neural system is also employed by Facebook, reaching over 4 billion LSTM-based translations per day as of 2017. Interestingly, recurrent neural networks had shown a rather discrete performance until LSTM showed up. One reason for the success of this recurrent network lies in its ability to handle the exploding / vanishing gradient problem, which stands as a difficult issue to be circumvented when training recurrent or very deep neural networks. In this paper, we present a comprehensive review that covers LSTM's formulation and training, relevant applications reported in the literature and code resources implementing this model for a toy example.
Article
Aero-engine bearing is the key component of aero-engine load-bearing transmission system, and its fault will affect the reliability and safety of aero-engine system, so it is particularly important to predict the fault of aero-engine bearings. The previous bearing researches are mainly based on the single-stage model, and the regression neural network has the problem of gradient disappearance, which leads to the uncertainty of prediction. Therefore, this paper proposes a novel model named LSS which combines the advantages of long short-term memory (LSTM) network with statistical process analysis to predict the fault of aero-engine bearings with multi-stage performance degradation. An algorithm based on the proposed model is put forward. Firstly, the time feature of bearing vibration signal is extracted and divided into multi-stage signals by statistical process analysis. Then, the multi-stage signals are input into LSS for prediction. The bearing datasets published by NASA and FEMTO-ST institute are used to prove the effectiveness of proposed method. The results show that the proposed method has higher prediction accuracy than recurrent neural network (RNN), support vector regression (SVR) and LSTM. Therefore, this method can divide the level of fault to reduce uncertainty, so as to improve prediction accuracy of the performance degradation trend and the RUL of bearings.
Article
The use of a modern technological system requires a good engineering approach, optimized operations, and proper maintenance in order to keep the system in an optimal state. Predictive maintenance focuses on the organization of maintenance actions according to the actual health state of the system, aiming at giving a precise indication of when a maintenance intervention will be necessary. Predictive maintenance is normally implemented by means of specialized computational systems that incorporate one of several models to fulfil diagnostics and prognostics tasks. As complexity of technological systems increases over time, single-model approaches hardly fulfil all functions and objectives for predictive maintenance systems. It is increasingly common to find research studies that combine different models in multi-model approaches to overcome complexity of predictive maintenance tasks, considering the advantages and disadvantages of each single model and trying to combine the best of them. These multi-model approaches have not been extensively addressed by previous review studies on predictive maintenance. Besides, many of the possible combinations for multi-model approaches remain unexplored in predictive maintenance applications; this offers a vast field of opportunities when architecting new predictive maintenance systems. This systematic survey aims at presenting the current trends in diagnostics and prognostics giving special attention to multi-model approaches and summarizing the current challenges and research opportunities.
Conference Paper
This paper investigates the use of the Particle Swarm Optimization (PSO) algorithm to quantify the effect of RUL uncertainty on predictive maintenance planning. The prediction of RUL is influenced by many sources of uncertainty, and it is required to quantify their combined impact by incorporating the RUL uncertainty in the optimization process to minimize the total maintenance cost. In this work, predictive maintenance of a multi-functional single machine problem is adopted to study the impact of RUL uncertainty on maintenance planning. Therefore, the PSO algorithm is integrated with a random sampling-based strategy to select a sequence that performs better for different values of RUL associated with different jobs. Through a numerical example, results show the importance of optimizing maintenance actions under the consideration of RUL randomness.
Conference Paper
Recently, machine learning techniques have been used to produce increasingly effective solutions to predict the remaining useful life (RUL) of assets accurately. This paper investigates the effect of different feature engineering approaches to the accuracy of RUL prediction. In this study, six different feature selection methods and many different regression algorithms were applied to choose the most accurate final model for prediction. Applied feature selection algorithms are Chi Squared, Spearman Correlation, Mutual Information, Fisher Score, Pearson Correlation and Count Based. Machine learning algorithms used in this work are Linear Regression, Bayesian Linear Regression, Poisson Regression, Neural Network Regression, Boosted Decision Tree Regression and Decision Forest Regression. In addition, two different feature engineering approaches were also tested on the benchmark dataset by transforming its feature space, with the goal of improving predictive modelling performance. Each combination of these methods were applied and totally 72 different models were constructed and compared with each other to evaluate their performances in terms of five different metrics, including mean absolute error, root mean squared error, relative absolute error, relative squared error and coefficient of determination.
Article
Generator maintenance scheduling plays a pivotal role in ensuring uncompromising operations of power systems. There exists a tight coupling between the condition of the generators and corresponding operational schedules, significantly affecting reliability of the system. In this study, we effectively model and solve an integrated condition-based maintenance and operations scheduling problem for a fleet of generators with an explicit consideration of decision-dependent generator conditions. We propose a sensor-driven degradation framework with remaining lifetime estimation procedures under time varying load levels. We present estimation methods by adapting our model to the underlying signal variability. Then, we develop a stochastic optimization model that considers the effect of the operational decisions on the generators’ degradation levels along with the uncertainty of the unexpected failures. As the resulting problem includes nonlinearities, we adopt piecewise linearization along with other linearization techniques and propose formulation enhancements to obtain a stochastic mixed-integer linear programming formulation. We develop a decision-dependent simulation framework for assessing the performance of a given solution. Finally, we present computational experiments demonstrating significant cost savings and reductions in failures in addition to highlighting computational benefits of the proposed approach.