Content uploaded by Nathalie Cauchi

Author content

All content in this area was uploaded by Nathalie Cauchi on Nov 03, 2017

Content may be subject to copyright.

Eicient Probabilistic Model Checking of Smart Building

Maintenance using Fault Maintenance Trees

Nathalie Cauchi∗

Department of Computer Science, University of Oxford

Oxford, United Kingdom

nathalie.cauchi@cs.ox.ac.uk

Khaza Anuarul Hoque

Department of Computer Science, University of Oxford

Oxford, United Kingdom

khaza.hoque@cs.ox.ac.uk

Alessandro Abate

Department of Computer Science, University of Oxford

Oxford, United Kingdom

aabate@cs.ox.ac.uk

Mari¨

elle Stoelinga

FMT Group, University of Twente

Twente, e Netherlands

marielle@cs.utwente.nl

ABSTRACT

Cyber-physical systems, like Smart Buildings and power plants,

have to meet high standards, both in terms of reliability and avail-

ability. Such metrics are typically evaluated using Fault trees (FTs)

and do not consider maintenance strategies which can signicantly

improve lifespan and reliability. Fault Maintenance trees (FMTs) –

an extension of FTs that also incorporate maintenance and degra-

dation models, are a novel technique that serve as a good planning

platform for balancing total costs and dependability of a system.

In this work, we apply the FMT formalism to a Smart Building

application. We propose a framework for modelling FMTs using

probabilistic model checking and present an algorithm for per-

forming abstraction of the FMT in order to reduce the size of its

equivalent Continuous Time Markov Chain. is allows us to apply

the probabilistic model checking more eciently. We demonstrate

the applicability of our proposed approach by evaluating various

dependability metrics and maintenance strategies of a Heating,

Ventilation and Air-Conditioning system’s FMT.

CCS CONCEPTS

•Computer systems organization →

Maintainability and main-

tenance;

KEYWORDS

Fault Maintenance Trees, Formal modelling, Probabilistic Model

checking, Reliability, Building Automation Systems, PRISM

ACM Reference format:

Nathalie Cauchi, Khaza Anuarul Hoque, Alessandro Abate, and Mari

¨

elle

Stoelinga. 2017. Ecient Probabilistic Model Checking of Smart Building

Maintenance using Fault Maintenance Trees. In Proceedings of BuildSys ’17,

Del, Netherlands, November 8–9, 2017, 10 pages.

DOI: 10.1145/3137133.3137138

∗e corresponding author

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. To copy otherwise, or republish, to post on servers or to redistribute

to lists, requires prior specic permission and/or a fee. Request permissions from

permissions@acm.org.

BuildSys ’17, Del, Netherlands

©2017 ACM. 978-1-4503-5544-5/17/11.. . $15.00

DOI: 10.1145/3137133.3137138

1 INTRODUCTION

Worldwide, buildings account for approximately 40% of the total

energy consumption and 20% of the total

CO2

emissions, annu-

ally [

5

]. Ecient Building Automation Systems (BAS) can reduce

energy consumption by up to 30% through their optimal operation,

continuous commissioning and maintenance [

5

]. Constructions

employing such technologies are termed Smart Buildings. High

standards have to be adhered by such technologies, both in terms of

reliability and availability. One way of achieving this is by employ-

ing methods to perform preventative and predictive maintenance

actions. Diagnostic and fault detection techniques for Smart Build-

ing applications have been developed in [

2

,

15

]. Predictive and

preventative maintenance strategies are devised in [

1

,

4

]. However,

these techniques preclude availability and reliability measurements

and focus only on synthesis of maintenance policies in the pres-

ence of degradation and fault nding. Reliability and availability

are typically tackled using Fault Trees (FTs), where the focus is

on nding the root causes of a system failure using a top-down

approach. FTs do not include maintenance strategies in the analysis

– a key element in reducing component failures. [

14

] presents the

Fault Maintenance Tree (FMT) as an extension of FT encompassing

both degradation and maintenance models. e degradation models

represent the dierent levels of component degradation and are

known as Extended Basic Events (EBE). e maintenance models

incorporate the undertaken maintenance policy which includes

both inspections and repairs. ese are modelled using Repair and

Inspection modules in the FMT framework.

In literature, FMTs are analysed using Statistical Model Checking

technique (SMC) [

14

] and provide statistical guarantees. In contrast,

Probabilistic Model Checking (PMC), based on numerical analysis,

provide formal guarantees with higher accuracy when compared

with SMC [

17

]. However, numerical methods are far more memory

intensive and may result in a state space explosion. is limitation

of PMC oen leaves SMC as the last resort [

17

]. In this paper

we tackle the FMT analysis using PMC. Our contributions can be

summarised as follows:

(1)

We formalise the FMT framework using Continuous Time

Markov Chain (CTMCs).

(2)

We formalise the dependability metrics using the extended

Continuous Stochastic Logic (CSL) formalism such that

they can be computed using the PRISM model checker [

12

].

BuildSys ’17, November 8–9, 2017, Del, Netherlands N. Cauchi et al.

(3)

To mitigate the state space explosion problem, we present

an FMT abstraction technique which decomposes a large

FMT into an equivalent abstract FMT based on our pro-

posed graph decomposition algorithm. Using our frame-

work, we are able to achieve a 67% reduction in the state

space size.

(4)

Finally, we construct a FMT that identies failure of a Heat-

ing, Ventilation and Air-conditioning system (HVAC). We

apply the developed framework to the built FMT and evalu-

ate relevant dependability metrics, together with dierent

maintenance strategies using the PRISM model checker.

To the best of our knowledge, this is the rst aempt to anal-

yse FMTs using Probabilistic Model Checking and also the rst

application to Smart Building systems.

is article has the following structure: Section 2 introduces the

fault maintenance trees and probabilistic model checking frame-

works. is is followed by the developed methodology for mod-

elling FMT using CTMCs and performing model checking in Sec-

tion 3. e framework is applied to a heating, ventilation and

air-conditioning (HVAC) case study which is presented in Section 4.

2 PRELIMINARIES

2.1 Fault maintenance trees framework

Fault trees are directed acyclic graphs (DAG) describing the com-

binations of component failures that lead to system failures. e

leaves in the fault trees are called basic events and denote the sys-

tem failures. e internal nodes of the graph are called gates and

describe the dierent ways that failures can interact to cause other

components to fail. e gates in a fault tree can be of several types

and these include the AND gate, OR gate, k/N-gate [14].

Fault maintenance trees (FMT) extend fault trees by including

maintenance (all the standard FT gates are also employed by the

FMTs). is is achieved by making use of:

(1)

Extended Basic Events - e basic events are modied to

incorporate degradation models of the component the leaf

represents. e degradation models represent dierent

discrete levels of degradations the components can be in

and are a function of time.

(2)

Rate Dependency Events - A new gate introduced in [

14

],

labelled as RDEP that accelerates the degradation rates of

dependent child nodes and is depicted in Figure 1. When

the component connected to the input of the RDEP fails,

the degradation rate of the dependent components is ac-

celerated with an acceleration factor γ.

RDEP

in

Children (n)

Figure 1: RDEP gate with 1 input and dependent components also known

as children.

(3)

Repair and Inspection modules - e repair module (RM) per-

forms cleaning or replacements actions. ese actions can

be either carried out using xed time schedules or when

enabled by the inspection module (IM). e IM performs

periodic inspections and when components fall below a cer-

tain degradation threshold a repair or partial replacement

is initiated by the IM to be performed by the RM.

2.2 Probabilistic model checking

Model checking is a well-established formal verication technique

used to verify the correctness of nite-state systems. Given a for-

mal model of the system to be veried in terms of labelled state

transitions and the properties to be veried in terms of temporal

logic, the model checking algorithm exhaustively and automatically

explores all the possible states in a system to verify if the property

is satisable or not. Probabilistic model checking deals with systems

that exhibit stochastic behaviour and is based on the construction

and analysis of a probabilistic model of the system. We make use

of CTMCs, having both transition and state labels, to perform sto-

chastic modelling. Properties are expressed in the form of extended

Continuous Stochastic Logic (CSL) [11].

Denition 2.1. e tuple

C=(S,s0,TL,AP,L,R)

denes a CTMC

which is composed of a set of states

S

, the initial state

s0

, a nite

set of transition labels

TL

, a nite set of atomic propositions

AP

,

a labelling function

L

:

S→

2

AP

and the transition rate matrix

R

:

S×S→R≥0

. e rate

R(s,s0)

denes the delay before which

a transition between states

s

and

s0

takes place. If

R(s,s0),

0

then the probability that a transition between the states

s

and

s0

is

dened as 1

−e−R(s,s0)t

where

t

is time. No transitions will trigger

if R(s,s0)=0.

e logic of CSL species state-based properties for CTMCs,

built out of propositional logic, a steady-state operator that refers

to the stationary probabilities, and a probabilistic operator for rea-

soning about transient state probabilities. e state formulas are

interpreted over states of a CTMC, whereas the path formulas are

interpreted over paths in a CTMC. For detail about the syntax and

semantics of CSL (which also includes reward formulae), we refer

the interested readers to [

11

]. Examples of a CSL property with

its natural language translation are: (i) P

≥0.95

[F

complet e

] - “e

probability of the system eventually completing its execution suc-

cessfully is at least 0.95”. (ii) R

=?

[F

success

] - “What is the expected

reward accumulated before the system successfully terminates?”

3 FORMALIZING FMTS USING CTMCS

In this section, we rst formalise the FMT framework by presenting

the formal syntax and semantics for modelling FMTs using CTMCs.

Next, we list the set of metrics used to analyse the FMT. Finally, we

present the developed framework which allows us to analyse large

FMTs using probabilistic model checking (PMC).

3.1 FMT Syntax

To formalise the syntax of FMTs using CTMCs, we rst dene the

set

F

, characterizing each FMT element by type, inputs and rates.

We introduce a new element called DELAY which will be used to

model the deterministic time delays required by the extended basic

events (EBE), repair module (RM) and inspection module (IM). We

restrict the set

F

to contain the EBE, RDEP gate, OR gate, DELAY,

RM and IM modules since these will be the components used in the

case study presented in Section 4.

Eicient PMC of Smart Building Maintenance using FMTs BuildSys ’17, November 8–9, 2017, Del, Netherlands

Denition 3.1. e set

F

of FMT elements consists of the follow-

ing tuples. Here,

n,N∈N

are natural numbers,

thresh,in,trig ∈

{

0

,

1

}

take binary values,

Tde д,Tcl n

,

Trp lc ,Tr e p ,Toh

,

Tinsp ∈R≥0

are deterministic delays, and γ∈R≥0is a rate.

•(EBE ,Td eд,Tc l n ,Tr pl c ,N)

represent the extended basic

events with

N

discrete degradation levels, each of which

degrade with a time delay equal to

Tde д

. It also takes as

inputs the time taken to restore the EBE to the previous

degradation level

Tcl n

when cleaning is performed and

the time taken to restore the EBE to its initial state

Trp l c

following a replacement action.

•(RDE P ,n,γ,in,Td e д)

represents the RDEP gate with

n

de-

pendent children, acceleration rate

γ

, the input

in

which

activates the gate and

Tde д

the degradation rate of the

dependent children.

•(OR,n)represents the OR gate with ninputs.

•(RM,n,Tr ep ,Toh ,Tin sp ,Tc l n ,Tr pl c ,thresh,trig)

represents

the RM module which acts on

n

EBEs (in our case, this cor-

responds to all the EBEs in the FMT). e RM can either be

triggered periodically to perform a cleaning action, every

Tr ep

delay, or a replacement action, every

Toh

delay, or by

the IM when the delay

Tinsp

has elapsed and the

thresh

condition is met. e time to perform a cleaning action

is

Tcl n

, while the time taken to perform a replacement is

Trp l c

. e

trig

signal ensures that when the component is

not in the degraded states, no unnecessary maintenance

actions are carried out.

•(IM,n,Ti ns p ,Tcl n ,Tr pl c ,thresh)

represents the IM module

which acts on

n

EBEs (in our case, this corresponds to all the

EBEs in the FMT). e IM initiates a repair depending on

the current state of the EBE. Inspections are performed in a

periodic manner, every

Tinsp

. If during an inspection, the

current state of the EBE does not correspond to the new or

failed state (i.e. the degradation level of the inspected EBE

is below a certain threshold), the

thr esh

signal is activated

and is sent to the RM. Once a repair action is performed the

IM moves back to the initial state with a delay equal to

Tcl n

or

Trp l c

depending on the maintenance action performed.

•(DE LAY ,T,N)

represents the DELAY module which takes

two inputs representing the deterministic delay

T∈ {Tdeд,

Tcl n ,Tr pl c ,Tr ep ,Toh ,Tins p }

to be approximated using an

Erlang distribution with

N

number of states. is DELAY

module can be extended by inclusion of a reset transition

label, which when triggered restarts the approximation of

the deterministic delay before it has elapsed. e extended

DELAY module is referred to as (DELAY ,T,N)ex t .

e FMT is dened as a special type of directed acyclic graph

G=(V,E)

where the vertices

V

represent the gates and the events

which represent an occurrence within the system, typically the

failure of a subsystem down to an individual component level, and

the edges

E

which represent the connections between vertices.

Events can either represent the EBEs or intermediate events which

are caused by one or more other events. e event at the top

of the FMT is the top event (TE) and corresponds to the event

being analysed - modelling the failure of the (sub)system under

consideration. e EBE are the leaves of the DAG. For

G

to be a

well-formed FMT, we take the following assumptions (i) vertices

are composed of the OR, RDEP gates, (ii) there is only one top event,

(iii) RDEP can only be triggered by EBEs and (iv) RM and IM are

not part of the DAG tree but are modelled separately

1

. is DAG

formulation allows us to propose a framework in Subsection 3.5

such that we can eciently perform probabilistic model checking.

Denition 3.2. A fault maintenance tree is a directed acyclic

graph G=(V,E)composed of vertices Vand edges E.

3.2 Semantics of FMT elements

Next, we provide the CTMC semantics for each FMT element

f∈ F

.

ese elements are then instantiated based on the underlying FMT

structure to form the semantics of the whole FMT in CTMC form.

DELAY

.We dene the semantics for the

(DE LAY ,T,N)

element

using Figure 2(a) and describe the corresponding CTMC using

the set of states given by

D={d0,d1, . . . , dN+1}

, the initial state

d0

, the set of transitions labels

TL ={trigger,move}

, the set of

atomic propositions

AP ={T}

with

L(d0)=· · · =L(dN)=∅

, and

L(dN+1)={T}

. e rate matrix

R

becomes clear from Figure 2(a)

and

Rij =

µi=0∧j=1

N

T((i≥1∨i<N+1)∧j=i+1)

∨(i=N+1∧j=1)

0 otherwise,

(1)

with

i

representing the current state,

j

is the next state and

µ

is

a xed large value corresponding to introducing a negligible de-

lay, which is used to trigger all the DELAY modules at the same

time (cf. Denition 2.1). In Figure 2(b) we dene the semantics of

(DE LAY ,T,N)ex t

. is results in the CTMC described using the

state space

D={d0,d1, . . . , dN+1}

, the initial state

d0

, the set of

transition labels

TL ={trigger,move,reset}

, the set of atomic

propositions

AP ={T}

, the labelling function

L(d0)=L(d1)=

· · · =L(dN)=∅, and L(dN+1)={T}and the rate matrix Rwhere

Rij =

µi=0∧j=1

1(i≥2∨i<N+1)∧j=1

N

T((i≥1∨i<N+1)∧j=i+1)

∨(i=N+1∧j=1)

0 otherwise,

(2)

with

i

representing the current state and

j

is the next state. In both

instances, the deterministic delays is approximated using an Erlang

distribution [

8

] and all DELAY modules are synchronised to start

together using the trigger transition label. e extended DELAY

module have the transition labels

reset

which restarts the Erlang

distribution approximation whenever the guard condition is met

at a rate of 1

×Rsync

where

Rsync

is the rate coming from the

use of synchronisation with other modules causing the reset to

occur ( as explained in Subsection 3.3). is is required when a

maintenance action is performed which restores the EBE’s state

back to the original state and thus restart the degradation process,

before the degradation time has elapsed.

1

Note, for dierent FMT structure same RM and IM modules are used, thus RM and

IM modules are independent of FMT structure

BuildSys ’17, November 8–9, 2017, Del, Netherlands N. Cauchi et al.

Remark 1. e basic properties of an Erlang distribution: A ran-

dom variable

Z∈R+

has an Erlang distribution with

k∈N

stages

and a rate

λ∈R+,Z∼Erl anд(k,λ)

, if

Z=Y1+Y2+. . . Yk

where

each

Yi

is exponentially distributed with rate

λ

. e cumulative den-

sity function of the Erlang distribution is characterised using,

f(t;k,λ)=1−

k−1

X

n=0

1

n!exp(−λt)(λt)nfor t,λ≥0 (3)

and for

k=

1, the Erlang distribution simplies to the exponential

distribution. In particular, the sequence

Zk∼Erl anд(k,λk)

converges

to the deterministic value

1

λ

for large

k

. us, we can approximate a

deterministic delay

T

with a random variable

Zk∼Erl anд(k,k

T)

[

3

].

Note, there is a trade-o between the accuracy and the resulting blow-

up in size of the CTMC model for larger values of

k

(a factor of

k

increase in the model size) [

8

,

9

]. In this work, the Erlang distribution

will be used to model the xed degradation rates, the maintenance

and inspection signals. is is a similar approach taken in [

14

] where

degradation phases are approximated by an (k,

λ

)-Erlang distribution.

d0

start

d1d2d3. . . dN+1

trigger,µ

move,N

Tmove,N

Tmove,N

Tmove,N

T

move,N

T

(a) CTMC representing DELAY with Nstates used to approximate a de-

lay equal to Tapproximated using Er l anд(N,N

T). e transition labels

TL ={trigger,move}are shown on each of the transitions. e state la-

bels are not shown and the initial state of the CTMC is pointed to using

an arrow labelled with start.

d0

start

d1d2d3. . . dN+1

trigger ,µ

move ,N

Tmove ,N

Tmove ,N

Tmove ,N

T

reset,1

reset,1

reset,1

reset,1

(b) CTMC representing the extended DELAY with Nstates used to ap-

proximate a delay equal to T. Delay approximated using Er l anд(N,N

T).

e transition labels TL ={trigger,move,reset}are shown on each of

the state transitions, while the state labels are not shown.

Figure 2: CTMC for (a) DELAY and (b) DELAY with reset guard.

Extended Basic Events (EBE)

.e EBE are the leaves of the

FMT and incorporate the component’s degradation model. EBE are

a function of the total number of degradation steps

N

considered.

Figure 3 shows the semantics of the

(EBE ,Td eд,Tc l n ,Tr ep ,N=

3

)

.

e corresponding CTMC is described by the tuple

({s0,s1,s2,s3},

s0,TLE BE ,APE B E ,LEB E ,RE BE )where s0is the initial state ,

TLEBE ={degradei∈ {0, . .., N},perform clean,perform replace},

the atomic propositions

APEBE ={new,thresh,failed}

, the la-

belling function

L(s0)={new},L(s1)=L(s2)={thresh},L(s3)=

{f ailed}

and

REB E ="0 1 0 0

1 0 1 0

1 1 0 1

1 0 1 0 #.

e deterministic time delays

taken as inputs are modelled using three dierent DELAY modules:

s0

start

s1s2s3

degrade1,1degrade2,1degrade3,1

perform clean,1perform clean,1

perform clean,1

perform replace,1

perform replace,1

perform replace,1

Figure 3: CTMC representing the EBE with N=3with the transition labels

TLE BE ={degradei∈ {1,2,3},perform clean,perform replace}on each of the

state transitions. e state labels are not shown and the initial state is pointed

to by the arrow labelled with start.

(1)

an extended DELAY module approximating

Tde д

with the

transition label

move

replaced with

degradeN

such that

synchronisation between the two CTMCs is performed

(explained in Subsection 3.3). When

Tde д

has elapsed the

transition labelled with

degradeN

is triggered and the EBE

moves to the next state at a rate equal to

N

Tde д×

1

2

. e

reset

transition label and corresponding transitions are

replicated in extended DELAY module and replaced with

perform clean

and

perform replace

. When the corre-

sponding maintenance action is performed one of the tran-

sition label is triggered and the state of the EBE moves to

previous state (if cleaning action is carried out) or to the

initial state (if replace action is performed).

(2)

a DELAY module approximating

Tcl n

with the transition

label

move

replaced with

perform clean

. When

Tcl n

has

elapsed the transition with transition label

perform clean

is triggered and the EBE moves to the previous state at a

rate equal to N

Tcl n .

(3)

a DELAY module approximating

Trp l c

with the transi-

tion label

move

replaced with

perform replace

. When

Trp l c

has elapsed the transition having the transition label

perform replace

is triggered and the EBE moves to the

initial state at a rate equal to N

Trp l c .

e transition labels

perform clean

and

perform replace

cannot

be triggered at the same time and it is assumed that

Tcl n ,Tr pl c

.

is is a realistic assumption as only one maintenance action is

performed at the same time.

RDEP gate

.e RDEP gate has static semantics and is used in

combination with the semantics of its

n

dependent EBEs. When trig-

gered (

in =

1), the associated EBE reaches the state labelled

failed

,

the degradation rate of the

n

dependent children is accelerated by

a factor γ. We model the in signal using,

in =

1L(s)=failed,

0 otherwise,(4)

where

L(s)

is the label of the current state of the associated EBE.

Similarly, we map the RDEP gate function using,

RA =

γTde д1, . . . , γTde дnin =1,

Tde д1, . . . , Tde дnotherwise,(5)

2

is is a direct consequence of synchronisation and corresponds to

R×REB E

. Refer

to Subsection 3.3

Eicient PMC of Smart Building Maintenance using FMTs BuildSys ’17, November 8–9, 2017, Del, Netherlands

where

Tde дi,i∈1, . . . n

corresponds to the degradation rate of the

ndependent children. 3

OR gate

.e OR gate indicates a failure when either of its input

nodes have failed and also does not have semantics itself but is used

in combination with the semantics of its

n

dependent input events

(EBEs or intermediate events). We use,

FAIL =

0E1=1∧ · · · ∧ En=1

1 otherwise (6)

where

Ei=

1

,i∈

1

. . . n

corresponds to when the

n

events, con-

nected to the OR gate, represent a failure in the system. In the case

of EBEs, E1=1 occurs when the EBE reaches the failed state .

Repair module (RM)

.Figure 4 (a) shows the semantics of

(RM,n,

Tr ep ,Toh ,Ti ns p ,Tcl n ,Tr pl c ,Tr pl c ,thresh,trig)

. e CTMC is de-

scribed using the state space

{rm0,rm1}

, the initial state

rm0

, the

transition labels

TLRM ={inspect,check clean,check replace,

trigger clean,trigger replace}

, the atomic propositions

AP =

{maintenance }

, the labelling function

L(rm0)={∅},L(rm1)=

{maintenance}

and with

RI M =f1 1

1 0 g

. For the sake of clarity in

Figure 4 (a), we used the transition labels

check maintenance

and

trigger maintenance

. e transition label

check maintenance

and corresponding transitions are replicated and the transition

labels replaced by

check clean

or

check replace

to allow for

both type of maintenance checks. Similarly, the transition la-

bel

trigger maintenance

and corresponding transitions are du-

plicated and the transition labels replaced by

trigger clean

or

trigger replace

to allow the initiation of both type of main-

tenance actions to be performed. Due to synchronisation, only

one of the transitions may trigger at any time instance (as ex-

plained in Subsection 3.3). e transition labels

trigger clean

or

trigger replace

correspond to the transition label

trigger

within the DELAY module approximating the deterministic delays

Tcl n

and

Trp l c

respectively. e deterministic delays which trig-

ger

inspect

,

check clean

or

check replace

correspond to when

the time delays

Tin sp ,Tr e p

and

Toh

respectively, have elapsed. All

these signals are generated using individual DELAY modules with

the

move

transition label for each module replaced using

inspect

,

check clean

or

check replace

respectively. e

thresh

signal is

modelled using,

thresh =

1L(sj,1)=thresh ∨ · · · ∨ L(sj,n)=thresh,

0 otherwise,(7)

where

L(sj,i),j∈

0

. . . N,i∈

1

. . . n

correspond to the label of the

current state

j

of each of the

n

EBE. Similarly, we model the

trig

signal using

trig =

1L(sj,1),new ∨ · · · ∨ L(sj,n),new,

0 otherwise.(8)

Both signals act as guards which when triggered determine which

transition to perform (cf. Fig. 4 (a)).

3

Note, this eectively results in changing the deterministic delay being modelled by

the DELAY module to a new value if in =1.

Inspection module (IM)

.e semantics of the

(IM,n,Ti ns p ,

Tcl n ,Tr pl c ,thresh)

is depicted in Figure 4 (b). e CTMC is dened

using the tuple

({im0,im1},im0,TLI M ,API M ,LI M ,RI M )

. Here,

TLI M ={inspect,perform clean,perform replace}

,

API M =

{∅}

,

L(s0)=L(s1)=∅

and

RI M =f1 1

1 0 g

. e

thresh

signal corre-

sponds to same signal used by the RM, given using

(7)

. In Figure 4

(b), for clarity, we use the transition label

perform maintenance

.

is transition label and corresponding transitions are duplicated

and the transition labels are replaced by either perform clean or

perform replace

to allow for both type of maintenance actions to

be performed when one of them is triggered using synchronisation.

e same DELAY modules used in the RM and EBE to represent

the deterministic delays are used by the IM. e DELAY module

used to represent the deterministic delays

Tcl n

and

Trp l c

triggers

the transition labels

perform clean

or

perform replace

. is

represents that the maintenance action has completed.

rm0

start rm1

inspect,thresh =0,1

check maintenance, trig =0,1

check maintenance, trig=1,1

inspect, thresh =1,1

trigger maintenance,1

(a) CTMC representing the RM with TLRM =

{inspect,check maintenance,perform maintenance}shown on the

state transitions. e guard condition trig =0/1or thresh =0/1must be

satised for the corresponding transition to trigger when it is activated

via synchronisation with the transition label.

im0

start im1

inspect, thresh =0,1

inspect, thresh =1,1

perform maintenance ,1

(b) CTMC representing the IM with TLI M =

{inspect,perform maintenance}shown on the state transitions.

e guard condition trig =0and thresh =1must be satised for the cor-

responding transition to trigger when it is activated via synchronisation

with the transition label.

Figure 4: CTMC for (a) RM and (b) IM.

3.3 Semantics of FMT

Next, we show how to obtain the semantics of a FMT from the

semantics of its elements using the FMT syntax introduced in Sub-

section 3.1. We dene the DAG

G

by dening the vertices

V

and

the corresponding events

E

. e leaves of the DAG are the events

corresponding to the EBE. e events

E

are connected to the ver-

tices

V

, which trigger the corresponding auxiliary function used to

represent the semantics of the gates. e

Events

connected to the

RM and IM are initiated by triggering the auxiliary functions

thresh

and

trig

given using

(7)

and

(8)

respectively. Based on the structure

of

G

, we compute the corresponding CTMC by applying parallel

composition of the individual CTMCs representing the elements of

the FMT. e parallel composition formulae are derived from [

7

]

and dened as follows,

BuildSys ’17, November 8–9, 2017, Del, Netherlands N. Cauchi et al.

Denition 3.3 (Interleaving Synchronization). e interleaving

synchronous product of

C1=(S1,s01,TL1,AP1,L1,R1)

and

C2=

(S2,s02,TL2,AP2,L2,R2)

is

C1||C2=(S1×S2,(s01,s02 ),TL1∪TL2,AP1∪

AP2,L1∪L2,R)where Ris given by:

s1

α1,λ1

−−−−−→ s0

1

(s1,s2)α1,λ1

−−−−−→ (s0

1,s2)

,and s2

α2,λ2

−−−−−→ s0

2

(s1,s2)α2,λ2

−−−−−→ (s1,s0

2)

,

and

s1,s0

1∈S1

,

α1∈TL1

,

R1(s1,s0

1)=λ1

,

s2,s0

2∈S2

,

α2∈TL2

,

R2(s2,s0

2)=λ2.

Denition 3.4 (Full Synchronization). e full synchronous prod-

uct of

C1=(S1,s01,TL1,AP1,L1,R1)

and

C2=(S2,s02,TL2,AP2,L2,R2)

is

C1||C2=(S1×S2,(s01,s02),TL1∪TL2,AP1∪AP2,L1∪L2,R)

where Ris given by:

s1

α,λ1

−−−−→ s0

1and s2

α,λ2

−−−−→ s0

2

(s1,s2)α,λ1×λ2

−−−−−−−→ (s0

1,s0

2)

and

s1,s0

1∈S1

,

α∈TL1∧TL2

,

R1(s1,s0

1)=λ1

,

s2,s0

2∈S2

,

α2∈TL2

,

R2(s2,s0

2)=λ2.

For any pair of states, synchronisation is performed either using

interleaving or full synchronisation. For full synchronisation, as

in Denitions 3.3, the rate of a synchronous transition is dened

as the product of the rates for each transition. e intended rate

is specied in one transition and the rate of other transition(s)

is specied as 1. For instance, the RM synchronises using full

synchronisation with the DELAY modules representing

Tinsp

,

Tr ep

and

Trp l c

and therefore, to perform synchronisation between the

RM and the DELAY modules, the rates of all the transitions of RM

should have a value of 1 (cf. Fig. 4 (a)), while the rate of the DELAY

modules represent the actual rates (cf. Fig 2). e same principle

holds for the EBEs and the IM. We refer the reader to Table 1 to

further elucidate the synchronisation between the FMT components

and the method employed during the parallel composition.

Example. Consider, a simple example showing the time signals

and synchronisations required for modelling an EBE and the RM

and IM. e EBE has a degradation rate equal to

Tde д

and we limit

the functionality of the RM and IM by allowing only the mainte-

nance action to perform cleaning. We also need the corresponding

DELAY modules generating the degradation rates,

Tde д

and the main-

tenance rates

Tcl n ,Ti ns p ,Tre p

. e resulting CTMC is obtained by

performing a parallel composition of the components

Cal l =CE BE ||

CTde д||CRM | |CI M ||CTcl n ||CTin sp | |CTr ep .

e resulting state space

is then

Sal l =SE BE ×STd e д×SRM ×SI M ×STc l n ×STin sp ×STr e p

.

e synchronisation between the dierent components is shown in

Figure 5 and proceeds as follows:

(1)

All the DELAY modules (except

Tcl n

) start at the same time

using the trigger transition label.

(2)

When the extended DELAY module generating the

Tde д

time

delay elapses, the corresponding EBE moves to the next state

through synchronisation with the transition label

degradeN

.

(3)

e clock signals

Tr ep ,Ti ns p

represent periodic maintenance

and inspection actions and when the deterministic delay

is reached, through synchronisation with the transition la-

bel

check clean

or the

inspect

, the RM or IM modules is

triggered (cf. Fig. 4(a) and 4(b)). If RM triggers a main-

tenance action, the DELAY representing

Tcl n

is triggered

using the synchronisation labels

trigger clean

. Once the

deterministic delay

Tcl n

elapses, the EBE, the extended DE-

LAY module representing

Tde д

(where the

reset

transition

label within the extended DELAY module is replaced with

perform clean

) and the IM are reset using the transition

label perform clean.

Figure 5: Block diagram showing the synchronisation connections be-

tween one component and the other, together with the corresponding tran-

sition label which trigger synchronisation.

Remark 2. One should note that this results in the requirement of

a large state space, which is a function of the number of states used to

approximate the deterministic delays. us, to counteract this eect

we propose an abstraction framework in Subsection 3.5.

3.4 Metrics

We use PRISM to compute the metrics of the model described in

Subsection 2.1. e metrics can be expressed using the extended

Continuous Stochastic Logic (CSL) as follows:

(1)

Reliability : is can be expressed as the complement of the

probability of failure over the time

T

, 1

−

P

=?

[F

≤Tf ailed

].

(2)

Availability: is can be expressed as R

=?

[

C≤T

]

/T

, which

corresponds to the cumulative reward of the total time

spent in states labelled with okay and thresh during the

time T.

(3)

Expected cost: is can be expressed using R

=?

[

C≤T

], which

corresponds to the cumulative reward of the total costs

(operational, maintenance and failure) within the time T.

(4)

Expected number of failure: is can be expressed using

R

=?

[

C≤T

], which corresponds to the cumulative transition

reward that counts the number of times the top event enters

the failed state within the time T.

3.5 Decomposition of FMTs

e use of CTMC and deterministic time delays results in the re-

quirement of a large state space for modelling the whole FMT (cf.

Remark 2). We therefore propose an approach which decomposes

the large FMT into an equivalent abstract CTMC which can be

analysed using PRISM. e process involves two transformation

steps. First we convert the FMT into the equivalent directed acyclic

graph (DAG) and split this graph into a set of smaller sub-graphs.

Second, we transform the sub-graphs into the equivalent CTMC

by making use of the developed FMT components semantics (cf.

Subsec. 3.2), and performing parallel composition of the individual

Eicient PMC of Smart Building Maintenance using FMTs BuildSys ’17, November 8–9, 2017, Del, Netherlands

Component Synchronised with component Transition label Synchronisation method

DELAY representing Tde дDELAY modules representingTc l n ,Trp lc ,Ti ns p trigger Full synchronisation

RM DELAY module representingTr e p trigger clean Full synchronisation

RM DELAY module representingTo h trigger replace Full synchronisation

EBE DELAY representing Tde дdegradeNFull synchronisation

DELAY representing Tcl n RM, EBE check clean Full synchronisation

DELAY representing Trp lc RM, EBE check replace Full synchronisation

DELAY representing Tins p RM, IM inspect Full synchronisation

DELAY representing Tre p RM, IM, EBE perform clean Full synchronisation

DELAY representing Toh RM, IM, EBE perform replace Full synchronisation

EBE RM,IM, all DELAY modules, other EBEs - Interleave synchronisation

Table 1: Performing synchronisation between the dierent FMT components and the synchronisation method used.

G3

G2

G1

B4B3

B2

B1 B2

G3

G2

G1

B1

G3G2

B4B3

RDEP

B2

PMC

OF SUB-GRAPHS

GRAPH

DECOMPOSITION

ORIGINAL

FMT

MTTF(G2)

CSL

PROPERTIES

PMC OF FINAL

CTMC

MTTF(G3)

G2

G3

RDEP

G1

B1 B2 B4B3

B2

RDEP

G3G2

G1

EQUIVALENT

GRAPH

LIBRARY OF CTMC MOD ELS PARALLEL

COMPOSITION

DEPENDABILITY,

COSTS TRADE-OFFs

CSL

PROPERTIES

FINAL CTMC

Figure 6: Overall developed framework for decomposition of FMTs into the equivalent abstract CTMCs.

FMT components based on the underlying structure of the sub-

graph. e smaller sub-graphs are then sequentially recomposed to

generate the higher level abstract FMT. Figure 6 depicts a high-level

diagram of the decomposition procedure.

Conversion of original FMT to the equivalent graph

.e

FMT is a DAG (cf. Subsection 3) and in this framework we need to

apply a transformation to the DAG in the presence of an RDEP gate,

such that we can perform the decomposition. e RDEP causes an

acceleration of events on dependent child nodes when the input

node fails. In order to capture this feature in a DAG, we need to

duplicate the input node such that it is connected directly to the

RDEP vertex. is allows us to capture when the failure of the

input occurs and the corresponding acceleration of the the children.

is is reasonable as the same RM and IM are used irrespective of

the underlying FMT structure.

Graph decomposition

.We dene modules within the DAG as

sub-trees composed of at least two events which have no inputs

from the rest of the tree and no outputs to the rest except from its

output event [

13

]. We can divide the graph into multiple partitions

based on the number of modules making up the DAG. We dene

the following notations to ease in the description of the algorithm:

•Vo

indicates whether the node is the top node of the DAG.

•Vдindicates the node where graph split is performed.

•Modules correspond to sub-graphs in DAG.

We set

Vo

when we construct the DAG from the FMT and then

proceed with executing Algorithm 1. We rst identify all the sub-

graphs within the whole DAG and label all the top nodes of each sub-

graph

i

as

VTi

. We loop through each sub-graph and its immediate

child (the sub-graph at immediate lower level) and at the point

where the sub-graph and child are connected, the two graphs are

split and a new node

Vд

is introduced. us, executing Algorithm 1

results in a set of sub-graphs linked together by the labelled nodes

Vд

. For each of lower level sub-graphs we now proceed to compute

the mean time to failure (MTTF). is will serve as an input to the

higher-level sub-graphs such that metrics for the abstract equivalent

CTMC can be computed.

PMC of sub-graphs

.We start from the boom level sub-graphs

and perform the conversion to CTMC using the formal models pre-

sented in Subsection 3.2. e formal models have been built into a

library of PRISM modules and based on the underlying components

and structure making up the sub-graph, the corresponding individ-

ual formal models are converted into the sub-graph’s equivalent

CTMC by performing parallel composition (cf. Subsec. 3.3). For

each sub-graph, we compute the probability of failure

De(T)

at time

T

, from which we calculate the MTTF using,

MTTF =ln(1−De(T))

−T

.

BuildSys ’17, November 8–9, 2017, Del, Netherlands N. Cauchi et al.

Algorithm 1: DAG decomposition algorithm

input : DAG G=(V,E)

output: Set of sub-graphs with one of the end nodes labelled

as Vд.

1Identify sub-graphs using ‘depth-rst’ traversal

2Label all top nodes of each sub-graph ias VTi

3forall the

select the top node of every sub-graph and immediate

child dened at immediate lower level do

4if label VTalready found in one of the leaf nodes of

sub-graph then

5Split sub-graph

6Insert new node Vдwhich will be used as input from

connected sub-graph

e MTTF serves as the input to the higher level sub-graph at time

T

. e new node in the higher-level sub-graph, now degrades with

the a new time delay

Tde д=MTTF

, which is fed into the corre-

sponding DELAY component. is process is repeated for all the

dierent sub-graphs until the top level node Vois reached.

PMC of nal equivalent abstract CTMC

.On reaching the top

level node

Vo

, we compute the metrics for the equivalent abstract

CTMC for a specic time horizon

T

. For dierent horizons, the

previous step of computing the MTTF for the underlying lower

level sub-graphs needs to be repeated. Using this technique, we

can formally verify larger FMTs, while using less memory and

computational time due to signicantly smaller state space of the

underlying CTMCs. Next, we proceed with an illustrative example

comparing the process of directly modelling the large FMT using

CTMCs versus the de-compositional modelling procedure. Figure 7

presents the FMT composed of two modules and the corresponding

abstracted FMT. e abstract FMT is a pictorial representation of

the moel represented by the equivalent abstract CTMC obtained

using the developed decomposition framework (cf. Fig. 6). For

8

Failure in

Heating coil

No heating /

cooling

12

9

No heating /

cooling

1 2F’

N = 4

MTTF = 20

N = 2

MTTF = 20

N = 2

MTTF = 10

N = 3

MTTF = 35

N= 2

MTTF = 20

N = 4

MTTF = 20

Figure 7: e original FMT and the abstract FMT corresponding to the

equivalent abstract CTMC generated by the developed framework. e MT TF

for the F’ is computed based on the probability of failure of the heating coil.

both the large FMT and the equivalent abstract FMT a comparison

between the total number of states for the resulting CTMC models,

the total time to compute the reliability metric and the resulting

reliability metric is performed. All computations are run on an 2.3

GHz Intel Core i5 processor with 8GB of RAM and the resulting

statistics are listed in Table 2. e original FMT has a state space

with 193543 states, while the equivalent abstract CTMC has a state

space with 63937 states. is corresponds to a 67% reduction in the

state space size. e total time to compute the reliability metric is a

function of the nal time horizon and a maximal 73% reduction in

computation time is achieved. Accuracy in the reliability metric of

the abstract model is a function of the time horizon. e accuracy

of the reliability metric computed by the abstract FMT results in a

maximal reduction of 0.61%.

Time Original FMT Abstracted FMT

Horizon Time to compute Reliability Time to compute Total Reliability

metric MT TF metric Time

(years) (mins) (mins) (mins) (mins)

50.727 0.9842 0.142 0.181 0.223 0.9842

10 1.406 0.8761 0.219 0.309 0.528 0.8769

15 2.489 0.3290 0.292 0.622 0.914 0.3270

Table 2: Comparison between the original large FMT and the abstracted

FMT.

4 CASE STUDY

We apply the FMT framework to a Heating, Ventilation and Air-

conditioning (HVAC) system used to regulate a building’s internal

environment. e HVAC system under consideration for the FMT

analysis is presented in Figure 8. It is composed of two circuits -

the air ow circuitry and the water circuit. e gas boiler heats

up the supply water which is fed into the heat pump. e heat

pump transfers the supply water into two sections - the supply

air heating and cooling coils and the radiators - via the splier.

e rate of water owing in the heating coil is controlled using a

heating coil valve, while the rate of water ow in the radiator is

controlled using a separate valve. e outside air is mixed with the

extracted room air temperature via the mixer. is is fed into the

heating coil, which warms up the input air to the desired supply

air temperature. is air is supplied back, at a rate controlled by

the Air Handling unit (AHU) dampers, into the zone via the supply

fan. e radiators are directly connected to the water circuitry and

transfer the heat from the water into the zone. e return water is

then passed through the collector and is returned back to the boiler.

Based on this HVAC system we construct the corresponding FMT

shown in Figure 9. e leaves of the tree are EBE with discrete

degradation rates computed using Table 3, approximated by the

Erlang distribution where

N

is the number of degradation phases

(

k=N

for the Erlang distribution) and MTTF is the expected

time to failure with

MTT F =

1

/λ

(cf. Remark 1). We choose

an acceleration factor

γ=

2 for the RDEP gate. e system is

periodically repaired every 6 months (

Tr ep =

182

days

) and a major

overhaul with a complete replacement of all components is carried

out once every 20 years (

Toh =

20

×

365

days

). Weekly inspections

are performed (

Tinsp =

7

days

) which return the components

back to the previous state. Only cleaning actions are performed

when inspections are carried out. e total time to perform a

cleaning action is 1 day (

Tcl n =

1

day

), while performing a total

replacement of components takes 7 days (

Trp l c =

7

days

). e time

timing signals

{Tre p ,Toh ,Ti nsp ,Tc l n ,Tr plc }

are all approximated

using the Erlang distribution with

N=

3. All maintenance actions

are performed simultaneously on all components.

Dampers

Zone

Heat

Pump

Heating &

cooling coil

Outside Air Intake

Mixer

Splitter Collector

Boiler

Supply Fan

Radiator

Air Input Water Input

Heating

coil valve

Radiator valve

Figure 8: High level schematic ofan HVAC system.

Failure of HVAC component

8

Insufficient

Radiator Pout

Failure in

Heating coil Failure of

Supply Fan

76

RDEP

No heating /

cooling Reduced

Capacity

1 2

93 5

4

Figure 9: FMT for failure in HVAC system with leaves represented using

EBE (associated RM and IM not shown in gure). e EBE are labelled to cor-

respond to the component failure they represent using the fault index pre-

sented in Table 3.

Fault Index Failure Mode N MTTF

(years)

1 Failure in cooling coil 4 20

2 Broken AHU Damper 2 20

3 Fan motor failure 3 35

4 Obstructed supply fan 4 31

5 Fan bearing failure 6 17

6 Radiator failure 4 25

7 Radiator stuck valve 2 10

8 Heater stuck valve 2 10

9 Failure in heat pump 4 20

Table 3: Extended Basic events in FMT with associated degradation rates

(N, MTTF) obtained from [6, 10].

4.1 antitative results

We make use of the developed framework (cf. Subsec. 3.5) and con-

vert the FMT representing the failure of the HVAC system (cf. Fig.

9) into the equivalent abstract CTMC. e abstracted CTMC has a

state space of 62779 states. Using our current computing set-up, the

complex CTMC representing the whole FMT was not computable

as it results in a state space explosion. Highlighting, the advantage

of the developed framework. e process is performed over six

time horizons

Nr={

0

,

5

,

10

,

15

,

20

,

25

}

years with the maintenance

policy consisting of periodic cleaning every 6 months, a major over-

haul every 20 years and inspections on a weekly basis. For this

set-up, the metrics corresponding to the reliability and availabil-

ity of the HVAC systems over the time horizon are computed and

are shown in Figure 10(b). e maximal time taken to compute a

metric using the abstract FMT is 1.47 minutes. It is deduced that

both the reliability and availability reduce over time and there is

a saturation in the number of maintenance actions which one can

perform before the system no longer achieves higher performance

in reliability and availability. Next, we compare the total cost of

0 10 20

0.96

0.97

0.98

0.99

1

Time (years)

Reliability

(a) Reliability of HVAC system.

0 10 20

0.96

0.97

0.98

0.99

1

Time (years)

Availability

(b) Availability of HVAC system.

Figure 10: Reliability and availability of HVAC over time horizon Nr.

maintenance and the expected number of failures over the time

horizon

Nr={

0

,

5

,

10

,

15

,

20

,

25

}

years when considering dierent

maintenance strategies, such that we can identify the maintenance

strategy that minimises cost and the number of failures over time.

We consider six dierent maintenance strategies which are listed

in Table 4. e total maintenance cost to perform a repair is 100

[GBP], while a replacement costs 5000 [GBP]. We now compute the

total expected maintenance costs and the total expected number of

failures for each strategy. ese are shown in Figure 11. e most ef-

fective strategy which oers a good trade-o between maintenance

costs and the expected number of failures is achieved when repairs

are carried out on a yearly basis, replacements are carried out every

20 years and inspections are carried out weekly (corresponding to

BuildSys ’17, November 8–9, 2017, Del, Netherlands N. Cauchi et al.

strategy

M1

). Furthermore, it can be seen that the frequency of

inspections has a large eect on the total number of failures. When

the frequency of inspection is low (as in

M4

and

M5

), the expected

number of component failures increases signicantly. Note that

reducing the periodicity of repairs, as in the case of maintenance

strategy

M2

also results in an increase in the expected number of

failures.

Strategy index Tre p Toh Ti ns p

M06 months 20 years 1 Week

M112 months 20 years 1 Week

M248 months 20 years 1 Week

M36 months 10 years 1 Week

M46 months 20 years 2 years

M56 months 20 years 5 years

Table 4: Implemented maintenance strategies

M0M1M2M3M4M5

0 10 20

0

0.5

1

1.5

·104

Time (years)

Maintenance cost

(a) Maintenance Costs.

5 10 15 20 25

0

0.05

0.1

0.15

0.2

Time (years)

Expected number of failures

(b) Expected number of failures.

Figure 11: Comparison between dierent number of maintenance strate-

gies for an HVAC systems.

5 CONCLUSION AND FUTURE WORKS

e paper has presented a methodology for applying probabilistic

model checking to FMTs. e FMTs are modelled in the form of

CTMCs which simplies the transformation of FMT into formal

models that can be analysed using PRISM. A novel technique for

abstracting the equivalent CTMC model is also presented. e novel

decomposition procedure tackles the issue of state space explosion

and results in a signicant reduction in both the state space size

and the total time required to compute metrics. e framework

has been applied to an HVAC system and the eect of applying

dierent maintenance strategies has been presented. e presented

framework can be further enhanced by adding more gates to the

PRISM modules library which include the Priority-AND, INHIBIT,

k/N gates and to incorporate lumping of states as in [

16

], such that

the state space can be further reduced.

ACKNOWLEDGMENTS

is work has been funded by the AMBI project under Grant No.:

324432, by the Alan Turing Institute, UK, post-doctoral research

grant from Fonds de Recherche du ebec - Nature et Technologies

(FRQNT) and Malta’s ENDEAVOUR Scholarships Scheme.

REFERENCES

[1]

Vladimir Babishin and Sharareh Taghipour. 2016. Optimal maintenance policy

for multicomponent systems with periodic and opportunistic inspections and

preventive replacements. Applied Mathematical Modelling 40, 24 (2016), 10480–

10505.

[2]

Francesca Boem, Riccardo MG Ferrari, Christodoulos Keliris, omas Parisini,

and Marios M Polycarpou. 2017. A distributed networked approach for fault

detection of large-scale systems. IEEE Trans. Automat. Control 62, 1 (2017), 18–33.

[3]

Luca Bortolussi and Jane Hillston. 2012. Fluid approximation of CTMC with

deterministic delays. In antitative Evaluation of Systems (QEST), 2012 Ninth

International Conference on. IEEE, 53–62.

[4]

Nathalie Cauchi, Karel Macek, and Alessandro Abate. 2017. Model-based predic-

tive maintenance in building automation systems with user discomfort. Energy

(2017).

[5]

European Parliament and Council of the European Union. 2010. Directive

2010/31/EU. (2010).

[6] ASHRAE Handbook. 1996. HVAC systems and equipment. American Society of

Heating, Refrigerating, and Air Conditioning Engineers, Atlanta, GA (1996).

[7]

Holger Hermanns and Lijun Zhang. 2011. From Concurrency Models to Numbers.

In Nato Science for Peace and Security Series. IOS Press.

[8]

Khaza Anuarul Hoque, Otmane Ait Mohamed, and Yvon Savaria. 2015. Towards

an accurate reliability, availability and maintainability analysis approach for

satellite systems based on probabilistic model checking. In Proceedings of the 2015

Design, Automation & Test in Europe Conference & Exhibition. EDA Consortium,

1635–1640.

[9]

Khaza Anuarul Hoque, O Ait Mohamed, Yvon Savaria, and Claude ibeault. 2014.

Probabilistic model checking based DAL analysis to optimize a combined TMR-

blind-scrubbing mitigation technique for FPGA-based aerospace applications. In

Formal Methods and Models for Codesign (MEMOCODE), 2014 Twelh ACM/IEEE

International Conference on. IEEE, 175–184.

[10]

Faisal I Khan and Mahmoud M Haddara. 2003. Risk-based maintenance (RBM):

a quantitative approach for maintenance/inspection scheduling and planning.

Journal of Loss Prevention in the Process Industries 16, 6 (2003), 561–573.

[11]

Marta Kwiatkowska, Gethin Norman, and David Parker. 2007. Stochastic model

checking. In International School on Formal Methods for the Design of Computer,

Communication and Soware Systems. Springer, 220–270.

[12]

Marta Kwiatkowska, Gethin Norman, and David Parker. 2011. PRISM 4.0: Veri-

cation of Probabilistic Real-time Systems. In Proc. 23

rd

International Conference on

Computer Aided Verication (CAV’11) (LNCS), G. Gopalakrishnan and S. Qadeer

(Eds.), Vol. 6806. Springer, 585–591.

[13]

ZF Li, Yi Ren, LL Liu, and ZL Wang. 2015. Parallel algorithm for nding modules

of large-scale coherent fault trees. Microelectronics Reliability 55, 10 (2015), 1400–

1403. Proceedings of the 26

th

European Symposium on Reliability of Electron

Devices, Failure Physics and AnalysisSI:Proceedings of {ESREF }2015.

[14]

Enno Ruijters, Dennis Guck, Peter Drolenga, and Mari

¨

elle Stoelinga. 2016. Fault

maintenance trees: reliability centered maintenance via statistical model check-

ing. In Reliability and Maintainability Symposium (RAMS), 2016 Annual. IEEE,

1–6.

[15]

Ying Yan, Peter B Luh, and Krishna R Paipati. 2017. Fault Diagnosis of HVACAir-

Handling Systems Considering Fault Propagation Impacts Among Components.

IEEE Transactions on Automation Science and Engineering 14, 2 (April 2017),

705–717.

[16]

Olexandr Yevkin. 2015. An ecient approximate Markov chain method in

dynamic fault tree analysis. ality and Reliability Engineering International

(2015).

[17]

H

˚

akan LS Younes, Marta Kwiatkowska, Gethin Norman, and David Parker. 2006.

Numerical vs. statistical probabilistic model checking. International Journal on

Soware Tools for Technology Transfer 8, 3 (2006), 216–228.