ArticlePDF Available

Distributed Fault Diagnosis for Process and Sensor Faults in a Class of Interconnected Input-Output Nonlinear Discrete-Time Systems

Authors:

Abstract and Figures

This paper presents a distributed fault diagnosis scheme able to deal with process and sensor faults in an integrated way for a class of interconnected input–output nonlinear uncertain discrete-time systems. A robust distributed fault detection scheme is designed, where each interconnected subsystem is monitored by its respective fault detection agent, and according to the decisions of these agents, further information regarding the type of the fault can be deduced. As it is shown, a process fault occurring in one subsystem can only be detected by its corresponding detection agent whereas a sensor fault in a subsystem can be detected by either its corresponding detection agent or the detection agent of another subsystem that is affected by the subsystem where the sensor fault occurred. This discriminating factor is exploited for the derivation of a high-level isolation scheme. Moreover, process and sensor fault detectability conditions characterising quantitatively the class of detectable faults are derived. Finally, a simulation example is used to illustrate the effectiveness of the proposed distributed fault detection scheme.
Content may be subject to copyright.
Distributed Fault Diagnosis for Process and Sensor
Faults in a Class of Interconnected Input-Output
Nonlinear Discrete-Time Systems
Christodoulos Kelirisa, Marios M. Polycarpoua, Thomas Parisinib
aKIOS Research Center for Intelligent Systems and Networks, Dept. of Electrical and Computer
Engineering, University of Cyprus, Nicosia 1678, Cyprus
bDept. of Electrical and Electronic Engineering, Imperial College London, UK, and Dept. of Engineering
and Architecture, University of Trieste, Italy
(Received 00 Month 20XX; accepted 00 Month 20XX)
This paper presents a distributed fault diagnosis scheme able to deal with process and sensor faults in an
integrated way for a class of interconnected input-output nonlinear uncertain discrete-time systems. A
robust distributed fault detection scheme is designed, where each interconnected subsystem is monitored
by its respective fault detection agent, and according to the decisions of these agents, further information
regarding the type of the fault can be deduced. As it is shown, a process fault occurring in one subsystem
can only be detected by its corresponding detection agent whereas a sensor fault in a subsystem can be
detected by either its corresponding detection agent or the detection agent of another subsystem that is
affected by the subsystem where the sensor fault occurred. This discriminating factor is exploited for the
derivation of a high-level isolation scheme. Moreover, process and sensor fault detectability conditions
characterizing quantitatively the class of detectable faults are derived. Finally, a simulation example is
used to illustrate the effectiveness of the proposed distributed fault detection scheme.
Keywords: nonlinear systems; fault diagnosis; filtering; fault propagation; process and sensor faults
1. Introduction
In many applications involving large-scale systems, collaboration and information exchange among
several subsystems is of crucial importance. Examples include power systems, communication net-
works and water systems. If the problem of real-time monitoring is not properly addressed, the
operation of such systems may create life-threatening situations and cause significant economic dam-
age. Therefore, the development of robust fault detection approaches for identifying promptly any
abnormal system behavior is a primary task for achieving safe and reliable system operation.
The area of fault diagnosis is at the forefront of the technological evolution for several decades.
Many survey papers (Frank, 1990; Venkatasubramanian, Rengaswamy, Yin, & Kavuri, 2003) and
books (Blanke, Kinnaert, Lunze, & Staroswiecki, 2010; Chen & Patton, 1999; R. Patton & Frank,
1989) exist on the topic. The problem of fault detection and isolation (FDI) for linear systems is well
investigated and the interested reader is directed to the aforementioned books and survey papers.
In the case of FDI for nonlinear systems, until recently, centralized fault diagnosis approaches were
the main topic of investigation and a variety of methods were employed; i.e. by using nonlinear
observer design (Hammouri, Kinnaert, & El Yaagoubi, 1999; Rajamani & Ganguli, 2004), adaptive
estimation methods (Reppa, Polycarpou, & Panayiotou, 2014; X. Zhang, Polycarpou, & Parisini,
Corresponding author. Email: t.parisini@gmail.com
1
2002), change detection methods (Basseville & Nikiforov, 1993; Q. Zhang, Basseville, & Benveniste,
1998) and differential-geometric approaches (De Persis & Isidori, 2002). In the last years though,
due to advances in computing and communications mainly, the focus of the research activities is di-
rected mostly towards the development of hierarchical (Klinkhieo & Patton, 2009; R. J. Patton et al.,
2007), decentralized (Ferdowsi, Raja, & Jagannathan, 2012; L´echevin & Rabbath, 2009; Stankovic,
Ilic, Djurovic, Stankovic, & Johansson, 2010; Wei, Gui, Xie, & Ding, 2009; X. Zhang, Polycarpou,
& Parisini, 2009) and distributed (Boem, Ferrari, & Parisini, 2011; Ferrari, Parisini, & Polycarpou,
2012; Keliris, Polycarpou, & Parisini, 2013a; Yan, Tian, & Shi, 2008) schemes for process or sensor
faults. In our earlier work Keliris et al. (2013a), a distributed fault detection approach under con-
tinuous time and full state measurements was presented for the case of process faults along with a
thorough detectability analysis. The approach made use of filtering in order to dampen the effects
of measurement noise and aid in the derivation of tight detection thresholds. In Keliris, Polycarpou,
and Parisini (2013b), the approach given in Keliris et al. (2013a) was extended by relaxing the as-
sumption of the availability of all the state measurements and, the proposed approach dealt only
with process faults under continuous time. In this work, we maintain the use of filtering and we
investigate further the properties of the filtering approach under discrete time, by considering the
input-output case and, by dealing with process and sensor faults in an integrated way.
In many cases, the architecture of the underlying subsystems that are inherently decentralized or
distributed makes the development of a distributed FDI framework a necessity. For instance, many
factors contribute to the need for a distributed FDI formulation such as the large scale nature of
the system to be monitored, its spatial distribution, the inability to access centrally certain parts
of the system. As a result, local diagnosis should be performed, however the increasing complexity
of large scale interconnected systems creates additional difficulties to the fault diagnosis problem,
especially with issues related to fault propagation where, a fault that occurs in one subsystem
appears and affects neighboring subsystems. Therefore, there is the need to understand better the
fault propagation issues involved and the behavior of the system and the corresponding FDI scheme.
This paper, contributes towards these directions by proposing a distributed fault detection scheme
and addressing fault propagation among interconnected subsystems.
In the research literature there is the tendency to deal with the problem of fault diagnosis for
process and sensor faults separately, something that poses a significant limitation for real world
applications. For example, in the problem of fault diagnosis for process faults the sensors are as-
sumed healthy. However, apart from erroneous detection results, a faulty sensor may also lead to
degraded tracking or regulation performance or even endanger the stability of the control system.
Acknowledging, that sensors are prone to faults and utilizing sensor validation approaches is crucial
to the overall system stability and reliability. Similarly, in the problem of fault diagnosis for sensor
faults, typically it is assumed that there are no process faults. Obviously, dealing with the process
and sensor fault problem separately incurs the danger of false alarms due to monitoring the specific
fault type and ignoring the other, something that results in unnecessary component replacement and
increased maintenance costs.
The research conducted on the fault diagnosis problem that deals simultaneously with process and
sensor faults is limited (Dunia & Joe Qin, 1998; Salahshoor, Mosallaei, & Bayat, 2008; Q. Zhang
& Zhang, 2012; X. Zhang, Polycarpou, & Parisini, 2008). Some earlier results dealing jointly with
actuator and sensor faults can also be found in Kinnaert and Peng (1995); Massoumnia and Van-
der Velder (1988); Viswanadham and Srichander (1987). In Salahshoor et al. (2008) the sensor and
process fault detection problem is addressed using multi sensor data fusion techniques based on the
adaptive extended Kalman filter algorithm, whereas in Dunia and Joe Qin (1998) a unified framework
for dealing with joint diagnosis of process and sensor faults is proposed along with fault identification
and reconstruction via principal component analysis. In the context of analytical redundancy meth-
ods, X. Zhang et al. (2008) develop a fault isolation approach to determine which process or sensor
fault, among two respective fault classes, has occurred. Based on the assumption that only a single
fault occurs (either a process or a single sensor), adaptive approximation methods are used in order
to build a fault detection estimator and suitable fault isolation estimators that correspond to the
2
process and sensor faults that are able to determine which fault has occurred. In Talebi, Khorasani,
and Tafazoli (2009) a recurrent neural-network based fault detection scheme for nonlinear systems is
proposed, which employs two nonlinear-in-parameters neural networks to isolate actuator and sen-
sor faults; the fault determined when the output of one of the neural networks produces a non-zero
output indicating the faulty condition. In Thumati and Halligan (2013) a nonlinear observer-based
fault diagnostics scheme, dealing with process and sensor faults, for nonlinear systems in discrete
time is proposed. The scheme consists of an artificial immune system as an online approximator,
which identifies the fault type by monitoring the outputs’ magnitude of the two online approximators
(state and output) as in Talebi et al. (2009). In Q. Zhang and Zhang (2012) a distributed detection
scheme for process and sensor faults for a class of input-output interconnected systems under con-
tinuous time is proposed, but the estimator design is conducted under some potentially restrictive
conditions and, deals only with the fault detectability issue. This paper contributes to the limited
research available, by developing a distributed fault detection scheme for process and sensor faults
and investigating the propagation of the fault effects to neighboring subsystems.
The primary objective and main contribution of this paper is the derivation of a distributed
fault diagnosis approach dealing with process and sensor faults in an integrated way, utilizing a
specifically designed scheme that encompasses important characteristics regarding fault propagation
among subsystems. Detectability conditions for process and sensor faults are derived, characterizing
quantitatively the class of detectable faults. The scheme is comprised of a set of interacting fault
detection agents, in which each subsystem is monitored by its respective detection agent. As shown,
a process fault that occurs in a subsystem can only be detected by its respective detection agent,
whilst a sensor fault that occurs in a subsystem may also be detected by the detection agents of
neighboring subsystems that are affected by the subsystem where the sensor fault occurred. This
differentiating element is exploited in order to derive a high-level fault isolation scheme, able to
provide information regarding the type and location of the fault that has occurred. Therefore, the
proposed distributed fault detection approach encompasses significant benefits for the fault isolation
task which can be exploited by a more sophisticated isolation scheme to pinpoint the exact fault
that occurred.
The paper is organized as follows: the problem formulation is given in Section 2 and the detailed
design of the distributed fault detection scheme is presented in Section 3. In Section 4, the de-
tectability conditions that characterize the class of detectable process and sensor faults are derived.
In Section 5, fault propagation issues are investigated, and in Section 6, a high-level fault isolation
scheme is proposed. In Section 7, a simulation example demonstrating the effectiveness of the scheme
is presented and, finally, in Section 8 some concluding remarks are given.
2. Problem formulation
We consider an interconnected nonlinear dynamic system comprised of Nsubsystems ΣI,I
{1, ..., N }. The discrete-time dynamics of each subsystem is described by:
ΣI:
xI(t+ 1) = AIxI(t) + gI(y0
I(t),¯y0
I(t), uI(t)) + ηI(xI(t),¯xI(t), uI(t), t)
+βx
I(tTx
0)φI(x(t), uI(t)) (1)
yI(t) = CIxI(t) + ξI(t) + βy
I(tTy
0)θI(t) (2)
where tNis the discrete time instant, xIRnI,uIRmIand yIRpIare the state, input
and measured output vectors of the I-th subsystem, respectively, and x,x>
1, x>
2, . . . , x>
N>Rn
is the state vector of the overall system. Note that the distributed fault diagnosis scheme (to be
presented) is composed by Nlocal fault detection agents FI,I∈ {1, ..., N}, one for each subsystem
ΣI, and, that the structure of the diagnosis agents is exactly mirroring the decomposition (1), (2).
The matrix AIRnI×RnIand the function gI:RpI×R¯pI×RmI7→ RnIare the known nominal
3
function dynamics and the matrix CIRpI×RnIis the known nominal output matrix of the
I-th subsystem. The function gIcontains the known part of the interconnection function between
the I-th and its neighboring subsystems. More specifically, the vectors y0
I(t), ¯y0
I(t) are defined as
y0
I(t),CIxI(t) and ¯y0
I(t),¯
CI¯xI(t), where ¯xIR¯nIand ¯
CI¯xIR¯pIdenote the state variables
and the corresponding output variables, respectively, of the neighboring subsystems that affect the
I-th subsystem. This indicates that gIis a function of local variables y0
I(t) and, interconnection
variables ¯y0
I(t) that are measurable as yI(t), ¯yI(t) respectively. The superscript 0in y0
I(t), ¯y0
I(t)
simply indicates the noiseless and sensor fault free measurements of yI(t), ¯yI(t) respectively. The
vector function ηI:RnI×R¯nI×RmI×R+7→ RnIdenotes the modeling uncertainty associated with
the nominal dynamics and ξIRpIrepresents the measurement noise. The term βx
I(tTx
0)φI(x, uI)
characterizes the process fault function dynamics affecting the I-th subsystem, including its time
evolution. More specifically, the term φI:Rn×RmI7→ RnIrepresents the unknown fault function
and the term βx
I(tTx
0) : R7→ R+models the time evolution of the fault, where Tx
0is the unknown
time of the fault occurrence. Note that the fault function φImay depend on the global state variable
vector xand not only on the local state vector xIallowing faults to be functions of the overall state
vector and not only of the states that are available to the I-th subsystem. The term βy
I(tTy
0)θI(t)
characterizes the sensor fault, where βy
I:R7→ R+models the time profile of the sensor fault which
occurs at some unknown time Ty
0and, θIRpIrepresents the unknown time-varying sensor fault.
In this work, no particular modeling is considered for the time profile βx
I(tTx
0) of the process fault
and βy
I(tTy
0) of the sensor fault. Generally, the time profiles can be used to model both abrupt
and incipient faults. In this work we consider them to be zero prior to the respective fault occurrence
and do not make any modeling considerations regarding the fault evolution after their occurrence,
i.e. we only consider βx
I(tTx
0)=0 t<Tx
0and βy
I(tTy
0) = 0 t < T y
0. In fact, the faults can
be permanent, temporary or even intermittent.
In this work, subsystem ΣJis said to to affect subsystem ΣI(or in other words ΣJis a neighbor of
ΣI), if the interconnection variables of ΣI, i.e. ¯y0
I(t), contains at least one of the measurable output
variables of ΣJ, i.e. y0
J(t).
The objective is to design and analyze a distributed fault detection approach where, a local fault
detection agent FIis associated with each subsystem ΣIand receives local measurements uI,yI
and partial information from neighboring fault detection agents FJ. Each fault detection agent FI
is not connected to all other agents, but only to a subset of neighboring agents, thus constituting a
distributed fault detection scheme. The analysis part does not aim only in the derivation of suitable
detection thresholds but also, it aims in the investigation of the fault propagation issues. In this
work, the notion of fault propagation does not mean the creation of new faults in interconnected
subsystems as a result of a faulty behavior of a subsystem. Instead, it means the way a particular
fault occurring in one subsystem affects neighboring interconnected subsystems (in other words, it is
simply the propagation of the fault effects from a faulty subsystem to its interconnected subsystems
it affects). More specifically, the objective is to design a robust distributed fault detection scheme,
for process and sensor faults, with enhanced fault detectability characteristics and, inherent fault
isolation characteristics, that is able to provide information regarding the type and location of the
fault that has occurred. The fault detectability enhancement is achieved through filtering by ex-
ploiting the noise suppression properties of filters in order to obtain tight detection thresholds. The
enhanced fault isolation characteristics are integrated by the design of the scheme which utilizes the
measurements instead of the state estimates. The purpose is to derive a high-level isolation scheme,
which does not necessarily pinpoints the specific fault that has occurred, but rather infers some
conclusions about the fault that has occurred by taking into consideration the decisions of the fault
detection agents that monitor each subsystem. This information, which in some special cases could
even lead to the identification of a faulty sensor, can provide valuable information that can be used
by a more advanced fault isolation scheme in order to greatly improve its performance by excluding
potential fault scenarios. In the sequel, the terms fault isolation/diagnosis will be used according to
the aforementioned basis.
Each fault detection agent contains an estimation model based on its subsystems’ nominal dy-
4
namics that provides the state estimates and, utilizes filtering to derive the residual and threshold
signals. Finally, each detection agent provides a binary decision regarding the detection of a fault in
the subsystem it monitors. The decisions of all the fault detection agents are then exploited by the
high-level isolation scheme to infer some information regarding the type/location of the fault that
has occurred.
1
ut
2
yt
0
2
yt
3
yt
Fault Decision
Yes/No
1
2
3
3
ut
Distributed System
Distributed FD
Estimation
model
Residuals Adaptive
thresholds
Fault Decision for
subsystem Σ1
Local FD agent 1

1
ˆ
x
t
Estimation
model
Residuals Adaptive
thresholds
Fault Decision for
subsystem Σ2
Local FD agent 2
2
ˆ
xt
Estimation
model
Residuals Adaptive
thresholds
Fault Decision for
subsystem Σ3
Local FD agent 3
3
ˆ
xt
High-level Fault Isolation Logic
Fault Decision
Yes/No
Fault Decision
Yes/No
11
,ut yt
22
,ut
y
t
33
,ut yt
0
3
yt
2
ut
Figure 1.: Distributed fault diagnosis approach for the case of three subsystems Σ1, Σ2, Σ3where
Σ1affects Σ2and Σ2affects Σ3.
Generally, the distributed fault diagnosis scheme is composed by Nlocal fault detection agents
FI,I= 1, . . . , N , one for each subsystem ΣI. Each local fault detection agent FIrequires the input
and output measurements of the subsystem ΣIthat is monitoring and also the measurements of all
interconnecting subsystems ΣJthat are affecting ΣI. Note that these last measurements are commu-
nicated by neighboring fault detection agents FJ, and not by the subsystems ΣJ. Therefore, there is
the need of communication between the fault detection agents depending on their interconnections
which constitutes the scheme distributed. Note that, the information exchanged among the subsys-
tems is constituted only by quantities (¯y0
I(t)) that are measurable with some uncertainty (¯yI(t)).
Figure 1 illustrates the distributed fault diagnosis approach for the case of three subsystems Σ1, Σ2,
Σ3where Σ1affects Σ2and Σ2affects Σ3.
The fault diagnosis structure can also be considered as a hierarchical multi-agent diagnostic system
composed of two layers: a lower and an upper layer. In the lower layer, a diagnostic agent FI
is associated to each subsystem ΣIwith the aim to detect process and sensor faults. Each fault
detection agent FIconsists of a residual generator of the form (4), (5), (6) (to be given in the
sequel), together with a fault detection decision logic relying on the comparison of the residual (6)
to an adaptive threshold (9) (to be designed). In the upper layer, a fault isolation logic combines
the decisions of the diagnostic agents, with the aim to infer further information regarding the type
of the fault that has occurred (distinguish process and sensor faults) and its location.
In this work, we do not deal explicitly with the design of feedback controllers for selecting uI.
Instead, we consider the fault detection issue in the presence of process faults φI, sensor faults
5
θI, modeling uncertainties ηIand measurement noise ξI. The proposed formulation allows for any
controllers that achieve under healthy conditions some desired control objectives and does not depend
on their structure. It is assumed that the controllers are able to retain the uniform boundedness of
the state variables before and after the occurrence of a fault.
The following assumptions are used throughout the paper:
Assumption 1: For each subsystem ΣI,I∈ {1, ..., N}, the local state variables xI(t)and the
local inputs uI(t)belong to a known compact region DxIand DuIrespectively before and after the
occurrence of a fault, i.e. xI(t)∈ DxI,uI(t)∈ DuIfor all t0.
Assumption 2: For each subsystem ΣI,I∈ {1, ..., N}the pair (AI, CI)is detectable.
Assumption 3: The modeling uncertainty ηIin each subsystem is an unstructured and unknown
nonlinear function of xI,¯xI,uIand tbut bounded by a known positive functional ¯ηIunder sensor
fault-free operation, i.e.,
kηI(xI,¯xI, uI, t)k ≤ ¯ηI(yI,¯yI, uI),(3)
for all tNand for all (xI,¯xI, uI)∈ DI, where ¯yIR¯pIis the noisy counterpart of ¯y0
I(t), i.e.
¯yI= ¯y0
I(t) + ¯
ξI,¯
ξIR¯pIand ¯ηI(yI,¯yI, uI)0is a known bounding function in some compact region
of interest DI=DxI× D¯xI× DuIRnI×R¯nI×RmI.
Assumption 4: The measurement noise belongs to a known compact region, i.e. ξI(t)∈ DξI
RpI,¯
ξI∈ D¯
ξIR¯pI.
Assumption 1 is required for well-posedness since in this work we do not address the control design
and fault accommodation problem, but instead the fault detection problem. Assumption 2 is required
for the design of a suitable observer to be used for the residual signal generation. Assumption 3
characterizes the class of modeling uncertainties being considered. The bound ¯ηIis required in order
to distinguish the effects between modeling uncertainty and faults. Assumption 4 is required in order
to distinguish the effects between noise and faults.
3. Distributed Fault Detection
In this Section, the details of the proposed distributed scheme regarding the design of the residual
and threshold signals are given, along with some practical considerations.
3.1 Residual Signal Generation
In this part, the residual signal generation in each fault detection agent is addressed by making use of
filtering. For each subsystem ΣI, we consider an estimation model, based on the known components
of (1) under healthy mode of operation:
ˆxI(t+ 1) = AIˆxI(t) + gI(yI(t),¯yI(t), uI(t)) + LI(yI(t)ˆyI(t)) (4)
ˆyI(t) = CIˆxI(t),(5)
where the gain matrix LIis computed so that (AILICI) is Schur stable, i.e. its eigenvalues lie in
the open unit disc. Note that, since the pair (AI, CI) is detectable, according to Assumption 2, such
LIcan always be determined. In order to simplify the presentation of the mathematical calculations,
the initial condition ˆxI(0) is considered known as ˆxI(0) = xI(0). In the case xI(0) is not exactly
known, the discrepancy xI(0) ˆxI(0) will appear in the calculations which, as it will be shown later,
6
is multiplied by exponentially decaying functions, and therefore it does not affect substantially the
subsequent analysis.
In this work, the residual signal rI(t) to be used for fault detection in each subsystem ΣIis given by
rI(t),H(z) [yI(t)ˆyI(t)] ,(6)
where H(z) is a p-th order, asymptotically stable filter with proper transfer function
H(z) = d0+d1z1+d2z2+. . . +dpzp
1 + c1z1+. . . +cpzp.(7)
Note that the form of H(z) allows both IIR and FIR types of digital filters. In addition, note that,
for the residual generation, each measured variable y(j)
I(j-th component of yI) is filtered by H(z) in
order to dampen the effect of measurement uncertainty ξI(t), so that tighter detection thresholds can
be obtained. Specifically, for each measured variable y(j)
I, a corresponding residual r(j)
Iand threshold
¯r(j)
Iare generated and hence, all the measurements need to be filtered for the residual generation.
On the other hand, the estimation model given by (4), (5) relies on the unfiltered measurements.
In the proposed approach, the same filter H(z) must be used within the fault detection agent FI
for filtering all the measurements yI, and different filters H(z) can be used by each fault detection
agent. In this work, without loss of generality, we have considered that the same filter H(z) is used
by all detection agents.
The choice of a particular type of filter to be used is application dependent, and it is made according
to the available a-priori knowledge on the noise properties. Usually, measurement noise is constituted
by high frequency components and therefore the use of low-pass filter for dampening noise is well
justified. On other occasions, one may want to focus the fault detectability on a prescribed frequency
band of the measurement signals and hence, choose the filter accordingly. The particular selection
criteria for choosing a suitable filter and its trade-offs are out of the scope of the present paper and the
reader is referred to the continuous-time case in Keliris et al. (2013a) where a rigorous investigation
of the filtering impact (according to the poles’ location and filters’ order) on the detection time is
presented.
Since the filter H(z) is asymptotically stable, for bounded measurement noise ξI(t), the filtered
measurement noise ξI(t),H(z) [ξI(t)] is bounded as follows:
kξI(t)k ≤ ¯ξI(t),(8)
where ¯ξIis a computable bounding function. Depending on the noise characteristics, H(z) can be
selected to reduce the bounding function ¯ξI. It is important to note that filtering is primarily used
to mitigate the effect of measurement noise and aid in the derivation of tighter thresholds, thus
enhancing fault detectability (see Keliris et al. (2013a)).
Remark 1: It is important to note that in the nonlinear function gI, the measurements yIand ¯yIare
used instead of the estimates CIˆxIand ¯
CIˆ
¯xI. This is crucial for the derivation of the high-level fault
isolation scheme, since in the case of a process fault its effects are contained in the measurements,
and as it will be shown, the fault can only be detected by the agent monitoring the subsystem that
the fault has occurred. On the other hand, when a sensor fault occurs, it can affect neighboring
detection agents through the communicated measurements of the interconnection variables, i.e. ¯yI
that contain the sensor fault and hence the fault can also be detected by interconnected detection
agents. In X. Zhang et al. (2009), which investigated a decentralized fault detection scheme for process
faults, the estimation model used the estimates in the interconnection functions among subsystems
(instead of the measurements as in this work) in order to enhance fault detectability by allowing the
interconnected agents to also be able to detect the fault. In this work, the measurements are used
instead of the estimates, in order to enhance fault isolability.
7
3.2 Adaptive Detection Threshold
In this part, suitable detection thresholds that guarantee no false alarms are derived. Filtering is
also integrated in the design in order to attenuate the measurement noise effects and aid in the
derivation of tighter detection thresholds. This is achieved by treating the use of filter as a linear
state transformation which leads to the manipulation of the mathematical expressions based on their
filtered versions.
In the following, we will use the filtered state variable xI ,f (t),H(z) [xI(t)], and the filtered state
estimate ˆxI,f (t),H(z) [ ˆxI(t)]. The threshold design is based on the derivation of a suitable bound
on the filtered state estimation error xI,f (t)ˆxI ,f (t). Also, let h(t) be the impulse response associated
with H(z); i.e. h(t),Z1[H(z)], so that xI,f (t) can be written as
xI,f (t) =
t
X
k=0
h(k)xI(tk).
In this work, the detection decision of a fault in the overall system is made when |r(j)
I(t)|>¯r(j)
I(t)
at some time tfor at least one component j∈ {1,2, . . . , pI}in any local subsystem ΣI, where ¯r(j)
I(t)
is the detection threshold given by
¯r(j)
I(t),
t1
X
k=0
αI,j δt1k
I,j ¯χI(k) + ¯ξI(t),(9)
where
¯χI(t),¯
H(z)h¯ηI(yI(t),¯yI(t), uI(t)) + ¯
gIi+kLIk¯ξI(t),(10)
¯
H(z) is a filter with impulse response ¯
h(t)≥ |h(t)|for all t0,
¯
gI,sup
(xI,¯xI,uI)∈DI
(ξI,¯
ξI)∈DξI×D¯
ξI
kgICIxI,¯
CI¯xI, uIgICIxI+ξI,¯
CI¯xI+¯
ξI, uIk,(11)
and where the constants αI,j >0 and 0 < δI ,j <1 are selected so that the following inequality holds:
kC(j)
IAt
I,0k ≤ αI ,j δt
I,j ≤ kC(j)
IkkAI,0kt,(12)
with AI,0,AILICI. Finally, C(j)
Idenotes the j-th row of the matrix CI. Note that, since AI,0is
Schur stable, suitable constants αI,j ,δI ,j do exist (Ferrari, Parisini, & Polycarpou, 2008).
Note that, the threshold (9) can be implemented using linear filtering techniques as:
¯r(j)
I(t) = αI,j
1δI,j z1[ ¯χI(t1)] + ¯ξI(t).(13)
Figure 2 illustrates the implementation of the fault detection scheme for the detection agent FI
resulting from equations (7), (4), (5), (6) and (13).
In the absence of any faults, the residual signal r(j)
I(t) given in (6) is always bounded by the
detection threshold ¯r(j)
I(t) given by (9). The fault detection concept is formalized in the following
Lemma.
Lemma 1: Consider a distributed system made of Nsubsystems ΣIgiven by (1),(2). In the absence
of any faults, the residuals r(j)
I(t)given by (6), where ˆy(j)
Iare given by (4) and (5), are bounded by
8
()
()
j
I
rt
+
-
+
()j
I
y
t

ˆ1
I
xt

I
t
()
()
j
I
rt
()
ˆj
I
yt
I
u
I
y
-
j
I
R

0 1,...,
0 for some
j
I
I
j
I
R
jp
No Fault Detection
R
j
Fault Detection

I
y
,
1
,
1
Ij
Ij
z
ˆ
I
xt
+
+
I
t
1
z
()
j
I
C
H
z
||
1
z
Figure 2.: Fault detection scheme of the detection agent FI.
the detection thresholds ¯r(j)
I(t), given by (9), thus guaranteeing that no false alarms are issued by the
fault detection scheme.
Proof. The filtered state estimate xI ,f (t+ 1) can be written as:
xI,f (t+ 1) =
t+1
X
k=0
h(t+ 1 k)xI(k)
=
t+1
X
k=1
h(t+ 1 k)xI(k) + h(t+ 1)xI(0),
and, by using the change of variables k=i+ 1, it becomes
xI,f (t+ 1) =
t
X
i=0
h(ti)xI(i+ 1) + h(t+ 1)xI(0).(14)
i) Prior to the possible occurrence of a process fault, by using (1), (14) can be written as:
xI,f (t+ 1) =
t
X
i=0
h(ti)AIxI(i) + gI(y0
I(i),¯y0
I(i), uI(i)) + ηI(xI(i),¯xI(i), uI(i), i)+h(t+ 1)xI(0)
=AIxI,f (t) + H(z)hgI(y0
I(t),¯y0
I(t), uI(t))i+H(z) [ηI(xI(t),¯xI(t), uI(t), t)]
+h(t+ 1)xI(0).(15)
Similarly to the derivation of (15), the filtered state estimate dynamics ˆxI,f (t+ 1) by using (4)
satisfies:
ˆxI,f (t+ 1) = AIˆxI ,f (t) + H(z)gIyI(t),¯yI(t), uI(t)+LIyI ,f (t)ˆyI ,f (t)
+h(t+ 1)ˆxI(0),(16)
where yI,f (t),H(z)[yI(t)] and ˆyI ,f (t),H(z)[ˆyI(t)] = CIˆxI,f (t).
ii) Prior to the possible occurrence of a sensor fault, yI ,f (t) = CIxI,f (t) + ξI(t) and, by using (15),
(16) and some algebra, the filtered state estimation error ˜xI,f (t),xI,f (t)ˆxI,f (t) satisfies
˜xI,f (t+ 1) = AI ,0˜xI ,f (t) + χI(t),(17)
9
where
χI(t),H(z) [ηI(xI(t),¯xI(t), uI(t), t)+∆gI(t)] LIξI(t) + h(t+ 1)(xI(0) ˆxI(0)),(18)
gI(t),gIy0
I(t),¯y0
I(t), uI(t)gIy0
I(t) + ξI(t),¯y0
I(t) + ¯
ξI(t), uI(t).(19)
The solution of (17) is
˜xI,f (t) = At
I,0˜xI ,f (0) +
t1
X
k=0
At1k
I,0χI(k).(20)
Note that in (18), h(t+ 1)(xI(0) ˆxI(0)) = 0 because of xI(0) = ˆxI(0), and note that the
term ˜xI,f (0) in (20) is also zero since ˜xI ,f (0) = xI,f (0) ˆxI ,f (0) = h(0)(xI(0) ˆxI(0)) = 0. In
the case ˆxI(0) 6=xI(0), the aforementioned terms decay exponentially to zero and hence do not
affect substantially the subsequent analysis. More specifically, the term At
I,0˜xI ,f (0) in (20) decays
exponentially to zero because AI ,0is Schur stable, and the term h(t+ 1)(xI(0) ˆxI(0)) in (18)
decays also exponentially to zero because the impulse response h(t) of an asymptotically stable filter
is exponentially decaying (in the case of IIR filter, the FIR filter case is trivial).
Now, by using (2) and (5) the residual (6) prior to any fault satisfies rI(t) = CI˜xI,f (t) + ξI(t)
and, by using (20) and ˜xI ,f (0) = 0, it becomes
rI(t) =
t1
X
k=0
CIAt1k
I,0χI(k) + ξI(t).(21)
By taking the absolute value component-wise and using the triangle inequality, the j-th element
of rI(t), i.e. r(j)
I(t) satisfies
|r(j)
I(t)| ≤
t1
X
k=0
C(j)
IAt1k
I,0χI(k)
+|(j)
ξI(t)|
t1
X
k=0
kC(j)
IAt1k
I,0kkχI(k)k+|(j)
ξI(t)|.(22)
Moreover, using (12) and the fact that |(j)
ξI(t)|≤kξI(t)k ≤ ¯ξI(t), (22) becomes
|r(j)
I(t)| ≤
t1
X
k=0
αI,j δt1k
I,j kχI(k)k+ ¯ξI(t).(23)
Now, consider the term χI(t) which satisfies
kχI(t)k=kH(z) [ηI(xI,¯xI, uI, t)+∆gI(t)] LIξI(t)k
≤ kH(z) [ηI(xI,¯xI, uI, t)+∆gI(t)] k+kLIξI(t)k
t
X
k=0
|h(tk)|kηI(xI(k),¯xI(k), uI(k), k)k+
t
X
k=0
|h(tk)|kgI(k)k+kLIkkξI(t)k
¯χI(t) (24)
where ¯χI(t) is the bounding function given by (10).
10
Finally, by using (23), (24) and the bound ¯χI(t), we obtain |r(j)
I(t)| ≤ ¯r(j)
I(t), where the detection
threshold ¯r(j)
I(t) is given by (9), thus concluding the proof.
From a practical viewpoint, the implementation of the threshold ¯rIrequires the bound ¯
gIgiven
in (11). One approach to derive this bound is to consider a local Lipschitz condition, i.e.:
kgICIxI,¯
CI¯xI, uIgICIxI+ξI,¯
CI¯xI+¯
ξI, uIk ≤ LgIk[ξI¯
ξI]>k,
where LgIis the Lipschitz constant for the function gIwith respect to (xI,¯xI) in the region DxI×D¯xI.
Therefore, by using a uniform bound on the measurement noise (see Assumption 4), i.e. kξI(t)k ≤ ξI,b,
k¯
ξI(t)k ≤ ¯
ξI,b tN, then we can derive the bound ¯
gI.
Filtering is primarily used to dampen the measurement noise and allow the derivation of tighter
detection thresholds. Filtering can also be proved beneficial for dampening the mismatch function
gI(t) which results due to the measurement noise and therefore further enhance fault detectability.
Among the various filters H(z) one can select, some may lead to less conservative detection thresh-
olds. The derivation of potentially tighter thresholds is obtained by writing the total uncertainty
term χI(t) given by (18) as
χI(t) = H(z) [ηI(xI,¯xI, uI, t)] + gI(t)LIξI(t),(25)
where gI(t),H(z) [∆gI(t)], and by making the following assumption.
Assumption 5: In the absence of a sensor fault, the filtered function mismatch term gI(t)is
bounded by a computable positive function ¯gI(t); i.e., for all tN,
kgI(t)k ≤ ¯gI(t).(26)
Assumption 5 is based on the fact that filtering dampens the error effect of measurement noise
present in the function mismatch term ∆gI(t). A suitable selection of ¯gIcan be made through
the use of simulations (i.e. Monte Carlo methods) by filtering the function mismatch term ∆gI(t)
using the known nominal function dynamics and the available noise characteristics (recall that the
measurement noise is assumed to take values in a compact set, see Assumption 4).
In this case, the detection threshold ¯r(j)
I(t) is still given by (9), but ¯χI(t) is now given by:
¯χI(t) = ¯
H(z) [¯ηI(yI(t),¯yI(t), uI(t))] + ¯gI(t) + kLIk¯ξI(t).(27)
As a result, the detection threshold by using (27) can be less conservative than by using (10). More
information regarding this can be found in Keliris et al. (2013a).
3.3 Selection of filter ¯
H(z)
In this part, we give more details on the selection of a suitable filter ¯
H(z) which is required for
the implementation of the detection threshold. As stated before, its impulse response must satisfy
|h(t)| ≤ ¯
h(t) for all t0. In the case where the impulse response h(t) is non-negative, the selection
¯
H(z) = H(z) is trivial. Sufficient conditions for non-negative impulse response for a class of discrete-
time transfer functions are given in Liu and Bauer (2008). In the following, we briefly illustrate two
simple methods for choosing ¯
H(z), one considering H(z) as a digital IIR filter and the other one as
a FIR filter.
First we consider the case where H(z) is an IIR filter. As stated earlier, the impulse response
h(t) of a proper and asymptotically stable transfer function H(z) converges to zero exponentially
fast. Therefore, there exist κ > 0, λ[0,1) such that for all tNthe following inequality holds:
11
|h(t)| ≤ κλt. Since |h(t)| ≤ ¯
h(t) must hold, the impulse response ¯
h(t) can be selected as ¯
h(t) = κλt
and thus ¯
H(z) = κ
1λz1.
Now, let’s consider the case in which H(z) is a FIR filter. Let H(z) be a p-th order FIR filter given
by H(z) = Pp
k=0 dkzk. Therefore, ¯
h(t) can be selected as ¯
h(t) = |h(t)|which leads to the FIR filter
¯
H(z) = Pp
k=0 |dk|zk.
Generally, in fault detection schemes it is difficult to find a suitable good balance in selecting the
threshold. If the threshold is too high (conservative threshold) then some faults may go undetected.
If the threshold is too low, then this may result in false alarms. The following Remark discusses the
sources of possible conservativeness in the proposed scheme.
Remark 2: The scheme presented in this paper, guarantees no-false alarms but, at the same time,
utilizes filtering to dampen the measurement noise and hence, improve the derivation of tighter
detection thresholds for enhanced fault detectability. Specifically, the designed fault detection scheme
guarantees that in the absence of a fault, the residual rI(t) given by (6) is uniformly bounded by
the detection threshold ¯rI(t) given by (9), which requires the bound ¯χI(t). The main sources of
conservativeness in the designed threshold are included in the overall bound ¯χI(t), which can be
broken up into the following four components: a) the filter ¯
H(z), b) the bound on the modeling
uncertainty ηI(xI,¯xI, uI), c) the bound on the filtered noise ξI(t) and d) the bound on the function
mismatch term ∆gI(t). At first, the filter ¯
H(z), which is required to satisfy |h(t)| ≤ ¯
h(t) for all
t0, may impose some conservativeness but, this can be avoided by selecting the filter H(z) used
for filtering the measurements to have a non-negative impulse response so that the same filter can
be used for the threshold derivation, i.e. ¯
H(z) = H(z). In this case, no conservativeness in the
threshold is added. The second source of conservativeness stems from the bound on ηI(xI,¯xI, uI),
which according to Assumption 3, is bounded by a known function ¯ηI(yI,¯yI, uI). This is required
in order to distinguish the effects from modeling uncertainty and faults so that no false alarms are
introduced. In practice, the system can be modeled more accurately in certain regions of the state
space and therefore, the fact that the bound ¯ηIis a function of yI, ¯yIand uIprovides more flexibility
by allowing the designer to take into consideration any prior knowledge of the system. Regarding the
conservativeness imposed by the use of the bound on the filtered noise ξI(t), please note that the
bound ¯ξI(t) is multiplied with kLIkin (10) and hence, the conservativeness is significantly reduced
in comparison to the case in which no filtering is used (in that case, the bound ¯
ξI(t) on the noise (i.e.
kξI(t)k ≤ ¯
ξI(t)) would be multiplied with kLIk, leading to more conservative thresholds). Finally,
the last source of conservativeness in the threshold is introduced by the bound on the mismatch
function ∆gIgiven by (19) for which the bound ¯
gIgiven by (11) is required. As discussed earlier,
one way to derive this bound is through the use of the Lipschitz assumption. The use of filtering can
aid in the elimination of (some of) the conservativeness imposed by the bound ¯
gI, by exploiting
the noise suppression properties through Assumption 5 and by using the bound ¯χI(t) given by (27).
In Section 4, fault detectability conditions for the aforementioned fault detection scheme are ad-
dressed. The conditions given in Section 4 refer to the case of a fault occurring in subsystem ΣIand
being detected by its respective local fault detection agent FI. In Section 5, fault propagation from
one subsystem to the other is investigated by examining the way the fault effects appear and affect
neighboring interconnected subsystems.
4. Local fault detectability analysis
The fault detectability analysis constitutes a theoretical result that characterizes quantitatively (and
in implicit way) the class of faults detectable by the proposed scheme. In order to derive the fault
detectability conditions, we take into consideration the distinct occurrence of a process fault φIat
an unknown time t=Tx
0or the occurrence of a sensor fault θIat an unknown time t=Ty
0.
12
Theorem 1 (Local Process Fault Detectability):Consider the nonlinear interconnected system (1),
(2) with the distributed fault detection scheme described in (4), (5), (6), (9) and (10) . A process
fault in the I-th subsystem occurring at t=Tx
0is detectable by the respective local fault detection
agent FIif the filtered process fault function φI ,f x(t), uI(t), t,H(z)[βx
I(tTx
0)φI(x(t), uI(t))]
satisfies the following inequality at some time t>Tx
0, for some j= 1,2, . . . , pI:
t1
X
k=Tx
0
C(j)
IAt1k
I,0φI ,f x(k), uI(k), k
>r(j)
I(t).(28)
Proof. In the presence of a process fault that occurs in the I-th subsystem at t=Tx
0, (20) becomes:
˜xI,f (t) =At
I,0˜xI ,f (0) +
t1
X
k=0
At1k
I,0χI(k) + φI ,f x(k), uI(k), k,
and by using ˜xI ,f (0) = 0, the residual rI(t) = CI˜xI,f (t) + ξI(t) becomes (similarly to (21))
rI(t) =
t1
X
k=0
CIAt1k
I,0χI(k) + φI ,f x(k), uI(k), k+ξI(t).
By using the triangle inequality, the j-th element of rI(t) for t>Tx
0satisfies:
|r(j)
I(t)| ≥ −
t1
X
k=0
C(j)
IAt1k
I,0χI(k)
− |(j)
ξI(t)|+
t1
X
k=Tx
0
C(j)
IAt1k
I,0φI ,f x(k), uI(k), k
.(29)
Following a similar procedure as in the derivation of the detection threshold (9), (29) becomes
|r(j)
I(t)| ≥ − ¯r(j)
I(t) +
t1
X
k=Tx
0
C(j)
IAt1k
I,0φI ,f x(k), uI(k), k
.
For fault detection, the inequality |r(j)
I(t)|>¯r(j)
I(t) must hold for some j= 1,2, . . . , pI, so the final
fault detectability condition given in (28) is obtained.
Theorem 2 (Local Sensor Fault Detectability):Consider the nonlinear interconnected system (1),
(2) with the distributed fault detection scheme described in (4), (5), (6), (10) and (9). A sensor fault
in the I-th subsystem occurring at t=Ty
0is detectable by the respective local fault detection agent
FIif the filtered sensor fault function
θI,f (t),H(z)[βy
I(tTy
0)θI(t)],
and the following mismatch function g0
Idue to the sensor fault
g0
I(t),gI(y0
I(t),¯y0
I(t), uI(t)) gI(y0
I(t) + ξI(t) + βy
I(tTy
0)θI(t),¯y0
I(t) + ¯
ξI(t), uI(t)),
13
satisfy the following inequality at some time t>Ty
0, for some j= 1,2, . . . , pI:
θ(j)
I,f (t) +
t1
X
k=0
C(j)
IAt1k
I,0H(z)[∆g0
I(k)] LIθI,f (k)
>¯r(j)
I(t) + ¯ξI(t)
+
t1
X
k=0
αI,j δt1k
I,j kH(z)[ηI(xI,¯xI, uI, k )]k+kLIk¯ξI(k).(30)
Proof. See Appendix A.
Theorems 1 and 2 provide sufficient conditions for the implicit characterization of certain classes
of faults that can be detected by the proposed fault detection scheme. Clearly, the fault functions
φI(x, uI) and θIare typically unknown and therefore these conditions cannot be checked a-priori.
Remark 3: The use of filtering is of crucial importance in order to derive tight detection thresholds
that guarantee no false alarms (see Keliris et al. (2013a)). As it can be seen in the detectability
conditions given by (28), (30) the detection of the fault depends on the filtered process fault function
φIand filtered sensor fault θIand as a result, the selection of the filter is very important. Therefore,
some filter selections may lead to less conservative thresholds than others.
5. Fault propagation
In this section, fault propagation from one subsystem to the other is investigated and further intuition
regarding the isolation properties of the proposed fault diagnosis scheme is obtained. The notion of
fault propagation does not mean the creation of additional faults to neighboring subsystems as a
result of a faulty behavior in one of them. Instead, it means the way the fault effects in one subsystem
appear and affect its neighboring subsystems. More specifically, we consider a fault that occurs in
subsystem ΣJwhich affects ΣIand investigate the possibility of fault detection, not by the local
fault detection agent FJ(which may obviously detect the fault), but by the agent FI.
The following lemma summarizes the main findings of this analysis.
Lemma 2: Consider a distributed system made of Nsubsystems ΣIgiven by (1),(2). The distributed
fault detection scheme described by the estimation model (4),(5), the residual signals rI(t)given by
(6) and the detection thresholds ¯rI(t)given by (9) guarantees that:
(a) a process fault occurring in subsystem ΣJwhich affects ΣIcan only be detected by its correspond-
ing fault detection agent FJand not by the detection agent FI.
(b) a sensor fault occurring in ΣJwhich affects ΣIcan be detected by either the corresponding
detection agent FJor the detection agent FI.
Proof. The case of the fault (process or sensor) occurring in subsystem ΣJand being detectable by
its corresponding detection agent FJhas been investigated in Section 3 and respective detectability
conditions were given in Section 4. In the sequel, we investigate the possibility of detection of a fault
that occurs in ΣJwhich affects ΣIby the detection agent FI.
(a) At first, lets consider fault propagation in the case of a process fault. In this case, the process
fault effects in ΣJare propagated to ΣIthrough the interconnection variables ¯
CI¯xI(see (1)) and to
FIthrough the measurements ¯yI(communicated by FJ, see (4)). For easier visual indication of the
process fault effects that are contained in the interconnection variables ¯xIand the measurements
of the output interconnection variables ¯yI, we denote them as ¯xI ,P and ¯yI,P respectively. Note that
¯yI,P (t) = ¯
CI¯xI,P (t) + ¯
ξI(t). In the case of a process fault occurring in ΣJwhich affects ΣI, the
dynamics of ΣIare given by (1), (2) and the estimation model of the local fault detection agent FI
by (4), (5). In the aforementioned equations ¯xIand ¯yIare now indicated by ¯xI ,P and ¯yI,P respectively.
14
Note that, both the dynamics of ΣIand the estimation model of the local fault detection agent FI
are affected by the process fault effects that occurred in ΣJ. Following the same analysis as in
the proof of Lemma 1, the filtered estimation error still satisfies (17)-(19) (the fault effects enter
implicitly through the interconnection variables and their measurements), which are rewritten below
with explicit indication of the fault effect through ¯xI,P (¯yI,P is expressed in terms of ¯xI ,P ):
˜xI,f (t+ 1) = AI ,0˜xI ,f (t) + χI,P (t)
χI,P (t),H(z) [ηI(xI(t),¯xI ,P (t), uI(t), t)+∆gI ,P (t)] LIξI(t),
gI,P (t),gICIxI(t),¯
CI¯xI,P (t), uI(t)gICIxI(t) + ξI(t),¯
CI¯xI,P (t) + ¯
ξI(t), uI(t).
Now, the residual rI(t) is “contaminated” with the fault effects that occurred in subsystem ΣJ,
and we need to investigate whether this residual is bounded or not by the detection threshold ¯rI(t)
given by (9) that is used by the detection agent FI. Following the same mathematical calculations
as in the derivation of (23), the residual r(j)
Isatisfies
|r(j)
I(t)| ≤
t1
X
k=0
αI,j δt1k
I,j kχI ,P (k)k+ ¯ξI(t),(31)
where
kχI,P (t)k ≤ ¯
H(z)kηI(xI(t),¯xI,P (t), uI(t), t)k+kgI ,P (t)k+kLIk¯ξI(t)
¯
H(z)¯ηI(yI(t),¯yI,P (t), uI(t)) + ¯
gI+kLIk¯ξI(t).(32)
The last inequality is derived by using the bounds kηI(xI,¯xI ,P , uI, t)k ≤ ¯ηI(yI(t),¯yI,P (t), uI(t)) (see
Assumption 3) and kgI,P (t)k ≤ ¯
gI(see Assumption 1 and (11)). Note that, Assumption 3 is stated
for sensor fault-free operation because the bound on the modeling uncertainty ¯ηImakes use of the
measurements. In the event of a process fault which changes the state variables, Assumption 3 is
still valid since the measurements are essentially these altered state variables (or linear combination,
contaminated with the process fault effects) but with some uncertainty due to the measurement
noise. In addition, note that the right side of (32) is actually the term ¯χI(t) given by (10) which is
used by the the local fault detection agent FIand hence kχI,P (t)k ≤ ¯χI(t). Therefore, from (31),
it can be seen that the residual still satisfies |r(j)
I(t)| ≤ ¯r(j)
I(t) for all j= 1, . . . , pIand hence the
fault is not detected by FI. In other words, a process fault that occurs in a subsystem can only be
detected by its respective fault detection agent.
(b) In this part, we consider fault propagation in the case of a sensor fault that occurs in ΣJthat
affects ΣI. As in the case of a process fault, the local fault detection agent FJmay detect the fault,
but here we investigate the possibility of fault detection by the detection agent FI. In this case the
sensor faults in ΣJare propagated to FIthrough the measurements ¯yI(communicated by FJ, see
(4)). For easier visual indication of the measurements ¯yIthat contain the sensor faults, we denote
them as ¯yI ,S . Therefore, in the case of a sensor fault occurring in ΣJwhich affects ΣI, the dynamics
of ΣIremain unaffected by the sensor fault and are given by (1), (2) whereas the estimation model
of the local fault detection agent FIis affected by the sensor fault and it is given by (4), (5) where
now ¯yIis indicated with ¯yI ,S (only (4) is affected). Note that ¯yI ,S = ¯y0
I(t) + ¯
ξI(t) + ¯
βy
I(tTy
0)¯
θI(t),
where the term ¯
βy
I(tTy
0)¯
θI(t) indicates the sensor faults that occur in neighboring subsystems and
affect ΣI. Hence, the filtered estimation error still satisfies (17)-(19) (the fault effects enter implicitly
through the measurements of the interconnection variables), which are rewritten below with explicit
15
indication of the fault effect through ¯yI ,S :
˜xI,f (t+ 1) = AI ,0˜xI ,f (t) + χI,S (t)
χI,S (t),H(z) [ηI(xI(t),¯xI(t), uI(t), t)+∆gI,S (t)] LIξI(t),
gI,S (t),gIy0
I(t),¯y0
I(t), uI(t)gIy0
I(t) + ξI(t),¯yI,S (t), uI(t).
Similarly to the derivation of (23), the residual r(j)
Isatisfies
|r(j)
I(t)| ≤
t1
X
k=0
αI,j δt1k
I,j kχI ,S (k)k+ ¯ξI(t),
where
kχI,S (t)k ≤ ¯
H(z)kηI(xI(t),¯xI(t), uI(t), t)k+kgI,S (t)k+kLIk¯ξI(t).
In this case, it cannot be guaranteed that kηI(xI,¯xI, uI, t)k ≤ ¯ηI(yI(t),¯yI ,S (t), uI(t)) (since Assump-
tion 3 might not hold due to the sensor fault) or kgI,S (t)k ≤ ¯
gIhold and as a result kχI,S (t)k
may exceed ¯χI(t) given by (10) which is used by the local fault detection agent FI(note that in
(10) the first term is actually ¯ηI(yI(t),¯yI,S (t), uI(t)) due to the use of the faulty measurements).
Therefore, the residual may exceed its corresponding detection threshold, i.e. |r(j)
I(t)|>¯r(j)
I(t) for
some j= 1, . . . , pIwhich means that the sensor fault that occurred in ΣJcan be detected by the
local fault detection agent FI.
Remark 4: A qualitative explanation can be given for Lemma 2 as follows. In the case of a process
fault that occurs in ΣJ, the fault affects its states which in turn affect other subsystems through
the interconnection variables. So, the states of ΣJare “contaminated” by the process fault and the
measurements of (some of) these states also contain the process fault effects. Therefore, a subsystem
ΣIthat is affected by ΣJ, is affected by the process fault that occurred in ΣJthrough the inter-
connection variables ¯
CI¯xIand the detection agent FImakes use of the measurements ¯yIwhich are
also “contaminated” by the same fault. Hence, the effect of the process fault that occurred in ΣJ,
is “canceled out” in the detection agent FIand it is unable detect the fault. Hence, a process fault
occurring in subsystem ΣJis detectable only by its respective detection agent FJand not by any
other detection agent FI. On the other hand, assume that a fault is detected by the detection agent
FIand we know that it is a sensor fault. Then the faulty sensor might be due to the measurements
yIof ΣIor due to the interconnection measurements ¯yIof the other subsystems. This is because
a sensor fault occurring in one subsystem affects the estimation model of its respective detection
agent FIthrough yIand also the estimation models of other agents FJthrough the communicated
interconnection measurements ¯yJ, whereas the actual subsystems are influenced by the fault-free
states (and not the faulty measurements).
Theorem 2 established a detectability condition for fault detection by the agent FIwhen a sensor
fault occurs in ΣI. The following Theorem, gives a detectability condition for fault detection by the
agent FIwhen a sensor fault occurs in ΣJwhich affects ΣI.
Theorem 3 (Propagation Sensor Fault Detectability):Consider the nonlinear interconnected system
(1), (2) with the distributed fault detection scheme described in (4), (5), (6), (10) and (9). A sensor
fault that occurs at t=Ty
0in ΣJwhich affects ΣI, is detectable by the local fault detection agent FI
if the following mismatch function g00
Idue to the sensor fault
g00
I(t),gI(y0
I(t),¯y0
I(t), uI(t)) gI(y0
I(t) + ξI(t),¯y0
I(t) + ¯
ξI(t) + ¯
βy
I(tTy
0)¯
θI(t), uI(t))
16
satisfies the following inequality at some time t, for some j= 1,2, . . . , pI:
t1
X
k=0
C(j)
IAt1k
I,0H(z)[∆g00
I(k)]
>¯r(j)
I(t) + ¯ξI(t)
+
t1
X
k=0
αI,j δt1k
I,j kH(z)[ηI(xI(k),¯xI(k), uI(k), k )]k+kLIk¯ξI(k).(33)
Proof. See Appendix B.
As it was shown in this Section, when a process fault occurs in subsystem ΣI, then it can only be
detected by its respective fault detection agent FIwhereas when a sensor fault occurs in subsystem
ΣI, then it can be detected either by its respective detection agent FIor any other interconnected
agent FJ(monitoring ΣJwhich is affected by ΣI). This discriminating factor is exploited in the
following Section for devising a high-level fault isolation scheme.
6. High-level fault isolation
In this section, we exploit the findings of the previous analysis for the derivation of a high-level
fault isolation scheme (see Figure 1). The purpose of this high-level isolation scheme is to infer
some conclusions regarding the type and/or location of the fault that has occurred in the whole
interconnected system according to the decisions of the detection agents, although, it does not
necessarily mean that exact fault identification/isolation can be achieved. As a result, the high-
level isolation scheme can provide valuable information that can be used by a more advanced fault
isolation scheme in order to greatly improve its performance by excluding potential fault scenarios.
In the analysis so far, we have considered the cases of process and sensor fault separately in
order to gain some intuition of how these faults affect the local and neighboring detection agents.
The main conclusion of this analysis was given in Lemma 2 which also constitutes the basis of the
subsequent high-level isolation scheme. Although the proposed fault detection scheme may handle
multiple faults, which can be both process and sensor faults, for the sake of the proposed high-level
isolation scheme it is assumed that only one fault can occur among all subsystems, which may either
be a process or a single sensor fault. Ideally, we would like to identify the type of fault that has
occurred, that is, whether it is a process or sensor fault and furthermore in the case of a sensor fault
to identify the faulty sensor.
In order to identify the sensors let us consider the following:
Definition 1: Let S{yI}be the set of pIsensors that measure yIRpI,S{yI} ∩ S{yJ}be the
set of common sensors among yIand yJand S{yI} ∪ S{yJ}be the union of the set of sensors that
measure yIand yJ.
In addition let Mbe the set of indices of the local fault detection agents that have detected a
fault, i.e. M,{I∈ {1, . . . , N }:FIdetects fault}, let Miindicate the i-th index of the set Mand,
let m∈ {0,1, . . . , N }be the cardinality of the set M, i.e. m,card(M).
Then the following high-level isolation facts can be deduced:
If m= 0 then all subsystems are considered as potentially non-faulty (the possibility of a fault
not yet detected cannot be excluded).
If m= 1 then the fault may be one of the following:
a) a process fault that has occurred in ΣM1, OR
b) a single sensor fault in S{yM1}, OR
c) a single sensor fault in S{¯yM1}.
17
If m2 then the possibility of a process fault can be excluded and hence it is guaranteed that
a sensor fault has occurred. The faulty sensor can be isolated within the set given by
m
\
i=1 S{yMi} ∪ S{ ¯yMi}.(34)
If we consider that multiple faults can occur in the subsystems (i.e. multiple process and/or
multiple sensor faults) then the isolation logic is modified as follows:
If m= 0 then all subsystems are considered as potentially non-faulty.
If m1 then the fault(s) may be:
a) a single/multiple process fault(s) that occurred in ΣMi, i ∈ {1, . . . , m}, AND/OR
b) a single/multiple sensor fault(s) within the set given by
m
[
i=1 S{yMi} ∪ S{ ¯yMi}.(35)
The proposed high-level fault isolation scheme in the case of joint process and sensor faults as-
suming that only one fault can occur, results in more constrained fault possibilities in comparison
to the multiple fault case, in the sense that if two or more fault detection agents detect a fault then
the occurrence of a process fault is excluded, and hence the fault is guaranteed to be a sensor fault
contained in the set of sensors described by (34). If only one fault detection agent detects a fault
then the fault type cannot be determined apart from the fact that the fault can either be a process
or a sensor fault in the agents’ respective subsystem, or a sensor fault in neighboring subsystems
that affect the subsystem the fault has been detected in. In the case we consider multiple faults, the
isolation results include more possibilities about fault occurrences but still provide useful information
about the fault type and the set of that the faulty sensors are contained in.
These results prove to be more valuable in the case we consider that only one type of fault
can occur (process or sensor), as it is the majority of the research conducted in the literature. By
considering that only process faults can occur, that is by assuming that all sensors are healthy, then
the detection of a fault by any fault detection agent guarantees that the fault has occurred in the
respective subsystem that the particular agent is monitoring. Hence, partial fault isolation is achieved
in the sense that the faulty subsystem is identified. Even if two or more agents detect a fault, then
this means that in each of the respective subsystems a process fault has occurred. In the case we
consider only sensor faults can occur (no process faults), then the faulty sensor(s) can be isolated
within the set given by (34) in the case of a single sensor fault or by (35) in the case of multiple
sensor faults. This additional information can be used by a more sophisticated fault isolation scheme
in order to enhance its performance, by excluding potential fault scenarios. Moreover, the proposed
distributed fault detection approach has significant benefits in comparison to a centralized approach
since it encompasses important fault isolation characteristics.
In the proposed scheme, each detection agent provides a binary decision regarding the detection
of a fault in the subsystem it monitors and, according to the decisions of all the agents, the high-
level isolation scheme provides some information regarding the type and location of the fault that
occurred. Moreover, some hypotheses can be stated regarding the status of presence of the fault, i.e. if
the fault is permanent, intermittent or temporary. For instance, the successive threshold crossings, in
the event of fault detection, can be interpreted as “the fault(s) is still present” but, the scheme cannot
determine precisely the status of the fault. Of course, some hypotheses can be stated; according to
the rate of successive threshold crossings and the frequency this behavior is observed. For instance,
permanent faults, given that they are sufficiently large, will most probably cause the residuals to
exceed their thresholds almost all the time after the initial detection or at least demonstrate rapid
threshold crossings. On the other hand, temporary faults can be identified if after some time of the
initial fault detection, the residuals fall and stay below their corresponding thresholds indicating
18
potentially healthy operation. Finally, intermittent faults behavior, will exhibit a mixture of the
permanent and temporary faults behavior, with these two phases repeating successively. Specifically,
during a period of time the residual will exceed its threshold like in the case of a permanent fault, and
afterwards it will be followed by a period of time where the residuals stay below their thresholds. In
general, the issue of distinguishing between permanent, temporary and intermittent faults, requires
further investigation and is out of the scope of the present paper.
7. Simulation Results
In this section, we consider a numerical example based on a system of two inverted pendulums
connected by a spring. The discrete time models of the two subsystems I= 1,2 are obtained from
a modified version of the continuous time version in Spooner and Passino (1999) by using a forward
Euler discretization with a time step Ts= 0.0001s and are given by
x(1)
I(t+ 1) =x(1)
I(t) + Tsx(2)
I(t)
x(2)
I(t+ 1) =x(2)
I(t) + Tsf(2)
I(t) + w(2)
I(t) + η(2)
I(t)
yI(t) =x(1)
I(t) + ξI(t)
where for the first subsystem the nominal and interconnection functions are given by:
f(2)
1(t) = m1gr
J1
kr2
4J1!sin(x(1)
1(t)) + kr
2J1
(lb) + u1
J1
w(2)
1(t) =kr2
4J1
sin(x(1)
2(t))
and for the second subsystem the respective functions are
f(2)
2(t) = m2gr
J2
kr2
4J2!sin(x(1)
2(t)) + kr
2J2
(lb) + u2
J2
w(2)
2(t) =kr2
4J2
sin(x(1)
1(t)).
The modification of this model with respect to Spooner and Passino (1999), is with regards to
the availability of the state variables for measurement, the presence of modeling uncertainty and
measurement noise. Specifically, in Spooner and Passino (1999) full state measurement is considered,
whereas in this example it is considered that only x(1)
I(t), I= 1,2 can be measured (with some
uncertainty).
The parameters that are used in the simulation are: m1=2kg, m2=2.5kg, J1=0.5kg, J2=0.625kg,
k=30N/m, l=0.5m, b=0.4m and g=9.81m/s2. The modeling uncertainties of the subsystems are
assumed to be η(2)
I(x(1)
I, t) = 0.05 sin(10t) + 0.05sin(x(1)
I(t)), I= 1,2, in which the term 0.05 sin(10t)
corresponds to the uncertainty associated with time variations or general inaccuracies, whereas the
term 0.05sin(x(1)
I(t)) corresponds to the uncertainty of the nominal function due to the error on
some model parameters (i.e. the mass). Note that, the modeling uncertainty is a function of both
the time tand the state x(1)
I. The bound on the modeling uncertainty that is used is given by
¯η(2)
I(y(1)
I, t)=0.1+0.05|sin(y(1)
I(t))|,I= 1,2, which satisfies Assumption 3. The inputs uIare derived
based on a simple decentralized proportional feedback controller that stabilizes each subsystem and
are given by uI= 20eI,I= 1,2 where eI=y(1)
Iis the tracking error. In this example, we consider
19
0 5 10 15
0
2
4
6
8x 10−5
Time − t (sec)
Residual and threshold
Residual
Threshold
(a) Fault detection agent monitoring measurement y1.
0 5 10 15
0
2
4
6x 10−5
Time − t (sec)
Residual and threshold
Residual
Threshold
(b) Fault detection agent monitoring measurement y2.
Figure 3.: Residual signal and fault detection threshold for measurements y1,y2in the case of process
fault occurring in Σ1.
0 5 10 15
0
0.1
0.2
0.3
Time − t (sec)
Measurement and estimate
y1(t)
ˆy1(t)
(a) Measurement y1and estimate ˆy1.
0 5 10 15
0
0.1
0.2
0.3
0.4
Time − t (sec)
Measurement and estimate
y2(t)
ˆy2(t)
(b) Measurement y2and estimate ˆy2.
Figure 4.: Measurements y1,y2and their corresponding estimates ˆy1, ˆy2in the case of process fault
occurring in Σ1.
two cases, one for process fault and one for the sensor fault. In the case of a process fault, we consider
an abrupt multiplicative actuator fault in subsystem 1 where the input changes to u1= (1 + β1)¯u1,
where ¯u1is the nominal control input in the non-fault case and β1[1,0] is the parameter
characterizing the magnitude of the fault. The actuator fault in this case can be considered as a
process fault affecting the dynamics of the system. The fault occurs at Tx
0= 5 sec with a magnitude
β1=0.1. In the case of a sensor fault, we consider that the sensor in the first subsystem measuring
y1measures the signals’ amplitude with 20% deviation and the sensor fault occurs at Ty
0= 5 sec.
The measurement noise ξIis implemented as a uniform random number in the range [0.01,0.01].
The proposed fault detection scheme is implemented using a FIR filter for H(z). Specifically, the
filter H(z) is designed as a 10-th order FIR lowpass filter with normalized cutoff frequency 0.2
and utilizing a Hamming window (using the fir1 command in Matlab). The transfer function of
H(z) is given by H(z) = P10
k=0 dkzkand explained in Section 3.3 the filter ¯
H(z) is given by
¯
H(z) = P10
k=0 |dk|zk. Using the aforementioned filter H(z) the bounds on the filtered noise are
found through the simulation as ¯ξI= 4e6 and ¯gI= 1.2e4 for I= 1,2.
According to the proposed fault detection scheme, two fault detection agents are designed, one
for each subsystem. In each detection agent, the estimation model is given by (4), (5), the residual
is generated according to (6) and finally, the detection threshold is generated according to (9) with
(27) to fully exploit the filtering benefits. For the threshold implementation the constants αI,1=
1, δI,1= 0.7, I= 1,2, are also used so that (12) is satisfied. The simulation results for the fault
detection agents that monitor the first subsystem (measured variable y1) and the second subsystem
(measured variable y2) in the case of the process fault are shown in Figure 3 and in the case of the
sensor fault in Figure 5.
At first let’s consider the case of the process fault. In this case the measurements and their
corresponding estimates of the two subsystems are shown in Figure 4 where it can be seen that the
estimation model tracks the measurement although, it cannot be seen any significant discrepancy
due to the process fault. The discrepancy of course is present, and it is the cause for the residual
20
0 5 10 15
0
2
4
6
8x 10−5
Time − t (sec)
Residual and threshold
Residual
Threshold
(a) Fault detection agent monitoring measurement y1.
0 5 10 15
0
2
4
6x 10−5
Time − t (sec)
Residual and threshold
Residual
Threshold
(b) Fault detection agent monitoring measurement y2.
Figure 5.: Residual signal and fault detection threshold for measurements y1,y2in the case of sensor
fault occurring in Σ1.
0 5 10 15
0
0.1
0.2
0.3
Time − t (sec)
Measurement and estimate
y1(t)
ˆy1(t)
(a) Measurement y1and estimate ˆy1.
0 5 10 15
0
0.1
0.2
0.3
0.4
Time − t (sec)
Measurement and estimate
y2(t)
ˆy2(t)
(b) Measurement y2and estimate ˆy2.
Figure 6.: Measurements y1,y2and their corresponding estimates ˆy1, ˆy2in the case of sensor fault
occurring in Σ1.
signal given in Figure 3a to exceed its threshold. Specifically, the simulation results that correspond
to the detection agent that monitors the measurements y1(1st subsystem) are shown in Figure 3a
where it is clearly seen that the residual significantly exceeds its detection threshold after the fault
occurs, hence the fault is detected at around t= 5.09 sec. It must be noted that there are no false
alarms prior to the fault occurrence in either case. The corresponding results of the detection agent
that monitors y2(2nd subsystem) are shown in Figure 3b where it is seen that the residual signal
is always below its threshold signal and therefore no fault is detected. It must be noted that, the
residual in the case of the second agent monitoring y2in Figure 3b, exhibits the same behavior
before and after the occurrence of the fault, indicating that the process fault effects from the first
subsystem do not seem to impact the residual of the second agent.
Now let us consider the case of the sensor fault. In this case, the measurements and their corre-
sponding estimates of the two subsystems are shown in Figure 6. In the case of the sensor fault, the
results in the case of the fault detection agents that monitor the measurements y1and y2are shown
in Figure 5a and 5b respectively. In this case, the sensor fault is detected very fast by the detection
module that monitors y1at around t= 5.01 sec. Most importantly though, the sensor fault is also
detected by the detection module that monitors y2at around t= 5.39 sec. As in the previous case,
no false alarms occur, since the residuals are always bounded by their thresholds prior to the sensor
fault occurrence. In this case, it must be noted that, the residual behavior in the case of the second
agent monitoring y2in Figure 5b, changes after the occurrence of the fault, indicating that the sensor
fault effects from the first subsystem affect significantly the residual of the second agent.
The simulation results are in line with the findings of the analysis conducted in Section 4 and more
specifically with Lemma 2 according to which the process fault occurring in a subsystem can only
be detected by its respective fault detection agent, whereas a sensor fault occurring in a subsystem
can be detected by its respective fault detection and also by neighboring interconnected detection
agents. According to the high-level fault isolation scheme described in Section 6, in the case of the
process fault, no conclusive decision can be reached since only one detection agent detects a fault
21
and therefore the fault can be either a process fault in Σ1(which is actually the case), or a fault in
the sensors measuring y1or y2. In the case of the sensor fault, because both fault detection agents
detect a fault, the case of a process fault is excluded and hence it is guaranteed that a sensor fault
has occurred and the faulty sensor is either y1(which is actually the case) or y2. Therefore, further
actions can be taken to identify precisely the type of the fault that has occurred.
8. Conclusion
In this paper, a distributed fault diagnosis approach for the detection of process and sensor faults in
a class of interconnected input-output discrete-time, nonlinear systems with modeling uncertainties
and measurement noise is presented. By utilizing a filtering approach which is incorporated in the
fault detection framework to mitigate the measurement noise effects, robust adaptive thresholds are
designed that guarantee no false alarms. Furthermore, the propagation of a fault that occurs in one
subsystem and affects neighboring subsystems is investigated, leading to some key properties of fault
propagation among subsystems. More specifically, the fault detection scheme is designed in such a
way that a process fault occurring in a subsystem can only be detected by its corresponding detection
agent, whereas a sensor fault occurring in a subsystem can also be detected by the detection agents
of the neighboring subsystems it affects. This discriminating element is exploited to extrapolate
further information regarding the type of fault that has occurred and, constitutes the basis of the
derived high-level isolation scheme. Furthermore, detectability conditions have been derived that
characterize quantitatively the class of process and sensor faults that can be detected by the proposed
scheme. Future research efforts will be devoted in the development of a comprehensive fault isolation
methodology and, the integration of filtering with learning techniques in order to derive tighter
detection thresholds by dampening the measurement noise and by learning the modeling uncertainty.
Appendix A. Proof of Theorem 2
In the presence of a sensor fault that occurs in the I-th subsystem at t=Ty
0, (20) becomes
˜xI,f (t) = At
I,0˜xI ,f (0) +
t1
X
k=0
At1k
I,0χ0
I(k),(A1)
where
χ0
I(t),H(z)ηI(xI(t),¯xI(t), uI(t), t)+∆g0
I(t)LIξI(t)LIθI,f (t).
Let’s define vI,f (t),H(z)[∆g0
I(t)] LIθI,f (t). After the occurrence of a sensor fault, the residual
(6) becomes rI(t) = CI˜xI ,f (t) + ξI(t) + θI,f (t) and by using (A1) with ˜xI,f (0) = 0, the residual is
written as:
rI(t) =
t1
X
k=0
CIAt1k
I,0χ0
I(k) + ξI(t) + θI,f (t)
=
t1
X
k=0
CIAt1k
I,0H(z)[ηI(xI(k),¯xI(k), uI(k), k )] LIξI(k)
+
t1
X
k=0
CIAt1k
I,0vI ,f (k) + ξI(t) + θI,f (t)
22
By using the triangle inequality, the j-th element of rI(t) for t > T y
0satisfies:
|r(j)
I(t)| ≥
θ(j)
I,f (t) +
t1
X
k=0
C(j)
IAt1k
I,0vI ,f (k)
− |(j)
ξI(t)|
t1
X
k=0
C(j)
IAt1k
I,0H(z)[ηI(xI(k),¯xI(k), uI(k), k )] LIξI(k)
θ(j)
I,f (t) +
t1
X
k=0
C(j)
IAt1k
I,0vI ,f (k)
¯ξI(t)
t1
X
k=0
αI,j δt1k
I,j kH(z)[ηI(xI(k),¯xI(k), uI(k), k )]k+kLIk¯ξI(k),
For fault detection, the inequality |r(j)
I(t)|>¯r(j)
I(t) must hold for some j= 1,2, . . . , pI, so the final
fault detectability condition given in (30) is obtained.
Appendix B. Proof of Theorem 3
In the presence of a sensor fault that occurs at t=Ty
0in ΣJwhich affects ΣI, the state estimation
error (20) becomes
˜xI,f (t) = At
I,0˜xI ,f (0) +
t1
X
k=0
At1k
I,0χ00
I(k),(B1)
where
χ00
I(t),H(z)ηI(xI(t),¯xI(t), uI(t), t)+∆g00
I(t)LIξI(t).
After the occurrence of a sensor fault in ΣJ, the residual (6) becomes rI(t) = CI˜xI ,f (t) + ξI(t) and
by using (B1) with ˜xI ,f (0) = 0 due to the filters’ initial condition, the residual is written as:
rI(t) =
t1
X
k=0
CIAt1k
I,0χ00
I(k) + ξI(t)
=
t1
X
k=0
CIAt1k
I,0H(z)[∆g00
I(k)] + ξI(t)
+
t1
X
k=0
CIAt1k
I,0H(z)[ηI(xI(k),¯xI(k), uI(k), k )] LIξI(k)
23
By using the triangle inequality, the j-th element of rI(t) for t > T y
0satisfies:
|r(j)
I(t)| ≥
t1
X
k=0
C(j)
IAt1k
I,0H(z)[∆g00
I(k)]
− |(j)
ξI(t)|
t1
X
k=0
C(j)
IAt1k
I,0H(z)[ηI(xI(k),¯xI(k), uI(k), k )] LIξI(k)
t1
X
k=0
C(j)
IAt1k
I,0H(z)[∆g00
I(k)]
¯ξI(t)
t1
X
k=0
αI,j δt1k
I,j kH(z)[ηI(xI(k),¯xI(k), uI(k), k )]k+kLIk¯ξI(k).
For fault detection, the inequality |r(j)
I(t)|>¯r(j)
I(t) must hold for some j= 1,2, . . . , pI, so the final
fault detectability condition given in (33) is obtained.
Acknowledgements
This work was supported by funding from the European Research Council under the ERC Advanced
Grant (FAULT-ADAPTIVE).
References
Basseville, M., & Nikiforov, I. (1993). Detection of abrupt changes: theory and application. Prentice-Hall.
Blanke, M., Kinnaert, M., Lunze, J., & Staroswiecki, M. (2010). Diagnosis and Fault-Tolerant Control (2nd
ed.). Springer Verlag.
Boem, F., Ferrari, R. M., & Parisini, T. (2011). Distributed Fault Detection and Isolation of Continuous-
Time Nonlinear Systems. European Journal of Control ,5-6 , 603–620.
Chen, J., & Patton, R. J. (1999). Robust Model-Based Fault Diagnosis for Dynamic Systems. Kluwer
Academic Publishers Norwell, MA, USA.
De Persis, C., & Isidori, A. (2002). On the design of fault detection filters with game-theoretic-optimal
sensitivity. International Journal of Robust and Nonlinear Control ,12 (8), 729–747.
Dunia, R., & Joe Qin, S. (1998). Joint diagnosis of process and sensor faults using principal component
analysis. Control Engineering Practice,6(4), 457–469.
Ferdowsi, H., Raja, D., & Jagannathan, S. (2012). A decentralized fault prognosis scheme for nonlinear
interconnected discrete-time systems. In American Control Conference (pp. 5900–5905).
Ferrari, R. M., Parisini, T., & Polycarpou, M. M. (2008). A robust fault detection and isolation scheme for
a class of uncertain input-output discrete-time nonlinear systems. In American Control Conference
(pp. 2804–2809).
Ferrari, R. M., Parisini, T., & Polycarpou, M. M. (2012). Distributed fault detection and isolation of large-
scale nonlinear systems: an adaptive approximation approach. IEEE Transactions on Automatic
Control,57 (2), 275–290.
Frank, P. (1990). Fault diagnosis in dynamic systems using analytical and knowledge-based redundancy: A
survey and some new results. Automatica,26 (3), 459–474.
Hammouri, H., Kinnaert, M., & El Yaagoubi, E. (1999). Observer-based approach to fault detection and
isolation for nonlinear systems. IEEE Transactions on Automatic Control,44 (10), 1879–1884.
Keliris, C., Polycarpou, M. M., & Parisini, T. (2013a). A Distributed Fault Detection Filtering Approach
for a Class of Interconnected Continuous-Time Nonlinear Systems. IEEE Transactions on Automatic
Control,58 (8), 2032–2047.
24
Keliris, C., Polycarpou, M. M., & Parisini, T. (2013b). A Distributed Fault Detection Filtering Approach
for a Class of Interconnected Input-Output Nonlinear Systems. In European Control Conference (pp.
422–427). Zurich.
Kinnaert, M., & Peng, Y. (1995). Residual generator for sensor and actuator fault detection and isolation:
a frequency domain approach. International Journal of Control,61 (6), 1423–1435.
Klinkhieo, S., & Patton, R. J. (2009). A Two-Level Approach to Fault-Tolerant Control of Distributed
Systems Based on the Sliding Mode. In 7th IFAC Symposium on Fault Detection, Supervision and
Safety of Technical Processes, Barcelona, Spain (pp. 1043–1048).
echevin, N., & Rabbath, C. (2009). Decentralized Detection of a Class of Non-Abrupt Faults With
Application to Formations of Unmanned Airships. IEEE Transactions on Control Systems Technology,
17 (2), 484–493.
Liu, Y., & Bauer, P. (2008). Sufficient conditions for non-negative impulse response of arbitrary-order
systems. In IEEE Asia Pacific Conference on Circuits and Systems (pp. 1410–1413).
Massoumnia, M.-A., & Vander Velder, W. E. (1988). Generating parity relations for detecting and identifying
control system component failures. Journal of Guidance, Control, and Dynamics,11 (1), 65.
Patton, R., & Frank, P. M. (1989). Fault diagnosis in dynamic systems: theory and applications. Prentice
Hall.
Patton, R. J., Kambhampati, C., Casavola, A., Zhang, P., Ding, S. X., & Sauter, D. (2007). A generic strategy
for fault-tolerance in control systems distributed over a network. European Journal of Control,13 (2-
3), 280–296.
Rajamani, R., & Ganguli, A. (2004). Sensor fault diagnostics for a class of non-linear systems using linear
matrix inequalities. International Journal of Control,77 (10), 920–930.
Reppa, V., Polycarpou, M., & Panayiotou, C. (2014). Adaptive Approximation for Multiple Sensor Fault
Detection and Isolation of Nonlinear Uncertain Systems. IEEE Transactions on Neural Networks and
Learning Systems,25 (1), 137–153.
Salahshoor, K., Mosallaei, M., & Bayat, M. (2008). Centralized and decentralized process and sensor fault
monitoring using data fusion based on adaptive extended Kalman filter algorithm. Measurement ,
41 (10), 1059–1076.
Spooner, J., & Passino, K. (1999). Decentralized adaptive control of nonlinear systems using radial basis
neural networks. IEEE Transactions on Automatic Control,44 (11), 2050–2057.
Stankovic, S., Ilic, N., Djurovic, Z., Stankovic, M., & Johansson, K. (2010). Consensus based overlapping
decentralized fault detection and isolation. In Conference on Control and Fault-Tolerant Systems
(SysTol’10), (pp. 570–575).
Talebi, H. a., Khorasani, K., & Tafazoli, S. (2009). A recurrent neural-network-based sensor and actuator
fault detection and isolation for nonlinear systems with application to the satellite’s attitude control
subsystem. IEEE Transactions on Neural Networks,20 (1), 45–60.
Thumati, B., & Halligan, G. (2013). A Novel Fault Diagnostics and Prediction Scheme Using a Nonlinear
Observer With Artificial Immune System as an Online Approximator. IEEE Transactions on Control
Systems Technology,21 (3), 569–578.
Venkatasubramanian, V., Rengaswamy, R., Yin, K., & Kavuri, S. (2003, March). A review of process
fault detection and diagnosis Part I: Quantitative model-based methods. Computers & Chemical
Engineering,27 (3), 293–311.
Viswanadham, N., & Srichander, R. (1987). Fault detection using unknown-input observers. Control Theory
and Advanced Technology,3(2), 91–101.
Wei, L., Gui, W., Xie, Y., & Ding, S. X. (2009). Decentralized Fault Detection System Design for Large-
Scale Interconnected Systems. In 7th IFAC symposium on Fault Detection, Supervision and Safety of
Technical Processes, Barcelona, Spain (pp. 816–821).
Yan, B., Tian, Z., & Shi, S. (2008). A novel distributed approach to robust fault detection and identification.
International Journal of Electrical Power & Energy Systems,30 (5), 343–360.
Zhang, Q., Basseville, M., & Benveniste, A. (1998). Fault detection and isolation in nonlinear dynamic
systems: A combined input-output and local approach. Automatica,34 (11), 1359–1373.
Zhang, Q., & Zhang, X. (2012). A distributed detection scheme for process faults and sensor faults in a
class of interconnected nonlinear uncertain systems. In IEEE 51st Annual Conference on Decision
and Control (pp. 586–591).
Zhang, X., Polycarpou, M. M., & Parisini, T. (2002). A robust detection and isolation scheme for abrupt
and incipient faults in nonlinear systems. IEEE Transactions on Automatic Control,47 (4), 576–593.
25
Zhang, X., Polycarpou, M. M., & Parisini, T. (2008). Design and analysis of a fault isolation scheme for a
class of uncertain nonlinear systems. Annual Reviews in Control,32 (1), 107–121.
Zhang, X., Polycarpou, M. M., & Parisini, T. (2009). Decentralized fault detection for a class of large-scale
nonlinear uncertain systems. In 48h IEEE Conference on Decision and Control and 28th Chinese
Control Conference (pp. 6988–6993).
26
... The sensor faults will make the state unavailable [18], which presents great challenges to the design of the controller. The non-linear systems with sensor faults were investigated in [19][20][21]. For instance, [20] realized real-time monitoring and isolation of system faults based on the neural network (NN) adaptive structure. ...
... For instance, [20] realized real-time monitoring and isolation of system faults based on the neural network (NN) adaptive structure. A distributed FTC method for the process and sensor faults was proposed in [21]. For FTC of uncertain non-linear systems, backstepping technology has been widely used in order to facilitate the design of the controller. ...
Article
Full-text available
In this article, an adaptive fuzzy finite‐time fault‐tolerant control (FTC) scheme for uncertain non‐linear systems under sensor faults is proposed. Compared with the existing methods, the considered system contains unknown time‐varying fault parameters, uncertain non‐linear functions, and can guarantee the performance of the system in finite time. The coupling between fault parameters and actual states is solved by the fault parameters separation method. The fuzzy logic system (FLS) is used to approximate the unknown functions, and combining the backstepping technology an adaptive fault‐tolerant controller is designed. The finite‐time stability of the closed‐loop system is proved by the Lyapunov theory. At last, the numerical simulation and the real physical system simulation verified the effectiveness of the proposed scheme.
... Existing work in the literature relies on several methods of detecting and isolating sensor faults. This objective is achieved in [3][4][5] through fault detection and isolation techniques (FDI) in non-linear systems. Besides the authors of [6,7] used Predictive Control for sensor fault accommodation in nonlinear systems. ...
Article
Full-text available
The authors’ work deals with modelling with coloured Petri nets (CPN) of network controlled systems (NCS) and exposes a proposal of a sensor fault detection and prevention mechanism. In NCS, the network must be viewed as part of the system and not just a means of communication. For this, two issues must be considered. The first consists of adapting the control and diagnostic laws to the performance of the network. The second is to guarantee the performance of the system. Each control loop containing a network is vulnerable to faults and cyber attacks and malicious users can intercept and listen to data during transmission. For that reason, it is essential to protect transmitted data against unauthorized access and modification and to diagnose components. In this paper, the authors presented a model with CPN of a NCS, the authors proposed a sensor fault detection and correction mechanism, then the authors injected a sensor fault and the authors tested the effectiveness of the proposed mechanism in detecting that fault and correcting its influence. Simulation results proved the success of the preventive mechanism in overcoming the influence of the fault and returning to normal behaviour.
... Moreover, the MVFP model provides a greater generalization ability of the results for different engines, since it can be easily reconfigured in its parameters and can also be expanded to host more subsystems and sensors. Compared to the state-of-theart in distributed model-based fault diagnosis literature where mostly systems described by ordinary differential equations (ODE) are considered [Reppa et al. (2016); Boem et al. (2017); Keliris et al. (2015)], this work focuses on systems described by nonlinear DAE. The design of algebraic residuals and adaptive thresholds is a challenging task that affects the detectability of sensor faults. ...
Article
This paper proposes a distributed model-based methodology for the detection and isolation of sensor faults in marine fuel engines. The proposed method considers a Mean Value First Principle model and a wide selection of heterogeneous sensors for monitoring the engine components. The detection of faults is realised based on residuals generated using nonlinear Differential Algebraic estimators combined with adaptive thresholds. The isolation of faults is, then, realised in two levels; local sensor fault detection and isolation agents are designed to monitor specific sensor sets and aim to detect faults in these sets; and a global decision logic is designed to isolate multiple sensor faults that may be propagated between the local monitoring agents. Finally, simulation results are used to illustrate the application of this method and its efficiency.
Article
We address the feasibility of the pragmatic implementation of monitoring systems for real-time distributed fault diagnosis in complex processes. We delve into the integration of theoretical distributed estimation methods and practical distributed embedded systems. This study emphasizes the merger of distributed computing and signal processing to augment fault diagnosis in dynamic systems. We introduce a generic mathematical model for distributed fault diagnosis algorithms, laying the groundwork for a reference network computing architecture that details the essential components, organization, and operational characteristics necessary for distributed monitoring systems. Subsequently, we evaluate existing software and hardware solutions that facilitate the realization of these systems. The theoretical framework and system design are empirically validated through an experimental setup involving a liquid-level control system, demonstrating the efficacy and applicability of our approach.
Article
This paper is devoted to the state and fault estimation design for continuous-time interconnected systems. By utilizing the associated information among subsystems, a novel unknown input method-based distributed observer synthesis scheme is presented to reconstruct system states and faults simultaneously. It is worth pointing out that our method has much broader applications, since the faults studied in each subsystem do not have to satisfy some constraints made in previous results such as having a known bound or the first derivative being equal to zero. By resorting to the Lyapunov theory and matrix transformation technique, some brand-new conditions are proposed to guarantee the stability of local error systems, by which observer gains can be solved in terms of linear matrix inequalities (LMIs). Particularly, all the conditions are expressed in the original system matrices rather than augmented matrices framework. Thus, computational complexity is reduced compared to previous related researches. At last, simulation experiments are shown to verify the validity of our distributed state and fault estimation scheme.
Article
A new approach to Fault-Tolerant Control (FTC) of distributed and interconnected systems is proposed based on applying sliding mode control (SMC) to the subsystems of a level de-centralised and hierarchical scheme. The SMC approach involves a new optimal control strategy for the design of local sliding functions for the subsystem controllers, replacing the conventional approach based on constrained locally linear LQ receding horizon control. The linear SMC gains handle the reachability for the sliding surfaces and the interaction effects from subsystem interconnections as well as small fault effects (i.e. giving passive fault-tolerance), whilst the non-linear (discontinuous) SMC gains facilitate a powerful way of accounting for larger but bounded system non-linearities and faults and can fulfill the role of an active FTC scheme. The local and global performance constraints are retained and implemented under autonomous learning supervision via the interaction-prediction principle. The scheme for an Autonomous Control and Supervision System (ACSS) is described that is capable of learning its coordination function and carrying out fault-tolerant balancing of the distributed system. The paper describes how the two-level learning strategy offers advantages over single-level FTC distributed SMC. The design concepts are illustrated using a non-linear 3-tank liquid level and heating control system with component faults.
Article
We propose a method for detection of actuator failures. In this scheme, actuator failures are isolated by monitoring the error between the actual outputs and their estimates obtained from a bank of unknown-input observers.
Article
In this part of the paper, we review qualitative model representations and search strategies used in fault diagnostic systems. Qualitative models are usually developed based on some fundamental understanding of the physics and chemistry of the process. Various forms of qualitative models such as causal models and abstraction hierarchies are discussed. The relative advantages and disadvantages of these representations are highlighted. In terms of search strategies, we broadly classify them as topographic and symptomatic search techniques. Topographic searches perform malfunction analysis using a template of normal operation, whereas, symptomatic searches look for symptoms to direct the search to the fault location. Various forms of topographic and symptomatic search strategies are discussed.
Conference Paper
In this paper a new distributed fault detection and isolation (FDI) methodology is proposed in the form of a multi-agent network representing a combination of a consensus based FDI observer for residual generation and a consensus based decision making strategy for change detection, applicable in real time. The proposed observer is based on overlapping system decomposition and a combination between the local optimal stochastic FDI observers and a dynamic consensus strategy. It is shown how the proposed algorithm can generate residuals which provide, under general conditions concerning local models and the network topology, high efficiency, scalability and robustness. The proposed decision making strategy provides solutions for two particular cases: a) local detection for non-overlapping parts of the identified subsystems; b) a consensus based strategy for FDI in the overlapping parts. One selected example illustrates the applicability of the proposed methodology in practice.
Conference Paper
A new approach to Fault-Tolerant Control(FTC)of distributed and interconnected systems is proposed based on applying sliding mode control (SMC) to the subsystems of a level de-centralised and hierarchical scheme. The SMC approach involves a new optimal control strategy for the design of local sliding functions for the subsystem controllers, replacing the conventional approach based on constrained locally linear LQ receding horizon control. The linear SMC gains handle the reachability for the sliding surfaces and the interaction effects from subsystem interconnections as well as small fault effects (i.e. giving passive fault-tolerance), whilst the non-linear (discontinuous) SMC gains facilitate a powerful way of accounting for larger but bounded system non-inearities and faults and can fulfill the role of an active FTC scheme. The local and global performance constraints are retained and implemented under autonomous learning supervision via the interaction-prediction principle. The scheme for an Autonomous Control and Supervision System (ACSS) is described that is capable of learning its coordination function and carrying out fault-tolerant balancing of the distributed system. The paper describes how the two-level learning strategy offers advantages over single-level FTC distributed SMC. The design concepts are illustrated using a non-linear 3-tank liquid level and heating control system with component faults.
Conference Paper
In this paper, a decentralized fault detection (FD) system design approach is proposed for discrete-time large-scale interconnected systems. Such an FD system consists of two parts for each subsystem: a residual generator and a residual evaluator. The residual generator producing weighted output estimation errors is designed to match a proper reference residual model, such that it is robust against system disturbances and sensitive to system faults. The solution to a (sub)-optimal residual generator is given by solving a convex optimization problem with the help of linear matrix inequalities (LMIs). Then norm based evaluation functions are selected for each subsystem and the corresponding thresholds are presented for the residual evaluator design. The computation of thresholds are formulated also as an optimization problem, which can be solved by using LMIs. Finally, a numerical example is given to illustrate the results.
Article
This paper provides a tutorial overview of a number of aspects and approaches to Control over the Network for Network Control Systems (NCS) that are likely to lead to good fault-tolerant control properties, subject to network faults. In order to analyze and derive the best strategies for fault tolerant NCS, it is initially assumed that the network communication bandwidth is infinite. This gives a simpler way to map the NCS structure in terms of computing nodes and control subsystems/components. Two Fault-tolerant NCS architectures have been described, analyzed and compared with view to demonstrating that the classical concepts of Fault-tolerant Control (FTC), namely of active and passive FTC can be related to equivalent (although more complex) concepts in NCS. The study confirms that the de-centralized approach to fault-tolerant control of NCS suffers from a difficult challenge as to how to compensate for fault effects occurring throughout the NCS. On the other hand, the distributed hierarchical structure, requires a coordination function which is able to manage (a) the local control task, (b) the compensation of faults, and (c) the network reconfiguration, if required, subject to significant network subsystem faults.
Article
This paper presents an adaptive approximation-based design methodology and analytical results for distributed detection and isolation of multiple sensor faults in a class of nonlinear uncertain systems. During the initial stage of the nonlinear system operation, adaptive approximation is used for online learning of the modeling uncertainty. Then, local sensor fault detection and isolation (SFDI) modules are designed using a dedicated nonlinear observer scheme. The multiple sensor fault isolation process is enhanced by deriving a combinatorial decision logic that integrates information from local SFDI modules. The performance of the proposed diagnostic scheme is analyzed in terms of conditions for ensuring fault detectability and isolability. A simulation example of a single-link robotic arm is used to illustrate the application of the adaptive approximation-based SFDI methodology and its effectiveness in detecting and isolating multiple sensor faults.