Conference PaperPDF Available

Fault Diagnosis in HVAC Systems Based on the Heat Flow Model

Authors:

Abstract and Figures

Fault Detection and Diagnosis based on the Heat Flow Model (HFM) provides a generic and extensible frame- work for monitoring HVAC systems. It supports the find- ing and fixing of faulty components. During the fault detection phase, measured sensor and control values are used to perform estimations based on the physical prop- erties of the system. Discrepancies of estimated and mea- sured values are collected as a detection failure vector. Di- agnosis seeks to find the most probable cause for the ob- served failures. In HVAC systems, the failures and faults form an m-n relation. Our proposed diagnosis is per- formed with an associative network to map the relations among failures and faults using the inherent fault simula- tion capabilities of the HFM nodes at runtime. The simi- larity of the detection failure vector to the simulated fail- ure vector indicates the probability of the corresponding fault. To find the best method of fault diagnosis, this pa- per examines different similarity metrics for HFM based FDD, including Euclidean distances, Manhattan distance, root of sum of products, Jaccard index, and a table based metric. The effectiveness of the proposed diagnosis ap- proaches is presented with a case study based on a refer- ence implementation using Simulink and Java.
Content may be subject to copyright.
FAULT DIAGNOSIS IN HVAC SYSTEMS BASED ON THE HEAT FLOW MODEL
Alexander Schiendorfer1, Gerhard Zimmermann2, Yan Lu1and George Lo1
1Siemens Corporation, Corporate Research and Technology, Princeton, NJ
2Informatik, University of Kaiserslautern, Germany
ABSTRACT
Fault Detection and Diagnosis based on the Heat Flow
Model (HFM) provides a generic and extensible frame-
work for monitoring HVAC systems. It supports the find-
ing and fixing of faulty components. During the fault
detection phase, measured sensor and control values are
used to perform estimations based on the physical prop-
erties of the system. Discrepancies of estimated and mea-
sured values are collected as a detection failure vector. Di-
agnosis seeks to find the most probable cause for the ob-
served failures. In HVAC systems, the failures and faults
form an m-n relation. Our proposed diagnosis is per-
formed with an associative network to map the relations
among failures and faults using the inherent fault simula-
tion capabilities of the HFM nodes at runtime. The simi-
larity of the detection failure vector to the simulated fail-
ure vector indicates the probability of the corresponding
fault. To find the best method of fault diagnosis, this pa-
per examines different similarity metrics for HFM based
FDD, including Euclidean distances, Manhattan distance,
root of sum of products, Jaccard index, and a table based
metric. The effectiveness of the proposed diagnosis ap-
proaches is presented with a case study based on a refer-
ence implementation using Simulink and Java.
INTRODUCTION
Buildings consume a significant amount of energy and
resources. In the United States buildings use 72% of elec-
tricity, 54% of natural gas and 38.9% of the total energy
consumption according to the U.S. Department of Energy
(U.S. Department of Energy 2009). They are designed to
provide comfortable air quality, temperature and humid-
ity for people working in the buildings. However, many
HVAC systems cannot meet energy consumption expec-
tations due to multiple faults that can occur throughout
their lifecycle. Physical problems such as stuck dampers,
leaking valves, or wrongly configured controllers are ex-
amples of HVAC faults that need to be detected and di-
agnosed by commissioning. Some faulted HVAC systems
can introduce 20% of energy waste (Roth and Quartararo
2005) which motivates the development of automatic fault
detection and diagnosis (FDD) systems.
Related Work
There are two main categories for detection systems
(Wu and Sun 2010): statistical and model based FDD.
Statistical FDD systems try to detect abnormalities by
comparing real-time data to data gained after recommis-
sioning of a building or from a detailed building model
providing fault-free data. (Morisot and Marchio 1999)
pointed out that the quality of FDD highly depends on
the input data when using artificial neural networks for
detection of failures. The system needs to be provided
with correct data for as many circumstances as possible
including different air temperatures and humidity. Real
data covering important cases are not always available.
On the other hand, model-based FDD systems simulate
the expected physical values of a HVAC system by build-
ing a model of it. They then calculate the outcome for
given outside air temperature values and other parameters.
A model-based approach to fault detection and diagnosis
consists of a mathematical description of the system using
equations of the thermodynamic processes that happen in
the HVAC components (Salsbury and Diamond 2008).
Motivation
Most existing FDD systems are hardware or level de-
pendent and require configuration and calibration (Wu
and Sun 2010). The proposed Heat Flow Model is a
model based approach using a simplified physical model
designed for fast reconfiguration.
An HFM based simulation engine estimates the proper-
ties of mass flows based on physical models and compares
them with measured values from sensors. Using object
oriented programming, HVAC components are modeled
on an abstract level as classes which can be parameterized
for concrete components. Program libraries are thus de-
veloped and reconfigured at runtime. By simulating pos-
sible faults within the scope of one component’s data, the
programmed components do not require reference data.
In (Zimmermann, Lu, and Lo 2011) we presented re-
sults for different faults and their diagnosis rate for one
particular recognition strategy. In this paper we try to de-
termine more efficient metrics based on the performance
for all possible simulated faults.
SimBuild
2012
Fifth National Conference of IBPSA-USA
Madison, Wisconsin
August 1-3, 2012
440
HFM BASED FAULT DETECTION
The Heat-Flow-Model for FDD
The proposed method for fault diagnosis uses the Heat
Flow Model (HFM) which is described in (Lu, Zimmer-
mann, and Lo 2010). HFM is component-level FDD
based on first principles i.e. the analyis of the underly-
ing physical processes involved with mass air flows (Have
1997). Heat flow hereby refers to a generalization for all
kinds of flows that are related to the energy or heat flow
such as temperature, humidity or air pressure. In essence,
a heat flow model is a directed graph with its nodes repre-
senting the components of the HVAC system and arcs rep-
resenting the mass flow connections among them such as
ducts or pipes. Existing BIM, such as an IFC model, ease
the creation of the HFM graph. HFM nodes contain esti-
mation functions to predict the changes on the flow prop-
erties such as air temperature or air pressure to perform
detection and diagnosis in steady state. A flow variable
refers to a physical value such as temperature or humidity
and is sent towards adajcent HFM components.
A node consists of three major elements:
Ports in upstream and downstream mass flow direc-
tions to connect nodes
Estimation and simulation formulas to calculate the
changes to an incoming flow variable
Rule definitions to define which flow variables are
compared to detect failures
Figure 1 depicts the internal structures of a modeled
flow node. It contains flow ports for both directions (up-
stream and downstream). Two adjacent nodes thus refer
to the same value in reality. Take for instance an HVAC
system with a temperature sensor laying upstream of a
heating coil followed by a cooling coil which would re-
sult in an HFM as shown in Figure 2. Sensor nodes trig-
ger the propagation and estimation of the flow variables.
Hence, the particular sensor node sends its incoming tem-
perature flow variable to the forwardIn port of the heating
coil which adds an estimated temperature rise based on its
current control value and sends it to the cooling coil.
Logically in parallel, the cooling coil adds the esti-
mated temperature drop to a value coming from a sensor
laying downstream in a reverse propagation. Thus, the
values being sent forward by the heating coil and the ones
sent backward by the cooling coil can be compared.
Sensor and control data are sent to their HFM com-
ponents such that the estimation and propagation can be
done with the information available to a single node. Each
sensor node initiates a propagation of a flow variable in
one timestep.
The FDD engine works in three phases which will be
explained in the following sections:
1. Fault Detection: Sending sensor and control data
to the HFM components, estimating their changes,
propagating them to neighbors and storing a detec-
tion failure vector of comparisons.
2. Fault Simulation: Simulating faults to create sim-
ulated failure vectors that are compared with the de-
tected values.
3. Fault Diagnosis: Finding the closest vector among
the simulated ones to ”blaming” it for causing the
observed data.
FwdIn FwdOut
RevInRevOut
tol1
r1
r2
XSens
Rules
tol2
r3
XSetp
Figure 1: Example of an HFM node with ports fwdIn, fwd-
Out, revIn and revOut as well as rule definitions r1, r2 and
r3 that combine the values of ports. Tol1 and tol2 stand
for added sensor tolerances.
Mixing
Box (MB)
Heating
Coil (HC)
Cooling
Coil (CC)
Supply Air
Sensor (SAS)
Space
(SP)
Mixing
Box (MB)
Temperature
Sensor (TS)
Heating
Coil (HC)
Cooling
Coil (HC)
Supply Air
Temperature
Sensor (SAS)
Space (SP)
Figure 2: Exemplary simplified HFM graph of an HVAC
system
Rule Evaluations, Intervals and Uncertainty
To come up for the uncertainty associated with mea-
sured values and due to estimations, intervals are used for
flow variables. A generic interval comparison rule for in-
tervals is used throughout the HFM graph as shown in
Equation 1. Concrete instantiations of this rule consist
of the definition of the originating ports and an identifier.
Each rule is evaluated to get a failure value. Typical ex-
amples are the pairs sensor setpoint interval and actual
sensor input or adjacent nodes in the flow as mentioned
in the example earlier. If there is any overlap between the
two intervals failure value 0 is reported. If however in-
tervals are disjointed in either direction, a nonzero failure
value is stored for this rule.
SimBuild
2012
Fifth National Conference of IBPSA-USA
Madison, Wisconsin
August 1-3, 2012
441
For instance, the rule ”SupplySensDuctTempSens -
FwdFail” compares the intervals of the forward outgoing
temperature flow of the supply sensor duct with the re-
verse incoming temperature flow. Vectors of failure values
are collected for detected and simulated faults hence they
will be referred to as ”detected” and ”simulated” failure
vector. A more rigorous formal definition for the objects
involved is given in Section HFM FAULT DIAGNOSIS.
Faults and Failures
Faults and detected failure values are generally in an
m:n relationship as illustrated in the form of an associative
network in Figure 3. One fault causes multiple failure
values (Baer 1996). In turn, those failure values can be
caused by multiple faults. For instance a too high supply
air temperature might indicate a malfunctioning cooling
or heating coil or a drifting sensor.
r1 r2 r3 r4 r5 r6
f1
failures
f2 f3 f4 f5 f6 faults
Figure 3: Faults are related to multiple failure values and
vice versa
A failure value f v is a the result of an evaluation of
Equation 1. Moreover, failure values have to be normal-
ized using typical value ranges. This value reflects a se-
mantic decision built into the FDD engine as a failure
value is not equal to 0 if and only if there is no overlap of
the compared intervals. Thus a partial overlap is counted
as no failure due to the large intervals resulted from esti-
mation and simulation.
f v(I1,I2) =
0 if I1,I2overlapping
I2.min I1.max if I2.min >I1.max
I2.max I1.min if I2.max <I1.min
(1)
Local Fault Simulation Capabilities of Nodes
After the collection of the detected failure vector the
flow is simulated with one fault inserted in one compo-
nent at a time (single fault assumption) in the fault simu-
lation phase. The modeled flow proceeds as in the detec-
tion phase except for sensor and other input values which
are replaced by simulated values. The FDD engine ac-
tivates every possible fault in any component to create a
simulated failure vector.
Assume that the changes of the air heat flow when pass-
ing a cooling coil can be approximated using the follow-
ing equation.
Tout =Tin ucc TavgDrop (2)
If the engine orders the cooling coil to simulate a stuck
cooling coil at 0 this equation changes to:
Tout =Tin 0TavgDrop (3)
Because of these component-local simulation capabili-
ties the FDD system no measured history data are needed
for fault diagnosis, in contrast to other, learning-based
simulation methods for diagnosis such as artificial neu-
ral networks. Also, this enables reuse of components for
different HVAC systems since those are kept independent
from each other.
HFM FAULT DIAGNOSIS
Fault diagnosis aims to find the most probable fault or
faults with patterns similar to the one observed. More con-
cretely, the detected and simulated failure vectors for an
inserted fault need to match for a diagnosis decision. The
proposed elementary diagnosis algorithm assigns a score
to every possible fault and ranks them accordingly. Users
will be most concerned with the highest ranked fault since
the system recommends to check the involved component
for faults.
Definitions
For ndefined rules in the HFM graph the detected fail-
ure vector Dcontains ncomponents. For every fault ithat
can be simulated there exists a simulated failure vector Si
also containing ncomponents for the same rules.
D=<d1,d2, . . . dn>(4)
Si=<si1,si2, . . . sin>(5)
A scoring metric sm is a function that returns a real
number sci(score) for a detection failure vector Dand
a simulation failure vector Sidenoting the similarity of
these vectors. All metrics assign a rank,ri, based on sci
using the precedes-predicate in Equation 8. This predi-
cate needs to be defined with the used metric since it ei-
ther arranges ranks in ascending or descending order (e.g.
with a distance based metric, a small score leads to a high
rank whereas in an explicit score assignment a high score
leads to a high rank). The rank riindicates the likelihood
of fault ihaving occurred given the the pattern of detected
values. The diagnosis algorithm returns a top-list, tl, con-
taining all simulated faults ordered by the rank of their
simulation vector. The relation it l jas noted in Equa-
tion 9 means that the simulated fault iis ranked higher
than the simulated fault j.
sm :Rn×RnR(6)
sci=sm(D,Si)(7)
ri<rjprecedes(sm,sci,sc j)(8)
i,jsimFaults : itl jri<rj(9)
Diagnosis algorithm
In the current version the algorithm used for diagnosis
works as shown abstractly in Figure 4.
SimBuild
2012
Fifth National Conference of IBPSA-USA
Madison, Wisconsin
August 1-3, 2012
442
Require: HFM has been run in detection and set detec-
tion failure vector D
Ensure: returns fault iwith highest probability
1: procedure SIM ULATE -FAULTS(HFM, D)
2: for all tin timesteps do
3: for all isuch that ican be simulated do
4: tl /
0tl is the top-list of faults
5: co FIN D-C O MP O NE NT-FOR(i)
6: AP PLY-FAULTY-STATE(co,i)
7: SI MU L ATE-FLOW (HFM)
8: SiGE T-RESU LT(HFM)
9: sciAS SI G N-SCORE(Si,D)
10: IN SE RT(t l,Si,sci)
11: end for
12: return sorted(tl)order by score rank
13: end for
14: end procedure
Figure 4: Diagnosis Algorithm based on simulation
Evaluation of similarity metrics for diagnosis
During the fault diagnosis phase, several simulated re-
sult vectors are calculated to serve as the basis for compar-
ison. It is a heuristic task to find a good metric for similar-
ity since plain vector distance does not perform precisely
enough in terms of diagnosis matching.
Since partial overlap leads to 0 values, the gap between
the failure values 0.2, 0 and 0.2 is more relevant than
their numeric value indicates. Failure values 4.1 and 4.3,
on the other hand, do not characterize a failure vector pat-
tern as much. Consequently, smaller failure values tend to
have emerged from estimation tolerances.
A good similarity metric fulfills the following require-
ments:
1. If the failure values are large and close, the resulting
score should be high.
2. A larger absolute failure value more significantly
indicates a particular fault. Smaller failure values
are more prone to have resulted from estimation.
Therefore large values should influence the similar-
ity stronger than small ones.
3. The FDD engine should be able to diagnose the ac-
tual fault and list it high in the top list.
Different Scoring Metrics
In order to find a good metric for the proposed diag-
nosis algorithm, several approaches are discussed and im-
plemented:
1. Component-wise Multiplication (CWM): The
score is calculated by multiplying Djwith sij, tak-
ing the square root of the absolute value of the prod-
uct and multiplying it with the sign of the product.
Here, the largest score is ranked highest (descending
metric). sci=n
j=1sig(djsij)q|djsij|
2. Euclidean Distance (ED): Since taking the square
root on both sides does not affect the order relation
between two vectors, only the sum of squared dis-
tances is used. This metric does not treat 0 values any
differently and the absolute failure value does not in-
fluence the rating either. Close small values yield a
better score than farther large values even though the
large values indicate better matching (e.g. 0.1 and
0.2 leads to better results than 5.7 and 6.3. Here, the
smallest score is ranked highest (ascending metric)
sci=n
j=1(djsij)2
3. Weighted Euclidean Distance (WED): Using
weights helps to overcome the problems of the plain
Euclidean distance. This metric takes the value of the
detected failure value (dj) as an indicator for congru-
ence. The squared distance is divided by djwhich
leads to smaller values if the absolute failure val-
ues are high. Small values mean low distance thus
high ranking. Alternatively, concrete weights could
be assigned to the rule outputs by experts or machine
learning algorithms. sci=n
j=1
(djsij)2
f(|dj|)where
f(x) = (1 if x=0
x otherwise
4. Explicit Scoring Matrix (ESM): A matrix assigns
explicit score values to failure values based on their
size and sign. Figure 5 shows the score of the failure
values, small ones almost do not change the score
whereas large values have a strong influence. If
one of the two compared values is zero the score
for this comparison is zero. Two values with op-
posite signs reduce the overall score. A function
toIndex sets failure values relatively to the overall
maximum or minimum failure value of the detec-
tion result to achieve a normalized value which is the
lookup-index for the scoring matrix. The fault with
the highest score is ranked best (descending metric).
sci=n
j=1scoringMatrix[toIndex(dj),toIndex(sij)]
5. Jaccard Index (JCI): This pattern recognition met-
ric relies on the ratio of intersecting rules to all
rule evaluations. Failure values are separated into
the classes positive, neutral and negative. Then the
amount of equally classified rules in detection and
simulation are counted to return the similarity coef-
ficient. sci=|DSi|
|D|where DSi:={j|sig(dj) =
sig(sij)}and sig(x) = 1 iff x<0, sig(0) = 0 and
sig(x) = 1 iff x>0.
6. Manhatten Distance (MHD) : This simple met-
ric approximates distances. Like Manhatten’s street
structures, distances are given in blocks hence the
sum of the distances in every dimension is taken.
sci=n
j=1(|djsij|)
SimBuild
2012
Fifth National Conference of IBPSA-USA
Madison, Wisconsin
August 1-3, 2012
443
0
1
2
0
0.5
1
1.5
2
−10
−5
0
5
10
Failure value Ai
Failure value Bi
Scoring value
10.5 0-1
-0.5 -1
0
1
Figure 5: Explicit Scoring Matrix for two failure values A
and B
Small Diagnosis Example
Figure 2 shows a small example for demonstrating di-
agnosis. It is an HFM graph of a simplified HVAC system
consisting of a mixing box responsible for mixing outdoor
and return air, a heating coil for preheating outdoor air, a
cooling coil to maintain the supply air temperature at a
setpoint and one space representing different zones with
reheat VAVs. For simplicity reasons we do not consider
mass air flow rates and other physical values and gener-
ally estimate 0.5 °C as sensor tolerances.
Detection Phase
Sending a sensor value means that the corresponding
HFM sensor node receives the value at its XSens port
in this context (tolerances are added by tol definitions)
– Propagation refers to the estimation and passing the es-
timated interval to adjacent nodes using forward and re-
verse ports. Assume that the building information system
provides the following data:
Toa:Outdoor air temperature sensor data that is sent
to the mixing box
Tsa:Supply air temperature sensor data
Tra:Return air temperature sensor data sent to the
space
uhc,ucc ,umb:Heating, cooling coil, and mixing box
control data
SPTsa:Supply air setpoint temperature
Moreover the following rules are defined for detection:
CCForw-SASRev: Cooling Coil Forward - Supply
Air Sensor Reverse
MBForw-HCRev: Heating Coil Reverse - Mixing
Box Forward
SASSP-SASDIn: Supply Air Sensor Setpoint - Sup-
ply Air Sensor Data In
Hence, the length of the detection failure vectors and all
simulated failure vectors will be 3. We will have a look at
the effects of diagnosis for one time step.
First, the sensor and control data from Table 1 are sent
to the components. In cooling mode, no additional pre-
heating is needed. Thus uhc is set to 0.0, umb is set to 0.1
to reuse most of return air with a temperature based econ-
omizer controller (Taylor and Cheng 2010) and additional
cooling (ucc =0.8) is required. Due to occupancy, return
air rises to 25 °C. Readers might examine Table 1 to guess
which component fails in the scenario. A structured anal-
ysis based on HFM FDD will lead to possible faults.
Table 1: Sensor data for one time step
Data Value
Toa 30 °C
Tra 22 °C
Tsa 20 °C
SPTsa 19 °C
uhc 0.0
ucc 0.8
umb 0.1
In the following steps the estimations needed for rule
evaluations are calculated. Sensor nodes invoke the prop-
agation by first adding tolerances and sending the result-
ing intervals in both directions. Transformation nodes
such as a cooling coil change the intervals during prop-
agation. For estimation, the modeler of the HFM graph
needed to add a minimal and maximal temperature drop
for the cooling coil, e.g. 11 °C and 15 °C, respectively.
We will focus on the ”path” the intervals take from the
sensor input to the port involved in a comparison and show
the process in detail for the rule CCForw-SASRev:
1. MB: Reads sensor value 30 °C with tolerances
[29.5 °C; 30.5 °C] forward
2. MB: Mixes with return air ([21.5 °C; 22.5 °C]) at
rate umb [22.3 °C; 23.3 °C] (e.g. 22.50.9+0.1
30.5=23.3) forward
3. HC: uhc is set to 0, no changes in HC forward to
CC
4. CC: Using ucc, interval is calculated as max = 23.3
11.00.8=14.5, min = 22.315 0.8=10.3
[10.3 °C; 14.5 °C] forward
5. SAS: Reads sensor value 20 °C with tolerances
[19.5 °C; 20.5 °C] reverse
Eventually, a failure value is calculated from both in-
tervals using Equation 1 to yield 5 as the failure value for
CCForw-SASRev.
Similar to that the propagation in reverse direction is
started from SAS.
1. SAS: Reads sensor value 20 °C with tolerances
[19.5 °C; 20.5 °C] reverse
2. CC: Using ucc for transformation [28.3 °C; 32.5
°C] reverse
3. HC: uhc is 0, no changes in HC reverse to MB
SimBuild
2012
Fifth National Conference of IBPSA-USA
Madison, Wisconsin
August 1-3, 2012
444
4. MB: Comparing HC-rev [28.3 °C; 32.5 °C] with
MB-forw [22.3 °C; 23.3 °C] failure value of
MBForw-HCRev is 5
SAS also calculates the rule evaluation SASSP-
SASDIn for setpoint and sensor value. In this time step
[18.5 °C; 19.5 °C] for the setpoint and [19.5 °C; 20.5 °C]
does not lead to a failure value for SASSP-SASDIn since
19.5 °C lies in both intervals.
After the the detection phase the detection failure vector
D=<5,5,0>is stored for comparison with simulated
failure vectors and indicates that the system is not in a
fully working state.
Simulation Phase
During the simulation phase, the engine calculates con-
trol and sensor values using feedback loops. Only the
outside air temperature and supply air setpoint tempera-
ture are taken from the measurements which requires spe-
cial treatment of outside air temperature sensor faults as
mentioned in (Zimmermann, Lu, and Lo 2011).
In this scenario, only two possible faults are studied
from the reference list of (Castro 2003). A low return air
temperature compared to the supply air set point temper-
ature and still high amount (0.8) of cooling indicate prob-
lems with the mixing box, so a simulated stuck outside air
temperature damper is considered.
The HFM graph in Figure 2 shows an air flow loop. In
addition, feedback loops for the control variables umb,uhc,
and ucc exist. Steady state values for all variables are
achieved by iteratively executing the air flow loop and
the control loops. Otherwise, a set of differential equa-
tions had to be solved. Equation 11 shows the details of
MB simulating a stuck valve where T0stands for the sim-
ulated temperature used to calculate the expected simu-
lated sensor temperatures. Equation 10 introduces u0
mb as
an intermediate control value using v f ∈ {−1,1}for the
simulated valve fault.
u0
mb =limit(0,1,umb +v f )(10)
T0
out =T0
ra (1u0
mb) + T0
oa u0
mb (11)
Note that umb rather than u0
mb would be identified as the
simulated control value. u0
mb is only used to simulate the
effects a stuck valve has on the temperature. MB send T0
out
with the effects of a fully open outside air damper, yet still
preserves the used umb which would be minimized due to
the controller strategy.
Equation 11 assumes a linear damper characteristic and
100 % damper authority. More complex damper models
can be introduced if necessary.
Assume that the simulation process starting with 30 °C
outside air temperature returned the control values ucc =
0.9, umb =0.15 and uhc =1 and a supply air temperature
of [17.8; 18.8]. Performing the same propagation steps
as during detection leads to a similar outcome: CCForw-
SASRev - 4, MBForw-HCRev - 4 and SASSP-SASDIn -
0, so S1=<4,4,0>.
Another possible explanation is a sensor drift in the
supply air temperature sensor i.e. a too low measured
value. In that simulation SAS adds an offset (e.g. -5) to
T0
out in its simulation propagation. This would return the
simulation vector S2=<3.6,3.6,2.4>.
To sum up, all failure value vectors (detection and sim-
ulation) are shown in Table 2
Table 2: Sensor data for one time step
CC-F SAS-R MB-F HC-R SASSP-SASDIn
D5 5 0 °C
S14 4 0°C
S2-3.6 -3.6 2.4
Diagnosis Phase
In diagnosis, Dand S1,S2are compared using the pre-
sented metrics and ranked according to their score. All
metrics classified fault 1, the stuck outside air damper,
correctly due to the already similar vectors. Results are
shown in Table 3.
Table 3: Diagnosis Results for Small Example, simulated
fault : Ranks in different metrics
Metric sc1sc2r1r2
ESM 16.0 -13.0 1 2
JCI 1.0 0.0 1 2
CWM 8.94 -8.49 1 2
MHD 2.0 19.6 1 2
WED 0.4 35.34 1 2
ED 2.0 153.68 1 2
After diagnosis, the engine suggests checking the out-
side air damper first. In this scenario, mechanical cooling
comes up for the lost return air in this small example as
the supply air setpoint temperature can still be reached.
The fault in the mixing box would not only led to higher
energy demands and costs, it would also remain unnoticed
by people working inside the building. Diagnosis would
help to point to the source of failures and suggest fast re-
pairing.
EXPERIMENT SETUP
The case study
For the experiment’s purposes the same case study used
in (Zimmermann, Lu, and Lo 2011) is considered. The
so called ”Small bank” example consists of three spaces
with individually controlled reheat VAVs and one cen-
tral AHU with economizer. It was used for first exper-
iments because a basic IFC building information model
existed. This was then compiled into an HFM graph and
augmented with additional required information such as
sensor tolerances. Features of the experimental HVAC
system of the Iowa Energy Center Energy Resource Sta-
tion (ERS) were modeled to make comparison to other
HVAC FDD research projects possible. To achieve valid
SimBuild
2012
Fifth National Conference of IBPSA-USA
Madison, Wisconsin
August 1-3, 2012
445
input data resulting from a faulty state of the HVAC sys-
tem a fault simulator has been built in Simulink. Faults
such as sensor drifts can interactively be inserted to pro-
duce reference output.
The setup
For testing the quality of the FDD diagnosis engine, an
experimental setup for different scoring metrics has been
set up. 12 components were capable of simulating faults
in the HFM model itself including heating coil valve stuck
and leaking errors or sensor drifts. The faults have been
taken from the reference list of (Castro 2003). In total, 48
faults were simulated and a reference state without fail-
ures. For every fault that can be simulated in the Simulink
fault simulator, the engine has been started for detection
and diagnosis. In total, sets of sensor and control data
are simulated in intervals of 6 minutes for 5 days which
results in 1200 time steps.
Temperatures are ranging from -14 °C to 40 °C in total
and a daily sinusoidal variation between 4 am and 4 pm of
14 °C. There is regular occupancy between 7 am and 6 pm
in all modeled spaces. Space temperature set-point ranges
are (20 °C, 22 °C) during occupancy and (17 °C, 25 °C)
otherwise. Occupancy also determines the heat load in the
spaces (G. Zimmermann and G. Lo 2011).
At each time step every simulated fault is ranked as de-
scribed earlier and overall the following quantities are ag-
gregated and calculated. Note that fstands for the in-
tended fault introduced through simulation which should
be diagnosed by the engine.
Average Rank: The rank fgot assigned on average
over all time steps. A good metric results in a lower
average rank.
Occurrences in Top 5: Lists how often fwas
ranked among the top 5 faults by diagnosis. This
makes up for possible outliers regarding scores and
ranks since a good metric repeatedly lists fin its top
5 list.
Occurrences in as top-diagnosed fault: Lists how
often fwas ranked as the top fault candidate.
Relative Occurrences: The percentage of the oc-
currences as top and in top 5 is presented relative to
the total number of time steps. This gives a tractable
idea of how the rankings are perceived by a mechan-
ical operator.
RESULTS
Overall results show the average values of the criteria
discussed earlier that every metric returned. Furthermore
exemplary results are shown for two simulated faults.
The results of a simulated leaking cooling coil valve led
to the results in Table 4. Whereas the explicit scoring ma-
trix yielded the better average rank, Jaccard index showed
the intended fault as top candidate more often. Hence the
measured categories do not strictly influence each other.
Table 5 shows that metrics notably influence the quality
of the diagnosis. ESM listed the sensor drift of the sup-
Table 4: Results for Cooling Coil Valve Leak. Ravg : av-
erage rank, Occ5: occurrences in top 5 out of 1200 time
steps, P5: percentage in top 5, Occ1: occurrences as top
listed fault, P1: percentage as top
Metric Ravg Occ5P5Occ1P1
CWM 1.91 1192.00 0.99 525.00 0.44
ESM 2.34 1121.00 0.93 210.00 0.18
JCI 3.52 793.00 0.66 432.00 0.36
ED 13.57 788.00 0.66 608.00 0.51
WED 13.59 783.00 0.65 609.00 0.51
MHD 15.61 728.00 0.61 585.00 0.49
ply air sensor as fault correctly in 66% of all timesteps.
Numeric metrics such as the Euclidean distance yielded
worse results probably because of the uniform treatment
of 0 and other failure values as well as ignoring large fail-
ure values.
Table 5: Results for Supply Air Sensor Temperature too
High. Ravg : average rank, Occ5: occurrences in top 5
out of 1200 time steps, P5: percentage in top 5, Occ1:
occurrences as top listed fault, P1: percentage as top
Metric Ravg Occ5P5Occ1P1
ESM 1.40 1193.00 0.99 788.00 0.66
CWM 1.41 1193.00 0.99 895.00 0.75
JCI 2.60 1127.00 0.94 349.00 0.29
WED 4.41 1018.00 0.85 909.00 0.76
ED 5.96 847.00 0.71 773.00 0.64
MHD 7.25 786.00 0.66 701.00 0.58
The overall results across all simulated faults for every
tested metric are shown in Table 6. Figure 7 depicts the
average ranks returned by the metrics, Figure 6 shows the
percentages of the intended fault being listed as top cause
or in the top 5 list.
The explicit scoring matrix returned the lowest rank on
average, Jaccard Index showed the intended fault in top 5
more than any other metric and CWM returned the actual
fault on top position the most times. A user of the system
would consequently have seen the ”true” cause at the top
of the fault list many times and could check the possibly
fault component.
CONCLUSION
This paper describes in detail the algorithmic strategies
pursued in HFM based fault diagnosis. Different metrics
were used in combination with a simulation based com-
parison procedure to yield the most probably occurred
faults.
Currently the HFM system is applied to a new build-
ing in Berkeley, Ca., with 7 levels and 157 zones that are
air-conditioned by 2 AHUs. Insights from the continuous
commissioning project will be used to improve the diag-
nosis capability.
SimBuild
2012
Fifth National Conference of IBPSA-USA
Madison, Wisconsin
August 1-3, 2012
446
Table 6: Average over all tested faults for each metric.
Ravg : average rank, Occ5: occurrences in top 5 out of
1200 time steps, P5: percentage in top 5, Occ1: occur-
rences as top listed fault, P1: percentage as top
Metric Ravg Occ5P5Occ1P1
ESM 3.21 1012.78 0.84 296.54 0.25
JCI 3.28 1072.24 0.89 242.98 0.2
CWM 4.39 828.71 0.69 372.78 0.31
MHD 14.42 460.02 0.38 248.66 0.21
WED 14.84 464.27 0.39 252.9 0.21
ED 16.49 482.85 0.4 265.41 0.22
ESM JCI CWM MHD ED WED
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Perc Top
Perc Top 5
Metric
Percentage
Figure 6: Percentages as top and among top 5 of different
metrics
ESM JCI CWM MHD ED WED
0
2
4
6
8
10
12
14
16
18
Met rics
Rank
Figure 7: Average rank of different metrics
More information in BIM is expected to become avail-
able so that more precise rules can be defined. Ongoing
research includes the detection and diagnosis of control
errors as well as proposing more efficient control strate-
gies.
First experiments showed that an explicit scoring ma-
trix used for the interval comparisons that arose with the
artificial setup worked well, the exact weights have yet to
be defined and tested with data from the current applica-
tion in Berkeley.
REFERENCES
Baer, et al. 1996. “Associative Networks.In Building
Optimization and Fault Diagnosis Source Book, IEA
Annex 25. p.219–222.
Castro, N.S., et al. 2003. “Results from Simulation
and Laboratory Testing of Air Handling Units and
Variable Air Volume Box Diagnostic Tools. NISTIR
6964 Report, January 2003.
G. Zimmermann, Y. Lu, and G. Lo. 2011. “Heat Flow
Model for Building Fault Detection and Diagnosis.
no. 20110093424 (April).
Have, Philip. 1997. “FAULT MODELLING
IN COMPONENT-BASED HVAC SIMULATION.”
Proceedings of Building Simulation’97, 1997 -
ibpsa.org.
Lu, Yan, Gerhard Zimmermann, and George Lo. 2010.
“Heat flow modeling of HVAC systems for fault de-
tection and diagnosis.” Proceedings SimBuild 2010,
Fourth National Conference of IBPSA-USA, New
York. 215–222.
Morisot and Marchio. 1999. “Fault detection and diag-
nosis on HVAC variable air volume system using ar-
tificial neural networks. Proc. IBPSA Building Sim-
ulation.
Roth, K. W., Westphalen D. Feng M. Y. Llana P., and
L. Quartararo. 2005. Energy Impact of Commer-
cial Building Controls and Performance Diagnostics:
Market Characterization, Energy Impact of Building
Faults and Energy Savings Potential.
Salsbury, Tim, and Rick Diamond. 2008. Model-Based
Model Based Diagnostics for Air Handling Units.
Taylor, and Cheng. 2010. “Why Enthalpy Economizers
Dont Work.” ASHRAE Journal 42 (November): 12–
28.
U.S. Department of Energy. 2009. Building Energy
DataBook. http://buildingsdatabook.eren.
doe.gov/.
Wu and Sun. 2010. Multilevel Fault Detection and Di-
agnosis on Office Building HVAC Systems.
Zimmermann, Gerhard, Yan Lu, and George Lo. 2011.
A simulation based fault diagnosis strategy using
extended heat flow models(HFM). Proceedings of
Building Simulation 2011: 12th Conference of Inter-
national Building Performance Simulation Associa-
tion, Sydney, 14-16 November. 405–412.
SimBuild
2012
Fifth National Conference of IBPSA-USA
Madison, Wisconsin
August 1-3, 2012
447
Article
This article introduces a new graph-based modeling methodology called heat flow modeling (HFM) for the purpose of mapping building information model (BIM) of HVAC systems automatically into fault detection and diagnosis (FDD) systems that can be integrated into HVAC control systems. The goal is an efficient and effective support of the maintenance of HVAC systems to detect and locate faults that may reduce energy efficiency, user comfort, or system lifetime. The nodes of the HFM model have a one-to-one relationship with HVAC system components and related building entities. The nodes are connected by arcs that model the flows in the HVAC systems, e.g., air, water, and information flows. The functionality of the nodes includes state variable estimations and failure rule evaluations. The failure rule outputs can be fed to an associative network based diagnosis engine to locate the faults. Since HFM nodes are instances of generic classes derived from small libraries, HVAC FDD systems can be automatically generated. The simulation result has shown the effectiveness of a proposed FDD approach and two software prototypes demonstrating the reduced engineering effort of fault detection for a small bank HVAC system.
Article
Full-text available
Control of building systems is becoming increasingly more intelligent and complex. This development both necessitates the use of automated diagnostics to ensure fault-free operation and enables diagnostic capabilities for the various building systems by providing a distributed platform that is powerful and flexible enough to perform fault detection and diagnostic (FDD). Most of today’s emerging FDD tools are stand-alone software products that do not reside in a building control system. Thus, trend data files must be processed off-line, or an interface to the building control system must be developed to enable on-line analysis. This is a cumbersome process and it does not scale well because all of the data must be obtained at a single point. A better approach would be to develop algorithms that can be embedded in commercial controllers so that the fault detection can be done as close to the source of the fault as possible. Only the result of the analysis needs to be conveyed to an operator or supervisory controller. AHU Performance Assessment Rules (APAR) is a diagnostic tool that uses a set of expert rules derived from mass and energy balances to detect common faults in air-handling units. Control signals are used to determine the mode of operation for the AHU. A subset of the expert rules corresponding to that mode of operation is then evaluated to determine if there is a mechanical fault or a control problem. VAV box Performance Assessment Control Charts (VPACC) is a diagnostic tool that uses a statistical quality control measures to detect faults or control problems in VAV boxes. This report describes the results of a research study to determine the effectiveness of these tools in detecting commonly found mechanical faults and control problems. The research involved a complementary set of laboratory experiments using commercial AHU and VAV box controllers under both normal operating conditions and operation with known faults, computer simulations, and emulations using the NIST Virtual Cybernetic Building Testbed (VCBT). The APAR and VPACC tools were both found to be successful at finding a wide variety of faults. It was also found that some faults could not be detected under certain operating conditions because the control system was able to mask the problem or because sensor data needed to detect the fault is not commonly available in commercial systems. Both tools appear to be suitable for embedding in commercial control products.
Article
Abrupt faults on HVAC components as blocked dampers or broken fan belt can be successfully detected by methods based on logic rules. On the other hand, those method are less efficient to detect fouling on coil or scaling in tubes that are progressively decreasing the energy efficiency and are long-lasting phenomena. Previous work on simulation data shows that methods based on artificial neural networks (ANN) are adapted to solve this problem. Model method consists in comparing real behavior of the HVAC plant to a normal behavior given by ANN trained during a preliminary phase. The main difficulty of using ANN for fault detection is to produce the training data. Indeed, the performance of the detector is linked to the quality of these data. The procedure of using real data obtained after a recommissioning is really problematic. An alternative way using a physical model is tested to produce training data for the cooling coil. This model of cooling coil requires only a rating point to be characterized. ANN performance with training on simulation data is evaluated on a VAV system. Artificial faults are introduced in the real plant to simulate standard faults occurring in building HVAC system.
Data
a b s t r a c t This paper presents a cross-level fault detection method. Two key features of the proposed method are 1) an energy description of all the units in an HVAC system and 2) a spatialetemporal partition strategy, which allows us to apply the FDD strategy to the entire building in a uniform manner. Energy flow models for HVAC units at all levels are presented. The concept of absolute and relative references for monitoring the energy performance is introduced. We have discussed the inherent complexity of HVAC systems, and proposed a grouping strategy of VAVs via correlation analysis. Examples of the temporal and spatial partitions are presented. Numerical examples are given to demonstrate the cross-level detection of two faults on AHU level and one fault on VAV level.
Associative Networks In Building Optimization and Fault Diagnosis Source Book, IEA Annex 25
  • References Baer
REFERENCES Baer, et al. 1996. " Associative Networks. " In Building Optimization and Fault Diagnosis Source Book, IEA Annex 25. p.219–222.
Why Enthalpy Economizers Dont Work
  • Taylor
  • Cheng
Taylor, and Cheng. 2010. "Why Enthalpy Economizers Dont Work." ASHRAE Journal 42 (November): 1228.
In Building Optimization and Fault Diagnosis Source Book, IEA Annex 25
  • Baer
Baer, et al. 1996. "Associative Networks." In Building Optimization and Fault Diagnosis Source Book, IEA Annex 25. p.219-222.