Conference PaperPDF Available

Probabilistic structural performance assessment in hidden damage spaces

Authors:

Abstract

Extended and generalized fragility functions support estimation of multiple damage state probabilities, based on intensity measure spaces of arbitrary dimensions and longitudinal state dependencies in time. The softmax function provides a consistent mathematical formulation for fragility analysis, thus, fragility functions are herein developed along the premises of softmax regression. In this context, the assumption that a lognormal or any other cumulative distribution function should be used to represent fragility functions is eliminated, multivariate data can be easily handled, and fragility crossings are avoided without the need for any parametric constraints. Adding to the above attributes, generalized fragility functions also provide probabilistic transitions among possible damage states, which can be either hidden or explicitly defined, thus allowing for long-term performance predictions. Long-term considerations enable the study and probabilistic quantification of the cumulative deterioration effects caused by multiple sequential events, while hidden damage states are described as states that are either not deterministically observed or determined, or that are initially even completely unknown and undefined based on relevant engineering demand parameters. Although hidden damage state cases are, therefore, frequently encountered in structural performance assessments, methods to untangle their longitudinal dynamics are elusive in the literature. In this work, various techniques are developed for fragility analysis with hidden damage states and long-term deterioration effects, from Markovian probabilistic graphical models to more flexible deep learning architectures with recurrent units.
Proceedings of the Eighth Conference on Computational Stochastic Mechanics
Paros, Greece, June 10--13, 2018
1
PROBABILISTIC STRUCTURAL PERFORMANCE
ASSESSMENT IN HIDDEN DAMAGE SPACES
C.P. ANDRIOTIS1 and K.G. PAPAKONSTANTINOU1
1Department of Civil & Environmental Engineering, The Pennsylvania State University,
University Park, 16802.
E-mail: cxa5246@psu.edu, kpapakon@psu.edu
Extended and generalized fragility functions support estimation of multiple damage state
probabilities, based on intensity measure spaces of arbitrary dimensions and longitudinal state
dependencies in time. The softmax function provides a consistent mathematical formulation for
fragility analysis, thus, fragility functions are herein developed along the premises of softmax
regression. In this context, the assumption that a lognormal or any other cumulative distribution
function should be used to represent fragility functions is eliminated, multivariate data can be
easily handled, and fragility crossings are avoided without the need for any parametric constraints.
Adding to the above attributes, generalized fragility functions also provide probabilistic
transitions among possible damage states, which can be either hidden or explicitly defined, thus
allowing for long-term performance predictions. Long-term considerations enable the study and
probabilistic quantification of the cumulative deterioration effects caused by multiple sequential
events, while hidden damage states are described as states that are either not deterministically
observed or determined, or that are initially even completely unknown and undefined based on
relevant engineering demand parameters. Although hidden damage state cases are, therefore,
frequently encountered in structural performance assessments, methods to untangle their
longitudinal dynamics are elusive in the literature. In this work, various techniques are developed
for fragility analysis with hidden damage states and long-term deterioration effects, from
Markovian probabilistic graphical models to more flexible deep learning architectures with
recurrent units.
Keywords: generalized fragility functions, multiple damages states, multivariate intensity
measures, longitudinal data dependencies, dependent hidden Markov models, recurrent neural
networks.
1 Introduction
Fragility analysis is employed in structural
engineering to assess and predict impacts
caused by hazardous events of diverse nature.
Standard fragility analysis focuses on
evaluating one-step-ahead predictions,
meaning that in the structural analyses
conducted for determining fragility functions,
the system initiates each time from the same
structural configuration(Shinozuka, et al.,
2003; Shinozuka, et al., 2000; Jalayer, et al.,
2007; Baker, 2015). This methodology is
practical for evaluating the risks of
immediately successive events, however it is
not sufficient for rigorous life-cycle
evaluations. Immediate future assessments
provided by classic fragility methods, although
valuable for decision-making, remain
significantly myopic in the long run, and hence
incorporating them in long-term decision
support systems is impractical without
important modeling simplifications.
Generalized fragility analysis, accounting
for the cumulative effect of sequential
damaging events, augments the base concept to
quantification of future Damage State (DS)
exceedance probabilities given any previous
state of the system (Andriotis & Papakon-
stantinou, 2018a, 2018b), as conceptually
illustrated in Fig. 1(a). Dependent Markov
Models (DMMs) and Dependent Hidden Mar-
Jie Li, Giovanni Solari and Pol Spanos (Eds)
2
Figure 1. (a) Long-term structural damage evolution due to sequential seismic events, for t= 0,1,…,T. Possible
transitions between damage states (unshaded arrows) and actual transitions for a given scenario sequence
(shaded arrows); (b) Structural properties and responses, with two sample sequential earthquake scenarios
highlighted in red and green.
kov Models (DHMMs), equipped with the
softmax function for transition probabilities,
have been shown to provide sound
probabilistic frameworks for sequential
predictions (Bengio & Frasconi, 1996; Visser
& Speekenbrink, 2010). Featuring, in a more
arbitrary setting, hidden states that are not
completely identifiable by Response Metrics
(RMs), such dynamic models can provide
transition probabilities among states given an
Intensity Measure (IM) of interest, while at the
same time preserving the Markovian property
that is central in structural performance
prediction frameworks and optimal
maintenance and inspection planning
(Papakonstantinou & Shinozuka, 2014a,
2014b; Papakonstantinou, et al., 2016, 2018;
Andriotis & Papakonstantinou 2018c).
The intuition behind the adoption of
hidden states is that RMs are only noisy
damage indicators, thus being insufficient to
strictly define DSs and, consequently,
transitions among them. Typically selected
low-dimensional responses do not produce
monotonic mappings from the space of
sequential IMs to the space of DSs. Take for
example a typical structure subjected to 20
sequential earthquake events of arbitrary Peak
Ground Acceleration (PGA) intensities, as
depicted in Fig. 1(b), which also illustrates the
dataset utilized in this work, without any loss
of generality. It is rare to observe a
monotonically increasing trend in the
maximum drift sequence, which should be the
case if the assumption that drift response is a
sufficient RM for damage characterization was
correct. From all available simulations, this
noisy pattern can be seen for instance in the two
highlighted scenarios of Fig. 1(b) (red and
green lines). There is obviously a good reason
for this non-monotonicity, since IMs only
capture low-dimensional features of the
original input and maximum drifts merely track
peak responses, but in its presence valid
description of damage remains elusive.
Apart from DMMs and DHMMs that
apply standard dynamic Bayesian network
principles, recent advances in artificial
intelligence and machine learning apply deep
architectures with multiple hidden processing
units, combining conventional neural networks
with convolutional and recurrent layers
(Goodfellow, et al., 2016). Deep neural
networks, by virtue of their structure, have the
capacity to elicit significant information
existing in the input and transform it through
highly nonlinear hidden spaces that are able to
perform complex input-output mappings. In
cases of sequence learning, Recurrent Neural
Networks (RNNs) have showcased impressive
results from machine translation to
grammatical inference problems, e.g.
(Bahdanau, et al., 2014; Wang, et al., 2018;
Cho, et al., 2014). Although deep learning
concepts in sequence learning are data-type
invariant to an important extent, their
Severe
damage
No
damage
t=1
t=0
t=2
t=T
Minor
damage
(a)
(b)
Proceedings of the Eighth Conference on Computational Stochastic Mechanics
3
implementation in sequential structural
response data for performance assessment and
prediction has not been studied.
In this work, proper probabilistic graphical
models supporting DMMs and DHMMs are
developed for generalized structural fragility.
Moreover, RNN architectures employing
different nonlinear neural activations and
advanced processing mechanisms, like Gated
Recurrent Units (GRU) and Long Short-Term
Memory (LSTM), are introduced and trained
(Cho, et al., 2014; Hochreiter & Schmidhuber,
1997). Training is implemented in a standard
earthquake engineering setting, as compactly
described in Fig. 1(b), with additional details
about modeling specifications available in
(Andriotis & Papakonstantinou, 2018a). In the
case of DHMMs, parametric forms for the
hidden state spaces as well as for the
underlying interstate transition dynamics are
given explicitly by the defined probabilistic
models before training, during the design of the
Bayesian network. In the case of RNNs, the
hidden space transformations are not
probabilistic, but rather defined by nonlinear
function activations, thus any probabilistic
semantics can only be discovered after training.
Based on the trained models, a coherent
methodology is elaborated here that enables
hidden DS transition dynamics to be extracted.
The derived methodologies can reveal deep-in-
time generalized structural fragility, and
provide a powerful quantitative tool for
decision-making and life-cycle assessments of
enhanced accuracy.
2 Softmax Fragility Functions
Softmax Regression (SR) has been shown to
have favorable qualities for determining
fragility functions, providing a sound
probabilistic methodology that avoids
commonly encountered theoretical
inconsistencies (Andriotis & Papakonstan-
tinou, 2018a). In a SR fragility analysis setting,
given a set of N data points having arbitrarily
known discrete DS characterization, and lying
in an m-dimensional feature space of IMs,
 
   
12
11
,,,,,
NN
ii ii ii
m
ii
zxxxz

x, the goal
is to estimate the probability of a DS z, given x,

12
|, , ,
m
P
zjxx x
, with {1, 2,..., }zS K
and m
x, where K is the total number of
discrete DSs. Due to the positivity of the
typically used IMs, analysis is not usually
conducted in the original feature space, but
rather in the logarithmic space of IMs. In SR,
the labels are often given in a one-zero vector
format, meaning that if x corresponds to DS j,
its label is a one-zero vector with only its j-th
entry equal to 1:
0 0 ... 1 ... 0
j


z (1)
This vectorized representation of the states also
allows for a relaxation of the strict one-zero
requirement, thus allowing for a probability
distribution over the states, in cases where the
actual DS is unlikely to be known with
certainty, due to uncertainties related to the
observation accuracy or the fidelity of the
structural models employed. The probability of
a DS z=j given x can be directly modeled by
the softmax function, as follows (Andriotis &
Papakonstantinou, 2018a):


12
|, , ,
j
i
g
mj g
iS
e
Pz jxx x p
e

x
x
x
(2)
where g is typically an affine function of x, for
all iS, and x lies in the IM space, or most
commonly the log-IM space:
11oj jjmjm
g
aax ax x (3)
Predictor functions, g, may also be supplied by
nonlinear polynomial terms, if required by the
structure of the dataset. It is clear from Eq. (2)
that the probabilities of all individual states
sum up to 1 for all x, and are, of course,
positive. Although the attribute of positivity is
self-evident, its presence prevents fragility
functions crossings among different DS
fragility functions. The total number of optimal
coefficients to be determined is (m+1)K. These
coefficients are determined by minimizing the
cross-entropy of the dataset, defined by the loss
function:
()
1
(
1
)
ln ( )
i
jj
N
i
K
i
j
Lzp


 x (4)
Jie Li, Giovanni Solari and Pol Spanos (Eds)
4
Figure 2. (a) Dependent Hidden Markov Model
probabilistic graphical model; (b) Generic RNN
sequence-to-sequence architecture.
Note that minimizing Eq. (4) is essentially
equivalent to maximizing the respective log-
likelihood using Eq. (2) and assuming i.i.d.
observed data.
As soon as the optimal coefficients are
obtained, all DS probabilities are known for all
IM values, using Eq. (2). In order to derive the
fragility functions we then need to compute the
probability of exceeding a certain state, |
Z
X
F
,
which is given by merely combining the
corresponding probabilities for the different
DS levels of interest:
 
|
1
|()
K
X
ij
Zi
FzjPzj p


xx (5)
There also exist different variants of the
softmax regression that either adopt ordinal or
hierarchical considerations to leverage the
special hierarchy in the dataset, as discussed in
detail in (Andriotis & Papakonstantinou,
2018a). In these cases, the respective loss
functions remain exactly the same as in Eq. (4),
with only the function models being slightly
altered compared to Eq. (2), to incorporate any
additional assumptions required. In the ordinal
approach, however, predictors g are as su me d t o
share the same gradient for all structural states
of damage, essentially imposing parallel
separating boundaries, whereas in the
hierarchical approach, optimal parameters are
derived through sequential binary logistic
regression tasks on reduced subsets of the
original dataset. As such, featuring the most
unrestricted version compared to its ordinal
and hierarchical counterparts, the standard
softmax regression approach is employed for
the remainder of this work, as the basis of the
proposed generalized framework.
3 Dependent Markov Models
Traditional fragility analysis frameworks
focus, as mentioned, on one-step probabilistic
predictions, with the structural system
initiating from only one condition, usually the
intact one, i.e. accounting for time-steps 0 and
1 in Fig. 1(a). However, this practice is
inadequate for assessing long-term responses
with accuracy, since transitions from the intact
state, or any sole condition state, are not
indicative of transitions encountered later in
time, as damage increases. The base concept of
softmax fragility functions is thus generalized
here in order to capture the longitudinal
dependencies among different states and
structural responses.
The developed Markovian models for
deriving the generalized fragility functions are
shown in Fig. 2(a). Excluding the O-nodes
(observation nodes), the remaining DMM
network features a direct generalization of
softmax fragility, consisting of X- and Z-nodes,
denoting IMs and DSs respectively. In this
case, DSs are considered to be fully observable
at each time step t (Andriotis &
Papakonstantinou, 2017). The entire network
on the other hand, including the O-nodes,
defines a DHMM representation that does not
necessarily require complete information over
the states (Andriotis & Papakonstantinou,
2018b). As shown in the figure, in the latter
case the additional set of O-nodes of the
Markovian network correspond to some
Response Metrics (RMs) conditional to the X-
and Z-nodes. RMs are directly observed and
can be provided by some low-dimensional
response data, such as displacements, drifts,
forces or strains. Hence, at each time step ot is
observed that depends on the occurred xt and
the actual zt of the system, which is generally
partially observable through observation ot.
The joint distribution reads(Rabiner, 1989):

0: 0: 1:
1
11
(, | )
|, |,
TT T
TT
tt t ttt
tt
fz
pz z p z





ox
xox
(6)
ot
ht
xt
xT
x1
zT
z1
z
0
oT
o1
(a) (b)
Jie Li, Giovanni Solari and Pol Spanos (Eds)
5
Figure 3. (a) Damage State transition probability matrix, conditional to the Intensity Measure and (b)
corresponding generalized fragility functions using the Dependent Markov Model representation, with four
Damage States of increasing severity, deterministically determined based on maximum drifts.
Table 1. Maximum inte
r
-storey drift damage
characterization and corresponding color
indicators.
DS Damage
severity
Maximum
drift
Color
1. No < 0.75 %
2. Minor 0.75 – 1.50 %
3. Major 1.50 – 3.00 %
4. Severe > 3.00 %
Note that for the Markovian network without
the O-nodes, Eq. (6) is modified by eliminating
the
|,
ttt
pzox
terms. State transitions are
considered in this case stationary, thus
|
()
jk t
px

1
|,
tt t
pz jz k
x
is used in Eq.
(6), whereas for the observation matrices
stationary conditions are again considered,
()
jt
qx
|,.
tt t
pzjox
The softmax
function is used to represent
|
()
jk t
px
, whereas
()
jt
qx
can be any continuous or discrete
distribution, with known parameters or not.
Considering the above, the negative log-
likelihood is
(Visser & Speekenbrink, 2010):
()
1|
1111
(, )ln()
NTK K
i
tt jkt
itkj
LIIzjzkp

  

x
()
111
()ln()
NTK
i
tjt
itj
II z j q



x (7)
where II is the indicator function. In general,
if state z
t
is observable, Eq. (7) can be directly
minimized. Otherwise, the indicator functions
cannot be exactly evaluated and only their
expected values can be estimated, given the
collected sequence of observations. As a
consequence, a new expected log-likelihood
function is computed based on Eq. (7):
() ()
1|
1111
(, )ln
NTK K
ii
tt jk
itkj
LEIIzjzkp


  


() ()
111
()ln()
NTK
ii
tjt
itj
EIIz j q





x (8)
Following the structure of the entire network in
Fig. 2(a) and applying the Bayes’ rule, the
expected values of the indicator functions can
be evaluated as:
 
() ()
()
() ()
() ,
ii
tt
i
tii
tt
kS
jj
EIIz j kk





 
  
() ()
1
() () () ()
1|
() () () ()
1|
,
(, )
()()
()()
ii
tt
iiii
tjktjtt
iiii
tmltmtt
lm S
EIIz jz k
kp q j
lp q m





xx
xx
(9)
(a) (b)
Jie Li, Giovanni Solari and Pol Spanos (Eds)
6
Figure 4. (a) Damage State transition probability
matrix, conditional to the Intensity Measure and (b)
corresponding generalized fragility functions using
the Dependent Hidden Markov Model
representation, with three Damage States of
increasing severity, probabilistically inferred based
on maximum drifts.
where the involved conditional probabilities

()i
t
j
() () ()
1: 1:
(,|)
iii
ttt
Pz jox
and

()i
t
j
() () ()
1: 1:
(| ,)
ii i
tT t tT
Pzj

ox
are estimated using an
initial guess of the transition and observation
models, combined with the set of observations
and the backward-forward algorithm for
hidden Markov models:
() () () ()
1|
() () () ()
1|1 1
() () () ()
() () ( ) ( )
iiii
tjt tjkt
kS
iiii
ttjktjt
jS
jq kp
kjpq



xx
xx
(10)
As a result of the calculations in Eqs. (9) and
(10), the loss function of Eq. (8) only contains
the model parameters as unknowns and it can
now be easily minimized using any applicable
nonlinear programming algorithm. The
obtained optimal parameter updates are then
used to re-evaluate the expected indicator
functions. This two-step scheme is repeated
until convergence, defining the Expectation-
Maximization (EM) algorithm for hidden
Markov models
(Ghahramani, 2001).
Eq. (8) implies a decomposition of the
optimization problem, for both network
architectures considered. Specifically, for the
DHMM network, in the M-steps the expected
indicators are constants and Eq. (8) can be
decomposed into K independent subproblems
pertaining to the transition model and K
independent subproblems pertaining to the
observation model. For the DMM, i.e. without
the O-nodes and the hidden state
considerations, the problem is again
decomposed in K softmax regression tasks,
which can be processed independently based
on the resulting conditional datasets for the
different DSs involved. In the degenerate case
of the simpler dependent Markov model, where
0,1t
, we end up with the previously
described softmax fragility functions
framework. This note also holds for the case of
the dependent hidden Markov model, where
now, however, DSs labels are hidden and not
assigned to the data points, but rather inferred
in an unsupervised manner, based on the EM
algorithm. This is a very powerful and practical
formulation that can be thus applied in
numerous cases where the DSs are unknown,
as for example, and only indicatively, cases
where the RMs are not measured with
certainty, or the exact quantification of DSs is
otherwise vague, e.g. due to limitations in the
available measuring instruments.
In the DMM case, drift is considered here
to be sufficient to strictly define the DSs, as
shown in Table 1. The drift space is
accordingly divided in four discrete DSs,
indicating ‘no damage’, ‘minor damage’,
‘major damage’, and ‘severe damage’. After
minimization of the loss function in Eq. (7), the
transition probability matrix from each DS to
all others as a function of PGA is provided
here, as shown in Fig. 3(a). The transition
matrix in the figure contains all the required
probabilistic information for the next DS
(columns) given a previous one (rows), when
an event with a given IM occurs. As such, the
horizontal axes in the relevant plots represent
PGA values and the vertical axes transition
probabilities, with DSs 1 to 4 describing DSs
of increasing severity according to Table 1.
From this conditional matrix of transitions, we
(a)
(b)
Proceedings of the Eighth Conference on Computational Stochastic Mechanics
7
can readily assess the generalized structural
fragility in an additive fashion, combining Eqs.
(2) and (5), as shown in Fig. 3(b).
Although this DMM establishes intuitive
transitions between damage states and
evolution of drift dynamics, it is not entirely
adequate in describing structural deterioration,
since it does not provide a triangular transition
matrix. Triangularity of DS transition matrices
is a property that assures irreversibility of
transitions from current lower damage states to
future states of greater damage. As shown in
Fig. 3(a), the DS transition probabilities form
in this case an almost upper triangular matrix,
having some non-zero lower triangular terms.
This matrix can be practically triangularized by
adding the small lower diagonal outliers to the
diagonal terms, thus enforcing irreversibility
empirically. Alternatively, we can make use of
the more advanced DHMM model presented
herein, which does not require any explicit
prior association of drifts with damage, as
imposed by Table 1 for the DMM.
In the DHMM case, damage is thus not
completely defined by the structural drift as
before. Drift is considered to be a RM that can
only suggest damage but does not
deterministically describe it. As described in
the previous section, this is accomplished by
considering the maximum inter-storey drifts to
form a space of observations that
probabilistically depend on the actual DSs and
IMs. Thereby, RMs are insufficient now to
reveal DSs with certainty and only entail noisy
information about the actual state of the
system. Three DSs are considered in this case,
whereas structural drifts are deemed to be
continuous, following unknown normal
distributions in the logarithmic space of
responses and having their means linearly
parametrized in the log-IM space. The
parameters for the observation and transition
probabilities are obtained in this case by
applying the EM algorithm in the loss function
of Eq. (8). In Fig. 4, the transition probability
matrix among DSs and the corresponding
generalized fragility functions are shown,
following the same steps as previously, for the
simpler DMM case. Apart from a small non-
zero region, the IM-dependent transitions form
an upper triangular matrix, ensuring
irreversible damage and highlighting the
effectiveness of the DHMM to capture
consistent deterioration trends.
4 Recurrent Neural Networks
RNNs transform the original input through a
nonlinear hidden space, mapping it to the
output space. By virtue of their recurrent
properties, they have the potential to unroll
deep-in-time, thus enabling detection of long-
term dependencies
(Goodfellow, et al., 2016).
3 different RNN architectures (RNN-ReLU,
GRU, LSTM) have been implemented in this
work in a 4D hidden space (4 neural nodes per
hidden unit), featuring different activation
functions and processing units. The structure of
the networks is illustrated in Fig. 2(b).
Although this has notable resemblance with the
structure of DHMM, now the hidden units are
not probabilistic but deterministic functions.
Selection of hidden units and activation
functions is very important in building a robust
and efficient model, and generally different
accuracy is expected depending on this choice.
A simple RNN features fully connected
activations between neurons. Herein, hidden
activations are Rectified Linear Units (ReLU),
which have been seen to converge faster than
other activation functions in this particular
problem:


1
ReLU ,
thtth
hWhxb
(11)
Figure 5. 2D embedding of a 4D hidden space using
t-SNE and sequential irreversible transitions for two
sample scenarios (red and green lines).
Jie Li, Giovanni Solari and Pol Spanos (Eds)
8
Table 2. Hidden DSs and damage characterization.
DS Damage
severity
Max.
drift
median
Max.
drift
c.o.v.
Color
1. No 1.037 % 0.7953
2. Minor 1.629 % 0.7843
3. Major 2.073 % 0.7917
4. Severe 2.897 % 0.8505
where Wh network weights and bh bias. The
respective LSTM nonlinear transformations
and activations read:








1
1
11
1
,
,
tanh ,
tanh ,
tanh
tfttf
titti
ttt t ct t c
tytty
tt t

 
 
  

fWhxb
iWhxb
cfc i Whx b
yWhxb
hy c

(12)
where
is the logistic sigmoid and
denotes the Hadamard product. Finally, the set
of equations governing GRU is:






1
1
1
1
,
,
1
tanh ,
tzttz
trttr
ttt
thttth
 
 


zWhxb
rWhxb
hzh
zWrhxb
(13)
For the purpose of training the models,
tuning their hyperparameters, and assessing
their generalization capacity a 6-fold validation
is applied. The dataset of 600 20-event
scenarios is partitioned in 500 samples which
define the training set and 100 samples which
define the validation set. Training is merely
performed on the training set and validation
error is tracked based on the validation set,
which is unseen for the model. Training is
executed by minimizing the loss function:

() () () ()
11
ˆˆ
NT T
ii ii
tt tt
it
L

  
 oo oo (14)
where t
ois the actual RM at each time step,
whereas ˆt
ois the estimated one based on the
RNN, linearly parametrized in the hidden
space. The loss function in Eq. (14) is similar
to its DHMM counterpart, if Gaussian
observations are assumed, except it does not
contain any terms related to the probabilities of
hidden states. The Adam optimizer, with a
learning rate of 1e-3, is used for training, being
a popular and robust variant of stochastic
gradient descent(Kingma & Ba, 2014). The
batch size for gradient calculation is set equal
to 256, whereas the 6-fold validation is
executed based on 1e4 training epochs.
The RNN model with the lowest validation
error was the LSTM one. The LSTM model is
thus trained again based on all available data.
Training is terminated based on the epoch
corresponding to the lowest point of the
validation error, as this is derived by the
validation curves of the 6-fold process. As
parametric dimensionality increases, the more
prone to overfitting the model becomes. Hence,
although the training error keeps decreasing
monotonically up to epoch 1e4, the lowest
validation error is attained earlier. As a
consequence, the termination criterion is an
important hyperparameter that needs to be
tuned, preventing convergence to models with
poor generalization capacity. Another
important hyperparameter is the selection of
regularizers and their penalty multipliers.
Typical L2 and L1 regularizers are often
essential in large models.
4.1 Extracting Hidden State Dynamics
The trained model provides a full predictive
tool for sequential earthquakes. However,
instead of merely exploiting predictions of drift
output, we can also exploit the rich and
structured representation of the hidden space,
to obtain the underlying state dynamics that
produce the observed drifts. The first step is to
cluster the continuous high-dimensional
hidden space into discrete states. Clustering of
hidden RNN spaces has been successfully
applied, for extracting deterministic finite state
automata in grammar models, using k-means
algorithm(Wang, et al., 2018). Herein, k-
means is applied to the hidden space of the
LSTM network, with k=4 number of states. For
visualization purposes, the 4D hidden space of
the RNN is embedded in a 2D space using the
Proceedings of the Eighth Conference on Computational Stochastic Mechanics
9
Figure 6. Damage State transition probability
matrix, conditional to the Intensity Measure, based
on learned transition dynamics through the LSTM
RNN.
t-Stochastic Neighbor Embedding (t-SNE)
(Maaten & Hinton, 2008), and the results are
shown in Fig. 5. The two sample sequences
highlighted in Fig. 1(b) are again tracked in
Fig. 5, revealing that in both scenarios, as soon
as one DS is reached, only higher DSs are
attainable in the future. The formed DSs
correspond to RM distributions of increasing
severity, as indicated by their median values,
presented in Table 2.
After clustering the hidden space, DSs
have been formed. Results indicate that
transitions from lower to higher DSs are highly
irreversible. This is a very important property
assuring that the interpretation of a hidden
space as a damage space is valid, thus
showcasing the efficiency of the presented
methodology in extracting consistent hidden
damage transition patterns. Assuming
stationary transitions among DSs and using
softmax regression, the corresponding
generalized fragility functions in the form of
IM-conditional transition probabilities are
plotted in Fig. 6, where the shown matrix
comprises again all transitional information
between states, as in previous cases. Thus, DSs
can be predicted based on the current state and
IM, whereas estimated drifts are also shown in
Table 2.
5 Conclusions
This work presents a methodology for
generalized fragility analysis, augmenting the
concept of softmax fragility functions with
DMMs, DHMMs, and deep in time RNNs, thus
allowing accurate long-term prediction of
damage evolution. The developed methods
allow for inference of hidden DSs and their
transition dynamics due to sequential seismic
events. In the case of DHMM, training of the
underlying probabilistic graphical model
directly provides the generalized transitions. In
the case of RNNs, solution proceeds in
successive steps of (i) RNN training, (ii)
hidden space clustering, and (iii) softmax
regression for obtaining inter-state transitions
given any previous state and any occurred IM.
Results in a standard earthquake engineering
setting indicate that RNNs have particular
qualities in learning seismic input to structural
output mappings. LSTMs, having sufficient
mechanisms to retain or forget, if necessary,
information from past events, are shown to
have better generalization performance. Most
importantly, herein, the sequential flow of
information, as this is encoded in the hidden
space of the RNNs, is exploited and is shown
to reveal consistent deterioration dynamics.
Overall, in both DHMM and RNN methods
transitions among states remain practically
irreversible, without any externally pre-
determined damage definition, culminating to
a final absorbing state of severe damage.
Thereby, generalized fragility in hidden
damage spaces is derived, which due to its
Markovian properties can be incorporated in
advanced decision-support systems, among
others.
References
Andriotis, C. P. & Papakonstantinou, K. G., 2017.
Generalized multivariate fragility functions
with multiple damage states. Proceedings of the
12th International Conference on Structural
Safety & Reliability (ICOSSAR), Vienna.
Andriotis, C. P. & Papakonstantinou, K. G., 2018a.
Extended and generalized fragility functions.
Journal of Engineering Mechanics, 144(9), p.
04018087.
Jie Li, Giovanni Solari and Pol Spanos (Eds)
10
Andriotis, C. P. & Papakonstantinou, K. G., 2018b.
Dependent Markov models for long-term
structural fragility. Proceedings of the 11th
National Conference on Earthquake
Engineering (NCEE), Los Angeles, CA.
Andriotis, C. P. & Papakonstantinou, K. G., 2018c.
Managing engineering systems with large state
and action spaces through deep reinforcement
learning. arXiv preprint, arXiv:1811.02052.
Bahdanau, D., Cho, K. & Bengio, Y., 2014. Neural
machine translation by jointly learning to align
and translate. arXiv preprint arXiv:1409.0473.
Baker, J. W., 2015. Efficient analytical fragility
function fitting using dynamic structural
analysis. Earthquake Spectra, 31(1), pp. 579-
599.
Bengio, Y. & Frasconi, P., 1996. Input-output
HMMs for sequence processing. IEEE
Transactions on Neural Networks, 7(5), pp.
1231-1249.
Cho, K., VanMerrinboer, B., Bahdanau, D. &
Bengio, Y., 2014. On the properties of neural
machine translation: Encoder-decoder
approaches. arXiv preprint arXiv:1409.1259.
Ghahramani, Z., 2001. An introduction to hidden
Markov models and Bayesian networks.
International Journal of Pattern Recognition
and Artificial Intelligence, 15(1), pp. 9-42.
Goodfellow, I., Bengio, Y. & Courville, A., 2016.
Deep learning. Cambridge: MIT Press.
Hochreiter, S. & Schmidhuber, J., 1997. Long short-
term memory. Neural Computation, 9(8), pp.
1735-1780.
Jalayer, F., Franchin, P. & Pinto, P., 2007. A scalar
damage measure for seismic reliability analysis
of RC frames. Earthquake Engineering &
Structural Dynamics, 36(13), p. 2059–2079.
Kingma, D. P. & Ba, J., 2014. Adam: A method for
stochastic optimization. arXiv preprint
arXiv:1412.6980.
Maaten, L. V. D. & Hinton, G., 2008. Visualizing
data using t-SNE. Journal of Machine Learning
Research, pp. 2579-2605.
Papakonstantinou, K. G., Andriotis, C. P. &
Shinozuka, M., 2016. Point-based POMDP
solvers for life-cycle cost minimization of
deteriorating structures. Proceedings of the 5th
International Symposium on Life-Cycle Civil
Engineering (IALCCE), Delft.
Papakonstantinou, K. G. & Shinozuka, M., 2014a.
Planning structural inspection and maintenance
policies via dynamic programming and Markov
processes. Part I: Theory. Reliability
Engineering & System Safety, Volume 130, pp.
202-213.
Papakonstantinou, K. G. & Shinozuka, M., 2014b.
Planning structural inspection and maintenance
policies via dynamic programming and Markov
processes. Part II: POMDP implementation.
Reliability Engineering & System Safety,
Volume 130, pp. 214-224.
Papakonstatninou, K. G., Andriotis, C. P. &
Shinozuka, M., 2018. POMDP and MOMDP
solutions for structural life-cycle cost
minimization under partial and mixed
observability. Structure and Infrastructure
Engineering, 14(7), pp. 869-882.
Rabiner, L. R., 1989. A tutorial on hidden Markov
models and selected applications in speech
recognition. Proceedings of the IEEE, 77(2),
pp. 257-286.
Shinozuka, M. et al., 2003. Statistical analysis of
fragility curves, Technical Report, MCEER-
03-002.
Shinozuka, M., Feng, M. Q., Lee, J. & Naganuma,
T., 2000. Statistical analysis of fragility curves.
Journal of Engineering Mechanics, 126(12),
pp. 1224-1231.
Visser, I. & Speekenbrink, M., 2010. depmixS4: An
R-package for hidden Markov models. Journal
of Statistical Software, 36(7), pp. 1-21.
Wang, Q. et al., 2018. A comparison of rule
extraction for different recurrent neural
network models and grammatical complexity.
arXiv preprint arXiv:1801.05420.
... Component transition parameters for the underlying hidden Markov models are assumed to be known or already learned, thus model uncertainty is not considered in this example. For learning of (hidden) Markov models and details on forming and maximizing the respective likelihood functions based on load-conditioned structural data, the interested reader can refer to [75,76], among various sources. In the case of latent states, as shown in the previous works, expectation-maximization or recurrent neural networks can be used. ...
Article
Determination of inspection and maintenance policies for minimizing long-term risks and costs in deteriorating engineering environments constitutes a complex optimization problem. Major computational challenges include the (i) curse of dimensionality, due to exponential scaling of state/action set cardinalities with the number of components; (ii) curse of history, related to exponentially growing decision-trees with the number of decision-steps; (iii) presence of state uncertainties, induced by inherent environment stochasticity and variability of inspection/monitoring measurements; (iv) presence of constraints, pertaining to stochastic long-term limitations, due to resource scarcity and other infeasible/undesirable system responses. In this work, these challenges are addressed within a joint framework of constrained Partially Observable Markov Decision Processes (POMDP) and multi-agent Deep Reinforcement Learning (DRL). POMDPs optimally tackle (ii)-(iii), combining stochastic dynamic programming with Bayesian inference principles. Multi-agent DRL addresses (i), through deep function parametrizations and decentralized control assumptions. Challenge (iv) is herein handled through proper state augmentation and Lagrangian relaxation, with emphasis on life-cycle risk-based constraints and budget limitations. The underlying algorithmic steps are provided, and the proposed framework is found to outperform well-established policy baselines and facilitate adept prescription of inspection and intervention actions, in cases where decisions must be made in the most resource- and risk-aware manner.
... Component transition parameters for the underlying hidden Markov models are assumed to be known or already learned, thus model uncertainty is not considered in this example. For learning of hidden Markov models based on structural data the interested reader is referred to [70,71]. Different failure probabilities are considered based on each one of the above damage states, as shown in Table 3. ...
Preprint
Full-text available
Determination of inspection and maintenance policies for minimizing long-term risks and costs in deteriorating engineering environments constitutes a complex optimization problem. Major computational challenges include the (i) curse of dimensionality, due to exponential scaling of state/action set cardinalities with the number of components; (ii) curse of history, related to exponentially growing decision-trees with the number of decision-steps; (iii) presence of state uncertainties, induced by inherent environment stochasticity and variability of inspection/monitoring measurements; (iv) presence of constraints, pertaining to stochastic long-term limitations, due to resource scarcity and other infeasible/undesirable system responses. In this work, these challenges are addressed within a joint framework of constrained Partially Observable Markov Decision Processes (POMDP) and multi-agent Deep Reinforcement Learning (DRL). POMDPs optimally tackle (ii)-(iii), combining stochastic dynamic programming with Bayesian inference principles. Multi-agent DRL addresses (i), through deep function parametrizations and decentralized control assumptions. Challenge (iv) is herein handled through proper state augmentation and Lagrangian relaxation, with emphasis on life-cycle risk-based constraints and budget limitations. The underlying algorithmic steps are provided, and the proposed framework is found to outperform well-established policy baselines and facilitate adept prescription of inspection and intervention actions, in cases where decisions must be made in the most resource- and risk-aware manner.
Article
Decision-making for engineering systems management can be efficiently formulated using Markov Decision Processes (MDPs) or Partially Observable MDPs (POMDPs). Typical MDP/POMDP solution procedures utilize offline knowledge about the environment and provide detailed policies for relatively small systems with tractable state and action spaces. However, in large multi-component systems the dimensions of these spaces easily explode, as system states and actions scale exponentially with the number of components, whereas environment dynamics are difficult to be described explicitly for the entire system and may, often, only be accessible through computationally expensive numerical simulators. In this work, to address these issues, an integrated Deep Reinforcement Learning (DRL) framework is introduced. The Deep Centralized Multi-agent Actor Critic (DCMAC) is developed, an off-policy actor-critic DRL algorithm that directly probes the state/belief space of the underlying MDP/POMDP, providing efficient life-cycle policies for large multi-component systems operating in high-dimensional spaces. Apart from deep network approximators parametrizing complex functions with vast state spaces, DCMAC also adopts a factorized representation of the system actions, thus being able to designate individualized component- and subsystem-level decisions, while maintaining a centralized value function for the entire system. DCMAC compares well against Deep Q-Network and exact solutions, where applicable, and outperforms optimized baseline policies that incorporate time-based, condition-based, and periodic inspection and maintenance considerations.
Conference Paper
Full-text available
Fragility functions are widely used in performance-based analysis and risk assessment of structures, readily addressing the earthquake and structural engineering needs for uncertainty quantification. Fragility functions indicate the probability of a system exceeding certain damage states given some appropriate intensity measures characterizing recorded or simulated data-series. Formally, these intensity measures are characteristic features of the data-series, which can then be probabilistically mapped to a label state space, through presumed structural models and engineering demand parameters. In this sense, the development of fragility functions is a learning task, which has to preserve the statistical information of the labeled data. In this work, fragility functions are derived in their utmost generality, accounting for both multivariate intensity measures and multiple damage states, and are even further expanded to cases with multiple transitions among different states, what is called herein generalized fragility functions. As shown in this work, the framework of softmax regression is proven to be the appropriate one for such learning tasks for several theoretical and practical reasons. Different variants of the methodology applicable in fragility analysis are discussed and their underlying implementation details, statistical properties and assumptions are provided.
Conference Paper
Full-text available
Risk assessment in earthquake engineering necessitates effective predictive models for structural damage evolution, compatible with current decision support frameworks. Such models should be able to handle stochastic seismic excitations and structural responses, probabilistically associating characteristic earthquake features of reduced dimensions to structural damage. Fragility analysis is typically employed in this regard, serving, among others, as the basis for evaluating mean annual frequencies of various informative measures that support decision-making. As recently shown by the authors, fragility functions should follow the theoretically consistent softmax function and not the typically used lognormal distribution. This work expands these findings to dependent Markov models that generalize the classical structural fragility framework to account for longitudinal dependencies among multiple damage states, enabling damage prediction over the life-cycle of structural systems. Featuring a special case of dynamic Bayesian networks, the suggested Markov models are dependent to some informative ground motion intensity measures and are modeled based on relevant longitudinal structural responses. Information over the structural damage states can be either considered complete or partial, which categorizes the corresponding approach into either a dependent Markov or a dependent hidden Markov model, respectively. The likelihood-based objectives formed can be optimized applying any standard optimization algorithm, whereas in the presence of hidden states solution can be found through Expectation-Maximization steps. Numerical results demonstrate the efficacy of the models to predict long-term damage evolution, and it is seen that hidden models reveal more consistent damage transitions, by virtue of their structure that handles response metrics as uncertain damage indicators. The implementation details of each formulation are presented and practical suggestions are given.
Article
Full-text available
Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder--Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.
Article
Decision-making for engineering systems management can be efficiently formulated using Markov Decision Processes (MDPs) or Partially Observable MDPs (POMDPs). Typical MDP/POMDP solution procedures utilize offline knowledge about the environment and provide detailed policies for relatively small systems with tractable state and action spaces. However, in large multi-component systems the dimensions of these spaces easily explode, as system states and actions scale exponentially with the number of components, whereas environment dynamics are difficult to be described explicitly for the entire system and may, often, only be accessible through computationally expensive numerical simulators. In this work, to address these issues, an integrated Deep Reinforcement Learning (DRL) framework is introduced. The Deep Centralized Multi-agent Actor Critic (DCMAC) is developed, an off-policy actor-critic DRL algorithm that directly probes the state/belief space of the underlying MDP/POMDP, providing efficient life-cycle policies for large multi-component systems operating in high-dimensional spaces. Apart from deep network approximators parametrizing complex functions with vast state spaces, DCMAC also adopts a factorized representation of the system actions, thus being able to designate individualized component- and subsystem-level decisions, while maintaining a centralized value function for the entire system. DCMAC compares well against Deep Q-Network and exact solutions, where applicable, and outperforms optimized baseline policies that incorporate time-based, condition-based, and periodic inspection and maintenance considerations.
Conference Paper
Optimized maintenance of operating aging infrastructures is of paramount importance to ensure safe and cost effective operation during their original design lifetime and even beyond that. Modern answers to the problem should focus on automated planning and decision making techniques taking advantage of informative but uncertain data that become available during the structural life-cycle. In this paper such a solution framework is presented, based on partially observable Markov decision processes (POMDPs). In a POMDP framework, the evolution of the system is described by stochastic processes, real-time observation data update the system state estimations, and all possible future actions, about where, when and what type of inspection and repair should be performed, are taken into account in order to optimize the long-term life-cycle objectives. As a consequence of their advanced mathematical attributes, POMDP models are unfortunately hard to solve. In recent years, however, significant breakthroughs have been achieved, mainly due to the introduction of point-based value iteration algorithms. In this work, several POMDP point-based methods are examined, with various characteristics in the selection of the belief space points/subset and the value function update procedures. To investigate the strengths and limitations of the various solution methods for structural maintenance problems of deteriorating infrastructure and to draw conclusions regarding their efficiency and applicability to problems of this kind, a realistic nonstationary example is selected, concerning corrosion of reinforcing bars of concrete structures in a spatial stochastic context.
Article
Fragility functions indicate the probability of a system exceeding certain damage states given some appropriate measures that characterize recorded or simulated data series. Presented in two main parts, this paper develops fragility functions in their utmost generality, accounting for both (1) multivariate intensity measures with multiple damage states and (2) longitudinal damage state dependencies in time. Without adopting the limiting assumption of common variance to avoid improper function crossings, the first part presents what is here compactly termed as extended fragility functions. As shown, these are best supported by the softmax function for any arbitrary distribution of the exponential family to which the intensity measures of different states may belong, including the typically used normal distribution in the logarithmic scale of intensity measures. In the second part, generalized fragility functions are introduced for cases where multiple system state transitions need to be captured. To that end, dependent Markov and hidden Markov models are employed because they are able to portray longitudinal data dependencies and reveal intrinsic deterioration trends for multiple sequential events. Numerical results are presented, together with underlying implementation details, statistical properties, and practical suggestions.
Article
Scheduling of inspection and maintenance policies during the life-cycle of operating infrastructure necessitates optimization of long-term objectives in stochastic environments. Modern answers to the problem should focus on quantitative decision-making techniques, taking advantage of informative but uncertain data that become available in time. As such, the problem is efficiently addressed within the framework of stochastic dynamic programming by means of Partially Observable Markov Decision Processes (POMDPs) and Mixed Observability Markov Decision Processes (MOMDPs). Although these methodologies can provide very sophisticated solutions with optimality guarantees, important computational challenges often emerge, mainly due to the continuity of the multidimensional belief space on the probability simplex. In response, recent value iteration algorithms based on point-based approaches have been suggested, focusing on reachable belief points that can support an accurate value function. In this work, several POMDP and MOMDP point-based algorithms, with various characteristics regarding the exploration of the belief space and the value function update procedures, are rigorously analyzed. The algorithms are compared and evaluated in terms of accuracy and performance in stationary and nonstationary problems of structural inspection and maintenance for life-cycle cost minimization. Results are thoroughly discussed and several insights along with practical suggestions for similar problems are provided.
Article
It has been shown that rules can be extracted from highly non-linear, recursive models such as recurrent neural networks (RNNs). The RNN models mostly investigated include both Elman networks and second-order recurrent networks. Recently, new types of RNNs have demonstrated superior power in handling many machine learning tasks, especially when structural data is involved such as language modeling. Here, we empirically evaluate different recurrent models on the task of learning deterministic finite automata (DFA), the seven Tomita grammars. We are interested in the capability of recurrent models with different architectures in learning and expressing regular grammars, which can be the building blocks for many applications dealing with structural data. Our experiments show that a second-order RNN provides the best and stablest performance of extracting DFA over all Tomita grammars and that other RNN models are greatly influenced by different Tomita grammars. To better understand these results, we provide a theoretical analysis of the "complexity" of different grammars, by introducing the entropy and the averaged edit distance of regular grammars defined in this paper. Through our analysis, we categorize all Tomita grammars into different classes, which explains the inconsistency in the performance of extraction observed across all RNN models.
Conference Paper
Optimized maintenance of operating aging infrastructures is of paramount importance to ensure safe and cost effective operation during their original design lifetime and even beyond that. Modern answers to the problem should focus on automated planning and decision making techniques taking advantage of informative but uncertain data that become available during the structural life-cycle. In this paper such a solution framework is presented, based on partially observable Markov decision processes (POMDPs). In a POMDP framework, the evolution of the system is described by stochastic processes, real-time observation data update the system state estimations, and all possible future actions, about where, when and what type of inspection and repair should be performed, are taken into account in order to optimize the long-term life-cycle objectives. As a consequence of their advanced mathematical attributes, POMDP models are unfortunately hard to solve. In recent years, however, significant breakthroughs have been achieved, mainly due to the introduction of point-based value iteration algorithms. In this work, several POMDP point-based methods are examined, with various characteristics in the selection of the belief space points/subset and the value function update procedures. To investigate the strengths and limitations of the various solution methods for structural maintenance problems of deteriorating infrastructure and to draw conclusions regarding their efficiency and applicability to problems of this kind, a realistic nonstationary example is selected, concerning corrosion of reinforcing bars of concrete structures in a spatial stochastic context.
Article
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions. The method is straightforward to implement and is based an adaptive estimates of lower-order moments of the gradients. The method is computationally efficient, has little memory requirements and is well suited for problems that are large in terms of data and/or parameters. The method is also ap- propriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The method exhibits invariance to diagonal rescaling of the gradients by adapting to the geometry of the objective function. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. We demonstrate that Adam works well in practice when experimentally compared to other stochastic optimization methods.