Content uploaded by Charalampos P Andriotis
Author content
All content in this area was uploaded by Charalampos P Andriotis on Dec 06, 2018
Content may be subject to copyright.
Generalized multivariate fragility functions with multiple
damage states
C. P. Andriotis and K.G. Papakonstantinou
Department of Civil & Environmental Engineering
The Pennsylvania State University, University Park, PA, 16802, USA
Introduction
Fragility analysis of structures is a practical mathematical and engineering tool for parametriz
ing the inherent uncertainties due to earthquake events, enabling engineering decisionmaking
based on a few informative metrics. Fragility functions quantify the probabilities of a structure
exceeding certain Damage (or limit) States (DSs) Z, given some Engineering Demand Param
eters Y (EDPs),
ZY
f
. Through these probabilities we can eventually estimate the Mean Annual
Frequency (MAF) of DSs exceedance,
DS
λ
, and other relevant MAF responses related to eco
nomical, societal, or environmental impacts. A strong assumption made is that given some In
tensity Measures X (IMs) we can sufficiently define the probabilistic response of EDP,
YX
f
.
Following the premises of these conditional independencies of the uncertain measures and re
sponses, we obtain [2]:
( )

,
( 1 ) (  )
DS Z Y Y X IM
xy
f z y df y x d x
λλ
= =
∫∫
(1)
where
{ }
0,1z
,
1
for exceeding the DS and
0
otherwise, and
IM
λ
is the corresponding seismic
hazard function.
Abstract: Fragility functions are widely used in performancebased analysis and
risk assessment of structures, readily addressing the earthquake and structural engi
neering needs for uncertainty quantification. Fragility functions indicate the proba
bility of a system exceeding certain damage states given some appropriate intensity
measures characterizing recorded or simulated dataseries. Formally, these intensity
measures are characteristic features of the dataseries, which can then be probabil
istically mapped to a label state space, through presumed structural models and en
gineering demand parameters. In this sense, the development of fragility functions
is a learning task, which has to preserve the statistical information of the labeled
data. In this work, fragility functions are derived in their utmost generality, account
ing for both multivariate intensity measures and multiple damage states, and are
even further expanded to cases with multiple transitions among different states, what
is called herein generalized fragility functions. As shown in this work, the frame
work of softmax regression is proven to be the appropriate one for such learning
tasks for several theoretical and practical reasons. Different variants of the method
ology applicable in fragility analysis are discussed and their underlying implemen
tation details, statistical properties and assumptions are provided.
IASSAR
Safety, Reliability, Risk, Resilience and Sustainability of Structures and Infrastructure
12th Int. Conf. on Structural Safety and Reliability, Vienna, Austria, 6–10 August 2017
Christian Bucher, Bruce R. Ellingwood, Dan M. Frangopol (Editors)
c
2017 TUVerlag Vienna, ISBN 9783903024281
2019
A favorable simplification for alleviating a part of the computational effort is to assume that
DSs are precisely defined by EDPs. That is,
ZY
f
is just a step function from the EDP space to
the discrete Z space. Thereby, fragility functions are given conditionally to IMs, instead of
EDPs, and this is the format considered in this work. There are various methods in the literature
for selecting and handling recorded or simulated ground motions to extract the information
required for the fragility analysis [4,5,7,9,14]. In this work a stochastic ground motion model
is utilized, which can be also integrated in (1).
Several formulations for conducting fragility analysis exist, either combining univariate
IMs with binary DSs [14], or univariate IMs with multiple DSs [13], or multivariate IMs with
binary DSs [17]. In general, fragility functions can be expressed as multivariate models with
multiple DSs. While the multivariate extension is straightforward under fair assumptions, the
multidimensional one, i.e. through multiple DSs, often brings out inconsistencies. The crossings
of fragility curves are, for instance, indicative of some modeling flaws, since what they essen
tially imply is negative state probabilities. Some techniques for circumventing this issue have
been proposed in the literature, e.g. [12], yet without always providing clear theoretical justifi
cations.
In this paper, multivariate fragility functions with multiple DSs are analyzed within the
context of Softmax Regression (SR) [11]. SR has strong theoretical connections with general
ized linear models [3], featuring the special case of logit links. In the binary case, SR can be
regarded as binary logistic regression, which has been effectively used for fragility analysis,
e.g. [8,17]. The development of fragility functions based on data and SR is seen as a learning
problem in this work, where the probability distribution over multiple discrete structural DSs is
to be inferred, given some multivariate data attributes.
In current fragility analysis frameworks, the lognormal distribution is favorably utilized for
data belonging to certain DSs, mainly for the practical reason of the nonnegativity of the used
IMs, x. Quite often though, the posterior,
ZX
f
, is misconceived as the likelihood,

XZ
f
, fre
quently enabling, among other issues, DS probabilities with negative values, i.e. crossings of
fragility functions. In this work, we show that SR, in the logspace of IMs, is a mathematically
accurate modeling choice for this posterior, when the DS conditional distribution is lognormal.
Besides lognormal, the results are generalizable to the entire exponential family of distributions,
which technically implies that a softmax fragility function assumption is invariant to this large
family of distributions. Another significant feature of fragility analysis is the fact that states
commonly follow a certain order, meaning that, for example, a “minor damage” state is a subset
of an “uptomajordamage” state. This nested structure is analyzed, investigated and leveraged
along the lines of nominal, ordered and hierarchical regression approaches.
Classic fragility analysis is mainly focused on the DS probabilities given that the structure
is in one given initial configuration, usually the intact state. However, it would have broad
implications to model state transitions from every to all DSs. Such generalized fragility func
tions are particularly useful in lifecycle applications, that necessitate computation of failure
probabilities from multiple initial structural configurations. This generalized approach that ac
counts for transition probabilities among all DSs, and is capable of describing longterm struc
tural behavior, is presented here, assuming Markovian properties for the evolution dynamics of
DSs. In this regard, the current state of the structure is a sufficient statistic over a history of
state transitions that do not need to be tracked.
2020
Statistical learning approach
2.1 Theoretical implications
To shed some insight on the choice of SR in fragility analysis, we examine, without loss of
generality, a binary state case, with one IM. The scope in fragility analysis is to model the
posterior probability of a DS exceedance given an IM,
ZX
f
, which based on Bayes’ rule can be
written:
( ) ( ) ( )
( ) ( ) ( ) ( )


1 1
1 1 1 0 0
XZ Z
ZX XZ Z XZ Z
f xz f z
f zx f xz f z f xz f z
= =
= = ==+==
(2)
A common misconception in fragility analysis is to model
XZ
f
instead of

ZX
f
. To elaborate
on this, once the structural analysis results have been obtained, usually only the points that
indicate damage are kept, and their derived distribution is eventually treated as being the pos
terior,
ZX
f
, when in fact it is merely the DS conditional likelihood
XZ
f
. From relation (2) we
can further derive:
( ) ( )
( ) ( )
( )


1
1 0 0
1
1 1
ZX XZ Z
XZ Z
f zx f xz f z
f xz f z
= = = =
+= =
(3)
The state z is binary, and random variable Z follows a Bernoulli distribution, having mass den
sity function:
( ) ( )
1
;1
z
z
Z
fz
θθ θ
−
= −
(4)
A popular distribution for modeling
XZ
f
is the lognormal, mainly due to its positive support
domain. Adopting this assumption we have:
( ) ( )
2
2
1
 ; , exp 2
2
i
XZ i i i
i
lnx
f xz i x
µ
σµ σ
σπ
−
= = −
(5)
Substituting (4) and (5) in (3), and after some tedious but trivial algebraic steps, we obtain:
( )
( )
2
1
1 1 exp
ZX
f zx Aln x Blnx C
= = +− − −
(6)
where A, B, C are constants, functions of the parameters of (4) and (5). Under the assumption
that
01
σσ
=
, it can be shown that
0A=
, thus
ZX
f
is a logistic function in the logspace of x.
From a classification perspective, the affinity of the exponent in (6) implies that DSs can be
discriminated by a linear function in the logspace of IMs. In case
0A≠
, the exponent is
quadratic in the logspace of x, which suggests that a quadratic kernel can be integrated in the
analysis. This important result in (6), also holds for any distribution in the exponential family,
with the lognormal just being one of them [1,6]. In addition, it can be easily shown that with
multiple DSs, relation (6) becomes the softmax function, demonstrating that SR is the mathe
matically accurate modeling choice for fragility analysis, with the potential aid of nonlinear
kernels, under the most general and diverse assumptions.
2021
2.2 Softmax regression
In an SR setting, given a set of n labeled data points lying at an mdimensional feature space,
namely
( ) ( )
{ }
() ( ) ( )( )
{ }
12
11
, ,,,,
nn
ii ii ii
m
ii
z xx xz
= =
= …x
, we want to estimate the probability of a class z=j
given x,
()
12
, , ,m
P z jx x x
= …
,
{1,2,..., }
zS S
∈=
, where S is the total number of classes.
In the context of fragility analysis, the classes can designate DSs, whereas the data features are
the various IMs. In SR, the labels are often given in a onezero vector format, meaning that if
x belongs to DS
zj
=
, its label is a zero vector with only its jth entry equal to 1:
0 0 ... 1 ... 0
j
z=
(7)
This vectorized representation of the classes also allows for a relaxation of the strict onezero
requirement, thus allowing for a softer probability distribution over the classes, in cases where
the actual DS is only partially observable and is not known with certainty. To discern the two
approaches in the presence of deterministic classes (onezero vectors), the method is referred
to as sparse SR, which is quite similar to the classical multiclass logistic regression. Along the
premises discussed in section 2.1, the probability of a class j given x, can be directly modeled
by the softmax function as:
()
( )
( )
12
, , , j
i
g
mj g
S
e
P z jx x x p e
= …==
∑
x
x
(8)
where
i
g
is an affine function of x, for all
iS∈
:
( )
11i oi i mi m
g a ax a x= + +…+x
(9)
It is clear from relation (8) that the probabilities of all individual states sum up to 1 for all x,
and are, of course, positive. Although the necessity of positivity is selfevident, its importance
has to be underlined here, since this is the guarantee that resolves fragility functions crossings.
The total number of optimal coefficients to be determined is
( )
1  ,mS+
whereas the loss func
tion to be minimized is given by the crossentropy:
 () ()
11
ln
ii
S
j
n
jj
i
L zp
= =
= −
∑∑
(10)
Note that minimizing (10) is essentially equivalent to maximizing the loglikelihood of
( )
12
, , ,
m
P zx x x…
assuming i.i.d. observed data.
2.2.1 Nominal approach
If we set
zk=
as a reference state, by dividing the denominator and nominator of (8) with
( )
k
g
e
x
we end up with the multinomial logistic function. This scheme eliminates a set of unknown
coefficients. In differentiation among the other types of SR, this reduced version is called the
nominal one, and includes
( )( )
1 1mS
+−
coefficients. Accordingly, the probability of each
DS now becomes:
2022
()
( ) ( )
\{ } \{ }
1
,
11
j
ii
g
jk k
g
k
g
Sk S
e
pp
ee
≠
= =
++
∑∑
x
xx
(11)
2.2.2 Ordinal approach
The ordinal SR is a more restrictive version of nominal SR. In this case, the affine
i
g
functions
are now constructed so as to directly take advantage of the ordered state structure, which is
rather appealing in fragility analysis given the nature of DSs, as for example “minor damage”,
“major damage” etc. This assumption is true to some extent, but rather restrictive, requiring a
good prior knowledge of the data domain. In the ordinal case, the total number of optimal co
efficients is reduced to
( )
  1
Sm
−+
, since the probabilities of exceedance of a set of sequen
tial DSs, are now modeled as:
( )
( )
12
1
, , , 1
j
mg
P z jx x x e
> …=
+
x
(12)
()
11io mmi
g a ax a x
= + +…+
x
(13)
As seen, the gradient of
i
g
is constant for all i, namely the respective separating hyperplanes,
( )
0
i
g=
x
, as well as the corresponding fragility functions are parallel to each other. As such,
although fragility functions have different means, they share the same variance, as the only way
to guarantee noncrossings in this case. It should be made clear at this point, that equal variance
is not required in the other SR formulations in order to avoid crossings. In all other cases, non
crossings are a priori guaranteed, even when different variances are used, due to the way the
probabilities are formulated. In these cases, however, the modeled probabilities have to be
simply postprocessed in order to provide the typical fragility functions, which model the prob
ability of exceedance of DSs. Along the ordinal assumptions, early and recent works in fragility
analysis have subtly employed ordinal models, without explicitly addressing it, either using the
probit or the logit link [13,17]. It can be noted that, with the probit link, equation (12) yields
the classical maximum likelihood estimation formulation presented in [13]:
( )
12
, , , Φ
j
m
P z jx x x
σ
−
> …=
xμ
(14)
where, again, x is the logIM and Φ the standard Gaussian CDF. The probit link is quite similar
to the logit one, with the former turning sharper at the tails. However, as shown in section 2.1
the choice of a logit link is in general theoretically more consistent for DS conditional data,
distributed according to the exponential family.
2.2.3 Hierarchical approach
The hierarchical approach reflects a nested logic. The probability of a DS is now given condi
tionally to the IMs and DSs, as:
( )
( )
12
1
 1, , , , 1
j
mg
Pzjzj xx x e
> >− … =+
x
(15)
2023
This is a formulation that comprises features from the two previous approaches. The
i
g
expres
sions are similar to the ones in (9), hence the same number of coefficients as in nominal SR
should be determined, whereas concurrently the concept of DSs ordering is explicitly enforced.
2.3 Kernelized softmax regression
The developed formulation of linear SR in the previous sections can be expanded to facilitate
nonlinear data discrimination. To accomplish this, a nonlinear mapping should be defined from
the original space of the dataset to a new space, where the linear model performs more
efficiently. An elegant way towards this, is to define the inner product in the new space by
means of a kernel function. Technically, linear SR can be seen as a special case of the family
of polynomially kernelized SR, that employs a kernel of the form
( )
(,) 1 ,
r
Kxy xy= +
for any
,
m
xy∈ℜ
, with
1r=
. The quadratic kernel is accordingly obtained for r=2, the cubic kernel
for r=3, etc. Another versatile and useful inner product is defined by the gaussian kernel or
radial basis function
22
2
( , ) exp( / ),Kxy x y
γγ
+
= − − ∈ℜ
. Relation (9) admits the following
modification [10]:
( )
( )
( )
()
1,
jj
j
n
ii
g a Kx x
=
=∑
x
(16)
Nonlinear kernels can be integrated in any of the SR variants discussed previously. To avoid
overfitting when using nonlinear kernels, the loss function should often be supplied with L2 or
L1norm regularizers. For more details on the strengths and weaknesses of different
regularization functions the interested reader may consult [11].
Generalized fragility functions
In order to capture the longitudinal data dependencies, a network with Markovian states evolu
tion is introduced, shown in Figure 1. The network consists of two node categories; the Xnodes
and the Znodes, corresponding to IMs and DSs random variables respectively. More formally,
for all
0,1,2,...,iT=
,
,
ii
ZX
are random variables, such that
i
ZS∈
and
i
X∈Ω
, where
{1,2,..., }SS=
,
Ω
Ω⊆ℜ
. Assuming that the initial state
0
z
is known, the joint probability
mass function is [11]:
1
0: 1: 1 0 1 2 1 2
  (, )

1
(  ) (  , ) (  , )... (  , )
tt
T T
SS
TII z j z k
jk
t kj
fz x fz z x fz zx fz z x
p
−
TT−1 T
= =
=
=
=
∏∏∏
(17)
where II is the indicator function. Considering the negative loglikelihood of (17), the corre
sponding loss function reads:
…
X1
Z1 Z2 ZT
X2 XT
Z0
Figure 1: Markovian network representation
2024
  
()
1  ()
11 1
( , )ln
SS S
nT i
t t jk k
it kj k
L II z j z k p L
−
= = =
=−===
∑∑∑∑ ∑
(18)
Table 1: Analysis and modeling parameters
Magnitude ()
5.7 – 7.5
Distance (km)
5.0 – 40.0
Vs30 ground velocity (m/s)
305.0
Beam length (m)
6.5
Column length (m)
4.2
Beam section
W21x83
Column section
W24x84
Yield strength (MPa)
235.0
Elastic modulus (GPa)
200.0
Hardening (%)
0.5
Concrete slab height (cm)
20.0
The subscript k in the loss functions of (18) indicates the conditional state of the respective
cross entropy. In addition, the form of the loss function indicates that regardless the length of
the sampled earthquake sequences, the parameter estimation process can be decomposed into
S subproblems that can be processed in parallel. The transition probabilities from each state k
to all j,
jk
p
, form the generalized fragility functions modeled according to (8), or any other of
the presented variations.
Numerical results
In this section, the dynamic response of a onebay threestory internal moment resisting frame
is considered, under simulated seismic excitations. Details for the earthquake model imple
mented can be found in [15,16]. Dynamic timehistory analyses are conducted using Opensees.
All beams and columns are geometrically linear forcebased elements with proper fiber dis
cretization, modeled with a bilinear material law with kinematic hardening, simply simulating
the uniaxial steel constitutive behavior chosen. In Table 1, the simulated ground motion and
analysis parameters are shown in detail. For the earthquake magnitude and distance from the
site, uniform distributions with bounds shown in Table 1 are sampled. The DSs are based on
different levels of maximum interstorey drifts. For illustration purposes, only three states are
chosen, shown in Table 2.
Fragility analysis for a 1D IM case is performed first, with the Peak Ground Acceleration
(PGA) (in g) chosen as the IM. A total number of 50 earthquakes and respective structural
analysis is used in this example for deriving the fragility functions.
Table 2: Damage states based on maximum drift
1: Minor damage
<0.5 %
2: Major damage
0.5 – 1.5 %
3: Near collapse
>1.5%
6.5 m
4.2 m
2025
Figure 2: Moment resisting frame fragility curves for one IM and three DSs (left). Crossing of fragility functions
for classical maximum likelihood approach without common variance (middle). Crossing avoided without
common variance assumption using nominal SR (right).
Results for this example are presented in Figure 2. Nominal and hierarchical approaches
are almost identical, whereas ordinal fragility curves for different DS exceedances are only
differentiated through a shift in their mean values. In all cases crossings are avoided, and as
explained in section 2.2.1, in the nominal and hierarchical cases this is accomplished without
the common variance assumption. In Figure 2, the remedy of fragility functions crossing is
also demonstrated. The same example is evaluated and fragility functions based on one of the
leading methodologies [13], without the common variance assumption, is compared with the
presented formulation results.
The difference among the three approaches can be also seen in Figure 3, where fragility
surfaces for 2 IMs are demonstrated, for 500 analyses. In this example, the chosen D595 (in sec)
duration is a typical measure for the significant duration of seismic excitations, denoting the
time interval between 5% and 95% of the arias intensity. In Figure 3, we can again observe that
nominal and hierarchical approaches are almost identical and different from the ordinal case.
This difference is more obvious in the xy plane, in Figure 4, where the boundaries, for
0
i
g=
, are shown. The parallel boundaries among DSs imposed by the ordinal assumption seem to
not be accurate when the model is free to optimize all the coefficients.
Finally, we show the analysis regarding the generalized fragility functions. The generalized
formulation is applied based on 1000 series events, with each series consisting of 5 earthquakes,
corresponding to 5000 analyses in total. In Figure 5, the corresponding plots are shown, for
every initial state, based on the nominal SR. In this figure, the diffusion of more severely dam
aged points into former less damaged regions is observed, and the corresponding linear bound
aries defining the fragility functions are drawn.
Figure 3: Moment resisting frame fragility surfaces for two IMs and three DSs.
Nominal (left), Ordinal (middle), Hierarchical (right).
4 2 0 2
log  PGA (m/s
2
)
0
0.2
0.4
0.6
0.8
1
P(z>1), P(z>2)
Nominal
Ordinal
Hierarchical
4 2 0 2
log  PGA (m/s
2
)
0
0.2
0.4
0.6
0.8
1
fragility functions
crossing region
4 2 0 2
log  PGA (m/s
2
)
0
0.2
0.4
0.6
0.8
1
fragility functions
crossing region
0
5
0.5
P(z>1), P(z>2)
1
4
2
00logPGA
logD
595
0
0
5
0.5
1
4
20 logPGA
logD
595
0
5
0.5
1
4
20 0logPGA
logD
595
2026
Figure 4: Separating boundaries of DSs based on fragility analysis.
Nominal (left), Ordinal (middle), Hierarchical (right).
Figure 5: Conditional separating boundaries of DSs based on generalized fragility analysis
with nominal SR.
Note that by determining all separating boundaries, all fragility functions have essentially been
obtained, since their optimal coefficients have been computed.
5 Conclusions
This work presents a complete and systematic framework for fragility analysis, based on SR.
This choice is driven by the fact that the softmax function turns out to describe the probabilities
of the DSs, when the distributions of the DS conditional data belong to the exponential family
of distributions. The presented methodology can be implemented in three alternative ways ei
ther under (i) nominal, (ii) ordered or (iii) hierarchical assumptions regarding the DSs, and in
all cases fragility functions crossings are avoided. The numerical investigation supports the fact
that the nominal and hierarchical approaches are more flexible and obtain similar results, and
do not need the common variance assumption to avoid crossings, as the ordinal approach does.
Finally, a theoretical formulation for generalized fragility functions based on Markovian as
sumptions is derived. Generalized fragility functions provide the transition probabilities from
every to all DSs, given the IMs, allowing for longterm structural damage predictions. Numer
ical examples and connections to current practice are analyzed and discussed for all presented
formulations.
References
[1] C. M. Bishop. Neural networks for pattern recognition. Oxford university press, 1995.
[2] C. A. Cornell and H. Krawinkler. "Progress and challenges in seismic performance
assessment." PEER Center News 3.2: 13, 2000.
0 1 2 3 4 5
logD
595
5
4
3
2
1
0
1
logPGA
012345
logD595
5
4
3
2
1
0
1
012345
logD
595
5
4
3
2
1
0
1
0 1 2 3 4 5
logD
595
5
4
3
2
1
0
1
logPGA
states  previous state = 1
0 1 2 3 4 5
logD595
5
4
3
2
1
0
1states  previous state = 2
012345
logD
595
5
4
3
2
1
0
1states  previous state = 3
2027
[3] P. McCullagh. "Generalized linear models." European Journal of Operational
Research 16.3: 285292, 1984.
[4] M. Grigoriu. "To scale or not to scale seismic groundacceleration records." Journal of
Engineering Mechanics 137.4: 284293, 2010.
[5] F. Jalayer, and C. A. Cornell. "Alternative non‐linear demand estimation methods for
probability‐based seismic assessments." Earthquake Engineering & Structural
Dynamics 38.8: 951972, 2009.
[6] M. I. Jordan "Why the logistic function? A tutorial discussion on probabilities and neural
networks.", Computational Cognitive Science, Technical Report 9503, MIT, 1995.
[7] N. S. Kwong, A. K. Chopra, and R. K. McGuire. "A framework for the evaluation of
ground motion selection and modification procedures." Earthquake Engineering &
Structural Dynamics 44.5: 795815, 2015.
[8] D. Lallemant, A. Kiremidjian, and H. Burton. "Statistical procedures for developing
earthquake damage fragility curves." Earthquake Engineering & Structural
Dynamics 44.9: 13731389, 2015.
[9] N. Luco and P. Bazzurro. "Does amplitude scaling of ground motion records result in
biased nonlinear structural drift responses?." Earthquake Engineering & Structural
Dynamics 36.13: 18131835, 2007.
[10] S. Marsland. Machine learning: an algorithmic perspective. CRC press, 2015.
[11] K. P. Murphy, Machine learning: a probabilistic perspective. MIT press, 2012.
[12] K. Porter, R. Kennedy, and R. Bachman. "Creating fragility functions for performance
based earthquake engineering." Earthquake Spectra 23.2: 471489, 2007.
[13] M. Shinozuka, , M. Q. Feng, H. Kim, T. Uzawa, and T. Ueda. "Statistical analysis of
fragility curves" . Technical Report, MCEER03002, 2003.
[14] D. Vamvatsikos. "Analytic Fragility and Limit States [P (EDP IM)]: Nonlinear Dynamic
Procedures." Encyclopedia of Earthquake Engineering: 8794, 2015.
[15] C. Vlachos, K. G. Papakonstantinou, and G. Deodatis. "A multimodal analytical non
stationary spectral model for characterization and stochastic simulation of earthquake
ground motions." Soil Dynamics and Earthquake Engineering 80, 177191, 2016.
[16] C. Vlachos, K. G. Papakonstantinou, and G. Deodatis. "Predictive model for site specific
simulation of ground motions based on earthquake scenarios." Earthquake Engineering
& Structural Dynamics, under review, 2017.
[17] A. J. Yazdi, T. Haukaas, T. Yang, P. Gardoni. "Multivariate fragility models for
earthquake engineering." Earthquake Spectra 32.1: 441461, 2016.
2028