Conference PaperPDF Available

Uncertainty Quantification for Seismic Risk Assessment using Latin Hypercube Sampling and Quasi Monte Carlo Simulation


Abstract and Figures

In the insurance industry Seismic Risk Assessment is commonly used for modeling loss to a spatially distributed portfolio. Best practice not only involves the computation of expected loss, but also requires treatment of the uncertainty of all components of the modeling chain. Because the dimensionality is high, this is typically performed with a Monte Carlo simulation of a large number of scenario realizations. In this study, we first compare the computational efficiency of uncorrelated pseudo-random sampling to variance reduction techniques for scenario loss uncertainty treatment. We observe that Latin Hypercube sampling as well as Quasi Monte Carlo simulation using low-discrepancy sequences can improve the error convergence from O(n^{−0.5}) to O(n^{-1}) in many cases. We then perform a global sensitivity analysis to quantify the contribution of different modeling parameters and their uncertainties to the overall loss uncertainty. To this end we use three scenarios in Indonesia and explore uncertainty in the geographical distribution of portfolio items, structural properties such as building height and quality, as well as ground motion and damage models. We find that depending on the portfolio a significant fraction of the output variance can be attributed to uncertain factors in the exposure and vulnerability models, revealing the importance of their thorough treatment in seismic risk analysis.
Content may be subject to copyright.
, Martin KÄSER
In the insurance industry Seismic Risk Assessment is commonly used for modeling loss to a spatially
distributed portfolio. Best practice not only involves the computation of expected loss, but also requires
treatment of the uncertainty of all components of the modeling chain. Because the dimensionality is high,
this is typically performed with a Monte Carlo simulation of a large number of scenario realizations. In
this study, we first compare the computational efficiency of uncorrelated pseudo-random sampling to
variance reduction techniques for scenario loss uncertainty treatment. We observe that Latin Hypercube
sampling as well as Quasi Monte Carlo simulation using low-discrepancy sequences can improve the error
convergence from  to  in many cases. We then perform a global sensitivity analysis to
quantify the contribution of different modeling parameters and their uncertainties to the overall loss
uncertainty. To this end we use three scenarios in Indonesia and explore uncertainty in the geographical
distribution of portfolio items, structural properties such as building height and quality, as well as ground
motion and damage models. We find that depending on the portfolio a significant fraction of the output
variance can be attributed to uncertain factors in the exposure and vulnerability models, revealing the
importance of their thorough treatment in seismic risk analysis.
Keywords: Seismic Risk Analysis; Uncertainty Quantification; LHC; QMC; Global Sensitivity Analysis.
Probabilistic Seismic Risk Assessment (PSRA), building upon Probabilistic Seismic Hazard
Assessment (PSHA; Cornell 1968, SSHAC 1997), is widely used in the insurance industry to model
the frequency and severity of losses to a geographically distributed portfolio from the occurrence of
earthquake events. In this context it is not sufficient to only compute expected loss, but the treatment
of uncertainty in all parts of the modeling chain is of immense importance (Crowley, 2014). For
practical purposes, model uncertainty can be categorized into being either epistemic or aleatory. The
term epistemic describes uncertainty due to limited knowledge or data and is commonly treated with a
logic tree combining multiple alternative models. Weights of the logic tree represent the degree of
belief in the correctness of a branch. Aleatory refers to variability inherent to natural processes which
is assumed to be irreducible and usually captured with a probability distribution.
Once uncertainty treatment is integrated, result uncertainty can be visualized and communicated to
decision makers. Moreover, the contribution of individual factors to the output variation can be
quantified using sensitivity analysis (SA). This allows to e.g. identify areas where additional research
or effort to reduce the associated uncertainty might be worthwhile. In contrast to local methods which
investigate the impact of incremental input perturbations at a base case, global SA aims at exploring
the entire space of uncertain input factors and thereby allows to take factor interactions into account.
Global SA is a computationally demanding technique, because in general a high-dimensional input
space needs to be sampled. This motivates our investigation of the efficiency of various sampling
PhD student, Department of Earth and Environmental Sciences, Ludwig-Maximilians-Universität,
Munich, Germany,
Senior NatCat Risk Analyst, Corporate Underwriting, Munich Re, Munich, Germany,
schemes in the first part of this paper before we perform a global SA for seismic risk analysis using
three loss scenarios in Indonesia in the second part.
2.1 Modeling Framework and Uncertainty Treatment
2.1.1 Ground Motion Model
Ground Motion Prediction Equations (GMPEs) are used to model the distribution of ground motion at
a site given the occurrence of an earthquake event of magnitude . Ground motion levels are
expressed by intensity measures such as Peak Ground Acceleration, Peak Ground Velocity, or Pseudo-
Spectral Acceleration at a given frequency. A simplified form for expected ground motion is generally
given by
  (1)
with coefficients and .
Most GMPEs capture the associated aleatory variability using a log-normal probability distribution.
The standard deviation of this distribution is part of the GMPE. In this study, we use five different
GMPEs (see Section 2.3) and sample the total ground motion residual.
2.1.2 Portfolio Location Uncertainty
In the context of Seismic Risk Assessment for insurance purposes, a wide range of portfolio
uncertainty exists. For many portfolios, risk items are only known to be located within some
administrative zone but exact coordinates are unknown. This can be caused by inaccurate geocoding
as well as reduced information accuracy between brokers and reinsurance companies or risk modelers.
We treat portfolio location uncertainty in a stochastic manner by sampling risk locations within their
respective administrative zone on a weighted irregular grid, which acts as a proxy for insured exposure
density so that e.g. residential buildings are preferentially located in areas of high population (see
Figure 1).
2.1.3 Vulnerability Function and Damage Uncertainty
In addition, it is common that building properties such as the number of stories as well as building age
and the related construction quality due to updated building codes remain unknown to the modeler.
For this study, we assume that building height and construction quality are unknown for all buildings.
The associated epistemic uncertainty is treated stochastically with a logic tree approach. To model
damage given a ground motion level at a site we use a zero-one-inflated Kumaraswamy distribution.
This is a mixture distribution of a Bernoulli distribution on  and a Kumaraswamy distribution in
the open interval , which allows the use of discrete probability masses and to denote the
likelihood that a building suffers no damage or is completely destroyed, respectively. The remaining
probability mass is then used to scale the Kumaraswamy distribution to model partial
We employ the inverse transform method to sample building damage. Conveniently, the inverse
distribution function of the zero-one-inflated Kumaraswamy distribution has a closed-form expression.
2.2 Sampling Strategies and Error Estimation
Because a large number of model evaluations is necessary for sensitivity analysis, it is worth to
explore the efficiency of different sampling strategies for our study.
Figure 1. Sunda straight with the urban areas of Palembang and Jakarta. Transparent black markers depict the
vertices of the weighted irregular grid employed in this study. Color represents population density (residents per
km²; Gaughan et al. 2015), which we use as insurance density proxy.
2.2.1 Simple Monte Carlo
Stochastic sampling was first used on an electronic computer (ENIAC) during the Manhattan Project
at Los Alamos Laboratory, where the still widely used name Monte Carlo (MC) was also coined. The
theory was further developed and first published by Metropolis and Ulam (1949). Simple MC is based
on uniform random sampling of the domain and can be used for numerical integration instead of
deterministic quadrature rules. We create uncorrelated pseudo-random numbers using the Mersenne
Twister pseudo-random number generator (Matsumoto and Nishimura 1998). Given a sample of size
, the expectation of some function , for example mean scenario loss or values along a
probabilistic loss curve, is given by the unbiased estimator
 (2)
Simple MC has a slow error convergence of . However, in contrast to deterministic
quadrature schemes such as the trapezoidal rule, the convergence order of simple MC is independent
of the number of parameters and only depends on the variance of the estimate
. For this reason, the
method is well suited for high-dimensional integrals such as seismic risk assessment with a large
portfolio size.
2.2.2 Latin Hypercube Sampling
Uncorrelated random sampling points tend to cluster, which is not ideal because there is little
information gain when sampling a point which is close to previous ones. Latin Hypercube
Sampling (LHS; McKay et al. 1979) is a variance reduction technique which aims to improve this by
stratifying the domain along each dimension. With a sample size of , for each dimension strata are
created such that every projection to one of the dimensions is itself a stratified sample with strata.
This can result in better asymptotic error convergence, particularly when the function is additive or
dominated by one parameter (Owen 1994). While  convergence is often observed in practice,
theoretically this order could so far only be shown for special cases.
A problem with LHS is that while the domain is stratified in each separate dimension,
multidimensional combinations are not. This can be enhanced to some extent by additional design
criterions. We use the maximin distance criterion (Johnson et al. 1990), which maximizes the
minimum distance between two points. While LHS with sample size never performs worse than
simple MC with sample size (Owen 1997), the advantage of LHS can decrease for high-
dimensional problems. Thus the performance of LHS for our study needs to be evaluated.
2.2.3 Quasi Monte Carlo
With Quasi Monte Carlo (QMC) methods, deterministic low-discrepancy sequences are used instead
of quasi-random numbers to generate sampling points. Low-discrepancy sequences are designed to
avoid previous points and fill space evenly. In this study, we use the sequence introduced by Sobol
(1967). The Koksma-Hlawka inequality states that the QMC integration error of a function in
the -dimensional unit cube is bounded by   where is the variation in
the Hardy-Krause sense which is finite if the integrand is smooth (Moskowitz and Caflisch 1996).
is the discrepancy of the sequence which is  for large , although it can be worse for
intermediate (Morokoff and Caflisch 1994). This can potentially be improved with randomized
QMC, where a deterministic sequence is scrambled randomly. We use scrambling as described in
Owen 1997 and Matoušek 1998.
In practice, many MC simulations involve decisions or functions that are not smooth, like the
epistemic uncertainties and the zero-one-inflated loss distribution in this study. This motivates our
investigation of empirical error convergence.
2.2.4 Estimation of Standard Error and Confidence Intervals
We use repeated simulation for MC standard error estimation of the different sampling schemes.
Denoting a set of estimations of a statistic obtained from repeated simulations by
and the
variance by , then the standard error  is given by
 (3)
In this study, we normalize the standard error of the mean by the sample mean to obtain relative
standard error . For the estimation of confidence intervals of , we use bootstrapping. This
method assumes that the original sample holds all information about the underlying population, and
can be used to estimate the sampling distribution of
by resampling with replacement. Specifically,
we employ the bias-corrected accelerated percentile method (Efron and Tibshirani 1986).
2.3 Sensitivity Analysis
Seismic Risk Assessment relies on many uncertain parameters. Awareness of model uncertainties and
knowledge of the extent to which certain factors drive output uncertainty under specific circumstances
is important for risk modelers as well as end users. With sensitivity analysis (SA), it is possible to
quantify the influence of uncertain model input factors. Regulatory documents and official guidelines
of the European Commission and the United States Environmental Protection Agency recommend the
use of SA and stress the importance to consider factor interactions (Saltelli et al. 2010).
Local SA methods use first-order partial derivatives 
to evaluate the sensitivity of model
output against input at a predefined base case of the input space. If the modeling code itself does
not return derivatives, they can e.g. be estimated with finite differences. Another very powerful
approach is algorithmic differentiation, which is the automated differentiation of an entire model
source code via application of the chain rule. While local methods are relatively cheap, they give only
limited insight into the sensitivity of a model with respect to a variable, because they only provide
valid information close to the base case where non-linear response can be neglected. They provide no
information about other regions of the input space. By contrast, global SA methods explore the entire
input space. This allows to quantify the overall sensitivity of the model output with respect to input
factors as well as interactions between factors.
For the global SA in this study, we use the variance decomposition introduced by Sobol (2001) based
on his Analysis of Variances (ANOVA) decomposition. The total variance  of a scalar model
output dependent on a model input vector can be decomposed into components
  
  (4)
where denotes the number of input factors, a first-order variance term dependent only on the th
input factor and  a second-order term dependent on the th and th input.  is the highest-order
variance term dependent on all input factors. Higher-order terms represent variance that cannot be
explained by lower-order terms, but is caused by some interaction of the involved factors. For
example, the variance in  cannot be expressed by .
Sobol sensitivity indices express the fraction of the total variance due to a subset of the variance
components. In this study, we use the first order (or main effect) indices which quantify the fraction of
the variance caused by an input factor to the total output variance without interactions over the whole
input space:
denotes the th input factor and is the partial variance taken over factor .  denotes all input
factors but the th, which can be thought of as the “non- direction (Saltelli et al. 2010).  is the
conditional expectation taken over all factors but , which means that variations in other parameters
are “averaged” and the variance is taken over these averages.
In addition, we use the total effect indices introduced by Homma and Saltelli (1996) to quantify the
contribution of the first order effect of the th input factor together with all higher-order interactions
with other factors to the total output variance:
is called the total effect index of the th input factor. Note that because individual interaction
components of the variance decomposition are reused for the computation of several total effect
indices (of all input factors involved in this interaction), the sum of all total effect indices exceeds 1
unless the model itself is purely additive. Because they are normalized by their variances, the exact
values of and are in the interval . Estimating and is usually performed via MC
simulation, which can be computationally demanding because convergence of the indices is often only
achieved after a very large number of model evaluations (Sarrazin et al. 2016). Numerous studies are
devoted to deriving efficient sampling designs for the joint estimation of main and total effects. In this
study, the design proposed by Jansen (1999) is employed.
2.4 Earthquake Scenarios and Portfolios
We study scenario loss variations based on two hypothetical earthquake events and several synthetic
portfolios in Indonesia. For this purpose we use a proprietary seismic risk assessment framework
developed by MunichRe running on MATLAB R2016B.
We use three portfolio sets, each is only known to be distributed in a different administrative zone
corresponding to an Indonesian province. Each portfolio set consists of several portfolios with a
varying number of buildings (1, 100, 5000 and 10000). For simplicity we use a total sum insured (TSI)
of  and a flat value distribution for all portfolios, i.e. losses can be interpreted directly as
percentage of the TSI and the TSI is distributed equally among all risk items. For each model
evaluation, locations, heights and qualities are sampled independently for all buildings, but we assume
a fixed construction type (reinforced concrete with unreinforced masonry infill). To compute mean
damage ratios, our model then uses customized vulnerability functions reflecting different structural
performance due to varying building heights and construction qualities. On top of this, we sample
either the ground motion residuals or the damage residuals. Note that we do not sample ground motion
and damage residuals jointly, because in the reinsurance industry damage distributions are fitted to
include variation due to aleatory ground motion uncertainty. It might be possible to correct for this
effect in the future when more detailed loss and ground motion data become available, but currently
this approach is common practice. The aleatory ground motion uncertainty model is part of each
GMPE, while the parameters of the zero-one-inflated Kumaraswamy distribution used to treat aleatory
damage uncertainty (see Section 2.1.3) are part of the aforementioned risk assessment framework and
depend on mean damage ratios as well as building heights and qualities.
For the SA, we use the SAFE toolbox (Pianosi et al. 2015), which provides workflows for several SA
methods. For a variance-based global SA as performed in this study, SAFE provides functions to
approximate and visualize main and total effect indices.
Figure 2. Footprints of expected ground motion of two hypothetical scenarios employed for this study. Color and
isolines denote Peak Ground Acceleration (PGA) in . Figure 2a (left) shows the  event on the
Sumatra subduction fault near the urban areas of Palembang and Jakarta using the GMPE by Zhao et al. (2006).
The provinces of Sumatera Selatan on the island of Sumatra and Daerah Khusus Ibukota Jakarta on the island
of Java in which we sample location uncertainty are outlined by their boundaries in blue color. Figure 2b (right)
shows the  event on a northern segment of the Sumatra Fault Zone near the city of Medan using the
GMPE by Chiou and Youngs (2008). The province Sumatera Utara is outlined by its boundary in blue color.
2.4.1 Southern Sumatra Subduction Fault Event
The first scenario is a hypothetical  event on the Sumatra subduction fault near the urban areas
of Palembang and Jakarta on the islands of Sumatra and Java, respectively. For this event, we use a
three-dimensional representation of the subduction zone based on the Slab 1.0 model (Hayes et al.
2012) and sample from two GMPEs with equal weights: Zhao et al. (2006), and Youngs et al. (1997).
We use two sets of portfolios: the first, hereafter labeled “Palembang portfolio set”, is only known to
be distributed within the province Sumatera Selatan on Sumatra; the second, hereafter labeled “Jakarta
portfolio set”, in the province Daerah Khusus Ibukota Jakarta on Java. Portfolio locations are sampled
onto the weighted irregular grid inside their respective zones (see Section 2.1.2 and Figure 1).
Figure 2a shows a footprint of the event obtained using the GMPE by Zhao et al. (2006) and the
outline of the two administrative zones.
2.4.2 Northern Sumatra Fault Zone Event
The second scenario is a hypothetical  event on a northern segment of the Sumatra fault zone
near the urban area of Medan. For this event, we sample from three different GMPEs with equal
weights: Boore and Atkinson (2008), Campbell and Bozorgnia (2008), and Chiou and Youngs (2008).
We use one portfolio set for this event, hereafter referred to as “Medan portfolio set”, for which risk
items are only known to be located somewhere in the administrative zone Sumatera Utara. For each
model run, risk item locations are sampled onto the weighted irregular grid inside this area (see
Section 2.1.2). Figure 2b shows a footprint of this event obtained using the GMPE by Chiou and
Youngs (2008) and the outline of the administrative zone.
3.1 Error Convergence of Sampling Strategies
In this section, we analyze the performance of different sampling schemes described in Section 2.2 for
the scenarios and portfolios described in Section 2.4. The discontinuous zero-one-inflated damage
distribution (see Section 2.1.3) as well as the high dimensionality of large portfolios provide an
interesting and challenging test case.
Figure 3 shows estimated event losses
for the Palembang portfolio with 1 and 5 risk items against
ten sample sizes    with     for 50repeated simulations for each sample size All three
sampling schemes converge to the same solution, but the estimations obtained with Latin Hypercube
Sampling (LHS) with the maximin design criterion and Quasi Monte Carlo with the scrambled Sobol
sequence (SSobol) converge faster with less variable loss estimates. The portfolio with 5 risk items has
less variation than the portfolio with 1 risk item due to the diversification induced by uncorrelated
sampling of individual building losses. A correlation model such as a spatial ground motion
correlation model (e.g. Jayaram and Baker 2009) with a spatially clustered portfolio or any type of
damage correlation model would act to lessen this effect. Other portfolio value distributions than flat
(see Section 2.4) would also show relatively higher variability.
To analyze the convergence order of different sampling schemes, Figure 4 shows logarithmic plots of
the relative standard error  of
against sample size for the Medan portfolio set, obtained from
repeated simulations (  for the portfolio with 1 risk item,    for 100 risk items, and  
 for 5000 and 10000 risk items). The thin blue and red lines indicate theoretical  and
) convergence given the initial at   . As expected, simple MC converges slowest with
 for all portfolios independently of the dimensionality. For the small portfolios with 1 and
100 risk items, SSobol and LHS perform about equally well and achieve linear convergence. For the
larger portfolio sizes (5000 and 10000 risk items), LHS does not achieve  convergence but
retains some advantage over simple MC. We do not use the Sobol sequence for very large portfolios,
because the employed algorithm only supports up to 1111 dimensions (Joe and Kuo 2003).
Figure 3. Event loss
versus sample size for the Sumatra subduction zone event and the Palembang
portfolio set with 1 risk item (left) and 5 risk items (right). Semi-transparent circles depict 50 repeated
simulations for each sample size and sampling scheme, with solid lines highlighting one individual repetition.
The transparently shaded background indicates the entire range for each sampling scheme. Estimations obtained
using Latin Hypercube Sampling (LHS; red) and Quasi Monte Carlo using a scrambled Sobol sequence (SSobol;
green) scatter less than those obtained with simple MC (blue).
Figure 4. Logarithmic plot of relative standard errors versus sample size obtained from repeated
simulations and bootstrapped upper  confidence intervals of event loss for the Medan portfolio set with 1,
100, 5000 and 10000 risk items and the Sumatra fault zone scenario. Latin Hypercube
Sampling (LHS; red) and Quasi Monte Carlo using a scrambled Sobol sequence (SSobol; green) achieve 
convergence for the small portfolios. While  is not achieved for larger portfolios, LHS still retains some
advantage over simple sampling (blue).
3.2 Results of Global Sensitivity Analysis
Using the same hypothetical scenarios and synthetic portfolios in Indonesia, we performed a global
SA to investigate the effect of uncertain input factors on event loss estimation.
To obtain a first impression of sensitivities, scatter plots are a simple and powerful tool. This graphical
global SA technique allows to quickly assess the first-order effect of varying each factor over its entire
range while also taking the global input space of other factors into account. Figure 5 shows scatter
plots of event losses
versus five different input factors for the  Sumatra subduction zone
event and a portfolio in DKI Jakarta with 1 building, obtained using LHS with size    .
Each plot is a one-dimensional projection of the entire sample, in which one factor ( is varied
systematically while all other factors () are taken unconstrained over their full range. The red lines
approximate ) by computing mean values of
inside a sliding window with length 
of the range of the respective input factor. The steep slope and large range (i.e., large variance) of the
red lines of the building quality and damage residual suggest that these uncertainty types have a strong
effect for this scenario and portfolio.
Figure 5. Scatter plot of event loss
versus individual input factors for the  Sumatra subduction zone
event and a portfolio in DKI Jakarta with 1 building obtained using Latin Hypercube Sampling. In each plot all
other input factors are sampled over their entire range, which corresponds to a projection to one dimension.
Semi-transparent blue markers depict individual event loss realizations. The red curves correspond to the
conditional mean obtained using a sliding window with a length of  of the total input interval, thereby
approximating ).
Figure 6 shows the results of a variance-based global SA (see Section 2.3) for the same portfolio. To
ensure convergence of the sensitivity indices, for this portfolio we computed    model
evaluations using simple MC, corresponding to a base sample size    (see Sarrazin et
al. 2016, Equation 13). Main effects (first order Sobol sensitivity indices are depicted as orange
boxes and total effects (total sensitivity indices ) as blue boxes for each input factor. The MC
estimation of each effect is indicated by a thin black line inside the corresponding box, while the
extent of the boxes depicts confidence intervals obtained using bootstrapping. Confirming the
impression obtained from the scatter plot, the GMPE, the building quality and the damage distribution
all have an important first order effect. All three are subject to significant higher order interactions
with other factors, resulting in larger total sensitivity indices. This underlines the notion that
uncertainty quantification for seismic risk analysis should not merely consider first-order effects or
local sensitivities, but consider factor interactions and explore the global uncertainty space. For this
scenario location uncertainty has little effect, which can be explained by the relatively small extent of
the administrative zone DKI Jakarta (see Figure 2a).
Figure 7 shows the equivalent plot for the  Sumatra fault zone event and the Medan portfolio
with 100 risk items. The main and total effects for the same uncertainty types as before are
investigated. However, in this case the 500 individual input factors are organized into five uncertainty
groups (one per uncertainty type), each containing 100 independently sampled factors corresponding
to the 100 risk items. Due to the larger portfolio size,    model evaluations were necessary
to achieve convergence of the indices, corresponding to a base sample size     .
For this scenario, location uncertainty has a stronger effect because the province of Sumatera Utara
has a large spatial extent (see Figure 2b) even though it is on the same administrative level as
DKI Jakarta.
Figure 6. Main effects (orange) and total effects (blue) for the  Sumatra subduction zone event and
a portfolio in DKI Jakarta with 1 building. While in this case the building location and height account for a
negligible fraction of event loss variance, the GMPE, the building quality and the damage distribution all have a
sizable effect, in particular in interaction with other factors (. The vertical extent of the boxes corresponds to
bootstrapped  confidence intervals.
Figure 7. Main effects (orange) and total effects (blue) for the Sumatra fault zone event and a
portfolio in Medan consisting of 100 buildings. While building heights accommodate a small fraction of the
output variance, building locations and qualities, the GMPE as well as the damage distributions have substantial
influence. The vertical extent of the boxes corresponds to bootstrapped  confidence intervals.
The results shown in this study indicate that LHS as well as QMC have the potential to increase the
computational efficiency of seismic risk analysis. We observe that error convergence is improved from
 to ) for many loss scenarios. While ) convergence is not fully achieved for very
large portfolios, it still remains advantageous to use advanced sampling strategies over simple MC.
We have furthermore investigated the impact of uncertainties in the ground motion model as well as in
the exposure and vulnerability models. Like many other uncertainty types in the exposure model,
uncertainties in building location and building properties are so far often neglected. This study has
shown that depending on the loss scenario a large fraction of the output variance can be attributed
to these factors. Although due care must be exercised when transferring the results to other models,
they highlight the importance of investigating the uncertainty associated with different factors.
Decision makers may then incorporate this knowledge into e.g. regulation, disaster management and
response plans, as well as risk mitigation measures and insurance pricing policies.
This work represents a progressive step towards a more comprehensive understanding of uncertainty
in seismic risk analysis. Nevertheless, the integration of more factors remains an important task. The
results of this study could also be tested using other methods than a variance-based SA, such as the
elementary effects test (Morris 1991) or density based methods (e.g. Pianosi and Wagener 2015).
Another powerful alternative is derivative based global SA (Sobol and Kucherenko 2009). This
approach is particularly efficient in combination with algorithmic differentiation, which has already
been successfully performed for PSHA (Molkenthin et al. 2017).
We thank Francesca Pianosi and Thorsten Wagener for providing access to the SAFE toolbox.
Boore DM, Atkinson GM (2008). Ground-motion prediction equations for the average horizontal component of
PGA, PGV, and 5%-damped PSA at spectral periods between 0.01 s and 10.0 s. Earthquake Spectra, 24(1): 99-
Campbell KW, Bozorgnia Y (2008). NGA ground motion model for the geometric mean horizontal component
of PGA, PGV, PGD and 5% damped linear elastic response spectra for periods ranging from 0.01 to 10
s. Earthquake Spectra, 24(1): 139-171.
Chiou BSJ, Youngs RR (2008). An NGA model for the average horizontal component of peak ground motion
and response spectra. Earthquake Spectra, 24(1): 173215.
Cornell CA (1968). Engineering seismic risk analysis. Bulletin of the Seismological Society of America, 58(5):
Crowley H (2014). Earthquake risk assessment: Present shortcomings and future directions. In: Ansal A. (eds)
Perspectives on European earthquake engineering and seismology. Geotechnical, Geological and Earthquake
Engineering 34, Springer Cham, pp. 515532.
Efron B, Tibshirani R (1986). Bootstrap methods for standard errors, confidence intervals, and other measures of
statistical accuracy. Statistical Science, 1: 54-75.
Gaughan AE, Stevens FR, Linard C, Jia P, Tatem AJ (2015). High resolution population distribution maps for
southeast Asia in 2010 and 2015. PLoS ONE, 8(2).
Hayes GP, Wald DJ, Johnson, RL (2012). Slab1.0: A threedimensional model of global subduction zone
geometries. Journal of Geophysical Research: Solid Earth, 117(B1).
Homma T, Saltelli A (1996). Importance measures in global sensitivity analysis of nonlinear models. Reliability
Engineering & System Safety, 52(1): 117.
Jansen MJ (1999). Analysis of variance designs for model output. Computer Physics Communications, 117(1-2):
Jayaram N, Baker JW (2009). Correlation model for spatially distributed groundmotion intensities. Earthquake
Engineering & Structural Dynamics, 38(15): 1687-1708.
Joe S, Kuo FY (2003). Remark on algorithm 659: Implementing Sobol's quasirandom sequence generator. ACM
Transactions on Mathematical Software, 29(1): 49-57.
Johnson M, Moore L, Ylvisaker D (1990). Minimax and maximin distance design. Journal of Statistical
Planning and Inference, 26(2): 131-148.
Matoušek J (1998). On the L2-discrepancy for anchored boxes. Journal of Complexity, 14(4): 527556.
Matsumoto M, Nishimura T (1998). Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-
random number generator. ACM Transactions on Modeling and Computer Simulation, 8(1), 330.
McKay MD, Beckman RJ, Conover WJ (1979). A comparison of three methods for selecting values of input
variables in the analysis of output from a computer code, Technometrics, 21(2): 239245.
Metropolis N, Ulam S (1949). The Monte Carlo method. Journal of the American Statistical Association,
44(247): 335341.
Molkenthin C, Scherbaum F, Griewank A, Leovey H, Kucherenko S, Cotton F (2017). Derivative-based global
sensitivity analysis: Upper bounding of sensitivities in seismic-hazard assessment using automatic
differentiation. Bulletin of the Seismological Society of America, 107(2): 9841004.
Morokoff WJ, Caflisch RE (1994). Quasi-random sequences and their discrepancies. SIAM Journal on Scientific
Computing, 15(6): 12511279.
Morris MD (1991). Factorial sampling plans for preliminary computational experiments. Technometrics, 33(2):
Moskowitz B, Caflisch RE (1996). Smoothness and dimension reduction in Quasi-Monte Carlo methods.
Mathematical and Computer Modelling, 23(89): 3754.
Owen, AB (1994). Controlling correlations in Latin hypercube samples. Journal of the American Statistical
Association, 89(428): 15171522.
Owen AB (1997). Monte Carlo variance of scrambled net quadrature. SIAM Journal on Numerical Analysis,
34(5): 18841910.
Pianosi F, Sarrazin F, Wagener T (2015). A Matlab toolbox for global sensitivity analysis. Environmental
Modelling and Software, 70: 8085.
Pianosi F, Wagener T (2015). A simple and efficient method for global sensitivity analysis based on cumulative
distribution functions. Environmental Modelling and Software, 67: 111.
Saltelli A, Annoni P, Azzini I, Campolongo F, Ratto M, Tarantola S (2010). Variance based sensitivity analysis
of model output. Design and estimator for the total sensitivity index. Computer Physics Communications,
181(2): 259270.
Sarrazin F, Pianosi F, Wagener T (2016). Global sensitivity analysis of environmental models: convergence and
validation. Environmental Modelling and Software, 79: 135-152.
Senior Seismic Hazard Committee (SSHAC) (1997). Recommendations for probabilistic seismic hazard
analysis: guidance on uncertainty and use of experts. NUREG/CR-6372. Vol. 1.
Sobol’ IM (1967). On the distribution of points in a cube and the approximate evaluation of integrals. Zhurnal
Vychislitel’noi Matematiki I Matematicheskoi Fiziki, 7(4): 784802.
Sobol’ IM (2001). Global sensitivity indices for nonlinear mathematical models and their Monte Carlo
estimates. Mathematics and Computers in Simulation, 55(1): 271-280.
Sobol’ IM, Kucherenko S (2009). Derivative based global sensitivity measures and their link with global
sensitivity indices. Mathematics and Computers in Simulation, 79(10): 30093017.
Youngs RR, Chiou SJ, Silva WJ, Humphrey JR (1997). Strong ground motion attenuation relationships for
subduction zone earthquakes. Seismological Research Letters, 68(1): 58-73.
Zhao JX, Zhang J, Asano A, Ohno Y, Oouchi T, Takahashi T, ..., Fukushima Y (2006). Attenuation relations of
strong ground motion in Japan using site classification based on predominant period. Bulletin of the
Seismological Society of America, 96(3): 898-913.
Seismic events are amongst the natural hazards that are known to produce excessive economic losses and human casualties throughout history. Seismic risk can be observed as a function of hazard, exposure and vulnerability. While hazard and exposure are dependent on the site of interest and are unchangeable, studies on vulnerability can help in reducing the seismic risk by adopting retrofitting methods. One of the features that affect the damage and loss characteristics of a building is its occupancy class (commercial, residential, educational, hospitals etc.). The occupancy class of a building mainly affects the number and severity of the casualties as well as the contents damage produced during a seismic event. This work aims to compare the damage and loss properties, i.e., the number of casualties, the economic loss and the damage in properties along with the associated uncertainty of buildings having various occupancy classes by considering a single building model type and subjecting them to a set of stochastic ground motions. All building occupancy classes are according to HAZUS-MH. Normalized losses and injuries are computed using Performance Based Engineering (PBE) workflow developed by Natural Hazards Engineering Research Infrastructure (NHERI).
Full-text available
We address two critical choices in Global Sensitivity Analysis (GSA): the choice of the sample size and of the threshold for the identification of insensitive input factors. Guidance to assist users with those two choices is still insufficient. We aim at filling this gap. Firstly, we define criteria to quantify the convergence of sensitivity indices, of ranking and of screening, based on a bootstrap approach. Secondly, we investigate the screening threshold with a quantitative validation procedure for screening results. We apply the proposed methodologies to three hydrological models with varying complexity utilizing three widely-used GSA methods (RSA, Morris, Sobol’). We demonstrate that convergence of screening and ranking can be reached before sensitivity estimates stabilize. Convergence dynamics appear to be case-dependent, which suggests that “fit-for-all” rules for sample sizes should not be used. Other modellers can easily adopt our criteria and procedures for a wide range of GSA methods and cases. Open access:
Full-text available
Global Sensitivity Analysis (GSA) is increasingly used in the development and assessment of environmental models. Here we present a Matlab/Octave toolbox for the application of GSA, called SAFE (Sensitivity Analysis For Everybody). It implements several established GSA methods and allows for easily integrating others. All methods implemented in SAFE support the assessment of the robustness and convergence of sensitivity indices. Furthermore, SAFE includes numerous visualisation tools for the effective investigation and communication of GSA results. The toolbox is designed to make GSA accessible to non-specialist users, and to provide a fully commented code for more experienced users to complement their own tools. The documentation includes a set of workflow scripts with practical guidelines on how to apply GSA and how to use the toolbox. SAFE is open source and freely available for academic and non-commercial purpose. Ultimately, SAFE aims at contributing towards improving the diffusion and quality of GSA practice in the environmental modelling community. Open access:
Full-text available
Variance-based approaches are widely used for Global Sensitivity Analysis (GSA) of environmental models. However, methods that consider the entire Probability Density Function (PDF) of the model output, rather than its variance only, are preferable in cases where variance is not an adequate proxy of uncertainty, e.g. when the output distribution is highly-skewed or when it is multi-modal. Still, the adoption of density-based methods has been limited so far, possibly because they are relatively more difficult to implement. Here we present a novel GSA method, called PAWN, to efficiently compute density-based sensitivity indices. The key idea is to characterise output distributions by their Cumulative Distribution Functions (CDF), which are easier to derive than PDFs. We discuss and demonstrate the advantages of PAWN through applications to numerical and environmental modelling examples. We expect PAWN to increase the application of density-based approaches and to be a complementary approach to variance-based GSA. Open access:
Two types of sampling plans are examined as alternatives to simple random sampling in Monte Carlo studies. These plans are shown to be improvements over simple random sampling with respect to variance for a class of estimators which includes the sample mean and the empirical distribution function.
Seismic‐hazard assessment is of great importance within the field of engineering seismology. Nowadays, it is common practice to define future seismic demands using probabilistic seismic‐hazard analysis (PSHA). Often it is neither obvious nor transparent how PSHA responds to changes in its inputs. In addition, PSHA relies on many uncertain inputs. Sensitivity analysis (SA) is concerned with the assessment and quantification of how changes in the model inputs affect the model response and how input uncertainties influence the distribution of the model response. Sensitivity studies are challenging primarily for computational reasons; hence, the development of efficient methods is of major importance. Powerful local (deterministic) methods widely used in other fields can make SA feasible, even for complex models with a large number of inputs; for example, automatic/algorithmic differentiation (AD)‐based adjoint methods. Recently developed derivative‐based global sensitivity measures can combine the advantages of such local SA methods with efficient sampling strategies facilitating quantitative global sensitivity analysis (GSA) for complex models. In our study, we propose and implement exactly this combination. It allows an upper bounding of the sensitivities involved in PSHA globally and, therefore, an identification of the noninfluential and the most important uncertain inputs. To the best of our knowledge, it is the first time that derivative‐based GSA measures are combined with AD in practice. In addition, we show that first‐order uncertainty propagation using the delta method can give satisfactory approximations of global sensitivity measures and allow a rough characterization of the model output distribution in the case of PSHA. An illustrative example is shown for the suggested derivative‐based GSA of a PSHA that uses stochastic ground‐motion simulations.
This paper looks at the current practices in regional and portfolio seismic risk assessment, discusses some of their shortcomings and presents proposals for improving the state-of-the-practice in the future. Both scenario-based and probabilistic risk assessment are addressed, and modelling practices in the hazard, fragility/vulnerability and exposure components are presented and critiqued. The subsequent recommendations for improvements to the practice and necessary future research are mainly focused on treatment and propagation of uncertainties.
Hybrids of equidistribution and Monte Carlo methods of integration can achieve the superior accuracy of the former while allowing the simple error estimation methods of the latter. This paper studies the variance of one such hybrid, scrambled nets, by applying a multidimensional multiresolution (wavelet) analysis to the integrand. The integrand is assumed to be measurable and square integrable but not necessarily of bounded variation. In simple Monte Carlo, every nonconstant term of the multiresolution contributes to the variance of the estimated integral. For scrambled nets, certain low-dimensional and coarse terms do not contribute to the variance. For any integrand in L2, the sampling variance tends to zero faster under scrambled net quadrature than under Monte Carlo sampling, as the number of function evaluations n tends to infinity. Some finite n results bound the variance under scrambled net quadrature by a small constant multiple of the Monte Carlo variance, uniformly over all integrands f. Latin hypercube sampling is a special case of scrambled net quadrature.
A scalar model output Y is assumed to depend deterministically on a set of stochastically independent input vectors of different dimensions. The composition of the variance of Y is considered; variance components of particular relevance for uncertainty analysis are identified. Several analysis of variance designs for estimation of these variance components are discussed. Classical normal-model theory can suggest optimal designs. The designs can be implemented with various sampling methods: ordinary random sampling, latin hypercube sampling and scrambled quasi-random sampling. Some combinations of design and sampling method are compared in two small-scale numerical experiments.
We present a model for estimating horizontal ground motion amplitudes caused by shallow crustal earthquakes occurring in active tectonic environments. The model provides predictive relationships for the orientation-independent average horizontal component of ground motions. Relationships are provided for peak acceleration, peak velocity, and 5-percent damped pseudo-spectral acceleration for spectral periods of 0.01 to 10 seconds. The model represents an update of the relationships developed by Sadigh (1997) and incorporates improved magnitude and distance scaling forms as well as hanging-wall effects. Site effects are represented by smooth functions of average shear wave velocity of the upper 30 m (V-S30) and sediment depth. The new model predicts median ground motion that is similar to Sadigh (1997) at short spectral period, but lower ground motions at longer periods. The new model produces slightly lower ground motions in the distance range of 10 to 50 km and larger ground motions at larger distances. The aleatory variability in ground motion amplitude was found to depend upon earthquake magnitude and on the degree of nonlinear soil response, For large magnitude earthquakes, the aleatory variability is larger than found by Sadigh (1997).