ArticlePDF Available

mfEGRA: Multifidelity efficient global reliability analysis through active learning for failure boundary location

Authors:

Abstract and Figures

This paper develops mfEGRA, a multifidelity active learning method using data-driven adaptively refined surrogates for failure boundary location in reliability analysis. This work addresses the issue of prohibitive cost of reliability analysis using Monte Carlo sampling for expensive-to-evaluate high-fidelity models by using cheaper-to-evaluate approximations of the high-fidelity model. The method builds on the efficient global reliability analysis (EGRA) method, which is a surrogate-based method that uses adaptive sampling for refining Gaussian process surrogates for failure boundary location using a single- fidelity model. Our method introduces a two-stage adaptive sampling criterion that uses a multifidelity Gaussian process surrogate to leverage multiple information sources with different fidelities. The method combines expected feasibility criterion from EGRA with one-step lookahead information gain to refine the surrogate around the failure boundary. The computational savings from mfEGRA depends on the discrepancy between the different models, and the relative cost of evaluating the different models as compared to the high-fidelity model. We show that accurate estimation of reliability using mfEGRA leads to computational savings of ∼46% for an analytic multimodal test problem and 24% for a three-dimensional acoustic horn problem, when compared to single-fidelity EGRA. We also show the effect of using a priori drawn Monte Carlo samples in the implementation for the acoustic horn problem, where mfEGRA leads to computational savings of 45% for the three-dimensional case and 48% for a rarer event four-dimensional case as compared to single-fidelity EGRA.
Content may be subject to copyright.
mfEGRA: Multifidelity Efficient Global Reliability Analysis
through Active Learning for Failure Boundary Location
Anirban Chaudhuri
, Alexandre N. Marques
Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
Karen E. Willcox
University of Texas at Austin, Austin, TX, 78712, USA
Abstract
This paper develops mfEGRA, a multifidelity active learning method using data-driven adaptively
refined surrogates for failure boundary location in reliability analysis. This work addresses the issue of
prohibitive cost of reliability analysis using Monte Carlo sampling for expensive-to-evaluate high-fidelity
models by using cheaper-to-evaluate approximations of the high-fidelity model. The method builds on
the Efficient Global Reliability Analysis (EGRA) method, which is a surrogate-based method that uses
adaptive sampling for refining Gaussian process surrogates for failure boundary location using a single-
fidelity model. Our method introduces a two-stage adaptive sampling criterion that uses a multifidelity
Gaussian process surrogate to leverage multiple information sources with different fidelities. The method
combines expected feasibility criterion from EGRA with one-step lookahead information gain to refine
the surrogate around the failure boundary. The computational savings from mfEGRA depends on the
discrepancy between the different models, and the relative cost of evaluating the different models as
compared to the high-fidelity model. We show that accurate estimation of reliability using mfEGRA
leads to computational savings of 46% for an analytic multimodal test problem and 24% for a three-
dimensional acoustic horn problem, when compared to single-fidelity EGRA. We also show the effect of
using a priori drawn Monte Carlo samples in the implementation for the acoustic horn problem, where
mfEGRA leads to computational savings of 45% for the three-dimensional case and 48% for a rarer event
four-dimensional case as compared to single-fidelity EGRA.
Keywords: multi-fidelity, adaptive sampling, probability of failure, contour location, classification, Gaus-
sian process, kriging, multiple information sources, EGRA, surrogate
1 Introduction
The presence of uncertainties in the manufacturing and operation of systems make reliability analysis critical
for system safety. The reliability analysis of a system requires estimating the probability of failure, which
can be computationally prohibitive when the high fidelity model is expensive to evaluate. In this work,
we develop a method for efficient reliability estimation by leveraging multiple sources of information with
different fidelities to build a multifidelity approximation for the limit state function.
Reliability analysis for strongly non-linear systems typically require Monte Carlo sampling that can incur
substantial cost because of numerous evaluations of expensive-to-evaluate high fidelity models as seen in
Figure 1 (a). There are several methods that improve the convergence rate of Monte Carlo methods to
decrease computational cost through Monte Carlo variance reduction, such as, importance sampling [1, 2],
cross-entropy method [3], subset simulation [4, 5], etc. However, such methods are outside the scope of this
paper and will not be discussed further. Another class of methods reduce the computational cost by using
approximations for the failure boundary or the entire limit state function. The popular methods that fall
in the first category are first- and second-order reliability methods (FORM and SORM), which approximate
Postdoctoral Associate, Department of Aeronautics and Astronautics, anirbanc@mit.edu.
Postdoctoral Associate, Department of Aeronautics and Astronautics, noll@mit.edu.
Director, Oden Institute for Computational Engineering and Sciences, kwillcox@oden.utexas.edu
1
the failure boundary with linear and quadratic approximations around the most probable failure point [6, 7].
The FORM and SORM methods can be efficient for mildly nonlinear problems and cannot handle systems
with multiple failure regions. The methods that fall in the second category reduce computational cost by
replacing the high-fidelity model evaluations in the Monte Carlo simulation by cheaper evaluations from
adaptive surrogates for the limit state function as seen in Figure 1 (b).
Reliability
analysis loop
High-fidelity
model
Random variable realization
System outputs
Reliability
analysis loop
Single fidelity adap-
tive surrogate
Random variable realization
System outputs
Multifidelity
adaptive surrogate
Reliability
analysis loop
High-fidelity model
Low-fidelity model 1
Low-fidelity model k
.
.
.
Random variable realization
System outputs
(a) (b) (c)
Improving computational efficiency
Figure 1: Reliability analysis with (a) high-fidelity model, (b) single fidelity adaptive surrogate, and (c)
multifidelity adaptive surrogate.
Estimating reliability requires accurately classifying samples to fail or not, which needs surrogates that
accurately predict the limit state function around the failure boundary. Thus, the surrogates need to be
refined only in the region of interest (in this case, around the failure boundary) and do not require global
accuracy in prediction of the limit state function. The development of sequential active learning methods for
refining the surrogate around the failure boundary has been addressed in the literature using only a single
high-fidelity information source. Such methods fall in the same category as adaptively refining surrogates for
identifying stability boundaries, contour location, classification, sequential design of experiment (DOE) for
target region, etc. Typically, these methods are divided into using either Gaussian process (GP) surrogates or
support vector machines (SVM). Adaptive SVM methods have been implemeted for reliability analysis and
contour location [8, 9, 10]. In this work, we focus on GP-based methods (sometimes referred to as kriging-
based) that use the GP prediction mean and prediction variance to develop greedy and lookahead adaptive
sampling methods. Efficient Global Reliability Analysis (EGRA) adaptively refines the GP surrogate around
the failure boundary by sequentially adding points that have maximum expected feasibility [11]. A weighted
integrated mean square criterion for refining the kriging surrogate was developed by Picheny et al. [12].
Echard et al. [13] proposed an adaptive Kriging method that refines the surrogate in the restricted set of
samples defined by a Monte Carlo simulation. Dubourg et al. [14] proposed a population-based adaptive
sampling technique for refining the kriging surrogate around the failure boundary. One-step lookahead
strategies for GP surrogate refinement for estimating probability of failure was proposed by Bect et al. [15]
and Chevalier et al. [16]. A review of some surrogate-based methods for reliability analysis can be found
in Ref. [17]. However, all the methods mentioned above use a single source of information, which is the
high-fidelity model as illustrated in Figure 1 (b). This work presents a novel multifidelity active learning
method that adaptively refines the surrogate around the limit state function failure boundary using multiple
sources of information, thus, further reducing the active learning computational effort as seen in Figure 1
(c).
For several applications, in addition to an expensive high-fidelity model, there are potentially cheaper
lower fidelity models, such as, simplified physics models, coarse-grid models, data-fit models, reduced order
models, etc. that are readily available or can be built. This necessitates the development of multifidelity
methods that can take advantage of these multiple information sources [18]. Various multifidelity methods
have been developed in the context of GP-based Bayesian optimization [19, 20, 21]. While Bayesian opti-
mization also uses GP models and adaptive sampling [22, 23], we note that Bayesian optimization targets
a different problem to GP-based reliability analysis. In particular, the reliability analysis problem targets
2
the entire limit state function failure contour in the random variable space, whereas Bayesian optimization
targets finding the optimal design. Thus, the sampling criteria used for failure boundary location as com-
pared to optimization are different, and the corresponding needs and opportunities for multifidelity methods
are different. In the context of reliability analysis using active learning surrogates, there are few multifi-
delity methods available. Dribusch et al. [24] proposed a hierarchical bi-fidelity adaptive SVM method for
locating failure boundary. The recently developed CLoVER [25] method is a multifidelity active learning
algorithm that uses a one-step lookahead entropy-reduction-based adaptive sampling strategy for refining
GP surrogates around the failure boundary. In this work, we develop a multifidelity extension of the EGRA
method [11] as EGRA has been rigorously tested on a wide range of reliability analysis problems.
We propose mfEGRA (multifidelity EGRA) that leverages multiple sources of information with different
fidelities and cost to accelerate active learning of surrogates for failure boundary identification. For single-
fidelity methods, the adaptive sampling criterion chooses where to sample next to refine the surrogate
around the failure boundary. The challenge in developing a multifidelity adaptive sampling criterion is that
we now have to answer two questions – (i) where to sample next, and (ii) what information source to use
for evaluating the next sample. This work proposes a new adaptive sampling criterion that allows the use
of multiple fidelity models. In our mfEGRA method, we combine the expected feasibility function used
in EGRA with a proposed weighted lookahead information gain to define the adaptive sampling criterion
for multifidelity case. We use the Kullback-Leibler (KL) divergence to quantify the information gain and
derive a closed-form expression for the multifidelity GP case. The key advantage of the mfEGRA method
is the reduction in computational cost compared to single-fidelity active learning methods because it can
utilize additional information from multiple cheaper low-fidelity models along with the high-fidelity model
information. We demonstrate the computational efficiency of the proposed mfEGRA using a multimodal
analytic test problem and an acoustic horn problem with disjoint failure regions.
The rest of the paper is structured as follows. Section 2 provides the problem setup for reliability analysis
using multiple information sources. Section 3 describes the details of the proposed mfEGRA method along
with the complete algorithm. The effectiveness of mfEGRA is shown using an analytical multimodal test
problem and an acoustic horn problem in Section 4. The conclusions are presented in Section 5.
2 Problem Setup
The inputs to the system are the Nzrandom variables ZRNzwith the probability density function π,
where Ω denotes the random sample space. The vector of a realization of the random variables Zis denoted
by z.
The probability of failure of the system is pF=P(g(Z)>0), where g: Ω 7→ Ris the limit state function.
In this work, without loss of generality, the failure of the system defined as g(z)>0. The failure boundary
is defined as the zero contour of the limit state function, g(z) = 0, and any other failure boundary, g(z) = c,
can be reformulated as a zero contour (i.e., g(z)c= 0).
One way to estimate the probability of failure for nonlinear systems is Monte Carlo simulation. The
Monte Carlo estimate of the probability of failure ˆpFis
ˆpF=1
m
m
X
i=1
IG(zi),(1)
where zi, i = 1, . . . , m are msamples from probability density π,G={z|z, g(z)>0}is the failure
set, and IG: Ω 7→ {0,1}is the indicator function defined as
IG(z) = 1,z∈ G
0,else. (2)
The probability of failure estimation requires many evaluations of the expensive-to-evaluate high-fidelity
model for the limit state function g, which can make reliability analysis computationally prohibitive. The
computational cost can be substantially reduced by replacing the high-fidelity model evaluations with cheap-
to-evaluate surrogate model evaluations. However, to make accurate estimations of ˆpFusing a surrogate
model, the zero-contour of the surrogate model needs to approximate the failure boundary well. Adaptively
3
refining the surrogate around the failure boundary, while trading-off global accuracy, is an efficient way of
addressing the above.
The goal of this work is to make the adaptive refinement of surrogate models around the failure boundary
more efficient by using multiple models with different fidelities and costs instead of only using the high-
fidelity model. We develop a multifidelity active learning method that utilizes multiple information sources
to efficiently refine the surrogate to accurately locate the failure boundary. Let gl: Ω 7→ R, l ∈ {0, . . . , k}be
a collection of k+ 1 models for gwith associated cost cl(z) at location z, where the subscript ldenotes the
information source. We define the model g0to be the high-fidelity model for the limit state function. The
klow-fidelity models of gare denoted by l= 1, . . . , k. We use a multifidelity surrogate to simultaneously
approximate all information sources while encoding the correlations between them. The adaptively refined
multifidelity surrogate model predictions are used for the probability of failure estimation. The Monte Carlo
estimate of the probability of failure is then estimated using the refined multifidelity surrogate and is denoted
here by ˆpMF
F. Next we describe the multifidelity surrogate model used in this work and the multifidelity active
learning method used to sequentially refine the surrogate around the failure boundary.
3 mfEGRA: Multifidelity EGRA with Information Gain
In this section, we introduce multifidelity EGRA (mfEGRA) that leverages the k+ 1 information sources to
efficiently build an adaptively refined multifidelity surrogate to locate the failure boundary.
3.1 mfEGRA method overview
The proposed mfEGRA method is a multifidelity extension to the EGRA method [11]. Section 3.2 briefly
describes the multifidelity GP surrogate used in this work to combine the different information sources. The
multifidelity GP surrogate is built using an initial DOE and then the mfEGRA method refines the surrogate
using a two-stage adaptive sampling criterion that:
1. selects the next location to be sampled using an expected feasibility function as described in Section 3.3;
2. selects the information source to be used to evaluate the next sample using a weighted lookahead
information gain criterion as described in Section 3.4.
The adaptive sampling criterion developed in this work enables us to use the surrogate prediction mean and
the surrogate prediction variance to make the decision of where and which information source to sample
next. Note that both of these quantities are available from the multifidelity GP surrogate used in this
work. Section 3.5 provides the implementation details and the algorithm for the proposed mfEGRA method.
Figure 2 shows a flowchart outlining the mfEGRA method.
3.2 Multifidelity Gaussian process
We use the multifidelity GP surrogate introduced by Poloczek et al. [19], which built on earlier work by Lam
et al. [20], to combine information from the k+1 information sources into a single GP surrogate, bg(l, z), that
can simultaneously approximate all the information sources. The multifidelity GP surrogate can provide
predictions for any information source land random variable realization z.
The multifidelity GP is built by making two modeling choices: (1) a GP approximation for the high-
fidelity model g0as given by bg(0,z)GP(µ0,Σ0), and (2) independent GP approximations for the model
discrepancy between the high-fidelity and the lower-fidelity models as given by δlGP(µl,Σl) for l=
1, . . . , k.µldenotes the mean function and Σldenotes the covariance kernel for l= 0, . . . , k.
Then the surrogate for model lis constructed by using the definition bg(l, z) = bg(0,z) + δl(z). These
modeling choices lead to the surrogate model bgGP(µpr,Σpr) with prior mean function µpr and prior
covariance kernel Σpr. The priors for l= 0 are
µpr(0,z) = E[bg(0,z)] = µ0(z),
Σpr((0,z),(l0,z0)) = Cov (bg(0,z),bg(0,z0)) = Σ0(z,z0),(3)
4
mfEGRA
Get initial DOE and
evaluate models
k+ 1 models
gl, l = 0, . . . , k
Build initial
multifidelity GP
Is mfEGRA
stopping criterion
met?
Estimate probability
of failure using the
adaptively refined
multifidelity GP
Stop
Select next sampling
location using expected
feasibility function
Select the information
source using weighted
lookahead information gain
Evaluate at the
next sample using
the selected model
Update
multifidelity GP
Yes
No
Figure 2: Flowchart showing the mfEGRA method.
and priors for l= 1, . . . , k are
µpr(l, z) = E[bg(l, z)] = E[bg(0,z)] + E[δl(z)] = µ0(z) + µl(z),
Σpr((l, z),(l0,z0)) = Cov (bg(0,z) + δl(z),bg(0,z0) + δl0(z0))
= Σ0(z,z0) + l,l0Σl(z,z0),
(4)
where l00, . . . , k and l,l0denotes the Kronecker delta. Once the prior mean function and the prior
covariance kernels are defined using Equations (3) and (4), we can compute the posterior using standard
rules of GP regression [26]. A more detailed description about the assumptions and the implementation of
the multifidelity GP surrogate can be found in Ref. [19].
At any given z, the surrogate model posterior distribution of bg(l, z) is defined by the normal distribution
with posterior mean µ(l, z) and posterior variance σ2(l, z) = Σ((l, z),(l, z)). Consider that nsamples
{[li,zi]}n
i=1 have been evaluated and these samples are used to fit the present multifidelity GP surrogate.
Note that [l, z] is the augmented vector of inputs to the multifidelity GP. Then the surrogate is refined
around the failure boundary by sequentially adding samples. The next sample zn+1 and the next information
source ln+1 used to refine the surrogate are found using the two-stage adaptive sampling method mfEGRA
as described below.
5
3.3 Location selection: Maximize expected feasibility function
The first stage of mfEGRA involves selecting the next location zn+1 to be sampled. The expected feasibility
function (EFF), which was used as the adaptive sampling criterion in EGRA [11], is used in this work to
select the location of the next sample zn+1. The EFF defines the expectation of the sample lying within a
band around the failure boundary (here, ±(z) around the zero contour of the limit state function). The
prediction mean µ(0,z) and the prediction variance σ(0,z) at any zare provided by the multifidelity GP for
the high-fidelity surrogate model. The multifidelity GP surrogate prediction at zis the normal distribution
Yz N (µ(0,z), σ2(0,z)). Then the feasibility function at any zis defined as being positive within the
-band around the failure boundary and zero otherwise as given by
F(z) = (z)min(|y|, (z)),(5)
where yis a realization of Yz. The EFF is defined as the expectation of being within the -band around the
failure boundary as given by
EYz[F(z)] = Z(z)
(z)
((z)− |y|)Yz(y)dy. (6)
We will use E[F(z)] to denote EYz[F(z)] in the rest of the paper. The integration in Equation (6) can be
solved analytically to obtain [11]
E[F(z)] = µ(0,z)µ(0,z)
σ(0,z)Φ(z)µ(0,z)
σ(0,z)Φ(z)µ(0,z)
σ(0,z)
σ(0,z)2φµ(0,z)
σ(0,z)φ(z)µ(0,z)
σ(0,z)φ(z)µ(0,z)
σ(0,z)
+(z)Φ(z)µ(0,z)
σ(0,z)Φ(z)µ(0,z)
σ(0,z),
(7)
where Φ is the cumulative distribution function and φis the probability density function of the standard
normal distribution. Similar to EGRA [11], we define (z)=2σ(0,z) to balance exploration and exploitation.
As noted before, we describe the method considering the zero contour as the failure boundary for convenience
but the proposed method can be used for locating failure boundary at any contour level.
The location of the next sample is selected by maximizing the EFF as given by
zn+1 = arg max
z
E[F(z)].(8)
3.4 Information source selection: Maximize weighted lookahead information
gain
Given the location of the next sample at zn+1 obtained using Equation (8), the second stage of mfEGRA
selects the information source ln+1 to be used for simulating the next sample by maximizing the information
gain. Information-gain-based approaches have been used previously for global optimization [27, 28, 21, 29],
optimal experimental design [30, 31], and uncertainty propagation in coupled systems [32]. Ref. [21] used
an information-gain-based approach for selecting the location and the information source for improving the
global accuracy of the multifidelity GP approximation for the constraints in global optimization using a
double-loop Monte Carlo sample estimate of the information gain. Our work differs from previous efforts
in that we develop a weighted information-gain-based sampling strategy for failure boundary identification
utilizing multiple fidelity models. In the context of information gain criterion for multifidelity GP, the
two specific contributions of our work are: (1) deriving a closed-form expression for KL divergence for the
multifidelity GP, which does not require double-loop Monte Carlo sampling, thus improving the robustness
and decreasing the cost of estimation of the acquisition function, and (2) using weighting strategies to address
the goal of failure boundary location for reliability analysis using multiple fidelity models.
The next information source is selected by using a weighted one-step lookahead information gain criterion.
This adaptive sampling strategy selects the information source that maximizes the information gain in the GP
surrogate prediction defined by the Gaussian distribution at any z. We measure the KL divergence between
6
the present surrogate predicted GP and a hypothetical future surrogate predicted GP when a particular
information source is used to simulate the sample at zn+1 to quantify the information gain.
We represent the present GP surrogate built using the navailable training samples by the subscript P
for convenience as given by bgP(l, z) = bg(l, z| {li,zi}n
i=1). Then the present surrogate predicted Gaussian
distribution at any zis
GP(z) N (µP(0,z), σ 2
P(0,z)),
where µP(0,z) is the posterior mean and σ2
P(0,z) is the posterior prediction variance of the present GP
surrogate for the high-fidelity model built using the available training data till iteration n.
A hypothetical future GP surrogate can be understood as a surrogate built using the current GP as
a generative model to create hypothetical future simulated data. The hypothetical future simulated data
yF N (µP(lF,zn+1), σ2
P(lF,zn+1)) is obtained from the present GP surrogate prediction at the location
zn+1 using a possible future information source lF∈ {0, . . . , k}. We represent a hypothetical future GP
surrogate by the subscript F. Then a hypothetical future surrogate predicted Gaussian distribution at any
zis
GF(z|zn+1, lF, yF) N (µF(0,z|zn+1, lF, yF), σ2
F(0,z|zn+1, lF, yF)).
The posterior mean of the hypothetical future GP is affine with respect to yFand thus is distributed normally
as given by
µF(0,z|zn+1, lF, yF) N (µP(0,z),¯σ2(z|zn+1, lF)),
where ¯σ2(z|zn+1, lF) = (ΣP((0,z),(lF,zn+1 )))2/ΣP((lF,zn+1),(lF,zn+1 ))[19]. The posterior variance of the
hypothetical future GP surrogate σ2
F(0,z|zn+1, lF, yF) depends only on the location zn+1 and the source lF,
and can be replaced with σ2
F(0,z|zn+1, lF). Note that we don’t need any new evaluations of the information
source for constructing the future GP. The total lookahead information gain is obtained by integrating over
all possible values of yFas described below.
Since both GPand GFare Gaussian distributions, we can write the KL divergence between them explicitly.
The KL divergence between GPand GFfor any zis
DKL(GP(z)kGF(z|zn+1 , lF, yF))
= log σF(0,z|zn+1 , lF)
σP(0,z)+σ2
P(0,z)+(µP(0,z)µF(0,z|zn+1, lF, yF))2
2σ2
F(0,z|zn+1, lF)1
2.(9)
The total KL divergence can then be calculated by integrating DKL(GP(z)kGF(z|zn+1, lF, yF)) over the
entire random variable space Ω as given by
Z
DKL(GP(z)kGF(z|zn+1 , lF, yF))dz
=Zlog σF(0,z|zn+1, lF)
σP(0,z)+σ2
P(0,z)+(µP(0,z)µF(0,z|zn+1, lF, yF))2
2σ2
F(0,z|zn+1, lF)1
2dz.
(10)
The total lookahead information gain for any zcan then be calculated by taking the expectation of Equa-
tion (10) over all possible values of yFas given by
DIG(zn+1 , lF) = EyFZ
DKL(GP(z)kGF(z|zn+1 , lF, yF))dz
=Z"log σF(0,z|zn+1, lF)
σP(0,z)+σ2
P(0,z) + EyF(µP(0,z)µF(0,z|zn+1, lF, yF))2
2σ2
F(0,z|zn+1, lF)1
2#dz
=Zlog σF(0,z|zn+1, lF)
σP(0,z)+σ2
P(0,z) + ¯σ2(z|zn+1 , lF)
2σ2
F(0,z|zn+1, lF)1
2dz
=Z
D(z|zn+1, lF)dz,
(11)
where
D(z|zn+1, lF) = log σF(0,z|zn+1, lF)
σP(0,z)+σ2
P(0,z) + ¯σ2(z|zn+1 , lF)
2σ2
F(0,z|zn+1, lF)1
2.
7
In practice, we choose a discrete set Z ⊂ Ω via Monte Carlo sampling to numerically integrate Equation (11)
as given by
DIG(zn+1 , lF) = Z
D(z|zn+1, lF)dzX
z∈Z
D(z|zn+1, lF).(12)
The total information gain for the multifidelity GP can be estimated using single-loop Monte Carlo sampling
instead of double-loop Monte Carlo sampling because of the closed-form expression derived in Equation (11).
This improves the robustness and decreases the cost of estimation of the acquisition function.
The total lookahead information gain evaluated using Equation (12) gives a metric of global information
gain over the entire random variable space. However, we are interested in gaining more information around
the failure boundary. In order to give more importance to gaining information around the failure boundary
we use a weighted version of the lookahead information gain normalized by the cost of the information source.
In this work, we explore three different weighting strategies: (i) no weights w(z) = 1, (ii) weights defined by
the EFF, w(z) = E[F(z)], and (iii) weights defined by the probability of feasibility (PF), w(z) = P[F(z)].
The PF of the sample to lie within the ±(z) bounds around the zero contour is
P[F(z)] = Φ (z)µ(0,z)
σ(0,z)Φ(z)µ(0,z)
σ(0,z).(13)
Weighting the information gain by either expected feasibility or probability of feasibility gives more impor-
tance to gaining information around the target region, in this case, the failure boundary.
The next information source ln+1 is selected by maximizing the weighted lookahead information gain
normalized by the cost of the information source as given by
ln+1 = arg max
l∈{0,...,k}X
z∈Z
1
cl(z)w(z)D(z|zn+1, lF=l).(14)
Note that the optimization problem in Equation (14) is a one-dimensional discrete variable problem. In this
case, we only need k+ 1 (number of available models) evaluations of the objective function to solve the
optimization problem exactly and typically kis a small number.
3.5 Algorithm and implementation details
An algorithm describing the mfEGRA method is given in Algorithm 1. In this work, we evaluate all the
models at the initial DOE. We generate the initial samples zusing Latin hypercube sampling and run all
the models at each of those samples to get the initial training set {zi, li}n
i=1. The initial number of samples,
n, can be decided based on the user’s preference (in this work, we use cross-validation error). The EFF
maximization problem given by Equation (8) is solved using patternsearch function followed by using
multiple starts of a local optimizer through the GlobalSearch function in MATLAB. In practice, we choose
a fixed set of realizations Z Ω at which the information gain is evaluated as shown in Equation (12) for all
iterations of mfEGRA. Due to the typically high cost associated with the high-fidelity model, we chose to
evaluate all the k+ 1 models when the high-fidelity model is selected as the information source and update
the GP hyperparameters in our implementation. All the k+ 1 model evaluations can be done in parallel. The
algorithm is stopped when the maximum value of EFF goes below 1010. However, other stopping criteria
can also be explored.
Although in this work we did not encounter any case of failed model evaluations, numerical solvers can
sometimes fail to provide a converged result. In the context of reliability analysis, a failed model evaluation
can be treated as failure of the system (defined here as gl(z)>0) at the particular random variable realization
z. One possibility to handle failed model evaluations would be to let the value of the limit state function
gl(z) go to an upper limit in order to indicate failure of the system.
A potential limitation of any GP-based method is dealing with the curse of dimensionality for high-
dimensional problems, where the number of samples to cover the space grows exponentially and the cost of
training GPs scales as the cube of the number of samples. The multifidelity method presented here alleviates
the cost of exploring the space by using cheaper low-fidelity model evaluations and restricts its queries of
the high-fidelity model to lie mostly around the failure boundary. The issue of the cost of training GPs with
8
increasing number of samples is not addressed here but can be potentially tackled through GP sparsification
techniques [33, 34]. Another strategy for reducing cost of training is through adaptive sampling strategies
that exploit parallel computing. Advancements in parallel computing have led to several parallel adaptive
sampling strategies for global optimization [35] and some parallel adaptive sampling methods for contour
location [36, 16]. In addition, parallel methods for multifidelity adaptive sampling have increased difficulty
and needs to be explored in both the fields of global optimization and contour location.
Algorithm 1 Multifidelity EGRA
Input: Initial DOE X0={zi, li}n
i=1, cost of each information source cl
Output: Refined multifidelity GP bg
1: procedure mfEGRA(X0)
2: X=X0set of training samples
3: Build initial multifidelity GP bgusing the initial set of training samples X0
4: while stopping criterion is not met do
5: Select next sampling location zn+1 using Equation (8)
6: Select next information source ln+1 using Equation (14)
7: Evaluate at sample zn+1 using information source ln+1
8: X=X ∪ {zn+1, ln+1}
9: Build updated multifidelity GP bgusing X
10: nn+ 1
11: end while
12: return bg
13: end procedure
4 Results
In this section, we demonstrate the effectiveness of the proposed mfEGRA method on an analytic multimodal
test problem and two different cases for an acoustic horn application. The probability of failure is estimated
through Monte Carlo simulation using the adaptively refined multifidelity GP surrogate.
4.1 Analytic multimodal test problem
The analytic test problem used in this work has two inputs and three models with different fidelities and costs.
This test problem has been used before in the context of reliability analysis in Ref. [11]. The high-fidelity
model of the limit state function is
g0(z) = (z2
1+ 4)(z21)
20 sin 5z1
22,(15)
where z1∼ U(4,7) and z2∼ U(3,8) are uniformly distributed random numbers. The domain of the
function is Ω = [4,7] ×[3,8]. The two low-fidelity models are
g1(z) = g0(z) + sin 5z1
22 +5z2
44 +5
4,(16)
g2(z) = g0(z) + 3 sin5z1
11 +5z2
11 +35
11.(17)
The cost of each fidelity model is taken to be constant over the entire domain and is given by c0= 1, c1=
0.01 and c2= 0.001. In this case, there is no noise in the observations from the different fidelity models.
The failure boundary is defined by the zero contour of the limit state function (g0(z) = 0) and the failure
of the system is defined by g0(z)>0. Figure 3 shows the contour plot of g(z) for the three models used for
the analytic test problem along with the failure boundary for each of them.
9
Figure 3: Contours of gl(z) using the three fidelity models for the analytic test problem. Solid red line
represents the zero contour that denotes the failure boundary.
We use an initial DOE of size 10 generated using Latin hypercube sampling. All the models are evaluated
at these 10 samples to build the initial multifidelity surrogate. The reference probability of failure is estimated
to be ˆpF= 0.3021 using 106Monte Carlo samples of g0model. The relative error in probability of failure
estimate using the adaptively refined multifidelity GP surrogate, defined by |ˆpFˆpMF
F|/ˆpF, is used to assess
the accuracy and computational efficiency of the proposed method. We repeat the calculations for 100
different initial DOEs to get the confidence bands on the results.
We first compare the accuracy of the method when different weights are used for the information gain
criterion in mfEGRA as seen in Figure 4. We can see that using weighted information gain (both EFF
and PF) performs better than the case when no weights are used when comparing the error confidence
bands. EFF-weighted information gain leads to only marginally lower errors in this case as compared to
PF-weighted information gain. Since we don’t see any significant advantage of using PF as weights and we
use the EFF-based criterion to select the sample location, we propose using EFF-weighted information gain
to make the implementation more convenient. Note that for other problems, it is possible that PF-weighted
information gain may be better. From hereon, mfEGRA is used with the EFF-weighted information gain.
10 15 20 25 30 35
10-5
10-4
10-3
10-2
10-1
100
Figure 4: Effect of different weights for information gain criterion in mfEGRA for analytic test problem in
terms of convergence of relative error in pFprediction (shown in log-scale) for 100 different initial DOEs.
Solid lines represent the median and dashed lines represent the 25 and 75 percentiles.
10
The comparison of mfEGRA with single-fidelity EGRA shows considerable improvement in accuracy at
substantially lower computational cost as seen in Figure 5. In this case, to reach a median relative error of
below 103in pFprediction, mfEGRA requires a computational cost of 26 compared to EGRA that requires
a computational cost of 48 (46% reduction). Note that we start both cases with the same 100 sets of
initial samples. We also note that the original paper for the EGRA method [11] reports a computational
cost of 35.1 for the mean relative error from 20 different initial DOEs to reach below 5 ×103. We report
the computational cost for the EGRA algorithm to reach a median relative error from 100 different initial
DOEs below 103to be 48 (in our case, the computational cost for the median relative error for EGRA to
reach below 5 ×103is 40). The difference in results can be attributed to the different sets of initial DOEs,
the GP implementations, different statistics of reported results, and different probability distributions used
for the random variables.
10 20 30 40 50 60 70
10-5
10-4
10-3
10-2
10-1
100
Figure 5: Comparison of mfEGRA vs single-fidelity EGRA for analytic test problem in terms of convergence
of relative error in pFprediction (shown in log-scale) for 100 different initial DOEs.
Figure 6 shows the evolution of the expected feasibility function and the weighted lookahead information
gain, which are the two stages of the adaptive sampling criterion used in mfEGRA. These metrics along
with the relative error in probability of failure estimate can be used to define an efficient stopping crite-
rion, specifically when the adaptive sampling needs to be repeated for different sets of parameters (e.g., in
reliability-based design optimization). Figure 7 shows the progress of mfEGRA at several iterations for a
particular initial DOE. mfEGRA explores most of the domain using the cheaper g1and g2models in this case.
The algorithm is stopped after 69 iterations when the expected feasibility function reached below 1010; we
can see that the surrogate contour accurately traces the true failure boundary defined by the high-fidelity
model. As noted before, we evaluate all the three models when the high-fidelity model is selected as the
information source. In this case, mfEGRA makes a total of 21 evaluations of g0, 77 evaluations of g1, and
23 evaluations of g2including the initial DOE, to reach a value of EFF below 1010.
4.2 Acoustic horn
We demonstrate the effectiveness of mfEGRA for the reliability analysis of an acoustic horn for a three-
dimensional case and a rarer event probability of failure four-dimensional case. The acoustic horn model
used in this work has been used in the context of robust optimization by Ng et al. [37] An illustration of the
acoustic horn is shown in Figure 8.
11
15 20 25 30 35 40
10-20
10-10
100
(a)
15 20 25 30 35 40
10-5
100
105
1010
1015
(b)
Figure 6: Evolution of adaptive sampling criteria (a) expected feasibility function, and (b) weighted infor-
mation gain used in mfEGRA for 100 different initial DOEs.
-4 -2 0 2 4 6
-2
0
2
4
6
8
-4 -2 0 2 4 6
-2
0
2
4
6
8
-4 -2 0 2 4 6
-2
0
2
4
6
8
-4 -2 0 2 4 6
-2
0
2
4
6
8
Figure 7: Progress of mfEGRA at several iterations showing the surrogate prediction and the samples from
different models for a particular initial DOE. HF refers to high-fidelity model g0, LF1 refers to low-fidelity
model g1, and LF2 refers to low-fidelity model g2.
4.2.1 Three-dimensional case
The inputs to the system are the three random variables listed in Table 1.
12
Table 1: Random variables used in the three-dimensional acoustic horn problem.
Random
variable Description Distribution Lower
bound
Upper
bound Mean Standard
deviation
kwave number Uniform 1.3 1.5
Zu
upper horn wall
impedance Normal – 50 3
Zl
lower horn wall
impedance Normal – 50 3
2𝑎
𝐿
𝐿
2𝑏𝑖
2𝑏
Γinlet
Γwall
Γradiation
Figure 8: Two-dimensional acoustic horn geometry with a= 0.5, b = 3, L = 5 and shape of the horn flare
described by six equally-spaced half-widths b1= 0.8, b2= 1.2, b3= 1.6, b4= 2, b5= 2.3, b6= 2.65. [37]
The output of the model is the reflection coefficient s, which is a measure of the horn’s efficiency. We
define the failure of the system to be s(z)>0.1. The limit state function is defined as g(z) = s(z)0.1,
which defines the failure boundary as g(z) = 0. We use a two-dimensional acoustic horn model governed by
the non-dimensional Helmholtz equation. In this case, a finite element model of the Helmholtz equation is
the high-fidelity model g0with 35895 nodal grid points. The low-fidelity model g1is a reduced basis model
with N= 100 basis vectors [37, 38]. In this case, the cost of evaluating the low-fidelity model is 40 times
faster than evaluating the high-fidelity model. The cost of evaluating the different models is taken to be
constant over the entire random variable space. A more detailed description of the acoustic horn models
used in this work can be found in Ref. [37].
The reference probability of failure is estimated to be pF= 0.3812 using 105Monte Carlo samples of the
high-fidelity model. We repeat the mfEGRA and the single-fidelity EGRA results using 10 different initial
DOEs with 10 samples in each (generated using Latin hypercube sampling) to get the confidence bands on the
results. The comparison of convergence of the relative error in the probability of failure is shown in Figure 9
for mfEGRA and single-fidelity EGRA. In this case, mfEGRA needs 19 equivalent high-fidelity solves to reach
a median relative error value of below 103as compared to 25 required by single-fidelity EGRA leading to
24% reduction in computational cost. The reduction in computational cost using mfEGRA is driven by the
discrepancy between the models and the relative cost of evaluating the models. In the acoustic horn case, we
see computational savings of 24% as compared to around 46% seen in the analytic test problem in Section 4.1.
This can be explained by the substantial difference in relative costs – 40 times cheaper low-fidelity model for
the acoustic horn problem as compared to two low-fidelity models that are 100-1000 times cheaper than the
high-fidelity model for the analytic test problem. The evolution of the mfEGRA adaptive sampling criteria
can be seen in Figure 10.
Figure 11 shows that classification of the Monte Carlo samples using the high-fidelity model and the
13
Figure 9: Comparing relative error in the estimate of probability of failure (shown in log-scale) using mfEGRA
and single-fidelity EGRA for the three-dimensional acoustic horn application with 10 different initial DOEs.
15 20 25 30 35 40
10-30
10-20
10-10
100
(a)
15 20 25 30 35 40
10-60
10-40
10-20
100
(b)
Figure 10: Evolution of adaptive sampling criteria (a) expected feasibility function, and (b) weighted infor-
mation gain for the three-dimensional acoustic horn application with 10 different initial DOEs.
adaptively refined surrogate model for a particular initial DOE lead to very similar results. It also shows
that in the acoustic horn application there are two disjoint failure regions and the method is able to accurately
capture both failure regions. The location of the samples from the different models when mfEGRA is used to
refine the multifidelity GP surrogate for a particular initial DOE can be seen in Figure 12. The figure shows
that most of the high-fidelity samples are selected around the failure boundary. For this DOE, mfEGRA
requires 31 evaluations of the high-fidelity model and 76 evaluations of the low-fidelity model to reach an
EFF value below 1010.
Similar to the work in Refs. [13, 15], EGRA and mfEGRA can also be implemented by limiting the
search space for adaptive sampling location in Equation (8) to the set of Monte Carlo samples (here, 105)
14
(a) (b)
Figure 11: Classification of Monte Carlo samples using (a) high-fidelity model, and (b) the final refined
multifidelity GP surrogate for a particular initial DOE using mfEGRA for the three-dimensional acoustic
horn problem.
1.3 1.35 1.4 1.45 1.5
30
40
50
60
70
1.3 1.35 1.4 1.45 1.5
30
40
50
60
70
Figure 12: Location of samples from different fidelity models using mfEGRA for the three-dimensional
acoustic horn problem for a particular initial DOE. The cloud of points are the high-fidelity Monte Carlo
samples near the failure boundary.
drawn from the given random variable distribution. The convergence of relative error in probability of
failure estimate using this method improves for both mfEGRA and single-fidelity EGRA as can be seen in
Figure 13. In this case, mfEGRA requires 12 equivalent high-fidelity solves as compared to 22 high-fidelity
solves required by single-fidelity EGRA to reach a median relative error below 103leading to computational
savings of around 45%.
15
Figure 13: Comparing relative error in the estimate of probability of failure (shown in log-scale) using
mfEGRA and single-fidelity EGRA by limiting the search space for adaptive sampling location to a set of
Monte Carlo samples drawn from the given random variable distribution for the three-dimensional acoustic
horn application with 10 different initial DOEs.
4.2.2 Four-dimensional case
For the four-dimensional acoustic horn problem, the inputs to the system are the three random variables used
before along with a random variable ξdefined by a truncated normal distribution representing manufacturing
uncertainty as listed in Table 2. The parameters defining the geometry of the acoustic horn (see Figure 8)
are now given by bi+ξ, i = 1,...,6 to account for manufacturing uncertainty. In this case, we define failure
of the system to be s(z)>0.16 to make the failure a rarer event. The limit state function is defined as
g(z) = s(z)0.16, which defines the failure boundary as g(z) = 0. The reference probability of failure is
estimated to be pF= 7.2×103using 105Monte Carlo samples of the high-fidelity model. Note that the pFin
the four-dimensional case is two orders of magnitude lower than the three-dimensional case. The complexity
of the problem increases because of the higher dimensionality as well as the rarer event probability of failure
to be estimated.
Table 2: Random variables used in the four-dimensional acoustic horn problem.
Random
variable Description Distribution Lower
bound
Upper
bound Mean Standard
deviation
kwave number Uniform 1.3 1.5
Zu
upper horn wall
impedance Normal – 50 3
Zl
lower horn wall
impedance Normal – 50 3
ξmanufacturing
uncertainty
Truncated
Normal -0.1 0.1 0 0.05
In this case, we present the results for EGRA and mfEGRA implemented by limiting the search space
to a priori Monte Carlo samples (here, 105) drawn from the given random variable distribution. Note that
the lower probability of failure estimation required here necessitates the use of a priori drawn Monte Carlo
samples to efficiently achieve the required accuracy. The computational efficiency can be further improved
16
by combining EGRA and mfEGRA with Monte Carlo variance reduction techniques, especially for problems
with even lower probabilities of failure. We repeat the mfEGRA and the single-fidelity EGRA results using
10 different initial DOEs with 15 samples in each (generated using Latin hypercube sampling) to get the
confidence bands on the results. The comparison of convergence of the relative error in the probability of
failure is shown in Figure 14 for mfEGRA and single-fidelity EGRA. In this case, mfEGRA requires 25
equivalent high-fidelity solves as compared to 48 high-fidelity solves required by single-fidelity EGRA to
reach a median relative error below 103leading to computational savings of around 48%.
Figure 14: Comparing relative error in the estimate of probability of failure (shown in log-scale) using
mfEGRA and single-fidelity EGRA by limiting the search space for adaptive sampling location to a set of
Monte Carlo samples drawn from the given random variable distribution for the four-dimensional acoustic
horn application with 10 different initial DOEs.
5 Concluding remarks
This paper introduces the mfEGRA (multifidelity EGRA) method that refines the surrogate to accurately
locate the limit state function failure boundary (or any contour) while leveraging multiple information sources
with different fidelities and costs. The method selects the next location based on the expected feasibility
function and the next information source based on a weighted one-step lookahead information gain criterion
to refine the multifidelity GP surrogate of the limit state function around the failure boundary.
We show through three numerical examples that mfEGRA efficiently combines information from different
models to reduce computational cost. The mfEGRA method leads to computational savings of 46%
for a multimodal test problem and 24% for a three-dimensional acoustic horn problem over the single-
fidelity EGRA method when used for estimating the probability of failure. The mfEGRA method when
implemented by restricting the search-space to a priori drawn Monte Carlo samples showed even more
computational efficiency with 45% reduction in computational cost compared to single-fidelity method for the
three-dimensional acoustic horn problem. We see that using a priori drawn Monte Carlo samples improves
the efficiency of both EGRA and mfEGRA, and the importance is further highlighted through the four-
dimensional implementation of the acoustic horn problem, which requires estimating a rarer event probability
of failure. For the four-dimensional acoustic horn problem, mfEGRA leads to computational savings of 48%
as compared to the single-fidelity method. The driving factors for the reduction in computational cost for
the method are the discrepancy between the high- and low-fidelity models, and the relative cost of the low-
fidelity models compared to the high-fidelity model. These information are directly encoded in the mfEGRA
17
adaptive sampling criterion helping it make the most efficient decision.
Acknowledgements
This work has been supported in part by the Air Force Office of Scientific Research (AFOSR) MURI on man-
aging multiple information sources of multi-physics systems award numbers FA9550-15-1-0038 and FA9550-
18-1-0023, the Air Force Center of Excellence on multi-fidelity modeling of rocket combustor dynamics award
FA9550-17-1-0195, and the Department of Energy Office of Science AEOLUS MMICC award DE-SC0019303.
References
[1] Melchers, R., “Importance sampling in structural systems,” Structural Safety, Vol. 6, No. 1, 1989,
pp. 3–10.
[2] Liu, J. S., Monte Carlo strategies in scientific computing, Springer Science & Business Media, 2008.
[3] Kroese, D. P., Rubinstein, R. Y., and Glynn, P. W., “The cross-entropy method for estimation,”
Handbook of Statistics, Vol. 31, Elsevier, 2013, pp. 19–34.
[4] Au, S.-K. and Beck, J. L., “Estimation of small failure probabilities in high dimensions by subset
simulation,” Probabilistic Engineering Mechanics, Vol. 16, No. 4, 2001, pp. 263–277.
[5] Papaioannou, I., Betz, W., Zwirglmaier, K., and Straub, D., “MCMC algorithms for subset simulation,”
Probabilistic Engineering Mechanics , Vol. 41, 2015, pp. 89–103.
[6] Hohenbichler, M., Gollwitzer, S., Kruse, W., and Rackwitz, R., “New light on first-and second-order
reliability methods,” Structural Safety, Vol. 4, No. 4, 1987, pp. 267–284.
[7] Rackwitz, R., “Reliability analysis–a review and some perspectives,” Structural Safety, Vol. 23, No. 4,
2001, pp. 365–395.
[8] Basudhar, A., Missoum, S., and Sanchez, A. H., “Limit state function identification using support
vector machines for discontinuous responses and disjoint failure domains,” Probabilistic Engineering
Mechanics, Vol. 23, No. 1, 2008, pp. 1–11.
[9] Basudhar, A. and Missoum, S., “Reliability assessment using probabilistic support vector machines,”
International Journal of Reliability and Safety, Vol. 7, No. 2, 2013, pp. 156–173.
[10] Lecerf, M., Allaire, D., and Willcox, K., “Methodology for dynamic data-driven online flight capability
estimation,” AIAA Journal, Vol. 53, No. 10, 2015, pp. 3073–3087.
[11] Bichon, B. J., Eldred, M. S., Swiler, L. P., Mahadevan, S., and McFarland, J. M., “Efficient global
reliability analysis for nonlinear implicit performance functions,” AIAA Journal, Vol. 46, No. 10, 2008,
pp. 2459–2468.
[12] Picheny, V., Ginsbourger, D., Roustant, O., Haftka, R. T., and Kim, N.-H., “Adaptive designs of
experiments for accurate approximation of a target region,” Journal of Mechanical Design , Vol. 132,
No. 7, 2010, pp. 071008.
[13] Echard, B., Gayton, N., and Lemaire, M., “AK-MCS: an active learning reliability method combining
Kriging and Monte Carlo simulation,” Structural Safety, Vol. 33, No. 2, 2011, pp. 145–154.
[14] Dubourg, V., Sudret, B., and Bourinet, J.-M., “Reliability-based design optimization using kriging
surrogates and subset simulation,” Structural and Multidisciplinary Optimization, Vol. 44, No. 5, 2011,
pp. 673–690.
[15] Bect, J., Ginsbourger, D., Li, L., Picheny, V., and Vazquez, E., “Sequential design of computer exper-
iments for the estimation of a probability of failure,” Statistics and Computing, Vol. 22, No. 3, 2012,
pp. 773–793.
18
[16] Chevalier, C., Bect, J., Ginsbourger, D., Vazquez, E., Picheny, V., and Richet, Y., “Fast parallel
kriging-based stepwise uncertainty reduction with application to the identification of an excursion set,”
Technometrics, Vol. 56, No. 4, 2014, pp. 455–465.
[17] Moustapha, M. and Sudret, B., “Surrogate-assisted reliability-based design optimization: a survey and
a unified modular framework,” Structural and Multidisciplinary Optimization, 2019, pp. 1–20.
[18] Peherstorfer, B., Willcox, K., and Gunzburger, M., “Survey of multifidelity methods in uncertainty
propagation, inference, and optimization,” SIAM Review, Vol. 60, No. 3, 2018, pp. 550–591.
[19] Poloczek, M., Wang, J., and Frazier, P., “Multi-information source optimization,” Advances in Neural
Information Processing Systems, 2017, pp. 4291–4301.
[20] Lam, R., Allaire, D., and Willcox, K., “Multifidelity optimization using statistical surrogate modeling for
non-hierarchical information sources,” 56th AIAA/ASCE/AHS/ASC Structures, Structural Dynamics,
and Materials Conference, 2015.
[21] Ghoreishi, S. F. and Allaire, D., “Multi-information source constrained Bayesian optimization,” Struc-
tural and Multidisciplinary Optimization, Vol. 59, No. 3, 2019, pp. 977–991.
[22] Frazier, P. I., “A tutorial on bayesian optimization,” arXiv preprint arXiv:1807.02811 , 2018.
[23] Jones, D. R., “A taxonomy of global optimization methods based on response surfaces,” Journal of
Global Optimization, Vol. 21, No. 4, 2001, pp. 345–383.
[24] Dribusch, C., Missoum, S., and Beran, P., “A multifidelity approach for the construction of explicit de-
cision boundaries: application to aeroelasticity,” Structural and Multidisciplinary Optimization, Vol. 42,
No. 5, 2010, pp. 693–705.
[25] Marques, A., Lam, R., and Willcox, K., “Contour location via entropy reduction leveraging multiple
information sources,” Advances in Neural Information Processing Systems, 2018, pp. 5217–5227.
[26] Rasmussen, C. E. and Nickisch, H., “Gaussian processes for machine learning (GPML) toolbox,” Journal
of Machine Learning Research, Vol. 11, No. Nov, 2010, pp. 3011–3015.
[27] Villemonteix, J., Vazquez, E., and Walter, E., “An informational approach to the global optimization
of expensive-to-evaluate functions,” Journal of Global Optimization , Vol. 44, No. 4, 2009, pp. 509–534.
[28] Hennig, P. and Schuler, C. J., “Entropy search for information-efficient global optimization,” The Jour-
nal of Machine Learning Research, Vol. 13, No. 1, 2012, pp. 1809–1837.
[29] Hern´andez-Lobato, J. M., Hoffman, M. W., and Ghahramani, Z., “Predictive entropy search for efficient
global optimization of black-box functions,” Advances in Neural Information Processing Systems, 2014,
pp. 918–926.
[30] Huan, X. and Marzouk, Y. M., “Simulation-based optimal Bayesian experimental design for nonlinear
systems,” Journal of Computational Physics, Vol. 232, No. 1, 2013, pp. 288–317.
[31] Villanueva, D. and Smarslok, B. P., “Using Expected Information Gain to Design Aerothermal Model
Calibration Experiments,” 17th AIAA Non-Deterministic Approaches Conference, Kissimmee, FL,
USA, 2015.
[32] Chaudhuri, A., Lam, R., and Willcox, K., “Multifidelity uncertainty propagation via adaptive surrogates
in coupled multidisciplinary systems,” AIAA Journal, 2018, pp. 235–249.
[33] Williams, C. K. and Rasmussen, C. E., Gaussian processes for machine learning, Vol. 2, MIT press
Cambridge, MA, 2006.
[34] Burt, D., Rasmussen, C. E., and Van Der Wilk, M., “Rates of Convergence for Sparse Variational
Gaussian Process Regression,” International Conference on Machine Learning, 2019, pp. 862–871.
19
[35] Haftka, R. T., Villanueva, D., and Chaudhuri, A., “Parallel surrogate-assisted global optimization with
expensive functions–a survey,” Structural and Multidisciplinary Optimization, Vol. 54, No. 1, 2016,
pp. 3–13.
[36] Viana, F. A., Haftka, R. T., and Watson, L. T., “Sequential sampling for contour estimation with
concurrent function evaluations,” Structural and Multidisciplinary Optimization, Vol. 45, No. 4, 2012,
pp. 615–618.
[37] Ng, L. W. and Willcox, K. E., “Multifidelity approaches for optimization under uncertainty,” Interna-
tional Journal for Numerical Methods in Engineering, Vol. 100, No. 10, 2014, pp. 746–772.
[38] Eftang, J. L., Huynh, D., Knezevic, D. J., and Patera, A. T., “A two-step certified reduced basis
method,” Journal of Scientific Computing, Vol. 51, No. 1, 2012, pp. 28–58.
20
... Research on KALRA with multiple fidelity sources is relatively rare (Yi et al. 2021(Yi et al. , 2022Chaudhuri et al. 2021;Zhang et al. 2022), but there have been some promising developments in this area. Here are some notable studies: Yi et al. (Yi et al. 2021) adopted a Gaussian Process (GP) model with an auto-regressive model (similar to CoKriging) to leverage both high and low fidelity samples for surrogate predictions. ...
... Building on their previous work, Yi et al. (Yi et al. 2022) proposed a more efficient point-selected strategy known as BSC believer to identify the best sample. Chaudhuri et al. (Chaudhuri et al. 2021) used a Bayesianbased multi-fidelity GP model that combines prior mean and prior covariance from multiple low fidelity samples for prediction. They employed a two-stage selection strategy, where the location of the best sample was determined by the classical expected feasibility function (EFF), and the fidelity source was determined by maximizing the information gain function. ...
... introduce considerations such as the relative computational cost (Yi et al. 2021(Yi et al. , 2022Chaudhuri et al. 2021;Zhang et al. 2022) and the cross-correlation (Reisenthel and Allen 2014) between the low fidelity response and the high fidelity response. Incorporating these factors into the point-selected process, the proposed strategy, referred to as the 'mfUEFF' strategy, seamlessly integrates the strengths of both EFF and U within the multi-fidelity KALRA framework. ...
Article
Full-text available
Reliability analysis can be particularly challenging when performance functions require time-consuming simulations. Such simulations often involve multiple fidelity sources. This paper aims to enhance the efficiency of reliability analysis by leveraging multiple sample datasets with varying sources of fidelity. Firstly, this paper extends the GP model to integrate multiple low fidelity data into high fidelity predictions. In order to consider the correlations among data from different fidelity sources, the latent space representation from LMGP is introduced into the correlation matrix of different fidelity data. This multi-fidelity GP model is referred to as a latent map multi-fidelity Kriging (LMmfK). The effectiveness of LMmfK is validated through 1-dimensional analytical test and 8-dimensional Borehole test. Secondly, based on LMmfK, this paper proposes an active learning point-selected strategy suitable for scenarios with multiple fidelity sources, referred to as mfUEFF. The mfUEFF strategy intelligently selects the best data points from multiple fidelity sources, leveraging the benefits of both global improvement and local uncertainty. This integration enhances the efficiency and accuracy of reliability analysis. Two classic cases demonstrate that the proposed reliability method demonstrates superior computational accuracy and efficiency compared to other reliability methods. Finally, this paper applies the proposed method to the static reliability analysis of gears, involving time-consuming finite element models. Engineering application demonstrates that this method significantly improves efficiency, especially in scenarios with multiple fidelity sources.
... To reduce the projection error by including more reduced modes in GP-BayesOpInf, it may be advantageous to leverage filtering strategies as shown in [19] prior to applying dimensionality reduction. We also plan to explore more advanced GP techniques, such as sparse GPs [60,61] to reduce the trajectory data needed for regression (i.e., selecting m ′ automatically), and multifidelity GPs [2,11,37,42,46,47] to fuse data from different sources. Another promising direction is to leverage the quantified uncertainties to guide the active selection of initial conditions and/or input functions for parsimoniously generating informative training trajectories. ...
Preprint
Full-text available
This work presents a data-driven method for learning low-dimensional time-dependent physics-based surrogate models whose predictions are endowed with uncertainty estimates. We use the operator inference approach to model reduction that poses the problem of learning low-dimensional model terms as a regression of state space data and corresponding time derivatives by minimizing the residual of reduced system equations. Standard operator inference models perform well with accurate training data that are dense in time, but producing stable and accurate models when the state data are noisy and/or sparse in time remains a challenge. Another challenge is the lack of uncertainty estimation for the predictions from the operator inference models. Our approach addresses these challenges by incorporating Gaussian process surrogates into the operator inference framework to (1) probabilistically describe uncertainties in the state predictions and (2) procure analytical time derivative estimates with quantified uncertainties. The formulation leads to a generalized least-squares regression and, ultimately, reduced-order models that are described probabilistically with a closed-form expression for the posterior distribution of the operators. The resulting probabilistic surrogate model propagates uncertainties from the observed state data to reduced-order predictions. We demonstrate the method is effective for constructing low-dimensional models of two nonlinear partial differential equations representing a compressible flow and a nonlinear diffusion-reaction process, as well as for estimating the parameters of a low-dimensional system of nonlinear ordinary differential equations representing compartmental models in epidemiology.
... The description of both aleatory and epistemic uncertainties that relate to the accuracy of the different fidelities of the surrogate models should be carefully taken into account in the workflow of multi-fidelity RBDO. This is a challenging task and an active field of research [18,19,20,21]. This article proposes a modified SORA method, named MFB-SORA for Multi-Fidelity Bayesian SORA. ...
... Multifidelity methods are aimed at improving the efficiency of probability estimation by combining multiple simulation models and methods of different approximation quality, varying complexity and computational expense and from different sources, and then leaving much portion of the model evaluations to less accurate yet cheaper low-fidelity models while evaluating the high-fidelity model for establishing unbiased estimators on fewer occasions. Multifidelity modeling is closely related to many of the concepts that have already been discussed in the present survey, for many reasons that such multifidelity methods can be derived from control variates [128,296], can be integrated into and combined with existing methods such as efficient global reliability analysis (EGRA) [61], importance sampling [293,296,303], subset simulation [94,303] and boundary element methods [259,260] and pose certain data fusion challenges that can be addressed with relevant machine learning and surrogate modeling based approaches [402,403,419]. As such, multifidelity modeling has attracted ever growing interest in structural reliability analysis and reliability-based design optimization [219,407], let alone many other problems such as statistical inference and optimization. ...
Article
Full-text available
Monte Carlo methods have attracted constant and even increasing attention in structural reliability analysis with a wide variety of developments seamlessly presented over decades. Along the way, a number of specialized reviews and benchmark studies have been provided from time to time, aiming at summarizing and comparing selected few approaches in detail, mainly from an implementation point of view. In contrast, the aim of the present survey is to play a comprehensive role as a methodological guidebook on Monte Carlo simulation and its related, especially variance reduction, techniques through a covering of 444 references in the relevant literature. To achieve this goal, we present an extensive review of formulations and techniques along with insightful summaries of developments of existing numerical methods, ranging from the general formulation, sub-categories and variants, to their combined uses with other simulation techniques and surrogate models, as well as the key advantages and assumptions.
Article
For complex engineering problems, multi-fidelity modeling has been used to achieve efficient reliability analysis by leveraging multiple information sources. However, most methods require nested training samples to capture the correlation between different fidelity data, which may lead to a significant increase in low-fidelity samples. In addition, it is difficult to build accurate surrogate models because current methods do not fully consider the nonlinearity between different fidelity samples. To address these problems, a novel multi-fidelity modeling method with active learning is proposed in this paper. Firstly, a nonlinear autoregressive multi-fidelity Kriging (NAMK) model is used to build a surrogate model. To avoid introducing redundant samples in the process of NAMK model updating, a collective learning function is then developed by a combination of a U-learning function, the correlation between different fidelity samples, and the sampling cost. Furthermore, a residual model is constructed to automatically generate low-fidelity samples when high-fidelity samples are selected. The efficiency and accuracy of the proposed method are demonstrated using three numerical examples and an engineering case.
Article
Full-text available
In order to make design decisions, engineers may seek to identify regions of the design domain that are acceptable in a computationally efficient manner. A design is typically considered acceptable if its reliability with respect to parametric uncertainty exceeds the designer’s desired level of confidence. Despite major advancements in reliability estimation and in design classification via decision boundary estimation, the current literature still lacks a design classification strategy that incorporates parametric uncertainty and desired design confidence. To address this gap, this works offers a novel interpretation of the acceptance region by defining the decision boundary as the hypersurface which isolates the designs that exceed a user-defined level of confidence given parametric uncertainty. This work addresses the construction of this novel decision boundary using computationally efficient algorithms that were developed for reliability analysis and decision boundary estimation. The proposed approach is verified on two physical examples from structural and thermal analysis using Support Vector Machines and Efficient Global Optimization-based contour estimation.
Chapter
Efficient methods for achieving active learning in complex physical systems are essential for achieving the two-way interaction between data and models that underlies DDDAS. This work presents a two-stage multifidelity active learning method for Gaussian-process-based reliability analysis. In the first stage, the method allows for the flexibility of using any single-fidelity acquisition function for failure boundary identification when selecting the next sample location. We demonstrate the generalized multifidelity method using the existing acquisition functions of expected feasibility, U-learning, targeted integrated mean square error acquisition functions, or their a priori Monte Carlo sampled variants. The second stage uses a weighted information-gain-based criterion for the fidelity model selection. The multifidelity method leads to significant computational savings over the single-fidelity versions for real-time reliability analysis involving expensive physical system simulations.
Article
Full-text available
Reliability-based design optimization (RBDO) is an active field of research with an ever increasing number of contributions. Numerous methods have been proposed for the solution of RBDO, a complex problem that combines optimization and reliability analysis. Classical approaches are based on approximation methods and have been classified in review papers. In this paper, we first review classical approaches based on approximation methods such as FORM, and also more recent methods that rely upon surrogate modelling and Monte Carlo simulation. We then propose a generalization of the existing surrogate-assisted and simulation-based RBDO techniques using a unified framework that includes three independent blocks, namely adaptive surrogate modelling, reliability analysis, and optimization. These blocks are non-intrusive with respect to each other and can be plugged independently in the framework. After a discussion on numerical considerations that require attention for the framework to yield robust solutions to various types of problems, the latter is applied to three examples (using two analytical functions and a finite element model). Kriging and support vector machines regression together with their own active learning schemes are considered in the surrogate model block. In terms of reliability analysis, the proposed framework is illustrated using both crude Monte Carlo and subset simulation. Finally, the covariance matrix adaptation-evolution scheme (CMA-ES), a global search algorithm, or sequential quadratic programming (SQP), a local gradient-based method, is used in the optimization block. The comparison of the results to benchmark studies shows the effectiveness and efficiency of the proposed framework.
Article
Full-text available
Design decisions for complex systems often can be made or informed by a variety of information sources. When optimizing such a system, the evaluation of a quantity of interest is typically required at many different input configurations. For systems with expensive to evaluate available information sources, the optimization task can potentially be computationally prohibitive using traditional techniques. This paper presents an information-economic approach to the constrained optimization of a system with multiple available information sources. The approach rigorously quantifies the correlation between the discrepancies of different information sources, which enables the overcoming of information source bias. All information is exploited efficiently by fusing newly acquired information with that previously evaluated. Independent decision makings are achieved by developing a two-step look-ahead utility policy and an information gain policy for objective function and constraints respectively. The approach is demonstrated on a one-dimensional example test problem and an aerodynamic design problem, where it is shown to perform well in comparison to traditional multi-information source techniques.
Article
Full-text available
Fixed point iteration is a common strategy to handle interdisciplinary coupling within a feedback-coupled multidisciplinary analysis. For each coupled analysis, this requires a large number of disciplinary high-fidelity simulations to resolve the interactions between different disciplines. When embedded within an uncertainty analysis loop (e.g., with Monte Carlo sampling over uncertain parameters), the number of high-fidelity disciplinary simulations quickly becomes prohibitive, because each sample requires a fixed point iteration and the uncertainty analysis typically involves thousands or even millions of samples. This paper develops a method for uncertainty quantification in feedback-coupled systems that leverage adaptive surrogates to reduce the number of cases forwhichfixedpoint iteration is needed. The multifidelity coupled uncertainty propagation method is an iterative process that uses surrogates for approximating the coupling variables and adaptive sampling strategies to refine the surrogates. The adaptive sampling strategies explored in this work are residual error, information gain, and weighted information gain. The surrogate models are adapted in a way that does not compromise the accuracy of the uncertainty analysis relative to the original coupled high-fidelity problem as shown through a rigorous convergence analysis.
Conference Paper
Full-text available
Designing and optimizing complex systems often requires numerous evaluations of a quantity of interest. This is typically achieved by querying potentially expensive numerical models in an optimization process. To alleviate the cost of optimization, surrogate models can be used in lieu of the original model, as they are cheaper to evaluate. In addition, different information sources with varying fidelity, such as numerical models, experimental results or historical data may be available to estimate the quantity of interest. This work proposes a strategy to adaptively construct and exploit a multifidelity surrogate when multiple information sources of varying fidelity are available. One of the distinguishing features of the proposed approach is the relaxation of the common assumption of hierarchical relationships among information sources. This is achieved by endowing the surrogate representation with uncertainty functions that can vary across the design space; this uncertainty quantifies the fidelity of the underlying information source. The resulting multifidelity surrogate is used in an optimization setting to identify the next design to evaluate, as well as to select the information sources with which to perform the evaluation, based on information source evaluation cost and fidelity. For an aerodynamic design ex- ample, the proposed strategy leverages multifidelity information to reduce the number of evaluations of the expensive information source needed during the optimization.
Article
Full-text available
Surrogate assisted global optimization is gaining popularity. Similarly, modern advances in computing power increasingly rely on parallelization rather than faster processors. This paper examines some of the methods used to take advantage of parallelization in surrogate based global optimization. A key issue focused on in this review is how different algorithms balance exploration and exploitation. Most of the papers surveyed are adaptive samplers that employ Gaussian Process or Kriging surrogates. These allow sophisticated approaches for balancing exploration and exploitation and even allow to develop algorithms with calculable rate of convergence as function of the number of parallel processors. In addition to optimization based on adaptive sampling, surrogate assisted parallel evolutionary algorithms are also surveyed. Beyond a review of the present state of the art, the paper also argues that methods that provide easy parallelization, like multiple parallel runs, or methods that rely on population of designs for diversity deserve more attention.
Article
In many situations across computational science and engineering, multiple computational models are available that describe a system of interest. These different models have varying evaluation costs and varying fidelities. Typically, a computationally expensive high-fidelity model describes the system with the accuracy required by the current application at hand, while lower-fidelity models are less accurate but computationally cheaper than the high-fidelity model. Outer-loop applications, such as optimization, inference, and uncertainty quantification, require multiple model evaluations at many different inputs, which often leads to computational demands that exceed available resources if only the high-fidelity model is used. This work surveys multifidelity methods that accelerate the solution of outer-loop applications by combining high-fidelity and low-fidelity model evaluations, where the low-fidelity evaluations arise from an explicit low-fidelity model (e.g., a simplified physics approximation, a reduced model, a data-fit surrogate) that approximates the same output quantity as the high-fidelity model. The overall premise of these multifidelity methods is that low-fidelity models are leveraged for speedup while the high-fidelity model is kept in the loop to establish accuracy and/or convergence guarantees. We categorize multifidelity methods according to three classes of strategies: adaptation, fusion, and filtering. The paper reviews multifidelity methods in the outer-loop contexts of uncertainty propagation, inference, and optimization.
Article
Designing and optimizing complex systems often requires numerous evaluations of a quantity of interest. This is typically achieved by querying potentially expensive numerical models in an optimization process. To alleviate the cost of optimization, surrogate models can be used in lieu of the original model, as they are cheaper to evaluate. In addition, different information sources with varying fidelity, such as numerical models, experimental results or historical data may be available to estimate the quantity of interest. This work proposes a strategy to adaptively construct and exploit a multifidelity surrogate when multiple information sources of varying fidelity are available. One of the distinguishing features of the proposed approach is the relaxation of the common assumption of hierarchical relationships among information sources. This is achieved by endowing the surrogate representation with uncertainty functions that can vary across the design space; this uncertainty quantifies the fidelity of the underlying information source. The resulting multifidelity surrogate is used in an optimization setting to identify the next design to evaluate, as well as to select the information sources with which to perform the evaluation, based on information source evaluation cost and fidelity. For an aerodynamic design example, the proposed strategy leverages multifidelity information to reduce the number of evaluations of the expensive information source needed during the optimization.
Article
This paper presents a data-driven approach for the online updating of the flight envelope of an unmanned aerial vehicle subjected to structural degradation. The main contribution of the work is a general methodology that leverages both physics-based modeling and data to decompose tasks into two phases: expensive offline simulations to build an efficient characterization of the problem and rapid data-driven classification to support online decision making. In the approach, physics-based models at the wing and vehicle level run offline to generate libraries of information covering a range of damage scenarios. These libraries are queried online to estimate vehicle capability states. The state estimation and associated quantification of uncertainty are achieved by Bayesian classification using sensed strain data. The methodology is demonstrated on a conceptual unmanned aerial vehicle executing a pullup maneuver, in which the vehicle flight envelope is updated dynamically with onboard sensor information. During vehicle operation, the maximum maneuvering load factor is estimated using structural strain sensor measurements combined with physics-based information from precomputed damage scenarios that consider structural weakness. Compared to a baseline case that uses a static as-designed flight envelope, the self-aware vehicle achieves both an increase in probability of executing a successful maneuver and an increase in overall usage of the vehicle capability.