ArticlePDF Available

G-DIF: A geospatial data integration framework to rapidly estimate post-earthquake damage

Authors:

Abstract and Figures

While unprecedented amounts of building damage data are now produced after earthquakes, stakeholders do not have a systematic method to synthesize and evaluate damage information, thus leaving many datasets unused. We propose a Geospa-tial Data Integration Framework (G-DIF) that employs regression kriging to combine a sparse sample of accurate field surveys with spatially exhaustive, though uncertain , damage data from forecasts or remote sensing. The framework can be implemented after an earthquake to produce a spatially-distributed estimate of damage and, importantly, its uncertainty. An example application with real data collected after the 2015 Nepal earthquake illustrates how regression kriging can combine a diversity of datasets-and downweight uninformative sources-reflecting its ability to accommodate context-specific variations in data type and quality. Through a sensitivity analysis on the number of field surveys, we demonstrate that with only a few surveys, this method can provide more accurate results than a standard engineering forecast.
Content may be subject to copyright.
G-DIF: A geospatial data integration framework
to rapidly estimate post-earthquake damage
Sabine Loosa)
,M.EERI, David Lallemantb)
,M.EERI, Jack Bakera)
,M.EERI, Jamie
McCaugheye)
, Sang-Ho Yunc)
, Nama Budhathokid)
, Feroz Khanb)
, Ritika Singhd)
While unprecedented amounts of building damage data are now produced after
earthquakes, stakeholders do not have a systematic method to synthesize and evalu-
ate damage information, thus leaving many datasets unused. We propose a Geospa-
tial Data Integration Framework (G-DIF) that employs regression kriging to com-
bine a sparse sample of accurate field surveys with spatially exhaustive, though un-
certain, damage data from forecasts or remote sensing. The framework can be im-
plemented after an earthquake to produce a spatially-distributed estimate of damage
and, importantly, its uncertainty. An example application with real data collected
after the 2015 Nepal earthquake illustrates how regression kriging can combine a
diversity of datasets–and downweight uninformative sources–reflecting its ability to
accommodate context-specific variations in data type and quality. Through a sensi-
tivity analysis on the number of field surveys, we demonstrate that with only a few
surveys, this method can provide more accurate results than a standard engineering
forecast.
INTRODUCTION
From rapid engineering forecasts to crowdsourced maps, unprecedented amounts of building
damage data are now being produced after earthquakes. The 2010 Haiti earthquake was the
first time that response and recovery stakeholders had access to this amount of damage data,
due to both technological advancements in remote sensing data acquisition and mandates to
make that data openly available after major disasters (Corbane et al., 2011; Kerle and Hoffman,
2013). In fact, after 2010 there was a spike in the number of damage-related maps posted on
ReliefWeb—a global information sharing site devoted to humanitarian disasters—in response to
a)Stanford University, Stanford, CA 94305
b)Earth Observatory of Singapore, Nanyang Technological University, Singapore
c)Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109
d)Kathmandu Living Labs, Kathmandu, Nepal
e)Institute for Environmental Decisions, Dept. Environmental Systems Science, ETH Z¨
urich, Z¨
urich, Switzer-
land
1
major earthquakes despite having similar estimated economic damages as earlier events (Figure
1).
Figure 1. The number of damage-related maps posted on ReliefWeb, a disaster information sharing
site, has increased since the 2010 Haiti earthquake. We would expect a similar number of maps for
major events with similar estimated economic damages (shown in 2019 USD). The number of maps
were scraped from ReliefWeb and economic damages were retrieved from EM-DAT (United Nations
Office for the Coordination of Humanitarian Affairs, 2019; Universit´
e catholique de Louvain (UCL) -
CRED and Guha-Sapir)
Counterintuitively, the increase in data is problematic since stakeholders–such as affected
governments, multilateral donor organizations, and humanitarian organizations–receive a bar-
rage of information and maps with unverified competing damage estimates (Kerle, 2013). Often,
data from new and untested methods are left unused when decisions need to be made quickly
(Hunt and Specht, 2019). Stakeholders do not have a systematic method to quickly assess the
accuracy or synthesize these data sources. Furthermore, it is common for damage to be quanti-
fied using metrics that are not usable for stakeholders to make crucial decisions within weeks of
an earthquake (Bhattacharjee et al., 2018). For example, in as little as two weeks, the affected
government uses damage data to estimate total losses for the Post Disaster Needs Assessments
(PDNA) to request recovery aid. It is unclear how to 1) translate multiple remotely-sensed
damage maps that show damage intensity per pixel, like the maps shown in Kerle and Hoffman
(2013), to usable metrics to estimate loss and 2) know which map is most accurate. If damage
2
estimates are inaccurate in the PDNA, the affected government could under or overestimate the
amount of aid requested—and subsequently distributed—for recovery. Because of these issues,
many damage data are left unused. This paper outlines a Geospatial Data Integration Frame-
work (G-DIF) to systematically integrate multiple sources of damage data into a single spatially
distributed estimate of damage with quantified uncertainty to ease decision-making and improve
the accuracy of post-earthquake damage estimates.
Integrating post-earthquake damage data is challenging since they are produced at differ-
ent times with varying geospatial coverages, formats, and levels of uncertainty. While a few
research studies have attempted to improve the accuracy of remote sensing and crowdsourced
damage data, none have developed generalized methods to combine multiple data sources into
a single, high-resolution, and spatially distributed estimate of building damage. For example,
Booth et al. (2011) used Bayesian analysis to update the ratio of collapsed buildings in an af-
fected area from manual assessments of satellite imagery with additional satellite assessments
and field surveys after the 2010 Haiti earthquake but produced collapse probability distributions
for four low-resolution land-use classes rather than high-resolution spatial estimates. Alterna-
tively, some studies treat post-earthquake damage data as inputs and validation for vulnerability
curves within an engineering forecast (e.g. Gunasekera et al., 2018; Huyck, 2015), but do not
update the final damage estimate itself. Rather than estimating damage, some studies have used
multiple damage data to develop maps of shaking intensity (e.g. Monfort et al., 2019). Finally,
Lallemant and Kiremidjian (2013) applied cokriging to integrate a crowdsourced assessment
with a set of field surveys, but this method was not generalized to incorporate multiple damage
data sources.
As opposed to existing methods, which rely on only one to two damage datasets, we pro-
pose a framework that is able to integrate multiple heterogeneous data sources to produce a
single spatial damage prediction in the weeks after an earthquake. Specifically, the geostatis-
tical model, regression kriging, implemented in G-DIF requires a limited sample of primary
damage data from field surveys, which are accurate but have low spatial coverage, to predict
damage using secondary damage data, which have lower accuracy but higher spatial coverage.
Within this framework, we employ a geostatistical integration method, since damage between
nearby buildings are likely correlated within the range of spatial correlation of ground motion
because of similarities in construction age and material, local soil conditions, and multiple other
factors (Shome et al., 2012). By modeling this spatial correlation parametrically, G-DIF does
not rely on large field survey samples as training data, unlike most machine learning models.
3
Therefore, instead of relying on a model that is built with training data from one location and
may not transfer well between different built environments and different data sources, G-DIF
can be be developed after an event using its specific data, leading to locally calibrated damage
estimates. Because of these features, similar geostatistical techniques have been previously ap-
plied to integrate data in other fields such as for mapping atmospheric optical thickness (e.g.
Chatterjee et al., 2010) and soil properties (e.g. Hengl et al., 2004; Thompson et al., 2010).
In this paper, we illustrate the implementation of the framework with an example application
using real damage data collected after the 2015 Nepal Earthquake. In this example, we show
how G-DIF produces a single map of damage and a map of the estimation uncertainty, which
can be used to model economic losses and guide further field surveying, respectively. Compared
to traditional methods of rapidly estimating post-earthquake damage, G-DIF results in a damage
estimate with lower overall error, higher resolution, and is specific to each context.
POST-EARTHQUAKE DAMAGE DATA SUITED FOR G-DIF
G-DIF makes use of two types of damage information: primary measurement data with high
accuracy and sparse spatial coverage, plus secondary proxy data with low accuracy and dense
spatial coverage. Examples of primary data include field surveys of damage and secondary data
includes engineering forecasts, remotely-sensed proxies, or relevant geospatial covariates from
before or after the event (e.g. intensity or elevation). All information is assumed to be numerical
(e.g. collapse rate) rather than descriptive (e.g. social media posts). In this section, we outline
the time of availability and format of the damage data suited for G-DIF, as shown in Figure 2.
FIELD SURVEYS
Field surveys of damaged buildings are often conducted following earthquakes. These include
surveys conducted by reconnaissance teams to understand the scale and type of building dam-
age, rapid engineering safety evaluations to inform people of the safety of reoccupying build-
ings, and detailed, recovery-oriented surveys as time progresses (Earthquake Engineering Re-
search Institute, 2015; Lallemant et al., 2017). These field surveys include an evaluation of
the level of damage for each inspected building. The two most prevalent methods to assign
damage levels are the ATC-20 methodology and the EMS-98 grading system, where engineers
classify building damage in damage states or grades, respectively, based on descriptive dam-
age conditions (Applied Technology Council, 1989; Gr ¨
unthal, 1998). Since engineers inspect
4
Figure 2. Timeline of availability of post-earthquake damage data suited for G-DIF based on Lallemant
et al. (2017)’s review of damage assessments. Data sources with lower accuracy but dense spatial cov-
erage are available soonest after an earthquake. Once a limited sample of field surveys are collected,
enough data is available for G-DIF. The time to collect a sufficient amount of field surveys can vary by
region (in Nepal, it could feasibly be done in a couple of weeks), however, a couple of weeks is sufficient
for early recovery decisions.
each building from the ground, field survey assessments are the most accurate measurement
of damage relative to other damage data. The timing of early field surveys varies between
disasters—past examples from the REACH survey, the government, and reconnaissance teams
have shown organized surveys to be conducted in the first 6 weeks (Shelter Cluster Nepal, 2015;
Lallemant et al., 2017; Earthquake Engineering Research Institute, 2015). While full coverage
of on-the-ground surveys takes months to even years after a major event, G-DIF leverages these
early surveys to provide calibration of predictions and constraints at the survey locations.
ENGINEERING FORECASTS
Engineering forecasts are near-real-time predictions of regional impact available within hours,
as soon as a map of shaking intensity can be derived from the magnitude and location of the
earthquake source (Jaiswal et al., 2009). Multiple global systems exist, the most widely used
being the Prompt Assessment of Global Earthquakes for Response (PAGER) system (Jaiswal
and Wald, 2011). These systems typically use an analytical or empirical model that relates shak-
ing intensity to impact measures such as building damage, casualties, or economic loss. These
5
models usually rely on information on the earthquake shaking in terms of peak ground mo-
tion or intensity, building and population exposure, and fragility functions (Erdik et al., 2014).
While systems like PAGER aggregate their models to country-level impact estimates, alternative
systems, such as the Quake Loss Assessment for Response and Mitigation (QLARM), provide
spatially distributed model predictions (Trendafiloski et al., 2009). Since engineering forecasts
are model-based, rather than observation-based, these predictions are inherently uncertain, es-
pecially in regions with limited seismic stations and building inventory data (Wald et al., 2012;
Erdik et al., 2014).
REMOTE SENSING-DERIVED DAMAGE DATA
Remote sensing-derived damage data are observations related to damage, retrieved from earth
observation technologies such as sensors mounted on satellites, aircraft, or unmanned aerial
vehicles. These signals can be interpreted automatically through computer algorithms or manu-
ally by humans, each with a range of formats (Dong and Shan, 2013; Kerle, 2013). Depending
on the interpretation method, the data are either damage proxies, which provide an idea of
damage intensity, or assessments, which provide direct measurements of damage. For exam-
ple, the Advanced Rapid Imaging and Analysis project at NASAs Jet Propulsion Laboratory
and California Institute of Technology produce damage proxy maps (DPM) for major disas-
ters based on an automatic change detection between two pairs of images from Interferometric
synthetic-aperture radar (InSAR) data, thus providing a measure of intensity (Yun et al., 2015).
Alternatively, digital humanitarian groups, such as Humanitarian OpenStreetMap Team (HOT)
or the Global Earth Observation-Catastrophe Assessment Network (GEO-CAN), have manually
identified damaged and collapsed buildings in optical satellite and aerial imagery, respectively
(Westrope et al., 2014; Loos et al., 2018; Ghosh et al., 2011). The availability of remote sensing-
derived damage data depends on the retrieval of the underlying remote sensing data—typically
within a few days to a couple of weeks (Dong and Shan, 2013; Lallemant et al., 2017). While
remotely sensing damage data have denser spatial coverage than field surveys, these estimates
have varying accuracy depending on the type of imagery or interpretation used (Loos et al.,
2018; Dong and Shan, 2013; Monfort et al., 2019).
6
GEOSPATIAL DATA INTEGRATION FRAMEWORK
Our goal is to estimate the true building damage, Z, which is the assigned damage grade for
a building from a field survey. We formulate the true damage as a function of location, s, so
Z(s)is a continuous variable. The region is discretized into a grid, so that Z(s)is defined at
a countable number of locations. When the grid dimension encompasses multiple buildings,
Zcan be defined as the average damage grade (hereon referred to as mean damage) of the
buildings or the fraction of buildings that fall within a given grade.
We consider the true damage as a random spatial process composed of two parts: 1) the
mean surface, which is the average damage throughout space and 2) small-scale fluctuations
around the mean surface. In the case of earthquake-induced building damage, the mean surface
will exhibit a general trend in space, because of characteristics such as shaking intensity that
have large-scale spatial variation. We model this trend parametrically. We expect the small-
scale fluctuations (hereon the residuals) to exist, resulting from smaller scale similarities in
characteristics such as construction characteristics and local soil conditions. Because of the
small-scale similarities, we model the residuals as stochastic and spatially auto-correlated, or
correlated with itself between two locations.
The true building damage Zat a single location s, can therefore be represented as the sum
of the trend, m(s), and stochastic residual, ε(s),
Z(s) = m(s) + ε(s).(1)
To illustrate, consider two communities Aand B—community Ais closer to the earthquake
source and experienced greater shaking, and therefore damage, than the more distant commu-
nity B. The average difference in damage between Aand Bis represented by the trend, m(s).
Beyond that, the buildings in the grids in and around Aare constructed similarly—built with
the same material in the same year—causing similar damage. The local similarities in damage
surrounding a grid is represented by the spatially correlated residual ε(s).
Note that Z(s)is defined as the true damage, since a field surveyed assessment is relatively
the most accurate measurement of damage available after an earthquake. Uncertainty in a field
survey still exists due to the subjectivity of the surveyor, and the additional uncertainty intro-
duced from aggregating the surveys to a grid. Here, however, we consider Z(s)to be exact
and only account for the uncertainty in the estimation of the trend and the spatially-correlated
residuals.
7
G-DIF capitalizes on 1) the correlation between the sparse field surveys and secondary dam-
age data to estimate the trend and 2) the auto-correlation between the field surveys to estimate
the residuals. The geostatistical data integration model implemented in G-DIF is regression
kriging (also known as residual kriging), a multivariate geostatistical regression technique,
which consists of two separate models for the trend and the residuals (Odeh et al., 1994).
Separate modeling of the trend and residuals allows for alternative regressions that consider
nonlinear relationships between primary and secondary data and separate interpretation of each
model’s results. The main steps of the framework are in Figure 3.
Figure 3. G-DIF steps to produce spatial estimates of regional damage.
8
DATA PRE-PROCESSING
We separate the input data for G-DIF into two sets of locations. There are psecondary datasets,
X1. . . Xp, that are spatially exhaustive and available at all nlocations with an additional set of
primary field survey data at a subset of nfs locations. The collocated primary and secondary
data at the nfs field survey locations are used for developing a regression function, which is
then used to estimate the trend at all nlocations. Similarly, the spatial correlation model is
developed using the nfs field locations. Generally, the set of field surveys should be large
enough to build a regression model for the trend (nfs >> p) and have samples at each damage
level and varying distances from each other. In this paper, we assume that the set of field
surveys include observations of the full range of damage levels and are carried out at random
grids distributed throughout the spatial domain in order to produce unbiased estimates of the
trend and variogram (this assumption has important implications for survey sampling, which
we revisit in the sensitivity analysis and conclusion sections). The vector of field surveys (Z)
and matrix of secondary datasets (X) for model development are
Z=
Z(s1)
.
.
.
Z(snfs )
X=
X1(s1). . . Xp(s1)
.
.
.. . . .
.
.
X1(snfs ). . . Xp(snfs )
.
To model the trend, we develop a regression function, f, which predicts the true damage at
the field survey locations, Z, as a function of the damage from the secondary data, X. We use
the developed regression function to estimate the trend at a single, unknown location, s0:
ˆm(s0) = f(X(s0)).(2)
TREND MODEL
The function fis the modeler’s choice and will generally be earthquake-specific. Because the
choice of trend model is likely to be dependent on the data available, it is important to develop
this function manually to obtain accurate estimates of the final damage. It is common to ap-
ply ordinary least squares (OLS) for trend estimation. Alternatively, generalized least squares
(GLS), which weights observations by their spatial covariance, accounts for spatial correlation
in the residuals and leads to an unbiased estimate of the coefficient. The use of GLS leads to re-
sults most similar to estimating the trend and residual simultaneously, as with universal kriging
(Hengl et al., 2003; Chiles and Delfiner, 2012). In either formulation, both linear and nonlinear
9
least-squares regression functions can be applied. Other functions such as general additive mod-
els, regression trees, and artificial neural networks have also been explored within this general
approach (McBratney et al., 2000; Grujic, 2017; Motaghian and Mohammadi, 2011). In addi-
tion, separate trend models can be developed for different regions that have varying coverage
of secondary data. This could be the case for imagery-based damage data that can be limited in
geographical extent, which we demonstrate in our application to Nepal.
SPATIAL CORRELATION MODEL
With the developed trend function we estimate the trend at all nlocations and calculate the
residuals at each of the nfs field surveyed locations:
ε(sα) = Z(sα)ˆm(sα), for α= 1...nfs.(3)
Using the calculated residuals, we perform ordinary kriging to estimate the residuals at the un-
known locations using a spatial correlation model. The estimated residual at a single, unknown
location is the weighted sum of the known residuals from the field surveyed locations
ˆε(s0) =
nfs
X
α=1
λα(s)·ε(sα)(4)
where λαare the kriging weights.
We solve for the kriging weights, λ
λ
λ=λα. . . λnfs , by minimizing the estimation variance at
the surveyed locations and placing a constraint on the sum of the weights to equal one to satisfy
the unbiasedness conditions assumed with ordinary kriging (Chiles and Delfiner, 2012).
min
λ1,...,λnfs
varε(sα)ε(sα)) + 2ν(
nfs
X
α=1
λα1).(5)
We obtain the λ
λ
λthat minimizes Equation 5 by introducing a Lagrange multiplier νand setting
the function’s partial derivatives with respect to λ
λ
λand νequal to zero. This results in the
following ordinary kriging system of nfs + 1 equations with nfs + 1 unknowns (λ
λ
λand ν):
C
nfs ×nf s
1
nfs ×1
1>
1×nfs
0
λ
λ
λ
nfs ×1
ν
=
C0
nfs ×1
1
,(6)
where Cis the auto-covariance matrix between the known residuals and C0is the covariance
between the new estimation location and all field survey locations. Here, we assume second-
order stationarity of the residuals, meaning the autocovariance is the same for any two points
10
based on their separation distance, h, and irrespective of their location. The auto-covariance C
is derived from a variogram, a concept similar to the correlation models used for ground-motion
intensities (Boore et al., 2003; Goda and Hong, 2008; Jayaram and Baker, 2009). The variogram
is a theoretical parametric model of spatial correlation that relates the separation distance h
between field surveyed locations and the dissimilarity of their residuals. Dissimilarity in the
variogram is quantified using half the variance, or the empirical semivariance
γ(h) = 1
2varε(s)ε(s+h)=1
2E{ε(s)ε(s+h)}2(7)
where his the euclidean distance. A theoretical variogram is then fit through all (γ, h)pairs.
Selection of an appropriate theoretical variogram should again be based on the lowest error from
cross-validation (Oliver and Webster, 2014).
DAMAGE AND UNCERTAINTY ESTIMATE
The final damage estimate at a single location is obtained by adding together the estimated trend
and residuals from Equations 2 and 4, respectively, as shown in Equation 1
ˆ
Z(s0) = f(X(s0)) +
nfs
X
α=1
λα(s)·ε(sα).(8)
Once we develop the final damage estimate for all locations, ˆ
Z, it can be used to estimate further
decision variables (i.e. the spatial distribution of economic losses).
In addition, this method provides the variance of the damage estimate, ˆσ2(s0), which is the
sum of the individual variances from estimating the trend, ˆσ2
m(s0), and kriging the residuals,
ˆσ2
ε(s0).
ˆσ2(s0) = ˆσ2
m(s0) + ˆσ2
ε(s0).(9)
The estimation variance can be used to propagate uncertainty in further loss estimates or to
guide where to carry out additional field surveys.
APPLICATION TO THE 2015 NEPAL EARTHQUAKE
In this section, we demonstrate the applicability of G-DIF by using real data produced after
the 2015 Mw7.8 Nepal earthquake to estimate damage over the 11 heavily affected and mostly
rural districts outside of Kathmandu Valley. We assume this model would have been applied
approximately two to four weeks following an earthquake (i.e., the vertical line in Figure 2)
11
when enough field surveys are available to implement G-DIF. For this example, we use field
surveys at 100 random locations plus representative data sources for each type of secondary
damage data. We present this case study in order of the flowchart of Figure 3.
1. DAMAGE DATA
The measurement unit and spatial support of each input data used in this case study are listed in
Table 1.
Table 1. Data from the 2015 Nepal earthquake used in the application of G-DIF
Damage data category Dataset used in case study Measurement Unit Spatial Support
Field surveys EMS-98 field surveys (Z) Damage grade Building-level
Engineering forecast Self-developed (X1) Mean damage ratio 1km grid
Remote sensing proxy InSAR-based damage proxy map (X2) Damage proxy map value 30m grid
Relevant geospatial covariates ShakeMap (X3) Modified Mercalli Intensity 1.75km grid
Digital Elevation Model (X4) Elevation (m) 90m grid
The damage survey data for this case study come from the Earthquake Housing Damage and
Characteristics Survey commissioned by the Government of Nepal and completed by July 2016
(http://eq2015.npc.gov.np/#/). The purpose of that survey was to identify rural households that
would be eligible beneficiaries for the Earthquake Housing Reconstruction Program and was
therefore carried out in the 11 rural most-affected districts, not including the three districts in
Kathmandu Valley (Nepal Earthquake Housing Reconstruction Multi-Donor Trust Fund, 2016).
In this survey, trained engineers used the EMS-98 damage grading system to classify a census
of 751,799 buildings in these districts into a damage grade from 1 (negligible to slight damage)
to 5 (collapse). While this exhaustive survey was completed a year after the earthquake, we
consider only a random sample of 100 locations in order to replicate what would be available
rapidly after an event.
We developed an engineering forecast dataset with similar methods and quality to engineer-
ing forecasts available after earthquakes in countries with limited building inventory data. We
use fragility curves from Nepal’s National Society of Earthquake Technology to relate the peak
ground acceleration from the latest ShakeMap to damage ratios for masonry (mud and cement
mortared), reinforced concrete, and wood structures (JICA, 2002; Worden et al., 2018). The ex-
posure is defined using population estimates from the LandScan 2011 High Resolution Global
Population Dataset and ratios of each construction type available at the district-level in Nepal’s
2011 census (Bright et al., 2012). Given the estimated number of buildings, the estimated distri-
bution of each construction type, and the fragility curve for each construction type, we compute
12
the mean damage ratio per grid.
For the remote sensing proxy, we use NASA’s damage proxy map (DPM) (Yun et al., 2015).
NASA has consistently produced a DPM after major disasters since the February 2011 M6.3
Christchurch earthquake, making it a relevant remote sensing proxy to include in this study.
The DPM algorithm takes the difference between two InSAR coherence (or similarity) maps:
one from before the earthquake and one spanning the earthquake. The DPM value in each
pixel (which ranges -1 to 1) represents anomalous change due to the earthquake, as opposed to
background changes (noise) that existed in the pre-earthquake pair coherence.
We also consider two geospatial covariates that are available after earthquakes and relate to
the trend in damage: the Modified Mercalli Intensity from the ShakeMap (Worden and Wald,
2016), and a Digital Elevation Model (DEM) derived from the Shuttle Radar Topography Mis-
sion (Jarvis et al., 2008; Farr et al., 2007). While elevation may not directly cause earthquake
damage, it could serve as a proxy for other factors such construction quality in remote areas or
landslide occurrence. The use of elevation data for the application of G-DIF in Nepal demon-
strates how the trend model down-weights secondary datasets that are poor proxies for damage,
as shown in the modeling results for the trend (Section 4.3).
2. DATA PRE-PROCESSING AND EXPLORATION
We discretize each dataset to a common grid of 0.0028×0.0028(290m ×290m), resulting
in a study area with 80,200 grid points. We use this resolution to remove any personal identifi-
able information, ensuring that more than one building is within each grid. The 11 considered
districts are mostly rural, so there are nine buildings per grid on average (though 0.25% have
100 or more buildings).
The true damage from the field surveys, Z, is the mean damage grade of all buildings within
each grid. Out of 80,200 grids that contain buildings, we randomly selected 100 grids (contain-
ing 1056 buildings) as the set of locations that engineers could survey in the field. From here
on, we refer to the subset of grids as the field surveyed locations.
Exploratory analysis shows a positive relationship between the true damage and the sec-
ondary damage data as exhibited in the moving average curves in the left column of the matrix
in Figure 4. Specifically, the engineering forecast, shaking intensity, and elevation are linearly
related to the true damage, while the DPM shows a slightly nonlinear relationship. The form
of these discovered relationships should be considered when deciding on which trend model to
13
Figure 4. Summary of true damage from primary field survey and secondary damage data at all loca-
tions (n= 80,200) and the subset of field surveyed locations (np= 100). The diagonal shows histograms
of each dataset, the scatter plots show relationships between datasets (including a moving average esti-
mate), and the bottom row maps the spatial patterns of each data set (with warm colors indicating larger
values). The left column of scatter plots highlights relationships between primary and secondary data.
use.
3. TREND MODEL
Based on our observations of linearity between variables, we used a linear least squares regres-
sion as the functional relationship between the true damage and each secondary damage data
14
ˆm(s0) =
p
X
k=0
ˆ
βk·Xk(s0)(10)
where X0is a vector of ones to estimate the intercept. We estimate the coefficient for each sec-
ondary damage data, ˆ
βk, through either ordinary least squares (OLS) regression or generalized
least squares (GLS) regression. We select the regression function that results in the least root
mean squared error, which is OLS regression in this example. We also build two trend models
for areas with and without DPM values, since the DPM covers about 40% of the considered
region (more details are in the Appendix). We compute the variance inflation factor (VIF) for
each secondary data variable to assess whether multicollinearity exists (James et al., 2013), but
find that the VIF’s for each variable are below two–a low value that indicates multicollinearity
is not a problem with these data.
Building a trend model with the data at the field surveyed locations has two advantages.
First, the function in Equation 10 translates the numerous secondary damage data with differing
units of measurement (e.g., shaking amplitude, elevation, an arbitrary numerical scale for DPM)
into a collective unit, the mean damage grade, that has value for regional loss estimates and other
decision-making.
The second advantage of the trend model is that the modeler does not need to subjectively
weight the importance of each secondary damage data, instead allowing the data to determine
the importance of each dataset through the model coefficients. By examining each ˆ
βkand its
standard error, we observe which secondary dataset provides additional value in modeling the
trend. For example, in Figure 5a, we see the digital elevation model (DEM) has close to a
zero coefficient in the estimated trend, signifying the DEM has little additional effect on the
trend estimate when we account for the other secondary damage data. These coefficients are
comparable since we normalize all variables before developing the trend model. If we estimate
a zero coefficient for all secondary damage data, then the trend reduces to a constant mean (the
intercept). Note that the estimated coefficients shown in blue in Figure 5a are dependent on
the set of field surveyed grids and therefore may differ from the true coefficient estimates as
shown in black. These coefficient estimates are also specific to the Nepal earthquake, which
was a largely rural disaster; it is not a comment on the general utility of each dataset among all
earthquakes. Because the parameters of the trend model are based on its data inputs at the field
surveyed locations, G-DIF is calibrated to the data available after each specific earthquake.
15
(a) (b)
Figure 5. (a) The coefficient estimates (blue dots) from the trend model using ordinary least squares
regression in the area with Damage Proxy Map values. Horizontal lines show the standard error and
black stars are coefficients using 10000 grids. (b) The spatial correlation model using a Matern vari-
ogram showing the difference in the variogram of the true damage at the field surveyed grids and of the
variogram of the residuals before and after removing the trend, respectively. The vertical dotted line at
9.4 km highlights the range of spatial autocorrelation.
4. SPATIAL CORRELATION MODEL
Similar to the trend model, the parameters of the spatial correlation model are calibrated to the
data rather than predetermined. In this case study, we estimate the parameters of a Matern the-
oretical variogram model. The fitted parameters when minimizing the residual sum of squares
results in an exponential covariance:
C(h) = bexp |h|
r,(11)
where bis equal to the variance of the residuals and ris the range of spatial autocorrelation. In
this example, we fit these parameters to equal b= 0.83 and r= 9.4km.
The variogram is related to the covariance through
γ(h) = bC(h).(12)
We verified the use of this model by comparing the variogram fitted with 100 field surveyed
grids shown in Figure 5b to the same model fit with 10,000 grids.
The shape of the variogram highlights spatial characteristics of the data. The vertical dot-
ted line in Figure 5b is the range of spatial autocorrelation, r, of the true damage at the field
16
surveyed locations after removing the trend. The range is the maximum distance at which two
locations are spatially autocorrelated with one another. The estimated range of 9.4km is spe-
cific to this earthquake and depends on the choice of variogram and fitting procedure, which
we evaluate further in the sensitivity analysis in the following section. We evaluate the statisti-
cal robustness of the estimated range in the following section. The variogram also shows that
we have successfully removed the preexisting trend from the data, since the variogram of the
true damage increases with distance, while the variogram of the residuals plateaus. If the trend
model were able to fully capture the spatial correlation in the true damage, the variogram would
reduce to a horizontal line with γ(0) = σ2(0) (i.e. the nugget-effect model), and performing
ordinary kriging would provide no additional effect on the final mean damage estimate. There-
fore, G-DIF adapts to allow varying levels of contribution from the spatial correlation model,
depending on how well the trend model estimates the true damage.
5. FRAMEWORK OUTPUTS
The implementation of G-DIF generates two main outputs: 1) a map of the mean damage
estimate for each of the 80,200 grids and 2) a map of uncertainty, or estimation variance, of
those estimates.
Damage estimate
The mean damage estimate map (6a) is the sum of the estimated trend and the estimated resid-
uals. The mean damage estimate reflects the trend model such that areas to the north exhibit
greater damage than areas to the south. This gradient in damage comes from the two most
important secondary damage data in the trend model, the shaking intensity and the engineering
forecast, which have higher values towards the north.
The mean damage estimate reflects the spatial correlation model through the similarity in
mean damage estimates surrounding the field surveyed locations shown in black in Figure 6a.
These spatial similarities are particularly visible to the northeast of Kathmandu, where there is
clustering of high damage around the field surveyed points. These similarities are due to the
variogram, which estimates small-scale fluctuations based on damage at nearby field surveyed
locations.
This map can then be used to estimate total costs of damage. Here, we assume the estimated
mean damage grade is the same for all buildings within a grid and that the number of buildings
17
per grid is known. By multiplying the estimated mean damage grade by the number of build-
ings, an assumed ratio of each type of construction material, and the repair or reconstruction
cost for each construction material, we obtain an estimate of 315 billion NPR (2.8 billion USD)
for the cost of repair and reconstruction. This estimate is almost the same as the total damages
to the housing sector of 303 billion NPR (2.7 billion USD) reported in the PDNA. While this
economic loss estimate is not a primary focus of this study, it is provided here to illustrate that
these damage predictions can be converted to regional economic loss estimates.
Estimation variance
The map of the estimation variance is the sum of the variance from estimating both the trend and
residuals (Equation 9). In the case of least squares regression and ordinary kriging, we solve
for the trend coefficients (β
β
β) and kriging weights (λ
λ
λ) by minimizing the variance of the error at
the field surveyed locations. These two procedures result in an estimation variance of the trend
and residuals at all locations (specific equations are included in the Electronic Supplement).
We can interpret the estimation variance, σ2(s), as our uncertainty in the mean damage es-
timate at each grid. The model assumes the uncertainty in the mean damage estimate varies
according to a Gaussian probability distribution with ˆ
Z(s)as the mean and ˆσ2(s)as the stan-
dard deviation—Z(s)N(ˆ
Z(s),ˆσ(s)). The variance quantifies the uncertainty in 1) the trend
estimation due to the relationships between the primary and secondary data and 2) spatial es-
timation of the residuals. The spatial uncertainty is visible in the estimation variance map as
shown by the higher variances at grids that are located further from the field survey locations in
Figure 6b.
The map of the estimation variance can be used to guide where future field surveys should
be carried out and to propagate uncertainty when estimating further losses. Since the variance
depends on the location of the field surveyed grids, surveyors could assess damage in areas with
higher variance to reduce the overall uncertainty.
COMPARING G-DIF WITH A RAPID ENGINEERING FORECAST
In this section, we compare G-DIFs spatially varying damage estimate and variance to the engi-
neering forecast, which is the current standard of practice for estimating post-earthquake dam-
age. Visually, we can see that G-DIFs mean damage estimate from the example set of 100
field surveyed grids presented in the previous section (Figure 6a) resembles the true damage
18
(a) G-DIF mean damage estimate (b) G-DIF estimation variance
(c) True damage (d) Engineering Forecast
Figure 6. Results of the framework for an example set of 100 field surveyed locations including (a) the
mean damage estimate and (b) the estimation variance. The results of (a) can be compared to (c) the true
damage from all field surveys and (d) the engineering forecast converted to mean damage grade.
(Figure 6c) more than the engineering forecast (Figure 6d). Going further, we quantify the per-
formance of G-DIFs outputs to demonstrate that its mean damage estimate has lower total error
and improved uncertainty quantification. Since G-DIF heavily depends on the field survey data,
we also perform a sensitivity analysis of G-DIFs outputs to the number and placement of field
surveyed locations used to build the model.
PERFORMANCE OF THE CASE STUDY EXAMPLE
Using the mean damage estimate from the previous case study, we quantify the error between
the predicted and observed damage at all validation grids, as shown in Figure 7a. The distribu-
19
tion of prediction error highlights 1) the bias, or the mean error—whether the damage estimate
is systematically under or overestimating damage, and 2) the variance, how precise the damage
estimate is for all grids. Note that the engineering forecast only results in mean damage grade
values that are whole integers (1, 3, and 4) after binning the predicted mean damage ratio per
grid, leading to spikes in its errors at whole integers in Figure 7a. The lower bias and variance
of G-DIFs mean damage estimate leads to a mean squared error (MSE), a performance metric
which combines both bias and variance, of MSE = 0.853—47% lower than the engineering
forecast with MSE = 1.62.
(a)
(b)
(c)
Figure 7. Histogram of errors between the predicted and observed damage for a) all 11 considered
districts, b) Makawanpur district (southwest of Kathmandu Valley), and c) Nuwakot district (northwest
of Kathmandu Valley). Histograms highlight the lower bias, variance, and mean squared error for G-DIF
when using an example set of 100 field surveyed grids.
Even though G-DIFs MSE is nearly half that of the engineering forecast, we see the dif-
ference is even larger when looking at individual districts within the study area (Figure 7b and
c). Over the full study area, the engineering forecast will capture the overall trend in damage
over large regions, but is limited by the resolution of the underlying building inventory data,
which is only available at the district-level. G-DIFs advantage over the forecast is that it is
locally calibrated to field surveys within a district, so its mean damage estimates for smaller
regions have lower bias, variance, and MSE. For example, G-DIFs mean damage estimate has a
lower bias (bias = 0.038) and higher precision (standard deviation = 1.122) than that of the engi-
neering forecast (bias = 0.904, standard deviation = 1.477) when considering the errors only for
Makawanpur, the district directly southwest of Kathmandu Valley (Figure 7b). G-DIF is consis-
20
tently more accurate at the local-level for 9 out of the ll districts, as seen in the error histogram
for Nuwakot in Figure 7c and the other districts depicted in the electronic supplement.
SENSITIVITY OF THE PERFORMANCE TO THE FIELD SURVEYED GRIDS
G-DIFs final mean damage estimate varies depending upon the sampled primary data, especially
with few field survey grids to build the trend and spatial correlation models or with secondary
data that are not strongly predictive of damage. The goal of this section is to quantify how
G-DIF’s performance depends on the number and placement of the field surveyed locations
used to build the framework. For a set of field surveyed locations ranging from 25 to 1000, we
simulate G-DIFs mean damage estimate using 1000 random samples of different placements
and assess its performance. Figure 8 shows the distribution of the MSE and bias for each of
these simulations.
(a) (b)
Figure 8. Histograms of (a) the accuracy of the mean damage estimate (MSE) and (b) the performance
of the estimation variance from the sensitivity analysis of the number and placement of field surveyed
locations used to develop G-DIF. As more field surveys are collected, the accuracy improves and does
not depend as much on the placement of field surveyed locations.
As expected, as the number of field surveyed locations increases, the G-DIF MSE decreases
(accuracy increases) and is consistently lower than that of the engineering forecast, regardless
of the placement of the field surveyed locations. Given that the MSE can take values between 0
and 16, the MSE from G-DIF is relatively low. Figure 8a shows histograms of MSE for repeated
analyses using varying samples of data, and for five amounts of sampled data. G-DIF MSE is
lower than that of the engineering forecast for 99.7 % of the simulations when using 50 field
21
surveyed grids, and the percentage is even higher when more survey locations are used.
When we separate the bias from the MSE, we see that G-DIF distribution of bias is low
relative to the full range of possible bias (-4 to 4). The G-DIF damage prediction can result in
a more biased result than the engineering forecast, as shown by the areas of the distributions of
bias outside of the vertical bounds of the engineering forecasts mean error in Figure 8b. This
is partly due to the fact that G-DIFs mean damage estimate depends on how representative the
field survey set is of the true distribution of damage. With more biased field survey sets, the final
estimate is more biased, but sample bias can be avoided with a sufficiently large field survey
if the field survey comes from a random sample. With 500 field survey locations, 86% of the
simulations are less biased than the engineering forecast.
G-DIF is more biased than the engineering forecast, also because the forecast has a low
mean error of -0.056 for the full study region, as discussed in the previous section. However,
the sensitivity analysis confirms G-DIF is more precise when considering sub-regions. Since
the MSE is the sum of the variance and squared bias, the reduction in MSE with more field
surveyed locations in Figure 8a is due to the reduction in variance of the error. This reduction
in variance means that grid-level estimates become more precise. So overall, there is lower
variation in the error considering the high resolution of G-DIF’s mean damage estimate.
We also evaluated the statistical robustness of the estimate of the range of spatial autocorre-
lation. After 1000 simulations using 1000 field surveyed locations, the range of the unexplained
damage is on average 14 km. This range is consistent with the range of 15-20km reported for
damage ratios from the 1994 Northridge earthquake (Shome et al., 2012) and plausible given
that the range of spatial correlation for ground motion intensities can vary between 10-60km
(Jayaram and Baker, 2009).
WHICH DAMAGE ESTIMATION APPROACH TO USE?
By comparing to the engineering forecast, we show that G-DIF provides a credible damage
estimate to support post-earthquake decisions. Whether G-DIF is advantageous over using tra-
ditional methods to rapidly estimate damage depends on the amount of primary field data, the
quality of the secondary data, and the scale at which decisions are made. In cases where a
well-calibrated engineering forecast is available, the study region is large, or there are few field
surveyed grids, the engineering forecast will provide reasonable damage estimates.
Often, however, the engineering forecast may be a general model rather than one calibrated
22
for the specific region, or the input inventory data may be of low quality and resolution. In such
cases, if there are sufficient field surveys available, G-DIF will likely provide a damage estimate
that is comparable or have higher accuracy than that of an engineering forecast. This is because
of the approaches’ ability to calibrate an event-specific prediction and to also perform spatial
interpolation between survey points. In this formulation, we consider measurements from the
field to be exact
A main advantage of G-DIF is that it provides locally accurate damage estimates that can be
leveraged for loss estimates and higher resolution decisions. This means that within sub-regions,
G-DIFs damage estimate will calibrate the engineering forecast, and other secondary damage
data, to the field surveys within that region (as seen in the error histogram for Makawanpur
district in Figure 7b). To improve the local accuracy of the damage estimate, surveyors can use
the uncertainty estimate to guide the collection of additional damage assessments. By surveying
in areas with greater uncertainty, the overall uncertainty of the damage estimate will decrease.
CONCLUSION
In this study, we propose a geospatial data integration framework (G-DIF) to produce a spatial
damage prediction in the weeks after an earthquake. G-DIF uses a limited sample of local and
accurate field surveys to calibrate predictions based on heterogeneous and uncertain damage
data from engineering forecasts, remote sensing and other sources. The uncertain data can
arrive in varying formats, measurement units, and levels of accuracy.
The geostatistical technique, regression kriging, applied in G-DIF consists of two models.
The first is a trend model that estimates the mean damage, a deterministic value that varies
in space, using secondary damage data. The second is a spatial correlation model that esti-
mates the stochastic and spatially correlated residuals between the estimated trend and the true
damage. The separate modeling of these two components allows the framework to produce a
sophisticated trend model when the secondary data is strongly predictive, plus a spatial inter-
polation between observations when the secondary damage data has less predictive power. The
framework is flexible to implement—the modeler can choose the functional form of the trend
prediction (linear or nonlinear) and spatial correlation (variogram) model, depending on the
data available for the event of interest.
Data collected after the 2015 Nepal earthquake was used to demonstrate the implementation
of G-DIF. Out of 80,200 grids in our area of interest, we used a sample of 100 grids as an
23
example of field surveyed locations and found that the mean damage estimated at the other
80,100 grids had a higher accuracy (lower mean squared error) than a benchmark based on a
current engineering forecast. Moreover, G-DIF provides a mean damage estimate that is more
accurate for smaller regions than the engineering forecast used in this study, because it locally
calibrates all secondary data to field surveys. Modelers can then use this spatially varying mean
damage estimate to calculate costs of repair and reconstruction.
In addition to the mean damage estimate map, G-DIF creates a map of the estimation un-
certainty, which is important for interpreting results, and a significant addition to the current
state of practice for standard damage maps from engineering forecasts or remote sensing dam-
age data (e.g. Jaiswal and Wald, 2011; Yun et al., 2015; Copernicus Emergency Management
Service, 2019). Post-disaster modelers or decision-makers can use this estimation variance to
propagate uncertainty into further impact models or decide where to collect more field surveys
to reduce the uncertainty.
With this method we do not explicitly account for uncertainty in the field surveyed assess-
ment that results from survey subjectivity and aggregation per grid. The subjectivity in the
field surveyed measurement has the potential to be mitigated Booth et al. (2011), so we have
made the assumption that its uncertainty is negligible relative to other data sources. This frame-
work can be extended to address the uncertainty due to aggregation through Bayesian updating
of the damage estimate per grid, similar to that presented in Booth et al. (2011), though this
would require estimates of prior and posterior distributions for each dataset and would be more
computationally intensive.
With even a small amount of field survey data, G-DIF predictions have improved accuracy
relative to standard engineering forecasts. Through Monte Carlo simulations of the number and
locations of field surveys, we found that G-DIF consistently resulted in a damage map with
lower mean squared error than an engineering forecast when using more than 50 field surveyed
locations. Given that we predict the damage at 80,150 grid locations using 50 field surveyed
locations, our framework required 0.06% percent of the grids to be surveyed to improve the
estimate of an engineering forecast. In the case of Nepal, 50 field surveyed locations could con-
tain between 250-1150 buildings, which could be feasibly assessed within the first 2-4 weeks in
remote, mountainous contexts. While this timeframe may seem long, a few weeks is a sufficient
amount of time for this approach to inform important decisions, such as the PDNA which is a
major use case.
24
While our results show an improved mean damage estimate with a small percentage of
field surveyed buildings, the placement of field surveys influence these results. Through the
sensitivity analysis of the framework to the field surveyed locations, we found G-DIF’s mean
damage estimate depends on how well the field survey set represents the full damage distri-
bution. A biased set of field surveyed locations can lead to biased results—in the case of the
Nepal earthquake, sets of more than 500 grids were less likely to be biased. To develop the
spatial correlation model at low separation distances, the field survey set should also consist of
locations within the spatial correlation range. To collect field data suited for G-DIF, surveys
can be strategically placed to collect damage assessments for all buildings within selected grids
so the sample has the full distribution of damage and sufficient spatial coverage, similar to the
methods of the REACH survey (e.g. REACH, 2014).
The advantage of G-DIF over standard damage estimates, such as the engineering forecast,
is apparent from the Nepal case study. The Nepal earthquake affected a large, mostly rural,
region over multiple districts. In this case, secondary data was uncertain because the engineer-
ing forecast was developed using low-fidelity data and the damage proxy map was observing
changes to both the built environment and vegetation. We expect many future earthquakes to be
similar in that there will be a limited sample of accurate field data to calibrate damage predic-
tions from multiple uncertain data. Therefore, the framework presented here could be extended
through testing with earthquakes occurring in different built environments or even other types
of disasters, as suggested in (Shome et al., 2012).
Overall, the outputs of this framework are useful for stakeholders involved in post-disaster
loss assessments (like the PDNA) or recovery aid allocation, such as the affected national gov-
ernment, multilateral or bilateral donor agencies, or civil society organizations. In post-disaster
settings, these stakeholders are often overloaded with making many decisions based on the
uncertain data that are available at that time. By combining multiple data, this framework auto-
matically weights those damage datasets according to their ability to predict damage observed
in the field surveys, and synthesizes them to develop one map of damage. Therefore, the frame-
work allows stakeholders to address the hurdle of weighing the reliability of input data versus
its availability, so they can ultimately make more informed decisions to for a more effective
regional recovery.
25
ELECTRONIC SUPPLEMENT
The data and R code to develop all results for the Nepal case study example presented in this
paper are available at https://purl.stanford.edu/gn368cq4893 with an interactive notebook of the
code at https://sabineloos.github.io/GDIF-damageprediction/GDIF nb.html.
ACKNOWLEDGMENTS
We would like to thank Anna Michalak, David Wald, Kishor Jaiswal, Brendon Bradley, and
Robert Soden for their contributions and feedback developing this framework. We would like to
thank the Government of Nepal, especially the National Planning Commission, Central Bureau
of Statistics and National Reconstruction Authority, for collecting this groundtruth damage data
and making its anonymized version available for broader uses and Arogya Koirala and Roshan
Paudel for their assistance in preparing this data. Part of the research was carried out at the Jet
Propulsion Laboratory, California Institute of Technology, under a contract with the National
Aeronautics and Space Administration. This work is funded by the National Science Founda-
tion Graduate Research Fellowship Program, the National Research Foundation of Singapore
grant NRF-NRFF2018-06, and the World Banks Trust Fund for Statistical Capacity Building
(TFSCB) with financing from the United Kingdom’s Department for International Development
(DFID), the Government of Korea, and the Department of Foreign Affairs and Trade of Ireland.
REFERENCES
Applied Technology Council, 1989. Procedures of Postearthquake Safety Evaluation of Buildings.Tech.
rep., Applied Technology Council, Redwood City, CA.
Bhattacharjee, G., Barns, K., Loos, S., Lallemant, D., Deierlein, G., and Soden, R., 2018. Developing
a User-Centric Understanding of Post-Disaster Building Damage Information Needs. In 11th U.S.
National Conference on Earthquake Engineering. Los Angeles, CA.
Boore, D. M., Gibbs, J. F., Joyner, W. B., Tinsley, J. C., and Ponti, D. J., 2003. Estimated Ground
Motion From the 1994 Northridge , California , Earthquake at the Site of the Interstate 10 and La
Cienega Boulevard, West Los Angeles, California. Bulletin of the Seismological Society of America
93, 2737–2751.
Booth, E., Saito, K., Spence, R., Madabhushi, G., and Eguchi, R. T., 2011. Validating Assessments of
Seismic Damage Made from Remote Sensing. Earthquake Spectra 27, S157–S177.
Bright, E. A., Coleman, P. R., Rose, A. N., and Urban, M. L., 2012. LandScan 2011. Oak Ridge National
Laboratory.
Chatterjee, A., Michalak, A. M., Kahn, R. a., Paradise, S. R., Braverman, A. J., and Miller, C. E., 2010.
A geostatistical data fusion technique for merging remote sensing and ground-based observations of
aerosol optical thickness. Journal of Geophysical Research 115, 1–12. doi:10.1029/2009JD013765.
26
Chiles, J.-P. and Delfiner, P., 2012. Geostatistics: Modeling Spatial Uncertainty. 2 edn. Wiley Series in
Probability and Statistics, New York, NY. ISBN 978-0471083153, 734 pp.
Copernicus Emergency Management Service, 2019. Rapid Mapping Portfolio.
Corbane, C., Saito, K., DellOro, L., Bjorgo, E., Gill, S., Boby, P., Huyck, C., Kemper, T., Lemoine, G.,
Spence, R., Shankar, R., Senegas, O., Ghesquiere, F., Lallemant, D., Evans, G., Gartley, R., Toro, J.,
Ghosh, S., Svekla, W., Adams, B., and Eguchi, R. T., 2011. A Comprehensive Analysis of Building
Damage in the 12 January 2010 Mw7 Haiti Earthquake Using High-Resolution Satellite and Aerial
Imagery. Photogrammetric Engineering Remote Sensing 77, 997–1009.
Dong, L. and Shan, J., 2013. A comprehensive review of earthquake-induced building damage detection
with remote sensing techniques. ISPRS Journal of Photogrammetry and Remote Sensing 84, 85–99.
Earthquake Engineering Research Institute, 2015. Learning From Earthquake (LFE) Program.Tech.
rep., Earthquake Engineering Research Institute, Oakland, CA.
Erdik, M., Sesetyan, K., Demircioglu, M., Zulfikar, C., Hancilar, U., Tuzun, C., and Harman-
dar, E., 2014. Rapid Earthquake Loss Assessment After Damaging Earthquakes. In Geotechni-
cal, Geological and Earthquake Engineering, vol. 34, pp. 53–96. ISBN 9783319071176. doi:
10.1007/978-3-319-07118-3.
Farr, T. G., Rosen, P. A., Caro, E., Crippen, R., Duren, R., Hensley, S., Kobrick, M., Paller, M., Ro-
driguez, E., Roth, L., Seal, D., Shaffer, S., Shimada, J., Umland, J., Werner, M., Oskin, M., Burbank,
D., and Alsdorf, D. E., 2007. The shuttle radar topography mission. Reviews of Geophysics 45.
doi:10.1029/2005RG000183.
Ghosh, S., Huyck, C. K., Greene, M., Gill, S. P., Bevington, J., Svekla, W., DesRoches, R., and Eguchi,
R. T., 2011. Crowdsourcing for Rapid Damage Assessment: The Global Earth Observation Catastro-
phe Assessment Network (GEO-CAN). Earthquake Spectra 27, S179–S198.
Goda, K. and Hong, H. P., 2008. Spatial correlation of peak ground motions and response spectra.
Bulletin of the Seismological Society of America 98, 354–365. doi:10.1785/0120070078.
Grujic, O., 2017. Subsurface Modeling with Functional Data. Ph.D. thesis, Stanford University.
Gr¨
unthal, G., 1998. European Macroseismic Scale 1998, vol. 15. ISBN 2879770084, 100 pp.
Gunasekera, R., Daniell, J., Pomonis, A., Arias, R. A., Ishizawa, O., and Stone, H., 2018. Methodology
Note on the Global RApid post-disaster Damage Estimation (GRADE) approach.Tech. rep., Global
Facility for Disaster Reduction and Recovery, Washington, DC.
Hengl, T., Heuvelink, G., and Stein, A., 2003. Comparison of kriging with external drift and regression-
kriging. Technical note, ITC p. 17. doi:10.1016/S0016-7061(00)00042-2.
Hengl, T., Heuvelink, G. B. M., and Stein, A., 2004. A generic framework for spatial prediction of soil
variables based on regression-kriging. Geoderma 120, 75–93. doi:10.1016/j.geoderma.2003.08.018.
Hunt, A. and Specht, D., 2019. Crowdsourced mapping in crisis zones: collaboration, organisation and
impact. Journal of International Humanitarian Action 4, 1–11. doi:10.1186/s41018-018-0048-1.
Huyck, C. K., 2015. Gorkha (Nepal) Earthquake Response.
Jaiswal, K., Wald, D., and Hearne, M., 2009. Estimating casualties for large earthquakes worldwide
using an empirical approach: US geological survey open-file report, OF 2009-1136, 78 p.Tech. rep.
Jaiswal, K. and Wald, D. J., 2011. Rapid Estimation of the Economic Consequences of Global Earth-
quakes.Tech. rep., USGS, Reston, VA.
James, G., Witten, D., Hastie, T., and Tibshirani, R. J., 2013. An Introduction to Statistical Learning.
Springer, New York, NY. ISBN 9781461471370, 1–440 pp.
27
Jarvis, A., Reuter, H. I., Nelson, A., and Guevara, E., 2008. Hole-filled seamless SRTM data V4.
Jayaram, N. and Baker, J., 2009. Correlation model for spatially distributed ground-motion intensities.
Earthquake Engineering {&}Structural Dynamics {...}.
JICA, 2002. The study on earthquake disaster mitigation in the Kathmandu Valley, Kingdom of Nepal.
Tech. rep., Japan International Cooperation Agency : Nippon Koei Co., Ltd. : Oyo Corp.
Kerle, N., 2013. Remote Sensing Based Post-Disaster Damage Mapping with Collaborative Methods.
Intelligent Systems for Crisis Management pp. 121–133. doi:10.1007/978-3-642-33218-0.
Kerle, N. and Hoffman, R. R., 2013. Collaborative damage mapping for emergency response : the role
of Cognitive Systems Engineering. Natural hazards and earth system sciences 13, 97–113.
Lallemant, D. and Kiremidjian, A., 2013. Rapid post-earthquake damage estimation using remote-
sensing and field-based damage data integration. In Safety, Reliability, Risk and Life-Cycle Perfor-
mance of Structures and Infrastructures, pp. 3399–3406. CRC Press.
Lallemant, D., Soden, R., Rubinyi, S., Loos, S., Barns, K., and Bhattacharjee, G., 2017. Post-
Disaster Damage Assessments as Catalysts for Recovery: A Look at Assessments Conducted in
the Wake of the 2015 Gorkha, Nepal, Earthquake. Earthquake Spectra 33, S435–S451. doi:
10.1193/120316EQS222M.
Loos, S., Barns, K., Bhattacharjee, G., Soden, R., Herfort, B., Eckle, M., Giovando, C., Girardot, B.,
Saito, K., Deierlein, G., Kiremidjian, A., Baker, J. W., and Lallemant, D., 2018. The Development and
Uses of Crowdsourced Building Damage Information based on Remote-Sensing.Tech. rep., Stanford,
CA.
McBratney, A. B., Odeh, I. O., Bishop, T. F., Dunbar, M. S., and Shatar, T. M., 2000. An overview of
pedometric techniques for use in soil survey, vol. 97. ISBN 0016-7061, 293–327 pp. doi:10.1016/
S0016-7061(00)00043-4.
Monfort, D., Negulescu, C., and Belvaux, M., 2019. Remote sensing vs. field survey data in a post-
earthquake context: Potentialities and limits of damaged building assessment datasets. Remote Sens-
ing Applications: Society and Environment 14, 46–59. doi:10.1016/j.rsase.2019.02.003.
Motaghian, H. R. and Mohammadi, J., 2011. Spatial estimation of saturated hydraulic conductivity from
terrain attributes using regression, kriging, and artificial neural networks. Pedosphere 21, 170–177.
doi:10.1016/S1002-0160(11)60115-X.
Nepal Earthquake Housing Reconstruction Multi-Donor Trust Fund, 2016. Nepal Earthquake Housing
Reconstruction Annual Report.Tech. rep., Nepal Earthquake Housing Reconstruction Multi-Donor
Trust Fund, Kathmandu, Nepal.
Odeh, I. O. A., McBratney, A. B., and Chittleborough, D. J., 1994. Spatial prediction of soil properties
from landform attributes derived from a digital elevation model. Geoderma 63, 197–214. doi:10.
1016/0016-7061(94)90063-9.
Oliver, M. A. and Webster, R., 2014. A tutorial guide to geostatistics: Computing and modelling vari-
ograms and kriging. Catena 113, 56–69. doi:10.1016/j.catena.2013.09.006.
REACH, 2014. Groundtruthing Open Street Map Building Damage Assessment: Haiyan Typhoon - The
Philippines.Tech. Rep. April, REACH; American Red Cross; USAID.
Shelter Cluster Nepal, 2015. Shelter and Settlements Vulnerability Assessment: Nepal 25 April / 12 May
Earthquakes Response Nepal.Tech. Rep. June, Shelter Cluster Nepal, Nepal.
Shome, N., Jayaram, N., and Rahnama, 2012. Uncertainty and Spatial Correlation Models for Earth-
quake Losses. In 15th World Conference on Earthquake Engineering (15WCEE), p. 10. Lisbon,
Portugal.
28
Thompson, E. M., Baise, L. G., Kayen, R. E., Tanaka, Y., and Tanaka, H., 2010. A geostatistical
approach to mapping site response spectral amplifications. Engineering Geology 114, 330–342.
Trendafiloski, G., Wyss, M., and Rosset, P., 2009. Loss Estimation Module in the Second Genera-
tion Software QLARM. In Second International Workshop on Disaster Casualties, June, pp. 1–10.
Cambridge, UK. ISBN 9789048194551. doi:10.1007/978-90-481-9455-1.
United Nations Office for the Coordination of Humanitarian Affairs, 2019. ReliefWeb - Informing
humanitarians worldwide.
Universit´
e catholique de Louvain (UCL) - CRED and Guha-Sapir, D., . EM-DAT: The Emergency
Events Database.
Wald, D. J., Jaiswal, K. S., Marano, K. D., Garcia, D., So, E., and Hearne, M., 2012. Impact-Based
Earthquake Alerts with the U. S. Geological Surveys PAGER System: What’s Next? In 15th World
Conference on Earthquake Engineering, Lisbon Portugal.
Westrope, C., Banick, R., and Levine, M., 2014. Groundtruthing OpenStreetMap Building Damage
Assessment. Procedia Engineering 78, 29–39.
Worden, C. B., Thompson, E. M., Baker, J. W., Bradley, B. A., Luco, N., and Wald, D. J., 2018. Spatial
and Spectral Interpolation of GroundMotion Intensity Measure Observations. Bulletin of the Seismo-
logical Society of America doi:10.1785/0120170201.
Worden, C. B. and Wald, D., 2016. ShakeMap Manual.Tech. rep.
Yun, S.-h., Hudnut, K., Owen, S., Webb, F., Sacco, P., Gurrola, E., Manipon, G., Liang, C., Fielding, E.,
Milillo, P., Hua, H., and Coletta, A., 2015. Rapid Damage Mapping for the 2015 M w 7 . 8 Gorkha
Earthquake Using Synthetic Aperture Radar Data from COSMO SkyMed and ALOS-2 Satellites.
Seismological Research Letters 86, 1549–1556. doi:10.1785/0220150152.
29
... Post-earthquake damage maps come from a wide range of sources, including remote sensing-derived or forecast-based estimates (Loos et al., 2020). We call these sources secondary datasets, which are advantageous since they provide a rapid estimate of damage over a large region in less time than it would take to collect primary field surveys of damage. ...
... The Geospatial Data Integration Framework (G-DIF), based on the geostatistical method Regression Kriging, addresses these issues (Loos et al., 2020). G-DIF is a general modeling framework that is agnostic to different types of primary and secondary data, and therefore adapts to different places and new developments in secondary data. ...
... Haiti had a weakly enforced building code in the dense city of Port-au-Prince composed mostly of unreinforced concrete frame buildings , resulting in an estimated 200,000-300,000 deaths (O'Connor, 2012). The Haiti earthquake was one of the first earthquakes with a proliferation of damage data, pioneering many new techniques to evaluate damage from remote sensing imagery (Corbane et al., 2011;Loos et al., 2020). However, because many of these nontraditional damage datasets were originally tested after this event, Haiti's datasets have relatively poorer quality than the subsequent case studies. ...
Article
Full-text available
Weeks after a disaster, crucial response and recovery decisions require information on the locations and scale of building damage. Geostatistical data integration methods estimate post-disaster damage by calibrating engineering forecasts or remote sensing-derived proxies with limited field measurements. These methods are meant to adapt to building damage and post-earthquake data sources that vary depending on location, but their performance across multiple locations has not yet been empirically evaluated. In this study, we evaluate the generalizability of data integration to various post-earthquake scenarios using damage data produced after four earthquakes: Haiti 2010, New Zealand February 2011, Nepal 2015, and Italy 2016. Exhaustive surveys of true damage data were eventually collected for these events, which allowed us to evaluate the performance of data integration estimates of damage through multiple simulations representing a range of conditions of data availability after each earthquake. In all case study locations, we find that integrating forecasts or proxies of damage with field measurements results in a more accurate damage estimate than the current best practice of evaluating these input data separately. In cases when multiple damage data are not available, a map of shaking intensity can serve as the only covariate, though the addition of remote sensing-derived data can improve performance. Even when field measurements are clustered in a small area-a more realistic scenario for reconnaissance teams-damage data integration outperforms alternative damage datasets. Overall, by evaluating damage data integration across contexts and under multiple conditions, we demonstrate how integration is a reliable approach that leverages all existing damage data sources to better reflect the damage observed on the ground. We close by recommending modeling and field surveying strategies to implement damage data integration in-real-time after future earthquakes.
... The initiation of the recovery acts after a seismic hazard depends on the allocation of appropriate funds and resources (Chiaro et al., 2015;Loos et al., 2020). One of the major sources of economic loss as a result of an earthquake is damaged structures (Sarmadi, 2020). ...
... In most cases, the resources do not meet the demand for rapid coverage of the region, and thus, only a limited number of buildings can be investigated within a short period after the hazard (Brando et al., 2017). Consequently, the total loss estimations can be inaccurate and reliable results may not become available until months or even years after the event (Loos et al., 2020). ...
... Although the results of the reconnaissance survey are narrow within a short time after the earthquake, they can be used in the inference of damage for uninspected buildings. It was shown that by training a surrogate model using this information, the damage level of each building could be inferred with promising accuracy (Loos et al., 2020;Mangalathu et al., 2020;Roeslin et al., 2020;Sheibani and Ou, 2021b). By considering a set of basic building attributes and ground motions' intensity measures as inputs and a damage indicator such as the damage ratio of the building as output, the surrogate model can deliver adequate performance with a number of input-output pairs. ...
Article
The extent of loss in a seismic hazard can be moderated with on-time allocation of funds and initiation of recovery tasks. Among various examinations conducted following the hazard, buildings damages are assessed as part of the reconnaissance survey to learn and document the impact of the earthquake on structures. The results of the survey are used in financial aid estimation, which is crucial for the community rapid recovery acts after the hazard. Due to the urgent need for this information, the amount of information gained per unit of time should be optimized. This article aims at answering the question of how to maximize the information gain in the presence of resource constraints by directing the efforts of a reconnaissance surveying team. A data-driven method is proposed that actively learns the patterns of damage and recommends the most informative buildings to be inspected while considering the resource limitations. The framework utilizes an efficient active learning method based on mutual information and developed for Gaussian process regression (GPR) to identify the information-rich cases. To assess the contribution of information gain and resource allocation in the overall outcome of the damage inference, two simulated earthquake testbeds are studied. It is shown that in a co-optimization approach, damage labels of the majority of buildings can be accurately predicted after 1 week of damage inspections.
... The high computational costs of such approaches are a concern since they demand precise information on the site attributes and fault patterns. As a result, these procedures are impractical for assessing post-earthquake damages rapidly (Loos et al., 2020;Mangalathu and Jeon, 2020), and 2) Simulations based on the coherency functions using cross-spectral density (CSD) and auto-spectral density (ASD) functions (Kameda and Morikawa, 1992;Konakli and Der Kiureghian, 2012;Zentner, 2013;Rodda and Basu, 2018). Simulated ground motions commonly are generated based on the CSD function, which itself is determined using empirical coherency functions, the coefficients of which are typically set through data-driven methods (Abrahamson et al., 1991). ...
Technical Report
Full-text available
Earthquake ground motion time series plays a critical role in the performance assessment of the structures, especially when nonlinear response history analysis for a specific structural system is required. The number of currently available recording instruments is sparse. Therefore, it is necessary to have a reliable methodology to construct the ground motion time series at the desired target un-instrumented sites. Using the Gaussian Process Regression (GPR), we recently presented an approach for generating ground motion time series at target sites where there are no available recording sensors. This model is trained based on physics-based simulated earthquake datasets in northern California and evaluated using the recorded motions during the 2019 M7.1 Ridgecrest earthquake sequence and 2020 M4.5 South El Monte datasets in Southern California. This GPR method interpolates the observed Discrete Fourier Transform (DFT) coefficients to construct the frequency-content of the ground motion at the target location and generate time series at the site. The optimized hyperparameter of the GPR model depends on the observation density of the training dataset. Thus, in this study, we tuned the hyperparameter of the GPR model based on observation density using the 2019 M7.1 Ridgecrest earthquake dataset recorded by the Community Seismic Network (CSN). In addition, we introduce a methodology to generate random realizations of ground motions using the trained GPR model at each target site. We utilize this methodology for the 2019 M7.1 Ridgecrest earthquake to conduct uncertainty quantification of the estimated motions at short and long periods. The results illustrate that uncertainty of the generated time series is lower for longer periods than that for shorter periods. In addition, we carried out the sensitivity analysis of both predictions' error and uncertainty with respect to a variety of governing parameters such as density of the observations surrounding the target site and estimated uncertainty of the local site conditions. It is shown that the observation density plays a key role in both reducing the prediction error as well as the uncertainty of the estimation. Moreover, we studied the improvement of the performance of the GPR model in the prediction of ground motions for 2019 M7.1 Ridgecrest as well as the 2020 M4.5 South El Monte earthquakes recorded by the California Integrated Seismic Network (CISN) through feeding more observed motions from CSN sites to the model. The results illustrate that the prediction error decreases, especially for those target sites located inside the added observed network (CSN) boundary. However, the prediction uncertainty is not changed considerably, especially at short periods.
... The quality of the damage map could be further enhanced by combination with other damage assessments, e.g., maps of shaking intensity and on-the-ground reports, as well as previously identified zones of higher risk for building collapse, fault surface rupture, landslides, and liquefaction [59]. ...
Article
Satellite remote sensing is playing an increasing role in the rapid mapping of damage after natural disasters. In particular, synthetic aperture radar (SAR) can image the Earth's surface and map damage in all weather conditions, day and night. However, current SAR damage mapping methods struggle to separate damage from other changes in the Earth's surface. In this study, we propose a novel approach to damage mapping, combining deep learning with the full time history of SAR observations of an impacted region in order to detect anomalous variations in the Earth's surface properties due to a natural disaster. We quantify Earth surface change using time series of interferometric SAR coherence, then use a recurrent neural network (RNN) as a probabilistic anomaly detector on these coherence time series. The RNN is first trained on pre-event coherence time series, and then forecasts a probability distribution of the coherence between pre- and post-event SAR images. The difference between the forecast and observed co-event coherence provides a measure of confidence in the identification of damage. The method allows the user to choose a damage detection threshold that is customized for each location, based on the local behavior of coherence through time before the event. We apply this method to calculate estimates of damage for three earthquakes using multiyear time series of Sentinel-1 SAR acquisitions. Our approach shows good agreement with observed damage and quantitative improvement compared to using pre- to co-event coherence loss as a damage proxy.
... Machine-learning techniques have been tested in regional seismic risk prediction, based on the assumption that a subset of buildings is representative of a much larger building stock [4]. Acknowledging the need for centralized information on building damage, Loos et al. proposed a geospatial data integration framework using a kriging regression model to find correlations between observed damage and other secondary parameters [5]. Kovačević et al. used random forests to classify buildings in damage states based on inspection results for a subset of buildings [6]. ...
... The quality of the damage map could be further enhanced by combination with other damage assessments, e.g. maps of shaking intensity and on-the-ground reports, as well as previously identified zones of higher risk for building collapse, fault surface rupture, landslides and liquefaction [59]. ...
Preprint
Satellite remote sensing is playing an increasing role in the rapid mapping of damage after natural disasters. In particular, synthetic aperture radar (SAR) can image the Earth's surface and map damage in all weather conditions, day and night. However, current SAR damage mapping methods struggle to separate damage from other changes in the Earth's surface. In this study, we propose a novel approach to damage mapping, combining deep learning with the full time history of SAR observations of an impacted region in order to detect anomalous variations in the Earth's surface properties due to a natural disaster. We quantify Earth surface change using time series of Interferometric SAR coherence, then use a recurrent neural network (RNN) as a probabilistic anomaly detector on these coherence time series. The RNN is first trained on pre-event coherence time series, and then forecasts a probability distribution of the coherence between pre- and post-event SAR images. The difference between the forecast and observed co-event coherence provides a measure of the confidence in the identification of damage. The method allows the user to choose a damage detection threshold that is customized for each location, based on the local behavior of coherence through time before the event. We apply this method to calculate estimates of damage for three earthquakes using multi-year time series of Sentinel-1 SAR acquisitions. Our approach shows good agreement with observed damage and quantitative improvement compared to using pre- to co-event coherence loss as a damage proxy.
Chapter
In order to document and learn an earthquake’s impact on structures, building damages are examined during the reconnaissance surveys following an earthquake. The obtained information is also used for the estimation of the financial need requests, which are crucial for the rapid initiation of the recovery acts after an earthquake. This article aims at providing a data-based framework that guides a reconnaissance surveying team by actively learning the damage data and identifying the most informative buildings given the resource limitations. Meanwhile, the damage data is used to train a surrogate model that can infer the damage intensity of the unobserved buildings in the region using the data obtained from damage inspections. The framework utilizes an efficient active learning method based on mutual information and developed for Gaussian process regression to prioritize buildings that provide a maximal information gain and reduce the overall travel time required for the surveying team. A joint selection of structural and earthquake parameters, along with the sparse damage observations, are used to train the Gaussian process regression model for damage emulations. To validate the proposed method, a simulated testbed of the 2018 Mw 7.1 Anchorage earthquake is studied. It is shown that in a co-optimization approach, damage labels of 96.7% of the buildings can be accurately predicted after only 100 hours of damage inspections.
Article
Post‐earthquake reconnaissance survey of structural damage is an effective way of documenting and understanding the impact of earthquakes on structures. This article aims at providing an efficient data‐based framework that reduces the required time for reconnaissance missions and predicts the damage intensities for every building in the affected region. We hypothesize that a joint selection of necessary structural and earthquake parameters along with sparse damage observations are sufficient to train a supervised learning algorithm and accurately infer the damage for other buildings in the region. Gaussian process regression is employed to prove the hypothesis for probabilistic inference of different damage indices. The algorithm performs efficiently by selecting a set of diverse and representative buildings for damage observations using K‐medoids clustering. To validate the hypothesis and the proposed method, the algorithm framework is implemented on two severe earthquake simulation testbeds. The impacts of different building and ground motion variables on the damage inference performance are discussed. Furthermore, the effectiveness of observation sampling by clustering in the post‐earthquake damage inference is compared with random sampling.
Article
Full-text available
Abstract Crowdsourced mapping has become an integral part of humanitarian response, with high profile deployments of platforms following the Haiti and Nepal earthquakes, and the multiple projects initiated during the Ebola outbreak in North West Africa in 2014, being prominent examples. There have also been hundreds of deployments of crowdsourced mapping projects across the globe that did not have a high profile. This paper, through an analysis of 51 mapping deployments between 2010 and 2016, complimented with expert interviews, seeks to explore the organisational structures that create the conditions for effective mapping actions, and the relationship between the commissioning body, often a non-governmental organisation (NGO) and the volunteers who regularly make up the team charged with producing the map. The research suggests that there are three distinct areas that need to be improved in order to provide appropriate assistance through mapping in humanitarian crisis: regionalise, prepare and research. The paper concludes, based on the case studies, how each of these areas can be handled more effectively, concluding that failure to implement one area sufficiently can lead to overall project failure.
Article
Full-text available
In the wake of large earthquake disasters, governments, international agencies, and large nongovernmental organizations scramble to conduct impact and damage assessments that help them understand the nature and scale of the emergency in order to orchestrate a complex series of emergency, response, and recovery activities. Using the Gorkha earthquake as a case study, this research seeks to provide greater clarity into the types of post-disaster damage assessments, their purposes, and their potential as catalysts for critical recovery activities. We argue that damage assessment methodologies need to be tailored to the diverse information needs in post-disaster contexts, which vary by user group and change over time. This research builds upon the authors' direct experience supporting the government of Nepal in the Post-Disaster Needs Assessment (PDNA) process, support with the rapid visual inspections conducted by the National Engineering Association, and interviews with humanitarian organizations who conducted damage assessment in Nepal.
Article
Full-text available
The 25 April 2015 Mw 7.8 Gorkha earthquake caused more than 8000 fatalities and widespread building damage in central Nepal. The Italian Space Agency's COSMO-SkyMed Synthetic Aperture Radar (SAR) satellite acquired data over Kathmandu area four days after the earthquake and the Japan Aerospace Exploration Agency's Advanced Land Observing Satellite-2 SAR satellite for larger area nine days after the mainshock. We used these radar observations and rapidly produced damage proxy maps (DPMs) derived from temporal changes in Interferometric SAR coherence. Our DPMs were qualitatively validated through comparison with independent damage analyses by the National Geospatial-Intelligence Agency and the United Nations Institute for Training and Research's United Nations Operational Satellite Applications Programme, and based on our own visual inspection of DigitalGlobe'sWorld-View optical pre-versus postevent imagery. Our maps were quickly released to responding agencies and the public, and used for damage assessment, determining inspection/imaging priorities, and reconnaissance fieldwork.
Article
Quick building damage assessment following disasters such as large earthquakes serves to establish a preliminary estimation of losses and casualties. These datasets are completed by employing several crowdsourcing initiatives, in which volunteers and collaborators map damaged buildings in a given area at a qualitative damage scale based on a post-earthquake aerial or satellite image. Automating this process is a temptation and a technical issue, but manual interpretation remains essential, with the identification of moderate and lateral damage being the key and limiting factor. Following the Haiti 2010 earthquake, many studies were completed by crossing multilayer data gathered from different sources (satellite, aerial, and field survey). These works created a building damage dataset that enabled the construction of different sets of empirical vulnerability functions. In the present study, we proposed to review the datasets used for the damage assessment again, investigate how they can be managed for understanding urban damage patterns, and quantify the potentialities and limits of the sets. A high-resolution map of damage in Port-au-Prince was used to obtain a deducted map of intensity and was then compared to more detailed post-earthquake investigations such as the microzonation of the city (Belvaux et al., 2018). These detailed post-earthquake investigations, in which array microtremor measurements are performed for characterization of the subsurface soil, contribute to a better understanding of local variations in intensity. Subsequently, a retro damage scenario was run, considering the different sets of vulnerability functions (using the RISK-UE methodology vulnerability indexes) fitted with empirical vulnerability functions. Using the characterization of the exposure on a remote sensing basis, the results fit the heaviest damage well (building collapse), but they overestimated moderate damage states compared to the observations. However, is an aerial image based dataset sufficiently exhaustive for moderate damage, which is mostly visible from a lateral or internal point of view? Finally, we suggested some range of adjustments that can be applied to a vulnerability assessment originating from remote sensing data such that it can be used more accurately in the detection of urban damage, even for moderate damage degrees.
Article
Following a significant earthquake, ground-motion observations are available for a limited set of locations and intensity measures (IMs). Typically, however, it is desirable to know the ground motions for additional IMs and at locations where observations are unavailable. Various interpolation methods are available, but because IMs or their logarithms are normally distributed, spatially correlated, and correlated with each other at a given location, it is possible to apply the conditional multivariate normal (MVN) distribution to the problem of estimating unobserved IMs. In this article, we review the MVN and its application to general estimation problems, and then apply the MVN to the specific problem of ground-motion IM interpolation. In particular, we present (1) a formulation of the MVN for the simultaneous interpolation of IMs across space and IM type (most commonly, spectral response at different oscillator periods) and (2) the inclusion of uncertain observation data in the MVN formulation. These techniques, in combination with modern empirical ground-motion models and correlation functions, provide a flexible framework for estimating a variety of IMs at arbitrary locations.
Book
Assessment of human casualties in earthquakes has become a topic of vital importance for national and urban authorities responsible for emergency provision, for the development of mitigation strategies and for the development of adequate insurance schemes. In the last few years important work has been carried out on a number of recent events (including earthquakes in Kocaeli, Turkey 1999, Niigata Japan, 2004, Sichuan, China 2008 and L'Aquila,Italy 2009). These events have created new and detailed casualty data, which has not until now been properly assembled and evaluated. This book draws the new evidence from recent events together with existing knowledge. It summarises current trends in the understanding of the factors influencing the numbers and types of casualties in earthquakes; it offers methods to incorporate this understanding into the estimation of losses in future events in different parts of the world; it discusses ways in which pre-event mitigation activity and post-event emergency management can reduce the toll of casualties in future events; and it identifies future research needs. Audience: This book will be of interest to scientists and professionals in engineering, geography, emergency management, epidemiology and the insurance industry.
Chapter
We studied the earthquake mortality rates for more than 4,500 worldwide earthquakes since 1973 and developed an empirical country- and region-specific earthquake vulnerability model to be used as a candidate for post-earthquake fatality estimation by the U.S. Geological Survey’s Prompt Assessment of Global Earthquakes for Response (PAGER) system. Earthquake fatality rate is defined as the ratio of the total number of shaking-related fatalities to the total population exposed at a given shaking intensity (in terms of Modified-Mercalli (MM) shaking intensity scale). An atlas of global Shakemaps developed for PAGER project (Allen and others, 2008) and the Landscan 2006 population database developed by Oak Ridge National Laboratory (Dobson and others, 2000; Bhaduri and others, 2002) provides global hazard and population exposure information which are necessary for the development of fatality rate. Earthquake fatality rate function is expressed in terms of a two-parameter lognormal cumulative distribution function. The objective function (norm) is defined in such a way that we minimize the residual error in hindcasting past earthquake fatalities. The earthquake fatality rate is based on past fatal earthquakes (earthquakes causing one or more deaths) in individual countries where at least four fatal earthquakes occurred during the catalog period. All earthquakes that have occurred since 1973 (fatal or non-fatal) were included in order to constrain the fatality rates for future estimations. Only a few dozen countries have experienced four or more fatal earthquakes since 1973; hence, we needed a procedure to derive regional fatality rates for countries that had not had enough fatal earthquakes during the catalog period. We propose a new global regionalization scheme based on idealization of countries that are expected to have similar susceptibility to future earthquake losses given the existing building stock, its vulnerability, and other socioeconomic characteristics. The fatality estimates obtained using an empirical country- or region-specific model will be used along with other selected engineering risk-based loss models (semi-empirical and analytical) in the U.S. Geological Survey’s Prompt Assessment of Global Earthquakes for Response (PAGER) system for generation of automated earthquake alerts. These alerts could potentially benefit the rapid earthquake response agencies and governments for better response to reduce earthquake fatalities. Fatality estimates are also useful to stimulate earthquake preparedness planning and disaster mitigation. The proposed model has several advantages as compared with other candidate methods, and the country- or region-specific fatality rates can be readily updated when new data become available.
Book
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.