Content uploaded by Sabine Loos

Author content

All content in this area was uploaded by Sabine Loos on Jun 05, 2020

Content may be subject to copyright.

G-DIF: A geospatial data integration framework

to rapidly estimate post-earthquake damage

Sabine Loosa)

,M.EERI, David Lallemantb)

,M.EERI, Jack Bakera)

,M.EERI, Jamie

McCaugheye)

, Sang-Ho Yunc)

, Nama Budhathokid)

, Feroz Khanb)

, Ritika Singhd)

While unprecedented amounts of building damage data are now produced after

earthquakes, stakeholders do not have a systematic method to synthesize and evalu-

ate damage information, thus leaving many datasets unused. We propose a Geospa-

tial Data Integration Framework (G-DIF) that employs regression kriging to com-

bine a sparse sample of accurate ﬁeld surveys with spatially exhaustive, though un-

certain, damage data from forecasts or remote sensing. The framework can be im-

plemented after an earthquake to produce a spatially-distributed estimate of damage

and, importantly, its uncertainty. An example application with real data collected

after the 2015 Nepal earthquake illustrates how regression kriging can combine a

diversity of datasets–and downweight uninformative sources–reﬂecting its ability to

accommodate context-speciﬁc variations in data type and quality. Through a sensi-

tivity analysis on the number of ﬁeld surveys, we demonstrate that with only a few

surveys, this method can provide more accurate results than a standard engineering

forecast.

INTRODUCTION

From rapid engineering forecasts to crowdsourced maps, unprecedented amounts of building

damage data are now being produced after earthquakes. The 2010 Haiti earthquake was the

ﬁrst time that response and recovery stakeholders had access to this amount of damage data,

due to both technological advancements in remote sensing data acquisition and mandates to

make that data openly available after major disasters (Corbane et al., 2011; Kerle and Hoffman,

2013). In fact, after 2010 there was a spike in the number of damage-related maps posted on

ReliefWeb—a global information sharing site devoted to humanitarian disasters—in response to

a)Stanford University, Stanford, CA 94305

b)Earth Observatory of Singapore, Nanyang Technological University, Singapore

c)Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109

d)Kathmandu Living Labs, Kathmandu, Nepal

e)Institute for Environmental Decisions, Dept. Environmental Systems Science, ETH Z¨

urich, Z¨

urich, Switzer-

land

1

major earthquakes despite having similar estimated economic damages as earlier events (Figure

1).

Figure 1. The number of damage-related maps posted on ReliefWeb, a disaster information sharing

site, has increased since the 2010 Haiti earthquake. We would expect a similar number of maps for

major events with similar estimated economic damages (shown in 2019 USD). The number of maps

were scraped from ReliefWeb and economic damages were retrieved from EM-DAT (United Nations

Ofﬁce for the Coordination of Humanitarian Affairs, 2019; Universit´

e catholique de Louvain (UCL) -

CRED and Guha-Sapir)

Counterintuitively, the increase in data is problematic since stakeholders–such as affected

governments, multilateral donor organizations, and humanitarian organizations–receive a bar-

rage of information and maps with unveriﬁed competing damage estimates (Kerle, 2013). Often,

data from new and untested methods are left unused when decisions need to be made quickly

(Hunt and Specht, 2019). Stakeholders do not have a systematic method to quickly assess the

accuracy or synthesize these data sources. Furthermore, it is common for damage to be quanti-

ﬁed using metrics that are not usable for stakeholders to make crucial decisions within weeks of

an earthquake (Bhattacharjee et al., 2018). For example, in as little as two weeks, the affected

government uses damage data to estimate total losses for the Post Disaster Needs Assessments

(PDNA) to request recovery aid. It is unclear how to 1) translate multiple remotely-sensed

damage maps that show damage intensity per pixel, like the maps shown in Kerle and Hoffman

(2013), to usable metrics to estimate loss and 2) know which map is most accurate. If damage

2

estimates are inaccurate in the PDNA, the affected government could under or overestimate the

amount of aid requested—and subsequently distributed—for recovery. Because of these issues,

many damage data are left unused. This paper outlines a Geospatial Data Integration Frame-

work (G-DIF) to systematically integrate multiple sources of damage data into a single spatially

distributed estimate of damage with quantiﬁed uncertainty to ease decision-making and improve

the accuracy of post-earthquake damage estimates.

Integrating post-earthquake damage data is challenging since they are produced at differ-

ent times with varying geospatial coverages, formats, and levels of uncertainty. While a few

research studies have attempted to improve the accuracy of remote sensing and crowdsourced

damage data, none have developed generalized methods to combine multiple data sources into

a single, high-resolution, and spatially distributed estimate of building damage. For example,

Booth et al. (2011) used Bayesian analysis to update the ratio of collapsed buildings in an af-

fected area from manual assessments of satellite imagery with additional satellite assessments

and ﬁeld surveys after the 2010 Haiti earthquake but produced collapse probability distributions

for four low-resolution land-use classes rather than high-resolution spatial estimates. Alterna-

tively, some studies treat post-earthquake damage data as inputs and validation for vulnerability

curves within an engineering forecast (e.g. Gunasekera et al., 2018; Huyck, 2015), but do not

update the ﬁnal damage estimate itself. Rather than estimating damage, some studies have used

multiple damage data to develop maps of shaking intensity (e.g. Monfort et al., 2019). Finally,

Lallemant and Kiremidjian (2013) applied cokriging to integrate a crowdsourced assessment

with a set of ﬁeld surveys, but this method was not generalized to incorporate multiple damage

data sources.

As opposed to existing methods, which rely on only one to two damage datasets, we pro-

pose a framework that is able to integrate multiple heterogeneous data sources to produce a

single spatial damage prediction in the weeks after an earthquake. Speciﬁcally, the geostatis-

tical model, regression kriging, implemented in G-DIF requires a limited sample of primary

damage data from ﬁeld surveys, which are accurate but have low spatial coverage, to predict

damage using secondary damage data, which have lower accuracy but higher spatial coverage.

Within this framework, we employ a geostatistical integration method, since damage between

nearby buildings are likely correlated within the range of spatial correlation of ground motion

because of similarities in construction age and material, local soil conditions, and multiple other

factors (Shome et al., 2012). By modeling this spatial correlation parametrically, G-DIF does

not rely on large ﬁeld survey samples as training data, unlike most machine learning models.

3

Therefore, instead of relying on a model that is built with training data from one location and

may not transfer well between different built environments and different data sources, G-DIF

can be be developed after an event using its speciﬁc data, leading to locally calibrated damage

estimates. Because of these features, similar geostatistical techniques have been previously ap-

plied to integrate data in other ﬁelds such as for mapping atmospheric optical thickness (e.g.

Chatterjee et al., 2010) and soil properties (e.g. Hengl et al., 2004; Thompson et al., 2010).

In this paper, we illustrate the implementation of the framework with an example application

using real damage data collected after the 2015 Nepal Earthquake. In this example, we show

how G-DIF produces a single map of damage and a map of the estimation uncertainty, which

can be used to model economic losses and guide further ﬁeld surveying, respectively. Compared

to traditional methods of rapidly estimating post-earthquake damage, G-DIF results in a damage

estimate with lower overall error, higher resolution, and is speciﬁc to each context.

POST-EARTHQUAKE DAMAGE DATA SUITED FOR G-DIF

G-DIF makes use of two types of damage information: primary measurement data with high

accuracy and sparse spatial coverage, plus secondary proxy data with low accuracy and dense

spatial coverage. Examples of primary data include ﬁeld surveys of damage and secondary data

includes engineering forecasts, remotely-sensed proxies, or relevant geospatial covariates from

before or after the event (e.g. intensity or elevation). All information is assumed to be numerical

(e.g. collapse rate) rather than descriptive (e.g. social media posts). In this section, we outline

the time of availability and format of the damage data suited for G-DIF, as shown in Figure 2.

FIELD SURVEYS

Field surveys of damaged buildings are often conducted following earthquakes. These include

surveys conducted by reconnaissance teams to understand the scale and type of building dam-

age, rapid engineering safety evaluations to inform people of the safety of reoccupying build-

ings, and detailed, recovery-oriented surveys as time progresses (Earthquake Engineering Re-

search Institute, 2015; Lallemant et al., 2017). These ﬁeld surveys include an evaluation of

the level of damage for each inspected building. The two most prevalent methods to assign

damage levels are the ATC-20 methodology and the EMS-98 grading system, where engineers

classify building damage in damage states or grades, respectively, based on descriptive dam-

age conditions (Applied Technology Council, 1989; Gr ¨

unthal, 1998). Since engineers inspect

4

Figure 2. Timeline of availability of post-earthquake damage data suited for G-DIF based on Lallemant

et al. (2017)’s review of damage assessments. Data sources with lower accuracy but dense spatial cov-

erage are available soonest after an earthquake. Once a limited sample of ﬁeld surveys are collected,

enough data is available for G-DIF. The time to collect a sufﬁcient amount of ﬁeld surveys can vary by

region (in Nepal, it could feasibly be done in a couple of weeks), however, a couple of weeks is sufﬁcient

for early recovery decisions.

each building from the ground, ﬁeld survey assessments are the most accurate measurement

of damage relative to other damage data. The timing of early ﬁeld surveys varies between

disasters—past examples from the REACH survey, the government, and reconnaissance teams

have shown organized surveys to be conducted in the ﬁrst 6 weeks (Shelter Cluster Nepal, 2015;

Lallemant et al., 2017; Earthquake Engineering Research Institute, 2015). While full coverage

of on-the-ground surveys takes months to even years after a major event, G-DIF leverages these

early surveys to provide calibration of predictions and constraints at the survey locations.

ENGINEERING FORECASTS

Engineering forecasts are near-real-time predictions of regional impact available within hours,

as soon as a map of shaking intensity can be derived from the magnitude and location of the

earthquake source (Jaiswal et al., 2009). Multiple global systems exist, the most widely used

being the Prompt Assessment of Global Earthquakes for Response (PAGER) system (Jaiswal

and Wald, 2011). These systems typically use an analytical or empirical model that relates shak-

ing intensity to impact measures such as building damage, casualties, or economic loss. These

5

models usually rely on information on the earthquake shaking in terms of peak ground mo-

tion or intensity, building and population exposure, and fragility functions (Erdik et al., 2014).

While systems like PAGER aggregate their models to country-level impact estimates, alternative

systems, such as the Quake Loss Assessment for Response and Mitigation (QLARM), provide

spatially distributed model predictions (Trendaﬁloski et al., 2009). Since engineering forecasts

are model-based, rather than observation-based, these predictions are inherently uncertain, es-

pecially in regions with limited seismic stations and building inventory data (Wald et al., 2012;

Erdik et al., 2014).

REMOTE SENSING-DERIVED DAMAGE DATA

Remote sensing-derived damage data are observations related to damage, retrieved from earth

observation technologies such as sensors mounted on satellites, aircraft, or unmanned aerial

vehicles. These signals can be interpreted automatically through computer algorithms or manu-

ally by humans, each with a range of formats (Dong and Shan, 2013; Kerle, 2013). Depending

on the interpretation method, the data are either damage proxies, which provide an idea of

damage intensity, or assessments, which provide direct measurements of damage. For exam-

ple, the Advanced Rapid Imaging and Analysis project at NASA’s Jet Propulsion Laboratory

and California Institute of Technology produce damage proxy maps (DPM) for major disas-

ters based on an automatic change detection between two pairs of images from Interferometric

synthetic-aperture radar (InSAR) data, thus providing a measure of intensity (Yun et al., 2015).

Alternatively, digital humanitarian groups, such as Humanitarian OpenStreetMap Team (HOT)

or the Global Earth Observation-Catastrophe Assessment Network (GEO-CAN), have manually

identiﬁed damaged and collapsed buildings in optical satellite and aerial imagery, respectively

(Westrope et al., 2014; Loos et al., 2018; Ghosh et al., 2011). The availability of remote sensing-

derived damage data depends on the retrieval of the underlying remote sensing data—typically

within a few days to a couple of weeks (Dong and Shan, 2013; Lallemant et al., 2017). While

remotely sensing damage data have denser spatial coverage than ﬁeld surveys, these estimates

have varying accuracy depending on the type of imagery or interpretation used (Loos et al.,

2018; Dong and Shan, 2013; Monfort et al., 2019).

6

GEOSPATIAL DATA INTEGRATION FRAMEWORK

Our goal is to estimate the true building damage, Z, which is the assigned damage grade for

a building from a ﬁeld survey. We formulate the true damage as a function of location, s, so

Z(s)is a continuous variable. The region is discretized into a grid, so that Z(s)is deﬁned at

a countable number of locations. When the grid dimension encompasses multiple buildings,

Zcan be deﬁned as the average damage grade (hereon referred to as mean damage) of the

buildings or the fraction of buildings that fall within a given grade.

We consider the true damage as a random spatial process composed of two parts: 1) the

mean surface, which is the average damage throughout space and 2) small-scale ﬂuctuations

around the mean surface. In the case of earthquake-induced building damage, the mean surface

will exhibit a general trend in space, because of characteristics such as shaking intensity that

have large-scale spatial variation. We model this trend parametrically. We expect the small-

scale ﬂuctuations (hereon the residuals) to exist, resulting from smaller scale similarities in

characteristics such as construction characteristics and local soil conditions. Because of the

small-scale similarities, we model the residuals as stochastic and spatially auto-correlated, or

correlated with itself between two locations.

The true building damage Zat a single location s, can therefore be represented as the sum

of the trend, m(s), and stochastic residual, ε(s),

Z(s) = m(s) + ε(s).(1)

To illustrate, consider two communities Aand B—community Ais closer to the earthquake

source and experienced greater shaking, and therefore damage, than the more distant commu-

nity B. The average difference in damage between Aand Bis represented by the trend, m(s).

Beyond that, the buildings in the grids in and around Aare constructed similarly—built with

the same material in the same year—causing similar damage. The local similarities in damage

surrounding a grid is represented by the spatially correlated residual ε(s).

Note that Z(s)is deﬁned as the true damage, since a ﬁeld surveyed assessment is relatively

the most accurate measurement of damage available after an earthquake. Uncertainty in a ﬁeld

survey still exists due to the subjectivity of the surveyor, and the additional uncertainty intro-

duced from aggregating the surveys to a grid. Here, however, we consider Z(s)to be exact

and only account for the uncertainty in the estimation of the trend and the spatially-correlated

residuals.

7

G-DIF capitalizes on 1) the correlation between the sparse ﬁeld surveys and secondary dam-

age data to estimate the trend and 2) the auto-correlation between the ﬁeld surveys to estimate

the residuals. The geostatistical data integration model implemented in G-DIF is regression

kriging (also known as residual kriging), a multivariate geostatistical regression technique,

which consists of two separate models for the trend and the residuals (Odeh et al., 1994).

Separate modeling of the trend and residuals allows for alternative regressions that consider

nonlinear relationships between primary and secondary data and separate interpretation of each

model’s results. The main steps of the framework are in Figure 3.

Figure 3. G-DIF steps to produce spatial estimates of regional damage.

8

DATA PRE-PROCESSING

We separate the input data for G-DIF into two sets of locations. There are psecondary datasets,

X1. . . Xp, that are spatially exhaustive and available at all nlocations with an additional set of

primary ﬁeld survey data at a subset of nfs locations. The collocated primary and secondary

data at the nfs ﬁeld survey locations are used for developing a regression function, which is

then used to estimate the trend at all nlocations. Similarly, the spatial correlation model is

developed using the nfs ﬁeld locations. Generally, the set of ﬁeld surveys should be large

enough to build a regression model for the trend (nfs >> p) and have samples at each damage

level and varying distances from each other. In this paper, we assume that the set of ﬁeld

surveys include observations of the full range of damage levels and are carried out at random

grids distributed throughout the spatial domain in order to produce unbiased estimates of the

trend and variogram (this assumption has important implications for survey sampling, which

we revisit in the sensitivity analysis and conclusion sections). The vector of ﬁeld surveys (Z)

and matrix of secondary datasets (X) for model development are

Z=

Z(s1)

.

.

.

Z(snfs )

X=

X1(s1). . . Xp(s1)

.

.

.. . . .

.

.

X1(snfs ). . . Xp(snfs )

.

To model the trend, we develop a regression function, f, which predicts the true damage at

the ﬁeld survey locations, Z, as a function of the damage from the secondary data, X. We use

the developed regression function to estimate the trend at a single, unknown location, s0:

ˆm(s0) = f(X(s0)).(2)

TREND MODEL

The function fis the modeler’s choice and will generally be earthquake-speciﬁc. Because the

choice of trend model is likely to be dependent on the data available, it is important to develop

this function manually to obtain accurate estimates of the ﬁnal damage. It is common to ap-

ply ordinary least squares (OLS) for trend estimation. Alternatively, generalized least squares

(GLS), which weights observations by their spatial covariance, accounts for spatial correlation

in the residuals and leads to an unbiased estimate of the coefﬁcient. The use of GLS leads to re-

sults most similar to estimating the trend and residual simultaneously, as with universal kriging

(Hengl et al., 2003; Chiles and Delﬁner, 2012). In either formulation, both linear and nonlinear

9

least-squares regression functions can be applied. Other functions such as general additive mod-

els, regression trees, and artiﬁcial neural networks have also been explored within this general

approach (McBratney et al., 2000; Grujic, 2017; Motaghian and Mohammadi, 2011). In addi-

tion, separate trend models can be developed for different regions that have varying coverage

of secondary data. This could be the case for imagery-based damage data that can be limited in

geographical extent, which we demonstrate in our application to Nepal.

SPATIAL CORRELATION MODEL

With the developed trend function we estimate the trend at all nlocations and calculate the

residuals at each of the nfs ﬁeld surveyed locations:

ε(sα) = Z(sα)−ˆm(sα), for α= 1...nfs.(3)

Using the calculated residuals, we perform ordinary kriging to estimate the residuals at the un-

known locations using a spatial correlation model. The estimated residual at a single, unknown

location is the weighted sum of the known residuals from the ﬁeld surveyed locations

ˆε(s0) =

nfs

X

α=1

λα(s)·ε(sα)(4)

where λαare the kriging weights.

We solve for the kriging weights, λ

λ

λ=λα. . . λnfs , by minimizing the estimation variance at

the surveyed locations and placing a constraint on the sum of the weights to equal one to satisfy

the unbiasedness conditions assumed with ordinary kriging (Chiles and Delﬁner, 2012).

min

λ1,...,λnfs

var(ˆε(sα)−ε(sα)) + 2ν(

nfs

X

α=1

λα−1).(5)

We obtain the λ

λ

λthat minimizes Equation 5 by introducing a Lagrange multiplier νand setting

the function’s partial derivatives with respect to λ

λ

λand νequal to zero. This results in the

following ordinary kriging system of nfs + 1 equations with nfs + 1 unknowns (λ

λ

λand ν):

C

nfs ×nf s

1

nfs ×1

1>

1×nfs

0

λ

λ

λ

nfs ×1

ν

=

C0

nfs ×1

1

,(6)

where Cis the auto-covariance matrix between the known residuals and C0is the covariance

between the new estimation location and all ﬁeld survey locations. Here, we assume second-

order stationarity of the residuals, meaning the autocovariance is the same for any two points

10

based on their separation distance, h, and irrespective of their location. The auto-covariance C

is derived from a variogram, a concept similar to the correlation models used for ground-motion

intensities (Boore et al., 2003; Goda and Hong, 2008; Jayaram and Baker, 2009). The variogram

is a theoretical parametric model of spatial correlation that relates the separation distance h

between ﬁeld surveyed locations and the dissimilarity of their residuals. Dissimilarity in the

variogram is quantiﬁed using half the variance, or the empirical semivariance

γ(h) = 1

2varε(s)−ε(s+h)=1

2E{ε(s)−ε(s+h)}2(7)

where his the euclidean distance. A theoretical variogram is then ﬁt through all (γ, h)pairs.

Selection of an appropriate theoretical variogram should again be based on the lowest error from

cross-validation (Oliver and Webster, 2014).

DAMAGE AND UNCERTAINTY ESTIMATE

The ﬁnal damage estimate at a single location is obtained by adding together the estimated trend

and residuals from Equations 2 and 4, respectively, as shown in Equation 1

ˆ

Z(s0) = f(X(s0)) +

nfs

X

α=1

λα(s)·ε(sα).(8)

Once we develop the ﬁnal damage estimate for all locations, ˆ

Z, it can be used to estimate further

decision variables (i.e. the spatial distribution of economic losses).

In addition, this method provides the variance of the damage estimate, ˆσ2(s0), which is the

sum of the individual variances from estimating the trend, ˆσ2

m(s0), and kriging the residuals,

ˆσ2

ε(s0).

ˆσ2(s0) = ˆσ2

m(s0) + ˆσ2

ε(s0).(9)

The estimation variance can be used to propagate uncertainty in further loss estimates or to

guide where to carry out additional ﬁeld surveys.

APPLICATION TO THE 2015 NEPAL EARTHQUAKE

In this section, we demonstrate the applicability of G-DIF by using real data produced after

the 2015 Mw7.8 Nepal earthquake to estimate damage over the 11 heavily affected and mostly

rural districts outside of Kathmandu Valley. We assume this model would have been applied

approximately two to four weeks following an earthquake (i.e., the vertical line in Figure 2)

11

when enough ﬁeld surveys are available to implement G-DIF. For this example, we use ﬁeld

surveys at 100 random locations plus representative data sources for each type of secondary

damage data. We present this case study in order of the ﬂowchart of Figure 3.

1. DAMAGE DATA

The measurement unit and spatial support of each input data used in this case study are listed in

Table 1.

Table 1. Data from the 2015 Nepal earthquake used in the application of G-DIF

Damage data category Dataset used in case study Measurement Unit Spatial Support

Field surveys EMS-98 ﬁeld surveys (Z) Damage grade Building-level

Engineering forecast Self-developed (X1) Mean damage ratio 1km grid

Remote sensing proxy InSAR-based damage proxy map (X2) Damage proxy map value 30m grid

Relevant geospatial covariates ShakeMap (X3) Modiﬁed Mercalli Intensity 1.75km grid

Digital Elevation Model (X4) Elevation (m) 90m grid

The damage survey data for this case study come from the Earthquake Housing Damage and

Characteristics Survey commissioned by the Government of Nepal and completed by July 2016

(http://eq2015.npc.gov.np/#/). The purpose of that survey was to identify rural households that

would be eligible beneﬁciaries for the Earthquake Housing Reconstruction Program and was

therefore carried out in the 11 rural most-affected districts, not including the three districts in

Kathmandu Valley (Nepal Earthquake Housing Reconstruction Multi-Donor Trust Fund, 2016).

In this survey, trained engineers used the EMS-98 damage grading system to classify a census

of 751,799 buildings in these districts into a damage grade from 1 (negligible to slight damage)

to 5 (collapse). While this exhaustive survey was completed a year after the earthquake, we

consider only a random sample of 100 locations in order to replicate what would be available

rapidly after an event.

We developed an engineering forecast dataset with similar methods and quality to engineer-

ing forecasts available after earthquakes in countries with limited building inventory data. We

use fragility curves from Nepal’s National Society of Earthquake Technology to relate the peak

ground acceleration from the latest ShakeMap to damage ratios for masonry (mud and cement

mortared), reinforced concrete, and wood structures (JICA, 2002; Worden et al., 2018). The ex-

posure is deﬁned using population estimates from the LandScan 2011 High Resolution Global

Population Dataset and ratios of each construction type available at the district-level in Nepal’s

2011 census (Bright et al., 2012). Given the estimated number of buildings, the estimated distri-

bution of each construction type, and the fragility curve for each construction type, we compute

12

the mean damage ratio per grid.

For the remote sensing proxy, we use NASA’s damage proxy map (DPM) (Yun et al., 2015).

NASA has consistently produced a DPM after major disasters since the February 2011 M6.3

Christchurch earthquake, making it a relevant remote sensing proxy to include in this study.

The DPM algorithm takes the difference between two InSAR coherence (or similarity) maps:

one from before the earthquake and one spanning the earthquake. The DPM value in each

pixel (which ranges -1 to 1) represents anomalous change due to the earthquake, as opposed to

background changes (noise) that existed in the pre-earthquake pair coherence.

We also consider two geospatial covariates that are available after earthquakes and relate to

the trend in damage: the Modiﬁed Mercalli Intensity from the ShakeMap (Worden and Wald,

2016), and a Digital Elevation Model (DEM) derived from the Shuttle Radar Topography Mis-

sion (Jarvis et al., 2008; Farr et al., 2007). While elevation may not directly cause earthquake

damage, it could serve as a proxy for other factors such construction quality in remote areas or

landslide occurrence. The use of elevation data for the application of G-DIF in Nepal demon-

strates how the trend model down-weights secondary datasets that are poor proxies for damage,

as shown in the modeling results for the trend (Section 4.3).

2. DATA PRE-PROCESSING AND EXPLORATION

We discretize each dataset to a common grid of 0.0028◦×0.0028◦(∼290m ×290m), resulting

in a study area with 80,200 grid points. We use this resolution to remove any personal identiﬁ-

able information, ensuring that more than one building is within each grid. The 11 considered

districts are mostly rural, so there are nine buildings per grid on average (though 0.25% have

100 or more buildings).

The true damage from the ﬁeld surveys, Z, is the mean damage grade of all buildings within

each grid. Out of 80,200 grids that contain buildings, we randomly selected 100 grids (contain-

ing 1056 buildings) as the set of locations that engineers could survey in the ﬁeld. From here

on, we refer to the subset of grids as the ﬁeld surveyed locations.

Exploratory analysis shows a positive relationship between the true damage and the sec-

ondary damage data as exhibited in the moving average curves in the left column of the matrix

in Figure 4. Speciﬁcally, the engineering forecast, shaking intensity, and elevation are linearly

related to the true damage, while the DPM shows a slightly nonlinear relationship. The form

of these discovered relationships should be considered when deciding on which trend model to

13

Figure 4. Summary of true damage from primary ﬁeld survey and secondary damage data at all loca-

tions (n= 80,200) and the subset of ﬁeld surveyed locations (np= 100). The diagonal shows histograms

of each dataset, the scatter plots show relationships between datasets (including a moving average esti-

mate), and the bottom row maps the spatial patterns of each data set (with warm colors indicating larger

values). The left column of scatter plots highlights relationships between primary and secondary data.

use.

3. TREND MODEL

Based on our observations of linearity between variables, we used a linear least squares regres-

sion as the functional relationship between the true damage and each secondary damage data

14

ˆm(s0) =

p

X

k=0

ˆ

βk·Xk(s0)(10)

where X0is a vector of ones to estimate the intercept. We estimate the coefﬁcient for each sec-

ondary damage data, ˆ

βk, through either ordinary least squares (OLS) regression or generalized

least squares (GLS) regression. We select the regression function that results in the least root

mean squared error, which is OLS regression in this example. We also build two trend models

for areas with and without DPM values, since the DPM covers about 40% of the considered

region (more details are in the Appendix). We compute the variance inﬂation factor (VIF) for

each secondary data variable to assess whether multicollinearity exists (James et al., 2013), but

ﬁnd that the VIF’s for each variable are below two–a low value that indicates multicollinearity

is not a problem with these data.

Building a trend model with the data at the ﬁeld surveyed locations has two advantages.

First, the function in Equation 10 translates the numerous secondary damage data with differing

units of measurement (e.g., shaking amplitude, elevation, an arbitrary numerical scale for DPM)

into a collective unit, the mean damage grade, that has value for regional loss estimates and other

decision-making.

The second advantage of the trend model is that the modeler does not need to subjectively

weight the importance of each secondary damage data, instead allowing the data to determine

the importance of each dataset through the model coefﬁcients. By examining each ˆ

βkand its

standard error, we observe which secondary dataset provides additional value in modeling the

trend. For example, in Figure 5a, we see the digital elevation model (DEM) has close to a

zero coefﬁcient in the estimated trend, signifying the DEM has little additional effect on the

trend estimate when we account for the other secondary damage data. These coefﬁcients are

comparable since we normalize all variables before developing the trend model. If we estimate

a zero coefﬁcient for all secondary damage data, then the trend reduces to a constant mean (the

intercept). Note that the estimated coefﬁcients shown in blue in Figure 5a are dependent on

the set of ﬁeld surveyed grids and therefore may differ from the true coefﬁcient estimates as

shown in black. These coefﬁcient estimates are also speciﬁc to the Nepal earthquake, which

was a largely rural disaster; it is not a comment on the general utility of each dataset among all

earthquakes. Because the parameters of the trend model are based on its data inputs at the ﬁeld

surveyed locations, G-DIF is calibrated to the data available after each speciﬁc earthquake.

15

(a) (b)

Figure 5. (a) The coefﬁcient estimates (blue dots) from the trend model using ordinary least squares

regression in the area with Damage Proxy Map values. Horizontal lines show the standard error and

black stars are coefﬁcients using 10000 grids. (b) The spatial correlation model using a Matern vari-

ogram showing the difference in the variogram of the true damage at the ﬁeld surveyed grids and of the

variogram of the residuals before and after removing the trend, respectively. The vertical dotted line at

9.4 km highlights the range of spatial autocorrelation.

4. SPATIAL CORRELATION MODEL

Similar to the trend model, the parameters of the spatial correlation model are calibrated to the

data rather than predetermined. In this case study, we estimate the parameters of a Matern the-

oretical variogram model. The ﬁtted parameters when minimizing the residual sum of squares

results in an exponential covariance:

C(h) = bexp −|h|

r,(11)

where bis equal to the variance of the residuals and ris the range of spatial autocorrelation. In

this example, we ﬁt these parameters to equal b= 0.83 and r= 9.4km.

The variogram is related to the covariance through

γ(h) = b−C(h).(12)

We veriﬁed the use of this model by comparing the variogram ﬁtted with 100 ﬁeld surveyed

grids shown in Figure 5b to the same model ﬁt with 10,000 grids.

The shape of the variogram highlights spatial characteristics of the data. The vertical dot-

ted line in Figure 5b is the range of spatial autocorrelation, r, of the true damage at the ﬁeld

16

surveyed locations after removing the trend. The range is the maximum distance at which two

locations are spatially autocorrelated with one another. The estimated range of 9.4km is spe-

ciﬁc to this earthquake and depends on the choice of variogram and ﬁtting procedure, which

we evaluate further in the sensitivity analysis in the following section. We evaluate the statisti-

cal robustness of the estimated range in the following section. The variogram also shows that

we have successfully removed the preexisting trend from the data, since the variogram of the

true damage increases with distance, while the variogram of the residuals plateaus. If the trend

model were able to fully capture the spatial correlation in the true damage, the variogram would

reduce to a horizontal line with γ(0) = σ2(0) (i.e. the nugget-effect model), and performing

ordinary kriging would provide no additional effect on the ﬁnal mean damage estimate. There-

fore, G-DIF adapts to allow varying levels of contribution from the spatial correlation model,

depending on how well the trend model estimates the true damage.

5. FRAMEWORK OUTPUTS

The implementation of G-DIF generates two main outputs: 1) a map of the mean damage

estimate for each of the 80,200 grids and 2) a map of uncertainty, or estimation variance, of

those estimates.

Damage estimate

The mean damage estimate map (6a) is the sum of the estimated trend and the estimated resid-

uals. The mean damage estimate reﬂects the trend model such that areas to the north exhibit

greater damage than areas to the south. This gradient in damage comes from the two most

important secondary damage data in the trend model, the shaking intensity and the engineering

forecast, which have higher values towards the north.

The mean damage estimate reﬂects the spatial correlation model through the similarity in

mean damage estimates surrounding the ﬁeld surveyed locations shown in black in Figure 6a.

These spatial similarities are particularly visible to the northeast of Kathmandu, where there is

clustering of high damage around the ﬁeld surveyed points. These similarities are due to the

variogram, which estimates small-scale ﬂuctuations based on damage at nearby ﬁeld surveyed

locations.

This map can then be used to estimate total costs of damage. Here, we assume the estimated

mean damage grade is the same for all buildings within a grid and that the number of buildings

17

per grid is known. By multiplying the estimated mean damage grade by the number of build-

ings, an assumed ratio of each type of construction material, and the repair or reconstruction

cost for each construction material, we obtain an estimate of 315 billion NPR (2.8 billion USD)

for the cost of repair and reconstruction. This estimate is almost the same as the total damages

to the housing sector of 303 billion NPR (2.7 billion USD) reported in the PDNA. While this

economic loss estimate is not a primary focus of this study, it is provided here to illustrate that

these damage predictions can be converted to regional economic loss estimates.

Estimation variance

The map of the estimation variance is the sum of the variance from estimating both the trend and

residuals (Equation 9). In the case of least squares regression and ordinary kriging, we solve

for the trend coefﬁcients (β

β

β) and kriging weights (λ

λ

λ) by minimizing the variance of the error at

the ﬁeld surveyed locations. These two procedures result in an estimation variance of the trend

and residuals at all locations (speciﬁc equations are included in the Electronic Supplement).

We can interpret the estimation variance, σ2(s), as our uncertainty in the mean damage es-

timate at each grid. The model assumes the uncertainty in the mean damage estimate varies

according to a Gaussian probability distribution with ˆ

Z(s)as the mean and ˆσ2(s)as the stan-

dard deviation—Z(s)∼N(ˆ

Z(s),ˆσ(s)). The variance quantiﬁes the uncertainty in 1) the trend

estimation due to the relationships between the primary and secondary data and 2) spatial es-

timation of the residuals. The spatial uncertainty is visible in the estimation variance map as

shown by the higher variances at grids that are located further from the ﬁeld survey locations in

Figure 6b.

The map of the estimation variance can be used to guide where future ﬁeld surveys should

be carried out and to propagate uncertainty when estimating further losses. Since the variance

depends on the location of the ﬁeld surveyed grids, surveyors could assess damage in areas with

higher variance to reduce the overall uncertainty.

COMPARING G-DIF WITH A RAPID ENGINEERING FORECAST

In this section, we compare G-DIFs spatially varying damage estimate and variance to the engi-

neering forecast, which is the current standard of practice for estimating post-earthquake dam-

age. Visually, we can see that G-DIFs mean damage estimate from the example set of 100

ﬁeld surveyed grids presented in the previous section (Figure 6a) resembles the true damage

18

(a) G-DIF mean damage estimate (b) G-DIF estimation variance

(c) True damage (d) Engineering Forecast

Figure 6. Results of the framework for an example set of 100 ﬁeld surveyed locations including (a) the

mean damage estimate and (b) the estimation variance. The results of (a) can be compared to (c) the true

damage from all ﬁeld surveys and (d) the engineering forecast converted to mean damage grade.

(Figure 6c) more than the engineering forecast (Figure 6d). Going further, we quantify the per-

formance of G-DIFs outputs to demonstrate that its mean damage estimate has lower total error

and improved uncertainty quantiﬁcation. Since G-DIF heavily depends on the ﬁeld survey data,

we also perform a sensitivity analysis of G-DIFs outputs to the number and placement of ﬁeld

surveyed locations used to build the model.

PERFORMANCE OF THE CASE STUDY EXAMPLE

Using the mean damage estimate from the previous case study, we quantify the error between

the predicted and observed damage at all validation grids, as shown in Figure 7a. The distribu-

19

tion of prediction error highlights 1) the bias, or the mean error—whether the damage estimate

is systematically under or overestimating damage, and 2) the variance, how precise the damage

estimate is for all grids. Note that the engineering forecast only results in mean damage grade

values that are whole integers (1, 3, and 4) after binning the predicted mean damage ratio per

grid, leading to spikes in its errors at whole integers in Figure 7a. The lower bias and variance

of G-DIFs mean damage estimate leads to a mean squared error (MSE), a performance metric

which combines both bias and variance, of MSE = 0.853—47% lower than the engineering

forecast with MSE = 1.62.

(a)

(b)

(c)

Figure 7. Histogram of errors between the predicted and observed damage for a) all 11 considered

districts, b) Makawanpur district (southwest of Kathmandu Valley), and c) Nuwakot district (northwest

of Kathmandu Valley). Histograms highlight the lower bias, variance, and mean squared error for G-DIF

when using an example set of 100 ﬁeld surveyed grids.

Even though G-DIFs MSE is nearly half that of the engineering forecast, we see the dif-

ference is even larger when looking at individual districts within the study area (Figure 7b and

c). Over the full study area, the engineering forecast will capture the overall trend in damage

over large regions, but is limited by the resolution of the underlying building inventory data,

which is only available at the district-level. G-DIFs advantage over the forecast is that it is

locally calibrated to ﬁeld surveys within a district, so its mean damage estimates for smaller

regions have lower bias, variance, and MSE. For example, G-DIFs mean damage estimate has a

lower bias (bias = 0.038) and higher precision (standard deviation = 1.122) than that of the engi-

neering forecast (bias = 0.904, standard deviation = 1.477) when considering the errors only for

Makawanpur, the district directly southwest of Kathmandu Valley (Figure 7b). G-DIF is consis-

20

tently more accurate at the local-level for 9 out of the ll districts, as seen in the error histogram

for Nuwakot in Figure 7c and the other districts depicted in the electronic supplement.

SENSITIVITY OF THE PERFORMANCE TO THE FIELD SURVEYED GRIDS

G-DIFs ﬁnal mean damage estimate varies depending upon the sampled primary data, especially

with few ﬁeld survey grids to build the trend and spatial correlation models or with secondary

data that are not strongly predictive of damage. The goal of this section is to quantify how

G-DIF’s performance depends on the number and placement of the ﬁeld surveyed locations

used to build the framework. For a set of ﬁeld surveyed locations ranging from 25 to 1000, we

simulate G-DIFs mean damage estimate using 1000 random samples of different placements

and assess its performance. Figure 8 shows the distribution of the MSE and bias for each of

these simulations.

(a) (b)

Figure 8. Histograms of (a) the accuracy of the mean damage estimate (MSE) and (b) the performance

of the estimation variance from the sensitivity analysis of the number and placement of ﬁeld surveyed

locations used to develop G-DIF. As more ﬁeld surveys are collected, the accuracy improves and does

not depend as much on the placement of ﬁeld surveyed locations.

As expected, as the number of ﬁeld surveyed locations increases, the G-DIF MSE decreases

(accuracy increases) and is consistently lower than that of the engineering forecast, regardless

of the placement of the ﬁeld surveyed locations. Given that the MSE can take values between 0

and 16, the MSE from G-DIF is relatively low. Figure 8a shows histograms of MSE for repeated

analyses using varying samples of data, and for ﬁve amounts of sampled data. G-DIF MSE is

lower than that of the engineering forecast for 99.7 % of the simulations when using 50 ﬁeld

21

surveyed grids, and the percentage is even higher when more survey locations are used.

When we separate the bias from the MSE, we see that G-DIF distribution of bias is low

relative to the full range of possible bias (-4 to 4). The G-DIF damage prediction can result in

a more biased result than the engineering forecast, as shown by the areas of the distributions of

bias outside of the vertical bounds of the engineering forecasts mean error in Figure 8b. This

is partly due to the fact that G-DIFs mean damage estimate depends on how representative the

ﬁeld survey set is of the true distribution of damage. With more biased ﬁeld survey sets, the ﬁnal

estimate is more biased, but sample bias can be avoided with a sufﬁciently large ﬁeld survey

if the ﬁeld survey comes from a random sample. With 500 ﬁeld survey locations, 86% of the

simulations are less biased than the engineering forecast.

G-DIF is more biased than the engineering forecast, also because the forecast has a low

mean error of -0.056 for the full study region, as discussed in the previous section. However,

the sensitivity analysis conﬁrms G-DIF is more precise when considering sub-regions. Since

the MSE is the sum of the variance and squared bias, the reduction in MSE with more ﬁeld

surveyed locations in Figure 8a is due to the reduction in variance of the error. This reduction

in variance means that grid-level estimates become more precise. So overall, there is lower

variation in the error considering the high resolution of G-DIF’s mean damage estimate.

We also evaluated the statistical robustness of the estimate of the range of spatial autocorre-

lation. After 1000 simulations using 1000 ﬁeld surveyed locations, the range of the unexplained

damage is on average 14 km. This range is consistent with the range of 15-20km reported for

damage ratios from the 1994 Northridge earthquake (Shome et al., 2012) and plausible given

that the range of spatial correlation for ground motion intensities can vary between 10-60km

(Jayaram and Baker, 2009).

WHICH DAMAGE ESTIMATION APPROACH TO USE?

By comparing to the engineering forecast, we show that G-DIF provides a credible damage

estimate to support post-earthquake decisions. Whether G-DIF is advantageous over using tra-

ditional methods to rapidly estimate damage depends on the amount of primary ﬁeld data, the

quality of the secondary data, and the scale at which decisions are made. In cases where a

well-calibrated engineering forecast is available, the study region is large, or there are few ﬁeld

surveyed grids, the engineering forecast will provide reasonable damage estimates.

Often, however, the engineering forecast may be a general model rather than one calibrated

22

for the speciﬁc region, or the input inventory data may be of low quality and resolution. In such

cases, if there are sufﬁcient ﬁeld surveys available, G-DIF will likely provide a damage estimate

that is comparable or have higher accuracy than that of an engineering forecast. This is because

of the approaches’ ability to calibrate an event-speciﬁc prediction and to also perform spatial

interpolation between survey points. In this formulation, we consider measurements from the

ﬁeld to be exact

A main advantage of G-DIF is that it provides locally accurate damage estimates that can be

leveraged for loss estimates and higher resolution decisions. This means that within sub-regions,

G-DIFs damage estimate will calibrate the engineering forecast, and other secondary damage

data, to the ﬁeld surveys within that region (as seen in the error histogram for Makawanpur

district in Figure 7b). To improve the local accuracy of the damage estimate, surveyors can use

the uncertainty estimate to guide the collection of additional damage assessments. By surveying

in areas with greater uncertainty, the overall uncertainty of the damage estimate will decrease.

CONCLUSION

In this study, we propose a geospatial data integration framework (G-DIF) to produce a spatial

damage prediction in the weeks after an earthquake. G-DIF uses a limited sample of local and

accurate ﬁeld surveys to calibrate predictions based on heterogeneous and uncertain damage

data from engineering forecasts, remote sensing and other sources. The uncertain data can

arrive in varying formats, measurement units, and levels of accuracy.

The geostatistical technique, regression kriging, applied in G-DIF consists of two models.

The ﬁrst is a trend model that estimates the mean damage, a deterministic value that varies

in space, using secondary damage data. The second is a spatial correlation model that esti-

mates the stochastic and spatially correlated residuals between the estimated trend and the true

damage. The separate modeling of these two components allows the framework to produce a

sophisticated trend model when the secondary data is strongly predictive, plus a spatial inter-

polation between observations when the secondary damage data has less predictive power. The

framework is ﬂexible to implement—the modeler can choose the functional form of the trend

prediction (linear or nonlinear) and spatial correlation (variogram) model, depending on the

data available for the event of interest.

Data collected after the 2015 Nepal earthquake was used to demonstrate the implementation

of G-DIF. Out of 80,200 grids in our area of interest, we used a sample of 100 grids as an

23

example of ﬁeld surveyed locations and found that the mean damage estimated at the other

80,100 grids had a higher accuracy (lower mean squared error) than a benchmark based on a

current engineering forecast. Moreover, G-DIF provides a mean damage estimate that is more

accurate for smaller regions than the engineering forecast used in this study, because it locally

calibrates all secondary data to ﬁeld surveys. Modelers can then use this spatially varying mean

damage estimate to calculate costs of repair and reconstruction.

In addition to the mean damage estimate map, G-DIF creates a map of the estimation un-

certainty, which is important for interpreting results, and a signiﬁcant addition to the current

state of practice for standard damage maps from engineering forecasts or remote sensing dam-

age data (e.g. Jaiswal and Wald, 2011; Yun et al., 2015; Copernicus Emergency Management

Service, 2019). Post-disaster modelers or decision-makers can use this estimation variance to

propagate uncertainty into further impact models or decide where to collect more ﬁeld surveys

to reduce the uncertainty.

With this method we do not explicitly account for uncertainty in the ﬁeld surveyed assess-

ment that results from survey subjectivity and aggregation per grid. The subjectivity in the

ﬁeld surveyed measurement has the potential to be mitigated Booth et al. (2011), so we have

made the assumption that its uncertainty is negligible relative to other data sources. This frame-

work can be extended to address the uncertainty due to aggregation through Bayesian updating

of the damage estimate per grid, similar to that presented in Booth et al. (2011), though this

would require estimates of prior and posterior distributions for each dataset and would be more

computationally intensive.

With even a small amount of ﬁeld survey data, G-DIF predictions have improved accuracy

relative to standard engineering forecasts. Through Monte Carlo simulations of the number and

locations of ﬁeld surveys, we found that G-DIF consistently resulted in a damage map with

lower mean squared error than an engineering forecast when using more than 50 ﬁeld surveyed

locations. Given that we predict the damage at 80,150 grid locations using 50 ﬁeld surveyed

locations, our framework required 0.06% percent of the grids to be surveyed to improve the

estimate of an engineering forecast. In the case of Nepal, 50 ﬁeld surveyed locations could con-

tain between 250-1150 buildings, which could be feasibly assessed within the ﬁrst 2-4 weeks in

remote, mountainous contexts. While this timeframe may seem long, a few weeks is a sufﬁcient

amount of time for this approach to inform important decisions, such as the PDNA which is a

major use case.

24

While our results show an improved mean damage estimate with a small percentage of

ﬁeld surveyed buildings, the placement of ﬁeld surveys inﬂuence these results. Through the

sensitivity analysis of the framework to the ﬁeld surveyed locations, we found G-DIF’s mean

damage estimate depends on how well the ﬁeld survey set represents the full damage distri-

bution. A biased set of ﬁeld surveyed locations can lead to biased results—in the case of the

Nepal earthquake, sets of more than 500 grids were less likely to be biased. To develop the

spatial correlation model at low separation distances, the ﬁeld survey set should also consist of

locations within the spatial correlation range. To collect ﬁeld data suited for G-DIF, surveys

can be strategically placed to collect damage assessments for all buildings within selected grids

so the sample has the full distribution of damage and sufﬁcient spatial coverage, similar to the

methods of the REACH survey (e.g. REACH, 2014).

The advantage of G-DIF over standard damage estimates, such as the engineering forecast,

is apparent from the Nepal case study. The Nepal earthquake affected a large, mostly rural,

region over multiple districts. In this case, secondary data was uncertain because the engineer-

ing forecast was developed using low-ﬁdelity data and the damage proxy map was observing

changes to both the built environment and vegetation. We expect many future earthquakes to be

similar in that there will be a limited sample of accurate ﬁeld data to calibrate damage predic-

tions from multiple uncertain data. Therefore, the framework presented here could be extended

through testing with earthquakes occurring in different built environments or even other types

of disasters, as suggested in (Shome et al., 2012).

Overall, the outputs of this framework are useful for stakeholders involved in post-disaster

loss assessments (like the PDNA) or recovery aid allocation, such as the affected national gov-

ernment, multilateral or bilateral donor agencies, or civil society organizations. In post-disaster

settings, these stakeholders are often overloaded with making many decisions based on the

uncertain data that are available at that time. By combining multiple data, this framework auto-

matically weights those damage datasets according to their ability to predict damage observed

in the ﬁeld surveys, and synthesizes them to develop one map of damage. Therefore, the frame-

work allows stakeholders to address the hurdle of weighing the reliability of input data versus

its availability, so they can ultimately make more informed decisions to for a more effective

regional recovery.

25

ELECTRONIC SUPPLEMENT

The data and R code to develop all results for the Nepal case study example presented in this

paper are available at https://purl.stanford.edu/gn368cq4893 with an interactive notebook of the

code at https://sabineloos.github.io/GDIF-damageprediction/GDIF nb.html.

ACKNOWLEDGMENTS

We would like to thank Anna Michalak, David Wald, Kishor Jaiswal, Brendon Bradley, and

Robert Soden for their contributions and feedback developing this framework. We would like to

thank the Government of Nepal, especially the National Planning Commission, Central Bureau

of Statistics and National Reconstruction Authority, for collecting this groundtruth damage data

and making its anonymized version available for broader uses and Arogya Koirala and Roshan

Paudel for their assistance in preparing this data. Part of the research was carried out at the Jet

Propulsion Laboratory, California Institute of Technology, under a contract with the National

Aeronautics and Space Administration. This work is funded by the National Science Founda-

tion Graduate Research Fellowship Program, the National Research Foundation of Singapore

grant NRF-NRFF2018-06, and the World Banks Trust Fund for Statistical Capacity Building

(TFSCB) with ﬁnancing from the United Kingdom’s Department for International Development

(DFID), the Government of Korea, and the Department of Foreign Affairs and Trade of Ireland.

REFERENCES

Applied Technology Council, 1989. Procedures of Postearthquake Safety Evaluation of Buildings.Tech.

rep., Applied Technology Council, Redwood City, CA.

Bhattacharjee, G., Barns, K., Loos, S., Lallemant, D., Deierlein, G., and Soden, R., 2018. Developing

a User-Centric Understanding of Post-Disaster Building Damage Information Needs. In 11th U.S.

National Conference on Earthquake Engineering. Los Angeles, CA.

Boore, D. M., Gibbs, J. F., Joyner, W. B., Tinsley, J. C., and Ponti, D. J., 2003. Estimated Ground

Motion From the 1994 Northridge , California , Earthquake at the Site of the Interstate 10 and La

Cienega Boulevard, West Los Angeles, California. Bulletin of the Seismological Society of America

93, 2737–2751.

Booth, E., Saito, K., Spence, R., Madabhushi, G., and Eguchi, R. T., 2011. Validating Assessments of

Seismic Damage Made from Remote Sensing. Earthquake Spectra 27, S157–S177.

Bright, E. A., Coleman, P. R., Rose, A. N., and Urban, M. L., 2012. LandScan 2011. Oak Ridge National

Laboratory.

Chatterjee, A., Michalak, A. M., Kahn, R. a., Paradise, S. R., Braverman, A. J., and Miller, C. E., 2010.

A geostatistical data fusion technique for merging remote sensing and ground-based observations of

aerosol optical thickness. Journal of Geophysical Research 115, 1–12. doi:10.1029/2009JD013765.

26

Chiles, J.-P. and Delﬁner, P., 2012. Geostatistics: Modeling Spatial Uncertainty. 2 edn. Wiley Series in

Probability and Statistics, New York, NY. ISBN 978-0471083153, 734 pp.

Copernicus Emergency Management Service, 2019. Rapid Mapping Portfolio.

Corbane, C., Saito, K., DellOro, L., Bjorgo, E., Gill, S., Boby, P., Huyck, C., Kemper, T., Lemoine, G.,

Spence, R., Shankar, R., Senegas, O., Ghesquiere, F., Lallemant, D., Evans, G., Gartley, R., Toro, J.,

Ghosh, S., Svekla, W., Adams, B., and Eguchi, R. T., 2011. A Comprehensive Analysis of Building

Damage in the 12 January 2010 Mw7 Haiti Earthquake Using High-Resolution Satellite and Aerial

Imagery. Photogrammetric Engineering Remote Sensing 77, 997–1009.

Dong, L. and Shan, J., 2013. A comprehensive review of earthquake-induced building damage detection

with remote sensing techniques. ISPRS Journal of Photogrammetry and Remote Sensing 84, 85–99.

Earthquake Engineering Research Institute, 2015. Learning From Earthquake (LFE) Program.Tech.

rep., Earthquake Engineering Research Institute, Oakland, CA.

Erdik, M., Sesetyan, K., Demircioglu, M., Zulﬁkar, C., Hancilar, U., Tuzun, C., and Harman-

dar, E., 2014. Rapid Earthquake Loss Assessment After Damaging Earthquakes. In Geotechni-

cal, Geological and Earthquake Engineering, vol. 34, pp. 53–96. ISBN 9783319071176. doi:

10.1007/978-3-319-07118-3.

Farr, T. G., Rosen, P. A., Caro, E., Crippen, R., Duren, R., Hensley, S., Kobrick, M., Paller, M., Ro-

driguez, E., Roth, L., Seal, D., Shaffer, S., Shimada, J., Umland, J., Werner, M., Oskin, M., Burbank,

D., and Alsdorf, D. E., 2007. The shuttle radar topography mission. Reviews of Geophysics 45.

doi:10.1029/2005RG000183.

Ghosh, S., Huyck, C. K., Greene, M., Gill, S. P., Bevington, J., Svekla, W., DesRoches, R., and Eguchi,

R. T., 2011. Crowdsourcing for Rapid Damage Assessment: The Global Earth Observation Catastro-

phe Assessment Network (GEO-CAN). Earthquake Spectra 27, S179–S198.

Goda, K. and Hong, H. P., 2008. Spatial correlation of peak ground motions and response spectra.

Bulletin of the Seismological Society of America 98, 354–365. doi:10.1785/0120070078.

Grujic, O., 2017. Subsurface Modeling with Functional Data. Ph.D. thesis, Stanford University.

Gr¨

unthal, G., 1998. European Macroseismic Scale 1998, vol. 15. ISBN 2879770084, 100 pp.

Gunasekera, R., Daniell, J., Pomonis, A., Arias, R. A., Ishizawa, O., and Stone, H., 2018. Methodology

Note on the Global RApid post-disaster Damage Estimation (GRADE) approach.Tech. rep., Global

Facility for Disaster Reduction and Recovery, Washington, DC.

Hengl, T., Heuvelink, G., and Stein, A., 2003. Comparison of kriging with external drift and regression-

kriging. Technical note, ITC p. 17. doi:10.1016/S0016-7061(00)00042-2.

Hengl, T., Heuvelink, G. B. M., and Stein, A., 2004. A generic framework for spatial prediction of soil

variables based on regression-kriging. Geoderma 120, 75–93. doi:10.1016/j.geoderma.2003.08.018.

Hunt, A. and Specht, D., 2019. Crowdsourced mapping in crisis zones: collaboration, organisation and

impact. Journal of International Humanitarian Action 4, 1–11. doi:10.1186/s41018-018-0048-1.

Huyck, C. K., 2015. Gorkha (Nepal) Earthquake Response.

Jaiswal, K., Wald, D., and Hearne, M., 2009. Estimating casualties for large earthquakes worldwide

using an empirical approach: US geological survey open-ﬁle report, OF 2009-1136, 78 p.Tech. rep.

Jaiswal, K. and Wald, D. J., 2011. Rapid Estimation of the Economic Consequences of Global Earth-

quakes.Tech. rep., USGS, Reston, VA.

James, G., Witten, D., Hastie, T., and Tibshirani, R. J., 2013. An Introduction to Statistical Learning.

Springer, New York, NY. ISBN 9781461471370, 1–440 pp.

27

Jarvis, A., Reuter, H. I., Nelson, A., and Guevara, E., 2008. Hole-ﬁlled seamless SRTM data V4.

Jayaram, N. and Baker, J., 2009. Correlation model for spatially distributed ground-motion intensities.

Earthquake Engineering {&}Structural Dynamics {...}.

JICA, 2002. The study on earthquake disaster mitigation in the Kathmandu Valley, Kingdom of Nepal.

Tech. rep., Japan International Cooperation Agency : Nippon Koei Co., Ltd. : Oyo Corp.

Kerle, N., 2013. Remote Sensing Based Post-Disaster Damage Mapping with Collaborative Methods.

Intelligent Systems for Crisis Management pp. 121–133. doi:10.1007/978-3-642-33218-0.

Kerle, N. and Hoffman, R. R., 2013. Collaborative damage mapping for emergency response : the role

of Cognitive Systems Engineering. Natural hazards and earth system sciences 13, 97–113.

Lallemant, D. and Kiremidjian, A., 2013. Rapid post-earthquake damage estimation using remote-

sensing and ﬁeld-based damage data integration. In Safety, Reliability, Risk and Life-Cycle Perfor-

mance of Structures and Infrastructures, pp. 3399–3406. CRC Press.

Lallemant, D., Soden, R., Rubinyi, S., Loos, S., Barns, K., and Bhattacharjee, G., 2017. Post-

Disaster Damage Assessments as Catalysts for Recovery: A Look at Assessments Conducted in

the Wake of the 2015 Gorkha, Nepal, Earthquake. Earthquake Spectra 33, S435–S451. doi:

10.1193/120316EQS222M.

Loos, S., Barns, K., Bhattacharjee, G., Soden, R., Herfort, B., Eckle, M., Giovando, C., Girardot, B.,

Saito, K., Deierlein, G., Kiremidjian, A., Baker, J. W., and Lallemant, D., 2018. The Development and

Uses of Crowdsourced Building Damage Information based on Remote-Sensing.Tech. rep., Stanford,

CA.

McBratney, A. B., Odeh, I. O., Bishop, T. F., Dunbar, M. S., and Shatar, T. M., 2000. An overview of

pedometric techniques for use in soil survey, vol. 97. ISBN 0016-7061, 293–327 pp. doi:10.1016/

S0016-7061(00)00043-4.

Monfort, D., Negulescu, C., and Belvaux, M., 2019. Remote sensing vs. ﬁeld survey data in a post-

earthquake context: Potentialities and limits of damaged building assessment datasets. Remote Sens-

ing Applications: Society and Environment 14, 46–59. doi:10.1016/j.rsase.2019.02.003.

Motaghian, H. R. and Mohammadi, J., 2011. Spatial estimation of saturated hydraulic conductivity from

terrain attributes using regression, kriging, and artiﬁcial neural networks. Pedosphere 21, 170–177.

doi:10.1016/S1002-0160(11)60115-X.

Nepal Earthquake Housing Reconstruction Multi-Donor Trust Fund, 2016. Nepal Earthquake Housing

Reconstruction Annual Report.Tech. rep., Nepal Earthquake Housing Reconstruction Multi-Donor

Trust Fund, Kathmandu, Nepal.

Odeh, I. O. A., McBratney, A. B., and Chittleborough, D. J., 1994. Spatial prediction of soil properties

from landform attributes derived from a digital elevation model. Geoderma 63, 197–214. doi:10.

1016/0016-7061(94)90063-9.

Oliver, M. A. and Webster, R., 2014. A tutorial guide to geostatistics: Computing and modelling vari-

ograms and kriging. Catena 113, 56–69. doi:10.1016/j.catena.2013.09.006.

REACH, 2014. Groundtruthing Open Street Map Building Damage Assessment: Haiyan Typhoon - The

Philippines.Tech. Rep. April, REACH; American Red Cross; USAID.

Shelter Cluster Nepal, 2015. Shelter and Settlements Vulnerability Assessment: Nepal 25 April / 12 May

Earthquakes Response Nepal.Tech. Rep. June, Shelter Cluster Nepal, Nepal.

Shome, N., Jayaram, N., and Rahnama, 2012. Uncertainty and Spatial Correlation Models for Earth-

quake Losses. In 15th World Conference on Earthquake Engineering (15WCEE), p. 10. Lisbon,

Portugal.

28

Thompson, E. M., Baise, L. G., Kayen, R. E., Tanaka, Y., and Tanaka, H., 2010. A geostatistical

approach to mapping site response spectral ampliﬁcations. Engineering Geology 114, 330–342.

Trendaﬁloski, G., Wyss, M., and Rosset, P., 2009. Loss Estimation Module in the Second Genera-

tion Software QLARM. In Second International Workshop on Disaster Casualties, June, pp. 1–10.

Cambridge, UK. ISBN 9789048194551. doi:10.1007/978-90-481-9455-1.

United Nations Ofﬁce for the Coordination of Humanitarian Affairs, 2019. ReliefWeb - Informing

humanitarians worldwide.

Universit´

e catholique de Louvain (UCL) - CRED and Guha-Sapir, D., . EM-DAT: The Emergency

Events Database.

Wald, D. J., Jaiswal, K. S., Marano, K. D., Garcia, D., So, E., and Hearne, M., 2012. Impact-Based

Earthquake Alerts with the U. S. Geological Surveys PAGER System: What’s Next? In 15th World

Conference on Earthquake Engineering, Lisbon Portugal.

Westrope, C., Banick, R., and Levine, M., 2014. Groundtruthing OpenStreetMap Building Damage

Assessment. Procedia Engineering 78, 29–39.

Worden, C. B., Thompson, E. M., Baker, J. W., Bradley, B. A., Luco, N., and Wald, D. J., 2018. Spatial

and Spectral Interpolation of GroundMotion Intensity Measure Observations. Bulletin of the Seismo-

logical Society of America doi:10.1785/0120170201.

Worden, C. B. and Wald, D., 2016. ShakeMap Manual.Tech. rep.

Yun, S.-h., Hudnut, K., Owen, S., Webb, F., Sacco, P., Gurrola, E., Manipon, G., Liang, C., Fielding, E.,

Milillo, P., Hua, H., and Coletta, A., 2015. Rapid Damage Mapping for the 2015 M w 7 . 8 Gorkha

Earthquake Using Synthetic Aperture Radar Data from COSMO SkyMed and ALOS-2 Satellites.

Seismological Research Letters 86, 1549–1556. doi:10.1785/0220150152.

29