Content uploaded by F. Yamazaki
Author content
All content in this area was uploaded by F. Yamazaki
Content may be subject to copyright.
Proc. 7th U.S. National Conf. On Earthquake Engineering, CD-ROM, 2002
APPLICATION OF AUTOMATED DAMAGE DETECTION OF BUILDINGS
DUE TO EARTHQUAKES BY PANCHROMATIC TELEVISION IMAGES
H. Mitomi1, M. Matsuoka2 and F. Yamazaki3
ABSTRACT
The characteristics of collapsed buildings are examined by image processing of
aerial television images taken after the 1995 Hyogoken-Nanbu (Kobe) earthquake.
In image processing, not only variance and predominant direction of edge
intensity, but also some statistical textures derived from the co-occurrence matrix
of edge intensity are used for the extraction of the characteristics of collapsed
buildings. The proposed automated damage detection method is applicable to
high-resolution satellite imagery such as IKONOS with a panchromatic one-meter
spatial resolution as well as to aerial imagery. Using this approach, collapsed
buildings in the Kobe images are approximately identified.
Introduction
It is important to grasp damage information in stricken areas just after an earthquake in
order to perform quick rescue and recovery activities. Airborne remote sensing is one of the
techniques available for gaining disaster information at an early stage, because these images can
be obtained quickly with very high resolution. Recently, a new overlay method between pre- and
post-event images based on artificial neural networks was applied to detect natural disasters
using aerial photographs (Kosugi et al. 2000). However, it is not realistically possible to obtain
images of the stricken areas before the disaster. Therefore, we are studying a method of
automated detection of damaged buildings due to earthquakes using only post-event images in
order to make use of the instantaneous acquisition ability of helicopters and airplanes (Aoki et al.
2001, Mitomi et al. 2001). In our previous study, severely damaged buildings were identified by
color indices and edge elements in an original RGB image. However, it was difficult to apply the
same threshold values used for color indices to other images, because of the differences in factors
such as the influence of sunshine and built environments (Mitomi et al. 2000). In this study, we
propose a method of detecting areas with building damage based only on edge information. The
application of the method that does not use color information to other aerial images and some
panchromatic satellite images, such as IKONOS, QuickBird and OrbView, which have one-meter
spatial resolution on the ground surface, can be expected (Gruen 2000).
1Earthquake Disaster Mitigation Research Center, NIED, Miki City, Hyogo, Japan 673-0433
2Deputy Team Leader, Earthquake Disaster Mitigation Research Center, NIED, Miki City, Hyogo, Japan 673-0433
3Team Leader, Earthquake Disaster Mitigation Research Center, NIED, Miki City, Hyogo, Japan 673-0433
Aerial HDTV Images and Training Data
Aerial shooting from helicopters of areas affected by the Kobe earthquake was performed
shortly after the event by the Japan Broadcasting Corporation (NHK). These images were taken
at a 30-45 degree angle from the vertical direction, from a height of about 300m using NHK’s
HDTV cameras. In this study, we used some of these images taken 10 days after the event. The
HDTV images were converted to RGB image data with a bitmap format, and the panchromatic
images were fabricated from the method to obtain the brightness signal for NTSC, which is one
of the image transmitting systems used for television. One of the images used in this study is
shown in Figure 1. The spatial resolution of this image is approximately 9cm to17cm for near to
far distances from the camera, respectively.
Table 1. Training data used in this study, and extraction accuracy in cases of both pixel
level (Dpx) and spatial filtering level (Darea).
c1 debris of collapsed wooden buildings 45.7 49.3 47.2 90.2 95.5 93.6
c2 brown roof of non-damaged low-rise buildings 0.0 0.0 0.1 0.0 0.0 0.0
c3 gray roof of non-damaged low-rise buildings 0.2 0.4 0.3 0.0 0.0 0.0
c4 big roof of a gymnasium 0.0 0.0 0.0 0.0 0.0 0.0
c5 brown wall of non-damaged low-rise buildings 16.8 19.1 19.1 1.0 2.4 2.6
c6 white wall of non-damaged low-rise buildings 18.6 19.8 19.9 0.0 0.0 0.0
c7 blue vinyl canvas sheets 8.2 9.2 9.4 0.0 0.0 0.0
c8 railways 2.1 2.7 2.5 0.0 0.0 0.0
c9 asphalt roads and parking lots 8.2 9.3 9.3 2.8 5.3 4.4
c10 bare ground 4.4 4.9 5.1 0.0 0.0 0.0
c11 tennis court 1.5 1.6 1.6 0.0 0.0 0.0
c12 vegetation 0.9 1.2 0.9 1.0 1.9 1.4
*1 denotes the combination with four threshold values of Ev, Ed, Ta and Te.
*2 denotes the combination with three threshold values of Ev, Ed and Ta.
*3 denotes the combination with three threshold values of Ev, Ed and Te.
*3
training data Dpx (%) Darea (%)
*1 *2 *3*1 *2
Figure 1. Aerial HDTV image.
Figure 2. Samples of training data
for collapsed buildings.
The outlines of undamaged buildings were clearly observed in the images while the
images of collapsed buildings were vague due to the preserve of building debris, e.g., roof tiles,
soil under the roof tiles and exterior walls. Hence, in this study, the characteristics of the area
with collapsed buildings were examined using edge information. Typical areas with collapsed
buildings (c1), elements of undamaged buildings (c2-c6), and objects other than buildings (c7-
c12) were selected from the aerial image as listed in Table 1. These training data were designated
as the areas of inscribed circles. Figure 2 shows samples of training data for the collapsed
buildings.
Edge Intensity, Its Variance and Direction
Edge intensity (Ei), its variance (Ev) and a ratio of predominant direction of edge
intensity (Ed) were derived from a Prewitt filter, to detect the change in density among
neighboring pixels. The Prewitt filter used to detect edge elements has 3x3 matrices, and can
calculate edge intensities of eight directions (Takagi and Shimoda 1991). We enlarged this filter
to a 7x7 matrix (Aoki et al. 2001), because densities of neighboring pixels have gentle slopes in
images taken by television cameras. Ei was obtained from the maximum value in the templates
for eight directions on edge. An edge direction was defined as the direction of Ei, such as 0-180,
45-225, 90-270, and 135-315 degrees. Using the Ei value, Ev was calculated as a variance in a
0%
20%
40%
60%
80%
100%
0 1000 2000 3000
edge intensity in a 7x7 area, Ei
cumulative relative frequency
c1
c2
c3
c4
c5
c6
c7
c8
c9
c10
c11
c12
Figure 3. Cumulative relative frequency of
Ei for training data.
0%
20%
40%
60%
80%
100%
0.E+00 5.E+05 1.E +06
variance of edge intensity in a 7x7 area, Ev
cumulative relative frequency
c1
c2
c3
c4
c5
c6
c7
c8
c9
c10
c11
c12
Figure 4. Cumulative relative frequency of
Ev for training data.
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
ratio of predominant direction in a 7x7 edge, Ed
cumulative relative frequency
c1
c2
c3
c4
c5
c6
c7
c8
c9
c10
c11
c12
Figure 5. Cumulative relative frequency of
Ed for training data.
7x7 pixel window. Also, the ratio of the predominant direction of edge elements in a 7x7 pixel
window, Ed, was calculated. Figures 3, 4 and 5 show cumulative relative frequencies of Ei, Ev
and Ed, respectively, for each set of training data. The roofs of undamaged buildings, (c2-c4)
including many of the non-edge areas, consist of extremely small values of Ei and Ev. The
exterior walls of undamaged buildings (c5, c6) have stronger edge elements than other training
data, due to attachments such as balconies, windows, and outlines of buildings. The distribution
of Ed of railways (c8) has more pixels with the same edge direction than other training data. In
the case of collapsed buildings (c1), the distributions of the values for Ei and Ev are broad, and
the distribution of the value for Ed is similar to that of other training data, excepting railways.
Statistical Textures due to Co-occurrence Matrix
An occurrence probability P(k, l) means the probability that pixel value l appears in a
relative position
δ
=(r,
θ
) from a reference pixel whose value is k, where r and
θ
of
δ
are
relative distance and direction from the reference pixel, respectively. The occurrence probability
P(k, l) is calculated for all combinations of pixel values (k, l) against some constant
δ
. This
matrix is called a co-occurrence matrix (Takagi and Shimoda 1991), because column k line l in
the matrix represents a co-occurrence probability of pixel values (k, l). Some of the textures due
to the co-occurrence matrix are often used to classify land cover in urban areas (Zhang et al.
2001). In this study, characteristics of the collapsed buildings were investigated with edge
textures derived from a co-occurrence matrix based on edge intensity, Ei. The cumulative relative
frequency of the collapsed buildings (c1) was converted to 4-bit data representing a condensed
edge intensity, cEi, as shown in Figure 6. Figure 7 shows relative frequencies for representative
training data. By this approach, the same number of pixels for the collapsed buildings in cEi was
obtained in all digital numbers of 4bits. On the other hand, exterior walls (c5, c6) have strong
elements in cEi, and some of the other training data, such as blue vinyl canvas sheets (c7),
asphalt roads and parking lots (c9), and bare ground (c10) contain many weak elements in cEi.
Using these characteristics, two textures were calculated for the condition of r=1, which
indicates neighboring pixels around a reference pixel, and four directions of 0-180, 45-225, 90-
270, and 135-315 degrees. Figure 8 is a schematic diagram representing the relationship between
0%
5%
10%
15%
20%
0 1 2 3 4 5 6 7 8 9 101112131415
condensed edge intensity, cEi (0-15)
relative frequency
c1 c5 c6
c7 c8 c9
c10
Figure 7. Relative frequency of cEi for
some training data.
0%
20%
40%
60%
80%
100%
0 1000 2000 3000
edge intensity in a 7x7 area, Ei
cumulative relative frequency
15141312
11
10
9
8
7
6
5
4
3210
Figure 6. Cumulative relative frequency of
Ei for the collapsed buildings, and
derivation of cEi with 16 grades.
the reference pixel and other neighboring pixels of distance r and direction
θ
from the pixel. The
maximum value for the directions was defined as a representative value of the texture. In
addition, a 7x7 pixel area was used as the window size for the texture analysis.
(){}
åå
−
=
−
=
=
1
0
1
0
2
,
m
k
m
l
lkPTa (1)
() (){}
lkPlkPTe
m
k
m
l
,log,
1
0
1
0
åå
−
=
−
=
−= (2)
Equations (1) and (2) describe the angular second moment (Ta) and entropy (Te) of textures,
respectively. Both textures represent the uniformity of the edge structure in the input
panchromatic image, but their trends are opposite. If P(k, l) is locally large in the matrix, which
represents uniform texture, large and small values of Ta and Te are obtained, respectively. As
mentioned above, training data of the collapsed buildings should consist of the approximately
same number of pixels among each 16 grades for cEi. Therefore, it can be expected that the
collapsed buildings have non-uniform textures as expressed by Ta and Te. Figures 9 and 10 show
cumulative relative frequencies of each set of training data for Ta and Te, respectively. The
0%
20%
40%
60%
80%
100%
0 0.1 0.2 0.3 0.4
angular second moment, Ta
cumulative relative frequency
c1
c2
c3
c4
c5
c6
c7
c8
c9
c10
c11
c12
Figure 9. Cumulative relative frequency of
Ta for training data.
0%
20%
40%
60%
80%
100%
0.5 1.0 1.5
entropy, Te
cumulative relative frequency
c1
c2
c3
c4
c5
c6
c7
c8
c9
c10
c11
c12
Figure 10. Cumulative relative frequency of
Te for training data.
(
r
,
θ
θθ
θ
)
area of r = 1
0
°
45
°
90
°
135
°
180
°
225
°
270
°
315
°
area of r = 2
Figure 8. Relationship between a reference pixel and neighboring pixels with r=1 and
eight directions.
collapsed buildings have a large number of pixels representing the lowest range for Ta and the
highest range for Te, respectively. This means the collapsed buildings show the strongest trends
of non-uniformity. A slight tendency of non-uniformity is also seen for the exterior walls,
railways and bare ground. Most roofs (c2-c4) have a uniform component for the edge structure.
Extraction of Pixels of Collapsed Buildings
The curve of each cumulative relative frequency for the collapsed buildings (c1) shown in
Figs 3, 4, 5, 9 and 10 was approximated by a regression line based on the cumulative data
between 20% and 80% of the collapsed buildings. If this line intersected 0% and 100% on the
graph of the cumulative relative frequency, the threshold value was determined to be the value at
these intersection points for each characteristic (Aoki 2001). When each threshold value for Ei,
Ev, Ed, Ta and Te was applied, a ratio of pixels extracted as the collapsed buildings in each
training data was attained as shown in Figure 11. In both cases of Ei and Ev, most of the pixels of
(a) edge intensity: Ei
(b) variance of edge intensity: Ev
(c) ratio of predominant edge direction: Ed, two statistical textures of edge intensity: Ta, Te
0%
20%
40%
60%
80%
100%
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12
training data
ratio of pixels extracted
as "c1"
Ei
Ei_80%
0%
20%
40%
60%
80%
100%
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12
training data
ratio of pixels extracted
as "c1"
Ev
Ev_80%
0%
20%
40%
60%
80%
100%
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12
training data
ratio of pixels extracted
as "c1"
Ed
Ta
Te
Figure 11. Ratio of pixels extracted for collapsed buildings in each set of training data.
not only the collapsed buildings but also several sets of training data were extracted, in particular,
each percentage of pixels extracted as collapsed buildings in blue vinyl canvas sheets (c7),
railways (c8), asphalt roads and parking lots (c9) and woods (c12) exceeded that of the collapsed
buildings for Ei. Railways were clearly distinguished from the collapsed buildings by Ed. Using
each threshold value of Ta and Te, about 40% of the pixels consisting of bare ground were
extracted, while the ratio of blue vinyl canvas sheets was about 30%. The each value of Ei and Ev,
when cumulative relative frequency for the bare ground (c10) reached 80% in Ei and Ev, was
considered as the lower limit of threshold value in Ei and Ev, in order to decrease the ratio of
extracted pixels of bare ground, as shown in Figs. 3 and 4. On the whole, pixels extracted in the
training data, except for the collapsed buildings, decreased in comparison with the result
obtained by the threshold values, which were determined by regression lines of the cumulative
relative frequencies. Extracted pixels for the roofs of undamaged buildings (c2-c4) disappeared,
and those on the bare ground and blue vinyl canvas sheets decreased significantly. However,
percentages of correctly extracted pixels for the collapsed buildings also decreased from 86.9%
to 48.7% for Ei, and from 85.4% to 61.4% for Ev. Therefore, Ei was not used, but Ev with a
constrained lower limit of threshold value was used as the parameter to detect building damage.
Table 2 shows the threshold values for all parameters used in this study. Pixels within the ranges
of all threshold values were regarded as corresponding to building damage.
In this study, accuracies of pixels extracted in combination with not only EvEdTaTe but
also EvEdTa and EvEdTe, were evaluated by Dpx, as shown in Table 1. Dpx is defined as a
percentage of extracted pixels in each set of training data, and Dpx of the collapsed buildings was
between 45% and 50% for the three combinations. Figure 12 shows the transitions of the ratio of
the extracted pixels in representative training data when four threshold values were combined in
the order of Ev, Ed, Ta and Te. Based on Ev, it was difficult for the collapsed buildings to be
distinguished from other objects, such as exterior walls (c5, c6), railways, asphalt roads and
parking lots, and the bare ground. However, the ratios of the number of pixels extracted as
collapsed buildings for the other training data, which are not shown in Fig.12, could be
Table 2. Threshold values.
Ev: edge variance 2.0 x 105-6.8 x 105
Ed: edge direction 0.30 - 0.60
Ta: angular second moment 0.05 - 0.13
Te: entropy 1.16 - 1.38
characteristics threshold
0%
20%
40%
60%
80%
Ev
EvEd
EvEdTa
EvEdTaTe
ratio of extracted pixel in each set of training data
c1 c5 c6
c7 c8 c9
c10
Figure 12. Changes of ratios of Ev, Ed, Ta
and Te of extracted pixels in
some training data.
decreased to less than 5%. Adding the condition of Ed, railways were distinguished from the
collapsed buildings, because the cumulative relative frequency of Ed was significantly different
from the other imaged objects, as mentioned above. By using Ta and Te, roofs of undamaged
buildings (c2-c4) were clearly distinguished from the collapsed buildings without correcting
threshold values such as Ev, and most outlines of undamaged buildings and roads with strong
edge elements were distinguished from those of collapsed buildings.
Detection of Areas with Building Damage
The extracted pixels corresponding to the collapsed buildings (c1) were further
synthesized to decrease surplus pixels and make areas with building damage easy to identify.
This analysis was introduced to calculate the local density of extracted pixels (Rpx) as in a
previous study (Aoki 2001). Rpx is defined as the ratio of the pixels extracted by the four
threshold values versus the number of pixels in a window approximating one building size.
Windows of 31x31 to 63x63 pixels were selected to be proportional to the resolution of the
ground surface. Figure 13 shows the cumulative relative frequency of the training data. Next, a
regression line was derived from the same method as used for obtaining the threshold value on
edge information, and the threshold value of Rpx was defined as the intersection point between
the regression line of Rpx for the collapsed buildings and the horizontal axis. In all cases of
EvEdTaTe, EvEdTa, and EvEdTe, Rpx 30% was derived as the threshold value to distinguish
between the collapsed buildings and the other imaged objects. A percentage of areas detected as
the collapsed buildings in all training data is shown as Darea in Table 1. In all cases, correctly
detected pixels in the collapsed buildings were more than 90%. However, a small number of
pixels for asphalt roads and parking lots (c9), brown exterior walls (c5), and woods (c12) were
incorrectly detected by this approach. Therefore, the areas with building damage were estimated
using EvEdTaTe, because these areas had the least number of incorrect pixels in the three cases.
Figures 14 and 15 show the results estimated in this study and the results of ground survey and
0%
20%
40%
60%
80%
100%
0% 20% 40% 60%
local density of selected pixels, Rpx
cumulative relative frequency
c1
c2
c3
c4
c5
c6
c7
c8
c9
c10
c11
c12
EvEdTa
EvEdTe
Figure 13. Cumulative relative frequency of Rpx for each set of training data. “EvEdTa” and
“EvEdTe” are the distributions of c1 in EvEdTa and EvEdTe, respectively.
visual inspection (Hasegawa 2000), respectively. The black and white areas in Fig.15 represent
the collapsed and the severely damaged buildings, respectively. In the dotted circle, the blue vinyl
canvas sheets are covered with parts of the damaged buildings. Most of the building debris was
detected correctly, although pedestrian crossings, cars and parts of exterior walls were incorrectly
detected as collapsed buildings. Figures 16 and 17 show the result of this method applied to the
adjacent image and the distribution of severely damaged buildings due to ground survey and
visual inspection, respectively. The distribution of the detected area shown in Figure 16 roughly
agreed with that in Figure 17. These results indicate that it may be possible to apply the method
based only on edge information from a sample image to several similar images spanning large
areas.
Conclusions
The analysis of areas containing collapsed buildings was conducted using panchromatic
aerial images taken from a helicopter. The characteristics of collapsed buildings were examined
based on edge information of aerial images, such as the variance of edge intensity, the ratio of the
predominant edge direction, and two textures based on the co-occurrence matrix of edge intensity.
The threshold value was determined by the cumulative relative frequency of the collapsed
buildings in terms of edge information, and was combined in order to detect the pixels
Figure 16. Result of application of the
threshold value for Fig.14 to an
adjacent image.
Figure 17. Distributions of collapsed and
severely damaged buildings in
Fig.16 by ground survey.
Figure 14. Result of estimated areas with
building damage based on
EvEdTaTe and Rpx30%.
Figure 15. Distributions of collapsed and
severely damaged buildings in
Fig.14 by ground survey.
representing collapsed buildings. The collapsed areas of buildings were roughly detected by this
approach, and these threshold values were applied to an adjacent image. In order to further
improve the technique for automated damage detection, we will examine the texture analysis
procedure using the co-occurrence matrix, such as appropriate sizes of the matrix, the pixel
window, and other textures instead of angular second moment and entropy.
Acknowledgment
We wish to thank the Japan Broadcasting Corporation (NHK) for HDTV images.
References
Aoki, H., M. Matsuoka, and F. Yamazaki (2001). Automated detection of damaged buildings due to
earthquakes using aerial HDTV and photographs, Journal of the Japan Society of
Photogrammetry and Remote Sensing, 40 (4), pp.27-36 (in Japanese).
Gruen, A. (2000). Potential and limitations of highresolution satellite imagery, The 21st Asian
Conference on Remote Sensing, Keynote address, pp.1-14.
Hasegawa, H., F. Yamazaki, M. Matsuoka, and I. Sekimoto (2000). Determination of building damage
due to earthquakes using aerial television images, The 12th World Conference on Earthquake
Engineering, CD-ROM.
Kosugi, Y., T. Plamen, M. Fukunishi, S. Kakumoto, and T. Doihara (2000). An adaptive nonlinear
mapping technique for extracting geographical changes, Proceedings of GIS2000, CD-ROM.
Mitomi, H., F. Yamazaki, and M. Matsuoka (2000). Automated detection of building damage due to
recent earthquakes using aerial television images, Proceedings of the 21st Asian Conference on
Remote Sensing, pp.401-406.
Mitomi, H., J. Saita, M. Matsuoka, and F. Yamazaki (2001). Automated damage detection of buildings
from aerial television images of the 2001 Gujarat, India earthquake, IEEE 2001 International
Geoscience and Remote Sensing Symposium, CD-ROM.
Takagi, M., and H. Shimoda (1991). Handbook of image analysis, University of Tokyo Press (in
Japanese).
Zhang, Q., J. Wang, P. Gong, and P. Shi (2001). Texture analysis for urban spatial pattern study using
SPOT imagery, IEEE 2001 International Geoscience and Remote Sensing Symposium, CD-ROM.