ArticlePDF Available

Airborne lidar data classification in complex urban area using random forest: a case study of Bergama, Turkey

Authors:
  • Ankara University Faculty of Applied Sciences

Abstract and Figures

Airborne Light Detection and Ranging (LiDAR) data have been increasingly used for classification of urban areas in the last decades. Classification of urban areas is especially crucial to separate the area into classes for urban planning, mapping, and change detection monitoring purposes. In this study, an airborne LiDAR data of a complex urban area from Bergama District, İzmir, Turkey were classified in four classes; buildings, trees, asphalt road, and ground. Random Forest (RF) supervised classification method is selected as classification algorithm, and pixel wise classification was performed. Ground truth of the area was generated by digitizing classes into features to select training data and to validate the results. The selected study area from Bergama district is complex in urban planning of buildings, road, and ground. The building are embedded and very close to each other, while trees are very close to buildings and sometimes cover the rooftops of buildings. The most challenge part of this study is to generate ground truth in such a complex area. According to obtained classification results, overall accuracy of the results is found as %70,20. The experimental results showed that the algorithm promises reliable results to classify airborne LiDAR data into classes in a complex urban area.
Content may be subject to copyright.
45
International Journal of Engineering and Geosciences (IJEG),
Vol;4, Issue;1, pp. 045-051, February, 2019, ISSN 2548-0960, Turkey,
DOI: 10.26833/ijeg.440828
AIRBORNE LIDAR DATA CLASSIFICATION IN COMPLEX URBAN AREA
USING RANDOM FOREST: A CASE STUDY OF BERGAMA, TURKEY
Sibel Canaz Sevgen 1*
1Ankara University, Faculty of Applied Sciences, Department of Real Estate Development and Management, Ankara,
Turkey (ssevgen@ankara.edu.tr); ORCID 0000-0001-5552-6067
*Corresponding Author, Received: 05/07/2018, Accepted: 06/08/2018
ABSTRACT: Airborne Light Detection and Ranging (LiDAR) data have been increasingly used for classification ofurban
areas in the last decades. Classification of urban areas is especially crucial to separate the area into classes for urban
planning, mapping, and change detection monitoring purposes. In this study, an airborne LiDAR data of a complex urban
area from Bergama District, İzmir, Turkey were classified into four classes; buildings, trees, asphalt road, and ground.
Random Forest (RF) supervised classification method is selected as classification algorithm and pixel-wise classification
was performed. Ground truth of the area was generated by digitizing classes into features to select training data and to
validate the results. The selected study area from Bergama district is complex in urban planning of buildings, road, and
ground. The buildings are very close to each other, and trees are also very close to buildings and sometimes cover the
rooftops of buildings.The most challenging part of this study is to generate ground truth in such a complex area. According
to theobtained classification results,the overall accuracy of the results is found as 70,20%.The experimental results showed
that the algorithm promises reliable results to classify airborne LiDAR data into classes in a complex urban area.
Keywords: Random Forest, LiDAR, Classification, Complex Urban Area
International Journal of Engineering and Geosciences (IJEG),
Vol;4, Issue;1, pp. 045-051, February, 2019,
46
1. INTRODUCTION
Classification of objects in an urban area is a popular
subject in a variety of research areas, such as computer
vision, machine learning, pattern recognition,
photogrammetry, remote sensing, and urban planning. In
the literature, satelliteand aerial images have been widely
used in urban area classification. Especially, land cover
changes studiesover the years by classifying satellite data
are abundantly presencein the literature. For instance, Yu
et al. (2012) monitored land cover changes and urban
sprawl dynamics 1989, 1999, and 2009 of Yantai China
by classifying satellite images in five classes. Atlanta,
Georgia’s land cover changes 1973-1998 were
categorized into six different classes (Yang and Lo, 2002).
Canaz et al. (2017), classified Istanbul, Turkey, in four
different classes to monitor land cover change between
the years of 1986-2015. On the other hand, comparing
with optical sensor data, a new technology to collect
remotely sensed data is called as Light Detection and
Ranging (LiDAR) have also been subjected as a popular
data for classification studies. LiDAR technology is
capable of collecting 3 Dimensional (3D) point cloud data
in a short time day or night. Because of the direct 3D data
acquisition, LiDAR data also have been increasingly used
for classifying urban areas into classes.
Classical data-driven techniques have been
developed for urban area classification(Rottensteiner and
Briese, 2002, Charaniya et al. 2004), the recent trend is to
use machine learning techniques to classify LiDAR data
in urban area (Lodha et al., 2006). Supervised machine
learning techniques are based on selected features and
classifier algorithm. In the literature, a variety of
supervised classification techniques, support vector
machines, neural networks, exists (Richards, J.A., and
Jia), in this study one of the supervised classification
technique, called as Random Forest (RF) was selected
and used because of its stability and robustness to the
features.
RF classification for airborne LiDAR data has been
studied using different features in order to label different
classes. For instance, Niemeyer et al. (2012) classified
three different area from Vaihingen, Germany LiDAR
dataset named as ‘ISPRS Test Project on Urban
Classification and 3D Building Reconstruction’. The
authors classified data into five categories; building, low
vegetation, tree, terrain, and asphalt ground using
Conditional Random Field (CRF) approach. However,
they only showed and evaluated the result only for classes
building and tree. Their correction result for classification
for the 3 subset area of the datawas found in average 73%
and 92% for the tree and building classes, respectively.
Guo et al. (2010) use a combination of optical
multispectral and LiDAR data to classify LiDAR data in
urban area in four classes using the Random Forest (RF)
algorithm. Many other studies using RF algorithm to
classify LiDAR data can be found in the literature
(Immitzer, et al., 2012; Rodriguez-Galiano et al. 2012;
Guan et al., 2013).
Lodha et al. (2006) employed another LiDAR data
classification work. The authors used Support Vector
Machines (SVM) for classifying LiDAR data into
buildings, trees, roads, and grass using five features:
height, height variation, normal variation, LiDAR return
intensity, and image intensity. To evaluate result they
compare ground truth and the classification result and
observed 90% accuracy. Chen et al. (2013) classified
LiDAR data to detect landslides in Three Gorges, China
by using the mean aspect, Digital Terrain Model (DTM),
and slope textures based on four texture directions;
aspect, DTM, and slope textures based on aspect; and the
moving average and standard deviation (stdev) filter of
aspect, DTM, and slope and RF algorithm. By combine
feature selection method with RF algorithm, they found a
reliable result for classifying LiDAR data and detection
of landslides. Ma et al. (2017) studied a comparison
between SVM and RF algorithm to classify LiDAR data.
The authors classified data in four categories: trees,
buildings, farmland, and ground. According to their
findings, the RF algorithm gave a better result than the
SVM algorithm for the classification of the LiDAR data.
In this study, an area from the Bergama district of
İzmir province, Turkey was chosen as study area. The
study area is very complex in shape. The feature classes
in interest are located very close to each other and some
buildings and trees are embedded. Thus,the originality of
the study is that the selected study area is very complex
in shape. Therefore, digitization andgeneration of ground
truth for the study area were carried out very carefully.
After creating the ground truth and 12 features (which
were generated from LiDAR data such as intensity,
planarity, DSM etc.) were used to employ classification
of LiDAR data.
2. STUDY AREA AND DATA
The study area was chosen from Bergama District of
İzmir. İzmir is one of the biggest provinces in Turkeyand
located in western Turkey. Bergama is the biggest district
of İzmir in the size of the area. The area of Bergama is
1573 km2. The population of the district in 2017 is
102.961.
The study area is located in the center of Bergama
district (Fig. 1). The boundary of İzmir province is shown
with the blue line, and the boundary of Bergama district
is shown in red line in Fig. 1. The true orthophoto of the
study area is also shown in Fig. 1. Since the study area’s
land cover mainly consists of ground, roads, trees, and
buildings, the study area divided four groups for
classification: buildings, trees, ground, and asphalt road.
International Journal of Engineering and Geosciences (IJEG),
Vol;4, Issue;1, pp. 045-051, February, 2019,
47
Figure 1. Location of study area, İzmir province
boundaries (a), Bergama district boundaries (b) (source:
google maps), and the chosen study area (c)
True orthophotos of the study area were generated by
Directory of Geographic Information Systems. The
images were acquired in May 2016. The pixel size of the
images is 10 cm. LiDAR data of the study area was
collected by Optech Pegasus HA-500 technic by Turkish
General Command of Mapping on 20-21 October 2014
(Kayı et al. 2015). Detailed information about the Optech
Pegasus HA-500 is given in Table 1 (Optech, 2018).
Table 1. Technical information of Optech Pegasus HA-
500 (Kayı et al. 2015)
Feature
Value
Height
150-5000 m
Effective laser repetition rate
100-500kHz
Scanning Angle
0-75º Adjustable
Accuracy (KOH)
≤ 5-20 cm.
Scanning Mechanism
Oscillating
3. METHODOLOGY
RF algorithm is often used in remote sensing
applications to classify data such as multi and
hyperspectral images, radar, LiDAR and thermal data
sets. A literature review of these applications was
presented in Belgiu and Dragut article (2016). This study
is based on RF on one of the remote sensing data airborne
LiDAR for a complex urban area. The flowchart of the
methodology of this study is given in Figure 2.
Figure 2. Flowchart of the methodology
The classification of the LiDAR data involves pixel-
based classification; therefore, 12 features were generated
and rasterized to 50 cm images. Before generating
features images and classifying the study area, airborne
LiDAR data was cleaned from noisy and duplicate points.
After preprocessing, feature images were generated in
four groups (Chehata et al., 2009; Dittrich et al., 2017)
intensity, height, eigenvalue, and echo based. The
intensity-based feature relies on the reflected energy of
the objects in the LiDAR dataset. It helps to separate
different characteristics objects such as asphalt road and
ground classes. Intensity feature image is created using
ArcGIS “Las to Raster” tool in 50 cm pixel size. Height
based features, on the other hand, were generated from
height values of the points and they play a really
important role in separating ground and other non-ground
classes, such as buildings and trees. The lidardata set was
filtered to ground points, then from those points, a DTM
in 0.5 m pixel size was generated. In addition to that, a 50
cm Digital Surface Model (DSM) was generated from all
points in the LiDAR dataset. Normalized DSM (nDSM)
was obtained by subtracting DSM from DTM. Besides,
height features based on local neighborhood helps to
determine objects, which are also different levels of the
surface. minh, minimum height value in the
neighborhood, andHd, height difference from minh of the
interested point, were generated for each point in the
LiDAR dataset (Table 2). From those features, 0.5 m
feature raster were generated using Python programming
language (Python, version 2.7).
Table 2. Height based features
Feature
Description
nDSM
Normalized Digital Surface Model
minh
Minimum height in local neighborhood of
a point
hd
The difference between minimum height
in the local neighborhood of a point and
that point height
Turkey
International Journal of Engineering and Geosciences (IJEG),
Vol;4, Issue;1, pp. 045-051, February, 2019,
48
Eigen-value based features were obtained from
eigenvalues which were calculated from the local
neighborhood covariance matrix. Eigen-values describe
the shape of the object, thus they give valuable
information about the object, whether it is a plane, line or
sphere; therefore, those features are a good indicator of a
tree or building roofs, depending on the feature (Table 3).
Sphericity, S, planarity, P, linearity, L, anisotropy, A, the
sum of eigenvalues, Sum, and change of curvature, C,
were calculated and 0.5 m feature images for each feature
were generated using Python Programming Language.
Geometric features, sphericity, planarity, linearity, and
anisotropy describe the shape of the object and give
useful information about the object whether it is a line,
plane or sphere. All geometric features were created in 3
m neighborhood points per point and then rasterized into
1 m range of mean values.
Table 3. Eigen-value based features
Description
λ1 − λ3
λ1
λ2 − λ1
λ1
λ3
λ1
λ1 − λ2
λ1
Last feature set, echo based features, helps to differentiate
objects, which have multiple returns. Therefore, a total
number of return, n, and the ratio of a number of return
over a total number of returns, t/n, were calculated for
each point and rasterized to 0.5 m images (Table 4).
Table 4. Echo based features
Feature
Description
n
Total number of returns
t /n
Number of returns over a total number of
returns
A total number of twelve features was selected and
images were generated using Python programming
Language and its machine learning and geospatial
libraries, including scikit learn (Pedregosa et al., 2011)
and GDAL (GDAL, 2018). Some of the features and
orthophoto of a part of the study area areshown in Figure
3.
RF classification (Breimen, 2001) is an ensemble
method of decision trees, which relies on randomly
selecting a subset of features and creating multiple trees
in training. and predicting new unlabeled data by voting
each tree in the ensemble. Two parameters are required
by the user, a number of trees, that define how much a
tree can grow up andnumber of features, which determine
how many new nodes can be split from parent node in the
tree.
(a) (b) (c)
(d) (e) (f)
International Journal of Engineering and Geosciences (IJEG),
Vol;4, Issue;1, pp. 045-051, February, 2019,
49
(g) (h) (i)
Figure 3. (a) True orthophoto and example of generated feature images; (b) intensity, (c) nDSM, (d) sphericity, (e)
planarity, (f) linearity, (g) total number of returns, (h) anisotropy, (i) number of total returns over number of
return images.
Ground truth of the study area (red boundary) and the
training area (blue boundary) are shown in Figure 4. The
study area and training areaswere chosen froma different
area. According to the similar studies in the literature, the
size of the training area was chosen as no lower than the
following size: 0,3 x size of the study area. The study area
was fully digitized to use it for quality control of the
classification results. Pink, green, black, and yellow
colored features represent buildings, trees, asphalt road,
and ground, respectively.
Figure 4. Study area (red), training area (blue) and their manually digitized features
Using the manually digitized training area (blue
boundary Fig. 4) and the twelve features, the
classification results were acquired by the RF algorithm.
The results are described in the following section.
4. RESULTS
Ground truth of the area was created by digitizing the
features from orthophoto of the area. Buildings, trees,
asphalt road, and the ground were carefully digitized
(Figure 5a). A part of the ground truth is used for
classification as a training site, while the ground truth of
the study area is used for the quality control of the results.
The results were classified into four groups is shown
in Figure 5b. In the figure, red, green, gray and blue
represent the buildings, trees, asphalt road, and ground
classification results, respectively. As it can be seen in
figure 5, the classes are extracted with high accuracy by
comparing the proposed methodology classification
results and the orthophoto of the study area. The
qualitative analysis was employed by comparing the
ground truth and the classification result. For this
International Journal of Engineering and Geosciences (IJEG),
Vol;4, Issue;1, pp. 045-051, February, 2019,
50
purpose, the difference between the ground truth and
result of the classes were created andillustrated in the Fig.
5c.
Figure 5. (a) Ground truth, (b) Classification result, (c) Difference
By using the ground truth and the classification results,
the quality control was employed, and a confusion matrix
was calculated. According to the results, the accuracy is
found as, 77,90% , 58,37% , 72,90%, and 71,53%for the
buildings, trees, asphalt road, and the ground,
respectively. Overall accuracy for the results is 70,20%.
Although the area is very complex, the classification
results are reliable. Only trees class have lower results
than results of the other classes. Some errors occurred
since LiDAR data, and the orthophotos, which were used
to create ground truth, were acquired in different years
and seasons. Therefore, some of the trees might
misclassified just because in LiDAR data acquisition time
(October 2014), the trees might not have leaves on the
trees. On the other hand, orthophotos were collected and
created in May, when trees have leaves. Another reason
that affects the results is that, in orthophotos, some of the
buildings were demolished while they are present in the
LiDAR data. For instance, one of the case for this kind of
building is shown in Figure 5 with red circles. Finally,
there were cars that wereon the roads in the LiDAR data,
while they are not presence in the orthophoto. This
phoneme alsomismatch the classification of asphalt roads
5. CONCLUSION
In conclusion, in this study, LiDAR data ofa complex
urban area from Bergama district, İzmir, Turkey was
classified into four groups using the RF algorithm. The
classes are as following, buildings, trees, asphalt road,
and ground. The area is very complex in terms of city
planning for instance buildings’ shapes are irregular. The
most challenging part for this study was a generation of
the ground truth since the area is very complex in shape.
Digitization of roads and buildings was very difficult and
carried out very carefully. After digitization of the area,
twelve features were created from LiDAR data, and using
the features and ground truth together, the area is
classified by RF algorithm. According to the results, the
RF algorithmwas classified the area reliably with 70,20%
overall accuracy. However, some errors occur because
the LiDAR data was acquired in October 2014 and the
orthophoto used in this study was collected in May 2016.
Because of the seasonaleffect, some of the trees were not
classified by the proposed methodology. Moreover, in
some cases, some building and trees that are available in
the orthophoto images, is not found in the LiDAR data,
Finally, for the asphalt road, there are car on the roads,
which may not be on the LiDAR data or vice versa. These
affected the classification results. Even though, these
limitations, the proposed methodology is able to classify
the complex urban area with high accuracy.
ACKNOWLEDGEMENTS
The author is very thankful to Eray Sevgen, a Ph.D.
student at the Hacettepe University for sharing Python
scripts for the RF algorithm, helping the feature
extraction and digitizing of ground truth data. The author
is also thankful to the Turkish Directory of Geographic
Information Systems for providing true orthophoto
images of the study area and the Turkish General
Command of Mapping for providing the LiDAR data of
Bergama district.
REFERENCES
Belgiu, M. and DrǎguțL. (2016). Random forests in
remote sensing: a review of applications and future
directions. ISPRS J.Photogramm. Remote Sens., 114, pp.
24-31.
Breiman, L. (2001). “Random Forests.” Machine
Learning 45: 5–32.
Canaz S., Aliefendioğlu Y. and Tanrıvermiş H. (2017).
Change detection using Landsat images and an analysis
of the linkages between the change and property tax
values in the Istanbul Province of Turkey. Journal of
Environmental Management. Vol. 200:446-45.
Chehata, N., Li, G. and Mallet, C. (2009). Airborne
LIDAR feature selection for urban classification using
random forests. Geomat. Inform. Sci. Wuhan Univ. 38,
207–212.
Dittrich, A., Weinmann, M. and Hinz, S. (2017).
Analytical and numerical investigations on the accuracy
and robustness of geometric features extracted from 3D
point cloud data. ISPRS J. Photogramm. 126, 195–208.
International Journal of Engineering and Geosciences (IJEG),
Vol;4, Issue;1, pp. 045-051, February, 2019,
51
Charaniya, A.P., Manduchi, R. and Lodha, S.K. (2004).
Supervised parametric classification of aerial LiDAR
data. In Proceedings of 2004 Conference on Computer
Vision and Pattern Recognition Workshop(CVPRW’04),
Washington, DC.
Chen, W., Li, X., Wang, Y., Chen, G. and Liu S. (2014).
Forested landslide detection using LiDAR data and the
random forest algorithm: a case study of the ThreeGorges,
China. Remote Sens. Environ. 152 (2014), pp. 291-301.
GDAL/OGR contributors (2018). GDAL/OGR
Geospatial Data Abstraction software Library. The Open
Source Geospatial Foundation. URL http://gdal.org
Guan, H., Li, J., Chapman, M., Deng, F., Ji, Z. and Yang,
X. (2013). Integration of orthoimagery and lidar data for
object-based urban thematic mapping using random
forests. International Journal of Remote Sensing, vol. 34,
issue 14, pp. 5166-5186.
Guo, L., Chehata, N., Mallet, C. and Boukir, S. (2011).
Relevance of airborne LiDAR and multispectral image
data for urban scene classification using Random Forests.
ISPRS Journal of Photogrammetry and Remote Sensing,
66 (1) , pp. 56-66/
Kayı A. Erdoğan M. and Eker O. (2015). Results of
LİDAR test performed by OPTECH HA-500 and RIEGL
LMS-Q1560. Harita Dergisi, Volume 153, pp 42-46.
Lodha, S.K., Kreps, E.J., Helmbold, D.P. and Fitzpatrick,
D. (2006). Aerial LiDAR data classification using support
vector machines (SVM). The Third International
Symposium on 3D Data Processing, Visualization, and
Transmission pp. 567-574.
Ma L., Zhou M. and Li C. (2017). Land Covers
Classification Based On Random Forest Method Using
Features From Full-Waveform Lidar Data. The
International Archives of the Photogrammetry, Remote
Sensing and Spatial Information Sciences, Volume XLII-
2/W7, 2017 ISPRS Geospatial Week 2017, 18–22
September 2017, Wuhan, China
Niemeyer, J., Rottensteiner, F. and Soergel, U. (2012).
Conditional random fields for LiDAR point cloud
classification in complex urban areas. In: ISPRS Annals
of the Photogrammetry, Remote Sensing and Spatial
Information Sciences I-3, pp. 263-26.
Optech, (2018), http://www.Optech.Com/Specification
[Accessed on: 14 May 2018]
Richards, J.A. and Jia, X. (1999). Supervised
Classification Techniques Remote Sensing Digital Image,
Analysis, Springer-Verlag GmbH, Heidelberg (1999) pp.
193–247.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blonde,l M., Prettenhofer, P.,
Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,
Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay
E. (2011). Scikit-learn: Machine Learning in Python.
Journal of Machine Learning Research 12:2825-2830.
Python Software Foundation. Python Language
Reference, version 2.7. available at
http://www.python.org. Retrieved on 15.04.2018.
Rodriguez-Galiano, V., Ghimire, B., Rogan, J., Chica-
Olmo, M. and Rigol-Sanchez, J. (2012). An assessment
of the effectiveness of a random forest classifier for land-
cover classification. ISPRS Journal of Photogrammetry
and Remote Sensing, 67 (0), pp. 93-104.
Rottensteiner, F. and Briese C.(2002). A new method for
building extraction in urban areas from high-resolution
LiDAR data. Int Arch Photogramm Remote Sens Spat Inf
Sci 34(3A):295–301
Yang, X. and Lo, C.P. (2002). Using a time series of
satellite imagery to detect land use and land cover
changes in the Atlanta, Georgia metropolitan area ,
International Journal of Remote Sensing, 23, pp. 1775—
1798.
Yu, X., Zhang, A., Hou, X., Li, M., and Xia, Y. (2013).
Multi-temporal remote sensing of land cover change and
urban sprawl in the coastal city of Yantai, China.
International Journal of Digital Earth. Vol. 6, Supplement
2, 137-154.
... They serve as inputs for algorithms, allowing them to learn and classify points based on patterns in the feature data. The features encompass geometric properties such as point density, height, and shape, along with radiometric properties like reflectance and intensity [13][14][15]. ...
... To further evaluate the efficacy of XGBoost three other popular machine learning approaches are also considered: (a) RF, a widely used bagging approach,(b) Light Gradient Boosting Machine (LGBM), another boosting based approach and c) ridge classifier, a fast-learning approach for classification [13][14][15]18]. ...
Article
Full-text available
Semantic segmentation of aerial LiDAR dataset is a crucial step for accurate identification of urban objects for various applications pertaining to sustainable urban development. However, this task becomes more complex in urban areas characterised by the coexistence of modern developments and natural vegetation. The unstructured nature of point cloud data, along with data sparsity, irregular point distribution, and varying sizes of urban objects, presents challenges in point cloud classification. To address these challenges, development of robust algorithmic approach encompassing efficient feature sets and classification model are essential. This study incorporates point‐wise features to capture the local spatial context of points in datasets. Furthermore, an ensemble machine learning model based on extreme boosting is utilised, which integrates sequential training for weak learners, to enhance the model’s resilience. To thoroughly investigate the efficacy of the proposed approach, this study utilises three distinct datasets from diverse geographical locations, each presenting unique challenges related to class distribution, 3D terrain intricacies, and geographical variations. The Land‐cover Diversity Index is introduced to quantify the complexity of landcover in 3D by measuring the degree of class heterogeneity and the frequency of class variation in the dataset. The proposed approach achieved an accuracy of 90% on the regionally complex, higher landcover diversity dataset, Trivandrum Aerial LiDAR Dataset. Furthermore, the results of the study demonstrate improved overall predictive accuracy of 91% and 87% on data segments from two benchmark datasets, DALES and Vaihingen 3D.
... There are several topological objects and areas that need to be classified. These classifications can be conducted from different data types, and the reasons for classifying these objects and areas vary depending on the context [1]. The road is one of the significant areas requiring accurate classification for the following reasons [2][3][4]. ...
... To find the covariance matrix, the distances in x, y and z coordinates between the k closest points and the center point were found using equation (1). Please note that the center point is the point whose feature is desired to be calculated, ...
Article
Full-text available
Accurate road surface from a three-dimensional (3D) point cloud depends on various parameters. One crucial parameter is the set of point features. Point features enable classification by capturing characteristics of the surface on which the points are located. These features are calculated based on the closest points surrounding each point. In this study, the K-nearest neighbors algorithm (KNN) was applied to identify these closest points. The KNN algorithm requires only one input, the number of closest points (k). Eight different point features were developed using different k values, and their impact on road surface classification from the 3D point cloud was investigated. It was observed that there is no significant improvement in classification accuracy until a certain k value. However, better classification accuracy was achieved after a certain k value. The effect of different k values was also investigated under different training sample structures and machine learning (ML) algorithms. When training samples were selected from a single location as a large group, similar classification accuracy was obtained across different k values. Conversely, when training samples were chosen from various regions in smaller groups rather than a single large group, improved classification was observed as the k value increased. Additionally, it was noted that five different ML algorithms-random forest, support vector machine, generalized linear model, linear discriminant analysis, and robust linear discriminant analysis-have almost similar performance under different k values. Finally, using the optimum k value, improvements of up to 4.543% and 6.601% in accuracy and quality measures, respectively, were found.
... Light Detection and Ranging (LiDAR) data have been increasingly used for classification of areas in recent decades [25][26][27]. In order to measure the evolution of the volume of the sinkhole and track local and regional deformations, five LiDAR scanning campaigns of the sinkhole and its vicinity were performed between June 2009 and The records are color-coded by the different instruments that were used during that period. ...
... Light Detection and Ranging (LiDAR) data have been increasingly used for classification of areas in recent decades [25][26][27]. In order to measure the evolution of the volume of the sinkhole and track local and regional deformations, five LiDAR scanning campaigns of the sinkhole and its vicinity were performed between June 2009 and December 2009 ( Figure 4). ...
Article
Full-text available
Subsurface salt layer dissolution along the western shores of the Dead Sea is considered to be the primary cause for extensive large sinkhole formation in the past 40 years. Many of these sinkholes are arranged in clusters and are filled with water from nearby springs. The Mineral Beach resort was built in an area with a thermal spring with water emerging at around 40 °C at the Shalem sinkhole cluster. Unfortunately, the same spring was responsible for the destruction of the resort as it supplied water undersaturated with respect to halite, which promoted dissolution and sinkhole formation. The sinkholes in the Shalem cluster drain out in sudden catastrophic events and then slowly fill again. The drainage mechanisms of this phenomenon are studied in the Shalem-2 sinkhole cluster using leveling data collectors and ground-based LiDAR surveys over a period of 5 years, including thirty-five drainage events. Drainage volume and fluxes calculated using water level and topographic data obtained by LiDAR scans suggest that the formation of additional sinkholes beneath the pond’s bottom triggers drainage events. The subsequent flux shows that the evolution of the newly formed sinkholes either improves the hydraulic connection or temporarily seals the connection between the surface pond and deeper caverns/aquifers. The drainage event ends when either the hydraulic connection is sealed or when the level of water in the pond drops to the level of the newly formed sinkhole. The large volumes of drained water and drainage fluxes imply the existence of a well-developed active underground draining system.
... Further, the normalized eigenvalues 2 and 3 are computed. For ALS point cloud vegetation segmentation, the effective PCA-based geometric feature descriptors are eigenvalues (λ 1 , λ 2 , λ 3 ) [47], eigenvectors (v 2 , v 3 ) [53], and eigenvalues-derived descriptors. The normalized eigenvalues are used to compute eigenvalues sum (E sum ) = λ 1 +λ 2 +λ 3 [48] and eigenentropy ...
... Where â z is unit vector along z-axis. These PCA-based geometric feature descriptors are summarized in Table 3, they play important roles in differentiating vegetation from non-vegetation objects [53]. ...
... 2024). Bu alanlar, orman araştırmaları (Akay vd., 2009;Cetin & Yastikli, 2023;Gollob vd., 2021), jeolojik ölçmeler (Zeybek vd., 2015), kentsel planlama alanları (Sevgen, 2019), geniş alanlarda topoğrafik harita üretimi, doğal afetlerin izlenmesi (Huang vd., 2021), yol geometrisi (Soilan vd., 2019;Suleymanoglu vd., 2023), iç mekan modellemesi ve taraması (Yiğit vd., 2023) ve diğer (Seyfeli & Ok, 2022) pek çok disiplin örnek olarak verilebilir. ...
Article
Full-text available
Son zamanlarda ışık algılama ve mesafe ölçme (LiDAR) sensörlerinin akıllı telefonlara entegre edilmesi, fotoğraf odak uzunluğunun iyi ayarlanmasının yanı sıra üç boyutlu (3B) iç ve dış mekan haritalama için yepyeni bir alternatif ölçme aracı olarak yerini almıştır. Bu yeni sistem sayesinde tarama teknolojisinin farklı disiplinlerde sivil kullanıma kapılarını aralasa da veri kalitesinin Jeodezik LiDAR ölçme seviyesi için henüz erken. İnce detayların ölçülmesini gerektiren işlemlerde sistem tamamen güvenilir olmayabileceği daha önceki çalışmalarda gösterilmiştir. Ancak, bu durum, akıllı telefon LiDAR’larının harita yapımında kullanımına tamamen engel değildir. Bu makale, Apple 14 Pro akıllı cihazının dış ortamlarda, özellikle altyapı çalışma kanallarının 3B modellemesinde belirli seviyelerde harita oluşturulmasına imkân verip veremeyeceğinin tartışılmasına odaklanmıştır. Özellikle, kanalizasyon, içme suyu gibi altyapı tesislerine ait 3B haritaları kanal yapı sahnelerinin yeniden yapılandırması, daha sonra yapılması planlanan bakım ve onarım çalışmaları için önemli bir konudur. Bu konuyla ilgili örnek çalışma olarak, Selçuk Üniversitesi Güneysınır Meslek Yüksekokulu doğal gaz bağlantı hattının geometrik yapısı ve rölatif konum belirlenmesi sunulmuştur. Altyapı çalışmalarında farklı yapılar ve sistemler tespit edilmiş ve bu katmanların belirlenmesi ve sonraki kazı aşamalarında bilgi sunması açısından bu çalışma önem arz ettiği düşünülmektedir.
... For the above reasons, a new search has been started (Lagüela et al., 2018;Sanchez Diaz et al., 2022), and mobile mapping systems have been developed to reach the precision obtained from TLS with less data in a short time (Qian et al., 2019;Yaman & Yılmaz, 2017). Emerging WMLS systems have offered solutions to measurement, documentation and mapping processes in areas where measurement areas are complex and access is limited, or TLS measurement procedures are limited (Karasaka and Beg, 2021;Otero et al., 2020;Sevgen, 2019). This type of device simplifies and speeds up the measurement process compared to stationary devices, as taking measurements from multiple points is unnecessary. ...
... Sibel Canaz Sevgen employed pre-processing on LiDAR data to clean noise and duplicate values before generating 12 features and a subsequent random forest (RF) classification [56]. Ground truth data were subsequently obtained from aerial photographs of a complex urban area to classify buildings, trees, asphalt roads and the ground with respective accuracies of 77.90%, 58.37%, 72.90% and 71.53%. ...
Article
Full-text available
Light detection and ranging (LiDAR) sensors have accrued an ever-increasing presence in the agricultural sector due to their non-destructive mode of capturing data. LiDAR sensors emit pulsed light waves that return to the sensor upon bouncing off surrounding objects. The distances that the pulses travel are calculated by measuring the time for all pulses to return to the source. There are many reported applications of the data obtained from LiDAR in agricultural sectors. LiDAR sensors are widely used to measure agricultural landscaping and topography and the structural characteristics of trees such as leaf area index and canopy volume; they are also used for crop biomass estimation, phenotype characterisation, crop growth, etc. A LiDAR-based system and LiDAR data can also be used to measure spray drift and detect soil properties. It has also been proposed in the literature that crop damage detection and yield prediction can also be obtained with LiDAR data. This review focuses on different LiDAR-based system applications and data obtained from LiDAR in agricultural sectors. Comparisons of aspects of LiDAR data in different agricultural applications are also provided. Furthermore, future research directions based on this emerging technology are also presented in this review.
Article
Ormanlık alanda gerçekleştirilen bu çalışmanın amacı, hem nesne tabanlı sınıflandırma yönteminin başarısını hem de sınıflandırma öncesi ihtiyaç duyulan referans veri ihtiyacı için arazi çalışmasının yeterli olup olmayacağını araştırmaktır. Nesne tabanlı sınıflandırma yönteminde sınıflandırma öncesi hem segmentasyon parametrelerinin hem de doğruluk analizi için seçilecek eğitim alanlarının seçiminde çoğu zaman hava fotoğrafları, paftalar, meşcere haritaları, arazi verisi gibi referans veriye ihtiyaç duyulmaktadır. Bu çalışma da ilk olarak Kastamonu İli Merkez Orman Şefliğine ait 12x12 km’lik çalışma alanı içerisinde belirlenen “İbreli, Yapraklı, Tarım alanı, Açık alan ve Bina” sınıf türlerine ait detay çıkarımı için yüksek çözünürlüklü GeoEye-1 uydu görüntüsü üzerinden eCognition Developer 9.1 yazılımı kullanılarak, nesne tabanlı sınıflandırma yöntemi ile değerlendirme yapılmıştır. Değerlendirme sonrası Diferansiyel Küresel Navigasyon Sistemi yöntemi ile her sınıftan 30 adet olmak üzere toplamda 150 adet nokta ile arazi çalışması yürütülmüş ve sonuçlar nesne tabanlı sınıflandırma sonuçları ile nokta bazında analiz edilmiştir. Araştırma bulgularına göre, arazi verisinin sınıflandırılmış uydu görüntüsü ile yeterli miktarda uyumlu ve referans veri olarak kullanılabilir olduğu sonucu elde edilmiştir.
Article
Otomatik bina çıkarımı, kentsel planlama, afet yönetimi, 3D bina modelleme, arazi değerlemesi ve CBS veri tabanlarının güncellenmesi gibi birçok alanda önemli bir rol oynamaktadır. Bu uygulamalarda, özellikle şehirlerin büyümesi ve gelişmesi ile birlikte bina yerleşimleri giderek karmaşık hale gelmektedir. Bu karmaşıklık, geleneksel yöntemlerle bu verilerin elde edilmesini ve güncellenmesini zorlaştırmaktadır. Kümeleme, veri içindeki desenleri ve benzer yapıları bulmayı amaçlayan bir veri analizi yöntemidir. Bu yöntem, genellikle büyük veri kümelerinde bilgi çıkarmayı basitleştirmek için kullanılır. Özellikle makine öğrenimi, veri madenciliği ve görüntü analizi gibi alanlarda, veri analizi süreçlerinde büyük bir öneme sahiptir. Veri analizi, verilerdeki önemli bilgileri çıkarmak ve bu bilgileri anlamak için temel bir araçtır. Lidar, darbeli lazer kullanarak kendi konumundan Dünya'nın yüzeyine olan mesafeyi ölçen ve Dünya'nın şekli ve formu hakkında üç boyutlu bilgi sunan bir uzaktan algılama yöntemidir. Hava üstü Lidar verileri, özellikle özellik çıkarma, arazi modelleme ve Dijital Yüzey Modeli oluşturma gibi uygulamalar için birçok araştırmacı tarafından kullanılmaktadır. Lidar, geleneksel veri kaynaklarına göre daha az çaba gerektirerek üç boyutlu verilere sahip olma fırsatı sunar. Ancak Lidar verileri üzerinden otomatik bina çıkarımı, verinin doğası gereği karmaşık bir konudur. Bu çalışmada, Lidar verilerinden otomatik bina çıkarımı, nokta bulutu işleme ve analizi için önerilen yöntemlerle gerçekleştirilmiştir. Özellikle, K-Ortalamalar ve Bulanık C-Ortalamalar kümeleme yöntemleri, farklı bina sayıları içeren veri setlerine uygulanmıştır. Sonuçlar, K-Ortalamalar ve Bulanık C-Ortalamalar yöntemlerinin benzer sonuçlar ürettiğini göstermektedir. Nokta verilerinin yakınlığı, düzeni ve geometrik yapının, kümeleme yöntemlerinin doğruluğunda önemli bir etken olduğu gözlemlenmiştir.
Article
Full-text available
The human population is constantly increasing throughout the world, and accordingly, construction is increasing in the same way. Therefore, there is an emergence of irregular and unplanned urbanization. In order to achieve the goal of preventing irregular and unplanned urbanization, it is necessary to monitor the cadastral borders quickly. In this sense, the concept of a sensitive, up-to-date, object-based, 3D, and 4D (4D, 3D + time) cadastral have to be a priority. Therefore, continuously updating cadastral maps is important in terms of sustainability and intelligent urbanization. In addition, due to the increase in urbanization, it has become necessary to update the cadastral information system and produce 3D cadastral maps. However, since there are big problems in data collection in urban areas where construction is rapid, different data-collection devices are constantly being applied. While these data-collection devices have proven themselves in terms of accuracy and precision, new technologies have started to be developed in urban areas especially, which is due to the increase in human population and the influence of environmental factors. For this reason, LiDAR data collection methods and the SLAM algorithm can offer a new perspective for producing cadastral maps in complex urban areas. In this study, 3D laser scanning data obtained from a portable sensor based on the SLAM algorithm are tested, which is a relatively new approach for cadastral surveys in complex urban areas. At the end of this study, two different statistical comparisons and accurate analyses of the proposed methodology with reference data were made. First, WMLS data were compared with GNSS data and RMSE values for X, Y, and Z, and were found to be 4.13, 4.91, and 7.77 cm, respectively. In addition, WMLS length data and cadastral length data from total-station data were compared and RMSE values were calculated as 4.76 cm.
Article
Full-text available
In this study, a Random Forest (RF) based land covers classification method is presented to predict the types of land covers in Miyun area. The returned full-waveforms which were acquired by a LiteMapper 5600 airborne LiDAR system were processed, including waveform filtering, waveform decomposition and features extraction. The commonly used features that were distance, intensity, Full Width at Half Maximum (FWHM), skewness and kurtosis were extracted. These waveform features were used as attributes of training data for generating the RF prediction model. The RF prediction model was applied to predict the types of land covers in Miyun area as trees, buildings, farmland and ground. The classification results of these four types of land covers were obtained according to the ground truth information acquired from CCD image data of the same region. The RF classification results were compared with that of SVM method and show better results. The RF classification accuracy reached 89.73% and the classification Kappa was 0.8631.
Article
Full-text available
In this paper, we investigate the potential of a Conditional Random Field (CRF) approach for the classification of an airborne LiDAR (Light Detection And Ranging) point cloud. This method enables the incorporation of contextual information and learning of specific relations of object classes within a training step. Thus, it is a powerful approach for obtaining reliable results even in complex urban scenes. Geometrical features as well as an intensity value are used to distinguish the five object classes building, low vegetation, tree, natural ground, and asphalt ground. The performance of our method is evaluated on the dataset of Vaihingen, Germany, in the context of the 'ISPRS Test Project on Urban Classification and 3D Building Reconstruction'. Therefore, the results of the 3D classification were submitted as a 2D binary label image for a subset of two classes, namely building and tree.
Article
Full-text available
The Three Gorges region of central western China is one of the most landslide-prone regions in the world. However, landslide detection based on field surveys and optical remote sensing and synthetic aperture radar (SAR) techniques remains difficult owing to the dense vegetation cover and mountain shadow. In the present study, an area of Zigui County in the Three Gorges region was selected to test the feasibility of detecting landslides by employing novel features extracted from a LiDAR-derived DTM. Additionally, two small sites—Site 1 and Site 2—were selected for training and were used to classify each other. In addition to the aspect, DTM, and slope images, the following feature sets were proposed to improve the accuracy of landslide detection: (1) the mean aspect, DTM, and slope textures based on four texture directions; (2) aspect, DTM, and slope textures based on aspect; and (3) the moving average and standard deviation (stdev) filter of aspect, DTM, and slope. By combining a feature selection method and the RF algorithm, the classification accuracy was evaluated and landslide boundaries were determined. The results can be summarized as follows. (1) The feature selection method demonstrated that the proposed features provided information useful for effective landslide identification. (2) Feature selection achieved an improvement of about 0.44% in the overall classification accuracy, with the feature set reduced by 74%, from 39 to 10; this can speed up the training of the RF model. (3) When fifty randomly selected 20% of landslide pixels (PLS) and 20% of non-landslide pixels (PNLS) (i.e., 20% of PLS and PNLS) were utilized in addition to the selected feature subsets for training, the test sets (i.e., the remaining 80% of PLS and PNLS) yielded an average overall classification accuracy of 78.24%. The cross training and classification for Site 1 and Site 2 provided overall classification accuracies of 62.65% and 64.50%, respectively. This shows that the random sampling design (which suffered some of the effects of spatial auto-correlation) and the proposed method in this present study contribute jointly to the classification accuracy. (4) Using the Canny operator to delineate landslide boundaries based on the classification results of PLS and PNLS, we obtained results consistent with the referenced landslide inventory maps. Thus, the proposed procedure, which combines LiDAR data, a feature selection method, and the RF algorithm, can identify forested landslides effectively in the Three Gorges region.
Article
Full-text available
In this paper, a new method for the automated generation of 3D building models from directly observed point clouds generated by LIDAR sensors is presented. By a hierarchic application of robust interpolation using a skew error distribution function, the LIDAR points being on the terrain are separated from points on buildings and other object classes, and a digital terrain model (DTM) can be computed. Points on buildings have to be separated from other points classified as off-terrain points, which is accomplished by an analysis of the height differences of a digital surface model passing through the original LIDAR points and a digital terrain model. Thus, a building mask is derived, and polyhedral building models are created in these candidate regions in a bottom-up procedure by applying curvature-based segmentation techniques. Intermediate results will be presented for a test site located in the City of Vienna.
Article
In this study, the Istanbul Province was monitored using Landsat 5 TM, MSS, Landsat 7 ETMþ, and Landsat 8 OLI imagery from the years 1986, 2000, 2009, 2011, 2013, and 2015 in order to assess land cover changes in the province. The aim of the study was to classify manmade structures, land, green, and water areas, and to observe the changes in the province using satellite images. After classification, the images were compared in selected years to observe land cover. Moreover, these changes were correlated with the property tax values of Istanbul by years. The findings of the study showed that manmade structure areas increased while vegetation areas decreased due to rapid population growth, urbanization, and industrial and commercial development in Istanbul. These changes also explain the transformation of land from rural and natural areas to residential use, and serve as a tool with which to assess land value increments. Land value capturing is critical for the analysis of the linkages between the changes in land cover, and for assessing land transformation and urban growth. Due to inadequate market data, real estate tax values were used to analyze the linkages between detection changes, land cover, and taxation. In fact, the declared tax values of land owners are generally lower than the actual market values and therefore it is not possible to transfer the value increasing of land in urban areas by using property taxation from the owner to local and central governments. The research results also show that the integration of remote sensing results with real estate market data give us to determine the tax base values of real estate more realistically
Article
In photogrammetry, remote sensing, computer vision and robotics, a topic of major interest is represented by the automatic analysis of 3D point cloud data. This task often relies on the use of geometric features amongst which particularly the ones derived from the eigenvalues of the 3D structure tensor (e.g. the three dimensionality features of linearity, planarity and sphericity) have proven to be descriptive and are therefore commonly involved for classification tasks. Although these geometric features are meanwhile considered as standard, very little attention has been paid to their accuracy and robustness. In this paper, we hence focus on the influence of discretization and noise on the most commonly used geometric features. More specifically, we investigate the accuracy and robustness of the eigenvalues of the 3D structure tensor and also of the features derived from these eigenvalues. Thereby, we provide both analytical and numerical considerations which clearly reveal that certain features are more susceptible to discretization and noise whereas others are more robust.
Article
A random forest (RF) classifier is an ensemble classifier that produces multiple decision trees, using a randomly selected subset of training samples and variables. This classifier has become popular within the remote sensing community due to the accuracy of its classifications. The overall objective of this work was to review the utilization of RF classifier in remote sensing. This review has revealed that RF classifier can successfully handle high data dimensionality and multicolinearity, being both fast and insensitive to overfitting. It is, however, sensitive to the sampling design. The variable importance (VI) measurement provided by the RF classifier has been extensively exploited in different scenarios, for example to reduce the number of dimensions of hyperspectral data, to identify the most relevant multisource remote sensing and geographic data, and to select the most suitable season to classify particular target classes. Further investigations are required into less commonly exploited uses of this classifier, such as for sample proximity analysis to detect and remove outliers in the training samples.
Chapter
The principal purpose of this Chapter is to present the algorithms used regularly for the supervised classification of single sensor remote sensing image data. These are collected in Part I. When data from a variety of sensors or sources (such as found in the integrated spatial data base of a Geographical Information System) requires analysis, or when the spatial resolution of a sensor is sufficiently high to warrant attention being paid to neighbouring pixels when performing a classification, more sophisticated analysis tools may be required. A range of these is presented in Part II, along with a treatment of the neural network method for image analysis. These techniques are conceptually more difficult than the standard procedures and have been grouped separately for that reason. It is suggested that only Part I be covered on a first reading of the material of this book; Part II can be left safely until the need arises without affecting an understanding of the remaining chapters.