Conference PaperPDF Available

Segmentation of Digital Images of Aerial Photography

Authors:
Segmentation of Digital Images of Aerial
Photography
Iryna Yurchuk
Vladyslav Kovdrya
Lolita Bilyanska
Applied Mathematics department
National Aviation University
Kyiv, Ukraine
Abstract For digital images of aerial photography the
analysis of parameters of effective graph-based and pyramidal
segmentation algorithms was obtained. Authors used RGB, Lab,
and HSV (BHS) color models.
Keywords—aerial photography; segmentation; graph-based
algorithm; pyramidal algorithm
I. INTRODUCTION
The image segmentation is a useful part of a machine vision
[1], an object detection [2], the recognition tasks [3] and etc.
Its main goal is to obtain a quotient space of image according
to equivalence relation a pixel a “is similar as” a pixel b. The
term “similar” can be defined in different ways which is the
cause of existing of many algorithms. But there is no one that
is universal for any types of digital images (faces, still life,
landscape, etc.).
By authors, efficient graph-based segmentation algorithm
(EGBSA) [4] and pyramidal segmentation algorithm (PSA)
[5] are researched for a possibility to be an effective
instrument for segmentation of digital images which are data
of aerial photography. In [6]-[9] authors researched the
problem of segmentation of aerial photography in way of
existence of pre-processing which makes them
computationally complex for real time.
The application side of this research is the possibility of
creating information technology for unmanned aerial vehicles
(UAV), which can automatically detect and recognize objects
in real time.
II. BACKGROUND AND ALGORITHMS
A. Efficient Graph-based Algorithm
Main terms and formulas of EGBSA can be found in [4].
At the begging, an image I is presented as a graph G=(V,E),
where V be a set of vertices and E be a set of edges. Brief
description of the algorithm is presented below.
1) Sort E into π = (o1,…, om), by non-decreasing edge
weight;
2) Start with initial segmentation S0 , where each vertex vi
is in its own component;
3) Repeat step 4 for q = 1, . . . , m;
4) Construct Sq given Sq−1 as follows. Let vi and vj denote
the vertices connected by the q-th edge in the ordering, i.e., oq
= (vi , vj ). If vi and vj are in disjoint components of Sq −1 and
w(oq) is small compared to the internal difference of both
those components, then merge the two components otherwise
do nothing. More formally, let
1
q
i
C
be the component of Sq−1
containing vi and
1
q
j
C
the component containing vj .
1
q
i
C
1
q
j
C
and w(oq) ≤ MInt(
1
q
i
C
,
1
q
j
C
) then Sq is obtained from
Sq−1 by merging
1
q
i
C
and
1
q
j
C
. Otherwise, Sq = Sq−1;
5) Return S = Sm.
B. Piramidal Segmentation Algorithm
For details of PSA the work [5] has to be used. Brief
description of the algorithm is presented below.
The first level of this algorithm consists of converting the
original image into an initial set of small clusters. Each formed
cluster is characterized both by its own parameters and
parameters of communication with neighboring clusters. This
problem is conveniently solved with the help of a pyramid-
recursive algorithm. The pyramidal segmentation algorithm
does straight (up) and reverse (down) passes through a
quadtree.
If the goal of the first stage of segmentation is maximum
image detail, then the task of the second stage is the maximum
reduction in the number of segments under the condition of
minimal loss (that is, merging of heterogeneous) objects.
The secondary segmentation procedure is performed in
several stages:
1) The main segments merge (initial merging of small
segments and the subsequent merging of large segments).
Dynamic threshold
depends on the iteration step and
segments sizes:
0
1( )P P
, where P is the size of
the smallest segment in a pair,
0
P
is the size of the segment,
which is considered “small”;β limits the reduction of the
threshold for “large” segments and α is a general adjustment
of the threshold;
2) The merging of dark segments. This procedure
processes only segments with low brightness value B.
The merging of texture segments applies only to multiple
clusters with high texture level.
C. Quality Assessment of Segmentation
In this work, authors use an intercluster dispersion sum.
Let remind main formulas of it.
Let assume that for image I a set
1,..., K
S S S
of K
segments was obtained.
Then, an inter cluster dispersion sum has the following
formula:
2 ( )
1 1
( ) ,
j
N
K
j j
l
j l
Q S d I I
, where
j
N
be a number
of pixels at
j
S
,
( )
j
I
- a value of the centre of
j
S
,
j
l
I
- value of
l
-th element at
j
S
.
There is no universal scale of
( )
Q S
for digital images. The
less value of
( )
Q S
corresponds to the better segmentation.
III. ILLUSTRATIVE EXAMPLES AND TESTING
Authors designed software based on EGBSA and PSA.
It’s implemented in C# (.NET Framework). Input data are
digital images the following types: JPEG, BMP, GIF, PNG.
Output data are segmented digital images, where different
colors mean different segments. Additional options are a
change-over between different color models RGB, Lab or
BHS and a quality assessment of segmentation by inter cluster
dispersion sum.
Main parameters of segmentation for EGBSA: sigma a
blur parameter, k - an indicator of differences between
segments and minSize a size of the minimum segment. For
this algorithm, an image can be presented at RGB and Lab
color models.
Authors analyze the effect of these parameters on the result
of segmentation. Let consider a digital image which is shown
in Fig.1. Its size is 2000 pixels on 1500 pixels.
Fig.1. Example of a digital image of aerial photography.
There are fields, builds and a road with trees on the sides.
Fields on the sides of a road are similar for human opinion and
it means they should belong to the same segment.
In Fig.2. the result of its segmentation is shown by using
EGBSA whose parameters are the following: sigma =0.83,
k=1200 and minSize =600. So, the amount of segments equals
28. It is easy to see that similar fields which are on the
roadsides belong to different segments. Builds are segmented
with accurate integration of small elements.
Fig.2.Result of segmentation of an image in Fig.1. using EGBSA: 28
segments, sigma =0.83, k=1200 and minSize =600 at RGB format.
Fig.3. For EGBSA the results of dependency of segments amount and two
parameters k and minSize with fixed sigma =0.83 for an image in Fig.1 at
RGB format.
Analysis of change of segments amount depending on the
values of k and minSize with sigma=0,83 is presented in Fig.3.
In general, we can conclude that if they are increasing, the
amount of segments is decreasing.
Let consider the dependency between two parameters k
and minSize and the quality assessment of segmentation
(Fig.4). So, if k and minSize are increasing, the quality
assessment is worse (the larger number is, the worse it is).
Fig.4. For EGBSA the results of dependency of quality assessment and
two parameters k and minSize with fixed sigma =0.83 (RGB format).
According to previous results, there is the following
conclusion: if the parameters are increasing, the amount of the
segments is decreasing, but quality assessment becomes
worse.
Authors also researched Lab format of images. According
to obtained results Lab format produced more effective
segmentation (see Fig.5 and Fig.6). For considering ranges of
parameters and fixed image the fewer amounts of segments is
at Lab format and the less value of quality assessment is also
at Lab format.
For PSA authors propose the results of its testing in case of
four large scale and four gradient directions. It means that 16
texture features will be analyzed. Coefficients for metrics
equal to 0.5 and weight coefficients of brightness, hue, and
saturation equal to 0.33, 0.34 and 0.33, respectively.
Additionally, there are two thresholds for the first and second
phases, respectively. Such number of parameters makes the
process slow, but flexible to different types of image: rich
texture (builds, forests) and poor (fields, grass, forest, etc.).
We should remark that such division is very subjective and
depends on a problem which has to be solved (detection,
recognition, etc.).
Fig.5. For EGBSA the results of dependency of segments amount and two
parameters k and minSize with fixed sigma =0.83 for an image in Fig.1 at Lab
format.
In Fig. 7. there is a result of segmentation of an image
which is shown in Fig.1. For this method, the fields on
roadsides belong to the same segment. Remark that there are
no parameters for EGBSA to achieve this result.
Fig.6. For EGBSA the results of dependency of quality assessment and two
parameters k and minSize with a fixed sigma =0.83 at Lab format.
Let consider an image with rich texture: fields, builds,
roads and forest (Fig.8).
Remind that the primary segmentation stage depends on
the first threshold. The less it is, the more segments are. Large
amount segments affect the second stage of segmentation, see
Fig.9.
Fig.7. Result of segmentation of an image in Fig.1. using PSA: primary
stage has 14360 segments and secondary – 14 at HBS format.
Let analyze how the second stage depends on the value of
the second threshold with fixed first and, also, its quality
assessment of segmentation. In Fig. 10 and Fig. 11, there are
diagrams which have the same x-coordinate and illustrate
quality different processes. If the value of the second threshold
is increasing, the amount of segments is decreasing and
quality assessment becomes worse.
For images with rich texture, authors recommend the
following ranges of first and second thresholds: 2.1-2.3 and 5-
7, respectively. For images with poor texture 1.5-1.7 and 3-5,
respectively.
For both EGBSA and PSA, the golden middle between
“reasonable” amount of segments and “fine” quality
assessment of segmentation have to be found.
A sample which consists of fifty representational images
(fields, builds, roads, forest, etc. and their combination) was
generated. By using software the dependency of segments
amount and two parameters k and minSize with fixed sigma
=0.83 was generalized (Fig. 12 ).
For all results authors use a computer with the following
characteristics: Intel (R) Core (TM) i5-8600K GRU 3.60 GHz
and 16 Gb of memory. For EGBSA an image processing time
of one image (2000 to 1500 pixels) equals 30 sec. For EGBSA
an image processing time of one image (2000 to 1500 pixels)
equals 300 sec. There are some ways to improve the speed of
these processes. One of them is to use parallel computing.
Fig.8. Example of the image with rich texture (fields, builds, roads, and
forest).
Fig.9. Primary segmentation of the image in Fig.8. with the first threshold
equals 2.1: 20658 segments.
Fig.10. For PSA the results of dependency of segments a mount and value of
secondary t hreshold with a fixed primary threshold for the image in Fig.8. at
HBS model.
Fig.11. For PSA the results of dependency of quality assessment and
value of secondary threshold with a fixed primary threshold for the image in
Fig.8. at HBS.
Fig.12. For EGBSA the results of dependency of segments amount and two
parameters k and minSize with a fixed sigma =0.83 for a sample of 50 images.
IV. CONCLUSIONS
Two algorithms EGBSA and PSA were tested on real data
obtained by aerial photography process. The first algorithm is
faster, but its result of segmentation is insensible. For
example, two blue lakes which are far from each other in an
image belong to different segments. But for PSA, such lakes
belong to the same segment. The most disadvantage of PSA is
a long processing time.
Authors recommended Lab and HBS formats of image,
since their segmentation results are more effective than, for
example, at RGB.
In further research, the digital image segmentation based
on EGBSA and PSA will be adapted to video files with real
time processing. It needs additional research to solve what
type of process (parallel work of two algorithms or a hybrid on
their basis) will be. Results of this work are used as a part of
scientific research which is devoted to the recognition of
objects on aerial photography.
Acknowledgment
Authors thank professor of applied mathematics
department of NAU Pylyp Prystavka for useful ideas and
discussions.
References
[1] R.-Q. Meng, S.-G. Cui, Y.-L. Zang, X.-L. Wu, L. H e, `` Segmentation
of disease image of lettu ce lea ves based on machine vision,'' 2018
Chinese Contr ol and Decision Conference, pp. 6590 - 6594, July 20 18.
[2] Z. Chen, B. Xu, B. Gao, ``An image-s egment ation-based u rban DTM
generat ion method using airborne li dar data,'' IEEE Journal of
Selected topics in Applied Ear th Obser vation and Remote Sensing,
vol.9 (1 ), pp. 496 - 506, January 2016.
[3] L. Liu , G. Feng, D. Beau temps, ``Automatic tempora l segmentation of
hand movements for hand positions recognition in French cued
speech,' ' 2018 IEEE Internat ional Conference on Acoustics, Speech
and Signal Pr ocessing, pp. 3061 - 3065, 2018.
[4] P.F. Felzenszwalb and D.P.Huttenlocher “Efficient graph-based image
segmenta tion,” International Journal of computer vision, vol.59 (2), pp.
167-181, September 2004.
[5] R. Marfil, L. Molina-Tanco, A. Bandera, J.A. Rodríguez and F.
Sandoval,” Pyramid segmentation algorithms revisited,” Pattern
Recognition , vol. 39 (8), pp.1430 – 1451, August 2006.
[6] L. Ichim, D. Popescu “Road detection and segmentation from aerial
images using a CNN based system,” 41st International Conference on
Telecommunications and Signal Processing (TSP), pp.1-5, July 2018.
[7] X. Huang, H.Bai, S. Li “Automatic aerial image segmentation using a
modified Chan-Vese algorithm,” 9th IEEE Conference on Industrial
Electronics and Applications, pp 1091 – 1094, June 2014.
[8] M. Ghiasi, R. Amirfattahi “Fast semantic segmentation of a erial i mages
based on color and texture,” 8th Iranian Conference on Machine Vision
and Image Processing (MVIP), pp. 324 – 327, September 2013.
[9] X. Liu, L. Hou, X. Ju “A method for detecting power lines in UAV
aerial images” 3rd IE EE International Conference on Computer and
Communications (ICCC), pp. 2132 – 2136, December 2017.
... However, aerial images often suffer from challenges such as low contrast [10], noise [11], and inconsistent lighting conditions [12], which can hinder the identification and analysis of important features like animal nests. To overcome these limitations, researchers have explored various image processing techniques specifically tailored for aerial photography [13]. ...
... Finally, the transformed image is normalised to the range [0, L] to obtain the final result, as seen in Eq. (13). ...
Article
Preserving wildlife habitats is crucial in mitigating climate change. Species like orangutans and monkeys contribute to fruiting and planting in forests. The World Wide Fund Sabah Malaysia faces challenges in manually identifying and classifying orangutan nests for studying their behaviour and conserving their habitats. To address this, we propose automating the classification of captured images using machine learning algorithms. This research involves three key components: image processing, feature extraction, and image classification. Our proposed image processing includes several steps, such as image pre-processing and enhancement techniques like local contrast enhancement, sharpening, intensity adjustment, histogram equalization, and colour thresholding. We applied four different Convolutional Neural Networks (CNNs) to extract and identify orangutan nests’ features. Subsequently, we utilize Support Vector Machine (SVM) for image classification. The results reveal that the Inception Residual Network Version 2 (ResNet-v2) achieves the best performance. This architecture is then combined with a kernel SVM to classify Bornean orangutan nests. Our approach demonstrates impressive results, boasting an accuracy of 96.60%, an F1-score of 96.60%, a precision of 96.59%, and a recall of 96.58%. These metrics underscore the high accuracy and effectiveness of our proposed methodology for classifying Bornean orangutan nests. By reducing the need for extensive human intervention in image analysis, our method presents a valuable tool for conservationists and researchers committed to studying and safeguarding these endangered orangutans and their habitats. In future work, we aim to develop orangutan nest detector, contributing to wildlife conservation research.
Conference Paper
Full-text available
In the context of Cued Speech (CS) recognition, the recognition of lips and hand movements is a key task. As we know, a good temporal segmentation is necessary for the supervised recognition system. Lips and hand stream cannot share the same temporal segmentation since they are not synchronized. The studies of lips (audio) temporal segmentation are well developed but it is not the case for hand movements in CS. In this work, we propose a hand preceding model to predict temporal segmentations of hand movements automatically by exploring the relationship between hand preceding time and the vowel positions in sentences. To evaluate the performance of the proposed method, we apply the hand preceding model to a sub-database and the whole database. Hand positions recognition is realized with the multi-Gaussian and Long-Short Term Memory (LSTM). The results show that using the predicted temporal segmentation significantly improves the recognition performance compared with that using the audio based segmentation. To the best of our knowledge, this is the first automatic method to predict the temporal segmentation for hand movements only from the audio based segmentation in CS. Index Terms-Cued Speech, hand preceding model, temporal segmentations of hand position movements, Hand positions recognition, LSTM.
Conference Paper
In the past decades, researchers have made great efforts to develop object segmentation algorithms and constantly improving the segmentation quality. Compared with object segmentation algorithms, object segmentation quality assessment is less studied. Especially, how to evaluate segmentation quality by subjective assessment, which is the most reliable way to understand and analyze segmentation quality in terms of human perception has not attracted much attention. In this paper, we review three object segmentation subjective assessment methods and propose possible future directions for research in object segmentation quality subjective assessment.
Article
DTM generation using airborne Light detection and ranging (Lidar) data is the fundamental issue of Lidar data processing and has been massively studied. However, DTM generation is still challenging in urban areas, due to the existence of densely distributed urban features and very large buildings. Different from most point-based DTM generation algorithms, this research proposes an image-segmentation-based method for urban DTM generation. First, image segmentation is conducted using the DSM image. Next, a seed ground segment is set for each cell. Following the order of the nearest segment pair, each unclassified segment is examined by comparing the spatial correlation between the candidate segment and its nearest ground segment. This process continues until no unclassified segment remains. Based on classified ground segments, all ground points can thus be extracted and the output DTM can be obtained through postinterpolation. This method was experimented in the central Cambridge. The accuracy assessment and comparison with other Lidar-processing methods proved that the segmentation-based method produces urban DTMs with a small mean bias and limited large errors. This methodology has the potential to be applied to other areas and terrain situations. In addition to an original DTM generation method, this research works as an example that mature methods from other subjects can be employed to extend the category of Lidar-processing algorithms.
Conference Paper
Automatic segmentation of aerial images has been a challenging area of research in recent years. Among numerous image segmentation methods, the level set method has received a great deal of attention which could represent contours or surfaces with complex topology and change their topology in a natural way. The solution of classic level set model, however, can be easily trapped into a local minimum. To overcome this problem, a novel modified dual Chan-Vese model is proposed in this paper. This proposed model is composed of two contours, which evolve towards the edges of objects from inside of the objects and outside of the objects. By reducing the differences between the interior contour and the external contour, the proposed model can partly prevent the solution of the level set method from a local minimum. Experiments show that the proposed model can obtain exact aerial image segmentation.
Conference Paper
In this paper, a semantic segmentation method for aerial images is presented. Semantic segmentation allows the task of segmentation and classification to be performed simultaneously in a single efficient step. This algorithm relies on descriptors of color and texture. In the training phase, we first manually extract homogenous areas and label each area semantically. Then color and texture descriptors for each area in the training image are computed. The pool of descriptors and their semantic label are used to build two separate classifiers for color and texture. We tested our algorithm by KNN classifier. To segment a new image, we over-segment it into a number of superpixels. Then we compute texture and color descriptors for each superpixel and classify it based on the trained classifier. This labels the superpixels semantically. Labeling all superpixels provides a segmentation map. We used local binary pattern histogram fourier features and color histograms of RGB images as texture and color descriptors respectively. This algorithm is applied to a large set of aerial images and is proved to have above 95% success rate.
Conference Paper
This paper presents a comparative study of the segmentation of satellite images in RGB and HSV color space using modified fuzzy c means clustering algorithm. The segmented images are compared with the original input images by using number of bivariate image quality parameters. These parameters measure the similarity between the input and the segmented image on the basis of comparing the corresponding pixels of the two images and present a numerical value as a result. The experiments are performed on GeoEye-1 satellite images to test the efficiency and robustness of the proposed method.
Conference Paper
As the premise of feature extraction and pattern recognition, image segmentation is one of the fundamental approaches of digital image processing. This paper enumerates and reviews main image segmentation algorithms, then presents basic evaluation methods for them, finally discusses the prospect of image segmentation. Some valuable characteristics of image segmentation come out after a large number of comparative experiments.