ArticlePDF Available

A novel band selection architecture to propose a built-up index for hyperspectral sensor PRISMA


Abstract and Figures

Processing of hyperspectral remote sensing datasets poses challenges in terms of computational expense pertaining to data redundancy. As such, band selection becomes indispensable to address redundancy while preserving the optimal spectral information. This paper proposes a novel architecture using Genetic Algorithm (GA) optimizing technique with Random Forest (RF) classifier for efficient band selection with the Hyperspectral Precursor of the Application Mission (PRISMA) dataset. The optimal bands are BLUE (λ=492.69 nm), NIR (λ=959.52 nm), and SWIR 1 (λ=1626.78 nm). This paper also involves an application of the selected bands to accurately identify and quantify built-up pixels by means of a new spectral index named Hyperspectral Imagery-based Built-up Index (HIBI). The proposed index was used to map built-up pixels in six cities around the world namely Jaipur, Varanasi, Delhi, Tokyo, Moscow and Jakarta to establish its robustness. This analysis shows that the proposed index has an accuracy of 94.02%, higher than all the other indices considered for this study. Moreover, the spectral separability analysis also establishes the efficiency of the proposed index to differentiate built-up pixels from spectrally similar land use or land cover classes.
Content may be subject to copyright.
Earth Science Informatics
of urban areas to gross domestic product (GDP). In com-
parison, the geographical aspect considers the spatial extent
estimation of built-up infrastructure, including impervious
surfaces such as concrete buildings and roads (Richards and
Richards 1999). Encroachment of natural lands will tran-
spire due to the expansion of the urbanized areas. The trans-
formation of these lands into the impermeable built-up area
will devastate that area’s hydrologic system, ecosystem,
and biodiversity (Xu 2007). It also changes the wind cir-
culation, albedo eect, and surface temperature of the sur-
rounding cities (Mallick et al. 2008). The measurement of
built-up area has been signicant because these land types
can indicate environmental quality and urban development
(As-syakur et al. 2012). The main problem encountered
while measuring or mapping urban areas is the assessment
of change in land usage from non-residential to residential
(Xu 2008). The estimation of the built-up extent has been
conventionally performed by measuring the spatial spread
of the built-up footprint from ground surveying information
(Mukherjee et al. 2020). The satellite imagery datasets have
also been used for calculating the built-up extent and these
imageries are advantageous due to their historical availabil-
ity and large-scale spatial coverage (Guindon et al. 2004).
Land covers in urban areas tend to change more drastically
over a short period than elsewhere because of incessant
urbanization. During the time period from 1960–2018, the
share of the global urban population increased from 33.61–
55.27% (Zha et al. 2003; Mukherjee et al. 2020). India’
increased from 17.97% to 1961 to 31.16% in 2011 and is
expected to reach 40% by 2030 (Kaur and Luthra 2018). The
studies related to urbanization of late have been analyzed
with renewed enthusiasm by urban planners, economists,
and researchers. Urban expansion has been quantied using
economic, demographic, and geographical approaches. The
quantication of the urban measurements with an economic
and demographic perspective measures the change in the
ratio of urban to the total population and the contribution
Communicated by H. Babaie
Rajarshi Bhattacharjee
1 Department of Civil Engineering, IIT-BHU, Varanasi, India
Processing of hyperspectral remote sensing datasets poses challenges in terms of computational expense pertaining to data
redundancy. As such, band selection becomes indispensable to address redundancy while preserving the optimal spectral
information. This paper proposes a novel architecture using Genetic Algorithm (GA) optimizing technique with Random
Forest (RF) classier for ecient band selection with the Hyperspectral Precursor of the Application Mission (PRISMA)
dataset. The optimal bands are BLUE = 492.69 nm), NIR = 959.52 nm), and SWIR 1 = 1626.78 nm). This paper
also involves an application of the selected bands to accurately identify and quantify built-up pixels by means of a new
spectral index named Hyperspectral Imagery-based Built-up Index (HIBI). The proposed index was used to map built-up
pixels in six cities around the world namely Jaipur, Varanasi, Delhi, Tokyo, Moscow and Jakarta to establish its robustness.
This analysis shows that the proposed index has an accuracy of 94.02%, higher than all the other indices considered for
this study. Moreover, the spectral separability analysis also establishes the eciency of the proposed index to dierentiate
built-up pixels from spectrally similar land use or land cover classes.
Keywords Remote sensing · HIBI · Urban sprawl · Spectral index · Genetic algorithm
Received: 15 November 2022 / Accepted: 21 January 2023
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2023
A novel band selection architecture to propose a built-up index for
hyperspectral sensor PRISMA
ShishirGaur1· NilenduDas1· RajarshiBhattacharjee1· AnuragOhri1· DebanirmalyaPatra1
1 3
Earth Science Informatics
For the remote sensing-based indices calculation, the sci-
entic community has primarily used the LANDSAT series
satellite imageries due to their easy availability. The LAND-
SAT program was activated in 1972, and since its inception,
these satellites have acquired multiple multispectral images
that calculate the Earth’s reected solar energy falling in the
visible and non-visible ranges of the electromagnetic spec-
trum (Mukherjee et al. 2020).
The process of built-up spread calculation can be per-
formed using several remotely sensed datasets or dierent
satellite imageries and spectral values based on the category
of land use (Xu, 2008). These calculations and estimations
can be analyzed with the help of several classication algo-
rithms (Poyil and Misra, 2015; Rawat and Kumar, 2015).
Among the numerous classication techniques, the index-
based thresholding method has been frequently applied by
researchers because of its computational eciency and ease
of implementation (Zha et al. 2003; He et al. 2010). The
urban land-use class of the Pearl River Delta of China has
been classied by Chen et al. (2006) using multiple remote
sensing-based indices with high accuracy. For mapping
the bare land and built-up in urban areas, several indices
have been used in various studies, such as Urban Index(UI)
(Kawamura et al. 1996), Normalised Dierence Built-up
Index (NDBI) (Zha et al. 2003), Normalised Dierence
Bareness Index (NDBaI) (Zhao and Chen, 2005), Index-
based Built-Up Index (IBI) (Xu, 2008), Enhanced Built-
Up and Bareness Index (EBBI) (As-syakur et al. 2012),
Dry built-up index (DBI) (Rasul et al. 2018), and powered
B1 built-up index (PB1BI) (Mukherjee et al. 2020). Indi-
ces like UI, NDBI, and EBBI are based on the high-speed
mapping of bare land or built-up areas. Nevertheless, these
indices are incompetent enough to verify the proper distri-
bution among the built-up and the bare land classes (He et
al. 2010; Ukhnaa et al. 2019). Some researchers stated that
this inability, because of the severe diculty of the spec-
tral response patterns to built-up areas, vegetation, and bare
land, predominantly in terms of the pixel groupings in areas
with heterogenic objects (He et al. 2010). All these built-
up indices have been derived using LANDSAT imageries.
The LANDSAT-series satellite(s) are multispectral sensors
having only a few bands (Loveland and Irons, 2016). As
compared to the multispectral data, the hyperspectral data
provide more substantial information. The greater level
of spectral details provides better prospects to analyze the
Land use/ Land cover (LU/LC) pattern (Jarocińska et al.
2022). The band combination which has been best suited for
the built-up delineation has been obtained using a genetic
algorithm (GA) based optimization technique using random
forest (RF) as a classier (Nagasubramanian et al., 2018).
This study uses hyperspectral data to derive a new index
for built-up area delineation. The PRISMA sensor dataset
has been used in this analysis. This sensor was built by the
Italian space agency and launched in 2019. This hyperspec-
tral sensor consists of approximately 250 bands in a spectral
range of 400–2500 nm. PRISMA images have a spatial res-
olution of 30 meters, similar to that of LANDSAT imageries
(Vangi et al. 2021). The index has been named the ‘Hyper-
spectral Imagery-based Built-up Index (HIBI),’ which can
properly delineate the built-up features and distinguish
between built-up and non-built-up features. The novelty
factor in this work is that for the rst time, the GA-based
RF classier has been applied to the PRISMA datasets. The
optimal bands have also been gured out which can be used
by the research community for urban pattern identication
in future.
This index has been tested by creating classied maps of
six cities in the world. The HIBI mapping outcomes have
been compared to the results of several other existing built-
up indices. The proposed index has also been compared with
a machine learning (ML) and deep learning (DL) classier.
Materials and methods
Study area description
For analysing the performance classication of the given
index HIBI, three cities have been chosen from India, and
three other cities have been chosen from outside India. The
study region selected from India is Delhi, Jaipur, and Vara-
nasi. These cities have a high population density. Delhi is
the national capital of India. It is the second most populous
city in India. The city of Jaipur is the state capital of Raj-
asthan and the most populous city of the state. It is also the
tenth most populous city in India (CensusInfo India 2011).
Varanasi is known as the spiritual capital of India and one of
the most famous cities in the world (Garg et al. 2020). The
cities that have been selected outside India are Tokyo (the
capital city of Japan), Moscow (the capital city of Russia),
and Jakarta (the capital city of Indonesia). The city of Tokyo
is known as the most populous city in the world. Moscow is
the second most populous city in Europe. Jakarta also fea-
tures in the top hundred most populous cities of the world
(UN, DESA, PD 2014; Hui et al. 2017). The location map
of the study site has been drawn in Fig. 1.
Datasets used
The remote sensing datasets comprise PRISMA satellite
images. All the datasets used in this analysis have been
Level-2D products. These datasets have been atmospheri-
cally corrected and geocoded (Vangi et al. 2021). Except
for the proposed index, all the other indices have been
1 3
Earth Science Informatics
estimated from LANDSAT-8 imageries. The PRISMA and
LANDSAT-8 images are from April 2021.
Methodology adopted for the study
The built-up indices UI, NDBI, IBI, EBBI, DBI, and PB1BI,
have been computed for the preprocessed LANDSAT-8
scene. Another built-up index additionally has been com-
puted using multispectral data (LANDSAT-8) having similar
band placement to that of hyperspectral PRISMA data. This
index has been termed in the manuscript as ‘Multispectral
Imagery-based Built-up Index (MIBI). The output images
(represented in the form of maps) have been generated for
each built-up index applying the optimum threshold. Then
the proposed built-up index (HIBI) is estimated using the
PRISMA dataset, and a comparison is drawn with all the
other built-up indices. Then the comparison of the HIBI is
made with a machine learning (ML) classier named Sup-
port Vector Machine (SVM) and a deep learning (DL) clas-
sier known as Convolutional Neural Network (CNN).
Development of built-up index HIBI and its comparison
with ML and DL techniques
The spectral curve has been made for LU/LC features
using the PRISMA dataset for the six selected cities. Seven
PRISMA bands situated in the EM spectrum zone between
400 and 450 nm have not been considered for this analysis.
These bands basically lie in the aerosol domain. The median
reectance value of each feature class has been shown in
the spectral graph. The spectral pattern of PRISMA bands
of the study regions is represented in Fig. 2. Another major
challenge while working with the hyperspectral dataset has
been the selection of the best waveband combination. In this
scenario, the major task has been to select the best possible
band combination for built-up area delineation. So to miti-
gate this challenge, a genetic algorithm (GA) based opti-
mization technique using random forest (RF) as a classier
has been implemented. The number of decision trees used
is 100, bootstrap has been set to true to increase the compu-
tational eciency, and the random state has been set to 2 to
avoid dierent results across dierent executions. GA has
been known as a population-dependent stochastic search
optimization method inuenced by natural genetics and nat-
ural selection principles. Wavebands have been represented
by long string bits and are known as chromosomes. A score
has been assigned to each of these chromosomes using a
tness function (Goldberg 2001). In this case, the tness
function evaluates how well these chromosomes (combina-
tion of bands) perform to discriminate between built-up and
non-built-up regions. These chromosomes have evolved
in consecutive generations using mutation, selection, and
crossover genetic operators to explore the solution space
unless the best solution has been achieved or end criteria
have been encountered. Chromosomes for reproduction
can be selected in many ways. One of the ways is to pick
chromosome pairs from the whole population, which gives
better tness scores to execute crossover. Genetic details
of the two chromosomes can be randomly combined using
the crossover operator. Some part of a chromosome gets
modied by the mutation operator, and it averts GA from
selecting local optimal solutions (Nagasubramanian et al.
2018). It has been signicant to select an appropriate t-
ness function carefully. In this analysis, the F1 score of the
Fig. 1 Geolocation map of the
study regions
1 3
Earth Science Informatics
rst term in both the numerator and denominator of the pro-
posed index. The urban pixels have a low reectance value
as compared to the barren land in the NIR band. It may be
noted that the healthy vegetation reaches its reectance crest
in the NIR region, which makes it an automatic inclusion in
the index to dierentiate built-up and vegetation. However,
the spectral response of dry vegetation and bare land peaks
up in the SWIR1 band leading to a similar spectral foot-
print as the construction materials. Thus, these three bands
have been used in the present study to considerably improve
the extraction of built-up pixels by increasing the contrast
between bare lands and built-ups. Even with the water bod-
ies (inland as well as the sea) in these regions, the built-up
class has shown the separation. In the GREEN, RED, and
SWIR2 regions, the spectral curve of the built-up class gets
overlapped at several places with some of the other classes.
All the observations, as mentioned earlier, have been taken
into account. A generic built-up delineation index has been
proposed to recognize the built-up pixels on the basis of a
suitable threshold. The formula of the built-up index con-
sisting of BLUE, NIR, and SWIR1 band has been shown as:
BLUE(λ=492.69 nm)NIR(λ=959.52 nm)SWIR(λ=1626.78 nm)
BLUE(λ=492.69 nm)+NIR(λ=959.52 nm)+SWIR(λ=1626.78 nm)
In this analysis, GA has given which specic bands in the
given regions need to be used to get the best result. Since
the PRISMA image is a hyperspectral dataset, there will be
multiple bands in each region. For example, this dataset con-
tains 10–12 bands in the BLUE region of the EM spectrum.
So unless an optimization algorithm has not been imple-
mented, it will be impossible to select the best band out of
the given 10–12 bands. A similar scenario will also happen
in other regions of the EM spectrum with the PRISMA data.
In the NIR region, this dataset contains around 60–65 bands.
Even the SWIR 1 and SWIR 2 regions also include a large
built-up class has been chosen to assess the performance of
the classier. F1 has been dened as the harmonic mean of
recall and precision values (Powers 2020). A good F1 score
is also indicative of good classication performance. F1
score ranges from 0 to 1 (Fourure et al. 2021). F1 has been
mathematically calculated using these formulas (Nagasub-
ramanian et al. 2018):
TruePositive +FalsePositive
TruePositive +FalseNegative
2×Precision ×Recall
Precision +Recall
10-fold cross-validation has been conducted to evaluate the
tness of the classier. The GA gives the best possible band
combination. In this case, it selected Band 13 (λ = 492.69 nm)
from the BLUE region, Band 22 (λ = 562.73 nm) from the
GREEN region, Band 34 = 669.81 nm) from the RED
region, Band 66 (λ = 959.52 nm) from the NIR region, Band
129 (λ = 1626.78 nm) from the SWIR1 region, and Band 197
= 2229.75 nm) from the SWIR2 region. The owchart for
the GA-RF architecture for the best waveband combination
selection is shown in Fig. 3.
The GA architecture has given a combination of six
bands from six dierent EM regions for built-up delinea-
tion. But still, all six bands can not be used for built-up esti-
mation. The careful analysis of the LU/LC spectral curve
shows that the BLUE region depicts a high reectance value
for the built-up class compared to other classes. On the basis
of this observation, the BLUE band has been kept as the
Fig. 2 Spectral prole for seven LU/LC classes
1 3
Earth Science Informatics
(Mukherjee et al. 2020). The pixels inside the range [L, U]
have been delineated as built-up pixels, and the other pixels
in the image have been marked as non-built-up pixels. For
accurate classication using any index, the proper U and L
bounds estimation must be done using statistical techniques
number of bands. The GA-RF architecture has been imple-
mented by using Google colab software (a cloud-based free
python platform).
The thresholding technique can be helpful for assigning
an upper cuto (U) and lower cuto (L) for a single pixel
Fig. 3 GA-RF architecture for
optimal bands combination
1 3
Earth Science Informatics
(combining all the cities) and 385,235 non-built-up pixels
(combining all the cities). A 7 × 7 window with 7 × 7 stride
has been used to generate a training image chipset for the
CNN model. The huge dierence between training and test-
ing pixels can be seen because training has been used only
for SVM and CNN methods. But testing data has been asso-
ciated with every index used in this study. The stratied ran-
dom sampling technique has been used for creating training
and testing data. The testing dataset gives a reference for
comparison of the delineated built-up results generated
from the indices as well as from SVM and CNN classiers.
In the SVM technique, the kernel function has been set to
Radial Basis Function (rbf). Moreover, the decision shape
function has been set to one-vs-one (‘ovo’). The CNN clas-
sier has been trained for 50 epochs with the initial learning
rate being 0.001 and dropout value of 0.25. The activation
function has been set to ReLU while the batch size being
128. The optimization function Adam has been used to tune
the hyperparameters. The value of other additional hyperpa-
rameters Beta1, Beta 2, and epsilon have been 0.001,0.009,
and 1e-08 respectively. The binary map of the built-up after
LU/LC classication has been used for the generation of
training and testing data. The datasets contain samples from
both the built-up and non-built-up pixels. The same set of
training pixels has been incorporated for threshold interval
window generation for all the indices of built-up delinea-
tion and for the SVM as well as the CNN classier training.
Similarly, the same set of pixels can be utilized to evaluate
and compare the performance of the classication between
several techniques. Several parameters have been chosen
for the accuracy measurements like Sensitivity, Specicity,
Positive Prediction Value (PPV), Negative Prediction Value
(NPV), Total accuracy, and Cohen’s Kappa (κ) (Mukherjee
et al. 2020). These parameters for accuracy measurements
have been dened in Table 1.
Performance of the built-up index HIBI
This work has analysed the built-up and non-built-up
regions mapping in the considered study regions using HIBI
transformation. The multispectral bands of the LANDSAT-8
satellite image have been used for making all the other con-
sidered indices except HIBI. The HIBI index has been com-
pared with the other indices (UI, NDBI, PB1BI, EBBI, DBI,
MIBI, and IBI). The HIBI result has been compared with
SVM and CNN classiers to estimate the eectiveness and
accuracy. Additionally, the built-up and non-built-up cover-
age area of the study region has been tabulated.
from the sample training set. This technique must be non-
trivial as the built-up sample pixels do not emulate any
parametric variations like Gaussian distribution. The boot-
strapping thresholding technique has been applied in this
study to overcome the diculty of the non-trivial method.
The classication performance of the proposed index has
been compared with other supervised classication algo-
rithms along with ML and DL classiers.
Spectral separability measurement
One of the commonly used methods for spectral separabil-
ity is known as Jeries-Matusita(JM) distance, and in this
study, the JM distance method has been used. This method
has been very reliable for spectral separability because it
behaves like the probability of correct classication (Padma
and Sanjeevi, 2014). The probability density of the spec-
tral vectors, S1 and S2 for the bands (l = 1, 2,. ., L) has been
denoted as pl and ql, and the JM distance has been calculated
(Ghiyamat et al. 2013) as:
The JM distance ranges from 0 to 2, where 2 indicates the
maximum separability ( Rao et al. 2014). The LANDSAT-8
image has been chosen as the base image, and on that image,
the 50 pure pixels have been chosen for each of the built-
up, cropland, vegetation, bare soil, sandbar, and waterbody
class. Then, these pure pixels group has been placed on the
corresponding classied image generated from each index.
Since the geo-coded location of the points has been the
same, the points lie at the same location for all the index-
based classied images. The spectral distance between
built-up and all the other classes has been calculated for all
index-based classied images. This procedure has been car-
ried out in ERDAS Imagine software. The entire process has
been reproduced for the HIBI index by taking the PRISMA
Creation of training and testing data
The training datasets have been implemented to generate the
threshold interval window of the built-up-delineation indi-
ces and for training the Support Vector Machine (SVM) and
CNN models. The training and testing datasets have been
prepared from the LANDSAT-8 and PRISMA images of the
study region(s). Both training and testing data have been
divided into built-up and non-built-up pixels. The training
set comprises 250 built-up pixels (combining all the cities)
and 1000 non-built-up pixels (combining all the cities). The
test set consists of a total number of 64,535 built-up pixels
1 3
Earth Science Informatics
above-mentioned cities. In Table 4, the accuracy parameters
have been computed for all the considered indices along
with the accuracies of SVM and CNN classiers. For all the
accuracy parameters, the average value has been computed
because all these parameters for each index, along with ML
and DL algorithms, have been applied to the six considered
The built-up and non-built-up regions have been esti-
mated by using HIBI for the six considered cities. The
results have been tabled in Table 5. GA algorithm has been
implemented in this analysis to decide which individual
bands need to be selected from each of the respective EM
regions. The F1 score value for this algorithm has been esti-
mated as 0.92.
Qualitative assessment of the built-up indices
The visual comparison among all the indices along with
SVM and CNN in the form of classied images for one of
the cities (i.e., Jaipur) has been shown. Figure 5 depicts the
classied maps. From the maps, it can be seen that all the
indices have been overestimating the built-up regions except
HIBI. From the spatial distribution pattern images, it can be
reckoned that UI shows maximum overestimation, followed
by PB1BI and EBBI. Figure 6 represents the Standard false
Result of the spectral separability
The spectra separation between built-up and other non-built-
up classes has been tabulated in Table 2. The HIBI index has
the best spectral separability values between built-up and all
the other non-built-up classes.
Threshold window for built-up indices
The bootstrap thresholding has been applied, and the
threshold window, along with the range diagrams, has been
depicted in Fig. 4.
The HIBI index has the highest built-up range percent-
age among all the indices. This shows that the HIBI index
has been the most robust and dynamic index of all. It can
classify the built-up and non-built-up regions more eec-
tively. The false positive (FB) value will be less for HIBI as
compared to other considered built-up indices. The built-up
threshold range percentage has been tabulated in Table 3.
The percentage here has been considered as the average per-
centage by combining all the six considered cities.
Quantitative accuracy assessment of the built-up indices
For each index, the accuracy estimation has been performed
by considering the testing pixels as a reference for the six
Table 1 Accuracy measures
Parameters for accuracy measurements Denition Expres-
Sensitivity The measure of how often the predicted pixel has been built up when actual testing
pixel also has been built-up pixel
Specicity The measure of how often the predicted pixel has been non-built-up when the refer-
ence testing pixel also has been non-built-up one
Positive Prediction Value (PPV) The measure of how often predicted built-up pixel has been a built-up pixel
Negative Prediction Value
The measure of how often predicted non-built-up pixel has been a non-built-up pixel
Total accuracy Measures overall accuracy of the classier
Cohen’s Kappa (κ) The measure of agreement between classier and ground truth (testing pixels)
TP (True Positive), TN (True Negative), FP (False Positive), FN (False Negative), Exp. Acc. (Expected Accuracy)
Table 2 JM statistics measuring separability between Built-up and Non-Built-up pixels
Indices JM distance for the separability
Cropland-Built-up Bare soil-Built-up Inland waterbody-Built-up Sandbar-
Vegetation-Built-up Sea water body-
NDBI 0.71 0.81 1.42 0.79 1.29 1.43
UI 0.68 0.81 1.47 0.61 1.15 1.47
PB1BI 1.68 1.31 1.65 1.09 1.53 1.61
EBBI 0.73 0.83 1.52 0.81 1.41 1.49
DBI 1.04 0.89 1.56 0.91 1.45 1.51
IBI 0.69 0.86 1.49 0.59 1.12 1.48
MIBI 1.77 1.61 1.82 1.48 1.69 1.72
HIBI 1.92 1.85 1.95 1.91 1.96 1.93
1 3
Earth Science Informatics
The urban indices based on remote sensing technology
have been generally used to dierentiate between bare
soil and built-up regions. These indices exhibit a low level
colour composite (SFCC) and true colour composite (TCC)
image of the study stretch prepared by PRISMA image.
Fig. 4 Threshold window of built-up indices using bootstrap threshold (a) NDBI (b) UI (c) PB1BI (d) EBBI (e) DBI (f) IBI (g) HIBI (h) MIBI
1 3
Earth Science Informatics
multispectral data, the associated accuracy of MIBI is less
as compared to HIBI. The narrower bandwidth of hyper-
spectral data provides more accuracy compered to the wider
bandwidth of multispectral datasets. So the built-up delinea-
tion using hyperspectral data is more precise. The accuracy
parameters measurement has been tabulated for both the
HIBI and MIBI in Table 4. This index can also successfully
distinguish between built-up and sandbar. The sandbar gets
mixed with urban pixels, and this can procure the wrong
classication result. Sandbar formation has been a common
phenomenon for a river like River Ganga (Jain and Singh
2020). This river has been the lifeline for Varanasi city, and
one extent of this city has been situated near the river bank.
So for the proper delineation of the city stretch, the sandbar
needs to be categorized into non-built-up classes, and earlier
indices have not been very capable of doing this. The HIBI
index can be benecial for delineating those cities which lie
near the sandbar intruded river bank. The coastal cities hav-
ing shorelines can also be appropriately demarcated by this
index. In cities where the tree canopy covers the building,
at those places, this pixel spectra-based index can produce
an error-prone result because of the heterogeneous land-
scape (As-syakur et al. 2012). If the spatial resolution of the
satellite imagery gets enhanced, it can retrieve better infor-
mation in heterogeneous urban regions by capturing small-
scale objects (Tran et al. 2011). The performance of HIBI
has been has been reasonably accurate when compared with
SVM and CNN classiers. The SVM and especially CNN
algorithm execution has been challenging enough as they
require substantial knowledge and skillset in computer pro-
gramming. Moreover, considerably high end systems are
also required for the execution of such algorithms. Selection
of perfectly homogenous training samples or datasets also
play a key role in the accuracy of SVM and CNN classi-
ers. The execution of HIBI is computationally inexpensive
and much easier as compared to ML and DL algorithms.
This index can be executed easily in any open source GIS
software like QGIS. However, the performance of HIBI
of accuracy because these land-use categories possess a
high degree of homogeneity. However, the application of
HIBI has been found to be very eective in discriminating
between bare soil and built-up areas, which has been a sig-
nicant limitation of several pre-existing indices. The boot-
strapping thresholding has been applied to determine the
index range. The HIBI has been created using hyperspectral
data, and other indices have been generated by using mul-
tispectral data. This analysis also shows that hyperspectral
images are better for pixel-based classication in compari-
son to multispectral images. From the HIBI images of Jai-
pur city, it can be clearly seen that this index can properly
delineate and discriminate between built-up and bare soil
classes. The MIBI has also been estimated as having similar
band placement to that of HIBI but since it has been using
Table 3 Built-up range percentage out of the total range window for
the considered indices
(in %)
NDBI 7.24
UI 5.04
PB1BI 8.12
EBBI 7.34
DBI 6.08
IBI 6.92
MIBI 22.07
HIBI 30.18
Table 4 Measurement of the accuracy parameters for the testing pixels
Indices Sensitivity Specicity PPV NPV Accuracy(%) (κ)
NDBI 0.862 0.768 0.787 0.887 75.33 0.717
UI 0.837 0.579 0.588 0.869 64.91 0.617
PB1BI 0.901 0.778 0.751 0.908 85.55 0.819
EBBI 0.801 0.582 0.646 0.889 73.29 0.692
DBI 0.485 0.576 0.571 0.502 46.83 0.438
IBI 0.771 0.628 0.651 0.798 66.94 0.638
MIBI 0.913 0.876 0.889 0.904 91.98 0.906
HIBI 0.922 0.937 0.914 0.952 94.02 0.919
ML technique
SVM 0.941 0.965 0.964 0.963 95.12 0.937
DL technique
CNN 0.970 0.982 0.978 0.962 97.27 0.951
Table 5 Built-up and non-built up area calculation for each city using
Cities and its surrounding region
Study area Built-up (Km2) Non-built up (Km2)
Jakarta 377.05 661.16
Tokyo 1427.03 2197.31
Moscow 1105.06 2507.17
Delhi 920.7 1485.58
Jaipur 134.53 1571.36
Varanasi 92.93 749.88
1 3
Earth Science Informatics
three bands are the optimal bands and can be used by future
researchers for built-up classication. This index provides
better separability between the sandbar and built-up class
than the other indices considered. The other indices except
HIBI overestimate the built-up area for his region. Like all
supervised classication methods, the performance of HIBI
is subjected to the selection of training data, i.e., the train-
ing set needs to be selected cautiously to ensure optimal
depends highly on the appropriate selection of threshold
like any other remote sensing based index.
A new pixel-spectra-based remote sensing index has been
presented and analysed for the delineation of the non-
built and built-up regions of several cities (three in India
and three outside India). The index has been calculated by
using the PRISMA images. Three bands nally, namely,
Blue = 492.69 nm), NIR = 959.52 nm), and SWIR1
= 1626.78 nm), have been used for estimating the HIBI.
The analysis indicates that the proposed index can be a
more accurate alternative to map built-up pixels in compari-
son to the existing indices considered in this study. These
Fig. 5 Classied built-up and non-built-up maps of the study region using built-up indices and advanced classiers (RF and CNN); (a) NDBI (b)
UI (c) PB1BI (d) EBBI (e) DBI (f) IBI (g) HIBI (h)MIBI (i) RF (j) CNN
1 3
Earth Science Informatics
Data availability LANDSAT-8 data had been accessed through the
GEE platform.
(accessed on 9–10 May 2021). The PRISMA data had been analysed in
EVNI 5.6 software (accessed on 12 May 2021).
Conflict of interest The authors report there are no competing interests
to declare.
Acknowledgements The authors would like to take this opportunity
to express their gratefulness towards Dr. Prabhat Kumar Singh Dixit,
HOD of the Civil Engineering Department, IIT (BHU), for constantly
motivating us to carry forward this study.
Authors’ contributions Dr. Shishir Gaur, Rajarshi Bhattacharjee and
Nilendu Das conceptualized the work. Data preparation and analysis
were done by Rajarshi Bhattcharjee and Debanirmalya Patra. The rst
draft of the paper was prepared by Nilendu Das. Finally, Dr. Anurag
Ohri and Dr. Shishir Gaur revised the manuscript critically and ap-
proved this current version. All authors reviewed the manuscript.
Funding This study has been funded by the Science and Engineering
Research Board (SERB), a statutory body of the Department of Sci-
ence and Technology (DST).
Fig. 6 (a) SFCC and (b)TCC
1 3
Earth Science Informatics
Nagasubramanian K, Jones S, Sarkar S, Singh AK, Singh A, Ganapa-
thysubramanian B (2018) Hyperspectral band selection using
genetic algorithm and support vector machines for early identi-
cation of charcoal rot disease in soybean stems. Plant methods
Powers DM (2020) Evaluation: from precision, recall and F-measure
to ROC, informedness, markedness and correlation. arXiv pre-
print arXiv:2010.16061
Poyil RP, Misra AK (2015) Urban agglomeration impact analysis
using remote sensing and GIS techniques in Malegaon city, India.
Int J Sustainable Built Environ 4(1):136–144
Rao DS, Prasad AVV, Nair T (2014) Application of texture characteris-
tics for urban feature extraction from Optical Satellite images. Int
J Image Graphics Signal Process 7(1):16
Rasul A, Balzter H, Ibrahim GRF, Hameed HM, Wheeler J, Adamu B,
Ibrahim S, Najmaddin PM (2018) Applying built-up and bare-soil
indices from Landsat 8 to cities in dry climates. Land 7(3):81
Rawat JS, Kumar M (2015) Monitoring land use/cover change using
remote sensing and GIS techniques: a case study of Hawalbagh
block, district Almora, Uttarakhand, India. Egypt J Remote Sens
Space Sci 18(1):77–84
Richards JA, Richards J (1999) Remote sensing digital
image analysis, 5th edn. Heidelberg, Berlin. https://doi.
Tran TDB, Puissant A, Badariotti D, Weber C (2011) Optimizing spa-
tial resolution of imagery for urban form detection—the cases of
France and Vietnam. Remote Sens 3(10):2128–2147
Ukhnaa M, Huo X, Gaudel G (2019) February Modication of urban
built-up area extraction method based on the thematic index-
derived bands. In: IOP Conference Series: Earth and Environ-
mental Science, vol. 227, no. 6. IOP Publishing, p 062009
Un DESA, Population Division (2014) World urbanization prospects:
the 2014 revision. United Nations Department of Economic and
Social Aairs (UN DESA) Population Division, New York
Vangi E, D’Amico G, Francini S, Giannetti F, Lasserre B, Marchetti
M, Chirici G (2021) The new hyperspectral satellite PRISMA:
imagery for forest types discrimination. Sensors 21(4):1182
Xu H (2007) Extraction of urban built-up land features from Landsat
imagery using a thematic oriented index combination technique.
Photogrammetric Eng Remote Sens 73(12):1381–1391
Xu H (2008) A new index for delineating built-up land features in sat-
ellite imagery. Int J Remote Sens 29(14):4269–4276
Zha Y, Gao J, Ni S (2003) Use of normalized dierence built-up index
in automatically mapping urban areas from TM imagery. Int J
Remote Sens 24(3):583–594
Zhao H, Chen X (2005) July Use of normalized dierence bareness
index in quickly mapping bare areas from TM/ETM+. In: Inter-
national Geoscience and Remote Sensing Symposium, vol. 3, p
Publisher’s Note Springer Nature remains neutral with regard to juris-
dictional claims in published maps and institutional aliations.
Springer Nature or its licensor (e.g. a society or other partner) holds
exclusive rights to this article under a publishing agreement with the
author(s) or other rightsholder(s); author self-archiving of the accepted
manuscript version of this article is solely governed by the terms of
such publishing agreement and applicable law.
As-syakur A, Adnyana I, Arthana IW, Nuarsa IW (2012) Enhanced
built-up and bareness index (EBBI) for mapping built-up and
bare land in an urban area. Remote Sens 4(10):2957–2970
Census Info India (2011) Population report for India. http://www. Accessed 14 June
Chen XL, Zhao HM, Li PX, Yin ZY (2006) Remote sensing image-
based analysis of the relationship between urban heat island and
land use/cover changes. Remote Sens Environ 104(2):133–146
Fourure D, Javaid MU, Posocco N, Tihon S (2021) September Anom-
aly detection: how to articially increase your f1-score with a
biased evaluation protocol. In: Joint European Conference on
Machine Learning and Knowledge Discovery in Databases.
Springer, Cham, pp 3–18
Garg V, Aggarwal SP, Chauhan P (2020) Changes in turbidity along
Ganga River using Sentinel-2 satellite data during lockdown
associated with COVID-19. Geomatics Nat Hazards Risk
Ghiyamat A, Shafri HZM, Mahdiraji GA, Shari ARM, Mansor S
(2013) Hyperspectral discrimination of tree species with dierent
classications using single-and multiple-endmember. Int J Appl
Earth Obs Geoinf 23:177–191
Goldberg DE (2001) Genetic Algorithms in Search, Optimization, and
Machine Learning, MA USA
Guindon B, Zhang Y, Dillabaugh C (2004) Landsat urban mapping
based on a combined spectral–spatial methodology. Remote Sens
Environ 92(2):218–232
He C, Shi P, Xie D, Zhao Y (2010) Improving the normalized dier-
ence built-up index to map urban built-up areas using a semiau-
tomatic segmentation approach. Remote Sens Lett 1(4):213–221
Hui C, Richardson DM, Visser V (2017) Ranking of invasive spread
through urban green areas in the world’s 100 most populous cit-
ies. Biol Invasions 19(12):3527–3539
Jain CK, Singh S (2020) Impact of climate change on the hydro-
logical dynamics of River Ganga, India. J Water Clim change
Jarocińska A, Kopeć D, Kycko M, Piórkowski H, Błońska A (2022)
Hyperspectral vs. multispectral data: comparison of the spectral
dierentiation capabilities of Natura 2000 non-forest habitats.
ISPRS J Photogrammetry Remote Sens 184:148–164
Kaur RR, Luthra A (2018) Population growth, urbanization and elec-
tricity-Challenges and initiatives in the state of Punjab, India.
Energy strategy reviews 21:50–61
Kawamura M (1996) Relation between social and environmental con-
ditions in Colombo Sri Lanka and the urban index estimated by
satellite remote sensing data. In: Proceedings 51st Annual Con-
ference of the Japan Society of Civil Engineers, pp 190–191
Loveland TR, Irons JR (2016) Landsat 8: the plans, the reality, and the
legacy. Remote Sens Environ 185:1–6
Mallick J, Kant Y, Bharath BD (2008) Estimation of land surface tem-
perature over Delhi using Landsat-7 ETM+. J Ind Geophys Union
Mukherjee A, Kumar AA, Ramachandran P (2020) Development of
new index-based methodology for extraction of built-up area
from landsat7 imagery: comparison of performance with svm,
ann, and existing indices. IEEE Trans Geosci Remote Sens
1 3
View publication stats
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Identification of the Natura 2000 habitats using remote sensing techniques is one of the most important challenges of nature conservation. In this study, the potential for differentiating non-forest Natura 2000 habitats from the other habitats was examined using hyperspectral data in the scope of VNIR (0.4–1 µm), SWIR (1–2.5 µm) and simulated multispectral data (Sentinel-2). The aim of the research was also to determine the most informative spectral ranges from the optical range. Five different Natura 2000 habitats common in Central Europe were analysed: heaths (code 4030), mires (code 7140), grasslands (code 6230) and meadows (codes 6410 and 6510). In order to guarantee the objectivity and transferability of the results each habitat was tested in two areas and in three campaigns (spring, summer, autumn). Hyperspectral data was acquired using HySpex VNIR-1800 and SWIR-384 scanners. The Sentinel-2 data was resampled based on HySpex spectral reflectance. The overflights were performed simultaneously with ground reference data – habitats and background polygons. The Linear Discriminant Analysis was performed in iterative mode based on spectral reflectance acquired from hyperspectral and multispectral data. This resulted in distribution of correctness rate values and information about the most differentiating spectral bands for each habitat. Based on the results of our experiments we conclude that: (i) hyperspectral data (both VNIR and SWIR) obtained from May to September was useful for differentiation of habitats from background with efficiency reaching over 90%, regardless of the area; (ii) the most useful spectral ranges are: in VNIR − 0.416–0.442 µm and 0.502–0.522 µm, in SWIR − 1.117–1.165 µm and 1.290–1.361 µm; (iii) the potential of multispectral data (Sentinel-2) in distinguishing Natura 2000 habitats from the background is diverse; higher for heaths and mires (comparable to hyperspectral data) lower for meadows (6410, 6510) and grasslands (6230); (iv) in case of meadows and grasslands, the correctness rate for the Sentinel-2 data was on average about 20% lower compared to the hyperspectral data.
Full-text available
Different forest types based on different tree species composition may have similar spectral signatures if observed with traditional multispectral satellite sensors. Hyperspectral imagery, with a more continuous representation of their spectral behavior may instead be used for their classification. The new hyperspectral Precursore IperSpettrale della Missione Applicativa (PRISMA) sensor, developed by the Italian Space Agency, is able to capture images in a continuum of 240 spectral bands ranging between 400 and 2500 nm, with a spectral resolution smaller than 12 nm. The new sensor can be employed for a large number of remote sensing applications, including forest types discrimination. In this study, we compared the capabilities of the new PRISMA sensor against the well-known Sentinel-2 Multi-Spectral Instrument (MSI) in recognition of different forest types through a pairwise separability analysis carried out in two study areas in Italy, using two different nomenclature systems and four separability metrics. The PRISMA hyperspectral sensor, compared to Sentinel-2 MSI, allowed for a better discrimination in all forest types, increasing the performance when the complexity of the nomenclature system also increased. PRISMA achieved an average improvement of 40% for the discrimination between two forest categories (coniferous vs. broadleaves) and of 102% in the discrimination between five forest types based on main tree species groups.
Full-text available
India had announced the longest ever lockdown from 24 March 2020 to 14 April 2020 amid COVID-19 pandemic. It was reported that the water quality of the Ganga River has improved as compared to regular during this country-wide lockdown. In the present study, an attempt has been made to study the change in water quality of the river in terms of turbidity purely through remote sensing data, in the absence of ground observations, especially during this time period. The change in spectral reflectance of water along the river in the visible region has been analyzed using the Sentinel-2 multispectral remote sensing data at Haridwar, Kanpur, Prayagraj, and Varanasi stretches of the river. In the present study, it was found that the red and NIR bands are most sensitive, and can be used to estimate the turbidity. Further, the temporal variation in turbidity was also analyzed through normalized difference turbidity index at each location. It was observed that the turbidity in the river has reduced drastically at each stretch of the river. The study elicited that the remote sensing approach can be used to make qualitative estimates on turbidity, even in the absence of field observations.
Full-text available
This research aims to develop new automatic and quicker spectral signature analysis tools to separate urban built-up area and determines urban area changes. Nowadays, researcher uses thematic index-derived bands for automatic urban data extraction. The extraction of urban built-up land can be automatically done with New Built-Up Index (NBUI) although it has a limitation on separating built-up land and water body. This study attempts to obtain the maximum accuracy of the extraction by merging several indices including Enhanced Built-Up and Bareness Index (EBBI), Soil Adjusted Vegetation Index (SAVI), Modified Normalized Difference Water Index (MNDWI), New Water Index (NWI) and Normalized Different Pond Index (NDPI) and compared its accuracy with NBUI. The results showed that merging EBBI, SAVI, MNDWI and NDPI produces the highest accuracy of 98.21% by addition and subtraction. The combined application of EBBI, SAVI and NWI also gives a good effect for extracting urban built-up areas and has 94.64% mapping accuracy.
Full-text available
Arid and semi-arid regions have different spectral characteristics from other climatic regions. Therefore, appropriate remotely sensed indicators of land use and land cover types need to be defined for arid and semi-arid lands, as indices developed for other climatic regions may not give plausible results in arid and semi-arid regions. For instance, the normalized difference built-up index (NDBI) and normalized difference bareness index (NDBaI) are unable to distinguish between built-up areas and bare and dry soil that surrounds many cities in dry climates. This paper proposes the application of two newly developed indices, the dry built-up index (DBI) and dry bare-soil index (DBSI) to map built-up and bare areas in a dry climate from Landsat 8. The developed DBI and DBSI were applied to map urban areas and bare soil in the city of Erbil, Iraq. The results show an overall classification accuracy of 93% (κ = 0.86) and 92% (κ = 0.84) for DBI and DBSI, respectively. The results indicate the suitability of the proposed indices to discriminate between urban areas and bare soil in arid and semi-arid climates.
Full-text available
Arid and semi-arid regions have different spectral characteristics from other climatic regions. Therefore, appropriate remotely sensed indicators of land use and land cover types need to be defined for arid and semi-arid lands, as indices developed for other climatic regions may not give plausible results in arid and semi-arid regions. For instance, the normalized difference built-up index (NDBI) and normalized difference bareness index (NDBaI) are unable to distinguish between built-up areas and bare and dry soil that surrounds many cities in dry climates. This paper proposes the application of two newly developed indices, the dry built-up index (DBI) and dry bare-soil index (DBSI) to map built-up and bare areas in a dry climate from Landsat 8. The developed DBI and DBSI were applied to map urban areas and bare soil in the city of Erbil, Iraq. The results show an overall classification accuracy of 93% (κ = 0.86) and 92% (κ = 0.84) for DBI and DBSI, respectively. The results indicate the suitability of the proposed indices to discriminate between urban areas and bare soil in arid and semi-arid climates.
Full-text available
Rivers provide innumerable ecosystem services to mankind. However, anthropogenic activities have inflicted a host of pressures to the riverine ecosystems. Climate change is also one of the human induced consequences which is of serious concern. A number of studies have predicted devastating effects of climate change. In the Indian context, where a river such as the Ganga is already suffering from industrial and municipal waste disposal, unhygienic rituals, and other activities, effects of climate change may further aggravate the situation. Climate change will not only result in disasters, but effects on water quality, biodiversity, and other ecological processes also cannot be denied. In this paper, an attempt has been made to evaluate the effects of climatic change on the dynamics of River Ganga. The study focuses on the impacts on fundamental ecological processes, river water quality, effect on species composition, and hydropower potential etc. The paper also discusses management aspects and research needs for rejuvenation of the River Ganga.
Anomaly detection is a widely explored domain in machine learning. Many models are proposed in the literature, and compared through different metrics measured on various datasets. The most popular metrics used to compare performances are F1-score, AUC and AVPR. In this paper, we show that F1-score and AVPR are highly sensitive to the contamination rate. One consequence is that it is possible to artificially increase their values by modifying the train-test split procedure. This leads to misleading comparisons between algorithms in the literature, especially when the evaluation protocol is not well detailed. Moreover, we show that the F1-score and the AVPR cannot be used to compare performances on different datasets as they do not reflect the intrinsic difficulty of modeling such data. Based on these observations, we claim that F1-score and AVPR should not be used as metrics for anomaly detection. We recommend a generic evaluation procedure for unsupervised anomaly detection, including the use of other metrics such as the AUC, which are more robust to arbitrary choices in the evaluation protocol.
By studying the spectral reflectance features of different land cover types and leveraging information of primarily "BLUE" band along with "RED" and "NIR" bands, this article seeks to introduce a new built-up index such as powered B1 built-up index (PB1BI). The proposed index, while being conceptually simple and computationally inexpensive, can extract the built-up areas from Landsat7 satellite images efficiently. For Landsat7 satellite imagery, classification performances of the proposed index along with support vector machine (SVM), artificial neural network (ANN), and three existing built-up indices have been examined for three study sites of 1° Latitude x 1° Longitude (≈12 100 sq· km) area from three diverse geographical regions in India. The computed value of the M-Statistics for PB1BI is consistently greater than 1.80, indicating a better spectral separability between built-up and nonbuilt-up classes by the index. In order to improve the performance of the built-up indices, this article has suggested a bootstrapping method for threshold estimation in addition to the existing Otsu's method for the same. It has been found that using bootstrapping method instead of Otsu's method for threshold estimation has helped to improve the classification performance of built-up indices up to 17.75% and 40.49% in terms of overall accuracy and kappa (κ) coefficient, respectively. It has been observed that for the validation set, average overall accuracy (97.45%) and kappa (κ) coefficient (0.907) of PB1BI for considered study sites are not only significantly higher than existing indices but also comparable with the same of SVM (99.10% and 0.942) and ANN (87.24% and 0.450). This article has also shown that the proposed index provides a stable performance for multitemporal analysis of the study sites and is able to capture growth in built-up region in time horizon. The classification performance of PB1BI has also been verified for Landsat8 imagery across 11 study sites from different continents around the globe, and the results show overall accuracy and κ to be consistently more than 90% and 0.75, respectively. For considered study sites, the reported values of average accuracy and κ of PB1BI for built-up classification using Landsat8 satellite data are 95.7151% and 0.8843, respectively.
Punjab is an economically advanced and energy-intensive state of India. However, the state power sector is facing challenges to align energy supply and demand due to steep rise in electricity consumption. Fossil fuel constraints and mandate to reduce greenhouse gas emissions are some of the major challenges to meet growing electricity demand. Accelerated generation of renewable power is the need of the hour, given the abundance of renewable energy resources in Punjab. This paper examines the current state of electricity in Punjab and the impact of increased population, economic activities and urbanization on the power sector of the state. Various initiatives including policies at national and state level to foster renewable energy development in the state have also been reviewed. We observed substantial increase in renewable power capacity through solar and small hydropower installation in the state. However, energy potential from agro-forestry biomass and waste sectors yet to be tapped fully. Bio-energy generation from crop residue should also be a focus area for the state primarily to minimize pollution and health impacts due to residue burning. Appropriate policy intervention along with stable renewable energy market is important to encourage further development of clean energy in Punjab. Keywords: Urbanization, Electricity, Renewable power, Biomass, Solar