Content uploaded by Farhad Mirzaei
Author content
All content in this area was uploaded by Farhad Mirzaei on Aug 22, 2023
Content may be subject to copyright.
Vol.:(0123456789)
1 3
International Journal of Environmental Science and Technology
https://doi.org/10.1007/s13762-019-02504-2
ORIGINAL PAPER
The study ofenvironmental andhuman factors aecting aquifer
depth changes using tree algorithm
S.H.Mirhashemi1· P.Haghighatjou1· F.Mirzaei2· M.Panahi3
Received: 20 March 2019 / Revised: 26 June 2019 / Accepted: 3 August 2019
© Islamic Azad University (IAU) 2019
Abstract
In recent years, more attention has been paid to water resources due to the development of agriculture and proper planning for
better management of aquifers. Considering the effect of different factors on aquifer depth changes, in this study, human and
environmental factors affecting the depth of aquifer changes in Qazvin plain were used. The Classification And Regression
Tree (CART) algorithm was adopted to investigate and predict changes in aquifer depth. According to the results, the highest
probability of the aquifer drop observed in July, August and September was 86.5%, and the highest probability of uprising
aquifer depth arisen in December, January, February and March was 71.2%. According to sensitivity analysis by CART
algorithm, the most important human and environmental factors affecting the number of aquifer depth changes in Qazvin
plain were groundwater withdrawal from the agricultural abstraction well and the air temperature, respectively. Therefore,
predicting the amount of aquifer depth changes by CART model, planning and managing groundwater resources is possible.
Keywords Air temperature· Aquifer management· Aquifer drop· Qazvin plain
Introduction
Groundwater is one of the essential natural resources that
plays an important role in providing water resources. So,
finding and predicting the spatial distribution of possible
locations for the groundwater detection is an important
topic for private, governmental and research institutes
(Nampak etal. 2014a, b). However, over-discharge of
groundwater along with low recharge due to low rainfall
and aridity leads to decrease groundwater level and as
a result of shortage of freshwater (Li etal. 2013). All
over the world, groundwater is an important source of
freshwater, especially for areas with sacristy of surface
water (Moreaux and Reynaud 2006).Groundwater surface
methods mainly focus on physically based numerical sim-
ulation models (Feng etal. 2011; Praveena etal. 2012;
Li etal. 2013), wavelet linear regression (WLR), artifi-
cial neural network (ANN) and dynamic autoregressive
(DAR) models (Adamowski and Chan 2011; Taormina
etal. 2012; Maheswaran and Khosa 2013). In addition,
some scientists used different methods for assessing the
risk of groundwater exploration. However, these studies
emphasize on assessment of risk of groundwater pollution
(Wang etal. 2012; Sener and Davraz 2013). Addition-
ally, the studies on risk assessment methods for deter-
mination of groundwater levels are rare; for example, a
bivariate-copula-based approach was presented by Reddy
and Ganguli (2012) for risk assessment due to hydrocli-
matic variability on groundwater levels in an unconfined
aquifer at the Manjara watershed in India. Dong etal.
(2013) divided groundwater in Tianjin into seven task
areas, which include comprehensive evaluation model for
the use of groundwater risks to forecast the level of the
third aquifer in years of 2015, 2020, and 2030.
The results illustrate that the ability of proposed GPR
model is better than partial least squares, back propa-
gation artificial neural networks, and least squares sup-
port vector regression (LSSVR) (Liu etal. 2018). To
Communicated by Parveen Fatemeh Rupani.
* P. Haghighat jou
phjou40@gmail.com
1 Department ofWater Engineering, Faculty ofWater
andSoil, University ofZabol, Zabol, Iran
2 Department ofIrrigation andDrainage, Faculty
ofAgriculture andNatural Resources, University ofTehran,
Tehran, Iran
3 Department ofWater Engineering, Faculty ofAgriculture,
University ofZanjan, Zanjan, Iran
International Journal of Environmental Science and Technology
1 3
determine the landslide susceptibility mapping, a data
mining classification technique was used. A decision tree
is a favorite classification algorithm, although it is hard to
use previously to analyze landslide susceptibility, because
the results assume a uniform class distribution, but land-
slide spatial event data sometimes are high class imbal-
anced (Yeon etal. 2010). The classification tree technique
was applied to develop a prediction model. The model
was very helpful in prediction (Hossain and Piantanakul-
chai 2013). Decision trees were relatively more accurate
than neural networks and support vector machines, but
the nodes of the rule were more than requested. Adjust-
ment of minimum support led to set of more tractable
rule (Olson etal. 2012). The decision tree, which is one
of the most popular classification techniques, is applied
in the process of data mining. Decision tree is represented
by introducing a tree, which summarizes the classifica-
tion method. Decision trees are used to predict items
membership in different classes. The ability to set this
technique makes it more usable than other methods of
data mining. The decision tree methodology consists of
two main phases: A—constructing the primary tree: using
educational data, constructing decision tree continues so
that each leaf becomes homogeneous. B—pruning: at this
phase, according to the experimental data, the grown tree
is pruned to increase the accuracy of the model (Cichosz
2015). Lee etal. investigated the potential of groundwater
productivity in a research, using the decision tree tech-
nique and geographical information system in Boryeong
and Pohang cities of the Korea. The results showed that
the models of the decision tree can be useful for the devel-
opment and study of groundwater resources (Lee and Lee
2015). Data mining and machine learning techniques are
two effective tools for studying similarity of watersheds
from hydrological viewpoint (Di Prinzio etal. 2011; Ley
etal. 2011; Toth 2013).
The CART is a powerful prediction tool with accu-
rate results. Chi-square Automatic Interaction Detector
(CHAID) algorithm (developed by a method called AID)
uses the Chi-squared test for tree split strategy, for this
reason it is called Chi-squared AID. This algorithm sup-
ports continuous and discrete variables (inputs) and can
perform regression and classification functions on the
result variable (Bozkir and Sezer 2011). In order to imple-
ment such programs to conserve groundwater resources,
the decision tree supports groundwater managers and
decision makers (Stumpp etal. 2016).
Tien Bui etal. (2018) have also determined the possibility
of land subsidence in South Korea using four machine learning
models including support vector machine (SVM), logistic
model tree (LMT), Bayesian logistic regression (BLR) and
alternate decision tree (ADTree). They found the BLR model
more precisely than other applied methods, though the other
methods also showed reasonable precision.
Most of Iran’s plains are arid and face water scarcity.
Thus, it is very important to manage water resources in the
plains of Iran. Hence, in this research, in order to the bet-
ter aquifers management, the effects of various factors on
depth changes of them were studied. The questions arisen
in this study are: what human and environmental factors
affect the aquifer depth changes? And how are the depth
changes in different months of the year? The objective of
the study is better management of the aquifers. The results
of CART algorithm were compared to results of CHAID,
SVM, Reduced Error Pruning Tree (REPTree) and neural
net algorithms. The results show that CART algorithm is the
best algorithm among others.
Materials andmethods
Study area
Qazvin plain with an area of 440 thousand hectares is
located on central plateau of Iran. Its climate is semi-
arid with hot summers and cold winters. One of the
most important rivers, providing water for this plain, is
Taleghan River, in which Taleghan dam was built. The
dam has a capacity of 460 million m3. This never saved
more than 210 million m3 water, due to its severe loss
of inflow current (Mohammadrezapour etal. 2019). The
position of Qazvin plain is shown in Fig.1.
The total agricultural area of Qazvin plain is about
313,608 hectares, approximately 82,070 hectares are within
the irrigation network. The volume of water entering to
irrigation network of Qazvin Plain is provided through the
Taleghan Dam reservoir. In the agricultural area of Qazvin
plain, the number of agricultural wells is about 1900, and
its average flow rate is about 30 L/s. In order to determine
the volume of agricultural water demand, the amount of
water requirement for completely agricultural products
in the agricultural area of Qazvin plain increased, by its
cultivation area. The volume of precipitation was obtained
by multiplying the amount of precipitation by the area of
precipitation.
The monthly potential evapotranspiration of Qazvin
plain was calculated by Penman–Monteith formula. Right
now, the FAO Penman–Monteith method is proposed as
the only standard method for defining, and calculating the
International Journal of Environmental Science and Technology
1 3
Fig. 1 Location of the Qazvin plain in Iran
Fig. 2 a, b The aquifer depth
changes in 15-years period
International Journal of Environmental Science and Technology
1 3
reference evapotranspiration, regarding the expertise ses-
sion in May 1990 (Allen etal. 1998) (Equation1).
where ETo is reference evapotranspiration (mmday−1), T
mean daily air temperature at 2m height (°C), Rn net radia-
tion at the crop surface (MJm−2day−1), u2 wind speed at
2m height (ms−1), es saturation vapor pressure (kPa), ∆
slope vapor pressure curve (kPa °C−1), ea actual vapor pres-
sure (kPa), γ psychrometric constant (kPa °C−1), es-ea satu-
ration vapor pressure deficit (kPa), G soil heat flux density
(MJm−2day−1).
The used data in this study are based on monthly
information from Qazvin plain area among the years
of 2001 to 2015. Human factors such as the volume of
water discharged from agricultural wells (million cubic
meters), the volume of water delivered to irrigation net-
work (million cubic meters), the volume of agricultural
water demand (million cubic meters) and environmental
factors such as the volume of precipitation (million cubic
meters), the temperature of air (centigrade), air humid-
ity (percent) and potential evapotranspiration (millimeter
per day) were used. Also, data related to the amount of
aquifer depth fluctuations were introduced into a given
model as target data.
In Fig.2a, b, the aquifer depth changes or fluctuations
have been shown in 15-year period. The negative amounts
of changes are related to the aquifer depth falling, and
the positive ones are related to its rising. As shown in the
figure, the highest falling of the aquifer and its highest ris-
ing have been occurred in August and April, respectively.
Data mining algorithms
The CART decision tree algorithm is a method to construct
prediction models using data. This algorithm divides its
input data repeatedly and is able to process classified pre-
dictor and target variables (Loh 2011).
CART indicates Classification And Regression Trees
(Breiman etal. 1984). This is chosen by the fact that it
builds binary trees, for example each internal node has
precisely two outgoing edges. The splits are selected
by coupling criteria, and the obtained tree is pruned
by cost–complexity pruning. At the time of presenting,
CART can reflect inappropriate classification costs in the
tree induction. It also helps users to provide a possible
prior distribution. An important CART character is its
(1)
ET
O=
0.408
[
Rn−G)+𝛾
900
T + 273U2.(es−ea)
]
Δ+𝛾(1+0.34 U
2
)
ability to produce regression trees. Regression tree leaves
forecast a real number and not a class. In case of regres-
sion, CART searches for splits that minimize the forecast
squared error. The mean weight for node is a basis of
prediction per leaf (El Seddawy etal. 2013). Advantages
and disadvantages of this method are as follows (Timo-
feev 2004):
Advantages CART can easily manage both numerical
and categorical variables by. CART algorithm will itself
discriminate the most significant variables and removes non-
significant ones. CART can easily manage extreme values.
Disadvantages CART may have unsteady decision tree.
Irrelevant qualification of learning sample such as remov-
ing several observations and making changes in decision
tree: increasing or decreasing of tree complication, chang-
ing variables of division and values. CART is splinted by
only one variable. One of the problems in CART model is
its bias in selecting the variables. In addition, in qualitative
variables with more than two levels, results may be confus-
ing. Several levels of a variable can belong to a node and
make it impossible to easily interpret the obtained results
(Breiman etal. 1984).
CHAID algorithm finds the differences of each sample
and then produces (generate) the intended tree. Pruning
the tree performs by finding similar differences (Chat-
tamvelli 2011). SVM method is a supervised nonpara-
metric statistical approach and is based on the assumption
that there is no information about data distribution. The
main characteristic of this method is the high ability to
use lesser training data and higher accuracy compared to
methods mentioned above (Mantero etal. 2005; Mount-
rakis etal. 2011). Neural networks have layer categoriza-
tion nature, such that each layer consists of a few nodes
(neurons), and process begins with data input and ends
by output (Richards., 1999). The REPTree algorithm is
a fast decision tree learner, which constructs a decision/
regression tree using data gain/variance and prunes it by
Table 1 Model denomination in data classification
Model name Data range (m) Data average (m) Percent-
age of
data
A··· < −1 − 1.66 10.23
B− 1 ≤ ··· < −0.3 − 0.56 25.79
C−0 .3 ≤ ··· < 0 − 0.15 24.14
D0 < ··· < 0.3 0.15 17.66
E0.3 ≤ ··· < 1 0.55 16.78
F1 ≤ ··· 1.73 5.4
International Journal of Environmental Science and Technology
1 3
reduced error pruning. The algorithm only displays values
for numeric indices once (Daud and Corne 2007).
Classication ofaquifer depth changes data
In order to increase the usability of results for managers
and also to reduce the effects of errors, the categorization
of the values of groundwater depth fluctuations was per-
formed. If the reading errors are less than 1m, the effect of
these errors is reduced by categorization. Also, for outlier
data values, this problem can be resolved by performing
a data categorization process. The purpose of this method
is to place the data in categories according to rules and
exclude a category that is very small. The categorization
is a process that converts continuous attributes into dis-
crete attributes. In this study, the data on aquifer depth
fluctuations were divided into six models, and the bounda-
ries between discrete values were determined so that each
model is sufficiently represented the data sample. After
categorizing the complete data, for each range, the data
were called by the model name, from A to F (Table1),
so that the level of the groundwater drop decreases (in
descending order), from model A to C, and the level of the
groundwater rising (in ascending order) increases from D
to F. The negative sign of the data range is related to the
depth of aquifer drawdown.
As shown in Table1, the highest and lowest percent-
ages of the aquifer depth changes have been allocated to B
(0.3 to 1m) and F models (uprising more than 1m) with
an average value of 25.8% and 5.4%, respectively. Dis-
crete values of aquifer depth changes were introduced into
CART tree algorithm as the target function. In cases where
the target function is discrete, the software for each branch
of the tree, in addition to the primary rules provides infor-
mation about the frequency of each range. In this research,
the frequency of each model was used in different months
and some related primary rules.
In order to validate the five models, the data were also
divided into two parts: training and test data. The tree
model was classified using training and test data. The five
models were constructed using training data, and then the
constructed models were examined on test data. The per-
centage of samples from the test data expresses the accu-
racy of the model, which the model (Gupta 2011) correctly
identifies their target characteristics. For the models, 70%
Table 2 Evaluation of the results of the models by statistical indica-
tors
Statistical
indicator
CART Neural net CHAID CVM REPTree
TPR 0.62 0.30 0.28 0.30 0.22
TNR 0.78 0.62 0.52 0.53 0.41
PPV 0.45 0.40 0.35 0.37 0.35
ACC 0.75 0.71 0.67 0.74 0.73
FM 0.30 0.25 0.13 0.25 0.14
GM 0.69 0.43 0.38 0.39 0.30
Fig. 3 The results of tree
graph derived from the CART
algorithm
International Journal of Environmental Science and Technology
1 3
of the data were randomly selected as training data and
the remaining 30% were tested as test data. The statistical
indices of Eqs.2–7 were used to evaluate the results of
CART, CHAID, SVM, REPTree and neural net algorithms.
In order to evaluate the models and select the best
one, the true positive rate index (TPR), true negative rate
(TNR), accuracy (ACC), positive predictive value or pre-
cision (PPV), F mean (FM) and geometric mean (GM)
were used. The indices are specified by the relationship
between 2 and 7. Indicators are calculated on the basis of
2 to 7 relationships.
(2)
TPR
=
TP
TP +FN
(3)
TNR
=
TN
FP + TN
(4)
ACC
=
TP + TN
TP+FN+FP+TN
(5)
FM
=
2×TP
TP +FP +FN
(6)
GM =√
TPR
×TNR
(7)
PPV = TP
TP +FP
Table 3 The frequencies of models in December, January, February
and March
Model name Model frequency Percentage of
model frequency
A 57 2.2
B 219 8.3
C 480 18.3
D 779 29.6
E 837 31.9
F 255 9.7
Table 4 Frequency probability of models in April and November
Model name Model frequency Percentage of
model frequency
A 93 7.1
B 234 17.9
C 342 26.3
D 287 22.2
E 257 19.7
F 89 6.8
Table 5 Frequency probability of models in July, August, and Sep-
tember
Model name Model frequency Percentage of
model frequency
A 167 15.3
B 482 44.0
C 298 27.2
D 79 7.2
E 50 4.6
F 19 1.7
Table 6 Initial rules for the dominant state of model B in July,
August and September
Initial Month name in [“August” “July” “June” “May” “Octo-
ber” “September”]
1 The temperature of air > 14
2Potential evapotranspiration > 3.24
3 Percent of air humidity ≤ 45.5
4 The volume of agricultural water demand > 0.03
5 The volume of water entering irrigation network > 0.07
Table 7 Frequency probability of models in May, June, and October
Model name Model frequency Percentage of
model frequency
A 118 9.28
B 436 28.27
C 440 40.59
D 139 10.93
E 101 7.94
F 38 2.99
Table 8 The initial rules for the dominant state of model C in May,
June and October
Initial Month name in [“June” “May” “October”]
1 The temperature of air ≤ 19.5
2Potential evapotranspiration ≤ 3.2
3Percent air humidity > 45.5
4Volume of precipitation > 0.002
5 The volume of water entering to irrigation
network ≤ 0.2
International Journal of Environmental Science and Technology
1 3
where TP is the number of positive tag data stored correctly,
FP the number of negative label data categorized as false
positive, FN the number of positive tag data classified incor-
rectly, and TN the number of negative tag data classified
correctly (Han etal. 2011).
Results anddiscussion
Among the used algorithms, the best results are related to
CART algorithm with the highest true positive rate (0.62),
true negative rate (0.78), accuracy (0.75), positive predictive
value or precision (0.45), index F (0.30) and geometric mean
(0.69) (Table2).
The results of CART tree diagram are shown in Fig.3.
As can be seen, diagram is divided into two main branches,
so that the months of May, June, July, August, September
and October are in the first main branch, and the months of
November, December, January, February, March and April
are in the second main branch. At the end of each branch,
the predicted values are shown due to aquifer depth changes.
Details of tree graph are listed in Tables3, 4, 5, 6, 7 and 8.
The rst category
In December, January, February andMarch
Table3 shows the probability of occurrence of aquifer depth
changes is determined by the frequency. The E and A mod-
els, with the amount of 31.9% and 2.2%, respectively, are
the highest and lowest probabilities () which were obtained
in December, January, February and March, respectively.
According to Table3, the summation of the probabilities
of uprising and drop of aquifer depth is 71.2 and 28.8%,
respectively. Therefore, during these four months, the prob-
ability of uprising is higher than the probability of aquifer
depth drop.
In April andNovember
According to the frequency table (Table4), C and F mod-
els have the highest and lowest probabilities of 26.3% and
6.8%, respectively, in April and November. According to
Table4, the summation of the uprising and drop prob-
abilities of aquifer depth are 51.3% and 48.7%, respec-
tively. Therefore, the difference among drop and uprising
of aquifer depth in these two months is not very different.
Fig. 4 The importance of input
parameters in predicting the
amount of aquifer depth changes
International Journal of Environmental Science and Technology
1 3
The second category
In July, August andSeptember
As can be seen in Table5, B and F models have the high-
est and lowest probabilities of 44% and 1.7%, respectively,
occurred in July, August and September. According to
Table5, the sum of the uprising and drop probabilities
of aquifer depth is 86.5% and 13.5%, respectively. There-
fore, the probability of drop in these three months is much
higher than the probability of uprising.
The conditions of model B were investigated by initial
rules and showed reduction in drop based on the results.
The initial rules are specified for a state that ends in B
(Table6).
According to Table6 of initial rules, among the factors
affecting the changes of aquifer depth, it can be mentioned
to increase the values of two human factors, the volume of
water entering the irrigation network and the volume of agri-
cultural water demand. Therefore, the use of more advanced
irrigation systems can be identified with higher outputs and
fewer losses, which reduce the volume of water demand
from water resources and ultimately reduce the volume of
water entering irrigation network. Among the environmen-
tal factors affecting the amount of increase in drop of aqui-
fer, the increase in air temperature and the reduction of air
humidity can be mentioned, so when the air temperature
increases and the air humidity decreases, it is better to pre-
vent undesirable conditions in the aquifer.
In May, June andOctober
According to obtained results of frequency table (Table7),
it has been shown that C and F models, with the values
of 40.6% and 3%, have the highest and lowest probabili-
ties respectively, that occurred in May, June and October.
According to drop and uprising values, the sum of the upris-
ing and drop probabilities of aquifer depth are 78.1% and
21.9%, respectively. Therefore, the probability of drop in
the depth of aquifer is higher than the probability of uprising
during these three months.
The conditions of model C are investigated by initial rules
to suggest ways to reduce the drop based on their results.
The initial rules are determined for a state that leads to C
(Table8).
According to Table8, effective human factor affecting
the amount of drop is the volume of water entering the
irrigation network and the volume of agricultural water
demand. Due to the simultaneous increase in the volume
of precipitation and the volume of water entering to the
irrigation network, it can be assumed that the increase
in precipitation (assuming that the aquifer is in a desir-
able situation) leads to a wrong decision of managers and
farmers for the high-consumption crop pattern, due to
increased agricultural water consumption. This inference
is determined by increasing the volume of water enter-
ing the irrigation network, and eventually a drop in the
aquifer.
Figure4 shows the results of the sensitivity analysis of
CART algorithm in determining the importance of various
factors in humans and environmental factors that affect the
prediction of aquifer depth changes. Considering that the
volume of water taken from the agricultural wells and the air
temperature has been selected as the most important human
and environmental factors affecting the prediction of ground-
water depth changes.
Conclusion
The prediction of aquifer depth changes is an impor-
tant subject in order to better manage the aquifers. In
this study, a model is presented for prediction of aquifer
depth changes using CART tree algorithm. Based on the
results, the air temperature as an environmental factor has
the greatest effect on aquifer depth changes. So, due to
tree diagram, the maximum drop and rise of aquifer depth
mainly occur in the hottest and coolest months of the
year, respectively. Thus, it is advisable to supervise irri-
gation water utilization and prevent crop production with
high water requirement in warm months. Among human
factors, water discharged from agricultural wells has the
greatest effect on aquifer depth changes. Thus, in order
to better manage aquifers, it is recommended that instead
of using groundwater, water supplied through irrigation
networks should be used. The use of water supplied by
irrigation networks leads to efficient application of water.
So, irrigation networks play an important role in optimiz-
ing the use of water resources (Burt 2007).
Finally, more attention should be considered to water
resources management. The aquifer depth fluctuations
should regularly measure at more appropriate intervals, and
total of human factors that affect the aquifer depth fluctua-
tions should be properly managed.
International Journal of Environmental Science and Technology
1 3
Acknowledgements The authors would like to thank the Coordinator-
ship of the Scientific Research Projects of University Zabol.
References
Adamowski J, Chan HF (2011) A wavelet neural network conjunction
model for groundwater level forecasting. J Hydrol 407(1–4):28–40
Allen RG, Pereira LS, Raes D, Smith M (1998) Crop evapotranspi-
ration-guidelines for computing crop water requirements-FAO
irrigation and drainage paper 56. Fao, Rome, vol 300, no 9,
pp D05109
Bozkir AS, Sezer EA (2011) Predicting food demand in food courts
by decision tree approaches. Proc Comput Sci 3:759–763
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classifica-
tion and regression trees, 1st edn. Chapman and Hall/CRC,
New York
Burt CM (2007) Volumetric irrigation water pricing considerations.
Irrigat Drain Syst 21(2):133–144
Chattamvelli R (2011) Data mining algorithms, 1st edn. Alpha Sci-
ence International, Oxford, pp 274–290
Cichosz P (2015) Data mining algorithms: explained using R.
Wiley, New York
Daud MNR, Corne DW (2007) Human readable rule induction
in medical data mining: a survey of existing algorithms. In:
WSEAS European computing conference, Athens, Greece
Dong D, Sun W, Zhu Z, Xi S, Lin G (2013) Groundwater risk
assessment of the third aquifer in Tianjin city, China. Water
Resour Manag 27(8):3179–3190
El Seddawy AB, Sultan T, Khedr A (2013) Applying classifica-
tion technique using DID3 algorithm to improve decision
support under uncertain situations. Department of Business
Information System, Arab Academy for Science and Technol-
ogy and Department of Information System, Helwan Univer-
sity, Egypt. Int J Mod Eng Res 3(4):2139–2146
Feng S, Huo Z, Kang S, Tang Z, Wang F (2011) Groundwater
simulation using a numerical model under different water
resources management scenarios in an arid region of China.
Environ Earth Sci 62(5):961–971
Gupta GK (2011) Introduction to data mining with case studies,
2nd edn. Prentice Hall, Upper Saddle River, pp 526–534
Han J, Kamber M, Pei J (2011) Data mining concepts and tech-
niques, 3rd edn. Morgan Kaufmann, Burlington, pp 365–369
Hossain MM, Piantanakulchai M (2013) Groundwater arsenic
contamination risk prediction using GIS and classification
tree method. Eng Geol 156:37–45
Lee S, Lee CW (2015) Application of decision-tree model to
groundwater productivity-potential mapping. Sustainability
7(10):13416–13432
Ley R, Casper MC, Hellebrand H, Merz R (2011) Catchment
classification by runoff behaviour with self-organizing maps
(SOM). Hydrol Earth Syst Sci 15(9):2947–2962
Li F, Feng P, Zhang W, Zhang T (2013) An integrated groundwa-
ter management mode based on control indexes of groundwa-
ter quantity and level. Water Resour Manag 27(9):3273–3292
Liu H, Yang C, Huang M, Wang D, Yoo C (2018) Modeling of
subway indoor air quality using Gaussian process regression.
J Hazard Mater 359:266–273
Loh WY (2011) Classification and regression trees. Wiley Inter-
discip Rev Data Min Knowl Discov 1(1):14–23
Maheswaran R, Khosa R (2013) Long term forecasting of ground-
water levels with evidence of non-stationary and nonlinear
characteristics. Comput Geosci 52:422–436
Mantero P, Moser G, Serpico SB (2005) Partially supervised clas-
sification of remote sensing images through SVM-based prob-
ability density estimation. IEEE Trans Geosci Remote Sens
43:559–570
Mohammadrezapour O, Yoosefdoost I, Ebrahimi M (2019) Cuckoo
optimization algorithm in optimal water allocation and crop
planning under various weather conditions (case study: Qazvin
plain, Iran). Neural Comput Appl 31(6):1879–1892
Moreaux M, Reynaud A (2006) Urban freshwater needs and spatial
cost externalities for coastal aquifers: a theoretical approach.
Reg Sci Urban Econ 36(2):163–186
Mountrakis G, Im J, Ogole C (2011) Support vector machines in
remote sensing: a review. ISPRS J Photogramm Remote Sens
13:247–259
Nampak H, Pradhan B, Manap MA (2014) Application of GIS based
data driven evidential belief function model to predict ground-
water potential zonation. J Hydrol 513:283–300
Olson DL, Delen D, Meng Y (2012) Comparative analysis of data
mining methods for bankruptcy prediction. Decis Support Syst
52(2):464–473
Praveena SM, Abdullah MH, Bidin K, Aris AZ (2012) Sustainable
groundwater management on the small island of Manukan,
Malaysia. Environ Earth Sci 66(3):719–728
Prinzio MD, Castellarin A, Toth E (2011) Data-driven catchment
classification: application to the pub problem. Hydrol Earth
Syst Sci 15(6):1921–1935
Reddy MJ, Ganguli P (2012) Risk assessment of hydroclimatic
variability on groundwater levels in the Manjara basin
aquifer in India using Archimedean copulas. J Hydrol Eng
17(12):1345–1357
Richards JA (1999) Remote sensing digital image analysis. Springer,
Berlin, p 240
Sener E, Davraz A (2013) Assessment of groundwater vulnerability
based on a modified DRASTIC model, GIS and an analytic
hierarchy process (AHP) method: the case of Egirdir Lake
basin (Isparta, Turkey). Hydrogeol J 21(3):701–714
Stumpp C, Żurek AJ, Wachniew P, Gargini A, Gemitzi A, Filip-
pini M, Witczak S (2016) A decision tree tool supporting the
assessment of groundwater vulnerability. Environ Earth Sci
75(13):1057
Taormina R, Chau KW, Sethi R (2012) Artificial neural net-
work simulation of hourly groundwater levels in a coastal
aquifer system of the Venice lagoon. Eng Appl Artif Intell
25(8):1670–1676
Tien Bui B, Shahabi H, Shirzadi A, Chapi K, Pradhan B, Chen W,
Khosravi K, Panahi M, Ahmad B, Saro L (2018) Land sub-
sidence susceptibility mapping in South Korea using machine
learning algorithms. Sensors 18:2464
Timofeev R (2004) Classification and regression trees (CART)
theory and applications. Humboldt University, Berlin
Toth E (2013) Catchment classification based on characterisation
of streamflow and precipitation time series. Hydrol Earth Syst
Sci 17(3):1149–1159
International Journal of Environmental Science and Technology
1 3
Wang J, He J, Chen H (2012) Assessment of groundwater contami-
nation risk using hazard quantification, a modified DRASTIC
model and groundwater value, Beijing Plain, China. Sci Total
Environ 432:216–226
Yeon YK, Han JG, Ryu KH (2010) Landslide susceptibility
mapping in Injae, Korea, using a decision tree. Eng Geol
116(3–4):274–283