Content uploaded by Hossein Bayat
Author content
All content in this area was uploaded by Hossein Bayat on Jan 08, 2015
Content may be subject to copyright.
1 23
Environmental Modeling &
Assessment
ISSN 1420-2026
Volume 18
Number 5
Environ Model Assess (2013) 18:605-614
DOI 10.1007/s10666-013-9366-2
Improving Estimation of Specific Surface
Area by Artificial Neural Network
Ensembles Using Fractal and Particle
Size Distribution Curve Parameters as
Predictors
Hossein Bayat, Sabit Ersahin & Estela
N.Hepper
1 23
Your article is protected by copyright and all
rights are held exclusively by Springer Science
+Business Media Dordrecht. This e-offprint
is for personal use only and shall not be self-
archived in electronic repositories. If you wish
to self-archive your article, please use the
accepted manuscript version for posting on
your own website. You may further deposit
the accepted manuscript version in any
repository, provided it is only made publicly
available 12 months after official publication
or later and provided acknowledgement is
given to the original source of publication
and a link is inserted to the published article
on Springer's website. The link must be
accompanied by the following text: "The final
publication is available at link.springer.com”.
Improving Estimation of Specific Surface Area by Artificial
Neural Network Ensembles Using Fractal and Particle Size
Distribution Curve Parameters as Predictors
Hossein Bayat &Sabit Ersahin &Estela N. Hepper
Received: 7 April 2012 /Accepted: 11 March 2013 /Published online: 26 March 2013
#Springer Science+Business Media Dordrecht 2013
Abstract Specific surface area (SSA) is one of the principal
soil properties used in modeling soil processes. In this study,
artificial neural network (ANN) ensembles were evaluated to
predict SSA. Complete soil particle-size distribution was esti-
mated from sand, silt, and clay fractions using the model by
Skaggs et al. and then the particle-size distribution curve
parameters (PSDCPs) and fractal parameters were calculated.
The PSDCPs were used to predict 20 particle-size classes for a
soil sample’s particle size distribution. Fractal parameters were
calculated by the model of Bird et al. In addition, total soil-
specific surface area (TSS) was calculated using the above 20
size classes. Pedotransfer functions were developed for SSA
and TSS using ANN ensembles from 63 pieces of SSA data
taken from the literature. Fractal parameters, PSDCPs, and
some other soil properties were used to predict SSA and
TSS. Introducing fractal parameters and PSDCPs improved
the SSA estimations by 12.5 and 11.1 %, respectively. The
improvements were even better for TSS estimations (27.7 and
27.0 %, respectively). The use of fractal parameters as estima-
tors described 44 and 92.8 % of the variation in SSA and TSS,
respectively, while PSDCPs explained 42 and 6.6 % of the
variation in SSA and TSS, respectively. The results suggested
that fractal parameters and PSDCPs could be successfully used
as predictors in ANN ensembles to predict SSA and TSS.
Keywords Artificial neural networks .Ensemble .Fractal
parameters .Particle-size distribution .Specific surface area .
Prediction
1 Introduction
Specific surface area (SSA) is one of the principal soil prop-
erties for agricultural, industrial, and environmental applica-
tions that are related to the physical and chemical properties of
aporousmedium[1]. Previous studies have indicated that
SSA is closely related to and has a determining influence on
many soil properties such as grain-size distribution (i.e., the
clay-size fraction), mineralogical composition [2], consisten-
cy limits [3], swelling and shrinkage characteristics [4], com-
pressibility characteristics [5], cation exchange capacity, clay
content [6], frost heave [7], water retention characteristics [8],
water movement, soil aggregation [9], activity [2], sorption
and desorption characteristics [10], and angle of the internal
friction of soils [11]. Moreover, processes such as contaminant
accumulation, nutrient dynamics, and chemical transport are
greatly influenced by SSA [12].
Contrary to soil properties such as organic matter, pH, and
particle-size distribution (PSD), SSA is not measured routine-
ly [13]; therefore, data for surface area are unavailable in most
of the databases. In addition, direct measurement techniques
for SSA are time consuming, labor intensive, expensive, tech-
nically difficult, and require skilled personnel [14,15]. There-
fore, since methods to measure the SSA are diverse and their
results are generally incomparable [16] due to the error from
either inherent limitations of the instrument employed or the
basic assumptions applied in mathematical models that are
used for computing SSA. Hence, finding a simple, economic,
and brief methodology that predicts reliable values for soil
SSA is quite essential.
In the past 15 years, many pedotransfer functions (PTFs)
have been developed using artificial neural networks (ANNs)
Project supported by the Bu Ali Sina University, Hamadan, Iran.
H. Bayat (*)
Department of Soil Science, Faculty of Agriculture,
Bu Ali Sina University, Hamadan, Iran
e-mail: h.bayat@basu.ac.ir
S. Ersahin
Department of Forest Engineering, Faculty of Forestry,
ÇankırıKaratekin University, 18100 Çankırı, Turkey
E. N. Hepper
Facultad de Agronomía, UNLPam, cc 300,
6300 Santa Rosa, Argentina
Environ Model Assess (2013) 18:605–614
DOI 10.1007/s10666-013-9366-2
Author's personal copy
and they have performed at least as well as other tech-
niques and overcome the problem of introducing statistical
uncertainties into PTFs [17,18]. Although efforts have
been made to develop PTFs by regression methods, to
our knowledge, no research has been conducted to evaluate
the performance of ANNs, especially ANN ensembles in
estimating SSA from readily available soil data. This mo-
tivated us to use the ensemble method that combines a
number of individual ANNs.
SSA can be predicted by developing PTFs, between
relatively easy to measure soil survey variables and SSA.
In any case, finding new input parameters that are more
preferable or necessary to estimate SSA and can improve
its estimation without spending more time or cost, remains a
challenging issue.
The fractal theory has been widely used to characterize
soil properties [19]. Some authors [20] suggested that the
fractal dimension of the PSD can be useful to quantify the
relationships between soil texture and related soil properties
and processes. Ersahin et al. [21] have used fractal dimen-
sion to predict SSA; however, no one has investigated the
efficiency of fractal parameters (FPs) and conventional PSD
curve parameters (PSDCPs) to predict SSA by ANNs.
Therefore, it would be practical and economical to use
the ANN ensembles procedure together with calculated FPs
and PSDCPs to predict soil SSA. More importantly, these
models should improve the accuracy of predicted SSA data
and aid in populating the database, which will benefit all
users of soil survey data for agricultural, industrial, and
environmental applications.
The objectives of this study were: (1) to calculate the FPs
and PSDCPs using limited soil texture data and to evaluate
their relationship with measured soil SSA and calculated
total soil-specific surface area (TSS), (2) to develop the
PTFs by ANN ensembles to evaluate the utility of using
calculated FPs and PSDCPs to improve SSA and TSS pre-
dictions, and (3) to compare the efficiency of FPs and
PSDCPs in the prediction of soil SSA and TSS.
2 Materials and Methods
2.1 Datasets
We developed PTFs to predict measured soil SSA and
calculated TSS using 17, 22, and 24 SSA data taken from
Aringhieri et al. [1], Ersahin et al. [21], and Hepper et al.
[15], respectively. Basic soil properties of sand, silt, clay,
and organic matter were used along with cation exchange
capacity and SSA to develop PTFs.
Ersahin et al. [21] measured the SSA of the soils analyz-
ing the retention of ethylene glycol monoethylene ether,
which is a polar molecule that forms only one layer of
molecules on the particle surfaces [22]. Hepper et al. [15]
measured the SSA with ethylene glycol mono ethylene ether
after sieving the samples (<0.5 mm), peroxiding, saturating
with Ca
2+
, and air drying [9]. Aringhieri et al. [1] used the
method of Quirk [23]tomeasureSSAbywatervapor
adsorption. The SSA measured by water vapor (H
2
O) or
ethylene glycol monoethylene ether is known to yield total
SSA [1]. Hence, in this study the SSA was measured using
both methods. Detailed information about the site descrip-
tion and analytical methods of soil properties has been given
in the related papers.
In this study, we predicted entire PSD curve from soil
texture data, and we calculated FPs from the extended PSD
curve data to predict SSA and TSS.
2.2 Modeling PSD
Complete PSD for the 20 size classes (as suggested by Arya
and Paris [24]) were predicted from the sand, silt, and clay
contents, using the Skaggs et al. [25] model:
PðrÞ¼ 1
1þ1
Pr
0
ðÞ
1
exp uRc
ðÞ
ð1Þ
R¼rr0
r0;rr0>0ð2Þ
where P(r) is the mass fraction of soil particles with radii
less than r,r
0
is the lower bound on radii for which the
model applies, and cand uare model parameters (hereafter,
we call cand uas (PSDCPs)) that can be calculated using
the following equations:
c¼aln v
w;u¼v1bwbð3Þ
v¼ln
1
Pr
1
ðÞ
1
1
Pr
0
ðÞ
1;w¼ln
1
Pr
2
ðÞ
1
1
Pr
0
ðÞ
1ð4Þ
a¼1
lnr1r0
r2r0
;b¼aln r1r0
r0ð5Þ
1>Pr
2
ðÞ>Pr
1
ðÞ>Pr
0
ðÞ>0;r2>r1>r0>0
To implement the method described in the above section,
we must select values for r
0
,r
1
,andr
2
.Weusedr
0
=1 μm,
r
1
=25 μm, and r
2
=999 μm. According to the USDA
particle-size classification system, these radii specify that
P(r
0
=1 μm) is the clay mass fraction, P(r
1
=25 μm) is the
clay+silt fraction, and P(r
2
=999 μm) is the clay+silt+sand
fraction. This was the only data available in the datasets
that were used in this study. However, P(r
2
=999 μm) used
606 H. Bayat et al.
Author's personal copy
by the Skaggs et al. [25] model, is not exactly the same as
1,000 μm specified by USDA for clay+silt+sand mass
fraction. However, as 999 μm is too close to the upper
bound of sand fraction (1,000 μm), we believe that the
above assumption may not cause much error in the fractal
scaling.
The Pore–Solid Fractal model of Bird et al. [26] has been
applied to PSD data:
Msddi
ðÞ¼cBdi3Dð6Þ
where M
s
(d≤d
i
) is the cumulative mass of particles below
an upper limit, d
i
,Dis the fractal dimension of the PSD, and
cB is a composite scaling constant. Hereafter, Dand cB will
be referred as fractal parameters (FPs).
2.3 Calculation of Soil’s TSS
TSS can be predicted by calculations based on the sizes,
shapes, and relative quantity of different types of soil parti-
cles [27]. We used predicted (extended) PSD classes to
calculate TSS of soil mineral fractions. Sand and silt parti-
cles are assumed to be spherical and clay particles are
assumed to be platy. For a sphere of radius r, the ratio of
surface to mass is
SSi¼3
ρsri
ð7Þ
For a platy particle of thickness x, the ratio of surface to
mass is
SSj2
ρsxj
ð8Þ
Here, SS
i
is the SSA of the ith class of spherical particles
and SS
j
is the SSA of the jth class of platy particles, ρ
s
is the
particle density, r
i
is the mean radius of the ith class of
spherical particles, and x
j
is the mean thickness of the jth
class of platy particles. The approximate TSS can be calcu-
lated by the summation equation:
TSS m2=g
¼X
n
i¼1
ciSSiþX
m
j¼1
cjSSjð9Þ
Here, c
i
and c
j
are the mass fraction of particles of
average radius r
i
and average thickness x
j
, respectively.
The variables nand mare the number of classes of spherical
and platy particles, respectively.
Measured and calculated values of specific surface area
(SSA and TSS) were regressed against each of the FPs (D
and cB) to determine the relationship between them. To
assess the nature of the relationship between SSA and/or
TSS and FPs, different types of regression equations were
considered.
2.4 ANN Ensembles
The 63 data that were taken from Aringhieri et al. [1],
Ersahin et al. [21] and Hepper et al. [15] were partitioned
into three subsets using a randomized approach, a training
set of 37 data, a cross-validation set of ten data, and a testing
set of 16 data. The precise PTFs were developed using ANN
ensembles.
We selected input data randomly in all ANN models. For
every PTF, 80 models were developed using two types of
ANNs; feed-forward multilayer perceptrons and generalized
regression neural networks in order to predict the soil SSA
and TSS. Performances of two types of ANNs were evalu-
ated; each type was run with one hidden layer and different
hidden neurons ranging from 3 to 12. We followed proce-
dure by Minasny and McBratney [28] who developed
“neuropath”software to create PTFs, in developing individ-
ual models. Therefore, to develop each of 80 models for
prediction of the soil SSA and TSS, combination of ANN,
and bootstrap method [29] were applied. The input data
were selected randomly in 50 different times, to obtain 50
bootstrap datasets of the same size as the training dataset.
For each bootstrap data set, a network was trained and the
soil SSA and TSS were predicted. We assumed as the
final estimate of an individual model the mean of the
50 predictions [30].
Several transfer functions including tanh, exponential,
logistic, identity, and sine were tested in hidden and
output layers to achieve the greatest accuracy and reliabil-
ity. According to Baker and Ellison [31], the root mean
square error (RMSE) tends to steady state when the num-
ber of ANN members in the PTF is greater than 5 or 6.
We evaluated the effect of the number of ANN ensemble
members on the RMSE of the ensemble models, and
behaving conservatively, selected the 20 most successful
ANN models from 80 developed to create an ANN
ensemble model.
We combined the predictions done from individual ANN
members by simple averaging (all members in the ensemble
areassignedanequalweight) and weighted averaging,
weighted by the testing error. In weighting, ANNs with
smallest errors are given more weight. The weighted aver-
age Xis:
X¼P
N
i¼1P
N
i¼1
Ei
Ei
Pi
N1ðÞ
P
N
i¼1
Ei
ð10Þ
where p
1
,p
2
,…,p
i
,…,p
N
,(Nis the number of ensemble
members) are several independent, unbiased estimates of
SSA or TSS, and E
i
is the sum of squared error at optimi-
zation of the ith ANN.
Estimation of SSA by ANN Ensembles Using FPs as Predictors 607
Author's personal copy
2.5 Development of Pedotransfer Functions
Since the D, SSA and TSS had non-normal distributions,
30
D–2
, log SSA, and log TSS were used to normalize them,
respectively. All variables were standardized to have a zero
mean and unit variance. To predict SSA, six PTFs were
developed. PTF1 was based on the basic soil properties (i.e.,
sand, silt, clay, and organic matter contents). PTF2 used FPs
(Dand cB) as additional inputs. To build PTF3, PSDCPs (i.e.,
cand uin the Eq. 3) were introduced to the model along with
the basic soil properties. To develop PTF4, only FPs were
used as inputs, and to develop PTF5, only PSDCPs were used
as inputs. We used the soil cation exchange capacity as an
input along with basic soil properties to develop PTF6. The
same procedure was followed to develop the other six PTFs to
predict TSS. The performances of all the PTFs were evaluated
andcomparedwitheachother.
2.6 Sensitivity Analyses
According to Donigian and Rao [32], sensitivity analysis is
the “degree to which the model result is affected by changes
in a selected model input.”The sensitivity coefficient of an
input variable can be calculated, making small changes in
the input variable of particular focus while keeping all the
other inputs constant and then dividing the change in the
output variable by the change in the input variable [33]. The
basicideaisthattheinputsofthenetworkareshifted
slightly and the corresponding change in the output is
reported either as a percentage or as a raw difference [34].
Response of its output when one standard deviation
change is made in an input is a good measure in sensitivity
analysis. Consequently, the sensitivity coefficient of the
output variables (SSA and/or TSS) to a given input variable
was approximated by causing changes in the specified input
variable within the range of mean±standard deviation values
while keeping all the other input variables constant, and then
dividing the resulting standard deviation of the output variable
by the standard deviation of that specified input variable [34].
The sensitivity analyses were done to determine the relative
importance of the input variables, and all the changes in the
outputs were reported as a percentage.
2.7 Evaluation Criteria
Three criteria, namely the Akaike information criterion
(AIC) [35], RMSE and relative improvement (RI) were used
to evaluate the reliability of PTFs.
We conducted the Morgan–Granger–Newbold (MGN) test
(Eq. 11)[36] to determine whether the differences of the
corresponding indices for various PTFs (1–6) are significant
or not and if it can be regarded as an improvement from one
PTF to the next. The equation applied is the following:
MGN ¼ρsd
ffiffiffiffiffiffiffiffiffi
1ρsd
N1
qð11Þ
where Nis the number of samples, ρ
sd
is the correlation
coefficient between s
t
=e
1, t
+e
2, t
and d
t
=e
1, t
−e
2, t
;e
1, t
and
e
2, t
represent the forecast errors from first and second com-
peting models, respectively; and (N−1) is the degree of free-
dom. If the forecasts are equally accurate, then ρ
sd
will be zero
and consequently, the MGN value will be zero. The more the
difference of the accuracy of the forecasts from two competing
models is, the greater the MGN value is.
3 Results and Discussion
Soil samples used in this study had a high variation due to
variations in land use, parent material, and climate (Table 1).
Distribution of soil textures in the USDA textural triangle is
Table 1 Descriptive statistics of soils used for training and testing
Sand (%) Silt (%) Clay (%) CEC (cmol
c
kg
−1
) Organic matter (%) cucB DSSA (m
2
g
−1
) TSS (m
2
g
−1
)
Train
Mean 32.7 39.1 28.3 21.1 2.1 0.42 0.65 13.70 2.80 130 0.37
SD 22.4 15.0 18.9 15.1 2.3 0.08 0.61 8.90 0.12 101 0.18
Min 0.2 15.0 2.3 4.1 0.1 0.17 0.11 0.78 2.42 18 0.06
Max 80.8 84.5 73.0 78.5 15.3 0.64 4.32 31.72 2.96 524 0.83
Test
Mean 37.5 38.0 24.4 22.2 4.9 0.43 0.56 11.54 2.76 120 0.32
SD 23.8 15.1 16.8 10.9 8.9 0.08 0.29 7.81 0.15 74 0.16
Min 9.0 9.7 2.4 6.2 0.1 0.33 0.14 0.58 2.39 25 0.05
Max 86.3 57.3 54.0 43.4 37.6 0.62 1.06 24.39 2.92 325 0.56
CEC cation exchange capacity, cand ucoefficients of PSD curve model, Dfractal dimension, cB constant of fractal model, SSA measured specific
surface area, TSS calculated specific surface area
608 H. Bayat et al.
Author's personal copy
shown in Fig. 1. SSA ranged from 18 to 524 m
2
g
−1
. The
TSS values ranged from 0.05 to 0.83 m
2
g
−1
with a mean of
0.37 and 0.32 m
2
g
−1
for training and testing data, respec-
tively. The values of TSS are substantially lower than those
of SSA (Table 1). This showed that SSA was under predict-
ed when PSD was solely used with Skagge's model. This
result was expected to some extent, since there are several
factors influencing SSA, such as particle shapes [27], min-
eralogical composition [2], and soil organic matter [37] that
we did not include in our calculations of TSS.
That a relatively high correlation (R
2
=0.705) occurred
between TSS and SSA (Fig. 2) is promising. Since SSA is
an operationally defined concept, dependent on the measure-
ment technique and sample preparation [38], and its measure-
ment is difficult, costly, and time consuming; it may be easy,
useful, and economic to calculate TSS fromsand, silt, and clay
contents by employing Skagge's model, and then using this
calculated value with proper equation (i.e., an exponential
equation) to predict SSA. In addition, TSS can be used as
secondary information to predict other soil properties that are
difficult to measure, such as the soil water retention curve and
the cation exchange capacity [39].
3.1 Correlation Analysis
The Pearson correlation analysis was performed to evaluate
the relations between SSA and/or TSS and input variables
(Table 2). The results suggested that there were strong
correlations between FPs and SSA (r=0.66 (p<0.001) for
cB and r=0.65 (p<0.001) for D). To investigate the relation
between SSA and FPs a linear, exponential, logarithmic,
polynomial, and power regression analysis were performed.
All curvilinear regression types described the relations bet-
ter than the linear regression (data are not shown). Relations
between SSA and Dand/or cB were best described by second-
degree polynomial and exponential equations, respectively
(r=0.827 and 0.833 for Dand cB, respectively) (Fig. 3).
Ersahin et al. [21] have noted similarly good correlation by
investigating the relationship between SSA and Dthat was
calculated from measured PSD. The stronger correlations
were found between TSS and FPs (r=0.90 (p<0.001) for
cB and r=0.96 (p<0.001) for D)(Table2).
SSA-texture.
g
rf
Sand (%)
Clay (%)
Silt (%)
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
CLAY
SILTY CLAY
SANDY
CLAY
SILTY CLAY LOAMCLAY LOAM
SANDY CLAY LOAM
LOAM
SILTY LOAM
SILT
SANDY LOAM
LOAMY
SAND
SAND
Fig. 1 Textural distribution of
studied soils on the USDA
soil textural triangle
Fig. 2 Relationship between measured soil-specific surface area (SSA)
and calculated soil-specific surface area (TSS)
Estimation of SSA by ANN Ensembles Using FPs as Predictors 609
Author's personal copy
The correlations between SSA and PSDCPs (r=0.25
(p<0.05) for uand r=−0.24 for c) were lower than the
correlation between SSA and FPs, and it was not significant
in the case for c(Table 2). We calculated Dby the fractal
model of Bird et al. [26] with the estimated PSD data and
regressed our calculated Dvalues against the Dvalues
obtained by Ersahin et al. [21] for 22 selected soils (Fig. 4).
The result showed that our calculated Dwas highly correlated
(R
2
=0.925) to the Dobtained by Ersahin et al. [21](Fig.4).
This shows the efficiency of Skagge's model in the simulation
of entire PSD using sand, silt, and clay contents (Fig. 4).
3.2 Number of ANN Ensemble Members
The means for training and testing RMSE of ensemble
models versus number of members are plotted in Fig. 5.
As the number of ANN members is increased from 1 to 5,
there is a substantial improvement in the RMSE of ensemble
models. This shows that ensembling individual ANNs,
successfully improved results over using the single ANN
methods and it increased the predictive ability of the
ensemble model. However, the RMSE tends to steady-state
when the number of ANN members in the model is higher
than 5, which shows that the best number of ensemble mem-
bers in an ANN ensemble is 5. This is in agreement with
Baker and Ellison [31] who showed that the best number of
ANN ensemble members is 5 or 6.
Table 2 Pearson correlation coefficient (r) between input and output variables
Sand Silt Clay CEC Organic matter cuDcB SSA
Silt −0.58** 1.00
Clay −0.76** −0.08 1.00
CEC −0.48** 0.01 0.59** 1.00
Organic matter 0.21 −0.06 −0.22 0.24 1.00
c0.56** −0.86** −0.01 −0.15 −0.06 1.00
u−0.40** 0.72** −0.08 0.32** 0.23 −0.77** 1.00
D−0.94** 0.30* 0.92** 0.57** −0.22 −0.32* 0.24 1.00
cB −0.91** 0.23 0.93** 0.64** −0.16 −0.34** 0.26* 0.98** 1.00
SSA −0.62** 0.21 0.59** 0.77** 0.26* −0.24 0.25* 0.65** 0.66** 1.00
TSS −0.98** 0.52** 0.78** 0.45** −0.28* −0.45** 0.35** 0.96** 0.90** 0.59**
CEC cation exchange capacity, cand ucoefficients of PSD curve model, Dfractal dimension, cB constant of fractal model, SSA measured specific
surface area, TSS calculated specific surface area
*p=0.05; **p= 0.01—significance level
Fig. 3 Relationship between measured soil specific surface area (SSA)
and afractal dimension (D) and bconstant of fractal model (cB)
Fig. 4 Relationship between measured and calculated fractal dimension
(D)
610 H. Bayat et al.
Author's personal copy
3.3 Comparing Combining Methods
The results (Tables 3and 4) showed that combining
individual models by weighted averaging always produced
predictions that are at least as accurate as the ANN
ensembles of which individual models are combined by
simple averaging. This is in agreement with Perrone and
Cooper [40] and Guber et al. [31]. However, Tables 3and
4show that there are no considerable differences in the
predictions of two methods for most ANN ensembles,
except that there is a substantial difference in the pre-
dictions of two methods for PTF3 of SSA estimation
(Tables 3). As a conclusion, contrary to weighted averaging,
simple averaging deteriorated the performance of PTF3 in the
prediction of SSA.
3.4 Developing PTFs to Predict SSA and TSS
In this section, ANN ensembles were used to develop PTFs to
estimate SSA and TSS. Six PTFs were developed, the results
for the validation data are summarized in Tables 3and 4.
In PTF2, FPs was used along with basic soil properties to
estimate SSA. This improved prediction of SSA, significantly
(Table 3). Introducing PSDCPs along with basic soil proper-
ties in PTF3, showed different results using weighted averag-
ing and simple averaging methods. In weighted averaging
method, introducing PSDCPs improved the SSA estimation
by 11.1 %. However, this was not significant (Table 3). But,
combining by the simple averaging method resulted in the
deterioration of the SSA estimation by 24.8 %. The under-
lying reason for this result may be the significant correla-
tions between PSDCPs and sand and silt contents (Table 2).
It should be noted that using FPs and PSDCPs as only
inputs in PTF4 and PTF5, respectively, resulted in deterio-
ration of the PTFs performance during validation (Table 3).
This result confirmed our hypothesis that SSA may be one
of the properties that is controlled by the fractal behaviour
of PSD. In fact, the fractal theory could develop a more
complete description of soil structure and processes, which
is impossible to be done by conventional methods based on
Euclidean geometry [41].
Bayat et al. [42] successfully used FPs to estimate the soil
water retention curve by ANNs and the multiobjective
group method of data handling. Ersahin et al. [21] related
the cation exchange capacity and SSA of the soils to the
fractal dimension of PSD using a regression method.
The prediction of SSA was improved significantly by the
use of the soil cation exchange capacity as a predictor along
with basic soil properties (Table 3). This is in agreement
with several authors [6,43] who found a close correlation
between cation exchange capacity and SSA.
In the same way, we developed PTFs for SSA, and we
developed six PTFs to predict TSS, and the results for the
testing data are summarized in Table 4. The prediction of
Fig. 5 Mean RMSE versus
number of ANN ensemble
members for training and
testing. The bars show the
mean RMSE resulted from
ANN ensembles
Table 3 Results of SSA predic-
tions using different input
variables for testing dataset
SSA calculated specific surface
area, AIC Akaike information
criterion, RMSE root mean
square error, RI relative im-
provement, MGN Morgan–
Granger–Newbold
*p<0.05, significant differences
between PTF1 and other PTFs
Combined by weighted averaging Combined by simple averaging
AIC RMSE RI MGN AIC RMSE RI MGN
PTF1 −49.0 0.203 −49.0 0.203
PTF2 −53.3 0.178 12.5 2.21* −53.2 0.178 12.5 2.15*
PTF3 −52.8 0.181 11.1 1.18 −41.9 0.254 −24.8 0.55
PTF4 −46.5 0.220 −8.1 0.59 −46.5 0.220 −8.0 0.59
PTF5 −38.1 0.285 −40.4 3.65* −38.1 0.286 −40.4 3.65*
PTF6 −61.2 0.139 31.7 4.85* −61.1 0.139 31.5 4.87*
Estimation of SSA by ANN Ensembles Using FPs as Predictors 611
Author's personal copy
TSS was improved significantly by using FPs and PSDCPs
as predictors along with basic soil properties in PTF2 and
PTF3, respectively (Table 4). These improvements were
more considerable than those for SSA. However, contrary
to TSS, use of PSDCPs did not improve the prediction of
SSA, significantly.
Contrary to predicting SSA, the prediction of TSS was
improved significantly when FPs were the only predictors in
PTF4 (Table 4). Depending on the method used, the mea-
sured SSA may show variations for a given soil [37]. In this
study, various methods were used to measure SSA, while
identical method was used to calculate TSS. However, the
PTFs developed to estimate SSA were less reliable than the
PTFs developed to estimate TSS. Similar to predicting SSA,
the prediction of TSS was improved significantly by the use
of cation exchange capacity as a predictor along with basic
soil properties (Table 4).
3.5 Sensitivity Analysis
Sensitivity analysis was performed to investigate the potential
importance of FPs and PSDCPs in PTF2 and PTF3, respec-
tively, in the prediction of SSA and TSS. The results are
shown in Fig. 6.InPTF2,FPs(Dand cB as input variables)
described 44 % of variation in SSA, while these input vari-
ables described 92.8 % of variation in TSS (Fig. 6). It shows
the convenience of FPs in describing the relation between
SSA and/or TSS and soil texture. That FPs influence TSS
more than SSA may be due to the superiority of fractal theory
in describing the relation between soil texture and TSS.
In PTF3, it was found that PSDCPs (cand uas input
variables) explained 42 % of variation in SSA, while these
input variables described only 6.6 % of variation in TSS
(Fig. 6). In comparison with fractal parameters, PSDCPs
were less effective in the prediction of SSA and TSS.
The results indicated the efficiency of ANN ensembles in
developing the PTFs to predict SSA using preferable input
variables such as FPs. The fractal approach is a useful tool
to describe the processes of porous medium [20,44]. The
use of FPs and/or PSDCPs as predictors improved SSA
and/or TSS predictions without any cost. In this study, FPs
and PSDCPs were calculated only from sand, silt and clay
contents of the soils. Therefore, it could be a simple, eco-
nomic, and practical procedure to improve the SSA and/or
TSS predictions without additional measurements.
4 Conclusions
Prediction of SSA and/or TSS was improved significantly
by introducing FPs and PSDCPs to an ANN assemble.
Strong correlations were found between FPs and SSA
and/or TSS, and to a lower degree, between SSA and/or
TSS and PSDCPs. When FPs and PSDCPs were used as
predictors along with basic soil properties the TSS predic-
tions improved significantly. The prediction for TSS was
better than for SSA when FPs were used as only predictors
and this may attributed to that same technique used to
calculate TSS for all soils in comparison with the different
Table 4 Results of TSS predic-
tions using different input
variables for testing dataset
TSS calculated specific surface
area, AIC Akaike information
criterion, RMSE root mean
square error, RI relative im-
provement, MGN Morgan–
Granger–Newbold
*p<0.05, significant differences
between PTF1 and other PTFs
Combined by weighted averaging Combined by simple averaging
AIC RMSE RI MGN AIC RMSE RI MGN
PTF1 −73.3 0.095 −72.2 0.098
PTF2 −83.7 0.069 27.7 10.12* −80.0 0.077 21.6 8.32*
PTF3 −83.4 0.069 27.0 5.40* −82.8 0.071 28.0 5.79*
PTF4 −98.2 0.044 54.0 6.01* −98.0 0.044 55.3 6.29*
PTF5 −54.6 0.170 −79.6 3.30* −53.4 0.177 −80.0 3.33*
PTF6 −86.0 0.064 32.6 13.23* −83.6 0.069 29.8 12.50*
Fig. 6 Sensitivity analysis of output variables, aSSA in PTF2, bTSS
in PTF2, cSSA in PTF3, and dTSS in PTF3 to the input variables.
(OM organic matter, cand u, coefficients of PSD curve model, D
fractal dimension, cB constant of fractal model, SSA measured specific
surface area, TSS calculated specific surface area)
612 H. Bayat et al.
Author's personal copy
methods used to obtain SSA data. The prediction of SSA
and/or TSS was improved significantly by the use of the soil
cation exchange capacity as a predictor along with basic soil
properties. Use of FPs as inputs, explained 44 and 92.8 % of
variation in SSA and TSS, respectively. In comparison to FPs,
PSDCPs were less effective in the prediction of SSA and TSS.
We concluded that the FPs are useful in quantifying the re-
lations between soil texture and related soil properties and
processes. This procedure is occurred as an economic and
powerful technique to predict the SSA and/or TSS by ANN
ensembles, without additional measurements. Therefore, we
concluded that ANN ensembling could be used as a new
technique in the prediction of SSA and TSS values.
Acknowledgments The authors are deeply grateful to the two anony-
mous reviewers and Associate Editor for their helpful comments, which
have considerably improved this paper. The authors thank Dr. Aringhieri
from Dipartimento di Informatica, Università degli Studi di TorinoCorso
Svizzera 185, I-10149 Torino, Italy, for his data he kindly provided.
References
1. Aringhieri, A., Pardini, G., Gispert, M., & Sole, A. (1992). Testing
a simple methylene blue method for surface area estimation in
soils. Agrochimica, XXXVI–N. 3, 224–232.
2. Mitchell, J. K. (1993). Fundamentals of soil behavior. New York:
Wiley.
3. Lutenegger, A. J., & Cerato, A. B. (2001). Surface area and
engineering properties of finegrained soils. 15th International
Conference on Soil Mechanics and Foundation Engineering (pp.
603–606). Istanbul, Turkey: A.A. Balkema Publishers.
4. Morgenstern, N.R., Balasubramanian, B.I. (1980). Effects of pore
fluid on the swelling of clay-shale. In: 4th International Conference
Expansive Soils. pp. 190–205.
5. Sridharan, A., Rao, S. M., & Murthy, N. S. (1986). Compressibil-
ity behavior of homoionized bentonites. Geotechnique, 36, 551–
564.
6. de Jong, E. (1999). Comparison of three methods of measuring
surface area of soils. Canadian Journal of Soil Science, 79, 345–
351.
7. Nixon, J. F. (1991). Discrete ice lens theory for frost heave in soils.
Canadian Geotechnical Journal., 28, 843–859.
8. Warkentin, B. P. (1972). Use of the liquid limit in characterizing
the clay soils. Canadian Journal of Soil Science, 52, 457–464.
9. D.L. Carter, M.M. Mortland, W.D. Kemper (1986). Specific sur-
face. In: A. Klute, (Ed.). Methods of soils analysis Part 1: physical
and mineralogical methods. 2nd ed. Serie Agronomy No. 9. ASA-
SSSA ed. Soil Science Society of America Madison, pp. 413–423.
10. Daniels, J. L., Inyang, H. I., & Brochu, M., Jr. (2004). Specific
surface area of barrier mixtures at various outgas temperatures.
Journal of Environmental Engineering, 130, 867–872.
11. Urai, J. L., Van Oort, E., & Van Der Zee, W. (1997). Correlations
to predict the mechanical properties of mudrocks from wireline
logs and drill cuttings. The EGS XXII General Assembly Vienna:
Annales Geophysica., 15, 143–155.
12. Koorevaar, P., Menelik, G., & Dirksen, C. (1983). Elements of soil
physics. Amsterdam: Elsevier.
13. Theng, B. K. G., Ristori, G. G., Santi, C. A., & Percival, H. J.
(1999). An improved method for determining the specific surface
areas of topsoils with varied organic matter content, texture and
clay mineral composition. European Journal of Soil Science, 50,
309–316.
14. Arnepalli, D. N., Shanthakumar, S., Hanumantha Rao, B., &
Singh, D. N. (2008). Comparison of methods for determining
specific-surface area of fine-grained soils. Geotechnical and Geo-
logical Engineering, 26, 121–132.
15. Hepper, E. N., Buschiazzo, D. E., Hevia, G. G., Urioste, A., &
Antón, L. (2006). Clay mineralogy, cation exchange capacity and
specific surface area of loess soils with different volcanic ash
contents. Geoderma, 135, 216–223.
16. Yukselen, Y., & Kaya, A. (2006). Comparison of methods for
determining specific surface area of soils. Journal of Geotechnical
and Geoenvironmental Engineering., 132, 931–936.
17. Schaap, M. G., & Bouten, W. (1996). Modeling water retention
curves of sandy soils using neural networks. Water Resources
Research, 32, 3033–3040.
18. Koekkoek, E. J. W., & Booltink, H. (1999). Neural network
models to predict soil water retention. European Journal of Soil
Science, 50, 489–495.
19. Crawford, J. W., Sleeman, B. D., & Young, I. M. (1993). On the
relation between number-size distributions and the fractal dimen-
sion of aggregates. Journal of Soil Science., 44, 555–565.
20. Hwanga Ii, S., Leeb, K. P., Leeb, D. S., & Powers, S. E. (2002).
Models for estimating soil particle-size distributions. Soil Science
Society of America Journal, 66, 1143–1150.
21. Ersahin, S., Gunal, H., Kutlu, T., Yetgin, B., & Coban, S. (2006).
Estimating specific surface area and cation exchange capacity in
soils using fractal dimension of particle-size distribution.
Geoderma, 136, 588–597.
22. Cerato, A. B., & Lutenegger, A. J. (2002). Determination of
surface area of fine-grained soils by the ethylene glycol monoethyl
ether (EGME) method. Geotechnical Testing Journal, 25, 315–
321.
23. .Quirk, J.P. (1955). Significance of surface area calculated from
water-vapour sorption isotherms by use of the B.E.T. equation.
Soil Science. 8:423.
24. Arya, L. M., & Paris, J. F. (1981). Physicoempirical model to
predict the soil moisture characteristic from particle-size distribu-
tion and bulk density data. Soil Science Society of America Jour-
nal, 45, 1023–1030.
25. Skaggs, T. H., Arya, L. M., Shouse, P. J., & Mohanty, B. P. (2001).
Estimating particle-size distribution from limited soil texture data.
Soil Science Society of America Journal, 65, 1038–1044.
26. Bird, N. R. A., Perrier, E., & Rieu, M. (2000). The water retention
function for a model of soil structure with pore and solid fractal
distributions. European Journal of Soil Science, 51,55–63.
27. Hillel, D. (2004). Introduction to environmental soil physics. Am-
sterdam: Academic.
28. Minasny, B., & McBratney, A. B. (2002). The neuro-M method for
fitting neural network parametric pedotransfer functions. Soil Sci-
ence Society of America Journal, 66, 352–361.
29. Efron, B., & Tibshirani, R. J. (1993). An introduction to the
bootstrap. Monographs on Statistics and Applied Probability.
London: Chapman and Hall.
30. Schaap, M. G., Leij, F. J., & Van Genuchten, M. T. (1998).
Neural network analysis for hierarchical prediction of soil
hydraulic properties. Soil Science Society of America Journal,
62,847–855.
31. Baker, L., & Ellison, D. (2008). Optimisation of pedotransfer
functions using an artificial neural network ensemble method.
Geoderma, 144, 212–224.
32. Donigian, A. S., & Rao, P. S. C. (1986). Overview of terrestrial
processes and modeling. In S. C. Hern & S. M. Melancon (Eds.),
Vadose zone modeling of organic pollutants (pp. 3–36). Chelsea:
Lewis Publishers.
Estimation of SSA by ANN Ensembles Using FPs as Predictors 613
Author's personal copy
33. Zheng, C., & Bennett, G. D. (1995). Applied contaminant trans-
port modeling: theory and practice. New York: Van Nostrand
Reinhold.
34. NeuroDimension, Inc. (2005). NeuroSolutions. Getting Started
Manual Version 4. Gainesville: NeuroDimension, Inc.
35. Akaike, H. (1974). New look at the statistical model identification.
IEEE Transactions on Automatic Control, AC-19, 716–723.
36. Diebold, F. X., & Mariano, R. S. (2002). Comparing predictive
accuracy. Journal of Business and Economic Statistics., 20, 134–
144.
37. Yukselen-Aksoy, Y., & Kaya, A. (2010). Method dependency of
relationships between specific surface area and soil physicochem-
ical properties. Applied Clay Science., 50, 182–190.
38. Pennell, K. D. (2002). Specific surface area. In J. H. Dane & G. C.
Topp (Eds.), Methods of soil analysis. Part 4: physical methods.
Madison: Soil Science Society of America.
39. H. Bayat, N. Davatgat, S. Moallemi. (2012). Using of specific
surface to improve the prediction of Soil CEC by Artificial Neural
Networks. Soil and Water Knowledge Journal, 21(4), 105–119.
40. Perrone, M. P., & Cooper, L. N. (1993). When networks disagree:
ensemble method for neural networks. In R. J. Mammone (Ed.),
Neural networks for speech and image processing (pp. 126–142).
London: Chapman and Hall.
41. Sokołowska, Z., Hajnos, M., Hoffmann, C., Renger, M., &
Sokołowski, S. (2001). Comparison of fractal dimensions of soils
estimated from adsorption isotherms, mercury intrusion, and par-
ticle size distribution. Journal of Plant Nutrition and Soil Science.,
164, 591–599.
42. Bayat, H., Neyshabouri, M. R., Mohammadi, K., & Nariman-
Zadeh, N. (2011). Estimating water retention with pedotransfer
functions using multi-objective group method of data handling
and ANNs. Pedosphere, 21, 107–114.
43. Curtin, D., & Smillie, G. W. (1976). Estimation of components of soil
cation exchange capacity from measurements of specific surface and
organic matter. Soil Science Society of America Journal, 40, 461–462.
44. Xu, Y. F., & Dong, P. (2004). Fractal approach to hydraulic
properties in unsaturated porous media. Chaos, Solitons and Frac-
tals, 19, 327–337.
614 H. Bayat et al.
Author's personal copy