Content uploaded by Hossein Bayat

Author content

All content in this area was uploaded by Hossein Bayat on Jan 08, 2015

Content may be subject to copyright.

1 23

Environmental Modeling &

Assessment

ISSN 1420-2026

Volume 18

Number 5

Environ Model Assess (2013) 18:605-614

DOI 10.1007/s10666-013-9366-2

Improving Estimation of Specific Surface

Area by Artificial Neural Network

Ensembles Using Fractal and Particle

Size Distribution Curve Parameters as

Predictors

Hossein Bayat, Sabit Ersahin & Estela

N.Hepper

1 23

Your article is protected by copyright and all

rights are held exclusively by Springer Science

+Business Media Dordrecht. This e-offprint

is for personal use only and shall not be self-

archived in electronic repositories. If you wish

to self-archive your article, please use the

accepted manuscript version for posting on

your own website. You may further deposit

the accepted manuscript version in any

repository, provided it is only made publicly

available 12 months after official publication

or later and provided acknowledgement is

given to the original source of publication

and a link is inserted to the published article

on Springer's website. The link must be

accompanied by the following text: "The final

publication is available at link.springer.com”.

Improving Estimation of Specific Surface Area by Artificial

Neural Network Ensembles Using Fractal and Particle Size

Distribution Curve Parameters as Predictors

Hossein Bayat &Sabit Ersahin &Estela N. Hepper

Received: 7 April 2012 /Accepted: 11 March 2013 /Published online: 26 March 2013

#Springer Science+Business Media Dordrecht 2013

Abstract Specific surface area (SSA) is one of the principal

soil properties used in modeling soil processes. In this study,

artificial neural network (ANN) ensembles were evaluated to

predict SSA. Complete soil particle-size distribution was esti-

mated from sand, silt, and clay fractions using the model by

Skaggs et al. and then the particle-size distribution curve

parameters (PSDCPs) and fractal parameters were calculated.

The PSDCPs were used to predict 20 particle-size classes for a

soil sample’s particle size distribution. Fractal parameters were

calculated by the model of Bird et al. In addition, total soil-

specific surface area (TSS) was calculated using the above 20

size classes. Pedotransfer functions were developed for SSA

and TSS using ANN ensembles from 63 pieces of SSA data

taken from the literature. Fractal parameters, PSDCPs, and

some other soil properties were used to predict SSA and

TSS. Introducing fractal parameters and PSDCPs improved

the SSA estimations by 12.5 and 11.1 %, respectively. The

improvements were even better for TSS estimations (27.7 and

27.0 %, respectively). The use of fractal parameters as estima-

tors described 44 and 92.8 % of the variation in SSA and TSS,

respectively, while PSDCPs explained 42 and 6.6 % of the

variation in SSA and TSS, respectively. The results suggested

that fractal parameters and PSDCPs could be successfully used

as predictors in ANN ensembles to predict SSA and TSS.

Keywords Artificial neural networks .Ensemble .Fractal

parameters .Particle-size distribution .Specific surface area .

Prediction

1 Introduction

Specific surface area (SSA) is one of the principal soil prop-

erties for agricultural, industrial, and environmental applica-

tions that are related to the physical and chemical properties of

aporousmedium[1]. Previous studies have indicated that

SSA is closely related to and has a determining influence on

many soil properties such as grain-size distribution (i.e., the

clay-size fraction), mineralogical composition [2], consisten-

cy limits [3], swelling and shrinkage characteristics [4], com-

pressibility characteristics [5], cation exchange capacity, clay

content [6], frost heave [7], water retention characteristics [8],

water movement, soil aggregation [9], activity [2], sorption

and desorption characteristics [10], and angle of the internal

friction of soils [11]. Moreover, processes such as contaminant

accumulation, nutrient dynamics, and chemical transport are

greatly influenced by SSA [12].

Contrary to soil properties such as organic matter, pH, and

particle-size distribution (PSD), SSA is not measured routine-

ly [13]; therefore, data for surface area are unavailable in most

of the databases. In addition, direct measurement techniques

for SSA are time consuming, labor intensive, expensive, tech-

nically difficult, and require skilled personnel [14,15]. There-

fore, since methods to measure the SSA are diverse and their

results are generally incomparable [16] due to the error from

either inherent limitations of the instrument employed or the

basic assumptions applied in mathematical models that are

used for computing SSA. Hence, finding a simple, economic,

and brief methodology that predicts reliable values for soil

SSA is quite essential.

In the past 15 years, many pedotransfer functions (PTFs)

have been developed using artificial neural networks (ANNs)

Project supported by the Bu Ali Sina University, Hamadan, Iran.

H. Bayat (*)

Department of Soil Science, Faculty of Agriculture,

Bu Ali Sina University, Hamadan, Iran

e-mail: h.bayat@basu.ac.ir

S. Ersahin

Department of Forest Engineering, Faculty of Forestry,

ÇankırıKaratekin University, 18100 Çankırı, Turkey

E. N. Hepper

Facultad de Agronomía, UNLPam, cc 300,

6300 Santa Rosa, Argentina

Environ Model Assess (2013) 18:605–614

DOI 10.1007/s10666-013-9366-2

Author's personal copy

and they have performed at least as well as other tech-

niques and overcome the problem of introducing statistical

uncertainties into PTFs [17,18]. Although efforts have

been made to develop PTFs by regression methods, to

our knowledge, no research has been conducted to evaluate

the performance of ANNs, especially ANN ensembles in

estimating SSA from readily available soil data. This mo-

tivated us to use the ensemble method that combines a

number of individual ANNs.

SSA can be predicted by developing PTFs, between

relatively easy to measure soil survey variables and SSA.

In any case, finding new input parameters that are more

preferable or necessary to estimate SSA and can improve

its estimation without spending more time or cost, remains a

challenging issue.

The fractal theory has been widely used to characterize

soil properties [19]. Some authors [20] suggested that the

fractal dimension of the PSD can be useful to quantify the

relationships between soil texture and related soil properties

and processes. Ersahin et al. [21] have used fractal dimen-

sion to predict SSA; however, no one has investigated the

efficiency of fractal parameters (FPs) and conventional PSD

curve parameters (PSDCPs) to predict SSA by ANNs.

Therefore, it would be practical and economical to use

the ANN ensembles procedure together with calculated FPs

and PSDCPs to predict soil SSA. More importantly, these

models should improve the accuracy of predicted SSA data

and aid in populating the database, which will benefit all

users of soil survey data for agricultural, industrial, and

environmental applications.

The objectives of this study were: (1) to calculate the FPs

and PSDCPs using limited soil texture data and to evaluate

their relationship with measured soil SSA and calculated

total soil-specific surface area (TSS), (2) to develop the

PTFs by ANN ensembles to evaluate the utility of using

calculated FPs and PSDCPs to improve SSA and TSS pre-

dictions, and (3) to compare the efficiency of FPs and

PSDCPs in the prediction of soil SSA and TSS.

2 Materials and Methods

2.1 Datasets

We developed PTFs to predict measured soil SSA and

calculated TSS using 17, 22, and 24 SSA data taken from

Aringhieri et al. [1], Ersahin et al. [21], and Hepper et al.

[15], respectively. Basic soil properties of sand, silt, clay,

and organic matter were used along with cation exchange

capacity and SSA to develop PTFs.

Ersahin et al. [21] measured the SSA of the soils analyz-

ing the retention of ethylene glycol monoethylene ether,

which is a polar molecule that forms only one layer of

molecules on the particle surfaces [22]. Hepper et al. [15]

measured the SSA with ethylene glycol mono ethylene ether

after sieving the samples (<0.5 mm), peroxiding, saturating

with Ca

2+

, and air drying [9]. Aringhieri et al. [1] used the

method of Quirk [23]tomeasureSSAbywatervapor

adsorption. The SSA measured by water vapor (H

2

O) or

ethylene glycol monoethylene ether is known to yield total

SSA [1]. Hence, in this study the SSA was measured using

both methods. Detailed information about the site descrip-

tion and analytical methods of soil properties has been given

in the related papers.

In this study, we predicted entire PSD curve from soil

texture data, and we calculated FPs from the extended PSD

curve data to predict SSA and TSS.

2.2 Modeling PSD

Complete PSD for the 20 size classes (as suggested by Arya

and Paris [24]) were predicted from the sand, silt, and clay

contents, using the Skaggs et al. [25] model:

PðrÞ¼ 1

1þ1

Pr

0

ðÞ

1

exp uRc

ðÞ

ð1Þ

R¼rr0

r0;rr0>0ð2Þ

where P(r) is the mass fraction of soil particles with radii

less than r,r

0

is the lower bound on radii for which the

model applies, and cand uare model parameters (hereafter,

we call cand uas (PSDCPs)) that can be calculated using

the following equations:

c¼aln v

w;u¼v1bwbð3Þ

v¼ln

1

Pr

1

ðÞ

1

1

Pr

0

ðÞ

1;w¼ln

1

Pr

2

ðÞ

1

1

Pr

0

ðÞ

1ð4Þ

a¼1

lnr1r0

r2r0

;b¼aln r1r0

r0ð5Þ

1>Pr

2

ðÞ>Pr

1

ðÞ>Pr

0

ðÞ>0;r2>r1>r0>0

To implement the method described in the above section,

we must select values for r

0

,r

1

,andr

2

.Weusedr

0

=1 μm,

r

1

=25 μm, and r

2

=999 μm. According to the USDA

particle-size classification system, these radii specify that

P(r

0

=1 μm) is the clay mass fraction, P(r

1

=25 μm) is the

clay+silt fraction, and P(r

2

=999 μm) is the clay+silt+sand

fraction. This was the only data available in the datasets

that were used in this study. However, P(r

2

=999 μm) used

606 H. Bayat et al.

Author's personal copy

by the Skaggs et al. [25] model, is not exactly the same as

1,000 μm specified by USDA for clay+silt+sand mass

fraction. However, as 999 μm is too close to the upper

bound of sand fraction (1,000 μm), we believe that the

above assumption may not cause much error in the fractal

scaling.

The Pore–Solid Fractal model of Bird et al. [26] has been

applied to PSD data:

Msddi

ðÞ¼cBdi3Dð6Þ

where M

s

(d≤d

i

) is the cumulative mass of particles below

an upper limit, d

i

,Dis the fractal dimension of the PSD, and

cB is a composite scaling constant. Hereafter, Dand cB will

be referred as fractal parameters (FPs).

2.3 Calculation of Soil’s TSS

TSS can be predicted by calculations based on the sizes,

shapes, and relative quantity of different types of soil parti-

cles [27]. We used predicted (extended) PSD classes to

calculate TSS of soil mineral fractions. Sand and silt parti-

cles are assumed to be spherical and clay particles are

assumed to be platy. For a sphere of radius r, the ratio of

surface to mass is

SSi¼3

ρsri

ð7Þ

For a platy particle of thickness x, the ratio of surface to

mass is

SSj2

ρsxj

ð8Þ

Here, SS

i

is the SSA of the ith class of spherical particles

and SS

j

is the SSA of the jth class of platy particles, ρ

s

is the

particle density, r

i

is the mean radius of the ith class of

spherical particles, and x

j

is the mean thickness of the jth

class of platy particles. The approximate TSS can be calcu-

lated by the summation equation:

TSS m2=g

¼X

n

i¼1

ciSSiþX

m

j¼1

cjSSjð9Þ

Here, c

i

and c

j

are the mass fraction of particles of

average radius r

i

and average thickness x

j

, respectively.

The variables nand mare the number of classes of spherical

and platy particles, respectively.

Measured and calculated values of specific surface area

(SSA and TSS) were regressed against each of the FPs (D

and cB) to determine the relationship between them. To

assess the nature of the relationship between SSA and/or

TSS and FPs, different types of regression equations were

considered.

2.4 ANN Ensembles

The 63 data that were taken from Aringhieri et al. [1],

Ersahin et al. [21] and Hepper et al. [15] were partitioned

into three subsets using a randomized approach, a training

set of 37 data, a cross-validation set of ten data, and a testing

set of 16 data. The precise PTFs were developed using ANN

ensembles.

We selected input data randomly in all ANN models. For

every PTF, 80 models were developed using two types of

ANNs; feed-forward multilayer perceptrons and generalized

regression neural networks in order to predict the soil SSA

and TSS. Performances of two types of ANNs were evalu-

ated; each type was run with one hidden layer and different

hidden neurons ranging from 3 to 12. We followed proce-

dure by Minasny and McBratney [28] who developed

“neuropath”software to create PTFs, in developing individ-

ual models. Therefore, to develop each of 80 models for

prediction of the soil SSA and TSS, combination of ANN,

and bootstrap method [29] were applied. The input data

were selected randomly in 50 different times, to obtain 50

bootstrap datasets of the same size as the training dataset.

For each bootstrap data set, a network was trained and the

soil SSA and TSS were predicted. We assumed as the

final estimate of an individual model the mean of the

50 predictions [30].

Several transfer functions including tanh, exponential,

logistic, identity, and sine were tested in hidden and

output layers to achieve the greatest accuracy and reliabil-

ity. According to Baker and Ellison [31], the root mean

square error (RMSE) tends to steady state when the num-

ber of ANN members in the PTF is greater than 5 or 6.

We evaluated the effect of the number of ANN ensemble

members on the RMSE of the ensemble models, and

behaving conservatively, selected the 20 most successful

ANN models from 80 developed to create an ANN

ensemble model.

We combined the predictions done from individual ANN

members by simple averaging (all members in the ensemble

areassignedanequalweight) and weighted averaging,

weighted by the testing error. In weighting, ANNs with

smallest errors are given more weight. The weighted aver-

age Xis:

X¼P

N

i¼1P

N

i¼1

Ei

Ei

Pi

N1ðÞ

P

N

i¼1

Ei

ð10Þ

where p

1

,p

2

,…,p

i

,…,p

N

,(Nis the number of ensemble

members) are several independent, unbiased estimates of

SSA or TSS, and E

i

is the sum of squared error at optimi-

zation of the ith ANN.

Estimation of SSA by ANN Ensembles Using FPs as Predictors 607

Author's personal copy

2.5 Development of Pedotransfer Functions

Since the D, SSA and TSS had non-normal distributions,

30

D–2

, log SSA, and log TSS were used to normalize them,

respectively. All variables were standardized to have a zero

mean and unit variance. To predict SSA, six PTFs were

developed. PTF1 was based on the basic soil properties (i.e.,

sand, silt, clay, and organic matter contents). PTF2 used FPs

(Dand cB) as additional inputs. To build PTF3, PSDCPs (i.e.,

cand uin the Eq. 3) were introduced to the model along with

the basic soil properties. To develop PTF4, only FPs were

used as inputs, and to develop PTF5, only PSDCPs were used

as inputs. We used the soil cation exchange capacity as an

input along with basic soil properties to develop PTF6. The

same procedure was followed to develop the other six PTFs to

predict TSS. The performances of all the PTFs were evaluated

andcomparedwitheachother.

2.6 Sensitivity Analyses

According to Donigian and Rao [32], sensitivity analysis is

the “degree to which the model result is affected by changes

in a selected model input.”The sensitivity coefficient of an

input variable can be calculated, making small changes in

the input variable of particular focus while keeping all the

other inputs constant and then dividing the change in the

output variable by the change in the input variable [33]. The

basicideaisthattheinputsofthenetworkareshifted

slightly and the corresponding change in the output is

reported either as a percentage or as a raw difference [34].

Response of its output when one standard deviation

change is made in an input is a good measure in sensitivity

analysis. Consequently, the sensitivity coefficient of the

output variables (SSA and/or TSS) to a given input variable

was approximated by causing changes in the specified input

variable within the range of mean±standard deviation values

while keeping all the other input variables constant, and then

dividing the resulting standard deviation of the output variable

by the standard deviation of that specified input variable [34].

The sensitivity analyses were done to determine the relative

importance of the input variables, and all the changes in the

outputs were reported as a percentage.

2.7 Evaluation Criteria

Three criteria, namely the Akaike information criterion

(AIC) [35], RMSE and relative improvement (RI) were used

to evaluate the reliability of PTFs.

We conducted the Morgan–Granger–Newbold (MGN) test

(Eq. 11)[36] to determine whether the differences of the

corresponding indices for various PTFs (1–6) are significant

or not and if it can be regarded as an improvement from one

PTF to the next. The equation applied is the following:

MGN ¼ρsd

ﬃﬃﬃﬃﬃﬃﬃﬃﬃ

1ρsd

N1

qð11Þ

where Nis the number of samples, ρ

sd

is the correlation

coefficient between s

t

=e

1, t

+e

2, t

and d

t

=e

1, t

−e

2, t

;e

1, t

and

e

2, t

represent the forecast errors from first and second com-

peting models, respectively; and (N−1) is the degree of free-

dom. If the forecasts are equally accurate, then ρ

sd

will be zero

and consequently, the MGN value will be zero. The more the

difference of the accuracy of the forecasts from two competing

models is, the greater the MGN value is.

3 Results and Discussion

Soil samples used in this study had a high variation due to

variations in land use, parent material, and climate (Table 1).

Distribution of soil textures in the USDA textural triangle is

Table 1 Descriptive statistics of soils used for training and testing

Sand (%) Silt (%) Clay (%) CEC (cmol

c

kg

−1

) Organic matter (%) cucB DSSA (m

2

g

−1

) TSS (m

2

g

−1

)

Train

Mean 32.7 39.1 28.3 21.1 2.1 0.42 0.65 13.70 2.80 130 0.37

SD 22.4 15.0 18.9 15.1 2.3 0.08 0.61 8.90 0.12 101 0.18

Min 0.2 15.0 2.3 4.1 0.1 0.17 0.11 0.78 2.42 18 0.06

Max 80.8 84.5 73.0 78.5 15.3 0.64 4.32 31.72 2.96 524 0.83

Test

Mean 37.5 38.0 24.4 22.2 4.9 0.43 0.56 11.54 2.76 120 0.32

SD 23.8 15.1 16.8 10.9 8.9 0.08 0.29 7.81 0.15 74 0.16

Min 9.0 9.7 2.4 6.2 0.1 0.33 0.14 0.58 2.39 25 0.05

Max 86.3 57.3 54.0 43.4 37.6 0.62 1.06 24.39 2.92 325 0.56

CEC cation exchange capacity, cand ucoefficients of PSD curve model, Dfractal dimension, cB constant of fractal model, SSA measured specific

surface area, TSS calculated specific surface area

608 H. Bayat et al.

Author's personal copy

shown in Fig. 1. SSA ranged from 18 to 524 m

2

g

−1

. The

TSS values ranged from 0.05 to 0.83 m

2

g

−1

with a mean of

0.37 and 0.32 m

2

g

−1

for training and testing data, respec-

tively. The values of TSS are substantially lower than those

of SSA (Table 1). This showed that SSA was under predict-

ed when PSD was solely used with Skagge's model. This

result was expected to some extent, since there are several

factors influencing SSA, such as particle shapes [27], min-

eralogical composition [2], and soil organic matter [37] that

we did not include in our calculations of TSS.

That a relatively high correlation (R

2

=0.705) occurred

between TSS and SSA (Fig. 2) is promising. Since SSA is

an operationally defined concept, dependent on the measure-

ment technique and sample preparation [38], and its measure-

ment is difficult, costly, and time consuming; it may be easy,

useful, and economic to calculate TSS fromsand, silt, and clay

contents by employing Skagge's model, and then using this

calculated value with proper equation (i.e., an exponential

equation) to predict SSA. In addition, TSS can be used as

secondary information to predict other soil properties that are

difficult to measure, such as the soil water retention curve and

the cation exchange capacity [39].

3.1 Correlation Analysis

The Pearson correlation analysis was performed to evaluate

the relations between SSA and/or TSS and input variables

(Table 2). The results suggested that there were strong

correlations between FPs and SSA (r=0.66 (p<0.001) for

cB and r=0.65 (p<0.001) for D). To investigate the relation

between SSA and FPs a linear, exponential, logarithmic,

polynomial, and power regression analysis were performed.

All curvilinear regression types described the relations bet-

ter than the linear regression (data are not shown). Relations

between SSA and Dand/or cB were best described by second-

degree polynomial and exponential equations, respectively

(r=0.827 and 0.833 for Dand cB, respectively) (Fig. 3).

Ersahin et al. [21] have noted similarly good correlation by

investigating the relationship between SSA and Dthat was

calculated from measured PSD. The stronger correlations

were found between TSS and FPs (r=0.90 (p<0.001) for

cB and r=0.96 (p<0.001) for D)(Table2).

SSA-texture.

g

rf

Sand (%)

Clay (%)

Silt (%)

0

20

40

60

80

100

0

20

40

60

80

100

0

20

40

60

80

100

CLAY

SILTY CLAY

SANDY

CLAY

SILTY CLAY LOAMCLAY LOAM

SANDY CLAY LOAM

LOAM

SILTY LOAM

SILT

SANDY LOAM

LOAMY

SAND

SAND

Fig. 1 Textural distribution of

studied soils on the USDA

soil textural triangle

Fig. 2 Relationship between measured soil-specific surface area (SSA)

and calculated soil-specific surface area (TSS)

Estimation of SSA by ANN Ensembles Using FPs as Predictors 609

Author's personal copy

The correlations between SSA and PSDCPs (r=0.25

(p<0.05) for uand r=−0.24 for c) were lower than the

correlation between SSA and FPs, and it was not significant

in the case for c(Table 2). We calculated Dby the fractal

model of Bird et al. [26] with the estimated PSD data and

regressed our calculated Dvalues against the Dvalues

obtained by Ersahin et al. [21] for 22 selected soils (Fig. 4).

The result showed that our calculated Dwas highly correlated

(R

2

=0.925) to the Dobtained by Ersahin et al. [21](Fig.4).

This shows the efficiency of Skagge's model in the simulation

of entire PSD using sand, silt, and clay contents (Fig. 4).

3.2 Number of ANN Ensemble Members

The means for training and testing RMSE of ensemble

models versus number of members are plotted in Fig. 5.

As the number of ANN members is increased from 1 to 5,

there is a substantial improvement in the RMSE of ensemble

models. This shows that ensembling individual ANNs,

successfully improved results over using the single ANN

methods and it increased the predictive ability of the

ensemble model. However, the RMSE tends to steady-state

when the number of ANN members in the model is higher

than 5, which shows that the best number of ensemble mem-

bers in an ANN ensemble is 5. This is in agreement with

Baker and Ellison [31] who showed that the best number of

ANN ensemble members is 5 or 6.

Table 2 Pearson correlation coefficient (r) between input and output variables

Sand Silt Clay CEC Organic matter cuDcB SSA

Silt −0.58** 1.00

Clay −0.76** −0.08 1.00

CEC −0.48** 0.01 0.59** 1.00

Organic matter 0.21 −0.06 −0.22 0.24 1.00

c0.56** −0.86** −0.01 −0.15 −0.06 1.00

u−0.40** 0.72** −0.08 0.32** 0.23 −0.77** 1.00

D−0.94** 0.30* 0.92** 0.57** −0.22 −0.32* 0.24 1.00

cB −0.91** 0.23 0.93** 0.64** −0.16 −0.34** 0.26* 0.98** 1.00

SSA −0.62** 0.21 0.59** 0.77** 0.26* −0.24 0.25* 0.65** 0.66** 1.00

TSS −0.98** 0.52** 0.78** 0.45** −0.28* −0.45** 0.35** 0.96** 0.90** 0.59**

CEC cation exchange capacity, cand ucoefficients of PSD curve model, Dfractal dimension, cB constant of fractal model, SSA measured specific

surface area, TSS calculated specific surface area

*p=0.05; **p= 0.01—significance level

Fig. 3 Relationship between measured soil specific surface area (SSA)

and afractal dimension (D) and bconstant of fractal model (cB)

Fig. 4 Relationship between measured and calculated fractal dimension

(D)

610 H. Bayat et al.

Author's personal copy

3.3 Comparing Combining Methods

The results (Tables 3and 4) showed that combining

individual models by weighted averaging always produced

predictions that are at least as accurate as the ANN

ensembles of which individual models are combined by

simple averaging. This is in agreement with Perrone and

Cooper [40] and Guber et al. [31]. However, Tables 3and

4show that there are no considerable differences in the

predictions of two methods for most ANN ensembles,

except that there is a substantial difference in the pre-

dictions of two methods for PTF3 of SSA estimation

(Tables 3). As a conclusion, contrary to weighted averaging,

simple averaging deteriorated the performance of PTF3 in the

prediction of SSA.

3.4 Developing PTFs to Predict SSA and TSS

In this section, ANN ensembles were used to develop PTFs to

estimate SSA and TSS. Six PTFs were developed, the results

for the validation data are summarized in Tables 3and 4.

In PTF2, FPs was used along with basic soil properties to

estimate SSA. This improved prediction of SSA, significantly

(Table 3). Introducing PSDCPs along with basic soil proper-

ties in PTF3, showed different results using weighted averag-

ing and simple averaging methods. In weighted averaging

method, introducing PSDCPs improved the SSA estimation

by 11.1 %. However, this was not significant (Table 3). But,

combining by the simple averaging method resulted in the

deterioration of the SSA estimation by 24.8 %. The under-

lying reason for this result may be the significant correla-

tions between PSDCPs and sand and silt contents (Table 2).

It should be noted that using FPs and PSDCPs as only

inputs in PTF4 and PTF5, respectively, resulted in deterio-

ration of the PTFs performance during validation (Table 3).

This result confirmed our hypothesis that SSA may be one

of the properties that is controlled by the fractal behaviour

of PSD. In fact, the fractal theory could develop a more

complete description of soil structure and processes, which

is impossible to be done by conventional methods based on

Euclidean geometry [41].

Bayat et al. [42] successfully used FPs to estimate the soil

water retention curve by ANNs and the multiobjective

group method of data handling. Ersahin et al. [21] related

the cation exchange capacity and SSA of the soils to the

fractal dimension of PSD using a regression method.

The prediction of SSA was improved significantly by the

use of the soil cation exchange capacity as a predictor along

with basic soil properties (Table 3). This is in agreement

with several authors [6,43] who found a close correlation

between cation exchange capacity and SSA.

In the same way, we developed PTFs for SSA, and we

developed six PTFs to predict TSS, and the results for the

testing data are summarized in Table 4. The prediction of

Fig. 5 Mean RMSE versus

number of ANN ensemble

members for training and

testing. The bars show the

mean RMSE resulted from

ANN ensembles

Table 3 Results of SSA predic-

tions using different input

variables for testing dataset

SSA calculated specific surface

area, AIC Akaike information

criterion, RMSE root mean

square error, RI relative im-

provement, MGN Morgan–

Granger–Newbold

*p<0.05, significant differences

between PTF1 and other PTFs

Combined by weighted averaging Combined by simple averaging

AIC RMSE RI MGN AIC RMSE RI MGN

PTF1 −49.0 0.203 −49.0 0.203

PTF2 −53.3 0.178 12.5 2.21* −53.2 0.178 12.5 2.15*

PTF3 −52.8 0.181 11.1 1.18 −41.9 0.254 −24.8 0.55

PTF4 −46.5 0.220 −8.1 0.59 −46.5 0.220 −8.0 0.59

PTF5 −38.1 0.285 −40.4 3.65* −38.1 0.286 −40.4 3.65*

PTF6 −61.2 0.139 31.7 4.85* −61.1 0.139 31.5 4.87*

Estimation of SSA by ANN Ensembles Using FPs as Predictors 611

Author's personal copy

TSS was improved significantly by using FPs and PSDCPs

as predictors along with basic soil properties in PTF2 and

PTF3, respectively (Table 4). These improvements were

more considerable than those for SSA. However, contrary

to TSS, use of PSDCPs did not improve the prediction of

SSA, significantly.

Contrary to predicting SSA, the prediction of TSS was

improved significantly when FPs were the only predictors in

PTF4 (Table 4). Depending on the method used, the mea-

sured SSA may show variations for a given soil [37]. In this

study, various methods were used to measure SSA, while

identical method was used to calculate TSS. However, the

PTFs developed to estimate SSA were less reliable than the

PTFs developed to estimate TSS. Similar to predicting SSA,

the prediction of TSS was improved significantly by the use

of cation exchange capacity as a predictor along with basic

soil properties (Table 4).

3.5 Sensitivity Analysis

Sensitivity analysis was performed to investigate the potential

importance of FPs and PSDCPs in PTF2 and PTF3, respec-

tively, in the prediction of SSA and TSS. The results are

shown in Fig. 6.InPTF2,FPs(Dand cB as input variables)

described 44 % of variation in SSA, while these input vari-

ables described 92.8 % of variation in TSS (Fig. 6). It shows

the convenience of FPs in describing the relation between

SSA and/or TSS and soil texture. That FPs influence TSS

more than SSA may be due to the superiority of fractal theory

in describing the relation between soil texture and TSS.

In PTF3, it was found that PSDCPs (cand uas input

variables) explained 42 % of variation in SSA, while these

input variables described only 6.6 % of variation in TSS

(Fig. 6). In comparison with fractal parameters, PSDCPs

were less effective in the prediction of SSA and TSS.

The results indicated the efficiency of ANN ensembles in

developing the PTFs to predict SSA using preferable input

variables such as FPs. The fractal approach is a useful tool

to describe the processes of porous medium [20,44]. The

use of FPs and/or PSDCPs as predictors improved SSA

and/or TSS predictions without any cost. In this study, FPs

and PSDCPs were calculated only from sand, silt and clay

contents of the soils. Therefore, it could be a simple, eco-

nomic, and practical procedure to improve the SSA and/or

TSS predictions without additional measurements.

4 Conclusions

Prediction of SSA and/or TSS was improved significantly

by introducing FPs and PSDCPs to an ANN assemble.

Strong correlations were found between FPs and SSA

and/or TSS, and to a lower degree, between SSA and/or

TSS and PSDCPs. When FPs and PSDCPs were used as

predictors along with basic soil properties the TSS predic-

tions improved significantly. The prediction for TSS was

better than for SSA when FPs were used as only predictors

and this may attributed to that same technique used to

calculate TSS for all soils in comparison with the different

Table 4 Results of TSS predic-

tions using different input

variables for testing dataset

TSS calculated specific surface

area, AIC Akaike information

criterion, RMSE root mean

square error, RI relative im-

provement, MGN Morgan–

Granger–Newbold

*p<0.05, significant differences

between PTF1 and other PTFs

Combined by weighted averaging Combined by simple averaging

AIC RMSE RI MGN AIC RMSE RI MGN

PTF1 −73.3 0.095 −72.2 0.098

PTF2 −83.7 0.069 27.7 10.12* −80.0 0.077 21.6 8.32*

PTF3 −83.4 0.069 27.0 5.40* −82.8 0.071 28.0 5.79*

PTF4 −98.2 0.044 54.0 6.01* −98.0 0.044 55.3 6.29*

PTF5 −54.6 0.170 −79.6 3.30* −53.4 0.177 −80.0 3.33*

PTF6 −86.0 0.064 32.6 13.23* −83.6 0.069 29.8 12.50*

Fig. 6 Sensitivity analysis of output variables, aSSA in PTF2, bTSS

in PTF2, cSSA in PTF3, and dTSS in PTF3 to the input variables.

(OM organic matter, cand u, coefficients of PSD curve model, D

fractal dimension, cB constant of fractal model, SSA measured specific

surface area, TSS calculated specific surface area)

612 H. Bayat et al.

Author's personal copy

methods used to obtain SSA data. The prediction of SSA

and/or TSS was improved significantly by the use of the soil

cation exchange capacity as a predictor along with basic soil

properties. Use of FPs as inputs, explained 44 and 92.8 % of

variation in SSA and TSS, respectively. In comparison to FPs,

PSDCPs were less effective in the prediction of SSA and TSS.

We concluded that the FPs are useful in quantifying the re-

lations between soil texture and related soil properties and

processes. This procedure is occurred as an economic and

powerful technique to predict the SSA and/or TSS by ANN

ensembles, without additional measurements. Therefore, we

concluded that ANN ensembling could be used as a new

technique in the prediction of SSA and TSS values.

Acknowledgments The authors are deeply grateful to the two anony-

mous reviewers and Associate Editor for their helpful comments, which

have considerably improved this paper. The authors thank Dr. Aringhieri

from Dipartimento di Informatica, Università degli Studi di TorinoCorso

Svizzera 185, I-10149 Torino, Italy, for his data he kindly provided.

References

1. Aringhieri, A., Pardini, G., Gispert, M., & Sole, A. (1992). Testing

a simple methylene blue method for surface area estimation in

soils. Agrochimica, XXXVI–N. 3, 224–232.

2. Mitchell, J. K. (1993). Fundamentals of soil behavior. New York:

Wiley.

3. Lutenegger, A. J., & Cerato, A. B. (2001). Surface area and

engineering properties of finegrained soils. 15th International

Conference on Soil Mechanics and Foundation Engineering (pp.

603–606). Istanbul, Turkey: A.A. Balkema Publishers.

4. Morgenstern, N.R., Balasubramanian, B.I. (1980). Effects of pore

fluid on the swelling of clay-shale. In: 4th International Conference

Expansive Soils. pp. 190–205.

5. Sridharan, A., Rao, S. M., & Murthy, N. S. (1986). Compressibil-

ity behavior of homoionized bentonites. Geotechnique, 36, 551–

564.

6. de Jong, E. (1999). Comparison of three methods of measuring

surface area of soils. Canadian Journal of Soil Science, 79, 345–

351.

7. Nixon, J. F. (1991). Discrete ice lens theory for frost heave in soils.

Canadian Geotechnical Journal., 28, 843–859.

8. Warkentin, B. P. (1972). Use of the liquid limit in characterizing

the clay soils. Canadian Journal of Soil Science, 52, 457–464.

9. D.L. Carter, M.M. Mortland, W.D. Kemper (1986). Specific sur-

face. In: A. Klute, (Ed.). Methods of soils analysis Part 1: physical

and mineralogical methods. 2nd ed. Serie Agronomy No. 9. ASA-

SSSA ed. Soil Science Society of America Madison, pp. 413–423.

10. Daniels, J. L., Inyang, H. I., & Brochu, M., Jr. (2004). Specific

surface area of barrier mixtures at various outgas temperatures.

Journal of Environmental Engineering, 130, 867–872.

11. Urai, J. L., Van Oort, E., & Van Der Zee, W. (1997). Correlations

to predict the mechanical properties of mudrocks from wireline

logs and drill cuttings. The EGS XXII General Assembly Vienna:

Annales Geophysica., 15, 143–155.

12. Koorevaar, P., Menelik, G., & Dirksen, C. (1983). Elements of soil

physics. Amsterdam: Elsevier.

13. Theng, B. K. G., Ristori, G. G., Santi, C. A., & Percival, H. J.

(1999). An improved method for determining the specific surface

areas of topsoils with varied organic matter content, texture and

clay mineral composition. European Journal of Soil Science, 50,

309–316.

14. Arnepalli, D. N., Shanthakumar, S., Hanumantha Rao, B., &

Singh, D. N. (2008). Comparison of methods for determining

specific-surface area of fine-grained soils. Geotechnical and Geo-

logical Engineering, 26, 121–132.

15. Hepper, E. N., Buschiazzo, D. E., Hevia, G. G., Urioste, A., &

Antón, L. (2006). Clay mineralogy, cation exchange capacity and

specific surface area of loess soils with different volcanic ash

contents. Geoderma, 135, 216–223.

16. Yukselen, Y., & Kaya, A. (2006). Comparison of methods for

determining specific surface area of soils. Journal of Geotechnical

and Geoenvironmental Engineering., 132, 931–936.

17. Schaap, M. G., & Bouten, W. (1996). Modeling water retention

curves of sandy soils using neural networks. Water Resources

Research, 32, 3033–3040.

18. Koekkoek, E. J. W., & Booltink, H. (1999). Neural network

models to predict soil water retention. European Journal of Soil

Science, 50, 489–495.

19. Crawford, J. W., Sleeman, B. D., & Young, I. M. (1993). On the

relation between number-size distributions and the fractal dimen-

sion of aggregates. Journal of Soil Science., 44, 555–565.

20. Hwanga Ii, S., Leeb, K. P., Leeb, D. S., & Powers, S. E. (2002).

Models for estimating soil particle-size distributions. Soil Science

Society of America Journal, 66, 1143–1150.

21. Ersahin, S., Gunal, H., Kutlu, T., Yetgin, B., & Coban, S. (2006).

Estimating specific surface area and cation exchange capacity in

soils using fractal dimension of particle-size distribution.

Geoderma, 136, 588–597.

22. Cerato, A. B., & Lutenegger, A. J. (2002). Determination of

surface area of fine-grained soils by the ethylene glycol monoethyl

ether (EGME) method. Geotechnical Testing Journal, 25, 315–

321.

23. .Quirk, J.P. (1955). Significance of surface area calculated from

water-vapour sorption isotherms by use of the B.E.T. equation.

Soil Science. 8:423.

24. Arya, L. M., & Paris, J. F. (1981). Physicoempirical model to

predict the soil moisture characteristic from particle-size distribu-

tion and bulk density data. Soil Science Society of America Jour-

nal, 45, 1023–1030.

25. Skaggs, T. H., Arya, L. M., Shouse, P. J., & Mohanty, B. P. (2001).

Estimating particle-size distribution from limited soil texture data.

Soil Science Society of America Journal, 65, 1038–1044.

26. Bird, N. R. A., Perrier, E., & Rieu, M. (2000). The water retention

function for a model of soil structure with pore and solid fractal

distributions. European Journal of Soil Science, 51,55–63.

27. Hillel, D. (2004). Introduction to environmental soil physics. Am-

sterdam: Academic.

28. Minasny, B., & McBratney, A. B. (2002). The neuro-M method for

fitting neural network parametric pedotransfer functions. Soil Sci-

ence Society of America Journal, 66, 352–361.

29. Efron, B., & Tibshirani, R. J. (1993). An introduction to the

bootstrap. Monographs on Statistics and Applied Probability.

London: Chapman and Hall.

30. Schaap, M. G., Leij, F. J., & Van Genuchten, M. T. (1998).

Neural network analysis for hierarchical prediction of soil

hydraulic properties. Soil Science Society of America Journal,

62,847–855.

31. Baker, L., & Ellison, D. (2008). Optimisation of pedotransfer

functions using an artificial neural network ensemble method.

Geoderma, 144, 212–224.

32. Donigian, A. S., & Rao, P. S. C. (1986). Overview of terrestrial

processes and modeling. In S. C. Hern & S. M. Melancon (Eds.),

Vadose zone modeling of organic pollutants (pp. 3–36). Chelsea:

Lewis Publishers.

Estimation of SSA by ANN Ensembles Using FPs as Predictors 613

Author's personal copy

33. Zheng, C., & Bennett, G. D. (1995). Applied contaminant trans-

port modeling: theory and practice. New York: Van Nostrand

Reinhold.

34. NeuroDimension, Inc. (2005). NeuroSolutions. Getting Started

Manual Version 4. Gainesville: NeuroDimension, Inc.

35. Akaike, H. (1974). New look at the statistical model identification.

IEEE Transactions on Automatic Control, AC-19, 716–723.

36. Diebold, F. X., & Mariano, R. S. (2002). Comparing predictive

accuracy. Journal of Business and Economic Statistics., 20, 134–

144.

37. Yukselen-Aksoy, Y., & Kaya, A. (2010). Method dependency of

relationships between specific surface area and soil physicochem-

ical properties. Applied Clay Science., 50, 182–190.

38. Pennell, K. D. (2002). Specific surface area. In J. H. Dane & G. C.

Topp (Eds.), Methods of soil analysis. Part 4: physical methods.

Madison: Soil Science Society of America.

39. H. Bayat, N. Davatgat, S. Moallemi. (2012). Using of specific

surface to improve the prediction of Soil CEC by Artificial Neural

Networks. Soil and Water Knowledge Journal, 21(4), 105–119.

40. Perrone, M. P., & Cooper, L. N. (1993). When networks disagree:

ensemble method for neural networks. In R. J. Mammone (Ed.),

Neural networks for speech and image processing (pp. 126–142).

London: Chapman and Hall.

41. Sokołowska, Z., Hajnos, M., Hoffmann, C., Renger, M., &

Sokołowski, S. (2001). Comparison of fractal dimensions of soils

estimated from adsorption isotherms, mercury intrusion, and par-

ticle size distribution. Journal of Plant Nutrition and Soil Science.,

164, 591–599.

42. Bayat, H., Neyshabouri, M. R., Mohammadi, K., & Nariman-

Zadeh, N. (2011). Estimating water retention with pedotransfer

functions using multi-objective group method of data handling

and ANNs. Pedosphere, 21, 107–114.

43. Curtin, D., & Smillie, G. W. (1976). Estimation of components of soil

cation exchange capacity from measurements of specific surface and

organic matter. Soil Science Society of America Journal, 40, 461–462.

44. Xu, Y. F., & Dong, P. (2004). Fractal approach to hydraulic

properties in unsaturated porous media. Chaos, Solitons and Frac-

tals, 19, 327–337.

614 H. Bayat et al.

Author's personal copy