ArticlePDF Available

Abstract and Figures

Machine Learning (ML) is a science dealing with the study and development of computational models of learning and discovery process, alongwith building learning programs for specific applications. In the present study, ML techniques have been used to develop correlations for predicting geotechnical parameters for Civil Engineering design. For the determination of design values, strength and compressibility tests need to be carried out on undisturbed samples of soil. It is difficult to obtain an undisturbed sample every time, due to handling, transportation, the release of overburden pressure and poor laboratory conditions. ML techniques can predict fairly accurate values of various geotechnical parameters like in-situ density, compression index (Cc) and shear strength parameters (c and ϕ), if accurate datasets of laboratory and field results are used to develop the models. Several ML techniques like Linear Regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF) and M5 tree (M5P) have been used for the analysis. In the present study, relationships between in-place density using SPT N-value, compression index (Cc) using liquid limit (LL) and void ratio (e), and cohesion (c) and angle of internal friction (ϕ) using SPT N-value have been developed. Geotechnical data up to a depth of 50 m from 1053 borehole locations covering almost every district in the state of Haryana have been considered to develop models and statistical correlations. A general trend has been recorded in the observed data and accordingly, the outliers have been excluded. Several models have been developed to establish functional correlations. These correlations have been ranked on the basis their coefficient of determination (R2) value and mean absolute error (MAE). Subsequently, the model with the highest R2 value and minimum mean absolute error has been considered for the development of correlations. Sensitivity analysis has also been carried out for all the developed correlations to assess their individual performance. For this purpose, all the developed models have been evaluated by fitting a straight line between observed and modelled values, and in all the cases, a good value of R2 has been observed. The R2 values obtained for all the models range from 0.798 to 0.988. On comparison, it has been observed that the values of geotechnical parameters obtained are in close agreement with the existing work.
Content may be subject to copyright.
ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 125 (2018) 509–517
1877-0509 © 2018 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the 6th International Conference on Smart Computing and Communications
10.1016/j.procs.2017.12.066
10.1016/j.procs.2017.12.066 1877-0509
1877-
0
Peer-
r
Com
m
6
P
Abst
r
In th
e
and
v
learni
state
o
obser
v
corre
l
error
devel
o
p
erfo
r
mode
l
0.798
with
t
© 20
1
Peer-
r
Com
m
K
eyw
o
*
E-
m
0
509© 2018 The
A
r
eview under re
s
m
unications.
6
th Internat
i
P
redictio
n
r
act
e
present study,
v
oid ratio (e), a
n
ng techniques.
G
o
f Haryana hav
e
v
ed data and a
c
l
ations. These c
o
(MAE). Subseq
u
o
pment of cor
r
r
mance. For thi
s
l
led values, and
to 0.988. On c
o
t
he existing wor
k
1
8 The Authors.
r
eview under re
s
m
unications.
o
rds:Geotechnical
C
orresponding aut
h
m
ail address: niti
s
Av
a
P
r
A
uthors. Publishe
d
s
ponsibility of t
h
i
onal Confe
r
n
of Ge
o
Nitis
h
a
D
epart
m
relationships b
e
n
d cohesion (c)
G
eotechnical da
t
e
been consider
e
c
cordingly, the
o
o
rrelations hav
e
u
ently, the mo
d
r
elations. Anal
y
s
purpose, all t
h
in all the cases,
o
mparison, it h
a
k
.
Published by E
l
s
ponsibility of t
h
Properties; Mach
i
h
o
r
s
hpuri.ce.89@gm
a
a
ilable online a
t
Scie
n
r
ocedia Computer
d
by Elsevier B.V.
h
e scientific co
m
r
ence on Sm
a
Decemb
e
o
technic
a
h
Puri
*a
, H
a
m
ent of Civil Engi
n
e
tween in-
p
lace
and angle of in
t
t
a up to a depth
e
d to develop
m
o
utliers have b
e
e
been ranked o
n
el with the hig
h
y
sis has also b
e
h
e developed m
a good value o
f
a
s been observe
d
l
sevier B.V.
h
e scientific co
m
i
ne Learning Tec
h
a
il.com
t
www.science
n
ceDirec
t
Science 00 (2018
m
mittee of the 6t
h
a
rt Computi
n
e
r 2017, Ku
r
a
l Param
e
Techni
q
a
rsh Deep
P
n
eering, NIT Kuru
k
density using S
P
t
ernal friction (
ϕ
of 50 m from
1
m
odels and stati
s
e
en excluded. S
n
the basis thei
r
h
est R
2
value an
d
e
en carried out
odels have bee
n
f
R
2
has been ob
s
d
that the values
m
mittee of the 6t
h
h
niques; Correlati
o
direct.com
t
) 000–000
h
International
C
n
g and Co
m
r
ukshetra, In
d
e
ters Usi
n
q
ues
P
rasad
a
, As
h
k
shetra, Kurukshe
P
T N-value, co
m
ϕ
) using SPT N
-
1
053 borehole l
o
s
tical correlatio
n
everal models
h
r
coefficient of
d
d
minimum me
a
for all the d
e
n
evaluated by
f
s
erved. The R
2
v
of geotechnica
l
h
International
C
o
ns; Haryana
w
w
w
C
onference on S
m
m
munication
s
d
i
a
ng
Mac
h
h
wani Jain
a
tra-136119, India
m
pression inde
x
-
value have be
e
o
cations coverin
g
n
s. A general tr
e
h
ave been deve
l
d
etermination (
R
a
n absolute erro
r
e
veloped model
s
f
itting a straigh
t
v
alues obtained
f
l
parameters ob
t
C
onference on S
m
w
.elsevier.com/loc
a
m
art Computin
g
s
, ICSCC 2
0
h
ine Lea
r
x
(C
c
) using liq
u
e
n established u
s
g
almost every
d
e
nd has been re
c
l
oped to establi
s
R
2
) value and
m
r
has been consi
s
to assess th
e
t
line between
o
f
or all the mode
l
t
ained are in clo
s
m
art Computin
g
a
te/procedia
g
and
0
17, 7-8
r
nin
g
u
id limit (LL)
s
ing machine
d
istrict in the
c
orded in the
s
h functional
m
ean absolute
dered for the
e
ir individual
o
bserved and
l
s range from
s
e agreement
g
and
510 Nitish Puri et al. / Procedia Computer Science 125 (2018) 509–517
2 Nitish Puri / Procedia Computer Science 00 (2018) 000–000
1. Introduction
In Geotechnical Engineering, empirical correlations are frequently used to evaluate various engineering properties
of soils. Correlations are generally derived with the help of statistical methods using data from extensive laboratory
or field testing. Linear Regression (LR) Analysis, Artificial Neural Network (ANN), Support Vector Machine
(SVM), Random Forest (RF) and M5 model trees (M5P) are some of the types of machine learning techniques.
These techniques learn from data cases presented to them to capture the functional relationship among the data even
if the fundamental relationships are unknown or the physical meaning is tough to explain. This contrasts with most
traditional empirical and statistical methods, which need prior information about the nature of the relationships
among the data. ML is thus well suited to model the complex performance of most Geotechnical Engineering
materials, which, by their very nature, exhibit extreme erraticism. This modeling capability, as well as the ability to
learn from experience, has given ML techniques superiority over most traditional modeling approaches since there is
no need for making assumptions about what could be the primary rules that govern the problem in hand. These
techniques are being widely used to solve various civil engineering problems[1-10].
Geotechnical parameters like in-place density, compression index (Cc), coefficient of consolidation (Cv), strength
characteristics (c, ϕ) are extensively used for the design of earthen dams, embankments, pavements, landfill liners
and foundation of various Civil Engineering structures. Most of these parameters are determined in the laboratory
and some are estimated on the field. Their calculation requires a specific laboratory equipment, an experienced
geotechnical engineer with a team of skilled technicians. Thus, determination of these parameters is costly and time
consuming. Also, soil is a highly erratic material as its performance is based on the processes due to which it is
formed. Hence, correlations developed for one region may not be applicable for the other. This ascertains the need to
develop region-based correlations to predict geotechnical properties.
In the present study, engineering parameters like in-place density, compression index (Cc), strength
characteristics, namely cohesion (c) and angle of internal friction (ϕ) have been correlated with soil parameters
determined in laboratory and in field. For this purpose, machine learning techniques like Linear Regression (LR)
Analysis, Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF) and M5 Tree
(M5P) have been used. Geotechnical data have been collected from various government and private organizations
across Haryana and optimized for development of more accurate models. The results indicate that developed models
are very accurate and provide a viable tool to site engineers and consultants for predicting missing data, and for cross
checking the observed values.
2. Study Area, Data Collection and Methodology
Haryana is a non-coastal state in North India with its capital at Chandigarh. It is a moderate sized state having an
area of 44,212 km2, which is 40 times the area of Delhi. It ranks 19th in terms of area in the country. It is surrounded
by the states of Uttarakhand, Himachal Pradesh and Shiwalik hills on the North, Uttar Pradesh on the East, Punjab
on the West and Delhi, Rajasthan and Aravali hills on the South. It lies between 27°39' to 30°35' N latitude and
74°28' and 77°36' E longitude. The country’s capital Delhi is surrounded by Haryana from three sides, forming the
northern, western and southern borders of Delhi. Consequently, a large area of Haryana is included in the National
Capital Region (NCR) for the purposes of planning for development. Haryana is a leading state in the country on
both the industrial and agricultural front. The state has invested in the development of world class infrastructure
facilities such as special economic zones (SEZs), Kundli-Manesar-Palwal (KMP) global corridor and Delhi-Mumbai
industrial corridor (DMIC) [11].
Geotechnical data collected from Public Works Department (PWD), Delhi Metro Rail Corporation (DMRC),
Northern Railways (NR), Haryana Urban Development Authority (HUDA), Nuclear Power Corporation of India
Limited (NPCIL), Rail Vikas Nigam Limited (RVNL) and several geotechnical consultants have been used in the
study. The developed geotechnical database has information for 1053 distinct locations in the State of Haryana
covering almost each district up to a depth of 50 m.
The observed values of geotechnical properties for 1053 borehole locations have been considered for
development of various models and statistical correlations. Sorting of relevant data has been carried out by
observing a recurring trend and thus deleting the outliers from the data sets. The models were then ranked based on
Nitish Puri / Procedia Computer Science 00 (2018) 000–000 3
their coefficient of determination (R2) and Mean Absolute Error (MAE). Analysis has been carried out by plotting
the observed and modeled values on ordinate and abscissa respectively [12] for all the models to assess their
individual performance. Figure 1 shows typical performance analysis of SVM model for predicting angle of internal
friction of soil (ϕ) using SPT N-value.
Fig. 1 Performance analysis of SVM model for predicting angle of internal friction of soil (ϕ) using SPT N-value
3. Results and Discussions
3.1 In-Situ Density
Prediction capabilities of all the developed models have been evaluated by fitting a straight line between
observed and modeled values. High value of R2 ranging from 0.84 to 0.97 has been observed between modeled and
observed densities for all the models.Observed error estimates for the models developed for predicting in-place
density using SPT N-value are presented in Table 1. All the 20 models have been ranked based on their overall
performance including their prediction capability, R2 value of the correlation and MAE of the correlation. For coarse
grained soils, models developed using M5P and linear regression have shown maximum accuracy in estimation of
bulk density (ρb) and dry density (ρd) respectively. In the case of fine grained soils, M5P and ANN models have
shown maximum accuracy in the estimation of bulk and dry density respectively. Consequently, best models, model
number 17 (R2 value of 0.95) and number 2 (R2 value of 0.96) have been adopted for determination of bulk and dry
density for coarse grained soils respectively. Model number 19 (R2 value of 0.97) and number 8 (R2 value of 0.9)
have been adopted for determination of bulk and dry density of fine grained soils respectively. The proposed
correlations established have been reported in Table 2.
Table 1. Observed error estimate of models for in-place density and SPT N-value
Model
No. Technique Soil Type Soil
Property (R2)
Mean
Absolute
Error
(g/cm3)
Root
Mean
Squared
Error
(g/cm3)
Relative
Absolute
Error (%)
Root
Relative
Squared
Error (%)
1 Regression Coarse Bulk Density 0.91 0.05 0.06 41.65% 40.65 %
Nitish Puri et al. / Procedia Computer Science 125 (2018) 509–517 511
2 Nitish Puri / Procedia Computer Science 00 (2018) 000–000
1. Introduction
In Geotechnical Engineering, empirical correlations are frequently used to evaluate various engineering properties
of soils. Correlations are generally derived with the help of statistical methods using data from extensive laboratory
or field testing. Linear Regression (LR) Analysis, Artificial Neural Network (ANN), Support Vector Machine
(SVM), Random Forest (RF) and M5 model trees (M5P) are some of the types of machine learning techniques.
These techniques learn from data cases presented to them to capture the functional relationship among the data even
if the fundamental relationships are unknown or the physical meaning is tough to explain. This contrasts with most
traditional empirical and statistical methods, which need prior information about the nature of the relationships
among the data. ML is thus well suited to model the complex performance of most Geotechnical Engineering
materials, which, by their very nature, exhibit extreme erraticism. This modeling capability, as well as the ability to
learn from experience, has given ML techniques superiority over most traditional modeling approaches since there is
no need for making assumptions about what could be the primary rules that govern the problem in hand. These
techniques are being widely used to solve various civil engineering problems[1-10].
Geotechnical parameters like in-place density, compression index (Cc), coefficient of consolidation (Cv), strength
characteristics (c, ϕ) are extensively used for the design of earthen dams, embankments, pavements, landfill liners
and foundation of various Civil Engineering structures. Most of these parameters are determined in the laboratory
and some are estimated on the field. Their calculation requires a specific laboratory equipment, an experienced
geotechnical engineer with a team of skilled technicians. Thus, determination of these parameters is costly and time
consuming. Also, soil is a highly erratic material as its performance is based on the processes due to which it is
formed. Hence, correlations developed for one region may not be applicable for the other. This ascertains the need to
develop region-based correlations to predict geotechnical properties.
In the present study, engineering parameters like in-place density, compression index (Cc), strength
characteristics, namely cohesion (c) and angle of internal friction (ϕ) have been correlated with soil parameters
determined in laboratory and in field. For this purpose, machine learning techniques like Linear Regression (LR)
Analysis, Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF) and M5 Tree
(M5P) have been used. Geotechnical data have been collected from various government and private organizations
across Haryana and optimized for development of more accurate models. The results indicate that developed models
are very accurate and provide a viable tool to site engineers and consultants for predicting missing data, and for cross
checking the observed values.
2. Study Area, Data Collection and Methodology
Haryana is a non-coastal state in North India with its capital at Chandigarh. It is a moderate sized state having an
area of 44,212 km2, which is 40 times the area of Delhi. It ranks 19th in terms of area in the country. It is surrounded
by the states of Uttarakhand, Himachal Pradesh and Shiwalik hills on the North, Uttar Pradesh on the East, Punjab
on the West and Delhi, Rajasthan and Aravali hills on the South. It lies between 27°39' to 30°35' N latitude and
74°28' and 77°36' E longitude. The country’s capital Delhi is surrounded by Haryana from three sides, forming the
northern, western and southern borders of Delhi. Consequently, a large area of Haryana is included in the National
Capital Region (NCR) for the purposes of planning for development. Haryana is a leading state in the country on
both the industrial and agricultural front. The state has invested in the development of world class infrastructure
facilities such as special economic zones (SEZs), Kundli-Manesar-Palwal (KMP) global corridor and Delhi-Mumbai
industrial corridor (DMIC) [11].
Geotechnical data collected from Public Works Department (PWD), Delhi Metro Rail Corporation (DMRC),
Northern Railways (NR), Haryana Urban Development Authority (HUDA), Nuclear Power Corporation of India
Limited (NPCIL), Rail Vikas Nigam Limited (RVNL) and several geotechnical consultants have been used in the
study. The developed geotechnical database has information for 1053 distinct locations in the State of Haryana
covering almost each district up to a depth of 50 m.
The observed values of geotechnical properties for 1053 borehole locations have been considered for
development of various models and statistical correlations. Sorting of relevant data has been carried out by
observing a recurring trend and thus deleting the outliers from the data sets. The models were then ranked based on
Nitish Puri / Procedia Computer Science 00 (2018) 000–000 3
their coefficient of determination (R2) and Mean Absolute Error (MAE). Analysis has been carried out by plotting
the observed and modeled values on ordinate and abscissa respectively [12] for all the models to assess their
individual performance. Figure 1 shows typical performance analysis of SVM model for predicting angle of internal
friction of soil (ϕ) using SPT N-value.
Fig. 1 Performance analysis of SVM model for predicting angle of internal friction of soil (ϕ) using SPT N-value
3. Results and Discussions
3.1 In-Situ Density
Prediction capabilities of all the developed models have been evaluated by fitting a straight line between
observed and modeled values. High value of R2 ranging from 0.84 to 0.97 has been observed between modeled and
observed densities for all the models.Observed error estimates for the models developed for predicting in-place
density using SPT N-value are presented in Table 1. All the 20 models have been ranked based on their overall
performance including their prediction capability, R2 value of the correlation and MAE of the correlation. For coarse
grained soils, models developed using M5P and linear regression have shown maximum accuracy in estimation of
bulk density (ρb) and dry density (ρd) respectively. In the case of fine grained soils, M5P and ANN models have
shown maximum accuracy in the estimation of bulk and dry density respectively. Consequently, best models, model
number 17 (R2 value of 0.95) and number 2 (R2 value of 0.96) have been adopted for determination of bulk and dry
density for coarse grained soils respectively. Model number 19 (R2 value of 0.97) and number 8 (R2 value of 0.9)
have been adopted for determination of bulk and dry density of fine grained soils respectively. The proposed
correlations established have been reported in Table 2.
Table 1. Observed error estimate of models for in-place density and SPT N-value
Model
No. Technique Soil Type Soil
Property (R2)
Mean
Absolute
Error
(g/cm3)
Root
Mean
Squared
Error
(g/cm3)
Relative
Absolute
Error (%)
Root
Relative
Squared
Error (%)
1 Regression Coarse Bulk Density 0.91 0.05 0.06 41.65% 40.65 %
512 Nitish Puri et al. / Procedia Computer Science 125 (2018) 509–517
4 Nitish Puri / Procedia Computer Science 00 (2018) 000–000
Table 2. Proposed correlations between in-place density and SPT N-value
3.2 Compression Index of Soil
2 Analysis Coarse Dry Density 0.96 0.01 0.02 23.65 % 25. 65 %
3 Fine Bulk Density 0.97 0.01 0.02 19.50 % 23.39 %
4 Fine Dry Density 0.90 0.04 0.06 41.35 % 42.55 %
5
Artificial
Neural
Network
Coarse Bulk Density 0.90 0.05 0.06 45.57 % 44.46%
6 Coarse Dry Density 0.95 0.02 0.02 30.04 % 32.52%
7 Fine Bulk Density 0.97 0.02 0.03 21.98 % 25.31 %
8 Fine Dry Density 0.90 0.04 0.05 39.34 % 38.13 %
9
Support Vector
Machine
Coarse Bulk Density 0.91 0.04 0.06 39.45 % 43.94 %
10 Coarse Dry Density 0.94 0.02 0.03 28.40% 34.52%
11 Fine Bulk Density 0.96 0.01 0.02 19.5 % 23.39 %
12 Fine Dry Density 0.89 0.04 0.05 39.34 % 38.13 %
13
Random Forest
Coarse Bulk Density 0.92 0.04 0.06 37.34 % 39.24 %
14 Coarse Dry Density 0.96 0.01 0.02 24.79% 26.95 %
15 Fine Bulk Density 0.95 0.02 0.02 25.93 % 30.67 %
16 Fine Dry Density 0.84 0.05 0.072 50.90 % 55.02 %
17
M5 Model Tree
Coarse Bulk Density 0.95 0.02 0.03 22.83 % 25.24 %
18 Coarse Dry Density 0.94 0.02 0.02 29.01 % 32.87 %
19 Fine Bulk Density 0.97 0.01 0.02 19.54 % 23.62 %
20 Fine Dry Density 0.90 0.04 0.06 41.35 % 42.55 %
Eq. No. Soil Type Soil Property Correlation Unit N Value Technique R2
1. (a)
Coarse Grained Bulk density (ρb) ρb = 0.0096 * N + 1.5001 g/cm3 1-39 M5P 0.95
1. (b) Coarse Grained Bulk density (ρ
b
) ρ
b
= 0.0141 * N + 1.3726 g/cm3 40-50 M5P 0.95
2. Coarse Grained Dry Density (ρd) ρd = 0.0068 * N + 1.5554 g/cm3 1-50 LR 0.96
3. Fine Grained Bulk Density (ρb) ρb = 0.0080 * N + 1.7202 g/cm3 1-50 M5P 0.97
4. Fine Grained Dry Density (ρd) ρd = 0.0114 * N + 1.2488 g/cm3 1-50 ANN 0.90
Nitish Puri / Procedia Computer Science 00 (2018) 000–000 5
A total of 10 models have been developed and their prediction capabilities have been evaluated by fitting a
straight line between observed and modeled values. The proximity of scatter graphs (with R2 ranging from 0.86 to
0.92) obtained between observed and modeled values proves high efficiency of the developed correlations. Models
have been ranked based on their overall performance. For estimation of Cc using liquid limit and void ratio, models
developed using M5P technique (model 5 and 10 respectively) have shown maximum accuracy, with R2 value of
0.92 for determining Cc using liquid limit and R2 value of 0.96 for determining Cc using void ratio as reported in
Tables 3-4. The established correlations are reported in Table 5. This study also concludes that machine learning
techniques offer distinct advantages over conventional hand calculations and laboratory tests. The developed
correlations for the determination of Cc from liquid limit have been compared with the correlations developed by
Skempton [13] and Terzaghi & Peck [14] as shown in the Figure 2. It has been observed, that the modeled values of
Ccare on the lower side of the Terzaghi & Peck [14] and in close agreement with Skempton [13]. The correlations
developed for the determination of Cc with the value of void ratio have been compared with the correlations
developed by Cozzolino [15], Azzouz et al. [16] and Kalantary and Kordnaeij [17]. It has been observed that the
calculated values of Cc using present study are on the higher side of the values obtained by Azzouz et al. [16], but
closer to the values obtained by Cozzolino [15]. For the values calculated using Kalantary and Kordnaeij [17], it has
been observed that there is a proximity in Cc value with the present study when void ratio is less than 0.8, beyond a
void ratio of 0.8; values modeled by present study are on the higher side, as shown in Figure 3.
Table 3. Developed models and their performance indices for compression index (Cc) and liquid limit (LL)
Table 4. Developed models and their performance indices for compression index (Cc) and void ratio (e)
Table 5. Proposed correlations for compression
Model
Number
Technique Coefficient of
Determination
(R2)
Mean
Absolute
Error
Root Mean
Squared Error
Relative Absolute
Error (%)
Root Relative
Squared Error
(%)
1 Linear Regression 0.86 0.012 0.015 49.78 % 51.50 %
2 Artificial Neural Network 0.81 0.014 0.177 59.56 % 60.57 %
3 Support Vector Machine 0.86 0.012 0.015 49.73 % 51.29 %
4 Random Forest 0.91 0.009 0.012 37.29 % 39.78 %
5 M5 Tree 0.92 0.009 0.012 38.62 % 40.54 %
Model
Number
Technique Coefficient of
Determination
(R2)
Mean
Absolute
Error
Root Mean
Squared Error
Relative
Absolute Error
(%)
Root Relative
Squared Error
(%)
6 Linear Regression 0.92 0.014 0.018 35.27 % 39.03 %
7 Artificial Neural Network 0.92 0.014 0.019 36.52 % 39.74 %
8 Support Vector Machine 0.92 0.013 0.021 34.29 % 45.62 %
9 Random Forest 0.94 0.010 0.016 24.83 % 33.26 %
10 M5 Tree 0.95 0.011 0.014 22.44 % 30.21 %
Index
Property
Equation
Number
Correlations Coefficient of
Determination
(R2)
Technique Remarks
Liquid
Limit (LL)
1 Cc = (0.0092 * LL) - 0.1091 0.92 M5P LL ≤ 29.25
2 Cc = (0.0017 * LL) + 0.1235 29.25 < LL < 37.35
3 Cc = (0.0064 * LL) - 0.05237 LL ≥ 37.35
Void Ratio
(e)
4 Cc = (0.2945 * e) - 0.0774 0.95 M5P e ≤ 0.495
5 Cc = (0.2534 * e) - 0.052 0.495 < e < 0.615
6 Cc = (0.7071 * e) - 0.3471 e ≥ 0.615
Nitish Puri et al. / Procedia Computer Science 125 (2018) 509–517 513
4 Nitish Puri / Procedia Computer Science 00 (2018) 000–000
Table 2. Proposed correlations between in-place density and SPT N-value
3.2 Compression Index of Soil
2 Analysis Coarse Dry Density 0.96 0.01 0.02 23.65 % 25. 65 %
3 Fine Bulk Density 0.97 0.01 0.02 19.50 % 23.39 %
4 Fine Dry Density 0.90 0.04 0.06 41.35 % 42.55 %
5
Artificial
Neural
Network
Coarse Bulk Density 0.90 0.05 0.06 45.57 % 44.46%
6 Coarse Dry Density 0.95 0.02 0.02 30.04 % 32.52%
7 Fine Bulk Density 0.97 0.02 0.03 21.98 % 25.31 %
8 Fine Dry Density 0.90 0.04 0.05 39.34 % 38.13 %
9
Support Vector
Machine
Coarse Bulk Density 0.91 0.04 0.06 39.45 % 43.94 %
10 Coarse Dry Density 0.94 0.02 0.03 28.40% 34.52%
11 Fine Bulk Density 0.96 0.01 0.02 19.5 % 23.39 %
12 Fine Dry Density 0.89 0.04 0.05 39.34 % 38.13 %
13
Random Forest
Coarse Bulk Density 0.92 0.04 0.06 37.34 % 39.24 %
14 Coarse Dry Density 0.96 0.01 0.02 24.79% 26.95 %
15 Fine Bulk Density 0.95 0.02 0.02 25.93 % 30.67 %
16 Fine Dry Density 0.84 0.05 0.072 50.90 % 55.02 %
17
M5 Model Tree
Coarse Bulk Density 0.95 0.02 0.03 22.83 % 25.24 %
18 Coarse Dry Density 0.94 0.02 0.02 29.01 % 32.87 %
19 Fine Bulk Density 0.97 0.01 0.02 19.54 % 23.62 %
20 Fine Dry Density 0.90 0.04 0.06 41.35 % 42.55 %
Eq. No. Soil Type Soil Property Correlation Unit N Value Technique R2
1. (a)
Coarse Grained Bulk density (ρb) ρb = 0.0096 * N + 1.5001 g/cm3 1-39 M5P 0.95
1. (b) Coarse Grained Bulk density (ρ
b
) ρ
b
= 0.0141 * N + 1.3726 g/cm3 40-50 M5P 0.95
2. Coarse Grained Dry Density (ρd) ρd = 0.0068 * N + 1.5554 g/cm3 1-50 LR 0.96
3. Fine Grained Bulk Density (ρb) ρb = 0.0080 * N + 1.7202 g/cm3 1-50 M5P 0.97
4. Fine Grained Dry Density (ρd) ρd = 0.0114 * N + 1.2488 g/cm3 1-50 ANN 0.90
Nitish Puri / Procedia Computer Science 00 (2018) 000–000 5
A total of 10 models have been developed and their prediction capabilities have been evaluated by fitting a
straight line between observed and modeled values. The proximity of scatter graphs (with R2 ranging from 0.86 to
0.92) obtained between observed and modeled values proves high efficiency of the developed correlations. Models
have been ranked based on their overall performance. For estimation of Cc using liquid limit and void ratio, models
developed using M5P technique (model 5 and 10 respectively) have shown maximum accuracy, with R2 value of
0.92 for determining Cc using liquid limit and R2 value of 0.96 for determining Cc using void ratio as reported in
Tables 3-4. The established correlations are reported in Table 5. This study also concludes that machine learning
techniques offer distinct advantages over conventional hand calculations and laboratory tests. The developed
correlations for the determination of Cc from liquid limit have been compared with the correlations developed by
Skempton [13] and Terzaghi & Peck [14] as shown in the Figure 2. It has been observed, that the modeled values of
Ccare on the lower side of the Terzaghi & Peck [14] and in close agreement with Skempton [13]. The correlations
developed for the determination of Cc with the value of void ratio have been compared with the correlations
developed by Cozzolino [15], Azzouz et al. [16] and Kalantary and Kordnaeij [17]. It has been observed that the
calculated values of Cc using present study are on the higher side of the values obtained by Azzouz et al. [16], but
closer to the values obtained by Cozzolino [15]. For the values calculated using Kalantary and Kordnaeij [17], it has
been observed that there is a proximity in Cc value with the present study when void ratio is less than 0.8, beyond a
void ratio of 0.8; values modeled by present study are on the higher side, as shown in Figure 3.
Table 3. Developed models and their performance indices for compression index (Cc) and liquid limit (LL)
Table 4. Developed models and their performance indices for compression index (Cc) and void ratio (e)
Table 5. Proposed correlations for compression
Model
Number
Technique Coefficient of
Determination
(R2)
Mean
Absolute
Error
Root Mean
Squared Error
Relative Absolute
Error (%)
Root Relative
Squared Error
(%)
1 Linear Regression 0.86 0.012 0.015 49.78 % 51.50 %
2 Artificial Neural Network 0.81 0.014 0.177 59.56 % 60.57 %
3 Support Vector Machine 0.86 0.012 0.015 49.73 % 51.29 %
4 Random Forest 0.91 0.009 0.012 37.29 % 39.78 %
5 M5 Tree 0.92 0.009 0.012 38.62 % 40.54 %
Model
Number
Technique Coefficient of
Determination
(R2)
Mean
Absolute
Error
Root Mean
Squared Error
Relative
Absolute Error
(%)
Root Relative
Squared Error
(%)
6 Linear Regression 0.92 0.014 0.018 35.27 % 39.03 %
7 Artificial Neural Network 0.92 0.014 0.019 36.52 % 39.74 %
8 Support Vector Machine 0.92 0.013 0.021 34.29 % 45.62 %
9 Random Forest 0.94 0.010 0.016 24.83 % 33.26 %
10 M5 Tree 0.95 0.011 0.014 22.44 % 30.21 %
Index
Property
Equation
Number
Correlations Coefficient of
Determination
(R2)
Technique Remarks
Liquid
Limit (LL)
1 Cc = (0.0092 * LL) - 0.1091 0.92 M5P LL ≤ 29.25
2 Cc = (0.0017 * LL) + 0.1235 29.25 < LL < 37.35
3 Cc = (0.0064 * LL) - 0.05237 LL ≥ 37.35
Void Ratio
(e)
4 Cc = (0.2945 * e) - 0.0774 0.95 M5P e ≤ 0.495
5 Cc = (0.2534 * e) - 0.052 0.495 < e < 0.615
6 Cc = (0.7071 * e) - 0.3471 e ≥ 0.615
514 Nitish Puri et al. / Procedia Computer Science 125 (2018) 509–517
6 Nitish Puri / Procedia Computer Science 00 (2018) 000–000
3.3 Strength Parameters of Soil
The overall performance of all the 10 developed models has been assessed based on their R2 values, MAE and
performance analysis. High value of R2 ranging from 0.8 to 0.99 has been observed for the estimation of cohesion
and angle of internal friction value for all the models in performance analysis. For the estimation of the angle of
internal friction by SPT N-value, SVM technique has been observed to be more useful with a R
2 value of 0.98
(model number 3) (Tables 6). For the estimation of cohesion using SPT N-value, model developed using M5P
technique has shown maximum accuracy, with a R2 value of 0.93 (model number 10) (Tables 7). The established
correlations are reported in Table 8. Also, a comparison has been done for the model number 3 predicting cohesion
value with the studies carried out by Terzaghi & Peck [18] and Kumar et al. [19], as show in the Figure 4. It has
been observed that value of cohesion obtained in the present study is on the lower side of the other studies.
Figure 2. Comparison between prediction models for compression indices (Cc) using liquid limit (LL)
Figure 3. Comparison between prediction models for compression indices (Cc) using void ratio (e)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Compression Index (Cc)
Void Ratio (e)
Present Stud
y
Cozzolino
(
1961
)
Azzouz
(
1976
)
Kalantar
y
et al
(
2012
)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
10 15 20 25 30 35 40 45 50 55 60
Compression Index (Cc)
Liquid Limit (%)
Present
S
tud
y
S
kem
p
ton
(
1944
)
Terzha
g
i and Peck
(
1967
)
Nitish Puri / Procedia Computer Science 00 (2018) 000–000 7
Table 6. Developed models and their performance indices for angle of internal friction (ϕ) and SPT N-value
Similarly, a comparison between model number 10 and studies carried out by Shioi & Fukui [20] and Wolff [21] has
been done. It has been observed that the value of angle of internal friction obtained in the present study is in close
proximity to both, as shown in the Figure 5.
Table 7. Developed models and their performance indices for cohesion (c) and SPT N-value
Figure 4. Comparison between prediction models for cohesion (c) using SPT N-value.
Model
Number
Technique Coefficient of
Determination
(R2)
Mean Absolute
Error
Root Mean
Squared Error
Relative
Absolute
Error (%)
Root Relative
Squared
Error (%)
6 Linear Regression 0.93 0.237 0.311 44.72% 37.64 %
7 Artificial Neural Network 0.89 0.283 0.375 53.27 % 45.38 %
8 Support Vector Machine 0.92 0.176 0.315 33.20 % 48.56 %
9 Random Forest 0.91 0.209 0.345 39.29 % 41.72 %
10 M5 Tree 0.93 0.171 0.312 32.75 % 37.73 %
Model
Number
Technique Coefficient of
Determination
(R2)
Mean
Absolute
Error
Root Mean
Squared Error
Relative
Absolute
Error (%)
Root Relative
Squared Error
(%)
1 Linear Regression 0.98 0.495 0.664 16.12 % 17.87 %
2 Artificial Neural Network 0.98 0.596 0.784 19.40 % 21.01 %
3 Support Vector Machine 0.98 0.246 0.454 7.99 % 12.01 %
4 Random Forest 0.98 0.481 0.683 15.65 % 18.36 %
5 M5 Tree 0.98 0.319 0.494 10.37 % 13.30 %
0
0.5
1
1.5
2
2.5
3
3.5
0 10 20 30 40 50
Cohesion (kg/cm2)
Observed SPT N-Value
Present Study Terzhagi and Peck (1982) Choudhury et al (2016)
Kumar et al. (2016)
Nitish Puri et al. / Procedia Computer Science 125 (2018) 509–517 515
6 Nitish Puri / Procedia Computer Science 00 (2018) 000–000
3.3 Strength Parameters of Soil
The overall performance of all the 10 developed models has been assessed based on their R2 values, MAE and
performance analysis. High value of R2 ranging from 0.8 to 0.99 has been observed for the estimation of cohesion
and angle of internal friction value for all the models in performance analysis. For the estimation of the angle of
internal friction by SPT N-value, SVM technique has been observed to be more useful with a R
2 value of 0.98
(model number 3) (Tables 6). For the estimation of cohesion using SPT N-value, model developed using M5P
technique has shown maximum accuracy, with a R2 value of 0.93 (model number 10) (Tables 7). The established
correlations are reported in Table 8. Also, a comparison has been done for the model number 3 predicting cohesion
value with the studies carried out by Terzaghi & Peck [18] and Kumar et al. [19], as show in the Figure 4. It has
been observed that value of cohesion obtained in the present study is on the lower side of the other studies.
Figure 2. Comparison between prediction models for compression indices (Cc) using liquid limit (LL)
Figure 3. Comparison between prediction models for compression indices (Cc) using void ratio (e)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Compression Index (Cc)
Void Ratio (e)
Present Stud
y
Cozzolino
(
1961
)
Azzouz
(
1976
)
Kalantar
y
et al
(
2012
)
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
10 15 20 25 30 35 40 45 50 55 60
Compression Index (Cc)
Liquid Limit (%)
Present
S
tud
y
S
kem
p
ton
(
1944
)
Terzha
g
i and Peck
(
1967
)
Nitish Puri / Procedia Computer Science 00 (2018) 000–000 7
Table 6. Developed models and their performance indices for angle of internal friction (ϕ) and SPT N-value
Similarly, a comparison between model number 10 and studies carried out by Shioi & Fukui [20] and Wolff [21] has
been done. It has been observed that the value of angle of internal friction obtained in the present study is in close
proximity to both, as shown in the Figure 5.
Table 7. Developed models and their performance indices for cohesion (c) and SPT N-value
Figure 4. Comparison between prediction models for cohesion (c) using SPT N-value.
Model
Number
Technique Coefficient of
Determination
(R2)
Mean Absolute
Error
Root Mean
Squared Error
Relative
Absolute
Error (%)
Root Relative
Squared
Error (%)
6 Linear Regression 0.93 0.237 0.311 44.72% 37.64 %
7 Artificial Neural Network 0.89 0.283 0.375 53.27 % 45.38 %
8 Support Vector Machine 0.92 0.176 0.315 33.20 % 48.56 %
9 Random Forest 0.91 0.209 0.345 39.29 % 41.72 %
10 M5 Tree 0.93 0.171 0.312 32.75 % 37.73 %
Model
Number
Technique Coefficient of
Determination
(R2)
Mean
Absolute
Error
Root Mean
Squared Error
Relative
Absolute
Error (%)
Root Relative
Squared Error
(%)
1 Linear Regression 0.98 0.495 0.664 16.12 % 17.87 %
2 Artificial Neural Network 0.98 0.596 0.784 19.40 % 21.01 %
3 Support Vector Machine 0.98 0.246 0.454 7.99 % 12.01 %
4 Random Forest 0.98 0.481 0.683 15.65 % 18.36 %
5 M5 Tree 0.98 0.319 0.494 10.37 % 13.30 %
0
0.5
1
1.5
2
2.5
3
3.5
0 10 20 30 40 50
Cohesion (kg/cm2)
Observed SPT N-Value
Present Study Terzhagi and Peck (1982) Choudhury et al (2016)
Kumar et al. (2016)
516 Nitish Puri et al. / Procedia Computer Science 125 (2018) 509–517
8 Nitish Puri / Procedia Computer Science 00 (2018) 000–000
Figure 5. Comparison between prediction models for angle of internal friction (ϕ) using SPT N-value
Table 8. Proposed correlations for cohesion (c) and angle of internal friction (ϕ) using SPT N-value
4. Conclusion
An attempt has been made to develop several statistical correlations relating various geotechnical parameters of
soil using different machine learning techniques. Most of the correlations developed in this study are quite
comparable with the existing studies and have shown a proximity in trend and prediction as well. These correlations
based on geotechnical data measured in-situ are very accurate and can help in reducing errors associated with their
assumption in geotechnical engineering problems. Given the general potential of these techniques, we have barely
started to use them in solving problems. Many published studies referenced in the present work point to many
opportunities. It can be concluded that these techniques can be materialized into practical systems if the application
process is executed carefully and it can help to solve various problems very efficiently and precisely. The results of
this study can be used for crosschecking the values determined in laboratory or can also be used directly for design
purpose, where equipment and expertise are not available. The correlations presented in this study are region
specific and hence should only be used for sites falling in state of Haryana and nearby areas.
Acknowledgements
We acknowledge the help and assistance given by DST, India for the study under INSPIRE Fellowship scheme.
Authors would also like thank Mr. Madan Mohan Puri, Senior Section Engineer, Ministry of Railways, Delhi, India
for providing geotechnical reports of several construction projects located in State of Haryana without which this
study would not have been possible.
25.0
27.0
29.0
31.0
33.0
35.0
37.0
39.0
41.0
43.0
0 5 10 15 20 25 30 35 40 45 50
Angle of Internal Friction (ϕ) in degrees
Observed SPT N-Value
Present Study Shioi and Fukui (1982) Wolff (1989)
Soil Property Equation
Number
Correlations Coefficient of
Determination (R2)
Technique Units N-Value
Cohesion (c) 1 c = 0.0464 * N + 0.0075 0.93 M5P kg/cm2 1-25
2 c = 0.0702 * N – 0.5453 kg/cm2 26-52
Angle of Internal
Friction (ϕ)
3 ϕ= 0.3125 * N + 26.1261 0.99 SVM degree 1-52
Nitish Puri / Procedia Computer Science 00 (2018) 000–000 9
References
[1] A.T.C Goh(1995). “Neural networks for evaluating CPT calibration chamber test data”.International Journal of Microcomputers in Civil
Engineering10(2): 47-51.
[2] B. Saini, V.K. Sehgal, M.L. Gambhir (2007). “Least-cost design of singly and doubly reinforced concrete beam using genetic algorithm
optimized artificial neural network based on Levenberg–Marquardt and quasi-Newton backpropagation learning techniques”. Structural and
Multidisciplinary Optimization 34 (3): 243-260.
[3] R. Siddique, P. Aggarwal and Y. Aggarwal (2011). “Prediction of compressive strength of self-compacting concrete containing bottom ash
using artificial neural networks”. Advances in Engineering Software42(10):780-786.
[4] M. Pal, N.K. Singh and N.K.Tiwari (2012).“M5 model tree for pier scour prediction using field dataset”.KSCE Journal of Civil
Engineering16(6): 107-124.
[5] N. Puri and A. Jain (2015).Correlation between california bearing ratio and index properties of silt and clay of low compressibility”. Proc.
Fifth Indian Young Geotechnical Engineers Conference, Vadodara.
[6] G. Singh, S.N. Sachdeva and M. Pal (2016). “M5 model tree based predictive modeling of road accidents on non-urban sections of
highways in India”. Accident; analysis and prevention96:108-117.
[7] P. Anbazhagan, A. Uday, S.S.R. Moustafa, and S.N.A. Nassir (2016). “Correlation of densities with shear wave velocities and SPT N
values”. J. Geophys. Engg.13(3): 320–341.
[8] H.D. Prasad, N. Puri and A. Jain (2017). “Prediction of in-place density of soil using SPT N-value”. Proc. National Conference on Recent
Advances in Mechanical Engineering, Roorkee.
[9] B. Singh, P. Sihag and K. Singh (2017). “Modelling of impact of water quality on infiltration rate of soil by random forest regression”.
Modeling Earth Systems and Environment3(3): 999-1004.
[10] H.D. Prasad, N. Puri and A. Jain (2017). “Prediction of compression index of clays using machine learning techniques” Proc. National
Conference on Numerical Modeling in Geomechanics, Kurukshetra.
[11] https://www.ibef.org/states/haryana.aspx, Accessed on 30-10-2017.
[12] G. Pin
͂eiro, S. Perelman, J.P. Guerschman and J. M. Paruelo (2008). “How to evaluate models: observed vs. predicted or predicted vs.
observed?”. Ecological Modelling 216: 316-322.
[13] A.W. Skempton (1944). “Notes on the compressibility of clays”. Quarterly J. Geological Soc.100(1-4): 119-135.
[14] K. Terzaghi and R.B. Peck, 2nd Edition: Soil Mechanics in Engineering Practice, John Wiley and Sons, New York, 1967.
[15] V.M. Cozzolino (1961). “Statistical Forecasting of Compression Index”. Proc. 5th Int. Conf. Soil Mech. Foundation Engg, Paris.
[16] A.S. Azzouz, R.J. Krizek, and R.B. Corotis (1976). “Regression Analysis of Soil Compressibility”.Soils and Foundations16(2): 19-29.
[17] F. Kalantary and A. Kordnaeij (2012). “Prediction of compression index using artificial neural network”. Scientific Research and
Essays7(31): 2835-2848.
[18] J.E. Bowles,3rd Edition, Foundation Analysis and Design, McGraw-Hill, Inc., New York, 1982.
[19] R. Kumar, K. Bhargava and D. Choudhury (2016). “Estimation of Engineering Properties of Soils from Field SPT Using Random Number
Generation”. INAE Lett1(3-4): 77-84.
[20] Y. Shioi and J. Fukui (1982). “Application of N-Value to Design of Foundation in Japan”. 2nd ESOPT 1: 40-93.
[21] T.F. Wolff(1989). “Pile capacity prediction using parameter function”. Proc. InPredicted and Observed Axial Behavior of Piles, Results of a
PilePrediction Symposium, sponsored by Geotechnical EngineeringDivision, ASCE, Evanston, Ill., June 1989, ASCE GeotechnicalSpecial
Publication No. 23, 96-106.
Nitish Puri et al. / Procedia Computer Science 125 (2018) 509–517 517
8 Nitish Puri / Procedia Computer Science 00 (2018) 000–000
Figure 5. Comparison between prediction models for angle of internal friction (ϕ) using SPT N-value
Table 8. Proposed correlations for cohesion (c) and angle of internal friction (ϕ) using SPT N-value
4. Conclusion
An attempt has been made to develop several statistical correlations relating various geotechnical parameters of
soil using different machine learning techniques. Most of the correlations developed in this study are quite
comparable with the existing studies and have shown a proximity in trend and prediction as well. These correlations
based on geotechnical data measured in-situ are very accurate and can help in reducing errors associated with their
assumption in geotechnical engineering problems. Given the general potential of these techniques, we have barely
started to use them in solving problems. Many published studies referenced in the present work point to many
opportunities. It can be concluded that these techniques can be materialized into practical systems if the application
process is executed carefully and it can help to solve various problems very efficiently and precisely. The results of
this study can be used for crosschecking the values determined in laboratory or can also be used directly for design
purpose, where equipment and expertise are not available. The correlations presented in this study are region
specific and hence should only be used for sites falling in state of Haryana and nearby areas.
Acknowledgements
We acknowledge the help and assistance given by DST, India for the study under INSPIRE Fellowship scheme.
Authors would also like thank Mr. Madan Mohan Puri, Senior Section Engineer, Ministry of Railways, Delhi, India
for providing geotechnical reports of several construction projects located in State of Haryana without which this
study would not have been possible.
25.0
27.0
29.0
31.0
33.0
35.0
37.0
39.0
41.0
43.0
0 5 10 15 20 25 30 35 40 45 50
Angle of Internal Friction (ϕ) in degrees
Observed SPT N-Value
Present Study Shioi and Fukui (1982) Wolff (1989)
Soil Property Equation
Number
Correlations Coefficient of
Determination (R2)
Technique Units N-Value
Cohesion (c) 1 c = 0.0464 * N + 0.0075 0.93 M5P kg/cm2 1-25
2 c = 0.0702 * N – 0.5453 kg/cm2 26-52
Angle of Internal
Friction (ϕ)
3 ϕ= 0.3125 * N + 26.1261 0.99 SVM degree 1-52
Nitish Puri / Procedia Computer Science 00 (2018) 000–000 9
References
[1] A.T.C Goh(1995). “Neural networks for evaluating CPT calibration chamber test data”.International Journal of Microcomputers in Civil
Engineering10(2): 47-51.
[2] B. Saini, V.K. Sehgal, M.L. Gambhir (2007). “Least-cost design of singly and doubly reinforced concrete beam using genetic algorithm
optimized artificial neural network based on Levenberg–Marquardt and quasi-Newton backpropagation learning techniques”. Structural and
Multidisciplinary Optimization 34 (3): 243-260.
[3] R. Siddique, P. Aggarwal and Y. Aggarwal (2011). “Prediction of compressive strength of self-compacting concrete containing bottom ash
using artificial neural networks”. Advances in Engineering Software42(10):780-786.
[4] M. Pal, N.K. Singh and N.K.Tiwari (2012).“M5 model tree for pier scour prediction using field dataset”.KSCE Journal of Civil
Engineering16(6): 107-124.
[5] N. Puri and A. Jain (2015).Correlation between california bearing ratio and index properties of silt and clay of low compressibility”. Proc.
Fifth Indian Young Geotechnical Engineers Conference, Vadodara.
[6] G. Singh, S.N. Sachdeva and M. Pal (2016). “M5 model tree based predictive modeling of road accidents on non-urban sections of
highways in India”. Accident; analysis and prevention96:108-117.
[7] P. Anbazhagan, A. Uday, S.S.R. Moustafa, and S.N.A. Nassir (2016). “Correlation of densities with shear wave velocities and SPT N
values”. J. Geophys. Engg.13(3): 320–341.
[8] H.D. Prasad, N. Puri and A. Jain (2017). “Prediction of in-place density of soil using SPT N-value”. Proc. National Conference on Recent
Advances in Mechanical Engineering, Roorkee.
[9] B. Singh, P. Sihag and K. Singh (2017). “Modelling of impact of water quality on infiltration rate of soil by random forest regression”.
Modeling Earth Systems and Environment3(3): 999-1004.
[10] H.D. Prasad, N. Puri and A. Jain (2017). “Prediction of compression index of clays using machine learning techniques” Proc. National
Conference on Numerical Modeling in Geomechanics, Kurukshetra.
[11] https://www.ibef.org/states/haryana.aspx, Accessed on 30-10-2017.
[12] G. Pin
͂eiro, S. Perelman, J.P. Guerschman and J. M. Paruelo (2008). “How to evaluate models: observed vs. predicted or predicted vs.
observed?”. Ecological Modelling 216: 316-322.
[13] A.W. Skempton (1944). “Notes on the compressibility of clays”. Quarterly J. Geological Soc.100(1-4): 119-135.
[14] K. Terzaghi and R.B. Peck, 2nd Edition: Soil Mechanics in Engineering Practice, John Wiley and Sons, New York, 1967.
[15] V.M. Cozzolino (1961). “Statistical Forecasting of Compression Index”. Proc. 5th Int. Conf. Soil Mech. Foundation Engg, Paris.
[16] A.S. Azzouz, R.J. Krizek, and R.B. Corotis (1976). “Regression Analysis of Soil Compressibility”.Soils and Foundations16(2): 19-29.
[17] F. Kalantary and A. Kordnaeij (2012). “Prediction of compression index using artificial neural network”. Scientific Research and
Essays7(31): 2835-2848.
[18] J.E. Bowles,3rd Edition, Foundation Analysis and Design, McGraw-Hill, Inc., New York, 1982.
[19] R. Kumar, K. Bhargava and D. Choudhury (2016). “Estimation of Engineering Properties of Soils from Field SPT Using Random Number
Generation”. INAE Lett1(3-4): 77-84.
[20] Y. Shioi and J. Fukui (1982). “Application of N-Value to Design of Foundation in Japan”. 2nd ESOPT 1: 40-93.
[21] T.F. Wolff(1989). “Pile capacity prediction using parameter function”. Proc. InPredicted and Observed Axial Behavior of Piles, Results of a
PilePrediction Symposium, sponsored by Geotechnical EngineeringDivision, ASCE, Evanston, Ill., June 1989, ASCE GeotechnicalSpecial
Publication No. 23, 96-106.
... Most geotechnical parameters which include relative density, compression index (Cu), and Atterberg Limits are determined within the laboratory and some are estimated in the field with some assumptions. Their calculations require specific laboratory equipment, and an experienced geotechnical engineer with a crew of skilled technicians [6]. Integrating AI into these methods through machine learning (ML) algorithms could help attain profitability, efficiency, safety, and accuracy. ...
... Machine learning algorithms can deal with non-linear and plastic issues of soils effectively and avoid the weakness that can be caused by traditional methods [7]. ML is thus well suited to model complex performances of most geotechnical engineering, which by its very nature, exhibits extreme erraticism [6]. ...
... The continuous advancement of computational capacity and the availability of a diverse dataset provides a platform for analyzing and developing relationships between spatially variable factors in geotechnical engineering using machine learning (ML) approaches (Pacheco et al. 2023;Suppakul et al. 2024). Linear regression (LR), analysis, artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS), Support Vector Machine (SVM), random forest (RF), evolutionary polynomial regression (EPR) and gene expression programming (GEP) are the most commonly used machine learning techniques in analyzing geotechnical engineering problems (Ebid 2021;Goharzay et al. 2017;Huynh et al. 2022;Puri et al. 2018). Das (2013) delved into the application and various modeling challenges of artificial neural networks (ANN) in geotechnical engineering, alongside discussing future issues in the field. ...
Article
Full-text available
Standard Penetration Test (SPT) and the Cone Penetration Test (CPT) are employed in-situ to evaluate soil parameters. In geotechnical engineering practice, engineers often conduct in-situ tests either SPT or CPT to delineate soil profile and evaluate soil parameters for bearing capacity analysis. Most of the geotechnical parameters are correlated with SPT instead and widely employed. Since numerous soil parameters are correlated with SPT N-values, it is very beneficial to establish a correlation between CPT data and SPT N-values. To predict the SPT-N value from CPT data across various soil types such as silty sand, sandy silt, silty clay, and lean clay, this study has developed an empirical model using gene expression programming (GEP). Also a comprehensive GEP model encompassing all soil types has been proposed. The input parameter used in the GEP models are CPT tip resistance (qc), CPT-Sleeve friction (qf), and effective overburden pressure (σvʹ). The effectiveness of the models is evaluated through the implementation of statistical tests, employing a comprehensive index OBJ, and performing parametric analysis. Moreover, to test the reliability of the proposed GEP models, CPT-SPT data pairs that were not utilized in the model generation were employed. The results of the proposed models testing indicated that the models either under-predicts the targeted value by 3–9% or over-predicts by 3–12%. The OBJ values indicate that silty clay has the highest value of 4.985, making it the weakest model, while the all-soil model achieved the lowest value of 1.656, thus being considered the most effective model. The results indicated that the suggested models are precise and exhibit a strong potential for generalization and prediction.
... Therefore, RF has been one of the highly accurate and most robust algorithms in many studies. [37][38][39][40]. ...
... ML methods, in particular, can learn non-linear functional mappings and are capable of handling complex functions. Previous researchers have successfully applied ML methods to address tunneling issues and geotechnical problems (Chou et al. 2016, Zhang and Goh 2016, Pham et al. 2018, Puri et al. 2018. These algorithms enable computers to learn from data and make predictions or decisions without being explicitly programmed for each task (Lei et al. 2024, Shi et al. 2023a, b, Su et al. 2023b, Yin et al. 2023. ...
Article
Full-text available
This paper delves into the critical assessment of predicting sidewall displacement in underground caverns through the application of nine distinct machine learning techniques. The accurate prediction of sidewall displacement is essential for ensuring the structural safety and stability of underground caverns, which are prone to various geological challenges. The dataset utilized in this study comprises a total of 310 data points, each containing 13 relevant parameters extracted from 10 underground cavern projects located in Iran and other regions. To facilitate a comprehensive evaluation, the dataset is evenly divided into training and testing subset. The study employs a diverse array of machine learning models, including recurrent neural network, back-propagation neural network, K-nearest neighbors, normalized and ordinary radial basis function, support vector machine, weight estimation, feed-forward stepwise regression, and fuzzy inference system. These models are leveraged to develop predictive models that can accurately forecast sidewall displacement in underground caverns. The training phase involves utilizing 80% of the dataset (248 data points) to train the models, while the remaining 20% (62 data points) are used for testing and validation purposes. The findings of the study highlight the back-propagation neural network (BPNN) model as the most effective in providing accurate predictions. The BPNN model demonstrates a remarkably high correlation coefficient (R2 = 0.99) and a low error rate (RMSE = 4.27E-05), indicating its superior performance in predicting sidewall displacement in underground caverns. This research contributes valuable insights into the application of machine learning techniques for enhancing the safety and stability of underground structures.
... ML methods, in particular, can learn non-linear functional mappings and are capable of handling complex functions. Previous researchers have successfully applied ML methods to address tunneling issues and geotechnical problems (Chou et al. 2016, Zhang and Goh 2016, Pham et al. 2018, Puri et al. 2018. These algorithms enable computers to learn from data and make predictions or decisions without being explicitly programmed for each task (Lei et al. 2024, Shi et al. 2023a, b, Su et al. 2023b, Yin et al. 2023. ...
Article
Full-text available
Abstract. This paper delves into the critical assessment of predicting sidewall displacement in underground caverns through the application of nine distinct machine learning techniques. The accurate prediction of sidewall displacement is essential for ensuring the structural safety and stability of underground caverns, which are prone to various geological challenges. The dataset utilized in this study comprises a total of 310 data points, each containing 13 relevant parameters extracted from 10 underground cavern projects located in Iran and other regions. To facilitate a comprehensive evaluation, the dataset is evenly divided into training and testing subset. The study employs a diverse array of machine learning models, including recurrent neural network, back propagation neural network, K nearest neighbors, normalized and ordinary radial basis function, support vector machine, weight estimation, feed forward stepwise regression, and fuzzy inference system. These models are leveraged to develop predictive models that can accurately forecast sidewall displacement in underground caverns. The training phase involves utilizing 80% of the dataset (248 data points) to train the models, while the remaining 20% (62 data points) are used for testing and validation purposes. The findings of the study highlight the back propagation neural network (BPNN) model as the most effective in providing accurate predictions. The BPNN model demonstrates a remarkably high correlation coefficient (R2 = 0.99) and a low error rate (RMSE = 4.27E 05), indicating its superior performance in predicting sidewall displacement in underground caverns. This research contributes valuable insights into the application of machine learning techniques for enhancing the safety and stability of underground structures
... The trend toward automating dam monitoring devices is gaining momentum, enabling higher reading frequencies and an abundance of monitoring data [13]. Within the machine learning (ML) community, sophisticated tools have been developed for constructing data-driven prediction models in geotechnical engineering [14,15] such as to evaluate soil properties [16], investigate the reliability of excavations [17][18][19], analyze the deformations [20,21] and ground settlement [22], as well as predict geotechnical parameters [23]. The significant advantage of the ML methods is that they can fit highly non-linear behaviors. ...
Article
Full-text available
Pore water pressure (PWP) response is significant for evaluating the earth dams’ stability, and PWPs are, therefore, generally monitored. However, due to the soil heterogeneity and its non-linear behavior within earths, the PWP is usually difficult to estimate and predict accurately in order to detect a pathology or anomaly in the behavior of an embankment dam. This study endeavors to tackle this challenge through the application of diverse machine learning (ML) techniques in estimating the PWP within an existing earth dam. The methods employed include random forest (RF) combined with simulated annealing (SA), multilayer perceptron (MLP), standard recurrent neural networks (RNNs), and gated recurrent unit (GRU). The prediction capability of these techniques was gauged using metrics such as the coefficient of determination (R2), mean square error (MSE), and CPU training time. It was found that all the considered ML methods could give satisfactory results for the PWP estimation. Upon comparing these methods within the case study, the findings suggest that, in this study, multilayer perceptron (MLP) gives the most accurate PWP prediction, achieving the highest coefficient of determination (R2 = 0.99) and the lowest mean square error (MSE = 0.0087) metrics. A sensitivity analysis is then presented to evaluate the models’ robustness and the hyperparameter’s influence on the performance of the prediction model.
Article
Soil Liquefaction has a disastrous impact on structures and underground infrastructure. Therefore, an appropriate liquefaction vulnerability assessment strategy can help reduce the detrimental consequences of this hazard. In recent decades, machine learning has been studied more frequently to solve geotechnical issues, such as determining liquefaction susceptibility. Intending to improve the model’s learning ability to identify liquefaction vulnerability and to find the optimum training and testing data ratio, this research attempts to develop a machine learning model for liquefaction prediction utilizing relatively more varied data in different data ratios. In this study, liquefaction prediction models were developed using four supervised learning-based algorithms: Random Forest (RF), Naïve Bayes Classifier (NBC), Decision Tree (DT), and K-Nearest Neighbor (k-NN). Seven parameters were utilized to train the model using historical data on liquefaction. The model’s performance in predicting liquefaction was compared with various training and testing data ratios and validated using 5-fold cross-validation. The capability of the model was assessed using performance metrics. The results show that the RF model has the highest accuracy in predicting liquefaction among all the algorithms used. RF achieved an overall accuracy of 90.28%, followed by the k-NN (86.11%) and the DT (81.94%) on a training and testing data ratio of 80:20. The NBC algorithm obtained the highest accuracy of 78.44% on the 75:25 data ratio. In general, the machine learning approach is capable of predicting liquefaction susceptibility.
Article
Full-text available
The compression index (Cc) serves as a crucial parameter in predicting consolidation settlement in fine-grained soils, representing the slope of the void ratio logarithmic effective stress curve obtained from oedometer tests. However, traditional consolidation testing methods are notably time-consuming, typically spanning a 15-day period for preparation, execution, and parameter calculation, leading to significant delays in civil engineering projects. Therefore, there is an urgent need for effective methodologies to determine consolidation parameters within a shorter timeframe. Although various empirical formulas have been proposed over the years to correlate compressibility with soil parameters, none have reliably predicted the Cc across different datasets. In this study, to overcome this challenge, an alternative approach using artificial neural network (ANN) methodology to predict the compression index of fine-grained soils based on index properties is proposed. For this purpose, an ANN was trained and validated using a dataset consisting of 560 high and low- plasticity soil samples obtained from construction sites in various regions of Turkey over the last forty years, as well as soil borings in Istanbul. The modeling of artificial neural networks was performed using the Regression Learner program, which integrates with the Matlab 2023a software package and offers a user-friendly graphical interface for AI model development without coding. The data set, which was structured as a matrix with dimensions of 458 × 6, included input parameters such as the natural water content, liquid limit, plastic limit, plastic index and initial void ratio, as well as information on the compression index, which was the output variable. The developed ANN model showed an outstanding predictive performance when predicting the output of the test data, achieving an outstanding R2 score of 0.81. This underlines the potential of ANN methodologies to efficiently extract important data with fewer experiments and in less time, and offers promising applications in the field of geotechnical engineering.
Chapter
Reinforced concrete columns are vertical bearing elements that transfer loads from horizontal bearing elements such as slabs and beams to foundations. In this Chapter, the cost minimization of a rectangular reinforced concrete column under uniaxial bending is carried out and the obtained data is used in machine learning. Harmony Search (HS) was used for optimization and Categorical Boosting (CatBoost) was used for the machine learning process. The effect of model parameters on the success of the model is investigated using coefficient of determination (R2). As a result, the “iterations” parameter had the highest impact on the model.
Article
Full-text available
In this paper, Infiltration rate of the soil is investigated by using predictive models of Random forest regression and their performance were compared with Artificial neural network (ANN) and M5P model tree techniques. A dataset consists of 132 field measurements were used. Out of 132 observations randomly selected 88 observations were used for training, whereas remaining 44 were used for testing the model. Input variables consist of cumulative time ( Tf), type of impurities ( It), concentration of impurities (Ci), and moisture content (Wc) whereas the infiltration rate was considered as output. Correlation coefficient (CC), root mean square error (RMSE), mean absolute error (MAE), relative absolute error (RAE) and root relative square error (RRSE) were considered to compare the performance the both modelling approaches. The result of evolution suggests that Random forest regression approach works well than the other two models (ANN and M5P model tree). The estimated value of infiltration rate using Random forest regression lies within ±25% error lines. Sensitivity analysis suggests that cumulative time is an important parameter for predicting the infiltration rate of the soil.
Conference Paper
Full-text available
In-place density is an essential parameter in any geotechnical engineering design. Hence, a crude assumption of in-place density can lead to uncertainties and errors in design. In the present study, predictive models have been developed for the estimation of in-place densities using Standard Penetration Test (SPT) blow counts (N-values). A total of 2067 datasets from 1053 boreholes have been used in the study. Using various machine learning techniques, several models have been developed for fine and coarse grained soils for the estimation of bulk and dry in-place densities. For coarse grained soils, it has been observed that the models developed using M5 Model Trees (M5P) and linear regression have the maximum coefficient of determination (R2) and the minimum error for the estimation of bulk and dry density respectively. In the case of fine grained soils, M5P and Artificial Neural Networks (ANN) have shown maximum R2 value with the minimum error for the estimation of bulk and dry density respectively. The results indicate that the developed models are very accurate, and provides a viable tool to site engineers and consultants for predicting missing data and for cross checking the observed values.
Conference Paper
Full-text available
California bearing ratio (CBR) value is a measure of mechanical strength of subgrade and it is utilized widely in India for the design of flexible pavements. The test itself is very time consuming and requires skilled technicians. Accuracy of test results can be monitored and verified by developing valid correlations. In this paper authors have suggested some correlations derived from regression analysis. For this purpose CBR tests have been conducted on large scale in Karnal region of Haryana, India. A total of 20 models were prepared, out of which 12 have been tested by Simple Linear Regression Analysis (SLRA) in order to choose most influencing soil properties. The selected soil properties have been incorporated in preparing 8 more models which have been tested by Multiple Linear Regression Analysis (MLRA) to generate final correlations. Some correlations have shown very good relationship between soil index properties and CBR with minimum standard error and hence recommended by authors for predicting CBR values.
Article
Full-text available
Site effects primarily depend on the shear modulus of subsurface layers, and this is generally estimated from the measured shear wave velocity (V s) and assumed density. Very rarely, densities are measured for amplification estimation because drilling and sampling processes are time consuming and expensive. In this study, an attempt has been made to derive the correlation between the density (dry and wet density) and V s/SPT (standard penetration test) N values using measured data. A total of 354 measured V s and density data sets and 364 SPT N value and density data sets from 23 boreholes have been used in the study. Separate relations have been developed for all soil types as well as fine-grained and coarse-grained soil types. The correlations developed for bulk density were compared with the available data and it was found that the proposed relation matched well with the existing data. A graphical comparison and validation based on the consistency ratio and cumulative frequency curves was performed and the newly developed relations were found to demonstrate good prediction performance. An attempt has also been made to propose a relation between the bulk density and shear wave velocity applicable for a wide range of soil and rock by considering data from this study as well as that of previous studies. These correlations will be useful for predicting the density (bulk and dry) of sites having measured the shear wave velocity and SPT N values.
Article
Full-text available
Over the decades, a number of empirical correlations have been proposed to relate the Compression Index of normally consolidated soils to other soil parameters, such as the natural water content, liquid limit, plasticity index and void ratio. In this article too it has been attempted to establish a correlation between compression index and physical properties for the clayey soils of Mazandaran region. Due to the multiple effects of various parameters, Artificial Neural Network (ANN) has been adapted for predicting the compression index from more simply determined index properties. In order to develop the ANN model, four hundred consolidation tests for soils sampled at 125 construction sites in the province of Mazandaran, in the north of Iran were collected and 90% of these were used to train the prediction model and the other 10% were used to test it. A comparison was carried out between the experimentally measured compression indexes with the predictions. Furthermore, the predictions of a number of previously proposed empirical correlations were obtained using the available data and it has been shown that an improvement of 1-4% with respect to the other correlations has been achieved.
Conference Paper
Compression index (Cc) value is one of the most important engineering properties of soil. This characteristics frequently used in the design of foundations for the estimation of settlement in clay layers. Calculation of secondary consolidation settlements envisages compression index of soil which is the slope of void ratio vs logarithm of effective stress. It can be determined in the laboratory by carrying out oedometer test, however, the test itself is very time consuming and involves lots of calculations. In the present study, various correlations between the index properties and consolidation parameters have been reviewed. Also, the various soil index parameters that influence the consolidation characteristics of soil have been studied using various machine learning techniques. Techniques such as Linear regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), Random Forest (RF) and M5 Tree were studied. As per the result of the observation these techniques have been proven to be simple and efficient in the determination of consolidation parameters over conventional methods.
Article
For design of foundation, engineering properties like strength and deformability characteristics of soils are very important parameters. Soil properties like cohesion, angle of friction, shear wave velocity, Poisson’s ratio etc. are important for evaluation of the vibration parameter by numerical modeling of soil. In various numerical modeling software manuals, various ranges of these parameters are specified. If any of these software is used, the output results of a problem are mostly very sensitive to these input parameters. Hence, selection/estimation of proper values of these engineering properties of soil is very critical for analysis of a geotechnical engineering problem. Twelve empirical correlations of soil properties in terms of common field Standard Penetration Test (SPT)-N value have been developed through random number generation technique. The usefulness of the presently developed correlations is verified by validating the correlations with experimental values available in literature, which in turn can be used for geotechnical engineering design problems.
Article
Statistical techniques are used to analyze and evaluate experimental data from more than 700 consolidation tests on a large variety of undisturbed soils, and regression equations are developed to estimate the compression index and the compression ratio from classification or index data. These regression equations are then examined within the framework of a variety of similar, but more restricted, empirical relationships that have been reported by other investigators. It is found that both the compression index and the compression ratio can be reasonably well approximated by use of a simple linear regression model involving only the initial void ratio.