ChapterPDF Available

Predictions of Weekly Soil Movements Using Moving-Average and Support-Vector Methods: A Case-Study in Chamoli, India

Authors:

Abstract and Figures

Landslides plague the Himalayan region, and landslide occurrence is widespread in hilly areas. Thus, it is important to predict soil movements and associated landslide events in advance of their occurrence. A recent approach to predicting soil movements is to use machine-learning techniques. In machine-learning literature, both moving-average-based methods (Seasonal Autoregressive Integrated Moving Average (SARIMA) model and Autoregressive (AR) model) and support-vector-based methods (Sequential Minimal Optimization; SMO) have been proposed. However, an evaluation of these methods on real-world landslide prediction has been little explored. The primary objective of this paper is to compare SARIMA, AR, and SMO methods in their ability to predict soil movements recorded at a real-world landslide site. A SARIMA model, an AR model, and a SMO model were compared in their ability to predict soil movements (in degrees) at the Tangni landslide in Chamoli, India. Time-series data about soil movements from five-sensors placed on the Tangni landslide hill were collected daily over a 78-week period from July 2012 to July 2014. Different model parameters were calibrated to the training data (first 62-weeks) and then made to predict the test data (the last 16-weeks). Results revealed that the moving-average models (SARIMA and AR) performed better compared to the support-vector models (SMO) during both training and test. Specifically, the SARIMA model possessed the smallest error compared to the AR and SMO models during test. We discuss the implications of using moving-average methods in predicting soil movements and associated landslides at real-world landslide locations.
Content may be subject to copyright.
Predictions of weekly soil movements using moving-
average and support-vector methods: A case-study in
Chamoli, India
Praveen Kumar1, Priyanka1, Ankush Pathania1, Shubham Agarwal1, Naresh Mali3,
Ravinder Singh4, Pratik Chaturvedi2, K. V. Uday3, Varun Dutt1
1 Applied Cognitive Science Lab, Indian Institute of Technology Mandi, Himachal Pradesh,
India
2 Defence Terrain Research Laboratory, Deference Research and Development Organization
(DRDO), New Delhi, India
3 Geohazard Studies Laboratory, Indian Institute of Technology Mandi, Himachal Pradesh,
India
4 National Disaster Management Authority (NDMA), New Delhi, India
Abstract. Landslides plague the Himalayan region, and landslide occurrence is
widespread in hilly areas. Thus, it is important to predict soil movements and
associated landslide events in advance of their occurrence. A recent approach to
predicting soil movements is to use machine-learning techniques. In machine-
learning literature, both moving-average-based methods (Seasonal
Autoregressive Integrated Moving Average (SARIMA) model and
Autoregressive (AR) model) and support-vector-based methods (Sequential
Minimal Optimization; SMO) have been proposed. However, an evaluation of
these methods on real-world landslide prediction has been little explored. The
primary objective of this paper is to compare SARIMA, AR, and SMO methods
in their ability to predict soil movements recorded at a real-world landslide site.
A SARIMA model, an AR model, and a SMO model were compared in their
ability to predict soil movements (in degrees) at the Tangni landslide in Chamoli,
India. Time-series data about soil movements from five-sensors placed on the
Tangni landslide hill were collected daily over a 78-week period from July 2012
to July 2014. Different model parameters were calibrated to the training data (first
62-weeks) and then made to predict the test data (the last 16-weeks). Results
revealed that the moving-average models (SARIMA and AR) performed better
compared to the support-vector models (SMO) during both training and test.
Specifically, the SARIMA model possessed the smallest error compared to the
AR and SMO models during test. We discuss the implications of using moving-
average methods in predicting soil movements and associated landslides at real-
world landslide locations.
Keywords: Landslides, SARIMA, SMO, Autoregression, Soil movements,
Tangni Landslide.
ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . . 1
2
1 Introduction
Landslides plague the Himalayan region, and landslide occurrence is widespread in
hilly areas of India in states of Himachal Pradesh and Uttarakhand [1]. These landslides
cause a massive damage to life and property [2]. Thus, we need to monitor, predict, and
warn people about soil movements on hills prone to landslides [3]. One way of
predicting soil movements is via machine-learning (ML) algorithms [4]. Such ML
algorithms take certain data attributes as input and predict the value of other attributes
of interest [5].
Several ML algorithms have been proposed in literature [6-17]. Two of the most
popular ones are either based upon moving average or based upon support vectors [10-
14, 16, 17]. For example, the Seasonal Autoregressive Integrated Moving Average
(SARIMA) algorithm and the Autoregressive (AR) algorithm are popular moving-
average-based methods, where the prior values in a time-series are using to predict
future values [16,17]. Similarly, support-vector algorithms like the Sequential Minimal
Optimization (SMO) have also been proposed in literature, where a set of support
vectors closer to the decision boundary are used to develop a prediction [10-14].
Prior research in the ML algorithms for soil-movement prediction has used neural-
network-based methods [6-9], support-vector-based methods [10-14], and moving-
average-based methods [10-17]. For example, reference [10-14] used the support vector
machine algorithm to predict soil movements in landslide-prone locations. Similarly,
reference [16, 17] have used certain moving-average algorithms to predict to predict
soil movements.
Although several ML algorithms have been used in literature to predict soil
movements in the recent past [6-17], the application and comparison of moving-
average-based methods and support-vector-based methods has been less explored. The
primary objective of this paper is to compare certain moving-average-based methods
(SARIMA and AR) with certain support-vector-based methods (SMO) in their ability
to predict soil movements in an active landslide site in the Himalayan mountains.
Specifically, we use a 2-year data about weekly soil-movements from the Tangni
landslide in Chamoli, India, for our investigation.
In what follows, first, we detail the background literature on the use of different
algorithms for predicting soil movements. Next, we detail the working of the SARIMA,
AR, and SMO algorithms and the method of calibrating these algorithms to data from
the Tangni landslide. Finally, we present our results from different algorithms and
discuss the implication of using moving-average-based and support-vector-based
methods for soil movement predictions.
2 Background
Several prior research studies have used neural network models [6-9] for predicting soil
movements as well as finding different triggering factors for such movements. For
example, reference [6, 7] proposed a novel neural network technique called ensemble
of extreme learning machine (E-ELM) and investigated the interactions of different
inducing factors affecting soil movements. Next, reference [8] improved the ELM
model by proposing a MEEMDELM model. Results revealed that the MEEMD
2 ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . .
3
ELM model was consistently better than the basic artificial neural networks (ANNs)
and the unmodified EEMDELM model in terms of the same measurements.
Furthermore, reference [9] used a multiple-ANNs switched prediction method on three
typical landslides in Three Gorges Reservoir, namely Baishuihe landslide, Bazimen
landslide, and Shiliushubao landslide. These authors found that the proposed switched
prediction method could significantly improve model generalization compared to the
best individual ANN predictor.
Some researchers have used support-vector-based models for soil movement
predictions [10-14]. For example, reference [10] predicted the landslide displacement
in the Three Gorges Reservoir, China, using a Particle Swarm Optimization and
Support Vector Machine (PSOSVM) coupling model. The PSOSVM model was
based on the factors of the precipitation, the variation range of the reservoir and the
displacements of the prior-periods. Results revealed that the proposed PSOSVM
model could better represent the response relationship between the factors and the
periodic landslide displacement. Similarly, reference [11] presented a comparative
study on landslide nonlinear displacement analysis and prediction using a support
vector machine (SVM), the relevance vector machine (RVM), and the Gaussian process
(GP). Results revealed that the Gaussian process performed better than the support
vector machine, relevance vector machine, and simple artificial neural network (ANN)
models. Furthermore, reference [12] used a case study of landslides in the Ecuadorian
Andes and compared the predictive power of logistic regression, support vector
machines and bootstrap-aggregated classification trees (bagging, double-bagging).
Results revealed that logistic regression with stepwise backward variable selection
yielded the lowest error rates and demonstrates the best generalization capabilities.
Next, reference [13] compared a Least Square Support Vector Machines (LSSVM)
model optimized with Genetic Algorithm, namely GA-LSSVM with a Double
Exponential Smoothing (DES) and LSSVM to empirically forecast landslide
displacement. Results indicated that both GA-LSSVM and DES-LSSVM models were
suitable for accurately predicting the landslide displacement based on precipitation and
displacement observations. Some researchers have also found that a Support Vector
Machine (SVM) regression predicts soil movements in Baishuihe landslide in Three
Gorges Reservoir Area, China, with a small error [14].
Some researchers have also used tree-based models for predicting soil movements
[15]. For example, reference [15] presented a methodology for prediction of landslide
movements using random forests, a machine learning algorithm based on regression
trees. The random forest method was established based on a time series consisting of 2
years of data on landslide movement, groundwater level, and precipitation gathered
from the Kostanjek landslide monitoring system and nearby meteorological stations in
Zagreb (Croatia). The validation results showed the capability of the random forest
model to predict the evolution of daily displacements for the period up to 30 days.
Furthermore, some researchers have found the moving-average models to perform
accurately in predicting soil movements [16, 17]. For example, reference [16] used the
Autoregressive Integrated Moving Average (ARIMA) model was employed to forecast
the accumulative displacement of the Bazimen landslide. Results indicated that the
ARIMA method improved the mining result of traditional static data. Reference [17]
have compared the ARIMA model and the considerable auto regressive (CAR) model
and found both these methods to yield good results.
ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . . 3
4
Across all the above investigations, either researchers have compared different
support-vector models or different moving-average models. However, research has yet
to compare support-vector models with the moving-average models. In this paper, we
address this literature gap by comparing moving-average models (SARIMA and AR)
with support-vector models (SMO). We perform this investigation by considering the
prediction of soil movements on the Tangni landslide in Chamoli, India. To best of
authors’ knowledge, this study is the first of its kind to compare moving-average-based
methods with support-vector-based models on the Tangni landslide.
3 Study Area
The study was performed in on the Tangni landslide in Chamoli district of Uttarakhand,
India. The study area covers an area of 0.72 km2. It is located on the northern Himalayan
region at latitudes 30° 2754.3” N and longitudes 79° 27’ 26.3” E, at an altitude of
1450 meter (Figure 1A and 1B). As seen in Figure 1B, the landslide is located on
National Highway 58, which connects Ghaziabad in Uttar Pradesh near New Delhi with
Badrinath and Mana Pass in Uttarakhand. The geology of this area consists of slate and
dolomite rocks [3]. The landslide slope is 30º above the road level and 42º below the
road level. The nearby area is a forest of oak and pinewood trees. There have been
several prior landslides in this area causing road blocks and economic losses to tourism
[18].
Fig. 1. (A) Location of the study area. (B) The Tangni landslide on Google Maps.
Data was collected from the Tangni landslide at a daily scale between 1st July 2012
and 1st July 2014 across five different boreholes. These five boreholes are represented
by different colours in Figure 1B (red borehole 1, green borehole 2, yellow
borehole 3, blue borehole 4, and pink borehole 5). Each borehole contained five
sensors at different depths (3m, 6m, 9m, 12m, and 15m). Data from some of these five
boreholes was used for evaluating different moving-average and support-vector
methods.
4 ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . .
5
4 Methodology
4.1 Data Pre-Processing
Data from Tangni landslide in Chamoli, India was obtained from the Defense Terrain
Research Laboratory, Defense Research and Development Organization. The
monitoring system in each of the five boreholes at the Tangni landslide contains
inclinometer sensors at different depths (3m, 6m, 9m, 12m, and 15m). These sensors
measure tilt in mm per m units (essentially the angle the inclinometer tilts). Each
inclinometer sensor is a 0.5-meter long sensor that is installed vertically at different
depths in a borehole. The monitoring system at Tangni landslide has five sensors per
borehole across five boreholes. Thus, in total there are 25 sensors across 5 boreholes.
Figure 2 shows the inclinometer sensor installed in its casing at a certain depth. As
shown in Figure 2, if there is a tilting movement () of the inclinometer of length L,
then the horizontal displacement in the tilting direction is  . For better
understanding, we converted the displacement in mm per m units into a angle
(degrees), where 1mm/m displacement equalled 0.05729º.
Fig. 2. Inclinometer sensor installed in its casing at a certain depth
First, we calculated the relative tilt angle of each sensor from its initial reading at the
time of installation. Second, we chose only those sensors from each borehole that gave
the maximum average tilt angle over a two-year period. Thus, the data was reduced to
five time-series, where each time-series represented the relative tilt per borehole from
the sensor that moved the most in the borehole across the two-year period. As the daily
data was sparse, we averaged the tilt over weeks to yield 78 weeks of average tilt data
per time-series. Figure 3A-3E represent the average relative tilt per week from five
sensors across five boreholes (one sensor per borehole) that caused the maximum
average tilt across 78 weeks. These five time-series were used to compare the moving-
average-based methods with the support-vector-based methods.
ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . . 5
6
Fig. 3. Average tilt angle in degrees across five sensors (one per borehole). (A) Borehole 1 and
3m sensor. (B) Borehole 2 and 12m sensor. (C) Borehole 3 and 6m sensor. (D) Borehole 4 and
15m sensor. (E) Borehole 5 and 15m sensor.
By convention, a negative tilt angle was downhill motion and a positive tilt angle was
an uphill motion. As seen in Figure 3C, a downhill motion starts from -0.11º in the 73rd
week and suddenly becomes larger (-4.4º) in the last four weeks. The data was split in
an 80:20 ratio (sixty-two weeks for training and the last sixteen weeks for testing)
across different machine learning algorithms.
4.2 Seasonal Auto-Regressive Integrated Moving-Average
Seasonal Auto-Regressive Integrated Moving-Average (SARIMA) is a statistical
forecasting method popular for univariate time-series data that may contain trend and
seasonal components. It predicts the time-series by describing the auto-correlations in
data [19].
Stationarity of Time-Series: A time-series with constant values over time for mean,
variance, and auto-correlation is stationary. Most statistical forecasting methods
assume that a time-series can be made approximately stationary using mathematical
transformations such as differencing [20]. The first step of building a SARIMA model
is stationarizing the data.
6 ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . .
7
Auto-Regressive (AR): An auto-regressive component in the SARIMA model
predicts a variable at current state by passing the past values of the same variable to the
model. Thus, an auto-regressive model is defined as:
    
(8)
where p is the auto-regressive trend parameter, is white noise and   
denote the movement data of the previous weeks [19].
Moving-Average (MA): A moving-average component in the SARIMA model uses
past prediction errors in a regression model, which is given in equation (9). A moving-
average model is defined as:
    
(9)
where q is the moving-average trend parameter, is white noise and  
 are the error terms at previous weeks.
If we combine auto-regression (AR) in equation (8) and a moving-average (MA) in
equation (9) model on stationary data, we obtain a non-seasonal ARIMA model, which
is defined as:
     
(10)
SARIMA builds upon an ARIMA model by adding seasonal parts to the ARIMA
model. The seasonal parts of an ARIMA model can have an AR factor, a MA factor,
and an order of difference term. All these factors in seasonal data operate across
multiples of the number of lagged periods in a season. In SARIMA, the three trend
elements that require calibration are the trend AR order ‘p’, the trend difference order
‘d’ and trend MA order ‘q’. Additional four seasonal elements, that require calibration
are the seasonal AR order ‘P’, the seasonal difference order ‘D’, the seasonal MA order
‘Q’ and the number of time steps ‘m’ for a single seasonal period. A SARIMA model
performs differencing of order D at a lag equal to the number of seasons ‘m’ to remove
additive seasonal effects. As with lag 1 differencing to remove the trend, the lag ‘m’
differencing introduces a moving- average term.
4.3 Sequential Minimal Optimization:
John Platt invented sequential minimal optimization (SMO) in 1998 [21]. It is a widely-
used algorithm for solving the quadratic programming (QP) problem that arises during
the training of support vector machines. The goal of the SMO algorithm is to return
alpha parameters (Lagrange multipliers) that satisfy the following constraint
optimization problem:
ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . . 7
8


(11)
For a Kernel function:
  
(12)
and

  
(13)
4.4 Auto Regression
Auto regression (AR) is a time series model that uses observations from previous time
steps as input to a regression equation to predict the value at the next time step. This
technique can be used on a time-series where input variables are taken as observations
at previous time steps, called lag variables. For example, we can predict the value for
the next time step (t+1) given the observations at the last two time-steps (st-1 and t-2).
As a regression model, this would look as follows:
    
(14)
4.5 Optimization of Model Parameters
Sequential Minimal Optimization (SMO). SMO algorithm has two parameters first
one is complexity parameter (C) that is used to build a 'hyperplane' between two classes
which are used for classification, regression, or other tasks. The C parameter controls
how many instances are used as 'support vectors' to draw the linear separation boundary
in the transformed Euclidean feature space. The second parameter of the SMO
algorithm is an exponent (E) of the kernel function. The kernel function means
transforming data into another dimension that has a clear dividing margin between
classes of data [21]. We varied the C and E parameters in SMO as per the following:
C=0, 1 and E=1, 2, 3, 4 for polynomial kernel; C=0, 1 and E=1, 2 for normalized
polynomial kernel; and, C=0 and E = 1 for RBF kernel.
SARIMA. This model has eight parameters p, d, q, P, D, Q, m, and Trend. Here, p is
the number of autoregressive (AR) terms, d is the order of differencing (I), and q is the
size of the moving average (MA) window. Table 1 shows the range of variations for
different parameters in the SARIMA model. The trend parameter has four different
values, where absent means no trend, constant means constant (horizontal) trend, linear
means linear trend, and finally, the constant with linear trend means there is both a
constant and linear trend. The m parameter means the number of time steps for a single
seasonal period. A parameter value of zero means we do not include that parameter in
8 ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . .
9
the model. A reason for using the SARIMA model was that it allows one to account for
a seasonal trend present in the time-series.
Table 1. Parameters for SARIMA.
Parameters
Range of Values
Trend Auto Regressive (p)
[0, 1, 2]
Trend Differencing (d)
[0, 1]
Trend Moving-Average (q)
[0, 1, 2]
Trend
[Absent, Constant, Linear, Constant with Linear Trend]
Seasonal Auto-Regressive (P)
[0, 1, 2]
Seasonal Differencing (D)
[0, 1]
Seasonal Moving-Average (Q)
[0, 1, 2]
Seasonal Periods (m)
[0, 1]
Autoregression. This algorithm has parameters corresponding to the beta coefficients
(, and the last n lag terms (t-1, …, t-n).
5 Results
Each algorithm was calibrated to each time-series independently. Table 2 shows the
root-mean squared error (RMSE) results of applying three different algorithms, AR,
SMO, and SARIMA, on the training data across the five boreholes. As can be seen in
the Table, the AR and SARIMA algorithms performed the best and second best and
better compared to the SMO algorithm.
Table 2. The RMSE of different algorithms in the training dataset.
Algorithm
Root-Mean Squared Error (RMSE) in degree of angle
Borehole 1
3m
Borehole 2
12m
Borehole 3
6m
Borehole 4
15m
Borehole 5
15m
AVERAGE
AR
0.844
0.000
0.002
0.015
0.708
0.314
SMO
0.951
0.000
0.002
0.017
0.797
0.353
SARIMA
1.047
0.421
0.016
0.502
0.830
0.563
ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . . 9
10
Table 3, 4, and 5 show the optimized values of different parameters of in the SMO,
SARIMA, and AR algorithms. For example, in Table 3, the best values of the C and E
parameters were 1.0 for a polynomial kernel function across all boreholes. Similarly,
most the SARIMA models showed non-zero seasonal Q parameter and non-seasonal q
parameter. In certain cases, the seasonal P, the non-seasonal p, and the non-seasonal d
parameters possessed non-zero values. In the AR model, the lag terms varied between
0 and 7 across different boreholes.
Table 3. Optimized hyper-parameters for SMO.
Parameters
Values
E
1
C
1
Kernel function
Polynomial
Table 4. Optimized parameters for SARIMA.
Sensor Location
Borehole Depth
Best Set of Parameters
[(p, d, q), (P, D, Q, m), ‘Trend’]
1
03-meter
[(0, 1, 0), (0, 0, 1, 0), 'Absent']
2
12-meter
[(0, 0, 1), (1, 0, 1, 1), 'Constant’]
3
06-meter
[(0, 1, 1), (0, 0, 0, 0), 'Absent’]
4
15-meter
[(2, 0, 1), (0, 0, 1, 0), 'Absent']
5
15-meter
[(1, 0, 0), (0, 0, 0, 0), 'Absent’]
Table 5. Optimized parameters for Autoregression.
Sensor Location
Borehole Depth
Function
    
1
03-meter
    
2
12-meter
  
3
06-meter
     
4
15-meter
     
5
15-meter
    
  
Table 6 shows the RMSEs from different models across different boreholes in the last
16-weeks of test data. As can be seen in the table, the SARIMA algorithm performed
best compared to the SMO and AR algorithms.
10 ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . .
11
Table 6. The RMSE of different algorithms in the test dataset.
Algorithm
Root-Mean Squared Error (RMSE) in degree of angle
Borehole 1
3m
Borehole 2
12m
Borehole 3
6m
Borehole 4
15m
Borehole 5
15m
AVERAGE
SARIMA
0.000
0.006
0.525
0.068
1.118
0.343
SMO
0.021
0.000
0.610
0.065
1.158
0.371
AR
0.632
0.000
0.572
0.067
1.419
0.538
Figure 4 shows the fits of the SARIMA algorithm to the time-series data across the
five boreholes in the training and test datasets. Overall, these results are reasonably
good with very small RMSE values.
Training data
Test data
ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . . 11
12
Fig. 4. Relative angle (in degree) over training data (62 weeks) and test data (16 weeks) from
the best performing SARIMA algorithm. (A) and (B): borehole 1, 3m depth. (C) and (D):
borehole 2, 12m depth. (E) and (F): borehole 3, 6m depth. (G) and (H): borehole 4, 15m depth.
(I) and (J): borehole 5, 15m depth.
6 Discussion and Conclusions
One focus of machine-learning algorithms could be the prediction of soil movements
in advance to timely warn people about impending landslides. In this work, we applied
both moving-average-based models (SARIMA and AR) and support-vector-based
models (SMO) on weekly soil-movement data from the Tangni landslides in Chamoli,
India. All models were calibrated on the first 80% of data and tested on the last 20% of
data. All models could generate the soil-movements in the following week given the
history of movements in prior weeks. Our results revealed that the moving-average-
based models (SARIMA and AR) outperformed the support-vector-based models
(SMO) both during model training and testing. During model testing, among the
moving-average-based models, the SARIMA model outperformed the AR as well.
One likely reason for the SARIMA model to perform better compared to the SMO
and AR models could be because this model has seasonal, auto-regressive, integrated,
and moving-average components built into its working. In contrast, the SMO attempts
to linearize the prediction problem using a kernel function, where the optimized kernel
function may not always be able to do so consistently across all boreholes. Similarly,
the AR model may only contain the AR component and it does not possess other
seasonal, integrated, and moving-average components to perform as well as the
SARIMA model.
In this paper, we were able to show that moving-average-based methods outperform
support-vector-based methods for real-world soil-movement predictions. However, as
part of our future research, we plan to extend these analyses to neural-network-based
methods including the use of both artificial neural networks as well as recurrent neural
12 ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . .
13
networks (e.g., long short-term memory models). Some of these ideas form the
immediate next steps in our program on soil-movement predictions using machine-
learning techniques.
7 References
1. Pande, R. K. (2006). Landslide problems in Uttaranchal, India: issues and
challenges. Disaster Prevention and Management: An International
Journal, 15(2), 247-255.
2. Parkash, Surya. (2011). Historical Records of Socio-economically Significant
Landslides in India. Journal of South Asia Disaster Studies.
3. Chaturvedi, P., Srivastava, S., & Kaur, P. B. (2017). Landslide Early Warning
System Development Using Statistical Analysis of Sensors’ Data at Tangni
Landslide, Uttarakhand, India. In Proceedings of Sixth International
Conference on Soft Computing for Problem Solving (pp. 259-270). Springer,
Singapore.
4. Korup, Oliver & Stolle, Amelie. (2014). Landslide prediction from machine
learning. Geology Today. 30. 10.1111/gto.12034.
5. Duda, R. O., Hart, P. E., & Stork, D. G. (2012). Pattern classification. John
Wiley & Sons. Lian, C., Zeng, Z., Yao, W., & Tang, H. (2014).
6. Lian, C., Zeng, Z., Yao, W., & Tang, H. (2014). Ensemble of extreme learning
machine for landslide displacement prediction based on time series
analysis. Neural Computing and Applications, 24(1), 99-107.
7. Cao, Y., Yin, K., Alexander, D. E., & Zhou, C. (2016). Using an extreme
learning machine to predict the displacement of step-like landslides in relation
to controlling factors. Landslides, 13(4), 725-736.
8. Lian, C., Zeng, Z., Yao, W., & Tang, H. (2014). Ensemble of extreme learning
machine for landslide displacement prediction based on time series
analysis. Neural Computing and Applications, 24(1), 99-107.
9. Lian, C., Zeng, Z., Yao, W., & Tang, H. (2015). Multiple neural networks
switched prediction for landslide displacement. Engineering geology, 186, 91-
99.
10. Zhou, C., Yin, K., Cao, Y., & Ahmed, B. (2016). Application of time series
analysis and PSOSVM model in predicting the Bazimen landslide in the
Three Gorges Reservoir, China. Engineering geology, 204, 108-120.
11. Liu, Z., Shao, J., Xu, W., Chen, H., & Shi, C. (2014). Comparison on landslide
nonlinear displacement analysis and prediction with computational
intelligence approaches. Landslides, 11(5), 889-896.
12. Brenning, A. (2005). Spatial prediction models for landslide hazards: review,
comparison and evaluation. Natural Hazards and Earth System Science, 5(6),
853-862.
13. Zhu, X., Xu, Q., Tang, M., Nie, W., Ma, S., & Xu, Z. (2017). Comparison of
two optimized machine learning models for predicting displacement of
rainfall-induced landslide: A case study in Sichuan Province,
China. Engineering geology, 218, 213-222.
ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . . 13
14
14. Zhu, C. H., & Hu, G. D. (2013). Time series prediction of landslide
displacement using SVM model: application to Baishuihe landslide in Three
Gorges reservoir area, China. In Applied Mechanics and Materials (Vol. 239,
pp. 1413-1420). Trans Tech Publications.
15. Krkač, M., Špoljarić, D., Bernat, S., & Arbanas, S. M. (2017). Method for
prediction of landslide movements based on random
forests. Landslides, 14(3), 947-960.
16. DUAN, G. H., & NIU, R. Q. (2013). A method of dynamic data mining for
landslide monitoring data. Journal of Yangtze River Scientific Research
Institute, (5), 10.
17. Qiang, L. I., & Duan-you, L. I. (2005). Research of Dynamic Predication
Technique for Landslide Displacement Monitoring [J]. Journal of Yangtze
River Scientific Research Institute, 6.
18. India News (2013, August 13). Landslides near Badrinath in Uttarakhand.
Retrieved April 7, 2019,
from https://www.indiatvnews.com/news/india/landslides-near-badrinath-in-
uttarakhand-26296.html
19. Asteriou, D., & Hall, S. G. (2011). ARIMA models and the BoxJenkins
methodology. Applied Econometrics, 2(2), 265-286.
20. Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: principles and
practice. OTexts.
21. J. Platt, “Sequential minimal optimization: A fast algorithm for training
support vector machines,” Microsoft Res., Bengaluru, India Rep. MSR-TR-
98-14, Apr. 1998.
14 ICITG2019, 064, v2 (major): ’Predictions of weekly soil movements using moving-average . . .
... The forecasting performance of these methods is justified by the authors through small errors and obviously high overlaps between the forecast and the ground truth (e.g. [17]). Furthermore, artificial intelligence (AI) / ML methods are used by [13] to forecast TBM operational data and by [18] to forecast the Q-index [19] of different construction sites. ...
... SVR has already been used several times for geotechnical applications (e.g. [13,17,35]). For this study, we used the linear SVRmodel from the library scikit-learn [36] (version 0.21.3), as it scales better to bigger datasets than scikit-learn's standard SVR. ...
Article
Full-text available
In tunneling, predictions of the rockmass conditions ahead of the face are of great interest to be able to take appropriate countermeasures at the right time. Besides investigations like drilling or geophysics, new approaches in mechanized tunneling aim at forecasting the geology ahead via Machine Learning models. These models are trained to forecast tunnel boring machine data by learning from recorded data in already excavated parts of the tunnel. Simply judging from high accuracies, these results may look promising at the first sight, but forecasts like this are mostly just delayed and slightly altered versions of the input data and no predictive value can result from them. This paper shows deficits in the current practice of data driven forecasts ahead of the tunnel face and raises impetus for further research in this particular field and TBM data analysis in general.
Article
Full-text available
Machine learning (ML) proposes an extensive range of techniques, which could be applied to forecasting soil movements using historical soil movements and other variables. For example, researchers have proposed recurrent ML techniques like the long short-term memory (LSTM) models for forecasting time series variables. However, the application of novel LSTM models for forecasting time series involving soil movements is yet to be fully explored. The primary objective of this research is to develop and test a new ensemble LSTM technique (called "Bidirectional-Stacked-LSTM" or "BS-LSTM"). In the BS-LSTM model, forecasts of soil movements are derived from a bidirectional LSTM for a period. These forecasts are then fed into a stacked LSTM to derive the next period's forecast. For developing the BS-LSTM model, datasets from two real-world landslide sites in India were used: Tangni (Chamoli district) and Kumarhatti (Solan district). The initial 80% of soil movements in both datasets were used for model training and the last 20% of soil movements in both datasets were used for model testing. The BS-LSTM model's performance was compared to other LSTM variants, including a simple LSTM, a bidirectional LSTM, a stacked LSTM, a CNN-LSTM, and a Conv-LSTM, on both datasets. Results showed that the BS-LSTM model outperformed all other LSTM model variants during training and test in both the Tangni and Kumarhatti datasets. This research highlights the utility of developing recurrent ensemble models for forecasting soil movements ahead of time.
Article
Full-text available
The Tangni landslide in Chamoli, India, has experienced several landslide incidents in the recent past. Due to the fatalities and injuries caused, it is essential to predict slope movements at this site. A recent approach to predicting slope movements is via machine-learning algorithms. In machine learning literature, recurrent neural networks (simple LSTMs, stacked LSTMs, bidirectional LSTMs, convolutional LSTMs, CNN-LSTMs, and encoder-decoder LSTMs) and non-recurrent neural networks (multilayer perceptrons) have been proposed. However, evaluating recurrent and non-recurrent neural networks for real-world slope movement prediction has been less explored. This research's primary objective is to develop and evaluate novel recurrent and non-recurrent neural network algorithms in their ability to predict slope movements. We used two years' weekly data of slope movements from the Tangni landslide site in Chamoli, India. Different recurrent and non-recurrent neural networks were calibrated on the training data and then predicted the test data. Different hyperparameters (epochs; packet shuffle; look-back period; the number of nodes per layer; and the number of layers) were calibrated to training data. Later, the developed models were evaluated on test data. Results revealed that, during training, the recurrent stacked LSTMs and bidirectional LSTMs performed the best and second-best, respectively, compared to other recurrent and non-recurrent neural networks. However, during the test, the recurrent CNN-LSTMs and simple LSTMs performed best and second best, respectively, compared to other recurrent and non-recurrent neural networks. We discuss the implications of our results for predicting slope movements at real-world landslide sites.
Article
Full-text available
Natural disasters such as landslides cause a lot of damage to life and property. However, less is known on how one could generate accurate alerts against landslides sufficiently ahead in time. The primary objective of this research is to develop and cross-validate a new ensemble gradient boosting algorithm for generating specific alerts about impending soil movements at a real-world landslide site. Data about soil movements at 10-minute intervals were collected via a landslide monitoring system deployed at a real-world landslide site situated at the Gharpa hill, Mandi, India. A new ensemble support vector machine-extreme gradient boosting (SVM-XGBoost) algorithm was developed, where the alert predictions of an SVM algorithm were fed into an XGBoost classifier to predict the alert severity 10-minutes ahead of time. The performance of the SVM-XGBoost algorithm was compared to other algorithms including, Naïve Bayes (NB), decision trees (DTs), random forest (RF), SVMs, XGBoost, and different new XGBoost variants (NB-XGBoost, DT-XGBoost, and RF-XGBoost). Results revealed that the new SVM-XGBoost algorithm significantly outperformed the other algorithms incorrectly predicting soil movement alerts 10-minutes ahead of time. We highlight the utility of developing newer ensemble-based machine-learning algorithms for an alert generation against impending landslides in the real world.
Chapter
Full-text available
Rainfall induced landslides account for over 200 deaths and loss of over Rs.550 crores annually in Himalaya. Literature suggests sensors based site specific Early Warning System (EWS) to be feasible and economic to curtail losses due to landslides for high risk areas. Area selected for current study is Tangni landslide located in Chamoli district of Uttarakhand state, India due to high anticipated risk to the local community residing nearby. For realization of EWS, a near real time instrumentation setup was installed on the slope. The setup measures pore water pressure, sub-surface deformations, and surface displacements along with rainfall. Regression analysis models are developed using antecedent rainfall and deformation data which are further used to find out thresholds for sensors based on z-scores. In future using the results from the sensors installed in the field and laboratory characterizations, numerical analyses will be applied to develop a process based model.
Conference Paper
Full-text available
Rainfall induced landslides account for over 200 deaths and loss of over Rs.550 crores annually in Himalaya. Literature suggests sensors based site specific Early Warning System (EWS) to be feasible and economic to curtail losses due to landslides for high risk areas. Area selected for current study is Tangni landslide located in Chamoli district of Uttarakhand state, India due to high anticipated risk to the local community residing nearby. For realization of EWS, a near real time instrumentation setup was installed on the slope. The setup measures pore water pressure, sub-surface deformations, and surface displacements along with rainfall. Regression analysis models are developed using antecedent rainfall and deformation data which are further used to find out thresholds for sensors based on z-scores. In future using the results from the sensors installed in the field and laboratory characterizations, numerical analyses will be applied to develop a process based model.
Article
Full-text available
Using an extreme learning machine to predict the displacement of step-like landslides in relation to controlling factors Abstract In the evolution of landslides, besides the geological conditions, displacement depends on the variation of the controlling factors. Due to the periodic fluctuation of the reservoir water level and the precipitation, the shape of cumulative displacement-time curves of the colluvial landslides in the Three Gorges Reservoir follows a step function. The Baijiabao landslide in the Three Gorges region was selected as a case study. By analysing the response relationship between the landslide deformation, the rainfall , the reservoir water level and the groundwater level, an extreme learning machine was proposed in order to establish the landslide displacement prediction model in relation to controlling factors. The result demonstrated that the curves of the predicted and measured values were very similar, with a correlation coefficient of 0.984. They showed a distinctive step-like deformation characteristic, which underlined the role of the influencing factors in the displacement of the landslide. In relation to controlling factors, the proposed extreme learning machine (ELM) model showed a great ability to predict the Baijiabao landslide and is thus an effective displacement prediction method for colluvial landslides with step-like deformation in the Three Gorges Reservoir region.
Article
Full-text available
Landslides are widespread, frequent and sudden hazards that strike human lives, livestock, livelihood, living places and environment in an adverse manner leading to colossal losses and damages directly or indirectly in a cumulative way. The present paper is an attempt to compile the information related to the incidents and impacts of landslides in terms of their location, time of occurrence and losses/damaged caused to the humans, habitations and highways. The list of landslide incidences comprises 371 socioeconomically significant events with respect to the above-said point of view. It is neither complete nor exhaustive but only a representation of the facts and figures obtained from different sources like records of the revenue and disaster management departments in the hill states, reports of the Geological Survey of India and Department of Geology & Mines from the states, border roads organization, public works department, soil and water conservation departments, irrigation and flood control department, disaster management authorities, department of planning, statistics, space application centers, media reports and news archives, research publications from universities and research organizations etc. National Institute of Disaster Management in collaboration with Geological Survey of India and Central Statistical Organization has been working on a national disaster statistical system for compilation of data related to landslide disasters since the year 2007 (Parkash and Nair, 2008). However, most of the data under this project has been obtained from the GSI officials, media reports and state administration. But the initiative has given a good impetus to the efforts of data collection and compilation for landslides. The data reported in this paper clearly shows a better trend of information on landslide disasters during the years 2007-2011 compared to that gathered before the year 2007. The author has attempted to compile pertinent data on landslide disasters from the years 1800 to 2011 from all the above said sources of information / data. An examination of the available data indicates that 3971 people have been reported as killed in 248 fatal events out of 371 socioeconomically significant landslides over a period of about 300 years. It would be worthwhile to mention here that in the present study the information related to socioeconomically
Article
Evaluation and prediction of displacement by specific models help in forecasting geo-hazards. Among the various available predictive tools, Least Square Support Vector Machines (LSSVM) model optimized with Genetic Algorithm, namely GA-LSSVM, is commonly used to empirically forecast landslide displacement due to its capability of processing non-linear complex systems. Another improved hybrid model composed of Double Exponential Smoothing (DES) and LSSVM considers measured displacement and precipitation time series to estimate the one-step ahead displacement evolution of rain-induced landslide. Here, the modelling process and accuracy of these two models are presented, and their predictive performances are evaluated by the root mean squared error (RMSE), mean absolute percentage error (MAPE), accuracy factor (AF), and correlation coefficient (R). A slowly-moving landslide on gently dipping rocky slope located in Sichuan Province of China was chosen as the case study for its deformation triggered by intense seasonal rainfall. The application results indicated that both GA-LSSVM and DES-LSSVM models were suitable for accurately predicting the landslide displacement on the basis of precipitation and displacement observations. Furthermore, comparison results show that DES-LSSVM model can provide the better predictive accuracy, with RMSE and MAPE values of 0.059 mm and 0.004%, respectively.
Article
Prediction of landslide movements with practical application for landslide risk mitigation is a challenge for scientists. This study presents a methodology for prediction of landslide movements using random forests, a machine learning algorithm based on regression trees. The prediction method was established based on a time series consisting of 2 years of data on landslide movement, groundwater level, and precipitation gathered from the Kostanjek landslide monitoring system and nearby meteorological stations in Zagreb (Croatia). Because of complex relations between precipitations and groundwater levels, the process of landslide movement prediction is divided into two separate models: (1) model for prediction of groundwater levels from precipitation data and (2) model for prediction of landslide movements from groundwater level data. In a groundwater level prediction model, 75 parameters were used as predictors, calculated from precipitation and evapotranspiration data. In the landslide movement prediction model, 10 parameters calculated from groundwater level data were used as predictors. Model validation was performed through the prediction of groundwater levels and prediction of landslide movements for the periods from 10 to 90 days. The validation results show the capability of the model to predict the evolution of daily displacements, from predicted variations of groundwater levels, for the period up to 30 days. Practical contributions of the developed method include the possibility of automated predictions, updated and improved on a daily basis, which would be an important source of information for decisions related to crisis management in the case of risky landslide movements.
Article
An accurate prediction of landslide displacement is challenging and of great interest to governments and researchers. In order to reduce the risk of selecting the types of influencing factors and artificial neural networks (ANNs), a multiple ANNs switched prediction method is proposed for landslide displacement forecasting. In the first stage, a set of individual neural networks are developed based on different environmental factors and/or different training algorithms. In the second stage, a switched prediction method is used to select the appropriate individual neural network for prediction purpose. For verification and testing, three typical landslides in Three Gorges Reservoir, namely Baishuihe landslide, Bazimen landslide and Shiliushubao landslide, are presented to test the effectiveness of our method. Application results demonstrate that the proposed method can significantly improve model generalization and perform similarly to, or better than, the best individual ANN predictor.