Content uploaded by K. Nagamani
Author content
All content in this area was uploaded by K. Nagamani on Jan 27, 2020
Content may be subject to copyright.
Arima Model Perfect Fit – A case study with
Kanchipuram District Rainfall Data Set
A.Amuthini Sambhavi1, *K.Nagamani2 and E.Priyadarshini3
1Research Scholar, Sathyabama Institute of Science and Technology, Chennai.
2Scientist-C, Centre for Remote Sensing and Geo-informatics, Sathyabama Institute of
Science and Technology, Chennai.
3Associate Professor, Department of Mathematics, Sathyabama Institute of Science and
Technology, Chennai.
*Corresponding Author: nagamaniloganathan@gmail.com
Abstract
The core concept of this paper lies in forecasting Kanchipuram district’s rainfall data set
ranging over hundred years starting from 1901 to 2002 (CGWB) using ARIMA model which
is Auto-Regressive Integrated Moving Average model. Hence, for this case study, ARIMA
Model (2,0,2) (1,0,1) was developed. Statistical assistance was satisfactorily supplied by
Autocorrelation and Partial correlation analysis to account for optimum accuracy. Standard
statistical techniques accounted for the validity check of the model. Kanchipuram rainfall
forecast prediction for the successive two years, that is, till 2004 was carried out with this
ARIMA model and hence, for the rainfall forecast values MAPE (Mean Absolute Percentile
Error) estimation was also successfully done. Thus, this traditional time series ARIMA model
perfectly justifies all the requirements of the decision-makers, researchers and water
resources departments related with the construction of any artificial recharge structures to
manage and conserve water. Hydrology and environmental management fields’ applications
are satisfied by this ARIMA Model.
Keywords: ARIMA, ACF, PCF, MAPE, forecast, rainfall
1. Introduction
Liquid water shower on earth in the form of rainfall happens when pure tiny droplets of water
from the atmospheric water vapour turns out to be heavy enough to drop down forced by
gravity. Rain, which enables suitability for many ecosystems as well as water for
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1064
hydroelectric power plants and crop cultivation, is a major part of water cycle. Rainfall is
measured by the use of a Raingauge and amount of rainfall is estimated by weather radar.
The major cause of rainfall is the moisture moving along three dimensional zones of
temperature and moisture contrasts known as weather fronts. At larger sizes, raindrops with
size range from 0.1 to 9 mm tend to break up. Raindrops impact at their terminal
velocity. Global warming is also causing changes in the precipitation pattern globally,
including wetter conditions across eastern North America and drier conditions in the tropics.
In hot and dry climatic regions, precipitation occurring from the clouds sublime even before
reaching the ground and is named as virga. The slow ascent of air in synoptic systems results
in stratiform and dynamic precipitation. Convective clouds shed convective rain. Rising air
motion of a large scale flow of moist air across the mountain ridge, resulting
in adiabatic cooling and condensation results in orographic precipitation. In Hawaii, Mount
Waiʻaleʻale has an extreme rainfall of 9,500mm. Cloud condensation nuclei arise due to
human impacts like car exhaust and other pollutions with a final increase in the percentage of
rainfall occurrence. Temperature rise increases the rate of evaporation leading to an increase
in rainfall rate. Cherrapunji in Shilong, India is the wettest place with an average annual
rainfall of 11,430 mm.
In 2019, Terence C. Mills published his overview of the ARIMA model. Amran.A, Assis. K,
& Remali. published their research work on forecasting cocoa bean prices using univariate
time series models. “Time series analysis: Forecasting and control”, San Francisco was c
Box, G.E.P. and G.M. Jenkins (1970) carried out by Box, G.E.P. and G.M. Jenkins (1970).
Forecasting the short-term metro passenger flow with empirical mode decomposition and
neural networks was done by Chen. M.C & Wei. Y in 2012. Application of seasonal models
in modeling and forecasting the monthly price of Privileged Sadri rice in Guilan province was
done by Dinpanah. G, H. Alipoor, Ansari M.H, & Pargami P.A. Satellite rainfall estimation
techniques using visible and infrared imagery was done by D’souza. G, Barrett. E.C, Power.
C.H(1990). Forecasting coarse rice prices in Bangladesh was taken up for research by Islam.
M.A, Hassan. M.F, Imam. M.F, & Sayem. S.M. ARIMA Model and Forecasting with Three
Types Of Pulse Prices In Bangladesh was done by Md Zakir Hossain, Quazi Abdus Samad,
Md Zulficar Ali (2006). Forecasting the future values of mutual funds, foreign exchange
rates, gold rates and crude oil rates using ARIMA modeling was done by Priyadarshini.E and
Chandrababu.A.S.L.Ho and M.Xie(1998) published their research on the use of ARIMA
models for reliability, forecasting and analysis of a mechanical system failure.
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1065
2.ARIMA METHODOLOGY
The most general class of models for forecasting any time series data set are ARIMA models.
Some notable types in the ARIMA models are Random-walk and random-trend models,
autoregressive models, and exponential smoothing models (i.e., exponential weighted moving
averages).
A non-seasonal ARIMA model is classified as an "ARIMA (p,d,q)" model, where:
p - number of autoregressive terms,
d - number of non-seasonal differences, and
q - number of moving average term.
A simple regression model is given by
Yt= b0 + b1Yt–1 + b2Yt–2 + …….+bpYt –p + et ------------ (1)
Where Yt is the forecast variable Yt-1,…..,Yt-p are explanatory variables and et is the error term.
The time-lagged values of the explanatory variable in the above equation assign its name as
Autoregression model equation. The past errors are used as the explanatory variables in
moving the average model.
A simple moving average model is represented as follows :
Yt= b0 + b1 et –1 + b2 et –2 + ……….+bp et –p + et ------------ (2)
Similarly, a seasonal model can be represented as ARIMA (p,d,q) (P,D,Q). The three steps of
performance involved in this model are model identification, parameters estimation and
diagnostic checking.
Step 1 - Model Identification
The variables meant for forecasting are primarily transformed into time-series format which
is an important step for the initiation of ARIMA modeling. The variables vary over time only
around a constant mean and constant variance for a stationary series. Generally, the time plot
of the concerned data set is thoroughly studied over to check stationarity. Differencing of the
data is executed to correct the Non-stationarity in mean with an order of differencing being
zero, Mean stationarity was maintained in this case study.
After examination of various ARIMA models, the one particular model with minimum BIC
(Bayesian Information criterion) was underscored and executed with the knowledge of the
ACF, the autocorrelation and PACF, Partial autocorrelation coefficients with the refined
order of o. This order has been spotted from the various computed orders of ACF and PACF.
This step ends up with the identification of the values of p and q.
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1066
Fig.1
Step 2 - Model estimation and verification
SPSS software was used to estimate the parameters of the ARIMA model. Coming to model
verification, it is nothing but examining thoroughly the autocorrelations and partial
autocorrelations of the residuals of different orders and noting whether any systematic pattern
has been adopted by the residuals of the model which can be removed further and the
ARIMA model could be made as a proper fit model. To satisfy the necessary step up to 16
lags different correlations were built-in along with their significance. This is tested by the
Box-Ljung test. The output infers that all the correlations from zero at a reasonable level are
alike. They portray constancy. This signifies the importance of this ARIMA model and the
corresponding ACF and PACF of the residuals configure a good fit of the model.
Rainfall forecasting using theARIMA model
Table 1 revealing the best fit ARIMA model parameters is given below and hence this forms
the key step to enhance proper computation of the Kanchipuram rainfall forecast with this
traditional time series ARIMA modeling. Model Fit is shown in Table 2.Model parameters
are given in Table 3.
Table 1: Model Description
Model Type
Model ID
Amount of rainfall
Model_1
ARIMA(2,0,2)(1,0,1)
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1067
Table 3: ARIMA Model Parameters
Estima
te
SE
T
Sig.
Amount
of
rainfall-
Model_1
Amount of
rainfall
No
Transformation
Constant
99.016
9.747
10.159
.000
AR
Lag 1
1.798
.030
59.284
.000
Lag 2
-.798
.034
-23.756
.000
MA
Lag 1
1.377
.021
65.953
.000
Lag 2
-.378
.026
-14.463
.000
AR,
Seasonal
Lag 1
1.000
.000
5424.56
0
.000
MA,
Seasonal
Lag 1
.991
.009
104.815
.000
Table 2: Model Fit
Fit Statistic
Mean
S
E
Min
Max
Percentile
5
10
25
50
75
90
95
Stationary
R-squared
.9409
.
.9409
.9409
.9409
.9409
.9409
.9409
.9409
.9409
.9409
R-squared
.9409
.
.9409
.9409
.9409
.9409
.9409
.9409
.9409
.9409
.9409
RMSE
86.86
5
.
86.86
5
86.86
5
86.86
5
86.86
5
86.86
5
86.86
5
86.86
5
86.86
5
86.86
5
MAPE
6.650
.
6.650
6.650
6.650
6.650
6.650
6.650
6.650
6.650
6.650
MaxAPE
63. 25
.
63. 25
63. 25
63. 25
63. 25
63. 25
63. 25
63. 25
63. 25
63. 25
MAE
55.85
1
.
55.85
1
55.85
1
55.85
1
55.85
1
55.85
1
55.85
1
55.85
1
55.85
1
55.85
1
MaxAE
455.6
75
.
455.6
75
455.6
75
455.6
75
455.6
75
455.6
75
455.6
75
455.6
75
455.6
75
455.6
75
Normalized
BIC
8.969
.
8.969
8.969
8.969
8.969
8.969
8.969
8.969
8.969
8.969
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1068
Fig.2
ARIMA Model Fit For Rainfall Forecast
The estimated model parameters of the ARIMA (2,0,2)(1,0,1) model are given below in the
table format. The Kanchipuram district Rainfall forecasting aided with ARIMA model was
executed with SPSS software.Rain-model_1 is given in Fig.1.The Normal P-P plot of the
amount of rainfall is obtained by taking Expected cum probability versus Observed cum
probabilityFig.2, Periodogram of rainfall by frequency Fig.3 and Spectral density of rainfall
by frequency Fig.4 graphs are given below.
Fig.3
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1069
Fig.4
Fig.5
To assess the rainfall forecasting ability of this ARIMA model, important measures of the
sample period forecasts were calculated. The mean absolute percentage error (MAPE) related
to the Kanchipuram rainfall forecast was calculated as 6.650. This value of MAPE renders a
strong assurance for very low forecasting inaccuracy.
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1070
Conclusion
1200 Rainfall data sets of Kanchipuram were taken from 1901 Jan to 2002 Dec(CGWB)and
the rainfall forecast prediction was made till 2004.i.e. additional 2yrs also with the help of
ARIMA model (2,0,2)(1,0,1). The error for the rainfall forecast has been estimated and
tabulated. MAE(Mean Absolute Error) came out as 55.851, RMSE (Root Mean Square Error)
turned out as 86.865 and finally, the MAPE (Mean Absolute Percentage Error) came out to
be 6.650. Since the rainfall forecasting error is concluded to be very low, this ARIMA model
can be stated as the best fit model to match the given time series. Another significance of this
model is that it helps the meteorologists, researchers, decision-makers to give an accurate
prediction of the present and future rainfall forecast. Any time series customed with any
changing pattern could be predicted with this model, provided, they satisfy the only limitation
of having a very long time series.
References
1. Amran. A, Assis. K, & Remali. Y, Forecasting cocoa bean prices using univariate
time series models, Journal of Arts Science and Commerce, 1 (2010), 71-80.
2. Alaguraja Palanichamy, Assessment of Rainfall and Groundwater for Agriculture of
Tiruchirappalli District, Tamil Nadu, using Geospatial Technology, International
Journal of Latest Technology in Engineering, Management & Applied Science, vol v,
issue VIII, August 2016, Pp- 40-52.
3. Alaguraja Palanichamy, Weekly Analysis of Rainfall for Agriculture Planning in
Tiruchirappalli District, Tamil Nadu Using GIS, GE- International Journal of
Engineering Research Vol.4, Issue 9, September 2016. Pp- 40-58.
4. Alaguraja Palanichamy, Rainfall Rhythm in Tiruchirappalli District, Tamil Nadu - A
GIS Approach, International Research Journal of Natural and Applied Sciences
Vol.3, Issue 9, September 2016. Pp- 203-223.
5. Alaguraja.P, Manivel. M, Nagarathinam .S.R, Sakthivel.R and Yuvaraj.D (2010)
Rainfall Distribution Study in Coimbatore District Tamil Nadu Using GIS, Recent
Trends in Water Research remote sensing and general perspectives, I.K International
Publication. New Delhi – pp—92-115,
6. Box, G.E.P. and G.M. Jenkins (1970). “Time series analysis: Forecasting and
control”, San Francisco: Holden-Day.
7. Chen. M.C & Wei. Y, Forecasting the short-term metro passenger flow with empirical
mode decomposition and neural networks, Transportation Research Part C, 21
(2012), 148-162.
8. Dinpanah. G, H. Alipoor, Ansari M.H, & Pargami P.A, Application of seasonal
models in modeling and forecasting the monthly price of Privileged Sadri rice in
Guilan province, ARPN Journal of Agricultural and Biological Science, 8 (2013),
283-290.
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1071
9. D’souza. G, Barrett. E.C, Power. C.H(1990)-“Satellite rainfall estimation techniques
using visible and infrared imagery”, Remote Sensing Reviews,4:2,379-414.
10. Islam. M.A, Hassan. M.F, Imam. M.F, & Sayem. S.M, Forecasting coarse rice prices
in Bangladesh, Progress Agriculture, 22 (2011), 193-201.
11. Md Zakir Hossain, Quazi Abdus Samad, Md Zulficar Ali (2006), “ARIMA Model and
Forecasting with Three Types Of Pulse Prices In Bangladesh: A Case Study”,
International Journal of Social Economics Vol. 33 No. 4, pp 344-353.
12. Priyadarshini. E and Chandrababu. A (2010), " Forecasting Of Indian Mutual
Funds Using ARIMA Models", Proceedings of the National Conference on Recent
Advances in Statistics and Computer Applications, Bharathiyar University,
Coimbatore, pp.150-155.
13. Priyadarshini. E and Chandrababu. A (2011), "Modeling And Forecasting Of
Foreign Exchange Rates Using ARIMA Models", Proceedings of the National
Conference on Recent Developments in Mathematics and its Applications at, SRM
University, Chennai, Excel India Publishers, pp 379 -383.
14. Priyadarshini. E and Chandrababu. A (2011), "ARIMA Model For Forecasting Gold
Rates", Proceedings of the International Conference on Stochastic Modeling and
Simulation (ICSMS2011), Vel Tech Dr.RR & Dr.SR Technical University, Chennai,
pp. 233-236.
15. Priyadarshini. E and Chandrababu. A (2011), "Forecasting of Crude Oil Rates
using ARIMA Models", International Journal of Statistics and Systems(IJSS),
Research India Publications, ISSN: 0973-2675, Vol. 6, Number 3, pp. 287-293.
16. Priyadarshini. E (2014) "A Comparative Analysis Of Prediction Using Neural
Network and Auto-Regressive Integrated Moving” ARPN Journal of Engineering and
Applied Sciences, Vol. 10, No. 7, pp 3078-3081 April 2015, ISSN: 1819- 6608.
17. Rahul Amin. M. D and Razzaque, M. A (2000), “Autoregressive Integrated Moving
Average (ARIMA) Modelling for Monthly Potato Prices in Bangladesh”, Journal of
Financial Management and Analysis, Vol.13, No.1 (Jan –Jun), pp. 74-80.
18. Saeed Moshiri and Faezeh Foroutan (2006), “Forecasting Nonlinear Crude Oil
Futures Prices”, The Energy Journal, Vol. 27, No. 4. pp .81-95.
19. Radina P. Soebiyanto and Richard Kiang (2014), Meteorological parameters as
predictors for seasonal influenza, Geocarto International, Vol. 29, No. 1, 39–47,
20. Uma Maheswari.V, Alaguraja.P and Yuvaraj.D, Rainfall Distribution Study in
Pudhukottai District, Tamilnadu, International Journal of Research and Analytical
Reviews, 2018 IJRAR September 2018, Volume 5, Issue 3, ISSN NO: 2279-543X,
UGC Approved No: 43602, Impact Factor – 5.75, pp 99-104
Journal of Information and Computational Science
Volume 9 Issue 12 - 2019
ISSN: 1548-7741
www.joics.org1072