ThesisPDF Available

Statistical downscaling of GCM outputs using wavelet based model

Authors:
STATISTICAL DOWNSCALING OF GCM OUTPUTS USING
WAVELET BASED MODEL
Submitted By:
ANCHIT LAKHANPAL
2013CEW2201
Department Of Civil Engineering
Submitted in fulfillment of the requirements for the degree of
MASTER OF TECHNOLOGY
In
WATER RESOURCES ENGINEERING
Under the Guidance of
Prof. R. KHOSA & Dr. R. MAHESWARAN
to the
DEPARTMENT OF CIVIL ENGINEERING
INDIAN INSTITUTE OF TECHNOLOGY, DELHI
June, 2015
i
CERTIFICATE
This is hereby certified that the work which is being submitted by Mr. ANCHIT LAKHANPAL
in the report titled, STATISTICAL DOWNSCALING OF GCM OUTPUTS USING
WAVELET BASED MODEL’, in partial fulfillment of the requirements for the award of degree
of Master of Technology in Water Resources Engineering, is an authentic record of bonafide work
carried out by him from July, 2014 to June, 2015 at Department of Civil Engineering, Indian
Institute of Technology-Delhi under our supervision and guidance.
It is certified that this report is an original product and has not been submitted in part or in full to
any other institute or university.
Dr. R Maheswaran
Inspire Faculty,
Department of Civil Engineering,
Indian Institute of Technology, Delhi
Dr. Rakesh Khosa
Professor
Department of Civil Enginneering,
Indian Institute of Technology, Delhi
ii
ACKNOWLEDGEMENT
First and foremost, praises and thanks to the God, the Almighty, for his showers of blessings
throughout my research work to complete the research successfully within in stipulated time.
I would like to articulate my deep gratitude to my esteemed project guide Dr. RAKESH KHOSA
(Professor) and Dr. R MAHESWARAN (Inspire Faculty), Department of Civil Engineering,
Indian Institute of Technology, Delhi for providing me an opportunity to work under their
supervision and guidance. They have always been my motivation for carrying out the project. Their
constant encouragement at every step was a precious asset to me during my work.
I express my heartiest thanks to the faculty members of the Water Resource Engineering branch,
IIT Delhi for their valuable suggestions to take this work towards fruition.
I express my deep appreciation and sincere thanks to Vinit Sehgal (Project Associate) for
providing all kinds of possible support, guidance, help and encouragement during my project work.
I also extend my thanks to the all research scholars especially Raktim Haldar, Himansgu Tyagi
and Vilakshna Parmar and my fellow classmates especially Ankit Agarwal, Durga Prasad Panday
for their immense cooperation and help whenever I needed.
I wish to give special thanks to Siddharth Chaudhary for his direct and indirect contribution in
developing the project.
I am grateful to the staff members of Department of Civil Engineering, Indian Institute of
Technology especially Mr. Rajveer Agarwal, Mr.Neeraj Gehlot and Mr. Tikaram for providing me
all facilities required for the project work.
An assemblage of this nature could never have been attempted without reference to and inspiration
from the works of others whose details are mentioned in reference section. I acknowledge our
indebtedness to all of them.
Last but not the least; I am greatly indebted to my parents Mr. Rakesh Sharma and Mrs. Suman
Sharma who brought me up to this position inspiring and supporting my pursuits and are always
my guides to step ahead and I also thank all my family members as their contribution in whatever
I have achieved till date is beyond expression.
I would like to thank everybody who was important to the successful realization of my thesis, as
well as expressing my apology that I could not mention them individually.
JUNE, 2015 ANCHIT LAKHANPAL
iii
ABSTRACT
India being an agro-economy, water resources plays a vital role in deciding the policies and
development of the economy. Thus the knowledge of water resources availability for the coming
future is of great concern as there is high degree of variability in their availability. A prior
knowledge of these vulnerable resources will reflect in the decision making and execution
processes. This has pressed the need for studies concerning the factors that influences the water
resources availability.
This studies attempts to deliver such information about the precipitation and the temperature for
the Krishna basin of India up to year 2035. At local or regional level this information can be used
for various purposes such as rainfall runoff analysis, hydrological model development etc.
The information which is made available at regional level has been derived using the GCM which
is a climatic model. These models attempt to capture the climate dynamics across the globe but
fail to represent some key attributes and provide us with the climatic information at low resolution.
This information at coarser resolution cannot be used directly at regional level and requires spatial
downscaling. Downscaling is a technique which endeavors to bridge the mismatch between the
coarser and the finer resolution and translates global climatic information to predefined local level.
This study utilizes the concept of Wavelet which is the most significant part of the thesis work.
This tool helps in capturing the underlying information in a much better way which is often
overlooked by the conventional techniques. Apart from this there are other methods and techniques
such as MWE, K-means clustering, PCA which have been explored and utilized efficiently to reach
to the end results.
A critical review on the GCM models has been made part of this work which claims to capture the
reality. There are many loopholes in their development and have been addressed thoroughly.
The entire downscaling methodology used in this study has been programmed as a MATLAB
toolbox which is capable of taking up the issue of downscaling. The toolbox is robust and resilient
and has been tested for other regions as well and soon it will be available online.
iv
TABLE OF CONTENTS
CERTIFICATE……………………………………………………….......………………………………………………………………………….i
ACKNOWLEDGEMENT………………………………………………………………………………………………………………….……..ii
ABSTRACT………………………………………………………………………………………………………………………………………….iii
TABLE OF CONTENTS…………………………………………………………………………………………………………………....…iv
LIST OF FIGURES.…………………………………………………………………………………………………………….....………....vii
LIST OF TABLES………………………………………………………………………………………………………………………......…….ix
LIST OF ABBREVATIONS………………………………………………………………………………………………….………..…………x
1. INTRODUCTION ................................................................................................................. 1
1.1 General ............................................................................................................................. 1
1.2 Study Objectives .............................................................................................................. 3
1.3 Significance of the study .................................................................................................. 4
1.4 Research Questions .......................................................................................................... 4
1.5 Structure of the Report ..................................................................................................... 4
2. LITERATURE REVIEW ..................................................................................................... 6
2.1 Models Used ..................................................................................................................... 6
2.2 Downscaling techniques ................................................................................................... 7
2.2.1 Statistical downscaling.............................................................................................. 9
2.2.2 Dynamical downscaling ............................................................................................ 9
2.3 Selection of GCM........................................................................................................... 10
2.4 Selection of Predictors ................................................................................................... 11
2.5 Wavelets ......................................................................................................................... 12
2.6 Assumptions involved in downscaling ........................................................................... 13
3. REALITY OF CLIMATE VARIABILITY AND GCMs ................................................ 14
3.1 Climate and Climate Change Revisited: ........................................................................ 14
3.2 General misapprehensions.............................................................................................. 16
3.3 Past studies contributions that questions the present specious definition of climate ..... 17
3.3.1 Carbon Cycle .......................................................................................................... 17
v
3.3.2 Warmer periods found in the earlier interglacial periods ....................................... 20
3.3.3 Geomagnetism ........................................................................................................ 20
3.3.4 Changes in the orientation of the earth axis of rotation and earth’s orbit ............... 21
3.3.5 Salinity levels in the oceans .................................................................................... 22
3.3.6 Political undertone .................................................................................................. 22
3.4 Supporting evidences ..................................................................................................... 22
3.4.1 Great Barrier Reef Submergence ............................................................................ 22
3.4.2 Dwarka Submergence ............................................................................................. 23
3.4.3 Strait of Bosporous ................................................................................................. 23
3.4.4 England was connected with rest of the Europe ..................................................... 24
3.4.5 Nomenclature of Greenland .................................................................................... 24
3.5 Climatic Models ............................................................................................................. 25
3.5.1 GCMs from alternate point of view ........................................................................ 25
4. STUDY AREA AND DATA DESCRIPTION................................................................... 28
4.1 Study Area ...................................................................................................................... 28
4.2 Data Description ............................................................................................................. 29
5. METHODOLOGY .............................................................................................................. 36
5.1 Continuous Wavelet transform and Multi-scale Wavelet Entropy ................................ 36
5.2 Discrete Wavelet Transform .......................................................................................... 39
5.3 K means clustering ......................................................................................................... 40
5.4 Principal Component Analysis ....................................................................................... 41
5.5 Models used in the study ................................................................................................ 41
5.5.1 Multiple Linear Regression (MLR) ........................................................................ 41
5.5.2 Second order Volterra (SoV) Model ....................................................................... 42
5.5.3 Artificial Neural Networks (ANN) ......................................................................... 44
5.6 Calibration and Validation of the models ...................................................................... 44
5.7 Performance indices ....................................................................................................... 45
5.7.1 Normalized Root Mean Square Error (NRMSE) .................................................... 45
5.7.2 Normalized Mean Absolute Error (NMAE) ........................................................... 45
5.7.3 Coefficient of Correlation (CC) .............................................................................. 46
vi
5.7.4 Accuracy ................................................................................................................. 46
6. MODEL ARCHITECTURE .............................................................................................. 47
6.1 Standardization ............................................................................................................... 48
6.2 Interpolation of GCM grid points to NCEP ................................................................... 48
6.3 Variable averaging ......................................................................................................... 49
6.4 Multi-scale wavelet entropy ........................................................................................... 50
6.5 K-means clustering and PCA ......................................................................................... 50
6.5.1 Davies-Bouldin (DB) Index .................................................................................... 51
6.5.2 Dunn’s Index ........................................................................................................... 51
6.6 Multi- resolution modeling using DWT ......................................................................... 53
6.7 Development of wavelet based hybrid models .............................................................. 58
7. RESULTS AND DISCUSSIONS ........................................................................................ 61
7.1 Performance Evaluation of the Models .......................................................................... 61
7.2 Discrepancy Ratio .......................................................................................................... 64
7.3 Line plots ........................................................................................................................ 67
7.4 Downscaled Future Scenario .......................................................................................... 72
7.5 Statistical Analysis ......................................................................................................... 74
8. SUMMARY AND CONCLUSION .................................................................................... 82
REFERENCES ............................................................................................................................ 85
vii
LIST OF FIGURES
Figure 2.1 Downscaling of global scale to local scale for Krishna Basin ..................................................... 8
Figure 3.1 Long Term Carbon cycle ........................................................................................................... 18
Figure 3.2 Temperature change (blue) and carbon dioxide change (red) observed in ice core records by
NOAA ......................................................................................................................................................... 19
Figure 3.3 Past temperature records for the four inter-glacial preceding Holocene period ........................ 20
Figure 3.4 Strait of Bosporus ...................................................................................................................... 24
Figure 4.1 Basin map of Krishna River (India-WRIS, 2015) ..................................................................... 28
Figure 4.2 Future scenarios developed in AR5 report ................................................................................ 30
Figure 4.3 Downscaling points for the precipitation on Krishna Basin ...................................................... 34
Figure 4.4 Downscaling points for the temperature on Krishna Basin ....................................................... 35
Figure 5.1 (a) Synthetic time series (b) Wavelet spectrum and (c) Multi-Scale entropy for synthetic time
series ........................................................................................................................................................... 38
Figure 5.2 Wavelet Decomposition ............................................................................................................ 40
Figure 6.1 Flowchart for the proposed downscaling framework ................................................................ 47
Figure 6.2 Interpolated GCM points location ............................................................................................. 48
Figure 6.3 Variations in Air Temperature at the vertices of the grid enclosing Station A.......................... 49
Figure 6.4 Variable averaging scheme ........................................................................................................ 50
Figure 6.5 Value of DB and Dunn’s indices corresponding to each number of clusters ........................... 52
Figure 6.6 Detailed flowchart of input formation to the models ................................................................. 53
Figure 6.7 (a) Wavelet power spectrum of observed precipitation (b) Wavelet power spectrum of observed
temperature ................................................................................................................................................. 54
Figure 6.8 Number of NCEP variables with respective significant periods ............................................... 55
Figure 6.9 Wavelet decomposition of TS10 (1) and PSL (6) upto three resolution levels ........................ 57
Figure 6.10 Nonlinear kernels for modeling D2 component for station B for Precipitation ...................... 59
Figure 6.11 Nonlinear kernels for modeling D2 component for station B for Temperature ...................... 59
Figure 7.1 DR Statistics of various models under study for the mean monthly precipitation .................... 65
Figure 7.2 DR statistics of various models under study for the mean monthly temperature ...................... 66
Figure 7.3 Line plot of observed mean monthly precipitation and model simulations at Station A ........... 67
Figure 7.4 Line plot of observed mean monthly temperature and model simulations at Station A ............ 67
Figure 7.5 Line plot of observed mean monthly precipitation and model simulations at Station B ........... 68
Figure 7.6 Line plot of observed mean monthly temperature and model simulations at Station B ............ 68
Figure 7.7 Line plot of observed mean monthly precipitation and model simulations at Station C ........... 69
viii
Figure 7.8 Line plot of observed mean monthly temperature and model simulations at Station C ............ 69
Figure 7.9 Line plot of observed mean monthly precipitation and model simulations at Station D ........... 70
Figure 7.10 Line plot of observed mean monthly temperature and model simulations at Station D .......... 70
Figure 7.11 Line plot of observed mean monthly precipitation and model simulations at Station E ......... 71
Figure 7.12 Line plot of observed mean monthly temperature and model simulations at Station E .......... 71
Figure 7.13 Future downscaled mean monthly precipitation and temperature for all the Stations.............................. 73
Figure 7.14 Mean of the precipitation time series of IMD (calibration and validation period) and downscaled
future data for all the stations ...................................................................................................................... 75
Figure 7.15 Maximum of the precipitation time series of IMD (calibration and validation period) and
downscaled future data for all the stations .................................................................................................. 75
Figure 7.16 Box plot for observed v/s simulated precipitation for June-October of the validation period. 76
Figure 7.17 Cumulative distribution function plots of observed v/s simulated precipitation for validation
period .......................................................................................................................................................... 77
Figure 7.18 Mean of the temperature time series of IMD (calibration and validation period) and downscaled
future data for all the stations ...................................................................................................................... 79
Figure 7.19 Maximum of the precipitation time series of IMD (calibration and validation period) and
downscaled future data for all the stations .................................................................................................. 79
Figure 7.20 Scatter plots of observed v/s downscaled temperature for validation period .......................... 80
ix
LIST OF TABLES
Table 4.1 List of potential atmospheric predictors considered in the study .............................................. 32
Table 4.2 Location of downscaling points for the precipitation ................................................................. 33
Table 4.3 Location of downscaling points for the temperature ................................................................. 33
Table 6.1 Optimum number of clusters formed at each downscaling location ......................................... 52
Table 6.2 Combination of linear and non-linear input variables for each sub- time series model of all five
stations for Precipitation ............................................................................................................................ 60
Table 6.3 Combination of linear and non-linear input variables for each sub- time series model of all five
stations for Temperature ............................................................................................................................ 60
Table 7.1 Performance evaluation of the proposed model for the mean monthly precipitation .............. 61
Table 7.2 Performance evaluation of the proposed model for the mean monthly temperature .............. 63
Table 7.3 Descriptive statistics of observed mean monthly IMD precipitation (mm) for calibration and
validation period ......................................................................................................................................... 74
Table 7.4 Descriptive statistics of observed mean monthly downscaled precipitation (mm) for future ... 74
Table 7.5 Descriptive statistics of observed mean monthly IMD temperature (°C) for calibration and
validation period ......................................................................................................................................... 78
Table 7.6 Descriptive statistics of observed mean monthly downscaled temperature (°C) for future ...... 78
x
LIST OF ABBREVATIONS
ANN
Artificial Neural network
AR5
Fifth Assessment Report
CanCM4
Canadian Centre for Climate Modelling and Analysis
CC
Coefficient of Correlation
CGCM1
Canadian Coupled GCM
CWT
Continuous wavelet transform
DB
Davies-Bouldin Index
DWC’s
Discreet Wavelet Components
DR
Discrepancy Ratio
DWT
Discrete wavelet transform
GAMLSS
Generalized Additive Model in Location, Scale and Shape
GCM
General Circulation Model
GHGs
Greenhouse Gases
HadCM3
Hadley Centre Climate Model
IMD
Indian Meteorological Data
KNN
K- Nearest Neighbor
MISO
Multiple Input Single Output
MLR
Multiple Linear Regression
MWE
Multi-scale Wavelet entropy
NCEP/NCAR
National Centre for Environmental Prediction-National Centre for Atmospheric Research
NMAE
Normalized Mean Absolute Error
NOAA
National Oceanic and Atmospheric Administration
xi
NRMSE
Normalized Root mean square error
OLS-ERR
Orthogonal Least Squares- Error Reduction Ratio
P-ANN
PCA ANN Hybrid Model
PCA
Principal component Analysis
PDF
Probability Density Function
P-MLR
PCA MLR Hybrid Model
PSL
Sea Level Pressure
RCM
Regional Circulation Model
RCP’s
Representative Concentration Pathways
RVM
Relevance Vector Machine
SoV
Second order Volterra
SDSM
Statistical downscaling model
SSVM
Smooth Support Vector Machine
SVM
Support Vector Machine
TA
Air Temperature
THC
Thermohaline Circulation
TLFN
Time-Lagged Feed-Forward Neural Network
TS
Surface Temperature
UA
Eastward Wind
VA
Northward Wind
W-P-MLR
Wavelet-PCA-MLR Hybrid Model
WPS
Wavelet Power Spectrum
W-P-SoV
Wavelet-PCA-SoV Hybrid Model
ZG
Geo-potential Height
1
1. INTRODUCTION
1.1 General
The Earth system is a complex network of physical, chemical, geographic and biological processes
of the land, ocean and the atmosphere that interact to shape the world in which we live. All these
processes are dynamic in nature which adds up to the complexity in the comprehension of the
entire system. The interconnectivity and complexity of the system processes limits the meticulous
understanding of the system and hence it draws special consideration.
Within the earth system exists the climatic system which is sensitive to the perturbations that are
seen in various processes such as earth’s radiation, earth’s geomagnetism, earth’s orbit, volcanic
eruptions carbon dioxide cycle, cloud physics, hydrological cycle, solar activity, ocean currents
and many other internal feedbacks in the system at various scales.
The change of state of water between solid, liquid and gas involves transfer of heat and influences
atmospheric circulation and global distribution of both water and heat (Asrar and Dozier, 1994).
The study of climatic system requires a structural integration of scientific approaches that include
hydrological components and its interrelationship which further encompasses climatic variability,
land cover change, irrigation and flow regulation etc. A comprehensive knowledge and
understanding of the various hydrological components within hydrological cycle is required in
order to study the effects of these components. Examination of these atmospheric systems and
their linkages define the critical questions that the General Circulation Models (GCMs) are
attempting to answer.
All the atmospheric processes as we can put are governed by certain laws of physics which can be
expressed mathematically in the form of various differential equations. These set of equations are
put together to constitute a numerical model. Hence, a numerical model is a mathematical
representative of a physical system of the atmospheric processes and falls under the category of
General Circulation Models.
General Circulation Models (GCM) are the mathematical models developed by considering the
physics involved in land SURFACE, ocean and atmospheric processes in form of a set of linear
and non-linear partial differential equations. They provide us with the information of future
2
climatic variables up to coming 100 years considering the anthropogenic activities but this
information is available at such a coarse resolution (about 300Kms grid size) that it cannot be used
directly for any analysis at local or regional level (Wigley et al., 1990; Carter et al.,1994).
However, GCMs do not project all climatic variables with the same level of accuracy. In general,
GCMs show good skill in projecting large-scale circulation pattern but fail miserably when it
comes to projections of rainfall (Ghosh and Mujumdar, 2006). GCMs do not incorporate sub-grid
features such as topography, land surface process, land use pattern and cloud physics hence do not
represent the true picture of the prevailing climatic conditions. There are various other processes
which are missing in the GCMs which contributes towards its limitations in representing the global
climate. This issue has been taken up in Chapter 3. The coarser information at low resolution can
be used by downscaling it to the desired scale level at the area of interest which can be used for
rain-runoff analysis, hydrological models etc.
Downscaling is a technique to access the climatic information at various higher scales and utilize
it in an efficient manner for further analysis at lower scales as needed by decision makers and
impact assessors. The application of downscaling is confined to the scale problems - spatial as well
as temporal. Statistical and Dynamic are two climatic downscaling approaches. Dynamic
downscaling is used to develop the Regional Circulation Models (RCMs) which takes input from
the GCM simulation as initial and boundary condition and incorporate subgrid features such as
topography and land use patterns and produces very high resolution results. The advantage of an
RCM lies in compensating for the decreasing accuracy of GCM at higher resolution scale for
impact studies. RCM can resolve more accurately many surface features, such as mountains and
coastlines. However, this technique is computationally expensive (Salvi et al., 2013). Statistical
downscaling is quite handy and is computationally easier than the dynamic downscaling. Statistical
downscaling is a data driven approach where empirical relations are established between the coarse
scale predictors and the predictand at finer scale. This approach takes the input from GCMs for
particular region to relate global climate aberrations to regional climate aberrations.
Knowledge of precipitation and temperature obtained by downscaling the GCM output is required
in estimating runoff, planning erosion control measures, planning for irrigation, removal of excess
water and conserving water in low rainfall regions. Knowledge of runoff is required in designing
engineering structures that are needed for mitigating water scarcity problems in drinking and
3
irrigation sector. Data about infiltration, evaporation and transpiration are required in planning
irrigation and drainage systems, moisture conservation practices, etc. The watershed at the
appropriate scale is generally the most logical geographical unit of stream-flow analysis and water
resources management.
The approach described in this study utilizes the concept of wavelet analysis. Wavelet is a
mathematical tool that captures the time frequency (scale) features of a time series. . The wavelets
have ability to zoom in and zoom out and analyse the time series at different scales. Therefore, it
is felt that wavelets can be a very useful tool to obtain the high resolution data from the available
low resolution time series. It decomposes the time series and captures the underlying information
in an efficient manner which is often not been taken in to account and is of great importance in
prediction. Wavelets in the past have been extensively used in hydrological modeling and
forecasting; in this study the use of wavelets have been explored for downscaling and wavelets
based downscaling models are proposed to capture the low frequency events which are often not
captured by the conventional methods.
1.2 Study Objectives
Formulation of wavelet assisted models for downscaling GCM outputs to regional
level and their comparison based on statistical parameter indices, at various pre-
defined downscaling stations.
A two- fold application of wavelet analysis; first for the GCM variable reduction
followed by the climatic downscaling.
Statistical downscaling of the GCM future scenario to the local level precipitation
and temperature.
To explore various techniques such as Multi Wavelet Entropy(MWE), K-means
Clustering, Principal Component Analysis (PCA) to enhance the efficiency of the
models
Comprehensive study to acknowledge the reality of the Climate Variability and its
capture by current developments in General Circulation Models.
4
1.3 Significance of the study
This study provides advancement towards developing a practical and implementable
wavelet based framework for climatic downscaling using statistical models and proposes a
wavelet based multivariate second order model for downscaling of GCM variables to
regional precipitation and temperature.
The study provides a comparison between performance of wavelet based models with
popular ANN and MLR based methodology and establish improvement over these by the
application of multi- resolution modelling approach.
The study is also first of its kind to use MWE and K-means clustering coupled PCA for
variable reduction and its application in climatic downscaling.
The study emphasis on the compelling rational and holistic approach required to redefine
the climate and to address the limitations of the GCMs.
1.4 Research Questions
Which of the climatic variables are the most significant variables which relates best to the
regional predictands and thus can be considered as the model inputs?
How well the GCM simulations represent the actual reality?
How efficient and reliable are the statistical downscaling models in translating the coarser
climatic information to the finer scales.
1.5 Structure of the Report
The report is structured as follows;
Chapter1 Introduction: This chapter deals with the motivation and need of the study giving the
basic ideas behind downscaling.
Chapter 2 Literature Review: This chapter focuses on the details behind the climatic
downscaling and various literature works which provide an overview of downscaling and its
application in various domains.
Chapter 3 Credibility of Climate Variability and GCMs: An attempt has been made to bring
the GCMs and climate variability under the scrutiny to check their efficacy.
Chapter 4 Study area and Data description: This chapter provides the detailed description of
study area and data used in the study.
5
Chapter 5 Methodology: This chapter provides an overview of the various techniques applied
in model development.
Chapter 6 Model Architecture: The whole framework of the statistical downscaling has been
explained with the aid of the schematic diagram
Chapter 7 Results and Discussions: It includes the discussion over model performance and
comparison of various models adopted.
Chapter 8 References: All those sources which were helpful and have been used in this study
are mentioned in this section.
6
2. LITERATURE REVIEW
2.1 Models Used
In the last decade a lot of work on downscaling of climatic variables has been done using various
linear and non-linear models which includes automated regression, Artificial Neural network
(ANN), Support Vector Machine (SVM), Relevance Vector Machine(RVM) etc.
The most popular approach used in downscaling is to develop a transfer function using some form
of regression for quantitative relationship between the local scale climate variable and the climate
variables (pressure, temperature, wind) at coarser scales (Cannon et al., 2002). Individual
downscaling schemes differ according to the choice of mathematical transfer function, predictor
variables and statistical fitting procedure. Linear and nonlinear regression, Artificial Neural
network (ANN), canonical correlation, etc. have been used to derive predictorpredictand
relationship. Artificial intelligence and machine learning approaches like ANN and Support Vector
Machine (SVM) based downscaling techniques have gained wide recognition owing to their ability
to capture nonlinear relationships between predictors and predictand (Hewitson et al., 1996; Trigo
et al., 1999; Tripathi et al., 2005; Wilby et al., 1998).
Automated regressions were used in the initial stages which didn’t take non-linearity into
consideration while, on the other hand, artificial intelligence and machine learning approaches like
ANN and SVM based downscaling techniques are indeed capable of capturing nonlinear
relationships between predictors and predictand. These techniques are generally computationally
less intensive than the RCM based downscaling method and, further, ensemble of high resolution
climate scenarios can be produced relatively easily.
Paulin Coulibaly et al., (2004) used the time-lagged feed-forward neural network (TLFN) for
downscaling daily total precipitation and daily maximum and minimum temperature series for the
Serpent River watershed in northern Quebec, Canada. Anandhi.A et al., (2008) developed a
methodology to downscale monthly precipitation to Malaprabha river basin using SVM. Separate
downscaling model is developed for each season to capture the relationship between the predictor
variables and the predictand. Hessami. Masoud et al., (2008) used Automated regression-based
statistical downscaling tool on Canadian Coupled GCM (CGCM1) and Hadley Centre Climate
Model (HadCM3) and made their comparison. Xu et al., (2008) constructed Smooth Support
7
Vector Machine (SSVM) method to predict daily precipitation of the changed climate in the
Hanjiang Basin. In SSVM, smoothing techniques are applied to solve important mathematical
programming problems.
S.Ghosh and P.P. Mujumdar (2008) in their paper, ‘Statistical
downscaling of GCM simulations to streamflow using relevance vector machine’ used RVM
technique. They compared the result of RVM with SVM.
G. Burger et al., (2011) came up with
the Downscaling ExtremesAn Intercomparison of Multiple Statistical Methods for Present
Climate. Chadwick (2011) used artificial neural network technique for downscaling ECHAM5
GCM temperature (T) and rainfall (R) fields to RegCM3 regional model scale over Europe. Okan
Fistikoglu (2011) presented Statistical downscaling of monthly precipitation based on ANN model
of monthly precipitation using NCEP/NCAR reanalysis data for Tahtali river basin in Turkey.
Salvi et al., (2013) demonstrated High-resolution multisite daily rainfall projections in India
with statistical downscaling for climate change impacts assessment. Use of statistical
downscaling model classification, regression tree and kernel Regression.
Shashikanth and
Subimal Ghosh (2013) in their paper titled Fine Resolution Indian Summer Monsoon Rainfall
Projection with Statistical Downscaling used nonparametric kernel regression model. Future
changes in extreme temperature events using the statistical downscaling model (SDSM) in the
trans-boundary region of the Jhelum river basin, was presented by Mahmood, Rashid, and Mukand
S. Babel (2014). Devak et al., (2015) proposed Downscaling of Precipitation using CanCM4 as
historical data and RCP 4.5 as future scenario in Mahanadi Basin, India by SVM and K- Nearest
Neighbor (KNN).
Despite a number of advantages, the traditional ANN models have several drawbacks including
possibility of getting trapped in local minima and subjectivity in the choice of model architecture
Suykens et al., (2001).
SVM has some drawbacks of rapid increase of basis functions with the size of training data set and
absence of probabilistic interpretation Govindaraju and Rao S (2005).
2.2 Downscaling techniques
Downscaling methods are developed to obtain local-scale surface weather from global scale
atmospheric variables that are provided by GCMs. With downscaling, a low resolution image is
enhanced to a finer resolution using another higher resolution data product and a certain regression
procedure. Figure 2.1 depicts the spatial downscaling from global to the local level.
8
Figure 2.1 Downscaling of global scale to local scale for Krishna Basin
9
Most popular downscaling techniques are:
2.2.1 Statistical downscaling
In this type of downscaling a statistical or empirical relationship (regression equations) is
established from observations between large scale variables, like atmospheric surface pressure,
and a local variable, like the wind speed at a particular site. The relationship is then subsequently
used on the GCM data to obtain the local variables from the GCM output.
Also called empirical downscaling, is a tool for downscaling climate information from coarse
spatial scales to finer scales. It is less technically demanding than regional modelling. It is thus
possible to downscale from several GCMs and several different emissions scenarios relatively
quickly and inexpensively for many decades or even centuries, rather than the brief “time slices”
of the dynamical downscaling approach. It is even possible to tailor scenarios for specific
localities, scales, and problems. The spatial resolution applied in regional climate modelling is still
too coarse for many impact studies, and some variables are either not available or not realistically
reproduced by regional models. However, the major weakness of statistical downscaling is the
assumption that observed links between large-scale predictors and local predictands will persist in
a changed climate. A problem when applying statistical downscaling techniques to daily values is
that the observed autocorrelation between the weather at consecutive time steps is not necessarily
reproduced also, Statistical downscaling does not necessarily reproduce a physically sound
relationship between different climate elements (Lenart, 2008).
2.2.2 Dynamical downscaling
In dynamic downscaling the output from the GCM is used to serve as initial and boundary
conditions to drive a physical based regional, numerical model in higher spatial resolution, which
therefore is able to simulate local conditions in greater detail.
With dynamic downscaling, a low resolution image is enhanced to a finer resolution using output
from another higher resolution data product and a certain numerical procedure to drive a mesoscale
regional, numerical model, which therefore is able to simulate local conditions in higher spatial
and temporal resolution (Kim et.al., 1984 and Lenart, 2008).
It fits output from GCMs into regional meteorological models. Rather than using equations to bring
global-scale projections down to a regional level, dynamic downscaling involves using numerical
10
meteorological modelling to reflect how global patterns affect local weather conditions. The level
of detail involved strains computer capabilities, so computations can only tackle individual GCM
outputs and brief time slices. Yet climatologists generally consider three decades about the
minimum for deducing climatic conditions from the vagaries of weather (Lenart 2008; Bader et
al., 2008; Maurer and Hidalgo 2008).
2.3 Selection of GCM
With reference to prior discussion in the introduction chapter, it has been observed that GCM better
simulates global conditions with typically greater than 100 km spatial scales, mean annual and
seasonal temporal scales and high vertical scales. The major working variables include wind,
temperature, Geo-potential height and pressure. While to resolve regional physical phenomenon
of an area occurring at lower scales, there is a need to produce well distributed regional details.
For instance, to simulate hydrology of a region we need to observe 0-50 km of spatial scales, mean
daily temporal values and near surface data with evapotranspiration, runoff, and soil moisture as
principle working variables. The advantage of an RCM lies in compensating for the decreasing
accuracy of GCM at higher resolution scale for impact studies. RCM can resolve more accurately
many surface features, such as complex mountain topographies and coastlines.
There are important differences between the real world and the world as represented by GCMs.
The model representation of small-scale effects (such as topography) important to local climate
could be poorly represented in the GCM, however it is plausible to produce detailed simulations
for selected regions by nesting a Regional Climate Model (RCM) into a global GCM. For driving
the initial and time-dependent lateral boundary conditions of GCM, large-scale fields are used.
The RCM is coupled to a global model which regularly provides boundary conditions to the RCM
during the model integration. RCM produces better regional detail of temperature and precipitation
distribution. The study concluded that RCM is apt to simulate regional structures better represents
orographic precipitation. In the vertical, the model-level data have to be interpolated on pressure
levels. In the horizontal, the model grid data are transferred to a latitude-longitude or to a polar
stereographic grid; in the case of spectral models, the spectral coefficients have to be transformed
into grid points.
Selection of GCM among so many available GCMs is another art. No GCM is superior in
predicting temperature or precipitation for the whole world, although some GCMs score better in
11
particular regions). The performance of GCMs is assessed according to their ‘‘skill scores’’. Cai,
Ximing et al., (2009) considered 17 GCMs and based on RMSE value found the skill score of
GCM and gave them ranking for all parts of the world, for India three GCMs perform well and
among them “Canadian Centre for Climate Modelling and Analysis” serves the best. In this study
CanCM4 model developed by “Canadian Centre for Climate Modelling and Analysis” is used for
the analysis.
2.4 Selection of Predictors
All the GCMs produces different simulations for the future, considering the anthropogenic
activities and the emission rates of Green House Gases (GHG’s). They all have got their own
assumptions and develop different variables or predictors which include atmospheric, soil, oceanic
and ice variables. In this study only the atmospheric predictors have been used. Within the
atmospheric variables there is wide range of variables depending on the pressure levels. So there
is a need to filter variables based on their relevance to our study to get the probable predictors.
Variables selected should be such that following conditions are fulfilled:
Data should be available for desired period
Selected GCM should be capable of simulating variable well
Predictor must show good relation with predictand
Anandhi.A et al., (2008) considered that the monsoon rain is dependent on dynamics through
advection of water from the surrounding seas and thermodynamics through effects of moisture and
temperature which can modify the local vertical static stability. In a changed climate scenario, both
the thermodynamic and dynamic parameters may undergo changes and it is likely that a predictor
may not be significant for present climate but may become key predictor for future.
Once the probable predictors are decided, then out of them potential predictors are determined to
reduce the dimensionality as well as to take in to account the multi-collinearity which would be
fed as an input in to the model. In this study concept of variable averaging and principal component
analysis (PCA) have been used to overcome the problems seen in past studies.
12
2.5 Wavelets
It is also to be noted that the natural phenomena works at multiple spatial and temporal scales and
it is necessary to capture the multi-scale variability in the nature through the downscaling method
(Johnson et al., 2009). In the past, wavelet has been widely used as multiresolution technique to
capture the underlying multi-scale variability. Extensive literature is available on wavelet based
models for a diverse set of problems in hydrological modeling like monsoonal flood forecasting
(Sehgal et al,. 2014), drought forecasting (
Kim et al., 2003
), streamflow analysis (Maheswaran
and Khosa 2012,
Admowski 2008; Coulibaly and Burn 2004; Kucuk and Agiralioglu 2006;
Smith et al.. 1998
), precipitation analysis (
Kim 2004; Lu 2002; Partal and Kisi 2007; Xingang,
et al., 2003
), rainfall- runoff relationship (
Labat, et al. 2000
), prediction of river discharge (
Zhou
et al. 2008
); analysis of suspended sediment load (
Rajaee et al., 2010
); estimation of unit
hydrographs (
Chou and Wang, 2002
) and various other hydrological predictions (
Wang and
Ding, 2003; Maheswaran and Khosa 2012
). A good illustration on wavelet analysis can be found
in the books of Loknath (2010) and Rao (2004).
Even though there has been extensive application of wavelets in the hydrological forecasting, there
is a very minimal application of wavelets in the area of spatial downscaling. In the past, Perica and
Foufoula-Georgiou (1996) have used the concept of self-similarity in rainfall fields for spatial
downscaling. More recently, FouFoula-Georgiou and Ebtehajf (2013) have developed a wavelet
based downscaling framework using the sparse nature of precipitation fields. Cai (2009) have
applied wavelet as a denoising tool in climatic downscaling from GCM. Rashid et al., (2014) used
wavelet coherence to identify predictor variables for hydro- climatic variables and proposed
wavelet coupled Generalized Additive Model in Location, Scale and Shape (GAMLSS) models
for downscaling rainfall. From the analysis of literature it seem that these has no much
investigations on simulating the finer scale precipitation from the GCM climate variables
(pressure, wind velocity, temperature etc.).
So, in this study an attempt has been made to develop a wavelet based framework for
simulating/downscaling rainfall at finer scale using the GCM based climate variables. The
approach described in this study utilizes the concept of multi- resolution analysis using wavelet
transform which is useful in capturing the underlying information of a time series at different time-
frequency scales.
13
2.6 Assumptions involved in downscaling
Predictors are variables of relevance and are realistically modeled by the host GCM.
Empirical relationship is valid also under altered climatic conditions
Predictors employed fully represent the climate change signal.
(Wilby and Wigley, 1997; von Storch et al., 2000)
14
3. REALITY OF CLIMATE VARIABILITY AND GCMs
The study presented here has some underlying assumptions and short comings which have been
discussed in this chapter. In the process of modelling climatic variables, a lot of pertinent
information and processes are unknown and are not incorporated in the models. There are instances
where the information or data available is not well founded or developed and that has been used
as an input to the study, hence are debatable. Knowing this fact, the further research work has been
conducted so the results obtained from it carry high degree of uncertainty. The very concept of the
climate and climate change are controversial in themselves and need to be addressed with rational
and pragmatic approach. There is wide range of the interactions that exist at various scales. Most
of these interactions are non-linear in characteristic which further adds up to the complexity. It is
impractical to have the entire knowledge about the climatic system but some of its key components
can be dealt with great proficiency. But the fact is even if the components of the climate are well
understood individually, their amalgamation and interaction is too complex to handle. As far as
the GCMs are concerned they have their own issues regarding the resolutions and the fundamental
processes on which they are based hence they produce awry simulations. The whole concept of
the climate that GCMs endeavor to capture is enigmatic in nature which engenders the need of a
comprehensive discussion over it and accordingly an attempt has been made in this study. This
chapter is a state of the art that gives impetus to exercise prudence while dealing with the climatic
downscaling.
3.1 Climate and Climate Change Revisited:
IPCC defines Climate as:
Climate in a narrow sense is usually defined as the ‘average weather’, or more rigorously, as the
statistical description in terms of the mean and variability of relevant quantities over a period of
time ranging from months to thousands or millions of years. The classical period of time is 30
years, as defined by the World Meteorological Organization (WMO).
A climate time series is a non-stationary series. Average is not defined for non-stationary series.
IPCC definition of climate is inappropriate and it needs to be corrected. Climate is a dynamic
process it depends on many dynamic processes and their mutual interactions like Orbital
15
parameters of earth, solar output variation, solar internal motion cycle, ocean and ice dynamics,
geological processes, atmospheric processes and biosphere etc., which occur at different scales.
Thus it cannot be a stationary process and modelling such a process is difficult. But still GCMs
are deified by the climate community and are valued as the panacea for all the climate related
issues ignoring the flaws which they bring together.
Since climate is inappropriately defined, so measure of change in it is not justified. Climate change
is portrayed in a much obscured way which shrouds the reality. The word ‘Change’ is an intrinsic
attribute of climate and hence climate variability is inevitable. But the very fact that it is associated
with the anthropogenic influences is an iniquitous concept. The way it has become entrenched in
to our society, it mainly draws negative connotations. It has been established as a sinister abstract
which propagates the notion that climate change is a recent, never before phenomenon which, if
not arrested, would lead to the end of this world. The advent of the Climate change concept in
literature is recent, it began in the early 19th century when ice ages and other natural changes
in paleoclimate were first suspected and the natural greenhouse effect first identified On the
contrary, Climate change is not a new concept, even in the past climate kept on continuously
varying. We have the knowledge that in the past creatures like dinosaurs (lived 231 million years
ago and lasted upto 66 million years ago), wooly mammoth (existed back in 5 million years ago
and lasted upto last 4500 years) and saber toothed tiger (found 42 million years ago and got extinct
11,000 years ago) etc. existed which got extinct due to the warming of the atmosphere without any
burning of the fossil fuels and industrialization. But today World Wildlife Fund (WWF) attributes
the extinction of animals like green sea turtles, orangutan, giant panda, polar bear, Galapagos
penguins and tawny eagle etc. to the climate change because of human intervention.
Earth has experienced various altering periods of glaciation marked by the much higher level of
atmospheric CO2 as compared to today. The last cold era existed 20,000 years ago followed by a
medieval warm period, when most of the civilizations flourished and the temperatures were
warmer than today. This increase in the average temperature was not on the account of any
industrialization. In the ensuing period up to 1850’s a little ice age was noticed resulting in crop
failures and famine, thereafter the temperatures on an average have been rising at a pace of 0.6°C
per century with little oscillations.
16
Earth acts as a gyroscope and continuously endeavors to maintain the equilibrium in its
atmospheric and oceanic processes or cycles, which are dynamic in their character by varying their
rates. Earth’s climate conforms to reach the stable or equilibrium state by making suitable
alterations in its dynamic processes by a self-mechanism which accounts to the variability in the
climate. This whole activity takes millions of years to reach the stable state. It is a continuous
process which keeps on repeating itself and is marked by the inter-glacial cycles.
But the leading world institutions define it in another way. The definitions given by them condemn
the process of the development by making human activities responsible for the climate change.
3.2 General misapprehensions
United Nations Framework Convention on Climate Change (UNFCCC) limits its definition of
climate change to the human causes “a change of climate which is attributed directly or indirectly
to human activity that alters the composition of the global atmosphere and which is in addition to
natural climate variability observed over considerable time periods.”
According to the UK Department of Energy and Climate Change (DECC), “Climate change is the
process of changing weather patterns caused by the increased number of greenhouse gases in the
global atmosphere as a result of human activity since the beginning of the Industrial Revolution”.
It appears that the DECC is only concerned with man-made climate change.
Climate change refers to any change in climate over time, whether due to natural variability or as
a result of human activity (IPCC). It is attributed largely to the increased levels of atmospheric
carbon dioxide produced by the use of fossil fuel.
As per the Summary for Policy Makers, Second Order Draft of AR5, “it is extremely likely that
human activities have caused more than half of the observed increase in global average surface
temperature since the 1950s”.
So from the above definitions it is clear that the human intervention has been considered as the
main cause for the warming up of the earth’s atmosphere but there are others who are on the
opposite end of the pole. Their contributions have brought the Climate change under the
microscope.
17
There is no evidence that man’s production of carbon dioxide is causing more extreme weather
events. Any change caused by man will be gradual and there will be plenty of time to adapt, as
humans have always done. Most people will hardly notice it. (http://carbon-sense.com/)
The so called greenhouse gases (mainly water vapour and carbon dioxide) have the ability to
absorb radiant energy and transmit it to their surroundings (accepting the potential contribution of
CO2 to global warming) carbon dioxide occurs in tiny trace amounts in the atmosphere, and any
surface heating it could do is already being done by water vapour (not rejecting but downplaying
the attribution claim), which is more abundant and affects far more energy wavelengths. But
additional carbon dioxide in the biosphere gives a major boost to all plants which feed all animals
(downplaying negative impacts). It is not a pollutant, anywhere (rhetorically indemnifying CO2).
3.3 Past studies contributions that questions the present specious definition of climate
The following studies tell that a much coherent and lucid outlook must be set up in dealing with
such a sensible issue of climate change which targets to curtail the human development process.
These studies indicates that the above stated definitions of the climate and the climate change are
not consistent with the revelation of these studies.
3.3.1 Carbon Cycle
A multitude of interaction pathways exist between the physical climate system and the different
carbon reservoirs. The most prominent is the concentration of CO2, an atmospheric trace gas,
which through its physical properties influences the radiation balance of the Earth. Our knowledge
is insufficient to describe the interactions between the components of the Earth system and the
relationship between the carbon cycle and other biogeochemical and climatological processes.
Geological processes contribute more carbon dioxide to atmosphere as compared to the
contribution from the industries. Magnitude of flux (millions of metric tons per year) due to human
activities is 8,000 and due to natural processes is 61,000 (Falkowski et al., 2000). This clearly
states that the credence in notion of the climate variability due to anthropogenic activities is
arguable.
It has been established that neither the combustion of fossil fuels nor the removal of organic carbon
(deforestation) influenced atmospheric CO2 levels nearly as much as crustal processes which
18
includes the formation of limestone (corals) and other carbonates, which removed carbon dioxide
from the atmosphere, and the decomposition of silicates, which added CO2 to it (Hogbom,1984).
He estimated that if 500 million tons of coal is converted to carbon dioxide it will produce 1/1000th
of total atmospheric concentration.
The carbon cycle includes a variety of processes that take place over timescales ranging from hours
to millions of years. Processes occurring over shorter periods include photosynthesis, respiration,
airsea exchange of carbon dioxide and humus accumulation in soils. However, it is the long-term
carbon cycle, occurring over millions of years, that is of interest when considering the origin of
fossil fuels. The long-term cycle, shown in Figure 3.1, is distinguished by the exchange of carbon
between rocks and the surficial system, which consists of the ocean, atmosphere, biosphere and
soils. The long-term carbon cycle is the main controller of the concentration of atmospheric carbon
dioxide over a geological timescale. There are various sources and sinks responsible for the
circulation of the CO2.
Figure 3.1 Long Term Carbon cycle
The total of dissolved inorganic carbon in the oceans is 50 times that of the atmosphere, and on
time scales of millennia, the oceans determine atmospheric CO2 concentrations, not vice versa.
Atmospheric CO2 continuously exchanges with oceanic CO2 at the surface. Carbon dioxide in the
19
atmosphere mixes with seawater to produce carbonic acid. Carbonic acid easily dissociates into
hydrogen ions (H+) and bicarbonate (HCO3-) ions. (An increase in the concentration of hydrogen
ions increases the acidity of the water, i.e. lowers the pH).Carbonate ions (CO32-) combine with
hydrogen ions to produce bicarbonate ions. Carbonate ions and calcium ions are used by marine
animals (such as corals, crustaceans, and zooplankton) to create hard shells and skeletons of
calcium carbonate. This is called “calcification,” or more generically, “biomineralization.”.
Calcium carbonate dissolves into calcium ions and carbonate ions. The rate of dissolution depends
on pH, pressure and the amount of carbonate ions in seawater.
In the past records it has been seen that the concentration level of CO2 was even more as compared
to today as shown in Figure 3.2. Those higher concentrations were not on the account of the
burning of fossils and the industrialization but due to the underlying processes which took place
in those times on Earth and governed the CO2 levels in the atmosphere. In the past CO2 cycle, clear
periodical trend can be seen having finite duration cycles which reveals that in the coming future
(million years) this kind of cycle will maintain itself. Today we might be on the rising limb of a
cycle, certainly after reaching the crest a downfall is inevitable part of the cycle. A similar trend is
seen in the temperature which has been discussed in next section.
Figure 3.2 Temperature change (blue) and carbon dioxide change (red) observed in ice core records by NOAA
20
3.3.2 Warmer periods found in the earlier interglacial periods
It is observed from the ice core recovered from the Russian Vostok drilling station in East
Antarctica that the four inter-glacial cycles that preceded the Holocene were, on average, more
than 2°C warmer than the one in which we currently live as shown in Figure 3.3 (Petit et al., 1999).
As judged from Vostok records, climate has almost always been in a state of change during the
past 420 kyr (kilo year) but within stable bounds (that is, there are maximum and minimum values
of climate properties between which climate oscillates). Significant features of the most recent
glacialinterglacial cycle are observed in earlier cycles.
Figure 3.3 Past temperature records for the four inter-glacial preceding Holocene period
3.3.3 Geomagnetism
Magnetism has seldom been invoked, and evidence for connections between climate and magnetic
field variations have received little attention. Potential correlation of changes in amplitude of
geomagnetic variations of external origin, solar irradiance and global temperature is stronger over
a range of time scales from decades to hundreds of thousands of years (Courtillot et al., 2007).
Geologic and atmospheric evidence both strongly indicate that the Martian climate has changed
dramatically over geologic time. The present-day Martian atmosphere is very thin and cold, with
an atmospheric pressure of only 6 millibars (> 150 times lower than Earth's atmospheric pressure)
21
and average surface temperature of 210 K. The loss of the Martian atmosphere has been attributed
to inability of its global magnetic field to hold it. A substantial global magnetic field causes the
solar wind to stand off from the planet, limiting its ability to strip off the atmosphere. A planetary
magnetic field can "protect" against sputtering losses in several ways. First, if the field deflects the
solar wind around the bulk of the atmosphere, it limits the ion production rate in the upper
atmosphere by eliminating solar-wind-induced ionization processes. Second, and perhaps more
important, the field shields any ions produced in the upper atmosphere (e.g., by photoionization)
from the solar wind magnetic field. Thus losses of atmospheric ions and atoms by direct sweeping
or collisional sputtering, respectively, are minimized (Hutchins et al., 1997).
As in the case of Mars where the atmosphere seldom exists, it is a challenging task to define climate
for it. We know atmosphere and oceans are the cardinal contributors which governs and define the
climate for a planet and in Mars both of these are absent. This raises a very pertinent question,
Does Mars has any climate of its own?
3.3.4 Changes in the orientation of the earth axis of rotation and earth’s orbit
We know the Earth rotates around an axis but the Earth’s axis is not upright, it leans at an angle.
This angle changes with time and over about 41 000 years it moves from 22.1 degrees to 24.5
degrees and back again. When the angle increases the summers become warmer and the winters
become colder and vice versa, there by influencing the climate of the Earth (British Geological
Survey).
Moreover the Earth’s orbit around the Sun is an ellipse, not a circle and the ellipse changes its
shape. Sometimes it is almost circular and the Earth stays approximately the same distance from
the Sun as it progresses around its orbit. At other times the ellipse is more pronounced so that the
Earth moves closer and further away from the sun as it orbits. When the Earth is closer to the sun
our climate is warmer.
These factors are rarely recognized in the discussions of climate. They play vital role in governing
the climate and its variability, so both of these factors must be considered while defining the
climate variability.
22
3.3.5 Salinity levels in the oceans
Salinity affects seawater density, which in turn governs ocean circulation and thus climate. It is
another factor which is often overlooked in the deliberation of the climate. We know that the wind
drives upper ocean currents, however ocean current can also flow deep below the surface. These
deep-ocean currents are driven by differences in the sea water density. As sea water density is
controlled by temperature and salinity, these factors make the oceans very dynamic in nature. The
term thermohaline circulation (THC) (thermo temperature and haline salt content) refers to a
part of the large-scale ocean current circulation that is driven by global density gradients created
by surface heat, freshwater fluxes and salinity of the sea water (Munn, R. E., et al., 2002). Salinity
level in the oceans also controls the evaporation rates, influencing the water cycle in the
atmosphere.
3.3.6 Political undertone
The concept of climate change is driven by the political thrust, which has its own objectives and
agenda to deceive. They portray climate change as a threat to the environment mainly because of
the industrialization and burning of fossil fuels and prove it scientifically through false means. To
curtail the further development by the developed nations this very idea of climate change has been
propounded by the world leaders who are anti-capitalists. Such policies are made and adopted,
which support the climate change, to close the industries and to make the capitalists pay for it. The
reports published by the IPCC have the political essence lacking the scientific temper.
3.4 Supporting evidences
There are many severe events which took place in the past millions of years ago due to the variation
in the climatic conditions. They reveal that if we are expecting something major to appear in the
coming times then it would not be something out of the box rather the part of the trend being
followed since the paleoages. Some of those events have been included here.
3.4.1 Great Barrier Reef Submergence
15,000 years ago, Earth was experiencing an ice age when much of North America was covered
with glaciers and oceans were 300-400 feet lower than today. Coral larvae, in the ocean, had settled
along the edge of coastal plain (now ocean). The grown coral heads as a reef similar to the
formation of a classical fringing reef. 6,500 years ago, the glaciers melted and sea level rose. As a
23
result of this the sea covered the coastal plain on Australia's northeast coast and many of the coral
heads grew upward.
3.4.2 Dwarka Submergence
The ancient city of Dwarka in Gujrat at Gulf of Cambay which was once the dwelling place of
Lord Krishna around 15th to 18th B.C. submerged in water due to the rise in the sea levels. It has
been said that the modern city is the seventh city, before that it has been submerged six times under
the sea. Archeologists have recovered many artifacts which show that their existence coincides
with that of the mythological era when Lord Krishna had lived. It has been found that it was one
of the busiest ports in the west coast of India in earlier times. If such a devastating event occurred
in the past then why the development need to be blamed for the increasing levels in the sea in the
present.
In addition to Dwarka there were many other sites along the Gujarat shoreline where the evidences
of fluctuations in the sea levels have been found. Changes in the shoreline at any point could be
due to various reasons such as tectonic disturbance or shift in sedimentological regime causing
erosion or deposition. Many scientific investigations, focusing on the palaeo-shoreline vis-à-vis
sea level fluctuations in India based on numerous geological techniques, have indicated that at
about 6000 BP, the sea level was approximately 6 m higher than at present and about 4000 years
BP it stabilized at the present one with minor fluctuations (Gaur, A. S., and K. H. Vora, 1999).
3.4.3 Strait of Bosporous
During latest Quaternary glaciation, the Black Sea became a giant freshwater lake. When the
Mediterranean rose to the Bosporus sill at 7,150 year BP, saltwater poured through this spillway
to refill the lake and submerge, catastrophically, more than 100,000 km of its exposed continental
shelf (Refer Figure 3.4). The permanent drowning of a vast terrestrial landscape may possibly have
accelerated the dispersal of early Neolithic foragers and farmers into the interior of Europe at that
time. (Ryan et al., 1997)
24
3.4.4 England was connected with rest of the Europe
26 000 calendar years ago when the global sea levels were 120m below than the present, Great
Britain was the part of the European mainland, connected through the land bridges with the Ireland
but after the Pleistocene ice age, due to the warming the glaciers melted down leading to the rise
in the sea level and Britain was carved out from the mainland, resulting in to an island transforming
the land bridges to the modern sea floors (Edwards et al.,, 2008).
Researchers have found many common features between the two regions which further strengthens
this theory. When similar events at much smaller are observed today then, they are labelled as the
repercussion of the human activities.
3.4.5 Nomenclature of Greenland
There are various theories given after the naming of the Greenland, the most popular is the one
based on Erik the Red one of the Viking, who were settled in different parts of the Northern Europe.
He was sent on exile to this place as it was uninhabitable due to severe climatic conditions, later
when he was set free; he wanted to stay here along with others. So in order to convince people he
portrayed this place as Greenland. But the scientific uncovering of the history of this landmass
directs to different outcome. There are evidences that this region was as warm if not warmer as
compared to today when Europeans started to settle there. This period coincides with the medieval
climatic anomaly which was marked by warmer period in some regions of the Northern Atlantic
and fragments of Europe, this warming was not on global level. Therefore, entire Greenland was
not covered with the ice sheet; some portions were warmer, green (had vegetation cover) and
¯
Bosporus
Black
Sea
Mediterranean Sea
Figure 3.4 Strait of Bosporus
25
habitable, although most of it was under ice cover. This incident communicates that within 1000
years span inhabitable land of Greenland was covered under white sheet.
Studies revealed that ancient soils has been preserved in basal ice for millions of years. Sediment
extracted is found to be long lived and mineral rich, organic subsoil from boreal region. Fossils
suggests a partially forested tundra landscape (Alaskan Tundra) existed before growth current ice
sheet. Ancient preglacial regolith also found on Baffian Islands, Arctic Canada.
3.5 Climatic Models
Many climatic models have been developed by various organizations across the globe which claim
to incorporate the climate attributes along with the climate variability. But it has been found that
these models fails miserably in capturing the reality as they ignore the fundamental and intrinsic
processes which dominate the climate. There is no single GCM which is superior in representing
the entire climate. These models provides the future projections by making the liberation of CO2
in to the atmosphere by the anthropogenic activities as the fundamental basis. In the previous
section it has been seen that it is not the human activities which governs the carbon cycle but the
physical processes that affects it on bigger scale. In the next section the lacunae in the climatic
models have been discussed.
3.5.1 GCMs from alternate point of view
Intergovernmental Panel on Climate Change (IPCC) claims that with the knowledge of the GCMs,
one can understand the entire climate system and its attributes. Moreover, both past and future
simulations can be made available for longer durations with great confidence. GCMs have been
extensively used for further hydrological studies after downscaling them to the desired level. But
the reality is that there exist many loopholes in the GCM models and they do not paint the true
picture of the global climate. They fail to capture the entire underlying physics and represent only
minuscule part of the reality.
Earlier General circulation models were known for their ability to capture the characteristics of the
general circulation of the planetary atmosphere or ocean. But since last few decades, they are
associated with the global climate and are claimed to be the climatic models that are able to portray
the global climate efficiently. Since then they are popularly known by the name of “Global
Climatic Models”. This modified nomenclature, took the whole concept of GCM to another level.
26
They were initially not meant for the forecasting purpose. The things changed so drastically that
after this, the notion of climate change was linked with them. It was asserted that these models can
project the climate change. To account for the climate change in to the GCM, various scenarios
have been developed in which anthropogenic activities are taken in to the contemplation in terms
of the green-house gas emission.
GCMs extended potential use by the climate community beyond their ability for which they were
initially designed have been criticized (Wilby, 2010). The extended use has not been supported by
any peer review. Kundzewicz (2011) raised a pertinent question “Are climatic models ready for
the prime time or more research is needed?” i.e. can they be used in real applications in the water
management sector and infrastructure planning and design realm. They are not capable for serving
the needs of the adaptation-type activities.
Another issue of concern is, in much of the literature currently available on GCM downscaling,
stress has been given to the determination of impact of climate change based on downscaled
information, ignoring the fact that downscaling techniques which acts as bridge between the
coarser GCM and local scale observed information was introduced to just deal with the scale issues
for more refined insights.
The study explores the credibility of the General Circulation Models (GCM) which is considered
to be the touchstone for the climatic downscaling. There are many lacunae present in these models
which raises the questions on the very concept of downscaling of these models.
GCMs do not incorporate sub-grid features such as topography, land surface process, land use
pattern and cloud physics, to encounter these discrepancies downscaling technique are used to deal
with the scales. These models overlook many physical phenomena’s and feedback mechanisms
which have direct influence on the climate. Uncertainty about such unfathomable phenomena does
further refrain them from the ambit of the climatic models, hence GCMs fail to capture and portray
the reality. Such physical phenomena’s also include processes that occur at small scales like
thunderstorms which is a regional event producing the flow of energy in to the atmosphere, but
are difficult to account for. If it is at all possible to include the higher resolution processes then the
concern for the boundary interaction arises leading to delusion in the models.
27
Flow of energy at the interaction between the land, atmosphere and oceans is too complex and has
not been completely understood by the scientists till now, which needs to be addressed to avoid
the questioning on GCMs credibility. The effect of this is not trivial on the climate and hence
cannot be ignored due to lack of understanding (Idso et al., 2009).
There are various other essential inputs which need to be addressed in the formulation of the GCMs
like geomagnetism of the earth and salinity level in the oceans, changes in orientation of earth axis
of rotation and earth’s orbit etc. they all contribute significantly and influence the climate on the
global scale.
Bio-physical-chemical cycles role in the models is still very much under development that
influences climate over long time periods. Ice cover and vegetation cover are other components
which have just been recently incorporated in the models.
Inadequate information of the climatic variables makes it difficult to define the initial condition
for the models. While defining the initial conditions, a small perturbation even at 5 decimal place
or so can lead to bifurcations and chaotic behavior in simple dynamic model. Here we are dealing
with a gargantuan dynamic phenomena of climate so these anomalies are more vehement in their
character.
Computational errors also hamper the performance of these models. The mathematical equations
used in formulation of these models are complex and often manifests multiplier effects to the
errors.
In the evaluation of the climatic models, simulated data is compared with the observed data which
is not available for the uninhabited areas. Wherever the data is available it is very limited, so certain
adjustments are made for the evaluation of the models which creates additional source of
uncertainty. Large differences are seen in the observed and the simulated data from these models.
Although in the AR5 report many new processes have been included and improvements have been
made in the models but still they are not in the state of delivering what is expected from them.
28
4. STUDY AREA AND DATA DESCRIPTION
4.1 Study Area
The study area is the catchment area of Krishna River, which is located between 73°17’ to 81°9’
east longitudes and 13°10’ to 19°22’ north latitudes. The location of Krishna river basin is shown
below in the Figure 4.1
Figure 4.1 Basin map of Krishna River (India-WRIS, 2015)
The Krishna Basin covers a total area of 258948 Sq.km. The basin has coverage of 69,425 sq km
in Maharashtra 1,13,271 sq km in Karnataka and 76,252 sq km in Andhra Pradesh respectively.
¯
29
The Krishna River rises from the Western Ghats near Jor village of Satara district of Maharashtra
and flows for about 1400 km before its outfall into the Bay of Bengal. The Ghataprabha, the
Malaprabha, the Bhima, the Tungabhadra, Muneru and Musi are the principal tributaries. The
major part of basin is agricultural land accounting to 75.86% of the total area and 4.07% of the
basin is covered by water bodies.
The climate of the drainage basin is dominated by the southwest monsoon which sets in the middle
of June and withdraws by the middle of October is the major source of precipitation for the whole
region. During this period, the basin receives about 80% of its total annual rainfall. The annual
rainfall in the Krishna basin varies from 3048 mm to 600 mm. About 90% of annual rainfall is
received during the Monsoon period, of which more than 70% occurs during July, August and
September (WRIS, 2015).
An average annual surface water potential of 78.1 km³ has been assessed in this basin. Out of this,
58.0 km³ is utilisable water. Also an average annual ground water potential of 26.41 km³ is
available. Cultivable land in the basin is about 203,000 km2, which is 75.86% of the basin area and
10.4% of the total cultivable area of the country.
The major Hydro Power stations in the basin are Koyna, Tungabhadara, Sri Sailam, Nagarjuna
Sagar, Almatti, Naryanpur, Bhadra.
4.2 Data Description
Latest AR5 report of IPCC has provided with various GCMs out of which CanCM4 at grid size
2.8° X 2.8° developed by Canadian Centre for Climate Modelling and Analysis has been chosen
based on the skill score as described earlier. It gives historical data from 1961 to 2005 as well as
the data consisted of future simulations by emission scenarios RCP4.5 from 2006 to 2035. The
data was extracted to cover the entire Krishna basin with 20 grid points.
GCMs are available at coarse grid scale and ranges between 250Kms to 600Kms. These are
mathematical models developed by considering physics involved in land, ocean and atmospheric
processes in form of a set of linear and nonlinear partial differential equations. They project
climatic variables globally at coarse resolution.
In fifth assessment report (AR5) of IPCC four Representative Concentration Pathways (RCP’s)
were given and defined by their total radiative forcing (cumulative measure of human emissions
30
of GHGs from all sources expressed in Watts per square meter) pathway and level by 2100 shown
in Figure 4.2. The Canadian GCM models have considered RCP4.5 as the future scenario which
represents stabilization without overshoot pathway to 4.5 W/m2 at stabilization after 2100
Figure 4.2 Future scenarios developed in AR5 report
The variables or the predictors provided by the CanCM4 GCM models include atmospheric, land,
oceanic and ice variables and among them atmospheric variables have been considered here.
Within the atmospheric variables there are many variables and each of them is available at 22
pressure levels that are 1000, 925, 850, 700, 600, 500, 400, 300, 250, 200, 150, 100, 70, 50, 30, 2
, 10, 7, 5, 3, 2 and 1(in millibar).
Reanalysis data of the monthly mean atmospheric variables prepared by National Centre for
Environmental Prediction-National Centre for Atmospheric Research (NCEP-NCAR), is extracted
for the period from January 1969 to December 1993 which included 300 months for twenty grid
points whose latitude ranges from 12.5°N to 20°N and longitude ranging from 72.5°E to 82.5°E
covering entire Krishna basin with 20 grid points. The resolution of NCEP data is 2.5° X 2.5°. The
spatial resolution is approximately 417 X 278 km.
NCEP data is obtained by data assimilation from various sources such as:
31
Global rawinsonde data
Comprehensive ocean-atmospheric data set that includes surface marine data
Aircraft data
Surface land synoptic data
Satellite sounder data
Special sensing microwave/imager surface wind speeds
Satellite cloud drift winds
All this data is assimilated through spectral statistical interpolation that is a 3D variational analysis
scheme. NCEP/NCAR is used for bias correction, calibration and validation of the downscaling
model.
Predictors are available at 17 different pressure level 1000, 925, 850, 700, 600, 500, 400, 300, 250,
200, 150, 100, 70, 50, 30, 20and 10(in mb) and some of them are at surface level like surface
temperature, pressure at mean sea level.
The Indian Meteorological Data (IMD) grid data for the basin is used for calibrating and validating
the models to finer resolution. The precipitation data available at grid size of 0.25° X 0.25° was
converted to monthly from daily as the data was available in daily format. The IMD data is
extracted for January 1969 to December 2004. On similar grounds the IMD temperature has been
used which is at grid size of 1° X 1° for the same duration.
Variables are such selected which were common in both the GCM as well as in NCEP data, so 70
variables at each grid point are taken. Predictors at 17 different pressure levels 1000, 925, 850,700,
600, 500, 400, 300, 250, 200, 150, 100,70,50, 30, 20, 10 (in millibar) are considered which includes
Air Temperature (TA), Eastward Wind (UA), Northward Wind (VA) and Geo-potential Height
(ZG) and others at surface levels include Surface Temperature (TS) at 2 meters above the ground
level, Sea Level Pressure (PSL) shown in Table 4.1.
32
Table 4.1 List of potential atmospheric predictors considered in the study
S.no
Predictor
Pressure levels (millibar)
1
Atmospheric Temperature
(TA)
1000, 925, 850, 700, 600, 500, 400, 300, 250, 200, 150, 100,
70, 50, 30, 20, 10
2
Eastward Wind
(UA)
1000, 925, 850, 700, 600, 500, 400, 300, 250, 200, 150, 100,
70, 50, 30, 20, 10
3
Northward Wind
(VA)
1000, 925, 850, 700, 600, 500, 400, 300, 250, 200, 150, 100,
70, 50, 30, 20, 10
4
Geo-potential Height
(ZG)
1000, 925, 850, 700, 600, 500, 400, 300, 250, 200, 150, 100,
70, 50, 30, 20, 10
5
Sea Level Pressure
(PSL)
At sea level
6
Surface Temperature
(TS)
At 2m from ground level
In this study, NCEP monthly data from January 1969 to December 1993 is used for calibrating the
models. Another 11 years of monthly GCM data from January 1994 to December 2004 is used for
validating the models. The models are developed based on available NCEP variables and applied
on corresponding GCM inputs to simulate mean monthly precipitation and temperature for
validation period.
The downscaling for the precipitation and temperature is performed at five locations (Table 4.2
and Table 4.3) coinciding IMD stations (named A, B, C, D, and E for convenience) and data only
for these points is extracted from the basin data shown in Figure 4.3 and Figure 4.4 respectively.
Downscaling has been carried out by considering the surrounding four NCEP points of each
downscaling point. Therefore the total number of variables available for each downscaling point
is 280 (4 times 70).
33
Table 4.2 Location of downscaling points for the precipitation
Point
Latitude (North)
Longitude (East)
A
18.5°
74°
B
16.5°
74°
C
16.5°
76.5°
D
16.5°
79°
E
13.5°
76.5°
Table 4.3 Location of downscaling points for the temperature
Point
Latitude (North)
Longitude (East)
A
18.5°
73.5°
B
16.5°
74.5°
C
16.5°
76.5°
D
16.5°
79.5°
E
13.5°
76.5°
34
Figure 4.3 Downscaling points for the precipitation on Krishna Basin
¯
35
Figure 4.4 Downscaling points for the temperature on Krishna Basin
¯
36
5. METHODOLOGY
Several mathematical tools and techniques like Wavelet transform, entropy, K means clustering,
Principal Component Analysis (PCA) and Multiple Linear Regression (MLR) were employed in
this study to develop the downscaling models. This section provides a brief introduction of these
techniques.
5.1 Continuous Wavelet transform and Multi-scale Wavelet Entropy
A wavelet transform involves convolving the signal/ time series against particular instances of a
wavelet at various time scales and positions. A wavelet is a small wave which oscillates and decays
in the time domain and the wavelets which have strictly finite extent in the time domain are known
as discrete wavelets. There are a variety of wavelets that can be used but to be admissible as a
wavelet, the function should have zero mean and be localized in both time and frequency space.
Continuous wavelet transform (CWT) is implemented to perform these convolutions at every
position and scale. Mathematically, CWT of a time series, f (t), with respect to a mother wavelet,
(t), is defined as the sum over all time of the signal multiplied by the scaled and shifted version
of the mother wavelet (t):

 
 (1)
Where Wa,b is CWT coefficient for scale a and location b of the function (t). The conjugate
wavelet basis functions, 
are derived from a common mother wavelet function,
by scaling (or dilating) it by and translating it by  The squared wavelet coefficients are
plotted for each scale to facilitate more comprehensive analysis of the data. The plot of the amplitude
squared spectrum  is called wavelet power spectrum.
This study uses the concept of Multi-scale wavelet entropy (MWE) proposed by (Agarwal et al.,
2015) as a substitute to wavelet power spectrum to represent significant trends and its variations
in a time series. The concept is to calculate the complexity of a time series at different time-
frequency scales using Shannon entropy measure (Shannon, 1948), which is defined as

 (2)
37
Where  is the probability density function (PDF) used to describe the random behavior of
variable x of length 
The CWT-based PDF  can be defined as:


 (3)
Where,  represent wavelet energy under time position and time scale and 
represent total wavelet energy of time series under time scale (Sang et al., 2011; Cek et al.,
2010).
As the entropy is calculated for CWT coefficients at multiple resolution levels, it is referred to as
Multi-scale Wavelet Entropy. In principle, Entropy is a measure of the statistical variability of the
random variable x as described by the pdf.  is a measure of information content in the signal
at a given time- frequency resolution, more information represent lower entropy value and vice
versa. Therefore high value of entropy represent high unpredictability and hence highly
complicated and disordered system.
To illustrate the concept of multi-scale wavelet entropy, a synthetic time series is analyzed for
its wavelet power spectrum and its entropy at multiple time- frequency scales. is obtained
through linear combination of a stationary time series , a linear component , a non- linear
signal and random noise of range 0 to 10. These signals are mathematically described below:

 
 (4)
 (5)

 (6)
 (7)
(8)
38
The resultant synthetic time series and its Wavelet power spectrum and MWE are provided in
Figure 5.1 (a), (b) and (c) respectively. It is evident that MWE is sensitive to the features of the
time series. The MWE plot indicates that the signal has dominant, however inconsistent, features
around 16, 32 and 128 periods. Dip in the MWE plot around 30th and 50th period indicates that a
strong and consistent feature exists in the signal around this period (which belong to and
respectively). Lower values entropy of indicates orderliness and higher values of entropy indicates
variability.
Figure 5.1 (a) Synthetic time series (b) Wavelet spectrum and (c) Multi-Scale entropy for synthetic time series
39
5.2 Discrete Wavelet Transform
Determining wavelet coefficients at every possible scale is an enormous task. Normally, Discrete
wavelet transform (DWT) uses dyadic scheme of wavelet decomposition where alternate scale and
position is adopted for calculating transform coefficients, thereby, reducing the computation
burden. DWT enables to achieve the time-frequency localization and multi-scale resolution of a
signal by suitably focusing and zooming around the neighborhood of one's choice (Mallat, 1999).
For a discrete time series, with integer time steps, DWT in the dyadic decomposition scheme
is defined as
 

  (9)
Whereis the discreet wavelet coefficient for scale a=2m and location b =2m n, m and n
being positive integers; N is the data length of the time series which is an integer power of 2,
i.e., N=2M. This gives the ranges of m and n as 0 < n < 2M-m -1 and 1 < m < M, respectively.
This implies that only one wavelet is needed to cover the time interval producing only one
coefficient at the largest scale (i.e., 2m where m=M). At the next scale (2m-1), two wavelets
would cover the time interval producing two coefficients, and so on till m=1. Thus, the total
number of coefficients generated by DWT for a discrete time series of length N = 2M is
1+2+3+…+2m-1 = N-1 (Nourani et al., 2009).
The process consists of a number of successive filtering steps in which the time series is
decomposed into approximation (A) and detail sub-time series or wavelet components (D1,
D2, D3, etc) as shown in Figure 5.2. Approximation component represents the slowly changing
coarse features of a time series and are obtained by correlating stretched version (low-
frequency and high-scale) of a wavelet with the original time series, while detail components
signify rapidly changing features of the time series and are obtained by correlating compressed
wavelet (high-frequency and low-scale) with the original time series (Sehgal et al., 2014).
40
Figure 5.2 Wavelet Decomposition
5.3 K means clustering
K means is one of the most popular clustering algorithms in literature. The variables are classified
into K clusters each of which is represented by its centroid, which is the mean (weighted or
otherwise) of feature vectors within the cluster. If represents the number of feature vector in
cluster and is the mean of cluster then, centroid of each cluster is defined as

 (10)
41
The algorithm starts with predefined initial number of cluster chosen according to some criteria
or some heuristic procedure. In every iteration, each cluster is assigned to its nearest cluster center
according to Euclidian distance measure between the two and then cluster center is re-calculated
(Rokach et al., 2005). Convergence of algorithm occurs as per defined criteria; or when
partitioning is reduced to maximum extent; or on re allocating cluster centroid indicating that
solution is locally optimal. More detailed information about K means clustering can be obtained
from (Ball and Hall, 1967; MacQueen, James 1967; Kanungo et al., 2002).
5.4 Principal Component Analysis
PCA is applied to data in which orthogonal transformation is applied on set of correlated predictor
variables producing principal components. Principal components are dimensionally reduced and
uncorrelated to one another i.e. it reduces dimensionality and multi-collinearity. These components
carry almost the same variability as that of the original data. Although this approach works well
and had been used widely but has some limitations. The components or the coefficients are
completely different from the original ones, so it is not possible to make out which original
decomposed variables reflects the best with the observed data.
5.5 Models used in the study
5.5.1 Multiple Linear Regression (MLR)
Multiple linear regression (MLR) is a statistical approach that is used to model a linear relationship
between a dependent variable (predictand) and one or more independent variables (predictors).
MLR is a least square based method and assumes that the relationship between variables is linear.
Therefore MLR model can be expressed as a linear function as:
 (11)
Where, y is the value of a predictand, xi is the value of the ith predictor variable, C is the intercept
and βi is an adjustable error coefficient of the ith predictor variable. Multiple linear regression
attempts to find a best fit plane. The fit can be evaluated by the coefficient of multiple
determination (R2). The correlation coefficient (R) expresses the degree to which two or more
predictors are related to the predictand.
42
5.5.2 Second order Volterra (SoV) Model
The decomposed time series of the various atmospheric variables from the NCEP/GCM data form
the input variables for the model. From these input variables, those which have a significant lagged
cross correlation with the rainfall time series were identified and were then integrated using the
second order Multiple Input Single Output (MISO) Volterra model to provide the rainfall at the
lower resolution. For details about the MISO Volterra the readers are referred to Maheswaran and
Khosa (2012).
Let, DiX i=1,2,..J denotes the detail component of the wavelet decomposition of a certain input
variable X and AJX denotes the approximation component of the wavelet decomposition of the same
input variable X.
From the wavelet coefficients of the different input time series, the significant wavelet coefficients
are selected based on the lag correlation with the observed precipitation and temperature time
series. Let for example, some of these may be denoted by
( ), ( )............, ( )
i i i
pressure windvel temperature
D t D t D t
and similarly the significant scaling coefficients at the
decomposition level J of the different input series be denoted by
( ), ( )............, ( )
J J J
pressure windvel temp
A t A t A t
, where i denotes the depth of decomposition which varies
from 1 to J.
Now, the significant wavelet coefficients and scaling coefficients of the different input series are
nonlinearly convolved using the second order Volterra representation within a multiple inputs-
single output frame work. For simplicity of notation, let these different series be denoted by u1, u2
... uL where L is total number of inputs.
If y(t) denotes the precipitation or temperature time series, L denotes the number of input variables,
N is the length of the time series, m denotes the memory of each input variable up to which there
is a significant lag relationship with the rainfall time series and
t
represents the model noise
including modelling errors and the unobservable disturbances, the multi-scale nonlinear
relationship may be written as
43
12
1
12
12
1 2 1 2
()
1
11
()
2 1 2 1 2
1 1 1
1( , )
2 1 2 1 2
1 1 1 1
( ) ( ) ( )
( , ) ( ) ( )
( , ) ( ) ( )
Lm nn
n
L m m n
s n n
n
n
L m m nn
x n n t
nn
y t h u t
h u t u t
h u t u t



 
 

  
 
 
 
 

(12)
First order kernels
()
1n
h
describe the linear relationship between the nth input un and y, the second
order self-kernels
()
2n
s
h
describe the 2nd order nonlinear relation between the nth input un and y
respectively and the second order cross-kernels
12
( , )
2nn
x
h
describe the 2nd order nonlinear
interactions between each unique pair of inputs (un1 and un2) as they affect y.
Eq. (12) can be simplified by combining the last two terms to yield Eq. (13) and it now remains
to estimate kernels h1 and h2.
12
12
1 2 1 2
()
1
11
( , )
2 1 2 1 2
1 1 1 1
( ) ( ) ( )
( , ) ( ) ( )
Lm nn
n
L L m m nn n n t
nn
y t h u t
h u t u t


 

 
 
 

(13)
The representation of Equation (13) can be further simplified by considering each of the lagged
variables u1(t-1), u1(t-τ)...., u2(t- 1), u2(t- τ).... as separate variables d1(t), d2(t), d3(t)........ dNl (t)
then, Eq. (13) can be written as
5
12
12
1 2 1 2
1 1 1
( ) ( ) ( ) ( , ) ( ) ( )
l l l
N N N
l l l
l l l
y t h l d t h l l d t d t
 

 
(14)
More clearly,
44
6
var
l
lk
l k l
th
L
total numberof lagged iables
d (t)= x (t) 1 k L ; 1 l
d (t)= x (t - ) 1 k L ; L <l N ; = 1,2,3....m
= lagged value.
L total predictortime series.
N


 
 
Using the Orthogonal Least Squares- Error Reduction Ratio (OLS-ERR) method of Chen and
Billings (1989), the significant regressor terms were selected and correspondingly kernels were
estimated. The complete mathematical derivation of the Wavelet Volterra coupled model can be
found in Maheswaran and Khosa (2011a,c). The programs were coded and executed in the
MATLAB 7.6.0.
5.5.3 Artificial Neural Networks (ANN)
Artificial neural networks (ANNs) are information processing systems composed of simple
processing elements (nodes) linked by weighted synaptic connections (Muller and Reinhardt
1991). An important type of ANN, the multilayer feed-forward neural network consists of a set of
sensory units that constitute the input layer, one or more hidden layers of computation nodes and
an output layer of computation nodes. The input signal propagates through the network in a
forward direction, layer by layer. These neural networks are commonly referred to as multilayer
perceptron. A detailed explanation of different properties of ANNs is beyond the scope of this
study and interested readers can to refer to Haykin (1994) and Bishop (1995) for information on
the properties of ANNs and Maier and Dandy (2010) for an overview of different applications of
ANNs in water resources variable forecasting (Sehgal et.al.2014).
5.6 Calibration and Validation of the models
Calibration of model takes the data set from NCEP reanalysis data. The validation data for the
models is taken from 1st January 1969 through 31st December 1993; total there are total 300
months for calibrating the model. Validation data is taken from 1st January 1994 through 31st
December 2004. The validation results are then compared with the observed IMD data.
45
For projecting future temperature values, the model which is trained above is used. The predictor
set from RCP 4.5 scenario, the period of which starts from 1st January 2006 and ends at 31st
December 2035, is taken.
5.7 Performance indices
The performance indices which are used to evaluate the performance of the developed models in this
study are Normalized Root mean square error (NRMSE), Normalized Mean Absolute Error (NMAE),
Coefficient of Correlation (CC) and Accuracy.
5.7.1 Normalized Root Mean Square Error (NRMSE)
NRMSE is Root Mean Square Error (RMSE) normalized to a scale [0, 1].
RMSE is expressed as
RMSE =
 
 (15)
Where Oi and Pi are the observed and estimated precipitation or temperature, O is the mean of the
observed precipitation or temperature and n is the number of data points in the series. The closer
the value to 0, the better is the model performance. To facilitate easy comparison of model
performances across stations and models, NRMSE is adopted in this study. NRMSE is expressed
as follows:
NRMSE=RMSE/Range (16)
Where, Range is the difference between maximum and minimum value of the observed dataset.
5.7.2 Normalized Mean Absolute Error (NMAE)
NMAE is Mean Absolute Error (MAE) normalized to a scale [0, 1].
MAE is expressed as
MAE =

 (17)
The closer the value to 0, the better is the model performance. NMAE is expressed as follows:
NMAE=MAE/Range (18)
46
5.7.3 Coefficient of Correlation (CC)
CC is expressed as

 

 
 (19)
P
is the mean of the predicted flow values. CC varies from [0, 1], with higher values indicating
higher responsiveness of the predicted time series to the observed.
5.7.4 Accuracy
Discrepancy Ratio (DR) is defined as log of the fraction of estimated and observed precipitation
or temperature.

 (20)
It follows that DR=0 suggests exact matching between predicted and measured values; otherwise,
there is either over-estimation (Oi < Pi) or under- estimation (Oi > Pi).
In this study, the estimation of precipitation is deemed precise if the simulated values fall within
10% deviation from the observed values (DR between -1 to 1)
In this study, the estimation of temperature is deemed precise if the simulated values fall within
2% deviation from the observed values (DR between -0.2 to 0.2)
Accuracy of the model for precipitation is defined as percentage of estimated data points in
validation period falling within DR range of [-1, 1].
Accuracy of the model for temperature is defined as percentage of estimated data points in
validation period falling within DR range of [-0.2, 0.2].
47
6. MODEL ARCHITECTURE
Figure 6.1 shows the flowchart for the proposed models in this study. This section provides a
detailed description of the model development.
Figure 6.1 Flowchart for the proposed downscaling framework
48
6.1 Standardization
After selecting the potential predictors the data needs to be standardized. Standardization is used
prior to statistical downscaling to reduce systematic biases in the mean and variances of GCM
outputs and NCEP/NCAR data (Wilby et al. 2004). The procedure typically involves subtraction
of mean and division by standard deviation of the predictor variable for a predefined baseline
period for both NCEP/NCAR and GCM output. A major limitation of standardization is that it
considers the bias in only mean and variance. There is a possibility that the reanalysis data and
GCM output may deviate from normal distribution, and there may exist bias in other statistical
parameters.
6.2 Interpolation of GCM grid points to NCEP
Grid size of GCM which is 2.8° x 2.8° is made to the order of NCEP 2.5°x 2.5° and the predictors
of GCM are predicted on this new scale using interpolation. The new location of the GCM along
with their location indices is shown in the Figure 6.2.
Figure 6.2 Interpolated GCM points location
¯
49
6.3 Variable averaging
Each downscaling point is chosen to be close to the center of a NCEP grid. Values of each variable
at all 4 vertices of a NCEP grid enclosing the downscaling point are observed to be close to each
other. Hence, values of each variable across the vertices of an NCEP grid are averaged out to
obtain a single value of variable across the grid at each pressure level. Figure 6.3 provides values
of TA at 10millbar pressure level (TA10) at the vertices of the NCEP grid surrounding point A. It
can be observed from the figure that the values and the variations in TA10 at NCEP vertices
numbered 1 (TA10(1)), 2 (TA10(2)), 5 (TA10(5)) ,6 (TA10(6)) are similar and close to each other
and the average of these variables (TA10(A)) explains most of the variance in these variables
(Refer to Figure 6.2 for locating said vertices on the basin) , hence justifying variable averaging
for the purpose of variable reduction. Figure 6.4 provide schematic for the proposed variable
averaging scheme.
Figure 6.3 Variations in Air Temperature at the vertices of the grid enclosing Station A
50
Figure 6.4 Variable averaging scheme
6.4 Multi-scale wavelet entropy
CWT is applied to each input variables for the calibrating period using Morlet wavelet to obtain
wavelet coefficients at 7 time- frequency dyadic scales covering information upto 128 months
period. These wavelet coefficients are then used to calculate MWE of the variable at each scale
hence providing a distinct multi- scale entropy signature for each input variables.
6.5 K-means clustering and PCA
The MWE for input variables is used as a signature to cluster atmospheric variables into
homogeneous groups using K- means methodology. Selection of suitable number of clusters is an
important task while dealing with K- means clustering. Two validity indices namely, Davies-
Bouldin. The (DB) index and Dunn’s index were used to identify the best number of clusters for
the data.
51
6.5.1 Davies-Bouldin (DB) Index
Davies-Bouldin (DB) index is defined as a function of the ratio of the sum of within-cluster scatter
to between cluster separations.




 (18)
Where, cluster diameter  is defined as:



(19)
With the number of points and the centroid of cluster. Since the objective is to obtain
cluster with minimum intra cluster distances, small value of DB is indicative of better clustering.
6.5.2 Dunn’s Index
Dunn’s index is defined as the ratio of minimal intra cluster distance to maximal inter cluster
distance also. The Dunn’s index for K clusters is defined as:
 
 

 (20)
 (21)
 (22)
Where diss is the dissimilarity between clusters and and  is the intra-
cluster function (or diameter) of the cluster. Large value of Dunn’s index is preferred as it
represent well and compacted cluster.
Detailed information of these indices is out of the scope of this study. Readers may refer to Davies
and Bouldin (1979); Kasturi et al., (2003); Dunn (1973); Bolshakova et al., (2003) and Halkidi et
al., (2001) for detailed information of these cluster validity indices.
Selection of optimum number of clusters is carried out by segregating the data into 5 to 15 clusters
and evaluating the corresponding clusters for low DB and high Dunn’s index value. Figure 6.5
52
provides value of DB and Dunn’s indices corresponding to each number of clusters (5 15) for all
five stations. The optimum numbers corresponding to each station is shown in Table 6.1.
Figure 6.5 Value of DB and Dunn’s indices corresponding to each number of clusters
Table 6.1 Optimum number of clusters formed at each downscaling location
S.No.
Station
Optimum Number Of Clusters
1
A
12
2
B
10
3
C
11
4
D
9
5
E
8
53
Principal component analysis (PCA) is then used on each cluster to obtain principal components
explaining 90- 95% of the variance of each cluster. These principal components served as model
inputs to P- ANN and P- MLR models.
The above steps can be categorized as data preparatory steps. A detailed schematic describing data
preparation for models is provided in Figure 6.6.
Figure 6.6 Detailed flowchart of input formation to the models
6.6 Multi- resolution modeling using DWT
Application of multi- resolution modeling using wavelet transform requires clarity on selection of
mother wavelet function for carrying out DWT and on suitable resolution level upto which the
data should be decomposed to obtain the Discreet Wavelet Components (DWC’s) for modeling
purpose. If the level of decomposition is taken to be less than the appropriate level as required,
models may miss out on some useful information regarding the process under study. On the other
hand, unnecessary inclusion of decomposition levels may lead to more number of redundant
variables and will add to overall complexity of the model. Similarly, the choice of mother wavelet
function used for obtaining DWC’s for model formation also play crucial role in extracting
54
information from the available variables. A comprehensive study has been carried out on the data
to understand the optimum resolution level for model formation and selection of mother wavelet.
Figure 6.7 shows the wavelet power spectrum (WPS) of observed precipitation and the temperature
respectively at station A. It is clear from the figure that events around 9-12 months period are
significant and have a distinct and consistent presence in the wavelet spectrum for both the
precipitation and the temperature. In addition to this for the temperature 6 months period is also
remarkable. A similar analysis is carried out for all NCEP variables to identify significant periods
for NCEP variables. Figure 6.8 shows the number of NCEP variables with respective significant
periods for station A. Out of a total of 280 NCEP variables, the period of 1-4 months is found to
be significant for 16 variables whereas 25 variables displayed significance around 5-8 months
period. 210 variables displayed significance around 9- 16 months compared to 10 and 19 variables
for 64- 128 and beyond 128 months respectively.
Clearly, the 9- 12 months period can be considered to be the most significant in extracting
information from both NCEP and observed precipitation and temperature. A similar pattern is
observed for all other stations as well. As 9- 12 months corresponds to the third level of
decomposition in dyadic scheme, third level of decomposition is chosen for multi- resolution
modeling using DWT.
Figure 6.7 (a) Wavelet power spectrum of observed precipitation (b) Wavelet power spectrum of observed temperature
55
Figure 6.8 Number of NCEP variables with respective significant periods
Maheshwaran and Khosa (2012) and Sehgal et al., (2014) have established that wavelets with
larger support width and higher vanishing moments are better suited to extract information at
multiple time frequency scales. However, in case of a climatic downscaling models, the input
variables at various pressure levels and may represent different physical properties which can be
difficult for any given wavelet to simulate. Downscaling models typically contains a variety of
sets of input variables each with unique physical characteristics and hence, distinct multi-
resolution behavior. Hence, selection of a mother wavelet to explain multi- resolution behavior of
all the variables with reasonable accuracy remains a major challenge. Using different wavelets to
decompose each set of physically similar variables could be an alternative approach, but that would
lead to scaling issues and also add to the overall complexity and computational effort of the model.
Selection of suitable resolution level of input variables for model formation is another challenge
in wavelet based hybrid models. Some variables may show significant information at a scale which
could be significantly different from other variables. Hence, a comprehensive analysis is required
to ascertain the most suitable level of decomposition without adding to the complexity of the
models.
In this study a trade- off between complexity and accuracy is maintained and a generalized
modeling approach is applied. The models employ a single mother wavelet for decomposition of
all input variables which is selected based on the performance of the models for validation period
using wavelets from daubechies” family from vanishing moment 1 through 45 for modelling at
each point of application. Third level of decomposition pertaining to 8-12 months scale is selected
for the model formation. Figure 6.9 provides wavelet decomposition of variable TA (air
56
temperature) at 10millbar pressure level (TA10) at NCEP vertices numbered 1 [TS10 (1)] and
variable PSL (sea level pressure) at NCEP vertices numbered 6 [PSL (6)] upto three resolution
levels using db45 mother wavelet for illustration.
57
Figure 6.9 Wavelet decomposition of TS10 (1) and PSL (6) upto three resolution levels
58
6.7 Development of wavelet based hybrid models
The concept of multi- resolution modeling is employed using DWT to develop Second Order
Volterra coupled with wavelet & PCA models (W-P- SoV) and MLR coupled with wavelet & PCA
models (W-P- MLR) models. DWT is used to obtain the wavelet sub- time series of both input
variables and observed precipitation and temperature (model target) which are modeled using SoV
and MLR to simulate downscaled precipitation and temperature at corresponding time frequency
scale. The final downscaled value of precipitation and temperature are reconstructed from the
output of each of these models.
Estimating best input combinations for models for each DWC is computationally a challenging
task. Hence total number of input variables for modeling each DWC of regional precipitation and
temperature were kept constant for a station which is determined based on the performance of the
reconstructed precipitation and temperature time series respectively for the validation period from
these models. However, the number of linear and non- linear variables varied across the DWC
models. Table 6.2 and 6.3 gives a detailed information of combination of linear and non-linear
input variables for each sub- time series model of all five stations under study for precipitation and
temperature respectively. It is observed that D1 component accounted for most number of non-
linear variables in the model across all stations. This is understandable as D1 corresponds to high
frequency component of a time series and represents sort and transient features. Figure 6.10 and
Figure 6.11 provides representation of nonlinear kernels for one of the multi-resolution models for
precipitation and temperature respectively.
59
Figure 6.10 Nonlinear kernels for modeling D2 component for station B for Precipitation
Figure 6.11 Nonlinear kernels for modeling D2 component for station B for Temperature
60
Table 6.2 Combination of linear and non-linear input variables for each sub- time series model of all five stations for
Precipitation
Station
Total
variables
D1
D2
D3
A3
Linear
Non-
Linear
Linear
Non-
Linear
Linear
Non-
Linear
Linear
Non-
Linear
A
6
3
3
6
0
6
0
6
0
B
10
2
8
7
3
10
0
9
1
C
5
2
3
5
0
4
1
3
2
D
7
3
4
7
0
7
0
2
5
E
5
2
3
5
0
5
0
5
0
Table 6.3 Combination of linear and non-linear input variables for each sub- time series model of all five stations for
Temperature
Station
Total
variables
D1
D2
D3
A3
Linear
Non-
Linear
Linear
Non-
Linear
Linear
Non-
Linear
Linear
Non-
Linear
A
7
2
5
7
0
7
0
4
3
B
17
2
15
15
2
17
0
5
12
C
9
2
7
9
0
9
0
5
4
D
9
2
7
9
0
9
0
7
2
E
6
1
5
6
0
6
0
5
1
61
7. RESULTS AND DISCUSSIONS
7.1 Performance Evaluation of the Models
This study provides comparison between following four models for downscaling GCM variables
to regional precipitation and temperature.
I. Second order Volterra coupled with wavelet & PCA (W-P-SoV)
II. MLR coupled with wavelet & PCA (W-P-MLR)
III. ANN coupled with PCA (P-ANN) and
IV. MLR coupled with PCA (P-MLR) models
Table 7.1 and Table 7.2 summarizes the performance of these models based on selected
performance indices for mean monthly precipitation and temperature respectively.
Table 7.1 Performance evaluation of the proposed model for the mean monthly precipitation
Station A
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.723
0.177
0.105
76.52
W-P-MLR
0.716
0.178
0.121
68.94
P-ANN
0.682
0.182
0.130
65.91
P-MLR
0.672
0.188
0.135
63.64
Station B
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.890
0.095
0.375
68.94
W-P-MLR
0.879
0.097
0.391
63.64
P-ANN
0.842
0.109
0.497
62.88
P-MLR
0.840
0.109
0.502
62.88
Station C
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.732
0.179
0.255
71.97
W-P-MLR
0.715
0.181
0.258
70.45
P-ANN
0.695
0.193
0.295
68.94
P-MLR
0.693
0.194
0.297
68.18
Station D
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.735
0.150
0.222
75.00
W-P-MLR
0.715
0.154
0.234
74.24
P-ANN
0.715
0.168
0.276
74.24
P-MLR
0.710
0.182
0.325
64.39
Station E
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.660
0.164
0.213
75.76
W-P-MLR
0.641
0.165
0.215
75.76
P-ANN
0.618
0.164
0.215
69.70
P-MLR
0.618
0.177
0.247
67.42
62
As observed from Table 7.1, Second order Volterra coupled with wavelet & PCA (W-P-SoV)
model and MLR coupled with wavelet & PCA (W-P-MLR) models outperform ANN coupled with
PCA (P-ANN) and MLR coupled with PCA (P-MLR) models for all 5 stations.
For Station A, W-P-SoV and W-P-MLR models provide CC of 0.723 and 0.716, NRMSE of 0.177
and 0.178, NMAE of 0.105 and 0.121 and Accuracy of 76.52% and 68.94% respectively compared
to CC of 0.682 and 0.672, NRMSE of 0.182 and 0.188, NMAE 0.130 and 0.135 and Accuracy of
65.91% and 63.64% using P-ANN and P-MLR models respectively.
Similar trend is observed for Station B where W-P-SoV models recorded improvement of 1.3%,
5.7% and 6.0% in correlation w.r.t W-P-MLR, P-ANN and P-MLR models respectively. Similarly,
improvement in NRMSE, NMAE and Accuracy w.r.t W-P-MLR, P-ANN and P-MLR is 2.1%,
12.8% and 12.8%; 4.1% 24.5% and 25.3%; 8.3%, 9.6% and 9.6% respectively.
A similar trend is observed for Stations C and Station E with W-P-SoV and W-P-MLR performing
considerably better than P-ANN and P-MLR models.
However for station D, performance of P-ANN models is observed to be close to that of W-P-
MLR model. Both the models provide CC of 0.715 and Accuracy of 74.24%. However, in NRMSE
and NMAE statistics, W-P-MLR takes the lead with NRMSE of 0.154 and NMAE of 0.234
compared to NRMSE of 0.168 and NMAE of 0.276 using P-ANN models.
Considering performance of the models based on the predefined four performance indices at all
the stations, overall W-P-SoV model outmatched the rest of the models.
63
Table 7.2 Performance evaluation of the proposed model for the mean monthly temperature
As observed from Table 7.2, Second order Volterra coupled with wavelet & PCA (W-P-SoV)
models and MLR coupled with wavelet & PCA (W-P-MLR) models outperform ANN coupled
with PCA (P-ANN) and MLR coupled with PCA (P-MLR) models for all 5 stations.
For Station A, W-P-SoV and W-P-MLR models provide correlation coefficient (CC) of 0.938 and
0.934, Normalized Root Mean Square Error (NRMSE) of 0.081 and 0.084, Normalized Mean
Absolute Error (NMAE) of 0.066 and 0.070 and Accuracy of 87.12% and 87.88% respectively
compared to CC of 0.902 and 0.899, NRMSE of 0.105 and 0.106, NMAE 0.084 and 0.086 and
Accuracy of 76.52% and 73.48% using P-ANN and P-MLR models respectively.
Station A
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.938
0.081
0.066
87.12
W-P-MLR
0.934
0.084
0.070
87.88
P-ANN
0.902
0.105
0.084
76.52
P-MLR
0.899
0.106
0.086
73.48
Station B
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.936
0.086
0.069
89.39
W-P-MLR
0.928
0.087
0.068
87.88
P-ANN
0.915
0.097
0.069
87.88
P-MLR
0.912
0.099
0.080
84.85
Station C
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.962
0.070
0.057
87.88
W-P-MLR
0.961
0.072
0.059
86.36
P-ANN
0.916
0.110
0.085
72.73
P-MLR
0.913
0.107
0.084
71.21
Station D
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.956
0.073
0.057
84.85
W-P-MLR
0.959
0.071
0.057
86.36
P-ANN
0.922
0.105
0.082
67.42
P-MLR
0.919
0.099
0.076
76.52
Station E
CC
NRMSE
NMAE
Accuracy
W-P-SoV
0.934
0.086
0.067
86.36
W-P-MLR
0.927
0.095
0.075
85.61
P-ANN
0.916
0.098
0.076
82.58
P-MLR
0.909
0.102
0.078
79.55
64
Similar trend is observed for Station B where W-P-S-V models recorded improvement of ~1%,
2.2% and 2.5% in correlation w.r.t W-P-MLR, P-ANN and P-MLR models respectively. Similarly,
improvement in NRMSE, NMAE and Accuracy w.r.t W-P-MLR, P-ANN and P-MLR is ~1.2%,
12.8% and 15.9%; 1.45%, 1.45% and 15.9%; 1.67%, 1.67% and 5.07% respectively.
A similar trend is observed for Stations C, Station D and Station E with W-P- SoV and W-P-MLR
performing considerably better than P-ANN and P-MLR models.
7.2 Discrepancy Ratio
Figure 7.1 and Figure 7.2 provides comparison of different models under study for the precipitation
and temperature respectively based on DR statistics. The x-axis of the plots provides the DR
bracket and the y- axis provides the percentage of total validation data points falling into the DR
bracket. The dotted vertical dines highlight the Accuracy band as total percentage of data points
falling into the region demarcated by the dotted lines is termed as Accuracy of the model. It is
evident from the plots that W-P-SoV models have a clear advantage over other models under study.
Higher percentage of results falls within the accuracy band for the W-P-SoV models as compared
to any other model for all the stations. In general the multi-resolution based models displayed less
bias and improved overall Accuracy for all five stations.
Accuracy of the model for precipitation is defined as percentage of estimated data points in
validation period falling within DR range of [-1, 1].
Accuracy of the model for temperature is defined as percentage of estimated data points in
validation period falling within DR range of [-0.2, 0.2].
65
Figure 7.1 DR Statistics of various models under study for the mean monthly precipitation
66
Figure 7.2 DR statistics of various models under study for the mean monthly temperature
67
7.3 Line plots
Figure 7.3 to 7.12 provides line plots of observed v/s simulated mean monthly precipitation and
temperature respectively for validation period for all the models under study. Again, it can be
observed that W-P-SoV and W-P-MLR models are more sensitive towards extreme values
compared to P-ANN and P-MLR models for all five stations under study.
Figure 7.3 Line plot of observed mean monthly precipitation and model simulations at Station A
Figure 7.4 Line plot of observed mean monthly temperature and model simulations at Station A
0
1
2
3
4
5
6
7
8
9
10
Precipitation mm
Station A IMD P-MLR P-ANN W-P-MLR W-P-SoV
20
22
24
26
28
30
32
Temperature oC
Station A IMD P-MLR P-ANN W-P-MLR W-P-SoV
68
Figure 7.5 Line plot of observed mean monthly precipitation and model simulations at Station B
Figure 7.6 Line plot of observed mean monthly temperature and model simulations at Station B
0
5
10
15
20
25
30
35
40
45
Precipitation mm
Station B IMD P-MLR P-ANN W-P-MLR W-P-SoV
20
22
24
26
28
30
32
Temperature oC
Station B IMD P-MLR P-ANN W-P-MLR W-P-SoV
69
Figure 7.7 Line plot of observed mean monthly precipitation and model simulations at Station C
Figure 7.8 Line plot of observed mean monthly temperature and model simulations at Station C
0
1
2
3
4
5
6
7
8
9
Precipitation mm
Station C IMD P-MLR P-ANN W-P-MLR W-P-SoV
20
22
24
26
28
30
32
34
36
Temperature OC
Station C IMD P-MLR P-ANN W-P-MLR W-P-SoV
70
Figure 7.9 Line plot of observed mean monthly precipitation and model simulations at Station D
Figure 7.10 Line plot of observed mean monthly temperature and model simulations at Station D
0
2
4
6
8
10
12
Precipitation mm
Station D IMD P-MLR P-ANN W-P-MLR W-P-SoV
20
22
24
26
28
30
32
34
36
Temperature OC
Station D IMD P-MLR P-ANN W-P-MLR W-P-SoV
71
Figure 7.11 Line plot of observed mean monthly precipitation and model simulations at Station E
Figure 7.12 Line plot of observed mean monthly temperature and model simulations at Station E
0
1
2
3
4
5
6
7
8
9
Precipitation mm
Station E IMD P-MLR P-ANN W-P-MLR W-P-SoV
20
21
22
23
24
25
26
27
28
29
30
Temperature OC
Station E IMD P-MLR P-ANN W-P-MLR W-P-SoV
72
It is clear from the results that wavelet based models outperform standalone models like P-ANN
and P-MLR. This is because application of wavelet analysis facilitates a multi-resolution modeling
approach to model the input variables. Separate models were developed for each DWC of the mean
monthly precipitation and temperature time series. Hence the models were sensitive to both long
and transient trends in the input variables providing an edge over standalone models.
Moreover, SoV models have an advantage over MLR based models as it addresses to the non-
linearity in the process. Transient processes which are mostly accounted by the D1 component of
a time series components are difficult to model using linear models due to inherent complexity of
the process. Hence a non- linear model captures variations in these components more accurately.
It is observed from Table 6.2 that D1 component accounted for most number of non- linear
variables in the model across all stations. This is understandable as D1 corresponds to high
frequency component of a time series and represents sort and transient features.
7.4 Downscaled Future Scenario
The downscaled scenario of mean monthly precipitation and the temperature for the upcoming
coming years have been engendered for all the stations shown in Figure 7.13. The RCP 4.5 future
scenario was downscaled using W-P-SoV model (as they gave best performance among all the
models). The forecasting period starts from 2006 and lasts up to 2035. This downscaled
information can be further used in decision making processes. Adopting this proposed framework
may aid decision makers to explore the future precipitation and temperature intensities, temporal
and spatial variability, regionalization studies etc. However, since planning and adaptation studies
should not rely on single climate model outputs, future decision making should employ the
proposed framework for multiple GCMs with multiple scenarios to incorporate the model and
scenario uncertainty.
73
Figure 7.13 Future downscaled mean monthly precipitation and temperature for all the Stations
74
7.5 Statistical Analysis
Table 7.3 provides statistical properties for mean monthly IMD precipitation for the calibration
and validation period. While Table 7.4 gives the statistical properties for mean monthly
downscaled precipitation for the future.
Table 7.3 Descriptive statistics of observed mean monthly IMD precipitation (mm) for calibration and validation period
Statistics (mm)
Point A
Point B
Point C
Point D
Point E
Calibration period (Jan’69 to Dec’93)
Mean
1.65
5.39
1.94
1.69
1.73
Median
0.47
0.97
0.82
0.70
0.91
Standard Error
0.14
0.48
0.16
0.14
0.13
Standard Deviation
2.48
8.28
2.78
2.37
2.18
Skewness
2.19
1.81
2.50
2.24
1.75
Minimum
0
0
0
0
0
Maximum
14.52
47.93
20.11
16.69
10.97
Range
14.52
47.93
20.11
16.69
10.97
Validation period (Jan’94 to Dec’04)
Mean
1.60
5.33
1.48
1.84
1.18
Median
0.31
0.92
0.61
0.86
0.57
Standard Error
0.19
0.73
0.16
0.19
0.14
Standard Deviation
2.16
8.33
1.89
2.17
1.66
Skewness
1.36
1.89
1.53
1.37
2.08
Minimum
0
0
0
0
0
Maximum
8.67
41.88
7.91
9.83
7.90
Range
8.67
41.88
7.91
9.83
7.90
Table 7.4 Descriptive statistics of observed mean monthly downscaled precipitation (mm) for future
Statistics (mm)
Point A
Point B
Point C
Point D
Point E
Future downscaling period (Jan’05 to Dec’35)
Mean
2.37
5.06
1.75
2
2.48
Median
2.12
2.33
1.43
1.92
2.09
Standard Error
0.08
0.31
0.08
0.13
0.11
Standard Deviation
1.54
5.84
1.61
2.39
2.03
Skewness
0.20
0.90
0.76
0.18
0.48
Minimum
0
0
0
0
0
Maximum
5.82
19.72
6.31
11.26
7.80
Range
5.82
19.72
6.31
11.13
7.80
75
Figure 7.14 Mean of the precipitation time series of IMD (calibration and validation period) and downscaled future data for
all the stations
Figure 7.15 Maximum of the precipitation time series of IMD (calibration and validation period) and downscaled future
data for all the stations
It is clearly visible from the above statistics and bar charts that station B behaves differently as
compared to other stations and the remaining stations have some kind of similar behavior.
Although for all the stations, IMD precipitation data has the grater maximum value as compared
to the future downscaled data but the mean for the downscaled precipitation is more than the IMD.
76
Krishna basin, like most parts in India, receives most of its rainfall from monsoon. Hence, further
analysis has been carried out to analyze the performance of the proposed models in capturing the
statistical properties for the monsoonal precipitation. Figure 7.16 shows the box plots for observed
v/s simulated precipitation for the monsoonal months (June to October) of the validation period. It
is observed that in general, wavelet based models are able to capture the mean and extremes of the
monsoonal precipitation better than P-ANN and P-MLR models.
Figure 7.16 Box plot for observed v/s simulated precipitation for June-October of the validation period
Figure 7.17 provides cumulative distribution function (CDF) plots of observed v/s simulated
precipitation for validation period for all the models under study. It can be observed that W-P-SoV
and W-P-MLR models are more sensitive towards extreme values compared to P-ANN and P-
MLR models for all five stations under study. The CDF plots for the wavelet based models are
able to mimic the shape of the CDF of observed precipitation more accurately.
77
Figure 7.17 Cumulative distribution function plots of observed v/s simulated precipitation for validation period
Table 7.5 provides statistical properties for mean monthly IMD temperature for the calibration and
validation period. While Table 7.6 gives the statistical properties for mean monthly downscaled
temperature for the future.
78
Table 7.5 Descriptive statistics of observed mean monthly IMD temperature (°C) for calibration and validation period
Statistics (°C)
Point A
Point B
Point C
Point D
Point E
Calibration period (Jan’69 to Dec’93)
Mean
25.59
25.99
27.02
28.05
24.08
Median
25.33
25.50
26.53
28.09
23.64
Standard Error
0.12
0.11
0.17
0.18
0.11
Standard Deviation
2.13
1.94
3.02
3.11
1.98
Skewness
0.27
0.60
0.34
0.06
0.54
Minimum
21.46
22.05
20.86
21.19
19.98
Maximum
30.03
30.69
33.85
35.32
28.82
Range
8.56
8.63
12.99
14.13
8.84
Validation period (Jan’94 to Dec’04)
Mean
25.78
26.12
27.17
28.21
24.26
Median
25.68
25.84
26.63
28.18
23.88
Standard Error
0.18
0.17
0.26
0.27
0.16
Standard Deviation
2.07
1.90
2.99
3.09
1.89
Skewness
0.23
0.54
0.32
0.07
0.39
Minimum
21.49
22.49
21.57
22.45
20.68
Maximum
30.73
30.71
33.32
34.84
28.84
Range
9.24
8.22
11.75
12.38
8.15
Table 7.6 Descriptive statistics of observed mean monthly downscaled temperature (°C) for future
Statistics (oC)
Point A
Point B
Point C
Point D
Point E
Future downscaling period (Jan’05 to Dec’35)
Mean
25.12
25.91
26.72
28.10
24.12
Median
24.89
25.74
25.76
28.03
23.40
Standard Error
0.11
0.13
0.20
0.17
0.14
Standard Deviation
2.15
2.44
3.80
3.29
2.69
Skewness
0.26
0.20
0.42
0.07
0.54
Minimum
20.46
20.92
19.81
19.21
19.03
Maximum
29.90
31.20
34.60
34.26
29.91
Range
9.44
10.28
14.79
12.05
10.88
79
Figure 7.18 Mean of the temperature time series of IMD (calibration and validation period) and downscaled future data for
all the stations
Figure 7.19 Maximum of the precipitation time series of IMD (calibration and validation period) and downscaled future
data for all the stations
The above statistics are consistent with the geographical and local climatic conditions, it is
observed that the stations C and D which lies in the central part of the basin and falls in the arid
and semi-arid zone experiences high temperatures compared to the other stations. In future a mixed
80
response in seen at different stations, bar charts show that at stations B, D and E the maximum
temperature will increase while at station A and D it would reduce.
Figure 7.20 Scatter plots of observed v/s downscaled temperature for validation period
81
Figure 7.20 depicts the scatter plots of observed v/s downscaled temperature for validation period
for all the four models. W-P-SoV model scatter points are most closely spaced and are adjacent to
the 45o line in comparison to other models. P-MLR model performs poorly as its scatter points are
most distantly plotted for all the stations.
82
8. SUMMARY AND CONCLUSION
This study proposes a wavelet based multi- resolution modeling approach for downscaling
of GCM variables to local mean monthly precipitation and temperature for Krishna Basin,
India using linear and non-linear models. Out of which the Second order Volterra model
coupled with wavelet & PCA proved to be most promising model. The modeling
framework is applied to five stations spanning across the study basin. The study employs
MWE based K-means clustering technique for classification of variables into
homogeneous clusters. PCA is applied to each cluster to derive representative variables
explaining 90-95% variance of each cluster. Multi- resolution models are developed using
these variables using SoV and MLR framework. P-ANN and P-MLR models are developed
applying ANN and MLR on these variables for the purpose of comparison. Based on
various performance indices it is concluded that wavelet analysis is an important tool in
improving the performance of a downscaling model. Wavelet transform has been
successfully applied for variable reduction using K-means and PCA. Multi- resolution
models (W-P-SoV and W-P-MLR) perform better than P-ANN and P-MLR models which
employ conventional ANN and MLR modeling methodology. Among W-P-SoV and W-
P-MLR, the former is observed to perform better for all stations. SoV has thus proved to
be a superior modeling methodology for climatic downscaling applications.
The downscaling points were so chosen to cover a wide geographical area of Krishna basin.
Some of these points have significantly different climatology and rainfall pattern from each
other. This ensured reproducibility of proposed multi-resolution framework.
The study uses NCEP data for calibrating the models based on past studies which states it
to be a reliable data as it has been derived by up-scaling the observed climatic processes at
local scale. Another option available for calibrating the model is using the historical GCM
data which is a generated data. A fair justification is needed in choosing these two options
and accordingly both of these options were taken up in the analysis. It was observed that
on the basis of performance of the models, NCEP reanalysis data gave comparatively more
promising results although there was not much remarkable difference in the performance
of the models.
83
Separate models were developed for each DWC of the precipitation and temperature time
series. Hence the models were sensitive to both long and transient trends in the input
variables providing an edge over standalone models. Moreover, SoV models have an
advantage over MLR based models as it addresses to the non-linearity in the process.
Transient processes which are mostly accounted by the D1 component of a time series
components are difficult to model using linear models due to inherent complexity of the
process. Hence a non- linear model captures variations in these components more
accurately.
The proposed framework does not employ a hybrid ANN- wavelet model for downscaling.
It is known that hit and trial approach is used in most of the studies available in literature
to discover the optimum parameters for ANN. However, in case of a multi- resolution
approach, as proposed in this study, determining the optimum ANN structure for each
resolution and each wavelet is a strenuous task. Models were developed for 3 resolution
levels, using 45 wavelets for each downscaling point. This means the number of optimum
ANN models to be generated for each downscaling points have to be 180 (= (3+1) x 45).
For all 5 points, the optimum ANN count would be 900 (= 180x5). Effectively, the models
will be susceptible to high degree of uncertainty pertaining to selection of suitable
parameters for ANN. On the other hand, the SoV and MLR models were computationally
more efficient and deterministic. Moreover, exact nature of the inter-relationship between
the climatic variables and downscaled precipitation or temperature can be achieved using
parametric techniques like SoV and MLR.
MWE is employed in this study to cluster the variables based on their features upto 128
months period. MWE is proved to be an effective tool in clustering variables based on
similar features on time- frequency domain. The downscaling models may be more
sensitive to variation in some clusters compared to others. Hence a further analysis can be
carried out to determine the cluster of variables (and hence, individual variables at different
pressure levels) which play comparatively important role in statistical downscaling of
GCM variables to local level. However, this analysis is out of the scope of this study and
is a subject for further research.
It is observed that the performance of the downscaling models vary significantly from point
to point. These models are heavily influenced by local factors like regional climatology,
84
land use, topography etc. This observation is evident in case of Station B where the
performance of models is significantly different from the respective models at other
downscaling points. This can be attributed to the fact that Station B is located at the
windward side of the Western Ghats and experience heavy rain fall in contrary to other
stations which lie in Arid/ Semi- Arid regions of Southern Indian Peninsula. Hence, a
rational basis for interpreting the results of the downscaled values obtained from the
proposed statistical framework is advised.
Future downscaled scenario has been generated using the RCP 4.5 scenario of CanCM4
model for both the mean monthly precipitation and temperature up to year 2035. This
downscaled information can be used by the decision makers for the planning purposes and
researchers in hydrological studies.
The study also explores the credibility of the GCMs which fail to capture the climate in an
efficient way and only exhibit partial attributes of it. There is a need to incorporate many
underlying natural processes in to the GCM models to bring a sense of rationality in them.
This inability of GCMs triggers the uncertainty in these models which further translates in
to the adopted downscaling technique. Hence a detailed and thorough analysis is advised
before using the output of these models at local/ regional scale for future.
In order to facilitate application of the proposed approach to diverse dataset by a wider
audience, a MATLAB toolkit based on the proposed approach is developed and is available
through the website
https://sites.google.com/site/climaticdownscalingtool/home.
85
REFERENCES
Adamowski, J. F. (2008). "River flow forecasting using wavelet and cross-wavelet transform
models." Hydrological processes 22(25): 4877-4891.
Agarwal, A. et al., (2015) “Hydrologic regionalization using Wavelet based Multi Scale
entropy technique.”
Anandhi, A., V. Srinivas, et al. (2008). "Downscaling precipitation to river basin in India for
IPCC SRES scenarios using support vector machine." International Journal of Climatology
28(3): 401-420.
Anandhi, A., V. Srinivas, et al. (2009). "Role of predictors in downscaling surface temperature
to river basin in India for IPCC SRES scenarios using support vector machine." International
Journal of Climatology 29(4): 583-603.
Asrar, G. and J. Dozier (1994). "EOS." Science Strategy for the Earth Observing System, 128
pp. AIP Press 1.
Bader, D., C. Covey, et al. (2008). "Climate models: an assessment of strengths and
limitations." US Department of Energy Publications: 8.
Ball, G. H. and D. J. Hall (1967). "A clustering technique for summarizing multivariate data."
Behavioral science 12(2): 153-155.
Beecham, S., M. Rashid, et al. (2014). "Statistical downscaling of multi-site daily rainfall in a
South Australian catchment using a Generalized Linear Model." International Journal of
Climatology 34(14): 3654-3670.
Bishop, C. M. (1995). "Neural networks for pattern recognition."
Bolshakova, N. and F. Azuaje (2003). "Machaon CVE: cluster validation for gene expression
data." Bioinformatics 19(18): 2494-2495.
British Geological Survey-
http://www.bgs.ac.uk/discoveringGeology/climateChange/general/causes.html?src=topNav
Bürger, G., T. Murdock, et al. "Downscaling extremes-an intercomparison of multiple
statistical methods for present climate." Journal of Climate 25(12): 4366-4388.
86
Cai, X., D. Wang, et al. (2009). "Assessing the regional variability of GCM simulations."
Geophysical Research Letters 36(2).
Cannon, A. J. and P. H. Whitfield (2002). "Downscaling recent streamflow conditions in
British Columbia, Canada using ensemble neural network models." Journal of Hydrology
259(1): 136-151.
Carter, T., M. Parry, et al. (1994). "IPCC technical guidelines for assessing climate change
impacts and adaptations with a summary for policy makers and a technical summary."
Cek, M. E., et al. Continuous time wavelet entropy of auditory evoked potentials, Computers
in biology and medicine, 40(1), 90-96.
Chadwick, R., E. Coppola, et al. (2011) "An artificial neural network technique for
downscaling GCM outputs to RCM spatial scale." Nonlinear Processes in Geophysics 18:
1013-1028.
Chen, S., et al. (1989), Orthogonal least squares methods and their application to non-linear
system identification, International Journal of control, 50(5), 1873-1896.
Chou, C.-M. and R.-Y. Wang (2002). "On-line estimation of unit hydrographs using the
wavelet-based LMS algorithm/Estimation en ligne des hydrogrammes unitaires grâce à
l'algorithme des moindres carrés moyens à base d'ondelettes." Hydrological sciences journal
47(5): 721-738.
Coulibaly, P. and D. H. Burn (2004). "Wavelet analysis of variability in annual Canadian
streamflows." Water Resources Research 40(3).
Coulibaly, P., Y. B. Dibike, et al. (2005). "Downscaling precipitation and temperature with
temporal neural networks." Journal of Hydrometeorology 6(4): 483-496.
Courtillot, V., et al. (2007), Are there connections between the Earth's magnetic field and
climate?, Earth and Planetary Science Letters, 253(3), 328-339.
Dai, X., P. Wang, et al. (2003). "Multiscale characteristics of the rainy season rainfall and
interdecadal decaying of summer monsoon in North China." Chinese Science Bulletin 48(24):
2730-2734.
Davies, D. L. and D. W. Bouldin (1979). "A cluster separation measure." Pattern Analysis and
Machine Intelligence, IEEE Transactions on (2): 224-227.
87
Department of Energy and Climate Change, 2009. Climate Change Act 2008: Impact
Assessment London: DECC. Available at
http://webarchive.nationalarchives.gov.uk/20090311095401/http://www.decc.gov.uk/Media/
viewfile.ashx?FilePath=85_20090310164124_e_@@_climatechangeactia.pdf&filetype=4
Devak, M., et al. Dynamic coupling of support vector machine and K-nearest neighbour for
downscaling daily rainfall, Journal of hydrology, 525, 286-301.
Dunn, J. C. (1973). "A fuzzy relative of the ISODATA process and its use in detecting compact
well-separated clusters."
Edwards, R., and A. Brooks (2008), The island of Ireland: Drowning the myth of an Irish land-
bridge? The Irish Naturalists' Journal, 19-34.
"Endangered Species- World Wildlife Fund." The Science of Birds. Ornithology, 18 Feb. 2009.
Web. 15 June 2015.
Falkowski, P., et al. (2000), The global carbon cycle: a test of our knowledge of earth as a
system, science, 290(5490), 291-296.
Fistikoglu, O. and U. Okkan "Statistical downscaling of monthly precipitation using
NCEP/NCAR reanalysis data for Tahtali River basin in Turkey." Journal of Hydrologic
Engineering 16(2): 157-164.
Foufoula-Georgiou, E., and M. Ebtehaj Variational Data Assimilation via Sparse
Regularization, paper presented at EGU General Assembly Conference Abstracts.
Gaur, A., and K. Vora (1999), Ancient shorelines of Gujarat, India, during the Indus
civilization (Late Mid-Holocene): A study based on archaeological evidences, Current
Science, 77, 180-185p.
Ghosh, S. and P. Mujumdar (2006). "Future rainfall scenario over Orissa with GCM
projections by statistical downscaling." Current Science 90(3): 396-404.
Ghosh, S. and P. Mujumdar (2007). "Nonparametric methods for modeling GCM and scenario
uncertainty in drought assessment." Water Resources Research 43(7).
88
Ghosh, S. and P. Mujumdar (2008). "Statistical downscaling of GCM simulations to
streamflow using relevance vector machine." Advances in Water Resources 31(1): 132-146.
Govindaraju, R. S. (2005). Bayesian learning and relevance vector machines for hydrologic
applications. 2nd Indian International Conference on Artificial Intelligence (IICAI-05), Pune,
India.
Grinsted, A., J. C. Moore, et al. (2004). "Application of the cross wavelet transform and
wavelet coherence to geophysical time series." Nonlinear processes in geophysics 11(5/6):
561-566.
Halkidi, M., Y. Batistakis, et al. (2001). "On clustering validation techniques." Journal of
Intelligent Information Systems 17(2-3): 107-145.
Haykin, S. and N. Network (2004). "A comprehensive foundation." Neural Networks 2(2004).
Hessami, M., P. Gachon, et al. (2008). "Automated regression-based statistical downscaling
tool." Environmental Modelling & Software 23(6): 813-834Hewitson, B., and R. Crane (1996),
Climate downscaling: Techniques and application, Climate Research, 7(2), 85-95.
Hewitson, B., and R. Crane (1996), Climate downscaling: Techniques and application, Climate
Research, 7(2), 85-95.
Högbom, Arvid. "Om sannolikheten för sekulära förändringar i atmosfärens
kolsyrehalt." Svensk kemisk tidskrift 4 (1894): 169-177.
Hutchins, K. S., et al. (1997), Impact of a paleomagnetic field on sputtering loss of Martian
atmospheric argon and neon, Journal of Geophysical Research: Planets (1991-2012), 102(E4),
9183-9189.
Idso, Craig, and S. Fred Singer. "Climate change reconsidered." 2009 Report of the (2009).
IPCC (Intergovernmental Panel on Climate Change) (2007) Climate Change 2007 Synthesis
Report: Contribution of Working Groups I, II, and III to the Fourth Assessment Report of the
Intergovernmental Panel on Climate Change. Cambridge: Cambridge University Press
Johnson, F., and A. Sharma (2009), Measurement of GCM skill in predicting variables relevant
for hydroclimatological assessments, Journal of Climate, 22(16), 4373-4382.
89
Kanungo, T., D. M. Mount, et al. (2002). "An efficient k-means clustering algorithm: Analysis
and implementation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 24(7):
881-892.
Kasturi, J., R. Acharya, et al. (2003). "An information theoretic approach for analyzing
temporal patterns of gene expression." Bioinformatics 19(4): 449-458.
Kim, J., J. Chang, et al. (1984). "The statistical problem of climate inversion: Determination
of the relationship between local and large-scale climate." Monthly weather review 112(10):
2069-2077.
Kim, S. (2004). "Wavelet analysis of precipitation variability in northern California, USA."
KSCE Journal of Civil Engineering 8(4): 471-477.
Kim, T.-W. and J. B. Valdés (2003). "Nonlinear model for drought forecasting based on a
conjunction of wavelet transforms and neural networks." Journal of Hydrologic Engineering
8(6): 319-328.
Kucuk, Murat, and Necati Ağirali-super (2006). "Wavelet regression technique for streamflow
prediction." Journal of Applied Statistics 33(9): 943-960.
Kundzewicz, Z. W., and E. Z. Stakhiv Are climate models “ready for prime time” in water
resources management applications, or is more research needed?, Hydrological Sciences
Journal-Journal des Sciences Hydrologiques, 55(7), 1085-1089.
Labat, D., et al. (2000), Rainfall- runoff relations for karstic springs. Part II: continuous
wavelet and discrete orthogonal multiresolution analyses, Journal of hydrology, 238(3), 149-
178.
Lenart, M. (2008). "Downscaling techniques." The University of Arizona (Southwest Clomate
Change Network).
Lokenath Debnath (2010), Wavelet Transforms And Time-Frequency Signal Analysis.
Publisher: Birkhuser Basel, ISBN: 0817641041.
Lu, R. (2002). "Decomposition of interdecadal and interannual components for North China
rainfall in rainy season." Chinese Journal of Atmosphere (in Chinese) 26: 611-624.
90
MacQueen, J. (1967). Some methods for classification and analysis of multivariate
observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and
probability, Oakland, CA, USA.
Maheswaran, R. and R. Khosa Multiscale nonlinear model for monthly streamflow forecasting:
a wavelet-based approach, Journal of Hydroinformatics, 14(2), 424-442.
Maheswaran, R. and R. Khosa "Comparative study of different wavelets for hydrologic
forecasting." Computers & Geosciences 46: 284-295.
Maheswaran, R. and R. Khosa "Wavelet-Volterra coupled model for monthly stream flow
forecasting." Journal of Hydrology 450: 320-335.
Mahmood, R. and M. S. Babel (2014) "Future changes in extreme temperature events using
the statistical downscaling model (SDSM) in the trans-boundary region of the Jhelum river
basin." Weather and Climate Extremes 5: 56-66.
Maier, H. R. and G. C. Dandy (2000). "Neural networks for the prediction and forecasting of
water resources variables: a review of modelling issues and applications." Environmental
modelling & software 15(1): 101-124.
Maimon, O. and L. Rokach Introduction to knowledge discovery and data mining. Data Mining
and Knowledge Discovery Handbook, Springer: 1-15.
Mallat, S. p. (1999). A wavelet tour of signal processing, Academic press.
Maurer, E. P. and H. G. Hidalgo (2008). "Utility of daily vs. monthly large-scale climate data:
an intercomparison of two statistical downscaling methods." Hydrology and Earth System
Sciences 12(2): 551-563.
MacQueen, J. (1967), Some methods for classification and analysis of multivariate
observations, paper presented at Proceedings of the fifth Berkeley symposium on mathematical
statistics and probability, Oakland, CA, USA.
Müller, B. (1995). Neural networks: an introduction, Springer Science & Business Media.
Munn, R. E., et al. (2002), Encyclopedia of global environmental change, Wiley Chichester.
Nourani, V., M. Komasi, et al. (2009). "A multivariate ANN-wavelet approach for rainfall-
runoff modeling." Water resources management 23(14): 2877-2894.
91
Pachauri, R. K., et al. Climate Change 2014: Synthesis Report. Contribution of Working
Groups I, II and III to the Fifth Assessment Report of the Intergovernmental Panel on Climate
Change.
Partal, T. and Özgür Kişi (2007). "Wavelet and neuro-fuzzy conjunction model for
precipitation forecasting." Journal of Hydrology 342(1): 199-212.
Perica, S., and E. Foufoula-Georgiou (1996), Model for multiscale disaggregation of spatial
rainfall based on coupling meteorological and scaling descriptions, Journal of Geophysical
Research: Atmospheres (1984-2012), 101(D21), 26347-26361.
Petit, J.-R., et al. (1999), Climate and atmospheric history of the past 420,000 years from the
Vostok ice core, Antarctica, Nature, 399(6735), 429-436.
Rajaee, T., S. Mirbagheri, et al. "Prediction of daily suspended sediment load using wavelet
and neurofuzzy combined model." International Journal of Environmental Science &
Technology 7(1): 93-110.
Rao, G. S. (2004), Wavelet Analysis And Applications. Publisher: New Age International (p)
Limited.
Rashid, M. M., S. Beecham, et al. "Statistical downscaling of rainfall: a non-stationary and
multi-resolution approach." Theoretical and Applied Climatology: 1-15.
Rokach, L., and O. Maimon (2005), Clustering methods, in Data mining and knowledge
discovery handbook, edited, pp. 321-352, Springer.
Ryan, W. B., et al. (1997), An abrupt drowning of the Black Sea shelf, Marine Geology, 138(1),
119-126.
Salvi, K. and S. Ghosh "High-resolution multisite daily rainfall projections in India with
statistical downscaling for climate change impacts assessment." Journal of Geophysical
Research: Atmospheres 118(9): 3557-3578.
Sang, Y.-F., D. Wang, et al. "Wavelet-based analysis on the complexity of hydrologic series
data under multi-temporal scales." Entropy 13(1): 195-210.
92
Sehgal, V., M. K. Tiwari, et al. "Wavelet bootstrap multiple linear regression based hybrid
modeling for daily river discharge forecasting." Water resources management 28(10): 2793-
2811.
Sehgal, V., R. R. Sahay, et al. "Effect of utilization of discrete wavelet components on flood
forecasting performance of wavelet based ANFIS models." Water resources management
28(6): 1733-1749.
Shannon, C. E. (1948). "A note on the concept of entropy." Bell System Tech. J 27: 379-423.
Shashikanth, K. and S. Ghosh (2013). "Fine Resolution Indian Summer Monsoon Rainfall
Projection with Statistical Downscaling."
Smith, L. C., D. L. Turcotte, et al. (1998). "Stream flow characterization and feature detection
using a discrete wavelet transform." Hydrological processes 12(2): 233-249.
Suykens, J. A. (2001). Nonlinear modelling and support vector machines. Instrumentation and
Measurement Technology Conference, 2001. IMTC 2001. Proceedings of the 18th IEEE,
IEEE.
Torrence, C. and G. P. Compo (1998). "A practical guide to wavelet analysis." Bulletin of the
American Meteorological society 79(1): 61-78.
Trigo, R. M. and J. P. Palutikof (1999). "Simulation of daily temperatures for climate change
scenarios over Portugal: a neural network model approach." Climate Research 13(1): 45-59.
Tripathi, S. and V. Srinivas (2005). Downscaling of general circulation models to assess the
impact of climate change on rainfall of India. Proceedings of International Conference on
Hydrological Perspectives for Sustainable Development (HYPESD-2005).
United Nations (1992) United Nations Framework Convention on Climate Change.
FCCC/INFORMAL/84, GE.05-62220 (E) 200705. Available at-
http://unfccc.int/resource/docs/convkp/conveng.pdf
Van Vuuren, D. P., J. Edmonds, et al. "The representative concentration pathways: an
overview." Climatic Change 109: 5-31.
von Storch, H., H. Langenberg, et al. (2000). "A spectral nudging technique for dynamical
downscaling purposes." Monthly weather review 128(10): 3664-3673.
93
Wang, W. and J. Ding (2003). "Wavelet network model and its application to the prediction of
hydrology." Nature and Science 1(1): 67-71.
Wang, Y., L. R. Leung, et al. (2004). "Regional climate modeling: progress, challenges, and
prospects." Journal of the Meteorological Society of Japan 82(6): 1599-1628.
Water resources information system of India (2015).
http://india-wris.nrsc.gov.in/wrpinfo/index.php?title=Krishna
Wigley, T., P. Jones, et al. (1990). "Obtaining sub-grid-scale information from coarse-
resolution general circulation model output." Journal of Geophysical Research: Atmospheres
(1984-2012) 95(D2): 1943-1953.
Wilby, R. L. and T. Wigley (1997). "Downscaling general circulation model output: a review
of methods and limitations." Progress in Physical Geography 21(4): 530-548
Wilby, R. L., T. Wigley, et al. (1998). "Statistical downscaling of general circulation model
output: a comparison of methods." Water Resources Research 34(11): 2995-3008.
Wilby, R., S. Charles, et al. (2004). "Guidelines for use of climate scenarios developed from
statistical downscaling methods."
Wilby, R. L., and S. Dessai Robust adaptation to climate change, Weather, 65(7), 180-185.
Xu, C.-Y. "Downscaling GCMs using the Smooth Support Vector Machine method to predict
daily precipitation in the Hanjiang Basin." Advances in Atmospheric Sciences 27(2): 274-284.
Zhou, H.-c., Y. Peng, et al. (2008). "The research of monthly discharge predictor-corrector
model based on wavelet decomposition." Water resources management 22(2): 217-227.
Article
Full-text available
A novel downscaling technique is proposed in this study whereby the original rainfall and reanalysis variables are first decomposed by wavelet transforms and rainfall is modelled using the semi-parametric additive model formulation of Generalized Additive Model in Location, Scale and Shape (GAMLSS). The flexibility of the GAMLSS model makes it feasible as a framework for non-stationary modelling. Decomposition of a rainfall series into different components is useful to separate the scale-dependent properties of the rainfall as this varies both temporally and spatially. The study was conducted at the Onkaparinga river catchment in South Australia. The model was calibrated over the period 1960 to 1990 and validated over the period 1991 to 2010. The model reproduced the monthly variability and statistics of the observed rainfall well with Nash-Sutcliffe efficiency (NSE) values of 0.66 and 0.65 for the calibration and validation periods, respectively. It also reproduced well the seasonal rainfall over the calibration (NSE = 0.37) and validation (NSE = 0.69) periods for all seasons. The proposed model was better than the tradition modelling approach (application of GAMLSS to the original rainfall series without decomposition) at reproducing the time-frequency properties of the observed rainfall, and yet it still preserved the statistics produced by the traditional modelling approach. When downscaling models were developed with general circulation model (GCM) historical output datasets, the proposed wavelet-based downscaling model outperformed the traditional downscaling model in terms of reproducing monthly rainfall for both the calibration and validation periods.
Article
An exploration of the wavelet transform as applied to daily river discharge records demonstrates its strong potential for quantifying stream flow variability. Both periodic and non-periodic features are detected equally, and their locations in time preserved. Wavelet scalograms often reveal structures that are obscure in raw discharge data. Integration of transform magnitude vectors over time yields wavelet spectra that reflect the characteristic time-scales of a river's flow, which in turn are controlled by the hydroclimatic regime. For example, snowmelt rivers in Colorado possess maximum wavelet spectral energy at time-scales on the order of 4 months owing to sustained high summer flows; Hawaiian streams display high energies at time-scales of a few days, reflecting the domination of brief rainstorm events. Wavelet spectral analyses of daily discharge records for 91 rivers in the US and on tropical islands indicate that this is a simple and robust way to characterize stream flow variability. Wavelet spectral shape is controlled by the distribution of event time-scales, which in turn reflects the timing, variability and often the mechanism of water delivery to the river. Five hydroclimatic regions, listed here in order of decreasing seasonality and increasing pulsatory nature, are described from the wavelet spectral analysis: (a) western snowmelt, (b) north-eastern snowmelt, (c) mid-central humid, (d) southwestern arid and (e) 'rainstorm island'. Spectral shape is qualitatively diagnostic for three of these regions. While more work is needed to establish the use of wavelets for hydrograph analysis, our results suggest that river flows may be effectively classified into distinct hydroclimatic categories using this approach. (C) 1998 John Wiley & Sons, Ltd.