ArticlePublisher preview available
To read the full-text of this research, you can request a copy directly from the authors.

Abstract and Figures

The authors present a novel self-organized climate regionalization (CR) method that obtains a spatial clustering of regions, based on the explained variance of physical measurements in their coverage. This method enables a microscopic characterization of the probabilistic spatial extent of climate regions, using the statistics of the obtained clusters. It also allows for the study of the macroscopic behaviour of climate regions through time by using the dissimilarity among different cluster size probability histograms. The main advantages of the presented method, based on the Second-Order Data-Coupled Clustering (SODCC) algorithm, are that SODCC is robust to the selection of tunable parameters and that it does not require a regular or homogeneous grid to be applied. Moreover, the SODCC method has higher spatial resolution, lower computational complexity, and allows for a more direct physical interpretation of the outputs than other existing CR methods, such as Empirical Orthogonal Function (EOF) or Rotated Empirical Orthogonal Function (REOF). These facts are illustrated with an example of winter wind speed regionalization in the Iberian Peninsula through the period (1979 − 2014). This study also reveals that the North Atlantic Oscillation (NAO) has a high influence over the wind distribution in the Iberian Peninsula in a subset of years in the considered period.
This content is subject to copyright. Terms and conditions apply.
Theoretical and Applied Climatology
https://doi.org/10.1007/s00704-019-03082-6
ORIGINAL PAPER
Spatio-temporal climate regionalization using a self-organized
clustering approach
Mihaela I. Chidean1·Antonio J. Caama ˜
no1·Carlos Casanova-Mateo2·Julio Ramiro-Bargue ˜
no1·
Sancho Salcedo-Sanz3
Received: 2 March 2018 / Accepted: 23 December 2019
©Springer-Verlag GmbH Austria, part of Springer Nature 2020
Abstract
The authors present a novel self-organized climate regionalization (CR) method that obtains a spatial clustering of
regions, based on the explained variance of physical measurements in their coverage. This method enables a microscopic
characterization of the probabilistic spatial extent of climate regions, using the statistics of the obtained clusters. It also
allows for the study of the macroscopic behaviour of climate regions through time by using the dissimilarity among different
cluster size probability histograms. The main advantages of the presented method, based on the Second-Order Data-Coupled
Clustering (SODCC) algorithm, are that SODCC is robust to the selection of tunable parameters and that it does not require a
regular or homogeneous grid to be applied. Moreover, the SODCC method has higher spatial resolution, lower computational
complexity, and allows for a more direct physical interpretation of the outputs than other existing CR methods, such as
Empirical Orthogonal Function (EOF) or Rotated Empirical Orthogonal Function (REOF). These facts are illustrated with
an example of winter wind speed regionalization in the Iberian Peninsula through the period (1979 2014). This study also
reveals that the North Atlantic Oscillation (NAO) has a high influence over the wind distribution in the Iberian Peninsula in
a subset of years in the considered period.
1 Introduction
Climate regionalization (CR) is defined as the process of
dividing a given area into smaller regions, in such a way that
they are somehow homogeneous with respect to a specified
climatic variable (Badr et al. 2015). CR is a key point
in climate studies, since it allows explaining small-scale
climate events in terms of the spatio-temporal mechanisms
which produce them. CR has been specifically applied to
Electronic supplementary material The online version of this
article (https://doi.org/10.1007/s00704-019-03082-6) contains
supplementary material, which is available to authorized users.
Antonio J. Caama˜
no
antonio.caamano@urjc.es
1Department of Signal Theory and Communications,
Universidad Rey Juan Carlos, Madrid, Spain
2Department of Civil Engineering: Construction,
Infrastructures and Transports, Universidad Polit´
ecnica
de Madrid, Madrid, Spain
3Department of Signal Processing and Communications,
Universidad de Alcal´
a, Madrid, Spain
palaeo-climatic problems (Knapp et al. 2002), precipitation
trends, floods and drought events (Comrie and Glenn 1998;
Baeriswyl and Rebetez 1997;Burn1989), numerical models
improvement for climate studies (Arg¨ueso et al. 2011;
Regonda et al. 2016), or climate change studies ( ¨
Onol and
Semazzi 2009), among others.
There are a number of well-known linear analysis tech-
niques for obtaining high-quality CR. Empirical Orthogonal
Function (EOF) analysis, also known as Principal Compo-
nent Analysis (PCA), is one of the most standard techniques
in climatology with direct application in CR. EOF anal-
ysis tries to identify natural spatio-temporal variability of
observations (Jolliffe 2002). The idea behind EOF analy-
sis is to identify a set of orthogonal eigenfunctions which
accounts for most of the system’s total variance (von Storch
and Zwiers 1999). Thus, EOF analysis tries to obtain the
dominant modes of variability, in turn reducing the data
space by only considering those EOFs which cover a large
percentage of the total variance. EOF analysis has been
intensely used in CR (White et al. 1991; Comrie and Glenn
1998; Baeriswyl and Rebetez 1997). The basic idea is to use
EOF or Rotated Empirical Orthogonal Function (REOF) to
define and interpret clusters of different climatic variables,
(2020) 140:927–949
/ Published online: 2020
February
13
Content courtesy of Springer Nature, terms of use apply. Rights reserved.
... The present study utilized the empirical orthogonal function (EOF) (Lorenz, 1956;Hannachi et al., 2007;Yosef et al., 2017;Chidean et al., 2020) to examine spatial modes (i.e., patterns) of precipitation variability at a spatial resolution of 0.5 × 0.5 during 1978-2017 and how they change with time. The EOF is analysed by calculating the eigenvalues and eigenvectors of a field's spatially weighted anomaly covariance matrix. ...
Article
Full-text available
This study investigates the relationship between the Antarctic sea ice concentration (SIC) and the precipitation variability in Tanzania during March to May (MAM) season from 1978 to 2017. It is found that the MAM SIC over the Weddell (Ross) sea is negatively correlated with the MAM precipitation, particularly over northern (southern) Tanzania, signifying that the high (low) MAM SIC over the Weddell (Ross) sea is associated with suppressed (enhanced) rainfall in Tanzania. The atmospheric circulations related to the MAM SIC anomalies were further analysed. It is revealed that the positive MAM SIC anomalies in the Weddell sea and the Ross sea are associated with the upper‐level wave train patterns propagating from the high latitudes to the low latitudes of the Southern Hemisphere, which results in anomalous upper‐level cyclonic circulation over the western Indian Ocean and Tanzania. The upper‐level cyclonic circulation anomaly favours the low‐level divergence and results in the subsidence over Tanzania. Moreover, the low‐level wind outflow over Tanzania due to divergence reduces the water vapour supply to Tanzania. This background is unfavourable for the occurrence of precipitation and thus decreases the precipitation in Tanzania. The situation is reversed for the negative MAM SIC anomalies over the Weddell sea and the Ross sea, conducive to the increase of precipitation over Tanzania.
... This study employed the empirical orthogonal function (EOF) (Lorenz, 1956;Hannachi et al., 2007;Yosef et al., 2017;Chidean et al., 2020) to examine spatial modes (i.e., patterns) of OND rainfall variability at a spatial resolution of 0.5°× 0.5°and how they change with time during 1950-2017. The EOF is analyzed by calculating the eigenvalues and eigenvectors of a field's spatially weighted anomaly covariance matrix. ...
Article
Full-text available
This study presents the relationship between Tanzania short rain variability and the sea surface temperature (SST) over the Southern Oceans from 1950 to 2017. It is found that the warm SST anomalies to the east of Australia (EA-SST) and the southern Atlantic Ocean (SA-SST) are significantly negatively correlated with the OND rainfall throughout Tanzania, signifying that the warmer (cooler) than normal EA-SST and SA-SST tend to cause a suppressed (enhanced) OND rainfall in Tanzania. Further investigation indicates that the above-normal SA-SST anomalies are linked to the changes of Walker-type circulation over the Atlantic Ocean, with the low-level (upper-level) divergence (convergence) occurring over the study region, which suppresses the in-situ convection and hence decreases the rainfall over Tanzania. The above-normal SA-SST anomalies are associated with the upper-level wave patterns propagating from the southern Atlantic Ocean, resulting in the formation of cyclonic anomalies over the target region. The upper-level cyclonic anomalies formed favor the subsidence of airflows over Tanzania and hence reduce rainfall. The local moisture and dynamical conditions also support the atmospheric circulations observed, whereby warm EA-SST and SA-SST anomalies are associated with the westerly moisture flux over the Indian Ocean moving away from Tanzania and the descending motion over Tanzania. Hence, close monitoring of SST anomalies over these regions might be useful in updating OND rainfall seasonal forecasts in Tanzania.
... EOF method has been used in meteorology studies since the early 1940s (Lorenz, 1956;Hannachi and Stephenson, 2007;Yosef et al., 2017;Chidean et al., 2020). This method is used by many scientists to identify the most dominant mode of variability associated with a set of variables. ...
Article
Full-text available
Rainfall is the most important meteorological variable that influences the economic development of Rwanda. Changes in rainfall trends and variability over recent past years have become a great concern to policymakers and scientists. This study aims at examining the spatiotemporal variability of rainfall over Rwanda and the teleconnections of rainfall with different large-scale ocean-atmospheric variables at different timescales. The study used rainfall data of Climate Hazards Category Infrared Precipitation with Stations (CHIRPS) and Climate Research Unit Time Series Version 4 (CRU) for the period 1981–2017. Several statistical methods, including standardized anomaly, Empirical Orthogonal Functions (EOF), Pearson Correlation, Mann-Kendall (MK), and Sen's gradient estimator, were used to assess the variability, trends, and teleconnections of rainfall with various driving factors. Results revealed a bimodal rainfall pattern in its annual cycle. The spatial distribution of annual and seasonal rainfall showed a southwest to northwest rainfall gradient. The MK test revealed a decreasing trend in annual rainfall in the southwest part of the country. Overall, March to May (MAM) rainy seasons showed a decreasing and September to December (SOND) rainy season an increasing trend over Rwanda. The EOF analysis revealed that the leading mode of variability for MAM rainfall parades a unimodal scheme with negative loadings that can explain 59.3% of the total rainfall variance. The dominant mode of variability of SOND rainfall revealed the same pattern but with positive loadings that can explain 58.1% of the total variance. Spatial correlation showed that the MAM (SOND) rainfall has a weak (strong) relationship with the Indian Ocean sea surface temperature (SST), which means a negative (positive) Indian Ocean Dipole can lead to anomalously wet (dry) conditions over Rwanda. A stronger influence of El-Nino Southern Oscillation (ENSO) on SOND rainfall than MAM rain was noticed. The results of this study are crucial in developing appropriate mitigation measures to curb the impacts of climate change on the agriculture and water resources of Rwanda.
Article
Full-text available
We consider a spiked population model, proposed by Johnstone, whose population eigenvalues are all unit except for a few fixed eigenvalues. The question is to determine how the sample eigenvalues depend on the non-unit population ones when both sample size and population size become large. This paper completely determines the almost sure limits for a general class of samples.
Article
Full-text available
Defining homogeneous precipitation regions is fundamental for hydrologic applications, yet nontrivial, particularly for regions with highly varied spatial-temporal patterns. Traditional approaches typically include aspects of subjective delineation around sparsely distributed precipitation stations. Here, hierarchical and non-hierarchical (k means) clustering techniques on a gridded dataset for objective and automatic delineation are evaluated. Using a spatial sensitivity analysis test, the k-means clustering method is found to produce much more stable cluster boundaries. To identify a reasonable optimal k, various performance indicators, including the within-cluster sum of square errors (WSS) metric, intra-and intercluster correlations, and postvisualization are evaluated. Two new objective selection metrics (difference in minimum WSS and difference in difference) are developed based on the elbow method and gap statistics, respectively, to determine k within a desired range. Consequently, eight homogenous regions are defined with relatively clear and smooth boundaries, as well as low intercluster correlations and high intracluster correlations. The underlying physical mechanisms for the regionalization outcomes not only help justify the optimal number of clusters selected, but also prove informative in understanding the local-and large-scale climate factors affecting Ethiopian summertime precipitation. A principal component linear regression model to produce cluster-level seasonal forecasts also proves skillful.
Article
Full-text available
Natural variability is an essential component of observations of all geophysical and climate variables. In principal component analysis (PCA), also called empirical orthogonal function (EOF) analysis, a set of orthogonal eigenfunctions is found from a spatial covariance function. These empirical basis functions often lend useful insights into physical processes in the data and serve as a useful tool for developing statistical methods. The underlying assumption in PCA is the stationarity of the data analyzed; that is, the covariance function does not depend on the origin of time. The stationarity assumption is often not justifiable for geophysical and climate variables even after removing such cyclic components as the diurnal cycle or the annual cycle. As a result, physical and statistical inferences based on EOFs can be misleading. Some geophysical and climatic variables exhibit periodically time-dependent covariance statistics. Such a dataset is said to be periodically correlated or cyclostationary. A proper recognition of the time-dependent response characteristics is vital in accurately extracting physically meaningful modes and their space-time evolutions from data. This also has important implications in finding physically consistent evolutions and teleconnection patterns and in spectral analysis of variability-important goals in many climate and geophysical studies. In this study, the conceptual foundation of cyclostationary EOF (CSEOF) analysis is examined as an alternative to regular EOF analysis or other eigenanalysis techniques based on the stationarity assumption. Comparative examples and illustrations are given to elucidate the conceptual difference between the CSEOF technique and other techniques and the entailing ramification in physical and statistical inferences based on computational eigenfunctions.
Article
Full-text available
We analyse the variability of the probability distribution of daily wind speed in wintertime over Northern and Central Europe in a series of global and regional climate simulations covering the last centuries, and in reanalysis products covering approximately the last 60 years. The focus of the study lies on identifying the link of the variations in the wind speed distribution to the regional near-surface temperature, to the meridional temperature gradient and to the North Atlantic Oscillation. Our main result is that the link between the daily wind distribution and the regional climate drivers is strongly model dependent. The global models tend to behave similarly, although they show some discrepancies. The two regional models also tend to behave similarly to each other, but surprisingly the results derived from each regional model strongly deviates from the results derived from its driving global model. In addition, considering multi-centennial timescales, we find in two global simulations a long-term tendency for the probability distribution of daily wind speed to widen through the last centuries. The cause for this widening is likely the effect of the deforestation prescribed in these simulations. We conclude that no clear systematic relationship between the mean temperature, the temperature gradient and/or the North Atlantic Oscillation, with the daily wind speed statistics can be inferred from these simulations. The understanding of past and future changes in the distribution of wind speeds, and thus of wind speed extremes, will require a detailed analysis of the representation of the interaction between large-scale and small-scale dynamics.
Article
In this paper a spatio-temporal analysis of wind power resource in the Iberian Peninsula is presented. The study uses the Second-Order Data-Coupled Clustering (SODCC) algorithm over reanalysis data in the for the period 1979 – 2014. Several characteristics of the method are detailed, such as the data-coupled clustering approach of SODCC, that ensures the non-singularity of the signal subspace within each cluster. The performance of the proposed approach and specific results obtained have been discussed in a case study in the Iberian Peninsula. In these results it is possible to identify different spatio-temporal patterns of the wind data statistics depending on the initialization year. Moreover, this work also shows that there is a close relationship between these spatio-temporal patterns with the wind energy production of the area under study, so the proposed analysis can be extended to wind farms efficiency production at the time scales considered.
Article
Sea Level Pressure (SLP) data for the period 1950–2012 at 61 stations located in or around the Balkan Peninsula was used. The main concept is that intra-annual course of SLP represents the best different air masses that are situated over the Balkan Peninsula during the year. The method for differentiation of climatic zones is cluster analysis. A hierarchical clustering technique–average linkage between groups with Pearson correlation for measurement of intervals was employed in the research. The climate of the Balkan Peninsula is transitional between oceanic and continental and also between subtropical and temperate climates. Several major changes in atmospheric circulation over the Balkan Peninsula have happened over the period 1950–2012. There is a serious increase of the influence of the Azores High in the period January–March, which leads to an increase of SLP and enhances oceanic influence. There is an increase of the influence of the north-west extension of the monsoonal low in the period June–September. This leads to more continental climate, but also to more tropical air masses over the Balkan Peninsula. Accordingly, the extent of subtropical climate widens in northern direction. There is an increase of the influence of the Siberian High in the period October–December. This influence covers central and eastern part of the peninsula in October and November, and it reaches western parts in December. Thus, the climate becomes more continental. © 2017, Institute of Geographic Science and Natural Resources Research (IGSNRR), Science China Press and Springer-Verlag Berlin Heidelberg.
Article
Dynamically based seasonal forecasts are prone to systematic spatial biases due to imperfections in the underlying global climate model (GCM). This can result in low-forecast skill when the GCM misplaces teleconnections or fails to resolve geographic barriers, even if the prediction of large-scale dynamics is accurate. To characterize and address this issue, this study applies objective climate regionalization to identify discrepancies between the Climate Forecast System Version 2 (CFSv2) and precipitation observations across the Contiguous United States (CONUS). Regionalization shows that CFSv2 1 month forecasts capture the general spatial character of warm season precipitation variability but that forecast regions systematically differ from observation in some transition zones. CFSv2 predictive skill for these misclassified areas is systematically reduced relative to correctly regionalized areas and CONUS as a whole. In these incorrectly regionalized areas, higher skill can be obtained by using a regional-scale forecast in place of the local grid cell prediction.
Article
In order to study climate change on a regional scale using Earth System Models, it is useful to partition the spatial domain into regions according to their climate changes. The aim of this work is to divide the European domain into regions of similar projected climate changes using a simulation of daily total precipitation, minimum and maximum temperatures for the recent-past (1986 – 2005) and long-term future (2081 – 2100) provided by the Coupled Model Intercomparison Project (CMIP5). The difference between the long-term future and recent-past daily climatologies of these three variables is determined. Aiming to objectively identify the grid points with coherent climate changes, a K-Mean Cluster Analysis is applied to these differences. This method is performed for each variable independently (univariate version) and for the aggregation of the three variables (multivariate version). A mathematical approach to determine the optimal number of clusters is pursued. However, due to the method characteristics, a sensitivity test to the number of clusters is performed by analysing the consistency of the results. This is a novel method, allowing for the determination of regions based on the climate change of multiple variables. Results from the univariate application of this method are in accordance with results found in the literature, showing overall similar regions of changes. The regions obtained for the multivariate version are mainly defined by latitude over European land, with some features of land-sea interaction. Furthermore, all regions have statistically different distributions of at least one of the variables, providing confidence to the regions obtained.
Article
Optimal siting of wind farms based on a pre-assessment of the spatiotemporal variability of wind resources is considered a suitable method for reducing fluctuations in the delivered output. In this study, we explore the potential for balancing wind energy generation in the Iberian Peninsula using Principal Component Analysis (PCA). This technique permits the discovery of possibly new promising locations for wind power harvesting and an evaluation of the existing wind farm network in terms of reliability in energy generation. Data input to the PCA consists of the hourly wind capacity factor in a 5-km spatial resolution grid covering the entire peninsula. These data are derived from an equivalent wind farm power curve fed by modeled wind speed data from 80 m above ground level. PCA reveals three significant balancing patterns prevailing over the IP, where half of the currently operating wind farms in Spain are placed. Hence, among the many constituents of the existing wind farm network, these spots offer the best opportunity for stable power supply. The paper concludes by making proposals on an optimum wind capacity allocation based on the idea of equally distributing installed power between positive/negative dipoles emerging from balancing principal components.