ArticlePDF Available

Abstract and Figures

We present a method to cluster time series according to the calculation of the pairwise Kendall distribution function between them. A case study with environmental data illustrates the introduced methodology.
Content may be subject to copyright.
Cluster Analysis of Time Series
via Kendall Distribution
Fabrizio Duranteand Roberta Pappad`a
School of Economics and Management,
Free University of Bozen–Bolzano, Bolzano, Italy
{fabrizio.durante,roberta.pappada}@unibz.it
Abstract. We present a method to cluster time series according to
the calculation of the pairwise Kendall distribution function between
them. A case study with environmental data illustrates the introduced
methodology.
Keywords: Cluster analysis, Copula, Kendall distribution, Tail depen-
dence.
1 Introduction
Cluster analysis plays an important role in extracting information from a group
of different time series. It can be used, for instance, to find some dependence
information, which is a key tool in geosciences and hydrology in order to un-
derstand the relationships between different variables. In general, a time series
clustering procedure involves the choice of an adequate metric between the uni-
variate time series, which allows to group together series exhibiting common
trends occurring at different times or similar sub-patterns in the data, according
to the idea of similarity one has adopted (see [1]).
A widely used approach to measure similarity is to consider a Pearson-
correlation based distance metric. However, recent studies have underlined that
classical correlation measures are often inadequate to capture the real depen-
dence structure between individual risk factors, especially in a financial and
environmental context (see, for instance, [2], [3]). As such, several investigations
have been carried out during the last years from different perspectives, exploit-
ing tools from extreme-value analysis ([4], [5]) to the concept of tail copulas
(see, for instance, [6] and the references therein). In particular, many research
efforts have remarked on the usefulness of extreme value theory in assessing cli-
mate changes and detecting spatial clusters (see, for instance [7], [8]). Moreover,
recent developments in statistical hydrology have shown the great potential of
copulas for the construction of multivariate cumulative distribution functions
and for carrying out a multivariate frequency analysis ([9], [10]). Extreme value
copulas have been largely used to investigate the spatial dependencies between
This work was supported by Free University of Bozen-Bolzano via the project
MODEX.
c
Springer International Publishing Switzerland 2015 209
P. Grzegorzewski et al. (eds.), Strengthening Links between Data Analysis & Soft Com puting,
Advances in Intelligent Systems and Computing 315, DOI: 10.1007/978-3-319-10765-3_25
210 F. Durante and R. Pappad`a
the involved variables, introducing a novel contribution to the interpretation of
meteorological and hydrological phenomena ([11], [12], [13]). From another per-
spective, methods have been recently proposed in order to cluster time series
observations according to a suitable copula-based dissimilarity measure, with
applications in the financial setting. Such an approach has been adopted, for
instance, in [14] focusing on the use of conditional Spearman’s correlation, and
in [15], [16] where the clustering procedure is based on the estimation of pairwise
tail dependence coefficients.
Management of environmental resources often requires the analysis of spatial
rainfall extremes which typically exhibit some form of dependence as a result
of the regional nature of hydrological phenomena. Reliable estimates of extreme
rainfall events are required for several hydrological purposes and their spatial
distribution is of both physical and practical interest, particularly in the case of
regional studies. Several approaches are available in the literature for the char-
acterization of spatial extremes, relying on a likelihood-based approach ([17],
[18]), a Bayesian approach ([19]) and cluster analysis for assessing the spatial
distribution of extremes ([20], [21]). In particular, the detection of spatial clus-
ters can help in summarizing available data, extracting useful information and
formulating hypothesis for further research. Clustering could be used in order to
identify homogeneous regions to be considered for regionalization procedures.
In the present contribution, we would like to use the Kendall distribution
function associated with a random vector in order to develop a novel clustering
procedure for grouping random vectors. We outline here briefly the possible
application of the proposed methodology to hydrological data by analysing time
series of maximum annual rainfall data collected at rain gauges of different sites
in the province of Bolzano-Bozen (Italy). Notice that according to the approach
in [22], homogeneity in the sense of Kendall’s distance implies homogeneity in
the sense of return period, a notion frequently used in environmental sciences
for the identification of dangerous events and risk assessment (see also [23],[24]).
2 Clustering via Kendall Distribution
We recall that a (bivariate) copula is a joint cumulative probability distribu-
tion function with uniform univariate margins on I=[0,1]. If we consider a
random pair (X, Y ) with cumulative continuous distribution function H,then
the bivariate probability integral transform is the random variable defined by
W=H(X, Y ). It is known that Wjust depends on the copula Cof (X, Y )and
it is equal in distribution to C(U, V ), where U=FX(X)andV=FY(Y), being
FX,F
Ythe univariate marginals of Xand Y, respectively. First introduced in
[25] for inference on Archimedean copulas, the Kendall distribution function (see
also [26]) is simply the distribution function of Wand is given by
K(q)=P(Wq),
where q[0,1] is a probability level.
Cluster Analysis of Time Series via Kendall Distribution 211
There are two important particular cases for the Kendall distribution. When
Xand Yare comonotonic, one finds K(q)=KM(q)=qfor all 0 q1, which
corresponds to C(u, v)=M(u, v)=min(u, v), where Mis the the Fr´echet-
Hoeffding upper bound copula. Under the hypothesis of independence between
Xand Y, which is equivalent to consider C(u, v)=Π(u, v )=uv,Khas the
form K(q)=KΠ(q)=qqlog(q),0q1. Thus, on the graph of Kbased
on pseudo-random samples from a positively dependent bivariate vector (X, Y ),
perfect positive dependence would translate into data points aligned on the line
y=x, while the plot will be seen to match nearly the curve KΠ(q) as the data
become less and less dependent. Notice that, for each Kendall distribution K,
one has the lower bound KKMon I. Starting with [27] (see also [28]), order-
ing properties of Kendall distributions have been used to detect dependence in
copula models. Here we show how to use them to provide a clustering procedure
for time series.
Suppose that we have at disposal a set of time series Xt
1,...,Xt
n, correspond-
ing to ndifferent measurements collected at time t∈{1,...,T}. Such time series
are assumed to be a random sample from an unknown vector X=(X1,...,X
n).
In order to interpret properly the following results it is also convenient to sup-
pose that the all the pairs in Xare positively quadrant dependent, i.e. their
copula is grater than or equal to Π. We would like to group the components of
Xaccording to the strength of their inter–dependence. To do this, following the
general principle applied in [14], we may proceed as follows:
1. Calculate the Kendall distribution function K(·) for each pair (Xi,X
j), and
denote it by Kij .
2. Define a kind of distance between Xiand Xjin terms of the related Kendall
distribution Kij =Kand the Kendall distribution KMof comonotone ran-
dom variables by one of the following definitions:
d2(K, KM)=1
0
(qK(q))2dq
d(K, KM)= sup
q[0,1] |qK(q)|dq
Intuitively, two random variables have small distance if their Kendall distri-
bution is close to KMor, in other words, if they tend to be comonotone.
3. From these metrics, create a suitable dissimilarity matrix D:= (δij ),i,j=
1,...,n, for instance by using δij =d2(Kij ,K
M). In fact, if the random
variables are comonotone, their dissimilarity is 0, while this number increases
when they are becoming less and less dependent. Hence, in this construction,
the larger the distance, the weaker the dependence.
4. Apply classical cluster techniques to the obtained dissimilarity matrix. In
particular, agglomerative hierarchical methods with nearest distance (single
linkage), furthest distance (complete linkage) and average distance (average
linkage) can be used as grouping criteria.
For what concerns the estimation procedure of the Kendall distribution func-
tion we rely on non-parametric estimation by using the empirical distribution
212 F. Durante and R. Pappad`a
function computed as in [29]. Suppose that (X11,X
12),...,(XT1,X
T2)isaran-
dom sample from a distribution Hwith copula C. The empirical Kendall distri-
bution function KTis given, for all q[0,1], by
KT(q)= 1
T
T
j=1
1(Wjq),
where, for each j∈{1,...,T},
Wj=1
T+1
T
t=1
1(Xt1<X
j1,X
t2<X
j2).
The limiting behaviour of the empirical process T(KTK) has been discussed
in [30], where the convergence in law to a centered Gaussian limit under mild
regularity conditions is proved.
3 An Empirical Case Study
In order to briefly illustrate a possible application of the proposed methodology
we present here a case study from environmental data. The data were collected by
“Ufficio Idrografico” of the province of Bolzano-Bozen and are available online.
They are related to daily rainfall measurements recorded at 18 gauge stations
spread across the province of Bolzano-Bozen in the North-Eastern Italy. This
results in a set of d= 18 time series originally formed by T= 18262 observations.
Tab. 1 reports the available information on the analysed rainfall records. From
these time series, we extracted annual maxima at each spatial location resulting
in a 50 ×18 matrix of time series observations ˜
Xm
1,..., ˜
Xm
d,m∈{1,...,50},
summarized by Fig. 1. The selection of annual maxima has two main goals: it
transforms data with strong seasonality into data that can be assumed to be
independent and identically distributed; it transforms data that may have a
general dependence structure into data that are positively dependent (actually,
they are coupled by an extreme-value copula). For more details, see [5]. The
latter property is quite relevant since it allows to apply the method described in
Section 2 in order to detect the presence ofclustersoftheanalysedsitesonthe
basis of the componentwise maxima.
Specifically, we compute the dissimilarity matrix D:= (δij),i,j=1,...,d,
such that the dissimilarity between two time series is defined as the distance
δij =d2(ˆ
Kij ,K
M)=1
0
(qˆ
Kij (q))2dq,
where ˆ
Kij is the empirical Kendall distribution function based on the maxima
observations ( ˜
Xm
i,˜
Xm
j), m∈{1,...,50}.
The choice of this metric reflects the final goal of the clustering procedure
in the sense that two strongly dependent time series will give an extremely low
Cluster Analysis of Time Series via Kendall Distribution 213
Table 1. Summary of the rainfall measurement stations
Code Station Longitude Latitude Height (m)
0220 S.VALENTINO ALLA MUTA 10.5277 46.7745 1520
0310 TUBRE 10.4775 46.6503 1119
2090 PLATA 11.1783 46.8225 1147
3140 FLERES 11.3477 46.9639 1246
3260 VIPITENO-CONVENTO 11.4295 46.8978 948
8320 BOLZANO 11.3127 46.4976 254
9150 SESTO 12.3477 46.7035 1310
0250 MONTE MARIA 10.5213 46.7057 1310
0480 MAZIA 10.6175 46.6943 1570
1580 VERNAGO 10.8493 46.7357 1700
2170 S.LEONARDO PASSIARIA 11.2471 46.8091 644
2670 PAVICOLO 11.1093 46.6278 1400
3450 RIDANNA 11.3068 46.9091 1350
4450 S.MADDALENA IN CASIES 12.2427 46.8353 1398
6650 FUNDRES 11.7029 46.8872 1159
8570 BRONZOLO 11.3111 46.4065 226
8730 REDAGNO 11.3968 46.3465 1562
9100 ANTERIVO 11.3678 46.2773 1209
0220
0250
0310
0480
1580
2090
2170
2670
3140
3260
3450
4100
4450
6650
8320
8570
8730
9150
50
100
150
Fig. 1. Boxplot of annual maxima at each station from 1961 to 2010. The station codes
are as Tab. 1. On the y-axis the amount of rainfall is measured in millimeters.
value of their dissimilarity. The results of the clustering procedure are illustrated
by a tree diagram usually referred to as dendrogram, which represents the ar-
rangement of the clusters produced by hierarchical agglomerative clustering. In
Fig. 2, the dendrogram based on complete linkage is displayed. The vertical axis
represents the distance at which two clusters are joined. From the dendrogram it
is possible to identify, e.g., four different groups, by cutting at about height 0.06.
214 F. Durante and R. Pappad`a
0.00 0.02 0.04 0.06 0.08
Height
0220
0250
0310
0480
1580
2090
2170
2670
3140
3260
3450
4100
4450
6650
8320
8570
8730
9150
Fig. 2. Dendrogram for the 18 rainfall measurement stations listed in Tab. 1 based on
the complete linkage method
10.5°E11°E11.5°E12°E12.5°E
46°N 46.2°N 46.4°N 46.6°N 46.8°N47°N 47.2°N 47.4°N
0220
0250
0310
0480
1580
2090
2170
2670
3140
3260
3450
4100
4450
6650
8320
8570
8730
9150
Cluster 1
Cluster 2
Cluster 3
Cluster 4
Fig. 3. Map of the rainfall measurement stations marked according the the 4-clusters
solution in the province of Bolzano–Bozen (North-Eastern, Italy)
The 4-clusters solution is visualized on the map in Fig. 3, where the stations are
marked according to their cluster.
For the hydrological interpretation of the results it seems that several factors
should be taken into account in order to determine correlated rainfall extremes.
Cluster Analysis of Time Series via Kendall Distribution 215
In fact, not only the geographical proximity plays a role, but also the strong
heterogeneity in morphological and climatic features.
4 Conclusions
We have presented a procedure for grouping time series according to a copula-
based dependence function among them. In particular, we considered a dis-
similarity measure that is based on the Kendall distribution associated to two
continuous random variables, since such a function provides useful information
in terms of environmental risk, as shown in [22]. The proposed approach comple-
ments similar methods provided by the authors about copula-based clustering
of time series (see, e.g., [14], [16]).
References
1. Liao, T.W.: Clustering of time series data - a survey. Pattern Recogn. 38(11),
1857–1874 (2005)
2. Embrechts, P., McNeil, A.J., Straumann, D.: Correlation and Dependence in Risk
Management: Properties and Pitfalls. Cambridge Univ. Press, New York (2001)
3. Poulin, A., Huard, D., Favre, A.-C., Pugin, S.: Importance of tail dependence in
bivariate frequency analysis. J. Hydrol. Eng. 12, 394–403 (2007)
4. Gudendorf, G., Segers, J.: Extreme-value copulas. In: Jaworski, P., Durante, F.,
ardle, W., Rychlik, T. (eds.) Copula Theory and its Applications. Lecture Notes
in Statistics - Proceedings, vol. 198, pp. 127–145. Springer, Heidelberg (2010)
5. Salvadori, G., De Michele, C., Kottegoda, N.T., Rosso, R.: Extremes in Nature. An
Approach Using Copulas. Water Sci. and Technology Library 56. Springer (2007)
6. Jaworski, P.: Tail behaviour of copulas. In: Jaworski, P., Durante, F., H¨ardle, W.,
Rychlik, T. (eds.) Copula Theory and its Applications. Lecture Notes in Statistics
- Proceedings, vol. 198, pp. 161–186. Springer, Heidelberg (2010)
7. Gaetan, C., Grigoletto, M.: A hierarchical model for the analysis of spatial rainfall
extremes. J. ABES 12(4), 434–449 (2007)
8. Scotto, M.G., Barbosa, S.M., Alonso, A.M.: Extreme value and cluster analysis of
European daily temperature series. Journal of Applied Statistics 38(12), 2793–2804
(2011)
9. Favre, A.-C., Adlouni, S.E., Perreault, L., Thiemonge, N., Bobee, B.: Multivariate
hydrological frequency analysis using copulas. Water Resour. Res. 40 (2004)
10. Salvadori, G., De Michele, C.: Frequency analysis via copulas: theoretical aspects
and applications to hydrological events. Water Resour. Res. 40 (2004)
11. B´ardossy, A.: Copula-based geostatistical models for groundwater quality param-
eters. Water Resour. Res. 42(11) (2006)
12. Bonazzi, A., Cusack, S., Mitas, C., Jewson, S.: The spatial structure of European
wind storms as characterized by bivariate extreme-value Copulas. Nat. Hazards
Earth Syst. Sci. 12, 1769–1782 (2012)
13. Genest, C., Favre, A.-C.: Everything you always wanted to know about copula
modeling but were afraid to ask. J. Hydrologic Eng. 12(4), 347–368 (2007)
14. Durante, F., Pappad`a, R., Torelli, N.: Clustering of financial time series in risky sce-
narios. Adv. Data Anal. Classif. (2013) (in press), doi: 10.1007/s11634-013-0160-4
216 F. Durante and R. Pappad`a
15. De Luca, G., Zuccolotto, P.: A tail dependence-based dissimilarity measure for
financial time series clustering. Adv. Data Anal. Classif. 5(4), 323–340 (2011)
16. Durante, F., Pappad`a, R., Torelli, N.: Clustering of extreme observations via tail
dependence estimation. Statist. Papers (in press, 2014)
17. Buishand, T., de Haan, L., Zhou, C.: On spatial extremes: With application to a
rainfall problem. Ann. Appl. Statist. 2, 624–642 (2008)
18. Cooley, D., Naveau, P., Poncet, P.: Variograms for spatial max-stable random fields.
In: Dependence in Probability and Statistics. Lectures Notes in Statistics, pp. 373–
390. Springer, Heidelberg (2006)
19. Cooley, D., Nychka, D., Naveau, P.: Bayesian spatial modeling of extreme precipi-
tation return levels. J. Amer. Statist. Assoc. 102, 824–840 (2007)
20. Robeson, S.M., Doty, J.A.: Identifying rogue air temperature stations using cluster
analysis of percentile trends. J. Climate 18, 1275–1287 (2005)
21. Scotto, M.G., Alonso, A.M., Barbosa, S.M.: Clustering time series of sea levels:
Extreme value approach. J. Waterway, Port, Coastal, and Ocean Engrg. 136, 215–
225 (2010)
22. Salvadori, G., De Michele, C., Durante, F.: On the return period and design in a
multivariate framework. Hydrol. Earth Syst. Sci. 15, 3293–3305 (2011)
23. Salvadori, G., Durante, F., De Michele, C.: Multivariate return period calculation
via survival functions. Water Resour. Res. 49(4), 2308–2311 (2013)
24. Salvadori, G., Durante, F., Perrone, E.: Semi–parametric approximation of the
Kendall’s distribution and multivariate return periods. J. SFdS 154(1), 151–173
(2013)
25. Genest, C., Rivest, L.-P.: Statistical inference procedures for bivariate Archimedean
copulas. J. Amer. Statist. Assoc. 88(423), 1034–1043 (1993)
26. Genest, C., Rivest, L.-P.: On the multivariate probability integral transformation.
Statist. Probab. Lett. 53(4), 391–399 (2001)
27. Cap´era`a, P., Foug`eres, A.-L., Genest, C.: A stochastic ordering based on a decom-
position of Kendall’s tau. In: Beneˇs, V., ˇ
Stˇep´an, J (Eds.) Distributions with Given
Marginals and Moment Problems. Kluwer Academic Publishers, Dordrecht, pp.
81–86
28. Nelsen, R.B., Quesada–Molina, J.J., Rodr´ıguez–Lallena, J.A., ´
Ubeda–Flores, M.:
Kendall distribution functions. Statist. Probab. Lett. 65, 263–268 (2003)
29. Genest, C., Neˇslehov´a, G., Ziegel, J., Inference, J.: in multivariate Archimedean
copula models. TEST 20, 223–256 (2011)
30. Barbe, P., Genest, C., Ghoudi, K., R´emillard, B.: On Kendall’s process. J. Multivar.
Anal. 58(1996), 197–229 (1996)
... -Copula-based [34,35,36] and tail dependence [37] distances. ...
Chapter
Full-text available
We review the state of the art of clustering financial time series and the study of their correlations alongside other interaction networks. The aim of the review is to gather in one place the relevant material from different fields, e.g. machine learning, information geometry, econophysics, statistical physics, econometrics, behavioral finance. We hope it will help researchers to use more effectively this alternative modeling of the financial time series. Decision makers and quantitative researchers may also be able to leverage its insights. Finally, we also hope that this review will form the basis of an open toolbox to study correlations, hierarchies, networks and clustering in financial markets.
... -Copula-based [136], [71,26] and tail dependence [74] distances. ...
Thesis
Full-text available
In this thesis we first review the scattered literature about clustering financial time series. We then try to give as much colors as possible on the credit default swap market, a relatively unknown market from the general public but for its role in the contagion of bank failures during the global financial crisis of 2007-2008, while introducing the datasets that have been used in the empirical studies. Unlike the existing body of literature which mostly offers descriptive studies, we aim at building models and large information systems based on clusters which are seen as basic building blocks: These foundations must be stable. That is why the work undertaken and described in the following intends to ground further the clustering methodologies. For that purpose, we discuss their consistency and propose alternative measures of similarity that can be plugged in the clustering methodologies. We study empirically their impact on the clusters. Results of the empirical studies can be explored at www.datagrapple.com.
... • In [21], the Kendall distribution function associated with a copula is used to derive a dissimilarity measure. This choice is mainly motivated by the use of Kendall hazard scenarios in defining risky regions in hydrology and environmental sciences (see, for instance, [48]). ...
... -Copula-based [27,28,29] and tail dependence [30] distances. ...
Article
Full-text available
This document is a preliminary version of an in-depth review on the state of the art of clustering financial time series and the study of correlation networks. This preliminary document is intended for researchers in this field so that they can feedback to allow amendments, corrections and addition of new material unknown to the authors of this review. The aim of the document is to gather in one place the relevant material that can help the researcher in the field to have a bigger picture, the quantitative researcher to play with this alternative modeling of the financial time series, and the decision maker to leverage the insights obtained from these methods. We hope that this document will form a basis for implementation of an open toolbox of standard tools to study correlations, hierarchies, networks and clustering in financial markets. We also plan to maintain pointers to online material and an updated version of this work at www.datagrapple.com/Tech.
... The topic has been also faced by other authors. In [13], Durante, Pappadà and Torelli have proposed to carry out a clustering procedure based on the conditional Spearman's correlation coefficient, and in [14] they have suggested a non-parametric estimation of the tail dependence coefficients, while in [12] Durante and Pappadà have clustered time series according to the pairwise Kendall distribution. A different approach has been studied by DiTraglia and Gerlach [11] exploiting a result from Extreme Value Theory to estimate the tail dependence and use it in portfolio selection. ...
Article
This paper is concerned with a procedure for financial time series clustering, aimed at creating groups of time series characterized by similar behavior with regard to extreme events. The core of our proposal is a double clustering procedure: the former is based on the lower tail dependence of all the possible pairs of time series, the latter on the upper tail dependence. Tail dependence coefficients are estimated with copula functions. The final goal is to exploit the two clustering solutions in an algorithm designed to create a portfolio that maximizes the probability of joint positive extreme returns while minimizing the risk of joint negative extreme returns. In financial crisis scenarios, such a portfolio is expected to outperform portfolios generated by the traditional methods. We describe the results of a simulation study and, finally, we apply the procedure to a dataset composed of the 50 assets included in the EUROSTOXX index.
Article
Copulas are a very flexible tool to highlight structural properties of the design for a wide range of dependence structures. In this work we introduce a procedure for checking the robustness of the D-optimal design with respect to slight changes of the marginal distributions in the case of copula models. To this end, we first provide a clear insight for the concept of “robustness” in our domain. Then, we define a stepwise method for the investigation of the design robustness. Finally, by reporting an example focused on comparison between the use of logistic margins and Gaussian margins, we put the usefulness of the analysis up.
Article
Full-text available
This article presents the modeling of multivariate extreme values using copulas. Our approach allows us to model the dependence structure independently of the marginal distributions, which is not possible with standard classical methods. The methodology has been applied on two different problems in hydrology. The first application is concerned with the combined risk in the framework of frequency analysis. Four copulas have been tested on peak flows from the watershed of Peribonka in Québec, Canada. The second application relates to the joint modeling of peak flows and volumes. Three copulas have been applied to the watershed of the Rimouski River in Québec, Canada. This approach using copulas is promising since it allows us to take into account a wide range of correlation which can happen in hydrology.
Article
Full-text available
The winds associated with extra-tropical cyclones are amongst the costliest natural perils in Europe. Re/insurance companies typically have insured exposure at multiple locations and hence the losses they incur from any individual storm crucially depend on that storm's spatial structure. Motivated by this, this study investigates the spatial structure of the most extreme windstorms in Europe. The data consists of a carefully constructed set of 135 of the most damaging storms in the period 1972–2010. Extreme value copulas are applied to this data to investigate the spatial dependencies of gusts. The copula method is used to investigate three aspects of windstorms. First, spatial maps of expected hazard damage between large cities and their surrounding areas are presented. Second, we demonstrate a practical application of the copula method to benchmark catalogues of artificial storms for use in the re/insurance sector. Third, the copula-based method is used to investigate the sensitivity of spatially aggregated damage to climate variability. The copula method allows changes to be expressed in terms of storm frequency, local intensity, and storm spatial structure and gives a more detailed view of how climate variability may affect multi-location risk in Europe.
Article
Full-text available
Time series of daily mean temperature obtained from the European Climate Assessment data set is analyzed with respect to their extremal properties. A time-series clustering approach which combines Bayesian methodology, extreme value theory and classification techniques is adopted for the analysis of the regional variability of temperature extremes. The daily mean temperature records are clustered on the basis of their corresponding predictive distributions for 25-, 50- and 100-year return values. The results of the cluster analysis show a clear distinction between the highest altitude stations, for which the return values are lowest, and the remaining stations. Furthermore, a clear distinction is also found between the northernmost stations in Scandinavia and the stations in central and southern Europe. This spatial structure of the return period distributions for 25-, 50- and 100-years seems to be consistent with projected changes in the variability of temperature extremes over Europe pointing to a different behavior in central Europe than in northern Europe and the Mediterranean area, possibly related to the effect of soil moisture and land-atmosphere coupling.
Book
The study of the statistics of extreme events is an essential first step in the mitigation of natural catastrophies, that often cause severe economic losses worldwide. This book is about the theoretical and practical aspects of the statistics of Extreme Events in Nature. Most importantly, this is the first text in which Copulas are introduced and used in Geophysics. Several topics are fully original, and show how standard models and calculations can be improved by exploiting the opportunities offered by Copulas. In addition, new quantities useful for design and risk assessment are introduced. Practicioners in all research areas of Geosciences and extreme events (including Finance and Insurance, closely related to natural disasters) will definitely benefit from the new Copula-approach outlined in the book. Audience This volume will be of interest to researchers and practitioners in the fields of civil and environmental engineering, geophysics, geosciences, geography and environmental science. Also scientists and undergraduate up to post graduate level students in water resources and hydrology will find valuable information in this book.
Article
We present a procedure for clustering time series according to their tail dependence behaviour as measured via a suitable copula-based tail coefficient, estimated in a non-parametric way. Simulation results about the proposed methodology together with an application to financial data are presented showing the usefulness of the proposed approach.
Article
A methodology is presented for clustering financial time series according to the association in the tail of their distribution. The procedure is based on the calculation of suitable pairwise conditional Spearman’s correlation coefficients extracted from the series. The performance of the method has been tested via a simulation study. As an illustration, an analysis of the components of the Italian FTSE–MIB is presented. The results could be applied to construct financial portfolios that can manage to reduce the risk in case of simultaneous large losses in several markets.
Article
If X and Y are random variables with joint distribution function H, their dependence may be measured by Kendall’s tau, τ(X,Y), expressed as 4E(V)-1 in terms of the random quantity V=H(X,Y) with distribution K on the interval [0,1]. A new dependence ordering based on K is defined and studied; although it does not always imply the classical positive quadrant dependence ordering, it is shown to be weaker than the association ordering of B. F. Schriever [Ann. Stat. 15, 1208-1214 (1987; Zbl 0631.62068)] under weak regularity conditions.
Article
Groundwater quality parameters exhibit considerable spatial variability. Geostatistical methods including the assessment of variograms are usually used to characterize this variability. Copulas offer an interesting opportunity to describe dependence structures for multivariate distributions. Bivariate empirical copulas can be used as an alternative to variograms and covariance functions for the description of the spatial variability. Rank correlations of these copulas express the strength of the dependence independently of the marginal distributions and thus offer an alternative to the variograms. Empirical copulas for four quality parameters, chloride, sulfate, pH, and nitrate, obtained from a large-scale groundwater quality measurement network in Baden-Württemberg (Germany) are calculated. They indicate that the spatial dependence structure of the investigated parameters is not Gaussian. Two theoretical copula-based models are presented in this paper: a Gaussian and a non-Gaussian. Bootstrap-based statistical tests using stochastic simulation of the multivariate distributions are used to investigate the appropriateness of the models. According to the test results the Gaussian copula is rejected for most of the parameters while the non-Gaussian alternative is not rejected in most cases.
Article
The concept of return period is fundamental for the design and the assessment of many engineering works. In a multivariate framework, several approaches are available to its definition, each one yielding different solutions. In this paper, we outline a theoretical framework for the calculation of return periods in a multidimensional environment, based on survival copulas and the corresponding survival Kendall's measures. The present approach solves the problems raised in previous publications concerning the coherent foundation of the notion of return period in a multivariate setting. As an illustration, a practical hydrological application is presented.