ArticlePDF Available

Multivariate Statistical Modelling of Compound Events via Pair-Copula Constructions: Analysis of Floods in Ravenna

Authors:

Abstract and Figures

Compound events are multivariate extreme events in which the individual contributing variables may not be extreme themselves, but their joint - dependent - occurrence causes an extreme impact. The conventional univariate statistical analysis cannot give accurate information regarding the multivariate nature of these events. We develop a conceptual model, implemented via pair-copula constructions, which allows for the quantification of the risk associated with compound events in present day and future climate, as well as the uncertainty estimates around such risk. The model includes meteorological predictors which provide insight into both the involved physical processes, and the temporal variability of CEs. Moreover, this model provides multivariate statistical downscaling of compound events. Downscaling of compound events is required to extend their risk assessment to the past or future climate, where climate models either do not simulate realistic values of the local variables driving the events, or do not simulate them at all. Based on the developed model, we study compound floods, i.e. joint storm surge and high river runoff, in Ravenna (Italy). To explicitly quantify the risk, we define the impact of compound floods as a function of sea and river levels. We use meteorological predictors to extend the analysis to the past, and get a more robust risk analysis. We quantify the uncertainties of the risk analysis observing that they are very large due to the shortness of the available data, though this may also be the case in other studies where they have not been estimated. Ignoring the dependence between sea and river levels would result in an underestimation of risk, in particular the expected return period of the highest compound flood observed increases from about 20 to 32 years when switching from the dependent to the independent case.
Content may be subject to copyright.
Multivariate Statistical Modelling of Compound Events via
Pair-Copula Constructions: Analysis of Floods in Ravenna
Emanuele Bevacqua1, Douglas Maraun1, Ingrid Hobæk Haff2, Martin Widmann3, and Mathieu Vrac4
1Wegener Center for Climate and Global Change, University of Graz, Graz, Austria
2Department of Mathematics, University of Oslo, Oslo, Norway
3School of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham, United Kingdom
4Laboratoire des Sciences du Climat et de l’Environnement, CEA Saclay, Gif-sur-Yvette, France
Correspondence to: Emanuele Bevacqua (emanuele.bevacqua@uni-graz.at)
Abstract. Compound events are multivariate extreme events in which the individual contributing variables may not be ex-
treme themselves, but their joint - dependent - occurrence causes an extreme impact. The conventional univariate statistical
analysis cannot give accurate information regarding the multivariate nature of these events. We develop a conceptual model,
implemented via pair-copula constructions, which allows for the quantification of the risk associated with compound events
in present day and future climate, as well as the uncertainty estimates around such risk. The model includes meteorological5
predictors which provide insight into both the involved physical processes, and the temporal variability of CEs. Moreover, this
model provides multivariate statistical downscaling of compound events. Downscaling of compound events is required to ex-
tend their risk assessment to the past or future climate, where climate models either do not simulate realistic values of the local
variables driving the events, or do not simulate them at all. Based on the developed model, we study compound floods, i.e. joint
storm surge and high river runoff, in Ravenna (Italy). To explicitly quantify the risk, we define the impact of compound floods10
as a function of sea and river levels. We use meteorological predictors to extend the analysis to the past, and get a more robust
risk analysis. We quantify the uncertainties of the risk analysis observing that they are very large due to the shortness of the
available data, though this may also be the case in other studies where they have not been estimated. Ignoring the dependence
between sea and river levels would result in an underestimation of risk, in particular the expected return period of the highest
compound flood observed increases from about 20 to 32 years when switching from the dependent to the independent case.15
1 Introduction
On the 6th of February 2015, a low pressure system that developed over the north of Spain moved across the Island of Corsica
into Italy. The low pressure itself (Figure 1) and the associated southeasterly winds drove a storm surge to the Adriatic coast at
Ravenna (Italy). Alongside the storm surge, large amounts of precipitation in the surrounding area caused high values of runoff
in the small rivers near the coast. This runoff was obstructed by the storm surge and lead to major flooding along the coast.20
Such a compound flood is a typical example of a compound event (CE). CEs are multivariate extreme events in which the
individual contributing variables may not be extreme themselves, but their joint - dependent - occurrence causes an extreme
impact. The impact of CEs may be a climatic variable such as the gauge level (e.g. for compound floods), or fatalities or
1
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
Longitude
Latitude
0
10
20
30
40
50
60
70
Precipitation (mm)
1002
1004
1004
1006
1008
1010
1012
1014
1014
1016
1018
1020
1022
1024
1026
1028
1030
1032
1034
1036
−5° W 5° E 15° E 25° E
36° N 44° N 52° N
Figure 1. Sea level pressure and total precipitation on 6th February 2015, when the coastal area of Ravenna (indicated by the yellow spot)
was hit by a compound flooding.
economic losses. CEs have received little attention so far, as underlined in the recent report of the Intergovernmental Panel on
Climate Change on extreme events (Seneviratne et al., 2012).
CEs are responsibile for a very broad class of impacts on society. For example, heatwaves amplified by the lack of soil
moisture, which reduces the latent cooling, may also be classed as CEs (Fischer et al., 2007; Seneviratne et al., 2010). The
impact of drought cannot be fully described by a single variable (e.g. Shiau et al., 2007): analyses have been carried out which5
consider drought severity, duration (Shiau et al., 2007), maximum deficit (Saghafian and Mehdikhani, 2013), as well as the
affected area (Serinaldi et al., 2009). Another example of CE includes fluvial floods resulting from extreme rainfall occurring
on a wet catchment (Pathiraja et al., 2012).
In recent literature, more attention has been given to the study of CEs through multivariate statistical methods (Seneviratne
et al., 2012) which can offer more in-depth information, regarding the multivariate nature of CEs, than conventional univariate10
analysis. Combinations of univariate analyses for studying CEs are only sufficient when no dependence exists among the
compound variables. However this is not usually the case, and so would lead to systematic errors in the estimation of the risk
associated with CEs.
Modelling CEs is a complex undertaking (Leonard et al., 2013), and methods to adequately study them are required. Para-
metric multivariate statistical models allow one to constrain the dependencies between the contributing variables of CEs, as15
well as their marginal distributions (e.g. Hobæk Haff et al., 2015; Serinaldi, 2015; Aghakouchak et al., 2014; Saghafian and
Mehdikhani, 2013; Serinaldi et al., 2009; Shiau et al., 2007; Shiau, 2003). The parametric structure reduces the uncertainties
of the statistical properties we want to estimate from the data, compared to empirical estimates. As observed data are often
limited, the remaining uncertainties might still be substantial and should thus be quantified (Serinaldi, 2015).
2
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
Due to the complex dependence structure between the contributing variables, advanced multivariate statistical models are
necessary to model CEs. For example, modelling the multivariate probability distribution of the contributing variables with
multivariate Gaussian distributions would usually not produce satisfying results. A multivariate Gaussian distribution would
assume that the dependencies between all the pairs are of the same type (homogeneity of the pair-dependencies), and without
any dependence of the extreme events, also called tail dependence. Furthermore, a multivariate Gaussian distribution would5
assume that all of the marginal distributions would be Gaussian. To solve the latter problems, the use of copulas has been
introduced in climate science (e.g. Schölzel and Friederichs, 2008). Through copulas, it is possible to model the dependence
structure of variables separately from their marginal distributions. However, multivariate parametric copulas lack flexibility
when modelling systems with high dimensionality, where heterogeneous dependencies exist among the different pairs (Aas et
al., 2009). Therefore, this lack of flexibility of copulas would be a limitation for many types of compound events. Pair-copula10
constructions (PCCs) decompose the dependence structure in bivariate copulas (some of which are conditional) and give greater
flexibility in modelling generic high-dimensional systems compared to multivariate parametric copulas (Aas et al., 2009; Acar
et al., 2012; Bedford and Cooke, 2002; Hobæk Haff, 2012).
Here we develop a multivariate statistical model, based on PCCs, which allows for an adequate description of the depen-
dencies between the contributing variables. The model provides a straightforward quantification of risk uncertainty, which is15
reduced with respect to the uncertainties obtained when computing the risk directly on the observed data of the impact. We
extend the multivariate statistical model through including meteorological predictors for the contributing variables. This in-
crease in complexity of the model due to additional variables, is accommodated for through the use of PCCs. The predictors
allow us to (1) gain insight into the physical processes underlying CEs, as well as into the temporal variability of CEs, and (2)
to statistically downscale CEs and their impacts. Downscaling may be used to statistically extend the risk assessment back in20
time to periods where observations of the predictors, but not of the contributing variables and impacts are available, or to assess
potential future changes in CEs based on climate models. Based on this model we study compound flooding in Ravenna.
In the context of compound floods, the dependence between rainfall and sea level has previously been studied for other
regions (e.g., Wahl et al., 2015; Zheng et al., 2013; Kew et al., 2013; Svensson and Jones, 2002; Lian et al., 2013). Among
these studies, Wahl et al. (2015) observed an increase in the risk of compound flooding in major US cities driven by an25
increasing dependence between storm surges and extreme rainfall. The impact of compound floods can be described as the
gauge level in a river near the coast, which is driven both by the river discharge upstream and the sea level. Only a few studies
have explicitly quantified the impact of compound floods and the associated risks (Zheng et al., 2015, 2014; Hurk et al., 2015;
Brink et al., 2005). This might be due to difficulties in quantifying the impact due to a lack of data. For the Rotterdam case
study, the impact has been explicitly quantified (Brink et al., 2005; Kew et al., 2013; Klerk et al., 2015). However, there is still30
debate as to whether the floods there are actually CEs, i.e. if surges and discharges can be treated independently or not when
assessing the risk of flooding. As discussed in Klerk et al. (2015), a significant dependence is more likely in small catchments,
such as those in mountainous areas by the coast, which have a quick response time to rainfall that may favour the coincidence
of high river flows and storm surges driven by the same synoptic weather system.
3
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
Y
2
River
Y
3
River
h
Y
1
Sea
X
23
River
3-Dim
5-Dim
Figure 2. Hydraulic system for Ravenna catchment. The area affected by compound floods is marked by the red point. The impact is the
water level h, which is influenced by the contributing variables Y, i.e. sea and river levels. The variables inside the black rectangle are used
to develop the 3-dimensional (stationary) model. The Xare the meteorological predictors driving the contributing variables Y, which are
incorporated into the 5-dimensional (non-stationary) model.
Here, we explicitly define the impact of compound floods as a function of sea and river levels in order to quantify the flooding
risk and its related uncertainties. Moreover we quantify the underestimation of the risk that occurs when the dependence
among sea and river levels is not considered. We identify the meteorological predictors driving the river and sea levels. By
incorporating such predictors into the statistical model, we extend the analysis of compound floods into the past, where data
are available for predictors, but not for the river and sea level stations.5
The paper is organized as follows. The case study and the conceptual model we develop are discussed in the sections 2 and
3. The mathematical method we use to develop the model, i.e. pair-copula constructions, is introduced in section 4. Results are
presented in section 5 and conclusions are provided in section 6. More technical details can be found in the appendices.
2 Compound flooding in the coastal area of Ravenna
In this study, we focus on the risk of compound floods in the coastal area of Ravenna. The choice of the case study was10
motivated by the extreme event that happened on the 6th of February, as presented in the introduction. On the day prior to
the event, values of up to approximately 80mm of rain were recorded in the surrounding area of Ravenna, and around 90mm
on the day of the event itself. The sea level recorded was the highest observed in the last 18 years. The high risk of flooding
to population in the Ravenna region has been underlined by the LIFE PRIMES project (Life Primes, a), recently financed by
the European Commission, whose target is "to reduce the damages caused to the territory and population by events such as15
floods and storm surges" (Life Primes, b) in Ravenna and its surrounding areas. A schematic representation of the catchment
on which we focus is shown in the black rectangle of Figure 2. The Yvariables, river and sea levels, represent the contributing
variables, and the the water level his the impact of the compound flood. The Xvariables are meteorological predictors of the
contributing variables Y, which will be discussed in more detail later.
4
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
We develop a multivariate statistical model able to assess the risk of compound floods in Ravenna. Our research objectives
are the following:
1. Develop a statistical model to represent the dependencies between the contributing variables of the compound floods,
via pair-copula constructions.
2. Explicitly define the impact of compound floods as a function of the contributing variables. This allows us to estimate5
the risk and the related uncertainty.
3. Identify the meteorological predictors for the contributing variables Y. Incorporate the meteorological predictors in the
model to gain insight into the physical mechanisms driving the compound floods and into their temporal variability.
4. Extend the analysis into the past (where data are available for the predictors, but not for the contributing variables Y).
2.1 Dataset10
The data used here for the contributing variables Yand the impact hare water levels at a daily resolution (daily averages of
hourly measurements). We use data for the extended winter season (November-March) of the period 2009-2015. Data sources
are the Italian National Institute for Environmental Protection and Research (ISPRA) for the sea, and Arpae Emilia-Romagna
for rivers and impact. River data were processed in order to mask periods of low quality, i.e. those suspected to be influenced
by human activities such as the use of a dam. Moreover, we applied a procedure to homogenise the data of the rivers, whose15
details are given in appendix A. We do not filter out the astronomical tide component of the sea level, considering that the range
of variation of the daily average of sea level is about 1 meter, while that of the astronomical tide is about 9 cm. To check the
above, we used astronomical tide obtained through FES2012, which is a software produced by Noveltis, Legos and CLS Space
Oceanography Division and distributed by Aviso, with support from Cnes (http://www.aviso.altimetry.fr/). Meteorological
predictors were obtained from the ECMWF ERA-Interim reanalysis dataset (covering the period 1979-2015, with 0.75 ×0.7520
degrees of resolution (Dee et al., 2011)). Specifically, for the river predictors we use daily data (sum of 12-hourly values) of
total precipitation, evaporation, snow melt and snow fall, while for the sea level predictor we use daily data (average of 6-hourly
values) of sea level pressure.
3 Modelling of compound events
Leonard et al. (2013) define a CE as "an extreme impact that depends on multiple statistically dependent variables or events".25
This definition stresses the extremeness of the impact rather than that of the individual contributing variables, which may not
be extreme themselves, and the importance of the dependence between these contributing variables. The physical reasons for
the dependence among the contributing variables can be different. For example, there can be a mutual reinforcement of one
variable by the other and vice versa due to system feedbacks (Seneviratne et al., 2012). Or the probability of occurrence of the
contributing variables can be influenced from a large scale weather condition, as has occurred in Ravenna (Figure 1), where the30
5
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
low pressure system caused coinciding extremes of river runoff and sea level. It is clear then, that the dependence among the
contributing variables represents a fundamental aspect of compound events, and so it must be properly modelled to represent
these extreme events well.
3.1 Non-stationary multivariate statistical model for CEs
Our non-stationary multivariate statistical model consists of three components: the contributing variables Yi, including a model5
of their dependence structure, the impact h, and meteorological predictors Xjof the contributing variables.
The contributing variables Yiand their multivariate dependence structure define the CE. For instance, in case of compound
floods, these are runoff and sea level. The impact hof a CE can be formalized via an impact-function h=h(Y1,..., Yn). In
the case of compound flooding, we define the river gauge level in Ravenna as impact, but in principle it can be any measur-
able variable such as, e.g, agricultural yield or economic loss. The predictors Xjprovide insight into the physical processes10
underlying CEs, including the temporal variability of CEs, and can be used to statistically downscale CEs (e.g. Maraun et al.,
2010). The downscaling feature is particularly useful for compound events, which are not realistically simulated, or may not
even be simulated at all by available climate models. For instance, standard global and regional climate models do not simulate
realistic runoff, and do not simulate sea surges. Here, our model can be used to downscale these contributing variables from
simulated large-scale meteorological predictors. In particular, the model provides a simultaneous, i.e. multivariate, downscal-15
ing of the contributing variables Yi, which allows for a realistic representation both of the dependencies between the Yi, and
of their marginal distributions. This is relevant because a separate downscaling of the contributing variables Yimay lead to
unrealistic representations of the dependencies between the Yi, which in turn would cause a poor estimation of the impact h.
The downscaling feature can be useful to extend the risk analysis into the past, where observations of the predictors, but not of
the contributing variables and impacts are available.20
More specifically, the model consists of:
1. An impact function to quantify the impact h:
h=h(Y1,...,Yn).(1)
2. Meteorological predictors Xfor the contributing variables Y.
3. A conditional joint probability density function (pdf) fY|X(Y|X)of the contributing variables Y, given the predictors25
X(which we describe through a parametric model, via pair-copula constructions). In particular, both the contributing
variables Yand predictors Xare time dependent, i.e. Y=Y(t)and X=X(t).
A particular type of such a model is obtained when the predictors are not considered in the joint pdf, i.e. when considering
fY(Y). This model does not allow the change of the contributing variables Yand of the impact due to a potential non-
stationarity caused by the predictors X. This is conceptually similar to the one applied from Serinaldi (2015) to bivariate30
droughts.
6
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
An advantage of using a parametric statistical model is that this constrains the dependencies between the contributing vari-
ables, as well as their marginal distributions, and thereby reduces their uncertainties with respect to empirical estimates (Hobæk
Haff et al., 2015). Such a reduction as above in turn reduces the uncertainty in the estimated physical quantity of interest, like
the impact of the CE.
4 Statistical method5
Pair-copula constructions (PCCs) are mathematical decompositions of multivariate pdfs proposed by Joe (1996), which allow
for the modelling of multivariate dependencies with high flexibility. We start presenting the concept of copula, and then we
introduce PCCs. More technical details can be found in the appendices.
4.1 Copulas
Consider a vector Y= (Y1,...,Yn)of random variables, with marginal pdfs f1(y1),...,fn(yn), and cumulative marginal dis-10
tribution functions (CDFs) F1(y1),..., Fn(yn), defined on R∪ {−∞,∞}. We use the recurring definition Ui:= Fi(Yi), where
the name Uindicates that these variables are uniformly distributed by construction. According to Sklar’s theorem (Sklar, 1959)
the joint CDF F(y1,..., yn), can be written as:
F(y1,..., yn) = C(u1, ..., un)(2)
where C is a n-dimensional Copula. C is a copula if C: [0,1]n[0,1] is a joint CDF of an n-dimensional random vector on15
the unit cube [0,1]nwith uniform marginals.
Under the assumption that the marginal distributions Fiare continuous, the copula C is unique and the multivariate pdf can
be decomposed as:
f(y1,..., yn) = f1(y1)·... ·fn(yn)·c(u1, ..., un)(3)
where cis the copula density. Equation (3) explicitly represents the decomposition of the pdf as a product of the marginal20
distributions and the copula density, which describes the dependence among the variables independently of their marginals.
Equation (3) has some important practical consequences: it allows us to generate a large number of joint pdfs. In fact, inserting
any existing family for the marginal pdfs and copula density into eq. (3), it is possible to construct a valid joint pdf. The group
of the existing parametric families of multivariate distributions (e.g. the multivariate normal distribution, which has normal
marginals and copula) is only a part of the realizations which are possible via equation (3). Copulas therefore increase the25
number of available multivariate distributions.
4.2 Tail dependence
The dependence of extreme events cannot be measured by overall correlation coefficients such as the Pearson, Spearman or
Kendall. Given two random variables which are uncorrelated according to such overall dependence coefficients, there can be
7
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
a significant probability to get concurrent extremes of both variables, i.e. a tail dependence (Hobæk Haff et al., 2015). On the
contrary, two random variables which are correlated according to an overall dependence coefficient, may not necessarily be tail
dependent.
Mathematically, given two random variables (Y1,Y2)with marginal CDFs (F1(y1), F2(y2)), they are upper tail dependent
if:5
λU(Y1,Y2) = lim
u1P(Y2> F 1
2(u)|Y1> F 1
1(u)) >0(4)
where P(A|B)indicate the generic conditional probability of occurrence of the event Agiven the event B. Similarly, the two
variables are lower tail dependent if:
λL(Y1,Y2) = lim
u0P(Y2< F 1
2(u)|Y1< F 1
1(u)) >0.(5)
4.3 Pair-Copula Constructions (PCCs)10
While the number of bivariate copula families is very large (Joe, 1997; Nelsen, 2006), building higher-dimensional copulas is
generally recognised as a difficult problem (Aas et al., 2009). As a consequence, the set of copula families having dimension
greater than or equal to 3 is rather limited, and they lack flexibility in modelling multivariate pdfs where heterogeneous depen-
dencies exist among different pairs. For instance, they usually prescribe that all the pairs have the same type of dependence, e.g.
they are either all tail dependent or not. Under the assumption that the joint CDF is absolutely continuous, with strictly increas-15
ing marginal CDFs, PCCs allow to mathematically decompose an n-dimensional copula density into the product of n(n1)/2
bivariate copulas, some of which are conditional. In practice, this provides high flexibility in building high-dimensional copu-
las. PCCs allow for the independent selection of the pair-copulas among the large set of families, providing higher flexibility
in building high dimensional joint pdfs with respect to using the existing multivariate parametric copulas (Aas et al., 2009).
When the dimension of the pdf is large, there can be many possible, mathematically equally valid decompositions of the20
copula density into a PCC. For example, for a 5 dimensional system there are 480 possible different decompositions. For
this reason, Bedford and Cooke (2001, 2002) have introduced the regular vine, a graphical model which helps to organize
the possible decompositions. This is helpful to chose which PCC to use to decompose the multivariate copula. In this study
we concentrate on the subcategories canonical (also known as C-vine) and D-vine of regular vines. Out of the 480 possible
decompositions for a 5-dimensional copula density, 240 are regular vines (60 C-vines, 60 D-vines and 120 other types of vines)25
(Aas et al., 2009). The decomposition we use for the non-stationary model is the following D-vine:
f12345(y1, y2,y3, y4, y5) = f4(y4)·f5(y5)·f3(y3)·f1(y1)·f2(y2)
·c45(u4, u5)·c53(u5, u3)·c31(u3, u1)·c12(u1, u2)
·c43|5(u4|5,u3|5)·c51|3(u5|3, u1|3)·c32|1(u3|1, u2|1)
·c41|35(u4|53 , u1|53 )·c52|13 (u5|31, u2|31)
·c42|135(u4|513 , u2|513 )
(6)
8
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
where (Y1,Y2, Y3)are the variables (Y1Sea , Y2River ,Y3River ), and (Y4,Y5)are the predictors (X1Sea , X23Rivers )(details about the pre-
dictors are given in the next section). The graphical representation of this decomposition is shown in Figure 10a (appendix
B1).
More details about vines and the decompositions used for the stationary model are in appendix B1. Details regarding the
statistical inference of the joint pdf can be found in appendix C, while the sampling and conditional sampling procedure from5
vines (including the used algorithm) are presented in appendix B2.
5 Results
The extreme impact of compound events may be driven from the joint occurrence of non-extreme contributing variables
(Leonard et al., 2013; Seneviratne et al., 2012). This is the case for compound floods in Ravenna, where not all extreme val-
ues of the impact would be considered if selecting only extreme values of the contributing variables. Therefore we model the10
contributing variables, without focusing only on their extreme values. Below we show the steps we follow to study compound
floods in Ravenna, based on the conceptual model described in section 3.1.
1. Define the impact function:
h=h(Y1Sea ,Y2River , Y3River ).(7)
The contributing variables Y(sea and river levels) and the impact are shown in the black rectangle of Figure 2).15
2. Find the meteorological predictors of the contributing variables Y. For each variable Yiwe found more than one mete-
orological predictor, which we aggregated in a single variable Xi, whom we refer to as the predictor Xiof the variable
Yifrom now on. Moreover we use an identical predictor for the two river levels because they are driven by a similar
meteorological influence. The predictors are graphically shown in Figure 2, where we introduce X1Sea (the predictor of
Y1Sea ) and X23Rivers (the predictor of Y2River and Y3River ).20
3. Fit the 5-dimensional conditional joint pdf fY|X(Y1Sea ,Y2River , Y3River |X1Sea , X23Rivers )of the non-stationary model (modelled
via PCC). Here, we use the model to extend the multivariate time series Y(t)to the past (period 1979-2015), when
only X(t)is available. The 3-dimensional pdf of the stationary model is fY(Y1Sea, Y2River ,Y3River )and includes only the
contributing variables Yinside the black rectangle of Figure 2.
4. Given the complexity of the problem, an analytical derivation of the statistical proprieties of the impact is impracticable.25
Therefore, we apply a Monte Carlo procedure (this is also required to get the model uncertainty, as shown in appendix
D). Specifically we simulate the contributing variables Yfrom the fitted models, and then we define the simulated values
of hvia equation (7) as:
hsim := h(Ysim
1Sea ,Y sim
2River ,Y sim
3River )(8)
where Ysim are the simulated values of Y.30
9
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
5. Perform a statistical analysis of the values hsim, and assess the risk associated with the events. We compute the model
uncertainties, which is straightforward through such models. Practically, such uncertainties propagate into the risk as-
sessment, and so they must be considered.
5.1 Impact function
The water level his influenced by river (Y2River and Y3River ) and sea (Y1Sea ) levels (Figure 2). We describe such influence through5
the following multiple regression model:
h=a1Y1Sea +a21Y2River +a22Y2
2River +a31Y3River +a32Y2
3River +c+ηh(0,σh)(9)
where ηh(0,σh)is a Gaussian noise having standard deviation equal to σh. The contribution of the rivers to the impact his
expressed via quadratic polynomials, which guarantees a better fit of the model according to the Akaike Information Criterion
(AIC). In particular, we defined the regression model as the best output of both a forward and a backward selection procedure,10
considering linear and quadratic terms for all of the Yas candidate variables. The Q-Q plot of the model, i.e. the plot of the
quantiles of observed values against those of the mean predicted values from the model, is shown in Figure 3. The points are
located along the line y=x, which indicates that the model is satisfying. Considering the two models which do not consider
the rivers or the sea variables in the regression, the Q-Q plots show larger deviations from the line y=x(not shown), which
underlines the compound nature of the impact h. The relative contribution of each contributing variable Yito the impact is of15
similar magnitude, as shown by the product between the parameter and the standard deviation σof the variable: a1·σ(Y1Sea)
= 0.15, a21 ·σ(Y2River ) + a22 ·σ(Y2
2River )= 0.036, a31 ·σ(Y3River) + a32 ·σ(Y2
3River )= 0.10. In particular, the sum of the relative
contributions of the rivers is very similar to that of the sea. The parameters of this model (and of those in section 5.2) were
estimated according to the maximum likelihood, solved through QR decomposition (via the lm function of the R package stats).
20
5.2 Meteorological Predictor Selection
We show in Figure 4 the resulting scatter plots of observed predictands (Yobs) and selected observed predictors (Xobs). To fit
the joint pdf of the non-stationary model, we use all time steps where data for all of the Xand Yvariables have been recorded.
However, we calibrate the predictors of rivers and sea separately, so we use all available data for each Yvariable (during the
period November-March). The procedure we use to identify the meteorological predictors is shown below.25
5.2.1 River levels
The meteorological influence on the two rivers Y2River and Y3River is very similar because their catchments are small and close by
(as a consequence the Spearman correlation between the rivers is high, i.e. 0.79). Therefore we use an identical predictor for
the two river levels.
The river levels are influenced by the total input of water over the catchments, which is given by the positive contribution of30
precipitation and snow melt, and by evaporation which results in a reduction of the river runoff. Specifically, we compute the
10
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
1.0 1.5 2.0 2.5 3.0
1.0 1.5 2.0 2.5 3.0
Predicted (m)
Observed (m)
Figure 3. Q-Q plot between the observed impact (X-axis) and the modelled impact (Y-axis) from the regression model (equation (9)).
Y1 Sea
0.5 1.5
0.0 0.4
●●
●●
−0.2 0.2 0.6
0.5 1.5
0.44
Y2 River
● ●
● ●
●●
0.45
0.79
Y3 River
2.0 3.0 4.0
0.0 0.4
0.81
0.22
0.24
X1 Sea
−0.2 0.2 0.6
0.72
0.59
2.0 3.0 4.0
0.61
0.58
−1.0 −0.4
−1.0 −0.4
X23 Rivers
Figure 4. Scatter plots of predictands Yobs and predictors Xobs. The numbers are Spearman coefficient correlations. The red lines (computed
via LOWESS, i.e. Locally Weighted Scatterplot Smoothing) is shown to better visualize the relationship between pairs.
11
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
input of water won the day tover the river catchments (one grid point) as:
w(t) = Ptotal(t)E(t) + Smelt (t)Sfall(t)(10)
where Ptotal is the total precipitation, Eis the evaporation, Smelt is the snow melt and Sfall is the snow fall. The snow fall accounts
for the fraction of precipitation which does not immediately contribute to the input of water over the catchments because of its
solid state. While a fraction of the water input over the catchment rapidly reaches the rivers as surface runoff, another fraction5
infiltrates the ground and contributes only later to the river discharge. Compared with the first fraction, the second has a slower
response to precipitation and changes more gradually over time. This double effect underlines the compound nature of river
runoff whose response to precipitation falling at given time is higher if in the previous period additional precipitation fell in
the river catchment. To consider both of these effects we define the river predictor as:
X23Rivers (t) = aR
t
X
t=t1
w(t) + bR
t
X
t=t10
w(t) + cR(11)10
where cRis a constant. We chose the parameters of equation (11) through fitting the right hand side of this equation to the rivers
(i.e. to the variable Y23Rivers ). Specifically Y23Rivers := a21 Y2River +a22Y2
2River +a31Y3River +a32Y2
3River represents the contribution
of the river levels to the impact (see eq. (9)). The lags n= 1 and n= 10 days are those which maximise respectively the
upper tail dependence and the Spearman correlation between Y23Rivers (t)and the cumulated wover the previous n days, i.e.
Pt
t=tnw(t). Here, we use the upper tail dependence to get the typical river response time to the fraction of water which15
directly flows into the rivers as surface runoff. Similarly, the Spearman correlation is used to get the typical time required for
the infiltrated water in the ground to flow into the rivers.
Through defining the river predictor as in equation (11), we aggregate the different meteorological drivers of the rivers in the
single predictor X23Rivers (t). Such aggregation allows for a simplification of the system describing the compound floods, due
to a reduction of the involved variables. Furthermore this reduces the variables described by the joint pdf fY,X(Y,X), whose20
numerical implementation errors can potentially increase with higher dimensionality (Hobæk Haff, 2012).
All of the terms involved in the multiple regression model (equation (11)) are statistically significant at level α= 2 ·1016.
Moreover, the quality of the river predictor X23Rivers improves (according to the likelihood and to Spearman correlation between
X23Rivers and Y23Rivers ) when we use all of the terms in equation (10), instead of only Ptotal(t). The presence of more terms in
equation (10) does not increase the number of model parameters.25
5.2.2 Sea level
Sea level can be modeled as the superposition of the barometric pressure effect, i.e. the force exerted by the atmospheric weight
on the water, the wind-induced surge, and an overall annual cycle. As for the river predictor, we aggregate the different physical
contributions in a single predictor. We define the sea level predictor on day tas:
X1Sea (t) = aSSLPRav enna(t) + bSS LP (t)·RMAP +cSsin(ω1Yeart+φ) + dS(12)30
12
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.
Longitude
Latitude
−1000
−500
0
500
1000
1500
5° E 10° E 15° E 20° E
37° N 42° N 47° N
Figure 5. Regression Map RMAP (equation (12)). The value of the regression map in the location (i,j)is given by RMAP(i, j) = var(R0)1·
cov(R0, SLP i,j ), where R0(t)is the residual of the barometric pressure effect obtained from the fit of the linear model a0SLPRavenna (t)+
d0to Y1Sea (t). The Regression map is equivalent to a 1-dimensional maximum covariance analysis (Widmann, 2005). The red spot indicates
Ravenna.
where SLPRav enna is the sea level pressure in Ravenna, SLP ·RMAP is the wind contribution due to the sea level pressure
field SLP , the harmonic term is the annual cycle and dSis a constant term. We chose the parameters of equation (12) through
regressing the sea level Y1Sea(t)on the right hand side of this equation. A more detailed physical interpretation of the terms is
given in the following.
1. aSSLPRav enna accounts for the barometric pressure effect (Brink et al., 2004). The regression map RMAP indicates5
which anomalies of the SLP field are associated with high values of the residual of the barometric pressure effect (see
Figure 5, where also more details are given). Particularly, according to the geostrophic equation for wind, these pressure
anomalies induce wind in the Adriatic Sea towards Ravenna’s coast. Therefore, the projection of the SLP field onto this
regression map, i.e the term SLP (t)·RMAP , describes the wind-induced change in the sea level at time t.
2. cSsin(ω1Yeart+φ)describes the remaining annual cycle of the sea level which is not described by barometric pressure10
effect and wind contribution. This harmonic term could be driven by the annual hydrological cycle (Tsimplis and Wood-
worth, 1994), i.e. due to cyclic runoff of rivers which flow into the Adriatic sea. Astronomical tide may drive a minor
fraction of this term. The range of variation of cSsin(ω1Yeart+φ)is about 10% of that of the sea level. When we use the
predictor to extend the analysis to the period 1979-2015 this term will be kept constant assuming that the hydrological
annual cycle has not drastically changed in past years.15
All the terms involved in the multiple regression model are statistically significant at level α= 2 ·1016.
13
Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-652, 2017
Manuscript under review for journal Hydrol. Earth Syst. Sci.
Published: 2 January 2017
c
Author(s) 2017. CC-BY 3.0 License.