Content uploaded by Demetris Koutsoyiannis

Author content

All content in this area was uploaded by Demetris Koutsoyiannis on Feb 01, 2020

Content may be subject to copyright.

Accepted for publication in Water Resources Research

1

On the exact distribution of correlated extremes in hydrology

1

2

F. Lombardo1,2, F. Napolitano1, F. Russo1, and D. Koutsoyiannis1,3

3

4

1 Dipartimento di Ingegneria Civile, Edile e Ambientale, Sapienza Università di Roma, Via

5

Eudossiana, 18 – 00184 Rome, Italy.

6

2 Corpo Nazionale dei Vigili del Fuoco, Ministero dell’Interno, Piazza del Viminale, 1 – 00184

7

Rome, Italy.

8

3 Department of Water Resources and Environmental Engineering, National Technical

9

University of Athens, Heroon Polytechneiou 5, GR-157 80 Zographou, Greece.

10

11

Corresponding author: Federico Lombardo (federico.lombardo@uniroma1.it)

12

13

Key Points:

14

We propose non-asymptotic closed-form distribution for dependent maxima.

15

We introduce a new efficient generator of Markov chains with arbitrary marginals.

16

We contribute to develop more reliable data-rich-based analyses of extreme values.

17

18

19

20

Accepted for publication in Water Resources Research

2

Abstract

21

The analysis of hydrological hazards usually relies on asymptotic results of extreme value theory

22

(EVT), which commonly deals with block maxima (BM) or peaks over threshold (POT) data

23

series. However, data quality and quantity of BM and POT hydrological records do not usually

24

fulfill the basic requirements of EVT, thus making its application questionable and results prone

25

to high uncertainty and low reliability. An alternative approach to better exploit the available

26

information of continuous time series and non-extreme records is to build the exact distribution

27

of maxima (i.e., non-asymptotic extreme value distributions) from a sequence of low-threshold

28

POT. Practical closed-form results for this approach do exist only for independent high-threshold

29

POT series with Poisson occurrences. This study introduces new closed-form equations of the

30

exact distribution of maxima taken from low-threshold POT with magnitudes characterized by an

31

arbitrary marginal distribution and first-order Markovian dependence, and negative binomial

32

occurrences. The proposed model encompasses and generalizes the independent-Poisson model

33

and allows for analyses relying on significantly larger samples of low-threshold POT values

34

exhibiting dependence, temporal clustering and overdispersion. To check the analytical results,

35

we also introduce a new generator (called Gen2Mp) of proper first-order Markov chains with

36

arbitrary marginal distributions. An illustrative application to long-term rainfall and streamflow

37

data series shows that our model for the distribution of extreme maxima under dependence takes

38

a step forward in developing more reliable data-rich-based analyses of extreme values.

39

1 Introduction

40

The study of hydrological extremes is one of long history in research applied to design

41

and management of water supply (e.g. Hazen, 1914) and flood protection works (e.g. Fuller,

42

1914). Almost half a century after the first pioneering empirical studies, Gumbel (1958) provided

43

Accepted for publication in Water Resources Research

3

a general framework linking the theoretical properties of probabilities of extreme values (e.g.

44

Fisher and Tippet, 1928) to the empirical basis of hydrological frequency curves. Since then,

45

extreme value theory (EVT) applied to hydrological analyses has been a matter of primary

46

concern in the literature (see e.g. Papalexiou and Koutsoyiannis, 2013; Serinaldi and Kilsby,

47

2014 for detailed overview). EVT aims at modeling the extremal behavior of observed

48

phenomena by asymptotic probability distributions, and observations to which such distributions

49

are allegedly related should meet the following important conditions:

50

1. They should resemble the samples of independent and identically distributed (i.i.d.)

51

random variables. Then, extreme events arise from a stationary distribution and are

52

independent of one another.

53

2. Their number should be large. Defining how large their size should be depends on the

54

characteristics of the parent distribution from which the extreme values are taken (e.g. the

55

tail behavior) and the degree of precision we seek.

56

Most of these assumptions, commonly made in classical statistical analyses, are hardly

57

ever realized in hydrological applications, especially when studying extremes. Specifically, the

58

traditional analysis of hydrological extremes is based on statistical samples that are formed by

59

selecting from the entire data series (e.g. at the daily scale) those values that can reasonably be

60

considered as realizations of independent extremes, e.g. annual maxima or peaks over a certain

61

high threshold. Thus, many observations are discarded and the reduction of the already small size

62

of common hydrological records significantly affects the reliability of the estimates

63

(Koutsoyiannis, 2004a,b; Volpi et al., 2019). In addition, Koutsoyiannis (2004a) showed that the

64

convergence to the asymptotic distributions can be extremely slow and may require a huge

65

Accepted for publication in Water Resources Research

4

number of events. Thus, a typical number of extreme hydrological events does not guarantee

66

convergence in applications.

67

Furthermore, the long-term behavior of the hydrological cycle and its driving forces

68

provide the context to understand that correlations between hydrological samples not only occur,

69

but they also can persist for a long time (see O’Connell et al., 2016 for a recent review). While

70

Leadbetter (1974, 1983) demonstrated that distributions based on dependent events (with limited

71

longterm persistence at extreme levels) share the same asymptotic properties of distributions

72

based on independent trials, there is evidence that correlation has strong influence on the exact

73

statistical properties of extreme values and it slows down the already slow rate of convergence

74

(e.g. Eichner et al., 2011; Bogachev and Bunde, 2012; Volpi et al., 2015; Serinaldi and Kilsby,

75

2016). In essence, correlation inflates the variability of the expected values and the width of

76

confidence intervals (CIs) due to information redundancy, and a typical effect is reflected in the

77

tendency of hydrological extremes to cluster in space and time (e.g. Serinaldi and Kilsby, 2018

78

and references therein). Moreover, focusing on extreme data values, such as annual maxima,

79

hinders reliable retrieval of the dependence structure characterizing the underlying process

80

because of sampling effects of data selection (Serinaldi et al., 2018; Iliopoulou and

81

Koutsoyiannis, 2019). Then, correlation structures and variability of hydrological processes

82

might easily be underestimated, further compromising the attempt to draw conclusions about

83

trends spanning the period of records (see Serinaldi et al., 2018, for detailed discussion). In other

84

words, the lately growing body of publications examining “nonstationarity” in hydrological

85

extremes (see Salas et al., 2018 and references therein) may likely reflect time dependence of

86

such extremes within a stationary setting, as observed patterns are usually compatible with

87

Accepted for publication in Water Resources Research

5

stationary correlated random processes (Koutsoyiannis and Montanari, 2015; Luke et al., 2017;

88

Serinaldi and Kilsby, 2018).

89

In classical statistical analyses of hydrological extremes, to form data samples we

90

commonly use two alternative strategies referred to as “block maxima” (BM) and “peaks over

91

threshold” (POT) methods. The former is to choose the highest of all recorded values at each

92

year (for a given time scale, e.g. daily rainfall) and form a sample with size equal to the number

93

of years of the record. The POT method is to form a sample with all recorded values exceeding a

94

certain threshold irrespective of the year they occurred, allowing to increase the available

95

information by using more than one extreme value per year (Coles, 2001; Claps and Laio, 2003).

96

The fact that observed hydrological extremes tend to cluster in time increases the

97

arguments towards the use of the POT sampling method, instead of block maxima approaches

98

which tend to hide dependence (Iliopoulou and Koutsoyiannis, 2019). Such clustering reflects

99

dependence (at least) in the neighboring excesses of a threshold, invalidating the basic

100

assumption of independence made in classical POT analyses. Therefore, the standard approach in

101

case studies is to fix a (somewhat subjective) high threshold, and then filter the clusters of

102

exceedances so as to obtain a set of observations that can be considered mutually independent.

103

Such a declustering procedure involves using empirical rules to define clusters (e.g. setting a run

104

length that represents a minimum timespan between consecutive clusters, meaning that a cluster

105

ends when the separation between two consecutive threshold exceedances is greater than the

106

fixed run length) and then selecting only the maximum excess within each cluster (Coles, 2001;

107

Ferro and Segers, 2003; Bernardara et al., 2014; Bommier, 2014). Declustering results in

108

significant loss of data that can potentially provide additional information about extreme values.

109

Accepted for publication in Water Resources Research

6

In this paper, we aim to overcome these problems by investigating the exact distribution

110

of correlated extremes. Hence, we can set considerably lower thresholds with respect to the

111

standard POT analyses and avoid declustering procedures whose effectiveness is called into

112

question if we do not account for the process characteristics. The proposed approach provides

113

new insight into probabilistic methods devised for extreme value analysis taking into account the

114

clustering dynamics of extremes, and it is consistent with the general principle of allowing

115

maximal use of information (Volpi et al., 2019).

116

In summary, hydrological applications have made wide recourse to asymptotes or

117

limiting extreme value distributions, while exact distributions for real-world finite-size samples

118

are barely used in stochastic hydrology because their evaluation requires the parent distribution

119

to be known. However, the small size of common hydrological records (e.g. a few tens of years)

120

and the impact of correlations on the information content of observed extremes cannot provide

121

sufficient empirical evidence to estimate limiting extreme value distributions with precision.

122

Therefore, we believe that non-asymptotic analytical models for extremes arising from correlated

123

processes should receive renewed research interest (Iliopoulou and Koutsoyiannis, 2019).

124

This paper is concerned with a theoretical approach to the exact distribution of high

125

extremes based on the pioneering work by Todorovic and Zelenhasic (1970), who proposed a

126

general stationary stochastic model to describe and predict behavior of the maximum term

127

among a random number of random variables in an interval of time assuming

128

independence. As verified in several studies mentioned above, to make a realistic stochastic

129

model of hydrological processes, we are forced to confront the fact that dependence should

130

necessarily be taken into consideration. The dilemma is that dependence structures make for

131

realistic models, but also reduce the possibility for explicit probability calculations (i.e.,

132

Accepted for publication in Water Resources Research

7

analytical derivations of joint probability distributions are more complicated than under

133

independence). The challenge of this paper is to propose a stochastic model of extremes with

134

dependencies allowing for acceptable realism, but also permitting sufficient mathematical

135

tractability. In this context, short-range dependence structures, such as Pólya’s and Markov’s

136

schemes, nicely make a trade-off between these two demands, when hydrological maxima satisfy

137

Leadbetter’s condition of the absence of long-range dependence (Koutsoyiannis, 2004a).

138

In the remainder of this paper, we first introduce a novel theoretical framework to model

139

the exact distribution of correlated extremes in Section 2. In Section 3, we present a new

140

generator, called Gen2Mp, of correlated processes with arbitrary marginal distributions and

141

Markovian dependence, and use it to validate the theoretical reasoning described in Section 2.

142

Then, Section 4 deals with case studies in order to test the capability of our model to reproduce

143

the statistical behavior of extremes of long-term rainfall and streamflow time series from the real

144

world. Concluding remarks are reported in Section 5.

145

2 Theoretical framework

146

We use herein the POT approach to analyze the extreme maxima, and assume the number

147

of peaks (e.g., flood peak discharges or maximum rainfall depths) exceeding a certain threshold

148

and their magnitudes to be random variables. The threshold simplifies the study and helps

149

focus the attention on the distribution tails, as they are important to know in engineering design

150

(Papalexiou et al., 2013). In the following, we use upper case letters for random variables or

151

distribution functions, and lower case letters for values, parameters or constants.

152

If we consider only those peaks in exceeding , then we can define the strictly

153

positive random variable

154

Accepted for publication in Water Resources Research

8

(1)

for all , where is the number of exceedances in . Clearly, is a non-

155

increasing function of for a given , but we assume herein that is a fixed constant.

156

It is recalled from probability theory that, given a fixed number of i.i.d. random

157

variables , the largest order statistic has a probability distribution

158

fully dependent on the joint distribution function of that is

159

(2)

In hydrological applications, it may be assumed that the number of values of in

160

(e.g. the number of storms or floods per year), whose maximum is the variable of interest

161

(e.g. the maximum rainfall depth or flood discharge), is not constant but it is a realization of a

162

random variable . Therefore, we are interested in the maximum term among a

163

random number of a sequence of random variables in an interval of time .

164

In the following, we attempt to determine the one-dimensional distribution function of

165

that is defined as . Since the magnitude of exceedances and their number

166

are supposed to be random variables, Todorovic (1970) derived the distribution of the extreme

167

maximum of such a particular class of stochastic processes as

168

(3)

which represents the probability that all exceedances in are less than or equal to .

169

If , then is the probability that there are no exceedances in .

170

Accepted for publication in Water Resources Research

9

Todorovic and Zelenhasic (1970) proposed the simplest form of the general model in eq.

171

(3) for use in hydrological statistics, which is now the benchmark against which we measure

172

frequency analysis of extreme events (e.g. Koutsoyiannis and Papalexiou, 2017). Its basic

173

assumptions are that is a sequence of independent random variables with common parent

174

distribution , and is a Poisson-distributed random variable independent of

175

with mean , i.e.

. Then, recalling that

176

, eq. (3) becomes

177

(4)

It can be shown that with satisfactory approximation (Koutsoyiannis, 2004a).

178

As stated above, the derivation of eq. (4) includes strong assumptions, such as

179

independence, and the purpose of this paper is to modify and test this equation under suitable

180

dependence conditions.

181

Firstly, we suppose that is a sequence of random variables with common parent

182

distribution and a particular Markovian dependence that give rise to the two-

183

state Markov-dependent process (2Mp, see next Section for further details). Specifically, we let

184

the occurrences of the event evolve according to a Markov chain with two states, whose

185

probabilities are:

186

(5)

and the transition probabilities (see also Lombardo et al., 2017, appendix C) are:

187

Accepted for publication in Water Resources Research

10

(6)

where is the lag-one autocorrelation coefficient of the Markov chain.

188

It follows that, for the process , the probability of the state at a given time

189

depends solely on the state at the previous time step . Then, for a fixed number

190

of exceedances , the Markov property yields:

191

(7)

Applying the chain rule of probability theory to the distribution function of the maximum term

192

, , we obtain

193

(8)

From the above it follows that can be determined in terms of the conditional probabilities

194

and the parent univariate distribution function . As the

195

random variables are identically distributed, they correspond to a stationary stochastic

196

process, and then the function is invariant to a shift of the origin. In this

197

case, is determined in terms of the second-order (bivariate) distribution

198

and the first-order (univariate) parent

199

distribution . Indeed, from eq. (8) we obtain

200

(9)

It can be easily shown that eq. (9) reduces to eq. (2) in case of independence, i.e.

201

.

202

Accepted for publication in Water Resources Research

11

Secondly, we assume that exceedances have positively correlated occurrences

203

causing a larger variance than if they were independent, i.e. the occurrences are overdispersed

204

with respect to a Poisson distribution, for which the mean is equal to the variance. Therefore, we

205

assume that the random number of occurrences in a specific interval of time follows the

206

negative binomial distribution (e.g. Calenda et al., 1977; Eastoe and Tawn, 2010), which allows

207

adjusting the variance independently of the mean. The negative binomial distribution (known as

208

the limiting form of the Pólya distribution, cf. Feller, 1968, p. 143) is a compound probability

209

distribution that results from assuming that the random variable is distributed according to a

210

Poisson distribution whose mean varies randomly following a gamma distribution with shape

211

parameter and scale parameter , so that its density is

212

(10)

Then, the probability distribution function of conditional on is

213

(11)

We can derive the unconditional distribution of by marginalizing over the distribution of ,

214

i.e., by integrating out the unknown parameter as

215

(12)

Substituting eqs. (10) and (11) into eq. (12), we have

216

(13)

Accepted for publication in Water Resources Research

12

Recalling that the gamma function is defined as

, then multiplying

217

and dividing eq. (13) by and integrating by substitution, we obtain after

218

algebraic manipulations

219

(14)

To summarize, we specialize the general model in eq. (3) for the following conditions:

220

1. is a sequence of correlated random variables with 2Mp dependence and common

221

parent distribution .

222

2. is a negative binomial random variable independent of with mean and

223

variance

224

Under the above assumptions, from eq. (3) we can derive the conditional distribution function of

225

the maximum as

226

(15)

where for of 2Mp

227

(16)

Substituting eqs. (11) and (16) in eq. (15), we obtain

228

(17)

Then, adding and subtracting the term

yields

229

Accepted for publication in Water Resources Research

13

(18)

and thus

230

(19)

which is the conditional distribution function of the maximum term among a Poisson-

231

distributed random number with gamma-distributed mean of 2Mp random variables

232

in an interval of time . It can be shown that eq. (4) is easily recovered assuming

233

independence, i.e. and is a fixed constant.

234

The unconditional distribution of is derived by substituting eqs. (14) and (16) into eq.

235

(3) as follows

236

(20)

Then, adding and subtracting the term

and denoting by

237

the Pochhammer’s symbol (Abramowitz and Stegun, 1972, p. 256) yields

238

(21)

Since

and is a real number, then this series is known as a

239

binomial series (Graham et al., 1994, p. 162), and, setting

, it

240

converges to

, thus

241

Accepted for publication in Water Resources Research

14

(22)

which is the unconditional distribution of the extreme maximum . The parameters of the model

242

in eq. (22) are and along with those of the models chosen for both the parent distribution,

243

, and the bivariate distribution (see Sect. 4 for further details).

244

In the case of independence, where , eq. (22) reduces to

245

(23)

As shown in later examples and case studies, eq. (22) yields probabilities of non-exceedance that

246

are systematically larger than those under independence, i.e..

247

3 Gen2Mp: An Algorithm to Simulate the Two-State Markov-Dependent Process (2Mp)

248

with Arbitrary Marginal Distribution

249

To check the performance of our stochastic model for correlated extremes, we need to

250

simulate a random process with any marginal distribution and Markovian dependence.

251

Nevertheless, we must better clarify what the “Markovian dependence” refers to here. As stated

252

in the previous Section, we assume that a Markov chain with two states (which may represent for

253

example flood or no flood, dry or wet year, etc.) governs the excursions above/below any level

254

(threshold) of the process (see e.g. Fernández and Salas, 1999). We refer to this process as

255

2Mp (Volpi et al. 2015). For such a process, the Markov property is valid because the

256

probability of the state at a given time depends solely on the state at the

257

previous time step , i.e., .

258

Accepted for publication in Water Resources Research

15

One can be tempted to use the classical AR(1) (first-order autoregressive) model to

259

simulate the 2Mp. However, this is not appropriate in general, as we show in the following by a

260

numerical experiment that provides insights into an effective simulation strategy. Let us define

261

the random variable in such a way that for , it is

262

(24)

Then, by definition of conditional probability, we may write e.g. for

263

(25)

In our case the Markov property yields

264

(26)

where because is stationary. From

265

eqs. (25) and (26), it is easily understood that we seek a modelling framework for which the ratio

266

should be constant for every , depending solely on the

267

value of the threshold . In order to show that this is generally not valid for AR(1) processes, we

268

compute such a ratio from a sequence of 100000 random numbers generated by a standard

269

Gaussian AR(1) model with lag-one correlation equal to 0.85. In particular, we calculate four

270

ratios () for various threshold values () selected randomly over the

271

entire range of the standard Gaussian distribution. Then, as the ratio values depend on the

272

threshold, for each we “standardize” the results by taking the absolute difference between

273

each ratio and its mean computed over , i.e.

274

Accepted for publication in Water Resources Research

16

, then dividing all by ; hence, we obtain the relative

275

difference

.

276

We seek a model with a particular Markovian dependence so that for all and

277

. In Fig. 1, we show the boxplots depicting the variability of (percent) over all threshold

278

values with . In the left panel, we display the results for the AR(1) model

279

described above. In contrast it can be noted that values are not only significantly different

280

from zero (especially if compared with results shown in the right panel of Fig. 1, based on

281

simulation algorithm described below), but their variability also changes strongly with the index

282

. Then, we conclude that AR(1) models are not appropriate for our purposes. As shown later,

283

despite sharing similar dependence structures (see Fig. 2), Gen2Mp outperforms AR(1) in terms

284

of .

285

286

Figure 1. Box plots of four () relative differences

for various

287

threshold values () selected at random from the parent (standard Gaussian) distribution, where

288

and

. The red line inside each box is the

289

Accepted for publication in Water Resources Research

17

median and the box edges are the 25th and 75th percentiles of the samples. The left panel depicts results for AR(1)

290

model, while right panel shows boxplots of synthetic data from Gen2Mp algorithm.

291

3.1 Description of the Gen2Mp simulation algorithm

292

We introduce herein a new generator, which enables the Monte Carlo materialization of a

293

2Mp with any arbitrary marginal distribution. It is worth stressing that the theoretical

294

considerations discussed above result in a conceptually simple simulation algorithm, whose

295

scheme consists of an iteration procedure with the following steps:

296

a) We start by generating two sequences

and

of independent random

297

numbers with the same arbitrary distribution but conditional on being higher (

) or

298

lower (

) than the median.

299

b) Then, we generate the series

sampled from i.i.d. Bernoulli random variables

300

taking values 1 and 0 with probability and , respectively.

301

c) The events in the Bernoulli series determine the alternation between the two

302

states of our target process, i.e. higher (state 1) and lower (state 2) than the median. In

303

other words, the series

determines the “holding times” before our process

304

switches (jumps) from a state to the other one, because we assume that the state remains

305

the same up to the “time” when there comes a state change . We can now

306

simulate the state-of-generation sequence

taking values 1 when the state of our

307

process is higher than the median (i.e.,

) and 2 otherwise (i.e.,

).

308

d) Consequently, the sequence

is a sample of a Markov chain with state space

309

. Since the holding times of each state are completely random, the state probabilities

310

are . On the other hand, as the jumps arrive randomly

311

Accepted for publication in Water Resources Research

18

according to the Bernoulli process, the transition probabilities are

312

and

313

. Therefore, the dependence structure of

is

314

completely specified in terms of the lag-one autocorrelation coefficient (see

315

e.g. Lombardo et al., 2017).

316

e) We can now obtain the target correlated sequence

as follows:

317

(27)

f) As the resulting sequence

generally does not satisfy the properties of the process

318

we are interested in, we must subdivide each of the cases “> median” and “< median”

319

into two subcases. Specifically, we generate the i.i.d. sequences

,

and

320

,

conditional on being, respectively, “ > 75th percentile”, “(median, 75th

321

percentile)”, (25th percentile, median) and “ < 25th percentile”. Then we generate other

322

two Bernoulli series

and

with same parameter as above, and consequently

323

derive the corresponding state-of-generation sequences

(taking values 1 when the

324

state of our process is higher than the 75th percentile, and 2 if it belongs to the interval

325

(median, 75th percentile)) and

(taking values 1 when the state belongs to the

326

interval (25th percentile, median), and 2 if it is lower than the 25th percentile). We can

327

now obtain the target correlated sequence

as follows:

328

(27’)

Accepted for publication in Water Resources Research

19

g) We continue to subdivide until the relative difference converges to zero for any .

329

In any subdivision step, we follow the same procedure as that described above with a

330

fixed parameter , until a convergence threshold is achieved (here, a mean absolute error

331

equal to 0.002 for is used in the numerical examples below, which is obtained

332

after 9 subdivision steps for ).

333

3.2 Numerical simulations

334

We show some Monte Carlo experiments assuming the standard Gaussian probability

335

model as parent distribution, but it can be changed to any distribution function. We generate a

336

correlated series of 100000 standard Gaussian random numbers using Gen2Mp with parameter

337

. Such a parameter completely determines the dependence structure of the 2Mp process.

338

For the process is positively correlated, while it reduces to white noise for .

339

For we get an anticorrelated series. The particular value of is chosen in

340

order to have the dependence structure of the generated series similar to that of the AR(1) model

341

with lag-one correlation equal to 0.85 (see Fig. 2). Such a value of has been determined

342

numerically exploiting the fact that the dependence structure of the generated series is closely

343

related (showing slight downward bias) to that of the Markov chain defined above, whose

344

lag-one autocorrelation is (see Fig. 2). Then, to a first approximation, we start

345

assuming , and progressively increase it until the dependence structures of the 2Mp

346

and AR(1) match.

347

Accepted for publication in Water Resources Research

20

348

Figure 2. Comparison of the empirical autocorrelation functions (EACFs) resulting from time series generated by

349

Gen2Mp

and the Markov chain

with parameter , and by AR(1) model with lag-one

350

correlation equal to 0.85.

351

Then, even though Gen2Mp and the classical AR(1) algorithms generate time series

352

exhibiting analogous dependence structures, the former significantly outperforms the latter in

353

terms of , as shown in Fig. 1 (right panel). Furthermore, we generate an independent

354

series of 100000 standard Gaussian random numbers as a benchmark using classical generators

355

(e.g. Press et al., 2007). As it can be noticed from the probability-probability (PP) and quantile-

356

quantile (QQ) plots in Fig. 3, the marginal distribution of the final dependent series

357

(corresponding to a 2Mp) is the same as that of the benchmark series. In summary, the important

358

achievement is that Gen2Mp does not alter the parent distribution, but it only induces time

359

dependence in a Markov chain sense.

360

0 5 10 15 20 25 30

0

0.2

0.4

0.6

0.8

1

Lag

EACF

Gen2Mp

AR(1)

Markov chain Di

Accepted for publication in Water Resources Research

21

361

Figure 3. Probability–Probability plot (left) and Quantile–Quantile plot (right) comparing the marginal distribution

362

of a benchmark series (i.i.d. standard Gaussian random numbers) to that of the correlated series generated using

363

Gen2Mp.

364

Focusing on the frequency analysis of maxima, we investigate the distribution of the

365

maximum term among a random number of a sequence of standard Gaussian random

366

variables . Specifically, we assume that follows a negative binomial distribution in eq.

367

(14), while the variables form a 2Mp stochastic process. Based on such hypotheses, in the

368

previous Section we derived the corresponding theoretical probability distribution function

369

given by eq. (22). To check this numerically, we generate the random

370

numbers

(where ) from the negative binomial distribution with parameters

371

and , then we form the target sample

by taking the maximum of non-

372

overlapping sequences of consecutive random numbers

. We allow two different

373

dependence structures for

. In the first case we assume that

are sampled from i.i.d.

374

random variables; while in the second case

are sampled from a 2Mp stochastic process

375

with parameter , which is simulated by Gen2Mp.

376

Accepted for publication in Water Resources Research

22

Results in the form of PP plots are depicted in Fig. 4. In the left panel, we show the

377

independent case, and it can be noticed how the empirical distribution of

is closely

378

matched by eq. (23), i.e. the PP plot (blue line) follows a straight line configuration oriented

379

from to . In other words, when are i.i.d. eq. (23) proves to be a good model for

380

the theoretical distribution of .

381

In the right panel of Fig. 4, we show the dependent case where the joint probability

382

in eq. (22) is determined numerically. Clearly, if we apply eq.

383

(23) to the correlated sample

, then the corresponding plot (blue line) shows a marked

384

departure from the 45° line (i.e., the line of equality). By contrast, the theoretical distribution that

385

we propose in eq. (22) reasonably models the empirical distribution of correlated maxima

386

in all respects (see black line). Therefore, when the belong to 2Mp eq. (22) (black

387

line) largely outperforms eq. (23) (blue line) in modelling the extreme maxima

388

389

Figure 4. Probability–Probability plots of the maximum term among a (negative binomial) random number of a

390

sequence of i.i.d. (left panel) and 2Mp (right panel) standard Gaussian random variables .

391

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

Theoretical distribution

Empirical distribution

0 0.2 0.4 0.6 0.8 1

0

0.2

0.4

0.6

0.8

1

Theoretical distribution

Empirical distribution

independent

45° line

dependent

Accepted for publication in Water Resources Research

23

4 Applications to Rainfall and Streamflow Data

392

In order to provide some insights into the capability of the proposed methodology to

393

reproduce the statistical pattern of observed hydrological extremes, the datasets used in the

394

applications comprise long-term daily rainfall and streamflow time series with no missing values

395

or as few as possible, to fulfil the requirements of POT analyses. In more detail, we use three

396

daily precipitation time series recorded by rain gages located at Groningen (north-eastern

397

Netherlands), Middelburg (south-western Netherlands) and Bologna (northern Italy) respectively

398

ranging from 1847 to 2017 (171 years, no missing values), from 1855 to 2017 (163 years, no

399

missing values) and from 1813 to 2018 (206 years, only three missing values). Raw data,

400

retrieved through the Royal Netherlands Meteorological Institute (KNMI) Climate Explorer web

401

site, are available at https://climexp.knmi.nl/data/bpeca147.dat (accessed on 26 October 2019)

402

for Groningen station, at https://climexp.knmi.nl/data/bpeca2474.dat (accessed on 26 October

403

2019) for Middelburg station and at https://climexp.knmi.nl/data/pgdcnITE00100550.dat

404

(accessed on 26 October 2019) for Bologna station in the period 1813-2007 (see Klein Tank et

405

al., 2002; Menne et al., 2012). For the most recent period, 2008-2018, daily data for Bologna

406

station are provided by the Dext3r public repository (http://www.smr.arpa.emr.it/dext3r/)

407

(accessed on 26 October 2019) of the Regional Agency for Environmental Protection and Energy

408

(Arpae) of Emilia Romagna, Italy (retrieved and processed by Koutsoyiannis for the book:

409

Stochastics of Hydroclimatic Extremes, in preparation for 2020).

410

Furthermore, we analyze one daily streamflow time series of the Po River recorded at

411

Pontelagoscuro, northern Italy (see Montanari, 2012 for further details). The data series,

412

spanning from 1920 to 2017 (98 years, no missing values), is made publicly available by Prof.

413

Alberto Montanari at

414

Accepted for publication in Water Resources Research

24

https://distart119.ing.unibo.it/albertonew/sites/default/files/uploadedfiles/po-pontelagoscuro.txt

415

(accessed on 26 October 2019) for the period 1920-2009, while the remainder (2010-2017) has

416

been retrieved through the Dext3r repository.

417

Since it has been shown that seasonality affects the distribution of hydrological extremes

418

(Allamano et al., 2011), our analyses are performed on a seasonal basis; we distinguish four

419

seasons, each consisting of three months such that the autumn comprises September, October,

420

and November. Winter, spring, and summer are defined similarly. We prefer not to use

421

deseasonalization procedures to avoid possible artifacts that may affect the results. Furthermore,

422

as daily rainfall and streamflow processes exhibit very different marginal distributional

423

properties, all recorded values exceeding a certain threshold are transformed to normality by

424

normal quantile transformation (NQT) for the sake of comparison (Krzysztofowicz, 1997). In

425

practice, observed exceedances

are transformed to , where is the

426

quantile function of the standard Gaussian distribution and is the Weibull plotting position of

427

the ordered sample. In addition, all datasets used in this study have been preprocessed by

428

removing leap days, because the February 29th was already removed from all leap years of the

429

1920-2009 Po river discharge dataset.

430

We now investigate the frequency analysis of observed hydrological maxima. For each

431

season of any dataset, we use for example the value of the threshold corresponding to the 5th

432

percentile (excluding zeros for rainfall datasets for simplicity, but we checked that results do not

433

vary considerably if we include zeros), whose exceedances are normalized to for each

434

sample. As stated in Sect. 1, we are interested in the statistical behavior of the maximum term

435

among a random number of equally distributed random variables (i.e., belonging to a certain

436

season) in an interval of time (we assume one year). Then, first we form the POT samples for

437

Accepted for publication in Water Resources Research

25

each year of the record, consisting of (i.e., number of years) sequences of threshold excesses

438

each of size (for ); second we form the sample of annual extremes

439

by taking the maximum of each POT series. In other words,

is a sample of

440

annual maxima of size (i.e., the number of years of the given dataset) taken from annual POT

441

series of size (i.e., the number of exceedances in the k-th year for the considered season). It

442

follows that the sample size used in classical BM analysis is , while that used in our approach

443

is

. As detailed below, all parameter values (see, e.g., Tables 1 and 2) are estimated from

444

the POT series by maximum likelihood method.

445

We compare the empirical distribution of to the theoretical probability distribution

446

function given by eq. (4) (i.e., the classical method) assuming Poisson

447

occurrences of independent exceedances, and by eq. (22) (i.e., the proposed method) assuming

448

negative binomial occurrences of 2Mp exceedances. Parameters of Poisson and negative

449

binomial distributions are derived through a process of maximum likelihood estimation from the

450

annual counts

for each season of each dataset. To a first approximation, we assume

451

statistical independence of

by checking that, for each dataset, the empirical

452

autocorrelations between the numbers of exceedances of subsequent years are negligible (not

453

shown). Furthermore, we assume that the joint probability of exceedances

454

in eq. (22) can be written in terms of the univariate marginal distribution

455

(which is the standard normal in case of normal quantile transformation) and a bivariate

456

copula that describes the dependence structure between the variables (Salvadori et al., 2007).

457

Several bivariate families of copulas have been presented in the literature, allowing the selection

458

of different dependence frameworks (Favre et al., 2004). For the sake of simplicity, we choose

459

the following three types of copulas that have been in common use:

460

Accepted for publication in Water Resources Research

26

1. The Gaussian copula (Salvadori et al., 2007 pp. 254-256), which implies the elliptical

461

shape of isolines of the pairwise joint distribution that in our case is given by a

462

bivariate normal distribution with zero mean and covariance matrix

463

, where the parameter is the average (over years) lag-one autocorrelation

464

coefficient of the annual POT series .

465

2. The Clayton copula (Salvadori et al., 2007 pp. 237-240), which exhibits upper tail

466

independence and lower tail dependence (Salvadori et al., 2007 pp. 170-175), and in our

467

case yields

468

(28)

where the parameter can be written in terms of the Kendall's tau correlation coefficient

469

as , which is the average (over years) of lag-one Kendall's tau

470

autocorrelation coefficient of the annual POT series .

471

3. The Gumbel-Hougaard copula (Salvadori et al., 2007 pp. 236-237), which exhibits upper

472

tail dependence and lower tail independence, and in our case yields

473

(29)

where the parameter is again written in terms of the Kendall's tau correlation

474

coefficient as .

475

All parameter values for all seasons and datasets are reported in Table 1.

476

Accepted for publication in Water Resources Research

27

477

Figure 5. Probability–Probability plots of Groningen dataset of daily rainfall. The empirical distributions of

478

maximum terms

among annual exceedances of the 5th percentile threshold for winter (top left), spring (top

479

right), summer (bottom left) and autumn (bottom right) seasons are compared to the corresponding theoretical

480

distributions assuming both Poisson (P) occurrences (with parameter ) of independent exceedances (eq. 4), and

481

negative binomial occurrences (with parameters and ) of correlated exceedances (eq. 22) with pairwise joint

482

distribution described by the Gaussian (N), Clayton (C, eq. 28) and Gumbel (G, eq. 29) copulas, with parameters

483

and as detailed in the text. All parameter values are reported in Table 1.

484

Accepted for publication in Water Resources Research

28

485

Figure 6. Same as Fig. 5 for Middelburg dataset of daily rainfall.

486

487

Figure 7. Same as Fig. 5 for Bologna dataset of daily rainfall.

488

In Figs. 5-7 we may observe that for all daily rainfall datasets the magnitudes of extreme

489

events taken from excesses of a low threshold (the 5th percentile of the nonzero sample) can be

490

Accepted for publication in Water Resources Research

29

considered independent and identically distributed, and this is consistent with the results shown

491

in the literature using different approaches (see e.g. Marani and Ignaccolo, 2015; Zorzetto et al.,

492

2016; De Michele and Avanzi, 2018). In addition, we may notice that the classical model of POT

493

analyses assuming Poisson occurrences (see eq. (4)) seems to be appropriate to study rainfall

494

extremes. Analogous considerations obviously apply to higher thresholds (not shown). Our

495

model of correlated extremes in eq. (22) is capable of capturing such a behavior with precision.

496

After showing the results with daily rainfall, we also analyze rainfall records at finer time

497

resolution (hourly scale) whose correlation can be stronger than that pertaining to daily data. To

498

this end, we use hourly rainfall data of “Bologna idrografico” station for the period 1990-2013

499

provided by the Dext3r repository (23 years full coverage, while the entire 2008 is missing). We

500

checked that such hourly rainfall data aggregated at the daily scale are consistent with the daily

501

data recorded in the same period by Bologna station above (not shown).

502

503

Figure 8. Same as Fig. 5 for Bologna dataset of hourly rainfall.

504

Accepted for publication in Water Resources Research

30

Comparing Figs. 7 and 8, it is noted that extremes of hourly rainfall data are more

505

affected by correlation than daily data (see e.g. winter and autumn seasons, respectively top left

506

and bottom right panels). This is also the case if we consider the same period of record (1990-

507

2013) for both datasets (not shown). Then, we may conclude that low thresholds can be used for

508

classical POT analyses (assuming independence) of rainfall time series at the daily scale (or

509

above), while further investigations of different datasets are required to describe the impact of

510

dependence on the extremal behavior of the rainfall process at finer time scales. Besides, other

511

interesting future analyses could investigate the extremes of areal rainfall, as for example

512

weather radar data will become more reliable and will accumulate in time providing samples

513

with lengths adequate enough to enable reliable investigation of the probability distribution of

514

areal rainfall (Lombardo et al., 2006a,b; Lombardo et al., 2009).

515

By contrast, results change significantly when analyzing extremes of streamflow time

516

series. In fact, we present a case study that shows how models assuming independence among

517

magnitudes of extreme events prove to be inadequate to study the probability distribution of

518

discharge maxima.

519

Accepted for publication in Water Resources Research

31

520

Figure 9. Same as Fig. 5 for the Po River dataset of daily discharge.

521

In Fig. 9, we show the PP plots of the distribution of extreme maxima taken from annual

522

exceedances of the 5th percentile thresholds for the four seasons of the Po River discharge

523

dataset, recorded at Pontelagoscuro station. Contrary to the rainfall case studies, the classical

524

model assuming independent magnitudes with Poisson (P) occurrences shows marked departures

525

from the 45° line. The theoretical distribution is usually much lower than its empirical

526

counterpart, meaning that, under the popular assumption of independent extremes, the theoretical

527

probability of an extreme event of given magnitude being exceeded is significantly higher than

528

the corresponding observed frequency of exceedance. Fig. 9 shows that our 2Mp model of

529

correlated extremes outperforms the widely used independent model. In particular, the

530

distribution of maxima that has a Gumbel copula seems to be more consistent with observed

531

extreme values, denoting dependence in the upper tail of the bivariate distribution

532

(Schmidt, 2005). In summary, daily streamflow extremes may exhibit

533

Accepted for publication in Water Resources Research

32

noteworthy departures from independence which are consistent with a stochastic process

534

characterized by a 2Mp behavior and upper tail dependence.

535

Table 1. Parameters values for all normalized case studies detailed in the text: for Poisson (P) occurrences (eq. 4);

536

and for negative binomial occurrences (eq. 22); for Clayton (C) and Gumbel (G) copulas (eqs. 28-29); for

537

Gaussian copula.

538

Station

Parameter /

Season

Winter

Spring

Summer

Autumn

Groningen

50.04

41.56

45.04

50.05

76.24

73.15

150.54

164.94

0.66

0.57

0.30

0.30

0.08

0.04

0.02

0.1

0.10

0.05

0.04

0.13

Middelburg

48.41

40.00

38.42

47.16

35.71

40.22

35.47

61.68

1.36

0.99

1.08

0.76

0.09

0.04

0.02

0.09

0.12

0.06

0.02

0.14

Bologna daily

20.92

25.39

16.59

24.67

7.20

22.57

20.98

21.14

2.91

1.13

0.79

1.17

0.03

0.02

-0.05

-0.01

0.05

0.02

-0.06

0.01

Bologna hourly

127.59

128.87

54.09

129.74

5.27

14.14

4.55

12.32

24.22

9.12

11.90

10.53

0.43

0.30

0.17

0.33

0.54

0.38

0.20

0.41

Pontelagoscuro

85.48

87.40

87.39

86.41

67.02

136.59

81.95

245.15

1.28

0.64

1.07

0.35

0.82

0.81

0.84

0.84

0.92

0.92

0.94

0.93

The above results are also evident if we compare theoretical and empirical distributions

539

of streamflow maxima by plotting their quantiles against each other. We use real values for this

540

example (i.e., we do not apply the normal quantile transformation to the data series); therefore,

541

empirical quantiles equal the observed annual maxima. Theoretical quantiles referring to eqs. (4)

542

and (22) (the latter specializes for Gaussian, Clayton and Gumbel copulas) are computed by

543

numerically solving for the root of the equation for a given probability value,

544

Accepted for publication in Water Resources Research

33

(i.e., the Weibull plotting position of observed annual maxima), assuming the classical

545

generalized Pareto (GPD) with zero lower bound as parent distribution of threshold excesses:

546

(30)

where is the shape parameter and is the scale parameter, which we estimate through the

547

maximum likelihood method applied to the entire POT series of each season.

548

In Fig. 10, QQ plots of Po river discharge for the spring season are shown when varying

549

the threshold (from the 5th, , to the 75th, , percentiles) to form POT series. It can be

550

noticed that for low thresholds there is a shift in variance between theoretical (i.e., derived from

551

eq. (22) with Gumbel copula) and empirical quantiles, namely the variance of theoretical annual

552

maxima underestimates its empirical counterpart. This can be due to the fitting performance of

553

the marginal generalized Pareto, which does not reproduce well the tail behavior of observed

554

data (not shown). Fig. 10 shows that increasing the threshold value helps focus the attention on

555

the distribution tail to better capture the behavior of maxima. This is also the case if we compare

556

streamflow quantiles resulting from our model with those estimated through “classical”

557

Generalized Extreme Value (GEV) distribution fitted to the observed annual maxima. All

558

parameter values are reported in Table 2. We note that the three GEV parameters are estimated

559

on data points, while the five parameters of our model in eq. (22) (, , or , and the

560

two parameters of the GPD with zero lower bound) are estimated on

data, which are

561

, , , and for , , , , respectively.

562

As threshold increases evidence of persistence is progressively reduced as expected, but,

563

we also note in Fig. 10 that the theoretical quantiles derived from the classical independent

564

Accepted for publication in Water Resources Research

34

Poisson method always show a shift in mean with respect to observed maxima (i.e., under

565

independence, theoretical streamflow quantiles systematically and significantly overestimate

566

observed streamflow maxima).

567

568

Figure 10. Quantile–Quantile plots of Po river discharge (m3/s) for spring season. The observed maximum terms

569

among annual peaks over the 5th percentile (top left), 25th percentile (top right), 50th percentile (bottom left) and

570

75th percentile (bottom right) thresholds are compared to the corresponding theoretical quantiles. In all cases, we

571

assume the Generalized Pareto as parent distribution of daily streamflow (with shape , scale and

572

threshold parameters), and compute quantiles specializing eq. (22) for Poisson (P) occurrences (with

573

parameter , eq. 4) of independent exceedances, and for negative binomial occurrences (with parameters and) of

574

correlated exceedances with pairwise joint distribution described by the Gaussian (N), Clayton (C) and Gumbel (G)

575

copulas, with parameters and as detailed in the text. We also plot theoretical quantiles from GEV distribution

576

(with shape , scale and location parameters) fitted to the observed annual maxima. All parameter

577

values are reported in Table 2.

578

To summarize, our model provides a closed-form expression of the exact distribution for

579

dependent hydrological maxima, which is capable of capturing the behavior of observed

580

Accepted for publication in Water Resources Research

35

extremes of long-term hydrological records. In particular, while rainfall extremes do not seem to

581

be significantly affected by correlation at the daily scale so that the classical Poisson model can

582

be appropriate for use in POT analyses of daily rainfall time series, the influence of correlation is

583

prominent in the streamflow process at the daily scale and it is important to preserve in

584

simulation and analysis of extremes.

585

Table 2. Parameters values for all models used in the QQ plots of Fig. 10.

586

Model

Parameter /

Threshold

Q5

Q25

Q50

Q75

Generalized Pareto

-0.10

-0.03

-0.05

-0.03

1220.16

1044.03

1065.80

998.06

653.00

998.00

1410.00

2133.00

Poisson

87.40

68.97

45.89

22.99

Negative Binomial

136.59

5.89

1.74

0.71

0.64

11.71

26.45

32.22

Clayton & Gumbel

copulas

0.82

0.76

0.63

0.48

Gaussian copula

0.91

0.86

0.75

0.61

GEV

-0.11

-0.11

-0.08

-0.07

1463.94

1463.94

1399.01

1273.31

3309.91

3309.91

3369.76

3739.46

5 Conclusions

587

The study of hydrological extremes faces the chronic lack of sufficient data to perform

588

reliable analyses. This is partly related to the inherent nature of extreme values, which are rare by

589

definition, and partly related to the relative shortness of systematic records from hydro-

590

meteorological gauge networks. The limited availability of data poses serious problems for an

591

effective and reliable use of asymptotic results provided by EVT.

592

Alternative methods focusing on the exact distribution of extreme maxima extracted from

593

POT sequences of random size over fixed time windows have been proposed in the past.

594

However, closed-form analytical results were developed only for independent data with Poisson

595

Accepted for publication in Water Resources Research

36

occurrences. Even though these assumptions may be sufficiently reliable for high-threshold POT

596

values, this type of data still generates relatively small sample size. In order to better exploit the

597

available information, it can be convenient to consider lower thresholds. However, the effect of

598

lower thresholds is twofold: on the one side the sample size increases, but on the other side the

599

hypotheses of independent magnitudes and Poisson occurrences of POT values are no longer

600

reliable.

601

In this study, we have introduced closed-form analytical formulae for the exact

602

distribution of maxima from POT sequences that generalize the classical independent model,

603

overcoming its limits and enabling the study of maxima taken from dependent low-threshold

604

POT values with arbitrary marginal distribution, first-order Markov dependence structure, and

605

negative binomial occurrences, and tested real data against this hypothesis. Even though the

606

framework can be further generalized by introducing arbitrary dependence structures and models

607

for POT occurrences, first-order Markov chains and negative binomial distributions provide a

608

good compromise between flexibility and the possibility to obtain simple ready-to-use formulae.

609

In this respect, it should be noted that our model of correlated extremes can cover a sufficient

610

range of cases. We have shown that the modulation of the lag-one autocorrelation coefficient of

611

the annual sequences of POT values (i.e. the Markov chain parameter) gives a set of extremal

612

distributions that include the empirical distribution of maxima for rainfall data series, and for

613

highly correlated low-threshold discharge POT series. On the other hand, the negative binomial

614

model is a widely used and theoretically well-established model for occurrences exhibiting

615

clustering and overdispersion, which are common characteristics of POT events resulting from

616

persistent processes, such as river discharge.

617

Accepted for publication in Water Resources Research

37

The relationship between our model and its classical independent version (i.e. eqs. (22)

618

and (4)) along with results of the case studies show that distribution of extreme maxima under

619

dependence yields probabilities of exceedance that are systematically lower than those under

620

independence, and are also consistent with traditional approaches (GEV), based on extreme

621

value theory, applied to long annual maxima series.

622

Finally, we stress that our model of the exact distribution of correlated extremes requires

623

knowledge or fitting of a bivariate distribution (and therefore its univariate marginal

624

distribution). In particular, while the extremal behavior of the rainfall process does not seem to

625

be significantly affected by dependence at the daily scale so that the classical Poisson model can

626

be appropriate for use in POT analyses of daily rainfall time series, the influence of correlation is

627

prominent in the streamflow process at the daily scale and it appears also in the rainfall process

628

at the hourly scale. Then, it is important to account for such dependence in the extreme value

629

analyses, which are crucial to hydrological design and risk management because critical values

630

can be less extreme and more frequent than expected under the classical independent models.

631

Comparing the Gaussian, Clayton and Gumbel bivariate copulas, describing different

632

dependence structures, and the standard Gaussian and Generalized Pareto marginal distributions,

633

we found that the distribution of maxima that has a Gumbel copula seems to be more consistent

634

with streamflow extreme values, denoting dependence in the upper tail of the bivariate

635

distribution. However, these aspects require further investigation form both theoretical and

636

empirical standpoints, and will be the subject of future research. In the spirit of the recent

637

literature on the topic, we believe that the present study will contribute to develop more reliable

638

data-rich-based analyses of extreme values.

639

Accepted for publication in Water Resources Research

38

Acknowledgments

640

All data used in this study are freely available online, as described in Section 4 above. The

641

associate editor, an eponymous reviewer, Geoff Pegram, and two anonymous reviewers are

642

gratefully acknowledged for their constructive comments that helped to substantially improve the

643

paper. We also thank Alessio Domeneghetti for providing the first author with detailed

644

information on the Dext3r public repository.

645

References

646

Abramowitz, M., and Stegun, I. A. (1972). Handbook of Mathematical Functions with Formulas,

647

Graphs, and Mathematical Tables, 9th printing. New York: Dover.

648

Allamano, P., Laio, F., and Claps, P. (2011). Effects of disregarding seasonality on the

649

distribution of hydrological extremes. Hydrology and Earth System Sciences, 15, 3207-

650

3215.

651

Bernardara, P., Mazas, F., Kergadallan, X., & Hamm, L. (2014). A two-step framework for over-

652

threshold modelling of environmental extremes. Natural Hazards and Earth System

653

Sciences, 14(3), 635-647.

654

Bogachev, M. I., and Bunde, A. (2012). Universality in the precipitation and river runoff. EPL

655

(Europhysics Letters), 97(4), 48011.

656

Bommier, E. (2014). Peaks-Over-Threshold Modelling of Environmental Data. U.U.D.M.

657

Project Report 2014:33, Department of Mathematics, Uppsala University.

658

Calenda, G., Petaccia, A., and Togna, A. (1977). Theoretical probability distribution of critical

659

hydrologic events by the partial-duration series method. Journal of Hydrology, 33(3-4),

660

233-245.

661

Accepted for publication in Water Resources Research

39

Claps, P., and Laio, F. (2003). Can continuous streamflow data support flood frequency

662

analysis? An alternative to the partial duration series approach. Water Resources

663

Research, 39(8), 1216.

664

Coles S. (2001). An Introduction to Statistical Modeling of Extreme Values, Springer Series in

665

Statistics, Springer, London.

666

De Michele, C., and Avanzi, F. (2018). Superstatistical distribution of daily precipitation

667

extremes: A worldwide assessment. Scientific reports, 8, 14204.

668

Eastoe, E. F., and Tawn, J. A. (2010). Statistical models for overdispersion in the frequency of

669

peaks over threshold data for a flow series. Water Resources Research, 46(2).

670

Eichner, J. F., Kantelhardt, J. W., Bunde, A., and Havlin, S. (2011). The statistics of return

671

intervals, maxima, and centennial events under the influence of long-term correlations. In

672

J. Kropp & H.-J. Schellnhuber (Eds.), Extremis (pp. 2–43). Berlin, Heidelberg: Springer.

673

Favre, A. C., El Adlouni, S., Perreault, L., Thiémonge, N., and Bobée, B. (2004). Multivariate

674

hydrological frequency analysis using copulas. Water Resources Research, 40(1).

675

Feller, W. (1968). An Introduction to Probability Theory and Its Applications, vol. I, 3rd edition,

676

London-New York-Sydney-Toronto, John Wiley & Sons.

677

Fernández, B., and Salas, J. D. (1999). Return period and risk of hydrologic events. I:

678

mathematical formulation. Journal of Hydrologic Engineering, 4(4), 297-307.

679

Ferro, C. A., and Segers, J. (2003). Inference for clusters of extreme values. Journal of the Royal

680

Statistical Society: Series B (Statistical Methodology), 65(2), 545-556.

681

Fisher, R., & Tippett, L. (1928). Limiting forms of the frequency distribution of the largest or

682

smallest member of a sample. Mathematical Proceedings of the Cambridge

683

Philosophical Society, 24(2), 180-190.

684

Accepted for publication in Water Resources Research

40

Fuller, W. E. (1914). Flood flows. Transactions of the American Society of Civil Engineers, 77,

685

564-617.

686

Graham, R. L., Knuth, D. E., and Patashnik, O. (1994). Concrete Mathematics: A Foundation for

687

Computer Science, 2nd ed. Reading, MA: Addison-Wesley.

688

Hazen, A. (1914). The storage to be provided in impounding reservoirs for municipal water

689

supply. Transactions of the American Society of Civil Engineers, 77, 1539-1669.

690

Klein Tank, A.M.G. et al. (2002). Daily dataset of 20thcentury surface air temperature and

691

precipitation series for the European Climate Assessment. International Journal of

692

Climatology, 22(12), 1441-1453.

693

Koutsoyiannis, D. (2004a). Statistics of extremes and estimation of extreme rainfall: I.

694

Theoretical investigation. Hydrological Sciences Journal, 49(4), 575–590.

695

Koutsoyiannis, D. (2004b). Statistics of extremes and estimation of extreme rainfall: II.

696

Empirical investigation of long rainfall records. Hydrological Sciences Journal, 49(4),

697

591–610.

698

Koutsoyiannis, D., and Montanari, A. (2015). Negligent killing of scientific concepts: the

699

stationarity case. Hydrological Sciences Journal, 60(7-8), 1174-1183.

700

Koutsoyiannis, D., and Papalexiou, S.M. (2017). Extreme rainfall: Global perspective, Handbook

701

of Applied Hydrology, Second Edition, edited by V.P. Singh, 74.1–74.16, McGraw-Hill,

702

New York.

703

Krzysztofowicz, R. (1997). Transformation and normalization of variates with specified

704

distributions. Journal of Hydrology, 197(1-4), 286-292.

705

Iliopoulou, T., and Koutsoyiannis, D. (2019). Revealing hidden persistence in maximum rainfall

706

records. Hydrological Sciences Journal, doi: 10.1080/02626667.2019.1657578.

707

Accepted for publication in Water Resources Research

41

Leadbetter M. R. (1974). On extreme values in stationary sequences. Zeitschrift für

708

Wahrscheinlichkeitstheorie und Verwandte Gebiete, 28, 289–303.

709

Leadbetter M. R. (1983). Extremes and local dependence in stationary sequences. Zeitschrift für

710

Wahrscheinlichkeitstheorie und Verwandte Gebiete, 65, 291–306.

711

Lombardo, F., Volpi, E., Koutsoyiannis, D., and Serinaldi, F. (2017). A theoretically consistent

712

stochastic cascade for temporal disaggregation of intermittent rainfall. Water Resources

713

Research, 53(6), 4586-4605.

714

Lombardo, F., Montesarchio, V., Napolitano, F., Russo, F., and Volpi, E. (2009). Operational

715

applications of radar rainfall data in urban hydrology. In Proceedings of a symposium on

716

the role of hydrology in water resources management, Capri, Italy, October 2008. (pp.

717

258-265). IAHS Press.

718

Lombardo, F., Napolitano, F., & Russo, F. (2006a). On the use of radar reflectivity for estimation

719

of the areal reduction factor. Natural Hazards and Earth System Sciences, 6(3), 377-386.

720

Lombardo, F., Napolitano, F., Russo, F., Scialanga, G., Baldini, L., and Gorgucci, E. (2006b).

721

Rainfall estimation and ground clutter rejection with dual polarization weather

722

radar. Advances in Geosciences, 7, 127-130.

723

Luke, A., Vrugt, J. A., AghaKouchak, A., Matthew, R., and Sanders, B. F. (2017). Predicting

724

nonstationary flood frequencies: Evidence supports an updated stationarity thesis in the

725

United States. Water Resources Research, 53(7), 5469-5494.

726

Marani, M., and Ignaccolo, M. (2015). A metastatistical approach to rainfall extremes. Advances

727

in Water Resources, 79, 121-126.

728

Accepted for publication in Water Resources Research

42

Menne, M. J., Durre, I., Vose, R. S., Gleason, B. E., and Houston, T. G. (2012). An overview of

729

the global historical climatology network-daily database. Journal of Atmospheric and

730

Oceanic Technology, 29(7), 897-910.

731

Montanari, A. (2012). Hydrology of the Po River: looking for changing patterns in river

732

discharge. Hydrology and Earth System Sciences, 16, 3739-3747.

733

O’Connell, P. E., Koutsoyiannis, D., Lins, H. F., Markonis, Y., Montanari, A., and Cohn, T.

734

(2016). The scientific legacy of Harold Edwin Hurst (1880–1978). Hydrological Sciences

735

Journal, 61(9), 1571-1590.

736

Papalexiou, S. M., and Koutsoyiannis, D. (2013). Battle of extreme value distributions: A global

737

survey on extreme daily rainfall. Water Resources Research, 49, 187-201.

738

Papalexiou, S. M., Koutsoyiannis, D., and Makropoulos, C. (2013). How extreme is extreme? An

739

assessment of daily rainfall distribution tails. Hydrology and Earth System Sciences, 17,

740

851-862.

741

Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (2007). Numerical recipes

742

3rd edition: The art of scientific computing. Cambridge University Press.

743

Salas, J. D., Obeysekera, J., & Vogel, R. M. (2018). Techniques for assessing water

744

infrastructure for nonstationary extreme events: a review. Hydrological Sciences Journal,

745

63(3), 325-352.

746

Salvadori, G., De Michele, C., Kottegoda, N. T., & Rosso, R. (2007). Extremes in nature: an

747

approach using copulas. Vol. 56. Springer Science & Business Media.

748

Schmidt, R. (2005). Tail dependence. In Statistical Tools for Finance and Insurance (pp. 65-91).

749

Springer, Berlin, Heidelberg.

750

Accepted for publication in Water Resources Research

43

Serinaldi, F., and Kilsby, C. G. (2014). Rainfall extremes: Toward reconciliation after the battle

751

of distributions. Water Resources Research, 50(1), 336-352.

752

Serinaldi, F., and Kilsby, C. G. (2016). Understanding persistence to avoid underestimation of

753

collective flood risk. Water, 8(4), 152.

754

Serinaldi, F., and Kilsby, C. G. (2018). Unsurprising Surprises: The Frequency of Record

755

breaking and Overthreshold Hydrological Extremes Under Spatial and Temporal

756

Dependence. Water Resources Research, 54(9), 6460-6487.

757

Serinaldi, F., Kilsby, C. G., and Lombardo, F. (2018). Untenable nonstationarity: An assessment

758

of the fitness for purpose of trend tests in hydrology. Advances in Water Resources, 111,

759

132-155.

760

Todorovic, P. (1970). On some problems involving random number of random variables. The

761

Annals of Mathematical Statistics, 41(3), 1059–1063.

762

Todorovic, P., and Zelenhasic, E. (1970). A stochastic model for flood analysis. Water Resources

763

Research, 6(6), 1641–1648.

764

Volpi, E., Fiori, A., Grimaldi, S., Lombardo, F., and Koutsoyiannis, D. (2015). One hundred

765

years of return period: Strengths and limitations. Water Resources Research, 51(10),

766

8570-8585.

767

Volpi, E., Fiori, A., Grimaldi, S., Lombardo, F., and Koutsoyiannis, D. (2019). Save

768

hydrological observations! Return period estimation without data decimation. Journal of

769

Hydrology, 571, 782-792.

770

Zorzetto, E., Botter, G., and Marani, M. (2016). On the emergence of rainfall extremes from

771

ordinary events. Geophysical Research Letters, 43(15), 8076-8082.

772

- A preview of this full-text is provided by Wiley.
- Learn more

Preview content only

Content available from Water Resources Research

This content is subject to copyright. Terms and conditions apply.