## No full-text available

To read the full-text of this research,

you can request a copy directly from the authors.

Article

Estimation of the extremal behavior of a process is often based on the fitting of asymptotic extreme value models to relatively short series of data. Maximum likelihood has emerged as a flexible and powerful modeling tool in such applications, but its performance with small samples has been shown to be poor relative to an alternative fitting procedure based on probability weighted moments. We argue here that the small-sample superiority of the probability weighted moments estimator is due to the assumption of a restricted parameter space, corresponding to finite population moments. To incorporate similar information in a likelihood-based analysis, we propose a penalized maximum likelihood estimator that retains the modeling flexibility and large-sample optimality of the maximum likelihood estimator, but improves on its small-sample properties. The properties of the penalized likelihood estimator are verified in a simulation study, and in application to sea-level data, which also enables the procedure to be evaluated in the context of structural models for extremes.

To read the full-text of this research,

you can request a copy directly from the authors.

... The poor performance of the MLE with a small sample has also been observed in the GEV distribution when one of the shape parameters (k) is negative. For the GEV distribution, [5,15] proposed PFs on a shape. Their approaches can be applied to two shape parameters in K4D. ...

... This study proposes an MPLE for K4D with PFs on two shape parameters -which are modified from those of [5,15] -and compares it with existing estimation methods. Section 2 describes the K4D. ...

... Consequently, it causes a large bias and variance in extreme upper quantiles. To overcome this problem and for a likelihood-based inference, [5,15] proposed the MPLE, which assigns a penalty for a large negative value of k. To obtain the MPLE in GEVD, the penalized negative log-likelihood to be minimized is ...

The four-parameter kappa distribution (K4D) is a generalized form of some commonly used distributions such as generalized logistic, generalized Pareto, generalized Gumbel, and generalized extreme value (GEV) distributions. Owing to its flexibility, the K4D is widely applied in modeling in several fields such as hydrology and climatic change. For the estimation of the four parameters, the maximum likelihood approach and the method of L-moments are usually employed. The L-moment estimator (LME) method works well for some parameter spaces, with up to a moderate sample size, but it is sometimes not feasible in terms of computing the appropriate estimates. Meanwhile , using the maximum likelihood estimator (MLE) with small sample sizes shows substantially poor performance in terms of a large variance of the estimator. We therefore propose a maximum penalized likelihood estimation (MPLE) of K4D by adjusting the existing penalty functions that restrict the parameter space. Eighteen combinations of penalties for two shape parameters are considered and compared. The MPLE retains modeling flexibility and large sample optimality while also improving on small sample properties. The properties of the proposed estimator are verified through a Monte Carlo simulation, and an application case is demonstrated taking Thailand's annual maximum temperature data. ARTICLE HISTORY

... The sample size may impact the fit of the GEV distributions, i.e. the maximum likelihood estimator might be unstable for sample sizes of n < 50 (e.g., Hosking and Wallis, 1997;Coles and Dixon, 1999;Martins and Stedinger, 2000). To quantify the uncertainties related to a reduced sample size n (20-year period in this PhD thesis), a permutation test i.e. bootstrapping test was developed and applied to check whether the parameters (shape, location, and scale parameter) from the fitted distribution based on a sample size of n = 20 come from the same distribution based on a larger sample size of 50 (the sample size that indicates as a sample size to obtain robust estimations). ...

... As shown in the figure, small sample sizes of n = 10 and n = 20 lead to non-robust estimations while sample sizes higher than n = 50 are converging (e.g. Hosking and Wallis, 1997;Coles and Dixon, 1999;Martins and Stedinger, 2000). The exact value of the relative differences related to a sample size of n = 50 for several selected return periods can be obtained from Table 5.8. ...

... Since larger periods are suggested to derive future high and low flows (e.g., Martins and Stedinger, 2000), further uncertainties can be expected due to potential non-robust estimates of the GEV distributions using the maximum likelihood method. By following a statistical bootstrap sampling strategy, these uncertainties can be roughly estimated (Coles and Dixon, 1999). It is found that the GEV parameter uncertainties have a stronger impact on the derived low flows, while only small impacts are identified for the high flows. ...

Central Vietnam is characterized by a complex climatology, which in combination with the sparse hydrometeorological observation network, creates a challenge in the quantification of projected hydrological extremes under a changing climate. In the region, farmers report increasing damages on agriculture caused by extreme floods and drought conditions. Particularly during the summer-autumn rice season, water is often insufficient to irrigate the entire rice production areas, and thus significantly affecting rice productivity. Therefore, scientifically sound information on the expected future hydrological extremes as well as water-efficient agricultural strategies are urgently required for sustainable water resources management. In this thesis a complex hydrometeorological modelling chain is employed to investigate the impact of climate change on future hydrological extremes in the Vu Gia - Thu Bon (VGTB) river basin, Central Vietnam. The modelling chain consists of six Global Circulation Models (GCMs) (CCAM, CCSM, ECHAM3, ECHAM5, HadCM3Qs, and MRI), six Regional Climate Models (RCMs) (CCAM, MM5, RegCM, REMO, HadRM3P and MRI), six bias correction (BC) approaches (linear scaling, local intensity scaling, power law transform (monthly), empirical and gamma quantile mapping, and power law transform), the fully distributed hydrological Water Flow and Balance Simulation Model (WaSiM) which was calibrated for the VGTB basin using two different calibration approaches, and extreme values analysis. The nonlinear parameter estimation tool PEST, which is based on the Gauss-Marquardt-Levenberg method, was combined with the distributed hydrological model WaSiM. Confidence bounds for all estimated parameters of the WaSiM model were developed based on a covariance analysis. A reasonable quality of fit between modelled and observed runoffs was achieved showing the reasonable performance of the WaSiM model in this region. Both bias corrected and raw RCM data are used as input for the WaSiM to simulate flows for the VGTB basin. To derive high ow and low ow frequency curves for the control (baseline) period (1980-1999) and the future periods 2011-2030, 2031-2050, and 2080-2099, the generalized extreme value (GEV) distribution is fitted to the annual maxima/minima of the simulated continuous discharge series. Permutation tests are developed and applied to the observed discharge series (1980-1999) to quantify the uncertainties related to the relatively small size to estimate the GEV distribution. Results show that the GEV fits based on sample size of n = 20 can partially be considered as robust. Due to limitations in the performance of the BC methods, the delta change approach was applied to facilitate extreme ow analysis as required for hydrological decision support. The results exhibit a remarkable variation among the different climate scenarios. As indicated by the majority of the discharge projections, a tendency towards increased high flows and decreased low flows is concluded. The results highlight challenges in using current GCM/RCMs in combination with state-of-the-art BC methods for local impact studies on both high and low flows. A second central objective of this PhD dissertation was the development and application of an integrated hydrological-irrigation modelling system to optimize irrigation strategies for a typical rice irrigation system in Central Vietnam. The modelling system comprises WaSiM to simulate the inflow to a reservoir and an irrigation model, which optimizes the rice irrigation technology, i.e. Alternate Wetting and Drying (AWD) or Continuous Flooding (CF), the rice irrigation area and the irrigation scheduling under given water constraints. Irrigation strategies are derived based on different initial water levels in the reservoir at the beginning of the cropping season as well as different maximum water releases. The simulation results show that the initial level of water in the reservoir is crucial: water levels of less than 90% do not provide sufficient water to irrigate the entire cropping area, whereas a level of 70% restricts the cropping area to 75% under current design maximum outflow of 0.3 m3/s. AWD is able to reduce the water irrigation input, ranging from 4% to 10% and reduce the number of irrigation events compared to CF. The adoption of AWD, which has been not popular in Central Vietnam therefore, has the potential to save more water and may help to increase the profit of the farmers. However, the benefits of AWD can only be achieved after significant investment in the canal system and the reservoir outlet. The impact of the different computing environments on the solutions of the integrated model is estimated, since the robustness of the optimization results(performance variability) is crucial for decision support. Only limited performance variability due to the computing environment is finally found, giving confidence in the robustness of the model for decision support. Prior to the application and the transfer of the model to similar irrigation schemes in other regions, the model must be further validated by field experiments under various conditions.

... As a possible alternative, a penalized version of the MLM, referred as PMLM, can be used. This method takes advantage of a penalty function (P f ) to constrain the likelihood to a subset of possible values of ξ (Coles and Dixon, 1999). In the framework of met-ocean parameters, Mackay et al. (2011) considered a penalty function to derive return levels of significant wave heights, using the estimators proposed by Coles and Dixon (1999). ...

... This method takes advantage of a penalty function (P f ) to constrain the likelihood to a subset of possible values of ξ (Coles and Dixon, 1999). In the framework of met-ocean parameters, Mackay et al. (2011) considered a penalty function to derive return levels of significant wave heights, using the estimators proposed by Coles and Dixon (1999). In the present work, we present an alternative formulation for P f , which allows to avoid spurious estimates of ξ, while it does not affect the N-GEV parameters in case that the MLM can be reliably applied. ...

... Coastal Engineering 167 (2021) 103896 the use of a penalty function. The method to compute the Penalized Maximum Likelihood Estimators (PMLE) uses a penalty function (P f ) as follows (Coles and Dixon, 1999): ...

Non-stationary Extreme Value Analysis (NEVA) allows to determine the probability of exceedance of extreme sea states taking into account trends in the time series of data at hand. In this work, we analyse the reliability of NEVA of significant wave height (Hs) and peak period (Tp) under the assumption of linear trend for time series of annual maxima (AM) Hs in the Mediterranean Sea. A methodology to assess the significance of the results of the non-stationary model employed is proposed. Both the univariate long-term extreme value distribution of Hs and the bivariate distribution of Hs and Tp are considered. For the former, a non-stationary Generalized Extreme Value (GEV) probability is used, and a methodology to compute the parameters of the distribution based on the use of a penalty function is explored. Then, non-stationary GEV is taken as a reference to compute the Environmental Countours of Hs and Tp, assuming a conditional model for the latter parameter. Several methods to compute linear trends are analysed and cross-validated on the series of AM Hs at more than 20,000 hindcast nodes. Results show that the non-stationary analysis provides advantages over the stationary analysis only when all the considered metrics are consistent in indicating the presence of a trend. Moreover, both the univariate return levels of Hs and bivariate return levels of Hs and Tp show a marked dependence to the time window considered in the GEV distribution formulation. Therefore, when applying NEVA for coastal and marine applications, the hypothesis of linear trend and the length of the reference data used for the non-stationary distribution should be carefully considered.

... The regional scale parameter γ and the shape parameter k are estimated through the application of Penalized maximum likelihood estimation (PMLE) to the regional extreme data sample. Coles and Dixon (1999) recommend the use of this method that allows to combine the efficiency of maximum likelihood estimators for large sample sizes and the reliability of the probability weighted moment estimators for small sample sizes penalizing high values of shape parameter. ...

... The estimation of the regional GPD parameters is performed by the Penalized Maximum Likelihood Estimator (PMLE). In general, the penalization of the likelihood enables the use of additional information into the inference compared with that supplied by the extreme data sample (Coles and Dixon, 1999). As suggested by Coles and Dixon (1999) and by Weiss (2014), a likelihood penalization is used to penalise positive values (< 1) of the shape parameter k as follows: ...

... In general, the penalization of the likelihood enables the use of additional information into the inference compared with that supplied by the extreme data sample (Coles and Dixon, 1999). As suggested by Coles and Dixon (1999) and by Weiss (2014), a likelihood penalization is used to penalise positive values (< 1) of the shape parameter k as follows: ...

La protection des zones littorales contre les agressions naturelles provenant de la mer, et notamment contre le risque de submersion marine, est essentielle pour sécuriser les installations côtières. La prévention de ce risque est assurée par des protections côtières qui sont conçues et régulièrement vérifiées grâce généralement à la définition du concept de niveau de retour d’un événement extrême particulier. Le niveau de retour lié à une période de retour assez grande (de 1000 ans ou plus) est estimé par des méthodes statistiques basées sur la Théorie des Valeurs Extrêmes (TVE). Ces approches statistiques sont appliquées à des séries temporelles d’une variable extrême observée et permettent de connaître la probabilité d’occurrence de telle variable. Dans le passé, les niveaux de retour des aléas maritimes extrêmes étaient estimés le plus souvent à partir de méthodes statistiques appliquées à des séries d’observation locales. En général, les séries locales des niveaux marins sont observées sur une période limitée (pour les niveaux marins environ 50 ans) et on cherche à trouver des bonnes estimations des extrêmes associées à des périodes de retour très grandes. Pour cette raison, de nombreuses méthodologies sont utilisées pour augmenter la taille des échantillons des extrêmes et réduire les incertitudes sur les estimations. En génie côtier, une des approches actuellement assez utilisées est l’analyse régionale. L’analyse régionale est indiquée par Weiss (2014) comme une manière très performante pour réduire les incertitudes sur les estimations des événements extrêmes. Le principe de cette méthodologie est de profiter de la grande disponibilité spatiale des données observées sur différents sites pour créer des régions homogènes. Cela permet d’estimer des lois statistiques sur des échantillons régionaux plus étendus regroupant tous les événements extrêmes qui ont frappé un ou plusieurs sites de la région (...) Cela ainsi que le caractère particulier de chaque événement historique ne permet pas son utilisation dans une analyse régionale classique. Une méthodologie statistique appelée FAB qui permet de réaliser une analyse régionale tenant en compte les données historiques est développée dans ce manuscrit. Élaborée pour des données POT (Peaks Over Threshold), cette méthode est basée sur une nouvelle définition d’une durée d’observation, appelée durée crédible, locale et régionale et elle est capable de tenir en compte dans l’analyse statistique les trois types les plus classiques de données historiques (données ponctuelles, données définies par un intervalle, données au-dessus d’une borne inférieure). En plus, une approche pour déterminer un seuil d’échantillonnage optimal est définie dans cette étude. La méthode FAB est assez polyvalente et permet d’estimer des niveaux de retour soit dans un cadre fréquentiste soit dans un cadre bayésien. Une application de cette méthodologie est réalisée pour une base de données enregistrées des surcotes de pleine mer (données systématiques) et 14 surcotes de pleine mer historiques collectées pour différents sites positionnés le long des côtes françaises, anglaises, belges et espagnoles de l’Atlantique, de la Manche et de la mer du Nord. Enfin, ce manuscrit examine la problématique de la découverte et de la validation des données historiques

... The unbiased PWM and biased PWM were the most unbiased and accurate (in terms of parameter estimation), being the least sensitive to sample size. Coles and Dixon (1999) proposed a penalized maximum likelihood (PML) approach with EVI < 1 which outperformed the PWM in terms of bias and MSE for parameter and quantile estimations (Coles and Dixon, 1999). Diebolt et al. (2008) stressed that the method presents limitations for strongly heavy-tailed distributions, noting that the asymptotic properties of PWM cannot be derived for G E V shape parameter values higher than 0.5. ...

... The unbiased PWM and biased PWM were the most unbiased and accurate (in terms of parameter estimation), being the least sensitive to sample size. Coles and Dixon (1999) proposed a penalized maximum likelihood (PML) approach with EVI < 1 which outperformed the PWM in terms of bias and MSE for parameter and quantile estimations (Coles and Dixon, 1999). Diebolt et al. (2008) stressed that the method presents limitations for strongly heavy-tailed distributions, noting that the asymptotic properties of PWM cannot be derived for G E V shape parameter values higher than 0.5. ...

... The Generalized (or Penalized) Maximum Likelihood Estimator (GML or PML) (Coles and Dixon, 1999) augments the maximum likelihood with a penalty function, which may contain prior knowledge on the parameters or discourage undesirable estimates. The penalty is full for EVI larger than 1, no penalty is given for EVI less than 0 and a continuous function is defined between 0 and 1. ...

Here we review methods used for probabilistic analysis of extreme events in Hydroclimatology. We focus on streamflow, precipitation, and temperature extremes at regional and global scales. The review has four thematic sections: (1) probability distributions used to describe hydroclimatic extremes, (2) comparative studies of parameter estimation methods, (3) non-stationarity approaches, and (4) model selection tools. Synthesis of the literature shows that: (1) recent studies, in general, agree that precipitation and streamflow extremes should be described by heavy-tailed distributions, (2) the Method of Moments (MOM) is typically the first choice in estimating distribution parameters but it is outperformed by methods such as L-Moments (LM), Maximum Likelihood (ML), Least Squares (LS), and Bayesian Markov Chain Monte Carlo (BMCMC), (3) there are less popular parameter estimation techniques such as the Maximum Product of Spacings (MPS), the Elemental Percentile (EP), and the Minimum Density Power Divergence Estimator (MDPDE) that have shown competitive performance in fitting extreme value distributions, and (4) non-stationary analyses of extreme events are gaining popularity; the ML is the typically used method, yet literature suggests that the Generalized Maximum Likelihood (GML) and the Weighted Least Squares (WLS) may be better alternatives. The review offers a synthesis of past and contemporary methods used in the analysis of hydroclimatic extremes, aiming to highlight their strengths and weaknesses. Finally, the comparative studies summary helps the reader identify the most suitable modeling framework for their analyses, based on the extreme hydroclimatic variables, sample sizes, locations, and evaluation metrics reviewed.

... provided that 1 / 0, for 1, … , . Coles and Dixon [31] proposed a penalty function for the maximum likelihood method for the GEV distribution. The penalty is in terms of both bias and variance magnitudes. ...

... Coles and Dixon [31] advise setting 1, which leads to reasonable performance. The Penalized Maximum Likelihood (PML) estimator appears to be slightly better than, or at least as good as, the PWM estimator. ...

... The Penalized Maximum Likelihood (PML) estimator appears to be slightly better than, or at least as good as, the PWM estimator. Following Coles and Dixon [31], the two approaches are applied for comparison, i.e., the penalized maximum likelihood (PML) and the maximum likelihood (ML) estimators. ...

This paper uses the Extreme Value Theory (EVT) to model the rare events that appear as delivery delays in road transport. Transport delivery delays occur stochastically. Therefore, modeling such events should be done using appropriate tools due to the economic consequences of these extreme events. Additionally, we provide the estimates of the extremal index and the return level with the confidence interval to describe the clustering behavior of rare events in deliveries. The Generalized Extreme Value Distribution (GEV) parameters are estimated using the maximum likelihood method and the penalized maximum likelihood method for better small-sample properties. The findings demonstrate the advantages of EVT-based prediction and its readiness for application.

... Various approaches have been proposed to find the values of these parameters, among which ML and PWM estimators are two commonly used methods. There are several studies comparing and evaluating the performances and estimations of PWM and ML methods (Brabson and Palutikof, 2000;Coles and Dixon, 1999;Hosking et al., 1985;Martins and Stedinger, 2000;Phien, 1987). ...

... Lastly, the ME method was found to be closer to the ML than the PWM method. Coles and Dixon (1999) argued that the better performance of the PWM method for small sample sizes is due to the assumption of "a restricted parameter space". Therefore, by incorporating a similar information in a likelihood-based model, Coles and Dixon (1999) proposed a penalised maximum-likelihood estimator, which benefits from "the modelling flexibility and large-sample optimality" of the ML method, and also from the small-sample advantage of the PWM method. ...

... Coles and Dixon (1999) argued that the better performance of the PWM method for small sample sizes is due to the assumption of "a restricted parameter space". Therefore, by incorporating a similar information in a likelihood-based model, Coles and Dixon (1999) proposed a penalised maximum-likelihood estimator, which benefits from "the modelling flexibility and large-sample optimality" of the ML method, and also from the small-sample advantage of the PWM method. In addition, Martins and Stedinger (2000) showed that restricting the value of k to a reasonable range, physically and statistically, using a Bayesian prior distribution, eliminates the small-sample issue of the ML method. ...

The study aims to estimate design wind speeds and associated directional multipliers, also lee-zone multipliers for New Zealand through the analysis of historical wind data recorded at meteorological stations and also utilising a high-resolution convection-resolving numerical weather prediction model (New Zealand Convective-Scale Model (NZCSM)) analyses. New Zealand’s historical wind data have not been analysed in the past two decades for design wind-load purposes. In addition, no attempt has been made to thoroughly homogenise the mean and gust wind speed data recorded prior to the 1990s and to convert them to equivalent Automatic Weather Stations (AWS) records. Furthermore, lee zones, areas affected by the wind speed-up due to the presence of mountains, can significantly influence the design wind loads, thus, it is crucial to estimate the spatial extent and magnitude of the lee multiplier accurately. In this study, the wind data were initially subjected to a robust homogenisation algorithm and then, they were separated into synoptic and non-synoptic events to gain a better understanding of New Zealand’s gust climatology and sources of extreme events. It was demonstrated that synoptic events dominate the design wind speeds at most locations in New Zealand. For extreme value analysis, three different extreme value distributions were used, namely Type I (using Gumbel, Gringorten and BLUE fitting methods), Type III (using maximum likelihood and probability weighted moments methods), and Peaks-Over-Threshold (POT) approach. In addition, the predictions of NZCSM along with historical wind speeds were used to identify the lee zones, which confirms existing zones and provides evidence to support introducing new zones, and obtain estimates of the lee-multipliers. Substantial changes have been proposed for the next version of the Australian/New Zealand wind-loading standard (AS/NZS 1170.2) based on the results of this study. The changes include adding a new wind region to New Zealand, refinements of wind zone boundaries, revising all regional wind speeds and directional multipliers, and modifying the lee-zone regions and multipliers.

... The ML estimator is widely used because of its asymptotic properties, even though it has bias, and robustness problems with small samples. Several attempts have been made to reduce ML bias using analytic [3] and bootstrap methods [4], [5]. Frery et al [6] proposed a technique to correct its tendency to diverge with small samples. ...

... This law has been verified and studied in several fields because of its flexibility to model different phenomena. We take advantage of this fact by using estimators whose good properties (bias corrected [3], low computational cost and asymptotic efficiency [12]) have already been assessed, but that have not been used by the SAR community. ...

... Penalized Maximum Likelihood (PML) This estimator is obtained by minimizing the penalized negative log-likelihood:( α PML , γ PML ) = arg min (α,γ) {− ln L(α, γ) + ln P (α)},where P (α) is the penalty function. Coles and Dixon[3]proposed:P (α) = e −λ( −1 1+α ) ν if α < −1 0 if α ≥ −1,where λ and ν are non negative values; the authors suggested ν = λ = 1.C. Moments (Mom)The r-order moments of a G 0 I (α, γ, 1) distributed random variable are given byE(Z r ) = γ r Γ(−α − r) Γ(−α) Γ(1 + r)(4)if α < −r and infinite otherwise. ...

The statistical properties of Synthetic Aperture Radar (SAR) image texture reveal useful target characteristics. It is well-known that these images are affected by speckle and prone to extreme values due to double bounce and corner reflectors. The GI0 distribution is flexible enough to model different degrees of texture in speckled data. It is indexed by three parameters: α, related to the texture, ɣ, a scale parameter, and L, the number of looks. Quality estimation of α is essential due to its immediate interpretability. In this letter, we exploit the connection between the GI0 and Pareto distributions. With this, we obtain six estimators that have not been previously used in the SAR literature. We compare their behavior with others in the noisiest case for monopolarized intensity data, namely single look case. We evaluate them using Monte Carlo methods for noncontaminated and contaminated data, considering convergence rate, bias, mean squared error, and computational time. We conclude that two of these estimators based on the Pareto law are the safest choices when dealing with actual data and small samples, as is the case of despeckling techniques and segmentation, to name just two applications. We verify the results with an actual SAR image.

... Robust methods for the EVD have been studied in Dupuis and Field (1998), who derived B-optimal robust M-estimators for the case that the observations follow an EVD. Modifications of the ML estimator are presented by Coles and Dixon (1999), who suggest penalised maximum likelihood (PML) estimators, showing that PML estimation improves the small-sample properties of a likelihood-based analysis. ...

... Castillo and Hadi (1997) have proposed estimators based on the elemental percentile method (EPM). PML estimators, containing a penalty function for the shape parameter, are presented by Coles and Dixon (1999) and Martins and Stedinger (2000). The PML estimator combines the flexibility of the ML estimator and the robustness of the PWM estimator. ...

In the last decades there has been a shift from the parametric statistics of extremes for IID random variables, based on the probabilistic asymptotic results in extreme value theory, towards a semi-parametric approach, where the estimation of the right tail-weight, under a quite general framework, is of major importance. After a brief presentation of classical Gumbel's block methodology and of later improvements in the parametric framework (multivariate and multi-dimensional extreme value models for largest observations and peaks over threshold approaches), we present a coordinated overview, over the last three decades, of the developments on the estimation of the extreme value index and testing of extreme value conditions under a semi-parametric framework. Laurens de Haan has been one of the leading scientists in the field, (co-)author of many seminal ideas, that he generously shared with dozens (literally) of colleagues and students, thus achieving one of the main goals in a scientist's life: he gathered around him a bunch of colleagues united in the endeavor of building knowledge. The last section is a personal tribute to Laurens, who fully lives his ideal that "co-operation is the heart of Science".

... Estimation of the shape parameter is notoriously challenging, and the maximization of the GPD likelihood may exhibit convergence problems for small sample sizes (Coles and Dixon, 1999). In general, penalization can help to reduce the variance of an estimator at the cost of higher bias (Hastie et al., 2009). ...

... In general, penalization can help to reduce the variance of an estimator at the cost of higher bias (Hastie et al., 2009). Coles and Dixon (1999) propose a penalty function that restricts the shape parameter values to < 1 and favors smaller values of . Several penalization schemes can be interpreted in a Bayesian sense by considering a prior distribution on the regularized parameter. ...

Classical methods for quantile regression fail in cases where the quantile of interest is extreme and only few or no training data points exceed it. Asymptotic results from extreme value theory can be used to extrapolate beyond the range of the data, and several approaches exist that use linear regression, kernel methods or generalized additive models. Most of these methods break down if the predictor space has more than a few dimensions or if the regression function of extreme quantiles is complex. We propose a method for extreme quantile regression that combines the flexibility of random forests with the theory of extrapolation. Our extremal random forest (ERF) estimates the parameters of a generalized Pareto distribution, conditional on the predictor vector, by maximizing a local likelihood with weights extracted from a quantile random forest. Under certain assumptions, we show consistency of the estimated parameters. Furthermore, we penalize the shape parameter in this likelihood to regularize its variability in the predictor space. Simulation studies show that our ERF outperforms both classical quantile regression methods and existing regression approaches from extreme value theory. We apply our methodology to extreme quantile prediction for U.S. wage data.

... For instance, probability weighted moments or L-moments have been proposed as alternatives to moment or maximum likelihood (ML) estimators. Indeed, the former show a superior performance in typical small sample cases (Hosking et al. 1985), which has been mainly attributed to their restricted parameter space (Coles and Dixon 1999). ...

... When including a non-zero penalty, the resulting estimators are therefore called penalized maximum likelihood (PML) estimators. Coles and Dixon (1999) and Martins and Stedinger (2000) propose two slightly different estimators of GEV parameters of this particular form Eq. 3, with a regularizer Ω(θ) depending only on the shape ξ , thus aiming at ruling out unusual values of the shape parameter. However, no asymptotic theory is provided and it is unknown whether (and under what conditions) the estimators are consistent. ...

A common statistical problem in hydrology is the estimation of annual maximal river flow distributions and their quantiles, with the objective of evaluating flood protection systems. Typically, record lengths are short and estimators imprecise, so that it is advisable to exploit additional sources of information. However, there is often uncertainty about the adequacy of such information, and a strict decision on whether to use it is difficult. We propose penalized quasi-maximum likelihood estimators to overcome this dilemma, allowing one to push the model towards a reasonable direction defined a priori. We are particularly interested in regional settings, with river flow observations collected at multiple stations. To account for regional information, we introduce a penalization term inspired by the popular Index Flood assumption. Unlike in standard approaches, the degree of regionalization can be controlled gradually instead of deciding between a local or a regional estimator. Theoretical results on the consistency of the estimator are provided and extensive simulations are performed for the reason of comparison with other local and regional estimators. The proposed procedure yields very good results, both for homogeneous as well as for heterogeneous groups of sites. A case study consisting of sites in Saxony, Germany, illustrates the applicability to real data.

... We also show in appendix C that the uncertainty in estimating unprecedented events from observational records using MLE is dominated by uncertainty in the shape parameter. One fix for this is to put a subjectively-chosen prior (or "penalty function") on the shape parameter (Coles and Dixon, 1999;Martins and Stedinger, 2000). This method was used by CFB2018. ...

... For short record lengths, unconstrained maximum-likelihood estimation is known to give "noisy" and implausible shape parameters (Coles and Dixon, 1999;Martins and Stedinger, 2000), see also §3.3. We illustrate this in Fig. C1 with a GEV fit to Figure C1. ...

Our ability to quantify the likelihood of present-day extreme sea level (ESL) events is limited by the length of tide gauge records around the UK, and this results in substantial uncertainties in return level curves at many sites. In this work, we explore the potential for a state-of-the-art climate model, HadGEM3-GC3, to help refine our understanding of present-day coastal flood risk associated with extreme storm surges, which are the dominant driver of ESL events for the UK and wider European shelf seas. We use a 483-year present-day control simulation from HadGEM3-GC3-MM (1/4 degree ocean, approx 60 km atmosphere in mid-latitudes) to drive a northwest European shelf seas model and generate a new dataset of simulated UK storm surges. The variable analysed is the skew surge (the difference between the high water level and the predicted astronomical high tide), which is widely used in analysis of storm surge events. The modelling system can simulate skew surge events comparable to the catastrophic 1953 North Sea storm surge, which resulted in widespread flooding, evacuation of 32 thousand people and hundreds of fatalities across the UK alone, along with many hundreds more in mainland Europe. Our model simulations show good agreement with an independent re-analysis of the 1953 surge event and suggest that a skew surge event of this magnitude has an expected frequency of about 1 in 500 years at the mouth of the river Thames. For that site, we also revisit the assumption of skew surge/tide independence. Our model results suggest that at that site for the most extreme surges, tide/surge interaction significantly attenuates extreme skew surges on a spring tide compared to a neap tide. Around the UK coastline, the extreme tail shape parameters diagnosed from our simulation correlate very well (Pearson's r greater than 0.85), in terms of spatial variability, with those used in the UK government's current guidance (which are diagnosed from tide-gauge observations), but ours can be diagnosed without the use of a subjective prior. Despite the strong correlation, our diagnosed shape parameters are biased low relative to the current guidance. This bias is also seen when we replace HadGEM3-GC3-MM with a reanalysis, so we conclude that the bias is likely associated with limitations in the shelf sea model used here. Overall, the work suggests that climate model simulations may prove useful as an additional line of evidence to inform assessments of present-day coastal flood risk.

... A GPD distribution taking into account the seasonality is then fitted to the regional sample. The distribution parameters are estimated with the penalized maximum likelihood method (Coles and Dixon, 1999). The most adequate distributions are obtained with the AIC. ...

To withstand coastal flooding, protection of coastal facilities and structures must be designed with the most accurate estimate of extreme storm surge return levels (SSRLs). However, because of the
paucity of data, local statistical analyses often lead to poor frequency
estimations. The regional frequency analysis (RFA) reduces the uncertainties associated with these estimations by extending the dataset from local (only available data at the target site) to regional (data at all the neighboring sites including the target site) and by assuming, at the scale of a region, a similar extremal behavior. In this work, the empirical spatial extremogram (ESE) approach is used. This is a graph representing all the coefficients of extremal dependence between a given target site and all the other sites in the whole region. It allows quantifying the pairwise closeness between sites based on the extremal dependence. The ESE approach, which should help with have more confidence in the physical homogeneity of the region of interest, is applied on a database of extreme skew storm surges (SSSs) and used to perform a RFA.

... Hosking and Wallis (1987) showed that MLE provides greater variance and bias for small samples compared to the Probability Weighted Moment (PWM) (Greenwood et al., 1979;Landwehr et al., 1979) and the Method of Moments (MOM) estimators. Coles and Dixon (1999) proposed a modified MLE which contains a penalty function for the shape parameter (i.e. the Maximum Penalized Likelihood estimator (MPLE). Zhang (2007) presented a hybrid Likelihood Moment estimator (LME) which provides feasible estimates and has high asymptotic efficiency. ...

This study investigated core components of an extreme value methodology for the estimation of high-flow frequencies from agricultural surface water runoff. The Generalized Pareto distribution (GPD) was used to model excesses in time-series data that resulted from the 'Peaks Over Threshold' (POT) method. First, the performance of eight different GPD parameter estimators was evaluated through a Monte Carlo experiment. Second, building on the estimator comparison, two existing automated GPD threshold selection methods were evaluated against a proposed approach that automates the threshold stability plots. For this second experiment, methods were applied to discharge measured at a highly-instrumented agricultural research facility in the UK. By averaging fine-resolution 15-minute data to hourly, 6-hourly and daily scales, we were also able to determine the effect of scale on threshold selection, as well as the performance of each method. The results demonstrate the advantages of the proposed threshold selection method over two commonly applied methods, while at the same time providing useful insights into the effect of the choice of the scale of measurement on threshold selection. The results can be generalised to similar water monitoring schemes and are important for improved characterisations of flood events and the design of associated disaster management protocols.

... Once the maximum is found in a parametric space with a dimension smaller than the original, we used these values as seeds or initial values to perform the approximation using the Newton-Raphson algorithm in the initial parametric space. We also agree with the results of Coles and Dixon [37], which found that estimators are improved using the maximum penalized likelihood method by restricting the range of κ. ...

This paper concerns the use and implementation of penalized maximum likelihood procedures to fitting smoothing functions of the generalized extreme value distribution parameters to analyze spatial extreme values of ultraviolet B (UVB) radiation across the Mexico City metropolitan area in the period 2000–2018. The model was fitted using a flexible semi-parametric approach and the parameters were estimated by the penalized maximum likelihood (PML) method. In order to investigate the performance of the model as well as the estimation method in the analysis of complex nonlinear trends for UVB radiation maxima, a simulation study was conducted. The results of the simulation study showed that penalized maximum likelihood yields better regularization to the model than the maximum likelihood estimates. We estimated return levels of extreme UVB radiation events through a nonstationary extreme value model using measurements of ozone ( O 3 ), nitrogen oxides ( NO x ), particles of 10 μ m or less in diameter ( PM 10 ), carbon monoxide (CO), relative humidity (RH) and sulfur dioxide ( SO 2 ). The deviance statistics indicated that the nonstationary generalized extreme value (GEV) model adjusted was statistically better compared to the stationary model. The estimated smoothing functions of the location parameter of the GEV distribution on the spatial plane for different periods of time reveal the existence of well-defined trends in the maxima. In the temporal plane, a presence of temporal cyclic components oscillating over a weak linear component with a negative slope is noticed, while in the spatial plane, a weak nonlinear local trend is present on a plane with a positive slope towards the west, covering the entire study area. An explicit spatial estimate of the 25-year return period revealed that the more extreme risk levels are located in the western region of the study area.

... (1938 -2021) Traditionally, estimates for the parameters of the GEV distribution are computed and these are, in turn, used to make inferences. Methods for estimating the GEV parameters include maximum likelihood and moment based methods [Coles and Dixon, 1999], [Nerantzaki and Papalexiou, 2022]. In addition, sensitivity analysis is generally employed based on the estimates having asymptotic properties, such as normality, along with other approximations. ...

In this study, we examine a Bayesian approach to analyze extreme daily rainfall amounts and forecast return-levels. Estimating the probability of occurrence and quantiles of future extreme events is important in many applications, including civil engineering and the design of public infrastructure. In contrast to traditional analysis, which use point estimates to accomplish this goal, the Bayesian method utilizes the complete posterior density derived from the observations. The Bayesian approach offers the benefit of well defined credible (confidence) intervals, improved forecasting, and the ability to defend rigorous probabilistic assessments. We illustrate the Bayesian approach using extreme precipitation data from Long Island, NY, USA and show that current return levels, or precipitation risk, may be understated.

... The ML estimation is in accordance with the likelihood principle, which states that, in the process of the inference of θ, all the relevant information in the observed data is contained in the likelihood function (Pindyck and Rubinfeld, 1998;Coles and Dixon, 1999;El Adlouni et al., 2007). ...

The estimates of the 2,082 grid points return level indicate that the intensity of expected daily extreme precipitation depends on the seasonal period and the place of occurrence of precipitation. The east of the Northeast Brazil stood out as the region where the highest intensities of extreme precipitation are expected. This information is very worrying, as in this area occur many natural disasters such as floods and landslides, causing impacts for human society and the environment. This study aimed to estimate levels of return of extreme daily precipitation events, associating them with natural disasters in Northeast Brazil (NEB), a region characterized by different climatic conditions and low rates of social and economic development. For this, generalized Pareto distribution (GPD) models were adjusted to the daily extreme precipitation data estimated by the Tropical Rainfall Measuring Mission (TRMM) 3B42 product of the multisatellite precipitation analysis for a period of 16 years (2000–2015). In addition, the estimates of the GPD model were compared using two data sources, TRMM and pluviometer. The investigation showed that the results of the GPD model estimated by means of the extreme data from the rain gauge and the TRMM were statistically the same, with 95% confidence. Thus, using the data referring to the 2,082 grid points of the TRMM, it was possible to map the spatial distribution of the estimates of the levels of return of extreme precipitation to the return periods of 2, 5 and 10 years, per seasonal period. In general, the results indicated that the intensity of expected extreme precipitation depends on the seasonal period and the place of occurrence of precipitation. The eastern NEB stood out as the region where the highest intensities of extreme precipitation are expected. Extreme precipitation values of up to 178 mm are expected in 2 years. The areas where natural disasters occurred in the years 2016, 2017 and 2018 are similar to those in which the highest rainfall intensities are expected. The results of this study can allow the evaluation of the spatial distribution of risks related to extreme precipitation events, and therefore, support the planning of regional public policies and environmental management for the prevention of natural disasters in NEB.

... When ξ = 0, the distribution is Gumbel distribution (Coles, 2001). The parameters are estimated by L moments fitting which gives better estimates for small samples (Bílková, 2012;Coles & Dixon, 1999). ...

In this work, the occurrence probability of extreme geomagnetic storms is estimated by applying extreme value theory to the geomagnetic activity Aa index. The Aa index has 172 years observation time span, which is much longer than other geomagnetic indices, and thus is more suitable for analysis for rarely occurred extreme geomagnetic storms. We use two newly developed extreme value theory methods, block maxima method and peak over threshold, and find that the extreme geomagnetic storm that happened in March 1989 may happen once per century. This result implies that we should pay more attention to such extreme geomagnetic storms that can cause space weather hazards.

... A GPD law taking into account the seasonality is then fitted to the regional sample. The distributions parameters are 280 estimated with the penalized maximum likelihood method (Coles and Dixon, 1999). The most adequate distributions are obtained with the AIC criterion. ...

To resist marine submersion, coastal protection must be designed by taking into account the most accurate estimate of the return levels of extreme events, such as storm surges. However, because of the paucity of data, local statistical analyses often lead to poor frequency estimations. Regional Frequency Analysis (RFA) reduces the uncertainties associated with these estimations, by extending the dataset from local (only available data at the target site) to regional (data at all the neighboring sites including the target site) and by assuming, at the scale of a region, a similar extremal behavior. RFA, based on the index flood method, assumes that, in a homogeneous region, observations at sites, normalized by a local index, follow the same probability distribution. In this work, the spatial extremogram approach is used to form a physically homogeneous region centered on the target site. The approach is applied on a database of extreme skew storm surges and used to carry out a RFA.

... Beside the L-moment, there are several other methods of performing parameter estimation such as method of moment and maximum likelihood estimation. As a general framework for extreme value modelling, maximum likelihood estimation method has many advantages because it can be constructed for complex modelling situation, enabling for non-stationarity, covariate effects and regression modelling (Coles & Dixon 1999). However, previous study showed that the method of maximum likelihood estimation is unstable and can give unrealistic estimates for the shape parameter for the small sample size (Hosking & Wallis 1997;Martins & Stedinger 2000). ...

... It is interesting to note the particularly wide confidence intervals seen for most of the velocity components with a high value of the GP shape parameter. Though, considering the physics of our problem, the most extreme intervals appears unrealistic -in such cases better estimates could probably be obtained by penalized maximum likelihood; see Coles and Dixon (1999). ...

Knowledge about extreme ocean currents and their vertical structure is important when designing offshore structures. We propose a method for statistical modelling of extreme vertical current velocity profiles, accounting for factors such as directionality, spatial and temporal dependence, and non-stationarity due to the tide. We first pre-process the data by resolving the observed (vector) currents at each of several water depths into orthogonal major and minor axis components by principal component analysis, and use harmonic analysis to decompose the total (observed) current into the sum of (deterministic) tidal and (stochastic) residual currents. A complete marginal model is then constructed for all residual current components, and the dependence structure between the components is characterized using the conditional extremes model by Heffernan and Tawn (2004). By simulating under this model, estimates of various extremal statistics can be acquired. A simple approach for deriving design current velocity profiles is also proposed. The method is tested using measured current profiles at two coastal locations in Norway, covering a period of 2.5 and 1.5 years. It is demonstrated that the method provides good extrapolations at both locations, and the estimated 10-year design current velocity profiles appear realistic compared to the most extreme velocity profiles observed in the measurements.

... Finally, in order to avoid an overestimate of the positive value of the shape parameter due to the small sample size (Lee et al., 2017), a modification of the maximum likelihood estimator using a penalty function is also applied for fitting the GEV. The penalty function penalizes estimates of ξ that are close to or greater than 1, following Coles and Dixon (1999). ...

Extreme cold weather events, such as the winter of 1962/63, the third coldest winter ever recorded
in the Central England Temperature record, or more recently the winter of
2010/11, have significant consequences for the society and economy. This
paper assesses the probability of such extreme cold weather across the United
Kingdom (UK), as part of a probabilistic catastrophe model for insured losses
caused by the bursting of pipes. A statistical model is developed in order to
model the extremes of the Air Freezing Index (AFI), which is a common measure
of the magnitude and duration of freezing temperatures. A novel approach in
the modelling of the spatial dependence of the hazard has been followed which
takes advantage of the vine copula methodology. The method allows complex
dependencies to be modelled, especially between the tails of the AFI
distributions, which is important to assess the extreme behaviour of such
events. The influence of the North Atlantic Oscillation and of anthropogenic
climate change on the frequency of UK cold winters has also been taken into
account. According to the model, the occurrence of extreme cold events, such
as the 1962/63 winter, has decreased approximately 2 times during the course
of the 20th century as a result of anthropogenic climate change. Furthermore,
the model predicts that such an event is expected to become more uncommon,
about 2 times less frequent, by the year 2030. Extreme cold spells in the UK
have been found to be heavily modulated by the North Atlantic Oscillation
(NAO) as well. A cold event is estimated to be ≈3–4 times more
likely to occur during its negative phase than its positive phase. However,
considerable uncertainty exists in these results, owing mainly to the short
record length and the large interannual variability of the AFI.

... Likelihood-based parameter estimates for the univariate GEV distribution are known to perform poorly, but penalized likelihoods can reduce estimation bias (Coles and Dixon 1999;Martins and Stedinger 2000). Penalized likelihoods have been incorporated into spatial models for marginal extremes (Opitz et al. 2018;Schliep et al. 2010). ...

Uncertainty in return level estimates for rare events, like the intensity of large rainfall events, makes it difficult to develop strategies to mitigate related hazards, like flooding. Latent spatial extremes models reduce the uncertainty by exploiting spatial dependence in statistical characteristics of extreme events to borrow strength across locations. However, these estimates can have poor properties due to model misspecification: Many latent spatial extremes models do not account for extremal dependence, which is spatial dependence in the extreme events themselves. We improve estimates from latent spatial extremes models that make conditional independence assumptions by proposing a weighted likelihood that uses the extremal coefficient to incorporate information about extremal dependence during estimation. This approach differs from, and is simpler than, directly modeling the spatial extremal dependence; for example, by fitting a max-stable process, which is challenging to fit to real, large datasets. We adopt a hierarchical Bayesian framework for inference, use simulation to show the weighted model provides improved estimates of high quantiles, and apply our model to improve return level estimates for Colorado rainfall events with 1% annual exceedance probability.

... In [3], [20], the selection of an appropriate threshold is one of the important concerns of the POT approach and still an unsolved problem an area of ongoing research in the literature which can be of the critical importance. In [6], [9], it states that the selection of the threshold process always is a trade-off between the bias and variance. If a too high threshold is selected, the bias decreases while the variance increases as there is not enough data above this threshold. ...

One of the major challenges in Peak over Threshold model (POT) the selection of the best threshold in fitting the Generalized Pareto Distribution (GPD) which is widely used in many applications. The choice of threshold must be a balance between bias and variation. In this paper we comparison between two graphical methods to determine the best threshold in the POT model and estimate the tail index. The results obtained from different estimators used to estimate the shape distribution of GPD by using maximum likelihood (ML). Finally, in this paper we use application on real data to compare the properties of different estimators for estimating tail index. The results show that GPD model with threshold of threshold choice plot (TCP) is a better choice basis on the Deviance and Akaike information test. For the calculations, we will use the R programming with packages POT and ismev for parameter estimation and diagnostic plots.

... With the TG regions established (Figure 2), the aggregated and normalized sets of TG threshold exceedances are fit with a GPD (Coles, 2001) using the penalized maximum likelihood method (Coles and Dixon, 1999;Frau et al., 2018) to estimate Frontiers in Marine Science | www.frontiersin.org (median) regional ESL (RESL) probabilities and the 5th and 95th% levels (90% confidence interval) defined as: ...

A regional frequency analysis (RFA) of tide gauge (TG) data fit with a Generalized Pareto Distribution (GPD) is used to estimate contemporary extreme sea level (ESL) probabilities and the risk of a damaging flood along Pacific Basin coastlines. Methods to localize and spatially granulate the regional ESL (sub-annual to 500-year) probabilities and their uncertainties are presented to help planners of often-remote Pacific Basin communities assess (ocean) flood risk of various threshold severities under current and future sea levels. Downscaling methods include use of local TG observations of various record lengths (e.g., 1–19+ years), and if no in situ data exist, tide range information. Low-probability RFA ESLs localized at TG locations are higher than other recent assessments and generally more precise (narrower confidence intervals). This is due to increased rare-event sampling as measured by numerous TGs regionally. For example, the 100-year ESLs (1% annual chance event) are 0.15 m and 0.25 higher (median at-site difference) than a single-TG based analysis that is closely aligned to those supporting recent Intergovernmental Panel on Climate Change (IPCC) assessments and a third-generation global tide and surge model, respectively. Height thresholds for damaging flood levels along Pacific Basin coastlines are proposed. These floods vary between about 0.6–1.2 m or more above the average highest tide and are associated with warning levels of the U.S. National Oceanic and Atmospheric Administration (NOAA). The risk of a damaging flood assessed by the RFA ESL probabilities under contemporary sea levels have about a (median) 20–25-year return interval (4–5% annual chance) for TG locations along Pacific coastlines. Considering localized sea level rise projections of the IPCC associated with a global rise of about 0.5 m by 2100 under a reduced emissions scenario, damaging floods are projected to occur annually by 2055 and >10 times/year by 2100 at the majority of TG locations.

... Likelihood-based inference for univariate extreme value distributions is known to yield bi- ased estimators for GEV parameters, but penalized likelihoods can reduce estimation bias (Coles and Dixon, 1999;Martins and Stedinger, 2000). Penalized likelihoods have been incorporated into spatial models for marginal extremes (Opitz et al., 2018;Schliep et al., 2010). ...

Uncertainty in return level estimates for rare events, like the intensity of large rainfall events, makes it difficult to develop strategies to mitigate related hazards, like flooding. Latent spatial extremes models reduce uncertainty by exploiting spatial dependence in statistical characteristics of extreme events to borrow strength across locations. However, these estimates can have poor properties due to model misspecification: many latent spatial extremes models do not account for extremal dependence, which is spatial dependence in the extreme events themselves. We improve estimates from latent spatial extremes models that make conditional independence assumptions by proposing a weighted likelihood that uses the extremal coefficient to incorporate information about extremal dependence during estimation. This approach differs from, and is more simple than, directly modeling the spatial extremal dependence; for example, by fitting a max-stable process, which are challenging to fit to real, large datasets. We adopt a hierarchical Bayesian framework for inference, use simulation to show the weighted model provides improved estimates of high quantiles, and apply our model to improve return level estimates for Colorado rainfall events with 1% annual exceedance probability.

... From (3.3) we see that errors in the estimated value of ξ i may be magnified in the estimate of y p , e.g., due to the dependence of y p on 1/ξ i and the fact that in many environmental applications, ξ i is close to zero. When the sample size is small, the maximum likelihood estimator of ξ i can have high bias, leading to absurd estimated return levels that are orders of magnitude beyond what would be deemed physically possible, and several authors (Coles and Dixon, 1999;Martins and Steidinger, 2000) have proposed adjustments to the log-likelihood function (3.2) to overcome this difficulty. ...

In this study we consider the problem of detecting and quantifying changes in the distribution of the annual maximum daily maximum temperature (TXx) in a large gridded data set of European daily temperature during the years 1950-2018. Several statistical models are considered, each of which models TXx using a generalized extreme value (GEV) distribution with the GEV parameters varying smoothly over space. In contrast to several previous studies which fit independent GEV models at the grid box level, our models pull information from neighbouring grid boxes for more efficient parameter estimation. The GEV location and scale parameters are allowed to vary in time using the log of atmospheric CO2 as a covariate. Changes are detected most strongly in the GEV location parameter with the TXx distributions generally shifting towards hotter temperatures. Averaged across our spatial domain, the 100-year return level of TXx based on the 2018 climate is approximately 2{\deg}C hotter than that based on the 1950 climate. Moreover, also averaging across our spatial domain, the 100-year return level of TXx based on the 1950 climate corresponds approximately to a 6-year return level in the 2018 climate.

... Based on both theory and empirical evidence (see e.g. AghaKouchak and Nasrollahi 2010; Blanchet et al. 2016;Coles 2001;Coles and Dixon 1999;Coles et al. 2003; El Adlouni c Spatial distribution of the highest daily rainfall depths in record (mm) El Adlouni and Ouarda 2010;Engeland et al. 2004;Gubareva and Gartsman 2010;Katz 2013;Katz et al. 2002;Koutsoyiannis 2004aKoutsoyiannis , 2004bKoutsoyiannis and Langousis 2011;Langousis et al. 2013Langousis et al. , 2016bLucarini et al. 2016;Mélèse et al. 2018;Onibon et al. 2004;Papalexiou and Koutsoyiannis 2013;Tyralis and Langousis 2019;Veneziano et al. 2009Veneziano et al. , 2007Villarini 2012;Villarini et al. 2011Villarini et al. , 2012, in the present study we model annual maxima of daily rainfall using the Generalized Extreme Value (GEV) distribution model. In what follows, we recall the essentials of GEV formulation, while referring the interested reader to the wide literature on the topic for a more in-depth discussion (see e.g. ...

We investigate and discuss limitations of the approach based on homogeneous regions (hereafter referred to as regional approach) in describing the frequency distribution of annual rainfall maxima in space, and compare its performance with that of a boundaryless approach. The latter is based on geostatistical interpolation of the at-site estimates of all distribution parameters, using kriging for uncertain data. Both approaches are implemented using a generalized extreme value theoretical distribution model to describe the frequency of annual rainfall maxima at a daily resolution, obtained from a network of 256 raingauges in Sardinia (Italy) with more than 30 years of complete recordings, and approximate density of 1 gauge per 100 km². We show that the regional approach exhibits limitations in describing local precipitation features, especially in areas characterized by complex terrain, where sharp changes to the shape and scale parameters of the fitted distribution models may occur. We also emphasize limitations and possible ambiguities arising when inferring the distribution of annual rainfall maxima at locations close to the interface of contiguous homogeneous regions. Through implementation of a leave-one-out cross-validation procedure, we evaluate and compare the performances of the regional and boundaryless approaches miming ungauged conditions, clearly showing the superiority of the boundaryless approach in describing local precipitation features, while avoiding abrupt changes of distribution parameters and associated precipitation estimates, induced by splitting the study area into contiguous homogeneous regions.

... They establish the best fit distribution among five extreme value distributions by classical modelling. On the other side, Coles and Dixon (1999), Coles (2001), and Ahmad et al. (2019) used likelihood-based inference methods for modelling extreme value models. Researchers are more interested in Bayesian modelling than a classical setup to obtain more valuable results about uncertainty extreme environmental events. ...

In this paper, the modeling of extreme rainfall is carried out in Pakistan by analyzing annual daily maximum rainfall data via frequentist and Bayesian approaches. In frequentist settings, the parameters and return levels of the best fitted probabilistic model (i.e. generalized extreme value) are estimated using maximum likelihood and linear moments method. On the other side, under the Bayesian framework, the parameters and return levels are calculated both for non‐informative and informative priors. This task is completed with the help of the Markov Chain Monte Carlo method using the Metropolis‐Hasting algorithm. This study also highlights a procedure to build an informative prior through historical records of the underlying processes from other nearby weather stations. The findings attained from the Bayesian paradigm demonstrate that the posterior inference could be affected by the choice of past knowledge used for the construction of informative priors. Additionally, the best method for the modeling of extreme rainfall over the country is decided with the support of assessment measures. In general, the Bayesian paradigm linked with the informative priors offers an adequate estimations scheme in terms of accuracy as compared to frequentist methods, accounting for ambiguity in parameters and return levels. Hence, these findings are very helpful in adopting accurate flood protection measures and designing infrastructures over the country. This article is protected by copyright. All rights reserved.

... The maximum likelihood method and the L-moment method are normally used for parameter estimation in GEV. The L-moment method has obvious advantages in terms of computational efficiency and small-sample reliability (Coles & Dixon, 1999). However, there are many shortcomings for L-moment as the calculation is complicated, low precision leads to poor sensitivity, and parameter calculation may cause error accumulation (Jin, 2007). ...

Precipitation anomaly grades are usually defined by the percentage anomaly (Pa) or probability distribution (Pd) methods. However, difference between the two may lead to different estimates for the same events, creating difficulty in judging the severity of the events. Here, we quantify the difference in measuring precipitation variability in China between Pa and Pd methods and analyze physical meaning and influencing factors of the difference. The results show that Pa tends to underestimate the domain of wetness (e.g., it underestimated 7.67% in June 2018) and overestimate/underestimate the severity of extreme wetness (>1.5σ)/dryness (<–1.3σ) compared to the Pd method. Because precipitation has a positive skewed distribution, and precipitation maximum values have a larger influence on Pa than on Pd. On the other hand, uniform Pa thresholds for classifying drought grades at all stations are unreasonable. Because an asymmetrical range of actual Pa value, Pa fails to symmetrically reflect the degree of drought and flood. Spatially, the large difference usually appears in the areas with extreme precipitation. Therefore, the more extreme precipitation stations, the greater the spatial dispersion of precipitation and the greater the total difference between Pa and Pd in whole China. We further find that the Pa-Pd difference is significantly related to a concurrent warming of the tropical Indian Ocean and the tropical Pacific sea surface in spring. And the Pa-Pd difference is rising at 0.022σ/10a with increase of extreme events associated with the ocean warming, which deserves attention from the decision making departments.

... A wide range of estimates obtained by the principle of moment-type statistical estimating are proposed in (Kijko and Sellevoll, 1989;1992;Kijko and Graham, 1998;Coles and Dixon, 1999;Kijko, 2004;Holschneider et al., 2011;Kijko and Singh, 2011;Lasocki and Urban, 2011;Zoller and Holschneider, 2016;Vermeulen and Kijko, 2017;Beirlant et al., 2019). To construct these estimates, initially, the formula is written out for the mean (or some higher moment) of some consistent estimator that converges in probability to the true value at n . ...

... It gives reliable results and is simple. It is frequently used used for large datasets (as in our case) (Coles and Dixon 1999). The Q-Q plot is used to select the best fit function. ...

This paper presents an application of the L-moments and L-moment ratio diagrams (LMRD) to the analysis of hydrological data at regional (country) scale. Existing research focuses on two main areas of the analysis: statistical analysis using LMRD and regression analysis. Further research mixes both approaches applying regression analysis to L-moments. Another direction of the research is clustering of the climatic and physiographic catchment properties and its validation using LMRD. However, LMRD plots can be separately used as the clustering domain. It is proposed to decompose the features into some classes, and than present these results on the LMRD. Such plots constitute the source for the clustering. Obtained clusters are then validated against k-means clustering performed in the LMRD diagram domain. Results show that statistical L-moments analysis can be improved with data mining clustering algorithms. Such combination delivers a new perspective for the interpretation of the results. It is shown that clustering in the LMRD domain is consistent with the K-means clustering. It is anther argument showing that L-moments diagrams can be considered as a very powerful and informative tool for hydrologists enabling the comparison on the regional basis with respect to various catchment properties. The method is validated on data consisting of daily river flow data from 290 gauges covering entire Poland.

... where μ is the location parameter (the mean of the data), σ is the scale parameter (the variance of the data denoting the distribution variability), and ξ is the shape parameter (the "tail" of the GEV distribution). These parameters can be estimated by L-moments (Bílková, 2012;Coles & Dixon, 1999). ...

An explicit proxy of solar activity on the Earth is the auroral displays. The auroral oval can extend to lower latitudes during geomagnetic storms triggered by explosive solar activities. The lower the latitude of auroral oval is, the stronger the solar activity is. Systematic auroral records in Europe (mainly central Europe with geomagnetic latitude lower than 55°) can be dated back to 1000 AD and can be used to manifest solar activity in the past millennium. However, the temporal distribution of the 6,262 auroral records during 1000 and 1900 AD is seriously uneven, with 85.6% of the records appeared after 1700 AD. Here we use the extreme value theory (EVT) to evaluate the effectiveness of characterizing the solar activity with the auroral records before 1700 AD. Due to the inhomogeneity of the auroral records, the EVT has been conducted separately for the data before 1700 and after 1700, finding that the 100‐year auroral frequency is 25.66 [19.67, 33.67] based on the auroral data in 1000–1699, and 283.71 [183.14, 417.58] based on the auroral data in 1700–1900. The predictions of both the 50‐year and 100‐year auroral frequencies before 1700 AD agree well with the solar activity index.

... We used non-informative priors for the location and scale parameters (i.e. the location parameter and the logtransformed scale parameter were uniform). A normal distribution with standard deviation 0.2 and expectation 0.0 was used as the prior for the shape parameter k, inspired by Coles and Dixon (1999), Martins and Stedinger (2000), and Renard et al. (2013). ...

The Glomma River is the largest in Norway, with a catchment area of 154 450 km2. People living near the shores of this river are frequently exposed to destructive floods that impair local cities and communities. Unfortunately, design flood predictions are hampered by uncertainty since the standard flood records are much shorter than the requested return period and the climate is also expected to change in the coming decades. Here we combine systematic historical and paleo information in an effort to improve flood frequency analysis and better understand potential linkages to both climate and non-climatic forcing. Specifically, we (i) compile historical flood data from the existing literature, (ii) produce high-resolution X-ray fluorescence (XRF), magnetic susceptibility (MS), and computed tomography (CT) scanning data from a sediment core covering the last 10 300 years, and (iii) integrate these data sets in order to better estimate design floods and assess non-stationarities. Based on observations from Lake Flyginnsjøen, receiving sediments from Glomma only when it reaches a certain threshold, we can estimate flood frequency in a moving window of 50 years across millennia revealing that past flood frequency is non-stationary on different timescales. We observe that periods with increased flood activity (4000–2000 years ago and

The estimation of the end-to-end delay in modern communication networks is of high importance to support multiple services but also for the management of the network resources. In this paper, we propose an end-to-end delay model and estimation methodology for heterogeneous networks. The estimation method allows the computation of the end-to-end delay distribution parameters when a small number of end-to-end delay samples is available through probe packets. The proposed technique is evaluated for different datasets, including networks operating with multiple access technologies and different upper-layer protocols, so the heterogeneity of the networks and protocols can be taken into account. The evaluation results show that the Generalized Extreme Value distribution can be used to approximate the end-to-end delay and high accuracy is achieved even when only a small number of probe packets is available. Finally, the estimation error confirms the effectiveness of the proposed method for a broad range of network scenarios.

PREDIÇÃO DA PRECIPITAÇÃO MÁXIMA NO MUNICÍPIO DE SILVIANÓPOLIS-MG: ABORDAGENS CLÁSSICA E BAYESIANA THAÍS BRENDA MARTINS1; GISELE CAROLINA ALMEIDA2; FABRICIO GOEKING AVELAR3 E LUIZ ALBERTO BEIJO4 1Mestranda no Programa de Pós-Graduação de Estatística Aplicada e Biometria, Universidade Federal de Alfenas, Rua Gabriel Monteiro da Silva, 700, centro, Alfenas-MG, CEP: 37130-001, Brasil, thaismartins@outlook.com.br; 2Mestranda no Programa de Pós-Graduação de Estatística Aplicada e Biometria, Universidade Federal de Alfenas Rua Gabriel Monteiro da Silva, 700, centro, Alfenas-MG, CEP: 37130-001, Brasil, giselealmeidac08@gmail.com; 3Professor do Departamento de Estatística, Universidade Federal de Alfenas Rua Gabriel Monteiro da Silva, 700, centro, Alfenas-MG, CEP: 37130-001, Brasil, fabricio@unifal-mg.edu.br; 3Professor do Departamento de Estatística, Universidade Federal de Alfenas Rua Gabriel Monteiro da Silva, 700, centro, Alfenas-MG, CEP: 37130-001, Brasil, luiz.beijo@unifal-mg.edu.br. 1 RESUMO As precipitações, quando em excesso, podem causar danos como erosão de solos e inundações, prejuízos em obras hidráulicas, rompimentos de barragens e represas, entre outros. O conhecimento sobre a precipitação máxima esperada, numa determinada região, pode auxiliar no planejamento de atividades agrícolas e construções hidráulicas de forma a evitar danos e prejuízos. Objetivando realizar a predição da precipitação máxima anual na cidade de Silvianópolis-MG, para os tempos de retorno de 5, 10, 25, 50 e 100 anos, foi ajustada a distribuição generalizada de valores extremos à série histórica de precipitação. Analisou-se a acurácia e erro médio de predição para avaliar as estimativas fornecidas pelo método de máxima verossimilhança e pela inferência Bayesiana. Informações, acerca das precipitações máximas, das cidades de Lavras-MG e Machado-MG foram utilizadas para elicitação da distribuição a priori. A aplicação da inferência Bayesiana levou a menores erros de predição, mostrando a eficiência da incorporação de conhecimentos a priori no estudo de precipitação máxima. A distribuição a priori embasada em informações de Lavras apresentou menor erro de predição da precipitação máxima anual de Silvianópolis. Palavras-chave: Valores extremos, níveis de retorno, prioris MARTINS, T. B.; ALMEIRA, G. C.; AVELAR, F. G.; BEIJO, L. A. PREDICTION OF MAXIMUM PRECIPITATION IN THE MUNICIPALITY OF SILVIANÓPOLIS-MG: CLASSICAL AND BAYESIAN APPROACHES 2 ABSTRACT Extreme rainfall can cause damage such as soil erosion and floods, damage to hydraulic works, rupture of dams and reservoirs among others. Knowledge about the expected maximum rainfall, in a given region, can assist in the planning of agricultural activities and hydraulic constructions in order to avoid damages and losses. Aiming to predict the maximum annual rainfall of the city of Silvianópoilis-MG for the return levels of 5, 10, 25, 50 and 100 years, the generalized extreme value distribution was fitted to the historical rainfall data series. The accuracy and mean prediction error were analyzed to evaluate the estimates provided by the maximum likelihood method and Bayesian inference. Information about the maximum rainfall from the cities of Lavras-MG and Machado-MG were used to elicit the prior distribution. The Bayesian Inference application led to smaller prediction errors, showing the efficiency of the incorporation of prior knowledge in the maximum rainfall study. The prior distribution based on information for Lavras presented smaller maximum annual rainfall prediction error for Silvianópolis. Keywords: Extreme value, return levels, priors

With ongoing climate change, analysis of trends in maximum annual daily river flow is of interest. Flow magnitude and timing during the year were investigated in this study. Observations from 11 unregulated rivers in northern Sweden were analysed, using extreme‐value distributions with time‐dependent parameters. The Mann–Kendall test was used to investigate possible trends. The extreme‐value statistics revealed no significant trends for the stations considered, but the Mann–Kendall test showed a significant upward trend for some stations. For timing of maximum flow (day of the year), the Mann–Kendall test revealed significant downward trends for two stations (with the longest records). This implies that the day of the maximum flow is occurring earlier in the year in northern Sweden.

Foodborne disease outbreaks are rare events that can be extremely costly in terms of public health as well as monetary losses for industry and government. These events can overwhelm the local public healthcare network and exceed the capacity of epidemiologists and local public health officials to investigate and manage the outbreak. Planning and allocation of sufficient resources requires an understanding of both the frequency and magnitude of large foodborne outbreaks. Describing these two characteristics is difficult because most statistical methods describe central tendencies of the phenomena under study. An exception is extreme value theory (EVT), which intends to estimate the size and frequency of adverse events as large as, or larger than, those previously observed. This study applies extreme value theory methods to foodborne disease outbreak data collected in the United States between 1973 and 2016. A brief summary of the data, including changes in the surveillance system and their effect on the outbreak data, is provided. Estimates of the outbreak size expected to be exceeded within time periods of 10, 20, 40 and 100 years, referred to as the return level, ranged from 2500 to 10,400. The estimated time period time between outbreaks (i.e., the return period) of at least 500, 5,000, 10,000 and 20,000 cases ranged from 1 to greater than 400 years.

The problem of evaluation of the maximum possible regional earthquake magnitude (Mmax) is reviewed and analyzed. Two aspects of this topic are specified: statistical, and historical and paleoseismic. The frequentist and the fiducial approaches used in the problem are analyzed and compared. General features of the Bayesian approach are discussed within the framework of the Mmax problem. A useful connection between quantiles of a single event and maximum event in a future time interval T is derived. Various estimators of Mmax used in seismological practice are considered and classified. Different methods of estimation are compared: the statistical moment method, the Bayesian method, the estimators based on the extreme value theory (EVT), the estimators using order statistics. A comparison of several well-known estimators of Mmax in the framework of the truncated Gutenberg–Richer law is made. As a more adequate and stable alternative to Mmax the quantiles Qq(T) of maximum earthquake considered in future time horizon T are proposed and analyzed. These quantiles permit us to select a time horizon T and quantile level q for a reliable estimation of maximum possible magnitudes. The instability of Mmax-estimates compared to Qq(T)-estimates is demonstrated. The main steps of the Qq(T)-quantile estimation procedure are highlighted. The historical and paleoseismic data are used, and an additional evidence of low robustness of Mmax-parameter is found. The evidence of possibility of earthquake magnitudes well exceeding the Mmax-value obtained for the truncated Gutenberg–Richter law is found also. The present situation in the domain of the Mmax-evaluation is discussed.

When considering future adaptation to climate change in UK fluvial flood alleviation schemes, the current recommendation by the Environment Agency (England) is to increase peak design flood flows by a preselected percentage. This allowance varies depending on the period for which the estimate is being made, the vulnerability of the development being considered and its location. Recently, questions have been raised as to whether these percentage uplifts should be kept the same, or whether change has already happened within the baseline period and so uplifts should be reduced. A complicating factor is that changes in flood frequency can occur for reasons in addition to climate change, such as land-use change or natural variability. This article describes current approaches taken by different stakeholders for catchments in England and Wales to account for climate change, and discusses these allowances where there is already an observed presence of trend in flood regimes. Theil–Sen estimators of trend were used in comparing non-stationary and stationary flood frequency curves with allowances applied, leading to a recommendation of evaluating non-stationary models at 1990, the end of the reference period. Examples were explored such as the Eden catchment, which was heavily affected by Storm Desmond in December 2015.

The existence of an upper limit for extremes of quantities in the earth sciences, e.g. for river discharge or wind speed, is sometimes suggested. Estimated parameters in extreme-value distributions can assist in interpreting the behaviour of the system. Using simulation, this study investigated how sample size influences the results of statistical tests and related interpretations. Commonly used estimation techniques (maximum likelihood and probability-weighted moments) were employed in a case study; the results were applied in judging time series of annual maximum river flow from two stations on the same river, but with different lengths of observation records. The results revealed that sample size is crucial for determining the existence of an upper bound.

Extreme surface winds and temperatures were estimated by the dynamical downscaling method combined with the generalized extreme value theory for the construction of Hardanger Suspension Bridge and the maintenance of Sotra Bridge in southwestern Norway. The Weather Research and Forecasting Model was used to downscale the Norwegian Earth System Model data from 2.5° × 1.8° to 1 km × 1 km horizontal grids. Simulations were performed for the control period, the 1990s, and the projection period, the 2050s, under the RCP8.5 radiative forcing scenario. Monthly maximum winds were compared with observations at three nearby observation stations for the warm and the cold seasons as well as the annual period. The simulated extreme wind distributions are in good agreement with the observed distributions at the coastal area, but have systematic positive deviations on the mountain. An extrapolation method was used to project extreme winds in the early and the late this century. Comparison of the simulated extreme winds between the 1990s and the 2050s shows that future extreme winds are unlikely to change with statistical significance during the cold season, but tend to decrease at mountainous and coastal areas with statistical significance during the warm season. They are possibly the reflections of the shift in the regional storm activities associated with the changes of the North Atlantic Oscillations and the effects of the local mountain topography. For surface maximum and minimum temperatures, the model can well reproduce the spreads of the pdf distributions. Both distributions shift towards higher temperatures in the 2050s. Keywords: Regional climate, Dynamical downscaling, WRF model, Extreme wind, Extreme surface temperature, Complex terrain

Our ability to quantify the likelihood of present-day extreme sea level (ESL) events is limited by the length of tide gauge records around the UK, and this results in substantial uncertainties in return level curves at many sites. In this work, we explore the potential for a state-of-the-art climate model, HadGEM3-GC3, to help refine our understanding of present-day coastal flood risk associated with extreme storm surges, which are the dominant driver of ESL events for the UK and wider European shelf seas. We use a 483-year present-day control simulation from HadGEM3-GC3-MM (1/4∘ ocean, approx. 60 km atmosphere in mid-latitudes) to drive a north-west European shelf seas model and generate a new dataset of simulated UK storm surges. The variable analysed is the skew surge (the difference between the high water level and the predicted astronomical high tide), which is widely used in analysis of storm surge events. The modelling system can simulate skew surge events comparable to the catastrophic 1953 North Sea storm surge, which resulted in widespread flooding, evacuation of 32 000 people, and hundreds of fatalities across the UK alone, along with many hundreds more in mainland Europe. Our model simulations show good agreement with an independent re-analysis of the 1953 surge event at the mouth of the river Thames. For that site, we also revisit the assumption of skew surge and tide independence. Our model results suggest that at that site for the most extreme surges, tide–surge interaction significantly attenuates extreme skew surges on a spring tide compared to a neap tide. Around the UK coastline, the extreme tail shape parameters diagnosed from our simulation correlate very well (Pearson's r greater than 0.85), in terms of spatial variability, with those used in the UK government's current guidance (which are diagnosed from tide gauge observations), but ours have smaller uncertainties. Despite the strong correlation, our diagnosed shape parameters are biased low relative to the current guidance. This bias is also seen when we replace HadGEM3-GC3-MM with a reanalysis, so we conclude that the bias is likely associated with limitations in the shelf sea model used here. Overall, the work suggests that climate model simulations may prove useful as an additional line of evidence to inform assessments of present-day coastal flood risk.

Generalised extreme value (GEV) distribution is traditionally applied to model extreme event and their return period. There are three parameters (location, scale and shape) in GEV distribution, which needs to be determined before its application. Different techniques have been developed to estimate the parameters of the GEV distribution. There is no specific guidance regarding the optimal method for estimating the parameters of the GEV distribution. This paper investigated the sensitivity of different parameters estimation techniques which are being commonly used in the application of the GEV distribution. Stationary GEV was adopted for the homogeneous data sets; whereas, non-stationarity GEV was implemented for the non-homogeneous data sets. Four methods were applied in the estimation of the GEV distribution parameters for four different timescales. The methods were applied in extreme rainfall modelling using extreme rainfall data in Tasmania, Australia as a case study. It was found that adoption of any GEV parameter estimation methods does not change the GEV type in Tasmanian extreme rainfall. The length of the data series has significant influence on the values of the GEV distribution parameters. The Fréchet type GEV distribution is suitable in most of the analysed rainfall stations in Tasmania.

The use of generalised extreme value (GEV) distribution to model extreme climatic events and their return periods is widely popular. However, it is important to calculate the three parameters (location, scale and shape) of the GEV distribution before its application. To estimate the parameters of the GEV distribution, different parameters estimation techniques are available in literature. Nevertheless, there are no set guidelines with a view of adopting a specific parameters estimation technique for the application of the GEV distribution. The sensitivity analysis of different parameters estimation techniques, which are commonly available in the application of the GEV distribution is the main objective of this study. Extreme rainfall modelling in Tasmania, Australia was carried out using four different parameters estimation techniques of the GEV distribution. The homogeneity of the extreme data sets were tested using the Buishand Range Test. Based on the estimated errors (MSE and MAE), the L-moments parameter estimation technique is appropriate for the data series, where there is a possibility to have outliers. The GEV distribution parameters can vary considerably due to variation in the length of the data series. Finally, Fréchet (type II) GEV distribution is the most appropriate distribution for most of the rainfall stations analysed in Tasmania.

Cet ouvrage est le premier en langue française exclusivement consacré à l'étude quantitative des avalanches de neige. Leur phénoménologie, ainsi que les bases de calcul propres à leur caractérisation et à leur modélisation (hydrologie des chutes de neige, techniques de calage et de simulation) y sont décrits en détail. L'exposé est complété de nombreux points pratiques (cartographie des avalanches, banques de données...), ainsi que d'une présentation exhaustive de la réglementation actuelle en matière de zonage et de prévention des risques. L'accent est tout particulièrement mis sur la nécessaire complémentarité entre une approche naturaliste (qualitative) et une vision scientifique plus mécaniste (quantitative) de la nivologie. Cet ouvrage de référence a été rédigé par quelques-uns des meilleurs spécialistes du domaine. Il s'adresse plus particulièrement aux praticiens des bureaux d'études et des services de l'État, ainsi qu'à tous ceux qui cherchent à s'informer sur les techniques actuelles de calcul des avalanches.

Le praticien, lors de l'étape de prédétermination des débits de crue, est souvent confronté à un jeu de données restreint. Dans notre travail de recherche, nous avons proposé trois nouveaux modèles probabilistes spécialement conçus pour l'estimation des caractéristiques du régime des crues en contexte partiellement jaugé. Parmi ces modèles, deux d'entre eux sont des modèles dits régionaux, i.e. intégrant de l'information en provenance de stations ayant un comportement réputé similaire à celui du site étudié. Ces modèles, basés sur la théorie Bayésienne, ont montré une grande robustesse au degré d'hétérogénéité des sites appartenant à la région. De même, il est apparu que pour l'estimation des forts quantiles (T >= 50 ans), l'idée d'un paramètre régional contrôlant l'extrapolation est pertinente mais doit d'être intégrée de manière souple et non imposée au sein de la vraisemblance. L'information la plus précieuse dont le praticien dispose étant celle en provenance du site d'étude, le troisième modèle proposé revient sur l'estimation à partir des seules données contemporaines au site d'étude. Ce nouveau modèle utilise une information plus riche que celle issue d'un échantillonnage classique de v.a.i.id. maximales puisque toute la chronique est exploitée. Dès lors, même avec seulement cinq années d'enregistrement et grâce à une modélisation de la dépendance entres les observations successives, la taille des échantillons exploités est alors bien plus importante. Nous avons montré que pour l'estimation des quantiles de crues, ce modèle surpasse très nettement les approches locales classiquement utilisées en hydrologie. Ce résultat est d'autant plus vrai lorsque les périodes de retour deviennent importantes. Enfin, part construction, cette approche permet également d'obtenir une estimation probabiliste de la dynamique des crues.

The Glomma river is the largest in Norway with a catchment area of 154 450 km2. People living near the shores of this river are frequently exposed to destructive floods that impair local cities and communities. Unfortunately, design flood predictions are hampered by uncertainty since the standard flood records are much shorter than the requested return period and also the climate is expected to change in the coming decades. Here we combine systematic- historical and paleo-information in an effort to improve flood frequency analysis and better understand potential linkages to both climate and non-climatic forcing. Specifically, we (i) compile historical flood data from the existing literature, (ii) produce high resolution X-ray fluorescence (XRF), Magnetic Susceptibility (MS) and Computed Tomography (CT) scanning data from a sediment core covering the last 10 300 years, and (iii) integrate these data sets in order to better estimate design floods and assess non-stationarities. Based on observations from Lake Flyginnsjøen, receiving sediments from Glomma only when it reaches a certain threshold, we can estimate flood frequency in a moving window of 50 years across millennia revealing that past flood frequency is non-stationary on different time scales. We observe that periods with increased flood activity (4000–2000 years ago and

Generalised Extreme Value (GEV) distribution can be used effectually to model extreme climatic events like bushfire. The major predicament of using GEV distribution is accurate determination of three parameters (location, scale, and shape); nevertheless, there are no specific guidelines to identify the most apposite parameter estimation technique of the GEV distribution for bushfire studies. In this study, influence of different GEV parameters estimation techniques were investigated in Victoria, Australia for extreme bushfire event modelling, using annual maximum forest fire danger index (FFDI), which is a combination of manifold climatic and fuel variables to indicate the potential for bushfires to propagate, and withstand suppression. Four GEV parameters estimation methods namely: Maximum Likelihood Estimation (MLE), Generalised Maximum Likelihood Estimation (GMLE), Bayesian and L-moments were used for two different timescale data (full data set and last 10 years of full dataset) to estimate the GEV distribution parameters. The return levels of FFDI for different Average Recurrence Interval (ARI) were estimated using the above mentioned four methods and two timescales. The study demonstrates that Fréchet (type II) extreme value distribution is pertinent for modelling the annual maximum FFDI for most of the selected stations; nonetheless, GEV distribution parameters can vary considerably due to variation in the length of the data series. Several applied statistical parameters namely: Mean square error (MSE) and Mean absolute error (MAE) were used to identify the most pertinent parameter estimation technique of the GEV distribution. The study reveals that L-moments can be used, even in the presence of the smaller data set. In addition, L-moments is the most appropriate parameters estimation technique of GEV distribution because of the presence of the lowest MSE and MAE values for most of the stations. The outcomes of this research are pivotal for Victorian public and private stakeholders to forecast the severity and intensity of imminent bushfire events due to recent bushfire events in fire prone areas.

Estimates of the parameters and quantiles of the Gumbel distribution by the methods of probability weighted moments, (conventional) moments, and maximum likelihood were compared. Results were derived from Monte Carlo experiments by using both independent and serially correlated Gumbel numbers. The method of probability weighted moments was seen to compare favorably with the other two techniques.

Maximum likelihood and Bayesian estimators are developed and compared for the three-parameter Weibull distribution. For the data analysed in the paper, the two sets of estimators are found to be very different. The reasons for this are explored, and ways of reducing the discrepancy, including reparametrization, are investigated. Our overall conclusion is that there are practical advantages to the Bayesian approach.

Let F be in the domain of attraction of the type I extreme value distribution. The behaviour of the maximum likelihood estimates when fitting the incorrect model Λ(ax+b) to the distribution of the maximum Fn(x) is investigated. The main result is that under widely applicable conditions, the estimated distribution function is consistent for an optimal type I distribution in the sense of [3] and that the parameter estimates of a and b are asymptotically normal. Some of the implications for statistical inference problems are discussed. Finally, similar results for the case where three-parameter extreme-value approximations are fitted are hypothesized. On this basis we give some general conclusions on the problem of choice of extrema model for Fn(x).

We argue that prediction intervals based on predictive likelihood do not correct for curvature with respect to the parameter value when they implicitly approximate an unknown probability density. Partly as a result of this difficulty, the order of coverage error associated with predictive intervals and predictive limits is equal to only the inverse of sample size. In this respect those methods do not improve on the simpler, ‘naive’ or ‘estimative’ approach. Moreover, in cases of practical importance the latter can be preferable, in terms of both the size and sign of coverage error. We show that bootstrap calibration of both naive and predictive-likelihood approaches increases coverage accuracy of prediction intervals by an order of magnitude, and, in the case of naive intervals preserves that method’s numerical and analytical simplicity. Therefore, we argue, the bootstrap-calibrated naive approach is a particularly competitive alternative to more conventional, but more complex, techniques based on predictive likelihood.

We consider the estimation of the three parameters of the lower tail of a distribution function based on the k smallest out of n observations. The likelihood function has a singularity but it is argued that a local maximum, when it exists, should be taken as the m.l.e. Asymptotic results as k → ∞, n → ∞, k/n → 0 show that such a local maximum does exist, and provides consistent estimators whenever the shape parameter is greater than one; otherwise there is no local maximum and likelihood inference fails. We also discuss interval estimation and propose a test to distinguish between the Type I and Weibull limit laws.

The NEAR(2) model is a model for nonlinear time series with exponential marginals. It formed the subject of a recent discussion paper by Lawrance and Lewis. We explore the possibility of maximum likelihood estimation with particular reference to the difficulties created by discontinuities in the likelihood function. Two exploratory techniques are proposed for resolving those difficulties. This note is an expanded version of a contribution presented orrally at the meeting.

Estimation is considered for a class of models which are simple extensions of the generalized extreme value (GEV) distribution, suitable for introducing time dependence into models which are otherwise only spatially dependent. Maximum likelihood estimation and the method of probability weighted moment estimation are identified as most useful for fitting these models. The relative merits of these methods, and others, is discussed in the context of estimation for the GEV distribution, with particular reference to the non - regularity of the GEV distribution for particular parameter values. In the case of maximum likelihood estimation, first and second derivatives of the log likelihood are evaluated for the models.

We discuss the analysis of the extremes of data by modelling the sizes and occurrence of exceedances over high thresholds. The natural distribution for such exceedances, the generalized Pareto distribution, is described and its properties elucidated. Estimation and model-checking procedures for univariate and regression data are developed, and the influence of and information contained in the most extreme observations in a sample are studied. Models for seasonality and serial dependence in the point process of exceedances are described. Sets of data on river flows and wave heights are discussed, and an application to the siting of nuclear installations is described.

For the design of sea defences the main statistical issue is to estimate
quantiles of the distribution of annual maximum sea levels for all
coastal sites. Traditional procedures independently analyse data from
each individual site; thus known spatial properties of the
meteorological and astronomical tidal components of sea level are not
exploited. By spatial modelling of the marginal behaviour and inter-site
dependence of sea level annual maxima around the British coast we are
able to examine risk assessment for coastlines and the issue of
sensitivity to climatic change.

The generalized Pareto distribution is a two-parameter distribution that contains uniform, exponential, and Pareto distributions as special cases. It has applications in a number of fields, including reliability studies and the analysis of environmental extreme events. Maximum likelihood estimation of the generalized Pareto distribution has previously been considered in the literature, but we show, using computer simulation, that, unless the sample size is 500 or more, estimators derived by the method of moments or the method of probability-weighted moments are more reliable. We also use computer simulation to assess the accuracy of confidence intervals for the parameters and quantiles of the generalized Pareto distribution.

We consider maximum likelihood estimation of the parameters of a probability density which is zero for x < θ and asymptotically
αc(x-θ)α−1 as x ↓ θ. Here θ and other parameters, which may or may not include α and c1 are unknown. The classical regularity conditions for the asymptotic properties of maximum likelihood estimators are not satisfied
but it is shown that, when α> 2, the information matrix is finite and the classical asymptotic properties continue to hold.
For α= 2 the maximum likelihood estimators are asymptotically efficient and normally distributed, but with a different rate
of convergence. For 1 < α < 2, the maximum likelihood estimators exist in general, but are not asymptotically normal, while
the question of asymptotic efficiency is still unsolved. For αα 1, the maximum likelihood estimators may not exist at all,
but alternatives are proposed. All these results are already known for the case of a single unknown location parameter θ,
but are here extended to the case in which there are additional unknown parameters. The paper concludes with a discussion
of the applications in extreme value theory.

We use the method of probability-weighted moments to derive estimators of the parameters and quantiles of the generalized extreme-value distribution. We investigate the properties of these estimators in large samples, via asymptotic theory, and in small and moderate samples, via computer simulation. Probability-weighted moment estimators have low variance and no severe bias, and they compare favorably with estimators obtained by the methods of maximum likelihood or sextiles. The method of probability-weighted moments also yields a convenient and powerful test of whether an extreme-value distribution is of Fisher-Tippett Type I, II, or III.

This article is concerned with modifications of both maximum likelihood and moment estimators for parameters of the three-parameter Wei bull distribution. Modifications presented here are basically the same as those previously proposed by the authors (1980, 1981, 1982) in connection with the lognormal and the gamma distributions. Computer programs were prepared for the practical application of these estimators and an illustrative example is included. Results of a simulation study provide insight into the sampling behavior of the new estimators and include comparisons with the traditional moment and maximum likelihood estimators. For some combinations of parameter values, some of the modified estimators considered here enjoy advantages over both moment and maximum likelihood estimators with respect to bias, variance, and/or ease of calculation.

Maximum likelihood estimators of the parameters of the generalized extreme-value distribution are derived for complete, left censored, right censored or doubly censored samples. Explicit expressions are provided for the observed information matrix which forms the basis of the iterative procedure described. It is shown that the inveise of this information matrix evaluated at the maximum likelihood estimates provides a better estimate of the variance-covanance matrix of the estimators than the expected information matrix.

An n-stage splitting algorithm for the solution of maximum penalized likelihood estimation (MPLE) problems is compared to the one-step-late (OSL) algorithm. General conditions under which the asymptotic rate of convergence of this splitting algorithm. exceeds that of the OSL algorithm are given. A one-dimensional positive data example, illustrates the comparison of the rates of convergence of these two algorithms.

The limiting distribution, when n is large, of the greatest or least of a sample of n, must satisfy a functional equation which limits its form to one of two main types. Of these one has, apart from size and position, a single parameter h, while the other is the limit to which it tends when h tends to zero.The appropriate limiting distribution in any case may be found from the manner in which the probability of exceeding any value x tends to zero as x is increased. For the normal distribution the limiting distribution has h = 0.From the normal distribution the limiting distribution is approached with extreme slowness; the final series of forms passed through as the ultimate form is approached may be represented by the series of limiting distributions in which h tends to zero in a definite manner as n increases to infinity.Numerical values are given for the comparison of the actual with the penultimate distributions for samples of 60 to 1000, and of the penultimate with the ultimate distributions for larger samples.

The majority of studies of long-term sea-level change have concentrated on the trend in mean sea-level which is just one constituent of the trend in extreme sea-level. By fitting a spatial model to sea-level annual maxima from 62 UK sites, extreme sea-level trend estimates are obtained for the entire British coastline. These estimates exhibit smooth, but significant, spatial variation which arises from a combination of eustatic extreme sea-level change and local vertical land movements. Once the latter effect is removed, eustatic extreme sea-level trends are found to have no significant spatial variation and to be similar in value to trends in UK eustatic mean sea-level.

It is shown that the probability P that the annual maximum value of a meteorological element is less than a given value x can be expressed as exp
where a and k are constants such that ak is positive.
The parameters a and k can be evaluated from data of annual maximum values for a given period of years. Hence we derive the average highest and lowest in sets of T annual maxima, and also the maximum value to be expected once in T years.

Observed sea level maxima in the form of annual extremes have been analysedfor 67 ports around the British Isles. The ports give a coverage of all principal estuarine and open coast regions. The average number of annual maxima per port is 45, ranging from a minimum number of 10 to a maximum of over 130. The data are analysed by a standard method of extreme value analysis. It is shown that for certain ports the frequency distributions of the maxima are time dependent; this time dependency is most pronounced in estuarine regions. The results show the local and regional distribution of frequency statistics and the relative magnitude of their variation through time. The limitations of using annual sea level maxima as a basis for computing frequency statistics are examined.

Several methods of analyzing extreme values are now known, most based on the extreme value limit distributions or related families. This paper reviews these techniques and proposes some extensions based on the point-process view of high-level exceedances. These ideas are illustrated with a detailed analysis of ozone data collected in Houston, Texas. There is particular interest in whether they is any trend in the data. The analysis reveals no trend in the overall levels of the series, but a marked downward trend in the extreme values.

Explicit expressions are provided for the information matrix of the three-parameter generalized extreme-value distribution. A condition for regular estimation is established.

Estimates Of Extreme Sea Conditions: Spatial Analyzes for the UK, Proudman Oceanographic Laboratory Internal document: in press

- M J Dixon
- J A Tawn

Dixon, M.J. and Tawn, J.A., Estimates Of Extreme Sea Conditions: Spatial Analyzes for the UK, Proudman Oceanographic Laboratory Internal document: in press, 210 pages, 1997.