Page 1

SCIENCE CHINA

Technological Sciences

© Science China Press and Springer-Verlag Berlin Heidelberg 2011 tech.scichina.com www.springerlink.com

*Corresponding author (email: zmliang@hhu.edu.cn)

• RESEARCH

PAPER •

May 2011 Vol.54 No.5: 1183–1192

doi: 10.1007/s11431-010-4229-4

Application of Bayesian approach to hydrological frequency

analysis

LIANG ZhongMin1*, LI BinQuan1, YU ZhongBo1,2 & CHANG WenJuan1

1 College of Hydrology and Water Resources, Hohai University, Nanjing 210098, China;

2 Department of Geosciences, University of Nevada Las Vegas, NV 89154, USA

Received September 26, 2010; accepted November 10, 2010; published online February 4, 2011

An existing Bayesian flood frequency analysis method is applied to quantile estimation for Pearson type three (P-III) probabil-

ity distribution. The method couples prior and sample information under the framework of Bayesian formula, and the Markov

Chain Monte Carlo (MCMC) sampling approach is used to estimate posterior distributions of parameters. Different from the

original sampling algorithm (i.e. the important sampling) used in the existing approach, we use the adaptive metropolis (AM)

sampling technique to generate a large number of parameter sets from Bayesian parameter posterior distributions in this paper.

Consequently, the sampling distributions for quantiles or the hydrological design values are constructed. The sampling distri-

butions of quantiles are estimated as the Bayesian method can provide not only various kinds of point estimators for quantiles,

e.g. the expectation estimator, but also quantitative evaluation on uncertainties of these point estimators. Therefore, the Bayes-

ian method brings more useful information to hydrological frequency analysis. As an example, the flood extreme sample series

at a gauge are used to demonstrate the procedure of application.

Bayesian theory, hydrological frequency analysis, Markov Chain Monte Carlo, prior distribution, posterior distribution

Citation: Liang Z M, Li B Q, Yu Z B, et al. Application of Bayesian approach to hydrological frequency analysis. Sci China Tech Sci, 2011, 54: 11831192,

doi: 10.1007/s11431-010-4229-4

1 Introduction

The primary objective of the hydrological frequency analy-

sis is to provide a design value xp with an assigned fre-

quency p (or the return period T, with T=1/p) in engineering

practice [1]. At present, most of the approaches for hydro-

logical frequency analysis focus on providing estimates of

the design values via the so called parameter model frame-

work. It is assumed in these methods that under this frame-

work, the hydrological variable satisfies a specific popula-

tion distribution, which is usually represented by a probabil-

ity density function (PDF) or cumulative distribution func-

tion (CDF) with unknown parameters. The choice of PDF

(or CDF) to be used for statistical inference is often based

on subjective criteria, or it is considered as a matter of

probabilistic hypotheses testing [2]. Besides the model se-

lection for the population distribution, the hydrological fre-

quency analysis is to develop statistical approaches to esti-

mate these parameters with hydrological observations or

samples.

After the flood frequency distribution can be well deter-

mined, the parameter estimation issue becomes another

troublesome problem. Several general approaches are

available for estimating parameters of a frequency distribu-

tion from sample data, among which the method of mo-

ments (MOM) is the most popular approach. However,

MOM generally yields biased estimates for standard devia-

tion and coefficient of skewness [3]. The method of maxi-

mum likelihood (ML) has also been used in hydrology.

Page 2

1184 Liang Z M, et al. Sci China Tech Sci May (2011) Vol.54 No.5

Griffis and Stedinger showed that MOM with a highly in-

formative regional skew performed as well as ML method

for the range of parameters of Log-Pearson Type 3 distribu-

tion through a Monte Carlo analysis [4]. However, ML

method is used to a lesser extent partly because its applica-

tion does not lend itself to easily manipulated algebraic ex-

pressions [5]. Greenwood et al. developed probability

weighted moments (PWM) for flood frequency studies [6].

The PWM method received great attention after it was pub-

lished, and many research reports have shown that this

method offers efficient and nearly unbiased parameter esti-

mates for several distributions such as the Gumbel, Wakeby,

and GEV distributions [5,7–10]. However, the scope of

PWM’s application was limited to such distributions whose

inverse forms must be explicitly defined until Song and

Ding [11] and Ding and Yang [12] developed a procedure

for estimating parameters of the P-III distribution whose

CDF does not have an explicit inverse form. The domain of

operation of PWM is largely extended thereafter, and there

are also some modifications of PWM, e.g. Whalen et al. [13]

In order to develop a unified approach for the use of order

statistics for the statistical analysis of univariate probability

distributions, Hosking [14] developed a more easily inter-

preted technique, called L-Moments, which covers the

characterization of probability distributions, the summariza-

tion of observed data samples, the fitting of probability dis-

tributions to data and the testing of hypotheses about distri-

butional form (for more recent works see refs. [15,16]).

Since its appearance, the method of L-Moments has been

receiving increasing attention in hydrological frequency

analysis (e.g. ref. [17]).

Except for these methods mentioned above, a kind of ap-

proach based on Bayesian theorem is also intensively stud-

ied and applied. One of the most attractive advantages of

the Bayesian approach is that it couples prior information

with sample information to provide a theoretically consis-

tent framework for integrating systematic flow records with

regional and hydrologic information within a unit frame-

work, and allows the explicit modeling estimation uncer-

tainties arising from both the flood frequency model and its

parameters [18–20]. Wood and Rodriguez-Iturbe developed

procedures for analyzing and accounting for both statistical

uncertainty of competing flood frequency models and pa-

rameter uncertainty for the individual models, and then con-

sidered the problem of uncertainty among flood frequency

models within a Bayesian analysis [21, 22]. Tang used a

Bayesian regression analysis to evaluate the resulting prob-

ability distribution of the flood level, the weight value in

this Bayesian approach varied dynamically depending on

goodness-of-fit between extreme values and frequencies,

whose relationship was supposed to be a linear form [23].

van Gelder et al. extended Tang’s work, thus figuring out

the problem of how to select a suitable probability distribu-

tion, and found that the approach of Bayes factors seemed

to perform better than Tang’s method [24]. Besides,

Kuczera also focused on this topic, and introduced an em-

pirical Bayes approach to infer hydrological quantities by

combining site-specific and regional information, and pre-

sented a Monte Carlo Bayesian method for computing the

expected probability distribution as well as quantile confi-

dence limits using data of gauged flows, possibly corrupted

by rating curve error, and data of censored flows [25, 26].

O’Connell et al. took a full Bayesian approach to incor-

porate historical information and measurement errors into

the flood frequency analysis [27]. Reis et al. developed a

Bayesian approach to analysis of a generalized least squares

(GLS) regression model for regional analyses of hydrologic

data [28]. Reis and Stedinger explored Bayesian MCMC

methods for evaluation of the posterior distributions [29].

They found that the Bayesian MCMC approach provided a

computationally and conceptually

appropriately incorporating into flood frequency analysis

the joint distribution of possible errors within rating curves

and individual observations. Seidou et al. proposed an al-

ternative Bayesian method for combining local and regional

information to provide the full probability densities for pa-

rameters or quantiles [30]. Unlike the empirical Bayesian

approach their method worked even with a single local ob-

servation, and also relaxed the hypothesis of normality of

the local quantiles probability distribution. Ribatet et al.

developed a regional Bayesian POT model for flood fre-

quency analysis when the at-site streamflow data series

were not available [31]. Subsequently, they further

presented a model that allowed a non null probability for a

regional fixed shape parameter for regional flood frequency

analysis [32]. Their methodology was integrated within a

Bayesian framework and used reversible jump techniques;

their results indicated that the estimator was absolutely

suited for regional estimation when only a few data were

available at the target site. More recently, Micevski and

Kuczera presented a general Bayesian approach for

inferring the GLS regional regression model and for com-

bining with any available at-site information to obtain the

most accurate flood quantile estimations [33].

Generally, the bias or uncertainty inevitably exists in es-

timating design values or quantiles due to various causes

including the incompletion of sample information, incorrect

selection of statistical models and inefficient parameter es-

timations. However, there are rarely effective theoretical

approaches to compute the quantile sampling error, instead,

in practice, it is empirical methods that were used to esti-

mate approximately the error, on which a safety factor or

adjustment value was added to the design value in order to

ensure the engineering safety. Diversely, because of being

able to provide sampling distributions for quantiles, Bayes-

ian approaches likely provide an alternative to estimate

theoretically the sampling errors of quantiles.

The remainder of this article is structured as follows: The

next section briefly introduces the Monte Carlo Bayesian

method put forward by ref. [26], followed by the description

simple way of

Page 3

Liang Z M, et al. Sci China Tech Sci Mary (2011) Vol.54 No.5

1185

of the AM sampling technique for Bayesian MCMC, in-

cluding its algorithm and convergence conditions. Then

formulas to estimate quantiles and confidence limits are

presented, after which a case study is performed, and con-

clusions are drawn at the end.

2 Bayesian approach for parameter estimation

2.1 Bayesian formula

Bayesian approach has been widely used in hydrology. The

principal idea of Bayesian theory is to effectively couple the

prior and sample information with the expectation to obtain

variables, posterior knowledge precisely, which expresses

the posterior probability density p( /x) of parameter with

given samples x as

( ) ( )g

(),

( ) ( )dg

f x

px

f x

(1)

where g( ) is the prior probability density of parameter , and

describes what is known about the unknown parameter prior

to the analysis of sample x; and f(x/ ) is the likelihood func-

tion of samples x, that is, the sampling distribution of the

samples given a chosen probability model and parameter .

2.2 Prior distribution

The determination of prior distribution g( ) is an essential

step in any Bayesian analysis. In general, noninformative

priors, subjective Bayes and empirical Bayesian are mostly

suggested rules to fix a prior distribution. As an example,

uniform distribution derived via using the noninformative

prior was adopted in this study, i.e. determining the range of

a parameter and assigning it uniformly distributed within

the range.

2.3 Likelihood function

Samples used for hydrological frequency analysis in prac-

tice are almost the so called “non-simple” samples, which

include gauged series and investigated data (i.e. extraordi-

nary flood data). Correspondingly, if the sample series con-

tain only the systematically recorded data, then it is called

the “simple” samples. In this paper we followed the defini-

tion for likelihood function proposed by ref. [26] to describe

the likelihood given a non-simple sample.

Suppose the hydrological frequency model is described

by a PDF f(Q/ ), where Q is the concerned hydrological

variable vector and is a parameter vector about which an

inference is sought. Denote the “non-simple” sample as

x={O, H}, where O={Q1

observed records of n hydrological data, while H={(Q1

v1), (Q2

,

(1), Q2

(1),

,

Qn

(1)} represents the

(2), u1,

(2), u2, v2),

(Qm

(2), um, vm)} represents m investi-

gated or censored records, in which the symbol (Qj

denotes uj annual hydrological data in vj years with dis-

charges Qj

,

m, exceeding a threshold, and its

corresponding hydrological frequency model is f(H/ )

similarly.

In light of the information above, it is assumed that the

gauged data series Qi

n) follow a P-III distribu-

tion and the censored data series (Qj

are from a Binomial distribution. Given the statistical inde-

pendence of annual hydrological data, it follows that

(2), uj, vj)

(2), j=1, 2,

(1) (i=1,

,

(2), uj, vj) (j=1,

,

m)

1

i

1

,

n

i

ff

OQ

(2)

2

j

1

,,

.

m

jj

j

ff vu

HQ

(3)

Then the sampling distribution becomes

1

i

0

1

0

e,

x a

fxa

Q

(4)

(2)

j

(2)

j

(2)

j

(/, , )

(1())( (F)) ,

j

jjj

j

v

uvv

jj

u

f vuF

C

QQQ

(5)

where coefficients a0, and are parameters depicting the

location, scale and shape characteristics of a P-III distribution,

and are defined as

0

(1 2ax

and =4/Cs

distribution, Cv is coefficient of variation, and Cs is the co-

efficient of skewness). F(Qj

P(Q>Qj

fine a cumulative probability function hereafter.

Hence, the likelihood function f(x/ ) in eq. (1) becomes

/),

vs

CC

2/( ),

vs

xC C

2 , respectively ( x is expectation of the P-III

(2)) is defined as the probability

(2)/ ), i.e., we use the exceedance probability to de-

( / )

f x

( ,

(

(

O

/ )

, ) (

/ ) (

f

/ / )

/ ).

f

f

f

f

O H

O HH

H

(6)

Consequently, the posterior probability density p( /x) in eq.

(1) can be estimated via the prior distribution g( ) and like-

lihood function f(x/ ) defined by eq. (6).

3 Quantile and confidence limit estimations

The Bayesian distribution of an annual hydrological vari-

able Q is obtained by the application of the total probability

theorem to yield the PDF

(/ )x(/ ) ( / )d ,pQ

Ifx

Q

(7)

where I(Q/x) is the PDF of the variable Q with given samples

x. Then the design value Qp with subscript p referring to the

value for exceedance probability p (p=1/T) is defined by

Page 4

1186 Liang Z M, et al. Sci China Tech Sci May (2011) Vol.54 No.5

( / )x(/ )d .x

p

p

pPI

Q

QQQQ (8)

Substituting eq. (7) into eq. (8) and changing the order of

integration yields

( / )x( / ) ( / )dpQ

d

( / ) ( / )d ,p

p

p

p

Pfx

Px

Q

QQQ

QQ

(9)

where P(QQp/ ) is the probability of Q exceeding Qp

given that is the true parameter. Eq. (9) indicates that the

Bayesian algorithm needs to be conducted for integral op-

eration for most of the probability distributions used in hy-

drological frequency analysis; the integral may be likely

very complex that brings extra difficulties in computation.

In this study, MCMC technique described in the next sec-

tion was employed to solve the integral. Furthermore, ac-

cording to the sampling distribution of quantile, several

point estimations of quantile can be obtained, e.g. the ex-

pectation value of quantile is defined as

E d (P/ ).x

ppp

QQQQ

(10)

In calculation, MCMC sampling technique is available to

estimate the expected value EQp of sample Qp. Additionally,

“the parameter estimation of maximum likelihood value” is

also available. This method is described as: =(EX, Cv, Cs)

from each parameter posterior distribution p( /x) via AM

algorithm (EX is the mathematical expectation with X de-

noting the random sample variable of the particular samples

x, and the other two symbols are defined as noted previ-

ously). Based on its result, the mode estimation max p of

parameter could be obtained. Then, the design value Qp is

estimated.

According to the sampling distribution of design value,

the interval estimation of the arbitrary confidence level

(1) follows

QQQ

([,])1,

LU

ppp

P

(11)

where

L

p

Q

is the lower limit value, and

U

,

p

Q

is upper

limit. Based on the confidence interval [ ],

LU

pp

QQ

we

could quantitatively evaluate the estimation uncertainty of

the design value.

4 MCMC method

The basic idea of MCMC method is to conduct Markov

chain with a specified invariant distribution, that is, with the

chain to sample simulation of the unknown variables re-

peatedly until it comes to a steady-state distribution.

MCMC method has been improved gradually and used in a

wide variety of applications. Some scholars have attempted

to explore Bayesian MCMC method for hydrological fre-

quency analysis. Besides several papers that have been

mentioned in introduction, Blasone et al. also improved the

computational efficiency of generalized likelihood uncer-

tainty estimation (GLUE) by sampling the prior parameter

space using an adaptive MCMC scheme and proposed an

alternative strategy to determine the value of the cutoff

threshold based on the appropriate coverage of the resulting

uncertainty bounds [34]. Recently, Gallagher et al. pre-

sented an overview of MCMC method, and discussed the

issues how to determine optimal model, model resolution

and model choice for earth science problems by using this

sampling method [35].

4.1 AM algorithm

Clearly, different sampling approaches may lead to different

MCMC algorithms, among which AM sampling method is

widely applied. AM algorithm inspired by refs. [36, 37], is

an improved sampler. Different from the traditional MCMC

algorithms, it uses the covariance matrix of the posterior

parameters, which adapts continuously to the target distri-

bution via the successive iteration, instead of predetermin-

ing the proposal distribution. Then the proposal distribution

employed in the AM algorithm is a multivariate normal

distribution with mean i at the current point and covariance

Ci for i-th time step.

The crucial problem regarding the adaptation is how the

covariance of the proposal distribution depends on the his-

tory of the chain. In AM algorithm, Ci takes the form of

Ci=SdCov( 0,

,

i-1)+SdId for i-th time step after an initial

period, where parameter is just to ensure that Ci will not

become singular; here we set =105. Sd is a scaling pa-

rameter depending only on dimension d, and ensures the

acceptance probability in advisable bounds, as a basic

choice, the expression Sd=(2.4)2/d is adopted; Id is identity

matrix of dimension d.

First, an arbitrary strictly positive definite initial covari-

ance C0 is selected, which of course is chosen according to

the best prior knowledge (which may be quite poor). The

expression is

0

01

,

(,,).

i

didd

S CovS

C

C

I

(12)

For step i+1 the covariance satisfies the recursion formula

T

TT

i

111

1

[(1)],

d

iiiiiiid

S

i

ii

ii

CCI

(13)

where

of previous i-1 iterations and previous i iterations, respec-

tively. Eq. (11) allows one to calculate Ci without too much

computational cost since the mean

vious recursion formula.

1

i

and

i represent the average parameter value

i also satisfies an ob-

Page 5

Liang Z M, et al. Sci China Tech Sci Mary (2011) Vol.54 No.5

1187

The flowchart in Figure 1 illustrates how the AM algo-

rithm works.

4.2 Convergence diagnostic criterion

An effective MCMC algorithm must assure that the sam-

pling sequence converges to the posterior distribution.

Theoretically, an isotropic sampler must be convergent as

time→∞, however, the practical application is not the case.

Instead, we have to determine the sampling frequency the

AM algorithm convergence to the stationary posterior dis-

tribution needs, that is “burn in” period. Gelman and Rubin

presented a quantitative convergence diagnostic indicator R,

as scale reduction score, it takes the form [38]:

1/2

( 1)/( 1) /(),Rkkq D q W k

(14)

where q is the number of simulated sequences (q2) with

the starting points drawn from an over-dispersed distribu-

tion. To diminish the effect of the starting distribution we

discard the initial period iterations of each sequence and

focus attention on the last k length of iterations. D/k is de-

fined as the variance between the q sequence means, and W

is the average of the q within-sequence variances, i.e.

2

1

( ( )

( )) /(

1),

q

j

j

D kaveaveq

(15)

Figure 1 Flowchart of the AM algorithm.

2

j

1

/ ,q

q

j

Ws

(16)

where ave( )j is the average value of parameter of j-th

sampling sequence, ave( ) is the average value of parame-

ter of all the sampling sequences, and Sj

the j-th sampling sequence.

Now for each scalar parameter, the scale reduction score

is estimated. Once R1/2 is near 1 for all scalar estimators of

interest, we conclude that each set of the q sets of k simu-

lated values converges to the posterior distribution. Gelman

and Rubin suggested that taking R1/2<1.2 as the convergence

diagnostic condition for the multiple sequences sampling

process [38].

2 is the variance of

5 Case study

A case study is presented to illustrate the hydrological fre-

quency analysis based on Bayesian theory. The real data

series at Pingyuan gauge, a hydrological station in China, as

shown in Table 1, consist of 33 a of gauged peak flows,

among which the peak occurred in 1969 is the largest. In

addition, the historical flood information for this station is

also available: censored data for 2 a, i.e. 1877 and 1914

with peak flows 2790 m3 s1 and 2130 m3 s1, respectively.

These two censored values and peak flow of 1969 are the

largest three discharges in the past 109 a. According to the

definition of “non-simple sample” in this paper, they are

classified as the type H, i.e. (2790, 109, 1), (2180, 109, 2)

and (2130, 109, 3), the rest of peak discharges belongs to

the type O.

Samples were assumed to be from a P-III distribution

which has been widely used as the population distribution in

China for flood extremes frequency analysis. As for the

prior distributions of parameters, the uniform prior was used

in this study.

Table 1 Flood peak flow series (discharge in m3 s1)

Year Peak flow

Year Peak flow Year Peak flow

1877 2790 1963 427 1975 810

1914 2130 1964 718 1976 718

1953 1250 1965 524 1977 1310

1954 1290 1966 612 1978 217

1955 527 1967 773 1979 421

1956 516 1968 376 1980 860

1957 1710 1969 2180 1981 855

1958 534 1970 1220 1982 750

1959 554 1971 1120 1983 1480

1960 1720 1972 415 1984 1125

1961 965 1973 1750 1985 640

1962 776 1974 545

Page 6

1188 Liang Z M, et al. Sci China Tech Sci May (2011) Vol.54 No.5

The AM algorithm was configured as follows.

1) The initial covariance C0 is diagonal matrix, and the

parameter variance is 1/20 of the given range.

2) Limit the parameters ranges x [850, 950],

[0.3, 1.0],

s

C [0.6, 3.0], so that the prior distributions of

these three parameters can be found, i.e.,

v

C

1/(950 850),

0,

[850, 950],

otherwise,

( )

p x

x

(17a)

1/(1.0 0.3),

0,

[0.3, 1.0],

()

otherwise,

v

v

C

p C

(17b)

1/(3.0 0.6),

0,

[0.6, 3.0],

()

otherwise.

s

s

C

p C

(17c)

3) The initial iteration i0=100.

5.1 Convergence diagnostics

Taking vague priors for all three parameters ( , , ),

vs

x CC

we obtained the output provided in Figures 2(a)–(c), by

running the each sampler for 4000 iterations. The figures

indicated that the parameter values spread all over the pa-

rameter space.

Figure 3 presented the posterior mean traces and variance

traces of parameter Cs. From the figure, we could find out

that the mean and variance of parameter Cs are essential to

stability, thus we concluded the single sequence is conver-

gence.

Now according to eq. (14), the evolution of the scale re-

duction score R1/2 in Figure 4 is described as the changes of

parameters ( , , )

vs

x CC

i<500, and after iteration i>1000, the scale reduction score

was stable as R1/2→1, implying that the MCMC sampling

sequences of parameters could stably converge to its posterior

distribution and the entire algorithm was convergent.

Therefore, for each sampling sequence we could define

the first 1000 iterations as “burn in” period and then discard

them, and further generate 3000 samples. Five parallel se-

quences were repeated, thus obtaining in total 15000 sam-

pling variables for concerned parameters.

were dramatic at the initial iterations

Figure 2 Sampling traces of three parameters.

Figure 3 Sampling traces for mean and variance of posterior distribution of Cs.

Page 7

Liang Z M, et al. Sci China Tech Sci Mary (2011) Vol.54 No.5

1189

5.2 Design peak discharges estimation

According to the MCMC sampling results of the parameters

=(EX, Cv, Cs) in 15000 groups, the posterior distributions

were listed as Figures 5(a)–(c).

The sampling simulation of MCMC approach enables us

to describe the sampling distributions of the design values or

quantiles, upon which an estimation of quantile and its

confidence limits can be obtained, as shown in Table 2 and

Figure 6. In Figure 6, the curve-fitting method is a parameter

Figure 4 Evolutions of the scale reduction scores R1/2 for three parameters.

Figure 5 Histograms of parameters’ posterior distributions.

Table 2 Design values or quantile estimates of flood peak flow

Curve-fitting

Xp

3730

2700

2384

1955

Bayesian

Dev Coeff

0.12

0.10

0.09

0.08

p

Xp

3304

2471

2210

1852

Std Dev

409

241

199

153

L90%

0.1

1

2

5

[2891, 3824]

[2227, 2763]

[2013, 2440]

[1719, 2001]

Note. Where p denotes exceedance probability (%), Xp is estimation of curve-fitting method, or the expected value of Bayesian estimation (m3 s1), Std

Dev represents standard deviation of Bayesian estimation, Dev Coeff is the coefficient of deviation of Bayesian estimation, and L90% represents 90% quantile

confidence limits of Bayesian method.

Page 8

1190 Liang Z M, et al. Sci China Tech Sci May (2011) Vol.54 No.5

Figure 6 Frequency curves of flood peak flow.

estimation approach based on the criteria of goodness-of- fit

and is a traditional technique used in a great extent in China.

Bayesian expectation estimation means the expectation of

quantile sampling distribution defined by eq. (10). The 90%

confidence limits of the design values are also obtained

through their sampling distributions. It is noted that the

variation coefficient of quantiles’ sampling distributions

increases as the exceedance probability decrease, i.e., quan-

tile with smaller exceedance probability bears larger sam-

pling variation, or, the gap between the upper and lower

bounds of confidence limits are getting larger.

5.3 Estimation of safety amendments for quantiles

In engineering hydrology, a modified value or safety

amendments (denoted by Xp) is needed to add to the esti-

mated design value for sake of engineering security. An

empirical formula to calculate Xp is

,

p

px

X

(18)

where is a reliability coefficient, usually takes 0.7, and xp

is the standard deviation of the design value or quantile. The

estimation of xp is vital for calculation of Xp, currently, it

is approximately estimated by

,

p

x

xCvB

n

(19)

where B is a function with independent variable Cs and ex-

ceedance probability p, and for practical usage a nomogram

for B is always made beforehand.

It is known that eq. (19) is an approximate expression for

computing the standard deviation of design value xp, and is

only valid for the case when parameters

mated by method of moment. Differently, the Bayesian

method can provide estimations for sampling distributions

of quantiles through using repetition samples. Therefore, xp

can be directly obtained. For comparison, results of this

study were compared with those of the traditional approach

as in Table 3. It indicates that for the studied sample series,

the security amendments Xp for quantiles with exceedance

probabilities p=0.1%, 1%, 2%, 5% derived from Bayesian

method are all smaller than those from the curve-fitting

method.

, x Cv, Cs are esti-

6 Summary and conclusions

The aim of flood frequency analysis is to provide accurate

estimates of quantiles or design values with rare exceedance

probabilities for hydraulic engineering constructions. As we

all know, such estimation in the real world is always com-

pleted with limited floods information or data. The shortage

of data diminishes the reliability of design values, and

brings extra risks to engineering works. Therefore, coupling

various sources of information to ensure the reliability of

estimation is what hydrologists have been seeking for; for

example, using regional floods data with at-site floods

Table 3 Results of safety amendments for quantiles (discharge in m3 s1)

Bayesian method Curving-fitting method

p(%)

Xp

Xp Xp+Xp

Xp

Xp Xp+Xp

0.1 3,304 286 3,590 3,730 408 4,138

1 2,471 169 2,640 2,700 232 2,932

2 2,210 139 2,349 2,384 196 2,580

5 1,852 107 1,959 1,955 130 2,085

Page 9

Liang Z M, et al. Sci China Tech Sci Mary (2011) Vol.54 No.5

1191

extremes to estimate parameters of a distribution is a fre-

quently adopted approach. With a specific form, Bayesian

formula provides a theoretical framework for combining

both prior and sample information. Accordingly, similar to

situations in other fields, Bayesian approaches have been

studied and applied widely in hydrological frequency

analysis.

This study describes the application of a Bayesian ap-

proach to hydrological frequency analysis for P-III prob-

ability distribution. A kind of MCMC sampling algorithm

called adaptive metropolis is used to acquire the posterior

probability distributions of parameters, upon which sam-

pling distributions of quantiles or design values are esti-

mated. Based on these sampling distributions of quantiles,

the Bayesian approach furnishes not only point estimators

for quantiles, but also evaluations on uncertainties of these

estimators. In use of Bayesian approach for flood frequency

analysis, the posterior distribution is expressed in a form of

integral, and for most of the probability distributions used in

hydrology, the integral may be very complex without ana-

lytical solution, so that it brings extra difficulties in compu-

tation. Therefore, it seems at present that measures based on

MCMC must be employed for obtaining the solution for

integral computations. There have been developed various

MCMC sampling algorithms, among which the AM sam-

pling technique is adopted in this study. The AM algorithm

does not need to assign the proposal distributions for pa-

rameters as most of other algorithms do; instead, it is com-

pleted through estimating the covariance matrixes of the

posterior parameters in an adaptive and successive manner.

This merit of AM algorithm guarantees the efficiency and

convergence of the sampling process.

There inevitably exist a number of sources of uncertain-

ties in flood frequency analysis. These sources of uncertain-

ties could be summarized into three categories [39]: The

natural uncertainty inherently existing in the natural gener-

ating process, estimation uncertainty due to limited data,

and model uncertainty as a specific statistical distribution

being assigned. The existence of these uncertainties leads to

the uncertainties in quantile or design value estimations. In

order to ensure the safety of engineering works, an addi-

tional amendment is needed for practical implementation,

but approaches for computing the amendment value are far

from mature. Bayesian approach is able to obtain sampling

distributions for design values, so that it provides a new

means for safety amendment estimation for hydraulic engi-

neering design. It should be mentioned that in this study, we

in fact only dealt with the estimation uncertainty but ig-

nored other uncertainties, therefore, the sampling distribu-

tions could not take into account entire uncertainties of

quantile estimations.

This work was financially supported by the National Basic Research Pro-

gram of China (“973” Program) (Grant No. 2007CB714104) and the

National Natural Science Foundation of China (Grant No. 50779013).

1 Stedinger J R, Vogel R M, Foufula-Georgiou E. Frequency analysis

of extreme events. In: Maidment R, ed., Handbook of Hydrology.

New York: McGraw-Hill, 1992, 18.1–18.66

Laio F, Di Baldassarre G, Montanari A. Model selection techniques

for the frequency analysis of hydrological extremes. Water Resour

Res, 2009, 45, W07416, doi:10.1029/2007WR006666

Wallis J R, Matalas N C, Slack J R. Just a moment! Water Resour

Res, 1974, 10(2): 211–221

Griffis V W, Stedinger J R. Log-Pearson type 3 distribution and its

application in flood frequency analysis. II: Parameter estimation

methods. J Hydrol Eng, 2007, 12(5): 492–500

Landwehr J M, Matalas N C, Wallias J R. Probanility weighted mo-

ments compared with some traditional techniques in eatimating

Gumbel parameters and quantiles. Water Resour Res, 1979, 15(5):

1055–1064

Greenwood J A, Landwehr J M, Matalas N C, et al. Probability

weighted moments: Definition and relation to parameters of several

distributions expressable in inverse form. Water Resour Res, 1979,

15(5), 1049–1054, doi: 10.1029/WR015i005p01049

Landwehr J M, Matalas N C, Wallias J R. Estimation of parameters

and quantiles of Wakeby distributions. Water Resour Res, 1979,

15(6): 1361–1379

Greis N P, Wood E F. Regional flood frequency estimation and net-

work design. Water Resour Res, 1981, 17(4): 1167–1177

Hosking J R M, Wallis J R, Wood E F. Estimation of the generalized

extreme-value distribution by the method of probability-weighted

moments. Technometrics, 1985, 27(3): 251–261

Hosking J R M, Wallis J R. The value of historical data in flood fre-

quency analysis. Water Resour Res, 1986, 22(11): 1606–1612

Song D, Ding J. The application of probability weighted moments in

estimating the parameters of the Pearson type three distribution. J

Hydrol, 1988, 101(1-4): 47–61

Ding J, Yang R. The determination of probability weighted moments

with the incorporation of extraordinary values into sample data and

their application to estimating parameters for the Pearson type three

distribution. J Hydrol, 1988, 101(1-4): 63–81

Whalen T M, Savage G T, Jeong G D. The method of self-deter-

mined probability weighted moments revisited. J Hydrol, 2002,

268(1-4): 177–191

Hosking J R M. L-moments: Analysis and estimation of distributions

using linear combinations of order statistics. J Roy Stat Soc B, 1990,

52(1): 105–124

Hosking J R M, Wallis J R. Regional Frequency Analysis. Cam-

bridge: Cambridge University Press, 1997. 14–141

Hosking J R M. On the characterization of distributions by their

L-moments. J Statist Plan Infer, 2006, 136(1): 193–198

Peel M C, Wang Q J, Vogel R M, et al. The utility of L-moment ratio

diagrams for selecting a regional probability distribution. Hydrolog

Sci J, 2001, 46(1): 147–156

Abramowitz M, Stegun I A. Handbook of mathematical functions,

Appl Math Ser 55, National Bureau of Standards, U.S. Government

Printing Office, Washington, D.C., 1964

Vicens G J, Rodriguez-Iturbe I, Schaake J C Jr. A Bayesian frame-

work for the use of regional information in hydrology. Water Resour

Res, 1975, 11(3): 405–414

Stedinger J R. Design events with specified flood risk. Water Resour

Res, 1983, 19(2): 511–522

Wood E F, Rodriguez-Iturbe I. Bayesian inference and decision

making for extreme hydrologic events. Water Resour Res, 1975,

11(4): 533–542

Wood E F, Rodriguez-Iturbe I. A Bayesian approach to analyzing

uncertainty among flood frequency models. Water Resour Res, 1975,

11(6): 839–843

Tang W H. Bayesian frequency analysis. J Hydraul Div-ASCE, 1980,

106: 1203–1218

Van Gelder P H A J M, Van Noortwijk J M, Duits M T. Selection of

probability distribution with a case study on extreme Oder River dis-

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

Page 10

1192 Liang Z M, et al. Sci China Tech Sci May (2011) Vol.54 No.5

charges. In: The Tenth European Conference on Safety and Reliabil-

ity, Munich-Garching, Germany, 1999. 1475–1480

Kuczera G. Combining site-specific and regional information: An

empirical Bayes approach. Water Resour Res, 1982, 18(2): 306–314

Kuczera G. Comprehensive at-site flood frequency analysis using

Monte Carlo Bayesian inference. Water Resour Res, 1999, 35(5):

1551–1557

O'Connell D R H, Ostenaa D A, Levish D R, et al. Bayesian flood

frequency analysis with paleohydrologic bound data. Water Resour

Res, 2002, 38(5), 1058, doi:10.1029/2000WR000028

Reis D S Jr, Stedinger J R, Martins E S. Bayesian generalized least

squares regression with application to log Pearson type 3 regional

skew estimation. Water Resour Res, 2005, 41, W10419, doi:

10.1029/2004WR003445

Reis D S Jr, Stedinger J R. Bayesian MCMC flood frequency analy-

sis with historical information. J Hydrol, 2005, 313(1-2): 97–116

Seidou O, Ouarda T B M J, Barbet M, et al. A parametric Bayesian

combination of local and regional information in flood frequency analy-

sis. Water Resour Res, 2006, 42, W11408, doi: 10.1029/2005WR004397

Ribatet M, Sauquet E, Gresillon J M, et al. A regional Bayesian POT

model for flood frequency analysis. Stoch Environ Res Risk Assess,

2007, 21(4): 327–339

25

26

27

28

29

30

31

32 Ribatet M, Sauquet E, Gresillon J M, et al. Usefulness of the reversi-

ble jump Markov Chain Monte Carlo model in regional flood fre-

quency analysis. Water Resour Res, 2007, 43, W08403, doi: 10.1029/

2006WR005525

Micevski T, Kuczera G. Combining site and regional flood informa-

tion using a Bayesian Monte Carlo approach. Water Resour Res,

2009, 45, W04405, doi: 10.1029/2008WR007173

Blasone R S, Vrugt J A, Madsen H, et al. Generalized likelihood un-

certainty estimation (GLUE) using adaptive Markov Chain Monte

Carlo sampling. Adv Water Resour, 2008, 31(4): 630–648

Gallagher K, Charvin K, Nielsen S, et al. Markov chain Monte Carlo

(MCMC) sampling methods to determine optimal models, model

resolution and model choice for earth science problems. Mar Petrol

Geol, 2009, 26(4): 525–535

Haario H, Saksman E, Tamminem J. An adaptive metropolis algo-

rithm. Bernoulli, 2001, 7(2): 223–242

Haario H, Laine M, Mira A, et al. DRAM: Efficient adaptive MCMC.

Stat Comput, 2006, 16(4): 339–354

Gelman A, Rubin D B. Inference from iterative simulation using

multiple sequences. Stat Sci, 1992, 7(4): 457–511

Benjiamin J R, Cornell C A. Probability, Statistics and Decision for

Civil Engineers. New York: McGraw-Hill, 1970

33

34

35

36

37

38

39