ArticlePDF Available

A Nonlinear Disaggregation Method With a Reduced Parameter Set for Simulation of Hydrologic Series


Abstract and Figures

A multivariate dynamic disaggregation model is developed as a stepwise approach to stochastic disaggregation problems, oriented toward hydrologic applications. The general idea of the approach is the conversion of a sequential stochastic simulation model, such as a seasonal AR(1), into a disaggregation model. Its structure includes two separate parts, a linear step-by-step moments determination procedure, based on the associated sequential model, and an independent nonlinear bivariate generation procedure (partition procedure). The model assures the preservation of the additive property of the actual (not transformed) variables. Its modular structure allows for various model configurations. Two different configurations (PAR(1) and PARX(1)), both associated with the sequential Markov model, are studied. Like the sequential Markov model, both configurations utilize the minimum set of second-order statistics and the marginal means and third moments of the lower-level variables. All these statistics are approximated by the model with the use of explicit relations. Both configurations perform well with regard to the correlation of consecutive lower-level variables each located in consecutive higher-level time steps. The PARX(1) configuration exhibits better behavior with regard to the correlation properties of lower-level variables with lagged higher-level variables.
Content may be subject to copyright.
A Nonlinear Disaggregation Method With a Reduced Parameter Set
for Simulation of Hydrologic Series
Department of Civil Engineering, National Technical University of Athens, Zografou, Greece
A multivariate dynamic disaggregation model is developed as a stepwise approach to stochastic
disaggregation problems, oriented toward hydrologic applications. The general idea of the approach is
the conversion of a sequential stochastic simulation model, such as a seasonal AR(1), into a
disaggregation model. Its structure includes two separate parts, a linear step-by-step moments
determination procedure, based on the associated sequential model, and an independent nonlinear
bivariate generation procedure (partition procedure). The model assures the preservation of the
additive property of the actual (not transformed) variables. Its modular structure allows for various
model configurations. Two different configurations (PAR(l) and PARX(1)), both associated with the
sequential Markov model, are studied. Like the sequential Markov model, both configurations utilize
the minimum set of second-order statistics and the marginal means and third moments of the
lower-level variables. All these statistics are approximated by the model with the use of explicit
relations. Both configurations perform well with regard to the correlation of consecutive lower-level
variables each located in consecutive higher-level time steps. The PARX(1) configuration exhibits
better behavior with regard to the correlation properties of lower-level variables with lagged
higher-level variables.
Disaggregation models are widely used tools for the sto-
chastic simulation of hydrologic series. They differ from
sequential stochastic simulation models in that instead of
generating events sequentially, they divide known higher-
level values (e.g., annual) into lower-level ones (e.g., sea-
sonal) which add up to the given higher-level values. Thus,
they have the advantage of providing the ability to transform
a time series from a higher time scale to a lower one,
resulting in a multiple time scale preservation of the sto-
chastic structure of the time series. However, disaggregation
models are more complicated than sequential models and
they have some other drawbacks, as discussed below.
The linear disaggregation model in its initial structure was
developed by Valencia and Schaake [1972, 1973]. Their
model assures the resemblance of variance and covariance
properties between historical and generated lower-level vari-
ables in the interior of a higher-level time step. However, it
makes no effort to preserve covariances of the lower-level
variables belonging to consecutive higher-level time steps.
In fact, this model does not assume any connection between
lower-level variables of different higher-level time steps.
Different model structures and parameter estimation pro-
cedures intended to preserve the lagged covariance proper-
ties among lower-level variables belonging to consecutive
higher-level times steps have been suggested by Mejia and
Rousselle [1976], Hoshi and Burges [1979] and Stedinger
and Vogel [1984]. However, as shown by Stedinger and
Vogel [1984], the reproduction of the exact historical serial
correlations of the lower-level variables can be an impossible
task within a disaggregation framework because of structural
constraints imposed by that framework, thus giving rise to
an inconsistency.
In cases where all variables are normally distributed, the
Copyright 1992 by the American Geophysical Union.
Paper number 92WR01299.
0043-1397/92/92 WR-01299505.00
basic linear generating scheme of the Valencia-Schaake
model assures the preservation of higher-order moments and
distribution functions as well; otherwise, the model in its
primary formulation fails to maintain such properties. Sug-
gested modifications of the model, permitting the represen-
tation of nonnormal distributions, can be classified in two
categories. First are methods oriented toward the preserva-
tion of merely the skewness of the lower-level variables [Tao
and Dellcur, 1976; Todini, 1980]. They generate independent
random components having skewed distributions with the
skewness coefficients being determined in terms of the third
moments of the historical data. Second are methods that
utilize nonlinear transformations of the variables, so that the
transformed variables have normal distributions [Valencia
and Schaake, 1972; Hoshi and Burges, 1979; Stedinger and
Vogel, 1984]. However, as Todini [1980] notes, this means
that the additive property, which is one of the main at-
tributes of the original disaggregation scheme, is lost. To
overcome this problem a correction procedure is usually
suggested [Lane and Frevert, 1990; Stedinger and Vogel,
1984; Grygier and Stedinger, 1988, 1990].
Another drawback of disaggregation models is their exces-
sive number of parameters, because of the large number of
cross correlations that they attempt to reproduce. Two
different procedures to reduce the required number of pa-
rmeters have been developed: the staged disaggregation
models (SDMs) and the condensed disaggregation models
(CDMs). The former [Lane, 1979, 1982; Sa!as eta!., 1980;
Stedinger and Vogel, 1984; Grygier and Stedinger, 1988;
Lane and Frevert, 1990] disaggregate higher-level variables
at one or more sites to lower-level variables at those and
other sites in two or more steps. The latter [Lane, 1979,
1982; Pereira et al., 1984; Oliveira et al., 1988; Stedinger and
Vogel, 1984; Stedinger et al., 1985; Grygier and Stedinger,
1988] reduce the number of required parameters by explicitly
modeling fewer of the correlations among the lower-level
Recently, another approach aiming mainly toward the
reduction of the parameter set required for a disaggregation
procedure, as well as toward providing the ability to repre-
sent non-Gaussian distributions, was introduced. The
model, called the dynamic disaggregation model (DDM)
[Koutsoyiannis, 1988; Koutsoyiannis and Xanthopoulos,
1990] has been formulated as a generalized stepwise ap-
proach to stochastic disaggregation problems, allowing for a
variety of configurations. DDM is closely connected with an
associated particular sequential generating model (e.g., a
PAR model) and uses the same parameter set as in this
sequential model. With the DDM approach, lower-level
variables are generated one at a time, given the total amount
to be allocated across the remaining periods. Thus the
disaggregation of a higher-level variable into its components
(lower-level variables) is split into equivalent sequential
steps, each corresponding to a specific lower-level time
period. The method involves two separate procedures in
each step. First the parameters of the generation equations
for this step are determined so as to preserve the specified
marginal and conditional moments given the total amount
remaining to be allocated and the already generated values at
previous steps. Second, these parameters are used to gener-
ate the step's amount (lower-level variable) and to update
the remaining amount to be allocated in next steps. These
two separate procedures are called the moments determina-
tion procedure and the partition procedure.
Each model configuration depends on the particular par-
tition procedure and moments determination procedure uti-
lized. The former is affected mainly by the marginal distri-
butions of consecutive lower-level variables, while the latter
is influenced by the type of their stochastic dependence, as
described by the associated sequential model. The first
studied forms of the model [Koutsoyiannis and Xanthopou-
los, 1990] concerned only single-site problems described by
Markov sequences with Gaussian or gamma marginal distri-
In this paper the model has been generalized for the
multivariate Markov case, i.e., it assumes a Markov depen-
dence of the lower-level variables. The Markov case is
selected because of its simplicity and the minimum set of
parameters that it involves. In addition, the explicit treat-
ment of third-order moments is supported by the present
model form. The model also approximates the lag one
correlation of the lower-level variables belonging to consec-
utive higher4evel time steps. Finally, a model configuration
capable of preserving of correlations of lagged higher- and
lower-level variables, in light of the inconsistency reported
by Stedinger and Vogel [1984] is discussed.
The model has been used in several hydrologic applica-
tions, such as the disaggregation of annual flows, rainfall
depths and lake evaporation into monthly quantities, as well
as the disaggregation of monthly into hourly rainfall depths.
Consider a specific higher-level time step, or period (e.g.,
1 year), and a subdivision of the period in k lower-level time
steps or subperiods (e.g., 12 months), each denoted with a
time index t = 1, ..- , k. We denote by Z = (Z•, Z2, "'
Z,) r the higher-level variables of that period at n sites and
by X t = (X•, X•, --. , X,t) r the lower-level variables of
subperiod t at the same n sites. Higher- and lower-level
variables satisfy
X •+X 2+ '" +X t+ '-' +X k=Z (1)
We define a partial sum of lower-level variables, referred to
as amount still to go, by
S t = X t + X t+l + ..- + X k (2)
also expressed by
S t = Z - X •- X 2 ..... X t-• (3)
With the help of the above notation we discuss now some
key elements of the multivariate dynamic disaggregation
model. These elements should not be considered as model
steps but rather as an introductory summary of the key
features of the proposed model to be discussed in detail in
the next sections.
1. The disaggregation of the higher-level variable, Z, into
its k components X •, X2, ... , X k is split into n(k - 1)
sequential steps, each referring to the generation of one
lower-level variable at a single site. (For one site we need k
- 1 steps, since the last lower-level variable can be calcu-
lated from (1)).
2. At the beginning of step t at site j, we have at our
disposal knowledge of the previously generated information,
consisting of (1) the whole sequence of higher-level values
and (2) the lower-level values of the previous steps. Conse-
quently, from (3), the amount still to go, sJ, is also known.
3. Based on (2) and a specific assumption about the
stochastic dependence of the X t sequence in each step, the
distribution function of (xJ, s]), conditional on previously
generated information, is determined, or, in fact, approxi-
mated via conditional moments. Thus, in each step, we have
a moments determination procedure. A mathematically con-
venient assumption about the stochastic dependence of the
X t, allowing for the calculation of conditional moments, can
be that X t is an autoregressive sequence. This specific
autoregressive sequence of X t will be referred to as the
associated sequential model of the disaggregation model. It
must be emphasized that, given an assumption for the
dependence of X t, the dependence between X t and Z t (orX t
and S t) is fully determined by (1) (or (2)).
4. Given the joint and marginal moments of (X], SJ),
conditional on previously generated information, the gener-
ation of XJ can be viewed as an entirely isolated problem
with three variables, XJ, SJ +• and S J, satisfying
Jj + sj +'-. sj (4)
This isolated procedure, called the partition procedure,
ignores all other variables of the problem. It divides the
known amount SJ into two parts xj and S] +'. Its connec-
tion to the other part of the model is that it is fed by the
conditional moments of (X], sJ), which contain the stochas-
tic structure of the whole sequence of variables. The parti-
tion procedure can take several forms, depending on the
particular marginal distribution of the lower-level variables
and the statistics that are to be preserved. This procedure is
based on a generating expression,
XJ = h(SJ, wj) (5)
where h( , ) is generally a nonlinear function to be
discussed later, and W] is a random variable independent of
sJ and not necessarily normal.
5. After the execution of the partition procedure the
previously generated information is updated with the known
XJ and the remaining quantity SJ + is transferred to the next
step of the same site.
The main novelty of this approach is the different treat-
ment of the dependence across the sequence of low-level
variables and the dependence implied by the additive prop-
erty (1). The former is studied through the moments deter-
mination procedure and the latter with the partition proce-
dure. The resulting modular structure of the model is flexible
and can give rise to several model configurations.
The present model apparently has many differences from
other disaggregation models. Unlike the totally linear type of
disaggregation models [Valencia and Schaake, 1972, 1973;
Mejia and Rousselle, 1976; Tao and Delleur, 1976; Hoshi
and Burges, 1979; Todini, 1980; Stedinger and Vogel, 1984],
DDM is a combination of a linear part (moments determina-
tion procedure) and a nonlinear one (partition procedure,
based on (5)). Furthermore, the linear part is not at all like
the linear disaggregation equations used by other models.
Like Todini's [1980] model, DDM is capable of explicitly
treating the skewness of the lower-level variables, without
loss of the additive property, but there are no other similar-
ities between the two models. DDM maintains only a subset
of the joint second-order moments among the lower-level
variables, as do condensed disaggregation models (CDMs);
however, there are essential differences between DDM and
CDMs, to be discussed later.
General Considerations
Since the partition procedure is isolated from the other
parts of the disaggregation model, it is studied separately in
this section. For reasons of convenience the notation in this
section will be as simple as possible, avoiding time and site
indexes which are not necessary here. The partition proce-
dure may be considered as a simple disaggregation problem
with two lower-level variables at a single site. It must be
emphasized that the extension of the partition procedure to
the multivariable and multisite case is not necessary. What is
needed in order to have a multivariate disaggregation model
is the development of a separate moments determination
procedure and the connection of the two procedures.
Assume, therefore, the case of two lower-level variables,
X and Y adding up to the higher level variable S, i.e.,
X+ Y=S (6)
In order to generate X we need to determine the following
generating expression, similar to (5):
X = h(S, W) (7)
where W is a random variable independent of S and not
necessarily Gaussian. Given a specific expression h( , )
and a specific distribution function F(w), X can be directly
generated from (7) and Y can then be obtained from (6). The
determination of h( , ) and F(w) aims toward the pres-
ervation of the joint distribution of (X, S). However, with
the exception of some special cases, the preservation of the
complete distribution is an impossible task. For practical
purposes though, the preservation of the first three moments
may suffice. The introduction of the third moment is neces-
sary if we want to cope with nonnormal distributions.
Since S is already known at the beginning of the partition
procedure, the preservation of its marginal moments, i.e., ,X'•
= E[S] and A i = E[S - X'•)i], i = 2, 3, '-' , is presumed.
The complete preservation of first, second, and third mo-
ments of X, Y and S requires that the following six quantities
are maintained:
Marginal moments of X
• • = E[X] r• 2 = Var [X] = E[(X- '1 '•)2] (Sa)
V3 '• /i'3[ X] -' E[(X- •1•1) 3]
Joint moments of (X, S)
Sr•l = Coy IX, S] = E[(X- r/'•)(S - X '•)] (SO)
sr•2 = tx•2[X, S] = El(X- ,q'•)(S- X'•) 2] (8c)
= s] = El(X- (8a)
where /z with one or two sub scripts denotes marginal or
cross central moments. It is easily shown that any marginal
or joint moment of order <3 between (X, Y, $) can be
determined in terms of the above six quantities and the
marginal moments of S, i.e., X•, X 2, ,X 3 (see, as examples,
(21)-(24) below). Thus, given that ,X are preserved outside of
the partition model and Y is derived from (6), the preserva-
tion of the ,/and •'implies the preservation of any moment of
order -<3, marginal or joint. In cases where only the marginal
moments of X and Y are of interest, the preservation of
merely the difference •'•2 - •'2•, instead of their particular
values, is required.
The Linear Scheme
Let us first examine the linear form of h(
disaggregation scheme), defined by ) (the linear
X = aS + b W (9)
where a and b are parameters to be estimated. Assuming
with no loss of generality that Var [W] = 1, the other two
parameters of the model are E[W] and/x3[W]. This scheme
is capable of preserving first and second moments, but not
nonzero third moments of X and ¾. Indeed, a, b and E[W]
can be determined so as to preserve the first and second
moments (a = •'•/X2, b = (*/2 - •'•/X2)•/2, E[W] = (*l'•
- a,•)/b). Then Ix3[W] can be determined such that */3 =
/z 3 [X] is preserved. Since there is no other parameter, there
is no way to preserve •'12 and •'21. Consequently, tz3[ Y] could
not be preserved by this scheme. The same is pointed out
elsewhere [Koutsoyiannis and Xanthopoulos, 1990] and is
implied by Todini [1980]. In fact, Todini [1980, p. 203] uses a
linear disaggregation scheme with more than two lower-level
variables, where the situation is different, if only marginal
third moments are to be preserved.
The preservation of first- and second-order moments is
probably the only task which can be faced in an exact way.
The linear model is sufficient for this task. For the special
case where all variables are normal the linear model can also
preserve the complete distributions of all variables. Below
some nonlinear schemes are examined, which can cope with
higher moments for the general case, but are not exact
schemes in that they do not preserve any statistic exactly but
only approximately, as will be discussed later.
The Use of Nonlinear Transformations
It is a common practice in stochastic models, even in
disaggregation models, to use nonlinear transformations of
variables with nonsymmetrical distributions in order to
achieve normal distributions, which are easier to handle. Let
us examine the logarithmic transformation, i.e.,
X' = In (X- Cx) (10)
Y' = In (Y- cr) (11)
S' = In (S - Cs) (12)
where C x, c r and Cs are parameters to be estimated.
Apparently, because of (6) and since lognormal distribution is
not regenerative, X', Y' and S' cannot all be normal. Conse-
quently, there is no case where a linear generating rule like
X' = axS' + bxW• (13)
along with (6) could be exact with respect to the complete
distribution of the three variables. Furthermore, if we reduce
the requirements to the moments preservation, the situation
will be similar to that of the linear scheme. Indeed, the
scheme in (13) again has four parameters to be determined:
ax, bx, Cx and E[Wk]. Note that Cs is not a parameter of
the generating scheme itself, since it refers to the presumed
distribution of $. Furthermore, t•3[W•c] is not a parameter;
it should be zero because W• should be assumed normal;
otherwise there do not exist analytical relations between the
statistics of X and X'. Thus the above scheme, because of a
lack of a sufficient number of parameters, is not capable of
preserving the third-order joint moments •'•2 and •r2•.
Another method, which is mostly used in disaggregation
models, initially ignores (6) and generates independently X'
from (13) and Y' from a similar relation, i.e.,
Y' = arS' + brW¾ (14)
By taking antilogarithms in (13) and (14) we find the follow-
ing first approximations of X and Y:
X* -- (S - cs)axwj• x d- c X (15)
Y* = (S- cs)arw br + c (16)
where Wx = exp (W•) and Wr = exp (W'r). Apparently,
X* and Y* do not add up to $. In order to regain the additive
property (6), X* and Y* must be corrected. Lane and
Frevert [1990, p. V-22], $tedinger and Vogel [1984] and
Grygier and Stedinger [1988, 1990] have developed several
adjusting procedures. The so-called proportional adjust-
ments define the simplest procedure, which is
"' X* Y = . Y*
X = X* +"" Y* ' X* -•' r* (17)
Combining (15), (!6) and (17), we get the following final
solution for X, or the final generating rule:
x = [(s - c +
ß [(s - ß + cx + (s- + c
which can be combined with (6) for the generation of Y.
The last scheme has a sufficient number of independent
parameters (eight) in order to assure the preservation of the
statistics in (8). However, its complexity prohibits the ana-
lytical determination of any statistic, even of the mean. In
that respect the scheme could not be exact, since no exact
parameter estimation procedure can be established.
Somewhat simpler is the situation if an adjustment scheme
based on the standard deviations of the lower-level variables
is used [Lane and Frevert, 1990, p. V-22]:
X= X* + (S - X* - Y*)l• x, (19)
r= ¾* + (s- x*- r*)(1 - •x)
where 8x = •rx/(rrx + rrr) and rr x and rr r are the standard
deviations of X and Y. Combining (15), (16) and (19) we get
the following explicit generating rule:
X = (1 - 8x)[(S- c$)axw• x + Cx]
- xE(s - c + c r] + (20)
which can be combined with (6) for the generation of Y. By
using (20) one can get explicit but too complicated relations
for the marginal moments of X (r/terms) and for the joint
moments of X and S (• terms). These relations may be
solved for the unknown parameters only numerically.
Moreover, (18) and (20) have another source of inaccuracy
arising from the fact that the lognormal distribution is not
regenerative. Even if there would exist an exact parameter
estimation procedure, this would be based on an assumption
that S, Wx and W r are lognormal. But then, X and Y could
not be lognormal. Thus in the next execution of the partition
procedure, in which the updated S is equal to the previous
Y, the above assumption is no longer valid.
Consequently, the requirement for exactness involves
insurmountable difficulties. In order to establish an approx-
imate procedure, the usual assumption is that the statistics of
X equal those of X* and the statistics of Y equal those of Y*.
This approximation has performed well in many models.
However, besides the nonexactness of the procedure, an-
other problem arises: The number of parameters is no longer
sufficient to preserve all the statistics listed in (8). Indeed,
four parameters, namely ax, bx, cx and E[W•r], are used to
preserve the three marginal moments of X* plus the covari-
ance of X* and S. The remaining parameters are a r, b r, c r
and E[ W'r]. Since the generation of Y* is independent of X*
(X* + Y* % S), the first and second moments of Y* are not
automatically preserved, as they are in the case where Y is
generated from (6). Thus one has to use three parameters to
preserve these moments, namely E[ Y*], Var [ Y*] and Coy
[ Y*, S], and there is one remaining parameters, which has to
be used to preserve the third marginal moment of Y*. The
result of this parameter allocation is that the difference •12 -
G• is preserved and not the values of •12 and •2]. (Note that
t•3[ Y] is given by (23) below, X 3 is preserved outside of the
partition model, r/3 is approximately preserved by the gen-
eration of X*, and thus the preservation of t•3[ Y] is equiv-
alent to the preservation of •12 - •2].) Of course, this
weakness is not major: The joint third moments of X and ¾
are not preserved, but the marginal ones are preserved
approximately. It is emphasized that the above difficulties in
establishing an exact model are not specific to the present
model, but are also shared by other disaggregation models
that use the logarithmic transformation of variables.
The realization of an approximate parameter estimation
procedure suiting the requirements of this specific partition
procedure is simple. First, calculate the moments of Y in
terms of the moments in (8), by using (6), i.e.,
ElY] = X '• - r/'• (21)
Var [ Y] = X 2 + r/2- 2st l l (22)
/'t'3[ Y] --- '•'3- •/3- 3Srl2 + 3•21 (23)
Coy [ Y, S] = X 2 - f it (24)
Second, assume equality of the moments of X* and Y* with
those of X and Y. Third, determine the moments of X' and
Y' (plus Cx and c r) in terms of the moments of X* and Y*
by the method of moments [e.g., Charbeneau, 1978; Kotte-
goda, 1980, p. 137; Stedinger, 1980]. (Due to the moments
orientation of the procedure other estimation methods dis-
cussed by Stedinger [1980] are not applicable here.) Fourth,
calculate the parameters of the linear schemes (13) and (14)
by standard methods.
The generation part of the procedure consists of the
following: First, generate WSc and W'y from the normal
distribution. Second, calculate X' and Y' by (13) and (14)
and take antilogarithms to determine X* and Y* (also, add
cx and c r)- Third, use (17) or (19) or any other correction
procedure to determine X and Y.
It is noted that the above partition procedure based on the
logarithmic transformation of variables has not been used in
a real application in the framework of the proposed model.
In a similar way other procedures, based on different
nonlinear transformations, such as Wilson-Hilferty, can be
developed. The main problems of such procedures are the
same as those described for the case of the logarithmic
The Quadratic Scheme
Finally we shall examine a quadratic generation rule,
without transformations of variables, defined by
x = a(s) + f(s) w (25)
g(S) = ao + a lS + a2 S2 (26)
f(S) = bo + biS + b2 S2 (27)
where a i and b i, i = 0, 1, 2, are parameters to be estimated
and W is a random variable not necessarily normal, indepen-
dent of S, with zero mean and unit variance. This scheme,
though it seems not physically reasonable, is introduced as
the most convenient one with respect to the moments
calculation, having a sufficient number of parameters to
cover the relevant restrictions. However, as explained be-
low, it has certain structural problems. Due to these prob-
lems it cannot be an exact scheme, but an approximate one.
In the following analysis, for mathematical convenience
and without loss of generality it will be assumed that all
random variables have zero mean. This scheme contains
seven parameters, the six coefficients a i and b i, plus the
skewness of W, 03 = E[ W3]. So, the number of parameters
exceeds the number of constraints in (8), thus giving one
degree of freedom.
By making the proper transformations in (25) and then
taking expected values, the following equations can be easily
•[X] = •r[a(S)] (28)
E[XS] = E[Sg(S)] (29)
E[X 2] = E[g2(S)] + E[f2(S)]E[W 2] (30)
E[XS 2] = E[S2•7(S)] (3 I)
E[X2S] = E[S#2(S)] + E[Sf2(S)]E[W 2] (32)
E[X 3] = E[93(S)] + 3E[g(S)f2(S)]E[W 2]
+ E[f3(S)]E[W 3] (33)
It is obvious that these equations are explicit relations
between the moments of X and S and the unknown model
parameters. In particular, (28), (29) and (31) determine
completely g(S), yielding simple expressions for the calcu-
lation of the a i. Somewhat more complicated is the deriva-
tion off(S), obtained from (30) and (32), by setting E[W 2 ] --
1. Finally, 03 is obtained from (33). All derivations and the
final equations are given in Appendix A.
There are two limiting cases where this scheme is exact
with regard to the preservation of the complete distribution
functions (not only moments). First, if the variables X and Y
(or, equivalently, X and S) are jointly normal, then the
quadratic scheme downgrades to the linear one and (25)--(27)
reduce to (9), as is theoretically anticipated (see Appendix
A). Apparently, in the normal case W is also normal and the
statistical resemblance is extended to the complete distribu-
tion of (X, Y). Second is the case where X and Y are
deviations from means of two independent, two-parameter
gamma-distributed variables and ? with con-tmon scale
parameter. In this case if •' = .,• + ? then the variable
W = .•/•' (34)
is independent of • and beta distributed [Johnson and Kotz,
1972, p. 234]. Consequently, the deviations from means are
related by
X = (E[œ] + •[•]•r[g]) + E[½]S + (•[•] + S)W (35)
where the random variable W = 1• - E[I•] is independent
of S. Equation (35) is a special case of (25) where both •/(S)
andf(S) have downgraded to linear functions (a 2 -- b 2 = 0)
and the complete distributions of variables are preserved.
Let us now consider the problems resulting from the
quadratic scherne in the general case. A first problem results
from the quadratic forms of f(S) and g(S), which have a
single maximum or minimum. Due to this, not all combina-
tions of A, r/, and s r values yield solutions of (28)-(33). The
relevant constraints concerning the existence of functions
f(S) and #(S) are given in Appendix A (equations (A19)-
(A21)). The limitations arising from that problem are illus-
trated further in section 6.
The second problem may be seen in (30), (32) and (33)
where high moments of S indirectly appear, i.e., A4, X5 and
'•6- This is confirmed by the expressions of Appendix A used
for the computation of the a and b terms. These high
moments must be known in order to evaluate these expres-
sions and preserve lower moments of X and Y. If the
complete marginal distribution of S is known, then these
moments can be determined exactly in some way. However,
this is not the case here, given that the partition procedure is
iteratively executed across the disaggregation steps. More-
over, there is no way to compute exactly in each step the
high moments of the updated amount still to go (of the next
step, which here is represented by Y), since any calculation
would introduce even higher moments of S.
For practical purposes, one could assume a typical distri-
bution function for S and calculate high moments analyti-
cally for this distribution. Apparently, however, the exact
solution is then lost. The use of cumulants may be helpful for
numerical purposes, i.e.,
A4 = K4 + 3X2 (36)
X 5 -- K 5 + 10A3A 2 (37)
A 6 = K 6 q- 15K4A 2 4- 10A} + 15A 2 (38)
where K4, Ks and K6 are the cumulants of S, of orders 4-6
[Kendall and Stuart, 1963, p. 70]. Two typical cases will be
emphasized (for both, see Kendall and Stuart [1963, p. 70]).
These are the normal distribution with
K4 = *:5 = K6 = 0 (39)
and the gamma distribution with
K r = (r- 1)Kr_lX3/2A2 r > 3 (40)
where K 3 = A3. Either (39) or (40), along with (36)-(38) may
also be used for estimation when the distribution of S is not
accurately determined, but can be approximated by the
Gauss or the gamma probability distribution function.
To inspect the influence of an assumption about cumulants
on the moments of the generated series a numerical investi-
gation was made. A gamma-distributed S was assumed and
then disaggregated into two lower-level variables. Two dif-
ferent assumptions for the cumulants were made: first, that
t< r is given by (40) (true assumption) and second, that Kr =
0 as in the normal distribution (equation (39), false assump-
tion). The second assumption does not indicate normality; A3
in (36)-(38) was still nonzero. From this investigation it was
found that the results are quite similar in both cases (an
example is given in section 6 and Table 2). This is an
indication that the model although not exact is not very
sensitive to the assumption about the cumulants.
In the applications of this study the cumulants of the
gamma distribution (equation (40)) were adopted for vari-
ables with distribution approximately gamma. Furthermore,
the beta distribution has been used to approximate the
distribution of W (after a linear transformation).
General Considerations
A disaggregation method may be viewed as a technique for
generating lower-level variables satisfying the additive prop-
erty (1). The lower-level variables of the current period
X , Xk are a subset of a stochastic sequence expanded
1 0 ! k
in both time directions, i.e., (..-, X- , X , X ,'", X ,
Xk+•, .. .). Also the higher-level variable of the current
period Z is a term of another stochastic sequence (- ß ß, Z -!
Z ø Z 1 m Z, Z 2 ... Z * ...). Note the different time
indexes used in the notation of X t and Z •, in order to avoid
dual indexes; if the number of subperiods in each period
(i.e., k) is the same for all periods then Z
+ ß ß ß + X •. Also note the notational simplification Z -- Z
Due to the independence of the generation procedures of
higher- and lower-level variables, it can be assumed that the
higher-level variables are all known (. ß., Z ø = z ø Z
Z 2 = z 2, - ß -) at the beginning of the disaggregation. Also, it
is assumed that the disaggregation procedure has already
been completed at the previous periods; thus all the previous
X t have known values (. , X-2 x-2 X-2 -2 X 0
Consider the disaggregation step t, at the current period
and at the site j. The generation of the lower-level variable
XJ is characterized by
xj + sj+' = sj
where the amount still to go, S J, has a known value given by
(3), since the previous steps have been completed.
Due to the similarity of (41) to (6), the generation of XJ can
be executed by using a specific scheme of the partition
procedure described in the previous section. From the point
of view of the main disaggregation model, the partition
procedure may be considered as a black box procedure. The
input from the main model to this procedure is the value of
$J and a list of statistical marginal and joint moments of (X,
S). The output received by the main model consists of the
values of X• and SJ +2 . Each time the partition procedure is
executed, no information about previous lower-level vari-
ables and their stochastic dependence is utilized. However,
there is a possibility of indirectly introducing the dependence
of previous lower-level variables into the black box partition
procedure without changing its structure. This could be done
by sending as input the conditional moments of (X, S) given
the previously generated information. Namely, these mo-
ments and their corresponding moments of the partition
procedure are as follows:
Conditional marginal moments of SJ
t t
,•'• = E[SJ llJ], x 2 = Var [sj llj], x3 =/•
Conditional marginal moments of XJ
,/'1 = E[Xj]Ilj], ,/2 = Var [Xj l•J], (42b)
= 3[xjlnj]
Conditional joint moments of (XJ, S J)
c,, Cov IX, Sjlaj] c12 • 12[ X, t ,
where •J is an abbreviation for the already generated
information. Depending on the succession of time steps •d
sites in each period, two d•erent courses of the disaggrega-
tion procedure may be considered: the horizontal co•se,
where each step follows the previous step at the same site,
and the vertical one, with each step following the same step
of the previous site. In the former case, which is adopted in
the present study, !• generally consists of the following
items' the full sequence of higher-level variables (.. ß, Z 2 =
z 2, Z -- z 1 Z ø = z ø ..-)' the lower-level variables of
previous periods at all sites (X ø = x ø, X -1 -- x -1, ...); the
lower-level variables of the current period and current
site/previous subperiods (XJ -• = x•-•, .-. , X! = x/),
previous site/all subperiods (Xf-1 xf_•,... X/_• ---
x.• tc
•-1), and so forth through the first site/all subperiods (X1 =
The determination of conditional moments could be an
impossible task if all the information contained in 11• is
considered. Thus, simplifications are necessary. These can
be based on an assumption for merely the sequence of
lower-level variables. A convenient simplifying assumption
about this sequence could be a linear autoregressive struc-
ture. This structure (or model) will be used here not for the
generation of lowerrevel variables but for the determination
of conditional moments.
The Associated Sequential Model
The selection of a sequential linear model for the lower-
level variables depends on which of their statistics are to be
maintained. Here, in order to build a parameter parsimoni-
ous model, the minimum set of statistics is considered. This
set is the same as in the sequential Markov model [Matalas
and Wallis, 1976, pp. 60, 63], i.e., it contains the following
groups: (1) mean values of X](•:J); (2) variances of XJ; (3)
skewhess coefficients of XJ; (4) lag one autocorrelation
coefficients between XJ and XJ -• (same site); and (5) lag
zero cross-correlation coefficients between X• and Xr t, for j
r (same period).
The third-order joint moments, which can be also sup-
ported by the partition procedure, are not independent
parameters in a linear model of lower-level variables. Their
values, which are required as input items to the partition
model, can be obtained from the assumed linear structure as
expressions of the above groups of statistics (see section 5
and Appendix B).
The preservation of the five groups of statistics enumer-
ated above is possible with a seasonal AR(1) (PAR(l)) model
for the process X t, which is the selected associated sequen-
tial model (not the disaggregation model):
X t '- atX t-1 + btV t (43)
or equivalently
XJ _ tx..t-1
aj•j q- E t t
= bjrVr,
j= 1,--., n (44)
where a t can be assumed (n x n) diagonal, i.e., a t = diag
(a f, a•, -.- , a•t), whereas b t = [b/•] is a (n x n) lower
triangular matrix of coefficients, and V t = [V}] is a vector of
n random variables, completely independent both in time
and location. We assume that •t = E[X t] and [•t = E[V t]
are not necessarily zero; we set (for mathematical conve-
nience) Var IV t] = 1, and also denote •/t = /z3[Vt]. The
process stationarity is not necessary and hence not implied
in the notation. However, the assumption of a seasonal
stationarity may be helpful in some of the following analy-
Note that the model does not make any distinction for the
first lower-level variable XJ, where (44) is also applied with
Xj ø, the last lower-level variable of the previous period.
The groups of parameters a t , b t and h, t are computed from
the groups of statistics 2-5 listed above by the following
relations which are extensions of those of the stationary
Markov model given by Matalas and Wallis [1976, p. 63]
coy xj -1]
t (45)
aj = Var [XJ -1]
bt(bt) T '- 0 't- ato. t-la t (46)
/'t'3[X•]- (a•)3/'t'3[X; -1] -- E (b•
t . . (47)
o 't-- Coy [X t, X t] = E[(X t- •t)(xt - •t)T] (48)
and the lower triangular b t is obtained from its Gramian
b t (b t) T by decomposition.
Connection of the Sequential Model
to the Disaggregation Model
It should be emphasized that the model (43) with param-
eters determined from (45)-(47) concerns only lower-level
variables and does not form a disaggregation model. How-
ever, (43) can be combined with (1) or (2) in order to describe
joint properties between lower-level and higher-level vari-
ables or, more generally, between lower-level variables and
amounts still to go. The required combination should be
oriented toward the determination of the moments listed in
(42) and not toward the generation of any variable, since the
generation is executed independently by the partition proce-
dure. Thus, what remains to complete the disaggregation
model is the development of the moments determination
procedure, which can be supported by the associated se-
quential model. This development is done in two different
versions in sections 5 and 7.
Finally, note that the set of independent parameters for
the disaggregation model is the same parameter set as for the
associated sequential model. No additional independent
parameter is required for the determination of the moments
listed in (42) (see sections 5 and 7) and the partition proce-
dure is not fed by any external parameter.
Consider the items of the already generated information in
the general case described in the previous section. First, we
note that higher-level variables of previous periods, (i.e.,
Z -1 = z -l , Z -2 = z -2 ß ß -) can be omitted because they are
contained (due to (1)) in the sequence of the corresponding
lower-level variables. With the approach followed, the item
(Z = z) of the current period can be substituted by the known
amount still to go, given the known values of the previous
lower-level variables. Furthermore, the handling of the
amount still to go is left to the partition procedure and
ignored at the moments determination procedure. A major
simplification of the disaggregation model is obtained if the
information containing the higher-level variables of the next
periods (i.e., Z 2 = z 2, Z 3 = z 3, ß ß .) is deleted from f•J. This
is the case in all disaggregation models. The consequences of
this deletion are discussed in section 6.
Consequently, in this section all items concerning the
information of higher-level variables are omitted to simplify
the determination of conditional moments. Later, in section
7 the question of how the item (Z 2 = z 2) can be taken into
consideration will be discussed. In the Markovian case
examined, f•J is further simplified, since the information
of only one previous step is affected. Thus f•J reduces to
: - -' (x/_ , , ß
= ) .. . , x/_ = x: x -
-1] ['J '' U (Xl k - x1, ''' , X• - x•, X =
The following analysis is very helpful to the moments
determination procedure. Let e t be the (n x n) diagonal
matrix, whose (j, j)th element is eJ = bJ•.; it is then easily
shown that the (n x n) matrix d t defined by
d t = I - et(bt) -1 (49)
with I the unit matrix, is lower triangular with zeros on the
diagonal. By means of d t and e t (43) becomes
X t = atX t-1 4- dt(X t - atX t-l) + etv t (50)
and consequently
j-1 dir(X r - a )+ ei VJ
The latter is a modification of (44) in which only one
innovation term (V J) appears. (It is similar to equations
(10)-(12) of Stedinger et al. [1985]). Equation (51) may be
also written as
Uj = •'• djr(Xr t - a}Xr t-•)
T] and UJ are expressions of lower-level variables contained
in 11• while WJ is a random variable independent of TJ and
UJ. Specifically, TJ is related to the information of the
current location, while U] is related to previous locations.
Thus, conditional moments of XJ of order greater than 1,
given 11•, are expressed in terms of the moments of W] only.
Furthermore, in each disaggregation step the conditional
mean of XJ can be easily determined by using these equa-
By iteratively using (52) and (53) each Xf , r = t, ... , k
may be expressed in terms of XJ -• , Uf (items contained in
l•J) and Wf. Then by adding the expressions of each Xf we
k k
sJ = aj txt--1 r=t r"'t
•'f= 1 + af+•a: +2--'a? (57)
In a manner similar to the case of conditional moments of
X], described above, the conditional moments of S J, given
f•J, can be easily derived with the use of (56). Specifically,
the moments of order > 1 are expressions of the moments of
Wj only. Furthermore, (52) and (56) may be combined by
multiplication to allow the calculation of conditional joint
moments of (XJ, S J). A systematic algorithm for the calcu-
lation of all needed moments is easily constructed from these
equations and given in Appendix B.
General Considerations
As described above, the proposed model consists of two
isolated procedures, a partition procedure and a moments
determination procedure. Three forms of the former and one
of the latter were studied above. Another improved form of
the moments determination procedure is studied in the next
Regardless of the configuration used, the model perfor-
mance is similar to that of the sequential PAR(l) (Markov)
model, except that the lower-level variables add up to the
already known values of the higher-level ones. This property
makes the disaggregation model superior to the sequential
one because there is no loss of statistical resemblance for the
higher-level variables. Instead of deriving the higher-level
variables as sums of the generated lower-level variables
(sequential way), the use of the disaggregation model per-
mits the separate generation of the former, based on the
preservation of their own statistics.
It is important to note that, with the exception of the
special cases discussed in section 3, the statistical resem-
blance achieved by the model is not strict. Theoretically, the
synthetic sequences generated by the model have moments
which may be good approximations of their theoretical
values but do not equal them. This nonexactness of the
model is due to the structural inconsistencies of the partition
procedure, described in section 3 and is encountered in other
disaggregation models as well.
The stepwise approach of the model permits the use of
parallel procedures adjusting properly the generated values
in each step, without loss of the additive property. The case
of positive lower-level variables is an example when a proper
parallel procedure may handle (reject or modify) the possibly
generated negative values. Note, though, that such parallel
procedures introduce additional bias to the simulated series
and, thus, may affect the validity of the procedure.
Due to the stepwise course of the dynamic model there are
no difficulties in handling the dependence between the
lower-level variables associated with higher-level ones of
consecutive periods. In fact, .as shown in section 4, the
mathematical formulation is the same as within a particular
higher-level variable.
At first view, the model exhibits a similar performance
with the condensed disaggregation models, since both
schemes do not use the all-at-once generation approach of
the Valencia-Schaake class of models and utilize only a
subset of the parameters of the latter. However, there are
essential differences between the two schemes:
1. The stepwise course of DDM is applied not only to
different time steps (CDM case) but also to different sites.
2. In each step DDM generates not only the correspond-
ing lower-level variable, but also the amount still to go of the
next step. CDMs do not use the amount still to go.
3. In CDMs each lower-level variable is expressed as a
linear function of the higher-level variable. DDM uses the
amount still to go instead.
4. CDMs use nonlinear transformations of the actual
hydrologic quantities as lower-level variables while DDM
uses the actual quantities instead. If nonlinear transforma-
tions are utilized by DDM, these are internal to the partition
procedure and invisible from the point of view of the main
disaggregation model.
5. The third moments are treated by DDM with explicit
analytical relations while in CDMs they are not.
6. CDMs, like all Valencia-Schaake type models, are
totally linear. DDM eventually utilizes a nonlinear part
(partition procedure).
7. One element of a CDM's parameter set is the covari-
ance matrix between lower-level and higher-level variables.
DDM does not introduce this element as an independent
parameter, thus reducing its parameter set to that of a typical
sequential model (PAR). DDM, in order to evaluate covari-
ances between lower- and higher-level variables or, more
generally, covariances between lower-level variables and
amounts still to go, uses the properties of the associated
sequential model and computes such covariances using
analytical expressions.
Model Limitations
The essential limitations of the disaggregation model are
related to the evaluation of the parameters b t and are met
also in all sequential and disaggregation model [Matalas and
Wallis, 1976, p. 69; Bras and Rodriguez-Iturbe, !985, p. 150;
Grygier and Stedinger, !990, p. 31].
The matrix c t = b t(b t) r may be written as
C t= COV [(X t-- atXt-1), (X t- atXt-1)] (58)
and consequently its elements must satisfy the inequality
< < ¾j (59)
-- t t 1/2 -- '
In addition, the existence of b t requires that c t is positive
definite. It is possible that these two structural constraints
are not satisfied for certain hydrologic data, utilized to
determine c t by means of (45) and (46). Furthermore, (47)
may yield unreasonable skewness coefficients, e.g., of the
magnitude reported by Todini [1980] (y > 30). In fact, there
is no limit for yJ, since the denominator (b•.)3 in (47) may
take too small values. These problems are encountered
mostly in cases where the historical records do not refer to a
common time period.
It is emphasized that these problems are related to the
sequential PAR model [Matalas and Wallis, 1976, p. 69] and
not to the disaggregation model itself. A practical solution to
overcome the problems may be the reduction of the cross-
correlation or autocorrelation coefficients, or even the skew-
ness coefficients of the historical data. Such a reduction of
the characteristics of the historical data is a major problem.
However, in the case of a disaggregation model, such a
modification does not influence the characteristics of the
higher-level variables.
In a similar situation concerning records with different
lengths at different sites Grygier and Stedinger [1990, p.
31-33] discuss practical solutions to relevant problems,
suggesting modifications to the items of a matrix to be
decomposed, if the decomposition fails. Here another sim-
plified technique was used. The decomposition of c t =
b t (b t) r can fail at the point where a diagonal element b j} is
to be calculated by the equation
If the right-hand side of this becomes negative there is no
real solution for b3. To avoid this, one can impose a lower
limit on each diagonal element b3. such as
h tminh 2 t
..;• , = pcj• (61)
where p is a constant (0 < p -< 1). If b• is set equal to this
limit then the nondiagonal elements of the row (bJk, k =
1, ..., j - 1) can be corrected by multiplication with a
single factor • determined so as to regain the validity of (60).
In this way the preservation of Var [XJ] is assured while a
minimum positive Var [WJ] is imposed. An indirect benefi-
cial consequence is that the skewness of WJ (or VJ) cannot
take unreasonably high values. However, the cross correla-
tions of XJ with the lower-level variables of other sites are
apparently reduced with this algorithm.
Another solution to the same problem could be the reduc-
tion of the matrix b t in a manner such that only the highest
concurrent cross-correlation coefficient is preserved, thus
having one off-diagonal nonzero element per row of b t . This
will have as a consequence the further reduction of the
model parameter set (see next subsection). Other methods of
parameter estimation for the PAR(l) model could be more
effective, but they have not been examined in the framework
of this study.
Some secondary limitations result from the constraints of
the quadratic partition procedure, i.e., inequalities (A19)-
(A21), if this particular procedure is selected for the model
configuration. A relevant illustration of the ranges of appli-
cability of the quadratic partition procedure is given in
Figure !. This is done by numerical applications of the model
for several combinations of parameters and selection of
those combinations which yield a limiting fulfillment of the
constraints. One can observe that if the lower-level variables
are quite skewed and strongly correlated then a problem
might arise for the model. However, it was observed from
the applications that these constraints are generally fulfilled
if the skewness coefficients yJ have reasonable values. A
solution to situations of very large yJ could be the invocation
of the lognormal scheme described in section 3, which
theoretically has no limits of applicability.
Cs. x -'6 ... "+6
0.0 O.t 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
VarEX] / ¾arEY]
C•i.X = 0
m: -0.2
0.0 O.t 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
VarEX3 / VarET3
Fig. 1. Illustration scheme of the ranges of applicability of the
quadratic partition procedure. The curves represent upper and
lower limits of correlation coefficients of lower-level variables (Corr
[X, Y]) that lead to real solutions of the equations. These limits are
given versus the ratio of the variances (Var [X]/Var [ Y]) and for two
combinations of the skewness coefficients of the lower-level vari-
ables: (a) Cs,x = Cs, r and (b) Cs,x = 2Cs,r.
Model Parameters
As stated in section 4 the present configuration of DDM
uses the minimum set of parameters, equal to that of the
sequential Markov model. A further reduction of the param-
eter set is possible (but not necessary) by defining the b t
matrices with only one nonzero element in each row, in such
a manner that only the highest concurrent cross-correlation
coefficient is preserved.
Table 1 summarizes the number of parameters needed for
the model, in comparison with other well-known models. It
is obvious that DDM has the fewest parameters. Note that
the results given in Table 1 concern only second-order
statistics. To these results the number of means and skew-
:ness coefficients of the lower-level variables and of the
parameters of the model used for the generation of the
higher-level variables must be added.
Apparently, the parsimony of parameters of DDM in-
volves a cost (omission of terms of the full covariance
matrices), but this cost is not associated with the preserva-
tion of the additive property or of the correlation between
consecutive lower-level variables of consecutive periods.
Preservation of Marginal Distributions
Figures 2a, 2b and 3 give examples for the model perfor-
mance concerning the preservation of characteristics of the
marginal distributions of lower-level variables. They origi-
nate from two applications of the model for the study of the
water supply system of Athens, Greece [Koutsoyiannis et
al., 1990]. Application 1 involves the simulation of concur-
rent rainfall and runoff in three basins supplying the water
system of Athens. Application 2 refers to the simulation of
lake evaporation of the three relevant reservoirs. In both
applications, synthetic annual data of 5000 years were dis-
aggregated by the model into monthly data. The gamma
distribution function and the quadratic partition procedure
were adopted for monthly rainfall and runoff data. The
Gauss distribution function and the linear partition proce-
dure were adopted for monthly evaporation data. The figures
indicate the good performance of the model in both cases. As
shown in Figure 2, though the model is a stepwise one,
errors in skewness do not build up as we progress through
the 12 months.
An application of a previous specific version of the model
in hourly rainfall generation, where the marginal distribution
of lower-level variables was J-shaped gamma or Weibull, can
be found elsewhere [Koutsoyiannis, 1988; Koutsoyiannis
and Xanthopoulos, 1990].
Table 2 illustrates a comparison of DDM to the Valencia-
Schaake model. A gamma higher-level variable, Z, is disag-
gregated into two correlated exponential lower-level vari-
ables, X 1 and X 2, and it is assumed that there is no
correlation between subsequent higher-level time steps.
DDM was directly applied to the actual (not transformed)
variables using the quadratic partition procedure with a
parallel procedure rejecting negative values (series 1 and 2).
In specific, series 1 is generated by using the exact cumu-
lants of the known gamma distribution of Z. For the gener-
ation of series 2 the false assumption that the cumulants K4
= •<5 = •<6 = 0 was used in order to investigate the effect of
such a false assumption on the statistics of the generated
series. The results indicate that this effect is not significant.
For the application of the Valencia-Schaake model in this
example the Wilson-Hilferty transformation of the actual
variables was used, followed by the inverse transformation
of the generated data. Since there are only two lower-level
variables the Valencia-Schaake model is equivalent to a
partition procedure using the Wilsoh-Hilferty transforma-
tion, in a way similar to the logarithmic transformation
described in section 3. In the case of the Va!encia-Schaake
model (series 3) the additive property apparently is not
preserved. By the adjustment procedure described by (!9)
and applied to the inverse-transformed data the additive
property was then regained and, in addition, negative values
were corrected (series 4). For reasons of comparison the
TABLE 1. Comparison of the Number of Second-Order Statistics Used in Various Disaggregation Models
Number of Second-Order Parameters
Fork = 12 Fork-- 12
Model Type In General and n = 1 and n = 3 For k = 12
andn = 6 For k = 12
andn = 10
LAST model (from Grygier and
Stedinger [1988])
Stedinger-Pe6Cohn model
[Stedinger etal., 1985]
Stedinger and Vogel [1984]
Full size (all lag zero covariances)
Reduced size (one lag zero covariance)
Full All-at-Once Disaggregation Models
kn(kn + 2n + 1)/2 90
kn(3kn + 2n + 1)/2 234
Condensed Disaggregation Models
kn(5n + 1)/2 - n 2 35
kn(7n + 1)/2- 3n 2 45
774 3,060 8,460
2,070 8,244 22,860
279 1,080 2,960
369 1,440 3,960
Staged Disaggregation Models
kn(n + 5)/2- n + 90 125 231 480 980
Dynamic Disaggregation Model (Based on a Sequential Markov Model)
kn(n + 3)/2 24 108 324
k(3n - 1) 24 96 204 780
direct Wilson-Hilferty transformation was again applied to
the data of series !, 2 and 4 and the calculated relevant
statistics are also shown in Table 2.
One can observe the generally good performance of DDM
in the preservation of the statistics of the actual variables.
The Valencia-Schaake model exhibits larger deviations from
the anticipated values (mainly in the variance and third
moment of X2) but this is not necessarily a major flaw of the
model, since it was not supposed to reproduce these param-
eters. Concerning the transformed variables, DDM does not
-• ½ o historical
•= q_ - - + s•mulate
= 0.2
0.0 0 N D J F M g H J g S
Fig. 2. Comparison of monthly skewness •d co.elation coef-
ficients of generated data: (a) skewness coefficient, applicgtio• 1,
location 3 (Morass mnoffi, (b) skewhess coefficient, application 2,
location 2 (Morass reservoir evaporation), (c) lag one autoco.ela-
tion coefficient of mon•ly v•ues for applicatio• 1, location 3
(•omos mnoffi.
exactly fit the theoretical values and it was not supposed to.
The transformed data of series 3 of the Valencia-Schaake
model are in agreement with the anticipated values, as was
expected. However, this agreement disappears when the
correction to regain the additive property is made as seen in
series 4. Obviously, the transformed corrected data are no
longer Gauss distributed (they have nonzero third moments)
and they do not have unit variance as they should. The bias
has been introduced by the variance of the adjusting factors
for regaining the additive property.
Preservation of Correlations
Figure 2c, originating from the case study discussed in the
previous subsection, indicates the preservation of the lag
one correlation coefficients achieved by the model. A similar
level of performance was found for the preservation for the
concurrent correlation coefficients for different sites. Note in
Figure 2c that the good performance of the model is ex-
tended also to the correlation of the first monthly runoff of
the current period and the last monthly runoff of the past
period. A more specific view of this topic is obtained from
- + historical ,,
..... simula(ed ,,'/
- gamma diskr•bukion
_ - -- gauss dis(rlbu(ion ,.///......
_ +
0.05 0.2 1 2 5 tO •0 30 40 58 BO 70 80 90 9S 98 99.5 99.9 99.99
Fig. 3. Comparison of empirical and theoretical distribution
functions of monthly values for application 1, location 3, subperiod
2 (Mornos runoff in November). The two-parameter gamma distri-
bution was adopted for the simulation.
TABLE 2. An Example for the Comparison of the Dynamic Disaggregation Model (DDM) to the
Valencia-Schaake Model (VSM) for Non-Gaussian Variables
Simulated Values From
DDM/Quadratic Partition
Procedure Simulated Values From
Gamma Gauss Corrected
Theoretical Cumulants Cumulants Raw Data Data
Statistic Values (Series 1) (Series 2) (Series 3) (Series 4)
Actual Variables
E[Z] 3.000 3.067 3.069 3.031 3.034
E[X 1 ] 1.000 1.024 1.024 1.006 1.006
E[X 2 ] 2.000 2.043 2.045 2.019 2.028
Var [Z] 8.000 8.242 7.998 8.508 8.489
Var [X 1 ] 1.000 1.053 ! .023 0.997 0.958
Var [X 2] 4.000 4.037 3.939 4.351 4.427
/23 [Z] 40.503 44.910 41.890 65.665 65.667
/x3[X• ] 2.000 2.311 2.326 2.061 1.807
/x3[X ] 16.000 16.417 14.744 31.148 29.723
Con' [X 1 , X 2 ] 0.750 0.765 0.756 0.753 0.754
Corr [X•, Z] 0.884 0.893 0.888 0.853 0.880
Con' [X 2, Z] 0.972 0.973 0.972 0.978 0.975
Transformed Variables (Wilson-Hilferty)
E[Z] 0.000 0.053 0.057 0.010 0.023
E[X ] 0.000 0.060 0.073 0.008 -0.030
E[X2] 0.000 0.030 0.019 0.010 -0.051
Var [Z] 1.000 0.882 0.873 1.004 0.948
Var IX • ] 1.000 0.866 0.827 0.996 1.180
Var [X 2] 1.000 0.984 1.036 0.998 1.269
/-•3 [Z] 0.000 0.271 0.237 O. 045 O. 191
/2, 3 [X' 1 ] O. 000 0.299 0.287 O. 019 --0.505
/x 3[X2] 0.000 0.005 --0.114 0.071 --0.486
Con' [X , X 2] 0.782* 0.797 0.777 0.783 0.781
Corr [X 1 , Z] 0.872* 0.881 0.866 0.873 0.892
Con' [X 2, Z] 0.981' 0.982 0.980 0.981 0.956
The Valencia Schaake model uses the Wilson-Hilferty transformation and is combined with a
correction procedure for regaining the additive property. The size of all synthetic series is 16,000.
*Estimations obtained by simulation.
Table 3, which refers to all runoff and evaporation series of
applications 1 and 2 (the rainfall autocorrelation coefficients
were negligible). Recall from section 4 that to achieve these
results no special treatment is required, since the model is
applied without any modification for the first lower-level
variable X• using as previous information (Xj ø = x?) the last
lower-level variable of the previous period.
From a practical point of view the model performance
concerning the preservation of the correlations between the
consecutive lower-level variables is sufficient. However, theo-
retically we cannot speak about exact preservation, but just an
TABLE 3. Correlation Coefficients of the Lower-Level
Variables Between the First Subperiod of the Present
Period and the Last Subperiod of the Previous Period,
for Applications 1 and 2
Distribution Historical Generated
Variable/Site Function Data Data
Application 1
Runoff/Evinos gamma 0.32 0.29
Runoff/Mornos gamma 0.16 0.09
Runoff/Yliki gamma 0.59 0.54
Application 2
Evaporation/Evinos Gauss 0.75 0.76
Evaporation/Mornos Gauss 0.02 0.03
Evaporation/Yliki Gauss 0.07 0.07
approximation. A systematic investigation of the model's be-
havior with regard to this is depicted in Figure 4, summarizing
the results of a series of simulations which concern the single-
site disaggregation of a higher-level variable into two compo-
nents. The correlation coefficient between consecutive lower-
level variables of consecutive periods ranges between -1 and
+ 1. The results of this figure are explained below.
Stedinger and Vogel [1984] showed that disaggregation
models do not preserve the correlation between the lower-
level variables of the previous period and the higher-level
variables of the current period, or equivalently and following
the notation of the present study, the quantities Corr [X t,
Z2], t = 1,.--, k. Since such quantities are used as
parameters in the Mejia-Rousselle model this model is totally
affected by this shortcoming. The original Valencia-Schaake
model does not use these quantities, but it does not repro-
duce at all the correlation between consecutive lower-level
variables of consecutive periods. The dynamic disaggrega-
tion model, like the Valencia-Schaake model, does not use
these quantities as parameters, but it performs well in
reproducing the latter correlation. However, its behavior is
similar to that of the Mejia-Rousselle model. In the examples
of Figure 4 the two models gave exactly the same curves
which are not distinguishable from each other (dashed lines
represent curves of both models). One can observe in Figure
4 significant deviations of the simulated Corr IX t, Z 2 ] from
their anticipated values. The deviations of Corr [X , X 2 ] are
-- OorrEX•,Z 2] - t:heoret:ical
-- OorrEX2,Z2'l - •heore•[cal
+- + OorrEXt,Z23 - s[mulat:ed
- e CorrEX 2, Z23 - s i mu 1 a ted
.0 4.8 -0.6 -O,q -0,2 -0.0 0.2 0.6 0.8 1.0
CORRELClTION: al=œorrEX ø,
'• o.8
o.• /'
c• •orr X Xm3- •heore•ca]
-0.2 [C -- EorrEX Z] -
-0.• W• CorrEXø,X•3 -
/• • -x CorrEX •, X•3 - simulated /
-0.6 •/ •--e Corr[X•,Z• - s•mula•ed /
-l.0 -1,0 -0.8 4.8 -0,4 -0.2 -0.0 0.2 0.• 0.6 0.8 .0
Fi•. 4. Illustration scheme for the pc•o•ancc of •hc PA•(1)
model configuration wkh re•ard to •he prostration of the co•ela-
fion o• (•) •he lowcr-lewl variables of •hc cu•cm period with the
•h•r-levcl variables of •hc ncx• period, (8) •he lower-!•wl and
hi,her-level variables of •h• cu•en• period and (c) consecutive
lower-level variables of the same or consecutive periods. The cu•cs
o• •he •jia-Rouss•ll• mod•l ar• •xactly th• sam• as •hos• of DD•
(dashed lin•s). The assumptions for the const•cfion of •hc scheme
(1) disa•rc&afion of a hi•her-level variable into two compo-
n•ms, and , (Z) Var [• ] = Var [• ] = 1 and (3) Co• [•0,
•] = Co• [•, •z]. The siz• of •hc synthetic series is 16,000.
negligible but there is a small difference in Corr [X ø, X1],
which is maximum near _+0.5.
Consequently, the failure in the reproduction of the quan-
tities Corr [X t, Z 2] affects to a degree the entire correlation
structure of the model, regardless of their presence in the
model formulation. But what is the reason of this influence,
if it is not the presence of quantities like Corr [X t, Z 2] in the
model formulation? The answer may be expressed as fol-
lows: Disaggregation models are techniques for "thicken-
ing" sample functions of stochastic processes, already
known in lower resolution (determined by the higher-level
time step). The original Valencia-Schaake model does not
see outside of each higher-level step, and thus it is known to
be weak in the reproduction of the autocorrelation. The
Mejia-Rousselle model and the above configuration of DDM
both have a way to see to the past, but not to the future;
hence they exhibit a similar behavior. However, the future is
known at the higher-level resolution; thus a better configu-
ration capable of seeing in both directions is possible. This is
attempted in the configuration of the next section.
Retaining the Markovian structure of the sequence of
lower-level variables (equations (43)-(47)), we try to modify
the parameter determination procedure in order to take
account of the higher-level variables of the next period, Z 2.
This may be based on the following equation where the Z 2 is
an exogenous input:
X t = ftxt-1 + gtZ2 + htQ t (62)
or, equivalently
xJ =f•xJ-' + gJzf + hJ•Qr t j = 1,'", n (63)
where ft and gt are assumed (n x n) diagonal, i.e., ft __ diag
(f[,f•, ... ,fnt), gt = diag (g•, g•, --- , g•t), whereas h t =
[h/•.] is a (n x n) lower triangular matrix of coefficients. Qt
= [Qj] is a vector of n random variables independent in time
and location and also independent of Zf. We denote gt =
E[Qt], •1 t - p,3[Qt], and set (for mathematical conve-
nience) Var [Qt] = 1.
The parameters ft, gt and h t can be computed from the
parameter set of the associated sequential PAR(i) model,
without introducing any other independent parameter, by
the following equations derived in Appendix C:
O'jj lf jt q_ Or j-1 t-10•; _ t-1 t- cr:j ß aJcrjj (64)
a j-1 t- t t crjj lfj + vjigj = ajo-jj (65)
ht(ht) T= o 't- fto't-lat-- grotto 't (66)
a t=diag(a•, a•,---, a• t) = a •---a 1 a •---a t+•
/=1 (67)
•, =Cov [z, z]
with {r t and a t defined by (48) and (45), respectively.
In particular, (64) and (65) when solved simultaneously
give j)t and #J, while h t can be derived by using (66) as the
lower triangular matrix.
The remaining quantities for the complete evaluation of
the model are [t and •l t. The former can be easily deter-
mined by taking expected values in (62). The computation of
the latter is complicated, since third-order joint moments of
Z 2 and X t-1 are needed; this can be carried out in a way
analogous to that used for the derivation of similar second-
order moments in Appendix C. Then the following equation
resulting from (63) may be utilized:
--(,qJ)3•3[Zj2 ] - 3(L.t) 2,qj/_6 l[/J -1 Z] 2]
2 ,
-- 3fjt(gJ )2• 12[/• -1' ZJ 2] - E (hJr)3r/•}
In a way similar to that used for the PAR(l) configuration
we can proceed to the equations that form the basis for the
moments determination in each step. Here the diagonal
matrix e t is defined by eJ = h•. and the lower triangular
matrix d t (with zeros on the diagonal, similar to (49)) by
d t= I - et(ht) -1 (70)
By means of dt and et, (62) becomes
X t = ftxt-1 + gtZ2 + dt(X t- ftxt-1 _ gtZ2 ) + etQ t
This form is similar to (50) and again results in (equation (52))
xJ= rJ + + w;
where the TJ, UJ and WJ are now defined in a different way,
rj = t t-1 4.Xj + g;Zj 2 (72)
UJ = dJr ( Xrt - f •Xr t - ' - g •Zr 2 )
W j_ t t
eiQ s (74)
By using (52) and (72)-(74), SJ can be expressed by an
equation analogous to (56)'
k k k
SJ = .et t ,rr t-1 r
Jj•j.A.j + E _.r r•2
r=t r=t r=t (75)
ß r;--l+ Y. fjr+lfjr+2'''4q
Equations (52) and (75) may be applied for the calculation of
conditional marginal and joint moments of (XJ, S J) and a
systematic algorithm similar to the one given in Appendix B
for PAR(l) configuration can be constructed from those
A minor problem related to the above configuration is the
existence of the term aj ø in (64) when it is applied for t = 1.
This term represents the covariance of the lower-level vari-
able Xj ø, located in the previous period, with the higher-level
variable Zj 2 of the next period. Since the model preserves
correlations only in consecutive periods, af will not be
preserved. However, the effect of this disparity is not
important. Practical solutions to overcome this problem
include the following: (1) Consider a ø = 0 for the typical
range of the correlation of higher level variables. (2) For
significantly high values of this correlation calculate a 0 from
(67). (3) An approximation making use of an average et
instead of a0 'gave good results in the total range of the
correlation of higher-level variables; a is defined by
o•,o = 1
-- Ot r (77)
k r=l -k
•, o.•
-- CorrEXt,Z•'l - theorelical
-- œorrEX•,Z•3 - •heore•ica[
+ [orrEXt,Z• - simulated
o - m CorrEX •, Z •] - s i mu 1 a led
, I , I , I , I ,
-1.0 -0.8 -0.6 -O.q -0.2 -0.0 0.2 O.q 0.6 0.8 1.0
CORRELATION: a•=Corr[X ø, X t]
/ CorrEX•,Z] - theoretical
F / +- • Cørr[X I' Xt, ]-s imula led
I/ ,•__,, Corr[X•,Z] - simulated
-t.0 -1,0 -0.8 -0.6 -0,4 -0.2 -0.0 0.2 0.4 0.6 0.8 1.0
CORRELATION: a 1= Corr[Xø,X t]
Fig. 5. Illustration scheme for the performance of the PARX(1)
model configuration with regard to the preservation of the correla-
tion of (a) the lower-level variables of the current period with the
higher-level variables of the next period, (b) the lower-level and
higher-level variables of the current period and (c) consecutive
lower-level variables of the same or consecutive periods. The
assumptions are similar to those of Figure 4.
where ot r is calculated from (67), with the observation that
am-k __ am.
The PARX(1) configuration is more complicated than the
PAR(l), particularly in the handling of the skewness, though
they have a common background. However, it performs
pretty well with all kinds of lagged covariances of higher-
and lower-level variables, as indicated in Figure 5 (similar to
Figure 4), which refers to the numerical investigation of the
previous section and was obtained by the use of the
PARX(1) configuration. Note that the developed method
does not require any assumption about the model used for
the generation of the sequence of higher-level variables. In
that respect it is structurally different from a recent method
by Lin [1990] which depends on the specific generating
model of the higher-level variables and aims toward the
modification of the Mejia-Rousselle parameter estimators for
preserving the covariance properties of lower-level vari-
Based on a nonlinear partition procedure, linked with an
appropriate implementation of the sequential Markov model,
a multivariate dynamic disaggregation model was developed.
Important features of the model are (1) the assurance of the
preservation of the additive property, (2) the stepwise ap-
proach, (3) the modular structure (composed of two parts
studied separately), and (4) the explicit analytical re!ations
composing its structure.
The mode] is parameter parsimonious, since the minimum
essential set of statistics of the lower-level variables is
maintained (first-, second-, and third-order marginal mo-
ments, lag zero cross-correlation coefficients and lag one
autocorrelation coefficients). The model parameters are es-
timated directly from the historical data with the usual
statistical methods of the sequential Markov mode!.
Various configurations of the model, resulting from the
use of either a different partition procedure, a different
moments determination procedure or a different stochastic
structure of the lower-level variables, are possible. Two
configurations of the mode] are studied, both based on a
Markovian structure of the lower-level variables. The first
(PAR(I)) configuration implements the associated Markov
model in its initial form, while the second (PARX(1)) uses
the known higher-level variable of the next period as an
exogenous input at the current period. The PAR(l) configu-
ration is simpler than the PARX(!), particular!y in handling
the third-order moments, and gives good approximations of
the correlations of lower-level variables located at consecu-
tive higher-level time steps. However, the PARX(1) config-
uration exhibits better behavior with regard to the correla-
tion properties of lagged lower- and higher-level variables.
Obviously, the model has a strong restriction concerning
the Markovian structure assumed for the lower-level series;
this is the price paid for obtaining the parsimony of param-
eters. However, this restriction may be not so important,
given that the errors are not accumulated, since the higher-
level series are generated independently. With the exception
of special cases, the model, similar to other disaggregation
models, is not exact in a strict statistical sense but gives good
approximations of the important statistics of interest.
By expressing the right-hand sides of (28), (29) and (31) in
terms of the a i and then solving the system, we get
(A 4-- A22)g!l-- A3g12
_ 2 (A1)
al= (x-x)x2 x3
A2•12-- •-3•11
_ 2 (A2)
a2 = ('•.4 -- '•.22)X2 A3
a0 = -a2A2 (A3)
The equations for the b i, determining f($), are obtained
from (30) and (32), after a number of operations. In the given
set of equations, the one degree of freedom has been given to
b2, as explained further below.
b0 = P
fl•A n + 2•1•2A3 + (• + 2•2)A 2 + 1 (A4)
b 1 = f{ lbo
b2 = fl2b 0 (A6)
/92=0 r0>-0
/•2--'-- (--rl + A21/2)/7'2, 'r0 < 0, r, >-- 0 (A7)
f•2 = (--7'1 -- /'X2 J/r2, 7'0 < 0, r 1 < 0
--(A 4 -- qA3)• 2 -- A 2 ----- A•/2
/•1 = A3 -- qA2 0
A3 - qA2 (AS)
1 q- (A•- qX4)/•2
=- , A3 - qA2 =0
/31 2 A2 + (An- qA3)/•2
r 0 = (A 3 - qA 2)q + A 2 (A9)
rl = (An- qA3)A2 - (A3 - qA2) 2 (A10)
r2=(An-qA3) 2-(A3-qA2)(AS-qAn) (All)
A2 r 2
= 1 - for2 (A12)
A1 = •0 + 2•fl2 + r2fl• (A13)
q = [;2• - a•A5 - 2a•a2(A4 - A•) - (a• - 2a•A2)A3]/p
p = n2- - - (Al)
The meaning of (A7) is that f12 (and hence b2) is set to zero
ff possible (that is, ff it does not result in nonreal solutions for
b l) thus downgrading the quadratic equation to a linear one.
Othe•ise, it takes the absolutely minimum value which
assures A 1 m 0 and hence a reM solution of b l-
It is noted that (A8) normMly yields two d•erent vMues of
ill, and consequently two couples (b o, b 1), both v•id; it
was assumed that the one resulting in the smMlest absolute
value of skewness 03 is finally selected. The vMue of 03 is
obtained from (33) which may be w•tten in the fo•
03 = [•3- •(a, a, a) - 3•(a, b, b)]/•(b, b, b) (A16)
where the symbol • , , ) is an abbreviation for the
foRowing expression, where k, l, m denote t•plets of
parameters such as k = (ko, k l, k2) , etc.'
•(k, l, m) = kolomo + (kolom 2 + kolim + kol2m 0
+ kilom • + kilim 0 + k210mo)A 2 + (kolim 2 + kol2m•
+ kilore2 + kllimi + kll2mo + k2loml + k21imo)A 3
+ (kol2m2 + kilim2 + kil2mi + k210m2 + k2liml
+ k212mo)An + (kll2m2 + k2lim2 + k212mi)A5
+ k212m 2A (A17)
It can be proved that
p = •[X 2] - •[a2(S)] (A•8)
and because of (30) it should be the case that
p > 0 (A19)
It is possible that certain combinations of the initial param-
eters listed in (8) yield values of the a i not obeying (A 19). In
that case, though the at can be determined, g(S) as defined
with (25) and (26) does not exist; thus (A19) is a necessary
condition for the existence of g(S). A similar necessary
condition for the existence off(S) is
/•22%4 + 2/• 1/•2% 3 + (]•12 + 2]•2)% 2 + 1 >0 (A20)
which follows from (A4). Further, the existence of f(S)
demands that
r 0>-0 or A 2>0 (A21)
which is related to (A7).
If the variables X and Y (or, equivalently, X and S) are
jointly normal, then the quadratic scheme downgrades to the
linear one and (25)--(27) reduce to (9). Indeed, the above set
of equations reduce to
a 0 = a 2 = b 1 = b 2 = 0 (A22)
a• = •rll/A 2 (A23)
bo = (r/2 - •r•21/A 2)1/2 (A24)
as was theoretically anticipated. This is also true in any case
where A3 = •q2 = •r21 = 0.
The stepwise algorithm involves the computation of the
conditional moments listed in (42) that are required for the
partition procedure. The computation is based on the follow-
ing relations, obtained from (52) and (56). The proofs are
given below at the end of this appendix.
aj•rjxj + rj (B1)
Var [SJ nJ] : ½J (B2)
EtXylnj] = ' '-'
ajx) + 3 d (B4)
Var ' ' =(e•)Z
x;Inj = (e J) ' (B6)
Sj nj]: •j(eJ) 2 (BY)
y• (B8)
SJ:E[U]InJ]+E[W]]: ' ' '
djr(X r - arX r ) q-
r=l (BlO)
and the following symbols are defined by the backward
recurslye relations'
t j+l t+l k+l
,r i = I + a •j qrj = 0 (Bll)
t t t •.;+1 k+l
q'j -- •Jj7Yj q- ß q'j : 0 (B12)
' •J) 2(½j) 2 t+l k+l (pj -- ( q- (pj (pj = 0 (B13)
*;: (7rd)3(d) 3 T; q- *J+l ½f+1 _ 0 (B14)
Note that (B11) is equivalent to (57), but simpler for pur-
poses of application.
Proof of(BlI). From (57) we have •rJ = I + aJ +• +
+la.t+2 + .--+ at+la t+2 .-. a .• = 1 + aJ+•(1 aJ +2
aJ . .a+ a j+2 ' af)= 1+ aJ +l•jt+l +
+ .... . Also, from (57),
= 1. This is compatible with the backward recursive relation
if we set •rf +• = 0.
Proof of(B4). From (52)we have E[XJIgl j] = E[UI j]
+ E[UJ glJ] + E[WJ]. Recall that U and Uj are com-
pletely known because they are expressions of variables
contained in glj. From (53)we get E[ U glJ] = aJxJ -• , from
j-1 t t t t-1
(54) E[ujInJ] - Y-r=l djr(Xr -- arXr ) and from (55)
E[WJ] = eJl3J. Hence, if we introduce 3J defined by (B10)
we get (B4).
Proof of(B1) and (B•2). r From (56) we have E[SJlnJ] =
aJqrJxJ -1 q- WrZrk=t q'Q(Vj q- Wi) •'•J] '- __t t_ ,-1 - x•k
aj vrj xj -r
•rf 3f. Note that, because of the horizontal course followed,
all Uf for r > t are known since they are expressions of
lower-level variables of previous sites. Also, note that E[(UJ
+ wJ)nj] = E[uJlnJ] + E[wJ] = aJ. Thus, if we denote
•'•k _rer
r=t 7rj oj '- TJ we get (B1). Then it is obvious that this
definition of rJ leads to the backward recursive relation
Proof of(B5) and (B6). By abstracting conditional means
from (52) (given ll]) and then squaring and taking condi-
tional expected values we get Var [xJlJ] = Var [wj] =
(eJ) 2 Var [V]] (because of (55)) and since Var [VJ] = 1,
(BS) has been proved. Note that TJ and uJ do not contribute
to the conditional variance because they are known. The
proof of (BG) is quite similar.
Proof of (B2), (B13), (B3) and (B14). By abstracting
conditional means from (56) (given glJ) and then squaring
and taking conditional expected values we get Var [sJ
k Var [Wf] : •.Lt (rrf)2(ef) 2 If we denote
-' •'r=t (qTf) 2 ß
the last sum by •pJ then it is easily shown that •pJ is also given
by the recursire relation (B!3). Note that XJ -• and Uf in
(56) do not contribute to the conditional variance because
they are known. Also recall that Wf, r = t, .'- , k, are
independent. The proof of (B3) and (B14) is quite similar.
Proof of (B7), (B8) and (B9). By abstracting conditional
means from each of (52) and (56) (given ftj) and then
multiplying them and taking conditional expected values we
get Cov [xy, SJlnJ] = =y Vat [wJ] = =y(,y)2. The proof
of (B8) and (B9) is similar.
By abstracting means from (63) and then multiplying
successively by (XJ-' - se] -1) and (Zj 2 - E[ZJ]) we obtain
Var [Xj-1]fJ +Cov [Zf, xJ-]gJ =Cov [xj, xj
Coy [Z•, xJ-•]fJ + var [z•]gJ-- Coy [zj 2, (c2)
By abstracting means from (62), then postmultiplying by (X t
_ •t)r, taking expected values, and also considering that
Coy [Qt, X t] = (ht)T, we obtain
ht(ht) T= Cov IX t, X t] - ft Coy IX t-1 X t]
_ gt COV [Z 2, X t] (C3)
The term Coy [X t-• , X t] in (C3) (also present in (C1))
equals {r t-• a t as is easily derived from (43), which is still
valid. The presence of terms like Cov [Z 2, X t] in (C1)-(C3)
does not append any parameter to the set discussed in
section 4. Indeed, the'covariance matrix of Z 2 = X k+•
+ ... + X 2k and X t can be determined by iterative use of
(43) in terms ofa l= 1 .-. k and{r t(notethata t+•-
at), since it can be shown that
COV [Z 2, X t] -' (a 1''' a 1) a k''' at+llY t (C4)
or, with the use of •x t defined in (67)
COV [Z 2 X t] -- otttl t (C5)
By making notational simplifications to (C!), (C2), (C3),
with the use of (48), (68), and (C5), we get (64)-(66).
Acknowledgments. The research leading to this paper was per-
formed within the frame of the project Appraisal of Existing
Potential for Improving the Water Supply of Greater Athens, project
8576710, sponsored by the Greek Ministry of Environment, Plan-
ning and Public Works. The scientific director of the project, Th.
Xanthopoulos, and the members of the research team who assisted
in the preparation of the data are gratefully acknowledged. The
author wishes to thank the anonymous reviewers and the Associate
Editor and the Editor for their comments and their detailed reviews,
as well as E. Foufoula-Georgiou for the comments and the help she
Bras, R. L., and I. Rodriguez-Iturbe, Random Functions and
Hydrology, Addison-Wesley, Reading, Mass., 1985.
Charbeneau, R., Comparison of the two- and three-parameter
lognormal distributions used in streamflow synthesis, Water Re-
sour. Res., 14(1), 149-150, 1978.
Grygier, J. C., and J. R. Stedinger, Condensed disaggregation
procedures and conservation corrections for stochastic hydrol-
ogy, Water Resour. Res., 24(10), 1574-1584, 1988.
Grygier, J. C., and J. R. Stedinger, SPIGOT, A synthetic streamflow
generation software package, Technical description, version 2.5,
School of Civ. and Environ. Eng., Cornell Univ., Ithaca, N.Y.,
Hoshi, K., and S. J. Burges, Disaggregation of streamflow volumes,
J. Hydraul. Div. Am. Soc. Civ. Eng., 105(HY1), 27-41, 1979.
Johnson, N. L., and S. Kotz, Distributions in Statistics: Continuous
Multivariate Distributions, John Wiley, New York, 1972.
Kendall, M. G., and A. Stuart, The Advanced Theory of Statistics,
vol. 1, Distribution Theory, 2nd ed., C. Griffin, London, 1963.
Kottegoda, N. T., Stochastic Water Resources Technology, Mac-
millan, London, 1980.
Koutsoyiannis, D., A point rainfall disaggregation model (in Greek),
Ph.D. thesis, Natl. Tech. Univ. of Athens, 1988.
Koutsoyiannis, D., and T. Xanthopoulos, A dynamic model for
short-scale rainfall disaggregation, Hydrol. Sci. J., 35(3), 303-322,
Koutsoyiannis, D., N. Mamassis, and J. Nalbantis, Appraisal of
Existing Potential for Improving the Water Supply of Greater
Athens, vol. 13, Stochastic Simulation of Hydrologic Variables
(in Greek), National Technical University of Athens, 1990.
Lane, W. L., Applied stochastic techniques: User's manual, Eng.
and Res. Center, Bur. of Reclam., Denver, Colo., 1979.
Lane, W. L., Corrected parameter estimates for disaggregation
schemes, in Statistical Analysis of Rainfall and Runoff, edited by
V. P. Singh, Water Resources Publications, Littleton, Colo.,
Lane, W. L., and D. K. Frevert, Applied stochastic techniques:
User's manual, personal computer version, Eng. and Res. Center,
Bur. of Reclam., Denver, Colo., 1990.
Lin, G.-F., Parameter estimation for seasonal to subseasonal disag-
gregation, J. Hydrol., 120(1--4), 65-77, 1990.
Matalas, N. C., and J. R. Wallis, Generation of synthetic flow
sequences, in Systems Approach to Water Management, edited
by A. K. Biswas, McGraw-Hill, New York, 1976.
Mejia, J. M., and J. Rousselle, Disaggregation models in hydrology
revisited, Water Resour. Res., I2(2), 185-186, 1976.
Oliveira, G. C., J. Kelman, M. V. F. Pereira, and J. R. Stedinger,
Representation of spatial cross-correlation in a seasonal stream-
flow model, Water Resour. Res., 24(5), 781-785, 1988.
Pereira, M. V. F., G. C. Oliveira, C. C. G. Costa, and J. Kelman,
Stochastic streamflow models for hydroelectric systems, Water
Resour. Res., 20(3), 379-390, 1984.
Salas, J. D., J. W. Dellcur, V. Yevjevich, and W. L. Lane, Applied
Modeling of Hydrologic Time Series, Water Resources Publica-
tions, Littleton, Colo., 1980.
Stedinger, J. R., Fitting lognormal distributions to hydrologic data,
Water Resour. Res., I6(4), 481-490, 1980.
Stedinger, J. R., and R. M. Vogel, Disaggregation procedures for
generating serially correlated flow vectors, Water Resour. Res.,
20(1), 47-56, 1984.
Stedinger, J. R., D. Pei, and T. A. Cohn, A condensed disaggrega-
tion model for incorporating parameter uncertainty into monthly
reservoir simulations, Water Resour. Res., 21(5), 665-675, 1985.
Tao, P. C., and J. W. Dellcur, Multistation, multiyear synthesis of
hydrologic time series by disaggregation, Water Resour. Res.,
I2(6), 1303-1312, 1976.
Todini, E., The preservation of skewness in linear disaggregation
schemes, J. Hydrol., 47, 199-214, 1980.
Valencia, D., and J. C. Schaake, A disaggregation model for tim6
series analysis and synthesis, Rep. 149, Ralph M. Parsons Lab.
for Water Resour. and Hydrodyn., Mass. Inst. of Techno!.,
Cambridge, 1972.
Valencia, D., and J. C. Schaake, Disaggregation processes in
stochastic hydrology, Water Resour. Res., 9(3), 211-219, 1973.
D. Koutsoyiannis, Division of Water Resources, Department of
Civil Engineering, National Technical University of Athens, 5 Iroon
Polytechniou, GR-15700 Zografou, Greece.
(Received April 20, 1992;
revised May 20, 1992;
accepted June 2, 1992.)
... A univariate disaggregation model like Hyetos would generate a synthetic hourly series, fully consistent with the known daily series and, simultaneously, statistically consistent with the actual hourly rainfall series. Obviously, however, a synthetic series obtained by such a manner could not coincide with the actual one, but would be only a likely realization [7,8]. Unfortunately, there is no station having hourly rainfall in the study area so that the equation proposed by Sai Htun Thein was used for rainfall disaggregation. ...
Full-text available
Rainfall and Its Intensity is needed for planning and designing of various water resource projects including infrastructure such as the design of urban drainage works, Storm Sewers, Culverts and etc. The main aim of this research was to develop Rainfall intensity duration curve for the selected towns in western part of Ethiopia. Gumbel and the Log Pearson Type III Probability distribution (LPT III) were used to develop rainfall intensity duration curves for the selected towns in western part of Ethiopia. The IDF curves developed by Gumbel's Extreme value distribution shows, the pattern similarity for all return period, duration and all considered stations but the rainfall intensity shows an increasing with increase in the return period and decrease with rainfall duration increase in all return periods and also show high Rainfall intensity (mm/hr.) so that it was used to derive Empirical equation using Logarithmic transformation method to determine the constants (C, m, a) considered to derive the equation. Then the comparison was made between rainfall developed by using Gamble's Probability distribution and computed by Empirical equation. In all return period and duration of time it shows good relation which approximately equal to unity (R 2) but for 1000 return period differs which is still acceptable without any uncertainty for further application. So, the developed Rainfall intensity duration curves and derived empirical equations can be used for the planning and design of any Water Resources projects and infrastructure in the towns related to water resources.
... Also, non-stationarity trends owing to the underlying dynamics of the physical processes may not be captured effectively. Koutsoyiannis (1999) developed a parsimonious nonlinear multi-variate dynamic disaggregation model (DDM) that followed a two-step approach for simulation of hydrologic time series. Following this, a generalized mathematical framework for stochastic simulation and forecasting problems in hydrology was proposed by Koutsoyiannis (2000) for modeling stochastic processes with shortor long-term memory structure, in which a generalized autocovariance function was implemented within a generalized moving average generating scheme. ...
A simulation-optimization (S-O) framework is developed for the hybrid stochastic modeling of multi-site multi-season streamflows. The multi-objective optimization model formulated is the driver and the multi-site, multi-season hybrid matched block bootstrap model (MHMABB) is the simulation engine within this framework. The multi-site multi-season simulation model is the extension of the existing single-site multi-season simulation model. A robust and efficient evolutionary search based technique, namely, non-dominated sorting based genetic algorithm (NSGA - II) is employed as the solution technique for the multi-objective optimization within the S-O framework. The objective functions employed are related to the preservation of the multi-site critical deficit run sum and the constraints introduced are concerned with the hybrid model parameter space, and the preservation of certain statistics (such as inter-annual dependence and/or skewness of aggregated annual flows). The efficacy of the proposed S-O framework is brought out through a case example from the Colorado river basin. The proposed multi-site multi-season model AMHMABB (whose parameters are obtained from the proposed S-O framework) preserves the temporal as well as the spatial statistics of the historical flows. Also, the other multi-site deficit run characteristics namely, the number of runs, the maximum run length, the mean run sum and the mean run length are well preserved by the AMHMABB model. Overall, the proposed AMHMABB model is able to show better streamflow modeling performance when compared with the simulation based SMHMABB model, plausibly due to the significant role played by: (i) the objective functions related to the preservation of multi-site critical deficit run sum; (ii) the huge hybrid model parameter space available for the evolutionary search and (iii) the constraint on the preservation of the inter-annual dependence. Split-sample validation results indicate that the AMHMABB model is able to predict the characteristics of the multi-site multi-season streamflows under uncertain future. Also, the AMHMABB model is found to perform better than the linear multi-site disaggregation model (MDM) in preserving the statistical as well as the multi-site critical deficit run characteristics of the observed flows. However, a major drawback of the hybrid models persists in case of the AMHMABB model as well, of not being able to synthetically generate enough number of flows beyond the observed extreme flows, and not being able to generate values that are quite different from the observed flows.
... Given the general lack of climate, water quantity and quality data at time steps of less than one day, the model developments reported in this paper are based on a daily time step. One of the earliest streamflow disaggregation models was proposed by Valencia and Schaake (1973) for disaggregation of annual flows into seasonal, and since then various disaggregation algorithms have been proposed to disaggregate annual flows to seasonal or monthly flows (Mejia and Rousselle 1976, Grygier and Stedinger 1988, Santos and Salas 1992, Koutsoyiannis 1992, Koutsoyiannis and Manetas 1996. However, the disaggregation of monthly to daily flows remains a challenge. ...
A monthly to daily streamflow disaggregation method is presented as part of an emerging water quality model designed to link with established monthly hydrology and yield models. The daily time step is assumed necessary for simulating water quality dynamics. The method is tested on two catchments in South Africa where observed daily flow data are available. The model includes a volume correction process to ensure daily sums are equal to input monthly flows and this reduces the sensitivity of the results to some model parameters. The sequences of events in the input daily rainfall must be representative of the catchment. Model validation against observed flows achieved Nash-Sutcliffe efficiency values ranging from 0.75 to 0.94 and initial applications of a water quality component suggested little difference between using observed and disaggregated flows. The main practical advantages are simplicity and the fact that the method builds on the experience of existing monthly models. Editor D. Koutsoyiannis; Guest editor G. Mahé
... As noted by Holley and Waymire [5], the independent and identically distributed "bounded generators" give rise to non-ergodic cascades. Recent developments in stochastic rainfall analysis in this direction deal with the introduction of wavelet transforms and importantly, the use of Artificial Neural Network, diffusion model (e.g., [6]), Markovian type models (e.g., [7] [8]) and Disaggregation models (e.g., [9]). ...
Full-text available
The analysis of time series is essential for building mathematical models to generate synthetic hydrologic records, to forecast hydrologic events, to detect intrinsic stochastic characteristics of hydrologic variables as well to fill missing and extend records. To this end, this paper examined the stochastic characteristics of the monthly rainfall series of Ilorin, Nigeria vis-a-vis modelling of same using four modelling schemes. The Decomposition, Square root transformation-deseasonalisation, Composite, and Periodic Autoregressive (T-F) modelling schemes were adopted. Results of basic analysis of the stochastic characteristics revealed that the monthly series does not show any discernible presence of long-term trend, though there is a seeming inter-decadal annual variation. The series exhibits strong seasonality throughout its length, both in the moments and autocorrelation and significantly intermittent. Based on assessment of the respective models, the performance of the different modelling schemes can be expressed in this order: T-F > Composite > Square root transformation-Deseasonalised > Decomposition. Considering the results obtained, modelling of monthly rainfall series in the presence of serial correlation between months should be based on the establishment of conditional probability framework. On the other hand, in view of the inadequacy of these modelling schemes, because of the autoregressive model components in the coupling protocol, nonlinear deterministic methods such as Artificial Neural Network, Wavelet models could be viable complements to the linear stochastic framework.
... The model by Valencia and Schaake (1973) was modified and improved by several researches (Mejia and Rousselle, 1976;Lane, 1982;Stedinger and Vogel, 1984). Besides linear stochastic disaggregation techniques, there have been alternate approaches, which allow representation of the non-Gaussian data directly in the disaggregation procedure (Tao and Delleur, 1976;Todini, 1980;Koutsoyiannis, 1992;Koutsoyiannis, 2001). These approaches do not require data transformation and preserve the additive property and higher order statistics of the aggregated and disaggregated data by performing a stepwise disaggregation. ...
A deterministic geometric approach, the fractal-multifractal (FM) method, is proposed to temporally downscale (disaggregate) rainfall and streamflow records. The applicability of the FM approach is tested on: (1) two sets of rainfall records—one from Laikakota, Bolivia and the other from Tinkham, Washington, USA; and (2) two distinct sets of streamflow records—for water years 2005 and 2008 from the Sacramento River, California, USA. For the purpose of validation, the available daily records are first aggregated into weekly, biweekly, and monthly records and then the FM method is applied to downscale such sets back into the daily scale. The results indicate that the FM method, coupled with a threshold to capture the high intermittency of rainfall and a smoothing parameter to get the milder texture of streamflow, readily generates daily series (over a year) based on weekly, biweekly, and monthly accumulated information, which reasonably preserves the time evolution of the records (especially for streamflow) and captures a variety of key statistical attributes (e.g., autocorrelation, histogram, and entropy). It is argued that the FM deterministic downscalings may enhance and/or supplement available stochastic disaggregation methods.
Advances in computational power, scientific concepts, and data measurements have led to the development of numerous nonlinear methods to study complex systems normally encountered in various scientific fields. These nonlinear methods often have very different conceptual bases and levels of sophistication and have been found suitable for studying many different types of systems and associated problems. Their relevance to hydrologic systems and ability to model and predict the salient characteristics of hydrologic systems have led to their extensive applications in hydrology over the past three decades or so. This chapter presents an overview of some of the very popular nonlinear methods that have found widespread applications in hydrology. The methods include: nonlinear stochastic methods, data-based mechanistic models, artificial neural networks, support vector machines, wavelets, evolutionary computing, fuzzy logic, entropy-based techniques, and chaos theory. For each method, the presentation includes a description of the conceptual basis and examples of applications in hydrology.
Hydrology was mainly dominated by deterministic approaches until the mid-twentieth century. However, the deterministic approaches suffered from our lack of knowledge on the exact nature of hydrologic system dynamics and, hence, the exact governing equations required for models. This led to the development and application of stochastic methods in hydrology, which are based on the concepts of probability and statistics. Since the 1950s–1960s, hydrology has witnessed the development of a large number of stochastic time series methods and their applications. The existing stochastic methods can be broadly grouped into two categories: parametric and nonparametric. In the parametric methods, the structure of the models is defined a priori and the number and nature of the parameters are generally fixed in advance. On the other hand, the nonparametric methods make no prior assumptions on the model structure, and it is essentially determined from the data themselves. This chapter presents an overview of stochastic time series methods in hydrology. First, a brief account of the history of development of stochastic methods is presented. Next, the concept of time series and relevant statistical characteristics and estimators are described. Finally, several popular parametric and nonparametric methods and their hydrologic applications are discussed.
We present a study performed in the framework of the EDF project on the use of seasonal meteorological forecasts to improve hydro-electrical resources management. Due to the difference of space and time scales, it is indispensable to downscale the (meso-scale) GCM rainfall to the (micro-) scale of the hydrological models. To preserve the scaling properties of the rainfall field, as well as its close interrelation with the dynamics at all scales, we develop a multifractal downscaling algorithm based on the idea that the rain rate cascades from large to small scale in a multiplicative manner: a scale invariant random multiplicative increment determines the rate fraction forwarded from a parent structure to a child one. Firstly, we proceeded to a rather exhaustive space and time analysis of the Météo-France PRECIP data base (about 10 years of high resolution data for 243 rain gages distributed over France territory) in order to estimate the universal multifractal parameters α, C1 as well as the exponent Ht of the scaling anisotropy of time versus space. The latter was empirically estimated to be in full agreement with its theoretical value: Ht=1/3. Secondly, we develop a cascade model defined with these parameters from space-time pixels corresponding to 243km×243km×32days, close to those of the GCM, which are of the order of 250km×250km×30 days, i.e. 35km×35km×25days. This choice is done in agreement with the value of Ht. In conclusion, we discuss how to take into account the orographic effects.
Full-text available
A disaggregation model, which differs from those developed by Valencia and Schaake (VS) and Mejia and Rousselle (MR) (referred to as the HB model), was developed specifically to preserve correlations between seasons joining water years in addition to the properties preserved by the basic VS model. The HB model simultaneously disaggregates two consecutive annual flow volumes, then the next two, etc. The HB model is extended to be able to disaggregate skewed seasonal flows that follow three-parameter log-normal distributions (3PLN). - from ASCE Publications Abstracts
Full-text available
The mean square error of estimation of selected quantities is used to evaluate the efficiency of alternative methods for fitting the two- and three-parameter log normal distributions. Monte Carlo results show that use of maximum likelihood parameter estimation dominates other methods for fitting the two-parameter log normal distribution for samples of 25 or more log normal variates. For the three-parameter log normal distribution the standard moment method performs best for log normal distributions with low skew coefficients, while use of the sample mean, variance, and quantile estimate of the lower bound performs better for highly skewed log normal distributions; use of the unbiased standard deviation and skew coefficient is almost dominated by this new fitting procedure. The performance of the better fitting procedures is also evaluated when the observations are drawn from a log Pearson type 3 distribution.
Sections: Introduction and climate; Analysis of hydrologic time series; Probability functions and their use; Linear stochastic models; Special properties and models; Statistical treatment of floods; Probability theory applied to reservoir storage; Stochastic programming methods in systems engineering; Applied decision theory. (92 refs.)
The structure of disaggregation models places severe constraints on the feasible values of the lagged covariance of generated flow vectors. A new and simple class of disaggregation models is presented which employ the Valencia-Schaake disaggregation model structure but allow the models' innovations to be serially correlated. These models can reproduce (1) the covariance matrix of the disaggregated flows, (2) their covariance with the upper level flows, and (3) reasonable approximations to the lag one covariance of the disaggregated flow vectors given the constraints imposed by a disaggregation approach. The Mejia-Rousselle disaggregation model is shown, in general, to fail to reproduce the anticipated variances and covariances of the disaggregated flows because the model and its parameter estimators are not self-consistent. The paper closes with a discussion of practical modeling considerations and of staged disaggregation procedures which reduce the size of multisite multiseason models.
The need to preserve both long-term and short-term variance and covariance properties of hydrologic processes has led to the development of numerous stochastic models during the past few years; Among these are the models first proposed by Mandelbrot and Wallis for preserving the Hurst phenomenon. All these models use sequential generation schemes, and some require the process to be stationary without seasonal variations. The model presented herein preserves seasonal variations and does not necessarily generate data sequentially, although the autoregressive models of any order are special cases of this model. If an annual series has been generated by this or any other model, this model may be used to generate a parallel series of seasonal events from the given series of annual events. It may further be used to generate monthly, weekly, daily, and hourly events.
Methods for the preservation of skewness in linear disaggregation schemes are reviewed, and disadvantages of some recent approaches are identified. Methods are suggested where-by the preservation of skewness can be achieved theoretically, although in finite samples a bias is evident which, however, is independent of the proposed methods. The application of the methods is illustrated through an example.
Pereira et al. (1984) presented a special disaggregation procedure for generating cross-correlated monthly flows at many sites while using what are essentially univariate disaggregation models for the flows at each site. This was done by using a nonparametric procedure for constructing residual innovations or noise vectors with cross-correlated components. This note considers the theoretical underpinnings of that streamflow disaggregation procedure and a proposed variation and their ability to reproduce the observed historical cross correlations among concurrent monthly flows at nine Brazilian stations.
A condensed version of the Valencia-Schaake disaggregation model is developed which describes the distribution of monthly streamflow sequences using a set of coupled univariate regression models rather than a multivariate time series formulation. The condensed model has fewer parameters and is convenient for generating flow sequences which incorporate the intrinsic variability of streamflows and the uncertainty in the parameters of the annual and monthly streamflow models. The impact of parameter uncertainty on derived relationships between reservoir capacity and reservoir performance statistics is illustrated using required reservoir capacity (calculated with the sequent peak algorithm), system reliability, and the average total shortfall. Modeled sequences describe flows in the Rappahannock River in Virginia and the Boise River in Idaho. For high-reliability systems the results show that streamflow generation procedures which ignore model parameter uncertainty can grossly underestimate reservoir system failure rates and the severity of likely shortages, even if based on a 50-year record.
The disaggregation model proposed by Schaake et al. (1972) is revised to include linkages with the past at the different levels of aggregation. This modification produces a more realistic hydrologic model.