ArticlePDF Available

Abstract

A problem frequently met in engineering hydrology is the forecasting of hydrologic variables conditional on their historical observations and the hindcasts and forecasts of a deterministic model. On the contrary, it is a common practice for climatologists to use the output of general circulation models (GCMs) for the prediction of climatic variables despite their inability to quantify the uncertainty of the predictions. Here we apply the well-established Bayesian Processor of Forecasts (BPF) for forecasting hydroclimatic variables using stochastic models through coupling them with GCMs. We extend the BPF to cases where long-term persistence appears, using the Hurst-Kolmogorov process (HKp, also known as fractional Gaussian noise) and we investigate analytically its properties. We apply the framework to calculate the distributions of the mean annual temperature and precipitation stochastic processes for the time period 2016-2100 in the United States of America conditional on historical observations and the respective output of GCMs.
Onthepredictionofpersistentprocessesusingtheoutputofdeterministic
models
Hristos Tyralis
*
and Demetris Koutsoyiannis
Department of Water Resources and Environmental Engineering, School of Civil
Engineering, National Technical University of Athens, Heroon Polytechniou 5, 157 80
Zographou, Greece
*
Corresponding author, montchrister@gmail.com
Abstract: A problem frequently met in engineering hydrology is the forecasting of
hydrologic variables conditional on their historical observations and the hindcasts and
forecasts of a deterministic model. On the contrary, it is a common practice for
climatologists to use the output of general circulation models (GCMs) for the prediction
of climatic variables despite their inability to quantify the uncertainty of the predictions.
Here we apply the well-established Bayesian Processor of Forecasts (BPF) for
forecasting hydroclimatic variables using stochastic models through coupling them with
GCMs. We extend the BPF to cases where long-term persistence appears, using the
Hurst-Kolmogorov process (HKp, also known as fractional Gaussian noise) and we
investigate analytically its properties. We apply the framework to calculate the
distributions of the mean annual temperature and precipitation stochastic processes for
the time period 2016-2100 in the United States of America conditional on historical
observations and the respective output of GCMs.
Keywords: Bayesian Processor of Forecasts; fractional Gaussian noise; general
circulation model; Hurst-Kolmogorov; hydroclimatic prediction; hydrological statistics
1. Introduction
1.1 Uncertainty in deterministic models in hydrological science
Recently, various studies regarding the prediction of hydrologic variables based on
stochastic models have been published. To mention some of them, Koutsoyiannis et al.
(2008b) proposed a stochastic model for the prediction of the Nile flow a month ahead.
On larger time scales, Koutsoyiannis et al. (2007) proposed a stochastic framework to
calculate future climatic uncertainties conditional on historic observations, while Tyralis
and Koutsoyiannis (2014) solved this problem using a Bayesian framework. Engineering
2
hydrologists frequently use stochastic models for the prediction of hydrologic variables,
whereas the climatologists focus on deterministic models (General Circulation Models,
GCMs) (Koutsoyiannis et al. 2008a). While it is true that deterministic models
incorporate knowledge of the climatic mechanisms expressed through deterministic
equations, they are not appropriate to quantify the uncertainty of predictions.
Consequently, climatologists have recently started reconsidering their approach,
introducing stochastic models in climate science (Macilwain 2014), while earlier
Schneider (2002) set a debate on how and when to assign probabilities to future
projections of the GCMs, simultaneously expressing some concerns about their absence
in specific cases.
Estimating uncertainties of forecasted geophysical variables using information from
deterministic models is frequently met in the hydrological science and in particular in
rainfall-runoff modelling (e.g. Montanari and Grossi 2008, Wang et al. 2009, Zhao et al.
2011, Smith et al. 2012, Pokhrel et al. 2013, Zhao et al. 2015a and others). The Bayesian
Forecasting System (BFS) and its extensions in a series of papers (Krzysztofowicz
1999b, 2001, 2002, Krzysztofowicz and Maranzano 2004) is a primary tool for
estimating uncertainties in rainfall-runoff modelling. Another interesting tool for
quantifying uncertainties is the Bayesian Processor of Forecasts (BPF) introduced in
Krzysztofowicz (1985) and compared with the BFS in Krzysztofowicz (1999a). The BPF
combines a prior distribution, which describes the natural uncertainty about the
realization of a hydrologic process, with a likelihood function, which describes the
uncertainty in categorical forecasts of that process, and outputs a posterior distribution of
the process, conditional upon the forecasts (Krzysztofowicz 1985). It is mostly used for
weather forecasting and while it is a general algorithm, which can be applied to any
distribution and dependence pattern of the process, it has been investigated solely for
independent or Markov dependent variables (e.g. Krzysztofowicz 1999a, Krzysztofowicz
and Evans 2008, Chen et al. 2013). The term “Bayesian” refers to the use of the Bayes
theorem, however the BPF does not use full Bayesian statistics. Consequently, the
parameter uncertainty (Montanari and Koutsoyiannis 2012) is not considered in the
model.
A frequent approach for modelling mean annual geophysical time series is the
implementation of the Hurst-Kolmogorov stochastic process (HKp) (also known as
Fractional Gaussian Noise, e.g. Koutsoyiannis 2002, 2003, 2006b, Koutsoyiannis and
3
Montanari 2007). The investigation of big geophysical data sets has confirmed the HK
behaviour of geophysical variables in the annual time scale (Fatichi et al. 2012,
Iliopoulou et al. 2016, Markonis and Koutsoyiannis 2016). The HK process is suitable for
modelling the variability observed in geophysical time series, and not only because it
can model the HK behaviour. Specifically, while it is stationary (for the benefits of using
stationary models see Koutsoyiannis and Montanari 2014, Montanari and Koutsoyiannis
2014), it can model higher variations of the observed time series unlike the Markovian
models. Thus, it can model observed trends (Koutsoyiannis 2006a) and it does not
underestimate uncertainties of the forecasted variable (Tyralis and Koutsoyiannis
2014).
1.2 General Circulation Models
The Coupled Model Intercomparison Project Phase 5 (CMIP5) includes GCMs, which
contain historical runs, i.e. simulations of the past forced by observed atmospheric
composition changes and time-evolving land cover (Taylor et al. 2012). Each historical
run is extended with a projection of the climate driven by concentration or emission
scenarios consistent with the representative concentration scenarios (RCPs, Hibbard et
al. 2007, Moss et al. 2010). The evaluation of GCMs for reproducing the past has been
studied extensively with varying results, depending on the examined variable (usually
temperature and precipitation), time scale of the variable, statistic or parameter of
interest, region and the time-period. Most studies include comparisons with
observations, re-analysis data, satellite data or all (Koutsoyiannis et al. 2008a,
Anagnostopoulos et al. 2010, Santer et al. 2013, Sheffield et al. 2013a, 2013b, Xu et al.
2013, Koutroulis et al. 2015, Nasrollahi et al. 2015, Aloysius et al. 2016, Matthes et al.
2016), visualizations (Potter et al. 2009) and even comparisons between the models
themselves (Johnson and Sharma 2009). However, Notz (2015) points out that the direct
comparison of model simulations with observations allows for limited inferences about
the deficiencies of the model.
Of practical interest and gaining place in the literature is the quantification of the
uncertainties of the GCMs projections (Katz 2002) whose sources are measurement
errors, variations of the geophysical processes and model structure according to Katz
(2002) or internal variability, scenario and model uncertainties (e.g. Hawkins and
Sutton 2009). Therefore, it is apparent that we cannot consider raw projections as a
4
product, which we can use without further processing. Significant part of the literature
has been devoted to the quantification of the uncertainties (Hawkins et al. 2014,
Woldemeskel et al. 2014, Tian et al. 2015, Zhao et al. 2015b) and their partition
(Hawkins and Sutton 2009, 2011, Yip et al. 2011, Ylhäisi et al. 2015, Hewitt et al. 2016)
usually to internal, scenario and model uncertainties. Beyond what narrowly concerns
the climate science there is a discussion on the uncertainty attributed to human
behaviour, which seems not quantifiable. Consequently, the use of scenarios is proposed
(Dessai and Hulme 2004) to consider the human behaviour with the use of RCPs. There
is also a discussion on the potential of the reduction of uncertainties (Hawkins and
Sutton 2009, 2011) while Knutti and Sedláček (2013) conclude that the progress in
terms of narrowing uncertainties is too limited. An overview of methods to evaluate
uncertainty of deterministic models, not only in the climate science, is presented in
Uusitalo et al. (2015).
The limitations in reducing uncertainties are primarily due to the internal climate
variability (Knutti and Sedláček 2013); thus the development of methods, which are
based on GCMs and simulate the local weather (e.g. Groves et al. 2008) gain place in
practical applications of the GCMs. While future climate is still projected based on single
GCM outputs (Maloney et al. 2014), combining multiple models for future projections is
proposed as an alternative for skilful climate predictions (Smith et al. 2009, Chowdhury
and Sharma 2011, Strobach and Bel 2015). However, Pirtle et al. (2010) claim that the
quality of analyses based on multiple models cannot be evaluated, while Kundzewicz
and Stakhiv (2010) mention that the spread of outcomes of the GCMs is incorrectly used
as a type of uncertainty analysis. The so-called “bias corrections” refers to another group
of methods, which are used to improve the projections of GCMs through (a posteriori)
increasing the agreement between GCM outputs and observations. However, this
procedure is artificial and is criticized for hiding the uncertainty rather than reducing it
(Ehret et al. 2012).
1.3 On the proposed framework
It seems that the arsenal of methods to improve the GCMs projections and quantify their
forecasting uncertainty (mainly use of multiple models and “bias correction”) is
inadequate. In the present study, we propose using the BPF, which is based on solid
scientific foundation, i.e. the concept of conditional stochastic independence (de Finetti
5
1974, Krzysztofowicz 1985). Hence, it can be an appropriate alternative. Here we apply
the BPF to quantify the uncertainties of the forecasts of mean annual temperature and
precipitation. We model the variables of interest with the HKp, which we assume is the
prior distribution that describes the natural uncertainty about the realization of the
process. The deterministic forecasts of the process are the GCMs outputs, while the BPF
outputs the posterior distribution of the process conditional on the GCMs outputs and
the realization of the process.
The posterior distribution depends initially on the fitted HKp but eventually (and in a
determinative manner) on the agreement of the GCM output with the observations. The
model uses six parameters. The three parameters of the HKp are estimated from the
observations. The degree of agreement of the GCM with the observations is determined
by three parameters, estimated when fitting the model using a common period of
observations and GCMs output. The two fittings are performed independently. As a
result, the application of the BPF avoids the artificial improvement of the model (e.g.
“bias correction” and related methods), while the natural variability of the process is
modelled using a well-established stochastic model. Furthermore, uncertainties are
quantified using a single output of the model, while the human influence is modelled
through the selection of a single scenario. Finally, we avoid to narrow uncertainty.
Instead, uncertainties are presented as they are, i.e. without reducing them artificially.
The theoretical contribution of the present study is the application of the BPF to
processes with more complicated dependence structure compared to the Markovian
model. We apply the BPF to the HKp, which results in posterior multivariate normal
distributions. We apply the framework to the mean annual temperature and total annual
precipitation in a large area, the contiguous part of the United States of America, while
we show whether and how a purely probabilistic forecast could be improved by using a
deterministic forecast.
2. Methods
In this section, we present the BPF, the definition of the HKp and the application of the
BPF to normal stationary stochastic processes and as a case study to the HKp. In the next
sections we use the Dutch convention for notation, according to which random variables
and stochastic processes are underlined (Hemelrijk 1966).
6
2.1 The Bayesian Processor of Forecasts
Let x
2(1:(n1+n2))
be a geophysical process which we wish to forecast and x
1(1:(n1+n2))
be its
forecast given by a deterministic model. The respective time periods (in discrete time,
denoted through the integers n
1
and n
2
) for each variable are presented in Figure 1. We
assume that x
2(1:n1)
denotes the observed (historical) values of the time series, while
x
1(1:(n1+n2))
and x
2(1:(n1+n2))
are the stochastic processes which represent in stochastic
terms the deterministic model and the geophysical process respectively, defined in eqs.
(1), (2).
x
1(1:(n1+n2))
:= (x
11
, …, x
1n1
, x
1(n1+1)
, …, x
1(n1+n2)
)
T
(1)
x
2(1:(n1+n2))
:= (x
21
, …, x
2n1
, x
2(n1+1)
, …, x
2(n1+n2)
)
T
(2)
Figure 1. Time periods for the BPF data input and output. The prediction time period
refers to the distribution of y
4
|y
3
, x
1
. y
3
and y
4
are defined in eqs. (7) and (8).
To shorten the equations used in the subsequent sections we use the notation of eqs.
(3)-(8), in which we remove the time indexes.
x
1
:= x
1(1:(n1+n2))
(3)
x
2
:= x
2(1:(n1+n2))
(4)
y
1
:= (x
11
, x
12
, …, x
1n1
)
T
: n
1
×1 (5)
y
2
:= (x
1(n1+1)
, …, x
1(n1+n2)
)
T
: n
2
×1 (6)
y
3
:= (x
21
, …, x
2n1
)
T
: n
1
×1 (7)
y
4
:= (x
2(n1+1)
, …, x
2(n1+n2)
)
T
: n
2
×1 (8)
Henceforth, y
1
will be called deterministic hindcast. The BPF is based on the
fundamental eqs. (9) and (10), which exploit the concept of conditional stochastic
independence (for intuitive explanations of the BPF the reader is referred to de Finetti
1974 and Krzysztofowicz 1985):
7
f
n
(x
11
, x
12
, …, x
1n
|x
21
, x
22
, …, x
2n
) =
i = 1
n f
i
(x
1i
|x
21
, x
22
, …, x
2n
) n {1, …, n
1
+n
2
} (9)
f
i
(x
1i
|x
21
, x
22
, …, x
2n
) = f
i
(x
1i
|x
2i
) i, n {1, …, n
1
+n
2
} (10)
The deterministic forecasts are independent on each other conditional on the
observations according to eq. (9) (Krzysztofowicz 1985), while each forecast depends
only on the parallel observation according to eq. (10). Eqs. (9) and (10) combined result
in
f
n
(x
11
, x
12
, …, x
1n
|x
21
, x
22
, …, x
2n
) =
i = 1
n f
i
(x
1i
|x
2i
) n {1, …, n
1
+n
2
} (11)
Given an observation of x
2
, the distribution of x
1
is determined by eqs. (9) and (10).
The purpose of the BPF is to find the distribution of y
4
conditional on y
3
and x
1
, which is
given by
h(y
4
|y
3
, x
1
) = f(x
1
|y
3
, y
4
) g(y
3
, y
4
) / ξ(y
3
, x
1
) (12)
where both g( ) and ξ( ) denote (joint) distributions (more precisely, probability
densities).
As proved in Appendix A, h can be simulated using the equation:
h(y
4
|y
3
, x
1
) f(y
2
|y
4
) g(y
4
|y
3
) (13)
Consequently, to simulate from h we must calculate the distributions f and g.
2.2 Α normal stationary stochastic process in the Bayesian Processor of
Forecasts
Let x
2
denote a normal stationary stochastic process (Wei 2006, p. 10) with parameters
μ, σ, ρ
i,j
, given by:
μ := E[x
2n
] n {1, …, n
1
+n
2
} (14)
σ
2
:= Var[x
2n
] n {1, …, n
1
+n
2
} (15)
ρ
i,j
:= ρ
|ij|
i, j {1, …, n
1
+n
2
} (16)
The joint distribution of x
2
is multivariate normal with constant mean μ and
autocovariance matrix Σ given by eq. (17). Furthermore, the joint distributions of y
3
, y
4
and all subsets of x
2
are also multivariate normal, with the same mean and
autocovariance matrix given by extracting respective parts of Σ. The proofs of the results
of Section 2.2 are given in Appendix A.
Σ = σ
2
[ρ
i,j
] i, j {1, …, n
1
+n
2
} (17)
8
The autocovariance matrix Σ can be partitioned in the following way:
Σ = σ
2
R
11
R
12
R
21
R
22
(18)
where the dimensions of the matrices are: R
11
: n
1
×n
1
, R
21
: n
2
×n
1
, R
12
: n
1
×n
2
, R
22
: n
2
×n
2
.
Then the distribution of y
4
|y
3
is given by:
g(y
4
|y
3
) = Ν(Μ
1
, Λ
1
) (19)
where N denotes the multivariate normal distribution and
Μ
1
:= μ
2
+ R
21
R
−1
11
(y
3
μ
1
) (20)
Λ
1
:= σ
2
(R
22
R
21
R
−1
11
R
12
) (21)
μ
1
:= (μ, …, μ)
Τ
, n
1
×1 (22)
μ
2
:= (μ, …, μ)
Τ
, n
2
×1 (23)
An intuitive modelling of the relationship between x
1n
and x
2n
is given by the
distribution (24) (e.g. Krzysztofowicz 1999a).
f(x
1n
|x
2n
) = Ν(q
n
, σ
2
e
) n {1, …, n
1
+n
2
} (24)
where
q
n
:= ax
2n
+ b n {1, …, n
1
+n
2
} (25)
Eq. (24) means that the deterministic forecast x
1n
can be modelled as a linear function of
the observation x
2n
. Thus, the level of the deterministic forecast depends on the level of
the observation, while its variation is modelled by a constant parameter, regardless of
the level of x
2n
. Given eqs. (9), (10) and (24) we prove in Appendix A that the
distribution of y
4
conditional on y
2
is:
f(y
4
|y
2
) = N((y
2
b
2
)/a, (σ
e
/a)
2
I
n2
) = N(M
2
, Λ
2
) (26)
where
M
2
:= (y
2
b
2
)/a (27)
Λ
2
:= (σ
e
/a)
2
I
n2
(28)
b
2
:= (b, …, b)
Τ
, n
2
×1 (29)
Combining eqs. (19) and (26) we prove in Appendix A that the joint distribution of
the future process of interest, given the historical observations and the deterministic
forecast, is:
h(y
4
|y
3
, x
1
) = N(M, Λ) (30)
9
where
Λ
−1
= (1/σ
2
) (R
22
R
21
R
−1
11
R
12
)
−1
+ (a/σ
e
)
2
I
n2
(31)
M = Λ Λ
−1
1
M
1
+ (a/σ
2
e
) Λ (y
2
b
2
) (32)
2.3 Estimation of the parameters of the Bayesian Processor of Forecasts
The parameters of the BPF are μ, σ, ρ
|ij|
defined in eqs. (14)-(16) and a, b, σ
2
e
defined in
eqs. (24) and (25). The parameters μ, σ, ρ
|ij|
can be estimated from fitting the joint
distribution of y
3
to y
3
. In the next sections, we will use the maximum likelihood
estimator. The parameters a, b, σ
e
can be estimated from the linear regression of x
1n
on
x
2n
over the time period {1, …, n
1
}. Figure 1 depicts the fitting periods.
2.4 Distinct fitting periods and other special cases
Sometimes the periods of fitting of the normal stationary model to estimate the
parameters μ, σ, ρ
|ij|
and fitting of the linear model to estimate the parameters a, b, σ
e
do
not coincide. In such cases, the parameters can be estimated in distinct periods. For
example, in Figure 2, we assume that the deterministic model has already used
information from the historical observations to adjust the hindcast, therefore the {1, …,
n
1
} period cannot be used for the linear model fitting. However, the period of
observations {1, …, n
1
+n
2
} can be used for fitting the normal stationary model. In such
cases the intersection of the deterministic forecast period and the historical
observations {n
1
+1, …, n
1
+n
2
} can be used for fitting the linear model. We present the
distributions of interest and the proofs in Appendix B.
Figure 2. Time periods for the BPF data input and output and the related periods for
model fitting in the case of distinct periods.
In cases that the geophysical process is nonnegative (e.g. precipitation), the modelling
framework should be adapted to truncated variables. The necessity for doing this
appears when the coefficient of variation of the process is high (so that the probability of
10
getting a negative value from the normal distribution is not negligible). For such cases,
the BPF can be extended to include truncated normal distributions (Horrace 2005).
2.5 Hurst-Kolmogorov process
The model of interest for x
2
is the HKp, as explained in Section 1.1. The HKp is a three-
parameter normal stationary stochastic process in discrete time. Its parameters μ and σ
are defined by eqs. (14) and (15), while its parameter H is defined by eq. (33) (Tyralis
and Koutsoyiannis 2011):
ρ
k
:= Corr[x
t
, x
t + k
] = |k + 1|
2H
/ 2 + |k − 1|
2H
/ 2 − |k|
2H
, k = 0, 1,… (33)
We use the maximum likelihood estimator to estimate μ, σ and H simultaneously, as
proposed in Tyralis and Koutsoyiannis (2011) while the estimator is implemented in the
R package HKprocess (Tyralis 2016).
2.6 Investigation for various values of the parameters of the Bayesian Processor
of Forecasts
For specific values of the parameters of the BPF and in particular of the linear model, we
can understand its behaviour in extreme cases. We present the proofs of the results of
Section 2.6 in Appendix C. A similar investigation is presented in Krzysztofowicz (1985).
If σ
e
= 0, i.e., the deterministic model is perfect, then the BPF forecast prediction
interval is 0, while the BPF forecast is equal to the deterministic forecast (see eqs. (C.4)
and (C.5)). When a = 0 (see eq. 25), then the deterministic forecast does not improve the
BPF forecast. Then the BPF forecast is equal to the forecast of the stochastic process (see
eq. (C.8)). This problem has already been solved in Tyralis and Koutsoyiannis (2014),
who also employed a Bayesian treatment of the parameters of the stochastic process.
Intuitively, high values of σ
e
will result in high uncertainties. Furthermore negative
values of a will result in BPF forecasts with inverse trends compared to the
deterministic forecasts.
We can also assess the quality of the deterministic model using the sufficient
characteristic defined in Krzysztofowicz (1987) and the informativeness score defined
in Krzysztofowicz (1992, 2010). The sufficient characteristic (SC) and the
informativeness score (IS), defined respectively by eqs. (34) and (35) summarize the
information contained in the parameters a and σ:
SC := |
a| / σ
e
(34)
11
IS := ((SC / (1/
σ
))
−2
+ 1)
−1/2
(35)
Krzysztofowicz (2010) proved that:
r = sign(a) IS (36)
where r is the Pearson’s r defined by
r := Corr[y
1
, y
3
] (37)
For an intuitive explanation of the SC and the IS the reader is referred to
Krzysztofowicz (2010). In brief, the sufficiency characteristic is interpretable as a
“signal-to-noise ratio”, with |a| being the measure of signal, and σ being the measure of
noise, while the posterior variance depends on the SC. The SC ranges in the interval
[0, ∞] and the IS ranges in the interval [0, 1]. Higher values of both parameters imply a
more informative deterministic model and lower posterior variance. For the perfect
deterministic model we have SC = ∞ and IS = 1, while for a completely uninformative
deterministic model we have SC = 0 and IS = 0.
Normal stationary stochastic processes have finite 1
st
and 2
nd
order moments,
therefore μ and σ defined in eqs. (14) and (15) are finite. Subsequently the results
presented in Appendix C can be generalized using the SC and IS parameters. For
instance, a = 0 implies SC = 0 and IS = 0, while σ
e
= 0 implies SC = 1 and IS = ∞.
Furthermore the SC and the IS can be estimated from different samples, e.g. as in Figure
2. In such case |r| ≠ IS (Krzysztofowicz 2010), and both parameters provide different
information. Further investigations using simulations for special (artificially designed)
cases will be presented in Section 4.1.
3. Data
We apply the BPF to instrumental temperature and precipitation data, which we
aggregated on the annual time scale, and to the GCM projections, which we used as
deterministic forecasts.
3.1 Temperature data
We use monthly temperature data from the unadjusted version 3 of the Global Historical
Climatology Network-Monthly (GHCN-M) temperature dataset (Lawrimore et al. 2011).
The GHCN-M includes mean monthly temperatures observed in a large number of
stations, which cover the earth surface. We choose the stations for latitude in the
interval [25°, 50°] and longitude in the interval [−125°, −65°] (USA region).
12
Furthermore, we consider all monthly values in the time period 1916-2015, while we
exclude all stations with more than 12 missing values. We impute missing values using a
seasonal Kalman filter as implemented in the R package zoo (Zeileis and Grothendieck
2005). A number of 362 stations, depicted in Figure 3, remained after this procedure.
Figure 3. Map of locations for the 362 stations with temperature data (dots). Thiessen
polygons for each station within the convex hull of the stations are also depicted.
We used the Albers equal-area conic projection to map the data onto a flat plane and
perform all subsequent calculations. However, all map visualizations in the figures of the
manuscript are presented in an equirectangular map projection. After defining the
convex hull of the 362 stations, we defined all Thiessen polygons corresponding to each
station. The Thiessen (also known as Voronoi or Dirichlet) tessellation is computed by
functions in the spatstat and deldir R packages (Baddeley et al. 2015, Turner 2016
respectively) according to the second (iterative) algorithm of Lee and Schacter (1980).
The mean annual temperature in the convex hull for the time period 1916-2015 is
computed using the Thiessen polygon method.
3.2 Precipitation data
We use daily precipitation data from the Global Historical Climatology Network (GHCN,
Menne et al. 2012a, 2012b). The initial dataset included time series with missing or
flagged (i.e. data of low quality for reasons explained in Menne et al., 2012a) values. We
choose the stations with latitude in the interval [25°, 50°] and longitude in the interval
[−125°, −65°] (USA region). We processed the dataset according to a briefly described in
Appendix D sequence of actions. The locations of the 319 stations, which remained after
the selection procedure, are depicted in Figure 4.
13
Figure 4. Map of locations for the 319 stations with precipitation data (dots). Thiessen
polygons for each station within the convex hull of the stations are also depicted.
The definition of the convex hull of the stations and the methodology for the Thiessen
polygons and the calculation of the spatial average precipitation over the convex hull are
same as those described in Section 3.1 for temperature.
3.3 GCM data
By GCM data we mean the GCM outputs for monthly temperature and precipitation from
the CMIP5 experiment, which involves more than 50 GCMs modelled by 20 modelling
groups (Taylor et al. 2012). Each model comes with its own spatial grid resolution. The
models used in the present study and the variables of interest are presented in Table 1.
Each GCM in Table 1 includes a simulation of the recent past (1850-2005) (historical
run) and a future projection (2006-2100) forced by the representative concentration
pathway 6.0 (RCP6). The RCP6 experiment represents a high concentration pathway in
which stabilization of the radiative forcing at 6.0 Wm
−2
occurs around 2100 and then
forcing remains fixed (Masui et al., Meinshausen et al. 2011, Fig. 4). Most of the models
have multiple ensemble members. Here we use the ensemble member r1i1p1 for each
model.
14
Table 1. CMIP5 models acronyms, modelling groups and institutes, and variable of
interest. The model outputs were downloaded from
https://pcmdi.llnl.gov/search/cmip5/.
Model Name Temperature Precipitation Modelling Centre (or Group) Institute ID
GISS-E2-H
NASA Goddard Institute for Space Studies NASA GISS
GISS-E2-R
NASA Goddard Institute for Space Studies NASA GISS
HadGEM2-AO
National Institute of Meteorological
Research/Korea Meteorological
Administration
NIMR/KMA
IPSL-CM5A-LR
Institut Pierre-Simon Laplace IPSL
IPSL-CM5A-MR
Institut Pierre-Simon Laplace IPSL
MIROC5
Atmosphere and Ocean Research Institute
(The University of Tokyo), National Institute
for Environmental Studies, and Japan Agency
for Marine-Earth Science and Technology
MIROC
MIROC-ESM
Japan Agency for Marine-Earth Science and
Technology, Atmosphere and Ocean Research
Institute (The University of Tokyo), and
National Institute for Environmental Studies
MIROC
MIROC-ESM-CHEM
Japan Agency for Marine-Earth Science and
Technology, Atmosphere and Ocean Research
Institute (The University of Tokyo), and
National Institute for Environmental Studies
MIROC
MRI-CGCM3
Meteorological Research Institute MRI
NOAA GFDL GFDL-CM3
NOAA Geophysical Fluid Dynamics
Laboratory
NOAA GFDL
NOAA GFDL GFDL-ESM2G
NOAA Geophysical Fluid Dynamics
Laboratory
NOAA GFDL
NOAA GFDL GFDL-ESM2M
NOAA Geophysical Fluid Dynamics
Laboratory
NOAA GFDL
NorESM1-M
Norwegian Climate Centre NCC
NorESM1-ME
Norwegian Climate Centre NCC
We extract GCM grid data corresponding to points within the respective convex hulls
defined in Figure 3 and Figure 4 and to the time period 1916-2100. Two examples of the
Thiessen polygons formed from the GCMs points within the convex hull defined in
Figure 3, are presented in Figure 5. The methodology for aggregating the temperature
and precipitation over the convex hull is presented in Section 3.1.
Figure 5. Temperature (left) and precipitation (right) Thiessen polygons for each grid
centre point (dot) for the GISS-E2-H model within the convex hull of the stations.
4. Application
In Section 4, we present the results of the model presented in Section 2.2 to controlled
simulation data (for testing) and data of Section 3 (for prediction).
15
4.1 Framework testing using simulations
We test the performance of the BPF on simulated series with n
1
= 100 and n
2
= 50. The
aim is to show the performance of the BPF even in extreme conditions. In Table 2, we
present the types of simulated time series to which we applied the BPF. In Table 3, we
present the estimated parameters of the BPF. Additionally we present the Pearson’s r of
x
1(1:100)
and x
2(1:100)
and the respective values of the SC and the IS. In all examined cases
we use the same simulated time series x
2(1:100)
, therefore the parameter σ has a common
value. Thus in all cases, the SC and IS provide the same amount of information.
Table 2. Simulated time series presented in the Figures of Section 4.1.
Case Figure Variable Simulation
1 Figure 6 (top) x
1
HKp with μ = 0, σ = 0.40, H = 0.50 with added trend = 0.01
x
2
HKp with μ = 5, σ = 2, H = 0.70
2 Figure 6 (bottom) x
1
Equal to x
2
of case 1 in the period 1-100. Linear trend = 0.50 with starting point
equal to (x
2(100)
of case 1 + 0.50) in the period 101-150
x
2
Equal to
x
2
of case 1
3 Figure 7 (top) x
1
Equal to x
2
of case 1 in the period 1-100. Linear trend = 0.10 with starting point
equal to (x
2(100)
of case 1 + 0.10) in the period 101-150. In the resulting time
series we add an HKp with μ = 0, σ = 1, H = 0.50
x
2
Equal to x
2
of case 1
4 Figure 7 (bottom) x
1
Equal to x
2
of case 1 in the period 1-100. Linear trend = 0.10 with starting point
equal to (x
2(100)
of case 1 + 0.10) in the period 101-150. In the resulting time
series we add an HKp with μ = 0, σ = 1, H = 0.50 and we shift it up 5 units
x
2
Equal to x
2
of case 1
5 Figure 8 (top) x
1
HKp with μ = 5, σ = 2, H = 0.50 in the period 1-100. Linear trend = 0.10 with
starting point equal to (x
1(100)
of the present case + 0.10) in the period 101-150
x
2
Equal to x
2
of case 1
6 Figure 8 (middle) x
1
Equal to x
1
of case 6 in the period 1-100. Linear trend = 0.50 with starting point
equal to (x
1(100)
of case 6 + 0.50) in the period 101-150
x
2
Equal to
x
2
of case 1
7 Figure 8 (bottom) x
1
HKp with μ = 5, σ = 2, H = 0.50 in the period 1-100. Linear trend = 0.50 with
starting point equal to (x
1(100)
of the present case + 0.50) in the period 101-150
x
2
Equal to x
2
of case 1
Table 3. Estimates of the BPF parameters defined in eqs. (14)-(16), (24), (25), (33) for
the cases of Table 2. r is defined in eq. (37) and is estimated using sample Pearson’s r of
x
1(1:100)
and x
2(1:100)
. SC and IS are defined in eqs. (34) and (35) and estimated by
substituting σ, a, σ
e
with their estimates. Cases with higher IS have a better ranking.
Case Figure μ σ H a b σ
e
r SC IS ranking
1 Figure 6 (top) 4.63 1.93 0.67 −0.05 0.81 0.54 −0.16 0.08 0.16 4
2 Figure 6 (bottom) 4.63 1.93 0.67 1.00 0.00 0.00 1.00 1.00 1
3 Figure 7 (top) 4.63 1.93 0.67 0.99 0.03 1.11 0.86 0.89 0.86 3
4 Figure 7 (bottom) 4.63 1.93 0.67 1.07 4.57 0.97 0.91 1.11 0.91 2
5 Figure 8 (top) 4.63 1.93 0.67 −0.02 4.92 1.90 −0.02 0.01 0.02 6
6 Figure 8 (middle) 4.63 1.93 0.67 −0.02 4.92 1.90 −0.02 0.01 0.02 7
7 Figure 8 (bottom) 4.63 1.93 0.67 −0.06 6.41 1.95 −0.06 0.03 0.06 5
Figure 6 (top) shows the results of the application assuming: (a) x
2
follows a HKp, (b)
a linear deterministic forecast model and (c) the deterministic forecast is of low quality
(a is almost equal to 0 and IS is low). Pearson’s r is related to a and both are slightly
16
negative. Therefore, the influence of the deterministic forecast on the probabilistic
forecast is negligible.
Figure 6. 95% prediction intervals produced by the BPF for the case of a time series
(green) simulated from a HKp, when the deterministic model (blue) is of low quality
(top) and perfect (bottom). The mean is equal to the estimated μ of the HKp model fitted
to the observations of the period 1-100. The BPF is fitted on the period 1-100 and
predicts for the period 101-150. The characteristics of the simulated time series are
presented in Table 2, while the estimated parameters of the BPF are shown in Table 3.
In the test application of Figure 6 (bottom) the assumptions are radically different.
Here again x
2
follows a HKp, but the deterministic hindcast is assumed to be perfect
(zero error and is = 1). Furthermore, the deterministic forecast is assumed to be a huge
linear trend. Because of the perfect performance of the deterministic model in the past
the BPF forecast fully complies with the deterministic model forecast (it is exactly equal
and 95% prediction interval has zero length in each time step). One could view this case
as model testing in nonstationary conditions, because at time 100 the deterministic
model fully changes its behaviour, yielding the huge linear trend, which did not appear
before. Even though the BPF framework is founded on a fully stationary setting, it
perfectly captures the assumed nonstationary behaviour. The reason is that the
deterministic model is found to behave very well in the past; had it behaved badly, the
probabilistic forecast (BPF) would disregard the linear trend and would be similar to
that in Figure 6 (top).
17
In the application depicted in Figure 7 (top), the deterministic hindcast is almost
perfect (with a small, nonzero, error and a high IS). Here again (as in Figure 6 (bottom)),
the BPF forecast is strongly influenced by the deterministic forecast; however, now
some prediction intervals of nonzero size appear. In the application, depicted in Figure 7
(bottom) the deterministic hindcast and forecast of Figure 7 (top) have been shifted up.
However, the BPF forecast has not changed. This shows that the BPF is invariant under
the mean change, which is a desirable property. The meaning of this that if the
deterministic model has a systematic bias, however high, the BPF framework
automatically removes it.
Figure 7. 95% prediction intervals produced by the BPF for the case of a time series
(green) simulated from a HKp with μ = 5, σ = 2 and H = 0.7, when the deterministic
model (blue) is almost perfect and varies a bit around the observations (top), or is
shifted up (bottom). The mean is equal to the estimated μ of the HKp model fitted to the
observations of the period 1-100. The BPF is fitted in the period 1-100 and predicts for
the period 101-150. The characteristics of the simulated time series are presented in
Table 2, while the estimated parameters of the BPF are shown in Table 3.
In the application depicted in Figure 8 (top), the deterministic hindcast is of low
quality. Furthermore, a is slightly negative. In this case, the deterministic forecast is
increasing while the BPF is slightly decreasing, which is reasonable because of the
negative a. In Figure 8 (middle) the deterministic hindcast and observations are equal to
those of Figure 8 (top). However, Figure 8 (middle) differs from Figure 8 (top) in that
the deterministic forecast in the period {101, …, 150} increases faster resulting in a
faster decrease of the BPF forecast. In Figure 8 (bottom), a is even more negative,
18
resulting in an even higher decrease of the BPF forecast. Finally the IS provides a
ranking of the models (cases) in terms of their informativeness (from the highest to the
lowest) which in the examined cases is 2, 4, 3, 1, 7, 5 and 6.
Figure 8. 95% prediction intervals produced by the BPF for the case of a time series
(green) simulated from a HKp with μ = 5, σ = 2 and H = 0.7. The deterministic model
(blue) is a simulated HKp with equal parameters, but of low quality in hindcast and a
slight linear trend in forecast (top) or high trend (middle) and more negative correlation
(bottom). The mean is equal to the estimated μ of the HKp model fitted to the
observations of the period 1-100. The BPF is fitted in the period 1-100 and predicts for
the period 101-150. The characteristics of the simulated time series are presented in
Table 2, while the estimated parameters of the BPF are shown in Table 3.
Overall, all testing experiments indicate an ideal performance of the BPF framework
in all cases, even the most extreme ones and those with huge nonstationary trends. The
methodology presented complies with the simple truth of the scientific method that
model predictions for the future are taken into account insofar models comply with
evidence from data of the past. Also, the methodology complies with the blueprint by
Montanari and Koutsoyiannis (2012) insofar it takes a deterministic model and
incorporates it into a stochastic framework, thus converting the deterministic into
19
stochastic predictions. If the deterministic model is good, the final stochastic prediction
highly relies on it. If the model is bad, it is almost automatically discarded.
4.2 Case studies
In Section 4.2, we present the application of the BPF to the data of Section 3. We present
two variants of the BPF, which are described in Sections 2.2 (Figure 1) and 2.4 (Figure 2)
and the respective fitting and forecasting periods in Figure 9. In the case of Figure 9
(top), the GCM historical runs have already been adjusted using information from the
observations, therefore using the time period 1916-2005 would use the same
information twice. Instead, in the case of Figure 9 (bottom), the fitting of the BPF linear
model in the time period 2006-2015 is based on a forecast with the assumptions of the
RCP6 experiment regarding the emissions scenario which have not been checked. In
both cases, we used the HKp to model the observations.
Figure 9. BPF fitting and predicting time periods. The fitting period is defined as the
period of the historical run (top) or the intersection of the historical observations and
the RCP4.5 time periods (bottom). The prediction period succeeds the fitting period and
extends to the year 2100.
We present the results in Figures 10-15. In all figures the mean of the observations, is
equal to the maximum likelihood estimate of μ as given in Section 2.5 for the fitting time
period. While we examined all GCMs of Table 1, we present here in detail two of them,
i.e. the GISS-E2-H and the MRI-CGCM3 along with summary information for all results.
Figure 10 shows the prediction of the mean annual temperature in the USA for the two
GCMs when the fitting time-period is 1916-2005. In the case of the GISS-E2-H model the
forecasted increase is equal to 0.8 °C while the 95% prediction interval is 1.8 °C wide. In
20
the case of the MRI-CGCM3 model, the forecasted increase is negligible while the 95%
prediction interval is again 1.8 °C wide. In Figure 11, the prediction intervals for the
fitting time-period 2006-2015 indicate a mean increase in the annual temperature equal
to 1.4 °C and 0.9 °C for both models respectively, while the respective prediction
intervals are 2.0 °C wide.
Figure 10. 95% prediction intervals of the mean annual temperature in the USA
produced by the BPF for the case of Figure 9 (top). The fitting time period is 1916-2005,
while the deterministic models are ensembles from the GISS-E2-H (top) and MRI-CGCM3
(bottom) models. The mean of the observations is equal to the maximum likelihood
estimate of μ in Section 2.5 for the fitting time period.
21
Figure 11. 95% prediction intervals of the mean annual temperature in the USA
produced by the BPF for the case of Figure 9 (bottom). The fitting time period is 2006-
2015, while the deterministic models are ensembles from the GISS-E2-H (top) and MRI-
CGCM3 (bottom) models.
Figures 12 and 13 depict similar results for the annual precipitation in the USA. In
particular, in Figure 12, which shows the results for the fitting time-period 1916-2005,
we observe a negligible increase of the annual precipitation while the 95% prediction
intervals are 200 mm wide. In Figure 13, where the fitting time-period is 2006-2015, we
observe an insignificant and a mean annual increase of 120 mm respectively, while the
respective 95% prediction intervals are 220 and 140 mm wide.
22
Figure 12. 95% prediction intervals of the annual precipitation in the USA produced by
the BPF for the case of Figure 9 (top). The fitting time period is 1916-2005, while the
deterministic models are ensembles from the GISS-E2-H (top) and MRI-CGCM3 (bottom)
models.
Figure 13. 95% prediction intervals of the annual precipitation in the USA produced by
the BPF for the case of Figure 9 (bottom). The fitting time period is 2006-2015, while the
deterministic models are ensembles from the GISS-E2-H (top) and MRI-CGCM3 (bottom)
models.
23
Figures 14 and 15 display the results for all models of Table 1. In particular, they
show the forecasted mean annual temperatures and annual precipitations for both
method variants defined in Figure 9. Furthermore, the graphs include the envelopes of
all 95% prediction intervals. In Figure 14, we observe an envelope of the mean annual
temperature 5.8 °C wide when the fitting time-period is 1916-2005 and an envelope
8.8 °C wide when the fitting time-period is 2006-2015. In the former case, the
temperature increase is centred around 2.5 °C for the year 2100, while in the latter case
the mean annual change seems to be negligible, while the overall shape of the graph
could be called a “Bayesian thistle”. Regarding the precipitation, we observe in Figure 15
envelopes 270 and 330 mm wide for the fitting time-periods 1916-2005 and 2006-2015
respectively. The forecasted increase in precipitation is negligible in the former case,
while it is approximately equal to 50 mm for the year 2100 in the latter case.
Figure 14. Prediction intervals of the mean annual temperature in the USA produced by
the BPF. The GCM medians correspond to all GCMs of Table 1. The prediction quantiles
are the envelopes of all 95% prediction intervals of the GCMs of Table 1 produced by the
BPF. The fitting time period is 1916-2005 (top, corresponds to Figure 9 (top)) and 2006-
2015 (bottom, corresponds to Figure 9 (bottom)).
24
Figure 15. Prediction intervals of the annual precipitation in the USA produced by the
BPF. The GCM medians correspond to all GCMs of Table 1. The prediction quantiles are
the envelopes of all 95% prediction intervals of the GCMs of Table 1 produced by the
BPF. The fitting time period is 1916-2005 (top, corresponds to Figure 9 (top)) and 2006-
2015 (bottom, corresponds to Figure 9 (bottom)).
5. Conclusions
The aim of this paper is to probabilistically predict the future evolution of a normal
stationary stochastic process used to model a geophysical variable conditional on
historical observations of the variable and hindcasts and forecasts of the variable
produced by a deterministic model. To this end, we apply the Bayesian Processor of
Forecasts (BPF) to the data of interest. The BPF has previously been applied to
independent variables or Markovian processes. Here, we extend its use to include any
normal stationary stochastic processes and we present an application to the special case
of the Hurst-Kolmogorov process.
We investigate the properties of the BPF and test its performance using simulated
time series. We show that the influence of the deterministic forecast increases when
there is a good fitting of the deterministic model to the historical observations. Indeed,
when this fitting is perfect, the BPF forecast is equal to the deterministic forecast. In
contrast, when this fitting is insufficient, the forecast depends on the observations and
the stochastic model and not on the deterministic model. Furthermore, even if the
25
stochastic model is stationary, the BPF can incorporate changes, which can be attributed
to non-stationarity.
The BPF is applied to the mean annual temperature and annual precipitation in the
time period 1916-2005 in the USA. The GCMs (the historical and the RCP6 scenarios) are
used as deterministic models. Using the estimated BPF parameters, we probabilistically
forecast the mean annual temperature and annual precipitation until the year 2100. The
results are sensitive to the choice of the fitting period between the observations and the
deterministic forecast and the choice of the GCM model. Regarding the temperature the
overall results show increasing temperature when the fitting period is the intersection
of the data time period and the historical scenario, while the temperature remains
unchanged when the fitting period is the intersection of the data time period and the
RCP6 scenario. In both cases, the envelopes of the 95% prediction intervals for each
GCM model are significantly wide (5.8 °C and 8.8 °C respectively). Regarding the
precipitation, the deterministic models had negligible effect in improving the forecast of
the stochastic model, regardless of the fitting period.
We emphasize that the estimation of the stochastic model parameters should better
be performed using only data that were not used in the GCM fitting/tuning, i.e. for the
period after 2006. This would correspond to the so-called split-sample technique
(Klemeš 1986), which avoids possible model overfitting on the available data and thus
artificially good performance. This corresponds to model fitting period after 2006. The
applications with this variant of the methodology showed that the uncertainty of the
forecasts increases considerably and practically result in total neglect of the GCM
predictions regarding for both temperature and precipitation. Finally the inclusion of
the uncertainty in a fully Bayesian setting, also considering the uncertainty of
parameters, would result in even higher uncertainties of the forecasted variables.
Funding information: The authors received no funding for this research, which was
performed for scientific curiosity.
Acknowledgement: We acknowledge two anonymous reviewers whose suggestions as
well as comments, both positive and negative, helped to improve the manuscript. We
acknowledge the World Climate Research Programme's Working Group on Coupled
Modelling, which is responsible for CMIP, and we thank the climate modelling groups
(listed in Table 1 of this paper) for producing and making available their model outputs.
26
For CMIP the U.S. Department of Energy's Program for Climate Model Diagnosis and
Intercomparison provides coordinating support and led development of software
infrastructure in partnership with the Global Organization for Earth System Science
Portals.
AppendixA TheBayesianProcessorof Forecastsapplied tonormalstationary
stochasticprocesses
Here we prove the results presented in Section 2.2. Overall, we prefer to use techniques
typically met in the Bayesian statistics literature, such as proportionality of the
distributions, and avoid to calculate integrals. For example, Marty et al. (2015) used
these techniques when they examined the Bayesian processor of output, while
Krzysztofowicz (1985) preferred the other way when he examined the BPF.
Let x
1
and x
2
and their subsets y
1
, y
2
, y
3
, y
4
be defined as follows:
x
1(1:(n1+n2))
:= (x
11
, …, x
1n1
, x
1(n1+1)
, …, x
1(n1+n2)
)
T
, deterministic forecast (A.1)
x
2(1:(n1+n2))
:= (x
21
, …, x
2n1
, x
2(n1+1)
, …, x
2(n1+n2)
)
T
, observations (A.2)
x
1
:= x
1(1:(n1+n2))
(A.3)
x
2
:= x
2(1:(n1+n2))
(A.4)
y
1
:= (x
11
, x
12
, …, x
1n1
)
T
: n
1
×1 (A.5)
y
2
:= (x
1(n1+1)
, …, x
1(n1+n2)
)
T
: n
2
×1 (A.6)
y
3
:= (x
21
, …, x
2n1
)
T
: n
1
×1 (A.7)
y
4
:= (x
2(n1+1)
, …, x
2(n1+n2)
)
T
: n
2
×1 (A.8)
Then the conditional independence mentioned in Section 2.1 is defined by
f
n
(x
11
, x
12
,…, x
1n
|x
21
, x
22
,…, x
2n
) :=
i = 1
n f
i
(x
1i
|x
21
, x
22
, …, x
2n
) n {1, …, n
1
+n
2
} (A.9)
f
i
(x
1i
|x
21
, x
22
, …, x
2n
) := f
i
(x
1i
|x
2i
) i, n {1, …, n
1
+n
2
} (A.10)
which results in
f
n
(x
11
, x
12
, …, x
1n
|x
21
, x
22
, …, x
2n
) =
i = 1
n f
i
(x
1i
|x
2i
) n {1, …, n
1
+n
2
} (A.11)
Hence,
h(y
4
|y
3
, x
1
) = f(x
1
|y
3
, y
4
) g(y
3
, y
4
) / ξ(y
3
, x
1
) (A.12)
h(y
4
|y
3
, x
1
) f(x
1
|y
3
, y
4
) g(y
3
, y
4
) (A.13)
27
h(y
4
|y
3
, x
1
) f(y
1
, y
2
|y
3
, y
4
) g(y
3
, y
4
) (A.14)
h(y
4
|y
3
, x
1
) f(y
2
|y
4
) g(y
3
, y
4
) (A.15)
h(y
4
|y
3
, x
1
) f(y
2
|y
4
) g(y
4
|y
3
) g(y
3
) (A.16)
h(y
4
|y
3
, x
1
) f(y
2
|y
4
) g(y
4
|y
3
) (A.17)
Equation (A.17) proves eq. (13).
We define the parameters of the normal stationary stochastic process used to model
the observations with eqs. (A.18)-(A.28). Matrices Σ, R and their submatrices are
symmetric Toeplitz positive definite matrices (Golub and Van Loan 1996, p.193). This
facilitates their handling using the Levinson or related algorithms (e.g. Tyralis and
Koutsoyiannis 2011). Consequently,
μ := E[x
2n
] n {1, …, n
1
+n
2
} (A.18)
σ
2
:= Var[x
2n
] n {1, …, n
1
+n
2
} (A.19)
μ
1
:= (μ, …, μ)
Τ
, n
1
×1 (A.20)
μ
2
:= (μ, …, μ)
Τ
, n
2
×1 (A.21)
Σ := σ
2
[ρ
i,j
] i, j {1, …, n
1
+n
2
} (A.22)
ρ
i,j
= ρ
|ij|
i, j {1, …, n
1
+n
2
} (A.23)
Σ =
Σ
11
Σ
12
Σ
21
Σ
22
(A.24)
Σ
11
: n
1
×n
1
, Σ
21
: n
2
×n
1
, Σ
12
: n
1
×n
2
, Σ
22
: n
2
×n
2
(A.25)
Σ = σ
2
R
11
R
12
R
21
R
22
(A.26)
R
11
: n
1
×n
1
, R
21
: n
2
×n
1
, R
12
: n
1
×n
2
, R
22
: n
2
×n
2
(A.27)
Σ
11
= σ
2
R
11
, Σ
21
= σ
2
R
21
, Σ
12
= σ
2
R
12
, Σ
22
= σ
2
R
22
(A.28)
Since we model x
2
with a multivariate normal distribution, the distribution of y
4
conditional on y
3
is given by (Eaton 2007, p.116),
g(y
4
|y
3
) = N(μ
2
+ Σ
21
Σ
−1
11
(y
3
μ
1
), Σ
22
Σ
21
Σ
−1
11
Σ
12
) (A.29)
which can be written as
g(y
4
|y
3
) = Ν(Μ
1
, Λ
1
) (A.30)
where
Μ
1
:= μ
2
+ R
21
R
−1
11
(y
3
μ
1
) (A.31)
28
Λ
1
:= σ
2
(R
22
R
21
R
−1
11
R
12
) (A.32)
whereas eq. (A.32) denotes the Schur complement (Horn and Zhang 2005).
If the distribution of x
1n
conditional on x
2n
is given by
f(x
1n
|x
2n
) := Ν(q
n
, σ
2
e
) n {1, …, n
1
+n
2
} (A.33)
then, using eq. (A.11) and the properties of the product of normal distributions
(Bromiley 2014) we find:
q
n
:= ax
2n
+ b n {1, …, n
1
+n
2
} (A.34)
f(y
2
|y
4
) =
n = n
1
+1
n
1
+n
2
Ν(q
n
, σ
2
e
) (A.35)
f(y
2
|y
4
) = N(Q, V) (A.36)
where
Q := (q
n1+1
, …, q
n1+n2
)
T
, n
2
×1 (A.37)
V := σ
2
e
I
n2
, n
2
×n
2
(A.38)
However, in the Bayesian setting, y
2
is known while the distribution of interest is that
of y
4
|y
2
. Therefore, eq. (A.36) is transformed to eq. (A.45) in which y
4
is the random
variable and y
2
is a value, after some algebraic manipulations:
b
2
:= (b, …, b)
Τ
, n
2
×1 (A.39)
f(y
4
|y
2
) exp(−(1/2) (y
2
ay
4
b
2
)
T
V
−1
(y
2
ay
4
b
2
)) (A.40)
f(y
4
|y
2
) exp(−(a
2
/2 σ
2
e
) (y
4
– (y
2
b
2
)/a)
T
I
−1
n2
(y
4
– (y
2
b
2
)/a)) (A.41)
f(y
4
|y
2
) exp(−(1/2) (y
4
– (y
2
b
2
)/a)
T
(a/σ
e
)
2
I
−1
n2
(y
4
– (y
2
b
2
)/a)) (A.42)
f(y
4
|y
2
) exp(−(1/2) (y
4
– (y
2
b
2
)/a)
T
((σ
e
/a)
2
I
n2
)
–1
(y
4
– (y
2
b
2
)/a)) (A.43)
f(y
4
|y
2
) = N((y
2
b
2
)/a, (σ
e
/a)
2
I
n2
) (A.44)
f(y
4
|y
2
) = N(M
2
, Λ
2
) (A.45)
where,
M
2
:= (y
2
b
2
)/a (A.46)
Λ
2
:= (σ
e
/a)
2
I
n2
(A.47)
The distribution of y
4
|y
3
, x
1
in eq. (A.17) is normal, i.e.,
h(y
4
|y
3
, x
1
) = N(M, Λ) (A.48)
29
because it is proportional to the product of the two normal distributions (A.30) and
(A.45) (Bromiley 2014). Its parameters are given by eqs. (A.50) and (A.54) after the
following manipulations:
Λ
−1
:= Λ
−1
1
+ Λ
−1
2
(A.49)
Λ
−1
= (1/σ
2
) (R
22
R
21
R
−1
11
R
12
)
−1
+ (a/σ
e
)
2
I
n2
(A.50)
Λ
−1
M = Λ
−1
1
M
1
+ Λ
−1
2
M
2
(A.51)
M = Λ (Λ
−1
1
M
1
+ Λ
−1
2
M
2
) (A.52)
M = Λ Λ
−1
1
M
1
+ Λ (a/σ
e
)
2
((y
2
b
2
)/a) (A.53)
M = Λ Λ
−1
1
M
1
+ (a/σ
2
e
) Λ (y
2
b
2
) (A.54)
AppendixB TheBayesianProcessorofForecastsappliedtodistinctfitting
periods
Here we repeat the procedure of Appendix A but for distinct fitting periods. The time
period {1, …, n
1
+n
2
+ n
3
} is divided in three subperiods {1, …, n
1
}, {n
1
+1, …, n
1
+n
2
},
{n
1
+n
2
+1, …, n
1
+n
2
+ n
3
}. The processes of interest are x
1
and x
2
and their subsets y
1
, y
2
,
y
3
, y
4
, y
5
, y
6
defined as:
x
1((n1+1):(n1+n2+n3))
:= (x
1(n1+1)
, …, x
1(n1+n2)
, x
1(n1+n2+1)
, …, x
1(n1+n2+n3)
)
T
(B.1)
x
2(1:(n1+n2+n3))
:= (x
21
, …, x
2n1
, x
2(n1+1)
, …, x
2(n1+n2)
, x
2(n1+n2+1)
, …, x
2(n1+n2+n3)
)
T
(B.2)
x
1
:= x
1((n1+1):(n1+n2+n3))
(B.3)
x
2
:= x
2(1:(n1+n2+n3))
(B.4)
y
1
:= (x
1(n1+1)
, …, x
1(n1+n2)
)
T
: n
2
×1 (B.5)
y
2
:= (x
1(n1+n2+1)
, …, x
1(n1+n2+n3)
)
T
: n
3
×1 (B.6)
y
3
:= (x
21
, …, x
2n1
)
T
: n
1
×1 (B.7)
y
4
:= (x
2(n1+1)
, …, x
2(n1+n2)
)
T
: n
2
×1 (B.8)
y
5
:= (x
2(n1+n2+1)
, …, x
2(n1+n2+n3)
)
T
: n
3
×1 (B.9)
y
6
:= (x
21
, …, x
2(n1+n2)
)
T
: (n
1
+n
2
)×1 (B.10)
Then the conditional independence mentioned in Section 2.1 is defined by
f
n
(x
11
, x
12
,…,x
1n
|x
21
,x
22
,…,x
2n
) :=
i = 1
n f
i
(x
1i
|x
21
, x
22
, …, x
2n
) n {1,…,n
1
+n
2
} (B.11)
30
f
i
(x
1i
|x
21
, x
22
, …, x
2n
) := f
i
(x
1i
|x
2i
) i, n {1, …, n
1
+n
2
} (B.12)
which result in
f
n
(x
11
, x
12
, …, x
1n
|x
21
, x
22
, …, x
2n
) =
i = 1
n f
i
(x
1i
|x
2i
) n {1, …, n
1
+n
2
} (B.13)
Hence,
h(y
5
|y
3
, y
4
, x
1
) = f(x
1
|y
3
, y
4
, y
5
) g(y
3
, y
4
, y
5
) / ξ(y
3
, y
4
, x
1
) (B.14)
h(y
5
|y
3
, y
4
, x
1
) f(x
1
| y
3
, y
4
, y
5
) g(y
3
, y
4
, y
5
) (B.15)
h(y
5
|y
3
, y
4
, x
1
) f(y
1
, y
2
| y
3
, y
4
, y
5
) g(y
3
, y
4
, y
5
) (B.16)
h(y
5
|y
3
, y
4
, x
1
) f(y
2
|y
5
) g(y
3
, y
4
, y
5
) (B.17)
h(y
5
|y
3
, y
4
, x
1
) f(y
2
|y
5
) g(y
5
|y
3
, y
4
) g(y
3
, y
4
) (B.18)
h(y
5
|y
3
, y
4
, x
1
) f(y
2
|y
5
) g(y
5
|y
3
, y
4
) (B.19)
We define the parameters of the normal stationary stochastic process used to model
the observations through the following equations:
μ := E[x
2n
] n {1, …, n
1
+n
2
+n
3
} (B.20)
σ
2
:= Var[x
2n
] n {1, …, n
1
+n
2
+n
3
} (B.21)
μ
1
:= (μ, …, μ)
Τ
, (n
1
+n
2
)×1 (B.22)
μ
2
:= (μ, …, μ)
Τ
, n
3
×1 (B.23)
Σ := σ
2
[ρ
i,j
] i, j {1, …, n
1
+n
2
+n
3
} (B.24)
ρ
i,j
= ρ
|ij|
i, j {1, …, n
1
+n
2
+n
3
} (B.25)
Σ =
Σ
11
Σ
12
Σ
21
Σ
22
(B.26)
Σ
11
: (n
1
+n
2
)×(n
1
+n
2
), Σ
21
: n
3
×(n
1
+n
2
), Σ
12
: (n
1
+n
2
n
3
, Σ
22
: n
3
×n
3
(B.27)
Σ = σ
2
R
11
R
12
R
21
R
22
(B.28)
R
11
: (n
1
+n
2
)×(n
1
+n
2
), R
21
: n
3
×(n
1
+n
2
), R
12
: (n
1
+n
2
n
3
, R
22
: n
3
×n
3
(B.29)
Σ
11
= σ
2
R
11
, Σ
21
= σ
2
R
21
, Σ
12
= σ
2
R
12
, Σ
22
= σ
2
R
22
(B.30)
Since we model x
2
with a multivariate normal distribution, the distribution of y
5
conditional on y
3
and y
4
is given by (Eaton 2007, p.116)
g(y
5
|y
3
, y
4
) = N(μ
2
+ Σ
21
Σ
−1
11
(y
6
μ
1
), Σ
22
Σ
21
Σ
−1
11
Σ
12
) (B.31)
which can be written as
31
g(y
5
|y
3
, y
4
) = Ν(Μ
1
, Λ
1
) (B.32)
where
Μ
1
:= μ
2
+ R
21
R
−1
11
(y
6
μ
1
) (B.33)
Λ
1
:= σ
2
(R
22
R
21
R
−1
11
R
12
) (B.34)
If the distribution of x
1n
conditional on x
2n
is given by
f(x
1n
|x
2n
) = Ν(q
n
, σ
2
e
) n {1, …, n
1
+n
2
} (B.35)
then, using eq. (B.13) and the properties of the product of normal distributions
(Bromiley 2014), we find
q
n
:= ax
2n
+ b n {1, …, n
1
+n
2
} (B.36)
f(y
2
|y
5
) =
n = n
1
+n
2
+1
n
1
+n
2
+n
3
Ν(q
n
, σ
2
e
) (B.37)
f(y
2
|y
5
) = N(Q, V) (B.38)
where
Q = (q
n1+n2+1
, …, q
n1+n2+n3
)
T
, n
3
×1 (B.39)
V = σ
2
e
I
n3
, n
3
×n
3
(B.40)
However, in the Bayesian setting y
2
is given while the distribution of interest is that of
y
5
|y
2
. Therefore, eq. (B.38) is transformed to eq. (B.47) in which y
5
is the random
variable and y
2
is a value, after some algebraic manipulations.
b
2
:= (b, …, b)
Τ
, n
3
×1 (B.41)
f(y
5
|y
2
) exp(−(1/2) (y
2
ay
5
b
2
)
T
V
−1
(y
2
ay
5
b
2
)) (B.42)
f(y
5
|y
2
) exp(−(a
2
/2 σ
2
e
) (y
5
– (y
2
b
2
)/a)
T
I
−1
n3
(y
5
– (y
2
b
2
)/a)) (B.43)
f(y
5
|y
2
) exp(−(1/2) (y
5
– (y
2
b
2
)/a)
T
(a/σ
e
)
2
I
−1
n3
(y
5
– (y
2
b
2
)/a)) (B.44)
f(y
5
|y
2
) exp(−(1/2) (y
5
– (y
2
b
2
)/a)
T
((σ
e
/a)
2
I
n3
)
–1
(y
5
– (y
2
b
2
)/a)) (B.45)
f(y
5
|y
2
) = N((y
2
b
2
)/a, (σ
e
/a)
2
I
n3
) (B.46)
f(y
5
|y
2
) = N(M
2
, Λ
2
) (B.47)
M
2
:= (y
2
b
2
)/a (B.48)
Λ
2
:= (σ
e
/a)
2
I
n3
(B.49)
The distribution of y
5
|y
3
, y
4
, x
1
in eq. (B.19) is normal, i.e.
32
h(y
5
|y
3
, y
4
, x
1
) = N(M, Λ) (B.50)
because it is proportional to the product of the two normal distributions (B.32) and
(B.47) (Bromiley 2014). Its parameters are given by eqs. (B.52) and (B.56) according to
the following manipulations:
Λ
−1
:= Λ
−1
1
+ Λ
−1
2
(B.51)
Λ
−1
= (1/σ
2
) (R
22
R
21
R
−1
11
R
12
)
−1
+ (a/σ
e
)
2
I
n3
(B.52)
Λ
−1
M = Λ
−1
1
M
1
+ Λ
−1
2
M
2
(B.53)
M = Λ (Λ
−1
1
M
1
+ Λ
−1
2
M
2
) (B.54)
M = Λ Λ
−1
1
M
1
+ Λ (a/σ
e
)
2
((y
2
b
2
)/a) (B.55)
M = Λ Λ
−1
1
M
1
+ (a/σ
2
e
) Λ (y
2
b
2
) (B.56)
AppendixC InvestigationoftheBPFforvariousvaluesofitsparameters
From eq. (A.49) we obtain
Λ = Λ
2
(I
n2
+ Λ
−1
1
Λ
2
)
−1
(C.1)
while from eqs. (A.52) and (C.1) we obtain
M = Λ
2
(I
n2
+ Λ
−1
1
Λ
2
)
−1
Λ
−1
1
M
1
+ Λ
2
(I
n2
+ Λ
−1
1
Λ
2
)
−1
Λ
−1
2
M
2
(C.2)
In the case where
σ
e
= 0 (C.3)
using eqs. (C.1) and (C.2), we obtain:
M = M
2
(C.4)
Λ = 0
n2
(C.5)
In the case where
a = 0 (C.6)
we find that
f(y
4
|y
2
) = constant (C.7)
Hence, from eqs. (A.17) and (A.30) we obtain
h(y
4
|y
3
, x
1
) = N(M
1
, Λ
1
) (C.8)
33
AppendixD Precipitationdata
Here we present the sequence of steps to aggregate the precipitation from the daily to
the annual scale. This sequence is reproduced from Tyralis et al. (2017) who use the
same dataset and procedures.
A. Flagged values were considered as missing values.
B. Months with a percentage of recorded values higher than 0.83 (i.e. with more
than 25/30 or 26/31 daily observations) are considered good, while months with a
percentage of recorded values less than 0.34 (i.e. equal or less than 10/30 and 10/31
daily observations) are considered of poor quality. The reason for the differentiation is
that we first aggregate to the monthly time scale and then to the annual time scale. Thus
even if all values in a month are missing we can fill the monthly value after the first
aggregation as described in step C.
B1. Missing values within months with observed values more than 83% are filled
using linear interpolation.
B2. All values within months with observed values less than 34% were considered
as missing.
B3. For the rest of the months the missing values were filled in using linear
interpolation and then these months were considered as missing. The reason is
explained in step D.
C. Missing months corresponding to steps B2 and B3 (the latter after the
substitution with missing values) were filled in using a seasonal Kalman filter,
implemented in the R package zoo (Zeileis and Grothendieck 2005).
D. Mean monthly values for months in which both steps B3 and C (i.e. months with
missing values more than 34% and less than 83%) were applied, were calculated with
the mean of monthly values of steps B3 and C.
E. From the mean monthly values we obtained the mean annual values.
F. Finally we discarded annual time series if one of the following constraints was
satisfied:
F1. Two or more missing years.
34
F2. Hurst parameter estimate H
˄ ≥ 0.95, mean annual rainfall μ
˄ ≥ 3000 mm, standard
deviation of annual rainfall σ
˄ 750 mm, coefficient of variation of annual rainfall c
˄
v
0.8. These constraints on the estimated parameters were justified from a preliminary
analysis, which showed that higher values were outliers.
F3. Four or more years with less than 60% of observed daily values.
References
Aloysius, N.R., Sheffield, J., Saiers, J.E., Li, H., and Wood E.F., 2016. Evaluation of historical
and future simulations of precipitation and temperature in central Africa from
CMIP5 climate models. Journal of Geophysical Research, 121 (1), 130–152.
doi:10.1002/2015JD023656
Anagnostopoulos, G.G., Koutsoyiannis, D., Christofides, A., Efstratiadis, A., and Mamassis,
N., 2010. A comparison of local and aggregated climate model outputs with
observed data. Hydrological Sciences Journal, 55 (7), 1094–1110.
doi:10.1080/02626667.2010.513518
Baddeley, A., Rubak, E., and Turner, R., 2015. Spatial Point Patterns: Methodology and
Applications with R. London: Chapman and Hall/CRC Press.
Bromiley, P., 2014. Products and Convolutions of Gaussian Probability Density Functions.
University of Manchester, Tina Memo No. 2003-003 [online]. Available from:
http://tina.wiau.man.ac.uk/docs/memos/2003-003.pdf
Chen, F., Jiao, M., and Chen, J., 2013. The meta-Gaussian Bayesian Processor of forecasts
and associated preliminary experiments. Acta Meteorologica Sinica, 27 (2), 199–
210. doi:10.1007/s13351-013-0205-9
Chowdhury, S. and Sharma, A., 2011. Global Sea Surface Temperature Forecasts Using a
Pairwise Dynamic Combination Approach. Journal of Climate, 24, 1869–1877.
doi:10.1175/2010JCLI3632.1
Dessai, S. and Hulme, M., 2004. Does climate adaptation policy need probabilities?.
Climate Policy, 4 (2), 107-128.
Eaton, M.L., 2007. Multivariate Statistics: A Vector Space Approach. Lecture Notes-
Monograph Series Volume 53. Beachwood, Ohio: Institute of Mathematical
Statistics.
Ehret, U., Zehe, E., Wulfmeyer, V., Warrach-Sagi, K., and Liebert, J., 2012. HESS Opinions
"Should we apply bias correction to global and regional climate model data?".
Hydrology and Earth System Sciences, 16, 3391–3404. doi:10.5194/hess-16-3391-
2012
Fatichi, S., Ivanov, V.Y., and Caporali, E., 2012. Investigating Interannual Variability of
Precipitation at the Global Scale: Is There a Connection with Seasonality?. Journal
of Climate, 25, 5512–5523. doi:10.1175/JCLI-D-11-00356.1
de Finetti, B., 1974. Theory of Probability. New York: Wiley, New York.
Golub, G.H., Van Loan, C.F., 1996. Matrix Computations. 3rd ed, Baltimore: John Hopkins
University Press.
Groves, D.G., Yates, D., and Tebaldi, C., (2008) Developing and applying uncertain global
climate change projections for regional water management planning. Water
Resources Research, 44 (12). doi:10.1029/2008WR006964
35
Hawkins, E., Anderson, B., Diffenbaugh, N., Mahlstein, I., Betts, R., Hegerl, G., Joshi, M.,
Knutti, R., McNeall, D., Solomon, S., Sutton, R., Syktus, J., and Vecchi, G., 2014.
Uncertainties in the timing of unprecedented climates. Nature, 511 (E3–E5).
doi:10.1038/nature13523
Hawkins, E. and Sutton, R., 2009. The Potential to Narrow Uncertainty in Regional
Climate Predictions. Bulletin of the American Meteorological Society, 90, 1095–
1107. doi:10.1175/2009BAMS2607.1
Hawkins, E. and Sutton, R., 2011. The potential to narrow uncertainty in projections of
regional precipitation change. Climate Dynamics, 37 (1), 407–418.
doi:10.1007/s00382-010-0810-6
Hemelrijk, J., 1966. Underlining random variables. Statistica Neerlandica, 20 (1), 1–7.
doi:10.1111/j.1467-9574.1966.tb00488.x
Hewitt, A.J., Booth, B.B.B., Jones, C.D., Robertson, E.S., Wiltshire, A.J., Sansom, P.G.,
Stephenson, D.B., Yip, S., 2016. Sources of Uncertainty in Future Projections of the
Carbon Cycle. Journal of Climate, 29, 7203–7213. doi:10.1175/JCLI-D-16-0161.1
Hibbard, K.A., Meehl, G.A., Cox, P., and Friedlingstein, P., 2007. A Strategy for Climate
Change Stabilization Experiments. Eos, 88 (20), 217–221.
doi:10.1029/2007EO200002
Horn, R.A. and Zhang, F., 2005. Basic Properties of the Schur Complement. In: F. Zhang,
ed. The Schur Complement and Its Applications. New York: Springer US.
doi:10.1007/b105056
Horrace, W.C., 2005. Some results on the multivariate truncated normal distribution.
Journal of Multivariate Analysis, 94 (1), 209–221. doi:10.1016/j.jmva.2004.10.007
Iliopoulou, T., Papalexiou, S.M., Markonis, Y., and Koutsoyiannis, D., 2016. Revisiting
long-range dependence in annual precipitation. Journal of Hydrology.
doi:10.1016/j.jhydrol.2016.04.015
Johnson, F. and Sharma, A., 2009. Measurement of GCM Skill in Predicting Variables
Relevant for Hydroclimatological Assessments. Journal of Climate, 22, 4373–4382.
doi:10.1175/2009JCLI2681.1
Katz, R.W., 2002. Techniques for estimating uncertainty in climate change scenarios and
impact studies. Climate Research, 20, 167–185. doi:10.3354/cr020167
Klemeš, V., 1986. Operational testing of hydrological simulation models. Hydrological
Sciences Journal, 1, 13–24. doi:10.1080/02626668609491024
Knutti, R. and Sedláček, J., 2013. Robustness and uncertainties in the new CMIP5 climate
model projections. Nature Climate Change, 3, 369–373. doi:10.1038/nclimate1716
Koutroulis, A.G., Grillakis, M.G., Tsanis, I.K., and Papadimitriou, L., 2015. Evaluation of
precipitation and temperature simulation performance of the CMIP3 and CMIP5
historical experiments. Climate Dynamics, 47 (5), 1881–1898.
doi:10.1007/s00382-015-2938-x
Koutsoyiannis, D., 2002. The Hurst phenomenon and fractional Gaussian noise made
easy. Hydrological Sciences Journal, 47 (4), 573–595.
doi:10.1080/02626660209492961
Koutsoyiannis, D., 2003. Climate change, the Hurst phenomenon, and hydrological
statistics. Hydrological Sciences Journal, 48 (1), 3–24.
doi:10.1623/hysj.48.1.3.43481
Koutsoyiannis, D., 2006a. A toy model of climatic variability with scaling behaviour.
Journal of Hydrology, 322 (1–4), 25–48. doi:10.1016/j.jhydrol.2005.02.030
Koutsoyiannis, D., 2006b. Nonstationarity versus scaling in hydrology. Journal of
Hydrology, 324 (1–4), 239–254. doi:10.1016/j.jhydrol.2005.09.022
36
Koutsoyiannis, D. and Montanari, A., 2007. Statistical analysis of hydroclimatic time
series: Uncertainty and insights. Water Resources Research, 43 (5), W05429.
doi:10.1029/2006WR005592
Koutsoyiannis, D. and Montanari, A., 2014. Negligent killing of scientific concepts: the
stationarity case. Hydrological Sciences Journal, 60 (7–8), 1174-1183.
doi:10.1080/02626667.2014.959959
Koutsoyiannis, D., Efstratiadis, A., and Georgakakos, K.P., 2007. Uncertainty Assessment
of Future Hydroclimatic Predictions: A Comparison of Probabilistic and Scenario-
Based Approaches. Journal of Hydrometeorology, 8 (3), 261–281.
doi:10.1175/JHM576.1
Koutsoyiannis, D., Efstratiadis, A., Mamassis, N., and Christofides, A., 2008a. On the
credibility of climate predictions. Hydrological Sciences Journal, 53 (4), 671–684.
doi:10.1623/hysj.53.4.671
Koutsoyiannis, D., Yao, H., and Georgakakos, A., 2008b. Medium-range flow prediction
for the Nile: a comparison of stochastic and deterministic methods. Hydrological
Sciences Journal, 53 (1), 142–164. doi:10.1623/hysj.53.1.142
Krzysztofowicz, R., 1985. Bayesian models of forecasted time series. Journal of the
American Water Resources Association, 21 (5), 805–814. doi:10.1111/j.1752-
1688.1985.tb00174.x
Krzysztofowicz, R., 1987. Markovian Forecast Processes. Journal of the American
Statistical Association, 82 (397), 31–37. doi:10.2307/2289121
Krzysztofowicz, R., 1992. Bayesian Correlation Score: A Utilitarian Measure of Forecast
Skill. Monthly Weather Review, 120, 208–220. doi:10.1175/1520-
0493(1992)120<0208:BCSAUM>2.0.CO;2
Krzysztofowicz, R., 1999a. Bayesian Forecasting via Deterministic Model. Risk Analysis,
19 (4), 739–749. doi:10.1111/j.1539-6924.1999.tb00443.x
Krzysztofowicz, R., 1999b. Bayesian theory of probabilistic forecasting via deterministic
hydrologic model. Water Resources Research, 35 (9), 2739–2750.
doi:10.1029/1999WR900099
Krzysztofowicz, R., 2001. Integrator of uncertainties for probabilistic river stage
forecasting: precipitation-dependent model. Journal of Hydrology, 249 (1–4), 69–
85. doi:10.1016/S0022-1694(01)00413-9
Krzysztofowicz, R., 2002. Bayesian system for probabilistic river stage forecasting.
Journal of Hydrology, 268 (1–4), 16–40. doi:10.1016/S0022-1694(02)00106-3
Krzysztofowicz, R. and Evans, W.B., 2008. The role of climatic autocorrelation in
probabilistic forecasting. Monthly Weather Review, 136 (12), 4572–4592.
doi:10.1175/2008MWR2375.1
Krzysztofowicz, R., 2010. Decision criteria, data fusion and prediction calibration: a
Bayesian approach. Hydrological Sciences Journal, 55 (6), 1033–1050.
doi:10.1080/02626667.2010.505894
Krzysztofowicz, R. and Maranzano, C.J., 2004. Bayesian system for probabilistic stage
transition forecasting. Journal of Hydrology, 299 (1–2), 15–44.
doi:10.1016/j.jhydrol.2004.02.013
Kundzewicz, Z.W. and Stakhiv, E.Z., 2010. Are climate models “ready for prime time” in
water resources management applications, or is more research needed?.
Hydrological Sciences Journal, 55 (7), 1085–1089.
doi:10.1080/02626667.2010.513211
37
Lawrimore, J.H., Menne, M.J., Gleason, B.E., Williams, C.N., Wuertz, D.B., Vose, R.S., and
Rennie, J., 2011. An overview of the Global Historical Climatology Network
monthly mean temperature data set, version 3. Journal of Geophysical Research,
116, D19121. doi:10.1029/2011JD016187
Lee, D.T. and Schachter, B.J., 1980. Two algorithms for constructing a Delaunay
triangulation. International Journal of Computer & Information Sciences, 9 (3), 219–
242. doi:10.1007/BF00977785
Macilwain, C., 2014. A touch of the random. Science, 344 (6189), 1221–1223.
doi:10.1126/science.344.6189.1221
Maloney, E.D., Camargo, S.J., Chang, E., Colle, B., Fu, R., Geil, K.L., Hu, Q., Jiang, X., Johnson,
N., Karnauskas, K.B., Kinter, J., Kirtman, B., Kumar, S., Langenbrunner, B.,
Lombardo, K., Long, L.N., Mariotti, A., Meyerson, J.E., Mo, K.C., Neelin, J.D., Pan, Z.,
Seager, R., Serra, Y., Seth, A., Sheffield, J., Stroeve, J., Thibeault, J., Xie, S.P., Wang, C.,
Wyman, B., and Zhao, M., 2014. North American Climate in CMIP5 Experiments:
Part III: Assessment of Twenty-First-Century Projections. Journal of Climate, 27,
2230–2270. doi:10.1175/JCLI-D-13-00273.1
Markonis, Y. and Koutsoyiannis, D., 2016. Scale-dependence of persistence in
precipitation records. Nature Climate Change, 6, 399–401.
doi:10.1038/nclimate2894
Marty, R., Fortin, V., Kuswanto, H., Favre, A.C., and Parent, A., 2015. Combining the
Bayesian processor of output with Bayesian model averaging for reliable ensemble
forecasting. Journal of the Royal Statistical Society C, 64 (1), 75–92.
doi:10.1111/rssc.12062
Masui, T., Matsumoto, K., Hijioka, Y., Kinoshita, T., Nozawa, T., Ishiwatari, S., Kato, E.,
Shukla, P.R., Yamagata, Y., Kainuma, M., 2011. An emission pathway for
stabilization at 6 Wm
−2
radiative forcing. Climatic Change, 109, 59–76.
doi:10.1007/s10584-011-0150-5
Matthes, J.H., Goring, S., Williams, J.W., and Dietze, M.C., 2016. Benchmarking historical
CMIP5 plant functional types across the Upper Midwest and Northeastern United
States. Journal of Geophysical Research, 121 (2), 523–535.
doi:10.1002/2015JG003175
Meinshausen, M., Smith, S.J., Calvin, K., Daniel, J.S., Kainuma, M.L.T., Lamarque, J.F.,
Matsumoto, K., Montzka, S.A., Raper, S.C.B., Riahi, K., Thomson, A., Velders, G.J.M.,
and van Vuuren, D.P.P., 2011. The RCP greenhouse gas concentrations and their
extensions from 1765 to 2300. Climatic Change, 109, 213–241.
doi:10.1007/s10584-011-0156-z
Menne, M.J., Durre, I., Korzeniewski, B., McNeal, S., Thomas, K., Yin, X., Anthony, S., Ray,
R., Vose, R.S., Gleason, B.E., Houston, T.G., 2012a. Global Historical Climatology
Network - Daily (GHCN-Daily), Version 3.22. NOAA National Climatic Data Center.
doi:10.7289/V5D21VHZ [accessed: 2 September]
Menne, M.J., Durre, I., Vose, R.S., Gleason, B.E., and Houston, T.G., 2012b. An overview of
the Global Historical Climatology Network-Daily Database. Journal of Atmospheric
and Oceanic Technology, 29, 897–910. doi:10.1175/JTECH-D-11-00103.1
Montanari, A. and Grossi, G., 2008. Estimating the uncertainty of hydrological forecasts:
A statistical approach. Water Resources Research, 44 (12), W00B08.
doi:10.1029/2008WR006897
Montanari, A. and Koutsoyiannis, D., 2012. A blueprint for process-based modelling of
uncertain hydrological systems. Water Resources Research, 48 (9).
doi:10.1029/2011WR011412
38
Montanari, A. and Koutsoyiannis, D., 2014. Modeling and mitigating natural hazards:
Stationarity is immortal!. Water Resources Research, 50, 9748–9756.
doi:10.1002/2014WR016092
Moss, R.H., Edmonds, J.A., Hibbard, K.A., Manning, M.R., Rose, S.K., van Vuuren, D.P.,
Carter, T.R., Emori, S., Kainuma, M., Kram, T., Meehl, G.A., Mitchell, J.F.B.,
Nakicenovic, N., Riahi, K., Smith, S.J., Stouffer, R.J., Thomson, A.T., Weyant, J.P., and
Wilbanks, T.J., 2010. The next generation of scenarios for climate change research
and assessment. Nature, 463, 747–756. doi:10.1038/nature08823
Nasrollahi, N., AghaKouchak, A., Cheng, L., Damberg, L., Phillips, T.J., Miao, C., Hsu, K., and
Sorooshian, S., 2015. How well do CMIP5 climate simulations replicate historical
trends and patterns of meteorological droughts?. Water Resources Research, 51 (4),
2847–2864. doi:10.1002/2014WR016318
Notz, D., 2015. How well must climate models agree with observations?. Philosophical
Transactions of the Royal Society A, 373 (2052). doi:10.1098/rsta.2014.0164
Pirtle, Z., Meyer, R., and Hamilton, A., 2010. What does it mean when climate models
agree? A case for assessing independence among general circulation models.
Environmental Science & Policy, 13 (5), 351–361. doi:10.1016/j.envsci.2010.04.004
Pokhrel, P., Robertson, D.E., and Wang, Q.J., 2013. A Bayesian joint probability post-
processor for reducing errors and quantifying uncertainty in monthly streamflow
predictions. Hydrology and Earth Systems Sciences, 17, 795–804. doi:10.5194/hess-
17-795-2013
Potter, K., Wilson, A., Bremer, P.T., Williams, D., Doutriaux, C., Pascucci, V., Johhson, C.,
2009. Visualization of uncertainty and ensemble data: Exploration of climate
modeling and weather forecast data with integrated ViSUS-CDAT systems. Journal
of Physics: Conference Series 180 (012089). doi:10.1088/1742-
6596/180/1/012089
Santer, B.D., Painter, J.F., Mears, C.A., Doutriaux, C., Caldwell, P., Arblaster, J.M., Cameron-
Smith, P.J., Gillett, N.P., Gleckler, P.J., Lanzante, J., Perlwitz, J., Solomon, S., Stott, P.A.,
Taylor, K.E., Terray, L., Thorne, P.W., Wehner, M.F., Wentz, F.J., Wigley, T.M.L.,
Wilcox, L.J., Zou, C.Z., 2013. Identifying human influences on atmospheric
temperature. Proceedings of the National Academy of Sciences of the United States of
America, 110 (1), 26–33. doi:10.1073/pnas.1210514109
Sheffield, J., Barrett, A.P., Colle, B., Fernando, D.N., Fu, R., Geil, K.L., Hu, Q., Kinter, J.,
Kumar, S., Langenbrunner, B., Lombardo, K., Long, L.N., Maloney, E., Mariotti, A.,
Meyerson, J.E., Mo, K.C., Neelin, J.D., Nigam, S., Pan, Z., Ren, T., Ruiz-Barradas, A.,
Serra, Y.L., Seth, A., Thibeault, J.M., Stroeve, J.C., Yang, Z., and Yin, L., 2013a. North
American Climate in CMIP5 Experiments. Part I: Evaluation of Historical
Simulations of Continental and Regional Climatology. Journal of Climate, 26, 9209–
9245. doi:10.1175/JCLI-D-12-00592.1
Sheffield, J., Camargo, S.J., Fu, R., Hu, Q., Jiang, X., Johnson, N., Karnauskas, K.B., Kim, S.T.,
Kinter, J., Kumar, S., Langenbrunner, B., Maloney, E., Mariotti, A., Meyerson, J.E.,
Neelin, J.D., Nigam, S., Pan, Z., Ruiz-Barradas, A., Seager, R., Serra, Y.L., Sun, D.Z.,
Wang, C., Xie, S.P., Yu, J.Y., Zhang, T., and Zhao, M., 2013b. North American Climate
in CMIP5 Experiments. Part II: Evaluation of Historical Simulations of
Intraseasonal to Decadal Variability. Journal of Climate, 26, 9247–9290.
doi:10.1175/JCLI-D-12-00593.1
Schneider, S.H., 2002. Can we Estimate the Likelihood of Climatic Changes at 2100?.
Climatic Change, 52 (4), 441–451. doi:10.1023/A:1014276210717
39
Smith, P.J., Beven, K.J., Weerts, A.H., and Leedal, D., 2012. Adaptive correction of
deterministic models to produce probabilistic forecasts. Hydrology and Earth
Systems Sciences, 16, 2783–2799. doi:10.5194/hess-16-2783-2012
Smith, R.L., Tebaldi, C., Nychka, D., and Mearns, L.O., 2009. Bayesian Modeling of
Uncertainty in Ensembles of Climate Models. Journal of the American Statistical
Association, 104 (485), 97–116. doi:10.1198/jasa.2009.0007
Strobach, E. and Bel, G., 2015. Improvement of climate predictions and reduction of their
uncertainties using learning algorithms. Atmospheric Chemistry and Physics, 15,
8631–8641. doi:10.5194/acp-15-8631-2015
Taylor, K.E., Stouffer, R.J., and Meehl, G.A., 2012. An Overview of CMIP5 and the
Experiment Design. Bulletin of the American Meteorological Society, 93, 485–498.
doi:10.1175/BAMS-D-11-00094.1
Tian, D., Guo, Y., and Dong, W., 2015. Future changes and uncertainties in temperature
and precipitation over China based on CMIP5 models. Advances in Atmospheric
Sciences, 32 (4), 487–496. doi:10.1007/s00376-014-4102-7
Turner, R., 2016. deldir: Delaunay Triangulation and Dirichlet (Voronoi) Tessellation. R
package version 0.1-12. https://CRAN.R-project.org/package=deldir
Tyralis, H., 2016. HKprocess: Hurst-Kolmogorov Process. R package version 0.0-2.
https://CRAN.R-project.org/package=HKprocess
Tyralis, H. and Koutsoyiannis, D., 2011. Simultaneous estimation of the parameters of
the Hurst-Kolmogorov stochastic process. Stochastic Environmental Research &
Risk Assessment, 25 (1), 21–33. doi:10.1007/s00477-010-0408-x
Tyralis, H. and Koutsoyiannis, D., 2014. A Bayesian statistical model for deriving the
predictive distribution of hydroclimatic variables. Climate Dynamics, 42 (11–12),
2867–2883. doi:10.1007/s00382-013-1804-y
Tyralis, H., Dimitriadis, P., Koutsoyiannis, D., O’Connell, P.E., Tzouka, K., and Iliopoulou,
T., 2017. On the long-term persistence properties of annual precipitation using a
global network of instrumental measurements. Advances in Water Resources. In
review.
Uusitalo, L., Lehikoinen, A., Helle, I., and Myrberg, K., 2015. An overview of methods to
evaluate uncertainty of deterministic models in decision support. Environmental
Modelling & Software, 63, 24–31. doi:10.1016/j.envsoft.2014.09.017
Wang, Q.J., Robertson, D.E., and Chiew, F.H.S., 2009. A Bayesian joint probability
modeling approach for seasonal forecasting of streamflows at multiple sites. Water
Resources Research, 45 (5), W05407. doi:10.1029/2008WR007355
Wei, W.W.S., 2006. Time Series Analysis, Univariate and Multivariate Methods. 2nd
edition. Pearson Addison Wesley.
Woldemeskel, F.M., Sharma, A., Sivakumar, B., and Mehrotra, R., 2014. A framework to
quantify GCM uncertainties for use in impact assessment studies. Journal of
Hydrology, 519 (Part B), 1453–1465. doi:10.1016/j.jhydrol.2014.09.025
Xu, J., Powell Jr, A.M., and Zhao, L., 2013. Intercomparison of temperature trends in IPCC
CMIP5 simulations with observations, reanalyses and CMIP3 models. Geoscientific
Model Development, 6, 1705–1714. doi:10.5194/gmd-6-1705-2013
Yip, S., Ferro, C.A.T., Stephenson, D.B., and Hawkins, E., 2011. A Simple, Coherent
Framework for Partitioning Uncertainty in Climate Predictions. Journal of Climate,
24, 4634–4643. doi:10.1175/2011JCLI4085.1
Ylhäisi, J.S., Garrè, L., Daron, J., and Räisänen, J., 2015. Quantifying sources of climate
uncertainty to inform risk analysis for climate change decision-making. Local
Environment, 20 (7), 811–835. doi:10.1080/13549839.2013.874987
40
Zeileis, A. and Grothendieck, G., 2005. zoo: S3 Infrastructure for Regular and Irregular
Time Series. Journal of Statistical Software, 14 (6), 1–27. doi:10.18637/jss.v014.i06
Zhao, L., Duan, Q., Schaake, J., Ye, A., and Xia, J., 2011. A hydrologic post-processor for
ensemble streamflow predictions. Advances in Geosciences, 29, 51–59.
doi:10.5194/adgeo-29-51-2011
Zhao, T., Wang, Q.J., Bennett, J.C., Robertson, D.E., Shao, Q., and Zhao, J., 2015a.
Quantifying predictive uncertainty of streamflow forecasts based on a Bayesian
joint probability model. Journal of Hydrology, 528, 329–340.
doi:10.1016/j.jhydrol.2015.06.043
Zhao, L., Xu, J., Powell Jr, A.M., and Jiang, Z., 2015b. Uncertainties of the global-to-regional
temperature and precipitation simulations in CMIP5 models for past and future
100 years. Theoretical and Applied Climatology, 122 (1), 259–270.
doi:10.1007/s00704-014-1293-x
... However, Strabo himself uses the term climate with a meaning close to the modern 71 one. Furthermore Strabo, defined the five climatic zones, one torrid, two temperate and two 72 frigid, as we use them to date (see also Appendix A). 77 founder of trigonometry and discoverer of the precession of the equinoxes, depicted in the back 78 facet of a coin of the Roman period. (Image sources: [3]). ...
... However, as a result of the "cart before the horse" approach, 590 which confuses or reverses roles and causality directions, those inputs are not credible. 591 As shown in a series of publications, the climate model outputs are irrelevant to reality 592 and thus not hydrologically useful for all time scales, from sub-annual to climatic, and for 593 a variety of spatial scales, local [73][74], sub-continental [75][76][77], continental and global [52]. 594 6. ...
Preprint
Full-text available
We revisit the notion of climate, along with its historical evolution, tracing the origin of the modern concerns about climate. The notion (and the scientific term) of climate has been established during the Greek antiquity in a geographical context and it acquired its statistical content (average weather) in modern times, after meteorological measurements had become common. Yet the modern definitions of climate are seriously affected by the wrong perception of the previous two centuries that climate should regularly be constant, unless an external agent acted. Therefore, we attempt to give a more rigorous definition of climate, consistent with the modern body of stochastics. We illustrate the definition by real-world data, which also exemplify the large climatic variability. Given this variability, the term “climate change” turns out to be scientifically unjustified. Specifically, it is a pleonasm as climate, like weather, has been ever-changing. Indeed, a historical investigation reveals that the aim in using that term is not scientific but political. Within the political aims, water issues have been greatly promoted by projecting future catastrophes while reversing the true roles and causality directions. For this reason, we provide arguments that water is the main element that drives climate and not the opposite.
... 53 In his famous book Meteorologica he describes the climates on Earth in connection with 54 latitude but he uses a different term, crasis (κρᾶσις), literally meaning mixing, blending of 55 things which form a compound, temperament. 56 The term climate (κλίμα, plural κλίματα) was coined as a geographical term by the 57 astronomer Hipparchus (Figure 1) in his Commentary on Aratus (Ἱππάρχου τῶν Ἀράτου 58 καὶ Εὐδόξου φαινομένων ἐξηγήσεως [2]). Hipparchus is also known in climatology for his 59 discovery and calculation of precession of the equinoxes (μετάπτωσις ἰσημεριών) by study- 60 ing measurements on several stars. ...
... However, as a result 485 of the "cart before the horse" approach, which confuses or reverses roles and causality 486 directions, those inputs are not credible. As shown in a series of publications, the climate 487 model outputs are irrelevant to reality and thus not hydrologically useful for all time 488 scales, from sub-annual to climatic, and for a variety of spatial scales, local [52][53], sub-489 continental [54][55][56], continental and global [36]. ...
Preprint
Full-text available
We revisit the notion of climate, along with its historical evolution, tracing the origin of the modern concerns about climate. The notion (and the scientific term) of climate has been established during the Greek antiquity in a geographical context and it acquired its statistical content (average weather) in modern times, after meteorological measurements had become common. Yet the modern definitions of climate are seriously affected by the wrong perception of the previous two centuries that climate should regularly be constant, unless an external agent acted. Therefore, we attempt to give a more rigorous definition of climate, consistent with the modern body of stochastics. We illustrate the definition by real-world data, which also exemplify the large climatic variability. Given this variability, the term “climate change” turns out to be scientifically unjustified. Specifi-cally, it is a pleonasm as climate, like weather, has been ever changing. Indeed, a historical inves-tigation reveals that the aim in using that term is not scientific but political. Within the political aims, water issues have been greatly promoted by projecting future catastrophes while reversing the true roles and causality directions. For this reason, we provide arguments that water is the main element that drives climate and not the opposite. (https://www.preprints.org/manuscript/202102.0180)
... For the above reasons, methods for extracting useful technical information from climate models are getting increasing attention. In this respect, Tyralis and Koutsoyiannis [11] have developed a Bayesian methodology for extracting such information and providing a stochastic framework of future climate based on the observations on the one hand and conditional on the climate model outputs on the other hand. ...
Preprint
Full-text available
Bluecat is a recently proposed methodology to upgrade a deterministic model (D-model) into stochastic (S-model), based on the hypothesis that the information contained in a time series of observations and the concurrent predictions by the D-model is sufficient to support this upgrade. Prominent characteristics of the methodology are its simplicity and transparency, which allow easy use in practical applications, without sophisticated computational means. Here we utilize the Bluecat methodology and expand it in order to be combined with climatic model outputs, which often require extrapolation out of the range of values covered by observations. We apply the expanded methodology to the precipitation and temperature processes in a large area, namely the entire territory of Italy. The results showcase the appropriateness of the method for hydroclimatic studies, as regards the assessment of the performance of the climatic projections, as well as their stochastic conversion with simultaneous bias correction and uncertainty quantification.
... For the above reasons, methods for extracting useful technical information from climate models are attracting increasing attention. In this respect, Tyralis and Koutsoyiannis [11] have developed a Bayesian methodology for extracting such information and providing a stochastic framework of the future climate based on observations, on the one hand, and conditional on climate model outputs, on the other hand. ...
Article
Full-text available
Bluecat is a recently proposed methodology to upgrade a deterministic model (D-model) into a stochastic one (S-model), based on the hypothesis that the information contained in a time series of observations and the concurrent predictions made by the D-model is sufficient to support this upgrade. The prominent characteristics of the methodology are its simplicity and transparency, which allow its easy use in practical applications, without sophisticated computational means. In this paper, we utilize the Bluecat methodology and expand it in order to be combined with climate model outputs, which often require extrapolation out of the range of values covered by observations. We apply the expanded methodology to the precipitation and temperature processes in a large area, namely the entire territory of Italy. The results showcase the appropriateness of the method for hydroclimatic studies, as regards the assessment of the performance of the climate projections, as well as their stochastic conversion with simultaneous bias correction and uncertainty quantification.
... Tyralis and Koutsoyiannis (2017) ...
Presentation
Full-text available
In the frame of the 2021 World Environmental & Water Resources Congress, Virtual Online, organized by the American Society of Civil Engineers, the Session 5-4 was a Panel Session entitled "Advancing New Methods for the Treatment of Climate Change and Extreme Events". The following scientists constituted the panel, while the audience made comments and remarks. (1) Moderator: Vijay P. Singh, Ph.D., D. Sc., P.E., P.H., Hon. Diplomate WRE – Texas A&M University. (2) Panelist: Michael L. Anderson, Ph. D., P.E. – Department of Water Resources. (3) Panelist: Chandra S. Pathak, Ph.D., P.E., F.ASCE – Headquarters, US Army Corps of Engineers. (4) Panelist: Hemant Chowdhary, PHD, AMASCE – AIR Worldwide. (5) Panelist: Subimal Ghosh, Ph.D. – Indian Institute of Technology Bombay. (6) Panelist: Demetris Koutsoyiannis – National Technical University of Athens. (7) Panelist: Edward McBean, D.WRE – University of Guleph. (8) Panelist: Ashish Sharma, Ph.D. – University of New South Wales. (9) Panelist: Richard M. Vogel, Ph.D., Dist.M.ASCE – Tufts University.
... A rainfall-runoff model was calibrated for the catchment using 60 years of discharge records using the GLUE methodology. Perhaps more importantly, very few have done so while allowing for the potential for persistent stochastic behaviour (though see Tyralis & Koutsoyiannis, 2017). Most authors of such studies will make the argument that the outcomes are only potential scenarios, and that the projections of climate models are the best indication of future boundary conditions that are available. ...
Article
Full-text available
This paper provides a historical review and critique of stochastic generating models for hydrological observables, from early generation of monthly discharge series, through flood frequency estimation by continuous simulation, to current weather generators. There are a number of issues that arise in such models, from uncertainties in the observational data on which such models must be based, to the potential persistence effects in hydroclimatic systems, the proper representation of tail behaviour in the underlying distributions, and the interpretation of future scenarios. This article is protected by copyright. All rights reserved.
... However, as a result of the "cart before the horse" approach, which confuses or reverses roles and causality directions, those inputs are not credible. As shown in a series of publications, the climate model outputs are irrelevant to reality and thus not hydrologically useful for all time-scales, from sub-annual to climatic, and for a variety of spatial scales, local [80,81], subcontinental [82][83][84], continental and global [59]. ...
Article
Full-text available
We revisit the notion of climate, along with its historical evolution, tracing the origin of the modern concerns about climate. The notion (and the scientific term) of climate was established during the Greek antiquity in a geographical context and it acquired its statistical content (average weather) in modern times after meteorological measurements had become common. Yet the modern definitions of climate are seriously affected by the wrong perception of the previous two centuries that climate should regularly be constant, unless an external agent acts upon it. Therefore, we attempt to give a more rigorous definition of climate, consistent with the modern body of stochastics. We illustrate the definition by real-world data, which also exemplify the large climatic variability. Given this variability, the term “climate change” turns out to be scientifically unjustified. Specifically, it is a pleonasm as climate, like weather, has been ever-changing. Indeed, a historical investigation reveals that the aim in using that term is not scientific but political. Within the political aims, water issues have been greatly promoted by projecting future catastrophes while reversing true roles and causality directions. For this reason, we provide arguments that water is the main element that drives climate, and not the opposite.
... If climate models were able to provide some useful information, it would be possible to incorporate it in a decent stochastic framework. Specifically, Tyralis and Koutsoyiannis (2017) have established a Bayesian framework to convert deterministic climate model predictions into formal stochastic ones. The underlining idea is that if the deterministic forecasts are good, then the Bayesian framework proposed takes them into account, otherwise they are automatically disregarded. ...
Article
Full-text available
The 21st century has been marked by a substantial progress in hydroclimatic data collection and access to them, accompanied by regression in methodologies to study and interpret the behaviour of natural processes and in particular of extremes thereof. The developing culture of prophesising the future, guided by deterministic climate modelling approaches, has seriously affected hydrology. Therefore, aspired advances are related to abandoning the certainties of deterministic approaches and returning to stochastic descriptions, seeking in the latter theoretical consistency and optimal use of available data.
... Representations and characterizations can take various formulations not only as the core of diagnostic and exploratory frameworks, but also as the basis for prediction methodologies of all possible types (see, e.g., Pechlivanidis et al., 2014;Tyralis and Koutsoyiannis, 2017;Tyralis et al., 2020), and as the basis for hydrological and hydroclimatic (stochastic) simulation frameworks (see, e.g., Perrin et al., 2003;Kumar et al., 2006;Langousis and Koutsoyiannis, 2006;Lee and Salas, 2011;Grimaldi et al., 2012;Langousis and Kaleris, 2014;Papalexiou, 2018). The candidate formulations may include (but are not limited to) statistical characterizations and representations in terms of marginal probability (or cumulative) density functions (see, e.g., Kroll et al., 2002;Papalexiou and Koutsoyiannis, 2013;Nerantzaki and Papalexiou, 2019), joint probability density functions and copulas (see, e.g., Serinaldi et al., 2009;Kuchment and Demidov, 2013;Wong et al., 2013), time series or regression models (see, e.g., Carlson et al., 1970;Koutsoyiannis, 2011;Khatami, 2013;Khazaei et al., 2019;Papalexiou and Montanari, 2019;Ghajarnia et al. 2020;Kagawa-Viviani and Giambelluca, 2020), process-based (including conceptual) representations (see, e.g., the reviews in Langousis and Koutsoyiannis, 2006;Koutsoyiannis and Langousis, 2011;Jaramillo and Destouni 2015;Langousis et al., 2016a;Davtalab et al., 2017;Széles et al., 2018;Khatami et al., 2019;Emmanouil et al., 2020;Khatami et al., 2020), and characterizations through feature extraction; see Section 1.2 below. ...
Article
Hydroclimatic time series analysis focuses on a few feature types (e.g., autocorrelations, trends, extremes), which describe a small portion of the entire information content of the observations. Aiming to exploit a larger part of the available information and, thus, to deliver more reliable results (e.g., in hydroclimatic time series clustering contexts), here we approach hydroclimatic time series analysis differently, i.e., by performing massive feature extraction. In this respect, we develop a big data framework for hydroclimatic variable behaviour characterization. This framework relies on approximately 60 diverse features and is completely automatic (in the sense that it does not depend on the hydroclimatic process at hand). We apply the new framework to characterize mean monthly temperature, total monthly precipitation and mean monthly river flow. The applications are conducted at the global scale by exploiting 40-year-long time series originating from over 13 000 stations. We extract interpretable knowledge on seasonality, trends, autocorrelation, long-range dependence and entropy, and on feature types that are met less frequently. We further compare the examined hydroclimatic variable types in terms of this knowledge and, identify patterns related to the spatial variability of the features. For this latter purpose, we also propose and exploit a hydroclimatic time series clustering methodology. This new methodology is based on Breiman’s random forests. The descriptive and exploratory insights gained by the global-scale applications prove the usefulness of the adopted feature compilation in hydroclimatic contexts. Moreover, the spatially coherent patterns characterizing the clusters delivered by the new methodology build confidence in its future exploitation. Given this spatial coherence and the scale-independent nature of the delivered feature values (which makes them particularly useful in forecasting and simulation contexts), we believe that this methodology could also be beneficial within regionalization frameworks, in which knowledge on hydrological similarity is exploited in technical and operative terms.
Article
This study presents the potential of hydrological ensemble forecasts over South Korea for medium-range forecast lead times (1-7 days). To generate hydrological forecasts, this study utilizes a framework based on stacking ensemble learning, an emerging machine learning technique that includes a two-level structure: base-learner and meta-learner models. In particular, the present research contributes to hydrological post-processing techniques by: (1) introducing a penalized quantile regression-based meta-learner to generate probabilistic predictions, (2) considering modeled climate predictions and antecedent hydrologic conditions simultaneously for regional hydrological forecast development, and (3) quantifying the skill enhancements from the multi-model forecasts under the stacking generalization. The proposed model is evaluated in massive 473 grid cells along with nine additional simpler models to test the specific hypotheses introduced in this study. Results indicate that our proposed forecasts can be used for relatively short lead times. In addition, results demonstrate that utilizing a penalized probabilistic meta-learner and antecedent conditions contributes to the forecast skill improvements. Lastly, we find that base-model diversity outperforms increased ensemble size alone in enhancing the forecast abilities under the stacking ensemble generalization. We conclude this paper with a discussion of possible forecast model improvements from an adaptation of additional information from input and model structures under the stacking generalization.
Article
Full-text available
The long-range dependence (LRD) is considered an inherent property of geophysical processes, whose presence increases uncertainty. Here we examine the spatial behaviour of LRD in precipitation by regressing the Hurst parameter estimate of mean annual precipitation instrumental data which span from 1916-2015 and cover a big area of the earth's surface on location characteristics of the instrumental data stations. Furthermore, we apply the Mann-Kendall test under the LRD assumption (MKt-LRD) to reassess the significance of observed trends. To summarize the results, the LRD is spatially clustered, it seems to depend mostly on the location of the stations, while the predictive value of the regression model is good. Thus when investigating for LRD properties we recommend that the local characteristics should be considered. The application of the MKt LRD suggests that no significant monotonic trend appears in global precipitation, excluding the climate type D (snow) regions in which positive significant trends appear. Supplementary information files are hosted at: https://doi.org/10.6084/m9.figshare.4892447.v1
Code
Full-text available
Methods to make inference about the Hurst-Kolmogorov and the AR(1) process.
Article
Full-text available
On the basis of the fifth Coupled Model Intercomparison Project (CMIP5) and the climate model simulations covering 1979 through 2005, the temperature trends and their uncertainties have been examined to note the similarities or differences compared to the radiosonde observations, reanalyses and the third Coupled Model Intercomparison Project (CMIP3) simulations. The results show noticeable discrepancies for the estimated temperature trends in the four data groups (radiosonde, reanalysis, CMIP3 and CMIP5), although similarities can be observed. Compared to the CMIP3 model simulations, the simulations in some of the CMIP5 models were improved. The CMIP5 models displayed a negative temperature trend in the stratosphere closer to the strong negative trend seen in the observations. However, the positive tropospheric trend in the tropics is overestimated by the CMIP5 models relative to CMIP3 models. While some of the models produce temperature trend patterns more highly correlated with the observed patterns in CMIP5, the other models (such as CCSM4 and IPSL_CM5A-LR) exhibit the reverse tendency. The CMIP5 temperature trend uncertainty was significantly reduced in most areas, especially in the Arctic and Antarctic stratosphere, compared to the CMIP3 simulations. Similar to the CMIP3, the CMIP5 simulations overestimated the tropospheric warming in the tropics and Southern Hemisphere and underestimated the stratospheric cooling. The crossover point where tropospheric warming changes into stratospheric cooling occurred near 100 hPa in the tropics, which is higher than in the radiosonde and reanalysis data. The result is likely related to the overestimation of convective activity over the tropical areas in both the CMIP3 and CMIP5 models. Generally, for the temperature trend estimates associated with the numerical models including the reanalyses and global climate models, the uncertainty in the stratosphere is much larger than that in the troposphere, and the uncertainty in the Antarctic is the largest. In addition, note that the reanalyses show the largest uncertainty in the lower tropical stratosphere, and the CMIP3 simulations show the largest uncertainty in both the south and north polar regions.
Article
Full-text available
Simulated climate dynamics, initialized with observed conditions, is expected to be synchronized, for several years, with the actual dynamics. However, the predictions of climate models are not sufficiently accurate. Moreover, there is a large variance between simulations initialized at different times and between different models. One way to improve climate predictions and to reduce the associated uncertainties is to use an ensemble of climate model predictions, weighted according to their past performances. Here, we show that skillful predictions, for a decadal time scale, of the 2 m temperature can be achieved by applying a sequential learning algorithm to an ensemble of decadal climate model simulations. The predictions generated by the learning algorithm are shown to be better than those of each of the models in the ensemble, the better performing simple average and a reference climatology. In addition, the uncertainties associated with the predictions are shown to be reduced relative to those derived from an equally weighted ensemble of bias-corrected predictions. The results show that learning algorithms can help to better assess future climate dynamics.
Article
Full-text available
Hydrologic model predictions are often biased and subject to heteroscedastic errors originating from various sources including data, model structure and parameter calibration. Statistical post-processors are applied to reduce such errors and quantify uncertainty in the predictions. In this study, we investigate the use of a statistical post-processor based on the Bayesian joint probability (BJP) modelling approach to reduce errors and quantify uncertainty in streamflow predictions generated from a monthly water balance model. The BJP post-processor reduces errors through elimination of systematic bias and through transient errors updating. It uses a parametric transformation to normalize data and stabilize variance and allows for parameter uncertainty in the post-processor. We apply the BJP post-processor to 18 catchments located in eastern Australia and demonstrate its effectiveness in reducing prediction errors and quantifying prediction uncertainty.
Article
Full-text available
Long-range dependence (LRD), the so-called Hurst–Kolmogorov behaviour, is considered to be an intrinsic characteristic of most natural processes. This behaviour manifests itself by the prevalence of slowly decaying autocorrelation function and questions the Markov assumption, often habitually employed in time series analysis. Herein, we investigate the dependence structure of annual rainfall using a large set, comprising more than a thousand stations worldwide of length 100 years or more, as well as a smaller number of paleoclimatic reconstructions covering the last 12,000 years. Our findings suggest weak long-term persistence for instrumental data (average H = 0.59), which becomes stronger with scale, i.e. in the paleoclimatic reconstructions (average H = 0.75).
Article
The inclusion of carbon cycle processes within CMIP5 Earth System Models provides the opportunity to explore the relative importance of differences in scenario and climate model representation to future land and ocean carbon fluxes. A two-way ANOVA approach was used to quantify the variability owing to differences between scenarios and between climate models at different lead times. For global ocean carbon fluxes, the variance attributed to differences between Representative Concentration Pathway scenarios exceeds the variance attributed to differences between climate models by around 2025, completely dominating by 2100. This contrasts with global land carbon fluxes, where the variance attributed to differences between climate models continues to dominate beyond 2100. This suggests that modelled processes that determine ocean fluxes are currently better constrained than those of land fluxes, thus we can be more confident in linking different future socio-economic pathways to consequences of ocean carbon uptake than for land carbon uptake. The contribution of internal variance is negligible for ocean fluxes and small for land fluxes, indicating that there is little dependence on the initial conditions. The apparent agreement in atmosphere-ocean carbon fluxes, globally, masks strong climate model differences at a regional level. The North Atlantic and Southern Ocean are key regions, where differences in modelled processes represent an important source of variability in projected regional fluxes.