ArticlePDF Available

Abstract

In a companion paper, we develop the theoretical background of a stochastic approach to causality with the objective of formulating necessary conditions that are operationally useful in identifying or falsifying causality claims. Starting from the idea of stochastic causal systems, the approach extends it to the more general concept of hen-or-egg causality, which includes as special cases the classic causal, and the potentially causal and anti-causal systems. The framework developed is applicable to large-scale open systems, which are neither controllable nor repeatable. In this paper, we illustrate and showcase the proposed framework in a number of case studies. Some of them are controlled synthetic examples and are conducted as a proof of applicability of the theoretical concept, to test the methodology with a priori known system properties. Others are real-world studies on interesting scientific problems in geophysics, and in particular hydrology and climatology.
Revisiting causality using stochastics: 2. Applications
Demetris Koutsoyiannis1, Christian Onof2, Antonis Christofidis1 and
Zbigniew W. Kundzewicz3
1 Department of Water Resources and Environmental Engineering, School of Civil Engineering, National
Technical University of Athens (dk@itia.ntua.gr)
2 Department of Civil and Environmental Engineering, Faculty of Engineering, Imperial College London
3 Meteorology Lab, Department of Construction and Geoengineering, Faculty of Environmental
Engineering and Mechanical Engineering, Poznan University of Life Sciences, Poznań, Poland
Abstract In a companion paper, we develop the theoretical background of a stochastic
approach to causality with the objective of formulating necessary conditions that are
operationally useful in identifying or falsifying causality claims. Starting from the idea of
stochastic causal systems, the approach extends it to the more general concept of hen-or-
egg causality, which includes as special cases the classic causal, and the potentially causal
and anticausal systems. The framework developed is applicable to large-scale open
systems, which are neither controllable nor repeatable. In this paper we illustrate and
showcase the proposed framework in a number of case studies. Some of them are
controlled synthetic examples and are conducted as a proof of applicability of the
theoretical concept, to test the methodology with a priori known system properties. Other
are real-world studies on interesting scientific problems in geophysics, and in particular
hydrology and climatology.
Keywords causality, causal systems, stochastics, impulse response function, geophysics,
hydrology, climate
疑惑深 智慧深 疑惑浅 智慧浅
(Deep doubts, deep wisdom; shallow doubts, shallow wisdom Chinese proverb)
1 Introduction
The companion paper (Koutsoyiannis et al., 2022) studies theoretically the identification
and characterization of a causal link between two stochastic processes  and ,
which represent two natural processes. Our proposal relies on the relationship:
 
 
(1)
where  is another stochastic process (not necessarily white noise), assumed
uncorrelated to , which represents the part of the process that is not explained by the
2
causal link. The function  is the impulse response function (IRF) of the system
(); to see this, we set and  (the Dirac delta function,
representing an impulse of infinite amplitude at and attaining the value of 0 for
), and we readily get .
For any two processes  and , equation (1) has infinitely many solutions in
terms of the function and the process . An obvious and trivial one is
. The sought solution is the one that corresponds to the minimum variance
of , called the least-squares solution. Assuming that this has been determined for the
system (), we call that system:
1. potentially causal if for any , while the explained variance is non
negligible;
2. potentially anticausal if for any , while the explained variance is non
negligible (this means that the system () is potentially causal);
3. potentially hen-or-egg (HOE) causal if for some and some ,
while the explained variance is non negligible;
4. noncausal if the explained variance is negligible.
In the above we use the adverb “potentially” to highlight the fact that the conditions
tested provide necessary but not sufficient conditions for causality. In a potentially causal
(or anticausal) system the time order is explicitly reflected in the definition. In a
potentially HOE causal system the time order needs to be clarified by defining the
principal direction. The most natural indices for this are: (a) the time lag maximizing
the absolute value of cross-covariance; (b) the mean (time average) of the function;
and (c) the median  of the function.
Assuming that processes  and are positively correlated (i.e. an increase in
 would result in an increase in ; if not we multiply one of the two by ), we seek
an optimal solution for the IRF by minimising the variance of . We also set additional
desiderata for
(a) an adequate time span of ;
(b) a nonnegative for all ; and
(c) a smooth  with the smoothness expressed as a constraint , where
E is determined in terms of the second derivative of  and is a real
number.
The proposed theoretical framework is formulated in terms of natural, i.e.,
continuous time. On the other hand, as the estimation of the IRF relies on data in an
inductive manner, and data are only available in discrete time, conversion of the
continuous- to a discrete-time framework is necessary for the application. This results in
3
  

(2)
where the sequence is determined from the function in the companion paper.
Furthermore, any data set is finite and allows only a finite number of terms to be
estimated. Therefore, in the applications the summation limits  in equation (2) are
replaced by , apparently assuming that for , where, in order to identify
from data, should be chosen much lower than the length of the dataset.
If we exclude the nonnegativity constraint, then the problem of identifying the IRF
has an analytical solution (equation (37) in the companion paper). However, a numerical
solution is always possible, simple and fast. The theoretical framework is illustrated,
tested and explored in a number of case studies which are presented below. The majority
of these are synthetic, with a priori known system properties, and they serve to test the
methodology developed. The remaining are real-world case studies that serve to illustrate
the usefulness of the method.
Apparently, the estimation of the IRF from data involves uncertainty. As explained
in the companion paper (Koutsoyiannis et al., 2022), the method of choice for the
uncertainty assessment is Monte Carlo simulation, because the complexity of the
calculations for optimizing the IRF fitting do not allow analytical solutions. To make the
study short, this task was kept out of its scope. However, a preliminary investigation and
some first results are provided in the Supplementary Information, section SI2.1, where it
is shown that (i) the uncertainty in the IRF ordinates per se can be large if no constraints
are used to determine it, (ii) the use of constraints decreases the uncertainty, and (iii) the
uncertainty in the key characteristics related to causality (time directionality, time lags,
explained variance) is small, irrespective of the constraints used.
Increased estimation uncertainty may also result in spurious causality claims. As
high autocorrelation increases uncertainty in the long term, this could be a major case
leading to false identification of causality. This is illustrated in the Supplementary
Information (Section SI2.2) by means of a synthetic example, in which the processes
and are, by construction, independent of one another, but with high autocorrelation.
Techniques to handle such situations and avoid false conclusions are also discussed there.
As also explained in the companion paper (Koutsoyiannis et al., 2022), the proposed
method for the determination of the ordinates based on the minimization of the
variance of the error process, is, by its construction, nonparametric. A well-known
weakness of determining numerous ordinates is that it is an over-parameterized problem.
Alternative techniques may overcome this problem using a parametric model (such as a
Box-Jenkins model or an autoregressive moving average exogenousARMAXmodel;
Young, 2011, 2015). In our nonparametric approach the over-parameterization problem
4
can also be tackledand here lies the usefulness of imposing constraints (a)-(c) discussed
above. For comparison, an additional parametric method, formulated in terms of
parameterizing the IRF per se in continuous time is also discussed and compared to the
proposed non-parametric method in the Supplementary Information, section SI2.3. This
method is also applied in one of the case studies in the Supplementary Information,
section SI2.4.
2 Case studies
Thirty case studies have been conducted, whose results are summarized in Table 1. In all
of them we started by assuming a potentially HOE causal model with a rather small
number of weights , namely 41 (i.e. ). Depending on the results of the estimation
procedure, the system is deemed potentially HOE causal if we have for both some
positive and some negative lags j, and potentially causal if for all (or
anticausal if this happens for all ). If the explained variance ratio is close to 0, the
system would be deemed noncausal, but this case did not appear in the case studies.
Furthermore, if it happens that at the edge of the window the IRF vanishes off ( ),
we have a strong indication that the chosen  (defining the window size) is sufficient
to recover the system dynamics in terms of causality. The opposite case would mean that
a larger time window with a greater J is required. The optimal J in this case is not
investigated as our scope here is not to construct an optimal model for a specific system,
but only to test the proposed methodology and seek some insights within it.
The majority of the case studies are synthetic (#1 to #18) and are conducted as a
proof of concept, i.e., to test the methodology developed with a priori known system
properties. The remaining are real-world case studies divided in two subcategories.
Namely, studies #19 to #22 deal with the precipitation runoff system which is well
understood in hydrology. Here we try to investigate whether our framework is consistent
with the known fact that precipitation at a specific location is the cause and runoff at the
associated location the effect. In other words, here we know a priori that precipitation
causes runoff and we test our methodology (whether it can capture this known fact)
rather than try to find the actual causality direction. On the contrary, studies #23 #30
deal with systems that are much more complex and not well understood. Namely, in
studies #23 #28 we investigate the links between atmospheric temperature and CO₂
concentration (cf. Koutsoyiannis and Kundzewicz, 2020) and in studies #29 #30 we
investigate the links between atmospheric temperature and El-Niño Southern Oscillation
(ENSO; cf. Kundzewicz et al., 2020).
5
Table 1 Summary indices for the results of all case studies elaborated. The is the time lag maximizing the
cross-covariance , or equivalently the cross-correlation ; is the
mean (time average) of the function;  is the median of the function; is the explained variance
ratio; and is the roughness ratio. In parentheses are the true values for the cases that they are known.
Direction


#
0 (0)
0 (0)
0 (0)
0.954
1
3.7×105 *
1
0 (0)
0 (0)
§
0.954
0.99
1.22*
2
0 (0)
0.02 (0)
§
0.954
0.99
3.2×105 *
3
0 (0)
0.11 (0)
§
0.954
0.98
2.4×105 *
4
0 (0)
0 (0)
0.2 (0)
0.954
0.91
0*
5
6
(6)
6.86
(6.84)
§
(5.42)
0.947
0.94
1.32
6
6 (6)
6.50
§
0.947
0.97
1.4×103
7
6
(6)
6.79
(6.84)
5.49
(5.42)
0.947
0.94
0.633
8
6 (6)
5.73
5.62
0.947
0.94
0.0053
9
6
(6)
6.86
(6.84)
5.34
(5.42)
0.947
0.94
1.9×105 *
10
6 (6)
5.15
4.49
0.947
0.93
3.2×105 *
11
6
6.64
5.89
0.554
0.32
1.2×106 *
12
6
3.34
4.62
0.554
0.43
9.8×107
13
8
(8)
9.65
(298.3)
8.87
(184.9)
0.704
0.57
4.6×105 *
14
8 (8)
8.38
8.31
0.704
0.50
1.4×103 *
15
0 (0)
0.06 (0)
0.06 (0)
0.758
0.71
1.7×105 *
16
0 (0)
0.06(0)
0.08 (0)
0.758
0.71
5.5×106
17
0 (0)
0.28
0.26
0.758
0.57
4.7×104
18
6
10.24
9.83
0.198
0.17
1.4×104 *
19
6
6.14
6.04
0.198
0.04
0.768 *
20

8
10.84
10.72
0.173
0.68
1.5×104 *
21

8
8.24
7.89
0.173
0.03
0.293 *
22

5
7.70
6.35
0.480
0.31
1.3×105 *
23

5
5.67
5.49
0.480
0.23
7.3×104 *
24

0
0.79
1.11
0.404
0.17
9.0×104
25

0
0.56
0.82
0.404
0.17
1.6×103
26

1
1.74
1.87
0.875
0.86
7.9×105
27

1
1.10
1.03
0.875
0.77
0.018
28

7
6.31
5.84
0.525
0.39
4.2×105 *
29

7
5.03
5.64
0.525
0.30
5.4×105 *
30
* The roughness was calculated without considering the second derivative at zero.
§ We have not defined the median in cases that include negative values of .
6
Notice that for each case, each of the directions and are investigated
separately as there is no symmetry (or antisymmetry) in the produced IRFs in the two
directions and, hence in the quantified measures, which are summarized in Table 1. When
we refer to direction we mean that we interchange the time series and and still
estimate the IRF in the same way, as described in our equations (e.g. equation (1)), in
which the direction is assumed.
The details of the case studies are given in the following subsections.
2.1 Synthetic examples
All synthetic examples are based on the same input series , which was constructed by
the methodology in Koutsoyiannis (2020a) whose software is available online as
Supplementary Information of that paper.
In particular, to generate we use the moving average scheme:


(3)
where , is white noise, assumed standard Gaussian, and the sequence of
coefficients is assumed time symmetric, i.e.,  , and is determined assuming a
Filtered Hurst-Kolmogorov process with a generalized Cauchy-type climacogram (FHK-C;
Koutsoyiannis 2016, 2017):


(4)
The term climacogram denotes the variance  of a stochastic process averaged at time
scale , as a function of ; by specifying it, all second order properties of the process
(autocovariance, power spectrum, variogram) are also uniquely specified. In the last
equation α and λ are scale parameters with dimensions of and , respectively, while
M (fractal parameter) and H (Hurst parameter) are dimensionless parameters
determining the dependence structure at a local level (smoothness or fractality) and the
global level (long-range dependence). The parameter values are chosen as 
. The chosen values of H and M suggest respectively (long-term)
persistence and (short-term) smoothness of the process . We deliberately choose a high
value of Hurst parameter H, corresponding to a process with long-range dependence, in
order to make the case study more challenging and insightful, also noting that a high H
implies high uncertainty, also in the estimation process. The resulting autocorrelations
are high, but not prohibitively high in the sense described in Section SI2.2 of the
Supplementary Information of this paper.
We recall (Koutsoyiannis, 2010, 2016) that the discrete time autocovariance for
integer time lag η is the second discrete derivative of the climacogram multiplied by the
7
square of the time scale. Specifically, assuming a discretization time step , the
autocovariance is

(5)
The series of coefficients is calculated from that of as described in detail in
Koutsoyiannis (2020a).
The system output is calculated in our synthetic case studies from an equation
similar to (3) but now replacing the white noise with the input , and the latter with
the output , while also adding some noise :
 

(6)
where the integers and differ in the various applications as specified below. The noise
is assumed Gaussian with standard deviation 0.5, except in one case marked as “pure”
(applications #1 and #2) where no noise is added. The length of the generated series is
8000 in all synthetic case studies. Here we assume that only these synthetic series are
known, while the generation equation (6) is unknown, and we will try to recover (or
approximate) it from the data by using equation (2).
We note that, since the sequence of was determined assuming (long-term)
persistence (, all are nonnegative. Hence in case where , if our
method works well, we expect to find that the sequence of (which we assume unknown
and try to estimate by our method) is identical to the sequence of . This, however, will
apparently not be the case if .
In case studies #1 and #2 we use  without noise, thus building a
symmetric HOE causal system without an error term. In this case, assuming  (41
weights) without using the constraints for roughness and nonnegativity (i.e. using the
analytical solution of equation (37) in the companion paper, with λ = 0), we fully recover
the system dynamics in the direction (case study #1), as shown in Figure 1 (left).
As seen in Table 1, the variance is fully explained by that dynamics ().
If we reverse the direction, i.e., without using any constraint (case study #2),
the explained variance ratio remains very high,  (Table 1), but the IRF, depicted
in Figure 1 (right), becomes very rough. Clearly, the time symmetry is captured, but apart
from this, the shape of the IFR looks like representing a random pattern alternating
between positive and negative parts. Does such an alternating random pattern suggest
causality?
One may claim that the number of weights (41) is too high and regard the random
shape as an artefact of non-parsimonious modelling (even though the data size is quite
large, 8000, which should support the estimation of 41 parameters). For this reason, we
8
have also tried to recover the dynamics with fewer (21) weights ). The resulting
IRF is also plotted in Figure 1 (right) and again has a similar shape, with a symmetric, yet
random pattern alternating between positive and negative parts. The explained variance
ratio remains almost equally high,  (Table 1, case study #3), which certainly
suggests that the solution with 41 weights could become more parsimonious without
sacrificing predictability. But does this high predictability of  suggest causality,
given the rough and alternating pattern of weights?
Figure 1 IRFs for synthetic applications #1 (left) and #2 #3 (right) representing a HOE causal (symmetric)
system without an error term. By construction, the true IRF has 41 nonzero weights () and the
system dynamics does not contain a random term. No constraints were used in the IRF estimation. For the
estimated IRF the number of weights is  with  (applications #1 #2) or (application
#3).
Figure 2 IRFs for synthetic applications #4 (left;  roughness constraint) and #5 (right; ,
nonnegativity constraint) representing a HOE causal (symmetric) system without an error term, the same
as in Figure 1.
To decrease the roughness, we have included case study #4 with direction , 21
weights and a roughness constraint. The solution is depicted in Figure 2 (left), where a
more logical pattern of the IRF is seen. This was achieved almost with a negligible cost in
predictability ( in case study #4 vs. 0.99 in case study #3; Table 1). Yet again we
-4
-3
-2
-1
0
1
2
3
4
-20 -10 010 20
IRF
Time lag
IRF, 41 weights
IRF, 21 weights
Mean
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
-20 -10 010 20
IRF
Time lag
IRF, theoretical
IRF, estimated
Mean
Median
0
0.05
0.1
0.15
0.2
0.25
-20 -10 010 20
IRF
Time lag
IRF
Mean
Median
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
-20 -10 010 20
IRF
Time lag
IRF
Mean
9
have an alternation of negative and positive parts. Notably, , while  .
The alternation of signs for a change of lag of just 1 is fully understandable in terms of
improving predictability, but does it have any meaning in establishing causality?
Our final application with the same data set and direction is case study #5 in
which we enable the nonnegativity constraint (with or without the roughness constraint).
Here we get a reasonable solution (Figure 2, right), with only one nonzero weight at time
lag zero. The resulting explained variance ratio is somewhat smaller, , and is fully
due to the high lag-zero cross-correlation of and (. All characteristic
time lags are equal to 0, reflecting the full temporal symmetry, as also happens with the
direction (case study #1). Yet in this fully symmetric case, even if we did not know
that , we would conclude that there is a preferential causality
direction, , as this results in higher explained variance and a more consistent and
prolonged IRF (a bigger time span ).
In case studies #6 and #7 we use , thus making a typical causal system
(including an error term). If we do not use any constraint, we get the solutions shown in
Figure 3 (left for , right for ). In both cases, the time directionality is
captured (see Table 1) but the rough shape in the direction (case study 6) and the
alternating positive and negative IRF values in both directions do not have any
relationship with the true system dynamics. Interestingly, the explained variance ratio is
higher in the direction () than in (), which could not be
expected. While causality can be inferred from characteristic time lags, the IRFs do not
help in understanding causality. At this stage the results seem rather puzzling but things
will become clearer with the next case studies.
Figure 3 IRFs for synthetic applications #6 (left) and #7 (right) representing a causal system. By
construction, the IRF has 21 nonzero weights () and a random term in the system dynamics.
No constraints were used in the estimation of IRF.
In case studies #8 and #9 we use the same time series and as in applications
#6 and #7. Here we additionally enrol the nonnegativity constraint for the IRF estimation,
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
-20 -10 010 20
IRF
Time lag
IRF
Mean
Median
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
-20 -10 010 20
IRF
Time lag
IRF, theoretical
IRF, estimated
Mean
Median
10
but not the roughness one. As seen in Figure 4 (left) and Table 1, the estimated IRF
reproduces the zero for time lags , thus capturing the direction and the correct
time lags, but the IRF is far from the true system dynamics because it is very rough. If we
reverse the direction, i.e., the system becomes anticausal (zero for time lags
, thus again recovering the correct causality direction. In both cases the explained
variance ratio is very high, , while, as seen in Figure 4 (right), case #9 results in a
rather smooth IRF. Yet the clearly positive lags for and negative ones for do
not leave any doubt that the causality direction is the former.
Figure 4 IRFs for synthetic applications #8 (left) and #9 (right) representing a causal system, the same as
that of Figure 3. Only the nonnegativity constraint was used in the estimation of IRF.
Figure 5 IRFs for synthetic applications #10 (left) and #11 (right) representing the same causal system as
that in Figure 4, but using both the nonnegativity and the roughness constraint in the IRF estimation.
Case studies #10 and #11 again use the same time series and as in #6 #9,
representing the same typical causal system. The difference is that we now use both
constraints, nonnegativity and small roughness. As seen in Figure 5 (left), now the system
dynamics is fully recovered, while, as seen in Table 1, this has been done at virtually no
cost in terms of explained variance ratio, which is as high as in case #8 (). If we
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
-20 -10 010 20
IRF
Time lag
IRF
Mean
Median
0
0.1
0.2
0.3
0.4
0.5
0.6
-20 -10 010 20
IRF
Time lag
IRF, theoretical
IRF, estimated
Mean
Median
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
-20 -10 010 20
IRF
Time lag
IRF
Mean
Median
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
-20 -10 010 20
IRF
Time lag
IRF, theoretical
IRF, estimated
Mean
Median
11
investigate the reverse direction, (Figure 5 right), the system correctly appears as
anti-causal (zero for time lags ), thus again recovering the correct causality
direction.
The above illustrations of the IRF behaviour with and without the constraints prove
that both constraints are essential in studying causality. Therefore, in what follows, we
always use both of them.
Another interesting question to investigate is this. What happens if the system is
nonlinear in nature, while our entire framework is based on a linear relationship? To
study this question, we exponentiate the time series of case studies #8 #11, while we
leave unchanged and we apply the linear framework again. In other words, now the
actual system is

 
(7)
and we try to approximate it in a linear fashion and estimate it as
 

(8)
Now we should expect a larger variance of because of the disagreement between the
actual system and the model used for causality detection.
As seen in Figure 6 (left), referring to the direction , our framework correctly
detected that we have a potentially causal system. The characteristic time lags do not
show a noteworthy change in comparison with case #10, despite the fact that, as seen in
Table 1, the explained variance ratio has been substantially reduced from 0.94 to 0.32. If
we change the direction to (Figure 6, right), the ratio increases to 0.43, and the
causality appears as HOE type, but with the correct principal direction, (anticausal
at ). Therefore, here we locate a potential problem of the methodology, as the actual
causality is not HOE.
It is not too difficult to resolve this ambiguity: If we produce a scatter plot of vs.
(Figure 7, right), it becomes evident that the relationship of and is not linear; for
comparison, Figure 7 (left) shows a similar plot for case studies #8 #11, which is typical
for linear relationships. Once we are aware of the nonlinearity, we will perform a
nonlinear transform on , or both, and reapply the methodology on the transformed
series. In this particular case, if we apply the logarithmic transformation on , we will
switch to cases #8 #11 and we will fully recover the system dynamics. Additional details
on the application of this technique are given in section 2.2.
12
Figure 6 IRFs for synthetic applications #12 (left) and #13 (right) representing the causal system of Figure
5, but with exponentiating the output of the latter.
Figure 7 Scatter plots of synchronous vs. for applications #8 #11 (left) and #12 #13 (right).
In the next case studies, #14 and #15, we have a causal model with long-range cross-
dependence, constructed by using  in equation (6). However, we keep
using  in our causality detection framework and thus we do not expect to fully
recover the true long-term system dynamics. We only test whether we can correctly
detect causality and its direction. Using only the roughness constraint we find the IRF
shown in Figure 8 (left). Notice that the true (theoretical) IRF curve exceeds the horizontal
axis span by far, going up to lag 1024, while the plotted area coincides with the time
window of the sought IRF estimate (i.e., it goes up to lag 20 only). As seen in Table 1 the
true characteristic lags  are of the order of 200 while with our chosen time window
we can estimate lags up to 20and most likely of the order of 10. Indeed, the
methodology captured the mono-directional causality yet the estimated characteristic
time lags are, as expected, too small compared to the true ones of Table 1. Interestingly,
Figure 8 (left) shows increasing IRF magnitude beyond the estimated average lag time.
This may seem inconsistent, as the true IRF does not contain an increasing branch.
However, this is a reasonable behaviour reflecting the fact that the chosen time window
() is too narrow to contain the true IRF. It is thus reasonable for our framework to
0
0.0002
0.0004
0.0006
0.0008
0.001
0.0012
0.0014
0.0016
-20 -10 010 20
IRF
Time lag
IRF
Mean
Median
0
0.2
0.4
0.6
0.8
1
1.2
1.4
-20 -10 010 20
IRF
Time lag
IRF
Mean
Median
0
50
100
150
200
250
300
350
400
450
-3 -2 -1 0 1 2 3
yτ
xτ
-10
-8
-6
-4
-2
0
2
4
6
8
-3 -2 -1 0 1 2 3
yτ
xτ
13
increase the power of the most distant components (closer to lag 20) as a replacement of
the power of even more distant ones.
It is most important that the estimated IRF reproduces the zero for time lags
, thus capturing the correct causality direction. If we reverse the direction, i.e., the
system becomes anti-causal (zero for time lags , thus again recovering the correct
causality direction. In both cases the explained variance ratio is high,  and 0.50,
for and , respectively. Notably, in the latter case if we considered only the
correlation with just one term, with time lag 8, the explained variance ratio would be
virtually the same. Overall, the results do not leave any doubt that the true causality
direction is .
Figure 8 IRFs for synthetic applications #14 (left) and #15 (right) representing a causal system. By
construction, the IRF has 1025 nonzero weights () and a random term in the system dynamics.
In our final synthetic applications, #16 #18, we consider a symmetric HOE causal
system. This is similar to that in the first applications #1 and #2 , except that we now use
long-range cross-dependence with  and include a random term .
Applications #16 and #17, depicted in Figure 9 (left), examine the causal direction
and they only differ in the roughness term in optimization of IRF. Namely, in application
#16 the roughness was calculated without considering the second derivative at zero, in
agreement with the fact that in the true system the second derivative is not defined at
zero due to discontinuity of the first derivative. In application #17 all roughness terms are
included and, as a result, a smoother IRF curve is produced. In both cases, our
methodology captures the time symmetry of the system. The increasing as the j
approaches ±20 give an alert that our time window is too narrow to capture the long-
range cross-dependencesomething similar with what we observed in application #14.
If we reverse the direction, i.e. (application #18), again the time symmetry of the
system is confirmed, but now the explained variance ratio (Table 1) decreases from
 (for ) to 0.57 (for ). Notably, in the latter case, if we considered only the
synchronous (lag 0) cross-correlation, the explained variance ratio would be virtually the
0
0.005
0.01
0.015
0.02
0.025
-20 -10 010 20
IRF
Time lag
IRF
Mean
Median
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-20 -10 010 20
IRF
Time lag
IRF, theoretical
IRF, estimated
Mean
Median
14
same. For this reason, we will again choose as preferential the direction and try a
wider time window to determine the IRF.
Figure 9 IRFs for synthetic applications #16 and #17 (left), and #18 (right) representing a symmetric HOE
causal system. By construction, the IRF has 2049 nonzero weights (). In “estimation 1” the
roughness was calculated without considering the second derivative at zero (application #16), while in
“estimation 2” all roughness terms are included (application #17).
2.2 Precipitation runoff
At the global scale, the hydrological cycle is obviously a family of processes that act in a
cyclical manner (Koutsoyiannis, 2020b), precipitation runoff precipitation , and
therefore can be thought of as a hen-or-egg case of causality. However, if we specify a
particular location on Earth, the situation is different and it is well known that at a local
scale runoff is caused by past precipitation upstream in the drainage basin in a mono-
directional fashion. We thus expect that precipitation (P) and runoff (R) data should
reflect this pattern.
1
To explore this, we use rainfall and streamflow data from the database of the U.S.
Geological Survey for the site USGS 01603000 North Branch Potomac River Near
Cumberland, MD (39°37'18.5"N, 78°46'24.3"W, catchment area 2271 km2). The data
series are for the period 2013-10-01 to 2021-02-25 for a time step of 15 min (a part of
these data was also used for a similar purpose in Koutsoyiannis, 2019). The discharge data
were converted to metric units (from cubic feet per second to m3/s and the precipitation
data from inches to mm). Both series have a small percentage of missing values, which
1
There is a common perception in the hydrological community that the precipitation runoff
transformation can be modelled in a deductive way. Epistemologically, this cannot hold, as a river basin is
a complex geophysical system (cf. Bode and Shannon, 1950, quoted in the companion paper; Koutsoyiannis
et al., 2022). It is true that some of the mechanisms of the transformation are described by differential
equations as dynamical systems. However, the modelling of the entire system cannot be reliably made
without data and without moving from a deterministic to a stochastic description (cf. Montanari and
Koutsoyiannis, 2012; Koutsoyiannis and Montanari, 2022). Therefore, induction is absolutely necessary.
This remains a demanding problem as reflected in the notion of equifinality (Beven, 2019).
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0.008
0.009
0.01
-20 -10 010 20
IRF
Time lag
IRF
Mean
Median
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
-20 -10 010 20
IRF
Time lag
IRF, theoretical IRF, estimated 1
IRF, estimated 2 Mean
Median
15
were left unfilled. The series were aggregated to a time scale of 3 h in order for our time
window of length 40 to represent a period of a few days (120 h or 5 d).
Indeed, Figure 10 (left) and Table 1 (case study #19) suggest a causal system with
direction , a time span > 20 time units (60 h) and characteristic time lags > 10 time
units (30 h). The increasing beyond the time lag , is clearly an artefact produced
by the fact that the time span of the causal relationship exceeds the size of our time
window. Had the latter been large enough, we would expect to see a shape like a unit
hydrograph, with a monotonically decreasing limb after the peak at time lag 6 (see below).
If we reverse the direction, i.e. (Figure 10, right, and Table 1, case study #20) we
just confirm that the true causality direction is the opposite, i.e. . The explained
variance ratio in the direction is very low, , not greater than implied by
merely the maximum cross-correlation value. In the correct direction, , this
increases by a factor of 4 (). Yet the latter value would still be too low if our aim
were to capture the system dynamics. One reason for this low value is the narrow time
window, as already discussed. A second reason becomes obvious from the scatter plot of
runoff vs. precipitation in Figure 11 (left). The scatter plot does not suggest linearity and
we may expect a larger e if we transform the two variables.
Figure 10 IRFs for precipitation runoff case studies #19 (left) and #20 (right).
We choose to apply to each of the variables a single-parameter (c) transformation
suggested in an entropy maximization framework (Koutsoyiannis, 2014), namely

 
(9)
where the two parameters were optimized simultaneously with the IRF and were found
to be 

. Practically, the low values of the parameters
mean that the transformations made are almost pure logarithmic. Yet the transformation
in equation (9) is more advantageous than the pure logarithmic, because of its property
of keeping the smallest values for P or virtually unchanged and the zero values
0
0.0005
0.001
0.0015
0.002
0.0025
-20 -10 010 20
IRF
Time lag (3-hours)
IRF
Mean
Median
0
1
2
3
4
5
6
7
8
-20 -10 010 20
IRF
Time lag (3-hours)
IRF
Mean
Median
16
precisely unchanged. The scatter plot of the transformed variables is presented in Figure
11 (right). While a linear arrangement of the points is more visible in the transformed
variables rather than in the untransformed ones (Figure 11, left), there was no
improvement (actually there was slight worsening) in the achieved maximum cross-
correlation (Figure 11 and Table 1). Yet the transformation resulted in a substantial
improvement in the explained variance ratio: from  it increased fourfold, to
 for the direction . In the opposite direction there was no change. This means
that a cross-correlation at a single time lag is a very poor representation of (potential)
causality and justifies our framework which is based on an appropriate range of time lags.
The IRFs of the transformed processes are shown in Figure 12, which is not
essentially different from Figure 10. This means that, while the transformation offers
explanatory power, our methodology, even applied to the untransformed processes, is
able to reveal the causal relationship (recall also case studies #12 #13 and see
theoretical explanation in Section SI1.2 of the Supplementary Information of the
companion paperKoutsoyiannis et al., 2022).
Figure 11 Scatter plots of lagged runoff (R) vs. precipitation (P), where the time lag is the one that
maximizes cross-correlation. Left: untransformed variables, applications #19 #20; right: transformed
variables, applications #21 #22.
y = 16.699x + 39.154
R² = 0.0392
0
100
200
300
400
500
600
700
800
0 5 10 15 20
Rτ+6 (m3/s)
Pτ(mm/h)
y = 0.0018x + 0.0101
R² = 0.0337
0.006
0.007
0.008
0.009
0.01
0.011
0.012
0.013
0.014
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
R'τ+8
P'τ
17
Figure 12 IRFs for transformed precipitation runoff case studies #21 (left) and #22 (right); for the IRFs
of the original data set see Figure 10.
To show that the increasing limb of beyond the time lag  in applications
#19 and #21 is an artefact of the small time window, we repeated the calculations with a
time window twice as long, i.e.  instead of . The results are shown in Figure
13. Where there were increasing limbs in applications #19 and #21, now there is
continuation of the decreasing limbs, as expected. As a result, the characteristic lags
increased: in 3-h time units, the mean became 17.35 and 19.10 for untransformed and
transformed variables, respectively (from 10.24 and 10.84, respectively); and the median
 increased to 15.64 and 18.54 (from 9.83 and 10.72, respectively). The explained
variance increased to 0.26 and 0.71 for untransformed and transformed variables,
respectively (from 0.17 and 0.68, respectively). However, again there appear (smaller)
increasing limbs close to the right end of the wider window. This means that even 
is still not enough to recover the dynamics and we should choose an even higher value of
. But as we have repeatedly stated, the scope of the study is not to provide a model of the
process but to explore causality.
Figure 13 IRFs for precipitation runoff case studies similar to #19 (left; untransformed variables) and
#21 (right; transformed variables) but with a with a time window twice as long, i.e.  instead of
; results are plotted only for nonnegative lags.
0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
-20 -10 010 20
IRF
Time lag (3-hours)
IRF
Mean
Median
0
1
2
3
4
5
6
7
8
-20 -10 010 20
IRF
Time lag (3-hours)
IRF
Mean
Median
0
1
2
3
4
5
6
7
8
010 20 30 40
IRF
Time lag (3-hours)
IRF, = 40
Mean
Median
IRF, = 20
0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
0.0008
010 20 30 40
IRF
Time lag (3-hours)
IRF, = 40
Mean
Median
IRF, = 20
18
2.3 Atmospheric temperature and CO concentration
The problem related to the causal relationship between atmospheric temperature (T) and
concentration of carbon dioxide ([CO₂]) is regarded by many as part of a “settled science”
yet it remains challenging and still debated. For example, the study by Koutsoyiannis and
Kundzewicz (2020) concluded, making use of the hen-or-egg causality concept and based
on the analysis of modern measurements of T and CO₂, that the principal causality
direction is , despite the common conviction that the opposite is true. In
addition, using palaeoclimatic proxy data from Vostok ice cores, Koutsoyiannis (2019)
and Koutsoyiannis and Kundzewicz (2020) found a time lag of [CO₂] from T of a thousand
years. Here we re-examine both modern and paleo data sets with our proposed causality
detection methodology.
As modern observations for global temperature we use the satellite dataset
developed at the University of Alabama in Huntsville (UAH). The temperature of three
broad levels of the troposphere is inferred from satellite measurements of the oxygen
radiance in the microwave band, using advanced (passive) microwave sounding units on
NOAA and NASA satellites (Spencer and Christy, 1990; Christy et al., 2007). The dataset
begins in 1979 continues to date. It is publicly available at a monthly scale in the form of
time series of “anomalies” (defined as differences from long-term means) for several parts
of the Earth, as well as in map form. Here we use only the global average (noting that
Liang, 2022, disputes the use of averages as he claims that generally they conceal regional
patterns of change) on the monthly scale for the lowest level, referred to as the lower
troposphere. For the CO₂ concentration we use the most famous dataset, that of the Mauna
Loa Observatory (Keeling et al., 1976). The Observatory, located on the north flank of
Mauna Loa Volcano, on the Big Island of Hawaii, USA, at an elevation of 3397 m above sea
level, is a premier atmospheric research facility that has been continuously monitoring
and collecting data related to the atmosphere since 1958. Here we examine the common
42-year period of the two datasets (1979 to May 2021) at a monthly scale.
Both data sets were also used by Koutsoyiannis and Kundzewicz (2020) who also
examined a ground-based temperature data series (CRUTEM.4.6.0.0 global T2m land
temperature) and three additional CO₂ series (Barrow, Alaska, USA; South Pole; global
average). The results did not substantially differ for the different pairs of T [CO₂] series
and therefore here we limit our current analysis to one pair. We additionally note that we
also examined the temperature data of the other two satellite levels for the troposphere
and the results were very similar to those reported here for the lower troposphere.
Furthermore, in our analysis we follow the data pre-processing justified in
Koutsoyiannis and Kundzewicz (2020); namely:
19
1. We take the logarithms of [CO₂], based on Arrhenius’s (1896) rule that when [CO₂]
increases in geometric progression, T will increase nearly in arithmetic
progression.
2. We take the difference of each monthly value from that of the same month of the
previous year. This diminishes the effect of seasonality while eliminating possible
artificial effects of the convention of giving the temperature data as “anomalies”
(departures from changing monthly means) rather than actual values.
We note that, since we are differencing the process, taking the average of 12
consecutive monthly differences (which is a more established process) is equivalent to
taking a difference for a time step of 12 months (notice that 
). We further note that Koutsoyiannis and Kundzewicz also examined the
non-differenced series and showed that they give spurious results due to the continuously
rising [CO₂] in the time window of modern observations, which results in autocorrelation
values that are virtually 1 for all lags. Here the case of spurious results due to very high
autocorrelation is explained more thoroughlysee Section SI2.2 in Supplementary
information, and notice the similarity of Figure SI2.3 with Figure 9 in Koutsoyiannis and
Kundzewicz (2020).
Hence the examined processes are  and . Figure 14 gives the obtained
IRFs in the directions  (left panel) and  (right panel).
Impressively, the results are not different from those in the precipitation runoff case
study. Clearly, the results in Figure 14 suggest a (mono-directional) potentially causal
system with T as the cause and [CO₂] as the effect. Hence the common perception that
increasing [CO₂] causes increased T can be excluded as it violates the necessary condition
for this causality direction. Even the possibility of hen-or-egg causality, supported by
Koutsoyiannis and Kundzewicz (2020), is not confirmed by the new methodological
framework. The causality direction  is further supported by all numerical
indices given in Table 1, as well as the graphical comparison of modelled and empirical
cross-correlation functions depicted in Figure 15. Namely:
All characteristic time lags () are positive in the direction 
(ranging from 5 to about 8 months), and negative in the direction .
The explained variance ratio is greater in the direction  ()
than in the direction  ().
In the direction , the cross-correlation function, reconstructed from
the IRF and the autocorrelation function using the discretized version of equation
(12) in the companion paper; Koutsoyiannis et al., 2022), agrees impressively well
with the empirical cross-correlation function. In the direction  the
proximity of the two is much lower.
20
Remarkably, however, the explained variance ratio of  is low and suggests
that the two processes have a behaviour that is much more complex and affected by
additional geophysical processes. However, insofar the relationship of these two
processes is concerned, this explained variance ratio is adequate to detect the main
characteristics, i.e. direction and time lags. Indications for this adequacy are provided by
the precipitation runoff case study (section 2.2), in which , as well in the
additional controlled (synthetic) case study that is provided in Supplementary
Information (section SI2.1, ).
Having gathered strong indications that in the recent decades the increase of
temperature is potentially the cause of the increased CO₂ concentration, while the
opposite is not probable, it is interesting to examine if this is also the case for longer
periods. To this aim we use the datasets from the Vostok ice cores (Jouzel et al., 1987;
Petit et al., 1999; Caillon et al., 2003) which were originally given for an irregular time
step and were regularized in the study of Koutsoyiannis (2019) for a time resolution of
1000 years.
Figure 14 IRFs for of temperature CO₂ concentration based on the modern time series. (left) Case study
#23 (); (right) case study #24 ).
0
2
4
6
8
10
12
-20 -10 010 20
IRF
Time lag (months)
IRF
Mean
Median
0
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0.0007
-20 -10 010 20
IRF
Time lag (months)
IRF
Mean
Median
21
Figure 15 (upper) Autocorrelation function of (left)  and (right) . (lower) Cross-correlation
functions, empirical and reconstructed (marked as “modelled”) from the IRF and the autocorrelation
functions in the upper panels, using the discretized version of equation (12) in the companion paper, for
case studies (left) #23 and (right) #24.
We study again the processes  and , where the differences are taken for
1 time step (1000 years). Figure 16 gives the obtained IRFs in the directions 
 (left panel) and  (right panel). Here the results support a HOE
causality. Nonetheless, again the principal direction is  with a time lag of
the order of 1000 years (0.79 to 1.11 time steps for ; 0.56 to 0.82 time
steps for ; see Table 1, case studies #25 #26). The explained variance
ratio is small,  in both directions.
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
-100 -80 -60 -40 -20 0 20 40 60 80 100
Autocorrelation
Time lag (months)
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
-100 -80 -60 -40 -20 020 40 60 80 100
Autocorrelation
Time lag (months)
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
-100 -80 -60 -40 -20 020 40 60 80 100
Cross-correlation
Time lag (months)
Empirical
Modelled
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
-100 -80 -60 -40 -20 020 40 60 80 100
Cross-correlation
Time lag (months)
Empirical
Modelled
22
Figure 16 IRFs for of temperature CO₂ concentration based on the proxy time series from the Vostok ice
cores. (left) Case study #25 (); (right) case study #26 ).
As the proxy data sets are free of monotonic trends and produce reasonable
empirical autocorrelation functions (see Koutsoyiannis, 2019), here we could also apply
our framework for the non-differenced processes. We initially note that, if is a
differenced process (where the differences are taken for 1 time step) and the non-
differenced (original) one, then the two are related by

(10)
and likewise for  and . In this case from equation (2) it follows
 

(11)
This is identical to equation (2) which means that the original and differenced processes
are related with the same potential causality equation involving the same IRF.
Furthermore, as a result of aggregation, and the resulting smoothing of the noise , it is
expected that the explained variance ratio should be higher after aggregation.
We test if this happens in the case of the proxy T and [CO₂] data. Figure 17 gives the
obtained IRFs in the directions  (left panel) and  (right panel). The
shapes of the curves are quite similar as those of Figure 16. Again, we find HOE causality
with principal direction is  and with a time lag of the order of 1000 years (1.74
to 1.87 time steps for ; 1.03 to 1.10 time steps for ; see Table
1, case studies #27 #28). The explained variance ratio increased substantially, from
 to  and is greater in the direction , thus confirming the latter as
the principal causality direction.
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.0035
0.004
0.0045
0.005
-20 -15 -10 -5 0 5 10 15 20
IRF
Time lag (millennia)
IRF
Mean
Median
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
-20 -15 -10 -5 0 5 10 15 20
IRF
Time lag (millennia)
IRF
Mean
Median
23
Figure 17 IRFs for of temperature CO₂ concentration based on the proxy time series from the Vostok ice
cores as in Figure 16 but for the aggregated (non-differenced) processes. (left) Case study #27 (
); (right) case study #28 ).
2.4 Atmospheric temperature and ENSO
For a second case study related to climate, we examine the El NiñoSouthern Oscillation
(ENSO) in relation to the global temperature. The ENSO is associated with irregular, anti-
persistent (at the time scale of a few years, else known as quasi-periodic), variation of sea
surface temperature and air pressure over the tropical Pacific Ocean. It is broadly
recognized as the principal climate variability mode (McPhaden et al., 2006). There exists
a plethora of indices related to ENSO, among which here we use the Southern Oscillation
Index (SOI) of the USA’s National Oceanic and Atmospheric Administration (NOAA) (see
also Kaplan, 2011).
In a recent paper, Kundzewicz et al. (2020) examined several ENSO indices, as well
as indices of similar phenomena in the Pacific and in the Atlantic, and found that they
meaningfully influence the global mean annual temperature. Here we examine a single
pair of processes, the UAH temperature, also used in case studies #23 #28 (section 2.3),
and the SOI. While Kundzewicz et al. (2020) processed the residuals from 5-year running
averages to remove the influence of large-scale fluctuations, here we use the data without
averaging. However, in both time series we take the difference of each monthly value from
that of the same month of the previous year, a technique already described in section 2.3.
The results are depicted in Figure 18, which suggests a clear case of a potentially
causal system in the direction . As seen in Table 1, the characteristic lags are close
to 6 months and the explained variance is 0.39. We note that the technique used by
Kundzewicz et al. (2020) (residuals from 5-year running averages) explained a higher
percentage of the variance (>0.60), which means that there is a margin for improvement
of the results obtained here. However, as already mentioned, the scope of the current
study is the exploration of potential causality, rather than the building of a reliable model.
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.0035
0.004
-20 -15 -10 -5 0 5 10 15 20
IRF
Time lag (millennia)
IRF
Mean
Median
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
-20 -15 -10 -5 0 5 10 15 20
IRF
Time lag (millennia)
IRF
Mean
Median
24
Figure 18 IRFs for of South Oscillation Index (SOI) and temperature (T). (left) Case study #29 ();
(right) case study #30 ).
3 Discussion and conclusions
The theoretical methodology proposed in the companion paper (Koutsoyiannis et al.,
2022) was applied to a range of case studies, both synthetic (artificial) and real world.
Since the system dynamics in the artificial cases is known, this enables the method to be
tested and validated. We showed that the method does not introduce any spurious
potential causation when none is present in the system dynamics, except in cases of very
high autocorrelation (see Section SI2.2 in Supplementary Information), which,
nonetheless, is easily handled by studying the changes in the time series (differenced time
series) instead of the original time series. Further, the requirement that the IRF
coefficients be nonnegative is, on its own, sufficient to enable the correct direction of
causality to be inferred. (Recall from the companion paperKoutsoyiannis et al., 2022
that we do not subsume oscillatory nonlinear systems, in which the sign of  could
alternate, under the causality notion, which accords with Cox’s (1992) conditions for
causality and in particular the monotonic relation of the cause with the effect.) When the
roughness condition is added, this enables the correct system dynamics to be recovered.
We showed how to deal with the presence of nonlinearity (in which case a nonlinear
transform of the potential cause should be carried out) and of long-term persistence in
the time-series (which causes rising limbs in the IRF at the edges of its domain).
In addition to causality studies of synthetic examples, the theoretical framework
was also applied to three real-world geophysical examples. In the first case, the causing
of runoff by rainfall over a catchment is well established. The proposed framework
indicates the correct direction of causation, even while suggesting the need for a nonlinear
transformation and the need to extend the size of the domain of the IRF. The framework
was further validated in its ability to detect the clear causal connection between Southern
Oscillation Index and global mean annual temperature.
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
-20 -10 010 20
IRF
Time lag (months)
IRF
Mean
Median
0
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
-20 -10 010 20
IRF
Time lag (months)
IRF
Mean
Median
25
The remaining real-world case study led to an important side product of the current
research. This is the surprising finding that, while in general the causal relationship of
atmospheric T and CO₂ concentration, as obtained by proxy data, appears to be of hen-or-
egg type with principal direction , in the recent decades the more accurate
modern data support a conclusion that this principal direction has become exclusive. In
other words, it is the increase of temperature that caused increased CO₂ concentration.
Though this conclusion may sound counterintuitive at first glance, because it contradicts
common perception (and for this reason we have assessed the case with an alternative
parametric methodology in the Supplementary Information, section SI2.4, with results
confirming those presented here), in fact it is reasonable. The temperature increase began
at the end of the Little Ice Period, in the early 19th century, when human CO₂ emissions
were negligible; hence other factors, such as the solar activity (measured by sunspot
numbers), as well as internal long-range mechanisms of the complex climatic systems had
to play their roles.
A possible physical mechanism for the [CO] increase, as a result of temperature
increase, was proposed by Koutsoyiannis and Kundzewicz (2020) and involves
biochemical reactions, as, at higher temperatures, soil respiration, and hence natural CO₂
emissions, are increasing. In addition, as pointed out by Liu et al. (2017) the influence of
El Niño on climate is accompanied by large changes to the carbon cycle, with the
pantropical biosphere releasing much more carbon into the atmosphere during large El
Niño occurrences. Noticeably, in a very recent paper, Goulet Coulombe and Göbel (2021)
seem to confirm the finding by Koutsoyiannis and Kundzewicz (2020), yet they deem it
an “apparently counterintuitive finding that GMTA [global mean surface temperature
anomalies] explains a larger portion of the forecast error variance of CO than vice versa.
To “resolveit, they “explore a last avenue, that of using annual CO emissions. However,
using anthropogenic CO emissions, which contribute only a small portion (3.8%) to the
global carbon cycle (Koutsoyiannis, 2021), as a principal variable is definitely less
meaningful than using the atmospheric CO₂ concentration.
We believe that counterintuitive results, such as those about the causal link between
temperature and CO concentration conveyed in this paper, can indeed open up avenues
of research. However, these avenues of research might not resolve the issue in a way
compatible with what intuition dictates. In the history of science, such avenues were often
created when established ideas were overturned by new findings.
By letting the geophysical records speak for themselves, with the help of our original
methodology, we discovered a regularity that apparently contradicts common opinion.
Our innovative findings should be given considerable attention as well as careful and
critical scrutiny in the form of public discussion by the scientific community, which will
undoubtedly improve understanding. If the methodology we proposed in the companion
26
paper (Koutsoyiannis et al., 2022) stands up to scrutiny, then our novel, high-impact
results, i.e. those of cases #23 #28 in the present paper, will have to be taken seriously
and interpreted. Further research on the regularities of the causal behaviour of the
climate system reported herein, being of considerable importance and relevance, is
urgently needed.
Acknowledgments: The constructive comments of two anonymous reviewers helped us to substantially
improve our work. We also thank the Board Member Graham Hughes for the processing of the paper and
the favourable decision, as well as Keith Beven for his advice on a preliminary version of the manuscript.
Funding: This research received no external funding but was motivated by the scientific curiosity of the
authors.
Conflicts of Interest: We declare no conflict of interest.
Data availability: All data used in the paper are available online for free. The rainfall and runoff data were
obtained from this U.S. Geological Survey database, publicly available at
https://nwis.waterdata.usgs.gov/md/nwis/uv/?site_no=01603000&agency_cd=USGS (accessed June
2021). The temperature time series and the Mauna Loa CO₂ time series are readily available on monthly
scale from http://climexp.knmi.nl (accessed June 2021). The palaeoclimatic data of Vostok CO₂ were
retrieved from http://cdiac.ess-dive.lbl.gov/ftp/trends/co2/vostok.icecore.co2 (dated January 2003,
accessed September 2018) and the temperature data from http://cdiac.ess-
dive.lbl.gov/ftp/trends/temp/vostok/vostok.1999.temp.dat (dated January 2000, accessed September
2018). The Southern Oscillation Index (SOI) data are provided by National Centers for Environmental
Information (NCEI) of the USA’s National Oceanic and Atmospheric Administration (NOAA)
https://www.ncdc.noaa.gov/teleconnections/enso/indicators/soi/.
References
Arrhenius, S. 1896 On the influence of carbonic acid in the air upon the temperature of the ground. Lond.
Edinb. Dublin Philos. Mag. J. Sci. 41, 237276, doi:10.1080/14786449608620846.
Beven, K., 2019 Validation and equifinality. In Computer Simulation Validation Fundamental Concepts,
Methodological Frameworks, and Philosophical Perspectives, Beisbart, C. & Saam, N. J. (eds), 791-809,
Springer, Cham.
Bode, H. W. & Shannon, C. E. 1950 A simplified derivation of linear least square smoothing and prediction
theory. Proc. of the IRE, 38(4), 417-425.
Caillon, N.; Severinghaus, J. P.; Jouzel, J.; Barnola, J. M.; Kang, J. & Lipenkov, V. Y. 2003 Timing of atmospheric
CO₂ and Antarctic temperature changes across Termination III. Science, 299, 17281731.
Christy, J. R., Norris, W. B., Spencer, R. W. & Hnilo, J. J. 2007 Tropospheric temperature change since 1979
from tropical radiosonde and satellite measurements. J. Geophys. Res. 112, D06102,
doi:10.1029/2005JD006881.
Cox, D.R., 199. Causality: Some statistical aspects. J. Roy. Stat. Soc. A 155(2), 291-301.
27
Goulet Coulombe, P. & bel, M. 2021 On spurious causality, CO2, and global temperature. Econometrics
9(3), 33, https://doi.org/10.3390/econometrics9030033
Jouzel, J., Lorius, C., Petit, J. R., Genthon, C., Barkov, N. I., Kotlyakov, V. M. & Petrov, V. M. 1987 Vostok ice
core: a continuous isotope temperature record over the last climatic cycle (160,000 years). Nature 329,
403408.
Kaplan, A. 2011 Patterns and indices of climate variability. In: State of the Climate in 2010. Bull. Amer.
Meteor. Soc. 92(6), S20-S26.
Keeling, C. D., Bacastow, R. B., Bainbridge, A. E., Ekdahl, C. A., Guenther, P. R. & Waterman, L. S. 1976
Atmospheric carbon dioxide variations at Mauna Loa observatory. Hawaii. Tellus 28, 538551.
Koutsoyiannis, D. 2010 A random walk on water. Hydrol. and Earth System Sci. 14, 585601,
doi:10.5194/hess-14-585-2010,.
Koutsoyiannis, D. 2014 Entropy: from thermodynamics to hydrology. Entropy 16 (3), 12871314,
doi:10.3390/e16031287.
Koutsoyiannis, D. 2016 Generic and parsimonious stochastic modelling for hydrology and beyond. Hydrol.
Sci. J. 61(2), 225244, doi:10.1080/02626667.2015.1016950.
Koutsoyiannis, D. 2017 Entropy production in stochastics. Entropy 19(11), 581, doi:10.3390/e19110581.
Koutsoyiannis, D. 2019 Time’s arrow in stochastic characterization and simulation of atmospheric and
hydrological processes. Hydrol. Sci. J. 64 (9), 10131037, doi:10.1080/02626667.2019.1600700.
Koutsoyiannis, D. 2020a Simple stochastic simulation of time irreversible and reversible processes. Hydrol.
Sci. J. 65(4), 536551, doi:10.1080/02626667.2019.1705302.
Koutsoyiannis, D. 2020b Revisiting the global hydrological cycle: is it intensifying? Hydrol. and Earth System
Sci. 24, 38993932, doi:10.5194/hess-24-3899-2020.
Koutsoyiannis, D. 2021 Rethinking climate, climate change, and their relationship with water. Water 13(6),
849, doi:10.3390/w13060849.
Koutsoyiannis, D. & Kundzewicz, Z. W. 2020 Atmospheric temperature and CO₂: Hen-or-egg causality? Sci
2(4), 83, doi:10.3390/sci2040083.
Koutsoyiannis, D. & Montanari, A. 2022 Bluecat: A local uncertainty estimator for deterministic simulations
and predictions. Water Resour. Res., 58(1), e2021WR031215, doi: 10.1029/2021WR031215.
Koutsoyiannis, D., Onof, C. Christofidis, A., & Kundzewicz, Z. W. 2022 Revisiting causality using stochastics:
1. Theory, Proceedings of the Royal Society A, this issue.
Kundzewicz, Z. W., Pińskwar, I. & Koutsoyiannis, D. 2020 Variability of global mean annual temperature is
significantly influenced by the rhythm of ocean-atmosphere oscillations. Science of the Total
Environment 747, 141256, doi:10.1016/j.scitotenv.2020.141256.
Liang, X.S. 2022 The causal interaction between complex subsystems. Entropy 24(1), 3, doi:
10.3390/e24010003.
Liu, J., Bowman, K.W., Schimel, D.S., Parazoo, N. C., Jiang, Z., Lee, M., Bloom, A. A., Wunch, D., Frankenberg, C.,
Sun, Y. & O’Dell, C. W., 2017. Contrasting carbon cycle responses of the tropical continents to the 2015
2016 El Niño. Science 358(6360), doi: 10.1126/science.aam5690.
McPhaden, M. J., Zebiak, S. E. & Glantz, M. H. 2006 ENSO as an Integrating Concept in Earth Science. Science
314, 1740-1745.
Montanari, A. & Koutsoyiannis, D. 2012 A blueprint for process-based modeling of uncertain hydrological
systems. Water Resources Research, 48, W09555, doi:10.1029/2011WR011412.
28
Petit, J. R., Jouzel, J., Raynaud, D., Barkov, N. I., Barnola, J.-M. et al. 1999 Climate and atmospheric history of
the past 420,000 years from the Vostok ice core, Antarctica. Nature, 399, 429436.
Spencer, R. W. & Christy, J. R. 1990 Precise monitoring of global temperature trends from satellites. Science
247, 15581562.
Young, P.C. 2011. Recursive Estimation and Time Series Analysis, Berlin, Heidelberg: Springer-Verlag.
Young, P.C., 2015. Refined instrumental variable estimation: Maximum likelihood optimization of a unified
Box-Jenkins model. Automatica, 52, 3546.
... A thorough and general presentation of the IRF concept in a causality framework has been presented by Koutsoyiannis et al. [11,[33][34][35]. In our case, the following particular considerations are made for a reservoir: (a) the system can be studied in terms of the dimensionless quantities, and the dimensional ones can then be obtained through Equation (7); (b) the response function ( ) for dimensionless time lag = ℎ/ 0 is identical to the dimensionless output function ( ) for = , which results from the impulse of an otherwise empty reservoir; and (c) the system is causal, which means that ( ) = 0 for < 0. Point (b) entails that when we speak about the IRF, we can use the symbols ( ) and ( ) interchangeably. ...
... Indeed, according to conventional wisdom, it is the increased atmospheric carbon dioxide concentration ([CO2]) that caused the increase in temperature (T). However, this was questioned by Koutsoyiannis and Kundzewicz [59], while later Koutsoyiannis et al. [11,33,34] provided evidence, based on analyses of instrumental measurements of the last seven decades, for a unidirectional, potentially causal link between T as the cause and [CO2] as the effect. The same causality direction was confirmed for the entire Phanerozoic by using several proxy data series [35]. ...
... In the theoretical aspect, our analyses provide a case where the instantaneous response function results directly from the system dynamics, rather than from stochastic, data-based means, thus complementing the recent causality framework by Koutsoyiannis et al. [11,33,34]. It further provides an extension of this framework for nonlinear dynamics, which deserves further pursuit. ...
Article
Full-text available
Reservoir routing has been a routine procedure in hydrology, hydraulics and water management. It is typically based on the mass balance (continuity equation) and a conceptual equation relating storage and outflow. If the latter is linear, then there exists an analytical solution of the resulting differential equation, which can directly be utilized to find the outflow from known inflow and to obtain macroscopic characteristics of the process, such as response and residence times, and their distribution functions. Here we refine the reservoir routing framework and extend it to find approximate solutions for nonlinear cases. The proposed framework can also be useful for climatic tasks, such as describing the mass balance of atmospheric carbon dioxide and determining characteristic residence times, which have been an issue of controversy. Application of the theoretical framework results in excellent agreement with real-world data. In this manner, we easily quantify the atmospheric carbon exchanges and obtain reliable and intuitive results, without the need to resort to complex climate models. The mean residence time of atmospheric carbon dioxide turns out to be about four years, and the response time is smaller than that, thus opposing the much longer mainstream estimates.
... and Kundzewicz[5],Koutsoyiannis et al. [6,7,8], Koutsoyiannis and Vournas[4] and Koutsoyiannis[9,10]. R1.5. 0.5% of 240 W/m2 (OLR) is 1.2 W/m2 -but this should be 1.8W/m2 (IPCC AR6, Etminan et al 2016) and so this immediately suggests a problem (see below -this appears to be because of the tropical profile used in the construction of Equation21. ...
... of 73 disputed in the last five years in a number of papers by Koutsoyiannis and Kundzewicz[5],Koutsoyiannis et al.[6,7,8], Koutsoyiannis and Vournas[4] and Koutsoyiannis[9,10].R1.5. 0.5% of 240 W/m2 (OLR) is 1.2 W/m2 -but this should be 1.8W/m2 (IPCC AR6, Etminan et al2016) and so this immediately suggests a problem (see below -this appears to be because of the tropical profile used in the construction of Equation 21.| do not refer to the IPCC estimate, but to the results of the paper, which may not be consistent with the IPCC. ...
Research
Full-text available
This file is the Supplementary Information of the paper "Relative importance of carbon dioxide and water in the greenhouse effect: Does the tail wag the dog?" It contains interesting material as it demonstrates the current practices of silencing voices that disagree with mainstream opinions, which are purported to be science. The contained materials include the rejection files from three journals, namely Hydrological Sciences Journal, MDPI Hydrology and Ecohydrology and Engineering. The document contains all reviews and replies to them, as well as key exchanges with the journal’s Editorial Offices. Replies to reviews are contained in the case that the Editor accepted the request to rebut them—otherwise no replies were prepared. [See the paper at: https://www.researchgate.net/publication/385590387]
... Nor even does it imply potential causation, because it is the time precedence that determines the latter. In a series of papers [9][10][11][12][13], my coauthors and I have: ...
... (a) confirmed this high correlation between CO₂ and global temperature (albeit not in all periods of the Phanerozoic); (b) formulated necessary (but not sufficient) conditions for causation; (c) excluded the case that [CO₂] changes cause temperature changes, as such a proposition violates the necessary condition of time precedence of the cause; (d) proposed a potential causality with temperature as the cause and the [CO₂] as the effect [9][10][11][12][13]; (e) suggested a mechanism that explains the latter potential causality, which is the action of the biosphere-the principal driver of climatic changes, as well as geological changes [29,30]. ...
Method
Full-text available
Prompted by post-publication comments on my recent paper “Refined reservoir routing (RRR) and its application to atmospheric carbon dioxide balance”, I present a multi-compartment carbon balance model whose results turn out to agree with those in the paper and disagree with IPCC’s official ones. I also discuss an additional approach, which is popular among mainstream sceptics, as well as additional comments I received.
... correlation function of the causal system (T,[CO₂]) obtained from its IRF and the autocorrelation function of T.Cross-correlation function of the anticausal system ([CO₂], T) obtained from its IRF and the autocorrelation function of[CO₂].Graph source:Koutsoyiannis et al. (2022b). ...
... How many eggs can a hen lay -The Lifecycle of Laying Hens, https://www.flytesofancy.co.uk/blogs/ information-centre/how-many-eggs-can-a-hen-lay.2 The Serpent's Egg, https://en.wikipedia.org/wiki/The_Serpent%27s_Egg_(film)3 The quoted phrase by Hans Fritzsche (of the Reich Ministry of Public Enlightenment and Propaganda in Nazi Germany; https://en.wikipedia.org/wiki/Hans_Fritzsche) is taken from: L. Goldensohn, Hans Fritzsche interview, in The Nuremberg Interviews, Ed. by R. Gellately, Vintage Books, New York, 2005. ...
Research
Full-text available
Publishing papers that challenge conventional wisdom is not easy. I struggled to publish each one of the papers that contradict the established climate narrative. I still struggle to publish others which are being reviewed or have been rejected. The attacks may continue after publication of published papers, aiming to force publishers to retract published papers. Here I provide an example of an attack, referring to the review of a paper, in which the comments were focused on my earlier papers and used material from online attacks on them. For the history and transparency, I include the attacking comments, as well as my detailed replies to them.
Article
Full-text available
Science seems to be flourishing like never before. However, science has become politicized up to the point where it has become the rule rather than the exception that dissenting submissions—i.e. manuscripts submitted for publication and research proposals submitted for funding that are critical of an accepted view or that propose a new view—are rejected by pseudoskeptical review, which is distinctly unethical. As this is detrimental not only to the career perspectives of those who dare to question the mainstream but also to the development of knowledge, in this opinion piece we call for reforms in science to the benefit of all.
Presentation
Full-text available
Video and presentation also available at: https://klimath.substack.com/p/international-hellenic-association and https://www.itia.ntua.gr/2528/
Article
Full-text available
Human-produced CO₂ by fossil fuel combustion, combined with the rising atmospheric CO₂ concentration and the observed temperature increase, enabled a compelling narrative to be constructed , in which these three facts, in that order, formed a chain of causality. The narrative has been embraced by global political elites to promote their interests. It has also become dominant in public perception, by means of issuing threats for all aspects of life due to alleged climate impacts. My recent work has challenged the alleged causal relationships that form the narrative. A stochastic method for detecting causality showed that temperature change can potentially cause changes in CO₂ concentration, but not vice versa. Temperature increase causes the biosphere to expand and, in turn, produce more naturally emitted CO₂, which accounts for 96% of total emissions. All relevant data sets confirm these findings. In particular, instrumental and proxy data support the natural origin of the change in the isotopic composition of atmospheric CO₂, and century-long longwave radiation data show no discernible effect of increased CO₂ concentration on the greenhouse effect.
Presentation
Full-text available
Human-produced CO₂ by fossil fuel combustion, combined with the rising atmospheric CO₂ concentration and the observed temperature increase, enabled a compelling narrative to be constructed, in which these three facts, in that order, formed a chain of causality. The narrative was embraced by global political elites to promote their interests and became dominant in public perception. My recent work has challenged the alleged causal relationships that form the narrative. A stochastic method for detecting causality showed that temperature change can potentially cause changes in CO₂ concentration, but not vice versa. Temperature increase causes biosphere to expand and, in turn, produce more naturally emitted CO₂, which accounts for 96% of total emissions. All relevant data sets confirm these findings. In particular, instrumental and proxy data support the natural origin of the change in the isotopic composition of atmospheric CO₂, and century-long longwave radiation data show no discernible effect of increased CO₂ concentration on the greenhouse effect.
Article
Full-text available
Using a detailed atmospheric radiative transfer model, we derive macroscopic relationships of downwelling and outgoing longwave radiation which enable determining the partial derivatives thereof with respect to the explanatory variables that represent the greenhouse gases. We validate these macroscopic relationships using empirical formulae based on downwelling radiation data, commonly used in hydrology, and satellite data for the outgoing radiation. We use the relationships and their partial derivatives to infer the relative importance of carbon dioxide and water vapour in the greenhouse effect. The results show that the contribution of the former is 4%-5%, while water and clouds dominate with a contribution of 87%-95%. The minor effect of carbon dioxide is confirmed by the small, non-discernible effect of the recent escalation of atmospheric CO₂ concentration from 300 to 420 ppm. This effect is quantified at 0.5% for both downwelling and outgoing radiation. Water and clouds also perform other important functions in climate, such as regulating heat storage and albedo, as well as cooling the Earth's surface through latent heat transfer, contributing 50%. By confirming the major role of water on climate, these results suggest that hydrology should have a more prominent and more active role in climate research. A good rule of thumb to keep in mind is that anything that calls itself 'science' probably isn't.
Article
Full-text available
Causality is a central concept in science, in philosophy and in life. However, reviewing various approaches to it over the entire knowledge tree, from philosophy to science and to scientific and technological applications, we locate several problems, which prevent these approaches from defining sufficient conditions for the existence of causal links. We thus choose to determine necessary conditions that are operationally useful in identifying or falsifying causality claims. Our proposed approach is based on stochastics, in which events are replaced by processes. Starting from the idea of stochastic causal systems, we extend it to the more general concept of hen-or-egg causality, which includes as special cases the classic causal, and the potentially causal and anti-causal systems. Theoretical considerations allow the development of an effective algorithm, applicable to large-scale open systems, which are neither controllable nor repeatable. The derivation and details of the algorithm are described in this paper, while in a companion paper we illustrate and showcase the proposed framework with a number of case studies, some of which are controlled synthetic examples and others real-world ones arising from interesting scientific problems.
Article
Full-text available
We present a new method for simulating and predicting hydrologic variables with uncertainty assessment and provide example applications to river flows. The method is identified with the acronym “Bluecat" and is based on the use of a deterministic model which is subsequently converted to a stochastic formulation. The latter provides an adjustment on statistical basis of the deterministic prediction along with its confidence limits. The distinguishing features of the proposed approach are the ability to infer the probability distribution of the prediction without requiring strong hypotheses on the statistical characterization of the prediction error (e.g. normality, homoscedasticity) and its transparent and intuitive use of the observations. Bluecat makes use of a rigorous theory to estimate the probability distribution of the predictand conditioned by the deterministic model output, by inferring the conditional statistics of observations. Therefore, Bluecat bridges the gaps between deterministic (possibly physically-based, or deep learning-based) and stochastic models as well as between rigorous theory and transparent use of data with an innovative and user-oriented approach. We present two examples of application to the case studies of the Arno River at Subbiano and Sieve River at Fornacina. The results confirm the distinguishing features of the method along with its technical soundness. We provide an open software working in the R environment, along with help facilities and detailed instructions to reproduce the case studies presented here.
Article
Full-text available
Information flow provides a natural measure for the causal interaction between dynamical events. This study extends our previous rigorous formalism of componentwise information flow to the bulk information flow between two complex subsystems of a large-dimensional parental system. Analytical formulas have been obtained in a closed form. Under a Gaussian assumption, their maximum likelihood estimators have also been obtained. These formulas have been validated using different subsystems with preset relations, and they yield causalities just as expected. On the contrary, the commonly used proxies for the characterization of subsystems, such as averages and principal components, generally do not work correctly. This study can help diagnose the emergence of patterns in complex systems and is expected to have applications in many real world problems in different disciplines such as climate science, fluid dynamics, neuroscience, financial economics, etc.
Article
Full-text available
Stips et al. (2016) use information flows (Liang (2008, 2014)) to establish causality from various forcings to global temperature. We show that the formulas being used hinge on a simplifying assumption that is nearly always rejected by the data. We propose the well-known forecast error variance decomposition based on a Vector Autoregression as an adequate measure of information flow, and find that most results in Stips et al. (2016) cannot be corroborated. Then, we discuss which modeling choices (e.g., the choice of CO2 series and assumptions about simultaneous relationships) may help in extracting credible estimates of causal flows and the transient climate response simply by looking at the joint dynamics of two climatic time series.
Article
Full-text available
We revisit the notion of climate, along with its historical evolution, tracing the origin of the modern concerns about climate. The notion (and the scientific term) of climate was established during the Greek antiquity in a geographical context and it acquired its statistical content (average weather) in modern times after meteorological measurements had become common. Yet the modern definitions of climate are seriously affected by the wrong perception of the previous two centuries that climate should regularly be constant, unless an external agent acts upon it. Therefore, we attempt to give a more rigorous definition of climate, consistent with the modern body of stochastics. We illustrate the definition by real-world data, which also exemplify the large climatic variability. Given this variability, the term “climate change” turns out to be scientifically unjustified. Specifically, it is a pleonasm as climate, like weather, has been ever-changing. Indeed, a historical investigation reveals that the aim in using that term is not scientific but political. Within the political aims, water issues have been greatly promoted by projecting future catastrophes while reversing true roles and causality directions. For this reason, we provide arguments that water is the main element that drives climate, and not the opposite.
Article
Full-text available
It is common knowledge that increasing CO2 concentration plays a major role in enhancement of the greenhouse effect and contributes to global warming. The purpose of this study is to complement the conventional and established theory, that increased CO2 concentration due to human emissions causes an increase in temperature, by considering the reverse causality. Since increased temperature causes an increase in CO2 concentration, the relationship of atmospheric CO2 and temperature may qualify as belonging to the category of “hen-or-egg” problems, where it is not always clear which of two interrelated events is the cause and which the effect. We examine the relationship of global temperature and atmospheric carbon dioxide concentration in monthly time steps, covering the time interval 1980–2019 during which reliable instrumental measurements are available. While both causality directions exist, the results of our study support the hypothesis that the dominant direction is T → CO2. Changes in CO2 follow changes in T by about six months on a monthly scale, or about one year on an annual scale. We attempt to interpret this mechanism by involving biochemical reactions as at higher temperatures, soil respiration and, hence, CO2 emissions, are increasing.
Article
Full-text available
As a result of technological advances in monitoring atmosphere, hydrosphere, cryosphere and biosphere, as well as in data management and processing, several databases have become freely available. These can be exploited in revisiting the global hydrological cycle with the aim, on the one hand, to better quantify it and, on the other hand, to test the established climatological hypotheses according to which the hydrological cycle should be intensifying because of global warming. By processing the information from gridded ground observations, satellite data and reanalyses, it turns out that the established hypotheses are not confirmed. Instead of monotonic trends, there appear fluctuations from intensification to deintensification, and vice versa, with deintensification prevailing in the 21st century. The water balance on land and in the sea appears to be lower than the standard figures of literature, but with greater variability on climatic timescales, which is in accordance with Hurst–Kolmogorov stochastic dynamics. The most obvious anthropogenic signal in the hydrological cycle appears to be the over-exploitation of groundwater, which has a visible effect on the rise in sea level. Melting of glaciers has an equal effect, but in this case it is not known which part is anthropogenic, as studies on polar regions attribute mass loss mostly to ice dynamics.
Article
Full-text available
While global warming has been evolving over several decades, in particular years there have been considerable deviations of global temperature from the underlying trend. These could be explained by climate variability patterns and, in particular, by the major interplays of atmospheric and oceanic processes that generate variations in the global climatic system. Here we show, in a simple and straightforward way, that a rhythm of the major ocean-atmosphere oscillations, such as the ENSO and IPO in the Pacific as well as the AMO in the Atlantic, is indeed meaningfully influencing the global mean annual temperature. We construct time series of residuals of the global temperature from the medium-term (5-year) running averages and show that these largely follow the rhythm of residuals of three basic ocean-atmosphere oscillation modes (ENSO, IPO and AMO) from the 5-year running averages. We find meaningful correlations between analyzed climate variability and deviations of global mean annual temperature residuals that are robust across various datasets and assumptions and explain over 70% of the annual temperature variability in terms of residuals from medium-term averages.
Article
Full-text available
As time irreversibility of streamflow is marked for time scales up to several days, while common stochastic generation methods are good only for time symmetric processes, the need for new methods to handle irreversibility, particularly in flood simulations, has been recently highlighted. From an investigation of the historical evolution of existing stochastic generation methods, which is a useful step before proposing new methods, the strengths and weaknesses of current approaches are located. Following this investigation, a generic solution to the stochastic generation problem is proposed. This is an analytical exact method based on an asymmetric moving average scheme, capable of handling time irreversibility in addition to preserving the second-order stochastic structure, as well as higher-order marginal statistics, of a process. The method is studied theoretically in its general setting, as well as in its most interesting special cases, and is successfully applied to streamflow generation at hourly scale.
Chapter
In this chapter, the concept of equifinality of model representations is discussed, from a background of model applications in the environmental sciences. Equifinality in this context is used to indicate that there may be many different model structures, parameter sets and auxiliary conditions that might appear to give equivalent output predictions or acceptable fits to any observation data available for use in model calibration. This does not imply that the resulting ensemble of models will give similar predictions when used to predict the future under some changed conditions. As new information becomes available to allow model validation, this can be used to constrain the ensemble of models within a Bayesian updating framework, although epistemic sources of uncertainty can make it difficult to define appropriate likelihood measures. It seems likely that the equifinality concept will persist into the future in the form of ensembles of (stochastic) model runs being used to estimate prediction uncertainties. However, more research is needed into the limitations of model structures, information content of data sets subject to epistemic uncertainties and means of evaluating and validating models in the inexact sciences.