Content uploaded by Pouya Manshour
Author content
All content in this area was uploaded by Pouya Manshour on May 02, 2022
Content may be subject to copyright.
arXiv:2204.11731v1 [physics.data-an] 25 Apr 2022
Compression-Complexity with Ordinal Patterns for Robust Causal Inference in
Irregularly-Sampled Time Series
Aditi Kathpalia,∗Pouya Manshour, and Milan Paluˇs†
Department of Complex Systems,
Institute of Computer Science of the Czech Academy of Sciences,
Prague, Czech Republic
(Dated: April 26, 2022)
Distinguishing cause from effect is a scientific challenge resisting solutions from mathematics,
statistics, information theory and computer science. Compression-Complexity Causality (CCC) is
a recently proposed interventional measure of causality, inspired by Wiener-Granger’s idea. It es-
timates causality based on change in dynamical compression-complexity (or compressibility) of the
effect variable, given the cause variable. CCC works with minimal assumptions on given data and
is robust to irregular-sampling, missing-data and finite-length effects. However, it only works for
one-dimensional time series. We propose an ordinal pattern symbolization scheme to encode multi-
dimensional patterns into one-dimensional symbolic sequences, and thus introduce the Permutation
CCC (PCCC), which retains all advantages of the original CCC and can be applied to data from
multidimensional systems with potentially hidden variables. PCCC is tested on numerical simu-
lations and applied to paleoclimate data characterized by irregular and uncertain sampling and
limited numbers of samples.
I. INTRODUCTION
Unraveling systems’ dynamics from the analysis of ob-
served data is one of the fundamental goals of many areas
of natural and social sciences. In this respect, detecting
the direction of interactions or inferring causal relation-
ships among observables is of particular importance that
can improve our ability to better understand the underly-
ing dynamics and to predict or even control such complex
systems [1, 2].
Around sixty years after the pioneering work of Wiener
and Granger [3, 4] on quantifying linear ‘causality’ from
observations, it has been widely applied not only in eco-
nomics [5–7], for which it was first introduced, but also in
various fields of natural sciences, from neurosciences [8]
to Earth sciences [9–11]. A number of attempts have been
made to generalize Granger Causality (GC) to nonlinear
cases, using, e.g., an estimator based on correlation inte-
gral [6], a non-parametric regression approach [12], local
linear predictors [13], mutual nearest neighbors [14, 15],
kernel estimators [16], to state a few. Several other
causality methods based on the GC principle such as
Partial Directed Coherence [17], Direct Transfer Func-
tion [18] and Modified Direct Transfer Function [19] have
also been proposed.
Information theory has proved itself as a powerful ap-
proach into causal inference. In this respect, Schreiber
proposed a method for measuring information trans-
fer among observables [20], known as Transfer Entropy
(TE), which is based on Kullback-Leibler distance be-
tween transition probabilities. Paluˇs et al. [21] intro-
duced a causality measure based on mutual information,
∗kathpalia@cs.cas.cz
†mp@cs.cas.cz
called Conditional Mutual Information (CMI). CMI has
been shown to be equivalent to TE [22]. These tools have
been applied in various research studies and have shown
their power in extracting causal relationships between
different systems [23–27].
We usually work with time series x(t) and y(t) as real-
izations of mand ndimensional dynamical systems, X(t)
and Y(t) respectively, evolving in measurable spaces.
It means that x(t) and y(t) can be considered as the
components of these mand ndimensional vectors. In
many cases, only one possible dimension of the phase
space is observable, recordings or knowledge of variables
which may have indirect effects or play as mediators in
the causal interactions between observables may not be
available. In this respect, phase-space reconstruction is
a common useful approach introduced by Takens [28],
which reconstructs the dynamics of the entire system
(including other unknown/unmeasurable variables) using
time-delay embedding vectors, as follows: the manifold
of an mdimensional state vector Xcan be reconstructed
as X(t) = {x(t), x(t−η), ..., x(t−(m−1)η)}. Here,
ηis the embedding delay, and can be obtained using
the embedding construction procedure based on the first
minimum of the mutual information [29]. Some causal-
ity estimators have applied this phase-space reconstruc-
tion procedure to improve their causal inference power,
such as high dimension CMI [26] and TE [30]. Other
causality measures, such as, Convergent Cross Mapping
(CCM) [31], Topological Causality [32], Predictability
Improvement [33], are based directly on the reconstruc-
tion of dynamical systems.
Vast amounts of data available in the recent years have
pushed some of the above discussed GC extensions, infor-
mation and phase-space reconstruction based approaches
forward as they rely on joint probability density esti-
mations, stationarity, markovianity, topological or linear
modeling. However, still, many temporal observations
2
made in various domains such as climatology [34, 35], fi-
nance [36, 37] and sociology [38] are often short in length,
have missing samples or are irregularly sampled. A signif-
icant challenge arises when we attempt to apply causality
measures in such situations [11]. For instance, CMI or
TE fail when applied to time series which are undersam-
pled or have missing samples [39–41] and also in case
of time series with short lengths [41]. CCM and kernel
based non-linear GC also show poor performance even
in the case of few missing samples in bivariate simulated
data [42].
Kathpalia and Nagaraj recently introduced a causal-
ity measure, called Compression-Complexity Causality
(CCC), which employs ‘complexity’ estimated using loss-
less data-compression algorithms for the purpose of
causality estimation. It has been shown to have the
strength to work well in case of missing samples in data
for bivariate systems of coupled autoregressive and tent
map processes. This has been shown to be the case for
samples which are missing in the two coupled time se-
ries either in a synchronous or asynchronous manner [41].
Also, it gives good performance for time series with short
lengths [41, 42]. These strengths of CCC arise from its
formulation as an interventional causality measure based
on the evolution of dynamical patterns in time series,
independence from joint probability density functions,
making minimal assumptions on the data and use of loss-
less compression based complexity approaches which in
turn show robust performance on short and noisy time
series [41, 43]. However, as discussed in [42], a direct mul-
tidimensional extension of CCC is not as straightforward
and so a measure of effective CCC has been formulated
and used on multidimensional systems of coupled autore-
gressive processes with limited number of variables.
On the other hand, a method for symbolization
of phase-space reconstructed (embedded) processes has
been used to improve the ability of info-theoretic causal-
ity measures for noisy data, such as symbolic transfer en-
tropy [44, 45], partial symbolic transfer entropy [46, 47],
permutation conditional mutual information (PCMI) [48]
and multidimensional PCMI [49]. The symbolization
technique used in these works is based on the Bandt
and Pompe scheme for estimation of Permutation En-
tropy [50], and often referred to as permutation or or-
dinal patterns coding. The scheme labels the embedded
values of time-series at each time point in ascending order
of their magnitude. Symbols are then assigned at each
time point depending on the ordering of values (or the
labelling sequence) at that point. Ordinal patterns have
been used extensively in the analysis and prediction of
chaotic dynamical systems and also shown to be robust
in applications to real world time series. By construc-
tion, this technique ignores the amplitude information
and thus decreases the effect of high fluctuations in data
on the obtained causal inference [51]. Other benefits of
permutation patterns are: they naturally emerge from
the time series and so the method is almost parameter-
free; are invariant to monotonic transformations of the
values; keep account of the causal order of temporal
values and the procedure is computationally inexpen-
sive [52–55]. Ordinal partition has been shown to have
the generating property under specific conditions, imply-
ing topological conjugacy between phase space of dynam-
ical systems and their ordinal symbolic dynamics [56].
Further, permutation entropy for certain sets of systems
has been shown to have a theoretical relationship to the
system’s Lyapunov exponents and Kolmogorov Sinai En-
tropy [57, 58]. Because of all these beneficial properties
of permutation patterns, it is no wonder that the devel-
opment of symbolic TE or PCMI helped to make them
more robust, giving better performance in the case of
noisy measurements, simplifying the process of parame-
ter selection and making less demands on the data.
In this work, we propose the use of CCC approach with
reconstructed dynamical systems which are symbolized
using ordinal patterns. The combination of strengths of
CCC and ordinal patterns, not only makes CCC appli-
cable to dynamical systems with multidimensional vari-
ables, but we also observe that the proposed Permuta-
tion CCC (PCCC) measure gives great performance on
datasets with very short lengths and high levels of miss-
ing samples. The performance of PCCC is compared with
that of PCMI (which is identical to symbolic TE), bivari-
ate CCC and CMI on simulated dynamical systems data.
PCCC outperforms the existing approaches and its esti-
mates are found to be robust for short length time series,
and high levels of missing data points.
This development for the first time opens up avenues
for the use of causality estimation tool on real world
datasets from climate and paleoclimate science, finance
and other fields where there is prevalence of data with ir-
regular and/or uncertain sampling times. To determine
the major drivers of climate is the need of the hour as
climate change poses a big challenge to humankind and
our planet Earth [59]. Different studies have employed
either correlation/coherence, causality methods or mod-
elling approaches to study the interaction between cli-
matic processes. The results produced by different stud-
ies are different and sometimes contradictory, present-
ing an ambiguous situation. We apply PCCC to anal-
yse the causal relationship between the following sets of
climatic processes: greenhouse gas concentrations – at-
mospheric temperature, El-Ni˜no Southern Oscillation –
South Asian monsoon and North Atlantic Oscillation –
European temperatures at different time-scales and com-
pare its performance with bivariate CCC, bivariate and
multidimensional CMI, and PCMI. The time series avail-
able for most of these processes are short in length and
sometimes have missing samples and (or) are sampled
in irregular intervals of time. We expect our estimates
to be reliable and to be helpful to resolve the ambiguity
presented by existing studies.
3
II. RESULTS
Simulation Experiments: Time series data from a
pair of unidirectionally coupled R¨ossler systems were gen-
erated as per the following equations:
˙x1=−ω1y1−z1,
˙y1=ω1x1+a1y1,(1)
˙z1=b1+z1(x1−c1),
for the autonomous or master system, and
˙x2=−ω2y2−z2+ǫ(x1−x2),
˙y2=ω2x2+a2y2,(2)
˙z2=b2+z2(x2−c2),
for the response or slave system. Parameters were set as:
a1=a2= 0.15, b1=b2= 0.2, c1=c2= 10.0, and fre-
quencies set as: ω1= 1.015 and ω2= 0.985. The coupling
parameter, ǫ, was fixed to 0.09. The data were generated
by numerical integration based on the adaptive Bulirsch-
Stoer method [60] using a sampling interval of 0.314 for
both the master and slave systems. This procedure gives
17 – 21 samples per one period. 100 realizations of these
systems were simulated and initial 5000 transients were
removed before using the data for testing experiments.
As can be seen from the equations, there is a cou-
pling between x1and x2, with x1influencing x2. The
analysis of the causal influence between the two systems
was done using the causality estimation measures: bi-
variate or scalar CCC, CMI, PCCC and PCMI for the
cases outlined in the following paragraphs. The estima-
tion procedure for each of the methods is described in
the ‘Methods’ section. The values of parameters used
for each of the methods are also given in the ‘Methods’
section (Table II).
Finite length data: The length of time series, N, of x1
and x2taken from coupled R¨ossler systems was varied
as shown in Fig. 1. The estimation for CMI and PCMI
is done up to a higher value of length as CMI did not
give optimal performance until the length became 32,768
samples. Fig. 1(c) shows scalar (simple bivariate) CMI
or one-dimensional CMI (CMI1) between x1and x2(see
Paluˇs and Vejmelka [22]). This method has high sen-
sitivity but suffers from low specificity. This problem
is solved by using conditional CMI or three-dimensional
CMI (CMI3), where the information from other vari-
ables (y1, z1, y2, z2) is incorporated in the estimation. Its
performance is depicted in Fig. 1(e). However, it re-
quires larger length of time series for optimal perfor-
mance. Fig. 1(a) shows the performance of scalar (or sim-
ple bivariate) CCC, which is equivalent to the CMI1 case,
considering dimensionality. Figs. 1(b) and 1(d) show the
performance of PCCC and PCMI respectively. For each
length level, all 100 realizations of coupled systems were
considered and 100 surrogates generated for each realiza-
tion in order to perform significance analysis of causality
estimated (in both directions) from each realization of
coupled processes. These surrogates were generated for
FIG. 1. Specificity and sensitivity of methods with varying
length. True positive rate (or rate of significant causality
estimated from x1→x2) and false positive rate (or rate of
significant causality estimated from x2→x1), using measures
(a) scalar CCC (CCC), (b) permutation CCC (PCCC), (c)
scalar CMI (CMI1), (d) permutation CMI (PCMI) and (e)
three-dimensional CMI (CMI3), as the length of time series,
N, is varied.
both the processes using the Amplitude Adjusted Fourier
Transform method [61] and significance testing done us-
ing a standard one-sided z-test with p-value set to 0.05
(this was justified as the distributions of surrogates for
CCC and CMI methods implemented were found to be
Gaussian). Based on this significance analysis, true posi-
tive rate (TPR) and false positive rate (FPR) were com-
puted at each length level. A true positive is counted for
a particular realization of coupled systems when causal-
ity estimated from x1to x2is found to be significant and
a false positive is counted when causality estimated from
x2to x1is found to be significant.
As it can be seen from the plots, direct application of
scalar CCC completely fails on multidimensional dynam-
ical systems data, yielding low true positives and high
false positives. Hence the method displays poor sensitiv-
ity as well as specificity. CMI1 also shows poor perfor-
mance, yielding high false positives. CMI3, which is ap-
propriate to be applied for multi-dimensional data, only
begins to give good performance when the length of time
series is taken to be greater than 32,768 samples. On the
other hand, PCCC begins to give high true positives and
low false positives, as the length of time series is increased
4
to 1024 time points, with TPR and FPR reaching almost
1 and 0 respectively as length is increased to 2048 time
points. The use of permutation patterns also improves
the performance of CMI3 for short length data as it can
be seen that PCMI begins to show a TPR of 1 and FPR
of 0 for length of time series equal to 2048 time points.
We did further experiments with simulated R¨ossler
data by varying the amount of noise and missing sam-
ples in the data. For these cases, performance of PCCC
and PCMI alone were evaluated because it can be seen
from the ‘varying length’ experiments that scalar CCC
and CMI1 do not work for multidimensional dynamical
systems data and CMI3 does not perform well for short
length data.
Noisy data: White Gaussian noise was added to the
simulated R¨ossler data. The amount of noise added to
the data was relative to the standard deviation of the
data. The noise standard deviation (σn), is expressed
as a percentage of the standard deviation of the original
data (σs). For example, 20% of noise means σn= 0.2σs,
and 100% of noise means σn=σs. The length of time se-
ries taken for this experiment was fixed to 2048. For each
realization of noisy data as well, 100 surrogate time se-
ries were generated and significance testing performed as
before using the Amplitude Adjusted Fourier Transform
method and z-test respectively. Figs. 2(a) and 2(b) show
the results for varying noise in the data for the measures
PCCC and PCMI respectively.
It can be seen that PCCC performs well for low levels
of noise, up to 10%, but at higher levels of noise, its
performance begins to deteriorate. PCMI, on the other
hand, shows high TPR and low FPR even as the noise
level is increased to 50%.
Sparse data: We refer to time-series with missing sam-
ples as sparse data. Sparsity or non-uniformly missing
samples were introduced in the data in two ways: (1)
Synchronous sparsity and (2) Asynchronous sparsity. In
case of (1), samples were missing from both x1and x2
at randomly chosen time indices and this set of time in-
dices was the same for both x1and x2. In case of (2),
samples were missing from both x1and x2based on two
different sets of randomly chosen time indices, that is, the
time indices of missing samples were different for x1and
x2. The amount of synchronous/ asynchronous sparsity
is expressed in terms of percentage of missing samples
relative to the original length of time series taken. αsync
and αasync refer to the level of missing samples for the
cases of synchronous and asynchronous sparsity respec-
tively, and are given by m/N, where mis the number
of missing samples and Nis the original length of time
series. Nwas fixed to 2048. The length of time series be-
came shorter as the percentage of missing samples were
increased. Causality estimation measures were applied to
the data without any knowledge of whether any samples
were missing or the time stamps at which the samples
were missing. Surrogate data generation for each real-
ization in this case was not done post the introduction
of missing samples but prior to that, using the original
FIG. 2. Specificity and sensitivity of methods with varying
noise and sparsity. True positive rate (or rate of significant
causality estimated from x1→x2) and false positive rate
(or rate of significant causality estimated from x2→x1),
using measures permutation CCC (PCCC) (left column) and
permutation CMI (PCMI) (right column) as the level of noise:
(a) and (b); level of synchronous sparsity: (c) and (d); and
asynchronous sparsity: (e) and (f ), are varied.
length time series. Sparsity was then introduced in the
surrogate time series in a manner similar to that for orig-
inal time series.
Figs. 2(c) and 2(d) show the results obtained using
PCCC and PCMI respectively for synchronous sparsity.
Figs. 2(e) and 2(f) show the same for asynchronous spar-
sity. It can be seen that PCCC is robust to the introduc-
tion of missing samples, showing high TPR and low FPR.
FPR begins to be greater than 0.2 only when the level
of synchronous sparsity is increased to 25% and asyn-
chronous sparsity is increased to 20%. PCMI is robust
to low levels of synchronous sparsity but deteriorates be-
yond 5% of missing samples, giving low true positives.
It performs very poorly even with low levels of asyn-
chronous sparsity.
Real Data Analysis: As discussed in the Introduc-
tion, a number of climate datasets are either sampled at
irregular intervals, have missing samples, are sampled af-
ter long intervals of time or have a combination of two or
more of these issues. In addition, their temporal record-
ings available are short in length. We apply the proposed
5
method, PCCC, to some such datasets described below.
We also compare the results obtained with existing mea-
sures: scalar CCC, scalar CMI and PCMI.
Millenial scale CO2-temperature recordings: Mills et
al. have compiled independent estimates of global aver-
age surface temperature and atmospheric CO2concentra-
tion for the Phanerozoic eon. These paleoclimate proxy
records span the last 424 million years [62] and have been
used and made available in the study by Wong et al. [63].
One data point for both CO2and temperature recordings
were available for each million year period and was used
in our analysis to check for causal interaction between
between the two.
CO2, CH4and temperature recordings over the last
800,000 years: In [64], Past Interglacials Working Group
of PAGES has made available proxy records of atmo-
spheric CO2, CH4and deepwater temperatures over the
last 800 ka (1 ka= 1000 years). Each of these time series
were reconstructed by separate studies and so the record-
ings available are non-synchronous and also irregularly
sampled for each variable. Further, some data points
are missing in the temperature time-series. Roughly, sin-
gle data point is available for each ka for each of the
three variables. CO2proxy data are based on antarctic
ice core composites. This was first reported in [65] and
the revised values made available in a study by Bere-
iter et al. [66]. Reconstructed atmospheric CH4con-
centrations, also based on ice cores, were as reported
in [67] (on the AICC2012 age scale [68]). Deepwater
temperature recordings obtained using shallow-infaunal
benthic foraminifera (Mg/Ca ratios) that became avail-
able from Ocean Drilling Program (ODP) site 1123 on
the Chatham Rise, east of New Zealand were reported
in [69].
Causal influence was checked between CO2-
temperature and separately between CH4-temperature.
CO2and CH4data are taken beginning from the 6.5th
ka on the AICC2012 scale and temperature data are
taken beginning from the 7th ka. Since the number of
data points available for temperature are 792, CO2-
temperature analysis was done based on these 792
samples and as the number of samples of CH4is limited
to 756 beginning from the 6.5th ka, CH4-temperature
analysis was done using these 756 data points.
Monthly CO2-temperature dataset: Monthly mean
CO2data constructed from mean daily CO2values as
well as Northern Hemisphere’s combined land and ocean
temperature anomalies for the monthly timescale are
available open source on the National Oceanic and At-
mospheric Administration (NOAA) website. The CO2
measurements were made at the Mauna Loa Observa-
tory, Hawaii. A part of the CO2dataset (March 1958-
April 1974) were originally obtained by C. David Keel-
ing of the Scripps Institution of Oceanography and are
available on the Scripps website. NOAA started its own
CO2measurements starting May 1974. The temperature
anomaly dataset is constructed from the Global Histor-
ical Climatology Network-Monthly data set [70] and In-
ternational Comprehensive Ocean-Atmosphere Data Set,
also available on the NOAA website. These data from
March, 1958 to June 2021 (with 760 data points) were
used to check for the causal influence between CO2and
temperature on the recent timescale. Both time series
were differenced using consecutive values as they were
highly non-stationary.
Yearly ENSO-SASM dataset: 1,100 Year El
Ni˜no/Southern Oscillation (ENSO) Index Recon-
struction dataset, made available open source on NOAA
website and originally published in [71] was used in this
study. South Asian Summer Monsoon (SASM) Index
1100 Year Reconstruction dataset, also available open
source on the NOAA website and originally published
in [72], was the second variable used here. The aim of
our study was to check the causal dependence between
these two sets of recordings taken from the year 900 AD
to 2000 AD (with one data point being available for each
year).
Monthly NINO-Indian monsoon dataset: Monthly
NINO 3.4 SST Index recordings from the year 1870 to
2021 are available open source on the NOAA website.
Its details are published in [73]. All India monthly rain-
fall dataset from 1871-2016, available on the official web-
site of World Meteorological Organization and originally
acquired from ‘Indian Institute of Tropical Meteorology’,
was used for analysis. These recordings are in the units of
mm/month. Causal influence was checked between these
two recordings using 1752 data points, ranging from the
month January, 1871 to December, 2016.
Monthly NAO-temperature recordings: Reconstructed
monthly North Atlantic Oscillation (NAO) index record-
ings from December 1658 to July 2001 are available
open source on the NOAA website. The reconstruc-
tions from December 1658 to November 1900 are taken
from [74, 75] and from December 1900 to July 2001 are
derived from [76]. Central European 500 year tempera-
ture reconstruction dataset, beginning from 1500 AD, is
made available open source by NOAA National Centers
for Environmental Information, under the World Data
Service for Paleoclimatology. These were derived in the
study [77]. We took winter only data points (months
December, January and February) starting from the De-
cember of 1658 to the February of 2001 as it is known that
the NAO influence is strongest in winter. This yielded a
total of 1029 data points. However, reconstruction based
on embedding was done for each year’s winter separately
(with a time delay of 1) and not in a continuous manner
as for the other datasets, reducing the length of ordinal
patterns encoded sequence to 343. Causal influence was
checked between NAO and temperature for the encoded
sequences using PCMI and PCCC and directly using one-
dimensional CMI and CCC for the 1029 length sequences.
Daily NAO-temperature recordings: Daily NAO
records are available on the NOAA website and have been
published in [78–80]. Daily mean surface air temperature
data from the Frankfurt station in Germany were taken
from the records made available online by the ECA&D
6
project [81]. This data was taken from 1st January 1950
to 31st April 2021. Once again, daily values from the
winter months alone (December, January and February),
comprising of 6390 data points, were extracted for the
analysis. While embedding the two time series, care was
taken not to embed the recordings of winter from one year
along with that of winter from the next year. Causal in-
fluence was checked between daily winter NAO and tem-
perature time-series.
For the analysis of causal interaction in each of these
datasets, scalar CCC and CMI as well as PCCC and
PCMI were computed as discussed in the ‘Methods’ sec-
tion. Parameters used for each of the methods are also
given in the ‘Methods’ section (Table II). In order to as-
sess the significance of causality value estimated using
each measure, 100 surrogate realizations were generated
using the stationary bootstrap method [82] for both the
time series under consideration. Resampling of blocks
of observations of random length from the original time
series is done for obtaining surrogate time series using
this method. The length of each block has a geometric
distribution. The probability parameter that determines
the geometric probability distribution for length of each
block was set to 0.1 (as suggested in [82]). Significance
testing of the causal interaction between original time-
series was then done using a standard one-sided z-test,
with p-value set to 0.05. Table Ishows whether causal
influence between the considered variables was found to
be significant using each of the causality measures. Fig. 3
depicts the value of the PCCC between original pair of
time series with respect to the distribution of PCCC ob-
tained using surrogate time series for two datasets: kilo-
year scale CO2-temperature (Figs. 3(a) and 3(b)) and
yearly scale ENSO-SASM (Figs. 3(c) and 3(d)) record-
ings. In the tables, Fig. 3and in the following text, we
use the notation ‘T’ to refer to temperature generically.
Which of the temperature recordings is being referred to,
will be clear from context.
III. DISCUSSION AND CONCLUSIONS
CCC has been proposed as an ‘interventional’ causal-
ity measure for time series. It does not require cause-
effect separability in time series samples and is based on
dynamical evolution of processes, making it suitable for
subsampled time series, time series in which cause and
effect are acquired at slightly different spatio-temporal
scales than the scales at which they naturally occur
and even when there are slight discrepancies in spatio-
temporal scales of the cause and effect time series. This
results in its robust performance in the case of missing
samples, non-uniformly sampled, decimated and short
length data [41]. In this work, we have proposed the
use of CCC in combination with ordinal pattern encod-
ing. The latter preserves the dynamics of the systems of
observed variables, allowing for CCC to decipher causal
relationships between variables of multi-dimensional sys-
-0.02 0 0.02 0.04
0
0.05
0.1
Probability
(a)
0 0.02 0.04 0.06
0
0.05
0.1
Probability
(b)
0.02 0.03 0.04 0.05
0
0.05
0.1
Probability
(c)
0.02 0.04 0.06
0
0.05
0.1
Probability
(d)
FIG. 3. PCCC surrogate analysis results. PCCC surrogate
analysis results for: (a) Kilo-year scale CO2→T, (b) Kilo-
year scale T →CO2, (c) Yearly ENSO →SASM, (d) SASM →
ENSO. Dashed line indicates PCCC value obtained for orig-
inal series. Its position is indicated with respect to Gaussian
curve fitted normalized histogram of surrogate PCCC values.
PCCC for cases (b), (c), (d) is found to be significant.
tems while conditioning for the presence other variables
in these systems which might be unknown or unobserved.
Simulations of coupled R¨ossler systems illustrate how
scalar CCC is a complete failure for observables of cou-
pled multi-dimensional dynamical systems, while PCCC
performs well to determine the correct direction of cou-
pling. Comparison of PCCC with PCMI for these simu-
lations shows that the former beats the latter by show-
ing better performance on shorter lengths of time series.
Further, while PCMI consistently gave superior perfor-
mance for increasing noise in coupled R¨ossler systems,
experiments with sparse data showed that PCCC out-
performs PCMI. This was the case when samples were
missing from the driver and response time series either
in a synchronous or asynchronous manner.
As PCCC showed promising results for simulations
with high levels of missing samples and short length, we
have applied it to make causal inferences in datasets from
climatology and paleoclimatology which suffer from the
issues of irregular sampling, missing samples and (or)
have limited number of data points available. Many of
these datasets have been analyzed in previous studies.
However, different studies report different results prob-
ably due to the challenging nature of their recordings
available or the limitation of the inference methods ap-
plied to work on the data.
For example, the relationship between CO2concentra-
tions and temperature of the atmosphere has been stud-
ied from the mid 1800s [83, 84], beginning when a strong
link between the two was recognized. Relatively recently,
with causal inference tools available, a number of studies
have begun to look at the directionality of relationship
7
TABLE I. Causal inference obtained for real datasets using different causality measures. Xindicates significant causality and
✗indicates non-significant causality.
System
Measure Direction CCC PCCC CMI PCMI
Millenial scale CO2-T CO2→TX✗ ✗ ✗
T→CO2X X ✗ ✗
Kilo-year scale CO2-T CO2→T✗ ✗ ✗ ✗
T→CO2✗X✗ ✗
Kilo-year scale CH4-T CH4→T✗X✗ ✗
T→CH4✗ ✗ ✗ ✗
Monthly scale CO2-T CO2→T✗X✗ ✗
T→CO2✗ ✗ ✗ ✗
Yearly ENSO-SASM ENSO →SASM ✗X✗ ✗
SASM →ENSO X X ✗ ✗
Monthly NINO-Indian monsoon NINO →Monsoon ✗XXX
Monsoon →NINO X✗X X
Monthly NAO-European T NAO →TX X ✗ ✗
T→NAO ✗ ✗ ✗ ✗
Daily NAO-Frankfurt T NAO →TX✗X✗
T→NAO ✗ ✗ ✗ ✗
between the two on different temporal scales. To men-
tion a few findings, Kodra et al. [85] found that CO2
Granger causes temperature. Their analysis was based
on data taken from 1860 to 2008. Atanassio [86] found
a clear evidence of GC from CO2to temperature using
lag-augmented Wald test, for a similar time range. On
the other hand, Stern and Kaufmann [87] found bidi-
rectional GC between the two, again for a similar time
range. Kang and Larsson [88] also find bidirectional cau-
sation between the two using GC, however, by using data
from ice cores for the last 800,000 years. Many of these
latter studies criticize the former. Also, the drawbacks
of one or more of these studies are explicitly mentioned
in [87, 89, 90] and highlight the issues with the data and/
or the methodology employed. Other than GC and its ex-
tensions, a couple of other measures have also been used
to study CO2-T relationship. Stips et al. [91] have ap-
plied a measure called Liang’s Information flow on CO2-T
recordings, both on recent (1850-2005) and paleoclimate
(800 ka ice core reconstructions) time-scales. The study
finds unidirectional causation from CO2→T on the re-
cent time-scale and from T →CO2on the paleoclimatic
scale. They have also analysed the CH4-T relationship
and found T to drive CH4on the paleoclimate scale. This
study has been criticized by Goulet et al. [92]. They show
that an assumption of ‘linearity’ made by Liang’s infor-
mation flow is nearly always rejected by the data. Con-
vergent cross mapping, which is applied to the 800 ka
recordings in another study, finds a bidirectional causal
influence between both CO2- T and CH4-T [93]. An-
other recent study, that infers causation using lagged
cross-correlations between monthly CO2and tempera-
ture, taken from the period 1980-2019, has found a bidi-
rectional relationship on the recent monthly scale, with
the dominant influence being from T →CO2[94]. In the
light of the limitations of CCM [95, 96], especially for ir-
regularly sampled or missing data [42], and of the widely
known pitfalls of correlation coefficient [97], it is difficult
to rely on the inferences of the latter two studies.
PCCC indicates unidirectional causality from T →
CO2on the paleoclimatic scale, using both millenial and
kilo-year scale recordings. On the recent monthly scale,
the situation is reversed with CO2driving T. These re-
sults are in line with some of the existing CO2-T causal
analysis studies and clearly PCCC does not suffer the
limitations of existing approaches. On the kilo-year scale,
8
PCCC suggests that CH4drives T. While none of the
above discussed causality studies have found this re-
sult, other works have suggested that methane concen-
trations modulate millenial-scale climate variability be-
cause of the sensitivity of methane to insolation [98, 99].
Other approaches implemented in this study – CCC,
CMI, PCMI also do not duplicate the results obtained
by PCCC because of their specific limitations such as
the inability to work on multi-dimensional, short length
or irregularly sampled data.
ENSO events and the Indian monsoon are other ma jor
climatic processes of global importance [59]. The rela-
tionship between the two has been studied extensively, es-
pecially using correlation and coherence approaches [100–
105]. While ENSO is normally expected to play a driv-
ing role, there is no clear consensus on the directional-
ity of the relationship between the two processes. More
recently, causal inference approaches have been used to
study the nature of their coupling. In [106] and [107],
both linear and non-linear GC versions were implemented
on monthly mean ENSO-Indian monsoon time series,
ranging from the period 1871-2006 and bidirectional cou-
pling was inferred between the two processes. Other
studies have studied the causal relationship indirectly
by analyzing the ENSO-Indian Ocean Dipole link. For
example, in [108], this connection was studied by ap-
plying GC on yearly reanalysis as well as model data
ranging from 1950-2014. The study found robust causal
influence of Indian Ocean Dipole on ENSO while the in-
fluence in opposite direction had lower confidence. Us-
ing PCCC, we find a bidirectional causal influence be-
tween yearly recordings of ENSO-SASM. However, on
the shorter monthly scales, NINO is found to drive Indian
Monsoon and there is insignificant effect in the opposite
direction.
Although the NAO is known to be a leading mode
of winter climate variability over Europe [109–111], the
directionality or feedback in NAO related climate ef-
fects has been studied by a few causality analysis stud-
ies [9, 112, 113]. We investigate the NAO-European tem-
peratures relationship on both monthly and daily time
scales using winter only data. While PCCC indicates
that NAO drives central European temperatures with no
significant feedback on the longer monthly scale, on the
daily scale it shows no significant causation in either di-
rection. On the other hand, CCC and CMI, based on one
dimensional time series, indicate a strong influence from
NAO to Frankfurt daily mean temperatures. This re-
sult indicates that the NAO influence on European win-
ter temperature on the daily scale can be explained as
a simple time-delayed transfer of information between
scalar time series in which no role is played by higher-
dimensional patterns, potentially reflected in ordinal cod-
ing. Such an information transfer in the atmosphere is
tied to the transfer of mass and energy as indicated in
the study of climate networks by Hlinka et al. [114]. CMI
and PCMI estimates can be considered to be reliable for
this analysis as the time-series analyzed are long, close
to 6000 time points.
CCC is free of the assumptions of linearity, require-
ment of long-term stationarity, extremely robust to miss-
ing samples, irregular sampling and short length data;
and its combination with permutation patterns allows
it to make reliable inferences for coupled systems with
multiple variables. Thus, we can expect our analysis
and inferences presented here on some highly-researched
and long-debated climatic interactions to be highly ro-
bust and reliable. We also expect that the use of PCCC
on other challenging datasets from climatology and other
fields will be helpful to shed light on the causal linkages
in considered systems.
IV. METHODS
Compression Complexity Causality (CCC) is
defined as the change in the dynamical compression-
complexity of time series ywhen ∆yis seen to be gen-
erated jointly by the dynamical evolution of both ypast
and xpast as opposed to by the reality of the dynami-
cal evolution of ypast alone. ypast, xpast are windows of a
particular length Ltaken from contemporary time points
of time series yand xrespectively and ∆yis a window
of length wfollowing ypast [41]. Dynamical compression-
complexity (CC) is estimated using the measure effort-
to-compress (ETC) [115] and given by:
CC (∆y|ypast ) = E T C(ypast + ∆y)−E T C(ypast),(3)
CC (∆y|ypast , xpast ) =
ET C (ypast + ∆y, xpast + ∆y)−ET C (xpast, ypast),(4)
Eq. (3) computes the dynamical compression-
complexity of ∆yas a dynamical evolution of ypast alone.
Eq. (4) computes the dynamical compression-complexity
of ∆yas a dynamical evolution of both ypast and xpast.
CCCxpast→yis then estimated as:
CCCxpast→∆y=CC(∆y|ypast )−CC(∆y|ypast, xpast ).
(5)
Averaged CCC from xto yover the entire length of
time series with the window ∆ybeing slided by a step-
size of δis estimated as —
CCCx→y=CCCxpast→∆y
=CC (∆y|ypast )−CC(∆y|xpast , ypast),(6)
If CC (∆y|ypast )≈CC(∆y|xpast , ypast), there is no
causality from xto y. Surrogate time series are gen-
erated for both xand yand the CCCx→yvalues of the
original and surrogate time series compared. If the CCC
computed for original time series is statistically different
9
from that of surrogate time series, we can infer the pres-
ence of causal relation from x→y[42]. CCCx→ycan
be both <or >0 depending upon the nature or quality
of the causal relationship [41]. The magnitude indicates
the strength of causation.
Selection of parameters: L, w, δ and the number of
bins, B, for symbolizing the time series using equidis-
tant binning (ETC is applied to symbolic sequences) is
done using parameter selection criteria given in the sup-
plementary text of [41].
Permutation Compression-Complexity Causal-
ity is the causal inference technique proposed and im-
plemented in this work. Given a pair of time series x1
and x2from dynamical systems in which causation is
to be checked from x1to x2, we first embed the time
series of the potential driver (x1here) in the following
manner: x1(t), x1(t+η), x1(t+ 2η),...x1(t+ (m−1)η),
where ηis the time delay and mis the embedding di-
mension of x1.ηis computed as the first minimum
of auto mutual information function. The embedded
time-series at each time-point is then symbolized using
permutation or ordinal patterns binning. For example,
if m= 3, the embedding at time point tis given as
ˆx1(t) = (x1(t), x1(t+η), x1(t+ 2η)). Symbols 0,1,2 are
then used for labelling the pattern for ˆx(t) at each time
point by sorting the embedded values in ascending or-
der, with 2 being used for the highest value and 0 for the
lowest. If two or more values are exactly same, they are
labelled differently depending on the order of their occur-
rence. A total of m! = 3! patterns at time tare possible in
this case. Thus, ˆx(t) is symbolized to a one dimensional
sequence consisting of m! possible symbols or bins. CCC
is then estimated from ˆx1(t) to x2(t), using Eq. (6) af-
ter symbolizing x2(t) using standard equidistant binning
with m! bins. Thus,
P CC Cx1→x2=CCCˆx1→x2.(7)
Permutation binning is not done for the potential
driver series as it was found from simulation experiments
(R¨ossler data) that embedding the ‘cause’ alone works
better for the CCC measure. Full dimensionality of the
cause is necessary to predict the effect. Hence, embed-
ding only the cause helps to recover the causal relation-
ship. PCCC helps to take into account the multidimen-
sional nature of the coupled systems. Parameter selection
for PCCC is done in the same manner as for the case of
CCC, using the symbolic sequences, ˆx1(t) and x2(t), for
selection of the parameters. When PCCC is to be esti-
mated from x2→x1,x2is embedded and x1remains as
it is. Just like CCC, the PCCC measure can also take
negative values.
Conditional Mutual Information (CMI) of the
variables Xand Ygiven the variable Zis a common
information-theoretic functional used for the causality
detection, and can be obtained as
I(X;Y|Z) = H(X|Z) + H(Y|Z)−H(X, Y |Z) (8)
where H(X1, X2, ...|Z) = H(X1, X2, ...)−H(Z) is
the conditional entropy, and the joint Shannon entropy
H(X1, X2, ...) is defined as:
H(X1, X2, ...) = −X
x1,x2,...
p(x1, x2, ...) log p(x1, x2, ...)
(9)
where p(x1, x2, ...) = P r[X1=x1, X2=x2, ...] is the
joint probability distribution function of the amplitude
of variables {X1, X2, ...}. In order to detect the cou-
pling direction among two dynamical variables of X
and Y, Paluˇs et al. [21] used the conditional mu-
tual information I(X(t); Y(t+τ)|Y(t)), that captures
the net information about the τ-future of the process
Ycontained in the process X. As mentioned in the
Introduction, to estimate other unknown variables, an
m-dimensional state vector Xcan be reconstructed as
X(t) = {x(t), x(t−η), ..., x(t−(m−1)η)}. Accord-
ingly, CMI defined above can be represented by its re-
constructed version for all variables of X(t), Y(t+τ) and
Y(t). However, extensive numerical studies [22] demon-
strated that CMI in the form
I(X(t); Y(t+τ)|Y(t), Y (t−η), ..., Y (t−(m−1)η)) (10)
is sufficient to infer direction of coupling among dy-
namical variables of X(t) and Y(t). In this respect, we
use this measure to detect causality relationships in this
article.
Permutation Conditional Mutual Information
(PCMI) can be obtained based on the permutation anal-
ysis described earlier in the PCCC definition. In this
approach, all marginal, joint or conditional probability
distribution functions of the amplitude of the variables
are replaced by their symbolized versions, thus Eq. (9)
should be replaced by
H(ˆ
X1,ˆ
X2, ...) = −X
ˆx1,ˆx2,...
p(ˆx1,ˆx2, ...) log p(ˆx1,ˆx2, ...)
(11)
where p(ˆx1,ˆx2, ...) = P r [ˆ
X1= ˆx1,ˆ
X2= ˆx2, ...] is the
joint probability distribution function of the symbolized
variables ˆ
Xi(t) = {Xi(t), Xi(t+η), ..., Xi(t+ (m−1)η)}.
By using Eqs. (8) and (11), permutation CMI can be
obtained as I(ˆ
X(t); ˆ
Y(t+τ)|ˆ
Y(t)). Finally, one should
replace τwith τ+ (m−1)ηin order to avoid any over-
lapping between the past and future of the symbolized
variable ˆ
Y.
Parameters of the methods used were set as shown
in Table II for different datasets.
DATA AVAILABILITY
The millenial scale CO2and temper-
ature datasets are freely available at
https://zenodo.org/record/4562996#.YiDbTN_ML3A.
10
TABLE II. Parameters corresponding to each method, used
for different datasets.
Dataset Embedding CCC PCCC
CMI/
PCMI
R¨ossler
ηx1= 5
ηx2= 5
m= 3
L= 300
w= 30
δ= 30
B= 8
L= 25
w= 15
δ= 20 τ= 20
Millenial
CO2-T
ηCO2= 11
ηT= 16
m= 3
L= 60
w= 15
δ= 20
B= 4
L= 60
w= 30
δ= 20 τ= 1 −30
Kilo-year
CO2-T
ηCO2= 24
ηT= 8
m= 3
L= 60
w= 15
δ= 20
B= 4
L= 30
w= 15
δ= 20 τ= 1 −30
Kilo-year
CH4-T
ηCH4= 10
ηT= 8
m= 3
L= 60
w= 15
δ= 20
B= 4
L= 30
w= 15
δ= 20 τ= 1 −30
Monthly
CO2-T
ηCO2= 3
ηT= 2
m= 3
L= 60
w= 15
δ= 20
B= 4
L= 30
w= 15
δ= 20 τ= 1 −30
Yearly
ENSO-
SASM
ηEN SO = 1
ηSAS M = 4
m= 3
L= 60
w= 15
δ= 20
B= 4
L= 60
w= 30
δ= 30 τ= 1 −30
Monthly
NINO-
India
Monsoon
ηNI N O = 10
ηmon = 3
m= 3
L= 60
w= 15
δ= 20
B= 4
L= 30
w= 15
δ= 20 τ= 1 −30
Monthly
NAO-T
ηNAO = 1
ηT= 1
m= 3
L= 60
w= 15
δ= 20
B= 4
L= 30
w= 15
δ= 10 τ= 1 −30
Daily
NAO-T
ηNAO = 15
ηT= 15
m= 3
L= 40
w= 15
δ= 20
B= 4
L= 30
w= 15
δ= 20 τ= 1 −30
Kilo-year scale CO2, CH4and temperature datasets
are available as supplementary files for [64] at
https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2015RG000482.
Monthly CO2recordings are taken from
the NOAA repository and are available at
https://gml.noaa.gov/ccgg/trends/. Monthly
Northern hemisphere temperature anomaly recordings
are taken from the NOAA repository and are available at
https://www.ncdc.noaa.gov/cag/global/time-series.
The yearly El Ni˜no/Southern Oscillation Index Recon-
struction dataset is taken from the NOAA repository,
https://www.ncei.noaa.gov/access/paleo-search/study/11194.
The yearly South Asian Summer Monsoon Index Recon-
struction dataset is taken from the NOAA repository,
https://www.ncei.noaa.gov/access/paleo-search/study/17369.
Monthly Ni˜no 3.4 SST Index dataset is
taken from the NOAA repository, available at
https://psl.noaa.gov/gcos_wgsp/Timeseries/Nino34/.
Monthly all India rainfall dataset is made avail-
able by the World Metereological Organization
at http://climexp.knmi.nl/data/pALLIN.dat.
Reconstructed monthly North Atlantic Oscilla-
tion Index is available at the NOAA repository,
https://psl.noaa.gov/gcos_wgsp/Timeseries/RNAO/.
Monthly Central European 500 Year Temperature Re-
constructions are available at the NOAA repository,
https://www.ncei.noaa.gov/access/metadata/landing-page/bin/iso?id=noaa-recon-9970.
Daily North Atlantic Oscillation Index
is available at the NOAA repository,
https://www.cpc.ncep.noaa.gov/products/precip/CWlink/pna/nao.shtml.
Daily Frankfurt air temperatures are
made available by the ECA&D project at
https://www.ecad.eu/dailydata/predefinedseries.php.
CODE AVAILABILITY
The computer codes used in
this study are freely available at
https://github.com/AditiKathpalia/PermutationCCC
under the Apache 2.0 Open-source license.
ACKNOWLEDGMENTS
This study is supported by the Czech Science Foun-
dation, Pro ject No. GA19-16066S and by the Czech
Academy of Sciences, Praemium Academiae awarded to
M. Paluˇs.
[1] J. Pearl and D. Mackenzie, The book of why: the new
science of cause and effect (Basic Books, 2018).
[2] A. Kathpalia and N. Nagaraj, Measuring causality, Res-
onance , 191 (2021).
[3] N. Wiener, The theory of prediction, Modern mathe-
matics for engineers 1, 125 (1956).
[4] C. Granger, Investigating causal relations by economet-
ric models and cross-spectral methods, Econometrica
37, 424– (1969).
11
[5] J. Geweke, Inference and causality in economic time se-
ries models, Handbook of econometrics 2, 1101 (1984).
[6] C. Hiemstra and J. D. Jones, Testing for linear and non-
linear granger causality in the stock price-volume rela-
tion, The Journal of Finance 49, 1639 (1994).
[7] S. Z. Chiou-Wei, C.-F. Chen, and Z. Zhu, Economic
growth and energy consumption revisited: evidence
from linear and nonlinear granger causality, Energy Eco-
nomics 30, 3063 (2008).
[8] A. K. Seth, A. B. Barrett, and L. Barnett, Granger
causality analysis in neuroscience and neuroimaging,
Journal of Neuroscience 35, 3293 (2015).
[9] T. J. Mosedale, D. B. Stephenson, M. Collins, and T. C.
Mills, Granger causality of coupled climate processes:
Ocean feedback on the north atlantic oscillation, Jour-
nal of climate 19, 1182 (2006).
[10] G. Tirabassi, C. Masoller, and M. Barreiro, A study of
the air–sea interaction in the south atlantic convergence
zone through granger causality, International Journal of
Climatology 35, 3440 (2015).
[11] J. Runge, S. Bathiany, E. Bollt, G. Camps-Valls,
D. Coumou, E. Deyle, C. Glymour, M. Kretschmer,
M. D. Mahecha, J. Mu˜noz-Mar´ı, et al., Inferring cau-
sation from time series in earth system sciences, Nature
communications 10, 1 (2019).
[12] D. Bell, J. Kay, and J. Malley, A non-parametric ap-
proach to non-linear causality testing, Economics Let-
ters 51, 7 (1996).
[13] Y. Chen, G. Rangarajan, J. Feng, and M. Ding, Analyz-
ing multiple nonlinear time series with extended granger
causality, Physics letters A 324, 26 (2004).
[14] S. J. Schiff, P. So, T. Chang, R. E. Burke, and T. Sauer,
Detecting dynamical interdependence and generalized
synchrony through mutual prediction in a neural en-
semble, Physical Review E 54, 6708 (1996).
[15] M. Le Van Quyen, J. Martinerie, C. Adam, and F. J.
Varela, Nonlinear analyses of interictal eeg map the
brain interdependences in human focal epilepsy, Physica
D: Nonlinear Phenomena 127, 250 (1999).
[16] D. Marinazzo, M. Pellicoro, and S. Stramaglia, Kernel
method for nonlinear granger causality, Physical Review
Letters 100, 144103 (2008).
[17] L. A. Baccal´a and K. Sameshima, Partial directed coher-
ence: a new concept in neural structure determination,
Biological cybernetics 84, 463 (2001).
[18] M. Kami´nski, M. Ding, W. A. Truccolo, and S. L.
Bressler, Evaluating causal relations in neural systems:
Granger causality, directed transfer function and statis-
tical assessment of significance, Biological cybernetics
85, 145 (2001).
[19] A. Korzeniewska, M. Ma´nczak, M. Kami´nski, K. J. Bli-
nowska, and S. Kasicki, Determination of information
flow direction among brain structures by a modified
directed transfer function (dDTF) method, Journal of
neuroscience methods 125, 195 (2003).
[20] T. Schreiber, Measuring information transfer, Physical
Review Letters 85, 461 (2000).
[21] M. Paluˇs, V. Kom´arek, Z. Hrnˇc´ıˇr, and K. ˇ
Stˇerbov´a, Syn-
chronization as adjustment of information rates: Detec-
tion from bivariate time series, Physical Review E 63,
046211 (2001).
[22] M. Paluˇs and M. Vejmelka, Directionality of coupling
from bivariate time series: How to avoid false causalities
and missed connections, Physical Review E 75, 056211
(2007).
[23] R. Vicente, M. Wibral, M. Lindner, and G. Pipa, Trans-
fer entropy: a model-free measure of effective connec-
tivity for the neurosciences, Journal of computational
neuroscience 30, 45 (2011).
[24] M. Bauer, J. W. Cox, M. H. Caveness, J. J. Downs,
and N. F. Thornhill, Finding the direction of distur-
bance propagation in a chemical process using transfer
entropy, IEEE transactions on control systems technol-
ogy 15, 12 (2007).
[25] T. Dimpfl and F. J. Peter, Using transfer entropy to
measure information flows between financial markets,
Studies in Nonlinear Dynamics & Econometrics 17, 85
(2013).
[26] M. Paluˇs, Multiscale atmospheric dynamics: cross-
frequency phase-amplitude coupling in the air temper-
ature, Physical review letters 112, 078702 (2014).
[27] N. Jajcay, S. Kravtsov, G. Sugihara, A. A. Tsonis, and
M. Paluˇs, Synchronization and causality across time
scales in el ni˜no southern oscillation, npj Climate and
Atmospheric Science 1, 1 (2018).
[28] F. Takens, Detecting strange attractors in turbulence,
in Dynamical systems and turbulence, Warwick 1980
(Springer, 1981) pp. 366–381.
[29] A. M. Fraser and H. L. Swinney, Independent coordi-
nates for strange attractors from mutual information,
Physical review A 33, 1134 (1986).
[30] M. Wibral, N. Pampu, V. Priesemann, F. Siebenh¨uhner,
H. Seiwert, M. Lindner, J. T. Lizier, and R. Vi-
cente, Measuring information-transfer delays, PloS one
8, e55809 (2013).
[31] G. Sugihara, R. May, H. Ye, C. Hsieh, and E. Deyle,
Detecting causality in complex ecosystems, Science 338,
496 (2012).
[32] D. Harnack, E. Laminski, M. Sch¨unemann, and K. R.
Pawelzik, Topological causality in dynamical systems,
Physical review letters 119, 098301 (2017).
[33] A. Krakovsk´a and F. Hanzely, Testing for causality in
reconstructed state spaces by an optimized mixed pre-
diction method, Physical Review E 94, 052203 (2016).
[34] A. Barrios, G. Trincado, and R. Garreaud, Alternative
approaches for estimating missing climate data: appli-
cation to monthly precipitation records in south-central
chile, Forest Ecosystems 5, 1 (2018).
[35] C. I. Anderson and W. A. Gough, Accounting for miss-
ing data in monthly temperature series: Testing rule-of-
thumb omission of months with missing values, Interna-
tional Journal of Climatology 38, 4990 (2018).
[36] G. DiCesare, Imputation, estimation and missing data
in finance, Ph.D. thesis, University of Waterloo (2006).
[37] C. John, E. J. Ekpenyong, and C. C. Nworu, Imputa-
tion of missing values in economic and financial time
series data using five principal component analysis ap-
proaches, CBN Journal of Applied Statistics (JAS) 10,
3 (2019).
[38] S. Gyimah, Missing data in quantitative social research,
PSC Discussion Papers Series 15, 1 (2001).
[39] C. Kulp and E. Tracy, The application of the transfer
entropy to gappy time series, Physics Letters A 373,
1261 (2009).
[40] D. Smirnov and B. Bezruchko, Spurious causalities due
to low temporal resolution: Towards detection of bidi-
rectional coupling from time series, EPL (Europhysics
Letters) 100, 10005 (2012).
12
[41] A. Kathpalia and N. Nagaraj, Data based intervention
approach for complexity-causality measure, PeerJ Com-
puter Science 5(2019).
[42] A. Kathpalia, Theoretical and Experimental Investiga-
tions into Causality, its Measures and Applications,
Ph.D. thesis, NIAS (2021).
[43] N. Nagaraj and K. Balasubramanian, Dynamical com-
plexity of short and noisy time series, The European
Physical Journal Special Topics , 1 (2017).
[44] M. Staniek and K. Lehnertz, Symbolic transfer entropy,
Physical Review Letters 100, 158101 (2008).
[45] M. Staniek and K. Lehnertz, Symbolic transfer entropy:
inferring directionality in biosignals, Biomedizinische
Technik 54, 323 (2009).
[46] D. Kugiumtzis, Partial transfer entropy on rank vectors,
The European Physical Journal Special Topics 222, 401
(2013).
[47] A. Papana, C. Kyrtsou, D. Kugiumtzis, and C. Diks,
Simulation study of direct causality measures in multi-
variate time series, Entropy 15, 2635 (2013).
[48] X. Li and G. Ouyang, Estimating coupling direction
between neuronal populations with permutation condi-
tional mutual information, NeuroImage 52, 497 (2010).
[49] D. Wen, P. Jia, S.-H. Hsu, Y. Zhou, X. Lan, D. Cui,
G. Li, S. Yin, and L. Wang, Estimating coupling
strength between multivariate neural series with mul-
tivariate permutation conditional mutual information,
Neural Networks 110, 159 (2019).
[50] C. Bandt and B. Pompe, Permutation entropy: a natu-
ral complexity measure for time series, Physical review
letters 88, 174102 (2002).
[51] B. Fadlallah, B. Chen, A. Keil, and J. Principe,
Weighted-permutation entropy: A complexity measure
for time series incorporating amplitude information,
Physical Review E 87, 022911 (2013).
[52] J. Amig´o, Permutation complexity in dynamical sys-
tems: ordinal patterns, permutation entropy and all that
(Springer Science & Business Media, 2010).
[53] M. Zanin, L. Zunino, O. A. Rosso, and D. Papo, Per-
mutation entropy and its main biomedical and econo-
physics applications: a review, Entropy 14, 1553 (2012).
[54] K. Keller, A. M. Unakafov, and V. A. Unakafova, Ordi-
nal patterns, entropy, and eeg, Entropy 16, 6212 (2014).
[55] M. Zanin and F. Olivares, Ordinal patterns-based
methodologies for distinguishing chaos from noise in dis-
crete time series, Communications Physics 4, 1 (2021).
[56] M. McCullough, M. Small, T. Stemler, and H. H.-C.
Iu, Time lagged ordinal partition networks for captur-
ing dynamics of continuous dynamical systems, Chaos:
An Interdisciplinary Journal of Nonlinear Science 25,
053101 (2015).
[57] C. Bandt, G. Keller, and B. Pompe, Entropy of interval
maps via permutations, Nonlinearity 15, 1595 (2002).
[58] J. M. Amig´o, M. B. Kennel, and L. Kocarev, The per-
mutation entropy rate equals the metric entropy rate for
ergodic information sources and ergodic dynamical sys-
tems, Physica D: Nonlinear Phenomena 210, 77 (2005).
[59] S. Solomon, M. Manning, M. Marquis, D. Qin, et al.,
Climate change 2007-the physical science basis: Work-
ing group I contribution to the fourth assessment report
of the IPCC, Vol. 4 (Cambridge university press, 2007).
[60] W. H. Press, B. P. Flannery, S. A. Teukolsky, W. T.
Vetterling, and P. B. Kramer, Numerical recipes: the art
of scientific computing, Physics Today 40, 120 (1987).
[61] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, and
J. D. Farmer, Testing for nonlinearity in time series: the
method of surrogate data, Physica D: Nonlinear Phe-
nomena 58, 77 (1992).
[62] B. J. Mills, A. J. Krause, C. R. Scotese, D. J. Hill,
G. A. Shields, and T. M. Lenton, Modelling the long-
term carbon cycle, atmospheric co2, and earth surface
temperature from late neoproterozoic to present day,
Gondwana Research 67, 172 (2019).
[63] T. E. Wong, Y. Cui, D. L. Royer, and K. Keller, A
tighter constraint on earth-system sensitivity from long-
term temperature and carbon-cycle observations, Na-
ture communications 12, 1 (2021).
[64] P. I. W. G. of PAGES, Interglacials of the last 800,000
years, Reviews of Geophysics 54, 162 (2016).
[65] D. L¨uthi, M. Le Floch, B. Bereiter, T. Blunier, J.-M.
Barnola, U. Siegenthaler, D. Raynaud, J. Jouzel, H. Fis-
cher, K. Kawamura, et al., High-resolution carbon diox-
ide concentration record 650,000–800,000 years before
present, nature 453, 379 (2008).
[66] B. Bereiter, S. Eggleston, J. Schmitt, C. Nehrbass-
Ahles, T. F. Stocker, H. Fischer, S. Kipfstuhl, and
J. Chappellaz, Revision of the epica dome c co2 record
from 800 to 600 kyr before present, Geophysical Re-
search Letters 42, 542 (2015).
[67] L. Loulergue, A. Schilt, R. Spahni, V. Masson-
Delmotte, T. Blunier, B. Lemieux, J.-M. Barnola,
D. Raynaud, T. F. Stocker, and J. Chappellaz, Orbital
and millennial-scale features of atmospheric ch 4 over
the past 800,000 years, Nature 453, 383 (2008).
[68] L. Bazin, A. Landais, B. Lemieux-Dudon, H. Toy´e Ma-
hamadou Kele, D. Veres, F. Parrenin, P. Martinerie,
C. Ritz, E. Capron, V. Lipenkov, et al., An optimized
multi-proxy, multi-site antarctic ice and gas orbital
chronology (aicc2012): 120–800 ka, Climate of the Past
9, 1715 (2013).
[69] H. Elderfield, P. Ferretti, M. Greaves, S. Crowhurst,
I. N. McCave, D. Hodell, and A. M. Piotrowski, Evo-
lution of ocean temperature and ice volume through
the mid-pleistocene climate transition, science 337, 704
(2012).
[70] J. H. Lawrimore, M. J. Menne, B. E. Gleason, C. N.
Williams, D. B. Wuertz, R. S. Vose, and J. Rennie,
An overview of the global historical climatology network
monthly mean temperature data set, version 3, Journal
of Geophysical Research: Atmospheres 116 (2011).
[71] J. Li, S.-P. Xie, E. R. Cook, G. Huang, R. D’arrigo,
F. Liu, J. Ma, and X.-T. Zheng, Interdecadal modula-
tion of el ni˜no amplitude during the past millennium,
Nature climate change 1, 114 (2011).
[72] F. Shi, J. Li, and R. J. Wilson, A tree-ring reconstruc-
tion of the south asian summer monsoon index over the
past millennium, Scientific Reports 4, 1 (2014).
[73] N. Rayner, D. E. Parker, E. Horton, C. K. Folland,
L. V. Alexander, D. Rowell, E. C. Kent, and A. Kaplan,
Global analyses of sea surface temperature, sea ice, and
night marine air temperature since the late nineteenth
century, Journal of Geophysical Research: Atmospheres
108 (2003).
[74] J. Luterbacher, C. Schmutz, D. Gyalistras, E. Xoplaki,
and H. Wanner, Reconstruction of monthly nao and eu
indices back to ad 1675, Geophysical Research Letters
26, 2745 (1999).
13
[75] J. Luterbacher, E. Xoplaki, D. Dietrich, P. Jones,
T. Davies, D. Portis, J. Gonzalez-Rouco, H. Von Storch,
D. Gyalistras, C. Casty, et al., Extending north atlantic
oscillation reconstructions back to 1500, Atmospheric
Science Letters 2, 114 (2001).
[76] K. E. Trenberth and D. A. Paolino Jr, The northern
hemisphere sea-level pressure data set: Trends, errors
and discontinuities, Monthly Weather Review 108, 855
(1980).
[77] P. Dobrovoln`y, A. Moberg, R. Br´azdil, C. Pfister,
R. Glaser, R. Wilson, A. van Engelen, D. Liman´owka,
A. Kiss, M. Hal´ıˇckov´a, et al., Monthly, seasonal and
annual temperature reconstructions for central europe
derived from documentary evidence and instrumental
records since ad 1500, Climatic change 101, 69 (2010).
[78] A. G. Barnston and R. E. Livezey, Classification, sea-
sonality and persistence of low-frequency atmospheric
circulation patterns, Monthly weather review 115, 1083
(1987).
[79] W. Y. Chen and H. Van den Dool, Sensitivity of tele-
connection patterns to the sign of their primary action
center, Monthly weather review 131, 2885 (2003).
[80] H. Van den Dool, S. Saha, and A. Johansson, Empirical
orthogonal teleconnections, Journal of Climate 13, 1421
(2000).
[81] A. Klein Tank, J. Wijngaard, G. K¨onnen, R. B¨ohm,
G. Demar´ee, A. Gocheva, M. Mileta, S. Pashiardis,
L. Hejkrlik, C. Kern-Hansen, et al., Daily dataset of
20th-century surface air temperature and precipitation
series for the european climate assessment, International
Journal of Climatology: A Journal of the Royal Meteo-
rological Society 22, 1441 (2002).
[82] D. N. Politis and J. P. Romano, The stationary boot-
strap, Journal of the American Statistical association
89, 1303 (1994).
[83] E. Foote, Art. xxxi.–circumstances affecting the heat of
the sun’s rays, American Journal of Science and Arts
(1820-1879) 22, 382 (1856).
[84] S. Arrhenius, Xxxi. on the influence of carbonic acid
in the air upon the temperature of the ground, The
London, Edinburgh, and Dublin Philosophical Maga-
zine and Journal of Science 41, 237 (1896).
[85] E. Kodra, S. Chatterjee, and A. R. Ganguly, Exploring
granger causality between global average observed time
series of carbon dioxide and temperature, Theoretical
and applied climatology 104, 325 (2011).
[86] A. Attanasio, Testing for linear granger causality from
natural/anthropogenic forcings to global temperature
anomalies, Theoretical and applied climatology 110,
281 (2012).
[87] D. I. Stern and R. K. Kaufmann, Anthropogenic and
natural causes of climate change, Climatic change 122,
257 (2014).
[88] J. Kang and R. Larsson, What is the link between tem-
perature and carbon dioxide levels? a granger causality
analysis based on ice core data, Theoretical and applied
climatology 116, 537 (2014).
[89] U. Triacca, On the use of granger causality to investi-
gate the human influence on climate, Theoretical and
Applied Climatology 69, 137 (2001).
[90] U. Triacca, Is granger causality analysis appropriate to
investigate the relationship between atmospheric con-
centration of carbon dioxide and global surface air tem-
perature?, Theoretical and applied climatology 81, 133
(2005).
[91] A. Stips, D. Macias, C. Coughlan, E. Garcia-Gorriz, and
X. San Liang, On the causal structure between co 2 and
global temperature, Scientific reports 6, 1 (2016).
[92] P. Goulet Coulombe and M. G¨obel, On spurious causal-
ity, co2, and global temperature, Econometrics 9, 33
(2021).
[93] E. H. Van Nes, M. Scheffer, V. Brovkin, T. M. Lenton,
H. Ye, E. Deyle, and G. Sugihara, Causal feedbacks in
climate change, Nature Climate Change 5, 445 (2015).
[94] D. Koutsoyiannis and Z. W. Kundzewicz, Atmospheric
temperature and co2: Hen-or-egg causality?, Sci 2, 83
(2020).
[95] D. Mønster, R. Fusaroli, K. Tyl´en, A. Roepstorff, and
J. F. Sherson, Causal inference from noisy time-series
data - Testing the convergent cross-mapping algorithm
in the presence of noise and external influence, Future
Generation Computer Systems 73, 52 (2017).
[96] K. Schiecke, B. Pester, M. Feucht, L. Leistritz, and
H. Witte, Convergent cross mapping: Basic concept,
influence of estimation parameters and practical appli-
cation, in 2015 37th Annual International Conference of
the IEEE Engineering in Medicine and Biology Society
(EMBC) (IEEE, 2015) pp. 7418–7421.
[97] R. J. Janse, T. Hoekstra, K. J. Jager, C. Zoccali,
G. Tripepi, F. W. Dekker, and M. van Diepen, Con-
ducting correlation analysis: Important limitations and
pitfalls, Clinical Kidney Journal (2021).
[98] E. J. Brook, T. Sowers, and J. Orchardo, Rapid varia-
tions in atmospheric methane concentration during the
past 110,000 years, Science 273, 1087 (1996).
[99] K. Thirumalai, S. C. Clemens, and J. W. Partin,
Methane, monsoons, and modulation of millennial-
scale climate, Geophysical Research Letters 47,
e2020GL087613 (2020).
[100] R. H. Kripalani and A. Kulkarni, Rainfall variabil-
ity over south–east asia—connections with indian mon-
soon and enso extremes: new perspectives, International
Journal of Climatology: A Journal of the Royal Meteo-
rological Society 17, 1155 (1997).
[101] K. K. Kumar, B. Rajagopalan, and M. A. Cane, On the
weakening relationship between the indian monsoon and
enso, Science 284, 2156 (1999).
[102] V. Krishnamurthy and B. N. Goswami, Indian
monsoon–enso relationship on interdecadal timescale,
Journal of climate 13, 579 (2000).
[103] S. Sarkar, R. P. Singh, and M. Kafatos, Further ev-
idences for the weakening relationship of indian rain-
fall and enso over india, Geophysical research letters 31
(2004).
[104] D. Maraun and J. Kurths, Epochs of phase coherence
between el nino/southern oscillation and indian mon-
soon, Geophysical Research Letters 32 (2005).
[105] L. Zubair and C. F. Ropelewski, The strengthening re-
lationship between enso and northeast monsoon rainfall
over sri lanka and southern india, Journal of Climate
19, 1567 (2006).
[106] I. I. Mokhov, D. A. Smirnov, P. I. Nakonechny, S. S.
Kozlenko, E. P. Seleznev, and J. Kurths, Alternating
mutual influence of el-ni˜no/southern oscillation and in-
dian monsoon, Geophysical Research Letters 38 (2011).
[107] I. Mokhov, D. Smirnov, P. Nakonechny, S. Kozlenko,
and J. Kurths, Relationship between el-nino/southern
oscillation and the indian monsoon, Izvestiya, Atmo-
14
spheric and Oceanic Physics 48, 47 (2012).
[108] T. Le, K.-J. Ha, D.-H. Bae, and S.-H. Kim, Causal ef-
fects of indian ocean dipole on el ni˜no–southern oscil-
lation during 1950–2014 based on high-resolution mod-
els and reanalysis data, Environmental Research Letters
15, 1040b6 (2020).
[109] H. Wanner, S. Br¨onnimann, C. Casty, D. Gyalistras,
J. Luterbacher, C. Schmutz, D. B. Stephenson, and
E. Xoplaki, North atlantic oscillation–concepts and
studies, Surveys in geophysics 22, 321 (2001).
[110] J. W. Hurrell and C. Deser, North atlantic climate vari-
ability: the role of the north atlantic oscillation, Journal
of marine systems 79, 231 (2010).
[111] C. Deser, J. W. Hurrell, and A. S. Phillips, The role of
the north atlantic oscillation in european climate pro-
jections, Climate dynamics 49, 3141 (2017).
[112] W. Wang, B. T. Anderson, R. K. Kaufmann, and R. B.
Myneni, The relation between the north atlantic oscil-
lation and ssts in the north atlantic basin, Journal of
Climate 17, 4752 (2004).
[113] G. Wang, N. Zhang, K. Fan, and M. Palus, Central euro-
pean air temperature: driving force analysis and causal
influence of nao, Theoretical and Applied Climatology
137, 1421 (2019).
[114] J. Hlinka, N. Jajcay, D. Hartman, and M. Paluˇs, Smooth
information flow in temperature climate network reflects
mass transport, Chaos: An Interdisciplinary Journal of
Nonlinear Science 27, 035811 (2017).
[115] N. Nagaraj, K. Balasubramanian, and S. Dey, A new
complexity measure for time series analysis and classi-
fication, The European Physical Journal Special Topics
222, 847 (2013).