Available via license: CC BY 4.0

Content may be subject to copyright.

©2020 The Authors.Journal of the Royal Statistical Society: Series A (Statistics in Society)

Published by John Wiley & Sons Ltd on behalf of the Royal Statistical Society.This is an open access article under the

terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium,

provided the original work is properly cited.

0964–1998/20/183000

J. R. Statist. Soc. A (2020)

How does temperature vary over time?: evidence on

the stationary and fractal nature of temperature

ﬂuctuations

John K. Dagsvik,

Statistics Norway, Oslo, Norway

Mariachiara Fortuna

Vanlog, Turin, Italy

and Sigmund Hov Moen

Oslo, Norway

[Received September 2016. Final revision January 2020]

Summary. The paper analyses temperature data from 96 selected weather stations world

wide, and from reconstructed northern hemisphere temperature data over the last two millennia.

Using a non-parametric test, we ﬁnd that the stationarity hypothesis is not rejected by the data.

Subsequently, we investigate further properties of the data by means of a statistical model

known as the fractional Gaussian noise (FGN) model. Under stationarity FGN follows from the

fact that the observed data are obtained as temporal aggregates of data generated at a ﬁner

(basic) timescale where temporal aggregation is taken over a ‘large’ number of basic units.The

FGN process exhibits long-range dependence. Several tests show that both the reconstructed

and most of the observed data are consistent with the FGN model.

Keywords: Fractional Gaussian noise; Hurst index; Observed and reconstructed

temperatures; Self-similar processes;Temperature change

1. Introduction

Debate about whether or not a change in temperature levels is taking place has been ﬁerce

over the past few decades. It seems to be widely accepted now that there has been a systematic

change in the temperature process in recent years. However, this is not as easy to demonstrate

as might be expected. Recorded series of temperatures typically exhibit local trends and cycles

that sometimes seem to persist for up to several decades. The difﬁculty of settling this matter

relates to the fact that the lengths of the available observed time series of temperature data are

limited. Few of the recorded temperature series are longer than 250 years. Although 250 years

may seem a long time, it is not when it comes to examining properties such as stationarity and

long-range dependence.

Recently, there have been attempts to reconstruct temperature data from other sources. One

important data set has been obtained by Moberg et al. (2005a), who have reconstructed annual

temperatures from 1979 back to 1 AD based on information from ice core drillings, tree rings,

lake sediments and other sources. The advantage of the reconstructed data sets is that they

Address for correspondence: John K. Dagsvik, PO Box 2633, St Hanshaugen, NO-0131 Oslo, Norway.

E-mail: john.dagsvik@ssb.no

2J. K. Dagsvik, M. Fortuna and S. Hov Moen

cover quite a long time range. However, since they are reconstructions one cannot rule out the

possibility that the extent of measurement error might be substantial.

Whereas existing physical models provide good short-term temperature forecasts, they are not

reliable for forecasts more than a few days ahead. In the absence of a physical model that is ca-

pable of explaining weather dynamics precisely over a longer time range, an interesting question

is whether temperature ﬂuctuations are consistent with outcomes of an underlying stationary

stochastic mechanism (process), and what the features of such a stochastic process might be. In

this paper we address several challenges. First, we investigate whether the temperature process is

stationary by applying a non-parametric test on observed as well as reconstructed data. Second,

we apply theoretical arguments to justify a speciﬁc stochastic model for the temperature process.

Third, we estimate the parameters of the model by using both observed and reconstructed data.

Finally, we apply different approaches to test whether the observed and reconstructed data are

consistent with our model.

Statistical time series analysis based on actual recorded temperature time series includes

Baillie and Chung (2002), Bloomﬁeld (1992), Bloomﬁeld and Nychka (1992), Galbraith and

Green (1992), Koenker and Schorfheide (1994), Harvey and Mills (2001), K¨

arner (2002), Gil-

Alana (2003, 2008a,b), Gay-Garcia and Estrada (2009), Mills (2007, 2010), Estrada et al. (2010),

Kaufmann et al. (2010), Schmith et al. (2012), Davidson et al. (2016) and Holt and Ter¨

asvirta

(2020). Mills (2007) analysed the reconstructed data that were obtained by Moberg et al. (2005a).

With a few exceptions, those references applied univariate statistical tools for analysing time

series without any a priori restrictions derived from theoretical considerations. When the choice

between statistical models is based primarily on goodness-of-ﬁt criteria, choosing between spec-

iﬁcations that yield virtually the same ﬁt but entail different interpretations and produce sub-

stantially different out-of-sample predictions is highly problematic. This difﬁculty is illustrated

by discussions in the recent literature (Kaufmann et al. (2010) and Mills (2010), and their ref-

erences) about whether the temperature process exhibits systematic trends. This difﬁculty is not

likely to be resolved by statistical tests alone. If a precise physical theory were available one might

imagine applying a combination of a priori physical assumptions and statistical modelling tech-

niques, as has been done by, for example, Aldrin et al. (2012), Humlum et al. (2013), Schmith

et al. (2012) and Solheim et al. (2011). However, such a strategy has serious drawbacks because

existing physical theories are not sufﬁciently precise to provide reliable quantitative relationships

to explain observed temperature ﬂuctuations. Climate models are rough deterministic represen-

tations of very complex phenomena and so cannot be expected to give precise answers. Since

they are deterministic, they are also not very helpful for choosing between competing statistical

formulations.

Motivated by the results from the non-parametric stationarity tests, which show that station-

arity is not rejected in most cases, we take a theoretical approach that draws on recent advances

in the theory of stochastic processes to justify the model structure before the model is confronted

with the data. Our approach is simple in the sense that no explanatory variable is introduced and

no deeper explanations of the temperature variations are provided. The justiﬁcation is based on

a key feature of the data, namely that the observed temperature series are obtained as temporal

(or can be interpreted as) aggregates of data generated at a ﬁner (basic) timescale where tempo-

ral aggregation is taken over a ‘large’ number of basic units. For example, weekly temperatures

are obtained as averages of daily (average) temperatures, monthly temperatures as averages of

weekly (average) temperatures, etc. This feature is very important because it has a crucial bearing

on the structure of the statistical model which is supposed to have generated these data. Under

stationarity and speciﬁc regularity conditions, it follows from Giraitis et al. (2012), pages 77

and 79, that the observed temperature data have, approximately, the properties of a so-called

How Does Temperature Vary over Time? 3

fractional Gaussian noise (FGN) process (Mandelbrot and van Ness, 1968; Mandelbrot and

Wallis, 1968, 1969). Recall that the stationarity assumption is reasonable since the ﬁrst-stage

non-parametric tests show that stationarity is not rejected by the reconstructed—and most of

the observed—temperature series. As will be explained later, other tests of the properties of the

model (including stationarity) are also applied. As demonstrated by Mandelbrot and Wallis

(1969), the FGN model allows for highly irregular patterns of observations. It implies a speciﬁc

structure of the serial correlation pattern. Thus, our approach avoids ad hoc parameterizations

of the serial correlation structure that are typical in statistical time series analysis. In our case,

where the timescale is based on either ‘month’ or ‘year’ as the unit, although the basic timescale

may be ‘day’ (or fractions of a day), we can safely say that temporal aggregation is taken over

a large number of units at the basic scale.

We have used data from 96 weather stations and data reconstructed by Moberg et al. (2005a,

2006) to test the stationarity hypothesis and to estimate and test the FGN model. We have used

different estimation methods and tests. These tests seem to provide convincing evidence to sup-

port the assertion that the FGN model is consistent with the major part of the data. What is par-

ticularly striking is that the reconstructed data also appear to be consistent with the FGN model.

The paper is organized as follows. In Section 2 the data are described. Section 3 contains a

discussion on the justiﬁcation of the modelling assumptions. In Section 4 we discuss methods

for estimation and testing, whereas Section 5 contains results from the estimation and testing.

Section 6 is devoted to an analysis of the reconstructed data from the last two millennia (Moberg

et al., 2005b).

Additional results are given in four on-line supplementary appendices. The ﬁrst ﬁle contains

appendices with further descriptions of the data and estimation results and ﬁgures. The second

ﬁle contains the R code that was used to produce the results in the ﬁrst ﬁle. The third ﬁle contains

the estimation results of the paper together with the R code that was used to produce the results.

The fourth ﬁle contains all the functions that were developed for this paper by Mariachiara

Fortuna written as plain text.

2. Data

Data on observed temperatures were obtained from various sources (Hov Moen; www.

rimfrost.no). These sources are the National Aeronautics and Space Administration

Goddard Institute for Space Studies, European climate assessment and data and national me-

teorological institutes, such as the Swedish Meteorological and Hydrological Institute and the

Norwegian Meteorological Institute. The data, certiﬁed by the National Aeronautics and Space

Administration, comprise time series of temperatures for 1258 weather stations in cities or

towns in more than 100 countries. The time series are available as yearly and monthly ﬁgures.

The lengths of the time series vary greatly across stations. Some, such as Uppsala, contain data

for 290 years, with monthly data from 1722 to 2012, apart from some missing observations.

Other time series are shorter than two decades. Some of the time series have several periods of

missing data. After various selection criteria were applied we ended up with our sample consist-

ing of 96 time series from 32 countries. The time series were selected on the basis of the quality

of data, such as the length and number of missing observations. More information about the

selection process is available in appendix F in the ﬁrst on-line resource ﬁle.

The ﬁrst on-line resource ﬁle contains plots of the temperature data for nine selected weather

stations (Fig. B1) as well as summary information for the 96 weather stations (Table B1). The

nine weather stations were selected because they have the longest temperature series with the

fewest interruptions. We have used annual data as well as monthly data adjusted for seasonal

4J. K. Dagsvik, M. Fortuna and S. Hov Moen

variations. Under the Gaussian hypothesis, seasonality can affect only the monthly mean, the

variance and possibly the auto-correlation function. By assuming that the auto-correlation

function does not depend on seasonal variations, we have adjusted temperatures for seasonal

variations by subtracting the respective monthly means from the observations and dividing

the observations by the respective monthly standard deviations. The ﬁgures of the temperature

series show strong cycles and local trends (Fig. 1B and appendix E in the ﬁrst on-line resource

ﬁle). (It is likely that there are measurement errors in these temperature series. Some errors are

due to the location of the measurement sites, which initially were often to be found in central

urban areas. For example, the thermometer that was used for temperature recordings in New

York City was previously in Central Park and only recently moved outside the city. It has been

documented that the temperatures tend to rise with increasing urbanization. There may also

have been variations in the quality of the thermometers that were used over time.)

We have also analysed the data obtained by Moberg et al. (2005a, 2006). They reconstructed

temperatures for the northern hemisphere from 1 AD to 1979 by using data from borehole

drillings, tree rings and lake sediments: see Moberg et al. (2005a,b) for a detailed discussion and

description of their data; see also the discussion by Moberg (2012). These data show considerable

variations over time: there are several cycles, with a high swing occurring from 1000 to 1100 and

a low swing occurring from 1500 to 1600: see Fig. 4 in Section 6.

3. Fractional Brownian motion and fractional Gaussian noise

Let {X.t/,t0}denote the (observed) temperature process, which we view as a stochastic process

in continuous time t. In this section, we provide a more precise formulation of our model and

its justiﬁcation. In what follows, let Ndenote the set of integers, including 0, and let [t] denote

the integer part of t.

Property 1. The process {X.t/,t0}is generated as averages of observations at a ﬁner (basic)

timescale, i.e.

X.t/ =1

m

[mt]

[mt−m]+1˜

X.q/

where {˜

X.q/,q∈N}denotes the process deﬁned at the basic timescale.

Suppose, for example, that the unit of the basic timescale is ‘day’ and that tindexes ‘month’.

Then ˜

X.q/ is the temperature recorded in day q,m∼

=30, and X.t/ is deﬁned as the temperature

of month t. Here, it is understood that the enumeration of the days goes from 1 to mT where

Tis the length of the time series when using the timescale that corresponds to the observed

data, which in this case is month. Alternatively, if instead tindexes year, then X.t/ is deﬁned as

the temperature of year t. In our case property 1 is simply a formalized statement of how the

observed time series (suitably adjusted for seasonal variations) have been constructed.

Hypothesis 1. The process {˜

X.q/,q∈N/ deﬁned at the basic timescale is weakly stationary

with constant mean.

Note, however, that in this paper we do not assume at the outset that the temperature process

is weakly stationary. Thus, hypothesis 1 is a property which we shall test.

To provide an a priori theoretical justiﬁcation for our model, namely the FGN process, let

τ2

m=varm

q=1

[˜

X.q/ −E{˜

X.q/}]:

How Does Temperature Vary over Time? 5

If the process {˜

X.q/,q∈N}is stationary and satisﬁes some regularity conditions, then it follows

from proposition 4.4.1, page 77, and proposition 4.4.4, page 79, in Giraitis et al. (2012) that the

process {V.t/,t0}given by

V.t/ =lim

m→∞τ−1

m

[tm]

q=1

[˜

X.q/ −E{˜

X.q/}]

is fractional Brownian motion (FBM): see theorem 3 in Appendix A. The FBM process is

Gaussian and has the property known as self-similarity. Remember that a self-similar stochas-

tic process (in continuous time) {V.t/,t0}has the property that the joint distribution of

.V.bt1/,V.bt2/,:::,V.btn// is equal to the joint distribution of .bHV.t1/,bHV.t2/,:::,bHV.tn//

for any set of time periods .t1,t2,:::,tn/, for any integer n, and any positive constant bwhere

Hsatisﬁes, 0 <H<1 (Samorodnitsky and Taqqu (1994), chapter 7). The parameter His the

so-called Hurst index, named after the British engineer Harold Edwin Hurst (1880–1978). One

way of explaining the self-similar feature (or fractal feature in a stochastic context) is to say that

the distribution of the average process (normalized to have zero mean) up to time bt is, apart

from a change of scale, the same as the distribution of the average process up to time t. In other

words, the time span over which the average is taken is not essential for the qualitative properties

of the probability law of the process. Thus, under self-similarity, a change in the input scale by

a factor bwill leave the process invariant up to a change in the ‘output’ scale by the factor bH:

The FGN process is deﬁned by {V.t/ −V.t −1/,t1}where {V.t/,t0}is FBM. It follows

immediately from the deﬁnition above that

V.t/ −V.t −1/=lim

m→∞τ−1

m

[tm]

q=[mt−m]+1

[˜

X.q/ −E{˜

X.q/}]

so that, when mis ‘large’, property 1 implies that

τ−1

m

[tm]

q=[mt−m]+1

[˜

X.q/ −E{˜

X.q/}]

is approximately FGN provided that it is weakly stationary. The FGN process has autocovari-

ance function given by

cov{X.t/,X.s/}=0:5σ2{.|t−s|+1/2H−2|t−s|2H+t−s|−1|2H}.3:1/

where σ2=var{X.1/}:We see that the autocovariance function is determined by σand the Hurst

index H. Note that equation (3.1) implies that the auto-correlation function is determined by

only one unknown parameter H. It follows from equation (3.1) that when H=0:5 the auto-

covariance of the FGN is 0 and FGN reduces to a process with independent realizations. One

can prove (see Samorodnitsky and Taqqu (1994), proposition 7.2.10, page 335) that

cov{X.s/,X.t/}∼

=σ2H.2H−1/|t−s|2H−2.3:2/

when |t−s|is large. (When H∈[0:7, 0:9] the auto-correlations obtained from expression (3.2)

differ from the corresponding exact values by less than 0.01 when |t−s|4:) This restrictive

implication of our approach contrasts with the conventional approach to statistical time series

analysis, where no or few a priori theoretical principles are invoked. Furthermore, expression

6J. K. Dagsvik, M. Fortuna and S. Hov Moen

(a)

Temperature

Year

Month

Temperature

(b)

Fig. 1. Illustration of statistical self-similarity: Paris, 1757–2009

(3.2) shows that the auto-correlation function decreases slowly according to a power function

(asymptotically). This means that FGN exhibits long-range dependence. In contrast, conven-

tional time series models typically have exponential (or linear combinations of) auto-correlation

functions. Moreover, since expression (3.2) implies that the auto-correlation function becomes

independent of the timescale we conclude that the FGN process is approximately self-similar.

Fig. 1 provides an intuitive illustration of the self-similar feature of the temperature process

for Paris. Fig. 1(a) shows the total series of annual recorded temperatures for Paris, from 1757 to

2009 (253 years). Fig. 1(b) shows the monthly temperatures over 253 months, suitably rescaled.

We note that although the two graphs are different they nevertheless appear to have quite

similar patterns (fractal feature). (Beran et al. (2013) and Mandelbrot and van Ness (1968) have

discussed aspects of the fractal feature of self-similar processes and FGN.)

Now we turn to another property of the FGN process that will be useful for interpreting

the reconstructed temperature data. The reconstructed temperature data that were obtained by

Moberg et al. (2005a,b) are based on several sources of data. A key question is therefore whether

or not a linear combination of self-similar processes is approximately a self-similar process. For

this let W=r1W1+r2W2where W1and W2are independent self-similar processes with Hurst

indices Hjand rj,j=1, 2, are weights. We have the following result.

Theorem 1. Suppose that W=r1W1+r2W2where r1and r2are constants and W1and W2are

independent self-similar processes with Hurst indices H1and H2respectively, with H1H2.

Then {W.bt/b−H1,t0}converges weakly towards W1as b→∞:

The proof of theorem 1 is given in Appendix A. The result of theorem 1 means that a linear

combination of independent self-similar processes will be approximately self-similar with Hurst

index equal to the largest Hurst index of the processes. Accordingly, the sum of FGN processes

will also be approximately FGN with Hurst index equal to the largest Hurst index of the original

processes.

How Does Temperature Vary over Time? 7

4. Methods of statistical inference

In the previous section we presented theoretical support for FGN as a model representation

of the temperature process under the hypothesis of stationarity (hypothesis 1). Thus, it is of

fundamental importance to test hypothesis 1. The theoretical result in support for the FGN

model is valid only for large samples (asymptotic results) so it is of interest also to test the

properties of the FGN even if hypothesis 1 has been found to hold. There are several methods

that can be applied to analyse long memory processes; see Beran et al. (2013). In this paper

we have used a non-parametric test (Cho, 2016) to test for stationarity, a method based on the

empirical characteristic function, the Whittle method and the wavelet lifting method (Knight

et al., 2017).

4.1. Method based on the characteristic function

In this section we outline a speciﬁc regression approach based on the empirical characteristic

function that is used for estimation and for carrying out graphical tests of whether the ob-

served temperature process is consistent with the FGN model. This method is an extension of

Koutrouvelis (1980) and Koutrouvelis and Bauer (1982) to the time series setting. In the time

series setting an important question is whether the empirical characteristic function is consis-

tent for the corresponding theoretical characteristic function. Feuerverger (1990) has proved

strong consistency of the empirical characteristic function in stationary time series under a spe-

ciﬁc condition on the autocovariance function. This condition does not, however, hold for long

memory time series. It is therefore of interest to establish consistency of the empirical charac-

teristic function in our setting. The following theorem yields strong consistency of the empirical

characteristic function for long memory Gaussian processes.

Theorem 2. Assume that {Z.t/,t∈N}is a Gaussian process in discrete time with the property

that

|cov{Z.s/,Z.t/}|K|t−s|δ

for all s,t∈Nwhere K>0 and δ<0 are constants and let

ˆ

ψ.s,s+T;λ/=1

T

s+T

r=s+1

exp{iλZ.r/}

and

ψ.s,s+T;λ/=1

T

s+T

r=s+1

Eexp{iλZ.r/}

where i =√.−1/: Then

ˆ

ψ.s,s+T;λ/−ψ.s,s+T;λ/→0

with probability 1 as T→∞:

The proof of theorem 2 is given in Appendix A. Theorem 2 shows that the empirical character-

istic function for Gaussian processes is a strongly consistent estimator of the mean characteristic

function even when the process {Z.t/,t∈N}is non-stationary. Regarding the FGN process it

follows from expression (3.2) with δ=2H−2 that the condition of theorem 2 holds. In the

stationary case ψ.s,s+T;λ/reduces to ψ.s,s+T;λ/=E[exp{iλZ.s/}]:

Let {Y.t/,t∈N}denote the aggregate process in discrete time deﬁned by

8J. K. Dagsvik, M. Fortuna and S. Hov Moen

Y.t/ =

t

k=1

[X.k/ −E{X.k/}]

and let

ϕ.t −s;λ/=Eexpiλ{Y.t/ −Y.s/}

√|t−s|

be the characteristic function of Y.t/ −Y.s/ for real λ:The reason why we divide the exponent

above by √|t−s|is that this will enable us to calculate reliable estimates of the corresponding

empirical characteristic function for higher values of λthan otherwise. Deﬁne the corresponding

empirical counterpart of the characteristic function above by

ˆϕ.t −s;λ/=1

T−|t−s|

T−|t−s|

k=1

expiλ{Y.|t−s|+k/ −Y.k/}

√|t−s|:.4:1/

As explained in Appendix A, it follows from equation (4.1) that

|ˆϕ.t −s;λ/|={ˆϕ.t −s;λ/ˆϕ.t −s;−λ/}1=2

=1

T−|t−s|T−|t−s|

k=1

T−|t−s|

r=1

cosλ{Y.|t−s|+k/ −Y.k/ −Y.|t−s|+r/ +Y.r/}

√|t−s|1=2

:.4:2/

Formula (4.2) is more convenient than formula (4.1) for calculating numerical values.

Note next that if Zis a normally distributed random variable with mean band standard

deviation σit is well known that for any real or complex cwe have

E{exp.cZ/}=exp.bc +σ2c2=2/: .4:3/

When {Y.t/,t∈N}is FBM it follows by using equation (4.3) and the self-similarity property

that

ϕ.t −s;λ/=E

expiλY.|t−s|/

√|t−s|=E[exp{iλ|t−s|H−0:5Y.1/}]

=exp.−0:5σ2|t−s|2H−1λ2/: .4:4/

From equation (4.4) it follows that

log{−log |ϕ.t −s;λ/|}=.2H−1/log |t−s|+2log|λ|+log.0:5σ2/: .4:5/

By replacing the theoretical characteristic function with the corresponding empirical character-

istic function we obtain from equation (4.5) that

log{−log |ˆϕ.t −s;λ/|}=.2H−1/log |t−s|+2log|λ|+log.0:5σ2/+".s,t/ .4:6/

where ".s,t/ is an error term that is small when the number of observations is large. Note

that the right-hand side of equation (4.6) is linear in log |t−s|and in log |λ|:Thus, equation

(4.6) implies that Hand σcan be estimated by regression analysis by treating log|t−s|and

log |λ|as independent variables with suitable values of t−sand λ. Koutrovelis (1980) has

given a look-up table with recommended values of λand t−s: Kogon and Williams (1998)

proposed the use of 10 equally spaced points, ranging from λ=0:1toλ=1. Because the normal

distribution is symmetric around zero this is equivalent to using values from λ=1toλ=10:In

How Does Temperature Vary over Time? 9

our estimation we have used tj−sj=1, 2, :::, 20, for the nine selected cities that are reported

below and tj−sj=1, 2, :::, 10, for the remaining weather stations. As regards λwe have chosen

the points λj=0:1, 0:2, :::,1, for j=1, 2, :::, 10. If the temperature process is not stationary

log{−log |ˆϕ.t −s;λ/|}will in general not be approximately linear in log |t−s|and log |λ|:When

{Y.t/,t0}is a self-similar and stable process with stationary stable increments one can show

in a similar way that

log{−log |ˆϕ.t −s;λ/|}=α.H −0:5/log |t−s|+αlog |λ|+log.τα/+".s,t/ .4:7/

where τis the scale parameter and α∈.0,2] is the index parameter representing tail thickness in

the stable distribution. Thus, if the slope that is associated with log |λ|is found to be less than

2 it indicates that the process is not Gaussian but stable. Under the stationarity hypothesis let

μ=E{X.t/}. In Appendix A we derive the estimator ˆμof μ, given in equation (A.10), based on

the characteristic-function approach.

Kogon and Williams (1998) have examined the properties of the Koutrouvelis method further

in the context of estimating the parameters of stable distributions and found that it performs

quite well. However, to the best of our knowledge the characteristic-function-based method that

was outlined above has not been applied for estimation of self-similar processes. The character-

istic function approach can also be used to carry out graphical tests. This is done by plotting the

expression on the left-hand side of equation (4.6) (or (4.7)). If the FGN model is correct this plot

will be linear in log |t−s|and in log |λ|, with the coefﬁcient that is associated with log |λ|equal to

2. When producing the plots we have used the same values of λand t−sas during the estimation.

Recall that even if all one-dimensional marginal distributions are normal it does not neces-

sarily follow that the corresponding joint distribution is multivariate normal. The tests based on

the empirical characteristic function that was discussed above can be extended to test whether

the joint distribution of the temperatures at several points in time is multivariate normal.

We have conducted a series of bootstrap simulations to check whether or not the distributions

of the estimates ˆ

H,ˆσand ˆμthat are obtained by the characteristic function procedure are

asymptotically normal and unbiased. For this we have computed corresponding Q–Q-plots,

which are obtained by bootstrapping based on 1000 simulated time series with length 2000

(Figs D1 and D2 in the ﬁrst on-line resource ﬁle). These plots seem to be almost perfectly linear

and clearly indicate that the estimates are normally distributed. However, the estimates of H

seem to be downward biased when His greater than 0.8.

4.2. Test of goodness of ﬁt

We have also used a χ2-type statistic, here called the Q-statistic, to examine whether the data

are consistent with the FGN model. (Note that the Ljung–Box test cannot be used. According

to Chen and Deo (2004), the distribution of this test is not known when the data exhibit long-

range dependence.) Let {Z.t/,t∈N}denote a Gaussian stochastic process with zero mean

and let ZT=.Z.T/,:::,Z.2/,Z.1//be a random Gaussian vector. Let ΩT.H/ be the covari-

ance matrix of ZTwhen {Z.t/,t∈N}is FGN. In this case it is well known that Z

TΩ−1

T.H/ZTis

χ2distributed with Tdegrees of freedom and therefore Z

TΩ−1

T.H/ZThas the same distribution as

T

j=1

UjT

where UjT ,j=1, 2, :::,T, are independent χ2-distributed random variables with 1 degree of

freedom, E.UjT /=1 and var.UjT /=2:Consequently, the central limit theorem applies and

implies that the distribution of the statistic Qgiven by

10 J. K. Dagsvik, M. Fortuna and S. Hov Moen

QT.H/ :=Z

TΩ−1

T.H/ZT−T

√.2T/ .4:8/

converges to a standard normal distribution as T→∞:Consider the hypothesis H0that {Z.t/,t∈

N}is an FGN process against the alternative that the vector ZTis not Gaussian and/or the co-

variance matrix is different from ΩT.H/. Then, H0will be rejected if |QT.H/|>κwhere κis the

critical value that corresponds to the selected level of signiﬁcance.

To check whether QT.H/ is normally distributed in ﬁnite samples with high values of H

we have run bootstrap simulations based on the FGN model with sample size T=2000. The

results are displayed in Fig. D6 in appendix D in the ﬁrst on-line resource ﬁle. Similarly, we have

also used simulations to investigate whether the distribution of QT.H/ is affected by missing

observations. The simulations show that the distribution of QT.H/, even for high values of H,

is still very close to a standard normal distribution under the null hypothesis that the FGN

model is the true model and, furthermore, the occurrence of a moderate number of randomly

dispersed missing data does not affect the distribution of QT.H/ (Fig. D5 in appendix D in the

ﬁrst on-line resource ﬁle).

5. Empirical results based on the observed temperature data

5.1. Non-parametric testing of stationarity

Several non-parametric tests of stationarity are available. As mentioned above, we have applied

the test that was developed recently by Cho (2016). In this method the test statistic is obtained

from measuring and maximizing the difference in the second-order structure over pairs of ran-

domly drawn intervals. Cho (2016) proved asymptotic normality of the statistic under speciﬁc

conditions. These conditions are satisﬁed for Gaussian and many non-Gaussian processes. Cho

(2016) demonstrated that his test has good ﬁnite sample performance on both simulated and real

data. (We have chosen to apply the test of Cho (2016) because it seems to have good properties

and is easy to use. R code is available and is also easy to use.)

For the monthly time series, stationarity is rejected for between about 14 and 18 out of the

total of 96 series at a 5% signiﬁcance level. (The variability in the number of rejections stems

from the fact that different subsamples are selected each time that the test procedure is applied.)

The number of rejections might vary slightly from one run to another because the intervals are

drawn randomly. In contrast, for the annual time series, stationarity is rejected for only one

series (Djupivogur, Iceland) at 5% signiﬁcance level. Note that the monthly and annual data

cover the same time range. The reason why stationarity is rejected for some of the monthly series

(compared with the annual series) might be that those series are quite long, or, alternatively, that

the control for seasonality is insufﬁcient (Tables C6 and C7 in appendix C in the ﬁrst on-line

resource ﬁle).

We have also applied the stationary test of Cho (2016) to the Moberg data (Moberg et al.,

2005a, 2006). We ﬁnd that stationarity is not rejected at the 5% signiﬁcance level (Table C5 in

appendix C in the ﬁrst on-line resource ﬁle).

5.2. Inference based on the maximum likelihood, the characteristic function regression

and the wavelet lifting methods

In this section we report inference results for nine selected weather stations. These were chosen

because they are among those with the longest series and have the fewest missing observations

(appendix F in the ﬁrst on-line resource ﬁle). The estimates for the remaining stations are given

in Tables C1 and C2 in appendix C in the ﬁrst on-line resource ﬁle. Recall that when the time

How Does Temperature Vary over Time? 11

series are Gaussian processes the likelihood function is fully identiﬁed by the autocovariance

function and the mean. Recall that the FGN model is approximately invariant under choice of

timescale, i.e. whether one uses observations on annual or monthly data the structure of the

auto-correlation function is approximately the same (with the same H). In Table 1 we show

estimates of Hbased on the characteristic function approach, HCand estimates obtained by the

Whittle method HW: see Beran (1994), chapter 6.1. The Whittle method, which is based on an

approximation of the likelihood function, has been found to perform well (Rea et al., 2013). (For

long time series the determinant of the corresponding autocovariance matrix of the underlying

stochastic process may be close to zero. In such cases exact maximum likelihood estimation often

does not work, which is the case in our application. Rea et al. (2013) have studied the performance

of several estimators of long memory processes and have found that the Whittle method is among

the better estimators of those examined. Also, Abry et al. (2000) have found that the Whittle

estimator may be heavily affected by trends.) The estimates of the standard errors of the char-

acteristic function estimates are obtained by bootstrap simulations. For this we have used 1000

simulations of FGN time series of length 2000 (appendix D in the ﬁrst on-line resource ﬁle). A

striking feature of these estimates is that the Hurst index does not vary much across weather

stations.

The estimates based on monthly data are quite precise because we have long time series. We

note that the maximum likelihood estimates, and the estimates obtained by the characteristic

function regression procedure, are quite similar. Since our method for seasonal adjustment may

not be optimal, this may produce additional ‘noise’ in the data. One might therefore expect

estimates based on annual data to yield higher estimates of Hbecause the presence of random

noise might weaken the serial dependence in the seasonally adjusted monthly temperature series.

This is in fact what we ﬁnd, namely that the estimates of Hbased on annual data often are

signiﬁcantly higher than the estimates based on monthly data.

In Tables D1, D2 and D3 (appendix D in the ﬁrst on-line resource ﬁle) we also report results

from bootstrap simulations of the properties of various estimators. These results show that the

characteristic function regression estimator seems to be downward biased when His greater than

0.8, whereas the Whittle estimator appears to be unbiased even for large H, such as H=0:95.

We note that for values of Hgreater than 0.8 all the estimators seem to underestimate σ.

To test for normality of the time series, it is possible to use the Kolmogorov–Smirnov test

(Beran (1994), page 199). In this paper we have used the characteristic function approach that

was outlined above for this. Recall that under the assumptions of theorem 3 in appendix A it

Tab l e 1. Estimates of the Hurst parameter for selected cities based on characteristic

function regression and the Whittle maximum likelihood estimation method: monthly and

annual data†

City HCHWHCHW

Germany, Berlin 0.664 (0.009) 0.662 (0.012) 0.726 (0.019) 0.712 (0.041)

Switzerland, Geneva 0.693 (0.001) 0.667 (0.012) 0.845 (0.021) 0.818 (0.042)

Switzerland, Basel 0.625 (0.011) 0.622 (0.012) 0.664 (0.019) 0.720 (0.042)

France, Paris 0.733 (0.010) 0.672 (0.012) 0.873 (0.023) 0.802 (0.042)

Sweden, Stockholm 0.681 (0.015) 0.721 (0.012) 0.614 (0.017) 0.632 (0.041)

Italy, Milan 0.724 (0.019) 0.709 (0.012) 0.851 (0.023) 0.826 (0.043)

Czech Republic, Prague 0.684 (0.015) 0.670 (0.012) 0.745 (0.019) 0.716 (0.043)

Hungary, Budapest 0.627 (0.011) 0.645 (0.012) 0.682 (0.019) 0.663 (0.043)

Denmark, Copenhagen 0.755 (0.051) 0.758 (0.013) 0.817 (0.021) 0.753 (0.045)

†Standard errors are in parentheses.

12 J. K. Dagsvik, M. Fortuna and S. Hov Moen

follows that the temperature process is asymptotically Gaussian. It is still of interest to test for

normality since we do not know to what extent the assumptions are met in ﬁnite samples and

with ﬁnite m. Under the FGN hypothesis it follows from equation (4.7) that if we select values λj

as described above and plot the left-hand side of equation (4.7) against log.λj/the plot will be

linear with the slope (tail index α) close to 2. In Fig. 2 we show corresponding plots for selected

cities. From Fig. 2 we see that the plots are almost perfectly linear, with most slopes between

1.99 and 2.01. In one case (Basel) the plot is linear with the slope equal to 1.98, which indicates

a stable distribution (with slightly heavier tails than a normal distribution). More results are

given in appendix E in the ﬁrst on-line resource ﬁle.

As discussed above, the characteristic function regression approach can also be applied to

check whether data indicate that {Y.t/,t0}is self-similar with stationary increments, and

whether the increments are Gaussian. We have plotted the left-hand side of equation (4.7), for

t=1, 2, :::, 10, against log |t−s|:The resulting plots are shown in Fig. 2 for the cities selected.

Additional results are given in appendix E in the ﬁrst on-line resource ﬁle. We note that in most

cases the plots are approximately linear. (There is a problem with this test because the calcula-

tion of the empirical characteristic function yields imprecise results when λis large.) We have

subsequently applied the Q-statistics deﬁned in equation (4.8) to test whether the FGN model

is consistent with the temperature data. Recall that when His known the test statistic QT.H/ is,

asymptotically, standard normally distributed, which implies that the corresponding 5% conﬁ-

dence interval is equal to (−1:96, 1.96). In Table 2 we present results for the nine cities selected.

We see from Table 2 that when His estimated by the characteristic function regression method

the model passes the test for all nine weather stations, whereas in two cases (Paris and Milan)

models that were estimated by the Whittle maximum likelihood method are rejected. For the

total data set we have found that in only two cases (Buenos Aires, Argentina, and Cap Otway,

Australia) out of the 96 time series are the models that were estimated by the characteristic func-

tion regression method inconsistent with the data, whereas in eight cases the models that were

estimated by the Whittle maximum likelihood method are inconsistent with the data (Tables C1

and C2 in appendix C in the ﬁrst on-line resource ﬁle). When His replaced by its corresponding

estimate ˆ

H(say), then it is not known what the corresponding distribution of QT.ˆ

H/ is. For

this, we have conducted a series of simulation experiments under the assumption that the FGN

model is true. These simulations indicate that when His estimated by the Whittle method the

distribution of QT.ˆ

H/ is stable with zero mean and maximally skew to the right. It follows

that when we compute conﬁdence intervals based on the unconditional (stable) distribution of

QT.ˆ

H/ the model is rejected in only one case (Adelaide) (Fig. D3 in appendix D in the ﬁrst

on-line resource ﬁle).

We have also applied the wavelet lifting method to estimate Hon the basis of the annual

data. Table C8 in appendix C in the ﬁrst on-line resource ﬁle contains estimates of Hand of the

QT.ˆ

H/statistic. According to Table C8 the model is rejected for 36 out of the 96 time series. Fig.

C1 in appendix C in the ﬁrst on-line resource ﬁle shows that the wavelet lifting method tends to

produce lower estimates of Hthan does the Whittle method. This might be the reason for the

relatively high proportion of rejections.

Some of the observed time series suffer from missing observations. To investigate whether the

distribution of QT.H/ is affected by missing observations we have carried out 1000 simulations of

FGN processes with H=0:8 and length 2000 and with 70 missing and randomly dispersed data

points. This amount of missing data is typical for the 96 selected time series. The plots in Fig. D5

in appendix D in the ﬁrst on-line resource ﬁle show that missing data have no effect in this case.

When looking at some of the observed temperature graphs which exhibit local trends of

considerable length it may seem puzzling that a stationary model such as FGN with only three

How Does Temperature Vary over Time? 13

(a)(b)

tj

j

tj

(c) (d)

j

tj

(e) (f)

j

tj

tj

tj

tjtjtj

Fig. 2. Graphical tests of the FGN model: (a) self-similarity test, Germany, Berlin; (b) normality test, Ger-

many, Berlin (estimated αD1.99); (c) self-similarity test, Switzerland, Geneva; (d) normality test, Switzerland,

Geneva (estimated αD1.99); (e) self-similarity test, Switzerland, Basel; (f) normality test, Switzerland, Basel

(estimated αD1.98)

14 J. K. Dagsvik, M. Fortuna and S. Hov Moen

Tab l e 2. Q-statistic under the FGN hypothesis

for selected cities

City Q(HC)Q(HW)

Germany, Berlin −0.517 −0.625

Switzerland, Geneva −0.222 −1.621

Switzerland, Basel −0.408 −0.483

France, Paris 1.285 −2.790

Sweden, Stockholm −1.209 1.097

Italy, Milan −1.338 −2.334

Czech Republic, Prague −0.197 −0.855

Hungary, Budapest −0.819 −0.254

Denmark, Copenhagen −1.006 −0.693

parameters can ﬁt the data. The explanation is that, in the presence of long-range dependence,

such patterns are indeed possible. To illustrate the extent of local trends and cycles of the FGN

process we have carried out simulations of the FGN process for various values of H: see Fig. 3.

Recall that the auto-correlation function of FGN is approximately invariant under choice of

timescale: the timescale in Fig. 3 may be 1 year, 10 years or 100 years. However, the corresponding

amplitudes will be affected by a change in time unit. If, for example, the timescale is changed

from 1 year to 10 years the corresponding standard deviation is found by dividing the standard

deviation based on annual data by 101−H:Recall also that most of the estimates of Hbased on

annual data have orders of magnitude in the interval (0.7, 0.9). With H=0:7 we see from Fig. 3

that from about 625 to about 720 time units there seems to be a decreasing trend, whereas from

about 260 to about 330 time units there is an increasing trend. When H=0:8 and H=0:9 this

type of pattern seems to be more pronounced. In these cases we note that the local trends may

be several hundred time units long.

To investigate whether it is possible to detect departures from stationarity given that the ‘core’

stationary process is FGN, we have conducted a simulation experiment. We have simulated 180

years of the following process: during the ﬁrst 120 years the temperature process is assumed to

be FGN, with zero mean and unit variance; during the last 60 years the temperature process

is assumed to be FGN, plus a linear trend with positive slope, starting at 0 in year 120. We

used the test based on the Q-statistic to see how steep the trend must be before the FGN

hypothesis is rejected.It turns out that the trend must be equivalent to an increase of at least about

2.16 ◦C over 60 years before departure from stationarity can be seen, when H=0:7. When H>0:7

the slope had to be even steeper for departures from the FGN model to be revealed by the

Q-test. Thus, this result indicates that the Q-test might have low power for alternatives in the

class of models where each member of the class can be broken down into an FGN process plus

a deterministic trend. This is hardly surprising given the properties of the FGN process. This

stems from the fact that, as Hincreases, the FGN model exhibits increasingly complex patterns

with long stochastic trends and cycles, as seen from the graphs in Fig. 3. However, these are

local features that disappear in the long run.

6. Empirical analysis based on two millennia of temperature constructions

In this section we discuss how the FGN model ﬁts the reconstructed data (Moberg et al., 2005a,

b). Fortunately, the argument that was used to support the hypothesis that the temperature

How Does Temperature Vary over Time? 15

(a)

(b)

(c)

Temperature

Time unit

Time unit

Time unit

Temperature Temperature

Fig. 3. Simulated FGN process with zero mean and unit variance: (a) HD0.7; (b) HD0.8; (c) HD0.9

process is FGN under stationarity also applies in this case. Recall that the reconstructions of

Moberg et al. (2005a,b) were based on temperature proxies that were obtained from several

sources, such as tree rings, lake sediments and ice core borehole data. According to the technol-

ogy of temperature reconstructions from tree rings and ice core samples, these data yield annual

temperatures which may be interpreted as temporal aggregates of data generated at a ﬁner (con-

tinuous) timescale. Consequently, it seems reasonable to assume that property 1 continues to

hold also in this case. This may be true even if the observed data contain measurement errors.

16 J. K. Dagsvik, M. Fortuna and S. Hov Moen

Hence, provided that the observed reconstructed series is generated by a stationary process, it

follows that it must be approximately an FGN process.

Fig. 4 displays the reconstructed temperatures from 1 AD to 1979. Recall that when applying

Cho’s stationarity test the hypothesis of stationarity was not rejected. We have applied the

characteristic function regression approach and the Whittle maximum likelihood estimation

procedure, as well as tests based on the graphical method. Both the plots of the normality

test and the plots of the self-similarity test are close to being perfectly linear and are therefore

consistent with the FGN hypothesis: see Fig. 5. The estimate of Hby the characteristic function

regression method is ˆ

H=0:917:Recall that according to the simulations in Table D2 in appendix

D in the ﬁrst on-line resource ﬁle the characteristic function regression estimator underestimates

Hwhen H>0:8. The estimate that is based on the Whittle method yields ˆ

H=0:990 with standard

error 0.015. By using a test based on the unconditional Q-statistic we ﬁnd, however, that both

the estimate that is based on the characteristic function and the Whittle estimate imply that the

Temperature

Year

Fig. 4. Reconstructed temperature data by Moberg et al. (2005a)

(a)(b)

tj

tj

j

tj

Fig. 5. Tests of the FGN model: (a) self-similarity test (estimated Hby regression, 0.92); (b) normality test,

(estimated αD1.98)

How Does Temperature Vary over Time? 17

FGN model is rejected. It turns out that only when His close to 0.95 is the value of QT.H/ within

the conﬁdence interval (−1:96, 1.96). Thus, only when His close to 0.95 is the FGN model not

rejected, which means that ˆ

H=0:95 might be the better estimate. Recall that when His replaced

by the Whittle estimate ˆ

Hthe distribution of QT.ˆ

H/ is stable with zero mean, maximally skew

to the right with α=1:99 when H=0:7 and H=0:8, and α=1:96 when H=0:9(TableD3

in appendix D in the ﬁrst on-line resource ﬁle). When ˆ

H=0:95 the 95% conﬁdence interval

for QT.ˆ

H/ is approximately (−6, 12). Computing QforH=0:94, 0:95, 0:96 respectively yields

values equal to −5.27, −0.94 and 5.60, which are all within the conﬁdence interval (−6, 12).

In Fig. 6 we have plotted the empirical and the FGN auto-correlation function (with H=0:95)

by using the approximation formula (A.15). This formula is a modiﬁed version of the formula

that was provided by Hosking (1996). The formula that was given by Hosking (1996) does not

seem to perform well with high values of H, whereas we have found that the formula that is

given in expression (A.15) seems to work well when His close to 0.95 (appendix A and Fig. D4

in appendix D in the ﬁrst on-line resource ﬁle). We have also provided a conﬁdence band with 2

standard deviations on either side. We see from Fig. 6 that when H=0:95 the FGN model tends

to underpredict the empirical auto-correlations of the Moberg data but the differences between

the theoretical and empirical auto-correlations are less than 2 standard deviations except for

the auto-correlation of lag 1. The standard deviations are obtained by the bootstrap method,

based on 1000 simulated time series data from the FGN model with length 2000. From the

auto-correlation plot we see that the auto-correlation of lag 1 is signiﬁcantly higher than that

predicted by the FGN model with H=0:95. One possible explanation why the auto-correlation

of lag 1 is so high may be measurement error. To realize this, suppose that the reconstructed

temperature observations can be expressed as the true temperature values plus measurement

error. If the auto-correlation function of the true temperature process is denoted ρkand the auto-

correlation function of the measurement error process {ξ.t/,t0}(say) is denoted γkthen if the

Fig. 6. Empirical and theoretical auto-correlation plots: , empirical; ; FGN model

18 J. K. Dagsvik, M. Fortuna and S. Hov Moen

measurement error process is independent of the true temperature process the auto-correlation

of the observed process, ˆρk,isgivenby ˆρk=ρk+β.γk−ρk/where

β=var{ξ.t/}

var{ξ.t/}+var{X.t/}:

So if γ1is very high, whereas γk∼

=ρkfor k>1, then we realize that the auto-correlations of lag 1

of the true FGN model and the observed reconstructed temperature process will differ whereas

the auto-correlations for higher lags will be approximately equal.

The Whittle method may not be robust with respect to departures from the assumed model,

especially when the Hurst index is close to 1 (Abry et al., 2000). To check whether the Whittle

estimation method is sensitive to a very high auto-correlation of lag 1 we have estimated the

FGN model by using only data for half of the sample, namely for the even and odd years

respectively. Since the Whittle method is based on the spectral representation of the model we

therefore needed to transform the spectral density of the FGN model to the case where the time

unit for the observed series is twice the unit of the original series, whereas we maintain the model

that is deﬁned at the original time unit. By using the fact that the autocovariance function is the

Fourier transform of the spectral density it follows easily that, if g.ω/is the spectral density of

the FGN process, then 0:5g.0:5ω/will be the spectral density for the model in the case where

either the even or the odd years of observations are used. We found that when using the odd

years we obtained the estimate ˆ

H=0:919 with standard error equal to 0.021, and when using

the even years we obtained the estimate ˆ

H=0:932 with standard error equal to 0.022. We note

that the last estimate differs from H=0:95 by only about 1 standard error. These estimation

results clearly demonstrate that the Whittle method is not robust against this type of departure

from the maintained model.

Next, we shall put forward a possible explanation for why the estimated Hurst index is found

to be so high. The argument goes as follows: if the measurement error process and the true (la-

tent) temperature process are independent and additive stationary processes, then it follows from

property 1 as discussed above that they are both approximate FGN processes. Hence, theorem 1

implies that the observed process is also approximately FGN with the Hurst parameter equal to

the maximum of the Hurst parameters of the true process and the measurement error process.

The measurement error process might be highly persistent because of systematic biases in the ap-

plied physical methodology linking the observed tree ring and borehole data to the unobserved

temperature process. In addition, Moberg et al. (2005a) have used wavelet transform techniques

to standardize, smooth and interpolate the observed proxies that were obtained from the respec-

tive sources to obtain temperature reconstructions. As a result, it may be that the Hurst exponent

of the measurement error process is higher than the Hurst index of the latent temperature process

and therefore, according to theorem 1, the Hurst index of the reconstructed series will be higher

than the Hurst index of the observed temperature series. In addition, the use of interpolation

may be one reason why the auto-correlation of lag 1 is particularly high in these data.

Mills (2007) has also analysed the data set that was obtained by Moberg et al. (2005a). As

with our analysis, he found that these data are consistent with a model that has long memory

properties and can be represented by an auto-regressive fractionally integrated moving average

process that is both stationary and mean reverting. He also showed that, if we allow for a

smoothly varying underlying trend function, then long memory disappears, implying that recent

centuries have been characterized by trend temperatures moving upwards. Thus, his analysis

provides another example of the difﬁculty of discriminating between competing models by

statistical arguments alone: see Mills (2010).

How Does Temperature Vary over Time? 19

7. Concluding remarks

In this paper we have provided evidence for why the FGN process is a good representation

of the temperature process with the Hurst parameter ranging from about 0.63 to 0.95. Both

theoretical arguments and empirical tests that were carried out support our claim. A key feature

of the FGN model is long-range dependence. The implication of long-range dependence is that

temperature series may exhibit long (non-systematic) trends and cycles. The most extreme case

is found in the reconstructed data of Moberg et al. (2005a), where some trends last for several

hundred years (Fig. 4) but still turn out to be consistent with the FGN model. Furthermore,

long-range dependence implies that temperature levels are highly persistent. For example, in the

case where the Hurst parameter is as high as 0.9, the correlation between temperatures 100 years

apart will be about 0.29. A consequence of this feature of the temperature process is that we

cannot achieve reliable claims about systematic changes in temperature levels by using simple

statistical trend analyses or ad hoc statistical models.

However, it cannot be ruled out that temperature data coupled with other types of data or

information, and models based on geophysical processes, might result in a different picture.

Hence, it may be that a systematic change in the temperature levels is under way but that our

statistical methods are unable to distinguish such changes from natural temperature variation.

Finally, we have conducted a simulation experiment based on a modiﬁcation of the FGN

model allowing for a linear trend with a positive slope during the last 60 years. The goal was to

detect how steep the trend must be before stationarity is rejected when using the test based on

the Q-statistics.

Acknowledgements

We are grateful for help and support from Mads Greaker, for comments by Arvid Raknerud,

Terje Skjerpen and Bjart Holtsmark, and for technical assistance from Gabriele Orlando. We

also thank participants in seminars at Statistics Norway, the Frisch Centre for Economic Re-

search, the Department of Mathematics, University of Oslo, and the Department of Biostatistics,

University of Oslo.

Appendix A

To make this paper self-contained we restate a very important result which is given in Giraitis et al. (2012),

page 77 and page 79. For this let Nbe the set of integers including 0 and let {˜

X.q/,q∈N}be a weakly

stationary process. Deﬁne

τ2

m=varm

q=1

[˜

X.q/ −E{˜

X.q/}]

and

Sm.t/ =

[mt]

q=1

[˜

X.q/ −E{˜

X.q/}]:

Remember that according to Wold’s representation theorem every weakly stationary process {U.t/,t∈N}

can be expressed as the sum of a deterministic process and a causal linear process, i.e.

U.t/ =ξt+∞

k=0

ak"t−k

20 J. K. Dagsvik, M. Fortuna and S. Hov Moen

where {ξt,t∈N}is a deterministic process, {"t,t∈N}is a white noise process and {ak}is a sequence with

the property

∞

k=1

a2

k<∞:

If {U.t/,t∈N}is purely stochastic then ξtis equal to a constant. Recall also that a function L.x/ (say) is

slowly varying at ∞if it has the property that

L.tx/=L.x/ →

x→∞ 1;

see Resnick (1987), page 13.

Theorem 3. Suppose that {˜

X.q/,q∈N}is a causal linear process that satisﬁes

τ2

m→

m→∞ ∞

and

τ2

m=m2HL.m/

for some constant 0 <H<1 where Lis a slowly varying function at ∞. Then the ﬁnite dimensional

distributions of the process {τ−1

mSm.t/,t0}converge towards the ﬁnite dimensional distributions of

FBM with unit variance.

Theorem 3 is a slightly modiﬁed version of proposition 4.4.1, page 77, given in Giraitis et al. (2012),

where also the proof is given. Provided that H>0:5 (which is the case in our analysis) then proposition

4.4.4, page 79, in Giraitis et al. (2012) states that {τ−1

mSm.t/,t0}converges weakly towards the standard

FBM with the uniform metric. The FBM has autocovariance function given by

cov{V.t/,V.s/}=0:5σ2.t2H+s2H−|t−s|2H/: .A.1/

A.1. Proof of theorem 1

Let W.t/ =r1W1.t/ +r2W2.t/ where r1and r2are constants and let ϕj.λ1,λ2;t1,t2/be the characteristic

function of .Wj.t1/,Wj.t2// and ϕ.λ1,λ2;t1,t2/the characteristic function of .W.t1/,W.t2//: If H1=H2

then obviously Wwill be self-similar. Assume next that H1>H

2. Then it follows that the characteristic

function of .W.bt1/b−H1,W.bt2/b−H1/is equal to

ϕ.b−H1λ1,b−H1λ2;bt1,bt2/=E[exp{ir1b−H1λ1W1.bt1/+ir1b−H1λ2W1.bt2/}]E[exp{ir2b−H2λ1W2.bt1/

+ir2b−H2λ2W2.bt2/}]

=E[exp{ir1λ1W1.t1/+ir1λ2W1.t2/}]E[exp{ir2b−H1+H2λ1W2.t1/

+ir2b−H1+H2λ2W2.t2/}]

=ϕ1.r1λ1,r1λ2;t1,t2/ϕ2.r2bH2−H1λ1,bH2−H1λ2;t1,t2/:

When b→∞the last expression tends towards

ϕ1.r1λ1,r1λ2;t1,t2/ϕ2.0, 0; t1,t2/=ϕ1.r1λ1,r1λ2;t1,t2/:

Thus we have proved that .W.bt1/b−H1,W.bt2/b−H1/is approximately distributed as .W1.t1/,W1.t2// when b

is large. Above we have considered only the bivariate characteristic function of .W.t1/,W.t2//: The argument

in the corresponding multivariate case is similar.

A.2. Proof of theorem 2

Let μ.t/ =E{Z.t/}and cov{Z.s/,Z.t/}=c.s,t/: Let

Xr=1

2.exp{iλZ.r/}−E[exp{iλZ.r/}]/:

How Does Temperature Vary over Time? 21

It follows that

|Xr|=1

2|exp{iλZ.r/}−E[exp{iλZ.r/}|1

2|exp{iλZ.r/}|+E|exp{iλZ.r/}|1:.A.2/

Using equation (4.3) it follows that

E[exp{iλZ.s/+iλZ.t/}]=exp[iλ{μ.s/ +μ.t/}−0:5λ2{c.s,s/ +c.t ,t/ +2c.s,t/}]

implying that

|cov.Xr,Xt/|=1

4exp[−0:5λ2{c.r,r/ +c.t,t/}]|1−exp{−λ2c.r,t/}|

exp[−0:5λ2{c.r,r/ +c.t,t/ −2|c.r,t/|}]|1−exp{−λ2|c.r,t/|}|

exp[−0:5λ2var{Z.r/ −Z.t/}]|1−exp{−λ2|c.r,t/|}

1−exp{−λ2|c.r,t/|}:

.A.3/

It is easily veriﬁed that when xis positive then 1 −exp.−x/ x: Accordingly, inequality (A.3) and the

assumption of theorem 2 imply that

|cov.Xr,Xr+m/|=1

4|cov.exp[iλZ.r/,exp{iλZ.r +m/}]/|1−exp{−λ2|c.r,r+m/|}

λ2|c.r,r+m/|λ2K|m|δ:

.A.4/

Moreover, since δ<0 by assumption we have that

m>1

λ2Kmδ

m=λ2K

m>1

1

m1−δ<∞:.A.5/

Consequently, equations (A.2) and (A.5) imply that the conditions of corollary 4 in Lyons (1988) are

fulﬁlled and thus

Ps+T

r=s

Xr→

T→∞ 0=1,

which is the desired result.

A.3. Details of the regression approach based on the empirical characteristic function

Here, we provide the derivation and justiﬁcation of the regression procedure based on the empirical

characteristic function. Let {Y.t/,t0}be a self-similar process with stationary stable increments. Deﬁne

the characteristic function ϕ.t −s;λ/by

ϕ.t −s;λ/=E

expiλ{Y.t/ −Y.s/}

√|t−s|

for real λwhere i =√.−1/: Deﬁne the corresponding empirical counterpart of the characteristic function

above by

ˆϕ.t −s;λ/=1

T−|t−s|

T−|t−s|

k=1

expiλ{Y.|t−s|+k/ −Y.k/}

√|t−s|:.A.6/

Clearly, under self-similarity it follows that

E{ˆϕ.t −s;λ/}=1

T−|t−s|

T−|t−s|

k=1

E

expiλ{Y.|t−s|+k/ −Y.k/}

√|t−s|=ϕ.t −s;λ/

which shows that the empirical characteristic function deﬁned in equation (A.6) is an unbiased estimator

of the corresponding theoretical characteristic function.

22 J. K. Dagsvik, M. Fortuna and S. Hov Moen

From equation (A.6) it follows readily that

|ˆϕ.t −s;λ/|2=ˆϕ.t −s;λ/ˆϕ.t −s;−λ/

=1

.T −|t−s|/2

T−|t−s|

k=1

T−|t−s|

r=1expiλ{Y.|t−s|+k/ −Y.k/}

√|t−s|−iλ{Y.|t−s|+r/ −Y.r/}

√|t−s|

=1

.T −|t−s|/2T−|t−s|

k=1

T−|t−s|

r=1

cos [λ{Y.|t−s|+k/ −Y.k/}−Y.|t−s|+r/ +Y.r/]

√|t−s|:

Since {Y.t/,t0}is self-similar with stationary stable increments it follows that

ϕ.t −s;λ/=Eexp iλY.|t−s|/

√|t−s|=E[exp{iλ|t−s|H−0:5Y.1/}]

=exp.−τα|t−s|α.H −0:5/λα/

.A.7/

where τis a dispersion parameter. Equation (A.7) is equivalent to

log{−log |ϕ.t −s;λ/|}=α.H −0:5/log |t−s|+αlog |λ|+log.τα/: .A.8/

By replacing the theoretical characteristic function by the corresponding empirical characteristic function

in equation (A.8) we obtain that

log{−log |ˆϕ.t −s;λ/|}=α.H −0:5/log |t−s|+αlog |λ|+log.τα/+".s,t/

where ".s,t/ is an error term that is small when the number of observation is large.

A.4. Estimate of μDE

{

X(t)

}

when αD2

Let

S.λ/=E[sin{λX.t/}]

and

C.λ/=E[cos{λX.t/}]:

Since

C.λ/+iS.λ/=E[exp{iλX.t/}]=exp.−0:5λ2+iλμ/=exp.−0:5σ2λ2/{cos.μλ/+i sin.μλ/}

it follows that

C.λ/=exp.−0:5σ2λ2/cos.μλ/

and

S.λ/=exp.−0:5σ2λ2/sin.μλ/

which yield

tan−1{S.λ/=C.λ/}=λμ:.A.9/

Thus, an estimator of μcan be obtained as follows (Koutrouvelis, 1980). Let

ˆ

S.λ/=1

T

T

k=1

sin{λX.k/}

and

ˆ

C.λ/=1

T

T

k=1

cos{λX.k/}:

How Does Temperature Vary over Time? 23

Evidently, ˆ

S.λ/and ˆ

C.λ/are consistent estimators for respectively S.λ/and C.λ/: Hence, for a suitable

choice of λ, equation (A.9) implies that

ˆμC=1

λtan−1ˆ

S.λ/

ˆ

C.λ/.A.10/

is a consistent estimator for μ.

A.5. Maximum likelihood and the Whittle estimator

Under the assumption that the temperature series is a Gaussian process we can also apply the method of

maximum likelihood for estimation. Speciﬁcally, given the FGN model then the autocovariance function

will depend on only two parameters, namely Hand σ. Let Ω.H/ be the matrix with elements

Ωst .H/ =0:5.|t−s|+1/2H−2|t−s|2H+.t−s|−1|2H/:

Furthermore, let

XT=.X.1/,X.2/,:::,X.T//

and

1=.1, 1, :::,1/:

Then the covariance matrix of XTcan be expressed as σ2Ω.H/. The log-likelihood function can be written

(apart from an additive constant)

log{L.μ,σ,H/}=−.XT−1μ/Ω−1.H/.XT−1μ/=2σ2−0:5Tlog.σ2/−0:5log|det{Ω.H/}|.A.11/

where μ=E{X.t/}:

In the case where Tis large it may be complicated to compute the likelihood function. However, Whittle

has demonstrated (see Beran (1994)) that the likelihood can be approximated. This approximate likelihood

converges to the exact likelihood as Tincreases without bounds.

Given Hit follows readily from equation (A.11) that the corresponding conditional maximum likelihood

estimates of μand σaregivenby

ˆμML =

T

k=1

T

r=1

Ω−1

kr .H/X.r/

T

k=1

T

r=1

Ω−1

kr .H/

and

ˆσ2

ML =1

T

T

r=1

{X.r/ −ˆμ}2:

A.6. Estimation of the auto-correlation function

In the presence of long-range dependence the usual estimator for the auto-correlation function may be

seriously biased even if long time series data are available. Let rkbe the usual estimator for the auto-

correlation ρkas given by

rk=

T−k

t=1

{X.t +k/ −¯

XT}{X.t/ −¯

XT}

T

t=1

{X.t/ −¯

XT}2

24 J. K. Dagsvik, M. Fortuna and S. Hov Moen

where ¯

XTis the sample mean. Under the condition ρk∼

=ck−γ, for large values of kwhere cis a positive

constant and 0 <γ<1

2, Hosking (1996) has obtained that

E.rk/−ρk∼

=−2.1−ρk/c

.1−γ/.2−γ/T γ:.A.12/

In our setting it follows from equation (3.2) that Hosking’s condition given above is satisﬁed and that

c=σ2H.2H−1/and γ=2−2H: The approximation in expression (A.12) becomes better with increasing

sample size T: Hence, by inserting these values into approximation (A.12) we obtain that

E.rk/−ρk∼

=ρk−1

T2−2H:.A.13/

Expression (A.13) can be applied to obtain an asymptotic unbiased estimator for ρk:

Let

ˆρk=rk+T2H−2

1+T2H−2:.A.14/

Then it follows from expression (A.13) and (A.14) that E. ˆρk/∼

=ρk, which means that ˆρkis approximately

an unbiased estimator for ρk:However, by conducting simulation experiments we have found that formula

(A.14) does not produce unbiased estimates when His large (Fig. D in appendix D in the ﬁrst on-line

resource ﬁle). Speciﬁcally, we have found that when H=0:95 the estimator

ˆρk=rk+0:9

1:9.A.15/

yields approximately unbiased estimates.

A.7. Bootstrap estimation and graphical test for the distributed of QT(ˆ

H) when ˆ

His

stochastic

When ˆ

His stochastic it is not known what the distribution of QT.ˆ

H/ is. However, we can apply the

Koutrouvelis (1980) approach to conduct a graphical test for the hypothesis that QT.ˆ

H/ follows a stable

distribution when ˆ

His normally distributed. For this let ˆ

Hjbe the bootstrap estimate that is obtained from

the jth simulated sample, j=1, 2, :::,M. For given real λ, deﬁne the empirical characteristic function of

ˆ

ψ.λ/=1

M

M

j=1

exp{iλQT.ˆ

Hj/}:

Similarly to equation (4.7) it follows from Koutrouvelis (1980) that when QT.ˆ

Hj/,j=1, 2, :::,M,are

independent and identically distributed according to a stable distribution then

log{−log |ˆ

ψ.λ/|}=αlog |λ|+log.σα/+".λ/.A.16/

where ".λ/is an error term that vanishes when Mincreases without bounds, αis the index parameter, σ

is the scale parameter and |ˆ

ψ.λ/|can conveniently be computed by using the relationship

|ˆ

ψ.λ/|= 1

MM

k=1

M

r=1

cos{λ.QT.ˆ

Hk/−QT.ˆ

Hj/}1=2

:

Similarly to Section 4.1 we can use regression analysis to estimate αand σ:Furthermore, one can use

equation (A.16) to conduct graphical tests of the hypothesis that the Q-statistic is stable.

References

Abry, P., Flandrin, P., Taqqu, M. S. and Veitch, D. (2000) Wavelets for the analysis, estimation and synthesis of

scaling data. In Self-similar Network Trafﬁc and Performance Evaluation (eds K. Park and W. Willinger), pp.

39–88. Chichester: Wiley.

How Does Temperature Vary over Time? 25

Aldrin, M., Holden, M., Guttorp, P., Skeie, R. B., Myhre, G. and Berntsen, T. K. (2012) Bayesian estimation

of climate sensitivity based on a simple climate model ﬁtted to observations of hemispheric temperatures and

global ocean heat content. Environmetrics,23, 253–271.

Baillie, R. T. and Chung, S.-K. (2002) Modeling and forecasting from trend-stationary long memory models with

applications to climatology. Int. J. Forecast.,18, 215–236.

Beran, J. (1994) Statistics for Long-memory Processes. London: Chapman and Hall.

Beran, J., Feng, Y., Ghosh, S. and Kulik, R. (2013) Long-memory Processes. Berlin: Springer.

Bloomﬁeld, P. (1992) Trends in global temperatures. Clim. Change,21, 275–287.

Bloomﬁeld, P. and Nychka, D. (1992) Climate spectra and detecting climate change. Clim. Change,21, 1–16.

Chen, W. W. and Deo, R. S. (2004) A generalized portmanteau goodness-of-ﬁt test for time series models.Econmetr.

Theory,20, 382–416.

Cho, H. (2016) A test of second-order stationarity of time series based on unsystematic sub-samples. Stat,5,

262–277.

Davidson, J. E. H., Stephenson, D. B. and Turasie, A. A. (2016) Time series modeling of paleoclimate data.

Environmetrics,27, 55–65.

Estrada, F., Gay, G. and S´

anchez, A. (2010) A reply to “Does temperature contain a stochastic trend?: Evaluating

conﬂicting statistical results” by R. K. Kaufmann et al.Clim. Change,101, 407–414.

Feuerverger, A. (1990) An efﬁciency result for the empirical characteristic function in stationary time-series

models. Can. J. Statist.,18, 155–161.

Galbraith, J. and Green, C. (1992) Inference about trends in global temperature data. Clim. Change,22, 209–221.

Gay-Garcia, C. and Estrada, F. (2009) Global and hemispheric temperatures revisited. Clim. Change,94, 333–349.

Gil-Alana, L. A. (2003) Estimating the degree of dependence in the temperatures in the northern hemisphere

using semi-parametric techniques. J. Appl. Statist.,30, 1021–1031.

Gil-Alana, L. A. (2008a) Time trend estimation with breaks in temperature series. Clim. Change,89, 325–337.

Gil-Alana, L. A. (2008b) Warming break trends and fractional integration in the northern, southern, and global

temperature anomaly series. J. Atmos. Ocean. Technol.,25, 570–578.

Giraitis, L., Koul, H. L. and Surgailis, D. (2012) Large Sample Inferences for Long Memory Processes. London:

Imperial College Press.

Harvey, D. I. and Mills, T. C. (2001) Modelling global temperatures using cointegration and smooth transitions.

Statist. Modllng,1, 143–159.

Holt, M. T. and Ter¨

asvirta, T. (2020) Global hemisphere temperaturetrends and co-shifting: a vector shifting-mean

autoregressive analysis. J. Econmetr.,214, 198–215.

Hosking, J. R. M. (1996) Asymptotic distributions of the sample mean, autocovariances, and autocorrelations of

long-memory time series. J. Econmetr.,73, 261–284.

Humlum, O., Stordahl, K. and Solheim, J.-E. (2013) The phase relation between atmospheric carbon and global

temperature. Globl Planet. Change,100, 51–69.

K¨

arner, O. (2002) On stationarity and antipersistency in global temperature series. J. Geophys. Res. D, 107, 1–11.

Kaufmann, R. K., Kauppi, K. and Stock, J. H. (2010) Does temperature contain a stochastic trend?: Evaluating

conﬂicting statistical results. Clim. Change,101, 395–405.

Knight, M. I., Nason, G. P. and Nunes, M. A. (2017) A wavelet lifting approach to long-memory estimation.

Statist. Comput.,27, 1453–1471.

Koenker, R. and Schorfheide, F. (1994) Quantile spline models for global temperature change. Clim. Change,28,

394–404.

Kogon,S. M. and Williams, D.B. (1998) Characteristic function based estimation of stabledistribution parameters.

In A Practical Guide to Heavy Tails (eds R. J. Adler, R. E. Feldman and M. S. Taqqu). Boston: Birkh¨

auser.

Koutrouvelis, I. A. (1980) Regression-type estimation of the parameters of the stable laws. J. Am. Statist. Ass.,

75, 918–928.

Koutrouvelis, I. A. and Bauer, D. P. (1982) Asymptotic distribution of regression-type estimators of parameters

of stable laws. Communs Statist.Theory Meth.,11, 2715–2730.

Lyons, R. (1988) Strong laws of large numbers for weakly correlated random variables. Mich. Math. J.,35, 353–

359.

Mandelbrot, B. B. and van Ness, J. W. (1968) Fractional Brownian motions, fractional noises and applications.

SIAM Rev.,10, 422–437.

Mandelbrot, B. B. and Wallis, J. R. (1968) Noah, Joseph and operational hydrology. Wat. Resour. Res.,4, 909–918.

Mandelbrot, B. B. and Wallis, J. R. (1969) Some long-run properties of geophysical records. Wat. Resour. Res.,5,

321–340.

Mills, T. C. (2007) Time series modelling of two millennia of northern hemisphere temperatures: long memory or

shifting trends? J. R. Statist. Soc. A, 170, 83–94.

Mills, T. C. (2010) ‘Skinning a cat’: alternative models of representing temperature trends. Clim. Change,101,

415–426.

Moberg, A. (2012) Comments on “Reconstruction of the extratropical NH mean temperature over the last mil-

lennium with a method that preserves low-frequency variability”. J. Clim.,25, 7991–7997.

Moberg, A., Sonechkin, D. M., Holmgren, K., Datsenko, N. M. and Karl´

en, W. (2005a) Highly variable northern

hemisphere temperatures reconstructed from low- and high-resolution proxy data. Nature,433, 613–617.

26 J. K. Dagsvik, M. Fortuna and S. Hov Moen

Moberg, A., Sonechkin, D. M., Holmgren, K., Datsenko, N. M. and Karl´

en, W. (2005b) 2,000-year northern

hemisphere temperature reconstruction. Data Contribution Series 2005-019. World Data Center for Paleocli-

matology, Boulder.

Moberg, A., Sonechkin, D. M., Holmgren, K., Datsenko, N. M., Karl´

en, W. and Lauritzen, S. E. (2006) Corri-

gendum: Highly variable northern hemisphere temperatures reconstructed from low- and high-resolution proxy

data. Nature,439, 1014.

Rea, W., Oxley, L., Reale, M. and Brown, J. (2013) Not all estimators are born equal: the empirical properties of

some estimators of long memory. Math. Comput. Simulns,93, 29–42.

Resnick, S. I. (1987) Extreme Values, Regular Variation, and Point Processes. Berlin: Springer.

Samorodnitsky, G. and Taqqu, M. S. (1994) Stable Non-Gaussian Random Processes. London: Chapman and

Hall.

Schmith, T., Johansen, S. and Thejll, P. (2012) Statistical analysis of global surface temperature and sea level using

cointegration methods. J. Clim.,25, 7822–7833.

Solheim, J.-E., Stordahl, K. and Humlum, O. (2012) The long sunspot cycle 23 predicts a signiﬁcant temperature

decrease in cycle 24. J. Atmos. Sol. Terrestr. Phys.,80, 267–284.

Supporting information

Additional ‘supporting information’ may be found in the on-line version of this article:

‘Online resource 01; Online resource 02; Online resource 03; Online resource 04—supplementary appendices’.