ArticlePDF Available

Comment on "Atlantic and Pacific multidecadal oscillations and Northern Hemisphere temperatures"



Steinman et al. (Reports, 27 February 2015, p. 988) argue that appropriately rescaled multimodel ensemble-mean time series provide an unbiased estimate of the forced climate response in individual model simulations. However, their procedure for demonstrating the validity of this assertion is flawed, and the residual intrinsic variability so defined is in fact dominated by the actual forced response of individual models.
Comment on Atlantic and Pacific
multidecadal oscillations and
Northern Hemisphere temperatures
S. Kravtsov,
*M. G. Wyatt,
J. A. Curry,
A. A. Tsonis
Steinman et al. (Reports, 27 February 2015, p. 988) argue that appropriately rescaled
multimodel ensemble-mean time series provide an unbiased estimate of the forced climate
response in individual model simulations. However, their procedure for demonstrating the
validity of this assertion is flawed, and the residual intrinsic variability so defined is in fact
dominated by the actual forced response of individual models.
The central result of Steinman et al.sanal-
ysis (1) is the demonstration of an apparent
consistency among the responses of differ-
ent models to variable forcing in the 20th-
century climate simulations. In particular,
they claim that regional multimodel ensemble-
mean time series defines the universal forced
signal, which can be linearly rescaled to provide
unbiased estimates of the regional forced re-
sponses for individual models. Such a consisten-
cy is surprising because the models have different
physical parameterizations and the simulations
may use different forcing subsets. If their claim
were true, it would add much confidence to the
authorssemi-empirical attribution of the ob-
served multidecadal climate variability to the
forced and intrinsic sources. However, the implied
uniqueness of the forced signal defined by their
regional regression method is an artifact of their
analysis procedure, and the actual uncertainty of
the semi-empirical estimates of the observed multi-
decadal intrinsic variability is much larger than
these authors have inferred.
Consider Mtime series of length T,corre-
sponding to Mdifferent climate simulations:
m;m¼1;;M;t¼1;;T. Let the bar denote
averaging across the time dimension (t)and
square brackets denote averaging across the
ensemble member dimension (m). For example,
theensembleaveragetimeseries½xðtÞare de-
fined as follows
Consider a decomposition of xðtÞ
minto the forced
signal fðtÞ
mand residual intrinsic variability DðtÞ
Without loss of generality, we can assume
xm¼fm¼0, hence Dm¼0. If the estimated
forced signal fðtÞ
mis unbiased, then the time series
m1and DðtÞ
m2of residual intrinsic variability in any
pair of simulations m
and m
must be uncorre-
lated (independent). Furthermore, if the distri-
bution of DðtÞ
mhas mean 0 and variance s2,the
ensemble mean residual time series ½DðtÞwill have
thedistribution withmean 0 and variance s2=M.
Hence, one can quantitatively assess the statis-
tical independence of different realizations of sim-
ulated intrinsic variability by comparing the actual
dispersion ½D2of the ensemble mean time series
½DðtÞwith its theoretical prediction D2=M,where
we estimated s2D2. Large values of ½D2would
indicate that assumption of statistical indepen-
dence between different realizations of intrinsic
variability DðtÞ
mis violated due to biases in the esti-
mated forced signal fðtÞ
of the common true forced signal manifests in
the estimated intrinsicresiduals DðtÞ
Steinman et al. considered, among others, the
following two methods for estimating the forced
signal, both based on the multimodel ensemble
mean time series
The differencing method (Eqs. 3 and 4A) simply
identifies the forced signal with the multimodel
ensemble mean ½xðtÞ.Theregressionmethod(Eqs.
3 and 4B) rescales the first-guess forced signal
½xðtÞfor a given simulation by finding amvia
least squares to minimize D2
min Eq. 3.
Steinman et al. further claimed that both of
these methods provided independent realizations
of residual intrinsic variability in climate-model
simulations, based on the fact that the resulting
variance ½D2of the ensemble mean residual time
series was much smaller than the theoretical value
of D2=M. However, it is easy to show that, due
to the choice of forcing derived using either Eq.
4A or Eq. 4B, this ensemble mean residual time
series is identically zero
½DðtÞ¼0; t¼1;;Tð5Þ
and so is its variance ½D2¼0. Hence, the ex-
treme smallness of the dispersion of ensemble
average intrinsic variability attributed in (1)to
the statistical independence of its different real-
izations is actually an artifact of the algebraic
constraint (Eq. 5) [see (25)]. This flaw does
not mean that the residuals are necessarily cor-
related (not independent), but a different test is
required to determine that.
We now show directly that the regional regres-
sion approach (1) of defining the forced signal
leads to the correlated samples of residual in-
trinsic variability in the individual-model ensem-
bles (subensembles of simulations using a single
model with fixed physics package and an iden-
tical forcing history). For these subensembles, it
is the expression (Eq. 4A) that naturally gives an
unbiased estimate of the forced variability. We
considered 18 such subensembles from the Cou-
pled Model Intercomparison Project Phase 5
simulations (6), totaling 116 individual simula-
tions out of the 170 available simulations. The
multimodel ensemble mean based on these 116
puted using all of the available 170 simulations.
We defined two alternative sets of the model-
simulated intrinsic variability. In method A, we
formed realizations of intrinsic variability by
subtracting the 5-year low-pass-filtered ensem-
ble mean of each model from this modelsindi-
vidual simulations (i.e., Eq. 4A applied separately
to individual model ensembles). The second set
(method B) defined the residual intrinsic varia-
bility using the forced signal estimated from re-
gional multimodel regression (1) (i.e., Eq. 4B
To quantify independence of different realiza-
tions of intrinsic variability in the individual-
model ensembles, we introduced an ensemble
correlation measure Cby summing positive cor-
relations among all possible pairs of an individ-
ual modelsMensemble members
Cml HðCml Þð6Þ
where HðxÞis the Heaviside step function (7);
the quantity Cranges from 0 (no positive cor-
relations between individual ensemble mem-
bers) to 1 (all ensemble members are perfectly
correlated). The correlation measure (Eq. 6) was
computed for raw and low-pass-filtered intrin-
sic variability defined using methods A and B
[Fig. 1, A to C shows results for the Geophysical
SCIENCE 11 DECEMBER 2015 VOL 350 ISSUE 6266 1326-b
Department of Mathematical Sciences, Atmospheric Science
group, University of Wisconsin-Milwaukee, Post Office Box
413, Milwaukee, WI 53201, USA.
Department of Geological
Sciences, University of Colorado, Boulder, CO, USA.
of Earth and Atmospheric Sciences, Georgia Institute of
Technology, Atlanta, GA, USA.
*Corresponding author:
Fluid Dynamics Laboratory (GFDL) CM3 model ;
see (8)]. Method A produces intrinsic variability
with Cvalues well within the range expected from
random uncorrelated red-noise samples generated
using an autoregressive model of order 3 (AR-3) (9).
In contrast, Steinman et al.smethodBresultsin
samples that are significantly correlated due to their
systematic difference from the true forced signal.
We then used 18 versions of the forced signal,
estimated by the unbiased method A, to isolate
intrinsic variability in observed surface temper-
atures via Eq. 3 and Eq. 4B (Fig. 1, D to F). The
spread among the 18 estimates of intrinsic var-
iability in observations is much larger than the
tight bootstrap-based error bounds on the semi-
empirical estimates of the observed intrinsic
variability in figure 3 in (1). Hence, the actual
uncertainty of the semi-empirical attribution
by SMM is also much larger (10), thereby pre-
venting any clear inferences about the cause of
the false pausein the global warming (11,12).
1. B. A. Steinman, M. E. Mann, S. K. Miller, Science 347,988991
2. The standard deviation of intrinsic variability computed in
Steinman et al. (1) is small, but not exactly zero because of
their using a data adaptive low-pass filter before averaging
intrinsic variability among different simulations.
3. Steinman et al. also used weighted ensemble means to define a
version of their model-based forced signal. In this case, the con-
straint (Eq. 5) would not be exact but would still be approximately
valid, because the weighted and nonweighted estimates of the
forced signal are in fact very close (not shown here).
4. Comment (3) above also applies to a possible variation of the
regression method (Eq. 4B) in which, instead of scaling each
individual simulation by its own factor am, one would estimate
and use the single scaling factor for all simulations of each
model; this scaling factor can be defined, for example, as the
ensemble mean of amestimates computed for individual
simulations of a given model.
5. One way to try to alleviate constraint (Eq. 5) would be to
estimate the forced signal for a given subset of models using
the ensemble mean time series of the complement subset of
models. However, this would only be effective if the sizes of
these two subsets are comparable. Otherwise, the multimodel
averaging over the much larger complement subset of models
would also be very close to the all-model ensemble mean, and
the algebraic constraint (Eq. 5) would still approximately hold.
6. There are 13 models with four or more 20th-century
simulations in the CMIP5 data set, but considering separately
the ensembles of the Goddard Institute for Space Studies
(GISS) models with different physics packages makes up 18
independent ensembles.
7. The Heaviside step function is used here merely to streamline
the mathematical notations in the multiple correlation measure
(Eq. 6) by zeroing out negative terms in the sum of
correlations, leaving positive terms unchanged.
8. Other models exhibit a similar behavior; see the corresponding
images at
9. If one does not divide the large ensembles of the GISS models
into the subensembles with different physics (6), the
correlation-measure diagnosis does identify the dependency
between the model realizations, because the true forced
responses in these versions of the model are different from the
grand ensemble mean response, and similar long-term biases
across the same-physics model simulations ensue.
10. The bootstrap resampling used in (1) is equival ent to co nsider ing
subensembles of about two-thirds of independent models (or
simulations), thus effectively averaging out the intramodel un-
certainty of the forced response emphasized in our Fig. 1, D to F.
11. This is exacerbated further by the unfortunate linear extrap-
olation of the CMIP5 runs from 2005 to 2012 used in (1)to
estimate recent intrinsic trends.
12. B. Rajaratnam, J. Romano, M. Tsiang, N. S. Diffenbaugh,
Clim. Change 133, 129140 (2015).
We thank Steinman et al. for making their data and analysis code
publicly available. This research was supported by NSF grants
OCE-1243158 (S.K.) and AGS-1408897 (S.K. and A.A.T.). All data
and MATLAB (MathWorks, Natick, MA) scripts for this paper are
available for downloading from
15 April 2015; accepted 6 November 2015
1326-b 11 DECEMBER 2015 VOL 350 ISSUE 6266 SCIENCE
Fig. 1. Intrinsic variability in the 20th-century model simulations with four or more ensemble
members identified using two different methods for estimating the forced signal: the classical
subtraction of the individual-model ensemble mean (method A) and the multimodel regional
regression method (1)(methodB).(Ato C) The correlation measure (Eq. 6) of statistical indepen-
dence between multiple realizations of the GFDL CM3 model (five realizations) for (A) Atlantic Multi-
decadal Oscillation (AMO), (B) Pacific Multidecadal Oscillation (PMO), and (C) Northern Hemisphere
Multidecadal Oscillation (HMO) indices; these correlations were computed for running-mean low-pass-
filtered residual time series (which characterize intrinsic variability) and are plotted here against the
averaging window size. Low correlation measure indicates statistical independence of intrinsic residuals.
Dashed lines show the 99th percentile of the correlation measure based on the 1000 simulations of the
corresponding AR-3 red-noise model. (Dto F) Estimates of the observed multidecadal intrinsic
variability for (D) AMO, (E) PMO, and (F) HMO. The semi-empirical estimates (thin black lines) were
computed as in (1) based on the forced signals obtained using method A for each of the 18 model
ensembles considered, with the heavy red line indicating the average over these individual estimates.
Additional heavy lines (see legend) are for results based on linear detrending.The distance between the
black dashed lines in each plot shows the 95th percentile of the standard deviations for multidecadal
intrinsic variability estimated using method A for each of 116 simulations considered.
... Frankcombe et al. (2015) derived estimates of this uncertainty in synthetic data sets designed to mimic CMIP5 20th century runs, but stopped short of combining these uncertainty estimates with the actual estimated forced signals in CMIP5 simulations and observations. Meanwhile, Kravtsov et al. (2015) showed that the model uncertainty actually dominates the inferred 'internal variability' in the CMIP5 individual model simulations when the MMEM is used to define the forced signal. They further demonstrated that the forced signals based on the smoothed SMEMs provide much more accurate estimates (compared to MMEM) of the true forced signals and residual internal variability in individual model ensembles. ...
... The smoothing is done with 5-year data-adaptive low-pass filter (Mann, 2008; available from www.meteo Kravtsov et al. (2015) showed that [x m ] so computed defines an estimate of the forced signal in individual model ensembles that leads to uncorrelated samples of residual (simulated) internal variability. Our semi-empirical estimates of the forced-signal and internal variability components in the Table 1; the value of b o,m less or greater than unity means that the corresponding model's warming trend needs to be scaled down or up relative to observations. ...
... The model errors in the MMEM regression estimates of the forced signal arise due to systematic differences between the true and estimated forced signal in each simulation. Indeed, Kravtsov et al. (2015) demonstrated that internal residuals within individual model ensembles exhibit high statistically significant correlations when the MMEM regression method is used to define the forced signal in these models. This fact is apparently at odds with Steinman et al.'s (2015aSteinman et al.'s ( , 2015b claim that the (c) (d) Figure 3. Performance of the MMEM regression (red) and SMEM subtraction (blue) attribution methods in recovering the known internal variability (black) in 100 surrogate multi-model AMO data sets (see text). ...
Full-text available
This paper combines CMIP5 historical simulations and observations of surface temperature to investigate relative contributions of forced and internal climate variability to long-term climate trends. A suite of estimated forced signals based on surrogate multi-model ensembles mimicking the statistical characteristics of individual models is used to show that, in contrast to earlier claims, scaled versions of the multi-model ensemble mean cannot adequately characterize the full spectrum of CMIP5 forced responses, due to misrepresenting the model uncertainty. The same suite of multiple forced signals is also used to derive unbiased estimates of the model simulated internal variability in historical simulations and, after appropriate scaling to match the observed climate sensitivity, to estimate the internal component of climate variability in the observed temperatures. On average, climate models simulate the non-uniform warming of Northern Hemisphere mean surface temperature well, but are overly sensitive to forcing in the North Atlantic and North Pacific, where the simulations have to be scaled back to match observed trends. In contrast, the simulated internal variability is much weaker than observed. There is no evidence of coupling between the model simulated forced signals and internal variability, suggesting that their underlying dominant physical mechanisms are different. Analysis of regional contributions to the recent global warming hiatus points to the presence of a hemispheric mode of internal climate variability, rather than to internal processes local to the Pacific Ocean. Large discrepancies between present estimates of the simulated and observed multidecadal internal climate variability suggest that our ability to attribute and predict climate change using current generation of climate models is limited.
... CMIP5 outputs for future projections To address future climate projected changes, we have retrieved globally gridded monthly precipitation and air temperature from the Coupled Model Intercomparison Project (CMIP5) long-term experiments, for 12 models out of about 30 available (Table Supplementary Materials 2). Only those where several runs (ensemble of simulations for one model and one scenario) are available for at least one of the timeseries (historical or future) were selected: all ensemble members are then averaged together for each model, so as to account for the models' internal variability and provide an unbiased estimate of the forced climate response in individual model simulations (Taylor et al. 2012;Kravtsov et al. 2015). For each model, the 1901-2005 period is extracted from the model ensemble mean historical experiment (forced by observed atmospheric composition changes). ...
Full-text available
Highly populated, water-limited and warm drylands are challenging areas for development and are expected to expand overall under several scenarios of climate change. Here, we adopt a bioclimatic approach based on the Köppen classification to focus on the evolution of warm semi-arid regions over the projected twenty-first century, following three socio-economic scenarios and 12 global climate models from the last IPCC exercise (CMIP5). We show that a global expansion of this climatic domain has already started according to climate observations in the twentieth century (about + 13% of surface increase, i.e. from 6 to 7% of the global land surface). Models project that this expansion will continue throughout the twenty-first century, whatever the scenario: for the most dramatic one (RCP 8.5), the share of the total land surface occupied by warm semi-arid surfaces is about 38% higher in 2100 compared to the present (from ∼ 7 to ∼ 9% of the global land surface). This expansion will essentially take place outside of the tropical belt, showing a poleward migration as large as 11∘ of latitude in the Northern Hemisphere. This expansion is linearly correlated with the projected future global warming (about 853 millions km² per degree of warming for RCP 8.5). Different types of climate class transitions and their associated mechanisms are discussed.
... Steinman et al. (2015) concluded that the hiatus has been induced primarily by a strong negative-trend in Pacific multidecadal variability, with only a small contribution from Atlantic variability. However, the Pacific multidecadal variability defined by Steinman et al. (2015) did not follow the classic ICV modes that are generally used to represent the Pacific variability (Kravtsov et al. 2015). The suggested classic Pacific variability related with the recent warming hiatus mainly were the La-Niña-like cooling induced by accelerated trade winds (Kosaka and Xie 2013;England et al. 2014), and a negative phase of the Pacific Decadal Oscillation (PDO), or (more generally) the Interdecadal Pacific Oscillation (IPO) (Dai et al. 2015;Trenberth et al. 2014). ...
Full-text available
A warming hiatus is a period of relatively little change in global mean surface air temperatures (SAT). Many studies have attributed the current warming hiatus to internal climate variability (ICV). But there is less work on discussion of the dynamics about how these ICV modes influence cooling over land in the Northern Hemisphere (NH). Here we demonstrate the warming hiatus was more significant over the continental NH. We explored the dynamics of the warming hiatus from a global perspective and investigated the mechanisms of the reversing from accelerated warming to hiatus, and how ICV modes influence SAT change throughout the NH land. It was found that these ICV modes and Arctic amplification can excite a decadal modulated oscillation (DMO), which enhances or suppresses the long-term trend on decadal to multi-decadal timescales. When the DMO is in an upward (warming) phase, it contributes to an accelerated warming trend, as in last 20 years of twentieth-century. It appears that there is a downward swing in the DMO occurring at present, which has balanced or reduced the radiative forced warming and resulted in the recent global warming hiatus. The DMO modulates the SAT, in particular, the SAT of boreal cold months, through changes in the asymmetric meridional and zonal thermal forcing (MTF and ZTF). The MTF represents the meridional temperature gradients between the mid- and high-latitudes, and the ZTF represents the asymmetry in temperatures between the extratropical large-scale warm and cold zones in the zonal direction. Via the different performance of combined MTF and ZTF, we found that the DMO’s modulation effect on SAT was strongest when both weaker (stronger) MTF and stronger (weaker) ZTF occurred simultaneously. And the current hiatus is a result of a downward DMO combined with a weaker MTF and stronger ZTF, which stimulate both a weaker polar vortex and westerly winds, along with the amplified planetary waves, thereby facilitating southward invasion of cold Arctic-air and promoting the blocking formation. The results conclude that the DMO can not only be used to interpret the current warming hiatus, it also suggests that global warming will accelerate again when it swings upward.
This paper addresses the dynamics of internal hemispheric-scale multidecadal climate variability by postulating an energy-balance (EBM) model comprised of two deep-ocean oscillators in the Atlantic and Pacific basins, coupled through their surface mixed layers via atmospheric teleconnections. This system is linear and driven by the atmospheric noise. Two sets of the EBM model parameters are developed by fitting the EBM-based mixed-layer temperature covariance structure to best mimic basin-average North Atlantic/Pacific sea-surface temperature (SST) co-variability in either observations or control simulations of comprehensive climate models within CMIP5 project. The differences between the dynamics underlying the observed and CMIP5-simulated multidecadal climate variability and predictability are encapsulated in the algebraic structure of the two EBM model versions so obtained: EBM CMIP5 and EBM OBS . The multidecadal variability in EBM CMIP5 is overall weaker and amounts to a smaller fraction of the total SST variability than in EBM OBS , pointing to a lower potential decadal predictability of virtual CMIP5 climates relative to that of the actual climate. The EBM CMIP5 decadal hemispheric teleconnections (and, by inference, those in CMIP5 models) are largely controlled by the variability of the Pacific, in which the ocean, due to its large thermal and dynamical memory, acts as a passive integrator of atmospheric noise. By contrast, EBM OBS features a stronger two-way coupling between the Atlantic and Pacific multidecadal oscillators, thereby suggesting the existence of a hemispheric-scale and, perhaps, global multidecadal mode associated with internal ocean dynamics. The inferred differences between the observed and CMIP5 simulated climate variability stem from a stronger communication between the deep ocean and surface processes implicit in the observational data.
Full-text available
In this paper we examine various options for the calculation of the forced signal in climate model simulations, and the impact these choices have on the estimates of internal variability. We find that an ensemble mean of runs from a single climate model [a single model ensemble mean (SMEM)] provides a good estimate of the true forced signal even for models with very few ensemble members. In cases where only a single member is available for a given model, however, the SMEM from other models is in general out-performed by the scaled ensemble mean from all available climate model simulations [the multimodel ensemble mean (MMEM)]. The scaled MMEM may therefore be used as an estimate of the forced signal for observations. The MMEM method, however, leads to increasing errors further into the future, as the different rates of warming in the models causes their trajectories to diverge. We therefore apply the SMEM method to those models with a sufficient number of ensemble members to estimate the change in the amplitude of internal variability under a future forcing scenario. In line with previous results, we find that on average the surface air temperature variability decreases at higher latitudes, particularly over the ocean along the sea ice margins, while variability in precipitation increases on average, particularly at high latitudes. Variability in sea level pressure decreases on average in the Southern Hemisphere, while in the Northern Hemisphere there are regional differences.
Identification and dynamical attribution of multidecadal climate undulations to either variations in external forcings or to internal sources is one of the most important topics of modern climate science, especially in conjunction with the issue of human-induced global warming. Here we utilize ensembles of twentieth century climate simulations to isolate the forced signal and residual internal variability in a network of observed and modeled climate indices. The observed internal variability so estimated exhibits a pronounced multidecadal mode with a distinctive spatiotemporal signature, which is altogether absent in model simulations. This single mode explains a major fraction of model-data differences over the entire climate index network considered; it may reflect either biases in the models' forced response or models' lack of requisite internal dynamics, or a combination of both.
Kravtsov et al. claim that we incorrectly assess the statistical independence of simulated samples of internal climate variability and that we underestimate uncertainty in our calculations of observed internal variability. Their analysis is fundamentally flawed, owing to the use of model ensembles with too few realizations and the fact that no one model can adequately represent the forced signal.
Full-text available
The reported “hiatus” in the warming of the global climate system during this century has been the subject of intense scientific and public debate, with implications ranging from scientific understanding of the global climate sensitivity to the rate in which greenhouse gas emissions would need to be curbed in order to meet the United Nations global warming target. A number of scientific hypotheses have been put forward to explain the hiatus, including both physical climate processes and data artifacts. However, despite the intense focus on the hiatus in both the scientific and public arenas, rigorous statistical assessment of the uniqueness of the recent temperature time-series within the context of the long-term record has been limited. We apply a rigorous, comprehensive statistical analysis of global temperature data that goes beyond simple linear models to account for temporal dependence and selection effects. We use this framework to test whether the recent period has demonstrated i) a hiatus in the trend in global temperatures, ii) a temperature trend that is statistically distinct from trends prior to the hiatus period, iii) a “stalling” of the global mean temperature, and iv) a change in the distribution of the year-to-year temperature increases. We find compelling evidence that recent claims of a “hiatus” in global warming lack sound scientific basis. Our analysis reveals that there is no hiatus in the increase in the global mean temperature, no statistically significant difference in trends, no stalling of the global mean temperature, and no change in year-to-year temperature increases.
Full-text available
The recent slowdown in global warming has brought into question the reliability of climate model projections of future temperature change and has led to a vigorous debate over whether this slowdown is the result of naturally occurring, internal variability or forcing external to Earth's climate system. To address these issues, we applied a semi-empirical approach that combines climate observations and model simulations to estimate Atlantic- and Pacific-based internal multidecadal variability (termed "AMO" and "PMO," respectively). Using this method, the AMO and PMO are found to explain a large proportion of internal variability in Northern Hemisphere mean temperatures. Competition between a modest positive peak in the AMO and a substantially negative-trending PMO are seen to produce a slowdown or "false pause" in warming of the past decade. Copyright © 2015, American Association for the Advancement of Science.