ArticlePDF Available

Ensemble hindcasts of ENSO events over the past 120 years using a large number of ensembles

March 2009
Advances in Atmospheric Sciences 26(2):359-372

March 2009
26(2):359-372

DOI:10.1007/s00376-009-0359-7

Authors:

Fei Zheng

Institute of Atmospheric Physics

Rong‐Hua Zhang

Chinese Academy of Sciences

Based on an intermediate coupled model (ICM), a probabilistic ensemble prediction system (EPS) has been developed. The ensemble Kalman filter (EnKF) data assimilation approach is used for generating the initial ensemble conditions, and a linear, first-order Markov-Chain SST anomaly error model is embedded into the EPS to provide model-error perturbations. In this study, we perform ENSO retrospective forecasts over the 120 year period 1886–2005 using the EPS with 100 ensemble members and with initial conditions obtained by only assimilating historic SST anomaly observations. By examining the retrospective ensemble forecasts and available observations, the verification results show that the skill of the ensemble mean of the EPS is greater than that of a single deterministic forecast using the same ICM, with a distinct improvement of both the correlation and root mean square (RMS) error between the ensemble-mean hindcast and the deterministic scheme over the 12-month prediction period. The RMS error of the ensemble mean is almost 0.2°C smaller than that of the deterministic forecast at a lead time of 12 months. The probabilistic skill of the EPS is also high with the predicted ensemble following the SST observations well, and the areas under the relative operating characteristic (ROC) curves for three different ENSO states (warm events, cold events, and neutral events) are all above 0.55 out to 12 months lead time. However, both deterministic and probabilistic prediction skills of the EPS show an interdecadal variation. For the deterministic skill, there is high skill in the late 19th century and in the middle-late 20th century (which includes some artificial skill due to the model training period), and low skill during the period from 1906 to 1961. For probabilistic skill, for the three different ENSO states, there is still a similar interdecadal variation of ENSO probabilistic predictability during the period 1886–2005. There is high skill in the late 19th century from 1886 to 1905, and a decline to a minimum of skill around 1910–50s, beyond which skill rebounds and increases with time until the 2000s.

Spatial patterns of the first mode (left column), second mode (middle column), and their associated normalized time coefficients (right column) for the SST anomaly model errors at 3-, 6-, 9-, and 12-month lead times. The contour interval is 0.2 • C for the first mode and 0.1 • C for the second mode.

…

Time series of observed and forecasted Niño-3.4 SST anomalies at 6-month lead time. The dashed line represents the observed SST anomalies, the solid line represents the ensemble mean, and the shaded area represents the prediction spread.

…

Four of the largest El Niños since 1886. The thick black curves are observed Niño-3.4 SST anomalies, and the thin curves of red, green, blue and purple are ensemble mean predictions started respectively 12, 9, 6, and 3 months before the peak of each El Niño.

…

Anomaly correlation (top) and RMS error (bottom) of the Niño-3.4 SST anomalies for the model ensemble mean hindcast (solid line with closed circle), the deterministic hindcast (solid line with open circle), and persistence (dot-dashed line) are shown as functions of lead time.

…

Talagrand diagram for the full ensemble Niño-3.4 SST anomaly hindcast over the whole 120-year period: (a) 3-month lead time, (b) 6-month lead time, (c) 9-month lead time, and (d) 12-month lead time hindcasts. The dashed line marks the theoretical frequency for a perfectly reliable EPS.

…

Figures - uploaded by Rong‐Hua Zhang

Content may be subject to copyright.

Content uploaded by Rong‐Hua Zhang

Content may be subject to copyright.

ADVANCES IN ATMOSPHERIC SCIENCES, VOL. 26, NO. 2, 2009, 359–372

Ensemble Hindcasts of ENSO Events over the Past 120 Years

Using a Large Number of Ensembles

ZHENG Fei1(

), ZHU Jiang∗2(



), WANG Hui3(



), and Rong-Hua ZHANG4

1International Center for Climate and Environment Science (ICCES),

Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029

2State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry (LAPC),

Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029

3National Meteorological Center, Beijing 100081

4Earth System Science Interdisciplinary Center (ESSIC), University of Maryland, College Park, Maryland, USA

(Received 20 January 2008; revised 2 November 2008)

ABSTRACT

Based on an intermediate coupled model (ICM), a probabilistic ensemble prediction system (EPS) has

been developed. The ensemble Kalman ﬁlter (EnKF) data assimilation approach is used for generating the

initial ensemble conditions, and a linear, ﬁrst-order Markov-Chain SST anomaly error model is embedded

into the EPS to provide model-error perturbations. In this study, we perform ENSO retrospective forecasts

over the 120 year period 1886–2005 using the EPS with 100 ensemble members and with initial conditions

obtained by only assimilating historic SST anomaly observations.

By examining the retrospective ensemble forecasts and available observations, the veriﬁcation results

show that the skill of the ensemble mean of the EPS is greater than that of a single deterministic forecast

using the same ICM, with a distinct improvement of both the correlation and root mean square (RMS) error

between the ensemble-mean hindcast and the deterministic scheme over the 12-month prediction period.

The RMS error of the ensemble mean is almost 0.2◦C smaller than that of the deterministic forecast at a

lead time of 12 months. The probabilistic skill of the EPS is also high with the predicted ensemble following

the SST observations well, and the areas under the relative operating characteristic (ROC) curves for three

diﬀerent ENSO states (warm events, cold events, and neutral events) are all above 0.55 out to 12 months

lead time.

However, both deterministic and probabilistic prediction skills of the EPS show an interdecadal variation.

For the deterministic skill, there is high skill in the late 19th century and in the middle-late 20th century

(which includes some artiﬁcial skill due to the model training period), and low skill during the period from

1906 to 1961. For probabilistic skill, for the three diﬀerent ENSO states, there is still a similar interdecadal

variation of ENSO probabilistic predictability during the period 1886–2005. There is high skill in the late

19th century from 1886 to 1905, and a decline to a minimum of skill around 1910–50s, beyond which skill

rebounds and increases with time until the 2000s.

Key words: ENSO, ensemble prediction system, interdecadal predictability, hindcast

Citation: Zheng, F., J. Zhu, H. Wang, and R.-H. Zhang, 2009: Ensemble hindcasts of ENSO events over the

past 120 years using a large number of ensembles. Adv. At mos. Sc i.,26(2), 359–372, doi: 10.1007/s00376-

009-0359-7.

1. Introduction

Based on the intermediate coupled model (ICM)

(Keenlyside and Kleeman, 2002; Zhang et al., 2005), a

probabilistic EPS was developed. It has been demon-

strated that this system can be improved for El Ni˜no

simulations and predictions through the use of the en-

semble Kalman ﬁlter (EnKF; e.g., Evensen, 2003) data

∗Corresponding author: ZHU Jiang, jzhu@mail.iap.ac.cn

360 ENSEMBLE ENSO HINDCAST OVER PAST 120 YEARS VOL. 26

Fig. 1. Horizontal distributions of the normalized obser-

vation error (a) and the model initial uncertainty (b) of

SST in January 1998. The model initial uncertainty is

estimated from the EnKF analysis spread. The contour

interval is 0.05◦C in (a) and 0.02◦Cin(b).

assimilation approach for generating the initial ensem-

ble conditions, as well as a linear, ﬁrst-order Markov-

Chain SST anomaly error model that was embedded

into the ensemble prediction system (EPS) to provide

model error perturbations (Zheng et al., 2006a). How-

ever, the model performance was only veriﬁed over a

relatively short period with relative small number of

events.

As pointed out by Chen et al. (2004), previous esti-

mations of El Ni˜no’s predictability (e.g., Goswami and

Shukla, 1991; Kirtman and Schopf, 1998; Latif et al.,

1998) were mostly based on retrospective predictions

for the last two or three decades (i.e., the hindcast pe-

riod encompassed a relatively small number of events).

With so few degrees of freedom and short hindcast pe-

riods, the statistical signiﬁcance of those estimates is

questionable. From available SST observations, Chen

et al. (2004) used the Lamont ENSO prediction model

to perform a retrospective forecast experiment of 148

years from 1856 to 2003, and found that ENSO pre-

dictability clearly had interdecadal variations. This

was, to date, the ﬁrst work that studied ENSO pre-

dictability by extending realistic forecasts to a pe-

riod over 100 years. Also, Tang et al. (2008) com-

pared ENSO predictabilities using three diﬀerent mod-

els by performing 120-year retrospective forecasts, and

conﬁrmed the interdecadal variations in ENSO pre-

dictability were not model dependent.

However, the ENSO predictability in these models

was only veriﬁed in the deterministic sense. Indeed, as

considered in classic theories, ENSO should be viewed

as a chaotic or irregular interannual ﬂuctuation in the

tropical Paciﬁc (e.g., Tziperman et al., 1994). So we

need to discuss the ENSO predictability in not only

a deterministic sense but also in a probabilistic sense.

With a realistic ENSO EPS, and newly-developed SST

assimilation approaches (Zheng et al., 2006a), we re-

cently completed a long-term retrospective ensemble

forecast from 1886 to 2005 with 100 members, and an-

alyzed the ENSO predictability and its variations in

both a deterministic and probabilistic sense.

This paper is structured as follows: Section 2 de-

scribes the components of the EPS, and the historic

SST data in detail. Section 3 examines the determin-

istic and probabilistic prediction skills of the EPS for

the whole period from 1886 to 2005. In section 4,

the interdecadal variations of the ensemble prediction

skills in the ENSO EPS are examined in both the de-

terministic and probabilistic sense. A summary and

discussion are given in section 5.

2. Ensemble prediction system components

and dataset

2.1 Basic deterministic model

Our ensemble prediction system mainly contains

three components. The EPS is ﬁrstly based on a

deterministic model, and the basic intermediate cou-

pled model was developed by Keenlyside and Klee-

man (2002) and Zhang et al. (2003). Its dynamical

component consists of both linear and non-linear com-

ponents. The former was essentially a McCreary-type

(McCreary, 1981) modal model, but was extended to

include a horizontally-varying background stratiﬁca-

tion. In addition, ten baroclinic modes, along with a

parameterization of the local Ekman-driven upwelling,

were included. A SST anomaly model was embedded

within this dynamical framework to simulate the evo-

lution of the mixed-layer temperature anomalies. As

demonstrated by Zhang et al. (2005), having a realistic

parameterization for the temperature of the subsurface

water entrained into the mixed-layer (Te)iscrucialto

the performance of SST simulations in the equatorial

Paciﬁc. An empirical Temodel was constructed from

historical data and was demonstrated to be eﬀective in

improving the SST simulations. The ocean model was

coupled with a statistical atmospheric model, which

speciﬁcally relates wind stress (τ) to SST anomaly

ﬁelds. The two empirical models (the Temodel and

the atmospheric model) were constructed based on the

historic observations during the period 1963–96 (34

yr of data). All coupled-model components exchange

NO. 2 ZHENG ET AL. 361

simulated anomaly ﬁelds. Information concerning the

interactions between the atmosphere (τ) and the ocean

(SST) was exchanged once a day.

2.2 Initial ensemble condition

Based on the ICM, a probabilistic EPS was de-

veloped by Zheng et al. (2006a). The initial ensem-

ble conditions of the EPS were provided by the EnKF

(e.g., Evensen, 2003, 2004) data assimilation approach

through assimilating SST anomaly data into the model

with 100 ensemble members (Zheng et al., 2006a). Fig-

ure 1 shows an example of horizontal distributions of

the normalized observation error and the model ini-

tial uncertainty of SST at the initial time of January

1998. The distribution of the model uncertainty has

the same shape as that of the normalized observation

error. Thus, each initial ensemble member after assim-

ilation represents an equally realistic initial condition.

At the same time, the initial ensemble state variables

are dynamically balanced within the model after a se-

ries of assimilation cycles. Thus, this ensemble initial-

ization approach not only can generate accurate and

dynamically consistent initial ensemble members, but

also can provide reasonable surface initial stochastic

uncertainties for the EPS by combining both back-

ground and observation errors during the assimilation

cycles (Zheng and Zhu, 2008).

2.3 Stochastic model-error perturbation

As described by Zheng et al. (2006a), due to simu-

lation deﬁciencies for coupled air-sea interactions and

subsurface thermal eﬀects in the SST anomaly model,

a linear, ﬁrst-order Markov stochastic model is em-

bedded within the SST anomaly model of the ICM to

represent the model uncertainties of forecasted SST

anomaly ﬁelds. This perturbation method was veri-

ﬁed to be capable of eﬀectively simulating the time

evolution of model uncertainties during the ensemble

forecasting procedure (Zheng et al., 2007). Here, we

make further reﬁnements and extensions to the model

error perturbation scheme by carefully analyzing the

forecast errors (408 samples of the observation-minus-

forecast values for 12-month lead time from 1963 to

1996, covering the same analysis period as the train-

ing period of the deterministic model) for the diﬀerent

lead times by an empirical orthogonal function (EOF)

method, instead of the formulation used in Zheng et al.

(2006a). After doing these, the time evolution of the

model errors at diﬀerent lead times can be represented

as,

⎧

⎪

⎨

⎪

⎩

Q(t)



i=1

λ(t)

i×Ψi,j +ξ(t)

λ(t)

i=αi,j ×λ(t−1)

i+1−α2

i,j ×v(t)

(1)

where Ψi,j represents the spatial pattern of the ith

EOF mode for the (SST anomaly) model error Qat

lead time of jmonth, and which is a constant hori-

zontal distribution for each mode. λ(t)

irepresents the

random normalized time coeﬃcient of the ith mode at

time t, the coeﬃcient αi,j is the time correlation of the

stochastic forcing for the ith mode at lead time of j

month, v(t)

iis a random number of the ith mode at

time t, with a mean equal to 0 and variance equal to

1, and the correlations between the random vector of

each mode should be zero to allow the maintenance of

the orthogonality of each mode. Therefore, this equa-

tion ensures that the variance in λ(t)

iis equal to 1 as

long as the variance of λ(t−1)

iis also equal to 1. The

subscripts iand jindicate the EOF mode number and

the lead time respectively, the number Mis the num-

ber of the EOF modes used in the stochastic model,

and ξ(t)

jrepresents a residual random ﬁeld for Qat

time tthat is obtained by taking out the ﬁrst MEOF

modes from the observation-minus-forecast values.

There are two advantages that should be addressed

here for this simpliﬁed representation of the model er-

rors. First, there is no longer any need to calculate

the spatial correlation scales of the model errors as in

Zheng et al. (2006a) at each grid point through per-

turbing the time coeﬃcients with only the constant

spatial patterns for each mode. Second, the temporal

correlation coeﬃcients α, for each mode in Eq. (1) for

the stochastic model can be easily obtained by calcu-

lating the lagged correlations of the series of the time

coeﬃcients from EOF analysis results.

This model-error analysis was performed by com-

paring the SST anomalies’ twelve-month observation-

minus-forecast values. The model errors were com-

puted from 408 samples over a 34-year period (co-

inciding with the model training period) starting in

1963 and extending until 1996, without considering

the errors inherent within the initial conditions. The

forecast initialization scheme was a nudging assimila-

tion scheme, which was used to minimize the initial

errors here (Zheng et al., 2006b). The details of the

analysis process for estimating the model errors are

as follows. Firstly, to obtain the approximate “per-

fect” initial ﬁelds, the observed SST anomaly data

were nudged into the model at every time step and at

each grid point, and this nudging process was started

each month from December 1962 to November 1996

with a reasonable nudging intensity [i.e., 0.50 follow-

ing Zheng et al. (2006b)] and 12-month nudging time

length. Then, twelve-month forecasts were initialized

from the nudging results each month during the 34-

yr period from 1963 to 1996. Thirdly, twelve-month

observation-minus-forecast values of the SST anoma-

lies during this 34-yr period (408 samples) were ob-

362 ENSEMBLE ENSO HINDCAST OVER PAST 120 YEARS VOL. 26

Fig. 2. Spatial patterns of the ﬁrst mode (left column), second mode (middle column), and their associated

normalized time coeﬃcients (right column) for the SST anomaly model errors at 3-, 6-, 9-, and 12-month

lead times. The contour interval is 0.2◦Cfortheﬁrstmodeand0.1

◦C for the second mode.

tained as model errors. Finally, the properties of the

model errors, such as spatial patterns and their asso-

ciated temporal variations, were analyzed through the

EOF method.

Figure 2 shows the spatial patterns [i.e., Ψ in Eq.

(1)] of the ﬁrst and second EOF modes for the SST

anomaly model errors at lead times of 3, 6, 9, and

12 months, and the associated time series. The spatial

structure indicates the regions, which are not predicted

well by the model. For the ﬁrst mode, model uncer-

tainties are mainly located over the eastern equatorial

Paciﬁc, and extend into the central basin with longer

lead times. In contrast to the ﬁrst mode, the model

uncertainties of the second mode are mainly located

over the eastern costal regions and the central equa-

torial Paciﬁc. And the proportion of the ﬁrst mode

in total covariance increases from 37.1% to 71.8% in

the 12-month model-error analysis results, while the

proportion of the second mode decreases to 8.3% at

12-month lead. These results indicate that the ﬁrst

several modes can explain and describe the variations

of the model errors in the tropical Paciﬁc, and the

contributions of the ﬁrst mode dominate the model-

error simulations, especially at longer leads. The tem-

poral correlation coeﬃcients [i.e., αin Eq. (1)] for

each mode were obtained by calculating the one-month

lagged correlations for each EOF time-series. Table 1

presents the temporal correlation coeﬃcients that are

used in the stochastic model. The temporal correla-

tion coeﬃcients of each mode increase with increasing

lead time, which indicates a decreased randomness in

the expansion time coeﬃcients, and the temporal cor-

relation coeﬃcient αof the ﬁrst mode exceeds 0.95 at

12-month lead time. Thus, the variations of the ma-

jor modes in the model-error model are allowed to be

more random at short lead times, but with more stable

and bias-correction like properties at longer lead times

(e.g., Evensen, 2003).

After carefully building up a reasonable model-

error model, we can use Eqs. (1) and (2) to provide

a simple representation of a non-linear model, by em-

bedding the above model-error system within the dy-

NO. 2 ZHENG ET AL. 363

Table 1. Time-correlated coeﬃcients of the stochastic model for the ﬁrst ten modes from one-month to twelve-month

lead times.

Lead time EOF mode

(months) 1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th

1 0.695 0.603 0.604 0.397 0.408 0.416 0.274 0.403 0.403 0.253

2 0.799 0.840 0.703 0.721 0.689 0.747 0.630 0.714 0.621 0.651

3 0.858 0.863 0.767 0.745 0.754 0.780 0.659 0.732 0.666 0.693

4 0.879 0.875 0.796 0.746 0.787 0.784 0.670 0.696 0.753 0.698

5 0.891 0.877 0.816 0.750 0.795 0.782 0.688 0.686 0.783 0.691

6 0.902 0.888 0.804 0.768 0.792 0.779 0.697 0.699 0.784 0.704

7 0.919 0.888 0.797 0.784 0.774 0.780 0.712 0.700 0.780 0.740

8 0.928 0.879 0.809 0.802 0.756 0.776 0.718 0.694 0.785 0.740

9 0.935 0.877 0.810 0.818 0.737 0.785 0.726 0.703 0.781 0.753

10 0.941 0.875 0.817 0.835 0.714 0.795 0.728 0.704 0.783 0.768

11 0.948 0.872 0.828 0.849 0.710 0.790 0.732 0.704 0.781 0.770

12 0.954 0.868 0.836 0.859 0.720 0.791 0.728 0.714 0.775 0.779

namical model to simulate the time evolutions of the

model errors during the ensemble forecast process:

ψt=f(ψt−1)+Qt(2)

where ψtrepresents the model state at time t,andfis

the non-linear model operator. In order to achieve rea-

sonable amplitudes, the ﬁrst ten EOF modes were re-

tained in simulating the random model errors, and the

simulated random model errors of SST anomalies were

generated and added into the physical model daily.

2.4 Dataset

The data used in this study is the monthly ex-

tended global SST (ERSST) dataset from 1854 to 2006

reconstructed by Smith and Reynolds (2004), with 2◦

horizontal resolution. Due to the relatively poor qual-

ity of the dataset prior to 1880, the observed SST

anomalies before 1880 lack annual and seasonal vari-

ations (Smith et al., 2008), so the initial conditions

can not trigger real annual oscillations and seasonal

variations of the predicted signals (Tang et al., 2008).

Thus we focus on the period from 1886 to 2005 in this

study, and the data domain is conﬁgured as the tropi-

cal Paciﬁc Ocean. A very important task in ENSO pre-

dictions is to optimize the oceanic initial conditions,

and the assimilation of subsurface in-situ observations

and satellite altimetry can signiﬁcantly improve model

skills (e.g., Tang and Hsieh, 2003; Zheng et al., 2007).

However, the oceanic satellite altimetry and subsur-

face observation records are too short for our study.

The only way solution is to only assimilate SST to ini-

tialize forecasts. Zheng et al. (2006a) used the EnKF

data assimilation system to provide an initial condition

ensemble for the ICM with 100 members. And this

SST-only assimilation approach has been veriﬁed to

be able to provide dynamically balanced initial ﬁelds

and signiﬁcantly improve El Ni˜no predictions. In this

study, only the observed monthly SST anomaly ﬁelds

from Smith and Reynolds (2004) are assimilated into

the ICM with the EnKF once a month. These ob-

servational data are also used for verifying the model

predictions.

3. Retrospective forecast experiments

The retrospective forecast (or hindcast) experi-

ments covering the period 1886–2005 are made and

compared to available observations. A 12-month hind-

cast is initialized each month during this 120-yr period.

For each initial month, an ensemble of 100 hindcasts

is run, yielding a total of 144000 retrospective fore-

casts to be veriﬁed. Figure 3 directly shows the pre-

dicted ensemble mean of the Ni˜no-3.4 (5◦S–5◦N, 120◦–

170◦W) SST anomalies and the prediction spread at

6-month lead time from 1886 to 2005. The variability

of the ensemble mean follows the Ni˜no-3.4 observations

quite well. Apart from a few exceptions the ensemble

forecasts can encompass the observations. This in-

dicates that the EPS is able to predict most of the

warm and cold events that occurred in past 120 years

at 6-month lead time, especially the relatively large El

Ni˜no and La Ni˜na events. The skill of the hindcasts is

examined from both a deterministic and a probabilis-

tic perspective. The skill estimation in this section is

based on the full hindcast period, 1886–2005, which

corresponds to a total of 144000 members.

3.1 Deterministic prediction skill

Firstly, to check the deterministic predictability

of the EPS for the large events, Fig. 4 shows long

lead time deterministic retrospective forecast results

for four of the largest warm episodes (as measured

by the peak Ni˜no-3.4 SST anomalies) of the past 120

years. In all cases, the EPS is able to predict the ob-

364 ENSEMBLE ENSO HINDCAST OVER PAST 120 YEARS VOL. 26

Fig. 3. Time series of observed and forecasted Ni˜no-3.4 SST anomalies at 6-month

lead time. The dashed line represents the observed SST anomalies, the solid line

represents the ensemble mean, and the shaded area represents the prediction spread.

Fig. 4. Four of the largest El Ni˜nos since 1886. The thick black curves are observed

Ni˜no-3.4 SST anomalies, and the thin curves of red, green, blue and purple are en-

semble mean predictions started respectively 12, 9, 6, and 3 months before the peak

of each El Ni˜no.

NO. 2 ZHENG ET AL. 365

Fig. 5. Anomaly correlation (top) and RMS error (bot-

tom) of the Ni˜no-3.4 SST anomalies for the model en-

semble mean hindcast (solid line with closed circle), the

deterministic hindcast (solid line with open circle), and

persistence (dot–dashed line) are shown as functions of

lead time.

served strong El Ni˜no events twelve months in ad-

vance, although some errors still exist in the forecasted

onset and development, and in the magnitude of these

events. The implication is that the signal components

present in initial ﬁelds play a critical role in deter-

mining ENSO prediction skills (e.g., Peng and Kumar,

2005; Moore et al., 2006; Zheng et al., 2009).

Figure 5 shows the anomaly correlation and root

mean square (RMS) error between observed and pre-

dicted average SST anomalies over the tropical Pa-

ciﬁc Ocean Ni˜no-3.4 region as a function of lead time.

To compare with the original deterministic predic-

tion skill, we also perform a prediction experiment

whose initialization procedure is brieﬂy described here,

wherein the wind stress anomalies reconstructed from

observed SST anomalies via a singular value decom-

position (SVD) based model are used to integrate the

ocean model over the whole forecast period to generate

initial conditions for the dynamical component, and

the SST anomaly model initial conditions are taken

as the observed SST anomalies (Zhang et al., 2003).

The skill scores for the ensemble mean hindcast are

better than that of the original deterministic forecast

scheme; both of hindcasts schemes have particularly

high skill at short lead times and beat persistence for

all lead times with a correlation coeﬃcient of greater

than 0.94 for the ﬁrst month. Beyond 4-month lead

time, there is a distinct diﬀerence of RMS errors be-

tween the ensemble mean hindcast and the original

scheme. The RMS error of the ensemble mean re-

mains smaller than 0.94◦C over the 12-month predic-

tion period, and is almost 0.2◦C smaller than that of

the original deterministic forecast scheme at a lead

time of 12 months. Over the whole period, this im-

provement occurs because the advanced assimilation

method can provide more dynamically consistent and

accurate initial conditions than the original initializa-

tion method, and the ensemble mean can remove some

unpredictable stochastic information.

3.2 Probabilistic prediction skill

We use Talagrand diagrams (also known as rank

histograms) to evaluate whether the hindcast and

the verifying observation are sampled from the same

probability distribution (e.g., Talagrand et al., 1998;

Hamill, 2001). The Talagrand diagrams are generated

by ordering at each grid point the forecast values from

each of the ensemble members from smallest to largest.

For our full ensemble, with 100 members, this creates

101 intervals, and the value of the verifying observa-

tion then falls into one of the 101 categories. Figure

6 shows the Talagrand diagram for the SST anoma-

lies over the Ni˜no-3.4 region, and is a diagram of the

frequencies as a function of the category index. For

the SST anomalies, the distribution is ﬂat, although

the two extreme categories are somewhat higher than

their adjacent categories. The 12-month lead hind-

cast is better in this respect than the 3-month lead

hindcast, however. This indicates that the ensemble

spread at longer lead is more reasonable. Also, there

is a small shift of frequencies (i.e., the frequencies in

the upper intervals are decreasing from shorter lead

time to longer lead time, while the frequencies in the

lower and middle intervals are increasing at the same

time) of the verifying observation from the lower cat-

egories to the higher categories at all four lead times.

The Talagrand diagrams indicate that the probability

distribution of observations can be represented by the

ensemble approach.

As described in section 2, our ensemble members

are generated based on the hypothesis of a Gaussian

distribution, but the standard normally distributed

perturbations are processed at all model grids (not on

regions), and thus we need to check whether probabil-

ity distributions of the Ni˜no-3.4 forecasted ensemble

members accord with the Gaussian distribution. Fig-

ure 7 shows the normalized probability curve of the

forecasted ensemble members over the Ni˜no-3.4 region

based on the entire 120-yr period. At diﬀerent lead

times, the forecasted ensemble members agree with

the normal distribution quite well, and there are no

366 ENSEMBLE ENSO HINDCAST OVER PAST 120 YEARS VOL. 26

Fig. 6. Talagrand diagram for the full ensemble Ni˜no-3.4 SST anomaly hindcast over the

whole 120-year period: (a) 3-month lead time, (b) 6-month lead time, (c) 9-month lead time,

and (d) 12-month lead time hindcasts. The dashed line marks the theoretical frequency for

a perfectly reliable EPS.

Fig. 7. Gaussian distribution diagram for the Ni˜no-3.4 ensemble SST anomaly hindcast: (a)

3-month lead time, (b) 6-month lead time, (c) 9-month lead time, and (d) 12-month lead

time. The dashed line marks a standard normal probability curve.

NO. 2 ZHENG ET AL. 367

Fig. 8. Ni˜no-3.4 ROC curves for a lead time of (a) 3 months, (b) 6 months, (c) 9 months,

and (d) 12 months. Warm events (upper tercile) are denoted with closed squares, normal

events (middle tercile) are denoted with open circles, and cold events (lower tercile) are

denoted with asterisks.

double- or multi-modal peaks for the ensemble mem-

bers. This indicates that the generation methods of

the forecast ensemble members are reasonable, and the

ensemble-mean forecast result is the most representa-

tive deterministic forecast, capable of illustrating the

deterministic performance of the EPS.

To measure the probabilistic prediction skill more

accurately, here we choose the method commonly re-

ferred to as relative operating characteristic (ROC;

e.g., Mason and Graham, 1999) to measure the en-

semble forecast performance by comparing the fraction

of events that were properly forewarned (i.e., the hit

rate) with the fraction of nonevents that occurred after

a warning was issued (i.e., the false alarm rate). The

ratios are determined from contingency tables and the

events are predeﬁned and expressed in binary terms.

Given an ensemble of hindcasts, an ROC curve show-

ing the diﬀerent combinations of hit and false alarm

rates given diﬀerent forecast probabilities can be con-

structed. The ROC curve is useful for identifying op-

timum strategies for issuing warnings, by indicating

the trade-oﬀ between false alarms and misses. Details

and examples of the ROC calculation can be found in

Mason and Graham (1999).

ROC curves for the Ni˜no-3.4 hindcasts at lead

times of 3, 6, 9, and 12 months are shown in Fig. 8.

For all lead times, there are three curves representing

three diﬀerent event types: (i) warm events (upper ter-

cile), (ii) cold events (lower tercile), and (iii) normal

events (middle tercile), where both the retrospective

forecasts and the observations have been normalized

by their local standard deviation. An ideal probabilis-

tic forecast system would have relatively large hit rates

and small false alarm rates so that all the points on

the ROC curve would cluster in the upper-left corner

of the diagram (e.g., Kirtman, 2003). For a relatively

poor forecast system, all the points of the ROC curve

would lie very close to the dashed diagonal line indi-

cating that the hit rate and the false alarm rate were

nearly the same (i.e., no skill). Akin to previous stud-

ies (e.g., Kirtman, 2003; DeWitt, 2005), the EPS has

relatively higher skill for the warm events and cold

events, and it has relatively lower skill for the neutral

events. For 3- and 6-month lead times, both warm and

cold events are fairly well predicted. The false alarm

rates are low and the hit rates are relatively high when

the agreement among the ensemble members is rela-

tively large. For a normal event forecast, the 3-month

lead time also has some skill although smaller than for

the extremes, whereas for 6-, 9-, and 12-month leads,

the ROC curve lies close to the diagonal, indicating

little skill. These results indicate that the EPS can

capture and predict big SST anomaly signals or ex-

treme events over the Ni˜no-3.4 region in diﬀerent sea-

368 ENSEMBLE ENSO HINDCAST OVER PAST 120 YEARS VOL. 26

sons quite well (Zhang et al., 2005), and the model is

able to predict extreme events. At 9- and 12-month

lead times, there is a considerable drop in skill. High

conﬁdence forecasts for warm and cold events are only

marginally better than those for normal events, sug-

gesting that a conﬁdent forecast for a warm or cold

event at 12 months lead time is still not particularly

useful. This is also appeared to be the case with the

earlier studies (e.g., Barnston et al., 1999; Kirtman,

2003).

The ability to easily verify the hindcast skill of

warm events and cold events separately is one of the

advantages of the ROC calculation, and thus we fur-

ther used the ROC area to verify the probabilistic

skills of the EPS for the three diﬀerent events. The

ROC area is the area under the ROC curve, and a

perfect forecast system would have a ROC area of 1

while a system with no ability to distinguish in ad-

vance between diﬀerent events would have a score of

0.5. Figure 9 shows the ROC area of SST anomalies

for warm events, normal events, and cold events over

the Ni˜no-3.4 region, as a function of lead time. Similar

to the analysis results above for the 120-yr hindcast,

the ROC areas for both the warm and cold events are

clearly higher than that of the neutral events during

the 12-month forecast period. This also indicates that

a large (initial) signal can lead to a reliable prediction

and high prediction skill, and that for small predicted

signals, the evolution of predicted SST anomalies in

our EPS might present a more chaotic evolution, which

would degrade prediction skill and induce obvious de-

Fig. 9. ROC area of SST anomalies for warm events

(solid line), normal events (dot-dashed line), and cold

events (dashed line) over the Ni˜no-3.4 region, shown as

a function of lead time.

creases of predictability (e.g., Zheng et al., 2009).

4. Variation of ENSO ensemble predictability

Similar to previous studies (e.g., Chen et al., 2004),

Fig. 3 shows that the characteristics of the interan-

nual variability obviously have changed with time. To

examine the possible interdecadal variation of ENSO

ensemble predictability, in this section, we calculate

both deterministic and probabilistic prediction skills

of 6 sub-periods of 20 years each.

4.1 Deterministic predictability

ENSO’s deterministic predictability depends on

the time period in which it is estimated (Balmaseda

et al., 1995; Kirtman and Schopf, 1998; Chen et al.,

2004). This is also evident in Fig. 10. For the six

sub-periods of 20-year each, both anomaly correlation

and RMS error vary over signiﬁcant ranges, especially

Fig. 10. Anomaly correlation (top) and RMS error (bot-

tom) between the observed and the ensemble-mean pre-

dicted values of the Ni˜no-3.4 index. These are shown as

a function of lead time, for six consecutive 20-yr periods

since 1886.

NO. 2 ZHENG ET AL. 369

Fig. 11. The averaged correlation (top) and RMS error

(bottom) between the observed and the predicted Ni˜no-

3.4 SST anomalies at 6-month lead time. The correlation

and RMS error are computed at each running window of

20-yr period from 1886 to 2005. The shaded area rep-

resents the 95% conﬁdence interval via bootstrap proce-

dures.

at longer lead times. For example, high prediction

skills appear in the late 19th century and the middle-

late 20th century (i.e., 1886–1905, 1966–85, and 1986–

2005), these periods are dominated by strong and regu-

lar ENSO events. The high scores for the 1966–85 and

1986–2005 periods might not be surprising because the

model is trained using data from part of this period,

and the high scores for the 1886–1905 period, which is

free of artiﬁcial skill, indicate that the large El Ni˜no

and La Ni˜na events can be highly predictable, even

initialized with only SST anomaly data. But, the pe-

riods of 1906–25, 1926–45, and 1946–65 have relatively

low prediction skills. The lower skill in these periods

is consistent with there being fewer and smaller events

to predict.

The consistent temporal variations of the determin-

istic prediction skills of the EPS are further displayed

in Fig. 11, which shows the averaged correlation and

RMS error at 6-month lead time measured by a run-

ning window of 20-yr from 1886 to 2005 (i.e., 1886–

1905, 1887–1906, ···, 1986–2005). For example, the

skill at 1896 was calculated using the samples from

1886–1905. The 20-yr window is shifted by one year

for each time starting from 1886 to 2005. There is a

striking interdecadal variation of ENSO deterministic

predictability (in both the correlation and RMS error)

over the past 120 years from 1886 to 2005 in the EPS.

Generally, there is high predictability in the late 19th

century and in the middle-late 20th century, and a low

predictability from 1906 to 1951 (correlation is lower

than 0.50).

A bootstrapped resampling procedure (Efron and

Tibshirani, 1986) is also used to derive useful conﬁ-

dence limits for the skill scores in order to allow mean-

ingful statistical conclusions to be drawn from these

comparisons. The shaded area in Fig. 11 represents

the 95% conﬁdence interval computed using bootstrap

procedures, and indicates the uncertainty of veriﬁca-

tion sampling. Considering the conﬁdence interval,

both the correlation and RMS error results have shown

that veriﬁcation sampling have smaller inﬂuence on

the forecast skill scores than diﬀerences between the

skills in the diﬀerent decades. This might because the

ensemble members match the Gaussian distribution

quite well at diﬀerent lead times (Fig. 7), and thus

that the resampling process makes little adjustments

on the distribution of the forecasted ensembles.

4.2 Probabilistic predictability

To verify the variations of the probabilistic pre-

dictability of the EPS, we examine the temporal

changes of the ROC area for the three diﬀerent event

types. Figure 12 shows the ROC area in the Ni˜no-

3.4 region for warm events, cold events, and normal

events in six consecutive 20-yr periods since 1886. Ob-

viously, the ENSO probabilistic predictability for dif-

ferent events also depends on the time period. For

warm and cold events, high probabilistic prediction

skills still appear in the late 19th century (i.e., 1886–

1905, when the skill for warm events is only a little

higher than that in the early 20th century) and the

middle-late 20th century (1966–85 and 1986–2005),

with the highest skills for warm events in the sub-

period 1966–85, and highest probabilistic prediction

skills for cold events in sub-period 1986–2005.

In order to illustrate the interdecadal features of

the probabilistic prediction skills more clearly,we fur-

ther verify the consistent temporal variations of the

probabilistic prediction skills of the EPS. Figure 13

shows the ROC area for the three diﬀerent event types

at 6-month lead time measured by a running window

of 20-yr from 1886 to 2005. For the warm events,

the highest skill appears in the late 20th century, and

the lowest skill appears from 1910 to 1930. For the

cold events, the highest skill also appears in the late

20th century,and the lowest skill emerges from 1920 to

1950. For the neutral events, the 20-yr averaged skill

decreases from 1896 to 1910, and takes a linear in-

creasing feature from 1910 to 1995. The uncertainty of

veriﬁcation sampling is also shown in Fig. 13 using the

95% conﬁdence interval computed via bootstrap pro-

cedures. Considering the conﬁdence interval, for the

three diﬀerent events, the ROC analysis results also

show that veriﬁcation sampling has smaller inﬂuence

370 ENSEMBLE ENSO HINDCAST OVER PAST 120 YEARS VOL. 26

Fig. 12. ROC area in the Ni˜no-3.4 region for (a) warm events (upper tercile), (b)

cold events (middle tercile), and (c) normal events (lower tercile). These are shown

as a function of lead time, for six consecutive 20-yr periods since 1886.

on the forecast skill scores than diﬀerences between

the skills in the diﬀerent decades. Compared to

Fig. 11, the probabilistic veriﬁcation uncertainties are

larger than the deterministic veriﬁcation uncertainties.

However, in summary, for three diﬀerent the event

types, there are still obvious interdecadal variations

of ENSO probabilistic predictability over the past 120

years from 1886 to 2005 in the EPS.

5. Discussions and conclusions

In this paper, long-term retrospective ensemble

forecasts using 100 members covering the past 120

years are performed with an EPS. With the assimi-

lation of only a historic SST dataset, the prediction

skills of the EPS are veriﬁed in both a deterministic

and probabilistic sense, and the EPS displays useful

prediction skill. An interesting ﬁnding from the retro-

spective ensemble forecasts is that the EPS showed in-

terdecadal variations in both deterministic and proba-

bilistic prediction skills. Both deterministic and prob-

abilistic prediction skills are high in the late 19th cen-

tury from 1886 to 1905, and then decline with time,

reaching a minimum around 1910–50, beyond which

skill rebounds and increases with time from the 1960s

onward. The EPS has relatively high prediction skill

(but also including some artiﬁcial skill) from the 1960s

onward, especially in the late 20th century from 1986

to 2005. These results are similar to previous studies

(e.g., Chen et al., 2004; Tang et al., 2008), although

there are still some diﬀerences in the prediction skills

among diﬀerent models [which is also shown in Tang

et al. (2008)]. However, the trends of the interdecadal

variations in diﬀerent models appear comparable (i.e.,

higher predictability in the late 19th century and in

the middle-late 20th century, and a lower predictabil-

ity in early 20th century). These results all indicate

that the interdecadal variability of ENSO (determin-

istic and probabilistic) predictability exists generally,

and is not model dependent.

However, it should be noted that the theoretical

framework discussed in this study is based on a rela-

tively simple EPS, and one could argue that our anal-

ysis is not complete since we only use SST data. One

serious question is whether or not this interdecadal

variation in predictability discussed in this paper is

due mainly to the diﬀerences of the quality of the data

in diﬀerent periods. For example, one can guess that

the high prediction skill for the period from 1966 to

2005 is probably due to better data quality because

of improvements of observation systems and the fact

that the model was trained using the data from part

of this period.

To explore this, we can examine the simulation skill

NO. 2 ZHENG ET AL. 371

Fig. 13. The averaged ROC area of Ni˜no-3.4 SST

anomalies for (a) warm events, (b) normal events, and

ROC areas are computed at each running window of 20-

yr period from 1886 to 2005. The shaded area represents

the 95% conﬁdence interval via bootstrap procedures.

Fig. 14. The averaged correlation (solid line) and RMS

error (dashed line) between the observed and the simu-

lated Ni˜no-3.4 SST anomalies forced by the reconstructed

wind stress anomalies. The correlation and RMS error

are computed at each running window of 20-yr period

from 1886 to 2005.

of the model forced by the reconstructed wind stress

from the atmospheric τmodel. Thus, the quality of

initial conditions of predictions and the model perfor-

mance can be indicated by the simulation skill, with

both inherent to the data quality. And the existence

of an impact of the data quality on the model’s sim-

ulation skill will be mostly felt through the quality of

initial conditions, such as initial SST anomalies. Fig-

ure 14 shows the averaged correlation and RMS er-

ror between the observed and the simulated Ni˜no-3.4

SST anomalies forced by the reconstructed wind stress

anomalies at each running window of 20-yr period from

1886 to 2005, and indicates that the interdecadal dif-

ference of the simulation skill is not large in the model.

The magnitude of variation is about 0.1 from maxi-

mum to minimum during the entire period for both

correlation and RMS error (units: ◦C). A comparison

between Figs. 11 and 14 reveals that the interdecadal

variation in predictability does not agree with that in

the simulation skill. Thus, interdecadal variation in

predictability is not due to model performance associ-

ated with data quality. This is further suggested by

the fact that noticeably higher prediction skill also oc-

curs during the period from 1886 to 1905.

The results of the analyses in this paper moti-

vate us to further investigate the possible reasons

and sources of limited ENSO predictability in detail.

These concerns in future works need to be addressed

through more comprehensive analyses, and other pos-

sible sources (besides the ENSO signal) of controlling

ENSO predictability also need to be further discussed,

such as nonlinearity and stochastic noise. Neverthe-

less, this study is to date the ﬁrst work to discuss both

ENSO deterministic and probabilistic predictabilities

using ensemble forecasts and long-term predictions.

The results and conclusions found in this EPS might

be helpful for the study of ENSO predictability.

Acknowledgements. The authors wish to thank the

two anonymous reviewers for their very helpful comments

and suggestions. This research is supported by the Chinese

Academy of Science (Grant No. KZCX2-YW-202), Na-

tional Basic Research Program of China (2006CB403600)

and National Natural Science Foundation of China (Grant

Nos. 40437017 and 40805033).

REFERENCES

Balmaseda,M.A.,M.K.Davey,andD.L.T.Anderson,

1995: Decadal and seasonal dependence of ENSO

prediction skill. J. Climate,8, 2705–2715.

Barnston, A. G., M. Glantz, and Y. He, 1999: Predictive

skill of statistical and dynamical climate models in

SST forecasts during the 1997–98 El Ni˜no and the

1998 La Ni˜na onset. Bull. Amer. Meteor. Soc.,80,

217–243.

Chen, D., M. A. Cane, A. Kaplan, S. E. Zebiak, and D.

Huang, 2004: Predictability of El Ni˜no in the past

148 years. Nature,428, 733–736.

DeWitt, D. G., 2005: Retrospective forecasts of interan-

372 ENSEMBLE ENSO HINDCAST OVER PAST 120 YEARS VOL. 26

nual sea surface temperature anomalies from 1982 to

present using a directly coupled atmosphere-ocean

general circulation model. Mon. Wea. Rev.,133,

2972–2995.

Efron, B., and R. Tibshirani, 1986: Bootstrap methods

for standard errors, conﬁdence intervals, and other

measures of statistical accuracy. Statistical Science,

1, 54–77.

Evensen, G., 2003: The ensemble Kalman ﬁlter: Theoret-

ical formulation and practical implementation. Ocean

Dynamics,53, 343–367.

Evensen, G., 2004: Sampling strategies and square root

analysis schemes for the EnKF. Ocea n Dyn ami cs,54,

539–560.

Goswami, B. N., and J. Shukla, 1991: Predictability of

a coupled ocean-atmosphere model. J. Climate,4,

3–22.

Hamill, T. M., 2001: Interpretation of rank histograms for

verifying ensemble forecasts. Mon. Wea. Rev.,129,

550–560.

Kirtman, B. P., and P. S. Schopf, 1998: Decadal variabil-

ity in ENSO predictability and prediction. J. Cli-

mate,11, 2804–2822.

Kirtman, B. P., 2003: The COLA anomaly coupled

model: Ensemble ENSO prediction. Mon. Wea. Rev.,

131, 2324–2341.

Keenlyside, N., and R. Kleeman, 2002: On the annual

cycle of the zonal currents in the equatorial Paciﬁc.

J. Geophys. Res.,107, doi: 10.1029/2000JC0007111.

Latif, M., and Coauthors, 1998: A review of the pre-

dictability and prediction of ENSO. J. Geophys. Res.,

103, 14,375–14,393.

Mason, S. J., and N. E. Graham, 1999: Conditional prob-

abilities, relative operating characteristics, and rela-

tive operating levels. Wea. Forecasti ng,14, 713–725.

McCreary, J. P., 1981: A linear stratiﬁed ocean model of

the equatorial undercurrent. Philosophical Transac-

tions of the Royal Society (London), 298, 603–635.

Moore, A., and Coauthors, 2006: Optimal forcing pat-

terns for coupled models of ENSO. J. Climate,19,

4683–4699.

Peng, P., and A. Kumar, 2005: A large ensemble analysis

of the inﬂuence of tropical SSTs on seasonal atmo-

spheric variability. J. Climate,15, 1068–1085.

Smith, T. M., and R. W. Reynolds, 2004: Improved ex-

tended reconstruction of SST (1854–1997). J. Cli-

mate,17, 2466–2477.

Smith, T. M., R. W. Reynolds, T. C. Peterson, and J.

Lawrimore, 2008: Improvements to NOAA’s histor-

ical merged land-ocean surface temperature analysis

(1880–2006). J. Climate,21, 2283–2296.

Talagrand, O., R. Vautard, and B. Strauss, 1998: Evalu-

ation of probabilistic prediction systems. Proc. Sem-

inar on Predictability, Reading, United Kingdom,

ECMWF, 1–26.

Tang, Y., and W. W. Hsieh, 2003: ENSO simulation and

predictions using a hybrid coupled model with data

assimilation. J. Meteor. Soc. Japan,81, 1–19.

Tang, Y., Z. Deng, X. Zhou, and Y. Cheng, 2008: Inter-

decadal variation of ENSO predictability in multiple

models. J. Climate,21, 4811–4833.

Tziperman, E., L. Stone, M. A. Cane, and H. Jarosh,

1994: El Ni˜no chaos: Overlapping of resonances

between the seasonal cycle and the Paciﬁc ocean-

atmosphere oscillator. Science,264, 72–74.

Zhang, R.-H., S. E. Zebiak, R. Kleeman, and N. Keenly-

side, 2003: A new intermediate coupled model for El

Ni˜no simulation and prediction. Geophys. Res. Lett.,

30(19), 2012, doi: 10.1029/2003GL018010.

Zhang, R.-H., S. E. Zebiak, R. Kleeman, and N. Keenly-

side, 2005: Retrospective El Ni˜no forecast using an

improved intermediate coupled model. Mon. Wea.

Rev.,133, 2777–2802.

Zheng, F., J. Zhu, R.-H. Zhang, and G.-Q. Zhou,

2006a: Ensemble hindcasts of SST anomalies in

the tropical Paciﬁc using an intermediate cou-

pled model. Geophys. Res. Lett.,33, L19604, doi:

10.1029/2006GL026994.

Zheng, F., J. Zhu, R.-H. Zhang, and G.-Q. Zhou, 2006b:

Improved ENSO forecasts by assimilating sea surface

temperature observations into an intermediate cou-

pled model. Adv. Atmos. S ci. ,23(4), 615–624, doi:

10.1007/s00376-006-0615-z.

Zheng, F., 2007: Research on ENSO ensemble predic-

tions. Ph. D. dissertation, Institute of Atmospheric

Physics, Chinese Academy of Sciences, 159pp. (in

Chinese)

Zheng, F., J. Zhu, and R.-H. Zhang, 2007: Impact of al-

timetry data on ENSO ensemble initializations and

predictions. Geophys. Res. Lett.,34, L13611, doi:

10.1029/2007GL030451.

Zheng, F., and J. Zhu, 2008: Balanced multivariate model

errors of an intermediate coupled model for ensem-

ble Kalman ﬁlter data assimilation. J. Geophys. Res.,

113, C07002, doi: 10.1029/2007JC004621.

Zheng, F., H. Wang, and J. Zhu, 2009: Impacts on ENSO

ensemble prediction: Initial-error perturbations vs.

model-error perturbations. Chinese Science Bulletin.

(in press)

Probabilistic prediction of ENSO over the past 137 years using the CESM model

Article

Full-text available

Dec 2022

In this study, we investigate probabilistic predictability for the El Niño‐Southern Oscillation (ENSO) by assessing both actual prediction skill and potential predictability using a long‐term retrospective forecast from a complicated coupled general circulation model (CGCM). Our results indicate that above and below normal events are more predictable than neutral events. The probabilistic prediction skill suffers prominent “Spring Predictability Barrier” and undergoes notable interdecadal variation. For the above and below normal events, the lowest probabilistic prediction skills appear during 1920–1940 and the higher prediction skills occur after the 1960s. The seasonal and interdecadal variability of the probabilistic prediction skill stems mainly from the variability of the ENSO signal intensity. There is much room for improvement for the predictability of all three categories of ENSO events. At least an additional 1 or 2 months of skillful probabilistic predictions can be expected to progress in the future. To our knowledge, this is the first study to use a CGCM to evaluate probabilistic predictability for ENSO at various time scales.

The predictability study of the two flavors of ENSO in the CESM model from 1881 to 2017

Article

Full-text available

May 2022
CLIM DYNAM

In this study, we evaluated the predictability of the two flavors of the El Niño Southern Oscillation (ENSO) based on a long-term retrospective prediction from 1881 to 2017 with the Community Earth System Model. Specifically, the Central-Pacific (CP) ENSO has a more obvious Spring Predictability Barrier and lower deterministic prediction skill than the Eastern-Pacific (EP) ENSO. The potential predictability declines with lead time for both the two flavors of ENSO, and the EP ENSO has a higher upper limit of the prediction skill as compared with the CP ENSO. The predictability of the two flavors of ENSO shows distinct interdecadal variation for both actual skill and potential predictability; however, their trends in the predictability are not synchronized. The signal component controls the seasonal and interdecadal variations of predictability for the two flavors of ENSO, and has larger contribution to the CP ENSO than the EP ENSO. There is significant scope for improvement in predicting the two flavors of ENSO, especially for the CP ENSO.

Key Processes on Triggering the Moderate 2020/21 La Niña Event as Depicted by the Clustering Approach

Article

Full-text available

Feb 2022

The 2020/21 La Niña was not well predicted by most climate models when it started in early-mid 2020. This paper adopted an El Niño-Southern Oscillation (ENSO) ensemble prediction system to evaluate the key physical processes in the development of this cold event by performing a clustering analysis of 100 ensemble member predictions 1 year in advance. The abilities of two clustering approaches were first examined in regard to capturing the development of the 2020/21 La Niña event. One approach was index clustering, which adopted only the 12-month Niño3.4 indices in 2020 as an indicator, and the other was pattern clustering through contrasting the evolution of sea surface temperature (SST) anomalies over the tropical Pacific in 2020 for clustering. Pattern clustering surpasses index clustering in better describing the evolution over the off-equatorial and equatorial regions during the 2020/21 La Niña. Consequently, based on the pattern clustering approach, a comparison of the selected most (five best) and least (five worst) representative ensemble members illustrated that the predominance of anomalous southeasterly winds over the central equatorial Pacific in spring 2020 played a crucial role in initiating the moderate La Niña event in 2020/21, by preventing the development of westerly winds over the warm pool. Moreover, the inherent spring predictability barrier (SPB) was still a major challenge for improving the prediction skill of the 2020/21 La Niña event when the prediction occurred across the spring season.

A multi-model prediction system for ENSO

Article

Full-text available

May 2023

The El Niño and Southern Oscillation (ENSO) is the primary source of predictability for seasonal climate prediction. To improve the ENSO prediction skill, we established a multi-model ensemble (MME) prediction system, which consists of 5 dynamical coupled models with various complexities, parameterizations, resolutions, initializations and ensemble strategies, to account for the uncertainties as sufficiently as possible. Our results demonstrated the superiority of the MME over individual models, with dramatically reduced the root mean square error and improved the anomaly correlation skill, which can compete with, or even exceed the skill of the North American Multi-Model Ensemble. In addition, the MME suffered less from the spring predictability barrier and offered more reliable probabilistic prediction. The real-time MME prediction adequately captured the latest successive La Niña events and the secondary cooling trend six months ahead. Our MME prediction has, since April 2022, forecasted the possible occurrence of a third-year La Niña event. Overall, our MME prediction system offers better skill for both deterministic and probabilistic ENSO prediction than all participating models. These improvements are probably due to the complementary contributions of multiple models to provide additive predictive information, as well as the large ensemble size that covers a more reasonable uncertainty distribution.

The Predictability of Ocean Environments that Contributed to the 2020/21 Extreme Cold Events in China: 2020/21 La Niña and 2020 Arctic Sea Ice Loss

Article

Full-text available

Jan 2022

Several consecutive extreme cold events impacted China during the first half of winter 2020/21, breaking the low-temperature records in many cities. How to make accurate climate predictions of extreme cold events is still an urgent issue. The synergistic effect of the warm Arctic and cold tropical Pacific has been demonstrated to intensify the intrusions of cold air from polar regions into middle-high latitudes, further influencing the cold conditions in China. However, climate models failed to predict these two ocean environments at expected lead times. Most seasonal climate forecasts only predicted the 2020/21 La Niña after the signal had already become apparent and significantly underestimated the observed Arctic sea ice loss in autumn 2020 with a 1–2 month advancement. In this work, the corresponding physical factors that may help improve the accuracy of seasonal climate predictions are further explored. For the 2020/21 La Niña prediction, through sensitivity experiments involving different atmospheric-oceanic initial conditions, the predominant southeasterly wind anomalies over the equatorial Pacific in spring of 2020 are diagnosed to play an irreplaceable role in triggering this cold event. A reasonable inclusion of atmospheric surface winds into the initialization will help the model predict La Niña development from the early spring of 2020. For predicting the Arctic sea ice loss in autumn 2020, an anomalously cyclonic circulation from the central Arctic Ocean predicted by the model, which swept abnormally hot air over Siberia into the Arctic Ocean, is recognized as an important contributor to successfully predicting the minimum Arctic sea ice extent.

Reinitializing Sea Surface Temperature in the Ensemble Intermediate Coupled Model for Improved Forecasts

Article

Full-text available

Aug 2021

The Ensemble Intermediate Coupled Model (EICM) is a model used for studying the El Niño-Southern Oscillation (ENSO) phenomenon in the Pacific Ocean, which is anomalies in the Sea Surface Temperature (SST) are observed. This research aims to implement Cressman to improve SST forecasts. The simulation considers two cases in this work: the control case and the Cressman initialized case. These cases are simulations using different inputs where the two inputs differ in terms of their resolution and data source. The Cressman method is used to initialize the model with an analysis product based on satellite data and in situ data such as ships, buoys, and Argo floats, with a resolution of 0.25 × 0.25 degrees. The results of this inclusion are the Cressman Initialized Ensemble Intermediate Coupled Model (CIEICM). Forecasting of the sea surface temperature anomalies was conducted using both the EICM and the CIEICM. The results show that the calculation of SST field from the CIEICM was more accurate than that from the EICM. The forecast using the CIEICM initialization with the higher-resolution satellite-based analysis at a 6-month lead time improved the root mean square deviation to 0.794 from 0.808 and the correlation coefficient to 0.630 from 0.611, compared the control model that was directly initialized with the low-resolution in-situ-based analysis.

Atlantic Niño/Niña Prediction Skills in NMME Models

Article

Full-text available

Jun 2021

The Atlantic Niño/Niña, one of the dominant interannual variability in the equatorial Atlantic, exerts prominent influence on the Earth’s climate, but its prediction skill shown previously was unsatisfactory and limited to two to three months. By diagnosing the recently released North American Multimodel Ensemble (NMME) models, we find that the Atlantic Niño/Niña prediction skills are improved, with the multi-model ensemble (MME) reaching five months. The prediction skills are season-dependent. Specifically, they show a marked dip in boreal spring, suggesting that the Atlantic Niño/Niña prediction suffers a “spring predictability barrier” like ENSO. The prediction skill is higher for Atlantic Niña than for Atlantic Niño, and better in the developing phase than in the decaying phase. The amplitude bias of the Atlantic Niño/Niña is primarily attributed to the amplitude bias in the annual cycle of the equatorial sea surface temperature (SST). The anomaly correlation coefficient scores of the Atlantic Niño/Niña, to a large extent, depend on the prediction skill of the Niño3.4 index in the preceding boreal winter, implying that the precedent ENSO may greatly affect the development of Atlantic Niño/Niña in the following boreal summer.

一个ENSO多模式集合预报系统介绍

Article

May 2023

Modern Problems of Mathematical Physics and Their Applications

Book

Full-text available

Mar 2022

There are many applications of mathematical physics in several fields of basic science and engineering. Thus, we have tried to provide the Special Issue “Modern Problems of Mathematical Physics and Their Applications” to cover the new advances of mathematical physics and its applications. In this Special Issue, we have focused on some important and challenging topics, such as integral equations, ill-posed problems, ordinary differential equations, partial differential equations, system of equations, fractional problems, linear and nonlinear problems, fuzzy problems, numerical methods, analytical methods, semi-analytical methods, convergence analysis, error analysis and mathematical models. In response to our invitation, we received 31 papers from more than 17 countries (Russia, Uzbekistan, China, USA, Kuwait, Bosnia and Herzegovina, Thailand, Pakistan, Turkey, Nigeria, Jordan, Romania, India, Iran, Argentina, Israel, Canada, etc.), of which 19 were published and 12 rejected.

ENSO Predictability over the Past 137 Years Based on a CESM Ensemble Prediction System

Article

Full-text available

Dec 2021

In this study, we conducted an ensemble retrospective prediction from 1881 to 2017 using the Community Earth System Model to evaluate El Niño–Southern Oscillation (ENSO) predictability and its variability on different time scales. To our knowledge, this is the first assessment of ENSO predictability using a long-term ensemble hindcast with a complicated coupled general circulation model (CGCM). Our results indicate that both the dispersion component (DC) and signal component (SC) contribute to the interannual variation of ENSO predictability (measured by relative entropy). Specifically, the SC is more important for ENSO events, whereas the DC is of comparable importance for short lead times and in weak ENSO signal years. The SC dominates the seasonal variation of ENSO predictability, and an abrupt decrease in signal intensity results in the spring predictability barrier feature of ENSO. At the interdecadal scale, the SC controls the variability of ENSO predictability, while the magnitude of ENSO predictability is determined by the DC. The seasonal and interdecadal variations of ENSO predictability in the CGCM are generally consistent with results based on intermediate complexity and hybrid coupled models. However, the DC has a greater contribution in the CGCM than that in the intermediate complexity and hybrid coupled models. Significance Statement El Niño–Southern Oscillation (ENSO) is a prominent interannual signal in the global climate system with widespread climatic influence. Our current understanding of ENSO predictability is based mainly on long-term retrospective forecasts obtained from intermediate complexity and hybrid coupled models. Compared with those models, complicated coupled general circulation models (CGCMs) include more realistic physical processes and have the potential to reproduce the ENSO complexity. However, hindcast studies based on CGCMs have only focused on the last 20–60 years. In this study, we conducted an ensemble retrospective prediction from 1881 to 2017 using the Community Earth System Model in order to evaluate ENSO predictability and examine its variability on different time scales. To our knowledge, this is the first assessment of ENSO predictability using a long-term ensemble hindcast with a CGCM.

Interdecadal Variation of ENSO Predictability in Multiple Models

Article

Full-text available

Sep 2008

In this study, El Niño–Southern Oscillation (ENSO) retrospective forecasts were performed for the 120 yr from 1881 to 2000 using three realistic models that assimilate the historic dataset of sea surface temperature (SST). By examining these retrospective forecasts and corresponding observations, as well as the oceanic analyses from which forecasts were initialized, several important issues related to ENSO predictability have been explored, including its interdecadal variability and the dominant factors that control the interdecadal variability. The prediction skill of the three models showed a very consistent interdecadal variation, with high skill in the late nineteenth century and in the middle–late twentieth century, and low skill during the period from 1900 to 1960. The interdecadal variation in ENSO predictability is in good agreement with that in the signal of interannual variability and in the degree of asymmetry of ENSO system. A good relationship was also identified between the degree of asymmetry and the signal of interannual variability, and the former is highly related to the latter. Generally, the high predictability is attained when ENSO signal strength and the degree of asymmetry are enhanced, and vice versa. The atmospheric noise generally degrades overall prediction skill, especially for the skill of mean square error, but is able to favor some individual prediction cases. The possible reasons why these factors control ENSO predictability were also discussed.

Balanced multivariate model errors of an intermediate coupled model for ensemble Kalman filter data assimilation

Article

Full-text available

Jul 2008

The ensemble Kalman filter (EnKF) depends on a set of ensemble forecasts to calculate the background error covariances. Without model error perturbations and the inflation of forecast ensembles, the spread of the ensemble forecasts can collapse rapidly. There are several ways to generate model perturbations, i.e., perturbations in model parameters/parameterizations, perturbations in the forcing fields of the model and adding some error terms to the right-hand side of the model equations. In this paper, we focus on the "adding model error terms" approach, which utilizes a first-order Markov chain model. This approach is suitable to those unforced models, such as the coupled atmosphere-ocean models. However, for a multivariate model, the balance between different model variables could be an important issue in building its model-error model. In this paper, we focus on building a balanced error model for an intermediate coupled model for El Niño-Southern Oscillation (ENSO) predictions. A simple approach to build such a model-error model is proposed on the basis of the multivariate empirical orthogonal functions method. EnKF data assimilation experiments with different configurations of multivariate model error treatments (no model errors, unbalanced and balanced model errors) are performed using realistic sea surface temperature (SST) and sea level (SL) observations. Results show that it is necessary to develop balanced, multivariate model-error models in order to successfully assimilate both SST and SL observations. The hindcasts initialized from these different assimilation experiment results also demonstrate that the balanced model errors can yield more balanced initial conditions that lead to improved predictions of ENSO events.

Impact of altimetry data on ENSO ensemble initializations and predictions

Article

Full-text available

Jul 2007

The El Niño/Southern Oscillation (ENSO) predictions strongly depend on the accuracy and dynamical consistency of the coupled initial conditions. Based on the proposed ensemble Kalman filter (EnKF), a new initialization scheme for the ENSO ensemble prediction system (EPS) was designed and tested in an intermediate coupled model (ICM). The inclusion of this scheme in the ICM leads to substantial improvements in ENSO prediction skill via the successful assimilation of both observed sea surface temperature (SST) and TOPEX/Poseidon/Jason-1 (T/P/J) altimeter data into the initial ensemble conditions. Comparisons with the original ensemble hindcast experiment show that the ensemble prediction skills were significantly improved out to a 12-month lead time by improving sea level (SL) initial conditions for better parameterization of subsurface thermal effects. It is clearly demonstrated that improvement in forecast skill can result from the multivariate and multi-observational ensemble data assimilation.

ENSO ensemble prediction: Initial error perturbations vs. model error perturbations

Article

Full-text available

Jul 2009

Based on our developed ENSO (El Niño-Southern Oscillation) ensemble prediction system (EPS), the impacts of stochastic initial-error and model-error perturbations on ENSO ensemble predictions are examined and discussed by performing four sets of 14-a retrospective forecast experiments in both a deterministic and probabilistic sense. These forecast schemes are differentiated by whether they considered the initial or model stochastic perturbations. The comparison results suggest that the stochastic model-error perturbations, which are added into the modeled physical fields to mainly represent the uncertainties of the physical model, have significant, positive impacts on improving the ensemble prediction skills during the entire 12-month forecast process. However, the stochastic initial-error perturbations have relatively small impacts on the ensemble prediction system, and its impacts are mainly focusing on the first 3-month predictions.

Conditional Probabilities, Relative Operating Characteristics, and Relative Operating Levels

Article

Oct 1999

The relative operating characteristic (ROC) curve is a highly flexible method for representing the quality of dichotomous, categorical, continuous, and probabilistic forecasts. The method is based on ratios that measure the proportions of events and nonevents for which warnings were provided. These ratios provide estimates of the probabilities that an event will be forewarned and that an incorrect warning will be provided for a nonevent. Some guidelines for interpreting the ROC curve are provided. While the ROC curve is of direct interest to the user, the warning is provided in advance of the outcome and so there is additional value in knowing the probability of an event occurring contingent upon a warning being provided or not provided. An alternative method to the ROC curve is proposed that represents forecast quality when expressed in terms of probabilities of events occurring contingent upon the warnings provided. The ratios used provide estimates of the probability of an event occurring given the forecast that is issued. Some problems in constructing the curve in a manner that is directly analogous to that for the ROC curve are highlighted, and so an alternative approach is proposed. In the context of probabilistic forecasts, the ROC curve provides a means of identifying the forecast probability at which forecast value is optimized. In the context of continuous variables, the proposed relative operating levels curve indicates the exceedence threshold for defining an event at which forecast skill is optimized, and can enable the forecast user to estimate the probabilities of events other than that defined by the forecaster.

Retrospective Forecasts of Interannual Sea Surface Temperature Anomalies from 1982 to Present Using a Directly Coupled Atmosphere Ocean General Circulation Model

Article

Oct 2005

David DeWitt

A large number of ensemble hindcasts (or retrospective forecasts) of tropical Pacific sea surface temperature (SST) have been made with a coupled atmosphere–ocean general circulation model (CGCM) that does not employ flux correction in order to evaluate the potential skill of the model as a seasonal forecasting tool. Oceanic initial conditions are provided by an ocean data assimilation system. Ensembles of seven forecasts of 6-month length are made starting each month in the 1982 to 2002 period. Skill of the coupled model is evaluated from both a deterministic and a probabilistic perspective. The skill metrics are calculated using both the bulk method, which includes all initial condition months together, and as a function of initial condition month. The latter method allows a more objective evaluation of how the model has performed in the context in which forecasts are actually made and applied. The deterministic metrics used are the anomaly correlation and the root-mean-square error. The coupled model deterministic skill metrics are compared with those from persistence and damped persistence reference forecasts. Despite the fact that the coupled model has a large cold bias in the central and eastern equatorial Pacific this coupled model is shown to have forecast skill that is competitive with other state-of-the-art forecasting techniques. Potential skill from probabilistic forecasts made using the coupled model ensemble members are evaluated using the relative operating characteristics method. This analysis indicates that for most initial condition months this coupled model has more skill at forecasting cold events than warm or neutral events in the central Pacific. In common with other forecasting systems, the coupled model forecast skill is found to be lowest for forecasts passing through the Northern Hemisphere (NH) spring. Diagnostics of this so-called spring predictability barrier in the context of this coupled model indicate that two factors likely contribute to this predictability barrier. First, the coupled model shows a too-weak coupling of the surface and subsurface temperature anomalies during NH spring. Second, the coupled-model-simulated signal-to-noise ratio for SST anomalies is much lower during NH spring than at other times of the year, indicating that the model’s potential predictability is low at this time.

Decadal Variability in ENSO Predictability and Prediction

Article

Nov 1998

A simple coupled model is used to examine decadal variations in El Niño-Southern Oscillation (ENSO) prediction skill and predictability. Without any external forcing, the coupled model produces regular ENSO-like variability with a 5-yr period. Superimposed on the 5-yr oscillation is a relatively weak decadal amplitude modulation with a 20-yr period. External uncoupled atmospheric `weather noise' that is determined from observations is introduced into the coupled model. Including the weather noise leads to irregularity in the ENSO events, shifts the dominant period to 4 yr, and amplifies the decadal signal. The decadal signal results without any external prescribed changes to the mean climate of the model.Using the coupled simulation with weather noise as initial conditions and for verification, a large ensemble of prediction experiments were made. The forecast skill and predictability were examined and shown to have a strong decadal dependence. During decades when the amplitude of the interannual variability is large, the forecast skill is relatively high and the limit of predictability is relatively long. Conversely, during decades when the amplitude of the interannual variability is low, the forecast skill is relatively low and the limit of predictability is relatively short. During decades when the predictability is high, the delayed oscillator mechanism drives the sea surface temperature anomaly (SSTA), and during decades when the predictability is low, the atmospheric noise strongly influences the SSTA. Additional experiments indicate that the relative effectiveness of the delayed oscillator mechanism versus the external noise forcing in determining interannual SSTA variability is strongly influenced by much slower timescale (decadal) variations in the state of the coupled model.

The COLA anomaly coupled model: ensemble ENSO prediction

Article

Oct 2003

Ben Kirtman

Results are described from a large sample of coupled ocean-atmosphere retrospective forecasts during 1980-99. The prediction system includes a global anomaly coupled general circulation model and a state-of-the-art ocean data assimilation system. The retrospective forecasts are initialized each January, April, July, and October of each year, and ensembles of six forecasts are run for each initial month, yielding a total of 480 1-yr predictions. In generating the ensemble members, perturbations are added to the atmospheric initial state only. The skill of the prediction system is analyzed from both a deterministic and a probabilistic perspective. The probabilistic approach is used to quantify the uncertainty in any given forecast. The deterministic measures of skill for eastern tropical Pacific SST anomalies (SSTAs) suggest that the ensemble mean forecasts are useful up to lead times of 7-9 months. At somewhat shorter leads, the forecasts capture some aspects of the variability in the tropical Indian and Atlantic Oceans. The ensemble mean precipitation anomaly has disappointingly low correlation with observed rainfall. The probabilistic measures of skill (relative operating characteristics) indicate that the distribution of the ensemble provides useful forecast information that could not easily be gleaned from the ensemble mean. In particular, the prediction system has more skill at forecasting cold ENSO events compared to warm events. Despite the fact that the ensemble mean rainfall is not well correlated with the observed, the ensemble distribution does indicate significant regions where there is useful information in the forecast ensemble. In fact, it is possible to detect that droughts over land are more predictable than floods. It is argued that probabilistic verification is an important complement to any deterministic verification, and provides a useful and quantitative way to measure uncertainty.

ENSO Simulation and Prediction in a Hybrid Coupled Model with Data Assimilation

Article

Feb 2003

With a 3D Var assimilation scheme, several types of observations—sea surface temperatures (SST), sea level height anomalies (SLHA), and the upper ocean 400 meter depth-averaged heat content anomalies (HCA)—were assimilated into a hybrid coupled model of the tropical Pacific. The ocean analyses, and prediction skills of the SST anomalies (SSTA) from the assimilation of each type of observation, were presented for 1980-998. SST assimilation, besides improving the simulation of SSTA, also slightly improved the HCA and SLHA simulations in the equatorial Pacific, especially in the east. The ocean analyses with the assimilation of SLHA improved the simulations of SSTA, SLHA and HCA in the equatorial Pacific, while the assimilation of HCA improved the SLHA and HCA simulations. For ENSO predictions, assimilating SST yielded the best prediction skills for the Niño3 region SSTA at lead times of 3 months or shorter, but severely degraded the predictions at longer lead times. The best Niño3 SSTA predictions for lead times longer than 3 months came from the initializations with the assimilation of HCA and SLHA data. Assimilating SLHA yielded prediction skills for the Niño3 SSTA almost as good as assimilating HCA, indicating considerable potential for improving ENSO predictions from altimetry data. In this study, a neural network (NN) approach was used to find the nonlinear statistical relations among model variables for the assimilation of HCA and SLHA. Using NN yielded better prediction skills than using multiple linear regression.

Improved Extended Reconstruction of SST (1854–1997)

Article

Jun 2004

An improved SST reconstruction for the 1854-1997 period is developed. Compared to the version 1 analysis, in the western tropical Pacific, the tropical Atlantic, and Indian Oceans, more variance is resolved in the new analysis. This improved analysis also uses sea ice concentrations to improve the high-latitude SST analysis and a modified historical bias correction for the 1939-41 period. In addition, the new analysis includes an improved error estimate. Analysis uncertainty is largest in the nineteenth century and during the two world wars due to sparse sampling. The near-global average SST in the new analysis is consistent with the version 1 reconstruction. The 95% confidence uncertainty for the near-global average is 0.4°C or more in the nineteenth century, near 0.2°C for the first half of the twentieth century, and 0.1°C or less after 1950.

Ensemble hindcasts of ENSO events over the past 120 years using a large number of ensembles

Abstract and Figures

Recommended publications

Data Assimilation of the High-Resolution Sea Surface Temperature Obtained from the Aqua-Terra Satell...

Chaotic dynamics and the role of covariance inflation for reduced rank Kalman filters with model err...

Application of Coupled Bred Vectors to Seasonal-to-Interannual Forecasting and Ocean Data Assimilati...

Ensemble-based streamflow data assimilation for an operational distributed hydrologic model