Content uploaded by Keith Ord
Author content
All content in this area was uploaded by Keith Ord
Content may be subject to copyright.
ISSN 1440771X
ISBN 0 7326 1091 5
Prediction Intervals for Exponential Smoothing State Space Models
Rob J Hyndman, Anne B Koehler, J. Keith Ord and Ralph D Snyder
Working Paper 11/2001
2001
DEPARTMENT OF ECONOMETRICS
AND BUSINESS STATISTICS
AUSTRALIA
Prediction intervals for exponential smoothing state
space models
Rob J Hyndman
1
, Anne B. Koehler
2
, J. Keith Ord
3
, Ralph D. Snyder
1
9 November 2001
Abstract: The main objective of this paper is to provide analytical expressions for forecast
variances that can be used in prediction intervals for the exponential smoothing methods. These
expressions are based on state space models with a single source of error that underlie the
exponential smoothing methods. Three general classes of the state space models are presented.
The ﬁrst class is the standard linear state space model with homoscedastic errors, the second
retains the linear structure but incorporates a dynamic form of heteroscedasticity, and the third
allows for nonlinear structure in the observation equation as well as heteroscedasticity. Exact
matrix formulas for the forecast variances are found for each of these three classes of models.
These formulas are specialized to nonmatrix formulas for ﬁfteen state space models that underlie
nine exponential smoothing methods, including all the widely used methods. In cases where an
ARIMA model also underlies an exponential smoothing method, there is an equivalent state
space model with the same variance expression. We also discuss relationships between these
new ideas and previous suggestions for ﬁnding forecast variances and prediction intervals for the
exponential smoothing methods.
Keywords: forecast distribution, forecast interval, forecast variance, HoltWinters method,
structural models.
JEL classiﬁcation: C22, C53.
1
Department of Econometrics and Business Statistics, Monash University, VIC 3800, Australia.
2
Department of Decision Sciences and Management Information Systems, Miami University, Oxford, OH 45056,
USA.
3
320 Old North, Georgetown University, Washington, DC 20057, USA.
Corresponding author: Rob Hyndman (Rob.Hyndman@buseco.monash.edu.au).
1
Prediction intervals for exponential smoothing state space models
Exponential smoothing methods were given a ﬁrm statistical foundation by the use of state space
models with a single source of error (Ord, Koehler, and Snyder, 1997). One of the important
contributions following from that work is the ability to provide a sound statistical basis for ﬁnding
prediction intervals for all the exponential smoothing methods. Traditionally, prediction intervals
for the exponential smoothing methods have been found through heuristic approaches or by
employing equivalent or approximate ARIMA models.
The major goal of this paper is to provide analytical expressions for the variances of the forecast
errors to be used in computing prediction intervals for many types of exponential smoothing,
including all of the widely used methods. In contrast Ord, Koehler, and Snyder (1997) found pre
diction intervals by using the model to simulate the entire prediction distributions for each future
time period. While simulating prediction intervals may be an excellent method for producing
them, many forecasters may prefer analytical formulas for their forecasting software. Hyndman
et al. (2001) describe a framework of 24 models for exponential smoothing, including all of the
usual methods as well as some extensions. The procedures in that paper also use simulation
to produce prediction intervals for the models. We will provide analytical expressions for the
forecast variances for some of those 24 models.
Where an equivalent ARIMA model exists (such as for simple exponential smoothing, Holt’s lin
ear method, and the additive HoltWinters method), our results provide identical forecast vari
ances to those from the ARIMA model. However, we also provide forecast variances for many
exponential smoothing methods where there is no equivalent ARIMA model.
State space models with multiple sources of error have also been used to ﬁnd forecast variances
for the simple and Holt exponential smoothing methods (Johnston and Harrison, 1986). In these
cases the variances are limiting values in models where the convergence is rapid. The variance
formulas in these two cases are the same as in our results.
Prediction intervals for the additive HoltWinters method and the multiplicative HoltWinters
method have previously been considered by Chatﬁeld and Yar. For the additive HoltWinters
method they found an exact formula for the forecast variance that can be computed directly from
the form of the smoothing method (Yar and Chatﬁeld, 1990). For the multiplicative HoltWinters
method, they provided an approximate formula (Chatﬁeld and Yar, 1991). In both papers they
assumed that the oneperiod ahead forecast errors are independent but they did not assume any
particular underlying model for the smoothing methods.
Using a single source of error state space model, Koehler, Ord, and Snyder (2001) derived an
approximate formula for the forecast variance for the multiplicative HoltWinters method. Their
formula differs from that of Chatﬁeld and Yar (1991) only in how the standard deviation of the
onestepahead forecast error is estimated. The variance formulas were given only for the ﬁrst
year of forecasts in both of these papers (Chatﬁeld and Yar, 1991; Koehler, Ord, and Snyder, 2001).
The results in this current paper include ﬁnding both an exact formula (ignoring the estimation
error for the smoothing parameters) for the forecast variance in all future time periods for the
multiplicative HoltWinters method and a better approximation to this exact formula. Another
point of difference in our work is that Yar and Chatﬁeld (1990) assumed that the variance of the
oneperiodahead forecast error is constant for the additive HoltWinters method. We include a
class of models where this forecast variance is not constant but instead changes with the mean of
the time series.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
2
Prediction intervals for exponential smoothing state space models
In Section 1 we present the main results of the paper. We use the classiﬁcation of exponential
smoothing methods from Hyndman et al. (2001) and show the relationship to three general classes
of state space models for exponential smoothing. We present formulas for the hperiodahead
means (i.e., forecasts) and forecast variances for ﬁfteen speciﬁc exponential smoothing models
that correspond to nine exponential smoothing methods, including the most widely used ones.
In Sections 2–4, we examine each of the three general classes of models more closely. We pro
vide general matrix formulas for the means and variances and then specialize these formulas to
nonmatrix expressions for speciﬁc exponential models. Proofs for these results are provided in
appendices. For the Class 3 models, the nonmatrix expression is an approximation. Thus, we
devote Section 5 to the accuracy of this approximation.
Finally, we provide an example in Section 6 that gives forecasts and prediction intervals for the
multiplicative HoltWinters method. Using this example, we compare our exact forecast vari
ances with approximations and compare prediction intervals obtained by using our exact expres
sion with ones obtained by simulating complete prediction distributions.
1. The main results
We describe the exponential smoothing methods using a similar framework to that proposed in
Hyndman et al. (2001). Each method is denoted by two letters: the ﬁrst letter denotes the type
of trend (none, additive, or damped) and the second letter denotes the type of seasonality (none,
additive or multiplicative).
Seasonal Component
Trend N A M
Component (none) (additive) (multiplicative)
N (none) NN NA NM
A (additive) AN AA AM
D (damped) DN DA DM
Cell NN describes the simple exponential smoothing method, cell AN describes Holt’s linear
method. The additive HoltWinters’ method is given by cell AA and the multiplicative Holt
Winters’ method is given by cell AM. The other cells correspond to less commonly used but
analogous methods.
Hyndman et al. (2001) proposed two state space models for each of these methods: one with
additive errors and one with multiplicative errors. To distinguish these models, we will add a
third letter (A or M) before the letters denoting the type of trend and seasonality. For example,
MAN refers to a model with multiplicative errors, additive trend and no seasonality.
We consider three classes of state space models. In all cases, we use the Single Source of Error
(SSOE) model as formulated by Snyder (1985) and used in later work (e.g., Ord et al., 1997; Hyn
dman et al., 2001). The ﬁrst class is the usual state space form: we specify linear relationships in
both the observation and state equations and assume constant error variances. The second class
Hyndman, Koehler, Ord and Snyder: 9 November 2001
3
Prediction intervals for exponential smoothing state space models
retains the linear structure but introduces dynamic heteroscedasticity among the errors in a way
that is natural for state space processes. Finally, in the third class, we allow a special form of non
linearity in the observation equation (additive and multiplicative relationships among the state
variables) as well as dynamic heteroscedasticity. The second and third classes are not contained
within the ARIMA class, although the second class could be formulated as a kind of GARCH
model. The third class is not covered by either ARIMA or GARCH structures, but is important
as a stochastic description of nonlinear forecasting schemes such as HoltWinters multiplicative
method (cf. Makridakis et al., 1998, pp.161–69).
Let Y
1
, . . . , Y
n
denote the time series of interest. The three classes of models may be deﬁned as:
Class 1 Y
t
= Hx
t−1
+ ε
t
x
t
= Fx
t−1
+ Gε
t
Class 2 Y
t
= Hx
t−1
(1 + ε
t
)
x
t
= (F + Gε
t
)x
t−1
Class 3 Y
t
= H
1
x
t−1
H
2
z
t−1
(1 + ε
t
)
x
t
= (F
1
+ G
1
ε
t
)x
t−1
z
t
= (F
2
+ G
2
ε
t
)z
t−1
where F, G, H, F
1
, F
2
, G
1
, G
2
, H
1
and H
2
are all matrix coefﬁcients, and x
t
and z
t
are unobserved
state vectors at time t. In each case, {ε
t
} is iid N(0, σ
2
). Let p be the length of vector x
t
and q be
the length of vector z
t
. Then the orders of the above matrices are as follows.
Class 1 F (p × p) G (p × 1) H (1 × p)
Class 2 F (p × p) G (p × p) H (1 × p)
Class 3 F
1
(p × p) G
1
(p × p) H
1
(1 × p)
F
2
(q × q) G
2
(q × q) H
2
(1 × q)
Fifteen of the 18 models described above fall within the three state space model classes above:
Class 1 ANN AAN ADN ANA AAA ADA
Class 2 MNN MAN MDN MNA MAA MDA
Class 3 MNM MAM MDM
The remaining three models (ANM, AAM and ADM) do not ﬁt within one of these three classes,
and will not be considered further in this paper. Hyndman et al. (2001) also consider six additional
models with multiplicative trend which fall outside the three state space model classes deﬁned
above. Note that the above 15 models include two models for simple exponential smoothing, two
models for Holt’s method, two models for the additive HoltWinters’ method and one model for
the multiplicative HoltWinters’ method.
Equations for the 15 models above are given in Table 1 using the same notation as in Hyndman et
al. (2001). As in that paper, we use the Single Source of Error (SSOE) model in our developments.
That is, all the observation and state variables are driven by the single error sequence ε
t
. For
Hyndman, Koehler, Ord and Snyder: 9 November 2001
4
Prediction intervals for exponential smoothing state space models
Class 1
ANN Y
t
= `
t−1
+ ε
t
ANA Y
t
= `
t−1
+ s
t−m
+ ε
t
`
t
= `
t−1
+ αε
t
`
t
= `
t−1
+ αε
t
s
t
= s
t−m
+ γε
t
.
AAN Y
t
= `
t−1
+ b
t−1
+ ε
t
AAA Y
t
= `
t−1
+ b
t−1
+ s
t−m
+ ε
t
`
t
= `
t−1
+ b
t−1
+ αε
t
`
t
= `
t−1
+ b
t−1
+ αε
t
b
t
= b
t−1
+ αβε
t
b
t
= b
t−1
+ αβε
t
s
t
= s
t−m
+ γε
t
.
ADN Y
t
= `
t−1
+ b
t−1
+ ε
t
ADA Y
t
= `
t−1
+ b
t−1
+ s
t−m
+ ε
t
`
t
= `
t−1
+ b
t−1
+ αε
t
`
t
= `
t−1
+ b
t−1
+ αε
t
b
t
= φb
t−1
+ αβε
t
b
t
= φb
t−1
+ αβε
t
s
t
= s
t−m
+ γε
t
.
Class 2
MNN Y
t
= `
t−1
(1 + ε
t
) MNA Y
t
= (`
t−1
+ s
t−m
)(1 + ε
t
)
`
t
= `
t−1
(1 + αε
t
). `
t
= `
t−1
+ α(`
t−1
+ s
t−m
)ε
t
s
t
= s
t−m
+ γ(`
t−1
+ s
t−m
)ε
t
.
MAN Y
t
= (`
t−1
+ b
t−1
)(1 + ε
t
) MAA Y
t
= (`
t−1
+ b
t−1
+ s
t−m
)(1 + ε
t
)
`
t
= (`
t−1
+ b
t−1
)(1 + αε
t
) `
t
= `
t−1
+ b
t−1
+ α(`
t−1
+ b
t−1
+ s
t−m
)ε
t
b
t
= b
t−1
+ αβ(`
t−1
+ b
t−1
)ε
t
b
t
= b
t−1
+ αβ(`
t−1
+ b
t−1
+ s
t−m
)ε
t
s
t
= s
t−m
+ γ(`
t−1
+ b
t−1
+ s
t−m
)ε
t
.
MDN Y
t
= (`
t−1
+ b
t−1
)(1 + ε
t
) MDA Y
t
= (`
t−1
+ b
t−1
+ s
t−m
)(1 + ε
t
)
`
t
= (`
t−1
+ b
t−1
)(1 + αε
t
) `
t
= `
t−1
+ b
t−1
+ α(`
t−1
+ b
t−1
+ s
t−m
)ε
t
b
t
= φb
t−1
+ αβ(`
t−1
+ b
t−1
)ε
t
b
t
= φb
t−1
+ αβ(`
t−1
+ b
t−1
+ s
t−m
)ε
t
s
t
= s
t−m
+ γ(`
t−1
+ b
t−1
+ s
t−m
)ε
t
.
Class 3
MNM Y
t
= `
t−1
s
t−m
(1 + ε
t
)
`
t
= `
t−1
(1 + αε
t
)
s
t
= s
t−m
(1 + γε
t
)
MAM Y
t
= (`
t−1
+ b
t−1
)s
t−m
(1 + ε
t
)
`
t
= (`
t−1
+ b
t−1
)(1 + αε
t
)
b
t
= b
t−1
+ αβ(`
t−1
+ b
t−1
)ε
t
s
t
= s
t−m
(1 + γε
t
).
MDM Y
t
= (`
t−1
+ b
t−1
)s
t−m
(1 + ε
t
)
`
t
= (`
t−1
+ b
t−1
)(1 + αε
t
)
b
t
= φb
t−1
+ αβ(`
t−1
+ b
t−1
)ε
t
s
t
= s
t−m
(1 + γε
t
).
Table 1: Equations deﬁning each of the 15 models.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
5
Prediction intervals for exponential smoothing state space models
Mean Variance
Class 1 µ
h
v
1
= σ
2
and v
h
= σ
2
1 +
h−1
∑
j=1
c
2
j
Class 2 µ
h
v
h
= (1 + σ
2
)θ
h
− µ
2
h
where θ
1
= µ
2
1
and θ
h
= µ
2
h
+ σ
2
h−1
∑
j=1
c
2
j
θ
h−j
Class 3 µ
h
v
h
= s
2
n−m+h
θ
h
(1 + σ
2
)(1 + γ
2
σ
2
)
k
−
˜
µ
2
h
where θ
1
=
˜
µ
2
1
, θ
h
=
˜
µ
2
h
+ σ
2
h−1
∑
j=1
c
2
j
θ
h−j
and k = b(h −1)/mc.
Table 2: hperiodahead forecast means and variances. Here buc denotes the largest integer less than or
equal to u and m denotes the period of seasonality. For Class 3, the expression is exact when h ≤ m but
only approximate for h > m.
µ
h
˜
µ
h
c
j
Class 1/Class 2
ANN/MNN `
n
α
AAN/MAN `
n
+ hb
n
α (1 + jβ)
ADN/MDN `
n
+ φ
h−1
b
n
α
1 + φ
j−1
β
ANA/MNA `
n
+ s
n−m+1+(h−1)
∗
α + γd
j,m
AAA/MAA `
n
+ hb
n
+ s
n−m+1+(h−1)
∗
α (1 + jβ) + γd
j,m
ADA/MDA `
n
+ φ
h−1
b
n
+ s
n−m+1σ(h−1)
∗
α
1 + φ
j−1
β
+ γd
j,m
Class 3
MNM `
n
s
n−m+1+(h−1)
∗
`
n
α
MAM (`
n
+ hb
n
) s
n−m+1+(h−1)
∗
`
n
+ hb
n
α (1 + jβ)
MDM (`
n
+ φ
h−1
b
n
) s
n−m+1+(h−1)
∗
`
n
+ φ
h−1
b
n
α
1 + φ
j−1
β
Table 3: Values of µ
h
,
˜
µ
h
and c
j
for the 15 models. Here φ
j
= 1 + φ + ··· + φ
j
= (1 − φ
j+1
)/(1 − φ),
d
j,m
= 1 if j = m (mod m) and 0 otherwise, and j
∗
= j (mod m).
Hyndman, Koehler, Ord and Snyder: 9 November 2001
6
Prediction intervals for exponential smoothing state space models
development of this approach, see Snyder (1985) and Ord et al. (1997). The variables `
t
, b
t
and
s
t
are elements of the state vector and denote the level, slope and seasonal components respec
tively; the parameters α, β and γ are the usual smoothing parameters corresponding to the level
equation, trend equation and seasonal equation; φ is a damping coefﬁcient used for the damped
trend models; and m denotes the number of seasons in a year.
We derive the forecast means and variances for each of the three model classes, and speciﬁcally
for each of the 15 models. The forecast mean for Y
n+h
made h steps ahead from forecast origin n is
denoted by µ
h
= E(Y
n+h
 x
n
) and the corresponding forecast variance is given by v
h
= Var(Y
n+h

x
n
).
The main results are summarized in Tables 2 and 3.
Criteria such as maximum likelihood for selection of optimal estimates for the parameters can be
found in Hyndman et al. (2001) and Ord et al. (1997). It is important to notice that estimates for
σ
2
are not done in the same manner for all three classes. The estimate for σ
2
would be
ˆ
σ
2
=
n
∑
t=1
ˆ
ε
2
t
/n
where
ˆ
ε
t
=
(
Y
t
−
ˆ
Y
t−1
(1) for Class 1;
(Y
t
−
ˆ
Y
t−1
(1))/
ˆ
Y
t−1
(1) for Classes 2 and 3;
and
Y
t
(1) = E( Y
t+1
 x
t
) =
(
Hx
t
for Classes 1 and 2;
H
1
x
t
H
2
z
t
for Class 3.
For the special cases in Table 2, Y
n
(1) = µ
1
.
More detail concerning the results for each class are given in the following sections. Derivations
of these results are given in the Appendices.
2. Class 1
Derivations are given in Appendix A.
In this case, the general results for the mean and variance are
µ
h
= HF
h−1
x
n
, (1)
v
1
= σ
2
, (2)
and v
h
= σ
2
1 +
h−1
∑
j=1
c
2
j
, h ≥ 2, (3)
where c
j
= HF
j−1
G. Speciﬁc values for µ
h
and c
j
for the particular models in Class 1 are given in
Tables 2 and 3.
Note that point forecasts from ANN are equivalent to simple exponential smoothing (SES) and
AAN gives forecasts equivalent to Holt’s method. SES with drift is obtained from AAN by setting
Hyndman, Koehler, Ord and Snyder: 9 November 2001
7
Prediction intervals for exponential smoothing state space models
β = 0 so that b
n
= b for all n. The additive HoltWinters’ method is equivalent to the point
forecasts from AAA. Furthermore, ANN is equivalent to an ARIMA(0,1,1) model where θ = 1 −α,
AAN is equivalent to an ARIMA(0,2,2) model and AAA is equivalent to an ARIMA[ 0, (1, m), m +
1] model where (1, m) denotes differences of orders 1 and m (McKenzie, 1976; Roberts, 1982).
The expressions for v
h
can be simpliﬁed as shown below.
ANN v
h
= σ
2
1 + α
2
(h − 1)
AAN v
h
= σ
2
h
1 + α
2
(h − 1)
1 + βh +
1
6
β
2
h(2h − 1)
i
ADN v
h
= σ
2
h
1 + α
2
(h − 1) + α
2
β
h−1
∑
j=1
φ
j−1
(2 + φ
j−1
β)
i
ANA v
h
= σ
2
h
1 + α
2
(h − 1) + γ(2α + γ)b(h − 1)/mc
i
AAA v
h
= σ
2
h
1 + α
2
(h − 1)
1 + βh +
1
6
β
2
h(2h − 1)
+ γk
γ + α[2 + βm(k + 1)]
i
where k = b(h − 1 )/mc
ADA v
h
= σ
2
h
1 +
h−1
∑
j=1
α
2
(1 + φ
j−1
β)
2
+ γd
j,m
[γ + 2α(1 + φ
j−1
β)]
i
3. Class 2
Derivations are given in Appendix B.
In this case, the general result for the forecast mean is the same as for Model 1, namely
µ
h
= HF
h−1
x
n
.
(4)
The forecast variance is given by
v
h
= HV
h−1
H
0
(1 + σ
2
) + σ
2
µ
2
h
(5)
where
V
h
= F V
h−1
F
0
+ σ
2
GV
h−1
G
0
+ σ
2
P
h−1
, h = 1, 2, . . . , (6)
V
0
= O, and P
j
= GF
j
x
n
x
0
n
(F
j
)
0
G
0
.
For the six models we consider in this class, we obtain the following simpler expression
v
h
= (1 + σ
2
)θ
h
− µ
2
h
where θ
1
= µ
2
1
and
θ
h
= µ
2
h
+ σ
2
h−1
∑
j=1
c
2
j
θ
h−j
(7)
Hyndman, Koehler, Ord and Snyder: 9 November 2001
8
Prediction intervals for exponential smoothing state space models
and c
j
depends on the particular model. Note that c
j
is identical to that for the corresponding
additive error model from Class 1. Speciﬁc values for µ
h
and c
j
for the particular models in
Class 2 are given in Tables 2 and 3.
Note that point forecasts from MNN are equivalent to simple exponential smoothing (SES) but
that the variances are different from ANN. Similarly, MAN gives point forecasts equivalent to
Holt’s method but with different variances from AAN and MAA gives point forecasts equivalent
to the additive HoltWinter’s method but with different variances from AAA.
In the case of MNN, a nonrecursive expression for v
h
can be obtained:
v
h
= `
2
n
h
(1 + α
2
σ
2
)
h−1
(1 + σ
2
) − 1
i
.
4. Class 3
Derivations are given in Appendix C.
For models in this class,
µ
h
= H
1
M
h−1
H
0
2
(8)
and
v
h
= (1 + σ
2
)(H
2
⊗ H
1
)V
h−1
(H
2
⊗ H
1
)
0
+ σ
2
µ
2
h
(9)
where ⊗ denotes a Kronecker product, M
0
= x
n
z
0
n
, V
0
= O
2m
, and for h ≥ 1,
M
h
= F
1
M
h−1
F
0
2
+ G
1
M
h−1
G
0
2
σ
2
(10)
and
V
h
= (F
2
⊗ F
1
)V
h−1
(F
2
⊗ F
1
)
0
+ σ
2
h
(F
2
⊗ F
1
)V
h−1
(G
2
⊗ G
1
)
0
+ (G
2
⊗ G
1
)V
h−1
(F
2
⊗ F
1
)
0
i
+ σ
2
(G
2
⊗ F
1
+ F
2
⊗ G
1
)
h
V
h−1
+ vecM
h−1
(vecM
h−1
)
0
i
(G
2
⊗ F
1
+ F
2
⊗ G
1
)
0
+ σ
4
(G
2
⊗ G
1
)
h
3V
h−1
+ 2vecM
h−1
(vecM
h−1
)
0
i
(G
2
⊗ G
1
)
0
. (11)
Note, in particular, that µ
1
= (H
1
x
n
)(H
2
z
n
)
0
and v
1
= σ
2
µ
2
1
.
Because σ
2
is usually small (much less than 1), approximate expressions for the mean and vari
ance can be obtained:
µ
h
= µ
1,h−1
µ
2,h−1
+ O(σ
2
)
v
h
≈ (1 + σ
2
)(v
1,h−1
+ µ
2
1,h−1
)(v
2,h−1
+ µ
2
2,h−1
) − µ
2
1,h−1
µ
2
2,h−1
where µ
1,h
= H
1
F
h
1
x
n
, µ
2,h
= H
2
F
h
2
z
n
, v
1,h
= Var(H
1
x
n+h
 x
n
) and v
2,h
= Var(H
2
z
n+h
 z
n
).
In the three special cases we consider, these expressions can be written as
µ
h
=
˜
µ
h
s
n−m+1+(h−1)
∗
+ O(σ
2
) (12)
and v
h
≈ s
2
n−m+1+(h−1)
∗
h
θ
h
(1 + σ
2
)(1 + γ
2
σ
2
)
k
−
˜
µ
2
h
i
(13)
Hyndman, Koehler, Ord and Snyder: 9 November 2001
9
Prediction intervals for exponential smoothing state space models
where k = [h − 1/ m], θ
1
=
˜
µ
2
1
, and
θ
h
=
˜
µ
2
h
+ σ
2
h−1
∑
j=1
c
2
j
θ
h−j
, h ≥ 2.
These expressions are exact for h ≤ m. Speciﬁc values for µ
h
,
˜
µ
h
and c
j
for the particular models
in Class 3 are given in Tables 2 and 3.
Note that the usual point forecasts for these models are given by (12) rather than (8). Also, the
point forecasts from MAM are equivalent to the multiplicative HoltWinters method.
For the MNM model, a simpler expression for v
h
is available:
v
h
≈ s
2
n−m+1+(h−1)
∗
h
(1 + α
2
σ
2
)
h−1
(1 + σ
2
)(1 + γ
2
σ
2
)
k
− `
2
n
i
.
(The expression is exact for h ≤ m.)
5. The accuracy of the approximations
In order to investigate the accuracy of the approximations for the mean (12) and standard de
viation (13) to the exact expressions in (8) and (9), we provide some comparisons for the MAM
model in Class 3.
These comparisons are done for quarterly data where the values for the components are assumed
to be the following: `
n
= 100, b
n
= 2, s
n
= 0.80, s
n−1
= 1.20, s
n−1
= 0.90, s
n−1
= 1.10. We use
the following base level values for the parameters: α = 0.2, β = 0.3 (i.e., αβ = 0.06), γ = 0.1, and
σ = 0.05. We vary these parameters one at a time as shown in Table 4.
The results in Table 4 show that the mean and approximate mean are always very close and that
the percentage difference in the standard deviations only becomes substantial when we increase
γ. This result for the standard deviation is not surprising because the approximation is exact if
γ = 0. In fact, we recommend that the approximation not be used if the smoothing parameter for
γ exceeds 0.10.
6. Example
As a numerical example, we consider the quarterly sales data given in Makridakis, Wheelwright
and Hyndman (1998, p.162) and use the multiplicative HoltWinters’ method (model MAM).
Following the approach outlined in Hyndman et al (2001), we estimate the parameters to be
α = 0.8, β = 0.1, γ = 0.1 and σ = 0.0384 with the ﬁnal states `
n
= 757.2, b
n
= 17.6, z
n
=
(0.873, 1.146, 1.031, 0.958)
0
.
Figure 1 shows the forecast standard deviations calculated exactly using (9) and approximately
using (13). We also show the approximation suggested by Koehler, Snyder and Ord (2001) for
1 ≤ h ≤ m. Clearly, both approximations are very close to the exact values in this case (because
σ
2
is so small here).
Hyndman, Koehler, Ord and Snyder: 9 November 2001
10
Prediction intervals for exponential smoothing state space models
Period Approximate Approximate SD percent
ahead Mean (8) Mean (12) SD (9) SD (13) Difference
h µ
h
√
v
h
σ = 0.05, α = 0.2, αβ = 0.06, γ = 0.1
5 121.01 121.00 7.53 7.33 2.69
6 100.81 100.80 6.68 6.52 2.37
7 136.81 136.80 9.70 9.50 2.07
8 92.81 92.80 7.06 6.93 1.80
9 129.83 129.80 10.85 10.45 3.68
10 108.03 108.00 9.65 9.34 3.21
11 146.44 146.40 13.99 13.60 2.81
12 99.22 99.20 10.13 9.88 2.47
σ = 0.1, α = 0.2, αβ = 0.06, γ = 0.1
5 121.05 121.00 15.09 14.68 2.73
6 100.84 100.80 13.39 13.07 2.40
7 136.86 136.80 19.45 19.04 2.11
8 92.84 92.80 14.15 13.89 1.84
9 129.93 129.80 21.77 20.96 3.75
10 108.11 108.00 19.39 18.75 3.29
11 146.55 146.40 28.11 27.30 2.89
12 99.30 99.20 20.35 19.83 2.55
σ = 0.05, α = 0.6, αβ = 0.06, γ = 0.1
5 121.02 121.00 10.87 10.60 2.47
6 100.82 100.80 9.96 9.76 2.04
7 136.83 136.80 14.76 14.51 1.72
8 92.82 92.80 10.86 10.70 1.47
9 129.86 129.80 16.64 16.19 2.71
10 108.05 108.00 14.83 14.48 2.37
11 146.46 146.40 21.45 21.00 2.09
12 99.24 99.20 15.45 15.16 1.86
σ = 0.05, α = 0.2, αβ = 0.18, γ = 0.1
5 121.03 121.00 10.19 9.87 3.08
6 100.82 100.80 9.88 9.66 2.27
7 136.83 136.80 15.55 15.29 1.69
8 92.82 92.80 12.14 11.98 1.28
9 129.87 129.80 19.67 19.16 2.56
10 108.06 108.00 18.41 18.04 2.03
11 146.48 146.40 27.86 27.41 1.64
12 99.26 99.20 20.93 20.65 1.35
σ = 0.05, α = 0.2, αβ = 0.06, γ = 0.3
5 121.04 121.00 8.10 7.53 7.12
6 100.83 100.80 7.13 6.68 6.36
7 136.84 136.80 10.28 9.70 5.64
8 92.83 92.80 7.42 7.05 4.97
9 129.90 129.80 11.89 10.77 9.46
10 108.08 108.00 10.47 9.59 8.42
11 146.51 146.40 15.04 13.91 7.49
12 99.27 99.20 10.79 10.07 6.67
Table 4: Comparison of exact and approximate means and standard deviations for MAM model in Class 3
(i.e., (8) and (9) versus (12) and (13)).
Hyndman, Koehler, Ord and Snyder: 9 November 2001
11
Prediction intervals for exponential smoothing state space models
Forecast horizon
Forecast standard deviation
2 4 6 8 10 12
40 60 80 100 120 140
Exact
Small sigma approximation
KSO approximation
Figure 1: Forecast standard deviations calculated (a) exactly using (9); (b) approximately using (13); and
(c) using the approximation suggested by Koehler, Snyder and Ord (2001) for 1 ≤ h ≤ m.
Quarter
0 5 10 15 20 25 30 35
400 600 800 1000 1200 1400
Percentile−based interval
Variance−based interval
Figure 2: Quarterly sales data with three years of forecasts. The solid lines show prediction intervals calcu
lated as µ
h
±1.96
√
v
h
and the dotted lines show prediction intervals computed by generating 20,000 future
sample paths from the ﬁtted model and ﬁnding the 2.5% and 97.5% quantiles at each forecast horizon.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
12
Prediction intervals for exponential smoothing state space models
The data with three years of forecasts are shown in Figure 2. In this case, the conditional mean
forecasts obtained from model MAM are virtually indistinguishable from the usual forecasts be
cause σ is so small (they are identical up to h = m). The solid lines show prediction intervals
calculated as µ
h
± 1.96
√
v
h
and the dotted lines show prediction intervals computed by generat
ing 20,000 future sample paths from the ﬁtted model and ﬁnding the 2.5% and 97.5% quantiles at
each forecast horizon. Clearly, the variancebased intervals are a good approximation despite the
nonnormality of the forecast distributions.
7. Summary
For three general classes of state space models, we have provided derivations of exact matrix
expressions for the means and variances of prediction distributions. These general results are
presented separately in a section for each class with the derivations put in three separate ap
pendices. We relate these three classes of state space models to the commonly used exponential
smoothing methods (simple, Holt, and additive and multiplicative HoltWinters) and to other
known exponential smoothing methods (Hyndman et al, 2001). We provide a summary of these
models and the corresponding nonmatrix expressions of the means and variances in Tables 1, 2
and 3. These means and variances may be used to construct analytical prediction intervals when
using the exponential smoothing methods for forecasting.
The nonmatrix formulas for the Class 3 models are not exact for h > m. In Table 4 we compare
our exact matrix formulas with our approximate formulas for the model that corresponds to the
multiplicative HoltWinters method (MAM). We ﬁnd that the approximation is very good as long
as the smoothing parameter for the seasonal component remains small (i.e., less than 0.1). We
also consider an example in which we compare our forecast standard deviations and prediction
intervals with the values from some of the previously used approaches.
In summary, we have provided, for the ﬁrst time, exact analytical formulas for the variances of
prediction distributions for all the exponential smoothing methods. More generally, we have ex
act formulas for variances of the general state space models of which the exponential smoothing
models are special cases. Where possible, we have presented both matrix and nonmatrix expres
sions.
Simulation methods have been the only comprehensive approach to handling the prediction dis
tribution problem for all exponential smoothing methods to date. Our formulas provide an effec
tive alternative, the advantage being that they involve much lower computational loads
Hyndman, Koehler, Ord and Snyder: 9 November 2001
13
Prediction intervals for exponential smoothing state space models
Appendix A: Proofs of results for Class 1
Let
m
h
= E(x
n+h
 x
n
)
and V
h
= Var(x
n+h
 x
n
)
Note that m
0
= x
n
and V
0
= O.
For Class 1
m
h
= Fm
h−1
= F
2
m
h−2
= ··· = F
h
m
0
= F
h
x
n
and therefore
µ
h
= Hm
h−1
= HF
h−1
x
n
.
The state forecast variance is given by
V
h
= FV
h−1
F
0
+ GG
0
σ
2
and therefore
V
h
= σ
2
h−1
∑
j=0
F
j
GG
0
(F
j
)
0
.
Hence, the prediction variance for h periods ahead is
v
h
= HV
h−1
H
0
+ σ
2
=
σ
2
if h = 1;
σ
2
h
1 +
h−1
∑
j=1
c
2
j
i
if h ≥ 2;
where c
j
= HF
j−1
G.
We now consider the particular cases.
ADA
We ﬁrst derive the results for the ADA case. Here the state is x
n
= (`
n
, b
n
, s
n
, s
n−1
, . . . , s
n−m+1
)
0
,
H = [1 1 0
0
m−1
1], F =
1 1 0
0
m−1
0
0 φ 0
0
m−1
0
0 0 0
0
m−1
1
0
m−1
0
m−1
I
m−1
0
m−1
and G =
α
αβ
γ
0
m−1
where I
n
denotes the n × n identity matrix and 0
n
denotes a zero vector of length n.
Therefore HF
i
= [1, φ
i
, d
i +1,m
, d
i +2,m
, . . . , d
i +m,m
]
0
where φ
i
= 1 + φ + ··· + φ
i
and d
j,m
= 1 if j = m
(mod m) and d
j,m
= 0 otherwise. Hence we ﬁnd c
j
= HF
j−1
G = α(1 + φ
j−1
β) + γd
j,m
,
µ
h
= `
n
+ φ
h−1
b
n
+ s
n−m+1+(h−1)
∗
Hyndman, Koehler, Ord and Snyder: 9 November 2001
14
Prediction intervals for exponential smoothing state space models
and for h ≥ 2,
v
h
= σ
2
(
1 +
h−1
∑
j=1
[α(1 + φ
j−1
β) + γd
j,m
]
2
)
= σ
2
1 +
h−1
∑
j=1
α
2
(1 + φ
j−1
β)
2
+ γd
j,m
[γ + 2α(1 + φ
j−1
β)]
.
These formulas agree with those of Yar and Chatﬁeld (1990) except that we apply the dampening
parameter φ beginning in second forecast time period, n + 2, instead of in the ﬁrst forecast time
period, n + 1.
Other cases
All other cases of Class 1 can be derived as special cases of ADA.
• For ADN, we use the results of ADA with γ = 0 and s
t
= 0 for all t.
• For AAN, we use the results of ADN with φ = 1.
• The results for ANN are obtained from AAN by further setting β = 0 and b
t
= 0 for all t.
• For AAA, the results of ADA hold with φ = 1.
• The results for ANA are obtained as a special case of AAA with β = 0 and b
t
= 0 for all t.
Appendix B: Proofs of results for Class 2
Let m
h
and V
h
be deﬁned as in Appendix A. The forecast means for Class 2 have the same form
as for Class 1, namely
µ
h
= Hm
h−1
= HF
h−1
x
n
.
To obtain V
h
, ﬁrst note that V
h
= FV
h−1
F
0
+ GVar(x
n+h−1
ε
n+h
)G
0
and that
Var(x
n+h−1
ε
n+h
) = E[x
n+h−1
x
0
n+h−1
]E(ε
2
n+h
) − 0 = σ
2
[V
h−1
+ m
h−1
m
0
h−1
].
Therefore
V
h
= FV
h−1
F
0
+ σ
2
GV
h−1
G
0
+ σ
2
P
h−1
.
where P
j
= GF
j
x
n
x
0
n
(F
j
)
0
G
0
.
The forecast variance is given by
v
h
= HV
h−1
H
0
(1 + σ
2
) + σ
2
Hm
h−1
m
0
h−1
H
0
.
= HV
h−1
H
0
(1 + σ
2
) + σ
2
µ
2
h
.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
15
Prediction intervals for exponential smoothing state space models
In the special case where G = QH we obtain a simpler result. In this case, x
t
= Fx
t−1
+ Qe
t
where e
t
= y
t
− Hx
t−1
= Hx
t−1
ε
t
. Thus, we obtain the linear exponential smoothing updating
rule x
t
= Fx
t−1
+ Q(y
t
− Hx
t−1
). Deﬁne θ
h
such that Var(e
n+h
 x
n
) = θ
h
σ
2
. Then it is readily seen
that V
h
= FV
h−1
F
0
+ QQ
0
Var(e
n+h
 x
n
) and so, by repeated substitution,
V
h
= σ
2
h−1
∑
j=0
F
j
QQ
0
(F
j
)
0
θ
h−j
and
HV
h−1
H
0
= σ
2
h−1
∑
j=1
c
2
j
θ
h−j
(14)
where c
j
= HF
j−1
Q. Now
e
n+h
= (H( x
n+h−1
− m
h−1
) + Hm
h−1
) ε
n+h
which we square and take expectations to give θ
h
= HV
h−1
H
0
+ µ
2
h
. Substituting (14) into this
expression for θ
h
gives
θ
h
= σ
2
h−1
∑
j=1
c
2
j
θ
h−j
+ µ
2
h
(15)
where θ
1
= µ
2
1
. The forecast variance is then given by
v
h
= (1 + σ
2
)θ
h
− µ
2
h
. (16)
We now consider the particular cases.
MDA
We ﬁrst derive the results for the MDA case. Here the state is x
t
= (`
t
, b
t
, s
t
, s
t−1
, . . . , s
t−m+1
)
0
,
H = [1, 1, 0, . . . , 0, 1],
F =
1 1 0
0
m−1
0
0 φ 0
0
m−1
0
0 0 0
0
m−1
1
0
m−1
0
m−1
I
m−1
0
m−1
and G =
α α 0
0
m−1
α
αβ αβ 0
0
m−1
αβ
γ γ 0
0
m−1
γ
0
m−1
0
m−1
O
m−1
0
m−1
.
Then from (4) we obtain µ
h
= `
n
+ φ
h−1
b
n
+ s
n−m+1+(h−1)
∗
where φ
i
= 1 + φ + ··· + φ
i
and j
∗
=
j mod m.
To obtain the expression for v
h
, note that this model satisﬁes the special case G = QH where Q =
[α, αβ, γ, 0
0
m−1
]
0
. Thus we can use the expression (16) where c
j
= HF
j−1
Q = α(1 + φ
j−1
β) + γd
j,m
(the same as c
j
for the corresponding model from Class 1).
Other cases
All other cases of Class 2 can be derived as special cases of MDA.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
16
Prediction intervals for exponential smoothing state space models
• For MDN, we use the results of MDA with γ = 0 and s
t
= 0 for all t.
• For MAN, we use the results of MDN with φ = 1.
• For MAA, the results of MDA hold with φ = 1.
• The results for MNA are obtained as a special case of MAA with β = 0 and b
t
= 0 for all t.
• The results for MNN are obtained from MAN by further setting β = 0 and b
t
= 0 for all t.
In this case, a simpler expression for v
h
can be obtained. Note that c
j
= α, θ
1
= `
2
n
and for
j ≥ 2,
θ
j
= `
2
n
+ σ
2
α
2
j−1
∑
i =1
θ
j−i
= `
2
n
+ α
2
σ
2
(θ
1
+ θ
2
+ ··· + θ
j−1
)
Hence
θ
j
= `
2
n
(1 + α
2
σ
2
)
j−1
and
v
h
= `
2
n
h
(1 + α
2
σ
2
)
h−1
i
(1 + σ
2
) − `
2
n
= `
2
n
h
(1 + α
2
σ
2
)
h−1
(1 + σ
2
) − 1
i
.
Appendix C: Proofs of results for Class 3
Note that we can write Y
t
as
Y
t
= H
1
x
t−1
z
0
t−1
H
0
2
(1 + ε
t
).
So let W
h
= x
n+h
z
0
n+h
, M
h
= E(W
h
 x
n
, z
n
) and V
h
= Var(W
h
 x
n
, z
n
) where (by standard
deﬁnitions)
V
h
= Var(vecW
h
 x
n
, z
n
), and vecA =
a
1
a
2
.
.
.
a
r
where matrix A = [a
1
a
2
··· a
r
] .
Note that
W
h
= (F
1
x
n+h−1
+ G
1
x
n+h−1
ε
n+h
)(z
0
n+h−1
F
0
2
+ z
0
n+h−1
G
0
2
ε
n+h
)
= F
1
W
h−1
F
0
2
+ (F
1
W
h−1
G
0
2
+ G
1
W
h−1
F
0
2
)ε
n+h
+ G
1
W
h−1
G
0
2
ε
2
n+h
It follows that M
0
= x
n
z
0
n
and
M
h
= F
1
M
h−1
F
0
2
+ G
1
M
h−1
G
0
2
σ
2
. (17)
For the variance of W
h
, we ﬁnd V
0
= 0, and
V
h
= Var{vec(F
1
W
h−1
F
0
2
) + vec(F
1
W
h−1
G
0
2
+ G
1
W
h−1
F
0
2
)ε
n+h
+ vec(G
1
W
h−1
G
0
2
)ε
2
n+h
}
= (F
2
⊗ F
1
)V
h−1
(F
2
⊗ F
1
)
0
+ (G
2
⊗ F
1
+ F
2
⊗ G
1
)Var(vecW
h−1
ε
n+h
)(G
2
⊗ F
1
+ F
2
⊗ G
1
)
0
+ (G
2
⊗ G
1
)Var(vecW
h−1
ε
2
n+h
)(G
2
⊗ G
1
)
0
+ (F
2
⊗ F
1
)Cov(vecW
h−1
, vecW
h−1
ε
2
n+h
)(G
2
⊗ G
1
)
0
+ (G
2
⊗ G
1
)Cov(vecW
h−1
ε
2
n+h
, vecW
h−1
)(F
2
⊗ F
1
)
0
.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
17
Prediction intervals for exponential smoothing state space models
Next we ﬁnd that
Var(vecW
h−1
ε
n+h
) = E[vecW
h−1
(vecW
h−1
)
0
ε
2
n+h
] = σ
2
(V
h−1
+ vecM
h−1
(vecM
h−1
)
0
),
Var(vecW
h−1
ε
2
n+h
) = E(vecW
h−1
(vecW
h−1
)
0
ε
4
n+h
) − E(vecW
h−1
)E(vecW
h−1
)
0
σ
4
= 3σ
4
(V
h−1
+ vecM
h−1
(vecM
h−1
)
0
) − vecM
h−1
(vecM
h−1
)
0
σ
4
= σ
4
(3V
h−1
+ 2vecM
h−1
(vecM
h−1
)
0
),
and
Cov(vecW
h−1
, vecW
h−1
ε
2
n+h
) = E(vecW
h−1
(vecW
h−1
)
0
ε
2
n+h
) − E(vecW
h−1
)E(vecW
h−1
)
0
σ
2
= σ
2
(V
h−1
+ vecM
h−1
(vecM
h−1
)
0
) − σ
2
vecM
h−1
(vecM
h−1
)
0
= σ
2
V
h−1
.
It follows that
V
h
= (F
2
⊗ F
1
)V
h−1
(F
2
⊗ F
1
)
0
+ σ
2
h
(F
2
⊗ F
1
)V
h−1
(G
2
⊗ G
1
)
0
+ (G
2
⊗ G
1
)V
h−1
(F
2
⊗ F
1
)
0
i
+ σ
2
(G
2
⊗ F
1
+ F
2
⊗ G
1
)
h
V
h−1
+ vecM
h−1
(vecM
h−1
)
0
i
(G
2
⊗ F
1
+ F
2
⊗ G
1
)
0
+ σ
4
(G
2
⊗ G
1
)
h
3V
h−1
+ 2vecM
h−1
(vecM
h−1
)
0
i
(G
2
⊗ G
1
)
0
.
The forecast mean and variance are given by
µ
h
= E(Y
n+h
 x
n
, z
n
) = H
1
M
h−1
H
0
2
and
v
h
= Var(Y
n+h
 x
n
, z
n
) = Var[vec(H
1
W
h−1
H
0
2
+ H
1
W
h−1
H
2
ε
n+h
)]
= Var[(H
2
⊗ H
1
)vecW
h−1
+ (H
2
⊗ H
1
)vecW
h−1
ε
n+h
]
= (H
2
⊗ H
1
)[V
h−1
(1 + σ
2
) + σ
2
vecM
h−1
(vecM
h−1
)
0
](H
0
2
⊗ H
0
1
)
= (1 + σ
2
)(H
2
⊗ H
1
)V
h−1
(H
2
⊗ H
1
)
0
+ σ
2
µ
2
h
.
When σ is small (much less than 1), it is possible to obtain some simpler but approximate expres
sions. The second term in (17) can be dropped to give M
h
= F
h−1
1
M
0
(F
h−1
2
)
0
and so
µ
h
≈ H
1
F
h−1
1
x
n
(H
2
F
h−1
2
z
n
)
0
.
The order of this approximation can be obtained by noting that the observation equation may be
written as Y
t
= U
1,t
U
2,t
U
3,t
where U
1,t
= H
1
x
t−1
, U
2,t
= H
2
z
t−1
and U
3,t
= 1 + ε
t
. Then
E(Y
t
) = E(U
1,t
U
2,t
U
3,t
) = E(U
1,t
U
2,t
)E(U
3,t
)
since U
3,t
is independent of U
1,t
and U
2,t
. Since E(U
1,t
U
2,t
) = E(U
1,t
)E(U
2,t
) + Cov(U
1,t
, U
2,t
), we
have the approximation:
µ
h
= E(Y
n+h
 x
n
, z
n
) = E(U
1,n+h
 x
n
)E(U
2,n+h
 z
n
)E(U
3,n+h
) + O(σ
2
).
Hyndman, Koehler, Ord and Snyder: 9 November 2001
18
Prediction intervals for exponential smoothing state space models
When U
2,n+h
is constant the result is exact. Now let
µ
1,h
= E(U
1,n+h+1
 x
n
) = E(H
1
x
n+h
 x
n
) = H
1
F
h
1
x
n
µ
2,h
= E(U
2,n+h+1
 z
n
) = E(H
2
z
n+h
 z
n
) = H
2
F
h
2
z
n
v
1,h
= Var(U
1,n+h+1
 x
n
) = Var(H
1
x
n+h
 x
n
)
v
2,h
= Var(U
2,n+h+1
 z
n
) = Var(H
2
z
n+h
 z
n
)
and v
12,h
= Cov(U
2
1,n+h+1
, U
2
2,n+h+1
 x
n
, z
n
) = Cov([H
1
x
n+h
]
2
, [H
2
z
n+h
]
2
 x
n
, z
n
).
Then
µ
h
= µ
1,h−1
µ
2,h−1
+ O(σ
2
) = H
1
F
h−1
1
x
n
H
2
F
h−1
2
z
n
+ O(σ
2
).
By the same arguments, we have
E(Y
2
t
) = E(U
2
1,t
U
2
2,t
U
2
3,t
) = E(U
2
1,t
U
2
2,t
)E(U
2
3,t
).
and
E(Y
2
n+h
 z
n
, x
n
) = E(U
2
1,n+h
U
2
2,n+h
 x
n
z
n
)E(U
2
3,n+h
)
=
Cov(U
2
1,n+h
, U
2
2,n+h
 x
n
z
n
) + E(U
2
1,n+h
 x
n
)E(U
2
2,n+h
 z
n
)
E(U
2
3,n+h
)
= (1 + σ
2
)[v
12,h−1
+ (v
1,h−1
+ µ
2
1,h−1
)(v
2,h−1
+ µ
2
2,h−1
)].
Assuming the covariance v
12,h−1
is small compared to the other terms we obtain
v
h
≈ (1 + σ
2
)(v
1,h−1
+ µ
2
1,h−1
)(v
2,h−1
+ µ
2
2,h−1
) − µ
2
1,h−1
µ
2
2,h−1
.
We now consider the particular cases.
MDM
We ﬁrst derive the results for the MDM case where x
t
= (`
t
, b
t
)
0
and z
t
= (s
t
, . . . , s
t−m+1
)
0
, and the
matrix coefﬁcients are H
1
= [1, 1], H
2
= [0, . . . , 0, 1],
F
1
=
1 1
0 φ
, F
2
=
0
0
m−1
1
I
m−1
0
m−1
, G
1
=
α α
αβ αβ
, and G
2
=
0
0
m−1
γ
O
m−1
0
m−1
.
Many terms will be zero in the formulas for the expected value and the variance because of the
following relationships: G
2
2
= O
m
, H
2
G
2
= 0
0
m
, and (H
2
⊗ H
1
)(G
2
⊗ X) = 0
0
2m
where X is any
2 × 2 matrix. For the terms that remain, H
2
⊗ H
1
and its transpose will only use the terms from
the last two rows of the last two columns of the large matrices because H
2
⊗ H
1
= [0
0
2m−2
, 1, 1].
Using the small σ approximations and exploiting the structure of the MDM model, we can we
can obtain simpler expressions that approximate µ
h
and v
h
.
Note that H
2
F
j
2
G
2
= γd
j+1,m
H
2
. So for h < m, we have
H
2
z
n+h
 z
n
= H
2
h
∏
j=1
(F
2
+ G
2
ε
n+h−j+1
)z
n
= H
2
F
h
2
z
n
= s
n−m+h+1
Hyndman, Koehler, Ord and Snyder: 9 November 2001
19
Prediction intervals for exponential smoothing state space models
Furthermore,
µ
2,h
= s
n−m+1+h
∗
and v
2,h
= s
2
n−m+1+h
∗
[(1 + γ
2
σ
2
)
k
−1]
where k = b(h − 1 )/mc.
Also note that x
n
has the same properties as for MDN in Class 2. Thus
µ
1,h
= `
n
+ φ
h−1
b
n
and v
1,h
= (1 + σ
2
)θ
h
− µ
2
1,h
.
Combining all the terms, we arrive at the approximations
µ
h
=
˜
µ
h
s
n−m+1+(h−1)
∗
+ O(σ
2
)
and v
h
≈ s
2
n−m+1+(h−1)
∗
h
θ
h
(1 + σ
2
)(1 + γ
2
σ
2
)
k
−
˜
µ
2
h
i
where
˜
µ
h
= `
n
+ φ
h−1
b
n
, θ
1
=
˜
µ
2
1
, and
θ
h
=
˜
µ
2
h
+ σ
2
α
2
h−1
∑
j=1
(1 + φ
j−1
β)
2
θ
h−j
, h ≥ 2.
These expressions are exact for h ≤ m. Also for h ≤ m, the formulas agree with those in Koehler,
Ord, and Snyder (2001) and Chatﬁeld and Yar (1991), if the O(σ
4
) terms are dropped from the
expression.
Other cases
The other cases of Class 3 can be derived as special cases of MDM.
• For MAM, we use the results of MDM with φ = 1.
• The results for MNM are obtained as a special case of MAM with β = 0 and b
t
= 0 for all t.
The simpler expression for v
h
is obtained as for MNN in Class 2.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
20
Prediction intervals for exponential smoothing state space models
References
CHATFIELD, C. and M. YAR (1991) Prediction intervals for multiplicative HoltWinters, Int. J.
Forecasting, 7, 31–37.
HYNDMAN, R.J., A.B. KOEHLER, R.D. SNYDER and S. GROSE (2001) A state space framework
for automatic forecasting using exponential smoothing methods. International J. Forecasting,
to appear.
JOHNSTON, F.R. and HARRISON, P.J. (1986) The variance of leadtime demand, J. Opl. Res. Soc.,
37, 303–308.
KOEHLER, A.B., R.D. SNYDER and J.K. ORD (2001) Forecasting models and prediction intervals
for the multiplicative HoltWinters method, International J. Forecasting, 17, 269–286.
MAKRIDAKIS, S., S.C. WHEELWRIGHT and R.J. HYNDMAN (1998) Forecasting: methods and appli
cations, 3rd edition, John Wiley & Sons: New York.
MCKENZIE, E. (1976), A comparison of some standard seasonal forecasting systems, The Statisti
cian, 25, 3–14.
ORD, J.K., A.B. KOEHLER and R.D. SNYDER (1997) Estimation and prediction for a class of dy
namic nonlinear statistical models, J. Amer. Statist. Assoc., 92, 1621–1629.
ROBERTS, S.A. (1982), A general class of HoltWinters type forecasting models, Management Sci
ence, 28, 808–820.
YAR, M. and C. CHATFIELD (1990). Prediction intervals for the HoltWinters forecasting proce
dure, Int. J. Forecasting, 6, 127–137.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
21