Content uploaded by Keith Ord
Author content
All content in this area was uploaded by Keith Ord
Content may be subject to copyright.
ISSN 1440-771X
ISBN 0 7326 1091 5
Prediction Intervals for Exponential Smoothing State Space Models
Rob J Hyndman, Anne B Koehler, J. Keith Ord and Ralph D Snyder
Working Paper 11/2001
2001
DEPARTMENT OF ECONOMETRICS
AND BUSINESS STATISTICS
AUSTRALIA
Prediction intervals for exponential smoothing state
space models
Rob J Hyndman
1
, Anne B. Koehler
2
, J. Keith Ord
3
, Ralph D. Snyder
1
9 November 2001
Abstract: The main objective of this paper is to provide analytical expressions for forecast
variances that can be used in prediction intervals for the exponential smoothing methods. These
expressions are based on state space models with a single source of error that underlie the
exponential smoothing methods. Three general classes of the state space models are presented.
The first class is the standard linear state space model with homoscedastic errors, the second
retains the linear structure but incorporates a dynamic form of heteroscedasticity, and the third
allows for non-linear structure in the observation equation as well as heteroscedasticity. Exact
matrix formulas for the forecast variances are found for each of these three classes of models.
These formulas are specialized to non-matrix formulas for fifteen state space models that underlie
nine exponential smoothing methods, including all the widely used methods. In cases where an
ARIMA model also underlies an exponential smoothing method, there is an equivalent state
space model with the same variance expression. We also discuss relationships between these
new ideas and previous suggestions for finding forecast variances and prediction intervals for the
exponential smoothing methods.
Keywords: forecast distribution, forecast interval, forecast variance, Holt-Winters method,
structural models.
JEL classification: C22, C53.
1
Department of Econometrics and Business Statistics, Monash University, VIC 3800, Australia.
2
Department of Decision Sciences and Management Information Systems, Miami University, Oxford, OH 45056,
USA.
3
320 Old North, Georgetown University, Washington, DC 20057, USA.
Corresponding author: Rob Hyndman (Rob.Hyndman@buseco.monash.edu.au).
1
Prediction intervals for exponential smoothing state space models
Exponential smoothing methods were given a firm statistical foundation by the use of state space
models with a single source of error (Ord, Koehler, and Snyder, 1997). One of the important
contributions following from that work is the ability to provide a sound statistical basis for finding
prediction intervals for all the exponential smoothing methods. Traditionally, prediction intervals
for the exponential smoothing methods have been found through heuristic approaches or by
employing equivalent or approximate ARIMA models.
The major goal of this paper is to provide analytical expressions for the variances of the forecast
errors to be used in computing prediction intervals for many types of exponential smoothing,
including all of the widely used methods. In contrast Ord, Koehler, and Snyder (1997) found pre-
diction intervals by using the model to simulate the entire prediction distributions for each future
time period. While simulating prediction intervals may be an excellent method for producing
them, many forecasters may prefer analytical formulas for their forecasting software. Hyndman
et al. (2001) describe a framework of 24 models for exponential smoothing, including all of the
usual methods as well as some extensions. The procedures in that paper also use simulation
to produce prediction intervals for the models. We will provide analytical expressions for the
forecast variances for some of those 24 models.
Where an equivalent ARIMA model exists (such as for simple exponential smoothing, Holt’s lin-
ear method, and the additive Holt-Winters method), our results provide identical forecast vari-
ances to those from the ARIMA model. However, we also provide forecast variances for many
exponential smoothing methods where there is no equivalent ARIMA model.
State space models with multiple sources of error have also been used to find forecast variances
for the simple and Holt exponential smoothing methods (Johnston and Harrison, 1986). In these
cases the variances are limiting values in models where the convergence is rapid. The variance
formulas in these two cases are the same as in our results.
Prediction intervals for the additive Holt-Winters method and the multiplicative Holt-Winters
method have previously been considered by Chatfield and Yar. For the additive Holt-Winters
method they found an exact formula for the forecast variance that can be computed directly from
the form of the smoothing method (Yar and Chatfield, 1990). For the multiplicative Holt-Winters
method, they provided an approximate formula (Chatfield and Yar, 1991). In both papers they
assumed that the one-period ahead forecast errors are independent but they did not assume any
particular underlying model for the smoothing methods.
Using a single source of error state space model, Koehler, Ord, and Snyder (2001) derived an
approximate formula for the forecast variance for the multiplicative Holt-Winters method. Their
formula differs from that of Chatfield and Yar (1991) only in how the standard deviation of the
one-step-ahead forecast error is estimated. The variance formulas were given only for the first
year of forecasts in both of these papers (Chatfield and Yar, 1991; Koehler, Ord, and Snyder, 2001).
The results in this current paper include finding both an exact formula (ignoring the estimation
error for the smoothing parameters) for the forecast variance in all future time periods for the
multiplicative Holt-Winters method and a better approximation to this exact formula. Another
point of difference in our work is that Yar and Chatfield (1990) assumed that the variance of the
one-period-ahead forecast error is constant for the additive Holt-Winters method. We include a
class of models where this forecast variance is not constant but instead changes with the mean of
the time series.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
2
Prediction intervals for exponential smoothing state space models
In Section 1 we present the main results of the paper. We use the classification of exponential
smoothing methods from Hyndman et al. (2001) and show the relationship to three general classes
of state space models for exponential smoothing. We present formulas for the h-period-ahead
means (i.e., forecasts) and forecast variances for fifteen specific exponential smoothing models
that correspond to nine exponential smoothing methods, including the most widely used ones.
In Sections 2–4, we examine each of the three general classes of models more closely. We pro-
vide general matrix formulas for the means and variances and then specialize these formulas to
non-matrix expressions for specific exponential models. Proofs for these results are provided in
appendices. For the Class 3 models, the non-matrix expression is an approximation. Thus, we
devote Section 5 to the accuracy of this approximation.
Finally, we provide an example in Section 6 that gives forecasts and prediction intervals for the
multiplicative Holt-Winters method. Using this example, we compare our exact forecast vari-
ances with approximations and compare prediction intervals obtained by using our exact expres-
sion with ones obtained by simulating complete prediction distributions.
1. The main results
We describe the exponential smoothing methods using a similar framework to that proposed in
Hyndman et al. (2001). Each method is denoted by two letters: the first letter denotes the type
of trend (none, additive, or damped) and the second letter denotes the type of seasonality (none,
additive or multiplicative).
Seasonal Component
Trend N A M
Component (none) (additive) (multiplicative)
N (none) NN NA NM
A (additive) AN AA AM
D (damped) DN DA DM
Cell NN describes the simple exponential smoothing method, cell AN describes Holt’s linear
method. The additive Holt-Winters’ method is given by cell AA and the multiplicative Holt-
Winters’ method is given by cell AM. The other cells correspond to less commonly used but
analogous methods.
Hyndman et al. (2001) proposed two state space models for each of these methods: one with
additive errors and one with multiplicative errors. To distinguish these models, we will add a
third letter (A or M) before the letters denoting the type of trend and seasonality. For example,
MAN refers to a model with multiplicative errors, additive trend and no seasonality.
We consider three classes of state space models. In all cases, we use the Single Source of Error
(SSOE) model as formulated by Snyder (1985) and used in later work (e.g., Ord et al., 1997; Hyn-
dman et al., 2001). The first class is the usual state space form: we specify linear relationships in
both the observation and state equations and assume constant error variances. The second class
Hyndman, Koehler, Ord and Snyder: 9 November 2001
3
Prediction intervals for exponential smoothing state space models
retains the linear structure but introduces dynamic heteroscedasticity among the errors in a way
that is natural for state space processes. Finally, in the third class, we allow a special form of non-
linearity in the observation equation (additive and multiplicative relationships among the state
variables) as well as dynamic heteroscedasticity. The second and third classes are not contained
within the ARIMA class, although the second class could be formulated as a kind of GARCH
model. The third class is not covered by either ARIMA or GARCH structures, but is important
as a stochastic description of non-linear forecasting schemes such as Holt-Winters multiplicative
method (cf. Makridakis et al., 1998, pp.161–69).
Let Y
1
, . . . , Y
n
denote the time series of interest. The three classes of models may be defined as:
Class 1 Y
t
= Hx
t−1
+ ε
t
x
t
= Fx
t−1
+ Gε
t
Class 2 Y
t
= Hx
t−1
(1 + ε
t
)
x
t
= (F + Gε
t
)x
t−1
Class 3 Y
t
= H
1
x
t−1
H
2
z
t−1
(1 + ε
t
)
x
t
= (F
1
+ G
1
ε
t
)x
t−1
z
t
= (F
2
+ G
2
ε
t
)z
t−1
where F, G, H, F
1
, F
2
, G
1
, G
2
, H
1
and H
2
are all matrix coefficients, and x
t
and z
t
are unobserved
state vectors at time t. In each case, {ε
t
} is iid N(0, σ
2
). Let p be the length of vector x
t
and q be
the length of vector z
t
. Then the orders of the above matrices are as follows.
Class 1 F (p × p) G (p × 1) H (1 × p)
Class 2 F (p × p) G (p × p) H (1 × p)
Class 3 F
1
(p × p) G
1
(p × p) H
1
(1 × p)
F
2
(q × q) G
2
(q × q) H
2
(1 × q)
Fifteen of the 18 models described above fall within the three state space model classes above:
Class 1 ANN AAN ADN ANA AAA ADA
Class 2 MNN MAN MDN MNA MAA MDA
Class 3 MNM MAM MDM
The remaining three models (ANM, AAM and ADM) do not fit within one of these three classes,
and will not be considered further in this paper. Hyndman et al. (2001) also consider six additional
models with multiplicative trend which fall outside the three state space model classes defined
above. Note that the above 15 models include two models for simple exponential smoothing, two
models for Holt’s method, two models for the additive Holt-Winters’ method and one model for
the multiplicative Holt-Winters’ method.
Equations for the 15 models above are given in Table 1 using the same notation as in Hyndman et
al. (2001). As in that paper, we use the Single Source of Error (SSOE) model in our developments.
That is, all the observation and state variables are driven by the single error sequence ε
t
. For
Hyndman, Koehler, Ord and Snyder: 9 November 2001
4
Prediction intervals for exponential smoothing state space models
Class 1
ANN Y
t
= `
t−1
+ ε
t
ANA Y
t
= `
t−1
+ s
t−m
+ ε
t
`
t
= `
t−1
+ αε
t
`
t
= `
t−1
+ αε
t
s
t
= s
t−m
+ γε
t
.
AAN Y
t
= `
t−1
+ b
t−1
+ ε
t
AAA Y
t
= `
t−1
+ b
t−1
+ s
t−m
+ ε
t
`
t
= `
t−1
+ b
t−1
+ αε
t
`
t
= `
t−1
+ b
t−1
+ αε
t
b
t
= b
t−1
+ αβε
t
b
t
= b
t−1
+ αβε
t
s
t
= s
t−m
+ γε
t
.
ADN Y
t
= `
t−1
+ b
t−1
+ ε
t
ADA Y
t
= `
t−1
+ b
t−1
+ s
t−m
+ ε
t
`
t
= `
t−1
+ b
t−1
+ αε
t
`
t
= `
t−1
+ b
t−1
+ αε
t
b
t
= φb
t−1
+ αβε
t
b
t
= φb
t−1
+ αβε
t
s
t
= s
t−m
+ γε
t
.
Class 2
MNN Y
t
= `
t−1
(1 + ε
t
) MNA Y
t
= (`
t−1
+ s
t−m
)(1 + ε
t
)
`
t
= `
t−1
(1 + αε
t
). `
t
= `
t−1
+ α(`
t−1
+ s
t−m
)ε
t
s
t
= s
t−m
+ γ(`
t−1
+ s
t−m
)ε
t
.
MAN Y
t
= (`
t−1
+ b
t−1
)(1 + ε
t
) MAA Y
t
= (`
t−1
+ b
t−1
+ s
t−m
)(1 + ε
t
)
`
t
= (`
t−1
+ b
t−1
)(1 + αε
t
) `
t
= `
t−1
+ b
t−1
+ α(`
t−1
+ b
t−1
+ s
t−m
)ε
t
b
t
= b
t−1
+ αβ(`
t−1
+ b
t−1
)ε
t
b
t
= b
t−1
+ αβ(`
t−1
+ b
t−1
+ s
t−m
)ε
t
s
t
= s
t−m
+ γ(`
t−1
+ b
t−1
+ s
t−m
)ε
t
.
MDN Y
t
= (`
t−1
+ b
t−1
)(1 + ε
t
) MDA Y
t
= (`
t−1
+ b
t−1
+ s
t−m
)(1 + ε
t
)
`
t
= (`
t−1
+ b
t−1
)(1 + αε
t
) `
t
= `
t−1
+ b
t−1
+ α(`
t−1
+ b
t−1
+ s
t−m
)ε
t
b
t
= φb
t−1
+ αβ(`
t−1
+ b
t−1
)ε
t
b
t
= φb
t−1
+ αβ(`
t−1
+ b
t−1
+ s
t−m
)ε
t
s
t
= s
t−m
+ γ(`
t−1
+ b
t−1
+ s
t−m
)ε
t
.
Class 3
MNM Y
t
= `
t−1
s
t−m
(1 + ε
t
)
`
t
= `
t−1
(1 + αε
t
)
s
t
= s
t−m
(1 + γε
t
)
MAM Y
t
= (`
t−1
+ b
t−1
)s
t−m
(1 + ε
t
)
`
t
= (`
t−1
+ b
t−1
)(1 + αε
t
)
b
t
= b
t−1
+ αβ(`
t−1
+ b
t−1
)ε
t
s
t
= s
t−m
(1 + γε
t
).
MDM Y
t
= (`
t−1
+ b
t−1
)s
t−m
(1 + ε
t
)
`
t
= (`
t−1
+ b
t−1
)(1 + αε
t
)
b
t
= φb
t−1
+ αβ(`
t−1
+ b
t−1
)ε
t
s
t
= s
t−m
(1 + γε
t
).
Table 1: Equations defining each of the 15 models.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
5
Prediction intervals for exponential smoothing state space models
Mean Variance
Class 1 µ
h
v
1
= σ
2
and v
h
= σ
2
1 +
h−1
∑
j=1
c
2
j
Class 2 µ
h
v
h
= (1 + σ
2
)θ
h
− µ
2
h
where θ
1
= µ
2
1
and θ
h
= µ
2
h
+ σ
2
h−1
∑
j=1
c
2
j
θ
h−j
Class 3 µ
h
v
h
= s
2
n−m+h
θ
h
(1 + σ
2
)(1 + γ
2
σ
2
)
k
−
˜
µ
2
h
where θ
1
=
˜
µ
2
1
, θ
h
=
˜
µ
2
h
+ σ
2
h−1
∑
j=1
c
2
j
θ
h−j
and k = b(h −1)/mc.
Table 2: h-period-ahead forecast means and variances. Here buc denotes the largest integer less than or
equal to u and m denotes the period of seasonality. For Class 3, the expression is exact when h ≤ m but
only approximate for h > m.
µ
h
˜
µ
h
c
j
Class 1/Class 2
ANN/MNN `
n
α
AAN/MAN `
n
+ hb
n
α (1 + jβ)
ADN/MDN `
n
+ φ
h−1
b
n
α
1 + φ
j−1
β
ANA/MNA `
n
+ s
n−m+1+(h−1)
∗
α + γd
j,m
AAA/MAA `
n
+ hb
n
+ s
n−m+1+(h−1)
∗
α (1 + jβ) + γd
j,m
ADA/MDA `
n
+ φ
h−1
b
n
+ s
n−m+1σ(h−1)
∗
α
1 + φ
j−1
β
+ γd
j,m
Class 3
MNM `
n
s
n−m+1+(h−1)
∗
`
n
α
MAM (`
n
+ hb
n
) s
n−m+1+(h−1)
∗
`
n
+ hb
n
α (1 + jβ)
MDM (`
n
+ φ
h−1
b
n
) s
n−m+1+(h−1)
∗
`
n
+ φ
h−1
b
n
α
1 + φ
j−1
β
Table 3: Values of µ
h
,
˜
µ
h
and c
j
for the 15 models. Here φ
j
= 1 + φ + ··· + φ
j
= (1 − φ
j+1
)/(1 − φ),
d
j,m
= 1 if j = m (mod m) and 0 otherwise, and j
∗
= j (mod m).
Hyndman, Koehler, Ord and Snyder: 9 November 2001
6
Prediction intervals for exponential smoothing state space models
development of this approach, see Snyder (1985) and Ord et al. (1997). The variables `
t
, b
t
and
s
t
are elements of the state vector and denote the level, slope and seasonal components respec-
tively; the parameters α, β and γ are the usual smoothing parameters corresponding to the level
equation, trend equation and seasonal equation; φ is a damping coefficient used for the damped
trend models; and m denotes the number of seasons in a year.
We derive the forecast means and variances for each of the three model classes, and specifically
for each of the 15 models. The forecast mean for Y
n+h
made h steps ahead from forecast origin n is
denoted by µ
h
= E(Y
n+h
| x
n
) and the corresponding forecast variance is given by v
h
= Var(Y
n+h
|
x
n
).
The main results are summarized in Tables 2 and 3.
Criteria such as maximum likelihood for selection of optimal estimates for the parameters can be
found in Hyndman et al. (2001) and Ord et al. (1997). It is important to notice that estimates for
σ
2
are not done in the same manner for all three classes. The estimate for σ
2
would be
ˆ
σ
2
=
n
∑
t=1
ˆ
ε
2
t
/n
where
ˆ
ε
t
=
(
Y
t
−
ˆ
Y
t−1
(1) for Class 1;
(Y
t
−
ˆ
Y
t−1
(1))/
ˆ
Y
t−1
(1) for Classes 2 and 3;
and
Y
t
(1) = E( Y
t+1
| x
t
) =
(
Hx
t
for Classes 1 and 2;
H
1
x
t
H
2
z
t
for Class 3.
For the special cases in Table 2, Y
n
(1) = µ
1
.
More detail concerning the results for each class are given in the following sections. Derivations
of these results are given in the Appendices.
2. Class 1
Derivations are given in Appendix A.
In this case, the general results for the mean and variance are
µ
h
= HF
h−1
x
n
, (1)
v
1
= σ
2
, (2)
and v
h
= σ
2
1 +
h−1
∑
j=1
c
2
j
, h ≥ 2, (3)
where c
j
= HF
j−1
G. Specific values for µ
h
and c
j
for the particular models in Class 1 are given in
Tables 2 and 3.
Note that point forecasts from ANN are equivalent to simple exponential smoothing (SES) and
AAN gives forecasts equivalent to Holt’s method. SES with drift is obtained from AAN by setting
Hyndman, Koehler, Ord and Snyder: 9 November 2001
7
Prediction intervals for exponential smoothing state space models
β = 0 so that b
n
= b for all n. The additive Holt-Winters’ method is equivalent to the point
forecasts from AAA. Furthermore, ANN is equivalent to an ARIMA(0,1,1) model where θ = 1 −α,
AAN is equivalent to an ARIMA(0,2,2) model and AAA is equivalent to an ARIMA[ 0, (1, m), m +
1] model where (1, m) denotes differences of orders 1 and m (McKenzie, 1976; Roberts, 1982).
The expressions for v
h
can be simplified as shown below.
ANN v
h
= σ
2
1 + α
2
(h − 1)
AAN v
h
= σ
2
h
1 + α
2
(h − 1)
1 + βh +
1
6
β
2
h(2h − 1)
i
ADN v
h
= σ
2
h
1 + α
2
(h − 1) + α
2
β
h−1
∑
j=1
φ
j−1
(2 + φ
j−1
β)
i
ANA v
h
= σ
2
h
1 + α
2
(h − 1) + γ(2α + γ)b(h − 1)/mc
i
AAA v
h
= σ
2
h
1 + α
2
(h − 1)
1 + βh +
1
6
β
2
h(2h − 1)
+ γk
γ + α[2 + βm(k + 1)]
i
where k = b(h − 1 )/mc
ADA v
h
= σ
2
h
1 +
h−1
∑
j=1
α
2
(1 + φ
j−1
β)
2
+ γd
j,m
[γ + 2α(1 + φ
j−1
β)]
i
3. Class 2
Derivations are given in Appendix B.
In this case, the general result for the forecast mean is the same as for Model 1, namely
µ
h
= HF
h−1
x
n
.
(4)
The forecast variance is given by
v
h
= HV
h−1
H
0
(1 + σ
2
) + σ
2
µ
2
h
(5)
where
V
h
= F V
h−1
F
0
+ σ
2
GV
h−1
G
0
+ σ
2
P
h−1
, h = 1, 2, . . . , (6)
V
0
= O, and P
j
= GF
j
x
n
x
0
n
(F
j
)
0
G
0
.
For the six models we consider in this class, we obtain the following simpler expression
v
h
= (1 + σ
2
)θ
h
− µ
2
h
where θ
1
= µ
2
1
and
θ
h
= µ
2
h
+ σ
2
h−1
∑
j=1
c
2
j
θ
h−j
(7)
Hyndman, Koehler, Ord and Snyder: 9 November 2001
8
Prediction intervals for exponential smoothing state space models
and c
j
depends on the particular model. Note that c
j
is identical to that for the corresponding
additive error model from Class 1. Specific values for µ
h
and c
j
for the particular models in
Class 2 are given in Tables 2 and 3.
Note that point forecasts from MNN are equivalent to simple exponential smoothing (SES) but
that the variances are different from ANN. Similarly, MAN gives point forecasts equivalent to
Holt’s method but with different variances from AAN and MAA gives point forecasts equivalent
to the additive Holt-Winter’s method but with different variances from AAA.
In the case of MNN, a non-recursive expression for v
h
can be obtained:
v
h
= `
2
n
h
(1 + α
2
σ
2
)
h−1
(1 + σ
2
) − 1
i
.
4. Class 3
Derivations are given in Appendix C.
For models in this class,
µ
h
= H
1
M
h−1
H
0
2
(8)
and
v
h
= (1 + σ
2
)(H
2
⊗ H
1
)V
h−1
(H
2
⊗ H
1
)
0
+ σ
2
µ
2
h
(9)
where ⊗ denotes a Kronecker product, M
0
= x
n
z
0
n
, V
0
= O
2m
, and for h ≥ 1,
M
h
= F
1
M
h−1
F
0
2
+ G
1
M
h−1
G
0
2
σ
2
(10)
and
V
h
= (F
2
⊗ F
1
)V
h−1
(F
2
⊗ F
1
)
0
+ σ
2
h
(F
2
⊗ F
1
)V
h−1
(G
2
⊗ G
1
)
0
+ (G
2
⊗ G
1
)V
h−1
(F
2
⊗ F
1
)
0
i
+ σ
2
(G
2
⊗ F
1
+ F
2
⊗ G
1
)
h
V
h−1
+ vecM
h−1
(vecM
h−1
)
0
i
(G
2
⊗ F
1
+ F
2
⊗ G
1
)
0
+ σ
4
(G
2
⊗ G
1
)
h
3V
h−1
+ 2vecM
h−1
(vecM
h−1
)
0
i
(G
2
⊗ G
1
)
0
. (11)
Note, in particular, that µ
1
= (H
1
x
n
)(H
2
z
n
)
0
and v
1
= σ
2
µ
2
1
.
Because σ
2
is usually small (much less than 1), approximate expressions for the mean and vari-
ance can be obtained:
µ
h
= µ
1,h−1
µ
2,h−1
+ O(σ
2
)
v
h
≈ (1 + σ
2
)(v
1,h−1
+ µ
2
1,h−1
)(v
2,h−1
+ µ
2
2,h−1
) − µ
2
1,h−1
µ
2
2,h−1
where µ
1,h
= H
1
F
h
1
x
n
, µ
2,h
= H
2
F
h
2
z
n
, v
1,h
= Var(H
1
x
n+h
| x
n
) and v
2,h
= Var(H
2
z
n+h
| z
n
).
In the three special cases we consider, these expressions can be written as
µ
h
=
˜
µ
h
s
n−m+1+(h−1)
∗
+ O(σ
2
) (12)
and v
h
≈ s
2
n−m+1+(h−1)
∗
h
θ
h
(1 + σ
2
)(1 + γ
2
σ
2
)
k
−
˜
µ
2
h
i
(13)
Hyndman, Koehler, Ord and Snyder: 9 November 2001
9
Prediction intervals for exponential smoothing state space models
where k = [h − 1/ m], θ
1
=
˜
µ
2
1
, and
θ
h
=
˜
µ
2
h
+ σ
2
h−1
∑
j=1
c
2
j
θ
h−j
, h ≥ 2.
These expressions are exact for h ≤ m. Specific values for µ
h
,
˜
µ
h
and c
j
for the particular models
in Class 3 are given in Tables 2 and 3.
Note that the usual point forecasts for these models are given by (12) rather than (8). Also, the
point forecasts from MAM are equivalent to the multiplicative Holt-Winters method.
For the MNM model, a simpler expression for v
h
is available:
v
h
≈ s
2
n−m+1+(h−1)
∗
h
(1 + α
2
σ
2
)
h−1
(1 + σ
2
)(1 + γ
2
σ
2
)
k
− `
2
n
i
.
(The expression is exact for h ≤ m.)
5. The accuracy of the approximations
In order to investigate the accuracy of the approximations for the mean (12) and standard de-
viation (13) to the exact expressions in (8) and (9), we provide some comparisons for the MAM
model in Class 3.
These comparisons are done for quarterly data where the values for the components are assumed
to be the following: `
n
= 100, b
n
= 2, s
n
= 0.80, s
n−1
= 1.20, s
n−1
= 0.90, s
n−1
= 1.10. We use
the following base level values for the parameters: α = 0.2, β = 0.3 (i.e., αβ = 0.06), γ = 0.1, and
σ = 0.05. We vary these parameters one at a time as shown in Table 4.
The results in Table 4 show that the mean and approximate mean are always very close and that
the percentage difference in the standard deviations only becomes substantial when we increase
γ. This result for the standard deviation is not surprising because the approximation is exact if
γ = 0. In fact, we recommend that the approximation not be used if the smoothing parameter for
γ exceeds 0.10.
6. Example
As a numerical example, we consider the quarterly sales data given in Makridakis, Wheelwright
and Hyndman (1998, p.162) and use the multiplicative Holt-Winters’ method (model MAM).
Following the approach outlined in Hyndman et al (2001), we estimate the parameters to be
α = 0.8, β = 0.1, γ = 0.1 and σ = 0.0384 with the final states `
n
= 757.2, b
n
= 17.6, z
n
=
(0.873, 1.146, 1.031, 0.958)
0
.
Figure 1 shows the forecast standard deviations calculated exactly using (9) and approximately
using (13). We also show the approximation suggested by Koehler, Snyder and Ord (2001) for
1 ≤ h ≤ m. Clearly, both approximations are very close to the exact values in this case (because
σ
2
is so small here).
Hyndman, Koehler, Ord and Snyder: 9 November 2001
10
Prediction intervals for exponential smoothing state space models
Period Approximate Approximate SD percent
ahead Mean (8) Mean (12) SD (9) SD (13) Difference
h µ
h
√
v
h
σ = 0.05, α = 0.2, αβ = 0.06, γ = 0.1
5 121.01 121.00 7.53 7.33 2.69
6 100.81 100.80 6.68 6.52 2.37
7 136.81 136.80 9.70 9.50 2.07
8 92.81 92.80 7.06 6.93 1.80
9 129.83 129.80 10.85 10.45 3.68
10 108.03 108.00 9.65 9.34 3.21
11 146.44 146.40 13.99 13.60 2.81
12 99.22 99.20 10.13 9.88 2.47
σ = 0.1, α = 0.2, αβ = 0.06, γ = 0.1
5 121.05 121.00 15.09 14.68 2.73
6 100.84 100.80 13.39 13.07 2.40
7 136.86 136.80 19.45 19.04 2.11
8 92.84 92.80 14.15 13.89 1.84
9 129.93 129.80 21.77 20.96 3.75
10 108.11 108.00 19.39 18.75 3.29
11 146.55 146.40 28.11 27.30 2.89
12 99.30 99.20 20.35 19.83 2.55
σ = 0.05, α = 0.6, αβ = 0.06, γ = 0.1
5 121.02 121.00 10.87 10.60 2.47
6 100.82 100.80 9.96 9.76 2.04
7 136.83 136.80 14.76 14.51 1.72
8 92.82 92.80 10.86 10.70 1.47
9 129.86 129.80 16.64 16.19 2.71
10 108.05 108.00 14.83 14.48 2.37
11 146.46 146.40 21.45 21.00 2.09
12 99.24 99.20 15.45 15.16 1.86
σ = 0.05, α = 0.2, αβ = 0.18, γ = 0.1
5 121.03 121.00 10.19 9.87 3.08
6 100.82 100.80 9.88 9.66 2.27
7 136.83 136.80 15.55 15.29 1.69
8 92.82 92.80 12.14 11.98 1.28
9 129.87 129.80 19.67 19.16 2.56
10 108.06 108.00 18.41 18.04 2.03
11 146.48 146.40 27.86 27.41 1.64
12 99.26 99.20 20.93 20.65 1.35
σ = 0.05, α = 0.2, αβ = 0.06, γ = 0.3
5 121.04 121.00 8.10 7.53 7.12
6 100.83 100.80 7.13 6.68 6.36
7 136.84 136.80 10.28 9.70 5.64
8 92.83 92.80 7.42 7.05 4.97
9 129.90 129.80 11.89 10.77 9.46
10 108.08 108.00 10.47 9.59 8.42
11 146.51 146.40 15.04 13.91 7.49
12 99.27 99.20 10.79 10.07 6.67
Table 4: Comparison of exact and approximate means and standard deviations for MAM model in Class 3
(i.e., (8) and (9) versus (12) and (13)).
Hyndman, Koehler, Ord and Snyder: 9 November 2001
11
Prediction intervals for exponential smoothing state space models
Forecast horizon
Forecast standard deviation
2 4 6 8 10 12
40 60 80 100 120 140
Exact
Small sigma approximation
KSO approximation
Figure 1: Forecast standard deviations calculated (a) exactly using (9); (b) approximately using (13); and
(c) using the approximation suggested by Koehler, Snyder and Ord (2001) for 1 ≤ h ≤ m.
Quarter
0 5 10 15 20 25 30 35
400 600 800 1000 1200 1400
Percentile−based interval
Variance−based interval
Figure 2: Quarterly sales data with three years of forecasts. The solid lines show prediction intervals calcu-
lated as µ
h
±1.96
√
v
h
and the dotted lines show prediction intervals computed by generating 20,000 future
sample paths from the fitted model and finding the 2.5% and 97.5% quantiles at each forecast horizon.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
12
Prediction intervals for exponential smoothing state space models
The data with three years of forecasts are shown in Figure 2. In this case, the conditional mean
forecasts obtained from model MAM are virtually indistinguishable from the usual forecasts be-
cause σ is so small (they are identical up to h = m). The solid lines show prediction intervals
calculated as µ
h
± 1.96
√
v
h
and the dotted lines show prediction intervals computed by generat-
ing 20,000 future sample paths from the fitted model and finding the 2.5% and 97.5% quantiles at
each forecast horizon. Clearly, the variance-based intervals are a good approximation despite the
non-normality of the forecast distributions.
7. Summary
For three general classes of state space models, we have provided derivations of exact matrix
expressions for the means and variances of prediction distributions. These general results are
presented separately in a section for each class with the derivations put in three separate ap-
pendices. We relate these three classes of state space models to the commonly used exponential
smoothing methods (simple, Holt, and additive and multiplicative Holt-Winters) and to other
known exponential smoothing methods (Hyndman et al, 2001). We provide a summary of these
models and the corresponding non-matrix expressions of the means and variances in Tables 1, 2
and 3. These means and variances may be used to construct analytical prediction intervals when
using the exponential smoothing methods for forecasting.
The non-matrix formulas for the Class 3 models are not exact for h > m. In Table 4 we compare
our exact matrix formulas with our approximate formulas for the model that corresponds to the
multiplicative Holt-Winters method (MAM). We find that the approximation is very good as long
as the smoothing parameter for the seasonal component remains small (i.e., less than 0.1). We
also consider an example in which we compare our forecast standard deviations and prediction
intervals with the values from some of the previously used approaches.
In summary, we have provided, for the first time, exact analytical formulas for the variances of
prediction distributions for all the exponential smoothing methods. More generally, we have ex-
act formulas for variances of the general state space models of which the exponential smoothing
models are special cases. Where possible, we have presented both matrix and non-matrix expres-
sions.
Simulation methods have been the only comprehensive approach to handling the prediction dis-
tribution problem for all exponential smoothing methods to date. Our formulas provide an effec-
tive alternative, the advantage being that they involve much lower computational loads
Hyndman, Koehler, Ord and Snyder: 9 November 2001
13
Prediction intervals for exponential smoothing state space models
Appendix A: Proofs of results for Class 1
Let
m
h
= E(x
n+h
| x
n
)
and V
h
= Var(x
n+h
| x
n
)
Note that m
0
= x
n
and V
0
= O.
For Class 1
m
h
= Fm
h−1
= F
2
m
h−2
= ··· = F
h
m
0
= F
h
x
n
and therefore
µ
h
= Hm
h−1
= HF
h−1
x
n
.
The state forecast variance is given by
V
h
= FV
h−1
F
0
+ GG
0
σ
2
and therefore
V
h
= σ
2
h−1
∑
j=0
F
j
GG
0
(F
j
)
0
.
Hence, the prediction variance for h periods ahead is
v
h
= HV
h−1
H
0
+ σ
2
=
σ
2
if h = 1;
σ
2
h
1 +
h−1
∑
j=1
c
2
j
i
if h ≥ 2;
where c
j
= HF
j−1
G.
We now consider the particular cases.
ADA
We first derive the results for the ADA case. Here the state is x
n
= (`
n
, b
n
, s
n
, s
n−1
, . . . , s
n−m+1
)
0
,
H = [1 1 0
0
m−1
1], F =
1 1 0
0
m−1
0
0 φ 0
0
m−1
0
0 0 0
0
m−1
1
0
m−1
0
m−1
I
m−1
0
m−1
and G =
α
αβ
γ
0
m−1
where I
n
denotes the n × n identity matrix and 0
n
denotes a zero vector of length n.
Therefore HF
i
= [1, φ
i
, d
i +1,m
, d
i +2,m
, . . . , d
i +m,m
]
0
where φ
i
= 1 + φ + ··· + φ
i
and d
j,m
= 1 if j = m
(mod m) and d
j,m
= 0 otherwise. Hence we find c
j
= HF
j−1
G = α(1 + φ
j−1
β) + γd
j,m
,
µ
h
= `
n
+ φ
h−1
b
n
+ s
n−m+1+(h−1)
∗
Hyndman, Koehler, Ord and Snyder: 9 November 2001
14
Prediction intervals for exponential smoothing state space models
and for h ≥ 2,
v
h
= σ
2
(
1 +
h−1
∑
j=1
[α(1 + φ
j−1
β) + γd
j,m
]
2
)
= σ
2
1 +
h−1
∑
j=1
α
2
(1 + φ
j−1
β)
2
+ γd
j,m
[γ + 2α(1 + φ
j−1
β)]
.
These formulas agree with those of Yar and Chatfield (1990) except that we apply the dampening
parameter φ beginning in second forecast time period, n + 2, instead of in the first forecast time
period, n + 1.
Other cases
All other cases of Class 1 can be derived as special cases of ADA.
• For ADN, we use the results of ADA with γ = 0 and s
t
= 0 for all t.
• For AAN, we use the results of ADN with φ = 1.
• The results for ANN are obtained from AAN by further setting β = 0 and b
t
= 0 for all t.
• For AAA, the results of ADA hold with φ = 1.
• The results for ANA are obtained as a special case of AAA with β = 0 and b
t
= 0 for all t.
Appendix B: Proofs of results for Class 2
Let m
h
and V
h
be defined as in Appendix A. The forecast means for Class 2 have the same form
as for Class 1, namely
µ
h
= Hm
h−1
= HF
h−1
x
n
.
To obtain V
h
, first note that V
h
= FV
h−1
F
0
+ GVar(x
n+h−1
ε
n+h
)G
0
and that
Var(x
n+h−1
ε
n+h
) = E[x
n+h−1
x
0
n+h−1
]E(ε
2
n+h
) − 0 = σ
2
[V
h−1
+ m
h−1
m
0
h−1
].
Therefore
V
h
= FV
h−1
F
0
+ σ
2
GV
h−1
G
0
+ σ
2
P
h−1
.
where P
j
= GF
j
x
n
x
0
n
(F
j
)
0
G
0
.
The forecast variance is given by
v
h
= HV
h−1
H
0
(1 + σ
2
) + σ
2
Hm
h−1
m
0
h−1
H
0
.
= HV
h−1
H
0
(1 + σ
2
) + σ
2
µ
2
h
.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
15
Prediction intervals for exponential smoothing state space models
In the special case where G = QH we obtain a simpler result. In this case, x
t
= Fx
t−1
+ Qe
t
where e
t
= y
t
− Hx
t−1
= Hx
t−1
ε
t
. Thus, we obtain the linear exponential smoothing updating
rule x
t
= Fx
t−1
+ Q(y
t
− Hx
t−1
). Define θ
h
such that Var(e
n+h
| x
n
) = θ
h
σ
2
. Then it is readily seen
that V
h
= FV
h−1
F
0
+ QQ
0
Var(e
n+h
| x
n
) and so, by repeated substitution,
V
h
= σ
2
h−1
∑
j=0
F
j
QQ
0
(F
j
)
0
θ
h−j
and
HV
h−1
H
0
= σ
2
h−1
∑
j=1
c
2
j
θ
h−j
(14)
where c
j
= HF
j−1
Q. Now
e
n+h
= (H( x
n+h−1
− m
h−1
) + Hm
h−1
) ε
n+h
which we square and take expectations to give θ
h
= HV
h−1
H
0
+ µ
2
h
. Substituting (14) into this
expression for θ
h
gives
θ
h
= σ
2
h−1
∑
j=1
c
2
j
θ
h−j
+ µ
2
h
(15)
where θ
1
= µ
2
1
. The forecast variance is then given by
v
h
= (1 + σ
2
)θ
h
− µ
2
h
. (16)
We now consider the particular cases.
MDA
We first derive the results for the MDA case. Here the state is x
t
= (`
t
, b
t
, s
t
, s
t−1
, . . . , s
t−m+1
)
0
,
H = [1, 1, 0, . . . , 0, 1],
F =
1 1 0
0
m−1
0
0 φ 0
0
m−1
0
0 0 0
0
m−1
1
0
m−1
0
m−1
I
m−1
0
m−1
and G =
α α 0
0
m−1
α
αβ αβ 0
0
m−1
αβ
γ γ 0
0
m−1
γ
0
m−1
0
m−1
O
m−1
0
m−1
.
Then from (4) we obtain µ
h
= `
n
+ φ
h−1
b
n
+ s
n−m+1+(h−1)
∗
where φ
i
= 1 + φ + ··· + φ
i
and j
∗
=
j mod m.
To obtain the expression for v
h
, note that this model satisfies the special case G = QH where Q =
[α, αβ, γ, 0
0
m−1
]
0
. Thus we can use the expression (16) where c
j
= HF
j−1
Q = α(1 + φ
j−1
β) + γd
j,m
(the same as c
j
for the corresponding model from Class 1).
Other cases
All other cases of Class 2 can be derived as special cases of MDA.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
16
Prediction intervals for exponential smoothing state space models
• For MDN, we use the results of MDA with γ = 0 and s
t
= 0 for all t.
• For MAN, we use the results of MDN with φ = 1.
• For MAA, the results of MDA hold with φ = 1.
• The results for MNA are obtained as a special case of MAA with β = 0 and b
t
= 0 for all t.
• The results for MNN are obtained from MAN by further setting β = 0 and b
t
= 0 for all t.
In this case, a simpler expression for v
h
can be obtained. Note that c
j
= α, θ
1
= `
2
n
and for
j ≥ 2,
θ
j
= `
2
n
+ σ
2
α
2
j−1
∑
i =1
θ
j−i
= `
2
n
+ α
2
σ
2
(θ
1
+ θ
2
+ ··· + θ
j−1
)
Hence
θ
j
= `
2
n
(1 + α
2
σ
2
)
j−1
and
v
h
= `
2
n
h
(1 + α
2
σ
2
)
h−1
i
(1 + σ
2
) − `
2
n
= `
2
n
h
(1 + α
2
σ
2
)
h−1
(1 + σ
2
) − 1
i
.
Appendix C: Proofs of results for Class 3
Note that we can write Y
t
as
Y
t
= H
1
x
t−1
z
0
t−1
H
0
2
(1 + ε
t
).
So let W
h
= x
n+h
z
0
n+h
, M
h
= E(W
h
| x
n
, z
n
) and V
h
= Var(W
h
| x
n
, z
n
) where (by standard
definitions)
V
h
= Var(vecW
h
| x
n
, z
n
), and vecA =
a
1
a
2
.
.
.
a
r
where matrix A = [a
1
a
2
··· a
r
] .
Note that
W
h
= (F
1
x
n+h−1
+ G
1
x
n+h−1
ε
n+h
)(z
0
n+h−1
F
0
2
+ z
0
n+h−1
G
0
2
ε
n+h
)
= F
1
W
h−1
F
0
2
+ (F
1
W
h−1
G
0
2
+ G
1
W
h−1
F
0
2
)ε
n+h
+ G
1
W
h−1
G
0
2
ε
2
n+h
It follows that M
0
= x
n
z
0
n
and
M
h
= F
1
M
h−1
F
0
2
+ G
1
M
h−1
G
0
2
σ
2
. (17)
For the variance of W
h
, we find V
0
= 0, and
V
h
= Var{vec(F
1
W
h−1
F
0
2
) + vec(F
1
W
h−1
G
0
2
+ G
1
W
h−1
F
0
2
)ε
n+h
+ vec(G
1
W
h−1
G
0
2
)ε
2
n+h
}
= (F
2
⊗ F
1
)V
h−1
(F
2
⊗ F
1
)
0
+ (G
2
⊗ F
1
+ F
2
⊗ G
1
)Var(vecW
h−1
ε
n+h
)(G
2
⊗ F
1
+ F
2
⊗ G
1
)
0
+ (G
2
⊗ G
1
)Var(vecW
h−1
ε
2
n+h
)(G
2
⊗ G
1
)
0
+ (F
2
⊗ F
1
)Cov(vecW
h−1
, vecW
h−1
ε
2
n+h
)(G
2
⊗ G
1
)
0
+ (G
2
⊗ G
1
)Cov(vecW
h−1
ε
2
n+h
, vecW
h−1
)(F
2
⊗ F
1
)
0
.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
17
Prediction intervals for exponential smoothing state space models
Next we find that
Var(vecW
h−1
ε
n+h
) = E[vecW
h−1
(vecW
h−1
)
0
ε
2
n+h
] = σ
2
(V
h−1
+ vecM
h−1
(vecM
h−1
)
0
),
Var(vecW
h−1
ε
2
n+h
) = E(vecW
h−1
(vecW
h−1
)
0
ε
4
n+h
) − E(vecW
h−1
)E(vecW
h−1
)
0
σ
4
= 3σ
4
(V
h−1
+ vecM
h−1
(vecM
h−1
)
0
) − vecM
h−1
(vecM
h−1
)
0
σ
4
= σ
4
(3V
h−1
+ 2vecM
h−1
(vecM
h−1
)
0
),
and
Cov(vecW
h−1
, vecW
h−1
ε
2
n+h
) = E(vecW
h−1
(vecW
h−1
)
0
ε
2
n+h
) − E(vecW
h−1
)E(vecW
h−1
)
0
σ
2
= σ
2
(V
h−1
+ vecM
h−1
(vecM
h−1
)
0
) − σ
2
vecM
h−1
(vecM
h−1
)
0
= σ
2
V
h−1
.
It follows that
V
h
= (F
2
⊗ F
1
)V
h−1
(F
2
⊗ F
1
)
0
+ σ
2
h
(F
2
⊗ F
1
)V
h−1
(G
2
⊗ G
1
)
0
+ (G
2
⊗ G
1
)V
h−1
(F
2
⊗ F
1
)
0
i
+ σ
2
(G
2
⊗ F
1
+ F
2
⊗ G
1
)
h
V
h−1
+ vecM
h−1
(vecM
h−1
)
0
i
(G
2
⊗ F
1
+ F
2
⊗ G
1
)
0
+ σ
4
(G
2
⊗ G
1
)
h
3V
h−1
+ 2vecM
h−1
(vecM
h−1
)
0
i
(G
2
⊗ G
1
)
0
.
The forecast mean and variance are given by
µ
h
= E(Y
n+h
| x
n
, z
n
) = H
1
M
h−1
H
0
2
and
v
h
= Var(Y
n+h
| x
n
, z
n
) = Var[vec(H
1
W
h−1
H
0
2
+ H
1
W
h−1
H
2
ε
n+h
)]
= Var[(H
2
⊗ H
1
)vecW
h−1
+ (H
2
⊗ H
1
)vecW
h−1
ε
n+h
]
= (H
2
⊗ H
1
)[V
h−1
(1 + σ
2
) + σ
2
vecM
h−1
(vecM
h−1
)
0
](H
0
2
⊗ H
0
1
)
= (1 + σ
2
)(H
2
⊗ H
1
)V
h−1
(H
2
⊗ H
1
)
0
+ σ
2
µ
2
h
.
When σ is small (much less than 1), it is possible to obtain some simpler but approximate expres-
sions. The second term in (17) can be dropped to give M
h
= F
h−1
1
M
0
(F
h−1
2
)
0
and so
µ
h
≈ H
1
F
h−1
1
x
n
(H
2
F
h−1
2
z
n
)
0
.
The order of this approximation can be obtained by noting that the observation equation may be
written as Y
t
= U
1,t
U
2,t
U
3,t
where U
1,t
= H
1
x
t−1
, U
2,t
= H
2
z
t−1
and U
3,t
= 1 + ε
t
. Then
E(Y
t
) = E(U
1,t
U
2,t
U
3,t
) = E(U
1,t
U
2,t
)E(U
3,t
)
since U
3,t
is independent of U
1,t
and U
2,t
. Since E(U
1,t
U
2,t
) = E(U
1,t
)E(U
2,t
) + Cov(U
1,t
, U
2,t
), we
have the approximation:
µ
h
= E(Y
n+h
| x
n
, z
n
) = E(U
1,n+h
| x
n
)E(U
2,n+h
| z
n
)E(U
3,n+h
) + O(σ
2
).
Hyndman, Koehler, Ord and Snyder: 9 November 2001
18
Prediction intervals for exponential smoothing state space models
When U
2,n+h
is constant the result is exact. Now let
µ
1,h
= E(U
1,n+h+1
| x
n
) = E(H
1
x
n+h
| x
n
) = H
1
F
h
1
x
n
µ
2,h
= E(U
2,n+h+1
| z
n
) = E(H
2
z
n+h
| z
n
) = H
2
F
h
2
z
n
v
1,h
= Var(U
1,n+h+1
| x
n
) = Var(H
1
x
n+h
| x
n
)
v
2,h
= Var(U
2,n+h+1
| z
n
) = Var(H
2
z
n+h
| z
n
)
and v
12,h
= Cov(U
2
1,n+h+1
, U
2
2,n+h+1
| x
n
, z
n
) = Cov([H
1
x
n+h
]
2
, [H
2
z
n+h
]
2
| x
n
, z
n
).
Then
µ
h
= µ
1,h−1
µ
2,h−1
+ O(σ
2
) = H
1
F
h−1
1
x
n
H
2
F
h−1
2
z
n
+ O(σ
2
).
By the same arguments, we have
E(Y
2
t
) = E(U
2
1,t
U
2
2,t
U
2
3,t
) = E(U
2
1,t
U
2
2,t
)E(U
2
3,t
).
and
E(Y
2
n+h
| z
n
, x
n
) = E(U
2
1,n+h
U
2
2,n+h
| x
n
z
n
)E(U
2
3,n+h
)
=
Cov(U
2
1,n+h
, U
2
2,n+h
| x
n
z
n
) + E(U
2
1,n+h
| x
n
)E(U
2
2,n+h
| z
n
)
E(U
2
3,n+h
)
= (1 + σ
2
)[v
12,h−1
+ (v
1,h−1
+ µ
2
1,h−1
)(v
2,h−1
+ µ
2
2,h−1
)].
Assuming the covariance v
12,h−1
is small compared to the other terms we obtain
v
h
≈ (1 + σ
2
)(v
1,h−1
+ µ
2
1,h−1
)(v
2,h−1
+ µ
2
2,h−1
) − µ
2
1,h−1
µ
2
2,h−1
.
We now consider the particular cases.
MDM
We first derive the results for the MDM case where x
t
= (`
t
, b
t
)
0
and z
t
= (s
t
, . . . , s
t−m+1
)
0
, and the
matrix coefficients are H
1
= [1, 1], H
2
= [0, . . . , 0, 1],
F
1
=
1 1
0 φ
, F
2
=
0
0
m−1
1
I
m−1
0
m−1
, G
1
=
α α
αβ αβ
, and G
2
=
0
0
m−1
γ
O
m−1
0
m−1
.
Many terms will be zero in the formulas for the expected value and the variance because of the
following relationships: G
2
2
= O
m
, H
2
G
2
= 0
0
m
, and (H
2
⊗ H
1
)(G
2
⊗ X) = 0
0
2m
where X is any
2 × 2 matrix. For the terms that remain, H
2
⊗ H
1
and its transpose will only use the terms from
the last two rows of the last two columns of the large matrices because H
2
⊗ H
1
= [0
0
2m−2
, 1, 1].
Using the small σ approximations and exploiting the structure of the MDM model, we can we
can obtain simpler expressions that approximate µ
h
and v
h
.
Note that H
2
F
j
2
G
2
= γd
j+1,m
H
2
. So for h < m, we have
H
2
z
n+h
| z
n
= H
2
h
∏
j=1
(F
2
+ G
2
ε
n+h−j+1
)z
n
= H
2
F
h
2
z
n
= s
n−m+h+1
Hyndman, Koehler, Ord and Snyder: 9 November 2001
19
Prediction intervals for exponential smoothing state space models
Furthermore,
µ
2,h
= s
n−m+1+h
∗
and v
2,h
= s
2
n−m+1+h
∗
[(1 + γ
2
σ
2
)
k
−1]
where k = b(h − 1 )/mc.
Also note that x
n
has the same properties as for MDN in Class 2. Thus
µ
1,h
= `
n
+ φ
h−1
b
n
and v
1,h
= (1 + σ
2
)θ
h
− µ
2
1,h
.
Combining all the terms, we arrive at the approximations
µ
h
=
˜
µ
h
s
n−m+1+(h−1)
∗
+ O(σ
2
)
and v
h
≈ s
2
n−m+1+(h−1)
∗
h
θ
h
(1 + σ
2
)(1 + γ
2
σ
2
)
k
−
˜
µ
2
h
i
where
˜
µ
h
= `
n
+ φ
h−1
b
n
, θ
1
=
˜
µ
2
1
, and
θ
h
=
˜
µ
2
h
+ σ
2
α
2
h−1
∑
j=1
(1 + φ
j−1
β)
2
θ
h−j
, h ≥ 2.
These expressions are exact for h ≤ m. Also for h ≤ m, the formulas agree with those in Koehler,
Ord, and Snyder (2001) and Chatfield and Yar (1991), if the O(σ
4
) terms are dropped from the
expression.
Other cases
The other cases of Class 3 can be derived as special cases of MDM.
• For MAM, we use the results of MDM with φ = 1.
• The results for MNM are obtained as a special case of MAM with β = 0 and b
t
= 0 for all t.
The simpler expression for v
h
is obtained as for MNN in Class 2.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
20
Prediction intervals for exponential smoothing state space models
References
CHATFIELD, C. and M. YAR (1991) Prediction intervals for multiplicative Holt-Winters, Int. J.
Forecasting, 7, 31–37.
HYNDMAN, R.J., A.B. KOEHLER, R.D. SNYDER and S. GROSE (2001) A state space framework
for automatic forecasting using exponential smoothing methods. International J. Forecasting,
to appear.
JOHNSTON, F.R. and HARRISON, P.J. (1986) The variance of leadtime demand, J. Opl. Res. Soc.,
37, 303–308.
KOEHLER, A.B., R.D. SNYDER and J.K. ORD (2001) Forecasting models and prediction intervals
for the multiplicative Holt-Winters method, International J. Forecasting, 17, 269–286.
MAKRIDAKIS, S., S.C. WHEELWRIGHT and R.J. HYNDMAN (1998) Forecasting: methods and appli-
cations, 3rd edition, John Wiley & Sons: New York.
MCKENZIE, E. (1976), A comparison of some standard seasonal forecasting systems, The Statisti-
cian, 25, 3–14.
ORD, J.K., A.B. KOEHLER and R.D. SNYDER (1997) Estimation and prediction for a class of dy-
namic nonlinear statistical models, J. Amer. Statist. Assoc., 92, 1621–1629.
ROBERTS, S.A. (1982), A general class of Holt-Winters type forecasting models, Management Sci-
ence, 28, 808–820.
YAR, M. and C. CHATFIELD (1990). Prediction intervals for the Holt-Winters forecasting proce-
dure, Int. J. Forecasting, 6, 127–137.
Hyndman, Koehler, Ord and Snyder: 9 November 2001
21