Content uploaded by Ivan Svetunkov
Author content
All content in this area was uploaded by Ivan Svetunkov on Jan 13, 2018
Content may be subject to copyright.
Please cite this paper as:
!"#$!!
%!"&#'#(")!)*)+!!,-.
Managmeent Science
Working Paper 2018:1
Complex Exponential Smoothing for
Seasonal Time Series
Ivan Svetunkov, Nikolaos Kourentzes
The Department of Management Science
Lancaster University Management School
Lancaster LA1 4YX
UK
© Ivan Svetunkov, Nikolaos Kourentzes
All rights reserved. Short sections of text, not to exceed
two paragraphs, may be quoted without explicit permission,
provided that full acknowledgment is given.
LUMS home page: http://www.lums.lancs.ac.uk
Complex Exponential Smoothing for Seasonal Time
Series
Ivan Svetunkova,∗, Nikolaos Kourentzesa
aLancaster University Management School
Department of Management Science, Lancaster, LA1 4YX, UK
Abstract
The general seasonal Complex Exponential Smoothing (CES) is presented
in this paper. CES is based on conventional exponential smoothing and the
notion of the “information potential”, an unobservable part of time series
that the classical models do not consider. The proposed seasonal CES can
capture known forms of seasonality, as well as new ones that are neither
strictly additive nor multiplicative. In contrast to exponential smoothing,
CES can capture both stationary and non-stationary processes, giving it
greater modelling flexibility. In order to choose between the seasonal and
non-seasonal CES a model selection procedure is discussed in the paper.
An empirical evaluation of the performance of the model against ETS and
ARIMA on simulated and real data is carried out. The findings suggest that
CES simplifies substantially model selection, and as a result the forecasting
process, while performing better than the benchmarks in terms of forecasting
accuracy.
Keywords: Exponential smoothing, model selection, complex variables,
seasonality, complex exponential smoothing, information potential
1. Introduction
Exponential smoothing (ETS) is one of the most popular family of models
used both in research and practice. The variety of the exponential smoothing
∗Correspondance: I Svetunkov, Department of Management Science, Lancaster Uni-
versity Management School, Lancaster, Lancashire, LA1 4YX, UK.
Email address: i.svetunkov@lancaster.ac.uk (Ivan Svetunkov)
Preprint submitted to International Journal of Forecasting July 15, 2016
models is wide and allows modelling different types of time series components.
Hyndman et al. (2008) present a taxonomy which leads to 30 models with
different types of error, trend and seasonal components.
ETS is not free of modelling challenges. First, the large number of model
forms introduces a selection problem. Although the model variety allows
capturing different types of processes, at the same time it makes it difficult
to select a correct one for a time series. This is usually addressed by using an
information criterion, typically the Akaike Information Criterion (Hyndman
et al., 2002), though Billah et al. (2006) showed that using other informa-
tion criteria did not lead to significant differences in forecasting performance.
However recent research showed that choosing a single most appropriate ex-
ponential smoothing model for a time series may not lead to the most accu-
rate forecast. As a result various combination approaches have been proposed
in the literature. For example, Kolassa (2011) investigated combination of
the different ETS forecasts using Akaike weights with good results.
Second, it is assumed in ETS framework that any time series may be
decomposed into level, trend and seasonal components, which in real life
are arbitrary and unobservable. For example, it may not always be easy
to identify whether a series exhibits a changing level or a trend (Hyndman
et al., 2008). Similar complications are relevant to identifying the seasonal
component and its nature: whether it is additive or multiplicative.
Third, the combination approaches highlight that there may be composite
ETS forms that are not captured by the known models. The combinations
described by Kolassa (2011) result in such non-customary trend and season-
ality forms. Kourentzes et al. (2014) argue that full identification of trend
and seasonality is not straightforward with conventional modelling, showing
the benefits of using multiple levels of temporal aggregation to that pur-
pose, demonstrating forecasting performance improvements. Kourentzes &
Petropoulos (2015) proceed to show that this problem is more acute under
the presence of extreme values, such as outliers due to promotional activities,
where again a similar combination approach leads to more accurate forecasts.
We argue that these composite forms of exponential smoothing perform well
because the type of time series components, as well as the interaction between
them, may be too restrictive under ETS for some time series.
In this paper we aim to overcome these problems by using a more general
exponential smoothing formulation that has been proposed by Svetunkov
& Kourentzes (2015). It assumes that any time series can be modelled as
the observed series value and an unobserved information potential. This
2
leads to a less restrictive model without an arbitrary decomposition of time
series. The authors implement this idea using complex variables, proposing
the Complex Exponential Smoothing (CES). Their investigation is limited to
non-seasonal time series, where they find that CES can accurately model time
series without arbitrarily separating them into level and trend ones. Another
crucial advantage of CES over conventional ETS is that it can capture both
stationary and non-stationary processes.
Here we extend CES for seasonal time series, which leads to a a family of
CES models that can model all types of level, trend, seasonal and trend sea-
sonal time series in the conventional ETS classification. In contrast to ETS,
CES has only two forms (non-seasonal and seasonal) and hence we deal with
a simpler model selection problem that allows capturing different types of
components and interactions between them. To test if the extended CES is
capable of modelling appropriate time series structures, we conduct a sim-
ulation study and find that the simplified selection problem leads to better
results in comparison with conventional ETS. We then evaluate the forecast-
ing performance of extended CES against established benchmarks and find
it to produce more accurate forecasts on the monthly M3 dataset. We argue
that even though the formulation of CES appears to be more complicated
than individual ETS models, the substantially simplified selection problem
and its good accuracy makes it appealing for practice.
The rest of the paper is organised as follows: section 2 introduces the
Complex Exponential Smoothing model and its seasonal extension. Section
3 presents the setup and provides the results of empirical evaluations on
simulated and real data. This section is then followed by concluding remarks.
2. Complex Exponential Smoothing
2.1. Information Potential
A fundamental idea behind CES is the Information Potential (Svetunkov
& Kourentzes, 2015). Any measured time series contains some information,
which may be less than the time series in all its totality and potentially
unobservable due to sampling. For example, that might be due to some
unobserved long memory process or other more exotic structures.
It was shown that there is a convenient way to write the measured series
and the information potential using a complex variable: yt+ipt(Svetunkov,
2012), where ytis the actual value of the series, ptis the information potential
on the observation tand iis the imaginary unit, the number that satisfies
3
the equation: i2=−1. Svetunkov & Kourentzes (2015) proposed using to
use the value of the error term as a proxy of information potential. This
leads to desirable properties of CES that are discussed below.
2.2. Complex Exponential Smoothing
A complex counterpart to conventional exponential smoothing can be
developed as:
ˆyt+1 +iˆpt+1 = (α0+iα1)(yt+ipt) + (1 −α0+i−iα1)(ˆyt+iˆpt) (1)
where ˆytis the forecast of the actual series, ˆptis the estimate of the in-
formation potential, α0+iα1is complex smoothing parameter. Using the
aforementioned proxy pt=t, Svetunkov & Kourentzes (2015) derived the
state-space model of CES that can be used to further explore its properties.
It can be written the following way:
yt=lt−1+t
lt=lt−1−(1 −α1)ct−1+ (α0−α1)t
ct=lt−1+ (1 −α0)ct−1+ (α0+α1)t
,(2)
where ltis the level component, ctis the information component on observa-
tion t, and t∼N(0, σ2). This state-space form of CES can be written in a
more general form:
yt=w0xt−1+t
xt=F xt−1+gt
,(3)
where xt=lt
ctis state vector, F=1−(1 −α1)
1 1 −α0is transition matrix,
g=α0−α1
α0+α1is persistence matrix and w=1
0is a measurement vector.
One of the main features of CES is that it does not contain an explicit
trend: its ltand ctcomponents are connected with each other and change in
time depending on the value of the complex smoothing parameter. It can be
shown that CES has an underlying ARIMA(2,0,2) model:
(1 −φ1B−φ2B2)yt= (1 −θ1,1B−θ1,2B2)t
(1 −φ1B−φ2B2)ξt= (1 −θ2,1B−θ2,2B2)t
,(4)
where φ1= 2 −α0,φ2=α0+α1−2, θ1,1= 2 −2α0+α1,θ1,2= 3α0+α1−
2−α2
0−α2
1,θ2,1= 2 + α1and θ2,2=α0−α1−2 and ξt=pt−ct−1is an
4
0 1 2 3 4 5
−3 −2 −1 0 1 2
α0
α1
Figure 1: CES stability (the black area) and stationarity (the light triangle) conditions.
information gap, the value showing the amount of the information missing
in the information component ct.
However the parameter space for the autoregressive terms in this model
differs from the conventional AR(2) model due to the connection of AR and
MA terms via the complex smoothing parameter. It should be noted that
the stationary condition is not essential for CES, which means that the roots
of the characteristic equation in (4) may lie inside the unit circle. This
gives the model an additional flexibility and allows to slide between level
and trend time series rather than switch between them. This means in its
turn that CES can be both stationary and non-stationary, depending on the
complex smoothing parameter value, while all the conventional ETS models
are strictly non-stationary.
Stability condition Stationarity condition
(α0−2.5)2+α2
1>1.25
(α0−0.5)2+ (α1−1)2>0.25
(α0−1.5)2+ (α1−0.5)2<1.5
α1<5−2α0
α1<1
α1>1−α0
Table 1: Stability and stationarity conditions of CES.
The stability and stationarity conditions for CES on the plane of complex
5
0
y
0 600
Index
x1[, 1]
0.2+0.99i
0 600
Index
x1[, 2]
1+0.99i
0 20 40 60 80 100
0 600
Index
x1[, 3]
1.8+0.99i
Index
x1[, 4]
0.2+1i
Index
x1[, 5]
1+1i
0 20 40 60 80 100
Index
x1[, 6]
1.8+1i
Index
x1[, 7]
0.2+1.01i
Index
x1[, 8]
1+1.01i
0 20 40 60 80 100
Index
x1[, 9]
1.8+1.01i
0
Time
Figure 2: Data simulated using CES with different complex smoothing parameter values.
smoothing parameters are shown on the Figure 1. They correspond to the
inequalities shown in the table 1. Note that while we restrict CES parameters
with stability region, we do not impose similar restriction for stationarity
allowing the model to capture and produce long-term trends.
CES can produce different types of trajectories depending on the complex
smoothing parameter value. Some example trajectories are shown in the
Figure 2. Note that the imaginary part of the complex smoothing parameter
determines the direction of the trend (or the absence of it when α1= 1),
while the real part influences the steepness of trend. When α0<1 the trend
has obvious exponential character. When α0= 1 the trend slows down but
still shows the features of the exponential function. Finally, when α0>1
the series reveals features of additive trend with possible change of level and
slope. One of the interesting findings is that when α0+iα1= 1 + ithe
generated series becomes stationary. The other feature of CES is that the
behaviour of the trend with α1<1 resembles the damped trend model.
Finally, if for some reason a level of series becomes negative, then the same
complex smoothing parameter values will result in the opposite trajectories
direction (for example, α0<1 will result in the increase either than decline).
We should also point out that CES is unable to produce trajectories
with changing signs of long-term trends. The reason for this is because
the imaginary part of complex exponential smoothing defines the long-term
direction of the trend in the data. This can be considered as a drawback
when an analyst is expecting the radical change of long-term trend in data.
But this is also a benefit of the model, because it allows capturing long-term
dependencies clearer than many other commonly used models do and does
6
not react rapidly to possible changes in short-term trends.
This example shows that CES is capable of capturing different types of
trends, as well as level series (in an ETS context). Svetunkov & Kourentzes
showed empirically that a single CES model can be used instead of several
ETS models with different types of trends for common forecasting tasks. This
is one of the major advantages of the model. When disregarding seasonal
time series no model selection is needed for CES, in contrast to conventional
exponential smoothing.
2.3. Seasonal CES
Here we extend the CES model presented above to cater for seasonality.
The simplest way to derive a seasonal model using CES is to take values of
level and information components with a lag mwhich corresponds to the
seasonality lag instead of t−1:
yt=l1,t−m+t
l1,t =l1,t−m−(1 −β1)c1,t−m+ (β0−β1)t
c1,t =l1,t−m+ (1 −β0)c1,t−m+ (β0+β1)t
(5)
where l1,t is the level component, c1,t is the information component on ob-
servation t,β0and β1are smoothing parameters. This formulation follows
similar ideas to the reduced seasonal exponential smoothing forms by Snyder
& Shami (2001) and Ord & Fildes (2012).
The model (5) preserves all the features of the original CES (2) and pro-
duces all the possible CES trajectories with the same complex smoothing
parameter values. These trajectories, however, will be lagged and will ap-
pear for each seasonal element instead of each observation. This means that
the model (5) produces non-linear seasonality which can approximate both
additive and multiplicative seasonality depending on the complex smoothing
parameter and initial components values. Note that the seasonality produced
by the model may even have these features regardless of the value of the level,
thus generating seasonalities that cannot be classified as either additive or
multiplicative. Naturally, no known exponential smoothing model can pro-
duce similar patterns: multiplicative seasonality implies that the amplitude
of fluctuations increases with the increase of the level, while with seasonal
CES the amplitude may increase without it.
However this model is not flexible enough: it demonstrates all the variety
of possible seasonality changes only when the level of the series is equal to
zero (meaning that a part of the data lies in the positive and another lies
7
in the negative plane). To overcome this limitation we extend the original
model (2) with the basic seasonal model (5) in one general seasonal CES
model:
yt=l0,t−1+l1,t−m+t
l0,t =l0,t−1−(1 −α1)c0,t−1+ (α0−α1)t
c0,t =l0,t−1+ (1 −α0)c0,t−1+ (α0+α1)t
l1,t =l1,t−m−(1 −β1)c1,t−m+ (β0−β1)t
c1,t =l1,t−m+ (1 −β0)c1,t−m+ (β0+β1)t
(6)
The model (6) still can be written in a conventional state-space form
(2). It exhibits several differences from the conventional smoothing seasonal
models. First, the proposed seasonal CES in (6) does not have a set of
usual seasonal components as the ordinary exponential smoothing models
do, which means that there is no need to renormalise them. The values
of l1,t and c1,t correspond to some estimates of level and information com-
ponents in the past and have more common features with seasonal ARIMA
(Box & Jenkins, 1976, p.300) than with the conventional seasonal exponential
smoothing models. Second, it can be shown that the general seasonal CES
has an underlying model that corresponds to SARIMA(2,0,2m+ 2)(2,0,0)m
(see Appendix A), which can be either stationary or not, depending on the
complex smoothing parameters values.
The general seasonal CES preserves the properties of both models (2)
and (5): it can produce non-linear seasonality and all the possible types of
trends discussed above, as now the original level component l0,t can become
negative while the lagged level component l1,t may become strictly positive.
Finally this model retains the interesting property of independence of
the original level and lagged level components, so a multiplicative (or other)
shape seasonality may appear in the data even when the level of the series
does not change. This could happen for example when the seasonality is
either nonlinear or some other variable is determining its evolution, as for
example is the case with solar power generation (Trapero et al., 2015).
2.4. Parameters estimation
During the estimation of the parameters of the general seasonal CES some
constrains should be introduced to achieve the stability of the model. The
state-space model is stable when all the eigenvalues of the discount matrix
D=F−gw0lie inside the unit circle. Unfortunately, it is hard to derive
the exact regions for the smoothing parameters of general seasonal CES but
8
the stability condition can be checked during the optimisation. To do that
the eigenvalues of the following discount matrix should be calculated (see
Appendix B):
D=
1−α0+α1α1−1α1−α00
1−α0−α11−α0−α1−α00
β1−β00 1 −β0+β1β1−1
−β1−β00 1 −β0−β11−β0
(7)
The estimation of the parameters of CES can be done using the likelihood
function. This is possible due to the additive form of the error term in (6).
The likelihood function turns out to be similar to the one used in ETS models
(Hyndman et al., 2002):
L(g, x0, σ2|y) = 1
σ√2πT
exp −1
2
T
X
t=1 t
σ2!(8)
This likelihood function value can be used in calculation of information
criteria. The double negative logarithm of (8) can be calculated to that
purpose:
−2 log(L(g, x0|y)) = T log (2πe) + log T
X
t=1
2
t!!.(9)
2.5. Model selection
Using (9) the Akaike Information Criterion for both seasonal and non-
seasonal CES models can be calculated:
AIC = 2k−2 log(L(g, x0|y)) (10)
where kis the number of coefficients and initial states of CES. For the non-
seasonal model (2) kis equal to 4 (2 complex smoothing parameters and 2
initial states). For the seasonal model the number of the coefficients in (6)
becomes much greater than in the original model: k= 4 + 2m+ 2, which
is 4 smoothing parameters, 2minitial lagged values and 2 initial values of
the generic level and information components. Naturally, other information
criteria can be constructed.
Observe that the model selection problem for CES is reduced to choosing
only between non-seasonal and seasonal variants, instead of the multiple
model forms under conventional ETS.
9
3. Empirical evaluation
3.1. Model selection
To evaluate the performance of the model selection procedure we simu-
late series with known structure and attempt to model them with CES. Each
series is simulated using ETS at a monthly frequency and contains 120 obser-
vations. All the smoothing parameters are generated using uniform distribu-
tion, restricted by the traditional bounds: α∈(0,1), β ∈(0, α), γ ∈(0,1−α).
Normal distribution with zero mean is used in the error term generation. We
generate series with either additive or multiplicative errors, using error stan-
dard deviation equal to 50 and 0.05 respectively. The initial value of the level
is set to 5000, while the initial values of trend is set to 0 for the additive cases
and to 1 for the multiplicative cases. This allows the model used as DGP to
produce either growth or decline, depending on the error term and smoothing
parameter values. All the initial values of seasonal components are randomly
generated and then normalised. For each of the 9 process shown in Table 2
1000 time series are generated.
We fit the two types of CES and chose the one with the smallest AIC
corrected for small sample sizes (AICc) value. As benchmarks we fit ETS
and ARIMA (both use AICc for model selection). The implementation of
CES was done in R and is available as the “CES” package (in github: https:
//github.com/config-i1/CES). The benchmarks are produced using the
“forecast” package for R (Hyndman & Khandakar, 2008).
The percentage of the appropriate models chosen by CES, ETS and
ARIMA are shown in the Table 2. The values in the table are the per-
centage of successful time series characteristics identified by each model. We
note here that the definition of success of time series characteristics identi-
fication by different models is subjective. This especially concerns ARIMA
model, for which we have developed a set of criteria (see explanation below),
allowing to define if it captured any type of trend or seasonality in different
situations. However the main objective of this experiment is to see whether
CES is able to make a distinction between seasonal and non-seasonal data
and compare performance of the model in this task with other models in
controlled environment.
Column “CES” shows in how many instances the appropriate seasonal
or non-seasonal model is chosen. We can observe that CES is very accurate
and managed to select the appropriate model in the majority of cases. It is
important to note that the fitted CES is not equivalent to the data generating
10
DGP CES ETS ARIMA
Trend Seasonal Exact Trend Seasonal Overall
N(5000,502) 99.9 97.3 99.8 97.1 96.5 45.7 44.1
ETS(ANN) 99.1 88.0 99.7 49.3 51.5 46.5 28.0
ETS(MNN) 99.3 85.1 99.6 50.9 59.7 47.3 30.0
ETS(AAN) 91.5 94.4 99.7 82.3 96.4 45.7 43.5
ETS(MMN) 98.9 91.6 99.7 68.9 92.3 35.2 32.2
ETS(ANA) 100 85.4 100 46.3 53.0 100 53.0
ETS(AAA) 100 92.1 100 79.1 86.3 100 86.3
ETS(MNM) 100 65.9 100 32.6 61.9 100 61.9
ETS(MMM) 98.2 88.4 100 52.7 70.3 100 70.3
Average 98.5 87.6 99.8 62.1 74.2 68.9 49.9
Table 2: The percentage of the forecasting models chosen appropriately for each data
generating process (DGP).
process in each time series, but nevertheless it is able to approximate the time
series structure.
The column ”ETS, Trend” shows in how many cases the trend or its ab-
sence is identified appropriately (not taking into account the type of trend).
The “ETS, Seasonal” column shows similar accuracy in the identification of
seasonal components. The lowest trends identification ETS accuracy is in
the case of ETS(MNM) process with 65.9%, while the average accuracy in
capturing trends appropriately is 87.6%. The accuracy of seasonal compo-
nents identification in ETS is much better, with the average accuracy for all
the DGPs in this case being 99.8%. The column “Exact” shows in how many
cases the exact DGP is identified by ETS. The average accuracy of ETS in
this task is 62.1%. Note that contrary to CES, ETS should have identified
the exact model in the majority of cases, since it was used to generate the se-
ries. The reason behind this misidentification may be due to the information
criterion used and the sample size of the generated time series.
Table 2 also provides results for the ARIMA benchmark. The column
“ARIMA, Trend” shows the number of cases where ARIMA identifies the
trend or its absence appropriately. Although trend in ARIMA is not defined
directly, we use three criteria to indicate that ARIMA captured a trend in
the time series: ARIMA has a drift, or ARIMA has second differences, or
ARIMA has first differences with non-zero AR element. It can be noted that
11
the lowest accuracy of ARIMA in trends identification is in the cases of level
ETS models DGPs. The average accuracy of ARIMA in this task is 74.2%.
The column “ARIMA, Seasonal” shows the number of cases where ARIMA
identifies the seasonality or its absence appropriately. The cases where either
seasonal AR or seasonal MA or seasonal differences contains non-zero values
are identified as seasonal ARIMA models. The values in the Table 2 indi-
cate that ARIMA makes a lot of mistakes in identifying the seasonal models
in time series generated using non-seasonal ETS models, which leads to the
average accuracy of 68.9% in this category. Even though the generating pro-
cesses is different than the model used in this case, capturing seasonality in
the pure non-seasonal data makes no sense. The column “ARIMA, Overall”
shows the number of cases where both trend and seasonality are identified
appropriately by ARIMA using the criteria described above. Note in the
“Overall” column we only count the cases where ARIMA appropriately iden-
tifies both the presence of trend and season, so as to make it comparable
with the CES results. ARIMA identifies the appropriate form mainly in the
cases of ETS(AAA) and ETS(MMM) models.
Obviously the ETS and ARIMA results are dependent on the identifi-
cation methodology employed. But the comparison done here retains its
significance since all three alternative models, CES, ETS and ARIMA use
the same information criterion to identify the final model form.
This simulation experiment shows that CES, having only two models to
choose from, becomes more efficient in time series identification than ETS
and ARIMA models, that both have to choose from multiple alternatives.
We argue that CES is capable of capturing both level and trend series (Sve-
tunkov & Kourentzes, 2015) and its seasonal extension can produce both
additive and multiplicative seasonality. This reduces the number of alterna-
tive models substantially. Furthermore any seasonal components are highly
penalized with CES during model selection. Therefore the seasonal model
is chosen only in cases when it fits data significantly better than its non-
seasonal counterpart. This substantially simplifies the forecasting process as
a whole.
3.2. Forecasting accuracy
The empirical out-of-sample accuracy evaluation is necessary to test the
forecasting performance of CES and compare it against ETS and ARIMA. We
conduct such an an evaluation on the monthly data from M3 Competition
(Makridakis & Hibon, 2000) that contains both trend and seasonal time
12
CES ETS ARIMA
Minimum 0.134 0.084 0.098
25% quantile 0.665 0.664 0.703
Median 1.049 1.058 1.093
75% quantile 2.178 2.318 2.224
Maximum 28.440 53.330 59.343
Mean 1.922 1.934 1.967
Table 3: MASE values of competing methods. The smallest values are in bold.
series. The forecasting horizon (18 periods ahead) is retained the same as in
the original competition, however a rolling origin evaluation scheme is used,
with the last 24 observations withheld.
The Mean Absolute Scaled Error (MASE) is used to compare the per-
formance of models for each forecast horizon from each origin (Hyndman &
Koehler, 2006):
MASE =Ph
j=1 |eT+j|
h
n−1PT
t=2 |yt−yt−1|(11)
Using these errors we estimate quantiles of the distribution of MASE along
with the mean values from each origin. Table 3 presents the results of this
experiment.
CES demonstrates the smallest mean and median MASE. It also has the
smallest maximum error which indicates that when all the models failed to
forecast some time series accurately, CES was still closer to the actual values.
In the contrast ETS had the smallest minimum MASE, while CES had the
highest respective value among all the competitors. This means that CES
may not be as accurate as other models in the cases when the time series is
relatively easy.
To see if the difference in the forecasting accuracy between CES and the
other methods is significant, we use the Multiple Comparisons with Best
(MCB) test (Koning et al., 2005). We note here that (Hibon et al., 2012)
demonstrated that MCB is a special case of Nemenyi test. The results of this
test indicate that CES is significantly more accurate than ETS and ARIMA
(see Figure 3).
To investigate what could cause this difference in forecasting accuracy, the
13
Friedman: 0.000 (Different)
Nemenyi CD: 0.005
Mean ranks
CES ETS(ZZZ) ARIMA
1.97 1.99 2.01 2.03
Figure 3: The results of MCB test on monthly data of M3. The dotted lines are the critical
distances for the best model and we can see that both ETS and ARIMA are found to be
significantly different.
mean and median MASE are calculated for each forecast horizon, obtaining
the one-step ahead, two-steps ahead etc. mean and median MASE values.
These values are shown on the Figures 4a and 4b.
The analysis of the Figure 4 shows that while the errors of the methods are
very similar for short horizons (with ETS being slightly better), the difference
between them increases on longer horizons. The difference in mean values
of MASE between methods starts increasing approximately after a year (12
observations) with CES being the most accurate. The same feature is present
in median MASE values.
Concluding the results of this experiment, CES performs significantly
better than ETS and ARIMA on the monthly M3 data. CES demonstrates
that it can model various types of time series successfully and is particularly
accurate in relation to the benchmark models for longer horizons. We at-
tribute this to the way that time series trends and seasonal components are
modelled under CES.
4. Conclusions
We proposed a new type of seasonal exponential smoothing, based on
Complex Exponential Smoothing and discussed the model selection pro-
cedure that can be used to identify whether the non-seasonal or seasonal
CES should be selected for time series forecasting. While the non-seasonal
CES can produce different non-linear trajectories and approximate additive
and multiplicative trends, the seasonal model produces non-linear seasonality
14
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
1.0 1.5 2.0 2.5 3.0
Horizon
Mean MASE value
CES
ETS
ARIMA
(a) Mean MASE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
0.6 0.8 1.0 1.2 1.4
Horizon
Median MASE value
CES
ETS
ARIMA
(b) Median MASE
Figure 4: Mean and median MASE value for CES, ETS and ARIMA on different horizons.
that can approximate both additive and multiplicative seasonal time series
and even model new forms of seasonality that do not strictly lie under either
of these cases. The latter can be achieved as a result of the independence of
the seasonal and non-seasonal components and non-linearity of CES.
We also discussed the statistical properties of the general seasonal CES
and showed that it has an underlying SARIMA(2,0,2m+2),(2,0,0)mmodel.
This model can be either stationary or not, depending on complex smooth-
ing parameter values, which gives CES its flexibility and a clear advantage
over the conventional exponential smoothing models which are strictly non-
stationary.
The simulation conducted showed that the proposed model selection pro-
cedure allows choosing the appropriate CES effectively, making only few
mistakes in distinguishing seasonal from non-seasonal time series. In com-
parison ETS and ARIMA were not as effective as CES in this task. We argue
15
that this is a particularly useful feature of the CES model family. Although
the models initially may appear more complicated than conventional ETS,
the fact that a forecaster needs to consider only two CES variants, while
being able to model a wide variety of trend and seasonal shapes, can greatly
simplify the forecasting process.
The forecast competition between CES, ETS and ARIMA conducted on
the monthly data from M3 showed that CES achieved both the lowest mean
and median forecasting errors. The MCB test showed that the differences
in forecasting accuracy between these models were statistically significant.
Further analysis showed that the superior performance of CES was mainly
caused by the more accurate long-term forecasts. This is attributed to the
ability of CES to capture the long-term dependencies in time series, due
to its non-linear character that permits CES to obtain more flexible weight
distributions in time than ETS.
Overall CES proved to be a good model that simplifies the model selection
procedure substantially and allows increasing the forecasting accuracy.
We should note that we have used error term as a proxy in CES frame-
work, but there can be suggested other proxies instead. For example, us-
ing differences of data as proxy instead of error term leads to underlying
ARMA(3,2) model and introduces other properties of Complex Exponential
Smoothing. However study of these properties is out of the scope of this
paper.
Finally, it would be interesting to compare CES with models with long
memory, such as ARARMA and ARFIMA. Such future research would per-
mit to highlight similarities and differences between these models, but also
provide additional evidence of the forecasting performance of such models,
which as is the case for ARARMA have been potentially not explored to
their potential.
Acknowledgements
We would like to thank the two anonymous reviewers and the associate
editor for their constructive communication that has helped improve the
paper.
Appendix A. General seasonal CES and SARIMA
The model (6) can be written in the following state-space form:
16
yt=w0
0x0,t−1+w0
1x1,t−m+t
x0,t =F0x0,t−1+g0t
x1,t =F1x1,t−m+g1t
,(A.1)
where x0,t =l0,t
c0,tis the state vector of the non-seasonal part of CES,
x1,t =l1,t
c1,tis the state vector of the seasonal part, w0=w1=1
0are the
measurement vectors, F0=1α1−1
1 1 −α0,F1=1β1−1
1 1 −β0are transition
matrices and g0=α1−α0
α1+α0,g1=β1−β0
β1+β0are persistence vectors of
non-seasonal and seasonal parts respectively.
Observe that the lags of the non-seasonal and seasonal parts in (A.1)
differ, which leads to splitting the state-space model into two parts. But
uniting these parts in a one bigger part will lead to the conventional state-
space model:
yt=w0xt−l+t
xt=F xt−l+gt
,(A.2)
where xt=x0,t
x1,t,xt−l=x0,t−1
x1,t−m,w=w0
w1,F=F00
0F1,g=g0
g1.
The state vector xt−lcan also be rewritten as xt−l=B0
0Bmx0,t
x1,t,
where Bis a backshift operator. Making this substitution and taking L=
B0
0Bmthe state-space model (A.2) can be transformed into:
yt=w0Lxt+t
xt=F Lxt+gt
(A.3)
The transition equation in (A.3) can also be rewritten as:
(I2−F L)xt=gt,(A.4)
which after a simple manipulation leads to:
xt= (I2−F L)−1gt,(A.5)
Substituting (A.5) into measurement equation in (A.3) gives:
17
yt=w0L(I2−F L)−1gt+t.(A.6)
Inserting the values of the vectors and multiplying the matrices leads to:
yt= (1 + w0
0(I2−F0B)−1g0B+w0
1(I2−F1Bm)−1g1Bm)t.(A.7)
Substituting the values by the matrices in (A.7) gives:
yt= 1 + w0
01−B(1 −α1)B
−B1−B+α0B−1α1−α0
α1+α0B+
w0
11−Bm(1 −β1)Bm
−Bm1−Bm+β0Bm−1β1−β0
β1+β0Bm)t!.
(A.8)
The inverse of the first matrix in (A.8) is equal to:
(I2−F0B)−1=1
1−2B−(α0+α1−2)B21−(1 −α0)B(α1−1)B
B1−B,
(A.9)
similarly the inverse of the second matrix is:
(I2−F1Bm)−1=1
1−2Bm−(β0+β1−2)B2m1−(1 −β0)Bm(β1−1)Bm
Bm1−Bm.
(A.10)
Inserting (A.9) and (A.10) into (A.8), after cancellations and regrouping of
elements leads to:
(1 −2B−(α0+α1−2)B2)(1 −2Bm−(β0+β1−2)B2m)yt=
[(1 −2B−(α0+α1−2)B2)(1 −2Bm−(β0+β1−2)B2m)+
(1 −2Bm−(β0+β1−2)B2m)(α1−α0−((α0−α1)2−2α1)B)+
(1 −2B−(α0+α1−2)B2)(β1−β0−((β0−β1)2−2β1)Bm)] t
(A.11)
Unfortunately, there is no way to simplify (A.11) to present it in a com-
pact form, so the final model corresponds to SARIMA(2,0,2m+ 2)(2,0,0)m.
18
Appendix B. Discount matrix of the general seasonal CES
Substituting the error term in the transition equation (A.2) by the value
from the measurement equation leads to:
xt=F xt−l+gyt−gw0xt−l= (F−gw0)xt−l+gyt,(B.1)
which leads to the following discount matrix:
D=F−gw0=
1−α0+α1α1−1α1−α00
1−α0−α11−α0−α1−α00
β1−β00 1 −β0+β1β1−1
−β1−β00 1 −β0−β11−β0
(B.2)
References
Billah, B., King, M. L., Snyder, R. D., & Koehler, A. B. (2006). Exponen-
tial smoothing model selection for forecasting. International Journal of
Forecasting,22 , 239–247.
Box, G., & Jenkins, G. (1976). Time Series Analysis, Forecasting And Con-
trol. Holden-day, Oakland, California.
Hibon, M., Crone, S. F., & Kourentzes, N. (2012). Statistical significance of
forecasting methods an empirical evaluation of the robustness and inter-
pretability of the mcb, anom and friedman-nemenyi test. The proceedings
of the 32nd Annual international Symposium on Forecasting,1.
Hyndman, R. J., & Khandakar, Y. (2008). Automatic time series forecasting:
The forecast package for r. Journal of Statistical Software,27 , 1–22.
Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of
forecast accuracy. International Journal of Forecasting,22 , 679–688.
Hyndman, R. J., Koehler, A. B., Ord, J. K., & Snyder, R. D. (2008). Fore-
casting With Exponential Smoothing: The State Space Approach. Springer-
Verlag Berlin Heidelberg.
Hyndman, R. J., Koehler, A. B., Snyder, R. D., & Grose, S. (2002). A state
space framework for automatic forecasting using exponential smoothing
methods. International Journal of Forecasting,18 , 439–454.
19
Kolassa, S. (2011). Combining exponential smoothing forecasts using akaike
weights. International Journal of Forecasting,27 , 238 – 251.
Koning, A. J., Franses, P. H., Hibon, M., & Stekler, H. (2005). The m3
competition: Statistical tests of the results. International Journal of Fore-
casting,21 , 397–409.
Kourentzes, N., & Petropoulos, F. (2015). Forecasting with multivariate
temporal aggregation: The case of promotional modelling. International
Journal of Production Economics,In Press.
Kourentzes, N., Petropoulos, F., & Trapero, J. R. (2014). Improving fore-
casting by estimating time series structural components across multiple
frequencies. International Journal of Forecasting,30 , 291 – 302.
Makridakis, S., & Hibon, M. (2000). The m3-competition: Results, conclu-
sions and implications. International Journal of Forecasting,16 , 451–476.
Ord, K., & Fildes, R. (2012). Principles of business forecasting. Cengage
Learning.
Snyder, R. D., & Shami, R. G. (2001). Exponential smoothing of seasonal
data: a comparison. Journal of Forecasting,20 , 197–202.
Svetunkov, I., & Kourentzes, N. (2015). Complex exponential smoothing.
Working Paper of Department of Management Science, Lancaster Univer-
sity,2015:1 , 1–31.
Svetunkov, S. (2012). Complex-Valued Modeling in Economics and Finance.
SpringerLink : B¨ucher. Springer.
Trapero, J. R., Kourentzes, N., & Martin, A. (2015). Short-term solar ir-
radiation forecasting based on dynamic harmonic regression. Energy,84 ,
289–295.
20















