Content uploaded by Yuqian Zhao
Author content
All content in this area was uploaded by Yuqian Zhao on Jun 12, 2020
Content may be subject to copyright.
209
Combination Forecasting of Energy Demand in the UK
Marco Barassi* and Yuqian Zhao**
ABSTRACT
In more deregulated markets such as the UK, demand forecasting is vital for the
electric industry as it is used to set electricity generation and purchasing, estab-
lishing electricity prices, load switching and demand response. In this paper we
produce improved short-term forecasts of the demand for energy produced from
ve dierent sources in the UK averaging from a set of 6 univariate and multivar-
iate models. The forecasts are averaged using six dierent weighting functions in-
cluding Simple Model Averaging (SMA), Granger-Ramanathan Model Averaging
(GRMA), Bayesian Model Averaging (BMA), Smoothing Akaike (SAIC), Mal-
lows Weights (MMA) and Jackknife (JMA). Our results show that model averag-
ing gives always a lower Mean Square Forecast Error (MSFE) than the best/opti-
mal models within each class however selected. For example, for Coal, Wind and
Hydro generated Electricity forecasts generated with model averaging, we report a
MSFE about 12% lower than that obtained using the best selected individual mod-
els. Among these, the best individual forecasting models are the Non-Linear Ar-
ticial Neural Networks and the Vector Autoregression and that models selected
by the Jackknife have often superior performance. However, MMA averaged fore-
casts almost always beat the predictions obtained from any of the individual mod-
els however selected, and those generated by other model averaging techniques.
Keywords: Demand for energy; forecasting; model averaging.
https://doi.org/10.5547/01956574.39.SI1.mbar
1. INTRODUCTION
Accurate and rigorous electricity demand modelling and forecasting is extremely import-
ant for energy suppliers, Independent System Operators (ISOs), nancial institutions, and other
participants in electric energy generation, transmission, distribution, and markets. Forecasting the
demand for energy is crucial for planning periodical operations and facility expansion in the elec-
tricity sector at the various levels. For example, models for electric power load forecasting are
essential to the operation and planning of a utility company. At this level, load/demand forecasting
would help an electric utility to make important decisions including purchasing and generating elec-
tric power, load switching, and infrastructure development. On a broader and dierent level, models
and forecasts of a country’s energy demand may provide useful information for the implementation
of specically targeted energy policies. However, obtaining an appropriate forecasting model for
electricity networks is far from being an easy task. In fact, although many modelling and forecasting
methods have been developed, none can be generalized for all demand patterns. This becomes an
even more pressing issue in deregulated markets such as that of the United Kingdom (UK hereafter)
* Corresponding author. Department of Economics, University of Birmingham, Birmingham, B15 2TT, U.K.; E-mail:
m.r.barassi@bham.ac.uk
** Department of statistics and actuarial science, University of Waterloo, ON, Canada; E-mail:yuqian.zhao@uwaterloo.ca
The Energy Journal, Vol. 39, SI1. Copyright © 2018 by the IAEE. All rights reserved.
210 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
where demand patterns have become even more complex. Depending on the available data, their
frequency, the desired nature and detail of forecasting, methodologies to obtain forecasts of the de-
mand for energy can be broadly classied into short-term models, which use traditional time series,
similar day/machine learning approaches and long-term models, which include, end-use, structural
econometric models and time series models with lower frequency data.
In this paper we aim to contribute to the literature on energy demand forecasting by de-
tailing a pseudo out-of-sample combination forecast design which uses high frequency time series
data to obtain improved short-term forecast (up to 1 day) of the demand for energy produced from
ve dierent sources in the UK, averaging from a set of 6 univariate and multivariate models. With
the growing deregulation of the energy industries, obtaining more accurate forecasts, has gained
increasing appeal. The reason is that, since supply and demand experience wider uctuations and
energy prices may increase by a factor of ten or more during peak situations, correct and well timed
demand forecasting has become vitally important for utilities. Short-term demand forecasts help to
estimate load ows and to make decisions that can prevent overloading. Timely implementations of
such decisions would lead to the improvement of network reliability and to the reduced occurrences
of equipment failures and blackouts. On the other hand, demand forecasting is also important for
contract evaluations and evaluations of various sophisticated nancial products on energy pricing
oered by the market.
Short-term forecasting models, which are usually from one hour to one week, use histori-
cal data and play a very important role in the operation of power systems’ operating functions such
energy transaction, unit commitment, security analysis, fuel scheduling and load switching. The
techniques most commonly used include:
• Univariate time series models (Box-Jenkins ARIMA), Holt-Winters exponential smooth-
ing, time series regressions and multivariate time series models such as Vector autore-
gressions (VARs), Bayesian VARs (BVAR) and Factor Augmented VARs (FAVAR)
• Similar day and machine learning approaches including Articial Neural Networks
(ANN), Non-linear Autoregressive Neural Networks (NLANN), Fuzzy Logic, Support
Vector Machines.
Long-term forecasting models play an important role in policy formulation and supply
capacity expansion (incorporate consumer behaviour and characteristics, technology, etc). They in-
clude:
• Time series methods with Lower frequency data. These include both univariate time
series models such as the traditional Box-Jenkins ARIMA, but also Seasonal ARIMA
and models which include fractionally integration such as ARFIMA and also the above
mentioned multivariate time series models (VAR, BVAR, FAVAR) also including cointe-
grated vector error correction models (VECM).
• End-use methods (electricity demand is derived from users’ demand for individual re-
quirements):
• Non-intrusive Load Monitoring Models
• Structural econometric models (Seek to establish the relationship between energy con-
sumption and the factors that inuence it) include:
• Conditional Demand Analysis (Multivariate Regressions, Stochastic Markov Chains)
Earlier attempts of short-term forecasting include Hagan and Behr (1987), Fan and Mc-
Donald (1994), Amjady (2001) and Nogales et al. (2002), who used time series regressions in-
cluding temperature, humidity and past energy consumption to obtain demand projections. More
Combination Forecasting of Energy Demand in the UK / 211
Copyright © 2018 by the IAEE. All rights reserved.
recently, Filik et al. (2011) predicted yearly, weekly and hourly electric energy demand through a
three stage model, showing how short-term predictions are usually more accurate and of immediate
application than medium and long-term ones.
Within the literature on the UK energy system, we nd several studies mostly focussed on
long-term modelling and forecasting. Hunt et al. (2003) applied time series models to forecast the
UK energy demand on a sectoral basis. However, pure time series models have often been criticised
for not considering other important macroeconomic variables as predictors of long-run energy de-
mand. To overcome this issue, Haas and Schipper (1998) use price elasticities, income elasticities
and technical eciency as explanatory variables to forecast energy demand in the UK and other
OECD countries. Similarly, in order to forecast oil, gas, coal and total energy demand in the UK and
Germany, McAvinchey and Yannopoulos (2003) constructed an econometric model incorporating
variables including the price of electricity and technological progress. Cointegration models have
also been used to model or forecast energy demand in the long-run. Among them, Fouquet et al.
(1997) uses a cointegrated VAR to investigate the long run relationship among fuel demand, the real
price level and economic activity and Sadorsky (2009) adopted panel cointegration techniques to
model the long run relationship among GDP per-capita, CO2 per-capita and the demand for renew-
able energy in the G7 countries.
Previous studies on energy demand forecasting can be also classied on the basis of their
parameterisation, that is, whether they employed univariate and/or multivariate (time series) mod-
els. The most widely used univariate forecasting set of models is the ARMA, which can be extended
to ARIMA to account for non-stationarity, SARMA to consider seasonality and ARFIMA to con-
sider fractional integration. Ediger et al. (2006) and Ediger and Akar (2007) employed ARIMA and
SARIMA to forecast fossil fuel in Turkey, using goodness-of-t and information criteria to select
the best forecasting models. However, Sumer et al. (2009) focussed on the importance of capturing
the seasonal eects contained in energy demand and used ARIMA, SARIMA and various regression
models, and found that the regression model with seasonal latent variables provides more accurate
forecast than the class of ARMA models.
In the earlier literature, exponential smoothing also showed reasonable forecasting ability.
Badri et al. (1997) adopted several time-series models including exponential smoothing to forecast
electricity peak-load in the UAE. As an extension of the simple exponential smoothing technique,
Hong (2013) claimed that Holt-Winter smoothing method shows better performance. More recently,
with the introduction of Articial Neural Networks (ANN) and its application in forecasting, many
studies have used ANN-type of models to forecast energy demand, also considering a number of
input variables, such as macroeconomic and environmental variables (Chow and Leung (1996),
Markham and Rakes (1998), Sözen et al. (2005), Ermis et al. (2007)). However, Maia et al. (2006)
showed that a hybrid model of ARMA and ANN model shows better prediction ability of energy
forecast. Similar models have also have been applied by Pao (2006) and Kurban and Filik (2009),
who used Non-linear Autoregressive Neural Networks—(NARNN) to forecast electricity demand
in Taiwan and Turkey, respectively. Geem and Roper (2009) employed the NARNN model to fore-
cast energy demand for South Korea, and they indicated that NARNN produce more accurate pre-
dictions than linear regressions and exponential smoothing.
Among multivariate time series models, vector autoregressions (VAR) are the probably
the most popular for forecasting purposes. García-Ascanio and Maté (2010) used a VAR to forecast
electric power demand, but they argued that VARs just shows poor predictive ability compared with
more advanced multivariate models. Bayesian VARs—BVAR have often been found to improve
the forecast accuracy of the basic VAR also overcoming the problem of over-tting. Crompton and
212 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
Wu (2005) forecast coal, oil, gas and hydro energy demand in China through a BVAR; Francis et al.
(2007) used the BVAR to study the relationship between real gross domestic product per capita and
energy demand in Caribbean countries, and obtained forecasts for energy demand in those countries.
Recently, VAR models have been further developed so to include factors extracted from a set of po-
tentially numerous predictors so to obtain a factor augmented vector autoregressive model—FAVAR
(see Chudik and Pesaran (2011) among others). Baumeister et al. (2016) used a FAVAR model to
forecast gasoline price for the US, showing that their predictions are more accurate than those of
standard VARs.
Clearly, there is a wide variety of models and parameterisations among which to choose for
the purpose of obtaining energy demand forecasts, and thus, the choice of which criterion to use in
order to select the optimal forecasting model becomes another important issue (see Pao (2006) and
García-Ascanio and Maté (2010)for comparative studies). Lai et al. (2008) used the mean squared
error—MSE, the mean absolute percentage error—MAPE and the mean squared percentage er-
ror—MSPE to compare the accuracy of rival forecasting models. Moreover, models can be selected
by minimisation of information criteria obtained in-sample for each individual estimated model
(i.e. Akaike Information Criterion—AIC, and Bayesian Information Criterion—BIC ). Within this
stream of literature, Hansen (2007) and Hansen (2008) showed that Mallows’ information criteria
may often select models that providing more precise forecasts, and later Hansen and Racine (2012)
and Hansen (2014) proposed a cross-validation criteria for model selection based on the Jackknife.
Since the seminal article of Bates and Granger (1969) model averaging has been widely
used to produce more accurate forecasts than the individual optimal models. Hendry and Clements
(2004) and Timmermann (2006) showed that simple combinations often give better performance
than more sophisticated approaches. Further, using a frequentist approach, Granger and Ramana-
than (1984) proposed the use of coecient regression methods to determine the magnitude of the
weights of individual models in the averaging process. However, the most popular average method
is Bayesian model averaging (Madigan and Raftery, 1994), where the averaging weights are calcu-
lated based on empirical data and uses the Bayesian Information Criterion of the individual models
as weight in the averaging function. Anderson and Burnham (2002), suggested to use AIC criteria
to replace BIC criteria in model averaging, while Hansen (2007) and Hansen (2008) have continued
this line of research showing that model combination based on Mallows’ criterion, asymptotically
leads to forecasts with the smallest possible mean squared error. Guidolin and Timmermann (2007)
proposed a dierent time varying weight combination scheme where weights have regime switch-
ing dynamics. More recently, Hansen and Racine (2012) proposed a “jackknife model averaging”
(JMA) estimator which selects the weights by minimizing a cross-validation criterion showing that
their method is asymptotically optimal.
In this paper, in order to overcome some of the methodological issues above and produce
improved forecasts of the demand for energy in the UK, we use a forecasting approach based on
model averaging of several popular linear or non-linear, univariate and multivariate forecast models.
Specically, we rst obtain the forecasts from sets of ARMA, Holt-Winters Smoothing (HWS), Non
Linear Autoregressive Neural Networks (NLANN), Vector Autoregressions (VAR), Bayesian VAR
(BVAR), and Factor Augmented VAR (FAVAR) models. Forecasts are generated from each of these
sets of models and within each set they are then compared and selected using rankings based on four
dierent information criteria (AIC, BIC, Mallows’ and Jackknife). The best models as selected by
each of the dierent information criteria within the individual model set and are averaged using six
dierent combination weight metrics including Simple Model Averaging (SMA), Granger-Ramana-
Combination Forecasting of Energy Demand in the UK / 213
Copyright © 2018 by the IAEE. All rights reserved.
than Model Averaging (GRMA), Bayesian Model Averaging (BMA), Smoothing Akaike (SAIC),
Mallows Weights (MMA) and Jackknife (JMA).
The paper is structured as follows: the next section presents the data and explains the
pre-analysis necessary to transform the raw series into workable variables. Section three outlines the
individual forecast models used and the metrics used for selection and weighting purposes. Compar-
isons of the performance of individual models, forecasting methods, information criteria, averaging
functions are discussed in section four. Section ve reports the forecasts of the levels demand for
energy in the UK, a summary concludes.
2. DATA AND PRE-ANALYSIS
2.1 Data
We collected 30 minutes data from National Gridwatch website1 ranging from 00:00:00
of the 21 December 2013 to 00:00:00 of the 21 March 2016, for a total of 39287 observations for
each energy demand process. The data relate to energy demand for the entire UK plus import, minus
export less un-metered sources. Here, we shall assume that supply matches demand at all times. The
UK gridwatch provides energy demand data disaggregated by dierent types of energy sources, and
in this paper, we will concentrate on the most important ve types of sources including coal, nuclear,
combined cycle gas turbines (CCGT hereafter), wind and hydro-power.
Note that, energy providers in the UK dier quite substantially between them for the choice
of fuel sources used to provide energy as can be seen from the table below which lists most of the
utilities in the UK. Note also that, beside the strong deregulation, given the dierences in fuel mix
used by energy suppliers in the UK, accurate short-term forecasting of the demand for energy pro-
duced from the dierent sources becomes even more appealing as it has an important impact on
the prices charged, the choice of production and not least general demand management and load
switching.
In this paper, using the data described above, we obtain 30-minutes to one-day predictions
of the demand for energy as produced from these ve energy sources, that is we obtain forecasts
for the 22 March 2016, at the following times 00:30, 01:00, 02:00, 04:00, 06:00, 08:00, 16:00 and
24:00.
Figure 2 plots the levels of the demand for energy generated from each of the ve sources.
It is noticeable how demand for CCGT and wind sourced energy are relatively more volatile, while
nuclear is more steady. The main reason is that CCGT, though is an ecient way to use gas and
turbines are fast to get online, they use relatively expensive fuel. Thus, these cycling plants will
ramp up and down during the day, and are usually used more during peak hours. As for the wind
turbines, they are expensive but wind is cheap, however, the strength of wind is not constant and it
varies from zero to storm force. This means that wind turbines cannot not produce the same amount
of electricity all the time and there are be times when they produce no electricity at all. On the other
hand, once on, nuclear power stations run at out, the cost of fuel is insignicant and this explains
the much lower volatility in its demand.
Still, regardless of the source, there exist strong“seasonal” components in the data. These
periodicities vary in frequency as seasons, “day-of-the-week” and “hour-of-the-day”. Therefore, it
is necessary to remove these periodic eects prior to analyse the data and obtain the forecasts for
each model of the model sets.
1. http://www.gridwatch.templar.co.uk/
214 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
Table 1: Fuel Mix of UK Energy Suppliers
Supplier Coal Gas Nuclear Renewable Other CO2 Nuclear Waste
British Gas 2 30 34 33 0.7 0.137 0.0024
Bulb 0 0 0 100 0 0 0
Co-operative Energy 36.6 34.2 13.4 9.8 6 0.493 0.00094
E.On 18.7 32.4 12.8 29 7.1 0.328 0.001
Ecotricity 0 0 0 100 0 0 0
EDF Energy 14.5 8.6 64.3 12.3 0.3 0.167 0.0045
Extra Energy 19 33 13 28 7 0.339 0.0009
First:Utility 18.9 32.7 12.9 28.3 7.2 0.33 0.0009
Flow Energy 18.9 32.7 12.9 28.3 7.2 0.331 0.0009
GnERGY 34 25.6 21.6 16.7 2.1 0.418 0.00151
Good Energy 0 0 0 100 0 0 0
Green Energy UK 0 70.6 0 29.4 0 0.134 0
Green Star Energy 0.1 0.1 0 99.8 0 0.002 0.00001
iSupplyEnergy 38.7 36.2 14.2 4.6 6.3 0.528 0.001
LoCO2 Energy 0 0 0 100 0 0 0
Npower/RWE 16 66 1 16 1 0.408 0.00008
Octopus Energy 1 1 1 97 0 0.013 0.00004
OVO Energy 0 46.9 0 53.1 0 0.183 0
Scottish Power 34 36 3 26 1 0.46 0.0002
So Energy 18.9 32.7 12.9 28.3 7.2 0.331 0.0009
Spark Energy 46.8 27.1 8.4 11.9 5.8 0.579 0.0007
SSE 25 35 7 29 4 0.38 0.00047
Utilita 19 33 13 28 7 0.332 0.00091
UK Average 17 32.3 23.7 24.3 2.5 0.29 0.0017
Source: http://electricityinfo.org/fuel-mix-of-uk-domestic-electricity-suppliers/
Figure 2: The levels of energy demand in UK
Note: This gure plots the level of demand for each energy source.
Combination Forecasting of Energy Demand in the UK / 215
Copyright © 2018 by the IAEE. All rights reserved.
2.2 Removing the Deterministic Components
As highlighted above, energy demand data always contains signicant periodic patterns in
both the short and long-term. In this subsection, we describe the three methods that we shall use to
remove these periodic patterns in turn. In what follows we will rst t an up to 5th degree Cheby-
shev polynomial to remove the season of the year component, a moving average based technique
to remove the day of the week eect and nally a 48th order dierencing to eliminate a day/night
peak/o-peak hour eect.
Cuestas and Gil-Alana (2016), recently showed the ability of Chebyshev polynomials to
t long-term cyclical patterns. Chebyshev polynomials are based on orthogonal cosine functions
of time, such that a linear combination of these functions can exibly approximate most cyclical
patterns. The higher the order of the polynomial, the more non-linear is the cyclical pattern that can
be approximated. Following [12] we dene the polynomial as,
,
( ) = 2 ( ( 0.5) / ), = 1, 2, , ; = 1, 2, ,
π
−
in
P t cos i t n t n i
(1)
where i is the order of Chebyshev polynomial. Specically, when i equals to 0, it gives a linear
constant function with
0,
( )=1
n
Pt
. Since any empirical process yt can be decomposed between a
deterministic and a stochastic part, and if the deterministic term approximated by Chebyshev poly-
nomials, then we can have,
,
=0
= ( ) , = 1, 2,3, ,
θ
+
∑
m
t i in t
i
y Pt x t
(2)
where xt is assumed to be the stochastic part of the model, and the order of Chebyshev polynomials
is determined by the signicance of parameters θi. The parameter θi can be estimated by,
1
=1 =1
ˆ=( () ()) ( () ),
θ
−
′
∑∑
nn
nn nt
tt
PtPt Pty
(3)
Finally the de-seasonalised process
*
t
y is,
*
,
=0
ˆ
= ( ),
θ
−∑
m
t t i in
i
y y Pt
(4)
Because our data set span over 2 years and 3 months, it should contain two complete season
cycles in the demand for each energy source. Figure 3 illustrates the seasonal patterns removed from
the demand of energy obtained from the ve sources. From these plots, we can see that all energy
sources reach their peaks during winter period and fall down during summer except for CCGT. Once
the seasonal long-term cyclical pattern has been removed, it is necessary to remove shorter term
periodic patterns.
In the short-run, there are two more periodic elements which need to be ltered out: week-
day, and peak/o-peak (or day-night) eects. The week-day element can be removed by adopting a
specic moving average method. Since the data frequency is 30 minutes, following [50], we set the
moving average length
l
equal to 336, which are the number of half-hours in one week. To obtain
the moving average of each point in the series, we need to make sure that data in front and behind
any given data point which we obtain, is of the same length. Hence, we consider
1 = 337+l
obser-
vations for each moving average window. The moving average component is calculated through,
/2 /2
** *
=1 =1
1
ˆ= ( ),
1
−+
++
+
∑∑
ll
t ti t ti
ii
m yy y
l
(5)
216 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
where mt is the moving average term and y*
t is the de-seasonalised series obtained in the rst step.
Hence, the deviation from the moving average is given as,
,,
ˆ
=,
++
−
k k lj k lj
wy m
(6)
where
k
is the index in one moving average window, and j is the number of moving average win-
dows. Thus, the day-of-the-week cyclical component can be obtained as,
=1
1
ˆ=,−∑
l
kk i
i
sw w
l
(7)
Therefore, the day-of-the-week ltered data is dened as,
**
,
ˆ
=,−
tl t t
y ys
(8)
Figure 4 illustrates the tted weekly component of the demand for energy obtained for each
source. From the plots, it is apparent that some energy sources have a more pronounced day-of-the-
week eect compared to others. Specically, the demand for coal, nuclear and CCGT produced
energy are relatively higher in weekdays as compared with weekend. This pattern, instead, does not
appear so clearly in the demand for energy produced from wind and hydro sources. This is probably
because the rst three energy are the main sources of electricity used in industrial processes and
businesses, and also, the other two are more strongly dependent on weather conditions and cannot
necessarily provide energy in a steady manner. Apart from the week-day eect, it is reasonable to
expect that a peak/o-peak or day/night eect exists too in the energy demand series and this will
also be dealt with in what follows.
After removing the season-of-year and week-day components, we remove the peak/o-
peak and day/night eect by taking the sth = 48 order dierence of each of the demand for energy of
each ω source:
Figure 3: The seasonal component in the demand for energy
Note: This gure plots deterministic seasonal trend tted by Chebyshev polynomials with order 5 for each energy source.
Combination Forecasting of Energy Demand in the UK / 217
Copyright © 2018 by the IAEE. All rights reserved.
* **
,, ,, , ,
=,
ω ωω
−
−
ts tl t sl
y yy
(9)
After this preliminary analysis, we move forward to the second step of our forecast pro-
cedure where the obtained (now purely stochastic) energy demand series
*
,,
ω
ts
y
for each source
ω
will be modelled and forecast by univariate and multivariate time series as well as neural networks
models, as briey described in next section.
3. METHODOLOGY
In this section, we will briey outline the methodology that we shall use to obtain the
forecasts of the demand for all the energy sources. Firstly, we will describe the six types of fore-
cast models going from univariate to multivariate ones; secondly, we will present the four types of
criteria used to select optimal forecasting model within each of the model sets (or )classes). Lastly,
we discuss the model averaging techniques and show they are used to produce improved forecasts.
3.1 Forecast Models
In this subsection, we will introduce the six classes of models which we use to t and
forecast the ltered series. The univariate models are autoregressive moving average (ARMA), Holt
Winter Smoothing (HWS), Non-linear Autoregressive Neural Network (NARNN) model, and the
multivariate models are Vector autoregression (VAR), Bayesian VAR and Factor Augmented VAR
models. Recall that
ω
denotes the type of fuel used as a source of energy.
1. ARMA
The rst forecasting model is the traditional stationary ARMA model. It is the most com-
monly used forecast model also in the energy eld and usually constitutes a benchmark against
Figure 4: The day of the week component in the demand for energy
Notes: This gure plots deterministic weekly trend tted by Weron (2007)’s moving average method with cyclical length
equals to 336. Thus, the weekly trend contains 336 observations, which indicates the cyclical patterns for one week.
218 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
which other forecasting techniques are compared. Among others, Ediger et al. (2006) and Ediger
and Akar (2007) used ARIMA models to forecast Turkish fossil fuel demand. Also, Sumer et al.
(2009) employed the ARMA class of models to predict electricity demand. In this paper, since the
periodic components and non-stationarity have been removed, we use the following simple AR-
MA(p,q) parameterisation:
,1 , , ,1
=1 =1
=,
ω ω ωω
α β γε ε
+− − − +−
+++
∑∑
pq
th i ti j t j th
ij
yy
(10)
where h is the period forecast ahead. The model is estimated by ordinary least squares and the opti-
mal lag length p and q are determined by information criteria. Once the model has been estimated,
we can use it to predict
,1 , ,
=1 =1
ˆ
ˆˆ
=,
ω ωω
α β γε
+− − −
++
∑∑
pq
th i ti j t j
ij
fy
(11)
The optimal forecasting model is selected from the set of ARMA(p,q) with p and
= 1, ,12q
giving a total of 144 models estimated for each energy source.
2. Holt-Winter Smoothing (HWS)
The HWS belongs to the class of exponentially weighted moving average methods. The
model ts the target process using its past smoothed values and gives more weight to the most recent
ones such that it can be expressed as,
, , ,1 ,1
= (1 )( ),
ω ω ωω
αα
−−
+− +
t t tt
s a sb
(12)
Figure 5: The Stochastic Component of Demand for Energy
Note: This gure plots stochastic term after remove deterministic non-linear patterns for each energy source.
Combination Forecasting of Energy Demand in the UK / 219
Copyright © 2018 by the IAEE. All rights reserved.
such that at time t, the actual value of the process is denotes by aω,t, the smoothed estimate is denoted
by ,
ω
t
s
and ,
ω
t
b
is the trend. In turn, the trend is formulated as,
, , ,1 ,1
= ( ) (1 ) ,
ω ωω ω
ββ
−−
− +−
t tt t
b ss b
(13)
where the parameter
β
is the trend smoothing parameter. Therefore, the predicted value is obtained
from,
,, ,
=,
ωω ω
+
tt t
f s ib
(14)
Here we select the smoothing parameter
α
from
[0.7,0.8,0.9]
while
β
is selected from
[0.1,0.2,0.3]
(see Hong (2013)). Hence, we consider 9 types of HWS models.
3. Non-linear Auto-Regressive Neural Networks (NARNN)
The restrictions imposed by linear forecast models such as ARMA and HWS, are usu-
ally overcome by adopting more general non-linear forecast models(De Gooijer and Kumar, 1992).
Articial neural networks are a commonly used type of such non-linear models (see Sözen et al.
(2005), Pao (2006) and Kurban and Filik (2009), amongst others), and have been widely used for
the purpose of univariate series forecasting. Although it suers from the criticism of not-so-much
underlying economic foundation, the NARNN model (Chow and Leung (1996) and Markham and
Rakes (1998), amongst others) often provides better forecasting accuracy because it is able to ap-
proximate plenty of functions(Zhang, 2003).
In brief, the NARNN model is a dynamic neural network model which is built on a linear
autoregressive model with feedbacks on several layers. The model regresses current dependent out-
put signal on previous output signals, so that the model equation is dened as follows:
, 1 ,1 ,2 ,
= ( ),
ω ωω ω
+− − − −
+ ++
th t t tp
y fy y y
(15)
where f is a non-linear function, and p is the earliest value of signals considered. Once the model has
completed training and validation, it can be used to forecast in the same fashion:
, , ,1 , 1
ˆ
ˆ= ( ),
ω ωω ω
+ − −+
+ ++
th t t t p
y fy y y
(16)
An example of the architecture of the NARNN model is shown in Figure 1. In our case, we
set to 10 the number of neurons in the hidden part, and apply a back-propagation method for training
as in Geem and Roper (2009).
The lag length considered in NARNN ranges from 1 to 12, so that there are 12 models for
each energy source.
Figure 1: The Architecture of a NARNN model
Source: http://uk.mathworks.com/help/nnet/ref/narnet.html
220 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
4. Vector Autoregression (VAR)
The models introduced above are self-forecast models, where the predicted value is mainly
based on the serial correlation of historical data. From the fourth model onward, we forecast the
energy demand processes according to causal relationships. In these models, the energy demand
processes are modelled as a system, such that the predictions for each process would be obtained
from the entire system. The rst causal forecast model is the standard VAR. García-Ascanio and
Maté (2010) used a VAR model to forecast electric power demand in Spain.
A system of
t
y
composed by endogenous variables
1,t
y
, 2,
t
y
, ..., ,
kt
y
, where
k
refers to the
demand for energy from each fuel source such that here
=5k
. Thus, the VAR model with lag length
p can be formulated as,
1
=1
=,
ε
+− −
Φ+
∑
p
th i ti t
i
yy
(17)
where each
Φ
i
is a
×KK
coecient matrix, and
ε
t
is a K-dimensional vector of errors terms with
mean vector zero and diagonal variance covariance matrix
Σ
. We estimate each VAR by maximum
likelihood (MLE). Then, we use the in-sample estimation results and iterate forward to obtain the
out-of-sample predictions.
1
=0
ˆ
=,
−
+−
Φ
∑
p
th i ti
i
fy
(18)
Compared with univariate forecast models, the advantage of VAR models (same for all VAR-type
of models which will follow) is that they provide predictions not only based on historical tting of
individual process but also by means of lags of other endogenous variables in the system. Note that,
in our in-sample estimation and model selection, the maximum lag length considered will be
= 12p
.
5. Bayesian Vector Autoregression (BVAR)
While it is common to use VARs to obtain forecasts, it has also been argued that VARs esti-
mated by Bayesian methods would provide better forecast with more parsimonious models because
standard VARs often incur in over-tting problems (Spencer, 1993). Compared with standard esti-
mation, the BVAR2 treats model’s parameters as random variables, and applies Bayesian estimation
imposing restrictions on the dynamics of the parameters according to a specic type of prior. Based
on this assumption, the coecients on longer lagged variables are more likely to be near zeros, re-
sulting in a more parsimonious estimation. Indeed, we still use the model showed in Equation (17),
however, with the prior adopted being the Minnesota Prior (Del Negro and Schorfheide, 2004). In
the VAR system, there are K equations, and each one can be expressed as,
,1 , , ,
=1 =1
=,
φε
+− −
⋅+
∑∑
pK
it h i j jt i it
ij
yy
(19)
In this case, the prior about coecients are captured in the prior density function
,
()
φ
ij
g
. Then, using
Bayesian theory, the estimators are obtained by the posterior density functions
,,
(|)
φ
i j it
gy
.
,, ,
,,
,
( | )( )
( | )= ,
()
φφ
φ
it ij ij
i j it
it
gy g
gy gy
(20)
2. For an application see Crompton andWu (2005) who applied BVAR model to predict energy consumption in China.
Combination Forecasting of Energy Demand in the UK / 221
Copyright © 2018 by the IAEE. All rights reserved.
and the predictions of yi,t can be obtained from following,
1
, ,,
=0 =1
ˆ
=,
φ
−
+−
⋅
∑∑
pK
it h i j jt i
ij
fy
(21)
where again, the maximum lag length considered will be
= 12p
.
6. Factor Augmented Vector Autoregression (FAVAR)
Last, we use a factor augmented VAR. The FAVAR model has been widely applied to
large data especially in macroeconomics (Bernanke and Boivin (2003) and Bernanke et al. (2004)).
Chudik and Pesaran (2011) claim that a less parameterised VAR model augmented with factors will
not lose any relevant information and would often produce better forecasts than standard VARs. In
the energy related literature, among others, Baumeister et al. (2016) have adopted VAR, BVAR and
FAVAR models to predict gasoline price in US market.
The FAVAR aim at modelling a system
t
x
with N variables and assume a subset
y
of
t
x
which contains M variables, and the dynamics of
y
are driven by unobservable forces in
t
x
. These
unobservable forces are factors extracted from
t
x
, containing most of the relevant information. The
system can thus be formulated as follows:
=,
ε
Λ ⋅ +Λ ⋅ +
fy
t tt
x Fy
(22)
where Λf is
×NK
coecient matrix for K factors, Λy is a
×NM
coecient matrix, and
ε
t
is a
1×N
vector of error terms.
In this paper, we classify the energy demand processes into two groups, one group is the
objective observed process for a specic source
t
y
, while the other group is made up of the energy
processes obtained by other sources from which one factor is extracted. We denote this groups data
as
t
x
. The FAVAR model can be written in state-space form comprising two equations: the observa-
tion equation and the state equation. In the observation equation, the number K of factors
t
F
, where
=1K
in this paper, can be extracted from the variables in
t
x
through principal components. Thus,
the state equation is,
,1 , ,1
=1
=,
ω ωω
ε
+− − +−
Φ+
∑
p
th i ti th
i
zz
(23)
where
,1
,1
,1
=
ω
ω
ω
+−
+−
+−
th
th
th
F
y
z
. Again,
Φ
i
is a coecient matrix, the dimension of which depends on the
number of factors extracted from
t
x
, and if there is only one factor, the coecient matrix will be
22×
. Therefore, the objective process
t
y
can be predicted by one
Φ
i
that has been estimated,
1
,,
=0
ˆ
ˆ=,
ωω
−
+−
Φ
∑
p
th i ti
i
zz
(24)
where
,
,
,
ˆ
ˆ=
ω
ω
ω
+
+
+
th
th
th
F
f
z
. For each objective process, the predictions are obtained through the causal
relationship with factors extracted from the remaining processes. In this paper, we extract one factor
from remaining four energy series. For each energy source, the optimal model is selected by consid-
ering lag length
k
from 1 to 12.
222 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
3.2 IMS and DMS forecast methods
For all the models in each class, we construct forecasts of the demand for energy using both
iterated multi-step (IMS) and direct multiple steps (DMS) methods. The IMS method provides h
steps ahead predictions through a one step ahead predictor
11
ˆ
=
+t
fy
iterated forward h times. In each
iteration, we estimate the Equation 25 below using the training sample, and then forecast one period
ahead for the out-of-sample through Equation 26.
11 1
= ( ) , = 1, , 6,
ε
++
Θ+
t it t
y yi
(25)
1 11
ˆ
ˆ
= = ( ),
+
Θ
t it
fy y
(26)
where
i
is the i th model, and Θ1 is a set of parameters in i th model.
The DMS method predicts h steps ahead through forecasting
ˆ
=
+h th
fy
directly. This is
achieved by using the estimated Equation 27 using the training sample, and then using Equation 28
to predict
h
f
for the out-of-sample.
= ( ) , = 1, , 6,
ε
++
Θ+
th h i t th
y yi
(27)
ˆ
ˆ
= = ( ),
+
Θ
h th h i t
fy y
(28)
where Θh is a set of parameters in i th model for DMS.
According to Marcellino et al. (2006), the IMS method should provide lower forecast error
once the one-period ahead model is well specied. However, the DMS method is relatively more
robust to misspecication in the forecast model. Thus, in general, DMS has been often preferred to
IMS in empirical studies. As there is no strong evidence to support a clear cut choice between IMS
and DMS, we shall obtain forecasts with both methods and will compare their respective forecasting
ability.
3.3 Model Selection
For each model discussed above, we will consider dierent parameter settings and lag
lengths, and assume that the best forecasting model exists among those considered here. The op-
timal or best model will be chosen in-sample by means of dierent information criteria including
Akaike Information Criterion (AIC), Beyesian Information Criterion (BIC), Mallows’ Information
Criterion (MIC) (Hansen (2007) and Hansen (2008)) and the Jackknife (JKC), a cross-validation
criterion suggested by Hansen and Racine (2012) and Hansen (2014). Each information criterion is
computed using the in-sample tted error
ˆˆ
()= ()
ε
−
t tt
m y ym
. Hence, the estimated tted error vari-
ance equals to
22
=1
1ˆ
ˆ()= ()
σε
∑
n
i
mm
n
, where
n
is the number of observations in-sample.
The AIC and BIC information criteria reward for lower tted errors but penalize for higher
number of parameters estimated, so that
2
ˆ()
σ
m
is the estimated error variance of model
m
, and the
number of parameters in each estimated model is denoted as
()km
. The AIC and BIC can be respec-
tively expressed as:
2
ˆ
= ( ( )) 2 ( ),
σ
⋅+AIC n ln m k m
(29)
Combination Forecasting of Energy Demand in the UK / 223
Copyright © 2018 by the IAEE. All rights reserved.
2
ˆ
= ( ( )) ( ) ( ),
σ
⋅ +⋅BIC n ln m k m ln n
(30)
where n is the total number of observations in-sample.
The Mallows’ information criterion uses the estimated mean squared errors,
2
ˆ ˆˆ
=( ())( ()) 2 () (),
σ
′
− − +⋅ ⋅
tt tt
MIC y y m y y m m k m
(31)
where
ˆ()
t
ym
is the tted value of
t
y
from model
m
and
2
ˆ()
σ
m
and
()km
are dened as above. Last,
we use the Jackknife, a cross-validation criteria. To use this cross-validation method, we obtain a
leave-one-out estimator for each in-sample point for every model, and then obtain a cross-validation
tted error
ε
t
through the following equation.
,,
ˆ
=,
−
−
mi i im
e yy
(32)
where
,
ˆ
−im
y
is the leave-one-out one step estimate of
i
y
based on the estimated parameters from the
remaining observations. Then, the expression for the Jackknife (JKC) can be formulated as,
2
,
=1
1
=,
ε
⋅
∑
n
mi
i
JKC n
(33)
Note that, for each of the criterion used, the optimal/best forecasting model will be the one
which in each class/set of models minimizes the information criterion. In case of equal value of the
information criterion for two dierent models within the same class, the more parsimonious model
will be preferred.
3.4 Model Averaging
In this subsection, we briey outline the model averaging methods which we shall use
to improve the accuracy of our forecasts. We denote the prediction from i th model as
()
t
fi
for
1≤≤iJ
, and the average prediction
t
f
is dened as
1
= ( ),
∑
J
t it
f wf i
(34)
where
()
t
fi
is obtained from the forecasting model class/set
i
, and
i
w
is the weight attached to the
individual
()
t
fi
obtained from the
J
candidate forecasts. At this point, the major issue in model av-
eraging becomes how to specify the weights
i
w
as dierent weighting functions are likely to provide
dierent levels of forecasting accuracy.
3.4.1 Simple Model Averaging (SMA)
The simple model averaging provides an equally weighted average of the predictions of
all the best forecast models in each class, such that the weights in simple model averaging are just
1
=
i
wJ
. The SMA forecast is,
=1
1
= ( ),
∑
J
tt
i
f fi
J
(35)
Note that SMA is known to improve the accuracy of forecasts as long as the model candidates are
well specied. However, once some of the candidates are not well specied, the accuracy of aver-
aged prediction will signicantly decrease.
224 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
3.4.2 Granger-Ramanathan Model Averaging (GRMA)
Granger and Ramanathan (1984) proposed a model average weighting based on coe-
cients of the regression model. The regression is made of an average forecast
t
f
regressed on the
candidate predictions
()
t
fi
and is formulated as:
0
= ()
ββ ε
++
∑
J
t it t
i
f fi
Granger and Ramanathan (1984) impose three constraints on the coecients of this re-
gression. First, the intercept coecient is equal to zero,
0
=0
β
; second, the coecients on each
candidate prediction should be non-negative,
0
β
≥
i
for all i; last, the sum of coecients of the re-
gression must be equal to one,
1
=1
β
∑
J
i
. Having specied these constraints, they used the estimated
coecients as the weights for averaging, that is
ˆ
=
β
ii
w
.
= ( ),
β
∑
J
t it
i
f fi
(36)
3.4.3 Bayesian Model Averaging (BMA)
Bayesian model averaging assumes that there always exists at least one well-specied
model among all candidate models and therefore one should give more weight to well or better spec-
ied candidates while less weight is attached to the rest of the models. The probability of candidate
models to be well-specied is giving as a prior, and then the Bayesian posterior probability can be
calculated conditional on real data. These Bayesian posterior probabilities for each candidate model
are the weights for averaging all potential models. As the prior probability that any model is a well
specied model is not known for each candidate model, the weights can be approximated by using
the Bayesian information criteria.
=1
1
( ( ))
2
=1
()
2
−
−
∑
m
b
mM
j
j
exp BIC
w
exp BIC
Thus, the averaged forecast is given by:
1
= ( ),
∑
J
b
t it
f wf i
(37)
3.4.4 Other Model Averaging Functions
Anderson and Burnham (2002), proposed to replace BIC in the weighting function with
AIC, this resulting into smoothed AIC (SAIC) weighting function. In this case the weights
a
m
w
will
be:
=1
1
( ( ))
2
=1
()
2
−
−
∑
m
a
mM
j
j
exp AIC
w
exp AIC
and the SAIC model averaging (AMA) forecasts are given by:
Combination Forecasting of Energy Demand in the UK / 225
Copyright © 2018 by the IAEE. All rights reserved.
1
= ( ),
∑
J
a
t it
f wfi
(38)
Similarly, as suggested by Hansen (2007), Hansen (2008) and (Hansen and Racine (2012), Hansen
(2014) we can use Mallows’ information criteria (MIC) or the Jackknife cross-validation criteria
(JKC) to replace the BIC, thus obtaining weights
m
m
w
and
J
m
w
. such that the two weighting functions
will respectively be,
=1
1
( ( ))
2
=1
()
2
−
−
∑
m
m
mM
j
j
exp MIC
w
exp MIC
with the Mallows’ model averaging (MMA) forecast being,
1
= ( ),
∑
J
m
t it
f wfi
(39)
and
=1
1
( ( ))
2
=1
()
2
−
−
∑
m
j
mM
j
j
exp JKC
w
exp JKC
while the Jackknife model averaging (JMA) forecast is,
1
= ( ),
∑
J
j
t it
f wf i
(40)
4. RESULTS AND COMPARISON OF THE FORECASTS
In this section, we provide our results and a comparison between forecast of dierent time
horizons according to the various criteria. Because of the heavy computational burden, we shall
consider static forecasts obtained with both IMS and DMS methods. Our forecast methods can be
extended to a dynamic type by constructing recursive or rolling out-of-sample forecasts iteratively,
however, this would be made at the expense of an even heavier computation burden and it is not
done here.
In the following, we report results of, and comparisons between dierent forecast models,
forecast methods, forecast model selection criteria and eventually model averaging methods. The
accuracy of these predictors is measured by Mean Squared Forecast Error (MSFE hereafter), and
model averaging predictions are statistically tested by using tests provided by Diebold and Mariano
(1995). We predict demand for each energy source with forecast horizons of 30 minutes, 1 hour, 2
hours, 4 hours, 8 hours, 12 hours, 18 hours and 24 hours.
4.1 Comparison based on Forecast Error
For any forecast model i, if
()
t
fi
is the predicted value of objective
t
y
, and the forecast error
ˆ()
ε
ti
is expressed as,
226 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
ˆ()= (),
ε
−
t tt
i y fi
(41)
Thus, the estimated forecast error variance is,
22
=1
1ˆ
ˆ()= () ,
σε
∑
m
i
i
ii
m
(42)
where m is the number of out-of-sample predicted points. This paper uses the MSFE to measure the
accuracy of predictions.
2
()=[ ()],−
tt
MSFE m E y f m
(43)
Table 2 displays the MSFE of an average between IMS and DMS forecasts of the best
model of each model class. There are several points to note. In terms of the model selection criteria,
we notice that the best/optimal models suggested by the JKC tends to often produce more precise
forecasts in terms of lower forecast error. Comparing dierent classes of models, NARNN and VAR
appear to forecast better than the others model classes. Within the ve energy sources, we can see
that it is easier to obtain accurate predictions of the demand for nuclear energy. This result is fairly
expected as nuclear plants once on-line run at-out and also conrms the preliminary analysis above
which showed less noise in nuclear energy demand process. On the contrary, CCGT and wind en-
ergy demands are relatively more dicult to predict, and display larger values of the MSFE.
Table 3 reports the MSFE of the dierent model averaging forecasts. Although the DMS
method reports lower MSFE in 16 out of the 30 cases, it is still hard to conclude whether IMS beats
DMS or the reverse, because each method dominates the other depending on the source of energy
Table 2: MSFE of Individual Models
ARMA HW NARNN VAR BVAR FAVAR
*Coal AIC 0.0428 1.3233 0.0284 0.0304 0.0438 0.0274
BIC 0.0428 1.3233 0.0284 0.0305 0.0436 0.0334
MIC 0.0428 1.3706 0.0284 0.0327 0.0429 0.0397
JKC 0.0289 1.3233 0.0295 0.0304 0.0438 0.0274
*Nuclear AIC 0.0239 0.0005 0.0005 0.0004 0.0013 0.0007
BIC 0.0239 0.0005 0.0004 0.0004 0.0012 0.0016
MIC 0.0239 0.0005 0.0005 0.0004 0.0012 0.0007
JKC 0.0004 0.0005 0.0005 0.0004 0.0013 0.0007
*CCGT AIC 0.1278 0.1527 0.0971 0.0935 0.1378 0.1017
BIC 0.1278 0.1527 0.0963 0.0954 0.1404 0.1013
MIC 0.1278 0.1565 0.1032 0.0955 0.1480 0.0962
JKC 0.1083 0.1527 0.0971 0.0935 0.1378 0.1017
*Wind AIC 0.1564 0.1201 0.0442 0.0909 0.1254 0.0835
BIC 0.1742 0.1201 0.0545 0.0866 0.1384 0.4522
MIC 0.1564 0.0843 0.0442 0.0860 0.1898 0.0835
JKC 0.1097 0.1201 0.0442 0.0909 0.1254 0.0835
Hydro-Power AIC 0.0508 0.7546 0.0591 0.0574 0.0483 0.0576
BIC 0.0508 0.7546 0.0571 0.0553 0.0490 0.0586
MIC 0.0487 0.6072 0.0591 0.0547 0.0523 0.0588
JKC 0.1790 0.7546 0.0591 0.0574 0.0573 0.0576
Note: This table reports the mean square forecast error of the best individual forecast models according to dierent information
criteria. The MSFE averages those of IMS and DMS predictions from all forecast horizons.
Combination Forecasting of Energy Demand in the UK / 227
Copyright © 2018 by the IAEE. All rights reserved.
for which demand is forecast. Among the six types of model averaging methods, generally the
BMA, MMA are superior to the others. Lastly but most importantly, from Table 3, we can see that
there always exists a model averaging method which gives a lower MSFE than the best/optimal
model from any class as from Table 2.
4.2 IMS vs DMS
In this subsection, we specically compare the forecasting ability of IMS and DMS. Figure
6 shows the plots of the MSFEs obtained from the two forecasting methods. Generally, it seems
preferable to use DMS, particularly if using ARMA and FAVAR models to forecast a longer horizon.
An exception is the case of BVAR for which the IMS beats the DMS in generating forecasts of the
demand for energy produced by coal, ccgt, wind and hydro-power.
In more detail, the ARMA model with DMS method produces better forecasts for coal,
nuclear and hydro-power. With regard to CCGT and wind sources, the FAVAR shows better fore-
casting ability, but with IMS in one instance and with DMS in another. Also, as a general result,
we observe that the predictions are more accurate in the beginning and at the end of the forecast
Table 3: MSFE of Model Averaging Methods
SMA GRMA AMA BMA MMA JMA
*Coal IMS 0.0440 0.0440 0.0297 0.0297 0.0231 0.0340
DMS 0.0496 0.0330 0.0268 0.0268 0.0255 0.0430
*Nuclear IMS 0.0013 0.00022 0.0005 0.0004 0.00022 0.0007
DMS 0.00038 0.00063 0.00043 0.00047 0.00036 0.00036
*CCGT IMS 0.1004 0.1275 0.0932 0.0927 0.0988 0.0996
DMS 0.1113 0.1191 0.1017 0.1004 0.1175 0.1086
*Wind IMS 0.0805 0.2896 0.0429 0.0457 0.0821 0.0679
DMS 0.0508 0.1366 0.0390 0.0423 0.0536 0.0584
Hydro-Power IMS 0.0606 0.0958 0.0500 0.0474 0.0449 0.0543
DMS 0.0635 0.0426 0.0508 0.0508 0.0483 0.0556
Note: This table documents the mean square forecast error of the model average methods. The MSFE averages predictors
from all forecast horizons.
Figure 6: MSFE of IMS and DMS
Notes: The gure compares IMS and DMS forecasts precision for six classes of forecast models. The y axis is the value of
MSFE, and the x axis is the forward prediction steps.
228 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
horizon, as shown by the MSFE, which is always low in forecasting 30 minutes, 1 hour and 1 day
ahead, but grows to higher levels in the mid-term, indicating that serial correlation or cross-serial
correlation (in multivariate models) is stronger in the short and long-term but weaker in the medium
term (relative to the frequency of the data). Therefore, for empirical purposes, forecasting with DMS
is advised for short and longer terms.
Figure 7: Comparison among Information Criteria
Notes: The gure compares model selection information criteria for six classes of forecast models. The y axis is the value of
MSFE, and the x axis is the forward prediction steps.
Combination Forecasting of Energy Demand in the UK / 229
Copyright © 2018 by the IAEE. All rights reserved.
4.3 The Comparison of Information Criteria
In section 3.3 we outlined four types of information criteria used to selecting the optimal
in-sample forecast model. Here we compare their performance out-of-sample, computing the MSFE
of the up-to-one-day forecasts of the demand for energy produced from the best models of each set
as selected by each information criteria. Figure 9 illustrates he MSFE of the optimal forecast models
in each class as selected by AIC, BIC, MIC and JKC respectively.
In brief, BIC and MIC consistently suggest the same optimal forecast model, while the
AIC and JKC are more likely to indicate similar forecast model. For all energy sources, the MSFE
obtained from the forecasting model suggested by AIC and JKC almost never underperform to the
counterpart suggested by BIC and MIC. Particularly in each sector, the optimal models suggested
by these four information criteria are more or less the same for HW, NARNN, VAR and BVAR
models, and the only exception is the BVAR model for wind-produced energy. In term of ARMA
and FAVAR models, the AIC and JKC select better out-of-sample forecasting model, except for
the short horizon forecasts generated by ARMA model in the coal-produced energy demand series,
and relatively longer horizons by ARMA for hydro-power. Now, since each combination method is
averaging the optimal models from six types of forecasting candidates, the fact that AIC and JKC
are selecting models that give lower MSFE, it is reasonable to expect that using these criteria in
weighting function for model averaging will also give more accurate predictions. Below we exam-
ine the issue in more detail.
4.4 Comparison of Model Averaging Methods
In this sub-section, we compare the forecasting performance of the dierent model averag-
ing methods, namely: SMA, GRMA, AMA, BMA, MMA and JMA. Figure 8 displays the MSFEs
obtained out-of-sample for the various model averaging methods. The multi-steps predictions are
computed with both IMS and DMS. Consistent with the results displayed in Table 3 and Figure 6
and already discussed, the DMS method provides slightly more accurate forecasts than the IMS.
Within the IMS-based forecasts, we nd that, overall, SMA and AMA produce more ac-
curate predictions for nuclear, CCGT and wind for most of the forecast horizons, while MMA is
superior in generating predictions for coal and hydro. In more detail, the AMA, BMA and MMA
methods produce good forecasts albeit none of them clearly dominated the others. Also, the JMA
generates better forecasts for coal but loses eciency as the forecast horizon approaches the 24
hours; the BMA method produces better predictive ability for nuclear and CCGT, while MMA is
best for hydro-power sourced energy; the AMA provides better forecasts for the demand of wind-
sourced energy. On another hand, if one used DMS, the AMA, BMA, MMA and JMA show very
similar forecast abilities with slightly dierences for the demand produced by nuclear, CCGT and
hydro-power sources. Although the performance of the model averaging methods in forecasting the
demand for energy obtained using coal and hydro-power are not much dierent regardless of the
weighting functions used, the MMA and AMA produce slightly better predictions for the demand of
energy sourced from coal and wind, respectively.
In more depth, in the case of coal-based energy, the GRMA predictors obtain lower fore-
cast error in the beginning of the predicted period and at its end, while JMA often predictions are
superior in the middle of the forecast period. For nuclear-sourced energy, the BMA consistently
outperform to other model averaging methods in the accuracy of its prediction. For CCGT, all
model averaging techniques perform more or less the same, and again, BMA is the one that slightly
230 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
more accurate than the others. Regarding the demand for wind-sourced energy, the MMA provides
the most accurate predictions in forecasting the longer term (one-day ahead), but AMA generally
outperforms the others. Lastly, considering hydro-power produced energy, using IMS, the MMA
shows the best forecasting ability while GRMA performs poorly. However, using DMS, both MMA
and GRMA produce more accurate forecasts compared with rest of the model averaging methods.
Next, we use the Diebold-Mariano (DM) and the Wilcoxon’s sign-ranked (Sign) tests by
(Diebold and Mariano, 1995) to test the prediction equivalence of AMA, BMA, MMA and JMA
(given the relatively poorer performance of SMA and GRMA). Table 4 displays the pair-wise results
of both tests for IMS and DMS, respectively. Generally, both the DM and the Sign test provide sim-
ilar results, expect for a few cases which concern the predictions obtained by IMS. On Table4, for a
pair of ordered forecasts obtained by weight functions x and y, a positive (negative) and statistically
signicant value of the statistic would imply superiority (inferiority) of the predictions obtained
with weighting function y over those obtained if using weighting function x. The results can be
Figure 8: Comparison among Model Averaging Methods
Notes: The gure compares six types of forecast model averaging methods within six types of forecast models. The y axis is
the value of MSFE, and th x axis is the forward prediction steps.
Combination Forecasting of Energy Demand in the UK / 231
Copyright © 2018 by the IAEE. All rights reserved.
Table 4: Prediction Equivalent Tests on Model Average Predictors
IMS
Coal Nuclear CCGT Wind Hydro
DM_test Sign_test DM_test Sign_test DM_test Sign_test DM_test Sign_test DM_test Sign_test
AMA vs BMA -0.78(0.42) -0.42(0.67) -0.83(0.34) -1.40(0.18) 1.42(0.16) 2.10(0.04) 0.76(0.42) 1.52(0.14) -0.93(0.35) 1.12(0.26)
AMA vs MMA 3.62(0.00) 2.52(0.01) -0.48(0.63) 1.68(0.09) 1.35(0.18) 2.10(0.04) 1.56(0.12) 2.10(0.04) 1.93(0.05) 2.10(0.04)
AMA vs JMA -1.65(0.10) 1.12(0.26) 1.74(0.08) 2.52(0.01) -0.33(0.74) 1.68(0.09) 0.50(0.62) 2.10(0.04) -0.97(0.33) 1.12(0.26)
BMA vs MMA 3.63(0.00) 2.52(0.01) -0.27(0.78) 2.38(0.02) -0.56(0.58) 1.68(0.09) -1.21(0.23) 1.12(0.26) 5.36(0.00) 2.52(0.01)
BMA vs JMA -1.65(0.10) 1.12(0.26) 2.76(0.01) 2.52(0.01) -1.00(0.32) 1.12(0.26) -1.89(0.06) 0.42(0.67) -0.91(0.36) 1.12(0.26)
MMA vs JMA -2.77(0.01) -2.52(0.01) 0.71(0.48) 1.12(0.26) -1.91(0.06) -0.42(0.68) -1.35(0.18) 0.42(0.67) -1.20(0.23) 1.12(0.26)
DMS
Coal Nuclear CCGT Wind Hydro
DM_test Sign_test DM_test Sign_test DM_test Sign_test DM_test Sign_test DM_test Sign_test
AMA vs BMA -0.38(0.70) 2.10(0.04) 0.59(0.55) 0.42(0.67) -0.59(0.55) 0.42(0.67) 1.57(0.12) 2.10(0.04) -0.53(0.59) 1.12(0.26)
AMA vs MMA 6.12(0.00) 2.52(0.01) -2.22(0.03) -0.42(0.67) 2.54(0.01) 2.10(0.04) 2.31(0.02) 2.38(0.02) 0.91(0.36) 2.10(0.04)
AMA vs JMA -2.01(0.04) -0.42(0.67) -2.15(0.03) -1.40(0.18) -0.74(0.46) 1.12(0.26) 2.94(0.00) 2.52(0.01) -2.33(0.02)
-0.42(0.67)
BMA vs MMA 6.13(0.00) 2.52(0.01) -2.20(0.03) 0.42(0.67) 2.57(0.01) 2.10(0.04) 1.99(0.05) 2.38(0.02) 1.84(0.07) 2.38(0.02)
BMA vs JMA -2.01(0.04) -0.42(0.67) -2.07(0.04) -1.40(0.18) -0.57(0.57) 1.68(0.09) 2.83(0.00) 2.38(0.02) -2.35(0.02)
-0.42(0.67)
MMA vs JMA -4.03(0.00) -2.52(0.01) -0.91(0.36) 1.12(0.26) -3.46(0.00) -1.40(0.18) -0.88(0.38) 1.12(0.26) -2.79(0.01) -0.42(0.67)
Notes: This table reports the prediction equivalence results for pair-wise model averaging methods which are using the IMS and DMS. DM_test refers to the Diebold and Mariano test, and
Sign_test refers to the Wilcoxon’s signed-rank test. Both tests use the critical values of standard normal distributions.
232 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
roughly summarized as: AMA and BMA provide statistically equivalent forecasts for all energy
sectors regardless of whether the forecasts are obtained by means of IMS or DMS methods. When
using IMS, the MMA forecasts are, in general, found statistically superior to the AMA/BMA except
for Nuclear Energy where contrasting results are found. Also, JMA-produced forecasts, although of-
ten equivalent to AMA and BMA, are found in a few cases to be even slightly superior to them. For
DMS forecasts, the MMA again outperforms all others methods while the JMA loses its slight su-
periority compared to the AMA and BMA forecasts and it is actually outperformed when producing
forecasts for coal, nuclear and hydro generated energy. Finally, for both the IMS and DMS methods,
MMA obtained forecasts are found consistently superior to those given by the JMA weight function.
5. FORECASTING THE DEMAND FOR ENERGY IN LEVELS
In this section, we nally obtain forecast of the levels of the energy demand for the ve
energy sources on 22 March 2016. Specically, we re-combine the IMS and DMS predictions ob-
tained through all the model averaging methods, with the deterministic/periodic terms captured
by Equations 4, 8 and 9, therefore obtaining forecasts for the level of the UK demand for energy
produced by the ve sources considered. Figure 9 illustrates the predicted and actual values for
coal, nuclear, CCGT, wind and hydro-power, respectively. Treating the forecasts from the individual
forecast models as bench-marks, the gures compare the predictions obtained from model averaging
to that of the individual forecasting models obtained by means of both IMS and DMS.
Among six bench-mark forecast models, the HWS model is the worst performing one, and
the NARNN is the best and actually shows a performances close to that of model averaging meth-
ods. However, it is important to re-iterate that there always exists a model averaging forecast that
can beat the forecasts obtained from a bench-mark model. Another notable fact is that the prediction
become more accurate as the forecast horizon approaches the 24 hours, and also that DMS outper-
form IMS in longer-term forecasting.
In more detail, the model averaging predictions for coal, CCGT, wind and hydro-power,
remain relatively accurate after adding the periodic and deterministic components. This is not the
case for the demand for energy nuclear-sourced, where in fact, forecasts of the level is less accurate
due to the inaccuracy in tting the deterministics.
Using the IMS method, predictions become more and more accurate reaching the 1 day
horizon. For the coal-sourced energy, to predict the shorter-term (30 minutes—4 hours), the GRMA
method produces the most precise forecasts, while MMA becomes superior in forecasting the longer
horizon (6–24 hours). For nuclear energy, MMA and JMA allow us to obtain better predictions than
others. All model averaging methods give similar forecasting of the CCGT-sourced energy. AMA
predictions outperform the other forecasting methods for the the demand of wind-fuelled energy,
and lastly, in the hydro-power sector, the MMA method again shows best forecasting ability than
other model averaging methods.
6. CONCLUSIONS
In this paper we have produced more accurate short-term forecasts of the demand for en-
ergy in the UK using a forecasting approach based on model averaging of several popular linear or
non-linear, univariate and multivariate forecast models. Specically, we used an algorithm that once
obtained the forecasts from sets of ARMA, Holt-Winters, Non Linear Autoregressive Neural Net-
works, Vector Autoregressions, Bayesian VAR and Factor Augmented VAR models selects the best
Combination Forecasting of Energy Demand in the UK / 233
Copyright © 2018 by the IAEE. All rights reserved.
forecasting model from each model-set according to four dierent information criteria (AIC, BIC,
Mallows’ and Jackknife). The best models as selected by each of the dierent information criteria
within each model set are then averaged using six dierent combination weight metrics includ-
ing Simple Model Averaging, Granger-Ramanathan Model Averaging, Bayesian Model Averaging,
Akaike Model Averaging, Mallows Weights and the Jackknife.
Our results conrm the merits of combination forecasting as a superior forecasting strat-
egy. Among the single forecasting models, NARNN and VAR forecast are superior in terms of lower
MSFE whilst HWS perform worst. Unexpectedly, DMS forecasts outperforms those obtained by
IMS in terms of accuracy. For all energy sources, the MSFE obtained from the forecasting model
selected by AIC and JKC almost never underperform compared to their counterparts suggested by
BIC and MIC. Among the six types of model averaging methods, generally the BMA, MMA are
superior to the others. Lastly but most importantly, there always exists a model averaging method
which gives a lower MSFE than the best/optimal models within each class however selected.
Figure 9: The Model Averaging vs Benchmark Models
Notes: The gure reports six types of model averaging forecasts from six sets of forecast models. The y axis is the value, and
the x axis is the forward prediction steps. We do not plot the results of HW model considering its poor performances.
234 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
As highlighted above, accurate forecasts are a precious resource for demand response, and/
or load management. With timely and accurate prediction of demand, load management programs
facilitate system load balancing by avoiding peak occurrences. On the other hand, they can also
be crucial for demand response, which has been gaining prominence in recent years as an eec-
tive and inexpensive tool for reducing overall utility peak demand while improving system-wide
energy eciency. Through the curtailment of electricity consumed by end-users during periods of
high demand or electricity grid instability, demand response technology addresses unexpected vari-
ances in electricity supply and demand levels. When wholesale electricity market prices are high or
when overall grid system reliability is compromised, demand response programs oer incentives
to end-users in order to aect time of use, instantaneous demand level, and/or aggregate electricity
consumption.
Given accurate forecasts with low error such as those we obtained using model averaging,
it is theoretically possible for network management to, for example, temporarily curtail a portion
of the network load in some areas of a city whenever approaching a predetermined peak in demand
in others areas. It is thus necessary for the network management to determine an acceptable peak
demand load maximum for the various areas. The real-time energy monitoring system provided
by smart metering together with the model averaging forecast signals an upcoming breach of the
predetermined peak demand load maximum. Thus, curtailment policy shreds unnecessary loads
during these events in order to control overall peak loading and prevent an unwanted peak demand
occurrence. Clearly, accurate model averaging forecasts would be particularly useful for ecient
and cost-eective peak demand energy management across city municipalities and other large en-
ergy end-users. In this case there would be added benet not only to the electric utility provider but
also to the environment through ecient and reduced power generation capacity. Such reduction
and ecient usage of power generation would undoubtedly contribute to the energy sustainability
of local municipalities and their communities.
REFERENCES
Amjady, N. (2001), “Short-term hourly load forecasting using time-series modeling with peak load estimation capability,”
IEEE Transactions on Power Systems 16(3): 498–505. https://doi.org/10.1109/59.932287.
Anderson, D. R. and K. P. Burnham (2002). “Avoiding pitfalls when using information-theoretic methods,” The Journal of
Wildlife Management pp. 912–918. https://doi.org/10.2307/3803155.
Badri, M. A., A. Al-Mutawa, D. Davis and D. Davis (1997). “Edssf: a decision support system (dss) for electricity peak-load
forecasting,” Energy 22(6): 579–589. https://doi.org/10.1016/S0360-5442(96)00163-6.
Bates, John M and Clive W.J. Granger (1969). “The combination of forecasts,” Journal of the Operational Research Society
20(4): 451–468. https://doi.org/10.1057/jors.1969.103.
Baumeister, C., L. Kilian and T. K. Lee (2016). “Inside the crystal ball: New approaches to predicting the gasoline price at the
pump,” Journal of Applied Econometrics .
Bernanke, B. S. and J. Boivin (2003). “Monetary policy in a data-rich environment,” Journal of Monetary Economics 50(3):
525–546. https://doi.org/10.1016/S0304-3932(03)00024-2.
Bernanke, B. S., J. Boivin and P. Eliasz (2004). Measuring the eects of monetary policy: a factor-augmented vector autore-
gressive (favar) approach, Technical report, National Bureau of Economic Research. https://doi.org/10.3386/w10220.
Chow, T.W.S. and C.T. Leung (1996). “Neural network based short-term load forecasting using weather compensation,” IEEE
Transactions on Power Systems 11(4): 1736–1742. https://doi.org/10.1109/59.544636.
Chudik, A. and M. Pesaran (2011). “Innite-dimensional vars and factor models,” Journal of Econometrics 163(1): 4–22.
https://doi.org/10.1016/j.jeconom.2010.11.002.
Crompton, P. and Y. Wu (2005). “Energy consumption in China: past trends and future directions,” Energy economics 27(1):
195–208. https://doi.org/10.1016/j.eneco.2004.10.006.
Combination Forecasting of Energy Demand in the UK / 235
Copyright © 2018 by the IAEE. All rights reserved.
Cuestas, J. C. and L. A. Gil-Alana (2016). “Testing for long memory in the presence of non-linear deterministic trends with
Cheby shev polynomials,” Studies in Nonlinear Dynamics & Econometrics 20(1): 57–74. https://doi.org/10.1515/snde-
2014-0005.
De Gooijer, Jan G. and Kuldeep Kumar (1992). “Some recent developments in non-linear time series modelling, testing, and
forecasting,” International Journal of Forecasting 8(2): 135–156. https://doi.org/10.1016/0169-2070(92)90115-P.
Del Negro, Marco and Frank Schorfheide (2004). “Priors from general equilibrium models for vars,” International Economic
Review 45(2): 643–673. https://doi.org/10.1111/j.1468-2354.2004.00139.x.
Diebold, Francis X. and Roberto S. Mariano (1995). “Comparing predictive accuracy,” Journal of Business & Economic
Statistics 13(3): 253–263.
Ediger,
V
olkan
Ş.
and
Ser
tac
Akar
(2007).
“Arima
forec
as
ting
of
primary
energy
demand
b
y
fuel
in
turkey,” Energy Policy 35(3):
1701–1708. https://doi.org/10.1016/j.enpol.2006.05.009.
Ediger,
V
olkan
Ş.,
Sertaç
Akar
and
Berkin
Uğ
urlu
(2006).
“F
orecasting
pro
duction
of
fossil
fuel
sources in Turkey using a compar-
ative regression and arima model,” Energy Policy 34(18): 3836– 3846. https://doi.org/10.1016/j.enpol.2005.08.023.
Ermis, K., A. Midilli, I. Dincer and M.A. Rosen (2007). “Articial neural network analysis of world green energy use,” Ener-
gy Policy 35(3): 1731–1743. https://doi.org/10.1016/j.enpol.2006.04.015.
Fan, J.Y. and J.D. McDonald (1994). “A real-time implementation of short-term load forecasting for distribution power sys-
tems,” IEEE Transactions on Power Systems 9(2): 988–994. https://doi.org/10.1109/59.317646.
Filik,
Ümmühan
Baş
aran,
Ömer
Nezih
Ge
rek
and
Mehmet
Ku
rban
(2011).
“
A
no
v
el
mo
deling
approach for hourly forecasting
of long-term electric energy demand,” Energy Conversion and Management 52(1): 199–211. https://doi.org/10.1016/j.
enconman.2010.06.059.
Fouquet, Roger, Peter Pearson, David Hawdon, Colin Robinson and Paul Stevens (1997). “The future of UK nal user energy
demand,” Energy Policy 25(2): 231–240. https://doi.org/10.1016/S0301-4215(96)00109-7.
Francis, Brian M., Leo Moseley and Sunday Osaretin Iyare (2007). “Energy consumption and projected growth in selected
Caribbean countries,” Energy Economics 29(6): 1224–1232. https://doi.org/10.1016/j.eneco.2007.01.009.
Garcí
a-Ascanio,
Carolina
and
Carlos
Maté
(2010).
“E
l
ectric
p
o
w
er
demand
forecasting
using
in
terv
al
time series: A comparison
between var and imlp,” Energy Policy 38(2): 715–725. https://doi.org/10.1016/j.enpol.2009.10.007.
Geem, Zong Woo and William E. Roper (2009). “Energy demand estimation of South Korea using articial neural network,”
Energy policy 37(10): 4049–4054. https://doi.org/10.1016/j.enpol.2009.04.049.
Granger, Clive W.J. and Ramu Ramanathan (1984). “Improved methods of combining forecasts,” Journal of Forecasting
3(2): 197–204. https://doi.org/10.1002/for.3980030207.
Guidolin, Massimo and Allan Timmermann (2007). “Asset allocation under multivariate regime switching,” Journal of Eco-
nomic Dynamics and Control 31(11): 3503–3544. https://doi.org/10.1016/j.jedc.2006.12.004.
Haas, Reinhard and Lee Schipper (1998). “Residential energy demand in oecd-countries and the role of irreversible eciency
improvements,” Energy economics 20(4): 421–442. https://doi.org/10.1016/S0140-9883(98)00003-6.
Hagan, Martin T. and Suzanne M Behr (1987). “The time series approach to short term load forecasting,” IEEE Transactions
on Power Systems 2(3): 785–791. https://doi.org/10.1109/TPWRS.1987.4335210.
Hansen, Bruce E. (2007). “Least squares model averaging,” Econometrica 75(4): 1175–1189. https://doi.org/10.1111/j.1468-
0262.2007.00785.x.
Hansen, Bruce E. (2008). “Least-squares forecast averaging,” Journal of Econometrics 146(2): 342–350. https://doi.
org/10.1016/j.jeconom.2008.08.022.
Hansen, Bruce E. (2014). “Nonparametric sieve regression: Least squares, averaging least squares, and cross-validation,”
Handbook of Applied Nonparametric and Semiparametric Econometrics and Statistics, forthcoming .
Hansen, Bruce E. and Jerey S. Racine (2012): “Jackknife model averaging,” Journal of Econometrics 167(1): 38–46. https://
doi.org/10.1016/j.jeconom.2011.06.019.
Hendry, David F. and Michael P. Clements (2004). “Pooling of forecasts,” The Econometrics Journal 7(1): 1–31. https://doi.
org/10.1111/j.1368-423X.2004.00119.x.
Hong, Wei-Chiang (2013). Intelligent energy demand forecasting, Springer. https://doi.org/10.1007/978-1-4471-4968-2.
Hunt, Lester C., Guy Judge and Yasushi Ninomiya (2003). “Underlying trends and seasonality in UK energy demand: a sec-
toral analysis,” Energy Economics 25(1): 93–118. https://doi.org/10.1016/S0140-9883(02)00072-5.
Kurban, Mehmet and U. Basaran Filik (2009). “Next day load forecasting using articial neural network models with autore-
gression and weighted frequency bin blocks,” International Journal of Innovative Computing, Information and Control
5(4): 889–898.
Lai, T.M., W.M. To, W.C. Lo and Y.S. Choy (2008). “Modeling of electricity consumption in the Asian gaming and tourism
center-macao sar, People’s Republic of China,” Energy 33(5): 679–688. https://doi.org/10.1016/j.energy.2007.12.007.
236 / The Energy Journal
All rights reserved. Copyright © 2018 by the IAEE.
Madigan, David and Adrian E. Raftery (1994). “Model selection and accounting for model uncertainty in graphical models
using occam’s window,” Journal of the American Statistical Association 89(428): 1535–1546. https://doi.org/10.1080/01
621459.1994.10476894.
Maia, Andre Luiz S., Francisco de A.T. de Carvalho and Teresa B. Ludermir (2006). Symbolic interval time series forecasting
using a hybrid model, in “2006 Ninth Brazilian Symposium on Neural Networks (SBRN’06),” IEEE, pp. 202–207.
Marcellino, Massimiliano, James H. Stock and Mark W. Watson (2006). “A comparison of direct and iterated multistep ar
methods for forecasting macroeconomic time series,” Journal of econometrics 135(1): 499–526. https://doi.org/10.1016/j.
jeconom.2005.07.020.
Markham, Ina S. and Terry R. Rakes (1998). “The eect of sample size and variability of data on the comparative performance
of articial neural networks and regression,” Computers & operations research 25(4): 251–263. https://doi.org/10.1016/
S0305-0548(97)00074-9.
McAvinchey, Ian D. and Andreas Yannopoulos (2003). “Stationarity, structural change and specication in a demand system:
the case of energy,” Energy Economics 25(1): 65–92. https://doi.org/10.1016/S0140-9883(02)00035-X.
Nogales, Francisco Javier, Javier Contreras, Antonio J. Conejo and Rosario Espínola (2002). “Forecasting next-day electric-
ity prices by time series models,” IEEE Transactions on power systems 17(2): 342–348. https://doi.org/10.1109/TP-
WRS.2002.1007902.
Pao, Hsiao-Tien (2006). “Comparing linear and nonlinear forecasts for taiwan’s electricity consumption,” Energy 31(12): 2129–
2141. https://doi.org/10.1016/j.energy.2005.08.010.
Sadorsky, Perry (2009). “Renewable energy consumption, CO2 emissions and oil prices in the g7 countries,” Energy Econom-
ics 31(3): 456–462. https://doi.org/10.1016/j.eneco.2008.12.010.
Sö
zen,
A
dnan,
Erol
Arcaklioğ
lu
and
Mehmet
Ö
zkaymak
(2005).
“T
urk
ey
’s
net
energy
consump
tion,” Applied Energy 81(2):
209–221. https://doi.org/10.1016/j.apenergy.2004.07.001.
Spencer, David E. (1993). “Developing a bayesian vector autoregression forecasting model,” Inter- national Journal of Fore-
casting 9(3): 407–421. https://doi.org/10.1016/0169-2070(93)90034-K.
Sumer, Kutluk Kagan, Ozlem Goktas and Aycan Hepsag (2009). “The application of seasonal latent variable in forecasting elec-
tricity demand as an alternative method,” Energy policy 37(4): 1317– 1322. https://doi.org/10.1016/j.enpol.2008.11.014.
Timmermann, Allan (2006). “Forecast combinations,” Handbook of economic forecasting 1: 135–196. Weron, Rafal (2007).
Modeling and forecasting electricity loads and prices: A statistical approach, Vol. 403, John Wiley & Sons.
Zhang, G. Peter (2003). “Time series forecasting using a hybrid arima and neural network model,” Neurocomputing 50:
159–175. https://doi.org/10.1016/S0925-2312(01)00702-0.
Combination Forecasting of Energy Demand in the UK / 237
Copyright © 2018 by the IAEE. All rights reserved.
APPENDIX: PREDICTION EQUIVALENT TESTS
Diebold and Mariano (1995) proposed statistical tests to compare the forecasting errors
from pair-wise models. In the present paper, we introduce two types of tests: Diebold and Mariano
asymptotic test and Wilcoxon’s signed-rank test. These two tests are aiming to distinguish the null
that
0, ,
: [ ( )] = [ ( )]
it jt
H Ege Ege
versus,
1, ,
: [ ( )] [ ( )]≠
it jt
H Ege Ege
where
,
()
it
ge
is a forecasting loss function on model i. Also, dene that the loss dierential series
,,
[ ( ) ( )]≡−
t it j t
d ge ge
for model i and j. Thus, hypothesis can also be understood as
[ ]=0
t
Ed
.
The rst test used is Diebold and Mariano asymptotic test, which is under mild assumption
that
t
d
is a covariance stationary and short memory series. Then, we have,
1= (0,1)
ˆ
2 (0)
π
a
d
d
SN
f
T
(44)
where
,,
=1
1
= [ ( ) ( )]−
∑
T
it jt
t
d ge ge
T
, and the variance term,
( 1)
= ( 1)
ˆˆ
2 (0) = ( ) ( )
()
τ
τ
π γτ
−
−−
∑
T
dd
T
fI
ST
where
()
()
τ
IST
is the lag window and
()ST
is the truncation lag. Noted that
( )=0
()
τ
IST
for
>1
τ
−h
as the h-step-ahead forecast errors are
1−h
dependent at most.
=1
1
ˆ( ) = ( )( )
τ
τ
γτ
−
+
−−
∑
r
dt
t
t
d dd d
T
As the Diebold and Mariano Test is a two-side test, it not only tests the equivalence, but also pro-
vides superior and inferior comparisons. A case that the statistic
1
S
falls outside of the right(left)-
hand condence interval implies the forecasting error
,it
e
(
,jt
e
) is greater than
,jt
e
(
,it
e
) with a measur-
able function
()⋅g
, thus, the predictor
,it
f
( ,
jt
f
) is less accurate than ,
jt
f
(
,it
f
).
The second test introduced is Wilcoxon’s signed-rank test. The test statistics follows a
standard normal distribution under the assumption that loss dierential series
t
d
is independent
identically distributed (i.i.d). Since we compare predictors for dierent forecasting horizons, the
t
d
is reasonable to be i.i.d.
1
2
( 1)
4
= (0,1)
( 1)(2 1)
24
+
−
++
a
a
TT
S
SN
TT T
(45)
where
2
=1
= () ( )
+
∑
T
a tt
t
S I d rank d
where
( )=1
+t
Id
if
>0
t
d
and it equals to 0 otherwise. The
()⋅rank
is the Wilcoxon’s rank operator.
Wilcoxon’s signed-rank test can also compare the superiority-inferiority through the sign.
CopyrightofEnergyJournalisthepropertyofInternationalAssociationforEnergy
Economics,Inc.anditscontentmaynotbecopiedoremailedtomultiplesitesorpostedtoa
listservwithoutthecopyrightholder'sexpresswrittenpermission.However,usersmayprint,
download,oremailarticlesforindividualuse.