ArticlePDF Available

Probabilistic Load Forecasting of Adaptive Multiple Polynomial Regression considering Temperature Scenario and Dummy variables

Authors:

Abstract and Figures

The monthly or yearly low accurate history data always leads to the low prediction-accuracy for load forecasting. We use temperature data from Sydney, Australia and the New South Wales Natural Load Dataset. To improve the data-based forecasting accuracy and time related scenario, this paper builds an adaptive multiple polynomial regression model considering temperature scenario and dummy variables. These dummy variables are divided into three aspects: trend variables, date variables and temperature variables. Trend variables are used to predict the whole economic development and user habit. Date variables are introduced to deal with the characteristics of working days and holidays. Cubic function for temperature variables from Australia and the New South Wales electric load history data is constructed to describe the relationship between load and temperature scenario. A temperature scenario is generated by considering the different loads of different seasons and the probability search of different scenarios. The load forecasting interval under different scenarios is given and analyzed by using dummy variables. At last, the method is validated based on the history data in a certain area. The prediction result with high accuracy shows clear intuitive and powerful interpreting ability, which can provide reliable decision basis for long term load forecasting. After simulation analysis, the accuracy of load forecasting based on 3-year history increases by 3.8%.
Content may be subject to copyright.
Journal of Physics: Conference Series
PAPER • OPEN ACCESS
Probabilistic Load Forecasting of Adaptive Multiple Polynomial
Regression considering Temperature Scenario and Dummy variables
To cite this article: Jiang Li et al 2020 J. Phys.: Conf. Ser. 1550 032117
View the article online for updates and enhancements.
This content was downloaded from IP address 181.215.75.235 on 16/06/2020 at 13:43
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
1
Probabilistic Load Forecasting of Adaptive Multiple
Polynomial Regression considering Temperature Scenario
and Dummy variables
Jiang Li 1, Liyang Ren 1, Baocai Wang 1 and Guoqing Li1
1Northeast Electric Power University, Jilin, Jilin, 132000, China
2China Electric Power Research Institute, Beijing, 100000, China
*Liyang Ren: 2024468066@qq.com
Abstract. The monthly or yearly low accurate history data always leads to the low prediction-
accuracy for load forecasting. We use temperature data from Sydney, Australia and the New
South Wales Natural Load Dataset. To improve the data-based forecasting accuracy and time
related scenario, this paper builds an adaptive multiple polynomial regression model considering
temperature scenario and dummy variables. These dummy variables are divided into three
aspects: trend variables, date variables and temperature variables. Trend variables are used to
predict the whole economic development and user habit. Date variables are introduced to deal
with the characteristics of working days and holidays. Cubic function for temperature variables
from Australia and the New South Wales electric load history data is constructed to describe the
relationship between load and temperature scenario. A temperature scenario is generated by
considering the different loads of different seasons and the probability search of different
scenarios. The load forecasting interval under different scenarios is given and analyzed by using
dummy variables. At last, the method is validated based on the history data in a certain area. The
prediction result with high accuracy shows clear intuitive and powerful interpreting ability,
which can provide reliable decision basis for long term load forecasting. After simulation
analysis, the accuracy of load forecasting based on 3-year history increases by 3.8%.
1.Introduction
Long-term load forecasting is very important for the production, operation, planning and construction
of power systems, which is the basis and also the value of history dada mining [1, 2]. With load
diversification and the accessing of large-scale distributed renewable energy sources, it is more and
more difficult to give the load forecasting interval under complex scenarios.
In recent years, the load forecasting mainly focuses on short-term load forecasting, topic papers about
long-term load forecasting are relatively fewer [3]. In practice, forecasting is essentially a stochastic
problem. Thus, exact forecasting for the future is impossible, and it can be assumed that forecasting for
long-term horizons can only be the reference for reducing the effect of uncertainty as few as possible
[4]. One way to counter this assumption is the scenario analysis that looks into a selected scenario in the
future. Due to the uncertainty in weather and economic forecasts, forecasting process is encouraged to
provide explicit forecasting value based on different scenarios. The other load forecasting methods are
predictive modeling, weather normalization, and probabilistic forecasting [5]. There are many
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
2
traditional short-term load prediction methods, such as regression prediction method and gray prediction
method. There are also other intelligent prediction algorithms, such as support vector machine method
and neural network method [6]. Gray prediction method requires less sample datum and is easy to
achieve. However, the demand load has an exponential trend [7-9]. Neural network method has effective
prediction results. The black-box model cannot explain the relationships between input and output
variables, which makes the model less able to explain and is easily trapped in local optimal solution.
Therefore, it is very difficult to initialize the model [10]. The regression analysis method is simple in
calculating principle and has a clear solving algorithm. The prediction speed is fast and has a strong
explanatory power of the model, and it is the earliest used in load forecasting. Literature [11] proposes
a new approach to support the process of forecasting the hourly electric load values for the next day.
The adopted methodology based on neural networks is only supported by detailed information related
to consumers’ typical behavior and climatic information. The case study was tested in two real
distribution substation outputs, demonstrating its effectiveness and practical applicability in [12].
Literature [13,14] provides new ideas for regression prediction. However, the method cannot reflect the
inherent mechanism of load fluctuation, and just considers the quantitative factors such as gross national
product and population, neglecting the meteorological temperature, periodic load characteristics, and
the special nature of the holiday load, which affect the adaptability of proposed method under different
scenarios.
With the increase of economic level, the proportion of temperature-sensitive load in the home is
increasing, which makes the load more and more obvious with temperature. Due to the uncertainty of
temperature, load forecasting is a random problem. The main methods are point forecasting and cannot
determine the forecasting interval of load fluctuations in the future, So, it is unscientific to judge the
long-term load forecasting by comparing the predicted and true values of the corresponding points [15-
16]. The low accuracy features of the traditional prediction methods provide very limited information
for the prediction model, their prediction errors are large and have poor interpretation ability, such as
the monthly maximum or minimum temperature, and it cannot explain the specific moment when this
temperature appears and the dynamic characteristics of the load with temperature [17, 18]. Therefore,
this paper proposes a high-precision load forecasting method that adapts to different data quality to solve
such problems.
The main contribution of this paper is to generate temperature scenario and applied into probabilistic
load forecasting problem by using dummy variables. The long-term load forecasting accuracy is
improved and both upper boundary and lower boundary are given with probabilistic forecasting. [19]
Based on the hourly history data, we first establish a regression model, dummy variables are used to
quantify the year, week and day of the dummy variables. When the weekly history data are classified, it
should take into account the special nature of holidays; the temperature effect is considered for periodic
load forecasting under working days and holidays. Compared with different term scale scenario for
forecasting error, the optimal scenario with high accuracy is generated. The probability is used to
optimize the load parameters and the forecasting interval is used to define the load change. We will
explain the concept of the temperature scene in 3.3.4 and introduce the construction of the temperature
scene in the form of simulation verification in Section 4.2.
The remainder of the paper is organized as follows: Firstly, the generalized multivariate linear
regression model for load forecasting is established based on per hour history data and regression
constant in section2. Then, the detailed model for probabilistic load forecasting is descripted by using
trend variables, date dummy variables and temperature scenario in Section 3. Finally, the performance
of the proposed method is verified in Section 4. Section 5 concludes the paper and proposes future work.
2.Generalized Multiple Linear Regression
In this section, a multivariate linear regression model is firstly given and then the polynomial regression
model is proposed to solve uncertainties from working days and holidays.
The general form of a multivariate linear regression model:
)...2,1(...
110 nieXXY ipp =++++=
(1)
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
3
Therefore, β0 stands for the regression constant, β0, …, βp stands for the partial regression coefficient.
Y is called the explained variable (dependent variable),X1 , X2 ,⋅⋅⋅ , XP is called the explanatory
variable(independent variable,𝑒𝑖 is the random error[20-22].Compared with other load forecasting
methods, the proposed load forecasting method based on principal component regression effectively
retains most information of the original variables and reduces the correlation among the data, finally
improve the accuracy of load forecasting [23].
In practical problems, the relationship between the explained variable Y and the explanatory variable
X is not linear in many models and they can be transformed into a linear relationship through the
functional relationship of independent variables or dependent variables. Linear regression could be used
to solve unknown parameters and make regression diagnosis [24].
In polynomial regression, the influencing variable may be a polynomial, or they are the two
independent variables that have an interaction effect, the regression equation:
3
0 1 1 2 2 3 1 2 4 1i i i i i i
Y X X X X X e
 
= + + + + +
(2)
The polynomial regression is transformed into a linear regression of four variables:
0 1 1 2 2 3 3 4 4i i i i i i
Y X X X X e
 
= + + + + +
(3)
In the regression analysis, we first quantify the qualitative variables by quantifying some independent
variable sand then introduce dummy variables that take only two values of 0 and 1. When an attribute
appears, the dummy variable takes 1, and otherwise 0. If a qualitative variable has K categories, it is
necessary to introduce K-1 0-1 virtual arguments, taking working days and holidays as an example.
Then a regression equation with load characteristics for the working days is described as follows:
0 1 1
YX

=+
(4)
When describing the working days, X1=1, the regression equation is: E(Y)=β01.When describing
holidays, X1=0 the regression equation is E(Y)=β0 in [25]. The resulting daily load characteristics are
described by regression constants.
3.Building the forecasting model
In this section, dummy variables, such as trend variables and data variables, are firstly introduced. Then,
interaction among different variables is modeled in linear regression expression. Finally, two methods
for generating temperature scenes, such as moving day temperature method and probabilistic
temperature scene creation method, are proposed, and the probability prediction errors under different
time scenarios are analyzed.
3.1. Trend variables
Data are sourced from Sydney's temperature in Australia and the Natural Load Dataset in New South
Wales.
Figure 1 plots the hourly load and temperature scatter plots for a region from 2006 to 2013, and Table
1 shows the annual load table, and it is relatively stable. There is no annual increase or decrease trend.
This may be caused by social-economic development and population growth resulting in increased
electricity consumption.
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
4
Figure 1. Scatter plot of history data (2006-2013)
Table 1. Annual load (2006-2013)
Years
2006
2007
2008
2009
Load (GW)
28286
28434
28579
28511
Years
2010
2011
2012
2013
Load (GW)
28741
29068
29415
29734
In order to actually describe the trend of increasing load, we introduce the trend variables Tr in the
regression model and define the rise of a series of natural numbers per hour to quantify the load growth
trend. For example, in the first hour of 2006, the trend variable was 1, the second hour was 2, and then
the analogy. This trend variable is a linear approximation of the load growth sequence. The trend
expression of economic growth is expressed as:
01ir
Load T e

= + +
(5)
3.2. Date variable
Power consumption behavior is one of the main factors that affect load fluctuation. This section
describes the load characteristics of periodic daily, weekly and yearly variables by date. As can be seen
from Fig. 1, the annual load has a periodic pattern of load fluctuation. The yearly component of the load
is closely related to seasonal climate characteristics. The peak loads of summer and winter reach
maximum, while loads of spring and autumn are minimum. This paper introduces the virtual independent
variable M for 12 categories, the treatment is as follows.
2006/1/1 2007/1/1 2008/1/1 2009/1/1 2010/1/1 2011/1/1 2012/1/1 2013/1/1
2006/1/1 2007/1/1 2008/1/1 2009/1/1 2010/1/1 2011/1/1 2012/1/1 2013/1/1
Time/h
(b) Hourly Temperature
Time/h
(a) Hourly Load
40
20
0
-20
-40
Temperature/
6000
5000
4000
3000
2000
Load/MW
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
5
1
1
2
2
11
11
1
1
1
1
1
1
2
2
X = January
X =0 others
X = February
X =0 others
X =
December
November
X =0 others
X =
X =0 others
The regression equation described the monthly load characteristics is:
0 1 1 2 2 12 12
0 1 2 3 4
...
t t t t t t
Y X X X
Load M D H D H e
 
 
= + + + +
= + + + + +
(6)
Where β is the regression coefficient and Mt, Ht, Dt is the dummy variable. HtDt represents the
interaction between the dummy variables D(day) and H(hour). e indicates random error. When the load
is described in January, the variables are X1=1, X2=X3=…=X11= X12=0 in the regression equation.
The load on different date types is also very different within a week, but shows a clear periodic
pattern. As shown in Figure 2. In normal days, there was a significant difference between weekdays and
weekends, the total load of the weekends was significantly lower than the daily cyclical changes.
Load/MW
12000
11000
10000
9000
8000
7000
6000
5000
4000 Sat Sun Mon Tue Wed Thu Fri Sat Sun
Week/h
Figure 2. Weekly load (2006/3/252006/4/2)
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
6
6500
6000
5500
5000
4500
4000
3500
3000
2500
2000
1500
Load/MW
Temperature/
-30 -20 -10 0 10 20 30 40
Quadratic Function
Section Function
Cubic
Function
Figure 3. The fitting plot of hourly temperature- hourly load
In order to describe the load characteristics, we introduce the independent dummy variable D to
describe the load difference between different date types. One week can be divided into 7 categories. 6
dummy arguments and processing methods are introduced into the monthly variable M [23]. Because it
reduces industrial electricity consumption, in the daily cycle, the load characteristics were significantly
different at different times of the day, and the nighttime electricity consumption was significantly lower
than that during the daytime. The virtual independent variable H is introduced to describe the load
characteristics, which is divided into 24 categories and introduced 23 dummy variables.
Working day morning shift load is significantly higher than the holiday morning, this is due to the
fact that people do not have to get up early to work on a day off, and reduce the load on electricity, and
we introduce interaction H and D in the model. Due to the different load components, each holiday
generally occurs in a fixed period of time every year. During the holidays, a large number of factories,
enterprises and institutions to withdraw from the electricity load, they are mainly including residential
load, commercial load and non-stop industrial load, this made the load significantly reduced from the
normal day [24]. According to the flexible adjustment policy, current holidays will be converted into
lasted holidays or working days, and it will raise the overall forecast level. In summary, the regression
equation can be expressed as follows,
0 1 2 3 4t t t t t t
Load M D H D H e
 
= + + + + +
(7)
3.3. Temperature Scenarios Generation
1) Analysis of Temperature Variables
In this section, the load-temperature function of the cubic function is introduced. Figure 3givesan
hourly temperature-load scatter plot for a region from 2006 to 2011, and its section linear, quadratic,
and cubic fitting functions are plotted. The temperature-load relationship is asymmetric, while the
quadratic function can only describe the symmetrical function. Thus, the cubic function is better than
the quadratic function for load forecasting.
2) Interaction of Temperature Variables
The temperature of summer is higher than that in winter, the temperature is distinguishing in different
months. M*T should be considered in the interaction between month variable M and temperature
variable T. During the day, the temperature in different time periods also changes regularly. The daytime
temperature is higher than that of night, and the interaction between variables H and T need to be
considered. The temperature function in the regression model is:
2 3 2 3 2 3
0 1 2 3 4 5 6 7 8 9t t t t t t t t t t t t t t t
Load T T T T T H T H TM T M T M e
 
= + + + + + + + + + +
(8)
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
7
Where β is the regression coefficient and Mt, Ht, Dt is the dummy variable.T, H×T2, H×T3 is the
interaction between the variables H and T. T, M×T2, T3 is the interaction between variables M
and T. e indicates random error.
3) Proximity of Temperature
Proximity is a phenomenon in psychology, referring to the phenomenon that when people recognize
a series of things, the memory effect of the last part of the items is better than that of the middle part.
The same phenomenon exists between the load and the temperature, that is, the current time before the
temperature will also affect the load changes. We add temperature variables into the model, introducing
the same form of variables as T. Tt-i refers to the temperature of the first i hours (i = 1, 2, 3), as, Tt-i, Tt-
i2, Tt-i3, Tt-iHt, Tt-i2Ht, Tt-i3Ht, Tt-iMt, Tt-i2Mt, Tt-i3Mt, Ta refers to the temperature average of the first 24 hours
of the current time. Proximity variables Ta, Ta2, Ta3, TaHt, Ta2Ht, Ta3Ht, TaMt, Ta2Mt, Ta3Mt.
Considering the proximity effect, the function of load forecasting is:
3
2 3 2 3 2 3 2
0 1 2 3 4 5 6 7 8 9 1 1 2
1
3 2 3 2 3 2 3
3 4 5 6 7 8 9 1 2 3 4
5
(
)
t t t t t t t t t t t t t t t i t i t i
i
i t i i t i t i t i t i t i t i t i t i t i t i t i t a a a a a a a a t
aa
Load T T T T T H T H TM T M T M T T
T T H T H T H T M T M T M T T T T H
T
 
 
−−
=
− −
= + + + + + + + + + + + +
+ + + + + + + + + + +
2 3 2 3
6 7 8 9t a a t a a t a a t a a t
H T H T M T M T M e
  
+ + + + +
(9)
In summary, the multiple linear regression load forecasting model based on time-temperature near-
effect is:
2 3 2 3
0 1 2 3 4 5 6 7 8 9 10 11 12
3
2 3 2 3 2 3 2
13 14 6 7 8 9 10 11 12 13
1
3
14
(
t t t t t t t t t t t t t t t t
t t t t i t i i t i i t i i t i t i t i t i t i t i t i t i t i t
t
i t i
Load Tr M D H D H T T T T H T H T H T M
T M T M T T T T H T H T H T M T M
T
 
   
− − − − −
=
= + + + + + + + + + + +
+ + + + + + + + + +
+
2 3 2 3 2 3
6 7 8 9 10 11 12 13 14
)
t a a a a a a a a t a a t a a t a a t a a t a a t
M T T T T H T H T H T M T M T M e (i=1,2,3)
 
+ + + + + + + + + +
(10)
4) Probabilistic Temperature Scenario Generation
The meteorological characteristics of the same period of each year are similar. A type of meteorology
may arrive a few days earlier, or it may arrive a few days later. For example, the temperature in May
2006 may be similar to the temperature in April or June 2009. There is a strong correlation between load
and temperature, and the load will follow the temperature change. It is expressed that if the high
temperature lasts for a long time in the summer, people will continue to use the air conditioner to cool
down, resulting in an increase in load. This temperature change phenomenon will lead to large load
differences, so this temperature change characteristics should be considered in long-term load
forecasting, and power planning and scheduling should be done to fully cope with this uncertain
meteorological variation.
In this section, a probabilistic temperature scenario generation method based on a moving
temperature scenario is proposed, which is compared with the fixed temperature scenario generation
method. If history year is k, then we will generate kth probability temperature scenario. The moving day
temperature method is based on the change characteristic of temperature, and the historical temperature
is moved forward or backward by n days to create more equal-probability historical temperature scenes,
taking table.3 forward and backward one day as an example. If k history moves forward and backward
by n days, then (2n+1) k temperature scenes are generated.
Table 2. Schematic diagram of shifted-data
Base year
)1,365(id
T
),1( id
T
),365(id
T
Move forward for 1
day
),1( id
T
),2( id
T
)1,1( +id
T
Moved 1day later
)1,364(id
T
)1,365(id
T
),364(id
T
The history year and moving days of temperature are the key indicators that influence the prediction
accuracy. We use the error parameter optimization method, and the formula is:
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
8
1100
100
t q t t t q
t q t
t t q t t q
qy y y y
S y y q qy y y y
− 
=
−
,,
,
,,
( )( )
( , , )
( )
(11)
Where, q is the given value (q=1,2…,99). yt is the actual load at time t, yt,q is the q-digit load at time
t. The smaller the value is, the smaller its error.
3.4. System flow chart
This section predicts the date entered. First, it is judged whether it is a normal working day. If it is an
abnormal working day, it is corrected according to the temperature scene and then enters the normal
cycle. The normal cycle is predicted once per hour to 24 hours, and H returns to zero to start a new day
forecast.
The load forecasting system flow chart is shown in Figure 4.
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
9
Start
Establish generalized regression linear equation
Construction of prediction model
End
Lead into trend
variable Tr and date
variable
D
day
variable
M
month
variable
H
hour
variable
Lead into
temperature variable
Load characteristic equation
Loadt=β0+β1Mt+β2Dt+β3Ht+β4DtHt+e
Considering the interaction
between date variable and T
Choose the right
temperature scene
Get the equation Loadt=β0+β1Tt+β2Tt2+β3Tt3
+β4Tt+β5Tt2Ht+β6Tt3Ht+β7TtMt+β8Tt2Mt+β9Tt3Mt+e
Calculate predicted value
Compare predicted and actual
values
Enter the forecast date
Determine if it is a
normal working day
Y
N
Output forecast result
H<23?
Definition H=0D=0
N
H=H+1
Y
D=D+1
H=0
Updated
Whether to reach
the end date
Y
N
Figure 4. Flow chart of load forecasting based on temperature scenario
4.Case study
In this Section, we use temperature data from Sydney, Australia and the New South Wales Natural Load
Dataset. the hourly data from 2006 to 2010 in regression model are used to predict the load in 2011 and
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
10
analyze error. The mean absolute percentage error (MAPE) is used to evaluate the prediction accuracy of
the model.
%100
1
1
=
=
N
tt
tt yyy
N
MAPE
(12)
Where yt is the real load, and
ˆt
y
is the predictive value.
Table 3 Deviation for different models
Model
R2
MAPE (%)
Standard deviation
(MW)
1
0.867
8.94
318.30
2
0.944
4.27
186.57
3
0.952
3.33
143.40
4
0.967
2.95
128.59
In model 1 single-factor variables (Tr, M, D, H, T, T2 and T3) are considered, the accuracy was also
low because of interaction between variables. Model 2 added the coupled variable with temperature and
date (MT, MT2, MT3, DT, DT2, DT3, DH, HT, HT2 and HT3) into the regression model for
improving forecasting accuracy. Model 3 has been revised on the coupled variables of Model 2. Model
4 adds the short-term effect of temperature scenario, the corrected R2 reaches 0.967, and MAPE is
reduced to 2.95%, which verifies the validity of short-term effect to improve prediction accuracy for
temperature.
The prediction model 4 considering temperature is
1 2 3 4 5 ()
t t t t t t t
L Tr M D H DH f T
 
= + + + + +
(13)
Where f
Tt
is the temperature model and the expression is
2 3 2 3
6 7 8 9 10 11
2 3 2 3
12 13 14 15 16 17
()
+
t t t t t t t
t t t t t t
f T T T T MT MT MT
HT HT HT MHT MHT MHT
 
   
= + + + + + +
+ + + +
(14)
During the year, summer is higher than winter temperature, that is, there are differences in
temperature in different months, and the interaction between variables M and T should be considered.
T
T2
T3; During the day, the temperature of different time periods also changes periodically.
The temperature at noon is higher than that at night.
4.1. Data length for estimating the regression parameters
The length of the history data in the regression model is a key factor affecting the forecasting accuracy.
Table 4 lists error for different data length. In the table 4, the second line is based on last 2-years history
data to forecast the load power. It can be seen from the average value that the minimum error 3.17% of
parameter estimation can be obtained by using 3-years history data. This paper selects the data from
last3-years to estimate the regression parameters.
Table 4. Error for different data length
DATA LENGTH (years)
2009
(%)
2010
(%)
2011
(%)
2012
(%)
Average
(%)
1
4.09
3.23
3.19
2.89
3.35
2
4.30
2.94
2.94
2.67
3.21
3
4.26
2.89
2.96
2.58
3.17
4
4.37
2.64
3.20
2.70
3.23
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
11
4.2. Temperature scenario generation
Probabilistic load forecasting method flow is as follows. First, the probabilistic parameter optimization
is performed. The k-n parameter with the highest accuracy is selected, and (2n+1)k temperature scenes
are created as the input of the prediction model for each temperature scene. Forecasting separately to
simulate the predicted annual temperature, and obtaining (2n+1)k prediction results, these results can be
used to find the median or interval division, which is of great significance for guiding medium and long-
term power grid planning and scheduling.
A temperature scenario based on from history 2005-2012 was generated, and the load forecasting
was based on 2013 actual date type (k = 1, 2, ..., 8). When k = 1, the scenario is generated from the 2012
temperature data, and when k = 2, the temperature scenario is based on 2011-2012, and so on. This
section will search for the optimal k-n parameter by moving the history year temperature data under k-
n year(s) and n day(s), and Table 4 shows the probabilistic error of different temperature scenarios.
As can be seen in the Table 5, the optimal k-n parameter is 8-13 days, which is based on the 8-year
and 13 days, probabilistic error is 59.35%, which has higher accuracy than 63.05%for 8-year fixed day
temperature scenario. Probabilistic error for 8-year and the fixed-day temperature scenario was 4.69%,
and the median error of MAPE was 4.51%. Figure 5 shows the probability error curves of temperature
scenes based on different history years (k = 1, 2, ..., 8). It can be seen that with moving-days increasing,
the probability error fluctuating. In the initial stage, the forecasting accuracy can be significantly
improved by increasing the number of moving days. When k = 1, it means the temperature data moving
forward three days in 2012, the probability error can be reduced from 78.37% to 66.27%, and the median
MAPE dropped from 5.67 to 5.01, the accuracy increased to 11.6%.
Table 5. Probabilistic error (%) of different temperature scenarios
k
n
1
2
3
4
5
6
7
8
0
78.37
77.23
69.63
67.86
64.48
63.49
63.05
63.05
1
76.42
76.41
63.36
64.45
61.83
61.08
60.99
63.58
2
70.43
68.82
61.63
62.41
60.61
60.12
60.13
61.57
3
66.27
64.34
60.79
61.21
59.99
59.94
59.86
60.02
4
65.45
64.17
60.49
61.00
59.78
59.87
59.63
60.09
5
64.77
63.71
60.40
60.32
59.77
59.77
59.63
59.92
6
63.57
62.46
60.38
60.83
59.83
59.76
59.68
59.55
7
63.62
63.50
60.31
60.63
59.88
59.76
59.70
59.70
8
62.61
62.8
60.26
60.55
59.87
59.75
59.73
59.69
9
62.80
62.67
60.24
60.30
59.82
59.62
59.78
59.48
10
62.56
62.64
60.22
60.28
59.80
59.75
59.84
59.42
11
61.91
62.21
60.20
60.13
59.73
59.74
59.87
59.38
12
61.03
61.90
60.17
60.01
59.68
59.73
59.86
59.35
13
61.72
61.85
60.20
60.04
59.70
59.72
59.93
59.43
15
61.23
62.16
60.32
60.4
59.91
59.87
60.12
59.75
20
62.03
62.57
60.73
61.08
60.07
60.20
60.30
60.29
30
62.85
62.71
61.5
61.87
60.78
60.94
61.15
61.07
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
12
Figure 5. Curve of Probabilistic error
The regression model parameters are estimated by using temperature scenario and the 2014 load
power are forecasted. The dashed line in Figure 6 is the forecasting curve based on the temperature
scenario. Black is the actual load and the red line is the median load. It can be seen that compared with
the forecasting accuracy of fixed-day temperature, the forecasting accuracy of moving-day temperature
scenario is obviously improved [26].
Figure 6. Probabilistic load forecasts for different term (2014)
Table 6 summarizes the forecast results. The results show that the probability error at the moving
day temperature of July 16 to July 22, 2014 is 1.82, which is larger than the 2.62 from the fixed day
temperature. In the median, the MAPE was 4.07, representing an increase of 40.32% under the fixed-
day temperature of 6.82. The 2014 full-year median load MAPE dropped from 4.90 to 4.76. It can be
seen that the predictive accuracy of the probabilistic load based on the moving day temperature scenario
is significantly higher than that of fixed day forecast, especially in the case of historical temperature
5 10 15 20 25 30
Moving Days(n)
85
80
75
70
65
60
55
50
Error
7/16 7/17 7/18 7/19 7/20 7/21 7/22
Time/h
a Fixed-day Forecast
Load/MW
5000
4000
3000
2000
5000
4000
3000
2000
Load/MW
7/16 7/17 7/18 7/19 7/20 7/21 7/22
Time/h
(b) Mobile-day Forecast
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
13
data is limited, and it can also reach higher forecast level by moving daily temperature.
Table 6. Error statistics
error
2014
7/16-7/22,2014
Fixed-
day
Mobile-
day
Fixed-
day
Mobile-
day
Probability
error
63.63
60.53
2.62
1.82
(MAPE)
4.90
4.76
6.82
4.07
Based on the preferred parameter k-h, according to the 2014 actual date type, we forecast the 2014
monthly electricity consumption and monthly maximum load a 10% quantile, a median and a 90%
quantile load value are taken at each of the forecast points.
Figure 7 and 8 show monthly maximum load and monthly electricity consumption respectively. The
broken line in the figure represents the predicted value based on the historical temperature and the daily
moving temperature data from 2005 to 2013, and the three solid lines from top to bottom are respectively
90% quantiles, median and 10% quantiles, black solid points represent the true value of the load.
Figure 7. Monthly peak load (2014)
Figure 8. Monthly load (2014)
As can be seen from the figure, most of the actual load points fall near the median, and individual
points fall outside the range of 10% and 90% quantile lines. The 10% and 90% quantile loadings indicate
the extreme cases with a lower probability of occurrence, but that does not mean it will never happen.
The maximum load in February, October, November and December and the total electricity
consumption in June and October are all in the 10% quantile line. In March and April, electricity
consumptions all in the 90% quantile. The maximum load in May was below the 10% quantile, and the
electricity consumption in January exceeded the 90% quantile. It can be seen that the defined prediction
interval can reflect the real value of the load more accurately.
Figures 9 and 10 show the temperature scenarios created with the data of 2005-2011 to forecast the
monthly maximum load and monthly electricity consumption for 2012-2014. As can be seen from the
figure, the annual maximum load in February-May and the electricity consumption in August-November
is small, and it even below the 10% quantile load value, during which the relevant part should be done
according to the 10% quantile. In January of each year, it is reasonable to arrange the power generation
plan according to the median monthly maximum load. In July, the power generation plan in the light of
90% quantile load would be more reasonable.
1 2 3 4 5 6 7 8 9 10 11 12
Time/Month
6000
5500
5000
4500
4000
3500
3000
Load/MW
90% Quantiles
Median
10%
Quantiles
1 2 3 4 5 6 7 8 9 10 11 12
Time/Month
Load/MW
2.9
2.8
2.7
2.6
2.5
2.4
2.3
2.2
2.1
2
10%
Quantles
90%
Quantiles
Median
×106
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
14
Figure 9. Monthly peak load (2012-2014)
Figure 10. Monthly load (2012-2014)
In comparison with the point load forecasting, the proposed probability forecasting method provides
a series of load changes, which can reflect the fluctuation range and trend of load fluctuation more
accurately and define different quantile intervals as needed. In different time periods, the width of the
forecast interval is different, this provides policy makers with more useful information, which is
unmatched by point forecasts
5.Conclusion
This paper extends the linear multiple linear regression model into the adaptive polynomial multiple
regression model. Trend variables, date variables and temperature variables as dummy variables are
used to describe the inherent characteristics of load changes in future. Economic development, utility
consumption habit of working day and holidays, temperature effect and so on are viewed as linear,
quadratic and even triple terms of the polynomial model. The proposed method quantifies 12 months, 7
days and 24 hours categories as the main factors for scenario generation. Temperature scenario
optimization is applied to analyzing load forecasting median error and border, and load forecasting
accuracy based on 3 years history is improved with 3.8%.
Case studies show that the proposed probability forecasting method can explain the trend of future
load changes more accurately, and it can provide more useful information for the long-term load
forecasting. It will help policy-makers estimate the possible uncertainties and risk factors of future loads.
This will lay a solid foundation for load forecasting in complex operations.
Acknowledgment
Thanks to the National Natural Science Foundation of China for the project: Probability prediction and
active smoothing theory of renewable energy slope events in AC and DC power grids (Project No.
51977030)
Author Biographies
Jiang Li In 2003, he obtained a bachelor's degree in electrical engineering and automation from
Shanghai Electric Power College; In 2006, he obtained a master's degree in electrical engineering and
automation from Northeastern Electric Power University; In 2010, he obtained a doctorate degree in
electrical engineering and automation from North China Electric Power University; Visiting scholar at
Cornell University in the United States in 2014; In 2015, he was a visiting scholar at the American
Energy System Research Center.
Liyang Ren In 2017, he obtained a bachelor's degree in electrical engineering and automation from
Northeastern Electric Power University; Master's degree in Power Systems and Automation from
Northeastern Electric Power University from 2017 to the present;
2012/1 2012/7 2013/1 2013/7 2014/1 2014/7
Time/Month
6000
5500
5000
4500
4000
3500
3000
Load/MW
90% Quantiles
Median
10%
Quantiles
2012/1 2012/7 2013/1 2013/7 2014/1 2014/7
Time/Month
2.9
2.8
2.7
2.6
2.5
2.4
2.3
2.2
2.1
2
×106
Load/MW
10%
Quantiles
Median
90% Quantiles
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
15
References
[1] Kanggu Park; Seungwook Yoon; Euiseok Hwang. Hybrid Load Forecasting for Mixed-Use
Complex Based on the Characteristic Load Decomposition by Pilot Signals. IEEE Access.
December 2018; pp.12297-12306.
[2] Mohamed Reda Nezzar; Nadir Farah; Tarek Khadir. Mid-long term Algerian electric load
forecasting using regression approach. IEEE Transactions on Power Systems, July2013;
pp.121-126.
[3] Weicong Kong ; Zhao Yang Dong ; et al. Short-Term Residential Load Forecasting Based on
Resident Behaviour Learning. 2018; pp. 1087-1088
[4] T. Hong, J. Wilson, J. Xie,; Long term probabilistic load forecasting and normalization with hourly
information; 2013. pp. 456-462
[5] Qingshan Xu; Yifan Ding; Qingguo Yan; et al. Day-Ahead Load Peak Shedding/Shifting Scheme
Based on Potential Load Values Utilization: Theory and Practice of Policy-Driven Demand
Response in China. IEEE Access August2017; pp.22892-22901.
[6] Chen Y.; Kloft M.; Yang Y.; et al. Mixed kernel based extreme learning machine for electric load
forecasting. Neurocomputing, 2018.
[7] Zhang X.; Wang R.; Zhang T.; et al. Short-Term load forecasting using a novel deep learning
framework. Energies, 2018, 11, 1554.
[8] Shepero M.; Meer D. V. D.; Munkhammar J.; et al. Residential probabilistic load forecasting: A
method using Gaussian process designed for electric load data. Applied Energy, 2018; pp.159-
172.
[9] Bowen Li; Jing Zhang; Yu He; et al. Short-Term Load-Forecasting Method Based on Wavelet
Decomposition With Second-Order Gray Neural Network Model Combined With ADF Test.
IEEE Access. May 2017; pp.16324-16331.
[10] Li Y.; Huang Y.; Zhang M.; et al. Short-Term load forecasting for electric vehicle charging station
based on niche immunity lion algorithm and convolutional neural network. Energies, 2018.
[11] Wang Y.; Zhang N.; Chen Q.; et al. Data-driven probabilistic net load forecasting with high
penetration of invisible PV. IEEE Transactions on Power Systems, 2017; pp.1-1.
[12] Fan G.F.; Peng L.L.; Hong W.C.; Short term load forecasting based on phase space reconstruction
algorithm and bi-square kernel regression model. Applied Energy, 2018, 224, 13-33.
[13] Giuseppe fenza; Mariacristina Gallo; Vincenzo Loia. Drift-Aware Methodology for Anomaly
Detection in Smart Grid, IEEE Access. December 2018; pp.9645-9657.
[14] Singh P.; Dwivedi P.; Integration of new evolutionary approach with artificial neural network for
solving short term load forecast problem. Applied Energy, 2018, 217, 537-549.
[15] Prakash A.; Xu S.; Rajagopal R.; et al. Robust building energy load forecasting using physically-
based kernel models. Energies, 2018, 11, 862.
[16] Yang Y.; Li S.; Li W.; et al. Power load probability density forecasting using Gaussian process
quantile regression. Applied Energy, 2018, 213.
[17] Barman M.; Choudhury N.B.D.; Sutradhar S. A regional hybrid GOA-SVM model based on similar
day approach for short-term load forecasting in Assam, India. Energy, 2018, 145.
[18] Karimi M.; Karami H.; Gholami M.; et al. Priority index considering temperature and date
proximity for selection of similar days in knowledge-based short term load forecasting method.
Energy, 2018, 144, 928-940.
[19] Simona Vasilica Oprea; Adela B´RA; Vlad Diaconta. Sliding Time Window Electricity
Consumption Optimization Algorithm for Communities in the Context of Big Data Processing.
IEEE Access December2018.pp. 13050-13067.
[20] Yang Z.C.; Discrete cosine transform-based predictive model extended in the least-squares sense
for hourly load forecasting. IET Generation Transmission & Distribution, 2016, 10, 3930-
3939.
[21] Kaur A.; Nonnenmacher L.; Coimbra C.F.M.; Net load forecasting for high renewable energy
penetration grids. Energy, 2016, 114, 1073-1084.
IWAACE 2020
Journal of Physics: Conference Series 1550 (2020) 032117
IOP Publishing
doi:10.1088/1742-6596/1550/3/032117
16
[22] Gu C.; Jirutitijaroen P. Dynamic state estimation under communication failure using kriging based
bus load forecasting. IEEE Transactions on Power Systems, 2015, 30, 2831-2840.
[23] Park H.; Baldick R.; Morton D.P.; A stochastic transmission planning model with dependent load
and wind forecasts. IEEE Transactions on Power Systems, 2015, 30, 3003-3011.
[24] Che J.X.; Wang J.Z.; Short-term load forecasting using a kernel-based support vector regression
combination model. Applied Energy, 2014, 132, 602-609.
[25] Hernández L.; Baladrón C.; Aguiar J.M.; et al. Artificial neural networks for short-term load
forecasting in microgrids environment. Energy, 2014, 75, 252-264.
[26] Vasudev Dehalwar; Akhtar Kalam; Mohan LalKolhe; et.al. Electricity load forecasting for urban
area using weather forecast information. IEEE International Conference on Power and
Renewable Energy, Oct2016; pp. 21-23.
... Point or deterministic forecast methods have been widely used historically because of their simplicity and understandable employment [21]. However, these deterministic methods are gradually replaced by probabilistic methods that respond to the stochastic factors corresponding to the system's flexibility [22]. The methods proposed by those works suffer from two issues: the first is the accumulation of errors due to the stochastic behaviour of end-users, and the second is the insufficiency of the model to provide reliable forecasts of users with different power patterns for an ensemble of houses since uncertainties can significantly impact the actual demand [23]. ...
Article
Full-text available
In flexibility markets, aggregators serve as crucial intermediaries by consolidating and selling consumer flexibility to grid operators or distribution system operators (DSOs). They are essential for grid management, offering load reductions based on power limits, and estimating expected consumer load in demand response scenarios. However, the inherent uncertainty in consumer behaviour poses a significant challenge, leading to deviations between projected and actual power consumption. In this context, this paper proposes a methodology for quantifying forecast uncertainties in power profiles at the aggregator level. The proposed methodology introduces a model-based approach to provide a more comprehensive representation of uncertainty and investigation of load variations. It provides load forecast values as comprehensive distributions, which are then sampled to generate newly sampled data from which the probability density function is extracted to quantify uncertainty, expressed by confidence intervals around the expected output. This approach aids in identifying the flexibility requirements for aggregated household power consumption, assists in quantifying uncertainties, and determines the flexibility needed for accurate forecasts of such consumption, which is essential for informed decision-making. The effectiveness of the proposed strategy is demonstrated using a synthetic dataset to assess its capability to quantify uncertainties in probabilistic forecasts. Additionally, a potential case study with a neighborhood of 14 houses connected to the same distribution transformer is presented to validate the proposed method. A comparative investigation of quantified uncertainties is presented by employing the Additive Gaussian Process (AGP), the Prophet forecasting, and the quantile regression, highlighting the usefulness of the proposed approach in flexibility markets. The results demonstrated the superiority of AGP-based load forecasts and flexibility needs with precise prediction accuracy. The comparative study demonstrates that the proposed method with AGP presents a minimum uncertainty when forecasting the total residential load than other benchmark models with a percentage of 26% and 21% in mean absolute error, respectively, for the different datasets. The continuous ranked probability score also revealed a 39% increase in the accuracy of probabilistic forecasts via the proposed method in contrast to others.
... Such variables were used to improve the estimation and forecasting in various engineering domains as: -A regression model to determine the product lifetimes from rapid testing data incorporated a group of dummy variables to check model adequacy [46]. -To improve the load forecasting accuracy, Li et al. [47] developed an adaptive multiple polynomial regression model considering temperature scenario and dummy variables and increase the accuracy by 3.8%. the prediction of rainfall in Indramayu (Indonesia); the PCR model was modified to overcome the errors with adding dummy variables and the dummy variables improved the rainfall prediction [49]. ...
Article
The share of photovoltaic energy is more and more increasing in the World energy mix; the intermittence of this production makes difficult to maintain the stability of the electricity grid and the balance production-consumption. Predicting in advance the solar production facilitates the operation of the grid manager. This paper aims to forecast hourly global solar irradiation for time horizons from h+1 to h+6, using two approaches: multiple linear regression (MLR) and artificial neural network (ANN) models. The choice of inputs in these models is crucial for a good prediction and is investigated here; generally, only endogenous data are used (global solar irradiation), the addition of exogenous ones often improves the accuracy (ambient temperature, humidity, pressure and differential pressure) but they are not always available; the introduction of ordinal variables is studied: four ordinal variables allow to introduce the double seasonality of solar irradiation. The performances of two forecasting models (linear and nonlinear models) with combinations of endogenous, exogenous and ordinal variables are compared on two Algerian sites with different meteorological variabilities. It appears that adding ordinal variables to endogenous data decreases the nRMSE values and enables to reach the same level of reliability than adding exogenous variables while simplifying the implementation. This addition as inputs in ANN models decreased nRMSE by 0.45–1.65% points (2.6–6.2%) for Algiers and by 0.2–0.3% point (1–3%) for Ghardaïa according to the forecasting horizon.
... Assuming that a feature has five states, one-hot encoding is shown in Figure 1. e one-hot encoding ensures that only one bit of each state is activated as 1, and all other state bits are 0. e advantage of this is to ensure that the distance between data is � 2 √ . e basic idea of dummy variables is similar to one-hot encoding, except that dummy variables select a state in the feature and set all its state bits to inactive state [14]. Analogous to the encoding method in Figure 1, dummy variables can make the state vector one bit less, and its encoding method is shown in Figure 2. ...
Article
Full-text available
Credit score is the basis for financial institutions to make credit decisions. With the development of science and technology, big data technology has penetrated into the financial field, and personal credit investigation has entered a new era. Personal credit evaluation based on big data is one of the hot research topics. This paper mainly completes three works. Firstly, according to the application scenario of credit evaluation of personal credit data, the experimental dataset is cleaned, the discrete data is one-HOT coded, and the data are standardized. Due to the high dimension of personal credit data, the pdC-RF algorithm is adopted in this paper to optimize the correlation of data features and reduce the 145-dimensional data to 22-dimensional data. On this basis, WOE coding was carried out on the dataset, which was applied to random forest, support vector machine, and logistic regression models, and the performance was compared. It is found that logistic regression is more suitable for the personal credit evaluation model based on Lending Club dataset. Finally, based on the logistic regression model with the best parameters, the user samples are graded and the final score card is output.
Article
Full-text available
Big data frameworks enable companies from various fields to build models that allow them to increase profit margins by improving decision making at different levels (middle management, senior management, board) or by attempting to boost sales by customizing consumers’ experience based on their history and feedback. Institutions and other entities also use big data coming from all kinds of sensors, data that can be used to detect, in real time or in retrospect, possible problems (e.g., frauds, malfunctions, supply shortages) or to identify patterns and trends. In this paper, we organize large volumes of community electricity consumption data coming from smart meters, smart plugs and other sensors, but also data regarding consumers’ preferences in order to assist them to dynamically optimize their electricity consumption. In this regard, we develop a novel optimization approach that re-schedules every fifteen-minutes the appliances for residential consumers to reduce both the consumption peaks and the payments at the community level. The consumers send their day-ahead schedule that is optimized and further implemented to some extent. Thus, we monitor the electricity consumption via sensors and smart meters and dynamically adjust the schedule in case the real consumption deviates from the optimized plan, considering appliances constraints and consumers’ preferences. Every fifteen minutes, the algorithm evaluates the differences between the optimized schedule and the actual consumption and controls the operation of the interruptible appliances to stick with the day-ahead schedule as much as possible.
Article
Full-text available
In this study, a characteristic load decomposition (CLD)-based day-ahead load forecasting scheme is proposed for a mixed-use complex. The aggregated load of the complex is composed of mixtures of different electricity usage patterns, and short-term load forecasting can be implemented by summing disaggregated sub-load predictions. However, tracing all usage patterns of sub-loads for prediction may be infeasible because of limited resources for measurement and analysis. To prevent this infeasibility, the proposed scheme focuses on effective decomposition using the sub-loads of typical characteristic load profiles and their representative pilot signals. Separate forecasts are obtained for the decomposed characteristic sub-loads using a hybrid scheme, which combines daytype conditioned linear prediction with long short-term memory regressions. Complex campus load data are considered to evaluate the proposed CLD-based hybrid forecasting. The evaluation results show that the proposed scheme outperforms conventional hybrid or similar-day-based forecasting approaches. Even when sub-load measurements are available only for a limited period, the CLD scheme can be applied for the extended training data through virtual disaggregations.
Article
Full-text available
Energy efficiency and sustainability are important factors to address in the context of smart cities. In this sense, smart metering and nonintrusive load monitoring play a crucial role for fighting energy thefts and for optimizing the energy consumption of home, building, city, and so forth. The estimated number of smart meters will exceed 800 million by 2020. By providing near real-time data about power consumption, smart meters can be used to analyze electricity usage trends and to point out anomalies guaranteeing companies’ safety and avoiding energy wastes. In literature, there are many proposals approaching the problem of anomaly detection. Most of them are limited because they lack of context and time awareness and the false positive rate is affected by the consumer habits change. This research work focuses on the need to define anomaly detection method capable to face with the concept drift, for instance family structure changes, a house becomes a second residence, and so forth. The proposed methodology adopts Long Short Term Memory (LSTM) network in order to profile and forecast the consumers’ behavior basing on their recent past consumptions. The continuous monitoring of the consumption prediction errors allows us to distinguish between possible anomalies and changes (drifts) in normal behavior that correspond to different error motifs. Experimental results demonstrate the suitability of the proposed framework by pointing out an anomaly in a near real-time after a training period of 1 week.
Article
Full-text available
Short-term load forecasting is the basis of power system operation and analysis. In recent years, the use of a deep belief network (DBN) for short-term load forecasting has become increasingly popular. In this study, a novel deep-learning framework based on a restricted Boltzmann machine (RBM) and an Elman neural network is presented. This novel framework is used for short-term load forecasting based on the historical power load data of a town in the UK. The obtained results are compared with an individual use of a DBN and Elman neural network. The experimental results demonstrate that our proposed model can significantly ameliorate the prediction accuracy.
Article
Full-text available
Accurate and stable prediction of short-term load for electric vehicle charging stations is of great significance in ensuring economical and safe operation of electric vehicle charging stations and power grids. In order to improve the accuracy and stability of short-term load forecasting for electric vehicle charging stations, an innovative prediction model based on a convolutional neural network and lion algorithm, improved by niche immunity, is proposed. Firstly, niche immunity is utilized to restrict over duplication of similar individuals, so as to ensure population diversity of lion algorithm, which improves the optimization performance of the lion algorithm significantly. The lion algorithm is then employed to search the optimal weights and thresholds of the convolutional neural network. Finally, a proposed short-term load forecasting method is established. After analyzing the load characteristics of the electric vehicle charging station, two cases in different locations and different months are selected to validate the proposed model. The results indicate that the new hybrid proposed model offers better accuracy, robustness, and generality in short-term load forecasting for electric vehicle charging stations. © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Article
Full-text available
Robust and accurate building energy load forecasting is important for helping building managers and utilities to plan, budget, and strategize energy resources in advance. With recent prevalent adoption of smart-meters in buildings, a significant amount of building energy consumption data became available. Many studies have developed physics-basedwhite boxmodels and data-driven black box models to predict building energy consumption; however, they require extensive prior knowledge about building system, need a large set of training data, or lack robustness to different forecasting scenarios. In this paper, we introduce a new building energy forecasting method based on Gaussian Process Regression (GPR) that incorporates physical insights about load data characteristics to improve accuracy while reducing training requirements. The GPR is a non-parametric regression method that models the data as a joint Gaussian distribution with mean and covariance functions and forecast using the Bayesian updating. We model the covariance function of the GPR to reflect the data patterns in different forecasting horizon scenarios, as prior knowledge. Our method takes advantage of the modeling flexibility and computational efficiency of the GPR while benefiting from the physical insights to further improve the training efficiency and accuracy. We evaluate our method with three field datasets from two university campuses (Carnegie Mellon University and Stanford University) for both short- and long-term load forecasting. The results show that our method performs more accurately, especially when the training dataset is small, compared to other state-of-the-art forecasting models (up to 2.95 times smaller prediction error).
Article
Short term electric load forecasting, as an important tool in the electricity market, plays a critical role in the management of electric systems. Proposing an accuracy and optimization method is not only a challenging task but also an indispensable part of the energy system. More and more accurate forecasting methods are needed by different people in different areas. This paper proposes a novel short-term electric load forecasting method EMD-Mixed-ELM which based on empirical mode decomposition (EMD) and extreme learning machine (ELM). EMD-Mixed-ELM first uses the empirical mode decomposition to decompose the load series for capturing the complicated features of the electric load and de-noising the data. Considering that the performance of extreme learning machine (ELM) is greatly influenced by the choice of kernel, the mixed kernel method is proposed for ELM. The mixed kernel combines the RBF kernel and the UKF kernel. The forecasting results of the EMD-Mixed-ELM are proved to be better than all the other three methods (RBF-ELM, UKF-ELM and Mixed-ELM) and other existing methods (MFES, ESPLSSVM and Combined method). To verify the forecasting ability of the EMD-Mixed-ELM, half-hourly electric load data from the state of New South Wales, Victoria and Queensland in Australia are used in this paper as a case study. The experimental results clearly indicate that for this three datasets, the forecasting accuracy of the proposed method is superior to other methods.
Article
Short term load forecasting (STLF) is an important issue for an electricity power system, to enhance its management efficiency and reduce its operational costs. However, STLF is affected by lots of exogenous factors, it demonstrates complicate characteristics, particularly, the multi-dimensional nonlinearity. Therefore, it is desired to extract some valuable features embedded in the time series, to demonstrate the relationships of the nonlinearity, eventually, to improve the forecasting accuracy. Due to the superiorities of phase space reconstruction (PSR) algorithm in reconstructing the phase space of time series, and of bi-square kernel (BSK) regression model in simultaneously considering original spectral signature and spatial information, this paper proposes a novel electricity load forecasting model by hybridizing PSR algorithm with BSK regression model, namely PSR-BSK model. The electricity load data can be sufficiently reconstructed by PSR algorithm to extract the evolutionary trends of the electricity power system and the embedded valuable features information to improve the reliability of the forecasting performances. The BSK model reasonably illustrates the spatial structures among regression points and their neighbor points to receive the rules of rotation rules and disturbance in each dimension. Eventually, the proposed PSR-BSK model including multi-dimensional regression is successfully established. The short term load data from the New South Wales (NSW, Australia) market and the New York Independent System Operator (NYISO, USA) are employed to illustrate the forecasting performances with different alternative forecasting models. The results demonstrate that, in these two employed numerical examples, the proposed PSR-BSK models all significantly receive the smallest forecasting errors in terms of MAPE (less than 2.20%), RMSE (less than 30.0), and MAE (less than 2.30), and the shortest running time (less than 400 s) than other forecasting models.
Article
Abstract Probabilistic load forecasting (PLF) is of important value to grid operators, retail companies, demand response aggregators, customers, and electricity market bidders. Gaussian processes (GPs) appear to be one of the promising methods for providing probabilistic forecasts. In this paper, the log-normal process (LP) is newly introduced and compared to the conventional GP. The LP is especially designed for positive data like residential load forecasting—little regard was taken to address this issue previously. In this work, probabilisitic and deterministic error metrics were evaluated for the two methods. In addition, several kernels were compared. Each kernel encodes a different relationship between inputs. The results showed that the LP produced sharper forecasts compared with the conventional GP. Both methods produced comparable results to existing PLF methods in the literature. The LP could achieve as good mean absolute error (MAE), root mean square error (RMSE), prediction interval normalized average width (PINAW) and prediction interval coverage probability (PICP) as 2.4%, 4.5%, 13%, 82%, respectively evaluated on the normalized load data.
Research
Due to the explosion in restructuring of power markets within a deregulated economy, competitive power market needs to minimize their required generation reserve gaps. Efficient load forecasting for future demands can minimize the gap which will help in economic power generation, power operations, power construction planning and power distribution. Nowadays, neural networks are widely used for solving load forecasting problem due to its non-linear characteristics. Consequently, neural network is successfully combined with optimization techniques for finding optimal network parameters in order to reduce the forecasting error. In this paper, firstly a novel evolutionary algorithm based on follow the leader concept is developed and thereafter its performance is validated by COmparing Continuous Optimizers experimental framework on the set of 24 Black-Box Optimization Benchmarking functions with 12 state-of-art algorithms in 2-D, 3-D, 5-D, 10-D, and 20-D. The proposed algorithm outperformed all state-of-art algorithms in 20-D and ranked second in other dimensions. Further, the proposed algorithm is integrated with neural network for the proper tuning of network parameters to solve the real world problem of short term load forecasting. Through experiments on three real-world electricity load data sets namely New Pool England, New South Wales and Electric Reliability Council of Texas, we compared our proposed hybrid approach to baseline approaches and demonstrated its effectiveness in terms of predictive accuracy measures.