Content uploaded by Nadeem Javaid
Author content
All content in this area was uploaded by Nadeem Javaid on Dec 13, 2019
Content may be subject to copyright.
An Optimized Linear-Kernel Support Vector
Machine for Electricity Load and Price Forecasting
in Smart Grids
1st Junaid Masood
Department of Computer Science
COMSATS University Islamabad
Islamabad (44000), Pakistan
junaid.rajput033@gmail.com
2nd Sakeena Javaid
Department of Computer Science
COMSATS University Islamabad
Islamabad (44000), Pakistan
sakeenajavaid@gmail.com
3rd Sheeraz Ahmed
Department of Computer Science
Iqra National University
Peshawar, Pakistan
sheeraz.ahmad@inu.edu.pk
4th Sameeh Ullah
School of Information Technology
Illinois State University
United States of America
sullah@ilstu.edu
5th Nadeem Javaid
Department of Computer Science
COMSATS University Islamabad
Islamabad (44000), Pakistan
nadeemjavaidqau@gmail.com
Abstract—In smart grids, one of the key issues is accurate
forecasting of electricity load and price to reduce the gap
between generation and consumption of electricity. To address
this issue, a framework has been proposed, in which feature
selection has been done by Random Forest (RF) technique in
both datasets of load and price. For prediction, RF, Support
Vector Machine (SVM) and SVM along with an enhanced linear
kernel and tuned parameters are used. New York electricity
market data for load (MWh) and price ($) has been taken for
this purpose. Daily and weekly forecasting results have been
taken by the proposed system. Several performance evaluation
techniques have been used to evaluate prediction results. The
results show that our proposed technique performed better
(0.07% for load and 0.12% for price) than default linear-kernel
SVR.
Index Terms—Data Analytics, Forecasting, Smart Grids,
Support Vector Machine, Support Vector Regression,
Short-Term
I. INTRODUCTION
Electricity is an essential need of the world nowadays.
It is needed to maintain balance between generation and
consumption (GC) of electricity for efficient usage of
resources. Traditional grids are used to manage the GC
of electricity. Traditional grid is the centralized system
in which power flows in one direction, from generation
to the consumer through transmission lines [1]. However,
traditional grids are not capable to balance between GC
of electricity, due to which GC gap take place. Smart
Grids (SG) are introduced to fill this GC gap. SG is a
decentralized system in which power flows in both direction,
from generation to consumer and vice versa [1]. In SG,
many work has been done for electricity like management,
forecasting, monitoring, control, etc. [2]. However, we
have done forecasting. Forecasting is to make predictions,
based on analyzing trends in the past and present data. In
this paper, the terms, prediction and forecasting are used
alternatively. The short-term electricity load and price is
forecasted in this work. In this way SG can do better
management about upcoming electricity shortages and gaps.
Forecasting can be done in many ways like statistical surveys,
technology forecasting, scenario building, data analytics, etc.
[3]. Whereas we have done forecasting with data analytics.
Data analytics is the process of examining raw data with
the help of specialized systems and softwares to draw the
conclusion about information present in that raw data [4].
By using historical data, most of the useful information
can be extracted through different techniques. However,
regression techniques are used in this work to forecast load
and price. For regression, we have used Support Vector
Regression (SVR) a type of Support Vector Machine (SVM).
The terms SVM and SVR are used alternatively in this article.
A. Motivation and Problem Statement
A lot of existing studies have performed different type of
load and price forecasting, either to improve the accuracy
of existing forecasting models or to reduce complexity and
over-fitting of these models. In [5], authors have used Gated
Recurrent Units (GRUs) technique to forecast price, whereas,
in [6], authors have used load forecasting using Convolutional
Neural Network (CNN) and Long Short Term Memory
(LSTM) combined. Deep Neural Networks (DNNs) are used
in [7] to forecast both electricity load and price. So by
taking motivation from these, load and price forecasting has
been done in our work. In [8], authors have done load
978-1-7281-4452-8/19/$31.00©2019 IEEE
forecasting with SVR. However, feature redundancy is not
discussed in their work. Due to redundancy in features,
accuracy is compromised. Moreover, kernel function is not
used in efficient manner. So, we have defined a kernel
function and compared it with simple SVM. In this article,
problem related to the electricity load and price forecasting is
highlighted. The main objective of this work is the accurate
prediction of the load and price by using the historic data
of SG. For this purpose, SVM framework has been applied.
SVM classifier divides the data into categories by making
a hyperplane between them. SVM is an efficient method;
however, following challenges are needed to be integrated in
order to achieve high accuracy of forecasting models.
•Feature Reduction: SVM has a high computational
complexity. To reduce the computational complexity of
SVM, it is needed to reduce data features according
to the importance of features. Only high importance
features should be used in order to reduce computational
complexity.
•Data Transformation: SVM uses kernel function to
transform data in to the required form. There are many
types of predefined kernel functions that can be used
for different purposes. However, a customized kernel
can be defined according to the type of data. The
correct transformation of data is very important for
better prediction and accuracy.
•Hyper Parameter Turing: In forecasting with SVM,
hyper parameters of SVM effects the performance. Some
of these parameters are incentive loss function, gamma,
cost penalty and kernel function. To obtain the higher
accuracy, it is challenging to find the approximate values
of the parameters.
B. Contributions
To address the abovementioned challenges, enhanced
Linear Kernel SVM (ELKSVM) model is proposed for the
accurate forecasting of electricity price and load. The actual
contributions of this paper are:
•A framework is proposed to forecast accurate load and
price using data analytics in SG, where feature selection
and classification are integrated.
•Random Forest (RF) is used to find features importance
and Select From Model (SFM) feature selector is used
to select important features based on the computed
importance.
•An Enhanced Linear (ELinear) kernel is also proposed
with two additional parameters for more accurate data
transformation and SVM model is combined with it.
•Two hyper parameters of SVM: cost penalty and gamma
are tuned for optimal forecasting.
Rest of the paper is designed as follows: in Section 2 and
Section 3, related work and the proposed system model are
described. In Section 4, simulation results are elaborated
and in Section 5, performance evaluation has been done.
Conclusion and future directions are discussed in Section 6.
II. RE LATE D WOR K
In the literature, three types of forecasting (prediction)
techniques: data driven, classical and artificially intelligent
are used for the electricity price and load forecasting .
For the forecasting classifiers such as: RF, naive bayes and
AutoRegressive Integrated Moving Average are used. In [9],
[7], shallow neural networks have over-fitting problem. In
[8], authors have done load forecasting with SVR. However,
feature redundancy is not discussed in this work. In [10]
- [14], multiple techniques are proposed; however, there
prediction accuracy is not observed accurately. In [11],
authors study the existing feature engineering’s techniques to
extract suitable features from data. A multi-variable Mutual
Information (MI) method is used for feature selection in data
[15]. Interaction Gain (IG) and MI are applied to determine
the relevance of features [15]. C4.5 algorithm is used for
feature selection in price forecasting, which performs better
than iterative di-chotomiser 3 algorithm in making Decision
Tree (DT) [16], DT faces over-fitting problem. To process
time series data, symbolic aggregation approximation is used
for feature extraction. It performs better in training; however,
it is not performing good in prediction [17].
In [18], an SVM method is applied for the prediction
of load and price. For managing the demand-response in
the residential area, multiple solutions are presented already.
Huang et al. in [19] and Ruelens et al. in [20] have presented
the demand response mechanisms for the residential area
consumers. A rule mining method is used to classify
the interdependency between appliance usage and power
consumption [21]. However, it lacks a proper rule mining
process and appliance to appliance association.
III. PROP OS ED SY ST EM MO DE L
The proposed system model has six steps: data
preprocessing (splitting, noise reduction, float to integer
conversion), feature selection using RF, linear kernel
optimization with introduced parameters, model building
with SVR using an optimized kernel and tuned parameters,
load and price forecasting, and finally performance evaluation
as shown in Fig. 1.
A. Data Preprocessing
Hourly data of load and price for the months of January for
four years (2014-2017) is used. Data is taken by the website
of NECA-ISO (New England Control Area Independent
System Operator). This is an on-line portal which provides
users with real time data [22]. In the first step, the data is
normalized by converting float to integers and by removing
noise for better results. Data is splitted into two parts: (1)
70 percent of data is used for training the model and (2)
rest of 30 percent data is used for testing. In this prediction
Fig. 1: Proposed System Model
model, RF is used for feature selection. Initially, there
are five features in both datasets of load and price. After
finding features’ importance with RF model, we have set the
threshold of 0.15 for feature selection through SFM feature
selector. Threshold value is taken after many simulations with
different threshold values. After this selection, features that
have less importance are eliminated and only the features
having high importance are selected.
B. Model Building with Optimized Kernel
After transforming the data according to selected features,
kernel function is defined. In this kernel function, a linear
kernel is optimized through two new parameters ‘a’ and ‘b’
for efficient data transformation according to the type of
data. After defining the kernel function, SVR model has been
built by using this optimized kernel function. Moreover, other
parameters of SVM: cost penalty and gamma are tuned by
many extensive simulations. Then the model is trained by
providing training data. The model predicts the hourly load
and price for grid side.
1) Optimized Kernel: In this section, the optimization
process for kernel function is defined. Two new parameters:
‘a’ and ‘b’ are defined in linear kernel for better
transformation of data. The purpose of introducing these
parameters is to reduce over-fitting and to increase
generalization. The equations for linear kernel and ELinear
kernel are given below:
linear =dot(X,Y.T)) (1)
ELinear =b∗(dot(X/a,Y.T)) (2)
Here, ‘b’ is used for generalization and ‘a’ is used to reduce
over-fitting.
C. Performance Metrics
For evaluation of prediction model, several performance
metrics are used to evaluate the accuracy of prediction
results. The performance metrics are: Mean Square Error
(MSE), Root MSE (RMSE), Mean Absolute Error (MAE),
Mean Absolute Percentage Error (MAPE). Finally, accuracy
is calculated by subtracting MAPE from hundred because
MAPE is absolute percentage and can be easily understood.
IV. RES ULTS AND SIMULATIONS
To evaluate the proposed system model, several
experiments have been carried out. For execution of
experiments, preprocessing is performed on datasets and
forecasting model is constructed in spider.
A. Data Set Description
Hourly data of NECA-ISO (New England Control Area
Independent System Operator) for Load and Price is used
for forecasting. Hourly data for the months of January of
four years (2014-2017) is used to predict electricity load and
price. The data is Normalized by converting it into integers
from float and by removing noise for better results. Data is
splitted into two parts: (1) training data (2) testing data. In
a result of splitting, 2084 out of 2976 data samples are used
for training and rest of 892 out of 2976 are used for testing
B. Feature Selection
To reduce the redundancy from features, RF has been used
to calculate features Importance for both datasets of load
and price. Based on this importance, features are selected by
setting the threshold value of 0.15. SFM is used to select the
features according to the threshold value. In this selection
4; out of 5 features are selected for further process. Fig. 2
shows the graphical view of features importance of load data
and similarly, graphical view of features importance of price
data is shown in Fig. 3.
SVM is not an actual machine. However, it is a collection
of predefined algorithms for classification and regression
Fig. 2: Features Importance For Load
Fig. 3: Features Importance For Price
tasks. According to the continuous target attribute, SVR
is used. ELinear kernel is used for data transformation.
Extensive simulations have been done to find optimal values
of a and b, parameters of ELinear kernel. After getting
simulations value, a is chosen 2 for both load and price
and value of b is also chosen as 1.35 and 1.72 for load
and price, respectively. Grid search algorithm is applied to
choose optimal set of values for hyper parameters of SVM
from random sets of values. Out of 50 random set of values,
36 and 27 values are chosen for cost penalty and gamma,
respectively.
C. Comparison of Forecasting Models
To validate the applicability and the performance of a
proposed forecasting model, two other classifier models, RF
classifier and SVR classifier with a linear kernel have been
used. The results of these classifiers have been obtained.
From the results, it can be seen that the SVR with ELinear
kernel performed better than SVR with a simple linear kernel
for both load and price by 0.2%. However, RF classifier
performed better than ELKSVM. Figs. 4, 5 and 6 show
the results of one day prediction of January 26 for load
Forecasting with RF, SVM and ELKSVM, respectively.
Similarly, Figs. 7, 8 and 9 show the results of one-day
prediction of January 26 for price Forecasting with three
classifiers. Whereas, Fig. 10 shows one-week prediction of
load with ELKSVM and one-week prediction of price with
ELKSVM is shown in Fig 11.
V. PERFORMANCE EVALUATI ON
All these For better understanding of results, MAPE
Fig. 4: RF Classifier
Fig. 5: SVR Classifier
value for each classifier are visualized through figures,
which are shown in Fig. 12 for load and Fig. 13 for price.
MAPE value for ELKSVM is less then the value for SVR
in both figures.
VI. CONCLUSION AND FUTURE WORK
Data analytics is used in this work to improve performance
of forecasting techniques for load and price forecasting in
SG. For this purpose, a new forecasting model with three
stages is introduced. Feature selection is done by RF, which
Fig. 6: ELKSVM Classifier
Fig. 7: RF Classifier
Fig. 8: SVR Classifier
Fig. 9: ELKSVM Classifier
Fig. 10: One Week Load Prediction
Fig. 11: One Week Price Prediction
Fig. 12: MAPE For Load
Fig. 13: MAPE For Price
has improved the complexity of SVR. A linear kernel has
been enhanced with two parameters to reduce over fitting
and to improve generality, which gives a positive effect in
the overall model. Based on ELinear kernel, SVM model is
built along with tuned hyper parameters for better forecasting.
Experimental results show that ELKSVM performed 0.2%
better than simple SVM with linear kernel for both load and
price forecasting.
REFERENCES
[1] Smart Grid (6EE5A). Available online:
http://www.electrical-guru.com/Subject.aspx?id=3&code=6EE5A&unit
id=2&topicid=9, (accessed on: Nov. 09, 2018).
[2] Saleh, Ahmed I., Asmaa H. Rabie, and Khaled M. Abo-Al-Ez. ”A
data mining based load forecasting strategy for smart electrical grids.”
Advanced Engineering Informatics 30, no. 3 (2016): 422-448.
[3] Forecasting. Available online: https://en.wikipedia.org/wiki/Forecasting,
(last accessed on: Nov. 09, 2018)
[4] M. Rouse. Data Analytics (DA).
https://searchdatamanagement.techtarget.com/definition/data-analytics,
(last accessed at Nov 09, 2018)
[5] Zhao, JunHua, ZhaoYang Dong, and Xue Li. ”Electricity price
forecasting with effective feature preprocessing.” In 2006 IEEE Power
Engineering Society General Meeting, pp. 8-pp. IEEE, 2006.
[6] Moghaddass, Ramin, and Jianhui Wang. ”A hierarchical framework
for smart grid anomaly detection using large-scale smart meter data.”
IEEE Transactions on Smart Grid 9, no. 6 (2017): 5820-5830.
[7] Ghasemi, Ali, Hossien Shayeghi, Mohammad Moradzadeh, and
Mohammad Nooshyar. ”A novel hybrid algorithm for electricity price
and load forecasting in smart grids with demand-side management.”
Applied energy 177 (2016): 40-59.
[8] Vrablecov´
a, Petra, Anna Bou Ezzeddine, Viera Rozinajov´
a, Slavom´
ır
ˇ
S´
arik, and Arun Kumar Sangaiah. ”Smart grid load forecasting using
online support vector regression.” Computers & Electrical Engineering
65 (2018): 102-117.
[9] Fan, Cheng, Fu Xiao, and Yang Zhao. ”A short-term building cooling
load prediction method using deep learning algorithms.” Applied
energy 195 (2017): 222-233.
[10] Abedinia, Oveis, Nima Amjady, and Hamidreza Zareipour. ”A new
feature selection technique for load and price forecast of electrical
power systems.” IEEE Transactions on Power Systems 32, no. 1
(2016): 62-74.
[11] Wang, Kun, Jun Yu, Yan Yu, Yirou Qian, Deze Zeng, Song Guo, Yong
Xiang, and Jinsong Wu. ”A survey on energy internet: Architecture,
approach, and emerging technologies.” IEEE Systems Journal 12, no.
3 (2017): 2403-2416.
[12] Ayub, Nasir, Adnan Ishaq, Mudabbir Ali, Muhammad Azeem Sarwar,
Basit Amin, and Nadeem Javaid. ”An efficient scheduling of
power and appliances using metaheuristic optimization technique.” In
International Conference on Intelligent Networking and Collaborative
Systems, pp. 178-190. Springer, Cham, 2017.
[13] Mujeeb, Sana, Nadeem Javaid, Mariam Akbar, Rabiya Khalid, Orooj
Nazeer, and Mahnoor Khan. ”Big data analytics for price and load
forecasting in smart grids.” In International Conference on Broadband
and Wireless Computing, Communication and Applications, pp. 77-87.
Springer, Cham, 2018.
[14] Ryu, Seunghyoung, Jaekoo Noh, and Hongseok Kim. ”Deep neural
network based demand side short term load forecasting.” Energies 10,
no. 1 (2016): 3.
[15] Wu, H. C., S. C. Chan, K. M. Tsui, and Yunhe Hou. ”A new
recursive dynamic factor analysis for point and interval forecast of
electricity price.” IEEE Transactions on Power Systems 28, no. 3
(2013): 2352-2365.
[16] Moon, Jihoon, Jinwoong Park, Sanghoon Han, and Eenjun Hwang.
”Power Consumption Forecasting Scheme for Educational Institutions
Based on Analysis of Similar Time Series Data.” Journal of KIISE 44,
no. 9 (2017): 954-965.
[17] Wang, Kun, Chenhan Xu, Yan Zhang, Song Guo, and Albert Y.
Zomaya. ”Robust big data analytics for electricity price forecasting
in the smart grid.” IEEE Transactions on Big Data 5, no. 1 (2017):
34-45.
[18] Kuo, Ping-Huan, and Chiou-Jye Huang. ”An electricity price
forecasting model by hybrid structured deep neural networks.”
Sustainability 10, no. 4 (2018): 1280.
[19] Huang, Hantao, Yuehua Cai, Hang Xu, and Hao Yu. ”A multiagent
minority-game-based demand-response management of smart buildings
toward peak load reduction.” IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems 36, no. 4 (2016): 573-585.
[20] Ruelens, Frederik, Bert J. Claessens, Stijn Vandael, Bart De
Schutter, Robert Babuˇ
ska, and Ronnie Belmans. ”Residential demand
response of thermostatically controlled loads using batch reinforcement
learning.” IEEE Transactions on Smart Grid 8, no. 5 (2016):
2149-2159.
[21] Liu, Jin-peng, and Chang-ling Li. ”The short-term power load
forecasting based on sperm whale algorithm and wavelet least
square support vector machine with DWT-IR for feature selection.”
Sustainability 9, no. 7 (2017): 1188.
[22] Price rates. Available online at: http://www.nyiso.com/public/index.jsp
(last accessed on: Sep. 25, 2018)