Content uploaded by Nadeem Javaid
Author content
All content in this area was uploaded by Nadeem Javaid on Dec 13, 2019
Content may be subject to copyright.
1
Short-Term Electricity Price and Load Forecasting
using Enhanced Support Vector Machine and
K-Nearest Neighbor
Mughees Ali1, Zahoor Ali Khan2, Sana Mujeeb1, Shahid Abbas1and Nadeem Javaid 1,∗
1COMSATS University Islamabad, Islamabad 44000, Pakistan
nadeemjavaidqau@gmail.com∗
www.njavaid.com
2CIS, Higher Colleges of Technology, Fujairah 4114, United Arab Emirates
Abstract—Accurate load and price forecasting is one of the
crucial stage in Smart Grid (SG). An efficient load and price fore-
casting is required to minimize the large difference among power
generation and consumption. Accurate selection and extraction
of meaningful features from data are challenging. In this paper,
New York Independent System Operator (NYISO) six months‘
load and price data is used for forecasting. Decision Tree (DT)
is used for feature selection and Recursive Feature Elimination
(REF) technique is used for feature extraction. REF technique is
used to remove redundancy from selected features. After feature
extraction, two classifiers are used for forecasting. One classifier is
Support Vector Machine (SVM) and other classifier is K-Nearest
Neighbor (KNN). These classifiers have different parameters with
some default values. Week ahead load and price forecasting is
performed in this work. Accuracy of modified SVM is 89.5984%
and modified KNN is 89.8605% is achieved for load forecasting.
For price, accuracy of modified SVM is 88.2740% and modified
KNN is 85.5999%.
Index Terms—Smart Grid, Electricity load forecasting, Elec-
tricity price forecasting, Support vector machine, K-nearest
neighbour.
I. INTRODUCTION
Traditional Grid (TG) system is combination of several power
system elements like power generation, power transmission
lines and power transmission substations. These systems are
distant from the electricity utilization sectors. Electric power is
carried through power generated sectors to power consuming
areas through long transmission lines. In TG systems, the
power flow is in one direction from generation to customers
through transmission lines. The Smart Grid (SG) system is
a modernized form of the TG system which provides more
secure electrical service as compare to old TG system [
1
].
SG provides bidirectional communication between the utility
and electricity users, and vice versa. SG system plays very
important role in order to monitor activities of consumer
electrical power consumption, smart meters, smart homes and
smart substations. Power consumption in different areas is
different. Power consumption depends on different factors
like holidays, weather conditions and many others, therefore
forecasting is very important in order to predict the power
consumption and unit price of any area.
Forecasting is a process which takes input of historical data
in order to make the prediction of future trends or events.
Forecasting provides information about future trends. It shows
the probability of what might happen in the future.
Forecasting techniques are used in for prediction of future
consumption and electricity price [
2
]–[
13
]. Data analytics
plays a very important role in SG. On the bases of data
analytics, consumers can easily understand the generation of
electric power, transmission and consumption of electric power.
Data analytics in SG also help to monitor consumer electric
appliances preference, grid connected system information and
smart appliances usage at smart homes. Data analytics in SG
may also used to check the electricity theft at every level like
home level, commercial level and industrial level. It also help
to examine the economy of a specific region or development of
a specific region. If in a specific region electricity consumption
is more than some other regions and usage of smart appliances
is more than other region then it means that this specific region
is more develop and its economy is better than the other one.
Load and price forecasting is divided into four main categories:
first one is very short term forecasting, second one is short
term forecasting, third one is medium term forecasting and
forth one is long term forecasting. Very short term forecasting
is for one hour ahead prediction, short term forecasting is for
hours to week ahead prediction, medium term forecasting is
for months to a year prediction and long term forecasting is
longer than a year prediction.
Different classifiers are used in SG for the prediction like
Support Vector Machine (SVM) [
2
], Artificial Neural Network
(ANN) [
7
], Recurrent Neural Network (RNN) [
8
] etc. In
proposed model two classifiers are used one is SVM and
other is KNN. SVM is applied for classification and regres-
sion purposes in literature; therefore, mostly classification is
performed through this in this work. SVM performs linear
classification very well. SVM uses a kernel trick in order
to select best features and transform the data. On the basis
of this transformation SVM finds optimal boundary between
possible outputs. It also perform non-linear classification very
efficiently. Non-linear SVM means that the algorithm calculated
boundary is not a straight line like in linear SVM. Furthermore,
K-Nearest Neighbor (KNN) is applied for classification. It
is non-parametric classifier, which means that the model
2
structure is determined from the data. It is based on the feature
similarity, prediction is just-in-time by calculating similarity
between input and training instances. Both SVM and KNN
have different parameters. In SVM kernel, degree, gamma,
shrinking, probability etc. parameters are used. The parameters
of KNN are: n-neighbors, weights, leaf-size, metric-parameters
etc. All parameters of both classifiers have some default values.
Modifying these parametric-default values will enhance the
classifiers efficiency.
A. Problem Statement and Contributions
In literature different techniques were used for forecasting.
In this paper SVM and KNN these two techniques are used
for classification and consider as benchmark schemes. In [
2
],
authors use SVM as benchmark scheme for load forecasting.
In [
2
], redundancy in features are not discussed. In [
3
], authors
modify SVM and forecast price. In [
4
] authors use KNN
for forecasting. In [
5
], authors use KNN and naive Bayes
classifiers for energy efficiency. Both load and price forecasting
is discussed in this work. Accurate prediction of electricity
consumption and price in the future using smart grid data is
the major objective of the proposed work. There are following
contributions of this paper, which are explained as under:
•
Using real world electricity load and price dataset of
NYISO New York city, we perform different techniques
(i.e., Decision Tree (DT), RFE, SVM, KNN) which gives
very good prediction results.
•
Perform feature selection on dataset for the selection of
best feature which give us better accuracy. DT is used for
the selection of features.
•
After the feature selection, feature extraction is performed.
For feature extraction, RFE technique is used. RFE is
used for removing redundancy from selected features.
•
For prediction of load and price; two classifiers are used,
one is SVM and other is KNN. Parameters of both
classifiers are also modified for better accuracy suing
grid search technique.
II. PRO PO SE D SYS TE M MOD EL
In this section, the proposed system and its working is
explained as visualised through the Fig. 1, where the flow of
short term load and price forecasting is shown. Proposed model
is based on following four main categories: normalization
of the data by preprocessing procedure, training, testing and
classification, which are used for load and price forecasting.
A. Dataset Description
Load and price data is acquired from NYISO. NYISO dataset
contain data of different states of United State of America. New
York city data is used in proposed model. The data is in hourly
time series form. The hourly humidity, pressure, temperature,
wind direction, wind speed, load and price data from 01-05-14
to 31-10-14 is available in dataset. Data is divided on hourly
bases of each day. All the data of sixth month is arrange in
same hourly manner.
B. Feature Engineering
Six months of data load and price data is used for forecasting.
In preprocessing of data, dataset is divided into following two
main parts: one is training data part while other is testing part.
The first part is utilized for performing training of our model
while second part is used for testing our model.
Feature selection is most important step for relevant feature
selection. Dataset contain different values, some values are
relevant to our requirement while some values are not relevant.
It is necessary to select relevant features from dataset. For this
purpose feature selection technique is used. In proposed model
DT is used for feature selection process. Data is categorized
into train and test data.
After feature selection, feature extraction is performed using
RFE technique. Redundancy in data increases computational
complexity. Therefore feature extraction technique is used for
the removal of data redundancy.
C. Forecasting
For forecasting, historical data is used as input and future
trends are predicted on the basis of this data. In proposed model,
two different forecasting techniques are used. One is SVM
and other is KNN. In SVM kernel, degree, gamma, shrinking,
probability and some other parameters are used. In KNN n
neighbors, weights, leaf size, metric params and some other
parameters are used. All parameters of both classifiers have
some default values. Modifying these parametric-default values
will enhance the classifiers efficiency. Week ahead price and
load forecasting is based on the previous load and price data.
In order to forecast next week load and price, previous months
data is used. In dataset, the column name ‘Load’ is set as target
value for load and ‘Price’ is set as target for price forecasting.
75% of the dataset is applied for training of model while 25%
is utilized for testing of the model.
Now, performing SVM and KNN separately for Load and Price
forecasting. Both classifiers have different parameters and these
parameters have different default values. By modifying these
parametric values accuracy of classifier enhanced. After that
perform modification on SVM and KNN for better accuracy
of load and price forecasting.
III. SIMULATION RESULTS
In this section, we have discussed the results of load and
price forecasting, which are achieved through our proposed
system and the existing models.
A. Results of Price forecasting
All the simulations are performed using Python platform on
a computer system Intel core i7 processor with 8 GB RAM
and 500 GB hard disk. Price forecasting results are shown
below.
Scheme 1: SVM for Price forecasting: First technique used
for price forecasting is SVM. Before forecasting, feature
importance is evaluated. The feature importance is shown in
Fig. 2. The data is normalized for obtaining prediction results
3
Features
Features
Features
Features
Features
Features
Features
Features
Feature Selection
Features
Features
Decision
Tree
Recursive
Feature
Elimination
Features
Features
NYISO
NewYork
Dataset
SVM
+
KNN
Classifiers
Load and Price
Forecasting
Features
Features
Features
Features
Input
Feature Extraction
Output
Forecasting
Features
Fig. 1: Proposed Model
Fig. 2: Feature Importance
of SVM model.
Scheme 2: KNN for Price Forecasting: Fig. 3 shows the
normalized price data of both SVM and KNN. Fig. 4 shows
the predictive price data of both SVM and KNN.
Modified scheme: SVM and KNN for Price Forecasting:
Now performing parametric modification on both classifiers
are implemented and studied on dataset. Perform normalization
on price data using Modified SVM and KNN. Fig. 5. shows the
normalized price data of both Modified SVM and KNN. After
normalization, week ahead price prediction is performed using
Modified SVM and KNN. Fig. 6. shows the predictive price
Fig. 3: Normalized Data by SVM and KNN
Fig. 4: Predictive Data by SVM and KNN
4
Fig. 5: Normalized Data for Modified SVM and KNN
Fig. 6: Prediction Data by Modified SVM and KNN
data of both Modified SVM and KNN. Simulation results depict
that modification enhances the accuracy of both classifiers as
Modified SVM shows 89.59% accuracy and 89.86% accuracy
is shown by the Modified KNN.
Mean Absolute Error (MAE), Root Mean Square Error
(RMSE) and Mean Average Percentage Error (MAPE) for
performance evaluation. Table I shows the performance evalu-
ator of Modified SVM and Modified KNN.
IV. RES ULT S OF LOA D FORECASTING SCHEME
The load forecasting results of SVM, KNN, enhanced SVM
and enhanced KNN are shown below.
Scheme 1: SVM for Load forecasting: First technique used
for price forecasting is SVM. Before forecasting, feature
importance in the calculated. After the feature importance
as shown in Fig. 7 perform normalization on the price data
using SVM. Fig. 8. shows normalized price data using SVM.
Scheme 2: KNN for Load Forecasting:
Modified scheme: SVM and KNN for load forecasting: Now
performing parametric modification on both classifiers and
implement on dataset. Fig. 9 shows the normalized load data
of both Modified SVM and KNN. Fig. 10 shows the predictive
load data of both Modified SVM and KNN. Modification
enhances the accuracy of both classifiers 88.27% is shown
by Modified SVM and 85.55% by Modified KNN The
Fig. 7: Normalized Data by SVM and KNN
Fig. 8: Predictive Data by SVM and KNN
Fig. 9: Normalized Data by Modified SVM and KNN.
5
TABLE I: Performance Evaluator values of modified SVM and KNN.
Evaluator Modified SVM Value Modified SVM Value KNN Value
MAE 13.6340 11.0688
RMSE 11.68613 13.3133
MAPE 6.402 5.139
Fig. 10: Predictive Data by Modified SVM and KNN.
performance evaluator: MAE, RMSE and MAPE are used
to validate the performance of proposed techniques. Table II
shows the performance evaluator of Modified SVM and KNN.
Table II shows that modification in the parameters of the SVM
and KNN enhance the overall performance and accuracy of
the SVM and KNN.
TABLE II: Performance evaluator of modified SVM and KNN
Evalutor Modified SVM value Modified KNN value
MAE 11.58297 12.8663
RMSE 14.5103 16.2073
MAPE 11.726 14.400
V. CONCLUSIONS
This work has proposed and implemented the load and
price forecasting. The sole purpose of the proposed work is to
increase the forecasting accuracy. NYSIO dataset is used to
predict the price and load of electricity. Feature selection and
extraction are performed. Further, two enhanced classifiers
are used: SVM and KNN. These classifiers are enhanced
by modifying the above-mentioned parameters. Accuracy of
modified SVM is 89.59% and modified KNN is 89.86% for
load.
REFERENCES
[1]
Stephens, Jennie C., Elizabeth J. Wilson, and Tarla Rai Peterson. “Smart
grid (R) evolution. ”Cambridge University Press, 2015.
[2]
Kun Wang, Chenhan Xu, and Song Guo. “Big Data Analytics for Price
Forecasting in Smart Grids.”2016 IEEE: 978-1-5090-1328-9/16.
[3]
Yang Liu, Wei Wang, Noradin Ghadimi, “Electricity load forecasting by
an improved forecast engine for building level consumers.”0360-5442/
2017 Elsevier. http://dx.doi.org/10.1016/j.energy.2017.07.150.
[4]
Yinghao Chu, Carlos F.M. Coimbra. “Short-term probabilistic
forecasts for Direct Normal Irradiance.”0960-1481/ 2016 Elsevier
http://dx.doi.org/10.1016/j.renene.2016.09.012.
[5]
Chuan Choong Yang, Chit Siang Soh, Vooi Voon Yap. “A systematic
approach in appliance disaggregation using k-nearest neighbours and
naive Bayes classifiers for energy efficiency.”Springer Science+Business
Media B.V. 2017.
[6]
Fan, Cheng, Fu Xiao, and Yang Zhao. “A short-term building cooling
load prediction method using deep learning algorithms.”Applied energy
195 (2017): 222-233.
[7]
Liu, Jin-peng, and Chang-ling Li. “The short-term power load forecasting
based on sperm whale algorithm and wavelet least square support vector
machine with DWT-IR for feature selection.”Sustainability 9, no. 7 (2017):
1188.
[8]
Kuo, Ping-Huan, and Chiou-Jye Huang. “An Electricity Price Forecasting
Model by Hybrid Structured Deep Neural Networks.”Sustainability 10,
no. 4 (2018): 1280.
[9] Moghaddass, Ramin, and Jianhui Wang. “A hierarchical framework for
smart grid anomaly detection using large-scale smart meter data.”IEEE
Transactions on Smart Grid, 2017.
[10]
Amber KP, Aslam MW, Hussain SK. “Electricity consumption forecast-
ing models for administration buildings of the UK higher education
sector.”Energy and Buildings. 2015 Mar 1;90:127-36.
[11]
Guangzhong Dong, Member, IEEE, and Zonghai Chen, Member, IEEE.
“Data Driven Energy Management in a Home Microgrid Based on
Bayesian Optimal Algorithm.”1551-3203 : 2018 IEEE.
[12]
Xishuang Dong, Lijun Qian, Lei Huang. “Short-Term Load Forecasting in
Smart Grid: A Combined CNN and K-Means Clustering Approach.”978-
1-5090-3015-6/17 2017 IEEE.
[13]
Peter Lusisa, Kaveh Rajab Khalilpour, Lachlan Andrew, Ariel Liebman.
“Short-term residential load forecasting: Impact of calendar effects and
forecast granularity.”0306-2619/ 2017 Elsevier.