Conference PaperPDF Available


Accurate load and price forecasting is one of thecrucial stage in Smart Grid (SG). An efficient load and price fore-casting is required to minimize the large difference among powergeneration and consumption. Accurate selection and extractionof meaningful features from data are challenging. In this paper,New York Independent System Operator (NYISO) six months‘load and price data is used for forecasting. Decision Tree (DT)is used for feature selection and Recursive Feature Elimination(REF) technique is used for feature extraction. REF technique isused to remove redundancy from selected features. After featureextraction, two classifiers are used for forecasting. One classifier isSupport Vector Machine (SVM) and other classifier is K-NearestNeighbor (KNN). These classifiers have different parameters withsome default values. Week ahead load and price forecasting isperformed in this work. Accuracy of modified SVM is 89.5984%and modified KNN is 89.8605% is achieved for load forecasting.For price, accuracy of modified SVM is 88.2740% and modifiedKNN is 85.5999%.
Short-Term Electricity Price and Load Forecasting
using Enhanced Support Vector Machine and
K-Nearest Neighbor
Mughees Ali1, Zahoor Ali Khan2, Sana Mujeeb1, Shahid Abbas1and Nadeem Javaid 1,
1COMSATS University Islamabad, Islamabad 44000, Pakistan
2CIS, Higher Colleges of Technology, Fujairah 4114, United Arab Emirates
Abstract—Accurate load and price forecasting is one of the
crucial stage in Smart Grid (SG). An efficient load and price fore-
casting is required to minimize the large difference among power
generation and consumption. Accurate selection and extraction
of meaningful features from data are challenging. In this paper,
New York Independent System Operator (NYISO) six months‘
load and price data is used for forecasting. Decision Tree (DT)
is used for feature selection and Recursive Feature Elimination
(REF) technique is used for feature extraction. REF technique is
used to remove redundancy from selected features. After feature
extraction, two classifiers are used for forecasting. One classifier is
Support Vector Machine (SVM) and other classifier is K-Nearest
Neighbor (KNN). These classifiers have different parameters with
some default values. Week ahead load and price forecasting is
performed in this work. Accuracy of modified SVM is 89.5984%
and modified KNN is 89.8605% is achieved for load forecasting.
For price, accuracy of modified SVM is 88.2740% and modified
KNN is 85.5999%.
Index Terms—Smart Grid, Electricity load forecasting, Elec-
tricity price forecasting, Support vector machine, K-nearest
Traditional Grid (TG) system is combination of several power
system elements like power generation, power transmission
lines and power transmission substations. These systems are
distant from the electricity utilization sectors. Electric power is
carried through power generated sectors to power consuming
areas through long transmission lines. In TG systems, the
power flow is in one direction from generation to customers
through transmission lines. The Smart Grid (SG) system is
a modernized form of the TG system which provides more
secure electrical service as compare to old TG system [
SG provides bidirectional communication between the utility
and electricity users, and vice versa. SG system plays very
important role in order to monitor activities of consumer
electrical power consumption, smart meters, smart homes and
smart substations. Power consumption in different areas is
different. Power consumption depends on different factors
like holidays, weather conditions and many others, therefore
forecasting is very important in order to predict the power
consumption and unit price of any area.
Forecasting is a process which takes input of historical data
in order to make the prediction of future trends or events.
Forecasting provides information about future trends. It shows
the probability of what might happen in the future.
Forecasting techniques are used in for prediction of future
consumption and electricity price [
]. Data analytics
plays a very important role in SG. On the bases of data
analytics, consumers can easily understand the generation of
electric power, transmission and consumption of electric power.
Data analytics in SG also help to monitor consumer electric
appliances preference, grid connected system information and
smart appliances usage at smart homes. Data analytics in SG
may also used to check the electricity theft at every level like
home level, commercial level and industrial level. It also help
to examine the economy of a specific region or development of
a specific region. If in a specific region electricity consumption
is more than some other regions and usage of smart appliances
is more than other region then it means that this specific region
is more develop and its economy is better than the other one.
Load and price forecasting is divided into four main categories:
first one is very short term forecasting, second one is short
term forecasting, third one is medium term forecasting and
forth one is long term forecasting. Very short term forecasting
is for one hour ahead prediction, short term forecasting is for
hours to week ahead prediction, medium term forecasting is
for months to a year prediction and long term forecasting is
longer than a year prediction.
Different classifiers are used in SG for the prediction like
Support Vector Machine (SVM) [
], Artificial Neural Network
(ANN) [
], Recurrent Neural Network (RNN) [
] etc. In
proposed model two classifiers are used one is SVM and
other is KNN. SVM is applied for classification and regres-
sion purposes in literature; therefore, mostly classification is
performed through this in this work. SVM performs linear
classification very well. SVM uses a kernel trick in order
to select best features and transform the data. On the basis
of this transformation SVM finds optimal boundary between
possible outputs. It also perform non-linear classification very
efficiently. Non-linear SVM means that the algorithm calculated
boundary is not a straight line like in linear SVM. Furthermore,
K-Nearest Neighbor (KNN) is applied for classification. It
is non-parametric classifier, which means that the model
structure is determined from the data. It is based on the feature
similarity, prediction is just-in-time by calculating similarity
between input and training instances. Both SVM and KNN
have different parameters. In SVM kernel, degree, gamma,
shrinking, probability etc. parameters are used. The parameters
of KNN are: n-neighbors, weights, leaf-size, metric-parameters
etc. All parameters of both classifiers have some default values.
Modifying these parametric-default values will enhance the
classifiers efficiency.
A. Problem Statement and Contributions
In literature different techniques were used for forecasting.
In this paper SVM and KNN these two techniques are used
for classification and consider as benchmark schemes. In [
authors use SVM as benchmark scheme for load forecasting.
In [
], redundancy in features are not discussed. In [
], authors
modify SVM and forecast price. In [
] authors use KNN
for forecasting. In [
], authors use KNN and naive Bayes
classifiers for energy efficiency. Both load and price forecasting
is discussed in this work. Accurate prediction of electricity
consumption and price in the future using smart grid data is
the major objective of the proposed work. There are following
contributions of this paper, which are explained as under:
Using real world electricity load and price dataset of
NYISO New York city, we perform different techniques
(i.e., Decision Tree (DT), RFE, SVM, KNN) which gives
very good prediction results.
Perform feature selection on dataset for the selection of
best feature which give us better accuracy. DT is used for
the selection of features.
After the feature selection, feature extraction is performed.
For feature extraction, RFE technique is used. RFE is
used for removing redundancy from selected features.
For prediction of load and price; two classifiers are used,
one is SVM and other is KNN. Parameters of both
classifiers are also modified for better accuracy suing
grid search technique.
In this section, the proposed system and its working is
explained as visualised through the Fig. 1, where the flow of
short term load and price forecasting is shown. Proposed model
is based on following four main categories: normalization
of the data by preprocessing procedure, training, testing and
classification, which are used for load and price forecasting.
A. Dataset Description
Load and price data is acquired from NYISO. NYISO dataset
contain data of different states of United State of America. New
York city data is used in proposed model. The data is in hourly
time series form. The hourly humidity, pressure, temperature,
wind direction, wind speed, load and price data from 01-05-14
to 31-10-14 is available in dataset. Data is divided on hourly
bases of each day. All the data of sixth month is arrange in
same hourly manner.
B. Feature Engineering
Six months of data load and price data is used for forecasting.
In preprocessing of data, dataset is divided into following two
main parts: one is training data part while other is testing part.
The first part is utilized for performing training of our model
while second part is used for testing our model.
Feature selection is most important step for relevant feature
selection. Dataset contain different values, some values are
relevant to our requirement while some values are not relevant.
It is necessary to select relevant features from dataset. For this
purpose feature selection technique is used. In proposed model
DT is used for feature selection process. Data is categorized
into train and test data.
After feature selection, feature extraction is performed using
RFE technique. Redundancy in data increases computational
complexity. Therefore feature extraction technique is used for
the removal of data redundancy.
C. Forecasting
For forecasting, historical data is used as input and future
trends are predicted on the basis of this data. In proposed model,
two different forecasting techniques are used. One is SVM
and other is KNN. In SVM kernel, degree, gamma, shrinking,
probability and some other parameters are used. In KNN n
neighbors, weights, leaf size, metric params and some other
parameters are used. All parameters of both classifiers have
some default values. Modifying these parametric-default values
will enhance the classifiers efficiency. Week ahead price and
load forecasting is based on the previous load and price data.
In order to forecast next week load and price, previous months
data is used. In dataset, the column name ‘Load’ is set as target
value for load and ‘Price’ is set as target for price forecasting.
75% of the dataset is applied for training of model while 25%
is utilized for testing of the model.
Now, performing SVM and KNN separately for Load and Price
forecasting. Both classifiers have different parameters and these
parameters have different default values. By modifying these
parametric values accuracy of classifier enhanced. After that
perform modification on SVM and KNN for better accuracy
of load and price forecasting.
In this section, we have discussed the results of load and
price forecasting, which are achieved through our proposed
system and the existing models.
A. Results of Price forecasting
All the simulations are performed using Python platform on
a computer system Intel core i7 processor with 8 GB RAM
and 500 GB hard disk. Price forecasting results are shown
Scheme 1: SVM for Price forecasting: First technique used
for price forecasting is SVM. Before forecasting, feature
importance is evaluated. The feature importance is shown in
Fig. 2. The data is normalized for obtaining prediction results
Feature Selection
Load and Price
Feature Extraction
Fig. 1: Proposed Model
Fig. 2: Feature Importance
of SVM model.
Scheme 2: KNN for Price Forecasting: Fig. 3 shows the
normalized price data of both SVM and KNN. Fig. 4 shows
the predictive price data of both SVM and KNN.
Modified scheme: SVM and KNN for Price Forecasting:
Now performing parametric modification on both classifiers
are implemented and studied on dataset. Perform normalization
on price data using Modified SVM and KNN. Fig. 5. shows the
normalized price data of both Modified SVM and KNN. After
normalization, week ahead price prediction is performed using
Modified SVM and KNN. Fig. 6. shows the predictive price
Fig. 3: Normalized Data by SVM and KNN
Fig. 4: Predictive Data by SVM and KNN
Fig. 5: Normalized Data for Modified SVM and KNN
Fig. 6: Prediction Data by Modified SVM and KNN
data of both Modified SVM and KNN. Simulation results depict
that modification enhances the accuracy of both classifiers as
Modified SVM shows 89.59% accuracy and 89.86% accuracy
is shown by the Modified KNN.
Mean Absolute Error (MAE), Root Mean Square Error
(RMSE) and Mean Average Percentage Error (MAPE) for
performance evaluation. Table I shows the performance evalu-
ator of Modified SVM and Modified KNN.
The load forecasting results of SVM, KNN, enhanced SVM
and enhanced KNN are shown below.
Scheme 1: SVM for Load forecasting: First technique used
for price forecasting is SVM. Before forecasting, feature
importance in the calculated. After the feature importance
as shown in Fig. 7 perform normalization on the price data
using SVM. Fig. 8. shows normalized price data using SVM.
Scheme 2: KNN for Load Forecasting:
Modified scheme: SVM and KNN for load forecasting: Now
performing parametric modification on both classifiers and
implement on dataset. Fig. 9 shows the normalized load data
of both Modified SVM and KNN. Fig. 10 shows the predictive
load data of both Modified SVM and KNN. Modification
enhances the accuracy of both classifiers 88.27% is shown
by Modified SVM and 85.55% by Modified KNN The
Fig. 7: Normalized Data by SVM and KNN
Fig. 8: Predictive Data by SVM and KNN
Fig. 9: Normalized Data by Modified SVM and KNN.
TABLE I: Performance Evaluator values of modified SVM and KNN.
Evaluator Modified SVM Value Modified SVM Value KNN Value
MAE 13.6340 11.0688
RMSE 11.68613 13.3133
MAPE 6.402 5.139
Fig. 10: Predictive Data by Modified SVM and KNN.
performance evaluator: MAE, RMSE and MAPE are used
to validate the performance of proposed techniques. Table II
shows the performance evaluator of Modified SVM and KNN.
Table II shows that modification in the parameters of the SVM
and KNN enhance the overall performance and accuracy of
the SVM and KNN.
TABLE II: Performance evaluator of modified SVM and KNN
Evalutor Modified SVM value Modified KNN value
MAE 11.58297 12.8663
RMSE 14.5103 16.2073
MAPE 11.726 14.400
This work has proposed and implemented the load and
price forecasting. The sole purpose of the proposed work is to
increase the forecasting accuracy. NYSIO dataset is used to
predict the price and load of electricity. Feature selection and
extraction are performed. Further, two enhanced classifiers
are used: SVM and KNN. These classifiers are enhanced
by modifying the above-mentioned parameters. Accuracy of
modified SVM is 89.59% and modified KNN is 89.86% for
Stephens, Jennie C., Elizabeth J. Wilson, and Tarla Rai Peterson. “Smart
grid (R) evolution. ”Cambridge University Press, 2015.
Kun Wang, Chenhan Xu, and Song Guo. “Big Data Analytics for Price
Forecasting in Smart Grids.”2016 IEEE: 978-1-5090-1328-9/16.
Yang Liu, Wei Wang, Noradin Ghadimi, “Electricity load forecasting by
an improved forecast engine for building level consumers.”0360-5442/
2017 Elsevier.
Yinghao Chu, Carlos F.M. Coimbra. “Short-term probabilistic
forecasts for Direct Normal Irradiance.”0960-1481/ 2016 Elsevier
Chuan Choong Yang, Chit Siang Soh, Vooi Voon Yap. “A systematic
approach in appliance disaggregation using k-nearest neighbours and
naive Bayes classifiers for energy efficiency.”Springer Science+Business
Media B.V. 2017.
Fan, Cheng, Fu Xiao, and Yang Zhao. “A short-term building cooling
load prediction method using deep learning algorithms.”Applied energy
195 (2017): 222-233.
Liu, Jin-peng, and Chang-ling Li. “The short-term power load forecasting
based on sperm whale algorithm and wavelet least square support vector
machine with DWT-IR for feature selection.”Sustainability 9, no. 7 (2017):
Kuo, Ping-Huan, and Chiou-Jye Huang. “An Electricity Price Forecasting
Model by Hybrid Structured Deep Neural Networks.”Sustainability 10,
no. 4 (2018): 1280.
[9] Moghaddass, Ramin, and Jianhui Wang. “A hierarchical framework for
smart grid anomaly detection using large-scale smart meter data.”IEEE
Transactions on Smart Grid, 2017.
Amber KP, Aslam MW, Hussain SK. “Electricity consumption forecast-
ing models for administration buildings of the UK higher education
sector.”Energy and Buildings. 2015 Mar 1;90:127-36.
Guangzhong Dong, Member, IEEE, and Zonghai Chen, Member, IEEE.
“Data Driven Energy Management in a Home Microgrid Based on
Bayesian Optimal Algorithm.”1551-3203 : 2018 IEEE.
Xishuang Dong, Lijun Qian, Lei Huang. “Short-Term Load Forecasting in
Smart Grid: A Combined CNN and K-Means Clustering Approach.”978-
1-5090-3015-6/17 2017 IEEE.
Peter Lusisa, Kaveh Rajab Khalilpour, Lachlan Andrew, Ariel Liebman.
“Short-term residential load forecasting: Impact of calendar effects and
forecast granularity.”0306-2619/ 2017 Elsevier.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
Electricity price is a key influencer in the electricity market. Electricity market trades by each participant are based on electricity price. The electricity price adjusted with the change in supply and demand relationship can reflect the real value of electricity in the transaction process. However, for the power generating party, bidding strategy determines the level of profit, and the accurate prediction of electricity price could make it possible to determine a more accurate bidding price. This cannot only reduce transaction risk, but also seize opportunities in the electricity market. In order to effectively estimate electricity price, this paper proposes an electricity price forecasting system based on the combination of 2 deep neural networks, the Convolutional Neural Network (CNN) and the Long Short Term Memory (LSTM). In order to compare the overall performance of each algorithm, the Mean Absolute Error (MAE) and Root-Mean-Square error (RMSE) evaluating measures were applied in the experiments of this paper. Experiment results show that compared with other traditional machine learning methods, the prediction performance of the estimating model proposed in this paper is proven to be the best. By combining the CNN and LSTM models, the feasibility and practicality of electricity price prediction is also confirmed in this paper.
Full-text available
One of the ways to achieve energy efficiency in various residential electrical appliances is with energy usage feedback. Research work done showed that with energy usage feedback, behavioural changes by consumers to reduce electricity consumption contribute significantly to energy efficiency in residential energy usage. In order to improve on the appliance-level energy usage feedback, appliance disaggregation or non-intrusive appliance load monitoring (NIALM) methodology is utilized. NIALM is a methodology used to disaggregate total power consumption into individual electrical appliance power usage. In this paper, the electrical signature features from the publicly available REDD data set are extracted by the combination of identifying the ON or OFF events of appliances and goodness-of-fit (GOF) event detection algorithm. The k-nearest neighbours (k-NN) and naive Bayes classifiers are deployed for appliances’ classification. It is observed that the size of the training sets effects classification accuracy of the classifiers. The novelty of this paper is a systematic approach of NIALM using few training examples with two generic classifiers (k-NN and naive Bayes) and one feature (power) with the combination of ON-OFF based approach and GOF technique for event detection. In this work, we demonstrated that the two trained classifiers are able to classify the individual electrical appliances with satisfactory accuracy level in order to improve on the feedback for energy efficiency.
Full-text available
Short-term power load forecasting is an important basis for the operation of integrated energy system, and the accuracy of load forecasting directly affects the economy of system operation. To improve the forecasting accuracy, this paper proposes a load forecasting system based on wavelet least square support vector machine and sperm whale algorithm. Firstly, the methods of discrete wavelet transform and inconsistency rate model (DWT-IR) are used to select the optimal features, which aims to reduce the redundancy of input vectors. Secondly, the kernel function of least square support vector machine LSSVM is replaced by wavelet kernel function for improving the nonlinear mapping ability of LSSVM. Lastly, the parameters of W-LSSVM are optimized by sperm whale algorithm, and the short-term load forecasting method of W-LSSVM-SWA is established. Additionally, the example verification results show that the proposed model outperforms other alternative methods and has a strong effectiveness and feasibility in short-term power load forecasting.
Literature is rich in methodologies for “aggregated” load forecasting which has helped electricity network operators and retailers in optimal planning and scheduling. The recent increase in the uptake of distributed generation and storage systems has generated new demand for “disaggregated” load forecasting for a single-customer or even down at an appliance level. Access to high resolution data from smart meters has enabled the research community to assess conventional load forecasting techniques and develop new forecasting strategies suitable for demand-side disaggregated loads. This paper studies how calendar effects, forecasting granularity and the length of the training set affect the accuracy of a day-ahead load forecast for residential customers. Root mean square error (RMSE) and normalized RMSE were used as forecast error metrics. Regression trees, neural networks, and support vector regression yielded similar average RMSE results, but statistical analysis showed that regression trees technique is significantly better. The use of historical load profiles with daily and weekly seasonality, combined with weather data, leaves the explicit calendar effects a very low predictive power. In the setting studied here, it was shown that forecast errors can be reduced by using a coarser forecast granularity. It was also found that one year of historical data is sufficient to develop a load forecast model for residential customers as a further increase in training dataset has a marginal benefit.
Electricity price forecasting is a significant part of smart grid because it makes smart grid cost efficient. Nevertheless, existing methods for price forecasting may be difficult to handle with huge price data in the grid, since the redundancy from feature selection cannot be averted and an integrated infrastructure is also lacked for coordinating the procedures in electricity price forecasting. To solve such a problem, a novel electricity price forecasting model is developed. Specifically, three modules are integrated in the proposed model. First, by merging of Random Forest (RF) and Relief-F algorithm, we propose a hybrid feature selector based on Grey Correlation Analysis (GCA) to eliminate the feature redundancy. Second, an integration of Kernel function and Principle Component Analysis (KPCA) is used in feature extraction process to realize the dimensionality reduction. Finally, to forecast price classification, we put forward a differential evolution (DE) based Support Vector Machine (SVM) classifier. Our proposed electricity price forecasting model is realized via these three parts. Numerical results show that our proposal has superior performance than other methods.
Real-time monitoring and control of smart grids is critical to the enhancement of reliability and operational efficiency of power utilities. We develop a real-time anomaly detection framework, which can be built based upon smart meter data collected at the consumers’ premises. The model is designed to detect the occurrence of anomalous events and abnormal conditions at both lateral and customer levels. We propose a generative model for anomaly detection that takes into account the hierarchical structure of the network and the data collected from smart meters. We also address three challenges existing in smart grid analytics: (i) large-scale multivariate count measurements, (ii) missing points, and (iii) variable selection. We present the effectiveness of our approach with numerical experiments.
Short-term building cooling load prediction is the essential foundation for many building energy management tasks, such as fault detection and diagnosis, demand-side management and control optimization. Conventional methods, which heavily rely on physical principles, have limited power in practice as their performance is subject to many physical assumptions. By contrast, data-driven methods have gained huge interests due to their flexibility in model development and the rich data available in modern buildings. The rapid development in data science has provided advanced data analytics to tackle prediction problems in a more convenient, efficient and effective way. This paper investigates the potential of one of the most promising techniques in advanced data analytics, i.e., deep learning, in predicting 24-h ahead building cooling load profiles. Deep learning refers to a collection of machine learning algorithms which are powerful in revealing nonlinear and complex patterns in big data. Deep learning can be used either in a supervised manner to develop prediction models with given inputs and output (i.e., cooling load), or in an unsupervised manner to extract meaningful features from raw data as model inputs. This study exploits the potential of deep learning in both manners, and compares its performance in cooling load prediction with typical feature extraction methods and popular prediction techniques in the building field. The results show that deep learning can enhance the performance of building cooling load prediction, especially when used in an unsupervised manner for constructing high-level features as model inputs. Using the features extracted by unsupervised deep learning as inputs for cooling load prediction can evidently enhance the prediction performance. The findings are enlightening and could bring more flexible and effective solutions for building energy predictions.
A k-nearest neighbor (kNN) ensemble model has been developed to generate Probability Density Function (PDF) forecasts for intra-hour Direct Normal Irradiance (DNI). This probabilistic forecasting model, which uses diffuse irradiance measurements and cloud cover information as exogenous feature inputs, adaptively provides arbitrary PDF forecasts for different weather conditions. The proposed models have been quantitatively evaluated using data from different locations characterized by different climates (continental, coastal, and island). The performance of the forecasts is quantified using metrics such as Prediction Interval Coverage Probability (PICP), Prediction Interval Normalized Averaged Width (PINAW), Brier Skill Score (BSS), and the Continuous Ranked Probability Score (CRPS), and other standard error metrics. A persistence ensemble probabilistic forecasting model and a Gaussian probabilistic forecasting model are employed to benchmark the performance of the proposed kNN ensemble model. The results show that the proposed model significantly outperform both reference models in terms of all evaluation metrics for all locations when the forecast horizon is greater than 5-min. In addition, the proposed model shows superior performance in predicting DNI ramps.