ArticlePDF Available

ESAENARX and DE-RELM: Novel schemes for big data predictive analytics of electricity load and price


Abstract and Figures

Accurate forecasting of the electricity price and load is an essential and challenging task in smart grids. Since electricity load and price have a strong correlation, the forecast accuracy degrades when bidirectional relation of price and load is not considered. Therefore, this paper considers price and load relationship and proposes two Multiple Inputs Multiple Outputs (MIMO) Deep Recurrent Neural Networks (DRNNs) models for price and load forecasting. The first proposed model, Efficient Sparse Autoencoder Nonlinear Autoregressive Network with eXogenous inputs (ESAENARX) comprises of feature engineering and forecasting. For feature engineering, we propose ESAE and performed forecasting using existing method NARX. The second proposed model: Differential Evolution Recurrent Extreme Learning Machine (DE-RELM) is based on RELM model and the meta-heuristic DE optimization technique. The descriptive and predictive analyses are performed on two well-known electricity markets' big data, i.e., ISO NE and PJM. The proposed models outperform their sub models and a benchmark model. The refined and informative features extracted by ESAE improve the forecasting accuracy in ESANARX and optimization improves the DE-RELMâââs accuracy. As compared to cascade Elman network, ESAENARX has reduced MAPE upto 16% for load forecasting, 7% for price forecasting. DE-RELM reduce 1% MAPE for both load and price forecasting.
Content may be subject to copyright.
ESAENARX and DE-RELM: Novel Schemes for Big Data Predictive
Analytics of Electricity Load and Price
Sana Mujeeba,Nadeem Javaida,
aCOMSATS University Islamabad, Islamabad 44000, Pakistan
Dynamic programming
Multi-objective optimization
Pareto front
Bird swarm and Cuckoo search algo-
Hybrid technique
Demand side management
Demand response
Smart grid.
Accurate forecasting of the electricity price and load is an essential and challenging task in smart
grids. Since electricity load and price have a strong correlation, the forecast accuracy degrades when
bidirectional relation of price and load is not considered. Therefore, this paper considers price and
load relationship and proposes two Multiple Inputs Multiple Outputs (MIMO) Deep Recurrent Neural
Networks (DRNNs) models for price and load forecasting. The first proposed model, Efficient Sparse
Autoencoder Nonlinear Autoregressive Network with eXogenous inputs (ESAENARX) comprises of
feature engineering and forecasting. For feature engineering, we propose ESAE and performed fore-
casting using existing method NARX. The second proposed model: Differential Evolution Recurrent
Extreme Learning Machine (DE-RELM) is based on RELM model and the meta-heuristic DE opti-
mization technique. The descriptive and predictive analyses are performed on two well-known elec-
tricity markets’ big data, i.e., ISO NE and PJM. The proposed models outperform their sub models and
a benchmark model. The refined and informative features extracted by ESAE improve the forecasting
accuracy in ESANARX and optimization improves the DE-RELMâĂŹs accuracy. As compared to
cascade Elman network, ESAENARX has reduced MAPE upto 16% for load forecasting, 7% for price
forecasting. DE-RELM reduce 1% MAPE for both load and price forecasting.
1. Introduction
THE smart grid is a modern power supply network that
uses communication technology. It consists of automation,
control and technology that responds quickly to the con-
sumption changes. Smart grid provides energy in an effi-
cient, secure, reliable, economical and environment-friendly
manner. Renewable Energy Sources (RESs) of power gener-
ation are integrated for reducing the carbon emission. It al-
lows a two-way communication between the consumers and
utility. With the emergence of smart metering infrastructure,
consumers are informed about the price per unit in advance.
Consumers can adjust their demand load economically, ac-
cording to the price signals. They can reduce consumption
cost by shifting load to a low price hour. Smart grids make
a price responsive environment where the price varies from
a change in demand and vice versa.
In unidirectional grids, there is a one-way interaction from
the generation side to consumers. The consumers are not
able to respond to the price signal because of the fact that
they are unaware of the price dynamically. The demand has
shown a very little or no elasticity to price variations in uni-
directional grids. However, with the advent of the smart me-
tering system, consumers are well aware of the price and
they control their power consumption accordingly. There-
fore, price and demand are highly correlated and interde-
pendent. The market participants need reliable techniques to
maximize their profit that depends on accurate load and price
forecasting. The price and demand forecasting also play an
Corresponding author (N. Javaid) (N. Javaid)
ORCID(s): 0000-0003-3777-8249 (N. Javaid)
important role in energy: systems planning, market design,
security of supply and operation planning for future power
consumption. An accurate forecast is very important. A 1%
reduction in Mean Absolute Percentage Error (MAPE) of the
load forecast reduces the generation cost to 0.1% to 0.3% [1].
0.1% generation cost is approximately $1 million annually in
a large scale smart grid. Due to the importance of an accu-
rate forecast of electricity price and load, the researchers are
still competing for improving the forecast accuracy. Using
big data for predictive analytics improves the forecasting ac-
curacy [2]. Electricity data is big data as the smart meters
record data in small time intervals [3]. In a large-sized smart
grid, approximately 220 million smart meter measurements
are recorded daily. Analytics of this energy big data helps
the power utilities to get deeper insights of consumer behav-
ior [4]. The volume of input data is increasing and train-
ing of classical forecasting methods is difficult. Processing
of big data by classifier based models is very difficult. Be-
cause of their high space and time complexity. On the other
hand, Deep Neural Networks (DNN) perform very well on
big data [5]. DNN has an excellent ability of self learning
and nonlinear approximation. They optimize the space by
dividing the training data into mini-batches. After dividing
whole data is trained batch by batch.
The rest of the paper is organized as: Section 2is related
work, the problem statement is stated in Section 3, descrip-
tions of used methods are presented in Section 4, proposed
models are described in Section 12 and DE-RELM 13, Sec-
tion 15 is simulations and results and Section 22 concludes
this article.
Sana et al.: Preprint submitted to Elsevier Page 1 of 16
Big Data Predictive Analytics of Electricity Load and Price
2. Related Work
With the advent of smart metering system, the energy-
related data is collected in a very huge volume at a high ve-
locity from variety of sources. This data is referred as en-
ergy big data. For making decision regarding energy mar-
ket operation, predictive analytics is performed on this load
and price data. For maintaining the demand and supply bal-
ance, an accurate prediction of load is essential. Whereas,
the price forecasting plays an important role in the bidding
process and energy trading. To ensure the reliability, stabil-
ity and security of smart grid accurate forecasts of electricity
load and price are essential. Electricity load and price have
bi-directional nature, therefore, simultaneous prediction of
load and price yields greater accuracy.
The authors of papers [6,7] have predicted price and load si-
multaneously. Authors of [6] have proposed a hybrid model
for simultaneous forecasting of electricity load and price.
The proposed model consists of three stages, i.e., denois-
ing, feature engineering and forecasting. For denoising,
authors propose a new Wavelet Packet Transform (WPT)
based method, Flexible WPT (FWPT). The features are se-
lected by adjacent features and Conditional Mutual Infor-
mation (CMI). In the forecasting step, Autoregressive In-
tegrated Moving Average (ARIMA) and Nonlinear Least
Square Support Vector Machine (NLSSVM) are employed
for linear and nonlinear modeling. The NLSSVM is opti-
mized using enhanced optimization technique Time Vary-
ing Artificial Bee Colony (TV-ABC). This hybrid model re-
sults in reasonable forecasting accuracy, however, the model
is highly complex. Moreover, the optimization of forecast-
ing model leads to over-fitting. In paper [7], authors pre-
dict load and price using a multi-stage forecasting approach.
The complex forecasting approach proposed in this work is
comprised of feature selection and multi-stage forecast en-
gine. Features are selected through a modified Maximum
Relevancy Minimum Redundancy (MRMR) method. Elec-
tricity load and price are forecasted using multi-block Arti-
ficial Neural Network (ANN) known as Elman Neural Net-
work (ENN). The forecasting model is optimized by a shark
smell optimization method. This method results in a reason-
able forecasting accuracy. However, it is computationally
very expensive. The feature engineering process and opti-
mization of ENN increase complexity. Moreover, big data
is not considered in this method. In paper [7], authors pre-
dict load and price using a multi-stage forecasting approach.
The complex forecasting approach proposed in this work is
comprised of feature selection and multi-stage forecast en-
gine. Features are selected through a modified Maximum
Relevancy Minimum Redundancy (MRMR) method. Elec-
tricity load and price are forecasted using multi-block Arti-
ficial Neural Network (ANN) known as Elman Neural Net-
work (ENN). The forecasting model is optimized by a shark
smell optimization method. This method results in a rea-
sonable forecasting accuracy. However, it is computation-
ally very expensive. The feature engineering process and
optimization of ENN increase complexity. Moreover, big
data is not considered in this method. Authors of paper [8]
have conducted a predictive analysis of electricity price fore-
casting taking advantage of big data. The relevant features
for the training prediction model are selected through an ex-
tensive feature engineering process. This process has three
steps: firstly, correlated features are selected using Gray
Correlation Analysis (GCA). Secondly, features are selected
through a hybrid of two feature selection methods: RliefF
and Random Forest (RF) are used for further feature selec-
tion. Lastly, the Kernel Principle Analysis (KPCA) is ap-
plied for dimension reduction. Price is predicted by SVM
and the hyper-parameters of SVM are optimized through
modified Differential Evolution (DE). In paper [9], the au-
thors forecast the energy consumption on big data. An anal-
ysis of frequent patterns is performed using a supervised
clustering method. Energy consumption is forecasted by the
Bayesian network.
Authors of paper [10] have utilized the computational power
of deep learning for Electricity Price Forecasting (EPF).
Stacked Denoising Autoencoder (SDA) and RANSAC-SDA
(RS-SDA) models are implemented for online and the day
ahead hourly EPF. The three years (i.e., January 2012 –
November 2014) data utilized in this paper. Data is collected
from Texas, Arkansas, Nebraska, Indiana and Louisiana
ISO hubs in the USA. Comprehensive analyses of the ca-
pabilities of the RS-SDA and SDA models in the EPF are
performed. The effectiveness of the proposed models is
validated through their comparative analyses with classical
ANN, SVM (Support Vector Machine) and MARS (Multi-
variate Adaptive Regression Splines). Both the SDA and
RS-SDA models are able to accurately predict electricity
price with a considerably less MAPE as compared to the
aforementioned models.
A deep learning model for Short-term Load Forecasting
(STLF) is proposed by Tong et al. [11]. The features are
extracted using SDA from the historical electricity load and
corresponding temperature data. Support Vector Regressor
(SVR) model is trained for the day ahead STLF. The SDA
has effectively extracted the abstract features from the data.
SVR model trained on these extracted features forecasts elec-
tricity load with low errors. The proposed model outper-
forms simple SVR and ANN in terms of forecasting accu-
racy which validates its performance.
The Shallow ANN (SANN) is utilized for electricity load
forecasting in [12] and [13]. SANN have the problem of
overfitting. To avoid overfitting, hyperparametersâĂŹ op-
timization is required that increases the complexity of the
forecasting model.
A hybrid deep learning method is applied to forecast price
in [14]. Two deep learning methods are combined in this re-
search work. Features are extracted by Convolution Neural
Network (CNN). Short-term energy price is predicted using
LSTM. Half hourly price data of PJM 2017 is used for pre-
diction. Previous 24 hour price is used to predict the next
1-hour electricity price. The hybrid DNN structure has 10
hidden layers. It has 2 convolution layers, 2 max-pooling
layers, 3 Rectified Linear Unit (ReLU), 1 batch normaliza-
tion layer, 1 LSTM layer for prediction and the last hidden
Sana et al.: Preprint submitted to Elsevier Page 2 of 16
Big Data Predictive Analytics of Electricity Load and Price
Table 1
Related work of load and price forecasting.
Task Forecast Horizon Platform / Testbed Dataset Algorithms
Load and price forecasting [6] Short-term Hourly data of 6 states OF USA NYISO, 2015 MRMR, Multi-block Elman ANN, En-
hanced shark smell optimization
Price forecasting [8] Short-term Hourly electricity price of 6 states of USA ISO NE, 2010-2015 GCA, Random forest (RF), ReliefF,
Consumption forecasting [9] Short and long-
6 second resolution consumption of 5
homes with 109 domestic appliance
UK-Dale, 2012-2015 Association rule mining, Incremental
k-means clustering, Bayesian network
Price forecasting [10] Short-term Hourly price of 5 hubs of MISO USA, 2012-2014 Stacked Denoising Autoencoders
Consumption forecasting [11] Short-term Aggregated hourly load of four regions Los Angeles, California, Florida,
New York City, USA, August
Consumption forecasting [12] Short-term Electricity market data of 3 grids: FE,
PJM, USA, 2015 Mutual Information (MI), ANN
Consumption forecasting [13] Short-term Electricity market data of 2 grids: DAY-
PJM, USA, 2015 Modified MI + ANN
Price forecasting [14] Short-term Half hourly price of PJM Intercontinental Exchange
Long Short Term Memory (LSTM),
Convolutional Neural Network (CNN)
Price forecasting [15] Short-term Turkish day-ahead market electricity prices Turkey, 2013-2016 Recurrent Neural Network (RNN)
Cooling load forecasting [16] Short-term HVAC Cooling load of an educational build-
Hong Kong, 2015 Elastic Net (ELN), SAE, RF, MLR,
Gradient Boosting Machines (GBM),
Extreme GB tree, SVR
Consumption forecasting [17] Short-term Hourly load of Korea Electric Power Cor-
poration (KEPCO)
South Korea, 2012-2014 Restricted Boltzman Machine (RBM)
Consumption forecasting [18] Short-term Individual house consumption of 7km of
Individual household electric
power consumption, France,
Conditional RBM (CRBM), Factored
Load forecasting [19] Short-term 15 minute resolution of one retail building Fremont, CA SAE, ELM
Load forecasting [20] Short-term 15 minutes cooling consumption of a com-
mercial building in Shenzhen city
Guangdong province, South
China, 2015
Empirical Mode Decomposition
(EMD), Deep Belief Networks
Load forecasting [21] Short-term Hourly consumption from Macedonian
Transmission Network Operator (MEPSO)
Republic of Macedonia, 2008-
Load forecasting [22] Short-term Hourly consumption from Australia AEMO, 2013 EMD, DBN
Load forecasting [23] Medium to
Hourly consumption of a public safety
building, Salt Lake City, Utah. Aggregated
hourly consumption of residential buildings,
Austin, Texas
USA, 2015, 2016 LSTM
Load forecasting [24] Medium-term Half hourly metropolitan electricity con-
France, 2008-2016 LSTM, GA
Load forecasting [25] Short-term Hourly aggregated consumption of 6 states
ISO NE, 2003-2016 Xgboost weighted k-means, EMD-
Load forecasting [26] Short-term Ireland consumption Smart meter database of load
profile, Ireland
Pooling deep RNN
Load forecasting [27] Short-term Daily electricity consumption data 3 Chinese cities, 2014 Feed Forward DNN (FFDNN), Prob-
ability Density Estimation
Load and photovoltaic power
forecasting [28]
Short-term Hourly residential power load data Dataport dataset, 2018 Deep Recurrent Neural Network
(DRNN) with LSTM units
Load forecasting [29] Short-term Hourly electricity market data ISO NE, 2007–2012 Deep RNN
Load forecasting [30] Short-term Hourly aggregated consumption of 6 states ISO NE, USA, DRNN, FFDNN
layer is a fully connected layer. The CNN feature extrac-
tor has 7 hidden layers and LSTM predictor has 3 hidden
layers. The output of 7𝑡ℎ hidden layer of feature extractor
CNN becomes the input of LSTM predictor. The proposed
method outperforms simple CNN, LSTM and various ma-
chine learning methods.
Authors of [15] have utilized the Gated Recurrent Units
(GRU) in RNN for Energy Price Forecasting (EPF).
Recently deep learning forecasting methods have shown
good performance in electricity price [1416] and load fore-
casting [1730]. However, the interdependency of load and
price are not considered in these DNN forecasting models.
In [31], the author discusses the importance of big data ap-
plications and analytics in the development of Smart Sus-
tainable Cities (SSCs). An IoT based framework is proposed
to improve the functionalities of SSCs. The importance of
accurate load and price forecasting in smart gridâĂŹs sta-
bility is discussed. Stability of grid improves sustainabil-
ity of SSCs. A SSC uses Information and Communication
Technology (ICT) for improving lifeâĂŹs quality, services
and urban operations. It ensures to fulfill the present and fu-
tureâĂŹs environmental, social, cultural and economic re-
The authors of [32] conduct an extensive literature review on
future SSCs. Besides other aspects of future SSCs, energy
efficiency is also mooted in this review. The authors describe
the SSC as an energy efficient, eco-friendly and real-time
city. Load demand forecasting plays a key role in energy
management and efficiency.
The future trends, architecture and challenges of SSCs are
reviewed in [33]. The major aspects of a smart city are illus-
trated in this study. Smart grid is discussed as an important
component of a smart city. The role of load demand fore-
casting is emphasized in an energy efficient city. Six dimen-
sions of SSCs are explained in [34]. The authors present a
road map towards SSCs. The concept of SSC is elaborated
with the help of six dimensions; one of these dimensions is
energy efficiency.
The authors of [35] discuss the present services of smart
cities like load demand forecasting in order to achieve a
sustainable city. The short-term load of Girona University,
Spain is studied. The forecasting model consists of outlier
rejection, feature selection using auto correlation and pre-
diction using auto regression. First, outliers are removed
based on k nearest neighbors and Euclidean distance. Sec-
ondly, highly correlated features with the target class are se-
lected and features having high correlation with other fea-
tures and less correlation with target class are eliminated. Fi-
nally, a classical data-driven prediction model, auto regres-
sion is implemented for STLF. The services embedded in the
studied layered architecture are described in detail, aiming to
make it part of a sustainable city.
Sana et al.: Preprint submitted to Elsevier Page 3 of 16
Big Data Predictive Analytics of Electricity Load and Price
3. Problem Statement and Contributions
Authors of paper [8] and [9] have used big data for pre-
dictive analytics. However, the extensive feature engineer-
ing process increases the computational complexity. The
feature engineering involves denoising of inputs, feature se-
lection and dimension reduction. After the feature engineer-
ing step, another important step is the optimization of the
prediction method’s hyperparameters. This optimization is
crucial to achieving accurate forecast results. Feature en-
gineering and model optimization steps make forecasting
complex. To avoid the extensive feature engineering pro-
cess, the deep learning methods are proposed for electricity
price [10] and load [11] forecasting. The mentioned deep
learning based forecasting models have forecasted electric-
ity load and price separately.
The electricity load and price signals have a high correla-
tion. The incorporation of the inherent bi-directional rela-
tion of electricity load and price in prediction models’ inputs
results in high prediction accuracy. The correlation of elec-
tricity load and price is not taken into consideration in [10]
and [11]. A forecasting method is needed that accurately
forecasts the electricity load and price simultaneously. In
this article, a forecasting model is proposed that is based on
deep learning. The proposed method accurately forecasts
electricity load and price simultaneously taking advantage
of big data. The major contributions of this study are en-
listed below:
The proposed models take advantage of big data. Big
data analyses of electricity load and price are pre-
sented in this study. Data and forecasting models are
analyzed statistically and graphically.
A new feature extraction scheme based on Sparse Au-
toencoder (SAE) is introduced in the first proposed
model. The performance of SAE is improved by us-
ing wavelet packet denoising as a decoding function
that significantly improves the quality of extracted fea-
tures. The extracted features are presented as refined
information and smooth training input of the forecast-
ing model Nonlinear Autoregressive Network with
Exogenous variables (NARX).
The second proposed model is an optimized Recurrent
Extreme Learning Machine (RELM). The parameters
of RELM are optimized using a meta-heuristic opti-
mization technique differential evolution. The pro-
posed models outperform ELM, RELM, NARX, DE-
ELM and Cascade Elman ANN (CEANN) [7].
4. Proposed Model
Before describing the proposed forecasting model, the
utilized methods are introduced. A brief description of the
methods used in the proposed models is given in this section.
5. Artificial Neural Network for Forecasting
ANNs are inspired by the learning process of the bio-
logical neural networks. ANNs have the capability to model
the complex patterns hidden in the data. Multilayer Percep-
tron (MLP) is the simplest and fundamental architecture of
ANN [36]. The MLP comprises of the neurons, bias and
weights. The ANNs make a mapping of the inputs 𝑥𝑖and
their respective targets 𝑡𝑖. The weights, 𝑊𝑖are updated while
creating this mapping. The network learns when the weights
are updated.
𝑦(𝑡) = 𝑓(𝑊1𝑥1+𝑊2𝑥2+…+𝑊𝑛𝑥𝑛)(1)
Where, 𝑊𝑖are the weights and 𝑓is the activation function.
The most common algorithm used for updating the weights
is gradient descent. It reduces the squared error 𝐸using the
delta rule:
𝐸=𝑦(𝑡) − 𝑡(𝑡)2(2)
Where, 𝑡(𝑡)is the correspondent target vector of the 𝑥(𝑡)
training vector.
𝑖𝑗 (𝑡+ 1) = 𝑤(𝓁)
𝑖𝑗 (𝑡) − 𝛼𝜕𝐸
𝑖𝑗 (𝑡)
𝑗(𝑡+ 1) = 𝑏(𝓁)
𝑗(𝑡) − 𝛼𝜕𝐸
Where, 𝑤(𝓁)
𝑖𝑗 (𝑡+ 1) is the new modified weight, 𝑤(𝓁)
𝑖𝑗 (𝑡)is the
weight that is required to be changed, bias is 𝑏(𝓁)
𝑗(𝑡)and the
learning rate is 𝛼(>0).
Deep Neural Network (DNN) is ANN with deeper architec-
ture, i.e., several numbers of hidden layers. DNN is compu-
tationally stronger as compared to Shallow ANN (SANN).
The proposed forecasting engines are based on Deep Recur-
rent Neural Networks (DRNN), i.e., NARX and LSTM.
6. Sparse Autoencoder
The SAE neural network is an unsupervised learning al-
gorithm that applies back propagation method setting the tar-
get values to be equal to the inputs, i.e., 𝑦𝑖=𝑥𝑖. The SAE
attempts to learn a function 𝑊 ,𝑏(𝑥) ≈ 𝑥. Basically, SAE
tries to learn an approximation function, so the output ̂𝑥 is
similar to the input 𝑥. The network must reconstruct the in-
put data. By placing constraints on the network and limiting
the number of hidden units and adding sparsity, an interest-
ing structure of the data is discovered. The network is forced
to learn a compressed representation of the input, i.e., given
only the vector of hidden unit activations. Generally, sig-
moid is the activation function of the autoencoder, which
is designed to obtain a better representation of input data:
(𝑋, 𝑊 , 𝑏) = 𝜎(𝑊 𝑋 +𝑏). A sparse penalty term is added
to the sparse autoencoder cost function to limit the average
activation value of the hidden-layer neuron. Normally, when
the output value of a neuron is 1, it is active and the neuron
is inactive when its output value is 0. The purpose of enforc-
ing sparsity is to limit the undesired activation. 𝑎𝑗(𝑥)is set
Sana et al.: Preprint submitted to Elsevier Page 4 of 16
Big Data Predictive Analytics of Electricity Load and Price
D å s
Layer 1
Output Layer
Layer 2
Smart Grid
Historic Temperature Forecast
Load Forecast
Price Forecast
Historic Data ESAE Feature Extractor MIMO Forecaster ESAENARX
Figure 1: Proposed System model.
as the 𝑗𝑡ℎ activation value. In the process of feature learning,
the activation value of the hidden-layerneuron is usually ex-
pressed as 𝑎=𝜎(𝑊 𝑋 +𝑏), where, 𝑊are the weight matrix
and 𝑏is the deviation matrix. The mean activation value of
the 𝑗𝑡ℎ neuron in the hidden layer is defined as:
[𝑎𝑗(𝑥𝑖)] (5)
The hidden layer is kept at a lower value to ensure that the
average activation value of the sparse parameter is defined as
𝜌, and the penalty term is used to prevent 𝜌𝑗from deviating
from parameter 𝜌. The Kullback-Leibler (KL) divergence
[37] is used in this study for the re-enforcement learning.
The mathematical expression of KL divergence is as follows:
𝐾𝐿(𝜌𝜌𝑗) = 𝜌ln 𝜌
+ (1 − 𝜌) ln 1 − 𝜌
1 − 𝜌𝑗
When 𝜌𝑗does not deviate from parameter 𝜌, the KL diver-
gence value is 0; otherwise, the KL divergence value will
gradually increase with the deviation. The cost function of
the neural network is set as 𝐶(𝑊 , 𝑏). Then, the cost function
of adding the sparse penalty term is:
𝐶𝑆𝑝𝑎𝑟𝑠𝑒 =𝐶(𝑊 , 𝑏) + 𝛽
Where, 𝑆2is the number of neurons in the implicit layer and
𝑊is the weight of the sparse penalty term. The training
essence of a neural network is to find the appropriate weight
and threshold parameter (𝑊 , 𝑏). After the sparse penalty
term is defined, the sparse expression can be obtained by
minimizing the sparse cost function.
An SAE can be transformed into Sparse Denoising Autoen-
coder (SDA). Data is corrupted in a stochastic manner by
introducing some noise into it. The corrupted data is then
attempted to reconstruct to the original data.
SAE is capable of discovering the correlation among the fea-
tures. A refined and the most relevant feature representation
achieved using SAE.
7. Efficient SAE (ESAE)
The Efficient SAE (ESAE) is proposed to create a better
representation of electricity data, that is useful for an accu-
rate forecast of price and load. In this section, the proposed
feature extractor Efficient SAE is discussed in detail.
8. Pre-training of ESAE
To initialize the weights and bias an unsupervised pre-
training is applied. Where the input of a hidden layer is the
output of its previous layer. In the pre-training step, the ini-
tial bias and weights of the autoencoder are learned.
In the proposed method, the input data 𝑋𝑡is corrupted by
introducing white noise [38]. The white noise is added to
randomly selected 30% data points. A random process 𝑦(𝑡)
is known as white noise when the 𝑆𝑦(𝑓)is constant at all the
frequencies 𝑓:
𝑆𝑦(𝑓) = 𝑁0
The white noise describes random disturbances with small
correlation periods. The white noise generalized correlation
function is defined by:
𝐵(𝑡) = 𝛿(𝑡)𝜎2(9)
Where, 𝛿(𝑡)is the delta function and 𝜎is a positive constant.
9. Fine-tuning of ESAE
The fine-tuning step is followed by the pre-training step.
In fine-tuning, the wavelet denoising is proposed as the en-
coding transfer function of the first hidden layer of ESAE.
The activation function of the second layer is sigmoid. The
wavelet denoising has two steps: (i) wavelet packet decom-
position and, (ii) reconstruction denoising operation. Firstly,
the input time series is decomposed into different frequency
band by passing through the high pass and low pass filters.
Then the frequency band of noise is set to be zero. The signal
is then reconstructed using wavelet reconstruction function,
that is the inverse of a wavelet decomposition function [39].
Sana et al.: Preprint submitted to Elsevier Page 5 of 16
Big Data Predictive Analytics of Electricity Load and Price
Extracted FeaturesDe-normalization Forecasting by
Price and load
normalization of data
Stage 1: Feature Extraction
Stage 2: Prediction
Pre-training Fine-tuning
Encoding with SAE
Corrupting input
features with white
Fine-tuning with
efficient SAE
Figure 2: Step by step flow of proposed model ESAENARX.
Wavelet decomposition operation can be expressed by:
𝑐𝑗,𝑘 =𝑛𝑐𝑗−1 , ℎ𝑛−2𝑘
𝑑𝑗,𝑘 =𝑛𝑑𝑗−1 , 𝑔𝑛−2𝑘𝑘= (1,2,, 𝑁 − 1)
Where, 𝑐𝑗,𝑘 is scale coefficient, 𝑑𝑗 ,𝑘 is the wavelet coefficient,
and 𝑔are the quadrature mirror filter banks. 𝑗is level of
decomposition and 𝑁are the sampling points. The wavelet
reconstruction function that is inverse wavelet decomposi-
tion is expressed as:
𝑐𝑗−1,𝑛 =
𝑐𝑗ℎ𝑘 − 2𝑛+
𝑑𝑗𝑔𝑘 − 2𝑛(10)
The denoising operation is shown by equations below.
̂𝜔𝑗,𝑘 =𝑠𝑖𝑔𝑛(𝜔𝑗,𝑘 (𝜔𝑗,𝑘 𝑇 𝜆)),𝜔𝑗 ,𝑘𝜆,
0,𝜔𝑗,𝑘 < 𝜆.
Where, ̂𝜔𝑗,𝑘 is denoised signal, 𝜔𝑗 ,𝑘 is wavelet transformed
signal and 𝜆is the threshold.
In ESAE feature extractor, the number of the units in hid-
den layer one and two are 400 and 300, respectively. The
coefficient that controls the layer 2 weight regularization is
set to be 0.001. Sparsity regularization is 4 and sparsity pro-
portion is 0.05. A maximum number of epochs is 100. The
algorithm for the learning of weights is scale conjugate gra-
dient descent.
10. Non-linear Autoregressive Network with
Exogenous Variables
NARX is an autoregressive RNN. Its feedback connec-
tions enclose several hidden layers of the network, leaving
the input layer. NARX has a memory that is utilized for
creating a nonlinear mapping between inputs and outputs.
The network learns from the recurrence on the past values
of time series and the past predicted values of the network
[40]. For predicting a value 𝑦(𝑡), the inputs of the NARX are
𝑦(𝑡− 1), 𝑦(𝑡− 2),, 𝑦(𝑡𝑑). NARX can be explained by
the following equation:
̂𝑦(𝑡+ 1) = 𝑓(𝑦(𝑡), 𝑦(𝑡− 1), ..., 𝑦(𝑡𝑑), 𝑥(𝑡+ 1), 𝑥(𝑡), ..., 𝑥(𝑡𝑑)) + 𝜀(𝑡)
Where ̂𝑦(𝑡+ 1) is network’s output at 𝑡,𝑓() is the nonlin-
ear mapping function, 𝑦(𝑡), 𝑦(𝑡− 1), ..., 𝑦(𝑡𝑑)are the
past observed values, 𝑥(𝑡+ 1), 𝑥(𝑡), ..., 𝑥(𝑡𝑑)are the net-
work’s inputs, number of the delays is 𝑑, and the error term
is denoted by 𝜀(𝑡). In the proposed NARX, for simultaneous
forecasting of price and load, the number of delays is 2. The
hidden layers of the network are 10. The training function is
Levenberg Marquardt.
11. Long Short-term Memory
LSTM is a well-known sub-category of the RNN. It is
widely used for modeling of sequential data. In LSTM, in-
ternal states are used to process input sequence. This struc-
ture allows it to learn dynamic temporal behavior for a time
sequence. Unlike feed forward ANNs, LSTM use their inter-
nal state to process sequences of inputs and remember longer
dependencies in the data. The LSTM is used to solve many
time sequence problems. LSTM contains three gates: input
gate, forget gate and output gate. It has a memory cell that
keeps relevant information of data as a memory. The pur-
pose of the forget gate is to flush out irrelevant data. LSTM
can be explained by following equations:
Suppose an input time series, 𝑥=𝑥1, 𝑥2,, 𝑥𝑛. The
LSTM models the input time series using recurrence (as
shown in equation 12):
𝑡=𝑓(𝑥𝑡, ℎ𝑡−1)(12)
Where, 𝑡is the hidden state at time 𝑡,𝑥𝑡is input at time 𝑡
and 𝑡−1 is the previous hidden state, i.e., at time 𝑡− 1. The
Sana et al.: Preprint submitted to Elsevier Page 6 of 16
Big Data Predictive Analytics of Electricity Load and Price
recurrence function 𝑓()contains gated operations as shown
in the following equations 13,14 and 15:
𝑖𝑡=𝜎(𝑤𝑖[𝑥𝑡, ℎ𝑡−1] + 𝑏𝑖)(13)
𝑓𝑡=𝜎(𝑤𝑓[𝑥𝑡, ℎ𝑡−1] + 𝑏𝑓)(14)
𝑜𝑡=𝜎(𝑤𝑜[𝑥𝑡, ℎ𝑡−1] + 𝑏𝑜)(15)
𝐶𝑡=𝑡𝑎𝑛ℎ(𝑤𝑐[𝑥𝑡, ℎ𝑡−1] + 𝑏𝐶)(16)
𝐶𝑡+𝑓𝑡𝐶𝑡−1 (17)
Where, 𝑖𝑡,𝑓𝑡and 𝑜𝑡are input, forget and output gates, respec-
tively. 𝑤𝑖,𝑤𝑓and 𝑤𝑜are their respective weights. 𝑏𝑖,𝑏𝑓and
𝑏𝑜are their respective biases. 𝐶𝑡is the current state of the
memory cell. ̃
𝐶𝑡is the new value candidate for the memory
cell. The sigmoid function 𝜎()converts the gatesâĂŹ val-
ues in the range of 0 to 1. The gates’ decisions depend on the
current input 𝑥𝑡and previous output 𝑡−1. An input signal
is blocked if the gate’s value is 0. The forget gate decides
the amount of previous state 𝑡−1 to be passed. The input
gate defines the amount of new input to be added or updated
to the previous cell state. Based on the cell state, the output
gate determines which information is output. In this man-
ner, the short and long-term sequence related information is
learned in the LSTM.
LSTM is superior to ANN because of its quality that it
can handle the problem of vanishing or exploding gradient.
The vanishing gradient problem arises while updating of
weights. The weights are updated by the delta rule in which
the gradient of the weight is taken with respect to the error
(as shown in equation 3). If the gradient becomes too small,
the change in updated weights will also be smaller resulting
in no improvement in learning. Whereas, if the gradient be-
comes too big, the updated weights will change too much
resulting in no convergence and un-stability of the network.
LSTM overcomes this problem by using the memory cell 𝑐𝑡,
that is able to preserve the state over a long period of time.
The amount of information to be restrained or discarded is
controlled by changing the values of forget gate, 𝑓𝑡, and in-
put gate, 𝑖𝑡. The dependency on individual inputs is also
controlled. This increased regulation helps in overcoming
the vanishing and exploding gradient problems.
12. ESAENARX Forecast Model
The deep learning is well known for its high precision
feature extraction. A sparse autoencoder deep neural net-
work with dropout is proposed to extract useful feature. This
deep neural network can significantly reduce the adverse ef-
fect of overfitting, making the learned features more con-
ducive to the identification and forecasting. NARX is pro-
posed for load and price forecasting.
A Multi Input Multi Output (MIMO) forecast model is pro-
posed to predict the price and load simultaneously. Fea-
tures are extracted using ESAE. Then the NARX network is
trained for simultaneous forecasting of price and load. The
system model is shown in Figure 1. The input features are:
hour, temperature forecast, wind speed forecast, lagged load,
the lagged price. There are two targets, electricity load and
price. The prediction process has the following five steps:
1. Inputs and targets are normalized using min-max
normalization. Suppose an input vector 𝑋=
𝑥1, 𝑥2, 𝑥3, ..., 𝑥𝑛. The number of instances in the vec-
tor is 𝑛. The min-max normalized is obtained by:
𝑋𝑛𝑜𝑟 =𝑥𝑖𝑋𝑚𝑖𝑛
𝑋𝑚𝑎𝑥 𝑋𝑚𝑖𝑛
Where, 𝑖= 1,2, ..., 𝑛.
2. The normalized inputs are fed to train the ESAE fea-
ture extractor. After the ESAE is trained, the input fea-
tures are encoded using this trained ESAE. The output
of ESAE is the encoded features.
3. The encoded features are given as input to train NARX
network. 80% data is given for training, 15% is used
for validation and 5% is used for testing.
4. The price and load are predicted for 168 hours that is
one week.
5. The predicted values of load and price are de-
normalized to obtain actual values. The NARX ac-
curately predicts the price and load simultaneously.
The ESAE feature extractor has wavelet packet denoising as
a decoder function that performs the denoising of the input
features along with extraction. A refined and rich represen-
tation of features is extracted by ESAE. Generally, SAE has
sigmoid decoder functions. The usage of wavelet packet de-
noising enhanced the extracted features and consequently the
forecasting accuracy improved significantly. The purpose of
good forecasting accuracy is achieved by ESAENARX with
the help of efficient feature extraction.
13. DE-RELM Forecast Model
The second proposed model is an also a MIMO model
like ESAENARX. DE-RELM is an efficient method for elec-
tricity load and price forecasting. DE-RELM has three
stages, in the first stage, the parameters of ELM are opti-
mized by applying the DE algorithm. In the second stage,
ELM is trained. The inputs and outputs of ELM are the in-
put features of load and price. With similar inputs and out-
puts, ELM acts like an encoder. Once the optimized ELM is
trained, the learned weights are set as the initial weights of
the RNN network that is used for forecasting. The learned
weights of ELM are the best representation of the input data.
Setting these initial weights helps RNN converge faster and
Sana et al.: Preprint submitted to Elsevier Page 7 of 16
Big Data Predictive Analytics of Electricity Load and Price
Price and load
normalization of data
Stage 1: ELM optimization
Stage 2: Training ELM
Select weights and
biases with DE No
Stage 3: Prediction with DE-RELM
Calculate objective
Train ELM with same
inputs and outputs
Learned Weights Train ELM with
optimized weights
Forecasting by DE-
Initialize DE-RELM
with learned weights
Figure 3: Flowchart of DE-ELM.
forecast accurately. This is the third and final stage of DE-
RELM. The number of neurons in the hidden layer of ELM
and RNN is kept the same. In order to use the learned
weights of ELM for the RNN network, the dimensions of
weight vectors have to be the same. For the prediction of
load and price, DE-RELM follows the steps shown in the
flowchart, Figure 3.
1. The inputs and targets are normalized using min-max
normalization (as shown in equation 19).
2. The normalized inputs are given to the ELM networks
as inputs and outputs. The network is trained.
3. The forecasting error is calculated by equation 22.
4. The DE algorithm is used to optimize the weights and
biases of ELM. The objective function of DE is the
minimization of the prediction error.
𝑂𝑏𝑗 =minimize 1
𝑖𝑦𝑓 𝑜𝑟
Where, 𝑥𝑓 𝑜𝑟 is the forecasted value, 𝑋𝑚𝑎𝑥 is the max-
imum value of the actual target and 𝑋𝑚𝑖𝑛 is the mini-
mum value of the actual target.
5. When the forecasting error is reduced to the desired
value, the optimized ELM network is trained.
6. The weights of ELM are set as initial weights of the
RNN network.
7. The RNN network predicts the price and load simul-
8. The predicted values are de-normalized by inverse
min-max function.
𝑋= [𝑥𝑓 𝑜𝑟 × (𝑋𝑚𝑎𝑥 𝑋𝑚𝑖𝑛)] + 𝑋𝑚𝑖𝑛 (21)
Where, 𝑥𝑓 𝑜𝑟 is the forecasted value, 𝑋𝑚𝑎𝑥 is the max-
imum value of the actual target and 𝑋𝑚𝑖𝑛 is the mini-
mum value of the actual target.
In DE-RELM, the number of neurons in the hidden layer of
ELM and RNN is 100. ELM has 1 hidden layer. The acti-
vation function of ELM is sigmoid. DE has 100 iterations,
population size is 50, mutation factor is 0.5 and the crossover
rate is 1. The RNN network has 1 hidden layer. The transfer
function is logistic sigmoid.
The proposed models have multiple inputs and outputs. In-
puts are: hour, temperature, wind speed, lagged price and
lagged load and outputs are: price and load. The forecast en-
gines create a mapping between inputs and targets. Hence,
a mapping of input hour, temperature, price and load is cre-
ated with target price and target load. The relation between
price and load is captured while creating this mapping. The
price and load are affected by past price and load, therefore,
lagged values are good features for prediction. The load is
affected by temperature. The temperature and lagged values
are the most relevant inputs for price and load prediction.
Sana et al.: Preprint submitted to Elsevier Page 8 of 16
Big Data Predictive Analytics of Electricity Load and Price
Moreover, the input is extracted with ESAE feature extractor
which further enhances the input to NARX forecaster. The
best mapping of relevant and informative inputs with targets,
results in improved forecast accuracy.
Both proposed models comprise of neural network based en-
coders: ESAE and ELM encoder, and deep RNN forecasters:
NARX and LSTM. In the first model; ESAENARX, features
are extracted by an efficient sparse encoder. In the second
model; DE-RELM, an extreme learning machine is used as
encoder to learn the initial weights for forecast engine.
This study is aimed at helping electricity market experts and
traders. Several market operations benefit from the load
forecasting, such as: formulation of demand-response pro-
grams, generation scheduling and planning new generation
sources. On the other hand, the traders take advantage of
price forecasting for making bidding strategies and market
experts make modified pricing schemes to control consump-
tion behaviors. No specific sector (i.e., residential, indus-
trial, commercial, etc.) is targeted in this study, instead ag-
gregated load and average regulation price of two power util-
ities are studied.
14. Applications of Proposed Models
The proposed models forecast electricity load and price.
Both price and load forecasts are useful in the case of smart
grids and micro grids. They help utility experts in under-
standing load and price correlation and dynamics. They have
following applications:
1. Minimize the risk of demand and supply imbalance. If
the generation of electricity is less than the demand,
the grids will not be able to fulfill the demands of con-
sumers. If generation is more than demand, the energy
will be wasted.
2. Enable the power utility companies to plan better since
they understand the future load demand.
3. Help to determine the required resources; such as, fu-
els required to operate the generating plants.
4. Maximize utilization of power generating plants. The
load forecasting prevents under generation and over
Several Independent Service Operators (ISOs) take advan-
tage of load forecasting. These ISOs publish the day-ahead
or month-ahead load forecasting data on their websites; such
as, NYISO [41], PJM [42], etc. In the aforementioned real
world scenarios, the proposed forecasting models are appli-
15. Simulations and Results
All the simulations are performed using MATLAB
R2018a on a computer with core i3 processor and 8 GB
RAM. In this section, the description of datasets, big data
analysis and results’ discussion are presented.
16. Data Description
The data used for forecasting is taken from the well-
known electricity utilities: ISO NE (Independent System
Operator New England) [43] and PJM [44], USA. Both
datasets are publicly available.
17. ISO NE Electricity Market
ISO NE is an independent system operator that provides
power to the six states of the USA, known as New England.
It serves Maine, Connecticut, Massachusetts, Rhode Island,
Vermont and New Hampshire. Approximately, every year
the transaction of $10 million is made by 400 electricity
market participants in ISO NE. It has almost 7 million con-
sumers: business and household. Hourly electricity market
data of almost 8 years is used for prediction purpose. Dura-
tion of data used in simulations is from January 2011 to June
2018. Total measurements are 65,616. The data utilized in
this paper is aggregated load and regulation capacity clear-
ing price of the ISO NE control area.
18. PJM Electricity Market
PJM Interconnection is a Regional Transmission Organi-
zation (RTO) in the USA. It is an electric transmission sys-
tem that is part of the Eastern Interconnection grid. It sup-
plies power to 14 regions, i.e., Illinois, Delaware, Kentucky,
Indiana, Maryland, New Jersey, Michigan, Ohio, North Car-
olina, Pennsylvania, Virginia, West Virginia, District of
Columbia and Tennessee. The data taken from PJM is hourly
consumption and price of thirteen years, i.e., January 2006
to October 2018. Data comprises of 112,300 measurements
of load and price each.
19. Performance Evaluation
To evaluate the performance of ESAENARX two per-
formance measures are used, i.e., MAPE, Root Mean Square
Error (RMSE) and Normalized RMSE (NRMSE). The lower
value of the error is better forecasting accuracy. MAPE is an
average absolute error of the forecasted and observed values
and defined by the following equation:
𝑀 𝐴𝑃 𝐸 =1
𝑖𝑦𝑓 𝑜𝑟
100 (22)
NRMSE is the normalized root mean square error of fore-
casted and observed values and defined by:
𝑖𝑦𝑓 𝑜𝑟
𝑖) − 𝑚𝑖𝑛(𝑋𝑎𝑐𝑡
𝑖)) (24)
Where 𝑋𝑎𝑐𝑡
𝑖is the observed value, 𝑦𝑓 𝑜𝑟
𝑖is the forecasted
value and 𝑛is number of values.
20. Big Data Analytics of Electricity Price and
In this research study, the big data of load and price are
deeply analyzed. Both visual and statistical analyses are per-
formed. The visual analyses are presented in graphs. The vi-
sual analyses of ISO NE load are shown in Figures 4,8,11,
Sana et al.: Preprint submitted to Elsevier Page 9 of 16
Big Data Predictive Analytics of Electricity Load and Price
and 13 and PJM load are illustrated in Figures 14,18,21,
and 23. The price analyses of ISONE are presented in Fig-
ures: 5,9,10, and 12, and PJM price is demonstrated by
Figures: 15,19,20, and 22. The price demand relation of
ISO NE is shown in Figure 6, and 7and PJM is shown in
Figure 16, and 17. The statistical analysis of the forecast er-
ror is shown in Table 2.
ISO NE price and load have daily and weekly seasonality.
Price and load have a strong relation with the ISO NE mar-
ket. The load of 8 years is shown in Figure 4and price is
shown in Figure 5. The scatter plot in Figure 6shows the di-
rectly proportional relation of price and demand. The scatter
plot shows the proportionality of price and load. The corre-
lation coefficient is also shown in the figure. The normalized
load and price of one week are shown in a Figure 7for better
visualization of their bidirectional relation. The price elas-
ticity of demand is a factor that describes changes in demand
with respect to changes in the price. Usually, the demand de-
creases if the price increases, however, the price elasticity of
power demand is low. According to the analysis presented
in [45], the price elasticity of demand is âĂŞ0.1 or lesser
within a year in the USA. The season affects the energy con-
sumption and price. In the USA there are four seasons in a
year. The spring season duration is from March to May, the
summer season is from June to August, the autumn (fall) is
from September to November and winters are from Decem-
ber to February. The summer season has the highest electric-
ity consumption of the year as shown in Figure 8. The peak
consumption hours of summer are from 1:00 pm to 5:00 pm
on weekdays. In winters (December to January), the peak
consumption hours are from 5:00 pm to 7:00 pm on week-
days. In ISO NE there are two peak load points in a day.
The 1𝑠𝑡 peak point is around 11:00 am and 2𝑛𝑑 peak point
is between 4:00 pm to 5:00 pm (as shown in Figure 8). The
consumption of 1𝑠𝑡 January, 1𝑠𝑡 April, 1𝑠𝑡 July and 1𝑠𝑡 Octo-
ber is shown in Figure 8and 18. The mentioned four days
are from the four different seasons of a year.
Prices of the same four days are shown in Figure 9and 19.
Both consumption and price are the highest in the summer
season from the rest of the year. The building cooling is re-
quired in the hot weather of summer. Air conditioners con-
sume a lot of power, that is the major reason for an increase in
energy consumption. Electricity prices are relatively higher
in the winters too. The electricity price and load are less in
the spring season as compared to the rest of the year. Due to
the fact that in moderate weather building heating or cool-
ing is not required, that reduces consumption and ultimately
price too. The electricity consumption pattern is fixed with
the seasons and time of use. The electricity consumption is
more in the working hours and less in the nonworking hours.
The load pattern trend has fewer variations as compared to
the price trend. Mostly price and load increasing and de-
creasing at the same time. However, there are a few points
in time where the energy price increase sharply in an unex-
pected manner, even if the load is not increased accordingly
(as shown in Figure 7, between hours 75 to 82 and Figure 17,
between hours 30 to 35). The unexpected change in the price
is due to the external influential factors other than consump-
tion. The factors that influence energy price are: Renewable
Energy Resources (RES) available, fuel prices, economic
conditions, excessive use penalty and transmission contin-
gency. The load is not much affected by most of these fac-
tors. Energy load shows a little or no variation towards the
aforementioned external factors. The energy consumption is
majorly affected by weather conditions. The electricity con-
sumption and price continue to increase over the last 8 years,
that is clear from Figure 4and 5. The visual representation
of past years’ consumption enables utility experts to visual-
ize increasing demand that helps in planning new generation
plants to satisfy future power demand.
PJM load and price of 13 years (2006-2018) are shown in
Figure 14 and Figure 15, respectively. Scatter plot in Fig-
ure 6illustrates the relation of price and load in ISO NE.
Figure 16 shows price demand relation in the PJM electric-
ity markets. The direct proportionality of load and price sig-
nals can be seen in these two figures. In Figure 7and 17, the
normalized load and price of 1𝑠𝑡 week of January 2018 are
plotted. The correlation of price and load signals is demon-
strated in these two figures.
The proposed models ESAENARX and DE-RELM are used
for short-term load and price forecasting. The forecast pe-
riod is one week that is 168 hours. The results of ISO NE
price and load forecast of 1𝑠𝑡 week of June 2018 are shown
in Figure 10 and Figure 11. The PJM price and load fore-
cast of 1𝑠𝑡 week of September 2018 are shown in Figure 20
and 21, respectively. The actual and forecasted values are
plotted and the forecasted values are following the trend of
the actual values. The forecasted load trend closer to the ac-
tual load trend as compared to price. The price forecast is
slightly less accurate as compared to the load forecast. This
is because the load has a similar repetitive pattern and price
pattern has a volatile nature.
Price data exhibit certain characteristics: volatility, sudden,
sharp spikes and changes. The nature of price makes its
forecasting difficult. Learning the pattern of price require
great effort. Only refined features learned with a good pre-
diction method can produce an accurate price forecast result.
It is clear from the results of the experiments that the ESAE-
NARX forecasts price and load very well.
21. Comparison and Discussion
The proposed methods are compared with four ANN
forecasting methods: NARX and ELM, DE-ELM and
RELM. These methods are widely used in electricity load
and price forecasting. The ESAENSARX, ELM, enhanced
ELM, NARX and RELM results for ISO NE price and load
forecast are shown in Figure 12 and Figure 13, respectively.
ESAENARX is able to follow the price and load trend bet-
ter than compared methods. The reason behind the better
forecast accuracy is the best representative features extracted
by proposed feature extractor ESAE. NARX forecaster is
trained with extracted features and it performs very well.
The proposed method takes advantage of the strengths of
both SAE and NARX. The SAE is further made efficient for
Sana et al.: Preprint submitted to Elsevier Page 10 of 16
Big Data Predictive Analytics of Electricity Load and Price
Hours 104
Load (MW)
Figure 4: Load of January 2011 to March 2018, ISO NE.
0 1 2 3 4 5
Hours 104
Price ($/MWh)
Figure 5: Price of January 2011 to March 2018, ISO NE.
1 1.2 1.4 1.6 1.8
Load (MW) 104
Price ($/MWh)
Correlation Coefficient = 0.62
Figure 6: Price-demand signals relation of January 2018 to
March 2018, ISO NE.
0 20 40 60 80 100 120 140 160
Load (MW)
Price ($/MWh)
Figure 7: Normalized load and price of first week of June 2018,
better performance. The detailed comparison of all the com-
pared methods is presented in this section. The results and
reasoning are also elaborated with the comparative analy-
sis. Moreover, the strengths and limitations of the compared
methods are highlighted.
The effect of proposed feature engineering is clear from
the numerical results. The forecasted accuracy of ESAE-
0 5 10 15 20 25
Load (MW)
Figure 8: One day consumption of all four seasons, ISO NE.
0 5 10 15 20 25
Price ($/MWh)
Figure 9: One day energy price of all four seasons, ISO NE.
0 20 40 60 80 100 120 140 160
Price ($/MWh)
Figure 10: Forecasted and observed price of first week of June
2018, ISO NE.
0 50 100 150 200 250
Load (MW)
Figure 11: Forecasted and observed load of first week of June
2018, ISO NE.
NARX with extracted features is much better as compared to
simple NARX. The extracted features are informative; there-
fore, the forecaster is able to model data in a better way and
forecast with greater accuracy.
The proposed methods are compared with three types of
ELMs: ELM, DE-ELM and RELM. The comparative anal-
ysis of these methods is given below.
Sana et al.: Preprint submitted to Elsevier Page 11 of 16
Big Data Predictive Analytics of Electricity Load and Price
0 20 40 60 80 100 120 140 160
Price ($/MWh)
Figure 12: Comparison of ESAENARX and DE-RELM price
prediction with NARX, ELM and DE-ELM, ISO NE.
0 60 120 170
Load (MW)
Figure 13: Comparison of ESAENARX and DE-RELM load
prediction with NARX, ELM and DE-ELM, ISO NE.
0 1 2 3 4 5 6 7
Hours 104
Load (MW)
Figure 14: Load of PJM from January 2010 to March 2018.
0 1 2 3 4 5 6 7
Hours 104
Price ($/MWh)
Figure 15: Price of PJM from January 2010 to March 2018.
The ELM is optimized using a meta-heuristic optimization
algorithm, named differential evolution. The initial weights
and biases of ELMâĂŹs hidden and output layers are op-
timized using DE. DE is an optimization method that iter-
atively improves the performance of an algorithm with re-
spect to the optimization function. In the case of ELM, the
performance is improved, when the forecast accuracy im-
0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
Load (MW) 105
Price ($/MWh)
Correlation Coefficient = 0.87
Figure 16: Price-demand signals relation of PJM from January
2018 to March 2018.
0 20 40 60 80 100 120 140 160
Load (MW)
Price ($/MWh)
Figure 17: Normalized load and price of PJM first week of
January 2018.
0 5 10 15 20 25
Load (MW)
Figure 18: One day consumption of all four seasons, PJM.
0 5 10 15 20 25
Price ($/MWh)
Figure 19: One day energy price of all four seasons, PJM.
proves. The objective function is to reduce the forecast error
on validation data of electricity load and price. First of all,
the population of weights and bias is generated. The pop-
ulation follows the normal distribution. For every selected
weight combination, the NRMSE and MAE are calculated.
The crossover and mutation operations are performed to gen-
erate new combinations of weights and biases. The opti-
Sana et al.: Preprint submitted to Elsevier Page 12 of 16
Big Data Predictive Analytics of Electricity Load and Price
0 20 40 60 80 100 120 140 160
Price ($/MWh)
Figure 20: Actual and predicted price of PJM.
0 20 40 60 80 100 120 140 160
Load (MW)
Figure 21: Actual and predicted load of PJM.
0 60 120 170
Price ($/MWh)
Figure 22: Comparison of ESAENARX and DE-RELM price
prediction with NARX, ELM and DE-ELM, PJM.
0 60 120 170
Load (MW)
Figure 23: Comparison of ESAENARX and DE-RELM load
prediction with NARX, ELM and DE-ELM, PJM.
mized combination of weights and biases are achieved after
multiple iterations of DE. The optimized weights and biases
are used in ELM for the price and load forecasting on test
data. The DE-ELM has a lesser error as compared to simple
ELM. The accuracy of DE-ELM is improved due to the op-
timized initial weights and biases according to the data. The
accuracy of DE-ELM is better than ELM and slightly worse
than RELM in load forecasting. However, for price fore-
casting, the performance of DE-ELM degrades. The price
data has high nonlinearity and dependency on exogenous
variables. Therefore, the relevant features of price are re-
quired to be extracted carefully. The proposed feature ex-
tractor ESAE is capable of extracting the fine details of rel-
evant data. Therefore, the proposed method, ESAENARX
shows good accuracy for both price and load forecasting.
RELM is a variant of the recurrent neural network. It is a
combination of two methods, ELM and RNN. ELM acts as
an encoder, where the inputs and outputs of the network are
same, i.e., the input features. The learned weights of the
ELM network are set as the initial weights of the RNN. By
keeping the inputs and outputs of ELM network similar, the
learned weights are a good representation of the input fea-
tures. The number of neurons in the hidden layer of ELM
and RNN is kept the same. Two ELM encoders are trained,
one for the hidden layerâĂŹs weights of RNN and second
for the output layerâĂŹs weights of the RNN. The learned
weights, make the RNN converge fast and better. The results
of RELM are slightly better than DE-ELM and comparable
to NARX. Both RELM and NARX belong to the same cat-
egory of the neural network, known as a recurrent neural
The second proposed method DE-RELM perform reason-
ably well on load forecasting. The load forecasting results
are much better as compared to other techniques and com-
parable to ESAENARX. However, no significant improve-
ment is seen in the price forecast. ESAENARX performs
equally well for both load and price. The DE-RELM trains
the forecaster on learned weights, a minor improvement is
achieved, that is not comparable to ESAENARX. For price
forecast only properly extracted features can improve accu-
racy. ESAE extracts the relevant and the most informative
features, that improves the forecast accuracy.
ELM has the worst forecast results in the six compared meth-
ods. Because of the fact that ELM is a feed forward network.
Its weights are learned once in a forward pass and never
updated. Therefore, to achieve acceptable forecast results,
the initial weights of the ELM have to be very optimized.
NARX performs better as compared to the ELM. However,
its forecast results are not as accurate as the proposed meth-
ods ESAENARX and DE-RELM. The errors MAPE and
NRMSE are shown in Table 2. The forecast accuracy of all
six methods is in sequence: ESAENARX > DE-RELM >
The lesser error than compared methods verifies the good
performance of the ESAENARX forecast model. The PJM
results in Figure 22 and Figure 23, prove the better accu-
racy of ESAENARX and DE-RELM as compared to ELM,
and CEANN [7] are listed in Table 2. The efficiency of
ESAENARX and DE-RELM is confirmed by lesser MAPE
and RMSE compared to the mentioned methods.
The computational time of both proposed models is pre-
sented in Table 3. The computational time of ESAENARX is
Sana et al.: Preprint submitted to Elsevier Page 13 of 16
Big Data Predictive Analytics of Electricity Load and Price
Table 2
Comparison of forecasting errors.
Forecast Method MAPE RMSE NRMSE
ELM 74.59 7.82 1.53
NARX 1.35 4.35 0.37
Load Forecast DE-ELM 21.73 5.23 0.41
RELM 18.78 4.62 0.37
CEANN [7]8.62 3.75 0.57
DE-RELM 7.78 3.14 0.32
ESAENARX 1.13 2.27 0.03
ELM 89.95 9.78 1.91
NARX 8.29 5.24 0.89
Price Forecast DE-ELM 28.06 6.92 0.32
RELM 21.06 5.62 0.28
CEANN [7]19.96 4.45 0.96
DE-RELM 18.62 3.75 0.34
ESAENARX 3.32 2.85 0.08
ELM 72.32 21.2 1.92
NARX 32 9.26 1.8
Load Forecast DE-ELM 6.52 9.18 0.08
RELM 1.14 9.04 0.032
CEANN [7]3.87 8.96 0.64
DE-RELM 1.09 5.24 0.028
ESAENARX 1.08 3.86 0.03
ELM 99 21.6 2.19
NARX 8.78 18.72 0.16
Price Forecast DE-ELM 18.49 21.76 0.35
RELM 11.09 18.96 0.52
CEANN [7]10.74 8.76 0.2604
DE-RELM 10.56 7.24 0.18
ESAENARX 4.32 4.67 0.12
Table 3
Computational time of proposed algorithms.
Model Dataset Training
Time (s)
Testing Time
PJM 187 53
PJM 123 29
higher as compared to DE-RELM because the feature extrac-
tor ESAE involves pre-training and fine tuning steps. Both
models take more time for training on PJM data. The reason
behind PJM’s higher time complexity is its larger size than
22. Conclusion
In this paper, electricity load and price forecasting is con-
sidered in order to take part in the ISO NE and PJM mar-
kets that regulate the price and demand in the power systems
of the USA. The modeling of electricity load and price is
addressed by two new deep learning based models: ESAE-
NARX and DE-RELM. Descriptive and predictive analytics
of electricity big data are performed. The proposed methods
consider the bidirectional impacts of demand and prices on
each other. These methods capture the load and price inter-
dependencies in the past market data. Following conclusions
are drawn from this study:
The big data analytics unveils the insightful infor-
mation about consumer behaviors and increasing de-
mand. This information helps in the formulation of
new demand-response programs and long term deci-
sions, such as, upscaling of the grid for satisfying the
future demand. Consequently, the grid stability is sig-
nificantly improved.
The proposed feature extractor; ESAE, significantly
improves the quality of extracting feature resulting in
accurate forecasting. The functionality of ESAE is im-
proved because of implementing proposed combina-
tion of decoder functions.
The proposed models efficiently capture price-
demand trends in energy big data. Numerical results
show that proposed forecasting models have lesser
MAPE and RMSE than the compared methods.
The feasibility and practicality of proposed models are
confirmed by their accuracy on well-known real elec-
tricity market data.
In future work, the SAE feature extractor will be enhanced
using multiple combinations of encoder and decoder func-
tions. The effect of each combination on the performance
of feature extractor will be examined. A comparative analy-
sis will be performed on enhanced feature extractor in order
to propose a generalized SAE that performs well on multi-
ple scenarios and datasets. Proposed models can be imple-
mented in real world scenario of smart grid or micro grid in
order to improve power system operations.
[1] Liu Y, Wang W, Ghadimi N. Electricity load forecasting
by an improved forecast engine for building level con-
sumers. Energy. 2017 Nov 15;139:18-30.
[2] Akhavan-Hejazi H, Mohsenian-Rad H. Power systems
big data analytics: An assessment of paradigm shift bar-
riers and prospects. Energy Reports. 2018 Nov 30;4:91-
[3] Jiang H, Wang K, Wang Y, Gao M, Zhang Y. Energy big
data: A survey. IEEE Access. 2016; 4:3844-61.
[4] Zhou K, Fu C, Yang S. Big data driven smart energy
management: From big data to big insights. Renewable
and Sustainable Energy Reviews. 2016 Apr 1;56:215-
[5] Zhang Q, Yang LT, Chen Z, Li P. A survey on deep
learning for big data. Information Fusion. 2018 Jul 31;
Sana et al.: Preprint submitted to Elsevier Page 14 of 16
Big Data Predictive Analytics of Electricity Load and Price
[6] Ghasemi A, Shayeghi H, Moradzadeh M, Nooshyar M.
A novel hybrid algorithm for electricity price and load
forecasting in smart grids with demand-side manage-
ment. Applied energy. 2016 Sep 1;177:40-59.
[7] Gao W, Darvishan A, Toghani M, Mohammadi M, Abe-
dinia O, Ghadimi N. Different states of multi-block
based forecast engine for price and load prediction. In-
ternational Journal of Electrical Power & Energy Sys-
tems. 2019 Jan 1;104:423-35.
[8] Wang K, Xu C, Zhang Y, Guo S, Zomaya A. Robust
big data analytics for electricity price forecasting in the
smart grid. IEEE Transactions on Big Data. 2017 Jul 5,
DOI: 10.1109/TBDATA.2017.2723563.
[9] Singh S, Yassine A. Big data mining of energy time
series for behavioral analytics and energy consumption
forecasting. Energies. 2018 Feb 20;11(2):452.
[10] Wang L, Zhang Z, Chen J. Short-term electricity price
forecasting with stacked denoising autoencoders. IEEE
Transactions on Power Systems. 2017 Jul;32(4):2673-
[11] Tong C, Li J, Lang C, Kong F, Niu J, Rodrigues JJ.
An efficient deep model for day-ahead electricity load
forecasting with stacked denoising autoencoders. Jour-
nal of Parallel and Distributed Computing. 2018 Jul
[12] Ahmad A, Javaid N, Guizani M, Alrajeh N, Khan ZA.
An accurate and fast converging short-term load fore-
casting model for industrial applications in a smart grid.
IEEE Transactions on Industrial Informatics. 2017 Oct
[13] Ahmad A, Javaid N, Alrajeh N, Khan ZA, Qasim U,
Khan A. A modified feature selection and artificial neu-
ral network-based day-ahead load forecasting model for
a smart grid. Applied Sciences. 2015 Dec 11;5(4):1756-
[14] Kuo PH, Huang CJ. An Electricity Price Forecasting
Model by Hybrid Structured Deep Neural Networks.
Sustainability. 2018 Apr 21;10(4):1280.
[15] Ugurlu U, Oksuz I, Tas O. Electricity Price Forecasting
Using Recurrent Neural Networks. Energies. 2018 Apr
[16] Fan C, Xiao F, Zhao Y. A short-term building cooling
load prediction method using deep learning algorithms.
Applied energy. 2017 Jun 1;195:222-33.
[17] Ryu S, Noh J, Kim H. Deep neural network based de-
mand side short term load forecasting. Energies. 2016
Dec 22;10(1):3.
[18] Mocanu E, Nguyen PH, Gibescu M, Kling WL. Deep
learning for estimating building energy consumption.
Sustainable Energy, Grids and Networks. 2016 Jun
[19] Li C, Ding Z, Zhao D, Yi J, Zhang G. Building energy
consumption prediction: An extreme deep learning ap-
proach. Energies. 2017 Oct 7;10(10):1525.
[20] Fu G. Deep belief network based ensemble approach
for cooling load forecasting of air-conditioning system.
Energy. 2018 Apr 1;148:269-82.
[21] Dedinec A, Filiposka S, Dedinec A, Kocarev L.
Deep belief network based electricity load forecasting:
An analysis of Macedonian case. Energy. 2016 Nov
[22] Qiu X, Ren Y, Suganthan PN, Amaratunga GA. Empir-
ical mode decomposition based ensemble deep learning
for load demand time series forecasting. Applied Soft
Computing. 2017 May 1;54:246-55.
[23] Rahman A, Srikumar V, Smith AD. Predicting electric-
ity consumption for commercial and residential build-
ings using deep recurrent neural networks. Applied En-
ergy. 2018 Feb 15;212:372-85.
[24] Bouktif S, Fiaz A, Ouni A, Serhani M. Optimal deep
learning lstm model for electric load forecasting using
feature selection and genetic algorithm: Comparison
with machine learning approaches. Energies. 2018 Jun
[25] Zheng H, Yuan J, Chen L. Short-term load forecast-
ing using EMD-LSTM neural networks with a Xgboost
algorithm for feature importance evaluation. Energies.
2017 Aug 8;10(8):1168.
[26] Shi H, Xu M, Li R. Deep learning for household load
forecasting-A novel pooling deep RNN. IEEE Transac-
tions on Smart Grid. 2018 Sep;9(5):5271-80.
[27] Guo Z, Zhou K, Zhang X, Yang S. A deep learning
model for short-term power load and probability density
forecasting. Energy. 2018 Oct 1;160:1186-200.
[28] Wen L, Zhou K, Yang S, Lu X. Optimal load dispatch
of community microgrid with deep learning based solar
power and load forecasting. Energy. 2019 Jan 16.
[29] Torres JF, Fernandez AM, Troncoso A, Martinez-
Alvarez F. Deep learning-based approach for time series
forecasting with application to electricity load. In In-
ternational Work-Conference on the Interplay Between
Natural and Artificial Computation 2017 Jun 19 (pp.
203-212). Springer, Cham.
[30] Din GM, Marnerides AK. Short term power load fore-
casting using deep neural networks. In 2017 Interna-
tional Conference on Computing, Networking and Com-
munications (ICNC) 2017 Jan 26 (pp. 594-598). IEEE.
[31] Bibri SE. The IoT for smart sustainable cities of the
future: An analytical framework for sensor-based big
data applications for environmental sustainability. Sus-
tainable Cities and Society. 2018 Apr 1, 38: 230-253.
Sana et al.: Preprint submitted to Elsevier Page 15 of 16
Big Data Predictive Analytics of Electricity Load and Price
[32] Bibri SE, Krogstie J. Smart sustainable cities of the
future: An extensive interdisciplinary literature review.
Sustainable Cities and Society. 2017 May 1, 31: 183-
[33] Silva BN, Khan M, Han K. Towards sustainable smart
cities: A review of trends, architectures, components,
and open challenges in smart cities. Sustainable Cities
and Society. 2018 Apr 1, 38: 697-713.
[34] Ibrahim M, El-Zaart A, Adams C. Smart sustainable
cities roadmap: Readiness for transformation towards
urban sustainability. Sustainable cities and society. 2018
Feb 1, 37: 530-540.
[35] Massana J, Pous C, Burgas L, Melendez J, Colomer J.
Identifying services for short-term load forecasting us-
ing data driven models in a Smart City platform. Sus-
tainable cities and society. 2017 Jan 1, 28: 108-17.
[36] White, B.W. Principles of neurodynamics: Perceptrons
and the theory of brain mechanisms. Spartan Books,
Washington DC. 1963.
[37] Youssef A, Delpha C, Diallo D. An optimal fault de-
tection threshold for early detection using Kullback-
âĂŞLeibler divergence for unknown distribution data.
Signal Processing. 2016 Mar 1;120:266-79.
[38] Hida T, Kuo HH, Potthoff J, Streit L. White noise: an
infinite dimensional calculus. Springer Science & Busi-
ness Media; 2013 Jun 29.
[39] Chen S, Billings SA, Grant PM. Non-linear system
identification using neural networks. International jour-
nal of control.
[40] Chen X, Li S, Wang W. New de-noising method for
speech signal based on wavelet entropy and adaptive
threshold. Journal of Information & Computational Sci-
ence. 2015;12(3):1257-65.
[41] NYISO Market Operation Data, https://www.nyiso.
com/load-data (Last visited on 16𝑡ℎ March 2019)
[42] PJM Market Operation Data,
(Last visited on 16𝑡ℎ March 2019)
[43] ISO NE Market Operations Data, https://www.iso-ne.
com/isoexpress/web/reports/pricing/-/tree/zone- info
(Last visited on 10𝑡ℎ November 2018)
[44] PJM Market Operations Data, https://dataminer2.pjm.
com (Last visited on 10𝑡ℎ November 2018)
[45] Burke PJ, Abayasekara A. The price elas-
ticity of electricity demand in the United
States: A three-dimensional analysis. Energy J.
Sana et al.: Preprint submitted to Elsevier Page 16 of 16
... It consists of smart meters and sensors linked to the energy server through wireless/wired communication [4]. Compared to the traditional grid, a smart grid allows for more efficient management of electric power [5][6][7][8][9][10]. A smart grid uses an analytical approach and efficient load scheduling to maximize resource use [11]. ...
... It is possible to put a model through its paces using any one of a wide variety of evaluation measures. Accuracy, precision, recall, the F-measure, the AUC score, and the confusion matrix are all taken into account, as well as a few others that are often used to evaluate performance in this study, as shown in Equations (4)- (7). Results such as true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) can be determined by examining the resulting confusion matrices. ...
Full-text available
Theft of electricity poses a significant risk to the public and is the most costly non-technical loss for an electrical supplier. In addition to affecting the quality of the energy supply and the strain on the power grid, fraudulent electricity use drives up prices for honest customers and creates a ripple effect on the economy. Using data-analysis tools, smart grids may drastically reduce this waste. Smart-grid technology produces much information, including consumers’ unique electricity-use patterns. By analyzing this information, machine-learning and deep-learning methods may successfully pinpoint those who engage in energy theft. This study presents an ensemble-learning-based system for detecting energy theft using a hybrid approach. The proposed approach uses a machine-learning-based ensemble model based on a majority voting strategy. This work aims to develop a smart-grid information-security decision support system. This study employed a theft-detection dataset to facilitate automatic theft recognition in a smart-grid environment (TDD2022). The dataset consists of six separate electricity thefts. The experiments are performed in four different scenarios. The proposed machine-learning-based ensemble model obtained significant results in all scenarios. The proposed ensemble model obtained the highest accuracy of 88%, 87.24%, 94.75%, and 94.70% with seven classes including the consumer type, seven classes excluding the consumer type, six classes including the consumer type, and six classes excluding the consumer type. The suggested ensemble model outperforms the existing techniques in terms of accuracy when the proposed methodology is compared to state-of-the-art approaches.
... At the end, they have taken an in-depth look at smart data sources and their characteristics. Recent studies presented in [80], [81] employ big data analytics approaches for efficient electrical load and electricity price forecasting. However, there is still exigent need to increase the usage of Big Data Analytics approaches in load and price prediction in order to improve the performance of energy management systems in smart grids. ...
Full-text available
Electricity load and price data pose formidable challenges for forecasting due to their intricate characteristics, marked by high volatility and non-linearity. Machine learning (ML) and deep learning (DL) models have emerged as promising tools for effectively predicting data exhibiting high volatility, frequent fluctuations, mean-reversion tendencies, and non-stationary behavior. Therefore, this review article is dedicated to providing a comprehensive exploration of the application of machine learning and deep learning techniques in the context of electricity load and price prediction. In contrast to existing literature, our study distinguishes itself in several key ways. We systematically examine ML and DL approaches employed for the prediction of electricity load and price, offering a meticulous analysis of their methodologies and performance. Furthermore, we furnish readers with a detailed compendium of the datasets utilized by these forecasting methods, elucidating the sources and specific characteristics underpinning these datasets. Then, we rigorously conduct a performance comparison across various performance metrics, facilitating a comprehensive assessment of the efficacy of different predictive models. Notably, this comparison is carried out using the same datasets that underlie the diverse methodologies reviewed within this study, ensuring a fair and consistent evaluation. Moreover, we provide an in-depth examination of the diverse performance measures and statistical tools employed in the studies considered, providing valuable insights into the analytical frameworks used to gauge forecasting accuracy and model robustness. Lastly, we devote significant attention to the identification and analysis of prevailing challenges within the realm of electricity load and price prediction. Additionally, we delve into prospective directions for future research, thereby contributing to the advancement of this critical field.
... Hu et al used neural networks to model old wind farms with large amounts of data, and then mapped them to new wind farms lacking data [25]. Mujeeb and Javaid proposed two multiple input multiple output deep recurrent neural network models for price and load forecasting [26]. Zhang et al proposed a deep learning framework based on restricted Boltzmann machine and Elman neural network (ENN) [27]. ...
Full-text available
Measurement data plays an important role in the control system, but the data collected by sensors often has measurement noise, which makes the state of the system cannot be accurately revealed. Unscented Kalman filter (UKF) is a highly accurate and robust filtering algorithm, but its limitation is the requirement of prior knowledge of the exact dynamic mathematical model, which is a critical issue to be addressed in practice. In this work, a data-driven dynamic data reconciliation scheme called nonlinear auto regressive Elman neural network (ENN) with exogenous inputs combined with UKF (NARX-ENN-UKF) is proposed, where nonlinear auto regressive ENN with exogenous inputs is used for dynamic data-driven modeling, and then UKF is applied for dynamic data reconciliation of the measurements based on the trained model. The scheme is applied to a DC/AC inverter experimental system and a self-developed sliding electrical contact experimental system to verify the effectiveness of NARX-ENN-UKF.
... The term "smart grid" refers to a combination of traditional electrical networks with automated communication technologies. In accordance with past research [8][9][10][11], the smart grid may help ensure that electrical energy is used efficiently. The smart grid network uses a transactive power architecture [12] together with medium-and short-term electrical demand forecasting methods [13] to make the most of the existing resources. ...
Full-text available
Electricity theft has a considerable negative effect on energy suppliers and power infrastructure, leading to non-technical losses and business losses. Power quality deteriorates and overall profitability falls as a result of energy theft. By fusing information and energy flow, smart grids may assist solve the issue of power theft. The examination of smart grid data aids in the detection of power theft. However, the earlier techniques were not very good in detecting energy theft. In this work, we suggested an electricity theft detection approach using smart meter consumption data in order to handle the aforementioned issues and assist and assess energy supply businesses to lower the obstacles of limited energy, unexpected power usage, and bad power management. In specifically, the Deep CNN model effectively completes two tasks: it differentiates between energy that is not periodic and that is, while keeping the general features of data on power consumption. The trial’s results show that the deep CNN model outperforms prior ones and has the best level of accuracy for detecting energy theft.
Full-text available
The rapid growth of the Internet of Things (IoT) has led to its widespread adoption in various industries, enabling enhanced productivity and efficient services. Integrating IoT systems with existing enterprise application systems has become common practice. However, this integration necessitates reevaluating and reworking current Enterprise Architecture (EA) models and Expert Systems (ES) to accommodate IoT and cloud technologies. Enterprises must adopt a multifaceted view and automate various aspects, including operations, data management, and technology infrastructure. Machine Learning (ML) is a powerful IoT and smart automation tool within EA. Despite its potential, a need for dedicated work focuses on ML applications for IoT services and systems. With IoT being a significant field, analyzing IoT‐generated data and IoT‐based networks is crucial. Many studies have explored how ML can solve specific IoT‐related challenges. These mutually reinforcing technologies allow IoT applications to leverage sensor data for ML model improvement, leading to enhanced IoT operations and practices. Furthermore, ML techniques empower IoT systems with knowledge and enable suspicious activity detection in smart systems and objects. This survey paper conducts a comprehensive study on the role of ML in IoT applications, particularly in the domains of automation and security. It provides an in‐depth analysis of the state‐of‐the‐art ML approaches within the context of IoT, highlighting their contributions, challenges, and potential applications.
Full-text available
Due to the illegal use of electricity, non-technical losses are exponentially increasing in electricity distribution systems day by day. With the debut of smart meters in the smart grid, new electricity theft attacks are welcomed. The investigation of abnormal electricity consumption patterns helps in detecting electricity thieves. Moreover, existing methods have poor electricity theft detection (ETD) accuracy due to imbalanced datasets provided by the utilities. They have also failed to capture both periodicity and non-periodicity of 1-D daily electricity usage data. We primarily propose a novel sampling technique to balance the dataset, named as random oversampling using both classes (ROBC). This technique performs oversampling using both the theft and normal classes. With this technique, the problem of low accuracy has been resolved. We also propose a unique ETD model using densenet-fully convolutional network (DenseNet-FCN) and gated recurrent unit (GRU) with a light gradient boosting machine (LightGBM), known as DenseNet-GRU-LightGBM, to address the above mentioned concerns. DenseNet-FCN module extracts periodic and non-periodic patterns from 2-D electricity consumption data in a precise way. Whereas, GRU module captures as well as memorizes features from 1-D consumption data. Afterwards, LightGBM module is used as an ensemble classifier to give final ETD results. As a result, the proposed model has excellent ETD results. Comprehensive simulations indicate that the proposed scheme outperforms other existing methods regarding ETD.
Full-text available
Electric power systems are taking drastic advances in deployment of information and communication technologies; numerous new measurement devices are installed in forms of advanced metering infrastructure , distributed energy resources (DER) monitoring systems, high frequency synchronized wide-area awareness systems that with great speed are generating immense volume of energy data. However, it is still questioned that whether the today's power system data, the structures and the tools being developed are indeed aligned with the pillars of the big data science. Further, several requirements and especial features of power systems and energy big data call for customized methods and platforms. This paper provides an assessment of the distinguished aspects in big data analytics developments in the domain of power systems. We perform several taxonomy of the existing and the missing elements in the structures and methods associated with big data analytics in power systems. We also provide a holistic outline, classifications, and concise discussions on the technical approaches, research opportunities, and application areas for energy big data analytics.
Full-text available
Background: With the development of smart grids, accurate electric load forecasting has become increasingly important as it can help power companies in better load scheduling and reduce excessive electricity production. However, developing and selecting accurate time series models is a challenging task as this requires training several different models for selecting the best amongst them along with substantial feature engineering to derive informative features and finding optimal time lags, a commonly used input features for time series models. Methods: Our approach uses machine learning and a long short-term memory (LSTM)-based neural network with various configurations to construct forecasting models for short to medium term aggregate load forecasting. The research solves above mentioned problems by training several linear and non-linear machine learning algorithms and picking the best as baseline, choosing best features using wrapper and embedded feature selection methods and finally using genetic algorithm (GA) to find optimal time lags and number of layers for LSTM model predictive performance optimization. Results: Using France metropolitan’s electricity consumption data as a case study, obtained results show that LSTM based model has shown high accuracy then machine learning model that is optimized with hyperparameter tuning. Using the best features, optimal lags, layers and training various LSTM configurations further improved forecasting accuracy. Conclusions: A LSTM model using only optimally selected time lagged features captured all the characteristics of complex time series and showed decreased Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) for medium to long range forecasting for a wider metropolitan area.
Full-text available
Accurate electricity price forecasting has become a substantial requirement since the liberalization of the electricity markets. Due to the challenging nature of electricity prices, which includes high volatility, sharp price spikes and seasonality, various types of electricity price forecasting models still compete and cannot outperform each other consistently. Neural Networks have been successfully used in machine learning problems and Recurrent Neural Networks (RNNs) have been proposed to address time-dependent learning problems. In particular, Long Short Term Memory (LSTM) and Gated Recurrent Units (GRU) are tailor-made for time series price estimation. In this paper, we propose to use multi-layer Gated Recurrent Units as a new technique for electricity price forecasting. We have trained a variety of algorithms with three-year rolling window and compared the results with the RNNs. In our experiments, three-layered GRUs outperformed all other neural network structures and state-of-the-art statistical techniques in a statistically significant manner in the Turkish day-ahead market.
Full-text available
Electricity price is a key influencer in the electricity market. Electricity market trades by each participant are based on electricity price. The electricity price adjusted with the change in supply and demand relationship can reflect the real value of electricity in the transaction process. However, for the power generating party, bidding strategy determines the level of profit, and the accurate prediction of electricity price could make it possible to determine a more accurate bidding price. This cannot only reduce transaction risk, but also seize opportunities in the electricity market. In order to effectively estimate electricity price, this paper proposes an electricity price forecasting system based on the combination of 2 deep neural networks, the Convolutional Neural Network (CNN) and the Long Short Term Memory (LSTM). In order to compare the overall performance of each algorithm, the Mean Absolute Error (MAE) and Root-Mean-Square error (RMSE) evaluating measures were applied in the experiments of this paper. Experiment results show that compared with other traditional machine learning methods, the prediction performance of the estimating model proposed in this paper is proven to be the best. By combining the CNN and LSTM models, the feasibility and practicality of electricity price prediction is also confirmed in this paper.
Full-text available
Responsible, efficient and environmentally aware energy consumption behavior is becoming a necessity for the reliable modern electricity grid. In this paper, we present an intelligent data mining model to analyze, forecast and visualize energy time series to uncover various temporal energy consumption patterns. These patterns define the appliance usage in terms of association with time such as hour of the day, period of the day, weekday, week, month and season of the year as well as appliance-appliance associations in a household, which are key factors to infer and analyze the impact of consumers’ energy consumption behavior and energy forecasting trend. This is challenging since it is not trivial to determine the multiple relationships among different appliances usage from concurrent streams of data. Also, it is difficult to derive accurate relationships between interval-based events where multiple appliance usages persist for some duration. To overcome these challenges, we propose unsupervised data clustering and frequent pattern mining analysis on energy time series, and Bayesian network prediction for energy usage forecasting. We perform extensive experiments using real-world context-rich smart meter datasets. The accuracy results of identifying appliance usage patterns using the proposed model outperformed Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) at each stage while attaining a combined accuracy of 81.82%, 85.90%, 89.58% for 25%, 50% and 75% of the training data size respectively. Moreover, we achieved energy consumption forecast accuracies of 81.89% for short-term (hourly) and 75.88%, 79.23%, 74.74%, and 72.81% for the long-term; i.e., day, week, month, and season respectively.
A deep recurrent neural network with long short-term memory units (DRNN-LSTM) model is developed to forecast aggregated power load and the photovoltaic (PV) power output in community microgrid. Meanwhile, an optimal load dispatch model for grid-connected community microgrid which includes residential power load, PV arrays, electric vehicles (EVs), and energy storage system (ESS), is established under three different scheduling scenarios. To promote the supply-demand balance, the uncertainties of both residential power load and PV power output are considered in the model by integrating the forecasting results. Two real-world data sets are used to test the proposed forecasting model, and the results show that the DRNN-LSTM model performs better than multi-layer perception (MLP) network and support vector machine (SVM). Finally, particle swarm optimization (PSO) algorithm is used to optimize the load dispatch of grid-connected community microgrid. The results show that EES and the coordinated charging mode of EVs can promote peak load shifting and reduce 8.97% of the daily costs. This study contributes to the optimal load dispatch of community microgrid with load and renewable energy forecasting. The optimal load dispatch of community microgrid with deep learning based solar power and load forecasting achieves total costs reduction and system reliability improvement.
Accurate load forecasting is critical for power system planning and operational decision making. In this study, we are the first to utilize a deep feedforward network for short-term electricity load forecasting. Our results are compared to those of popular machine learning models such as random forest and gradient boosting machine models. Then, electricity consumption patterns are explored based on monthly, weekly and temperature-based patterns in terms of feature importance. Also, a probability density forecasting method based on deep learning, quantile regression and kernel density estimation is proposed. To verify the efficiency of the proposed methods, three case studies based on daily electricity consumption data for three Chinese cities for 2014 are conducted. The empirical results demonstrate that (1) the proposed deep learning-based approach exhibits better forecasting accuracy in terms of measuring electricity consumption relative to the random forest and gradient boosting model; (2) monthly, weekly and weather-related variables are key factors that have a great influence on household electricity consumption; and (3) the proposed probability density forecasting method is capable of forecasting high-quality prediction intervals via probability density forecasting.
Deep learning, as one of the most currently remarkable machine learning techniques, has achieved great success in many applications such as image analysis, speech recognition and text understanding. It uses supervised and unsupervised strategies to learn multi-level representations and features in hierarchical architectures for the tasks of classification and pattern recognition. Recent development in sensor networks and communication technologies has enabled the collection of big data. Although big data provides great opportunities for a broad of areas including e-commerce, industrial control and smart medical, it poses many challenging issues on data mining and information processing due to its characteristics of large volume, large variety, large velocity and large veracity. In the past few years, deep learning has played an important role in big data analytic solutions. In this paper, we review the emerging researches of deep learning models for big data feature learning. Furthermore, we point out the remaining challenges of big data deep learning and discuss the future topics.