Content uploaded by Nadeem Javaid

Author content

All content in this area was uploaded by Nadeem Javaid on Feb 17, 2020

Content may be subject to copyright.

An Enhanced Convolutional Neural Network

model based on weather parameters for

short-term electricity supply and demand

Zeeshan Aslam, Nadeem Javaid, Muhammad Adil, Muhammad Tariq Ijaz, Atta ur

Rahman, and Mohsin Ahmed

Abstract Short-term electricity supply and demand forecasting using weather pa-

rameters including: temperature, wind speed, and solar radiations improve the op-

erational efﬁciency and accuracy of power systems. There are many weather pa-

rameters which have inﬂuential affect on the supply and demand of electricity, but

temperature, solar radiations, and wind speed are the most important parameters.

Our proposed time series model is based on preprocessing, feature extraction, data

preparation, and Enhanced Convolutional Neural Network referred as ECNN mod-

ule for short-term weather parameters forecasting up to 6-hours ahead. The proposed

ECNN time series model is applied on 61 locations of United States, collected from

National Solar Radiation Database (NSRDB). Model trained on 15-years data and

validated on additional two-years out of sample data. Simulation result shows that

our proposed model performs better than traditional benchmark models in terms of

Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Relative Root

Mean Square Error (RMSE%) performance metrics. Result shows that the proposed

model is effective for short-term forecasting of temperature, solar radiations, and

wind speed. Moreover, proposed model improves the accuracy and operational efﬁ-

ciency of power systems.

1 Introduction

Electric power industry plays very important role in the well being of a country, it’s

efﬁcient performance helps in the economic development of a country. One of the

main mode of electric power generation is thermal, which is costly and emits car-

bon in large amount . Rapidly growing interests in renewable energy sources such

as wind and solar power and decreasing cost of power generation, renewable en-

Zeeshan Aslam, Nadeem Javaid (Corresponding Author), Muhammad Adil, Muhammad Tariq Ijaz,

Atta ur Rahman, and Mohsin Ahmed

COMSATS University Islamabad, Pakistan, email: nadeemjavaidqau@gmail.com

1

2 Authors Suppressed Due to Excessive Length

ergy generation gains lot of importance because of less expensive and no carbon

emission. Globally, wind and solar power generation grows rapidly from 80 to 790

GW from 2006 to 2016. Accurate forecasting is important because 1% reduction in

Mean Absolute Percentage Error (MAPE) means 0.1% to 0.3% reduction in gener-

ation cost of electricity, which is approximately $1 million annually on large scale

smart grid.

The supply and demand of electricity is highly dependent on weather parame-

ters, therefore, market participants need a useful and reliable technique to increase

their proﬁt ratio by accurately forecasting weather parameters. Weather parame-

ters have great affect on both supply and demand side of electricity. From large

number of weather parameters, temperature, wind speed, and solar radiations are

the most inﬂuencing factors in supply and demand of electricity. Temperature has

great effects on the individual side of electricity, whereas, wind speed and solar

radiations effected the on supply side of electricity. Accurate short-term forecast-

ing of these weather parameters is important for several reasons including: efﬁcient

supply management of electricity, to reduce amount of electricity consumption, to

improve energy efﬁciency level of stations, to prepare effective production plans, to

improve operational efﬁciency, and to adjust and control power stations [1-2]. How-

ever, inaccurate forecasting of these weather parameters is one of the most important

challenge in supply and demand of electricity.

Traditionally, short-term weather parameters forecasting techniques are based

on statistical models such as Vector Autoregression (VAR) model, regression tech-

niques, and artiﬁcial intelligence models such as Support Vector Machine (SVM),

Artiﬁcial Neural Networks (ANNs), and Deep learning models. These models re-

quire extensive memory and computation time, slow convergence, less accurate,

and require weights adjustment and heavy preprocessing in order to reﬁne input

which make them less accurate, and lead to overﬁtting problem. Moreover, as the

size of data increases, these models become complex and requires more training

time [2]. Based on above mentioned limitations of traditional models, we propose

ECNN weather model for short-term forecasting of three weather parameters in-

cluding: temperature, wind speed, and solar radiations using 61 locations data of

United States.

In the light of above mentioned weather parameters forecasting techniques, this

paper have following contributions. First, we identify and remove outliers from

weather parameters time series. Second, we apply Fast Independent Component

Analysis (FastICA) to reduce dimensionality of features. Third, we apply Grange

Causality (GCA) and Augmented Dickey-Fuller (ADF) tests to check the useful-

ness and stationarity of time series. After that, we ﬁnd and adjust trend and seasonal

patterns in time series. Fourth, we employ ECNN time series model for short-term

forecasting of three weather parameters to overcome the limitations of above men-

tioned traditional models. Finally, we evaluate the performance of our proposed

model with traditional benchmark models according to three accuracy metrics in-

cluding: RMSE, RMSE%, and MAE.

Rest of this paper is organized as follows. In section II, we investigate and sum-

marize the related work of short-term weather parameters forecasting. In section III,

Title Suppressed Due to Excessive Length 3

we present our proposed methodology. In section IV, we examine the forecasting

performance using simulation results. Finally, in section V, we draw conclusion and

future work of this paper.

2 Related Work

Related work based on weather parameters have been extensively studied and cat-

egorized into four sections: solar radiations forecasting, wind speed forecasting,

temperature or load forecasting, and multivariate weather parameters forecasting. In

paper [26, 27, 28, 31, 32], authors propose solutions for short-term to medium-term

electricity price and load forecasting. In [29,30], authors perform wind and photo-

voltaic power forecasting. Table 1 shows the summary of further related work based

on short-term weather parameters forecasting.

3 Methodology

Modeling three weather parameters for short-term forecasting ﬁrst requires prepro-

cessing of input features. Afterward, it requires appropriate extraction and prepa-

ration of input features to reduce dimensionality and keep necessary information.

Then, deﬁne a suitable structure of ECNN to perform short-term forecasting of input

weather parameters. Following sub-sections describe each module of our proposed

system model, before describing these steps in detail, an overview of methodology

is presented.

3.1 Overview of methodological approach

In this paper, we propose an approach for short-term forecasting of multivariate time

series based on historical input features known as: wind speed, solar radiations, and

temperature. In our proposed approach, ﬁrst apply preprocessing on input features,

then identify and remove outliers using Z score technique, and lastly normalize in-

put features using min-max normalization (see Section 3.3). After preprocessing

of input features, an appropriate feature extraction technique FastICA is applied to

reduce the dimensionality of features (see Section 3.4). After preprocessing and ex-

traction of relevant data, identify and adjust its speciﬁc characteristics and seasonal

patterns (see Section 3.5) in order to prepare data for forecasting. After the prepara-

tion of input features, the ECNN model is develop and pass input data to ECNN. In

order to deﬁne the structure of ECNN, it is necessary to deﬁne elements of ECNN

(see Section 3.6). Overview of proposed approach is presented in Fig. 1.

4 Authors Suppressed Due to Excessive Length

Table 1: Summary of short-term weather parameters forecasting

Proposed

Method

Description Limitations

VAR, 2018 [1] Authors employ VAR model for weather parame-

ters.

Linear in nature, need more memory, high compu-

tation time and preprocessing.

WKNNRW,

2018 [3]

Authors propose a hybrid model for multivariate

time series forecasting.

Overﬁtting, extensive computation and memory

requirements, and random weights selection.

DNN, 2018 [4] Authors apply deep learning to forecast solar radi-

ations.

Overﬁtting, local optima, and slow convergence.

CNN-LSTM,

2018 [5]

Authors propose a hybrid model to forecast solar

power.

Overﬁtting and computationally expensive.

DNN-TPE,

2018 [6]

Authors propose a global DNN model for solar ir-

radiance forecasting.

Overﬁtting, poor accuracy, and TPE become ex-

pensive on large data set.

VMD-RMWK,

2018 [7]

Authors propose a hybrid model for solar irradi-

ance forecasting.

Computationally expensive, more training time,

more memory, and Less accurate on large data set.

IEMD-ARIMA-

WNN-FOA,

2018 [8]

Authors propose a hybrid model for short-term

load forecasting.

Computationally expensive, overﬁtting, more

training time, and less accurate on large data set.

PSR-BSK, 2018

[9]

Authors propose a hybrid model for short-term

load forecasting.

Require parameter tuning, overﬁtting, extensive

memory and computation requirements, and less

accurate.

SVR, 2017 [10] Authors propose SVR model for short-term fore-

casting of demand response.

Overﬁtting, slow convergence, requires more

memory, less accurate, and computationally ex-

pensive on large data set.

EWT-LSTM-

RELM-IEWT,

2018 [11]

Authors propose a hybrid model to forecast wind

speed.

Computationally expensive, overﬁtting, requires

more memory, complex and expensive on large

data set.

RWT-ARIMA,

2019 [12]

Authors propose a hybrid model for short-term

forecasting of wind speed.

Non-linear characteristics, poor accuracy, less ac-

curate and expensive on large data set.

CBA-ANN-

SVM, 2019

[2]

Authors propose a hybrid model to forecast load

and weather parameters.

Overﬁtting, local optima and slow convergence,

and requires more memory.

VMD-GSO-

ELM, 2018

[13]

Authors propose a hybrid model for wind speed

forecasting.

Overﬁtting, extensive memory and computation

requirements, and GSO become expensive on

large data set.

ICEEMDAN-

GWO, 2018

[14]

Authors propose a hybrid model to forecast wind

speed.

Overﬁtting, extensive memory and computation

requirements, and GWO become expensive on

large data set.

EEMD-AWNN,

2018 [15]

Authors propose a hybrid model for short-term

wind speed forecasting.

Overﬁtting, slow convergence, and extensive

memory and computation requirements.

3.2 Weather Data

We apply our proposed model on 61 locations of United States, collected from Na-

tional Solar Radiation Database (NSRDB), which is prepared by National Renew-

able Energy Laboratory, National Climate Data center, and other partners [16]. We

use three input parameters from data set. Input parameters that have inﬂuential ef-

fect on both the supply and demand side of electricity such as temperature in Kelvin

Title Suppressed Due to Excessive Length 5

Fig. 1: Overview of proposed system model

(K), wind speed in meters per second (m/s), and global solar radiations in Watt hour

per square meter (Wh/m2). We use 15-years data from 1991-2006 to train model and

additional 2-years data for validation.

3.3 Data Preprocessing

In preprocessing module, we perform three operations: data cleansing and analyz-

ing, outliers detection and removal, and data normalization. In data cleansing and

analyzing, we ﬁrst analyze the data by identifying the type of time series and con-

taining them null or incorrect values, then cleaning those values through ﬁlling them

with mean value of time series. In outliers detection and removal, we identify values

using Z score which reside outside the distribution and do not have any major inﬂu-

ence on ﬁnal output. An outlier has serious impact on mean and standard deviation,

and causes to skew the data. In paper [17], authors deﬁne the formula of Z score to

identify and remove outliers, which is deﬁned as:

Zscore = (Observat ion −Mean)/StandardDeviation.(1)

After ﬁrst two operations in preprocessing module, we normalize the time series

using Min-Max normalization that make it easy for ECNN to handle data in same

scale and increases the training speed of ECNN. It scales the input features between

0 and 1. In paper [18], authors deﬁne the normalization formula which is deﬁned as:

6 Authors Suppressed Due to Excessive Length

x0= (xmax −xmin)∗(xi−xmin )

(xmax −xmin)+xmin ,(2)

where (xmax −xmin) = 0 when (xmax −xmin ) = 0 for a feature which shows a constant

value for that input feature. Preprocessing module improves the accuracy of our

proposed system, reduces required memory, training time, complexity of model,

and overﬁtting problem.

3.4 Feature Extraction

In feature extraction module, we apply FastICA, which is computationally powerful

method for estimation of independent component analysis. It is 10-100 times faster

than traditional methods for independent component analysis task, which are based

on gradient descent approach. FastICA is used as feature extraction technique to re-

duce dimensionality of features, while retaining key information. It compresses the

features that take less memory to store these features [19]. By reducing dimension-

ality of features, it reduces: the amount of memory required to store features, com-

plexity of model, training time of model, and improves visualization of data because

in high dimensions it is very difﬁcult to understand and visualize data. FastICA is

similar to Principle Component Analysis (PCA) technique that maps collection of

features to uncorrelated features, whereas, FastICA do more by maximizing the sta-

tistical independence (or minimize mutual information) rather than developing just

uncorrelated features. FastICA performs better than PCA and easy to use [19]. It

reduces number of variables in time series.

3.5 Data Preparation

In data preparation module, we ﬁrst apply Granger Causality (GCA) test, which

is statistical hypothesis test to check weather a time series is useful in forecasting

other time series. GCA is used for time series analysis. GCA result shows that if the

probability outcome is less than any αlevel, then the hypothesis would be rejected at

that level [20]. In [20], authors describe the equation for GCA test which is deﬁned

as:

X(t) =

p

∑

j=1

AjX(t−j) + E1(t),(3)

where pis the lagged observations, the matrix Acontains the coefﬁcients, E1 repre-

sents residual (prediction errors) for each time series. After that we ﬁnd trend and

seasonal patterns in time series and adjust them. Time series data are mainly com-

posed of trend and seasonal patterns. It is possible to decompose the time series

data into major sub-components such as trend and seasonal components to check

their affects on time series data [21]. There are two different decomposition models:

Title Suppressed Due to Excessive Length 7

additive decomposition and multiplicative decomposition. In this paper, we use ad-

ditive decomposition model to get the trend and seasonal components of time series,

which is deﬁned as:

Xt =t rend(T t) + Seasonal (St) + random.(4)

We use additive decomposition model because additive decomposition is useful to

ﬁnd trend and seasonal components when time series change with respect to changes

in weather and do not vary much [21]. Additive model works more efﬁcient in our

time series data than multiplicative model. After ﬁnding trend and seasonal com-

ponents in time series, we make the series stabilize by differencing method. Dif-

ferencing is the most popular method that make the series stable by reducing or

eliminating trend and seasonality [22], say,

xt=yt−yt−1,(5)

where ytis original time series and yt−1is the lagged version of original time series.

Then apply Augmented Dickey-Fuller (ADF) test to ensure that weather the time

series is stationary or not. The ADF test result shows that time series is stationary

with p-values less than 0.01 for all locations. Results of differencing and ADF sug-

gest that stationarity is not an issue with our time series.

3.6 Model Structure

For the structure of ECNN, certain elements have to be deﬁned as: number of hidden

layers to use in ECNN, activation function, type of optimizer, padding size, window

size, and adjustment of regularization terms to avoid overﬁtting and increase fore-

cast accuracy. As, for the choice of these decisions, there is no optimal choice given

in literature. All of these elements of proposed ECNN found by a trial and error pro-

cess. In ECNN, we use Leaky Relu activation function, which is most commonly

used activation function in deep neural networks. Leaky Relu improves the training

process and reduces vanishing gradient problem of ECNN. For ECNN learning, we

use Adaptive moment estimation (Adam) optimizer, which is extension of stochastic

gradient descent to optimize the model. Moreover, the choice of hidden layers and

activation function depend on the type of data and problem to solve. Fig. 2 shows

the structure of ECNN.

4 ECNN: Our Proposed Model

CNN is a deep neural network, which uses multiple layered neural network structure

to represent the information. CNN was ﬁrst proposed [23] for automatic classiﬁca-

8 Authors Suppressed Due to Excessive Length

tion of digit images. One of the most important property of CNN is that it continues

to improve as the size of data increases. It successfully improves the performance

with less memory requirements, because CNN is fully connected network and it

has parameter sharing property. We propose an enhanced version of CNN, named

as ECNN, in which we add some additional hidden layers and adjust parameters of

hidden layers to avoid overﬁtting problem and improve model performance. CNN

works in two parts; in ﬁrst part, CNN learns the high level features from the given

input features with weight sharing property, and in second part, CNN ﬂattens the

output of above layers and perform prediction. In ECNN, we add one convolution

layer with ﬁlter size two using Leaky Relu as activation function. Mathematical

equation of convolution layer to perform convolution operation is deﬁned as in [4];

Oj,k=f(

c

∑

l=0

c

∑

m=0

wk,lij+l,k+m),(6)

where fis the activation function, wis the weight values of kernel, and iis the input

features. In most of deep neural networks, relu is used as activation function. Relu

has key advantages over other activation functions that it does not activate all the

neurons at the same time. However, the limitation of relu activation function is that

it saturates at the negative region, which means that gradient at negative region is

zero. When the gradient is zero, during back propagation all the weights will not be

updated. To overcome such limitation, we use Leaky Relu. In general, it solves the

dying relu problem. After convolution layer we add two dense layers, dense layer

represents a matrix vector multiplication, and the values in the matrix are trainable

parameters that get updated during back propagation. Dense layer is fully connected

layer whose neurons receive input from all the neurons of previous layers. After the

dense layer, dropout layer comes which is used to prevent overﬁtting problem. Dur-

ing training time, at each iteration, number of neurons with some certain probability

is temporarily dropped. The reason is that dropout prevents the network to be de-

pendent on a small number of neurons and force every neuron to be able to operate

independently, which increases the accuracy, shortens the training time and combats

Fig. 2: Proposed Enhanced Convolutional Neural Network

Title Suppressed Due to Excessive Length 9

overﬁtting. The purpose of dropout layer is to not rely on some or combination of

neurons, but to learn different representations to avoid overﬁtting.

In ECNN, we add two Maxpooling layers. Maxpooling layer reduces the amount

of parameters, model computation, dimensionality, and control overﬁtting by reduc-

ing the spatial size of the network. Key advantage of pooling operation is to generate

small feature maps, which summarize the large input feature maps. After that ﬂatten

layer ﬂattens the output of above layers in order to feed next fully connected layers

to perform prediction. Output of a jth hidden layer neuron can be calculated using

following equation:

oj=f(b+

p

∑

l=1

wj,lol),(7)

where fis the activation function, w(j,l)is the weight between neurons ojand ol,

and pis the total number of neurons [4]. Fig. 2 shows the proposed ECNN model.

ECNN trains using Adam optimizer to update the weights of model iteratively.

Adam has key some key advantages over traditional stochastic gradient descent

optimization algorithms, which are deﬁned as: it is straightforward to implement,

computationally efﬁcient, requires less memory, best for problems that are large in

terms of data or parameters, and appropriate for noisy problems [24]. Following are

the key equations to update the model parameters:

mt=β1mt−1+ (1−β1).gt,(8)

vt=β2vt−1+ (1−β2).g2

t,(9)

where mand vare moving averages, gis gradient along time t, and betas are hyper-

parameters of the algorithm.

4.1 Performance metrics

In order to evaluate the model performance, there are many performance evaluation

metrics available in literature. In this paper, we consider three standard performance

metrics, which are deﬁned by [30] as: MAE, RMSE, and RMSE%. Following are

the equations of these metrics; MAE is deﬁned as:

MAE =1

N

N

∑

i=0

|Fi−Oi|,(10)

RMSE is deﬁned as:

RMSE =s1

N

N

∑

i=0

(Fi−Oi)2,(11)

and RMSE% is deﬁned as:

10 Authors Suppressed Due to Excessive Length

RMSE %=q1

N∑N

i=0(Fi−Oi)2

1

N∑N

i=0Oi

,(12)

where Nis the total number of input samples, Fiis the actual value, and Oiis the

predicted value. In this paper, we compare our proposed model with existing bench-

mark models including: SVM, VAR, and ANN.

5 Simulation Results

In this section, we discuss the results of our proposed ECNN time series model with

existing benchmark models. We evaluate the performance of our proposed model

with existing benchmark models in terms of MAE, RMSE, and RMSE% perfor-

mance metrics which are describe in Sub-section 4.1. Fig. 3 shows that the proposed

ECNN has very less average (among 61 locations) MAE and RMSE as compared

to other existing models. In terms of accuracy, ECNN performs better than existing

benchmark models. Fig. 4 shows ECNN forecasting results of wind speed, solar ra-

diations, and temperature for California state.

Table II and III summarize the ECNN forecasting results of wind speed, solar radi-

ations, and temperature according to three performance metrics (i.e., MAE, RMSE,

and RMSE%). These tables show the average MAE, RMSE, and RMSE% for three

weather parameters. For weather parameters forecasting, our results clearly indicate

that the proposed ECNN model has very less error rate and more accurate than tra-

ditional models. Furthermore, the execution time of ECNN is less than the VAR

benchmark model. VAR execution time for one location is 6 minutes and 25 sec-

onds, while ECNN has 3 minutes and 50 seconds. Results concluded that our pro-

posed model have better performance than other benchmark models for supply and

demand forecasting of electricity.

6 Conclusion

In order to solve the problem of short-term weather parameters forecasting, we pro-

pose an ECNN weather model based on three weather parameters including: temper-

ature, wind speed, and solar radiations. The proposed model is based on four major

modules. In ﬁrst module, we perform preprocessing of time series data including:

data analyzing and cleansing, outliers detection and removal, and data normaliza-

tion. In second module, we perform feature extraction using FastICA. FastICA is

used to reduce dimensionality, model complexity, and improve model training. In

third module, we prepare time series data by applying GCA test to check the use-

fulness of time series, ﬁnd and adjust trend and seasonality in time series, and ADF

Title Suppressed Due to Excessive Length 11

Fig. 3: Average 6-hours MAE, RMSE, and RMSE% of ECNN and other models

Table 2: ECNN average MAE, RMSE, and RMSE% of wind and solar forecasting

Forecast

Horizon

MAE (m/s) RMSE

(m/s)

RMSE%

(%)

1-hour

ahead

0.0084 0.0103 23.15

2-hour

ahead

0.0095 0.0109 26.17

3-hour

ahead

0.0098 0.0115 28.21

4-hour

ahead

0.0105 0.0121 35.62

5-hour

ahead

0.0107 0.0124 38.67

6-hour

ahead

0.0110 0.0126 38.90

Forecast

Horizon

MAE

(Wh/m2)

RMSE

(Wh/m2)

RMSE%

(%)

1-hour

ahead

0.0114 0.0135 17.15

2-hour

ahead

0.0116 0.0138 20.17

3-hour

ahead

0.0116 0.0140 22.21

4-hour

ahead

0.0118 0.0142 25.12

5-hour

ahead

0.0124 0.0145 27.41

6-hour

ahead

0.0126 0.0146 30.90

test to check stationarity of time series. In fourth module, we perform short-term

forecasting by employing proposed ECNN model. Therefore, the proposed model

gets the advantages of deep neural networks. The simulation results based on two-

years real world time series weather data shows that the proposed model has more

12 Authors Suppressed Due to Excessive Length

Fig. 4: ECNN wind speed, solar radiations, and temperature forecasting results

Table 3: ECNN average MAE, RMSE, and RMSE% of temperature forecasting

Forecast Hori-

zon

MAE (K) RMSE (K) RMSE%

(%)

1-hour ahead 0.0988 0.9886 0.15

2-hour ahead 0.0988 0.9889 0.25

3-hour ahead 0.0989 0.9890 0.28

4-hour ahead 0.0989 0.9895 0.32

5-hour ahead 0.0989 0.9898 0.35

6-hour ahead 0.0990 0.9899 0.40

accurate and effective results than existing benchmark models. Altogether it can be

concluded that developing a deep model is complex process, especially when it is

applied to weather parameters. Hence, the deep model development requires exten-

sive effort to determine the optimal number of hidden layers and their parameters

such as hyper-parameters tuning, since there are no clear instructions available for

such process. In future, we enhance the proposed system by incorporating more

weather parameters and improves its performance. Furthermore, we will enhance

Title Suppressed Due to Excessive Length 13

the performance of proposed model through optimization techniques by tuning their

hyper-parameters.

References

1. L. Yixian, M. C. Roberts, and R. Sioshansi, ”A vector autoregression weather model for elec-

tricity supply and demand modeling,” Journal of Modern Power Systems and Clean Energy,

vol. 6, pp. 763-776, 2018.

2. M. Torabi, S. Hashemi, M. R. Saybani, S. Shamshirband, and A. Mosavi, ”A Hybrid clus-

tering and classiﬁcation technique for forecasting short?term energy consumption,” Environ-

mental Progress & Sustainable Energy, vol. 38, pp. 66-76, 2019.

3. K. Lang, M. Zhang, Y. Yuan, and X. Yue, ”Short-term load forecasting based on multivariate

time series prediction and weighted neural network with random weights and kernels,” Cluster

Computing, pp. 1-9, 2018.

4. K. Kaba, M. Sarıg¨

ul, M. Avcı, and H. M. Kandırmaz, ”Estimation of daily global solar radi-

ation using deep learning model,” Energy, vol. 162, pp. 126-135, 2018.

5. W. Lee, K. Kim, J. Park, J. Kim, and Y. Kim, ”Forecasting Solar Power Using Long-Short

Term Memory and Convolutional Neural Networks,” IEEE Access, vol. 6, pp. 73068-73080,

2018.

6. J. Lago, K. De Brabandere, F. De Ridder, and B. De Schutter, ”Short-term forecasting of solar

irradiance without local telemetry: A generalized model using satellite data,” Solar Energy,

vol. 173, pp. 566-577, 2018.

7. I. Majumder, P. Dash, and R. Bisoi, ”Variational mode decomposition based low rank robust

kernel extreme learning machine for solar irradiation forecasting,” Energy conversion and

management, vol. 171, pp. 787-806, 2018.

8. J. Zhang, Y.-M. Wei, D. Li, Z. Tan, and J. Zhou, ”Short term electricity load forecasting using

a hybrid model,” Energy, vol. 158, pp. 774-781, 2018.

9. G.-F. Fan, L.-L. Peng, and W.-C. Hong, ”Short term load forecasting based on phase space

reconstruction algorithm and bi-square kernel regression model,” Applied energy, vol. 224,

pp. 13-33, 2018.

10. Y. Chen, P. Xu, Y. Chu, W. Li, Y. Wu, L. Ni, et al., ”Short-term electrical load forecasting

using the Support Vector Regression (SVR) model to calculate the demand response baseline

for ofﬁce buildings,” Applied Energy, vol. 195, pp. 659-670, 2017.

11. Y. Li, H. Wu, and H. Liu, ”Multi-step wind speed forecasting using EWT decomposition,

LSTM principal computing, RELM subordinate computing and IEWT reconstruction,” En-

ergy Conversion and Management, vol. 167, pp. 203-219, 2018.

12. S. Singh and A. Mohapatra, ”Repeated wavelet transform based ARIMA model for very

short-term wind speed forecasting,” Renewable energy, vol. 136, pp. 758-768, 2019.

13. C. Li, Z. Xiao, X. Xia, W. Zou, and C. Zhang, ”A hybrid model based on synchronous op-

timization for multi-step short-term wind speed forecasting,” Applied energy, vol. 215, pp.

131-144, 2018.

14. J. Song, J. Wang, and H. Lu, ”A novel combined model based on advanced optimization

algorithm for short-term wind speed forecasting,” Applied energy, vol. 215, pp. 643-658,

2018.

15. M. Santhosh, C. Venkaiah, and D. V. Kumar, ”Ensemble empirical mode decomposition based

adaptive wavelet neural network method for wind speed prediction,” Energy conversion and

management, vol. 168, pp. 482-493, 2018.

16. Wilcox, S. M. (2012). National solar radiation database 1991-2010 update: User’s manual

(No. NREL/TP-5500-54824). National Renewable Energy Lab.(NREL), Golden, CO (United

States).

17. P. J. Rousseeuw and M. Hubert, ”Robust statistics for outlier detection,” Wiley Interdisci-

plinary Reviews: Data Mining and Knowledge Discovery, vol. 1, pp. 73-79, 2011.

14 Authors Suppressed Due to Excessive Length

18. T. Jayalakshmi and A. Santhakumaran, ”Statistical normalization and back propagation for

classiﬁcation,” International Journal of Computer Theory and Engineering, vol. 3, pp. 1793-

8201, 2011.

19. D. Langlois, S. Chartier, and D. Gosselin, ”An introduction to independent component anal-

ysis: InfoMax and FastICA algorithms,” Tutorials in Quantitative Methods for Psychology,

vol. 6, pp. 31-38, 2010.

20. H. L¨

utkepohl, New introduction to multiple time series analysis: Springer Science & Business

Media, 2005.

21. V. Prema and K. U. Rao, ”Time series decomposition model for accurate wind speed forecast,”

Renewables: Wind, Water, and Solar, vol. 2, p. 18, 2015.

22. D. C. Montgomery, C. L. Jennings, and M. Kulahci, Introduction to time series analysis and

forecasting: John Wiley & Sons, 2015.

23. Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, et al., ”Back-

propagation applied to handwritten zip code recognition,” Neural computation, vol. 1, pp.

541-551, 1989.

24. D. P. Kingma and J. Ba, ”Adam: A method for stochastic optimization,” arXiv preprint

arXiv:1412.6980, 2014.

25. R. Marquez and C. F. Coimbra, ”Proposed metric for evaluation of solar forecasting models,”

Journal of solar energy engineering, vol. 135, p. 011016, 2013.

26. Samuel, O., Alzahrani, F.A., Hussen Khan, R.J.U., Farooq, H., Shaﬁq, M., Afzal, M.K. and

Javaid, N., 2020. Towards Modiﬁed Entropy Mutual Information Feature Selection to Fore-

cast Medium-Term Load Using a Deep Learning Model in Smart Homes. Entropy, 22(1),

2020.

27. Khalid, R., Javaid, N., Al-zahrani, F.A., Aurangzeb, K., Qazi, E.U.H. and Ashfaq, T., 2020.

Electricity Load and Price Forecasting Using Jaya-Long Short Term Memory (JLSTM) in

Smart Grids. Entropy, 22(1), 2020.

28. Mujeeb, S. and Javaid, N., 2019. ESAENARX and DE-RELM: Novel schemes for big data

predictive analytics of electricity load and price. Sustainable Cities and Society, 51.

29. Mujeeb, S., Alghamdi, T.A., Ullah, S., Fatima, A., Javaid, N. and Saba, T., 2019. Exploiting

Deep Learning for Wind Power Forecasting Based on Big Data Analytics. Applied Sciences,

9(20).

30. Naz, A., Javaid, N., Rasheed, M.B., Haseeb, A., Alhussein, M. and Aurangzeb, K., 2019.

Game Theoretical Energy Management with Storage Capacity Optimization and Photo-

Voltaic Cell Generated Power Forecasting in Micro Grid. Sustainability, 11(10).

31. Naz, A., Javed, M.U., Javaid, N., Saba, T., Alhussein, M. and Aurangzeb, K., 2019. Short-

term electric load and price forecasting using enhanced extreme learning machine optimiza-

tion in smart grids. Energies, 12(5).

32. Mujeeb, S., Javaid, N., Ilahi, M., Wadud, Z., Ishmanov, F. and Afzal, M.K., 2019. Deep long

short-term memory: A new price and load forecasting scheme for big data in smart cities.

Sustainability, 11(4), p.987.