ChapterPDF Available

Short Term Load Forecasting Using XGBoost


Abstract and Figures

For efficient use of smart grid, exact prediction about the in-future coming load is of great importance to the utility. In this proposed scheme initially we converted daily Australian energy market operator load data to weekly data time series. Furthermore, we used eXtreme Gradient Boosting (XGBoost) for extracting features from the data. After feature selection we used XGBoost for the purpose of forecasting the electricity load for single time lag. XGBoost perform extremely well for time series prediction with efficient computing time and memmory resources usage. Our proposed scheme outperformed other schemes for mean average percentage error metric.
Content may be subject to copyright.
Short Term Load Forecasting
Using XGBoost
Raza Abid Abbasi1, Nadeem Javaid1(B
), Muhammad Nauman Javid Ghuman2,
Zahoor Ali Khan3, Shujat Ur Rehman2, and Amanullah2
1COMSATS Institute of Information Technology, Islamabad 44000, Pakistan
2Quaid-i-Azam University, Islamabad 44000, Pakistan
3Computer Information Science, Higher Colleges of Technology,
Fujairah 4114, UAE
Abstract. For efficient use of smart grid, exact prediction about the in-
future coming load is of great importance to the utility. In this proposed
scheme initially we converted daily Australian energy market operator
load data to weekly data time series. Furthermore, we used eXtreme Gra-
dient Boosting (XGBoost) for extracting features from the data. After
feature selection we used XGBoost for the purpose of forecasting the elec-
tricity load for single time lag. XGBoost perform extremely well for time
series prediction with efficient computing time and memmory resources
usage. Our proposed scheme outperformed other schemes for mean aver-
age percentage error metric.
1 Introduction
Energy production and consumption difference minimization is a challenging
task these days. Efficient consumption of electricity is one good solution to this
problem. Researchers has done a lot of work in for introducing cost effective
and efficient energy utilization systems. Next generation Smart Grid (SG) is
the most attractive solution so far. SG is the integration of information and
communication technology in traditional grid which makes it intelligent power
grid supporting real-time information exchange between producer and consumer.
SG enables the energy efficiency optimization. More precisely, the SG needs an
accurate forecasting of the energy load for more productive application.
AN increasing attempt of deregulating strength markets to shape a more
dependable, green, and price-effective system with the aid of improving com-
petitions has been witnessed in world’s important economies [1,2]. In the lib-
eralized markets, the strength is commoditized and consequently its price is
dynamic. Due to the fee variant, pricing power correctly will become important
to generate profits, schedule strength productions, and plan load responses [3
7]. The accurate power charge forecasting is helpful to decide the strength rate
and accordingly is precious. As liberalized electricity markets include types, day-
ahead and real-time [8], it’s miles meaningful to discuss both of the day-ahead
Springer Nature Switzerland AG 2019
L. Barolli et al. (Eds.): WAINA 2019, AISC 927, pp. 1120–1131, 2019.
STLF Using XGBoost 1121
and online forecasting of the power fee. The energy rate forecasting has been
vigorously studied inside the literature. From the application aspect, the fore-
casting of the energy rate in unique deregulated markets of essential economies
around the world has been stated [915].
The Rest of the paper is structured as defined next. The mostly used practi-
cle techniques for load forecasting are discussed in Sect. 2. The Sect. 3enlightens
the different problems associated with the load prediction. The proposed model
for load prediction and the evaluation metrices are explained in Sect.4. Exper-
imental results are depicted and highlighted in Sect.5. Conclusion about the
work done in this paper is expressed in Sect. 6at the end.
2 Related Work
Authors in [16] proposed a novel practical methodology using quantile regression
mean on a set of sisters point forecasts. Data from GEFCom2014 probabilistic
load track was used to forecast for developing probabilistic load forecasts. The
suggested scheme has dual advantages, where It will strength the advancement in
the field of point load forecasting, it is not dependent on the high quality expert
predictions. Scheme proposed in this work produces better results as compared to
the benchmark methods. “Recency effect” a psychology term is used by authors
in [17]. They exploited the fact that power consumption demand is influenced by
the temperature of the earlier hours. Authors produced a ample study to show
the effect of recency with the help of big data. Modern computing power is used
in order to decide how many lagged temperature are required for catching the
recency effect completely without affecting the predicting accuracy.
In [18] authors proposed a scheme for big data analytics in smart grid which
aim at reducing the electricity cost for users. Moreover they explored the individ-
ual components needed for an improved decision support system for the purpose
of energy saving. The presented framework has four different layers in its archi-
tecture, i.e., smart grid, data accumulation, an analysis counter and supporting
web portal. Future power consumption is fore-casted and optimized through a
innovative composite nature inspired meta heuristic prediction scheme. A versa-
tile optimization algorithm works as a backbone for the analytics counter that
helps in achieving accurate results. The proposed novel framework is the major
contribution of this work, which supports the energy saving decision process.
This contribution is the basis for full scale, Smart Decision Support System
(SDSS). SDSS can identify the usage pattern of an individual user which helps
in enhancing the efficiency of energy usage where improving the accuracy of fore-
casted energy demands. Authors in [19] used forecasting analytics while focusing
the extraction of related external features. More explicitly the proposed scheme
predicts the spot prices in German energy market in relation to the historical
data of prices and weather features. Least Absolute Shrinkage Selection Opera-
tion (LASSO) finds the related weather stations where implicit variable selection
is executed by Random Forest (RF). This work enhanced the prediction accuracy
with respect to Mean Average Error (MAE) by 16.9%.
1122 R. A. Abbasi et al.
A novel modeling scheme for electricity price prediction is introduced in [20].
Four different deep learning models are suggested by Lago et al. for forecast-
ing electricity prices that lead to advancement in forecasting accuracy. Authors
proposed despite the presence of a good number of electricity price forecast-
ing methods, still a benchmark is missing. This work compared and evaluated
27 different common techniques used for electricity price prediction and then
proved how the proposed models outperform the state of the art techniques
those are significant. Wang et al. used Stacked De-noising Auto-encoder and
Random Samples RS-SDA for live and next day hourly price prediction. In [21]
short term forecasting of the electricity price is performed using data driven
scheme. Deep Neural Networks type, SDA and its extended version RS-SDA are
used to forecast the electricity price hourly for the data collected from different
states of United States. This research is focused on next day hourly prediction
and the live hourly prediction. SDA defined models are assessed in comparison
with conventional neural network and support vector machine, where next day
prediction SDA models accuracy is assessed in comparison industrial model.
In [22] Lagoa et al. introduced two distinct schemes for combining market
incorporation in energy price prediction and to enhance the forecasting depic-
tion. First scheme suggested a DNN that examines features from linked markets
to enhance the forecasting results in a community market. Features importance
is calculated using a innovative feature selection scheme that contains the opti-
mization and functional analysis of variance. Second scheme forecasts the prices
from two adjacent markets simultaneously which bring the accuracy metric Sym-
metric Mean Absolute Error (SMAPE) even further lower. Raviv et al. worked
on predicting next day energy prices while utilizing hourly prices in [23]. This
work exhibit that dismantled hourly rates include handy forecasting facts for the
daily typical prices in the Nord pool market. It is evaluated that the multivari-
ate patterns for the complete group of hourly prices considerably go better than
univariate patterns of the daily normal price. Multivariate models reduce RMSE
upto 16%. In [24] authors worked on electrical load forecasting on the basis of
pre analysis and weight coefficients optimization. A novel scheme is introduced
exploiting the features of electrical load data i.e., capacity to effectively calcu-
late the seasonality and nonlinearity. The proposed new scheme can use up the
advantages stay away from disadvantages of the individual schemes. In suggested
combined scheme the data fore analyzation is adapted so that conflicts can be
minimized in the data, where weight factors are adjusted using cuckoo search
in the combined model. The newly proposed scheme outperforms the individual
forecasting models regarding forecast performance.
Singh et al. worked on the amount of power consumed prediction in [25].
An intellectual data mining scheme is proposed that can evaluate, predict and
reflect electricity time series to disclose numerous temporary energy using pat-
terns. These patterns help to identify appliance usage relationship with time
i.e., hour of day, week, month e.t.c, and appliance usage relationship with other
appliances. This identification basis for the understanding the customer usage
behavior, energy load forecasting and the price forecasting. Authors proposed
STLF Using XGBoost 1123
Bayesian network forecasting, constant analysis of data through data mining and
unsupervised data accumulating for electricity consumption prediction. In [26]
authors proposed a short-lived electricity load prediction scheme for academic
buildings. This work used 2-stage forecasting analysis for the productive working
of their energy system. Energy consumption data is collected from different uni-
versities and moving average method is used for finding the energy load pattern
according to week day. Random Forest (RF) technique is used for forecasting
the daily energy load. RF performance is assessed using cross-validation on time
Gonz´alez et al. predicted electricity price adopting functional time series
using a New Hilbertian ARMAX model in [27]. Suggested scheme has a lin-
ear regression structure, where functional variables are operated by functional
parameters. Where functional parameters are fundamental entities with linearly
combined kernels as sigmoid operations. Quasi-Newton model is used for param-
eters optimization in sigmoid which minimizes the sum of squared error. Data
integrity attacks affect the results of load prediction models i.e., artificial neural
network, multiple linear regression, support vector regression and fuzzy interac-
tion regression). Authors in [28] worked on exposing the consequences of these
attacks. We begin by simulating some knowledge integrity attacks through the
random injection of some multipliers that follow a traditional or uniform distri-
bution into the load series. Then, the four same load prognostication models are
used to generate one-year-ahead ex post purpose forecasts so as to supply a com-
parison of their forecast errors. The results show that the support vector regres-
sion model is most robust, followed closely by the multiple rectilinear regression
model, whereas the fuzzy interaction regression model is that the least sturdy of
the four. withal, all four models fail to supply satisfying forecasts once the size
of the info integrity attacks becomes giant. This presents a serious challenge to
each load forecasters and therefore the broader prognostication community: the
generation of correct forecasts beneath knowledge integrity attacks.
Dong et al. worked on the energy management in a microgrid. Bayesian-
optimization-algorithm (BOA) is used for a single SG using house. Authors
in [29] articulates the enhancement beyond the closed form equitable function
equation, and work out on it using BOA based data-driven technique. We can
consider the suggested technique as a black box function improving technique
as a whole. Furthermore, it has the ability to handle the microgrid working and
argument forecasting ambiguity.
3 Motivation and Problem Statement
Electricity load prediction is an important part of advanced power systems
i.e., SG, effective power controlling, and improved energy operation engineering.
Therefore, highly accurate prediction is needed for different perspectives, that
are related to control, forwarding, planning and unit responsibility in a grid.
Artificial Intelligence (AI) centered schemes has high competency to manipu-
late complicated mathematical problems, therefore, these techniques are widely
employed in number of research areas.
1124 R. A. Abbasi et al.
The Artificial Neural Network (ANN) outperforms statistical schemes, as
ANN is more efficient in mapping inputs to the outputs beyond complicated
mathematical designs. Diverse learning structures are used by ANN for exploit-
ing the linear association among the inputs [30]. ANN schemes has better depic-
tion than analytical and time series techniques for prediction problems. The
prediction performance in neural network is enhanced by the pre-processing of
training data, high equivalence impact, optimal network structure and better
learning algorithm. Moreover, ANN brings rapid confluence, minimized comput-
ing complexity, minimal training period and improved generalization [31].
Given a time series of 30 mins electricity loads, up to the time t, X1,.....Xt,
our goal is to predict load at time t+1, i.e., Xt+1.
4 Proposed System Model
Our proposed system forecasts the electricity load. In our proposed model we
used dataset from Australian Energy Market Operator (AEMO). In this dataset
electricity load recordings are taken after every 30 min, therefore producing 48
recordings for 48 time lags in a single day. Now if we want to predict the load of
a time lag, we have just 48 features to use. To increase the number of features
and better prediction we combine the records of week days from same week to
form a single record yielding 336 recordings for a single record, i.e we have 48 ×7
features. The proposed system model is visualized in Fig. 1.
Now we calculate the importance of each feature in updated dataset using
XGBoost feature selection technique. It help us to select the most appropriate
features for selection. We consider 35–40 features with highest importance values
for training and testing of our proposed scheme.
We have 1 year of data for 12 different regions, bringing 12×365 records for
365 days in a year. We divide the data into training and testing data as 75%
training and 25% testing data. We train our proposed model using training data
and perform testing using testing data. Algorithmic steps followed during the
process of load prediction using XGBoost are defined in Algorithm 1.
Fig. 1. System model
STLF Using XGBoost 1125
Algorithm 1. Proposed Scheme Algorithm for Load Prediction.
1: Load daily records data
2: Convert daily records data to weekly records data
3: for i1tosize(features)do
4: Calculate feature importance for feature
5: end for
6: Select features with importance value greater then threshold
7: Divide data into training and testing data
8: Train model over training data
9: Predict load using trained model over testing data
10: Calculate accuracy
11: Calculate MAPE
12: Calculate MAE
4.1 Evaluation Metrices
We used various standards for the evaluation of our proposed prediction model
efficiency. The two most commonly used metrices for the measurement of predic-
tion accuracy are Mean Absolute Percentage Error (MAPE) and Mean Absolute
Error (MAE).
4.1.1 MAPE
The MAPE may be a live of prediction accuracy of a forecasting methodology for
constructing fitted statistic values in statistics, specifically in trend estimation.
it always expresses accuracy as a proportion of the error. as a result of this range
may be a percentage, it may be easier to know than the opposite statistics. The
MAPE is outlined as shown in (1). Here, and area unit the actual worth and
therefore the forecast worth, severally. Also, is the number of times discovered.
MAPE =100
4.1.2 MAE
In statistics, the MAE is employed to measure however shut forecasts or predic-
tions area unit to the particular outcomes. It’s calculated by making a mean of
absolutely the variations between the prediction values and therefore the actual
ascertained values. The MAE is defined as shown in (2). Wherever nine is that
the prediction price, is the actual price.
MAE =1
1126 R. A. Abbasi et al.
5 Simulation Results and Discussion
We evaluated the performance of our proposed forecasting technique for predict-
ing the load. We did perform a lot of experiments. The results obtained from
extensive simulation are discussed here in this section.
5.1 Data Set Description
We used AEMO load data for the year 2017. The dataset records electricity load
after every 30 mins making 48 lags daily. The considered dataset has records
about 12 different areas for the same year 2017. We considered 365 days in a
year in the provided dataset. Here Fig.2plots the load profile for individual
week days. i.e., each weekday has its own line. From the graph we can see that
Saturday has an overall highest load with respect to the other week days in the
selected week. Furthermore, we can see that Wednesday has lowest load with
respect to the other weekdays in the selected week. Where Fig.3is displaying
the load of two consecutive weeks. It is clear from Figs. 2and 3that the electricity
load data has daily as well as weekly cycles. The load at a specific time with
respect to the other day is more or less same and it rises and falls more or less
like the previous and next day. Similarly for the week, we can see that the load
at one day in a week is more or less to the same day in previous and next week.
These cycles are due to the cycles in daily human activities. For performing the
load prediction of a specific time lag we combined the daily loads for same week
to form a weekly dataset. i.e We combined the data from same weekdays to form
single row. While conversion we neglected the weekdays that were not forming
a complete week at the end of the year. This conversion provides more features
for forecasting the load of a time lag.
Fig. 2. Daily load by weekdays
STLF Using XGBoost 1127
Fig. 3. Two weeks load
5.2 Prediction Model Configurations
We used XGBoost a gradient boosting framework, introduced back in 2014.
XGBoost can be used as a forecasting technique for feature selection and load
prediction of a time lag. From prediction to classification XGBoost has proved
its worth in terms of performance.
When we convert the dataset to form a weekly data we have 48×7 recordings
for a week. One row in the modified dataset represent a week with 336 features.
To understand the importance of features, we used XGBoost to calculate the
Feature Importance (FI) of all these features.
Figure 4is depicting the FI of all these features. The greater the FI values
means the feature will more effect the load prediction. It is evident from Fig. 4
that features with hight importance values are less in number, where most of
the features have low importance value. We can see that the features close to
the predicting lag have high importance as well. Also it is evident that days
having same weekday for which we are predicting the load i.e., sunday have
high importance value. We will only use the features with high FI values for the
purpose of prediction and eliminate all other features from dataset. For selecting
the features, we set a threshold for feature importance and we set this threshold
by repeating experiment multiple times, and trying different threshold value.
The value with best results in considered as threshold. The Fig. 4shows that the
features with high FI value are less in number, it will save the running time.
5.3 Forecasting Results
We used XGBoost for forecasting the load for a specific time lag in a week using
weekly data. Figure 5shows the real load in the dataset and the XGBoost fore-
casted load. Here x-axis is depicting the time lags where y-axis is the load at
1128 R. A. Abbasi et al.
Fig. 4. XGB feature importance
that specific time lag. Actual load is represented by the blue graph, where fore
casted load is represented by the green graph. We can see that the XGBoost
load prediction follows the real load at most of the time, however at some high
load instances XGBoost is not exactly following the real load. We can see that
XGBoost is not predicting well at the high loads.
Fig. 5. XGB predictions
The XGBoost load forecasting results for a time lag are displayed in Fig.6.
We can see that XGBoost forecasting technique results in a low Mean Average
Percentage Error (MAPE), high accuracy and high Mean Average Error (MAE).
XGBoost load prediction resulted in a 10.08% MAPE, 97.21% accuracy and
88.90% MAE.
STLF Using XGBoost 1129
Fig. 6. XGB results
6 Conclusion
In this paper, we proposed a new scheme for electricity load forecasting. We con-
verted daily electricity load information into weekly load information. It increases
number of features available for predicting load for a lag variable. Then, we used
XGBoost, a recently dominant machine learning technique for time series pre-
diction, for feature selection from converted data. Once features are extracted
we train the model using XGBoost. After training we use trained model for load
1. Mathaba, T., Xia, X., Zhang, J.: Analysing the economic benefit of electricity price
forecast in industrial load scheduling. Electr. Power Syst. Res. 116, 158–165 (2014)
2. Sarada, K., Bapira ju, V.: Comparison of day-ahead price forecasting in energy
market using Neural Network and Genetic Algorithm. In: Proceeding of the Inter-
national Conference on Smart Electric Grid, pp. 1–5 (2014)
3. Shafie-Khah, M., Moghaddam, M.P., Sheikh-El-Eslami, M.: Price forecasting of
day-ahead electricity markets using a hybrid forecast method. Energy Convers.
Manag. 52(5), 2165–2169 (2011)
4. Garcia, R.C., Contreras, J., Van Akkeren, M., Garcia, J.B.C.: A GARCH forecast-
ing model to predict day-ahead electricity prices. IEEE Trans. Power Syst. 20(2),
867–874 (2005)
5. Shahidehpour, M., Yamin, H., Li, Z.: Market overview in electric power systems. In:
Market Operations in Electric Power Systems, pp. 1–20. Wiley, New York (2002)
6. Ci-wei, G., Bompard, E., Napoli, R., Cheng, H.: Price forecast in the competitive
electricity market by support vector machine. Phys. A: Stat. Mech. Appl. 382(1),
98–113 (2007)
1130 R. A. Abbasi et al.
7. Cai, Y., Lin, J., Wan, C., Song, Y.: A stochastic Bi-level trading model for an
active distribution company with distributed generation and interruptible loads.
IET Renew. Power Gener. 11(2), 278–288 (2017)
8. Weron, R.: Electricity price forecasting: a review of the state-of-the-art with a look
into the future. Int. J. Forecast. 30(4), 1030–1081 (2014)
9. Hu, L., Taylor, G.: A novel hybrid technique for short-term electricity price fore-
casting in UK electricity markets. J. Int. Counc. Electr. Eng. 4(2), 114–120 (2014)
10. Voronin, S., Partanen, J.: Forecasting electricity price and demand using a hybrid
approach based on wavelet transform, ARIMA and neural networks. Int. J. Energy
Res. 38(5), 626–637 (2014)
11. Kou, P., Liang, D., Gao, L., Lou, J.: Probabilistic electricity price forecasting with
variational heteroscedastic Gaussian process and active learning. Energy Convers.
Manag. 89, 298–308 (2015)
12. Shrivastava, N.A., Panigrahi, B.K.: A hybrid wavelet-ELM based short term price
forecasting for electricity markets. Int. J. Electr. Power Energy Syst. 55, 41–50
13. He, K., Xu, Y., Zou, Y., Tang, L.: Electricity price forecasts using a curvelet
denoising based approach. Phys. A: Stat. Mech. Appl. 425, 1–9 (2015)
14. Wan, C., Xu, Z., Wang, Y., Dong, Z.Y., Wong, K.P.: A hybrid approach for prob-
abilistic forecasting of electricity price. IEEE Trans. Smart Grid 5(1), 463–470
15. Wan, C., Niu, M., Song, Y., Xu, Z.: Pareto optimal prediction intervals of electricity
price. IEEE Trans. Power Syst. 32(1), 817–819 (2017)
16. Liu, B., Nowotarski, J., Hong, T., Weron, R.: Probabilistic load forecasting via
quantile regression averaging on sister forecasts. IEEE Trans. Smart Grid 8(2), 1
17. Wanga, P., Liu, B., Hongb, T.: Electric load forecasting with recency effect: a big
data approach. Int. J. Forecast. 32, 585–597 (2016)
18. Chou, J.-S., Ngo, N.-T.: Smart grid data analytics framework for increasing energy
savings in residential buildings. Autom. Constr. 72, 247–257 (2016)
19. Ludwig, N., Feuerriegel, S., Neumann, D.: Putting big data analytics to work:
feature selection for forecasting electricity prices using the LASSO and random
forests. ISSN 1246-0125 (Print) 2116-7052
20. Lago, J., De Ridder, F., De Schutter, B.: Forecasting spot electricity prices: deep
learning approaches and empirical comparison of traditional algorithms. Appl.
Energy 221, 386–405 (2018)
21. Wang, L., Zhang, Z., Chen, J.: Short-term electricity price forecasting with stacked
denoising autoencoders. IEEE Trans. Power Syst. 32(4), 2673–2681 (2017)
22. Lagoa, J., De Ridder, F., Vrancx, P., De Schutter, B.: Forecasting day-ahead elec-
tricity prices in Europe: the importance of considering market integration. Appl.
Energy 211(1), 890–903 (2018)
23. Raviv, E., Bouwman, K.E., van Dijk, D.: Forecasting day-ahead electricity prices:
utilizing hourly prices. Energy Econ. 50, 227–239 (2015)
24. Xiao, L., Jianzhou Wang, R., Hou, J.W.: A combined model based on data pre-
analysis and weight coefficients optimization for electrical load forecasting. Energy
82(15), 524–549 (2015)
25. Singh, S., Yassine, A.: Big data mining of energy time series for behavioral analytics
and energy consumption forecasting. Energies 11, 452 (2018)
26. Moon, J., Kim, K.-H., Kim, Y., Hwang, E.: A short-term electric load forecasting
scheme using 2-stage predictive analytics. In: 2018 IEEE International Conference
on Big Data and Smart Computing (2018)
STLF Using XGBoost 1131
27. Gonz´alez, J.P., San Roque, A.M., Perez, E.A.: Forecasting functional time series
with a new Hilbertian ARMAX model: application to electricity price forecasting.
IEEE Trans. Power Syst. 33(1), 545–556 (2018)
28. Luo, J., Hong, T., Fang, S.-C.: Benchmarking robustness of load forecasting models
under data integrity attacks. Int. J. Forecast. 34, 89–104 (2018)
29. Dong, G., Chen, Z.: Data driven energy management in a home microgrid based
on Bayesian optimal algorithm. IEEE Trans. Ind. Inform
30. Fausett, L.: Fundamentals of Neural Networks: Architectures, Algorithms, and
Applications. Pearson Education, Delhi (2006)
31. Shekhar, S., Amin, M.B.: Generalization by neural networks. IEEE Trans. Knowl.
Data Eng. 4, 177–185 (1992)
... In the following, we detail these categories. No × [6] No × [7] No × [8] No × [9] No × [10] Yes × Hybrid [11] No × × [12] Yes × × [13] No × × [14] No × × × [15] No × × [16] Yes × × ...
... In the same study, the authors mentioned that their improved algorithm enhanced the performance of SVR from three aspects, including prediction accuracy, robustness, and generalization capabilities. In another study, the XGBoost model has been used to construct a prediction model to forecast electricity load [9]. The authors outlined that their model can capture the non-linear relationships in a small-scale dataset provided by the Australian Energy Market Operator (AEMO). ...
Full-text available
With the steady growth of energy demands and resource depletion in today’s world, energy prediction models have gained more and more attention recently. Reducing energy consumption and carbon footprint are critical factors for achieving efficiency in sustainable cities. Unfortunately, traditional energy prediction models focus only on prediction performance. However, explainable models are essential to building trust and engaging users to accept AI-based systems. In this paper, we propose an explainable deep learning model, called Expect, to forecast energy consumption from time series effectively. Our results demonstrate our proposal’s robustness and accuracy when compared to the baseline methods.
... Based on the hybrid method of genetic algorithm and machine learning, Zhang, 2017 extracted the stress analysis characteristics of the plate and predicted the defects. The tree algorithm integrated with the XGBoost (Chen and Guestrin, 2016;Abbasi et al., 2019;Dong et al., 2020;Kim et al., 2020) was introduced into the regularization parameter, effectively avoiding the over-fitting phenomenon. The superposition of numerous decision trees improved the calculation accuracy, while the iterative efficiency was enhanced by the second-order Taylor expansion of the objective function. ...
Full-text available
This study conducted ten freeze-thaw cyclic tests to clarify the effect of freeze-thaw cycles on the forces acting on the buried oil pipeline. The stress evolution in the Q345 steel pipeline versus the number of freeze-thaw cycles was obtained. The test results were consistent with the COMSOL simulation of the effect of different moisture contents on the pipeline bottom stress. Besides the proposed XGBoost model, eleven machine-learning stress prediction models were also applied to 10–20 freeze-thaw cycling tests. The results showed that during the freeze-thaw process, the compressive stress at the pipeline bottom did not exceed −69.785 MPa. After eight freeze-thaw cycles, the extreme value of the principal stress of -252.437MPa, i.e., 73.17% of the yield stress, was reached. When the initial moisture content exceeded 20%, the eighth freeze-thaw cycle’s pipeline stress decreased remarkably. The XGBoost model effectively predicted the pipeline’s principal stress in each cycle of 10 freeze-thaw cyclic tests, with R2 = 0.978, MSE = 0.021, and MAE = 0.102. The above compressive stress fluctuated from −131.226 to −224.105 MPa. The predicted values well matched the experimental ones, being in concert with the “ratcheting effect” predicted by the freeze-thaw cycle theory. The results obtained provide references for the design, operation, and maintenance of buried oil pipelines.
... So far XGBoost has not been applied to a fault location problem in an electricity grid. Nevertheless, it presents multiple benefits showcased in a variety of applications [41][42][43] and its multifarious characteristics could provide a robust solution to a complex problem such as the fault location. Among its According to Eq. 5, the additive function of the boosting trees can be expressed by the prediction model equation as follows: ...
Full-text available
As the need for automatization of the electricity grid’s fault diagnosis schemes is rising, the application of technologies such as the artificial intelligence (AI) can provide practical solutions to the problem. AI can overcome the challenges that complex topologies like those of the low voltage (LV) smart grids pose and prove to be a powerful tool in the development of advanced fault diagnosis methods. An important parameter for the success of any AI-based method is the quality of data. Therefore, in this paper a data analysis is performed in order to evaluate the type of data produced by a small LV grid and an representative AI algorithm’s response to those. In the context of this analysis, the most important features and meters were identified. Furthermore, as a response to the large volume of available data, a data management strategy is proposed. The strategy combines original and reshaped features. For this purpose, five dimensionality reduction methods are tested and compared. Truncated-SVD is deemed the most appropriate and is subsequently utilized for the reshaping of the dataset that is introduced to the XGBoost fault location model. The integration of the dimensionality reduction technique in the algorithm results in the decrease of the computational time and the dataset’s size and in a higher generalizability of the algorithm. Thus, the application of the proposed method is not limited by the grid’s topology. The method’s robustness was verified against various influencing parameters such as the fault resistance, the size of the dataset, the loss of data and the photovoltaics’ penetration level. The overall algorithm achieved a mean squared error of 13.26 and a training and test accuracy of more than 99% when tested on the CIGRE LV benchmark grid.
... Finally, this method has seen multiple uses in various areas, such as energy [26], structural integrity [27], and health [28]. More specifically, in the forecast domain, XGBoost has been applied for short-term wind production forecast [29], outperforming traditional artificial neural network (ANN), long short-term memory (LSTM) recurrent neural network, and temporal convolutional networks (TCN) models and also presenting extremely good results for time-series forecast in terms of results, computational resources, and memory use for load forecasting [30]. In [31], XGBoost is used for photovoltaic generation. ...
Full-text available
The use of energy sharing models in smart grids has been widely addressed in the literature. However, feasible technical solutions that can deploy these models into reality, as well as the correct use of energy forecasts are not properly addressed. This paper proposes a simple, yet viable and feasible, solution to deploy energy management systems on the end-user-side in order to enable not only energy forecasting but also a distributed discriminatory-price auction peer-to-peer energy transaction market. This work also analyses the impact of four energy forecasting models on energy transactions: a mathematical model, a support-vector machine model, an eXtreme Gradient Boosting model, and a TabNet model. To test the proposed solution and models, the system was deployed in five small offices and three residential households, achieving a maximum of energy costs reduction of 10.89% within the community, ranging from 0.24% to 57.43% for each individual agent. The results demonstrated the potential of peer-to-peer energy transactions to promote energy cost reductions and enable the validation of auction-based energy transactions and the use of energy forecasting models in today’s buildings and end-users.
... The most popular ML models used in the field are NNs (Li, 2020) and, most recently, deep learning (Oreshkin et al., 2021). Nevertheless, regression-tree-based ML models, like XGBoost, have also become popular, showing promising results in various load forecasting applications Aguilar Madrid & Antonio, 2021;Abbasi et al., 2019;Liu et al., 2018), while being relatively faster to compute and easier to parameterize than NNs. Finally, hybrid forecasting methods involve the integration of time series or ML models with the objective to mitigate model uncertainty (Bozkurt et al., 2017). ...
Full-text available
This study introduces an energy management method that smooths electricity consumption and shaves peaks by scheduling the operating hours of water pumping stations in a smart fashion. Machine learning models are first used to accurately forecast the electricity consumed and produced by renewable energy sources on an hourly level. Then, the forecasts are exploited by an algorithm that optimally allocates the operating hours of the pumps with the objective to minimize predicted peaks. Constraints related with the operation of the pumps are also considered. The performance of the proposed method is evaluated considering the case of a Greek remote island, Tilos. The island involves an energy management system that facilitates the monitoring and control of local water pumping stations that support residential water supply and irrigation. Results indicate that smart scheduling of water pumps in a small-scale island environment can reduce the daily and weekly deviation of electricity consumption by more than 15% at no monetary cost. It is also concluded that the potential gains of the proposed approach are strongly connected with the amount of load that can be shifted each day, the accuracy of the forecasts used, and the amount of electricity produced by renewable energy sources.
... In the experiments, the XGB model achieved better prediction performance than Bagging, RF, and CRF. Abbasi et al. [30] proposed a 30-minute-interval electrical load forecasting model based on XGB. ey used variable importance to extract input variables from the historical load during a week and confirmed that the historical loads close to the prediction time point and from a week before the prediction time point had high importance for the model construction. ...
Full-text available
Daily peak load forecasting (DPLF) and total daily load forecasting (TDLF) are essential for optimal power system operation from one day to one week later. This study develops a Cubist-based incremental learning model to perform accurate and interpretable DPLF and TDLF. To this end, we employ time-series cross-validation to effectively reflect recent electrical load trends and patterns when constructing the model. We also analyze variable importance to identify the most crucial factors in the Cubist model. In the experiments, we used two publicly available building datasets and three educational building cluster datasets. The results showed that the proposed model yielded averages of 7.77 and 10.06 in mean absolute percentage error and coefficient of variation of the root mean square error, respectively. We also confirmed that temperature and holiday information are significant external factors, and electrical loads one day and one week ago are significant internal factors.
... ANN achieved better accuracy than SVM models with daily data. Abbasi et al. [32] proposed an extreme gradient boosting (XGBoost) electrical load forecasting model, using feature importance to extract input variables from historical load over a week. They verified that historical loads close to or a week before the prediction time point had high importance for model construction. ...
Full-text available
Smart grids have attracted much attention recently for their potential to reduce power system operating and management costs. Smart grid core components include energy storage, renewable energy source(s), and smart meters. Smart meters collect diverse data regarding smart grid operation, which can lead to inefficient operation if the meter data are damaged or tampered with during collection or transmission. Therefore, it is important to identify abnormalities in smart grid data and process them accordingly. Various anomaly detection models have been proposed using statistical methods, but they cannot detect some anomaly patterns accurately, and the models generally did not consider repair strategies for the detected anomalies. Anomaly repair should be included with model training to improve forecasting performance. This paper proposes a robust sliding window-based LightGBM model for short-term load forecasting using anomaly detection and repair. We first show how to detect anomalies using a variational autoencoder and then how they can be repaired using a random forest method. Finally, we verify that the proposed sliding window-based LightGBM achieves superior forecasting performance in combination with anomaly detection and repair.
Load Forecasting is an approach that is implemented to foresee the future load demand projected on some physical parameters such as loading on lines, temperature, losses, pressure, and weather conditions etc. This study is specifically aimed to optimize the parameters of deep convolutional neural networks (CNN) to improve the short-term load forecasting (STLF) and Medium-term load forecasting (MTLF) i.e. one day, one week, one month and three months. The models were tested based on the real-world case by conducting detailed experiments to validate their stability and practicality. The performance was measured in terms of squared error, Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and Mean Absolute Error (MAE). We optimized the parameters using three different cases. In first case, we used single layer with Rectified Linear Unit (ReLU) activation function. In the second case, we used double layer with ReLU – ReLU activation function. In the third case, we used double layer with ReLU – Sigmoid activation function. The number of neurons in each case were 2, 4, 6, 8, 10 and 12. To predict the one day ahead load forecasting, the lowest prediction error was yielded using double layer with ReLU – Sigmoid activation function. To predict ahead one-week load forecasting demands, the lowest error was obtained using single layer ReLU activation function. Likewise, to predict the one month ahead forecasting using double layer with ReLU – Sigmoid activation function. Moreover, to predict ahead three months forecasting using double layer ReLU – Sigmoid activation function produced lowest prediction error. The results reveal that by optimizing the parameters further improved the ahead prediction performance. The results also show that predicting nonstationary and nonlinear dynamics of ahead forecasting require more complex activation function and number of neurons. The results can be very useful in real-time implementation of this model to meet load demands and for further planning.
Full-text available
Electrical load forecasting has a fundamental role in the decision-making process of energy system operators. When many users are connected to the grid, high-performance forecasting models are required, posing several problems associated with the availability of historical energy consumption data for each end-user and training, deploying and maintaining a model for each user. Moreover, introducing new end-users to an existing network poses problems relating to their forecasting model. Global models, trained on all available data, are emerging as the best solution in several contexts, because they show higher generalization performance, being able to leverage the patterns that are similar across different time series. In this work, the lodging/residential electricity 1-h-ahead load forecasting of multiple time series for smart grid applications is addressed using global models, suggesting the effectiveness of such an approach also in the energy context. Results obtained on a subset of the Great Energy Predictor III dataset with several global models are compared to results obtained with local models based on the same methods, showing that global models can perform similarly to the local ones, while presenting simpler deployment and maintainability. In this work, the forecasting of a new time series, representing a new end-user introduced in the pre-existing network, is also approached under specific assumptions, by using a global model trained using data related to the existing end-users. Results reveal that the forecasting model pre-trained on data related to other end-users allows the attainment of good forecasting performance also for new end-users.
For efficient working of the power system, an accurate approach for short-term load forecasting (STLF) is suggested. To improve the accuracy of forecasting, various weather conditions, such as temperature, humidity, dew point, wind chill, and wind speed, are considered and their impact on the accuracy of load forecasting is studied in detail in terms of Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), and Maximum Error (ME) errors. The proposed hybrid approach consists of Support Vector Regression (SVR) and fuzzy because SVR can forecast the ability of small dataset and fuzzy system to handle non-linear weather conditions and uncertainty of load in forecasting. For load forecasting, time of the day, historical load i.e. previous one-month hourly load, weather conditions, calendar days for the last 10 days, sunny time, temperature at the same time in previous day, and average temperature of last three hours are taken into account. The proposed approach provides accurate load forecasting for a day regardless of its being a working day or holiday, while fewer days are used for load prediction viz. previous one month, while no special care is taken for weekend. The suggested approach is tested on standard electricity datasets: EUNITE network 1997 and New England of America of 2012and 2019. Simulation results show better effectiveness and the superiority of the proposed approach when compared with other existing methods for daily load forecasting viz. ANN, Bayesian, and Least Square Support Vector Machine, etc.
Full-text available
In this paper, a novel modeling framework for forecasting electricity prices is proposed. While many predictive models have been already proposed to perform this task, the area of deep learning algorithms remains yet unexplored. To fill this scientific gap, we propose four different deep learning models for predicting electricity prices and we show how they lead to improvements in predictive accuracy. In addition, we also consider that, despite the large number of proposed methods for predicting electricity prices, an extensive benchmark is still missing. To tackle that, we compare and analyze the accuracy of 27 common approaches for electricity price forecasting. Based on the benchmark results, we show how the proposed deep learning models outperform the state-of-the-art methods and obtain results that are statistically significant. Finally, using the same results, we also show that: (i) machine learning methods yield, in general, a better accuracy than statistical models; (ii) moving average terms do not improve the predictive accuracy; (iii) hybrid models do not outperform their simpler counterparts.
Full-text available
Responsible, efficient and environmentally aware energy consumption behavior is becoming a necessity for the reliable modern electricity grid. In this paper, we present an intelligent data mining model to analyze, forecast and visualize energy time series to uncover various temporal energy consumption patterns. These patterns define the appliance usage in terms of association with time such as hour of the day, period of the day, weekday, week, month and season of the year as well as appliance-appliance associations in a household, which are key factors to infer and analyze the impact of consumers’ energy consumption behavior and energy forecasting trend. This is challenging since it is not trivial to determine the multiple relationships among different appliances usage from concurrent streams of data. Also, it is difficult to derive accurate relationships between interval-based events where multiple appliance usages persist for some duration. To overcome these challenges, we propose unsupervised data clustering and frequent pattern mining analysis on energy time series, and Bayesian network prediction for energy usage forecasting. We perform extensive experiments using real-world context-rich smart meter datasets. The accuracy results of identifying appliance usage patterns using the proposed model outperformed Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) at each stage while attaining a combined accuracy of 81.82%, 85.90%, 89.58% for 25%, 50% and 75% of the training data size respectively. Moreover, we achieved energy consumption forecast accuracies of 81.89% for short-term (hourly) and 75.88%, 79.23%, 74.74%, and 72.81% for the long-term; i.e., day, week, month, and season respectively.
Full-text available
Motivated by the increasing integration among electricity markets, in this paper we propose two different methods to incorporate market integration in electricity price forecasting and to improve the predictive performance. First, we propose a deep neural network that considers features from connected markets to improve the predictive accuracy in a local market. To measure the importance of these features, we propose a novel feature selection algorithm that, by using Bayesian optimization and functional analysis of variance, evaluates the effect of the features on the algorithm performance. In addition, using market integration, we propose a second model that, by simultaneously predicting prices from two markets, improves the forecasting accuracy even further. As a case study, we consider the electricity market in Belgium and the improvements in forecasting accuracy when using various French electricity features. We show that the two proposed models lead to improvements that are statistically significant. Particularly, due to market integration, the predictive accuracy is improved from 15.7% to 12.5% sMAPE (symmetric mean absolute percentage error). In addition, we show that the proposed feature selection algorithm is able to perform a correct assessment, i.e. to discard the irrelevant features.
Full-text available
A short-term forecasting of the electricity price with data-driven algorithms is studied in this research. A Stacked Denoising Autoencoder (SDA) model, a class of deep Neural Networks (DNN), and its extended version are utilized to forecast the electricity price hourly. Data collected in Nebraska, Arkansas, Louisiana, Texas, and Indiana hubs in U.S. are utilized. Two types of forecasting, the online hourly forecasting and day-ahead hourly forecasting, are examined. In on-line forecasting, SDA models are compared with data-driven approaches including the classical neural networks (NN), support vector machine (SVM), multivariate adaptive regression splines (MARS), and least absolute shrinkage and selection operator (Lasso). In day-ahead forecasting, the effectiveness of SDA models is further validated through comparing with industrial results and a recently reported method. Computational results demonstrate that SDA models are capable to accurately forecast electricity prices and the extended SDA model further improves the forecasting performance.
Microgrid is a key enabling solution to future smart grids by integrating distributed renewable generators and storage systems to efficiently serve the local demand. However, due to the intermittent and uncertainty of distributed renewable energy, the reliability and economic operations of microgrid are facing increasing new challenges. Traditionally, economic dispatch issue is considered as solving an offline or online optimization problem whose objective function is prior known. However, accurate and determined function expression is difficult to formulate , and wrong expression may result in waste of electricity cost and causing security issues. Thus, it is desirable to re-formulate the economic dispatch problem, and solve it in a data-driven way. This paper proposes a data-driven energy management solution based on Bayesian optimization algorithm (BOA) for a single grid-connected home microgrid. The proposed solution formulates the optimization problem without a closed-form objective function expression, and solves it using BOA-based data-driven framework. The proposed solution is a kind of black-box function sequential global optimization strategy, and does not require derivative operation on the objective function. Besides, it can also solve the microgrid operation and parameter prediction uncertainty. Simulation results demonstrate the effectiveness of the proposed solution.
Microgrid is a key enabling solution to future smart grids by integrating distributed renewable generators and storage systems to efficiently serve the local demand. However, due to the intermittent and uncertainty of distributed renewable energy, the reliability and economic operations of microgrid are facing increasing new challenges. Traditionally, economic dispatch issue is considered as solving an off-line or on-line optimization problem whose objective function is prior known. However, accurate and determined function expression is difficult to formulate, and wrong expression may result in waste of electricity cost and causing security issues. Thus, it is desirable to reformulate the economic dispatch problem, and solve it in a data-driven way. This paper proposes a data-driven energy management solution based on Bayesian-optimization-algorithm (BOA) for a single grid-connected home microgrid. The proposed solution formulates the optimization problem without a closed-form objective function expression, and solves it using BOA based data-driven framework. The proposed solution is a kind of black box function sequential global optimization strategy, and does not require derivative operation on the objective function. Besides, it can also solve the microgrid operation and parameter prediction uncertainty. Simulation results demonstrate the effectiveness of the proposed solution.
As the internet's footprint continues to expand, cybersecurity is becoming a major concern for both governments and the private sector. One such cybersecurity issue relates to data integrity attacks. This paper focuses on the power industry, where the forecasting processes rely heavily on the quality of the data. Data integrity attacks are expected to harm the performances of forecasting systems, which will have a major impact on both the financial bottom line of power companies and the resilience of power grids. This paper reveals the effect of data integrity attacks on the accuracy of four representative load forecasting models (multiple linear regression, support vector regression, artificial neural networks, and fuzzy interaction regression). We begin by simulating some data integrity attacks through the random injection of some multipliers that follow a normal or uniform distribution into the load series. Then, the four aforementioned load forecasting models are used to generate one-year-ahead ex post point forecasts in order to provide a comparison of their forecast errors. The results show that the support vector regression model is most robust, followed closely by the multiple linear regression model, while the fuzzy interaction regression model is the least robust of the four. Nevertheless, all four models fail to provide satisfying forecasts when the scale of the data integrity attacks becomes large. This presents a serious challenge to both load forecasters and the broader forecasting community: the generation of accurate forecasts under data integrity attacks. We construct our case study using the publicly-available data from Global Energy Forecasting Competition 2012. At the end, we also offer an overview of potential research topics for future studies.
A functional time series is the realization of a stochastic process where each observation is a continuous function defined on a finite interval. These processes are commonly found in electricity markets and are gaining more importance as more market data become available and markets head toward continuous-time marginal pricing approaches. Forecasting these time series requires models that operate with continuous functions. This paper proposes a new functional forecasting method that attempts to generalize the standard seasonal ARMAX time series model to the L <sup xmlns:mml="" xmlns:xlink="">2</sup> Hilbert space. The structure of the proposed model is a linear regression where functional parameters operate on functional variables. The variables can be lagged values of the series (autoregressive terms), past observed innovations (moving average terms), or exogenous variables. In this approach, the functional parameters used are integral operators whose kernels are modeled as linear combinations of sigmoid functions. The parameters of each sigmoid are optimized using a Quasi-Newton algorithm that minimizes the sum of squared errors. This novel approach allows us to estimate the moving average terms in functional time series models. The new model is tested by forecasting the daily price profile of the Spanish and German electricity markets and it is compared to other functional reference models.
With the rapid development of distributed generations (DGs) and interruptible loads (ILs), distribution network company can actively purchase electricity in market instead of playing as a traditional passive purchaser. This study proposes a stochastic bi-level model-based strategic trading model for an active distribution company (ADisCo) which operates the active distribution network (ADN) to maximise its profit in electricity market. Uncertainties pertaining to bidding and offering prices of other market rivals', the imbalance prices in the balance market and the productions of DGs are considered via stochastic programming. Besides, a linear ADN operation model is proposed to ensure ADN operation security within the stochastic programming model. The proposed model is initially formulated as a stochastic bi-level model, where the upper-level problem represents the maximisation of the profit of ADN operator, whereas the lower-level model represents the maximisation of the social welfare in clearing of market from the perspective of independent system operator. On the basis of the complementarity theory, the proposed model can be transformed into a mixed integer linear programming model. Case studies demonstrate the efficiency and effectiveness of the proposed strategic trading model for an ADisCo with DGs and ILs.