ArticlePDF Available

Daily power demand prediction for buildings at a large scale using a hybrid of physics-based model and generative adversarial network

Authors:

Abstract and Figures

Power demand prediction for buildings at a large scale is required for power grid operation. The bottom-up prediction method using physics-based models is popular, but has some limitations such as a heavy workload on model creation and long computing time. Top-down methods based on data driven models are fast, but less accurate. Considering the similarity of power demand patterns of single buildings and the superiority of Generative Adversarial Network (GAN), this paper proposes a new method (E-GAN), which combines a physics-based model (EnergyPlus) and a data-driven model (GAN), to predict the daily power demand for buildings at a large scale. The new E-GAN method selects a small number of typical buildings and utilizes EnergyPlus models to predict their power demands. Utilizing the prediction for those typical buildings, the GAN then is adopted to forecast the power demands of a large number of buildings. To verify the proposed method, the E-GAN is used to predict 24-hour power demands for a set of residential buildings. The results show that (1) 4.3% of physics-based models in each building category are required to ensure the prediction accuracy; (2) compared with the physics-based model, the E-GAN can predict power demand accurately with only 5% error (measured by Mean Absolute Percentage Error, MAPE) while using only approximately 9% of the computing time; and (3) compared with data-driven models (e.g., Support Vector Regression, Extreme Learning Machine, and polynomial regression model), E-GAN demonstrates at least 60% reduction in prediction error measured by MAPE.
Content may be subject to copyright.
Daily Power Demand Prediction for Buildings at A Large Scale1
using A Hybrid of Physics-based Model and Generative Adversarial2
Network3
Chenlu Tiana, Yunyang Yeb, Yingli Louc, Wangda Zuoc,
, Guiqing Zhanga, Chengdong Lia
4
aShandong Key Laboratory of Intelligent Building Technology, School of Information and Electrical5
Engineering, Shandong Jianzhu University, Jinan 250101, China6
bPacific Northwest National Laboratory, Richland, WA 99354, U.S.A7
cDepartment of Civil, Environmental and Architectural Engineering, University of Colorado Boulder,8
Boulder, CO 80309, U.S.A9
Abstract10
Power demand prediction for buildings at a large scale is required for power grid oper-
ation. The bottom-up prediction method using physics-based models is popular, but has
some limitations such as a heavy workload on model creation and long computing time.
Top-down methods based on data driven models are fast, but less accurate. Considering the
similarity of power demand patterns of single buildings and the superiority of Generative
Adversarial Network (GAN), this paper proposes a new method (E-GAN), which combines
a physics-based model (EnergyPlus) and a data-driven model (GAN), to predict the daily
power demand for buildings at a large scale. The new E-GAN method selects a small num-
ber of typical buildings and utilizes EnergyPlus models to predict their power demands.
Utilizing the prediction for those typical buildings, the GAN then is adopted to forecast the
power demands of a large number of buildings. To verify the proposed method, the E-GAN
is used to predict 24-hour power demands for a set of residential buildings. The results show
that (1) 4.3% of physics-based models in each building category are required to ensure the
prediction accuracy; (2) compared with the physics-based model, the E-GAN can predict
power demand accurately with only 5% error (measured by Mean Absolute Percentage Er-
ror, MAPE) while using only approximately 9% of the computing time; and (3) compared
with data-driven models (e.g., Support Vector Regression, Extreme Learning Machine, and
polynomial regression model), E-GAN demonstrates at least 60% reduction in prediction
error measured by MAPE.
Keywords: Large-scale Simulation; Power Demand; Generative Adversarial Networks;11
Building Energy Model12
Corresponding author
Email addresses: chenlutian2017@sdjzu.edu.cn (Chenlu Tian), yunyang.ye@pnnl.gov (Yunyang
Ye), Yingli.Lou@colorado.edu (Yingli Lou), Wangda.Zuo@Colorado.edu (Wangda Zuo),
qqzhang@sdjzu.edu.cn (Guiqing Zhang), lichengdong@sdjzu.edu.cn (Chengdong Li)
Preprint submitted to Building Simulation February 4, 2022
C. Tian, Y. Ye, Y. Lou, W. Zuo, G. Zhang, C. Li 2022. “Daily Power Demand Prediction for
Buildings at A Large Scale Using A Hybrid of Physics-based Model and Generative Adversarial
Network.Building Simulation, https://doi.org/10.1007/s12273-022-0887-y
1. Introduction13
Power grid is critical for social and economic development. It is required to provide14
stable, sufficient, and persistent electricity to all end users. During power grid operation,15
the power production and consumption are not always balanced, which can cause unstable16
situations and increase the operation cost. To increase stability and reduce the cost, sig-17
nificant attention has been paid to energy planning and optimization of the power grid, in18
which power demand prediction is essential [1].19
Buildings are the main contributor to power demand [2, 3]. In the U.S., buildings account20
for 74% of power consumption and up to 80% of peak demand [4]. The power demand in21
buildings is also more flexible than many other end-users, such as transportation, etc. [5, 6].22
Thus, a great potential of power demand adjustment lies in the building sector because of23
its flexibility [7]. As a result, building power demand prediction is indispensable and vital24
for the power grid operation.25
In the past, many power demand prediction models for a single building have been26
established [8, 9]. Rather than single buildings, the power grid system operators are more27
interested in the potential power demand of buildings at a large scale, such as urban or28
community scale. Compared with power demand prediction for a single building, large29
scale power demand forecasting is different because 1) the number of buildings is large,30
2) the physical parameters can vary significantly for different buildings, and 3) the power31
consumption of each building is influenced by many factors such as building physics-based32
parameters, heating plant, etc. [10]. Thus, exploring the relationship between the overall33
power demand of buildings at a large scale is difficult.34
To achieve the power demand prediction for buildings at a large scale, researchers pro-35
posed various methods [11]. Bottom-up prediction is a common method in current research.36
It predicts the power demand of each building separately and the results are then aggregated37
to obtain the final predicted power demand for the buildings at a large scale. Physics-based38
models have been widely applied in the bottom-up prediction method [12]. For example,39
both Oak Ridge National Laboratory and National Renewable Energy Laboratory created40
platforms using physics-based models to evaluate building energy efficiency and power de-41
mand response strategies [13, 14]. Massachusetts Institute of Technology created an urban42
model for Boston based on physics-based phenomena to get the details of electrical and43
natural gas consumption [15]. Platforms introduced above adopt EnergyPlus as their sim-44
ulation engine, which is a popular physics-based building energy simulation tool [16]. In45
conclusion, the bottom-up method using physics-based model has gained much attention46
and also shown its advantages. However, some challenges still exist in such methods. For47
example, the power demand prediction models of all single buildings need to be created48
separately. Collecting detailed data is a heavy workload. The computing time is often long49
and the modeling process also requires expert knowledge [15].50
To address these challenges, some researchers try to develop top-down prediction meth-51
ods [17]. It explores the intrinsic relations between the overall power demand for buildings52
at a large scale and other parameters such as climate/weather conditions, economic fac-53
tors, etc. For top-down prediction, data-driven models, including machine learning models,54
2
are often adopted to find the explicit relationship between the power demand for buildings55
at a large scale and other parameters. For example, Ahmed Gassar et al. [18] developed56
and compared four types of data-driven urban-scale energy consumption models which used57
different machine learning methods including multilayer neural network (MNN), polyno-58
mial regression, random forest (RF), and gradient boosting algorithms. Among these four59
types of data-driven models, the MNN was found to be the best model. Wang et al. [19]60
developed a new method that adopted the popular long short term memory networks to61
achieve the district-scale energy dynamics, in which the correlation between adjacent build-62
ings were also studied. In [20], linear regression, RF, and support vector regression (SVR)63
algorithms were utilized to achieve the city-scale energy consumption prediction and it was64
found that the linear regression had the best performance. Although data-driven models65
have gained noticeable achievement in current top-down prediction methods, they still have66
some limitations. Compared with the bottom-up method based on physics-based models,67
these data-driven top-down methods has difficulties in finding the complex relationships be-68
tween the large-scale power demand and other parameters, so the prediction accuracy is not69
always acceptable [21].70
It can be concluded from the research above that 1) Bottom-up methods using physics-71
based models were applied in many real cases. Even though such methods have been widely72
applied and rely less on historical power demand data, the long computing time and heavy73
workload on model creation can not be ignored. 2) Top-down methods using data-driven74
models have also been adopted and can achieve fast prediction. However, the prediction75
accuracy is not always acceptable. Thus, it is essential to create a new method that can76
achieve the prediction in a shorter time than physics-based models but has higher prediction77
accuracy than data-driven models. Since physics-based and data-driven models have their78
specific advantages, it is possible to combine them to solve the encountered problems and79
achieve better prediction.80
Generative Adversarial Network (GAN) is a newly proposed generative deep learning81
method that can find the distribution of original data and generate data with a similar82
distribution to the original data via an adversarial process [22, 23]. GAN has been utilized83
in many domains including building energy simulation to achieve data enhancement, par-84
allel prediction, data imputation, etc. [24, 25]. For example, in [24], a parallel prediction85
method using GAN was proposed to achieve the building energy consumption prediction.86
In the proposed method, the GAN was utilized to achieve the generation of artificial energy87
consumption data. Wang [26] used GAN to generate the building electric load and the gen-88
erated electric load could capture both the general trend and stochastic dynamics. In [27], a89
conditional GAN was presented to generate daily building electric load and predict a daily90
profile (24 hours of the day).91
Even though the number of buildings is large for large-scale building prediction, the power92
demand patterns of some buildings are similar due to the similarities in building physics,93
electrical equipment, etc. Considering the similar power demand patterns of buildings, GAN94
provides a new way to achieve power demand prediction for buildings at a large scale. It95
is possible that one only needs to create physics-based models for a small subset of typical96
buildings, and then based on the simulated power demand data of the typical buildings, the97
3
GAN can be utilized to predict the power demand of buildings at a large scale. Therefore,98
this paper proposes a new hybrid methodology utilizing both deep learning (GAN) and99
physics-based (EnergyPlus) models to achieve daily power demand prediction for buildings100
at a large scale, referred to as E-GAN in this paper. For E-GAN, firstly, the buildings are101
classified into different categories using Density-Based Spatial Clustering of Applications102
with Noise (DBSCAN). Then, a small number of buildings are selected from each category103
as the typical buildings. Next, EnergyPlus is adopted to predict the power demand of a104
small sample of typical buildings, and GAN is utilized to forecast power demand for all105
buildings via transforming the small power demand dataset to a large dataset. Finally, the106
predicted power demands from GAN are aggregated to be the predicted power demand for107
buildings at a large scale.108
The contributions of this paper can be summarized as follows.109
This paper proposes a new method combining a physics-based model (EnergyPlus) and110
a data-driven model (GAN) to predict the power demand for buildings at a large scale.111
In the proposed method, only a small amount of physics-based models of typical build-112
ings are needed to be constructed and GAN is adopted to forecast the power demands of all113
buildings by utilizing the power demands of typical buildings.114
The proposed method can significantly reduce the computing time compared with115
bottom-up methods using the physics-based models and obtain more accurate prediction116
results compared with top-down methods using data-driven models.117
This methodology gives guidance to those who need to deal with power demand pre-118
diction for buildings at a large scale.119
The remainder of this paper is structured as follows. Section 2 introduces the proposed120
method of predicting power demand. In section 3, the residential building sector in the city121
of Colorado Springs is selected as an example to demonstrate the proposed method. Finally,122
section 4 concludes the findings of this paper.123
2. Methodology124
Figure 1 shows the workflow of the proposed E-GAN method. The E-GAN includes125
model preparation and prediction. The model preparation part (Phase 1-3) aims to select a126
small sample of typical buildings and develop physics-based models for typical buildings and127
GAN models for large scale prediction. Model preparation is composed of building power128
demand classification (Phase 1), pre-prediction for determining typical buildings (Phase 2),129
and physics-based model creation (Phase 3). After that, the prediction part (Phase 4)130
will produce daily power demand (e.g. 24-hour) prediction. It should be noted that the131
model preparation part only needs to be operated once at the beginning of the workflow.132
After the typical buildings are determined, Phase 4 of the proposed method achieves the133
final prediction. EnergyPlus predicts the daily power demand of the typical buildings,134
then the predicted small sample of power demands are input to the GAN to predict the135
power demands of all buildings. Finally, the predicted power demands from the GAN are136
aggregated as the final predicted power demand. The details of the proposed E-GAN method137
are introduced in the following subsections.138
4
Figure 1: The workflow of the proposed E-GAN method
2.1. Phase 1: Building Classification based on Power Demand139
The GAN is vital in transforming small datasets into large datasets in the E-GAN. How-140
ever, the performance of the GAN is significantly influenced by the input data. The GAN141
will be unstable and the quality of generated data will not be guaranteed if the distributions142
of input data are greatly different from each other. Thus, to guarantee the prediction ac-143
curacy, it is necessary to divide the buildings with similar power demand patterns into the144
same category, and then each category can utilize one GAN to achieve its overall power de-145
mand prediction. A popular method for building classification is to identify key parameters146
with most influence on the power demand, then divide the buildings into different categories147
according to the similarity in the key parameters [28, 29]. However, this method is complex148
and unfeasible especially when many parameters need to be considered. An alternative is149
clustering methods [29]. This study adopts one clustering algorithm - DBSCAN for building150
classification [30]. In the building classification process, DBSCAN clusters the daily power151
demand profiles, then buildings are divided into different categories according to the cluster-152
ing results. Buildings in the same category are deemed to have similar daily power demand153
profiles. This method concentrates on the power demand profiles directly and builds the154
relationships between building categories and the power demand patterns.155
2.1.1. Data Preparation156
The historical power demands of each building are normalized between 0 and 1 as157
ˆxt
i=
xt
imin
txt
max
txtmin
txt,(1)
5
where ˆxt
iis the normalized power demand of building iat time t,xt
iis the original power158
demand of building iat time t, min
txtis the minimum value of the historical power demand159
of all buildings at time t, max
txtis the maximum value of the historical power demand of160
all buildings at time t.161
Assuming the prediction horizon is nhours (n=24 in this study), the average, maximum,162
and minimum of the normalized 24-hour power demands are extracted. The three features163
and the normalized 24-hour power demands are combined as the new sequence input to164
DBSCAN. The new data sequence on the dth day is as follows:165
ˆxi,d = [ˆx1
i,d,ˆx2
i,d,· · · ,ˆx24
i,d,ˆxavr
i,d ,ˆxmax
i,d ,ˆxmin
i,d ],(2)
where ˆxavr
i,d is the average value of normalized 24-hour power demand on the dth day, ˆxmax
i,d is166
the maximum of normalized 24-hour power demand on the dth day, and ˆxmin
i,d is the minimum167
of normalized 24-hour power demand on the dth day.168
2.1.2. Power Demand Clustering using DBSCAN169
Machine learning methods are often divided into supervised machine learning and un-170
supervised machine learning. The main difference between these two methods is that su-171
pervised learning requires labels while unsupervised learning does not require labels [31].172
Because there are no labels for power demands in this research, DBSCAN, which is one typ-173
ical unsupervised machine learning method, is selected to classify the buildings by clustering174
the power demands.175
Compared with other clustering algorithms such as k-means, DBSCAN is rarely influ-
enced by noise and initial settings [30]. DBSCAN clusters the data automatically without
the need for a specific number of clusters, and the data which doesn’t belong to any clus-
ters are considered as noise. The detailed theory of DBSCAN has been illustrated by the
research from Luchi et al. [32]. Two parameters, epsilon-neighborhood and minpoint, need
to be determined in the DBSCAN. Epsilon-neighborhood is used to define the neighborhood
threshold value of a point. Minpoint is the minimum number of points in the Epsilon-
neighborhood of the point. Additionally, there are two common evaluation indices for the
DBSCAN model: silhouette coefficient and noise rate. Silhouette coefficient Siis computed
as
Si=b(i)a(i)
max{b(i), a(i)},(3)
where a(i) is the average distance between iand the other vectors which belong to the176
same cluster as i; and b(i) is the average distance between iand the other vectors which177
belong to the different clusters. A higher silhouette coefficient (Si) signifies better clustering178
performance of the DBSCAN.179
6
Noise rate Pnoise is computed as
Pnoise =mnoise
m,(4)
where mnoise is the amount of noise data; and mis the total amount of data. Higher noise180
rate means there are more outliers which don’t belong to any clusters.181
The 24-hour power demands on the same day are input to the DBSCAN to obtain the182
power demand clustering results. Then the buildings are classified into different categories183
according to the clustering results. Using the power demand on the same day to achieve the184
building classification avoids the influence of different weather conditions.185
2.2. Phase 2: Pre-prediction for Determination of Typical Buildings186
The typical buildings needed for physics-based model creation were determined via the187
algorithm in this phase. Firstly, the percentage of selected buildings in each category is set188
from 1% to p% at an interval of x%, and the golden section search (GSS) method [33] is189
adopted to search the optimum percentage of typical buildings in each category. Then, for190
each selected percentage proposed by GSS, three steps are conducted. In step 2.2, a small191
sample of power demands is randomly selected from each category. Next, in step 2.3, the192
small sample of power demands is input to GANs to achieve the power demand prediction193
for the buildings at a large scale. Steps 2.2 and 2.3 iterate for Netimes with each proposed194
selected percentage, and the prediction performance with each specific selecting percentage195
is evaluated according to these Neprediction results in step 2.4. The evaluation results of196
prediction performance are then input to GSS for the following search process. After the197
optimum percentage is determined, the typical buildings are randomly chosen from each198
category according to the optimum percentage.199
2.2.1. Typical Building Searching via Golden Section Search (GSS) Method200
In the selection process, the GAN is expected to iterate for each candidate of typical201
buildings. However, GAN relies on training two deep neural networks simultaneously to202
ensure the generated data is similar to the true collected data, so significant computing time203
is required when the GAN step is iterated for each candidate typical building. Thus, to speed204
up the determination process, the GSS method is adopted to search the optimum percentage205
of typical buildings in each category. The GSS is an efficient and generic technique for206
optimizing a uni-modal objective function [33]. In this study, the Information and Error207
Criterion (IEC) of the GSS is set to be the objective function. IEC is introduced in subsection208
2.2.4. The value of Min(IEC) will influence the determination of the selected percentage of209
the chosen buildings. When IEC obtains the lowest value, the corresponding percentage210
is set to be the optimum percentage. To reduce the possibility of IEC falling into a local211
minimum point, the range of selected percentages in each category is first divided into P212
parts. Then, the GSS method is utilized to determine the optimum selected percentage of213
typical buildings in each part. Finally, the IEC values of the optimum selected percentages214
in each part are compared, and the percentage whose IEC value is the lowest is selected to215
be the final percentage of selected buildings in each category. One restriction to guarantee216
7
Figure 2: The searching workflow of the optimum percentage of selected buildings via GSS
GAN performance is the least number of selected buildings in each category should be no217
less than q. In other words, when the number of selected power demands obtained according218
to the optimum selected percentage is less than q, it is set as q. The value qcan be manually219
chosen or determined by experiments. Figure 2 illustrates the searching workflow of the220
optimum percentage of selected buildings via GSS.221
The detailed process of searching the optimum percentages of selected buildings in each222
category is listed as follows.223
a. Determine the range of the selected percentages of typical buildings in each category.224
b. Divide the range of the selected percentages into Pparts.225
c. Search the optimum percentage of selected power demands using the GSS method in226
each part.227
8
Figure 3: Structure of GAN [34]
d. Compare the IEC values of the selected percentage from each part and obtain the228
optimum percentage ps, whose IEC is the lowest.229
e. Determine psas the percentage of selected buildings in each category.230
GSS will propose specific percentages during the searching process. For each specific231
percentage, steps 2.2 and 2.3 will iterate Netimes to obtain different prediction results.232
These Neprediction results are utilized to evaluate the prediction performance of the specific233
percentage of typical buildings in step 2.4. The prediction evaluation results are then input234
to GSS to determine the typical building percentage in each category.235
After the selected percentage of typical buildings in each category is determined, the236
buildings whose prediction accuracy is best are selected to be the typical buildings when the237
percentage of power loads in each category is determined to be ps. The prediction accuracy238
evaluation index is also introduced in subsection 2.2.4.239
2.2.2. Small Sample Power Demand Selection240
A small sample of 24-hour power demands are randomly selected from each category.241
The selected small samples of power demands are then input to the GAN to achieve the242
large-scale power demand prediction.243
2.2.3. Large-scale Power Demand Prediction Using GAN244
(1) The theory of GAN245
GAN is a newly proposed deep learning method, which has the capability to grasp the246
potential distribution of original data and generate similar data via an adversarial process.247
In the E-GAN, GAN is utilized for energy modeling of buildings in the same category.248
Figure 3 shows the structure of GAN. GAN is composed of one generator and one dis-
criminator: the generator produces fake samples and aims to deceive the discriminator,
while the discriminator tries its best to distinguish the fake samples produced by the gener-
ator from the real samples. The evaluation feedback V(D, G) is given to the generator and
9
discriminator to help them improve their performance. V(D, G) is presented as
V(D, G) = Expdata (x)[log D(x)] + Ezpz(z)[log(1 D(G(z)))],(5)
where xis the real data; pdata(x) is the distribution of the real data x;zis the noise; pz(z) is249
defined as a prior on the input noise variables; Dand Gare the differential functions which250
are usually presented as multi-layer neural networks; G(z) is the output of generator; and251
D(x) represents the possibility that xis the real data.252
The GAN generates data via an adversarial process. In the adversarial process, the
generator tries its best to minimize the value function V(D, G), while the discriminator
tries to maximize V(D, G). The adversarial process can also be illustrated as
min
Gmax
DV(D, G) = Expdata (x)[log D(x)] + Ezpz(z)[log(1 D(G(z)))].(6)
To train the GAN, the discriminator and generator play a two-step game. First, the253
generator is fixed and the generated fake data and real data are utilized to train the dis-254
criminator. Next, the discriminator is fixed and the generator is updated. This two-step255
game continues iteratively until pg=pdata(x), where pgis the distribution of generated data256
from GAN. Under such conditions, the generated data from the generator has similar dis-257
tributions to the real data, and the discriminator cannot distinguish the fake data from real258
data [34].259
(2) The prediction process of large-scale power demand prediction using GAN260
To guarantee the quality of the generated power demand and promote accurate predic-261
tions, the power demands of typical buildings in each category are input to GAN separately262
to predict the 24-hour power demand of each individual building.263
Suppose that there are cNbuildings in category c. First, the power demands of all build-
ings are normalized, then sntypical buildings in category care selected, and the normalized
power demands of the typical buildings are utilized to train the GAN to achieve the power
demand prediction for the buildings at a large scale. The 24-hour normalized power demands
of the selected buildings on the same day is expressed as
Ls=
y0
s1y1
s1· · · yt
s1· · · y24
s1
y0
s2y1
s2· · · yt
s2· · · y24
s2
· · · · · · · · · · · · · · · · · ·
y0
sny1
sn· · · yt
sn· · · y24
sn
(7)
where yt
siis the real power load of building siat time t, and siis the index of the selected264
building in category c.265
The detailed processes of power demand prediction of all buildings in category cusing266
GAN are listed below.267
a. Input Lsinto the GAN.268
b. Iterate GAN until loss(G) = loss(D), where loss(G) is the generator loss, loss(D) is269
the discriminator loss, and loss(G) and loss(D) are computed via 6.270
10
c. Collect cNgenerated samples from GAN as the predicted 24-hour power loads of all
buildings as
Gc=
ˆy0
c1ˆy1
c1· · · ˆyt
c1· · · ˆy24
c1
ˆy0
c2ˆy1
c2· · · ˆyt
c2· · · ˆy24
c2
· · · · · · · · · · · · · · · · · ·
ˆy0
cNˆy1
cN· · · ˆyt
cN· · · ˆy24
cN
(8)
where ˆyt
cnis the predicted power load of building cnat time t, and cnis the building index271
in category c.272
Suppose there are Cbuilding categories. To obtain the final predicted result of the power
demand for buildings at a large scale, all of the predicted 24-hour power demand sequences
are aggregated as
ˆyt=
C
X
c=1
cN
X
i=c1
ˆyt
i,(9)
where ˆytis the predicted power demand for buildings at a large scale at time t, ˆyt
iis the273
predicted power demand of the building iat time t, and Nis the number of buildings in274
category c.275
2.2.4. Prediction Evaluation for Building Sample Determination276
As illustrated above, for each selected percentage of power demands into GAN, there will
be Nepredicted results. Mean Absolute Percentage Error (MAPE) is selected to evaluate
the prediction accuracy. Therefore, there are NeMAPEs for each specific percentage of
input data into GAN. MAPE is computed as
M AP E =1
T
T
X
t=1
|ˆytyt|
yt×100%,(10)
where ˆytis the predicted value at time t; and ytis the real value at time t.277
To determine the proper percentage of physics-based building models in each category,
a new evaluation index named Information and Error Criterion (IEC) is established. IEC is
calculated as
IEC (ns) = aln(ns) + bln(µ) + wln(CV ),(11)
where a,b,ware weighting factors; nsis the average amount of input data to GAN of each278
category; µis the average value of the NeMAPEs; and C V is the Coefficient of Variation279
for the NeMAPEs when the average amount of input data is ns.280
CV is presented as
CV =σ
µ,(12)
11
where σis the standard deviation of the NeMAPEs when the amount of input data is ns.281
The IEC considers the amount of data input to GAN, prediction accuracy, and stability.282
When nsis lower, the prediction accuracy will be improved significantly with increasing283
ns. Under such conditions, ∆I EC is influenced much more by improving the accuracy284
than increasing ns, so IEC will decrease. However, after nssurpasses a critical value, the285
prediction accuracy improves little with increasing ns. ∆IEC will be influenced more by286
the increase of ns, so IEC (ns) will become higher with increasing ns. Lower IEC signifies287
more accurate and stable predicted performance with fewer physics-based building models.288
2.3. Phase 3: Physics-based Model Creation289
EnergyPlus is utilized to achieve the 24-hour power demand prediction. In this part,290
model input information for the typical buildings identified by phase 2 (pre-prediction) is291
first collected. Then, physics-based building models are created for these typical buildings.292
The method of creating physics-based building models was illustrated in the research of293
Ye et al. [16]. In this research, the physics-based building models are obtained by using294
residential prototype building models [35] as a starting point.295
2.4. Phase 4: Large-scale Power Demand Prediction296
This phase performs any 24-hour power demand prediction via running EnergyPlus and297
GAN. Two steps are included: (1) power demand prediction of typical buildings using298
EnergyPlus, (2) large-scale power demand prediction using GAN.299
In step 4.1, EnergyPlus is utilized to achieve the 24-hour power demand prediction of300
typical buildings. A new weather file and the created physics-based building models in301
phase 3 are input to EnergyPlus to predict the 24-hour power demands of typical buildings.302
In step 4.2, the power demands predicted by EnergyPlus in each category are utilized to303
train one GAN to realize the power demand prediction of all buildings for each category,304
and then all of the predicted power demands from GAN models are aggregated to get the305
final predicted result as expressed in equation (9). In this phase, the physics-based model306
(EnergyPlus) needs new weather data (including humidity, wet bulb temperature, dry bulb307
temperature, solar altitude, etc.) and physical parameters (including building prototype,308
plug load density, lighting power density, u-value of window, etc.) as exogenous variables,309
and the GANs only need the daily power demands from EnergyPlus as exogenous variables.310
3. Case Study311
The residential building sector in the city of Colorado Springs, Colorado, United States312
was selected as a case study to demonstrate the proposed E-GAN method. In this section,313
the process of the power demand prediction using E-GAN is illustrated in detail, and the314
prediction results are also compared and discussed.315
12
Table 1: Residential Building Type Classification Method
Classification aspect Categories
Building prototype single-family (SF), multi-family (MF)
Heating system electric resistance, gas furnace, oil furnace, and heat pump (for both SF and MF)
Foundation slab, crawlspace, heated basement, and unheated basement (for both SF and MF)
Total floor area* 152 m2, 222 m2, 297 m2(for SF); 56 m2per unit, 73 m2per unit,
92 m2per unit (for MF)
*: Residential energy consumption survey
3.1. Phase 1: Building Classification Based on Power Demand316
3.1.1. Data Preparation317
In reality, 24-hour power data can be collected directly. In this research, 24-hour power318
data was generated based on residential prototype building models [35], which includes three319
steps: (1) building category classification, (2) model preparation, and (3) power demand data320
generation.321
The residential prototype building models categorize the residential buildings by build-322
ing prototype, heating system type, and foundation type [35].This research also considers323
building floor area in addition to the three previously mentioned categories. As a result,324
residential buildings are categorized using four approaches in this study, as shown in Ta-325
ble 1. In total, there are 96 categories of residential buildings: 2 (prototypes) ×4 (heating326
systems) ×4 (foundations) ×3 (floor areas).327
The number of buildings in each category was determined by referring to the residential328
energy consumption survey [36], as shown in Appendix Table A. Since the heating system329
types of oil furnace and heat pump account for 0% of the heating systems in the city of330
Colorado Springs, the heating systems types can be reduced to two and the total categories331
of residential buildings becomes 48.332
To prepare the 48 baseline models, the 16 residential prototype building models [35]333
were used as a starting point. Then, each prototype model was revised to represent the334
same building type, but with three different floor areas. To populate the 48 baseline models335
for different building characteristics, five model inputs were selected and their values for336
a specific building were chosen randomly in the ranges defined in Table 2. For a specific337
building, the values of all five model inputs in Table 2 are selected randomly within the338
range that we defined in Table 2. Stochastic schedules were not considered in this case339
study, thus the schedule was not set as a modified variable. All buildings were given the340
same occupancy schedule. We assumed that the buildings were occupied all day. The setting341
fraction at night is larger than during the day.342
Using the method described above, 1,992 building energy models were generated. Those343
models were simulated for 11 randomly selected considering seasons, temperatures, and day344
types (weekend, workday, holiday): January 7, January 9, January 24, February 2, April 4,345
June 4, June 6, September 30, September 17, October 25 and December 25. Finally, the346
24-hour power demands of the 1,992 buildings were obtained.347
13
Table 2: The Range of Model Input Values
Model Input Unit Default Value Value Range1
Plug Load Density W/m22.46 (1.97, 2.95)
Lighting Power Density W/m21.7 (1.36, 2.04)
Insulation R-value of Exterior Walls - 3.37 (2.70, 4.04)
U-value of Window W/m2-K 1.82 (1.46, 2.18)
SHGC of Window - 0.88 (0.70, 1.06)
1: Minimum value is 20% less than default value; the maximum value is 20% more than the default value.
Table 3: Clustering results of daily power demands in various days
Day type date Average
temperature (C )
Cluster
number
Noise
rate (%)
Silhouette
coefficient
Workday January 7 -0.8 4 0.00 0.908
Workday January 9 -12.5 4 0.25 0.917
Workday January 24 -5.8 4 0.15 0.918
Weekend February 2 0.55 4 0.15 0.915
Workday April 4 1.15 5 0.70 0.887
Workday June 4 12.88 5 0.10 0.902
Workday June 6 18.42 5 0.20 0.907
Workday September 10 17.18 4 1.05 0.889
Workday September 30 19.92 4 0.50 0.902
Workday October 25 12.98 3 0.30 0.908
Holiday December 25 -9.8 4 0.20 0.899
The 24-hour power demands from EnergyPlus were normalized via 1, and then the348
average, maximum, and minimum of 24-hour power demands were extracted and combined349
with the normalized 24-hour power demands to be the new sequences which were input to350
DBSCAN for building classification.351
3.1.2. Power Demand Clustering using DBSCAN352
In DBSCAN, two parameters including epsilon and minpoint need to be determined. To353
find more power demand patterns, the minpoint was set from 2 to 20 and epsilon was varied354
from 0.1 to 2 at an interval of 0.1. The silhouette coefficient and noise rate were monitored355
with the change of the two parameters. The values of epsilon and minpoint were chosen for356
when the silhouette coefficient was the largest, and the classification results were obtained.357
Table 3 shows the best categorized results of building power demands on different days358
according to the clustering results of the DBSCAN. Among the best clustering results, the359
daily power demands were divided into four categories in 7 of 11 days (63.6%).360
Figure 4 shows the the silhouette coefficient change with the number of clusters, ranging361
from one to six, for different days. It can be seen from this figure that when the number of362
clusters is four, the silhouette coefficient is highest on most of the days, and is also higher363
than 0.85 on all days. Thus, the building power demand is recommended to be divided into364
four categories in this real case.365
14
Figure 4: The silhouette coefficient change with the number of clusters from one to six on different days.
Table 4: Detailed information of each category of 24-hour power demands on January 7
Category Number of
buildings
Average peak
demand (kW)
Average valley
demand (kW)
Peak
time
Valley
time
Building
type Heating source
1 72 198.0 64.5 8-10am 2-4pm MF Electricity
2 216 58.6 31.5 7-9pm 3-5am MF Gas
3 396 28.1 5.8 5-7am 2-4pm SF Electricity
4 1308 6.5 3.4 8-10pm 1-3pm SF Gas
Table 4 shows the detailed information of the clustering results on January 7. It includes366
the number of buildings, average peak demand, average valley demand, peak time, and valley367
time at the four categories recommended by DBSCAN. Here, the peak time and valley time368
are defined as the two hours when the power demands are the highest/lowest in one day.369
It can be seen from Table 4 that four categories of buildings are obtained according to the370
power demand patterns, and different categories of buildings show a significant difference in371
the peak demand, valley demand, peak time, and valley time. Meanwhile, the buildings in372
the same category have some similarities in the physics-based parameters. It is found that373
heating source and total floor area are the two key parameters for building classification374
and highly impact the patterns of the power demands for residential buildings. This finding375
is very useful and can accelerate the process of building classification. Buildings can be376
divided into different categories directly according to the heating source and floor area in377
future research.378
Furthermore, some normalized 24-hour power load profiles of each category on January379
7 are visualized in Figure 5. It can be seen from Figure 5 that the traces and value ranges380
of the 24-hour power load profiles greatly differ for each category.381
15
Figure 5: Some normalized 24-hour power load profiles of each category on January 7.
3.2. Phase 2: Pre-prediction for Determination of Typical Buildings382
The noise rate of DBSCAN on January 7 is the lowest when the number of clusters383
is four. To obtain the least required number of building samples for physics-based model384
creation, the simulated 24-hour power demands from EnergyPlus on January 7 were selected385
for the pre-prediction experiment. This section determined the structure of the GAN, the386
least number of input daily power demands in each category, the percentage of selected387
typical building power demands in each category, and the typical buildings in each category.388
For the GAN, the generator was a fully connected recurrent network [37], and the dis-389
criminator was a long short-term memory network [38]. There were 24 nodes in each layer390
of the generator, and Relu was set as the activation function. The number of nodes in the391
hidden layers of the discriminator was also set to be 24; 24 nodes and one node were used392
for the input and output layers, respectively, and tanh was used for the activation function.393
The number of hidden layers of the generator and the discriminator were varied from one394
to five to determine the best structures for the generator and discriminator. The losses395
of the generator and discriminator were monitored and compared, and the generated data396
was also compared with the average of the original daily power demands using the MAPE397
as the evaluation index. First, the number of hidden layers of the discriminator was set398
as two, and the number of hidden layers of the generator were varied from 1 to 5. It was399
found that when the number of hidden layers of the generator was set as 2, the losses of the400
generator and discriminator balanced when the iterations were larger than 2500, and the401
MAPE was lower than 20%. The performance of the generator with two hidden layers was402
better than the others. Thus, the number of hidden layers in the generator was set as 2.403
16
After the generator was determined, the number of hidden layers of the discriminator were404
varied from 1 to 5. It was found that when the number of hidden layers in discriminators405
was set as one, the losses of the generator and discriminator balanced when the iterations406
were larger than 2500, and the MAPE was lower than 20%. The MAPE did not improve407
with increasing number of hidden layers in the discriminator. Thus, the number of hidden408
layers for the discriminator was set as one. To obtain good-quality generated data for each409
type, data was collected after 2500 iterations of the GAN. The generated data was then410
collected every ten iterations. Finally, there were 72, 216, 396, and 1308 predicted 24-hour411
power demands for the buildings in categories one, two, three, and four separately in each412
prediction process.413
To determine the least amount of input data required in each category, the prediction
performance of GAN using different amounts of daily power demands was monitored, and
the MAPE and Pearson Correlation Coefficient (r) were chosen as the evaluation indices.
The coefficient r is calculated as
r=PT
t=1(ˆytE(ˆyt))(ytE(yt))
qPT
t=1(ˆytE(ˆyt)2)qPT
t=1(ytE(yt)2)
,(13)
where ˆytis the predicted value using the E-GAN at time t;ytis the predicted value using414
EnergyPlus at time t; and E(ˆyt) and E(yt) are the average values of ˆytand yt. The lower415
value of MAPE indicates better prediction accuracy, and the larger value of rrepresents416
more positive correlation.417
The number of selected buildings was changed from 4 to 30 at an interval of one, and418
the prediction experiment using GAN was conducted 5 times for each group of selected419
buildings. Figure 6 shows the MAPE and r changes with different numbers of input daily420
power demands. It can be seen from these two figures that the prediction of GAN performs421
well when the number of selected buildings is more than 6, with MAPE less than 4.1% and422
r more than 0.96. Thus, the least number of selected buildings in each category was set as423
7.424
17
(a) (b)
Figure 6: The performance of GAN with the different number of input daily power demands on January
7, (a) The MAPE change with the different number of daily power demands, (b) The r change with the
different number of daily power demands.
The percentage of selected typical building power demands in each category was set from425
1% to 15% at an interval of 0.1%. Then, the percentage of power demands of buildings was426
divided in half, one part was from 1% to 7.5%, and the other part was from 7.5% to 15%.427
When the number of power demands obtained according to the selected percentage was less428
than 7, the number of selected power demands was set as 7. The GSS method was utilized429
to search the optimum percentage in each part. min(I EC) was set to be the value function,430
and the accuracy parameter ϵof GSS was set to be 0.1. The weighting factors a,band w431
in the IEC function (equation (11)) were set to be one. The prediction experiment using432
GAN was conducted ten times for each selected percentage (Ne= 10). Finally, the search433
process of GSS was conducted 7 times, three times for 1% to 7.5% and four times for 7.5%434
to 15%. In total, the GAN ran for 5 ×10 ×4 times for 1% to 7.5%, and 6 ×10 ×4 times for435
7.5% to 15%, costing about 102 hours using multithreading. Two optimum percentages were436
obtained from the GSS method: 4.3% and 10%. The corresponding IEC values for 4.3% and437
10% were -3.32 and -2.43, respectively, so 4.3% was chosen as the final optimal percentage438
result. Thus, the buildings are selected to be the typical buildings whose prediction error439
rate is the lowest (4.2% measured by MAPE) when the percentage of power loads in each440
category is determined to be 4.3%.441
In this phase, GSS was adopted to speed up the searching process of the optimal percent-442
age. Although, this process seems time consuming, it is only conducted once to prepare the443
physical models of typical buildings at the beginning. Once the physical models of typical444
buildings are created, the E-GAN can predict any 24-hour power demand in much less time445
than the physics-based model.446
3.3. Phase 3: Physics-based Model Creation447
Four categories of buildings were identified in Section 3.1 and 92 physics-based building448
models were required to be created according to Section 3.2. There are separately 7, 10, 18,449
18
Table 5: Basic information of physics-based models
No. Building
type
Heating
source
Total Floor
area (m2)
Plug
load
density
(W/m2)
Lighting
power
density
(W/m2)
Insulation
R-value of
exterior
wall
U-value of
window
(W/m2-K)
SHGC of
window
1-1 to
1-7
Multi-
family Electricity 56, 73 92 (1.97, 2.95) (1.36, 2.04) (2.70, 4.04) (1.46, 2.18) (0.70, 1.06)
2-1 to
2-10
Multi-
family Gas 56, 73, 92 (1.97, 2.95) (1.36, 2.04) (2.70, 4.04) (1.46, 2.18) (0.70, 1.06)
3-1 to
3-18
Single-
family Electricity 152, 222,
297 (1.97, 2.95) (1.36, 2.04) (2.70, 4.04) (1.46, 2.18) (0.70, 1.06)
4-1 to
4-57
Single-
family Gas 152, 222,
297 (1.97, 2.95) (1.36, 2.04) (2.70, 4.04) (1.46, 2.18) (0.70, 1.06)
and 57 selecting buildings in categories one, two, three and four. Table 5 provides the basic450
information for these 92 building models.451
3.4. Phase 4: Large-scale Power Demand Prediction452
For the E-GAN, 92 physics-based building models in subsection 3.3 were used to generate453
the small dataset of power demands. The small dataset was then input to the GAN to454
produce 24-hour power demands for 1,992 buildings. In this part, the settings of GAN were455
the same as the settings in the pre-prediction. The MAPE(10) and the r(13) were utilized456
to evaluate the prediction accuracy of the E-GAN and other data-driven methods.457
3.4.1. Results Validation458
To validate the E-GAN, the power demands of two days in four seasons were predicted.459
We ran 1,992 building physics-based models with the new weather profiles on EnergyPlus460
to predict the 24-hour power demands, which were then aggregated to be the target power461
demand. The prediction performances of E-GAN are compared with the physics-based462
model EnergyPlus, including prediction accuracy, computing time, and the hourly power463
demand distribution of single buildings. In phase four, E-GAN and EnergyPlus were run on464
the same computer. The settings of the computer are as shown in Table 6465
Table 6: Computer Setting
Indicator Value
System type 64-bit operating system, x64-based processor
Installed RAM 32.0 GB
CPU type Intel(R) Xeon(R) W-2125 CPU @ 4.00GHz 4.01GHz
Number of cores 8
Disk capacity 1.8 TB
Three data-driven models including SVR [39], Extreme Machine Learning (ELM) [40] and466
polynomial regression [18] were selected to be the comparative models of the E-GAN. They467
19
achieved the prediction of power demand via the top-down method. For the comparative468
data-driven model, the input parameters were set to be humidity, dry bulb temperature,469
direct normal radiation, and time as recommended by [41]. The output of the data-driven470
models was set to be the predicted hourly power demand for the 1,992 buildings.471
To guarantee the prediction accuracy of the data-driven models, each building category
was trained on separate data-driven models, and the prediction results of each category were
aggregated to obtain the final prediction results. The training dataset was constructed by
the three-month simulated data (generated via EnergyPlus) ahead of the predicted 24-hour
power demands. The data was first normalized, then used for data-driven model training.
The humidity, dry bulb temperature, direct normal radiation, and power demands were
normalized via equation (1), while the time was normalized as the following
¯
tn= sin(n2π
24 ),(14)
where nis the number of hours in daily 24 hours.472
The hyperparameters of these three data-driven models were also carefully chosen. For473
the ELM, the activation function was determined to be the radial basis function (RBF), the474
amount of the hidden nodes was set from 20 to 200 at an interval of 20 and the prediction475
accuracy was monitored. It was found that when the number of hidden nodes was set to476
be 100, ELM obtained the best prediction results. For SVR, RBF was set to be the kernel477
function, the penalty coefficient was set from 0.1 to 1 at an interval of 0.1, and gamma was478
also set from 0.1 to 1 at an interval of 0.1. The prediction accuracy was also monitored via479
MAPE, and it was found that when the penalty coefficient was 0.5 and gamma was 0.8,480
SVR achieved the most accurate prediction results. For the polynomial regression, there481
were four input parameters, thus the highest dimension was set to be four.482
3.4.2. Final Prediction Results and Discussion483
Table 7 shows the prediction accuracy and computing time of the E-GAN compared484
with the physics-based method using EnergyPlus. From this table, we can see that the485
computing time of E-GAN is much shorter than EnergyPlus, and it is concluded that the486
final prediction only needs approximately 9% of the computing time compared with the487
physics-based method while getting the similar prediction results which is below 5% error488
rate under the criteria of MAPE.489
20
Table 7: The prediction accuracy and computing time of the E-GAN compared with the physics-based
method using EnergyPlus on different days
Date MAPE (%) r Little sample
generation time (h)
GAN
time (h)
E-GAN
time (h)
EnergyPlus
time (h)
January 5 3.67 0.98 0.90 1.15 2.15 24.01
January 7 4.62 0.96 0.93 1.10 2.03 23.84
March 5 4.03 0.91 0.88 1.14 2.02 23.32
April 4 2.67 0.98 0.92 1.12 2.04 23.61
June 5 3.97 0.98 0.85 1.13 1.98 22.06
August 4 5.78 0.96 0.87 1.15 2.02 22.61
September 10 3.25 0.96 0.92 0.98 2.11 22.40
September 11 4.25 0.98 0.96 1.15 2.11 21.75
Average 4.03 0.96 0.94 1.11 2.05 22.95
Figure 7 compares the distributions of some predicted daily power demands of GANs490
with the average of the original daily power demands in each category on April 4. It shows491
that the predicted daily power demands have similar distributions to the average of the492
original daily power demands. It can also be concluded that the GAN predicted power493
demands can be representative of real power demand.494
21
(a) (b)
(c) (d)
Figure 7: The distribution comparison between some daily predicted power demands of GANs and the
average of the original daily power demands in each category on April 4, (a) The first building category, (b)
The second building category, (c) The third building category, (d) The fourth building category.
Furthermore, the distributions of the predicted power demands of single buildings via495
E-GAN and EnergyPlus on April 4 were compared and shown in Figure 8. It shows that the496
distributions of power demands of single buildings predicted by E-GAN are similar to the497
distributions of the power demands from EnergyPlus. The distribution comparison further498
validates the E-GAN.499
Table 8 illustrates the prediction accuracy comparison between the E-GAN and three500
data-driven models on different days. It can be seen that the E-GAN predicts more accu-501
rately than the data-driven models. On average, E-GAN reduces the error (measured by502
MAPE) by 66.56% compared to SVR, 71.01% compared to ELM, and 70.04% compared503
to the polynomial regression model. For the linear relationship comparison, the r-value of504
E-GAN is more than 0.95 and it is much better than the other three data-driven models505
whose r values are lower than 0.80. The comparison results of MAPE and r illustrate the506
advantages and effectiveness of the proposed E-GAN.507
The predicted results of the E-GAN and three data-driven models on four days were also508
22
Figure 8: The distributions of the predicted power demands of single buildings via E-GAN and EnergyPlus
on April 4.
Table 8: The prediction accuracy of the E-GAN and three data-driven models compared with physics-based
method using EnergyPlus on different days
Date Method MAPE r Date Method MAPE r
E-GAN 3.66%0.98 E-GAN 4.62%0.96
SVR 7.86% 0.85 SVR 9.53% 0.75
January 5 ELM 9.31% 0.83 January 7 ELM 9.72% 0.76
Polyfit 8.62% 0.82 Polyfit 9.88% 0.72
E-GAN 4.03%0.91 E-GAN 2.67%0.98
SVR 5.39% 0.83 SVR 17.03% 0.44
March 5 ELM 5.69% 0.89 April 4 ELM 18.74% 0.36
Polyfit 6.17% 0.87 Polyfit 18.50% 0.36
E-GAN 3.97%0.98 E-GAN 5.78%0.96
SVR 9.12% 0.92 SVR 22.55% 0.80
June 5 ELM 9.11% 0.90 August 4 ELM 26.29% 0.77
Polyfit 9.54% 0.91 Polyfit 22.33% 0.74
E-GAN 3.25%0.96 E-GAN 4.25%0.98
September SVR 10.26% 0.75 September SVR 14.65% 0.72
10 ELM 9.57% 0.83 11 ELM 22.76% 0.72
Polyfit 16.56% 0.62 Polyfit 17.55% 0.87
E-GAN 4.03%0.96
SVR 12.05% 0.76
Average ELM 13.90% 0.76
Polyfit 13.64% 0.74
23
visualized in Figure 9. The visualization of comparison also demonstrates the more accurate509
prediction performances of the E-GAN.510
(a) (b)
(c) (d)
Figure 9: Power demand prediction on January 5, April 4, June 5 and September 11, (a) Total power
demand prediction on January 5, (b) Total power demand prediction on April 4, (c) Total power demand
prediction on June 5, (d) Total power demand prediction on September 11.
3.5. Comparison and Discussion511
The case study proved the effectiveness of the E-GAN. Compared with physics-based512
models such as EnergyPlus, E-GAN is 20 times faster while obtaining similar prediction513
results. Compared with three data-driven models, the power demand prediction process514
(phase four) of E-GAN has much better prediction accuracy. It should also be noted that515
the first three phases of E-GAN are utilized for model preparation. In these three phases,516
historical data is required to determine the typical buildings and the hyperparameters of517
GAN. However, once the physical models are prepared, historical data is not required in518
the fourth phase. Thus, compared with the other data-driven models, the proposed E-GAN519
reduces the reliance on historical data to some extent. Besides, the optimum architecture for520
24
the discriminator and generator was found to be one and two layers respectively. It means521
that deep networks are not needed for the proposed method.522
The good prediction results of the E-GAN also prove the availability of the typical523
building searching process. It is acceptable that we utilize one day to set the optimal value524
of the selected percentage given the similarity of classification results of different days and525
the good final prediction results. The good final prediction results also illustrate that the526
type of day and season of the year has little impact on the optimal value of the selected527
percentage in each category.528
The proposed method still has some limitations. For example, the searching process529
of typical buildings still requires significant computing time, even using GSS to speed up530
the process. The typical building selection is data-driven and lacks interpretability for531
application. Such limitation needs to be solved in the future.532
4. Conclusion533
Large scale power demand prediction for buildings plays a great role in stable operation534
and management for the grid. This paper proposed a new methodology named E-GAN to535
predict daily power demands for buildings at a large scale using a hybrid of the physics-based536
model (EnergyPlus) and data-driven model (GAN). Four phases are included in E-GAN: (1)537
building classification based on power demand, (2) pre-prediction, (3) physics-based model538
creation, and (4) large-scale power demand prediction. The residential building sector in539
the city of Colorado Springs was selected as a case study to evaluate the E-GAN.540
It is found that: (1) at least 4.3% of physics-based models in each building category are541
required to ensure the prediction accuracy; (2) phase 4 of the E-GAN only needs approxi-542
mately 9% of the computing time compared with the physics-based method using Energy-543
Plus while having similar prediction accuracy; and (3) on average, E-GAN reduces the error544
(measured by MAPE) by 66.56% compared to SVR, 71.01% compared to ELM, and 70.04%545
compared to the polynomial regression model. It can be concluded that the E-GAN signif-546
icantly reduces the computing time compared with the physics-based model, and achieves547
more accurate prediction performance compared to the other studied data-driven models.548
The case study demonstrated the proposed E-GAN. Based on that, some follow-up re-549
search can be conducted. Firstly, the data utilized for building classification is from Energy-550
Plus, but the performance of E-GAN still needs to be evaluated by using real data. Secondly,551
similar to other research using GAN, instability is also encountered in the proposed E-GAN.552
In the future, other advanced GAN can be used to solve this problem. Thirdly, this case553
study concentrates on residential buildings, but the proposed method is also applicable for554
commercial building types.555
Acknowledgments556
The Chinese team is supported by the National Natural Science Foundation of China557
(62076150, 61903226), the Taishan Scholar Project of Shandong Province (TSQN201812092),558
the Key Research and Development Program of Shandong Province (2019GGX101072,559
25
2019JZZY010115), and the Youth Innovation Technology Project of Higher School in Shan-560
dong Province (2019KJN005), the Key Research and Development Program of Shandong561
Province (2019JZZY010115).562
The first author conducted this research at the University of Colorado Boulder as a563
visiting scholar.564
References565
[1] H. Wang, S. Wang, R. Tang, Development of grid-responsive buildings: Opportunities, challenges,566
capabilities and applications of hvac systems in non-residential buildings in providing ancillary services567
by fast demand responses to smart grids, Applied Energy 250 (2019) 697 – 712. doi:https://doi.568
org/10.1016/j.apenergy.2019.04.159.569
[2] J. Li, X. Chen, P. Duan, J. Mou, Kmoea: A knowledge-based multi-objective algorithm for distributed570
hybrid flow shop in a prefabricated system, IEEE Transactions on Industrial Informatics (2021) 1–571
1doi:10.1109/TII.2021.3128405.572
[3] J. Li, Y. Du, K. Gao, D. Peiyong, D. Gong, Q. Pan, P. N. Suganthan, A hybrid iterated greedy algorithm573
for a crane transportation flexible job shop problem, IEEE Transactions on Automation Science and574
Engineering (2021) 1–18doi:10.1109/TASE.2021.3062979.575
[4] Prepared by: T. Eckman, L. Schwartz, and G. Leventis, Lawrence Berkeley National Laboratory,576
Determining utility system value of demand flexibility from grid-interactive efficient buildings, State577
and Local Energy Efficiency Action Network. (2020).578
URL https://www.energy.gov/sites/prod/files/2020/04/f74/bto-see-action-GEBs-valuation-20200410.579
pdf580
[5] T. M. Lawrence, M.-C. Boudreau, L. Helsen, G. Henze, J. Mohammadpour, D. Noonan, D. Patteeuw,581
S. Pless, R. T. Watson, Ten questions concerning integrating smart buildings into the smart grid,582
Building and Environment 108 (2016) 273 – 283. doi:https://doi.org/10.1016/j.buildenv.2016.583
08.022.584
[6] J. Li, Z. Liu, C. Li, Z. Zheng, Improved artificial immune system algorithm for type-2 fuzzy flexible585
job shop scheduling problem, IEEE Transactions on Fuzzy Systems 29 (11) (2021) 3234–3248. doi:586
10.1109/TFUZZ.2020.3016225.587
[7] A. Heydari, M. Majidi Nezhad, E. Pirshayan, D. Astiaso Garcia, F. Keynia, L. De Santoli, Short-588
term electricity price and load forecasting in isolated power grids based on composite neural network589
and gravitational search optimization algorithm, Applied Energy 277 (2020) 115503. doi:https:590
//doi.org/10.1016/j.apenergy.2020.115503.591
[8] K. Amasyali, N. M. El-Gohary, A review of data-driven building energy consumption prediction studies,592
Renewable & Sustainable Energy Reviews 81 (pt.1) (2018) 1192–1205.593
[9] G. Zhang, C. Tian, C. Li, J. J. Zhang, W. Zuo, Accurate forecasting of building energy consumption594
via a novel ensembled deep learning method considering the cyclic feature, Energy 201 (2020) 117531.595
doi:https://doi.org/10.1016/j.energy.2020.117531.596
[10] A. A. A. Gassar, S. H. Cha, Energy prediction techniques for large-scale buildings towards a sustainable597
built environment: A review, Energy and Buildings 224 (2020) 110238. doi:https://doi.org/10.598
1016/j.enbuild.2020.110238.599
[11] T. Hong, Y. Chen, X. Luo, N. Luo, S. H. Lee, Ten questions on urban building energy modeling, Building600
and Environment 168 (2020) 106508. doi:https://doi.org/10.1016/j.buildenv.2019.106508.601
[12] M. Ferrando, F. Causone, T. Hong, Y. Chen, Urban building energy modeling (ubem) tools: A state-of-602
the-art review of bottom-up physics-based approaches, Sustainable Cities and Society 62 (2020) 102408.603
doi:https://doi.org/10.1016/j.scs.2020.102408.604
[13] J. New[link].605
URL https://www.energy.gov/eere/buildings/virtual-epb606
[14] [link].607
URL https://www.nrel.gov/buildings/urbanopt.html608
26
[15] C. C. Davila], C. F. Reinhart, J. L. Bemis, Modeling boston: A workflow for the efficient generation609
and maintenance of urban building energy models from existing geospatial datasets, Energy 117 (2016)610
237 – 250. doi:https://doi.org/10.1016/j.energy.2016.10.057.611
[16] Y. Ye, K. Hinkelman, J. Zhang, W. Zuo, G. Wang, A methodology to create prototypical building612
energy models for existing buildings: A case study on us religious worship buildings, Energy and613
Buildings 194 (PNNL-SA-140941) (2019).614
[17] N. Abbasabadi, M. Ashayeri, R. Azari, B. Stephens, M. Heidarinejad, An integrated data-driven615
framework for urban energy use modeling (ueum), Applied Energy 253 (2019) 113550. doi:https:616
//doi.org/10.1016/j.apenergy.2019.113550.617
[18] A. A. Ahmed Gassar, G. Y. Yun, S. Kim, Data-driven approach to prediction of residential energy618
consumption at urban scales in london, Energy 187 (2019) 115973. doi:https://doi.org/10.1016/619
j.energy.2019.115973.620
[19] W. Wang, T. Hong, X. Xu, J. Chen, Z. Liu, N. Xu, Forecasting district-scale energy dynamics through621
integrating building network and long short-term memory learning algorithm, Applied Energy 248622
(2019) 217 – 230. doi:https://doi.org/10.1016/j.apenergy.2019.04.085.623
[20] C. E. Kontokosta, C. Tull, A data-driven predictive model of city-scale energy use in buildings, Applied624
Energy 197 (2017) 303 – 317. doi:https://doi.org/10.1016/j.apenergy.2017.04.005.625
[21] A. Roth, J. Reyna, Grid-interactive efficient buildings technical report series: Whole-building controls,626
sensors, modeling, and analytics (12 2019). doi:10.2172/1580329.627
URL https://www.osti.gov/biblio/1580329628
[22] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio,629
Generative adversarial nets, in: Advances in neural information processing systems, 2014, pp. 2672–630
2680.631
[23] A. Salazar, L. Vergara, G. Safont, Generative adversarial networks and markov random fields for632
oversampling very small training sets, Expert Systems with Applications 163 (2021) 113819. doi:633
https://doi.org/10.1016/j.eswa.2020.113819.634
[24] C. Tian, C. Li, G. Zhang, Y. Lv, Data driven parallel prediction of building energy consumption using635
generative adversarial nets, Energy and Buildings 186 (2019) 230 – 243. doi:https://doi.org/10.636
1016/j.enbuild.2019.01.034.637
[25] X. Zhou, Z. Sun, H. Wu, Wireless signal enhancement based on generative adversarial networks, Ad638
Hoc Networks 103 (2020) 102151. doi:https://doi.org/10.1016/j.adhoc.2020.102151.639
[26] Z. Wang, T. Hong, Generating realistic building electrical load profiles through the generative adver-640
sarial network (gan), Energy and Buildings 224 (2020) 110299. doi:https://doi.org/10.1016/j.641
enbuild.2020.110299.642
[27] N. M. M. Bendaoud, N. Farah, S. Ben Ahmed, Comparing generative adversarial networks architectures643
for electricity demand forecasting, Energy and Buildings 247 (2021) 111152. doi:https://doi.org/644
10.1016/j.enbuild.2021.111152.645
[28] L. G. Swan, V. I. Ugursal, Modeling of end-use energy consumption in the residential sector: A review646
of modeling techniques, Renewable and Sustainable Energy Reviews 13 (8) (2009) 1819 – 1835. doi:647
https://doi.org/10.1016/j.rser.2008.09.033.648
[29] G. Tardioli, R. Kerrigan, M. Oates, J. O’Donnell, D. P. Finn, Identification of representative build-649
ings and building groups in urban datasets using a novel pre-processing, classification, clustering650
and predictive modelling approach, Building and Environment 140 (2018) 90–106. doi:https:651
//doi.org/10.1016/j.buildenv.2018.05.035.652
[30] Y. Zhao, C. Zhang, Y. Zhang, Z. Wang, J. Li, A review of data mining technologies in building653
energy systems: Load prediction, pattern identification, fault detection and diagnosis, Energy and654
Built Environment 1 (2) (2020) 149 – 164. doi:https://doi.org/10.1016/j.enbenv.2019.11.003.655
[31] H. Liu, B. Xu, D. Lu, G. Zhang, A path planning approach for crowd evacuation in buildings based656
on improved artificial bee colony algorithm, Applied Soft Computing 68 (2018) 360 – 376. doi:https:657
//doi.org/10.1016/j.asoc.2018.04.015.658
[32] D. Luchi, A. Loureiros Rodrigues, F. Miguel Varej˜ao, Sampling approaches for applying dbscan to large659
27
datasets, Pattern Recognition Letters 117 (2019) 90 – 96. doi:https://doi.org/10.1016/j.patrec.660
2018.12.010.661
[33] J. A. Koupaei, S. Hosseini, F. M. Ghaini, A new optimization algorithm based on chaotic maps and662
golden section search method, Engineering Applications of Artificial Intelligence 50 (2016) 201–214.663
doi:https://doi.org/10.1016/j.engappai.2016.01.034.664
[34] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio,665
Generative adversarial nets, in: Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, K. Q. Wein-666
berger (Eds.), Advances in Neural Information Processing Systems 27, Curran Associates, Inc., 2014,667
pp. 2672–2680.668
URL http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf669
[35] Pacific Northwest National Laboratory (PNNL), Residential prototype building models (2019).670
URL https://www.energycodes.gov/development/residential/iecc_models671
[36] Residential energy consumption survey (recs), Tech. rep., U.S. Energy Information Administration672
(EIA) (2020).673
URL https://www.eia.gov/consumption/residential/data/2015/674
[37] A. Rahman, V. Srikumar, A. D. Smith, Predicting electricity consumption for commercial and res-675
idential buildings using deep recurrent neural networks, Applied Energy 212 (2018) 372 – 385.676
doi:https://doi.org/10.1016/j.apenergy.2017.12.051.677
[38] N. Somu, G. R. M R, K. Ramamritham, A hybrid model for building energy consumption forecasting678
using long short term memory networks, Applied Energy 261 (2020) 114131. doi:https://doi.org/679
10.1016/j.apenergy.2019.114131.680
[39] H. Zhong, J. Wang, H. Jia, Y. Mu, S. Lv, Vector field-based support vector regression for building681
energy consumption prediction, Applied Energy 242 (2019) 403 – 414. doi:https://doi.org/10.682
1016/j.apenergy.2019.03.078.683
[40] S. Kumar, S. K. Pal, R. P. Singh, Intra elm variants ensemble based model to predict energy performance684
in residential buildings, Sustainable Energy, Grids and Networks 16 (2018) 177 – 187. doi:https:685
//doi.org/10.1016/j.segan.2018.07.001.686
[41] S. Huang, W. Zuo, M. D. Sohn, A bayesian network model for predicting cooling load of687
commercial buildings, Building Simulation 11 (2018) 87–101. doi:https://doi.org/10.1007/688
s12273-017-0382-z.689
28
Appendix690
Table 9: percentage of different types of buildings in Colorado Spring
No. Building Type percentage No. Building Type percentage
1 SF electric slab 152 1.65% 25 MF electric slab 56 0.27%
2 SF electric slab 222 1.65% 26 MF electric slab 73 0.27%
3 SF electric slab 297 1.65% 27 MF electric slab 92 0.27%
4 SF electric crawlspace 152 1.65% 28 MF electric crawlspace 56 0.27%
5 SF electric crawlspace 222 1.65% 29 MF electric crawlspace 73 0.27%
6 SF electric crawlspace 297 1.65% 30 MF electric crawlspace 92 0.27%
7 SF electric heatedBasement 152 1.65% 31 MF electric heatedBasement 56 0.27%
8 SF electric heatedBasement 222 1.65% 32 MF electric heatedBasement 73 0.27%
9 SF electric heatedBasement 297 1.65% 33 MF electric heatedBasement 92 0.27%
10 SF electric unheatedBasement 152 1.65% 34 MF electric unheatedBasement 56 0.27%
11 SF electric unheatedBasement 222 1.65% 35 MF electric unheatedBasement 73 0.27%
12 SF electric unheatedBasement 297 1.65% 36 MF electric unheatedBasement 92 0.27%
13 SF gas slab 152 5.46% 37 MF gas slab 56 0.90%
14 SF gas slab 222 5.46% 38 MF gas slab 73 0.90%
15 SF gas slab 297 5.46% 39 MF gas slab 92 0.90%
16 SF gas crawlspace 152 5.46% 40 MF gas crawlspace 56 0.90%
17 SF gas crawlspace 222 5.46% 41 MF gas crawlspace 73 0.90%
18 SF gas crawlspace 297 5.46% 42 MF gas crawlspace 92 0.90%
29 SF gas heatedBasement 152 5.46% 43 MF gas heatedBasement 56 0.90%
20 SF gas heatedBasement 222 5.46% 44 MF gas heatedBasement 73 0.90%
21 SF gas heatedBasement 297 5.46% 45 MF gas heatedBasement 92 0.90%
22 SF gas unheatedBasement 152 5.46% 46 MF gas unheatedBasement 56 0.90%
23 SF gas unheatedBasement 222 5.46% 47 MF gas unheatedBasement 73 0.90%
24 SF gas unheatedBasement 297 5.46% 48 MF gas unheatedBasement 92 0.90%
Notes:
Building prototype: single-family (SF) (86%), multi-family (MF) (14%);
Heating system: electric resistance (23%), gas Furnace (77%);
Foundation: slab (25%), crawlspace (25%), heated basement (25%), and unheated basement (25%);
Floor area for single-family: 152 m2(33%), 222 m2(33%), 297 m2(33%);
Floor area for multi-family: 56 m2per unit (33%), 73 m2per unit (33%), 92 m2per unit (33%).
Nomenclature691
iA vector692
ˆxNormalized power demand693
ˆxavr
i,d The average value of normalized 24-hour power demand of a building ion the dth694
day695
ˆxmax
i,d The maximum of normalized 24-hour power demand of a building ion the dth day696
29
ˆxmin
i,d The minimum of normalized 24-hour power demand of a building ion the dth day697
ˆyThe predicted power load698
ˆytThe predicted power load at time t699
ˆyt
iThe predicted power demand of the a building iat time t700
µThe average value of NeMAPEs701
σStandard deviation702
a(i) The average distance between iand the other vectors which belong to the same cluster703
as i704
b(i) The average distance between iand the other vectors which belong to the different705
clusters706
cThe index of building category707
cnThe building index in category c708
CV Coefficient of Variation709
D(x) The output of Discriminator which represents the possibility that xis the real data710
G(z) The output of Generator711
iIndex of a building712
mTotal amount of data713
mnoise The amount of noise data714
NeThe number of iterations of step 2.2 and step 2.3 for each selected percentage of715
typical buildings716
pgThe distribution of the generated data from GAN717
pz(z) The prior on the input noise variables718
pdata(x) The distribution of the real data x719
Pnoise Noise rate720
SiSilhouette coefficient721
sithe index of the selected building in category c722
tTime723
30
V(D, G) The evaluation feedback function of GAN724
xOriginal power demand725
xt
iOriginal power demand of a building iat time t726
ytThe real power load at time t727
yt
iThe real power load of a building iat time t728
a Weighting factor729
b Weighting factor730
D The differential function of Generator which are usually presented as multi-layer neu-731
ral networks732
DBSCAN Density-Based Spatial Clustering of Applications with Noise733
E-GAN The proposed hybrid method combing EnergyPlus and Generative Adversarial Net-734
work735
ELM Extreme Machine Learning736
G The differential function of Discriminator which are usually presented as multi-layer737
neural networks738
GAN Generative Adversarial Network739
IEC Information and Error Criterion740
MAPE Mean Absolute Percentage Error741
RF Random Forest742
SVR Support Vector Regression743
w Weighting factor744
z Noise variable745
ˆxt
iNormalized power demand of building iat time t746
xtOriginal power demand at time t747
31
... For example, the original GAN model is designed to generate samples for a single class, while large-scale power demand prediction involves multiple classes (e.g., building types). To solve this issue, some researchers conducted clustering to classify building samples and then, trained a GAN model for each class [21]. Recently, hundreds of variations of GANs have been proposed to address the limitations of the original GAN model [22], including models with the ability to generate samples for multiple classes. ...
... Some evaluation indicators have already been proposed in existing building power demand prediction studies. For example, Tian et al. [21] used a hybrid physics-based model and generative adversarial network to predict power demand for buildings at a large scale. Mean absolute percentage error (MAPE) was used as an indicator to evaluate the accuracy of the prediction result. ...
Article
Full-text available
As an unsupervised-learning data-driven model, Generative Adversarial Networks (GANs) have recently attracted a lot of attention for various applications. There is potential to apply GANs for large-scale building power demand prediction, which is needed for power grid operation. However, there are many GAN variations and it is unclear which GAN is suitable for this application. To answer this question, this paper identifies five promising GANs (Original GAN, cGAN, SGAN, InfoGAN, and ACGAN) and evaluates their performance for predicting building power demand at a large scale. Physics-based building energy models are developed to generate training and reference data. A new evaluation indicator that combines accuracy and reproducibility is proposed to evaluate the performance of different GANs in predicting building power demand. The results show that SGAN and InfoGAN are not suitable because they cannot control the number of generated building samples for different building types. The prediction performance among the Original GAN, cGAN, and ACGAN can vary depending on training sample sizes and number of building types. If the training sample size is sufficiently large, Original GAN and cGAN can predict building power demand more accurately than ACGAN with the same number of samples. If training samples are limited, Original GAN provides better accuracy than cGAN and ACGAN. When the number of building types increase, the prediction accuracy increases for cGAN, decreases for ACGAN, and remains the same for Original GAN. As a result, cGAN and Original GAN are recommended for large-scale building power demand prediction.
... Apart from k-means and fuzzy c-means clustering, Xu et al. (2017) proposed a hierarchical clustering method to analyze large-scale electricity data. A DBSCAN method is utilized for electricity load profile clustering by Tian et al. (2022). ...
Article
Full-text available
Renewable energy is important for achieving carbon neutralization. However, power generation from renewable energy sources can be uncertain and uncontrollable. Therefore, understanding the features of energy demand is pivotal for integrating renewable energy sources and storage systems within the entire energy network. Traditional load profiles are depicted by fixed typical load curves, which cannot support detailed dynamic simulations of annual hourly electricity consumption. Here, a novel stochastic model for hourly electricity load profile analysis is proposed. A clustering-based model of hourly load was constructed to depict load profiles of typical days, while a non-linear regression method determined the temperature-related factors within annual daily load consumptions, based on which a stochastic simulation model was established for hourly electricity load profiles. The model's performance was tested with electricity data of rural districts in Fujian Province, China. The proposed model achieved a coefficient of variation of the mean absolute error of 15.7%, which was significantly lower than that of the traditional model. Further, a simplified case was employed to analyze the application of the proposed stochastic model in the design of energy storage systems. The proposed method enables the optimal design of integrated energy networks with renewable energy sources and energy storage systems. Keywords: electricity profile, cluster analysis, rural district, non-linear regression Highlights:  A two-step clustering method was used for typical load profiles from 4,053 rural districts.  Probability distribution models were developed for stochastic simulation of annual hourly electricity load.  The correlation between temperature and electricity consumption was described using a non-linear regression model.
Conference Paper
Full-text available
Urban building modeling tools are developing rapidly; these tools use emerging simulation workflows for specific urban environmental design tasks, such as assessing the impacts of energy efficiency technologies at a district scale. However, with the emergence of new environmental design tasks, addressing all possible use cases and tasks is challenging and cannot be covered by a single tool. Urban-scale analysis at this level of complexity often requires linking multiple emerging tools, rather than using a single tool, to adequately evaluate a variety of possible fields in urban environmental design. To achieve this, flexible platforms are needed to support multiple input formats (e.g., geometric and non-geometric building properties), enabling the mapping of such inputs to underlying simulation engines. This paper provides an overview of the open-source UR-BANopt Software Development Kit (SDK) for modeling high-performance buildings and energy systems at a district scale. URBANopt's flexible SDK is composed of several modules that can be customized to integrate with other tools and generate new workflows to perform urban environmental design tasks, such as capturing interactions between individual buildings, district energy systems , distributed energy resources, and the electric distribution grid.
Article
Full-text available
Building electrical load profiles can improve understanding of building energy efficiency, demand flexibility, and building-grid interactions. Current approaches to generating load profiles are time-consuming and not capable of reflecting the dynamic and stochastic behaviors of real buildings; some approaches also trigger data privacy concerns. In this study, we proposed a novel approach for generating realistic electrical load profiles of buildings through the Generative Adversarial Network (GAN), a machine learning technique that is capable of revealing an unknown probability distribution purely from data. The proposed approach has three main steps: (1) normalizing the daily 24-hour load profiles, (2) clustering the daily load profiles with the k-means algorithm, and (3) using GAN to generate daily load profiles for each cluster. The approach was tested with an open-source database – the Building Data Genome Project. We validated the proposed method by comparing the mean, standard deviation, and distribution of key parameters of the generated load profiles with those of the real ones. The KL divergence of the generated and real load profiles are within 0.3 for majority of parameters and clusters. Additionally, results showed the load profiles generated by GAN can capture not only the general trend but also the random variations of the actual electrical loads in buildings. The proposed GAN approach can be used to generate building electrical load profiles, verify other load profile generation models, detect changes to load profiles, and more importantly, anonymize smart meter data for sharing, to support research and applications of grid-interactive efficient buildings.
Article
Full-text available
Short-term forecasting of building energy consumption (BEC) is significant for building energy reduction and real-time demand response. In this study, we propose a new method to realize half-hourly BEC prediction. In this new method, to fully utilize the existing data features and to further promote the forecasting performance, we divide the BEC data into the stable (cyclic) and stochastic components, and propose a novel hybrid model to model the stable and stochastic components respectively. The cyclic feature (CF) is extracted via the spectrum analysis, while the stochastic component is approximated by a novel Deep Belief Network (DBN) and Extreme Learning Machine (ELM) based ensembled model (DEEM). This novel hybrid model is named DEEM + CF. Furthermore, two real-world BEC experiments are performed to verify the proposed method. Also, to display the superiorities of the proposed DEEM + CF, this model is compared with the DBN, DBN + CF, ELM, ELM + CF, Support Vector Regression (SVR) and SVR + CF. Experimental results indicate that the CF has a great influence on the promotion of forecasting accuracy for approximately 20%, and DEEM + CF performance is the best among the comparative models, with at least 3%, 6%, 10% better accuracy than the DBN + CF, ELM + CF and SVR + CF respectively under the criteria of MAE.
Article
In this study, a distributed hybrid flow shop scheduling problem with variable speed constraints is considered. To solve it, a knowledge-based adaptive reference points multi-objective algorithm (KMOEA) is developed. In the proposed algorithm, each solution is represented with a three-dimensional vector, where the factory assignment, machine assignment, operation scheduling and speed setting are encoded. Then, four problem-specific lemmas are proposed, which are used as the knowledge to guide the main components of the algorithm, including the initialization, global and local search procedures. Next, an efficient initialization approach is presented, which embedded with several problem-related initialization rules. Furthermore, a novel Pareto-based crossover heuristic is designed to learn from more promising solutions. To enhance the local search abilities, a speed adjustment local search method is investigated. Finally, a set of instances generated based on the realistic prefabricated production system is tested to verify the efficiency and effectiveness of the proposed algorithm.
Article
This paper introduces short-term load forecasting (STLF) using Generative Adversarial Networks (GAN). STLF was explored using several Artificial Intelligence based methods that offered excellent results. However, the usage of GAN models in this field is very limited, and usually works on creating synthetic load profiles to increase load datasets. This paper investigates the application of GAN for load forecasting by generating daily load profiles. Predicting the daily load is a challenging task that requires accurate and stable models that can capture seasonality and variation in load data. This paper presents a conditional GAN (cGAN) architecture, that uses only four exogenous variables (maximum and minimum temperature, day of the week and month), to predict a daily profile (24 hours of the day). Several types of GAN have been compared such as Deep Convolutional GAN, Least Squares GAN and Wasserstein GAN. The generated load profiles were tested on one year of data and compared to the real load profiles. The proposed GAN models provided excellent predictions, averaging a Mean Absolute Percentage Error (MAPE) of 4.99%.
Article
In this study, we propose an efficient optimization algorithm that is a hybrid of the iterated greedy and simulated annealing algorithms (hereinafter, referred to as IGSA) to solve the flexible job shop scheduling problem with crane transportation processes (CFJSP). Two objectives are simultaneously considered, namely, the minimization of the maximum completion time and the energy consumptions during machine processing and crane transportation. Different from the methods in the literature, crane lift operations have been investigated for the first time to consider the processing time and energy consumptions involved during the crane lift process. The IGSA algorithm is then developed to solve the CFJSPs considered. In the proposed IGSA algorithm, first, each solution is represented by a 2-D vector, where one vector represents the scheduling sequence and the other vector shows the assignment of machines. Subsequently, an improved construction heuristic considering the problem features is proposed, which can decrease the number of replicated insertion positions for the destruction operations. Furthermore, to balance the exploration abilities and time complexity of the proposed algorithm, a problem-specific exploration heuristic is developed. Finally, a set of randomly generated instances based on realistic industrial processes is tested. Through comprehensive computational comparisons and statistical analyses, the highly effective performance of the proposed algorithm is favorably compared against several efficient algorithms.
Article
In this work, we propose a new method for oversampling the training set of a classifier, in a scenario of extreme scarcity of training data. It is based on two concepts: Generative Adversarial Networks (GAN) and vector Markov Random Field (vMRF). Thus, the generative block of GAN uses the vMRF model to synthesize surrogates by the Graph Fourier Transform. Then, the discriminative block implements a linear discriminant on features measuring clique similarities between the synthesized and the original instances. Both blocks iterate until the linear discriminant cannot discriminate the synthetic from the original instances. We have assessed the new method, called Generative Adversarial Network Synthesis for Oversampling (GANSO), with both simulated and real data in experiments where the classifier is to be trained with just 3 or 5 instances. The applications consisted of classification of stages of neuropsychological tests using electroencephalographic (EEG) and functional magnetic resonance imaging (fMRI) data and classification of sleep stages using electrocardiographic (ECG) data. We have verified that GANSO can effectively improve the classifier performance, while the benchmark method SMOTE is not appropriate to deal with such a small size of the training set.
Article
Electricity price forecasting is a key aspect for market participants to maximize their economic efficiency in deregulated markets. Nevertheless, due to its non-linearity and non-stationarity, the trend of the price is usually complicated to predict. On the other hand, the accuracy of short-term electricity price and load forecasting is fundamental for an efficient management of electric systems. An accurate prediction can benefit future plans and economic operations of the power systems’ operators. In this paper, a new and accurate combined model has been proposed for short-term load forecasting and short-term price forecasting in deregulated power markets. It includes variational mode decomposition, mix data modeling, feature selection, generalized regression neural network and gravitational search algorithm. A mixed data model for the price and load forecast has been considered and integrated with the original signal series of price and load and their decomposition. Throughout this model, the candidate input variables are chosen by a distinct hybrid feature selection. Two reliable electricity markets (Pennsylvania-New Jersey-Maryland and Spanish electricity markets) have been used to test the proposed forecasting model and the obtained results have been compared with different valid benchmark prediction models. Lastly, the real load data of Favignana Island's power grid have been considered to test the proposed model. The obtained results pinpointed that the proposed model’s precision and stability is higher than in other benchmark forecasting models.
Article
Regulations corroborate the importance of retrofitting existing building stocks or constructing new energy-efficient districts. There is, thus, a need for modeling tools to evaluate energy scenarios to better manage and design cities, and numerous methodologies and tools have been developed. Among them, Urban Building Energy Modelling (UBEM) tools allow the energy simulation of buildings at large scales. Choosing an appropriate UBEM tool, balancing the level of complexity, accuracy, usability, and computing needs, remains a challenge for users. The review focuses on the main bottom-up physics-based UBEM tools, comparing them from a user-oriented perspective. Five categories are used: (i) the required inputs, (ii) the reported outputs, (iii) the exploited workflow, (iv) the applicability of each tool, and (v) the potential users. Moreover, a critical discussion is proposed, focusing on interests and trends in research and development. The results highlighted major differences between UBEM tools that must be considered to choose the proper one for an application. Barriers of adoption of UBEM tools include the needs of a standardized ontology, a common three-dimensional city model, a standard procedure to collect data, and a standard set of test cases. This feeds into future development of UBEM tools to support cities’ sustainability goals.
Article
Building energy prediction techniques are the primary tool for moving towards sustainable built environments. Energy prediction models play irreplaceable roles in making energy policy and the development of the building sector. This paper presents a comprehensive review of the prevailing prediction techniques used in large-scale building energy applications under different scopes and different archetypes, including black-box, white-box, and grey-box based methods. Additionally, the advantages and disadvantages of the applications of those approaches are compared and discussed in the context of large-scale buildings. The review results show that prediction techniques have addressed a variety of large-scale building energy related-applications, such as energy consumption forecasting and prediction, energy consumption profiling, energy mapping and benchmarking of buildings. However, there are still some research gaps that require more attention such as the inclusion of occupant behavior in white-box based models and the explicit representation of end-uses in black-box based models. Significantly, this review concludes with a few key tasks for modification of the current prediction approach framework, which can help with forecasting future energy use changes of specific buildings during the retrofit process or inclusion of renewable energy technology. This would assist in developing an appropriate strategy for the sustainability of the built environment.