ArticlePDF Available

Grey-Box Method for Urban Building Energy Modelling: Advancements and Potentials

MDPI
Energies
Authors:

Abstract and Figures

Urban building energy modelling (UBEM) has consistently been a pivotal tool to evaluate and control a building stock’s energy consumption. There are two main approaches to build up UBEM: top-down and bottom-up. The latter is the most commonly used in engineering. The bottom-up approach includes three methods: the physical-based method, the data-driven method, and the grey-box method. The first two methods have previously received ample attention and research. The grey-box method is a modelling method that has emerged in recent years that combines the traditional physical method with the data-driven method while it aims to avoid their problems and merge their advantages. Nowadays, there are several approaches for modelling the grey-box model. However, the majority of existing reviews on grey-box methods concentrate on a specific technical approach and thus lack a comprehensive overview of modelling method perspectives. Accordingly, by conducting a comprehensive review of the literature on grey-box research in recent years, this paper classifies grey-box models into three categories from the perspective of modelling methods and provides a detailed summary of each, concluding with a synthesis of potential research opportunities in this area. The aim of this paper is to provide a foundational understanding of grey-box modelling methods for similar research, thereby removing potential barriers in the field of research methods.
Content may be subject to copyright.
6.23.0
Grey-Box Method for Urban
Building Energy Modelling:
Advancements and Potentials
Yucheng Guo, Jie Shi, Tong Guo , Fei Guo, Feng Lu and Lingqi Su
Special Issue
Ultra-Low Energy Consumption and Zero-Energy Buildings in Response to Climate Change
Edited by
Prof. Dr. Fei Guo, Prof. Dr. Stephen Siu Yu Lau , Dr. Baojie He and Prof. Dr. Andreas Matzarakis
Review
https://doi.org/10.3390/en17215463
Citation: Guo, Y.; Shi, J.; Guo, T.; Guo,
F.; Lu, F.; Su, L. Grey-Box Method for
Urban Building Energy Modelling:
Advancements and Potentials.
Energies 2024,17, 5463. https://
doi.org/10.3390/en17215463
Academic Editor: Alessandro
Cannavale
Received: 15 September 2024
Revised: 16 October 2024
Accepted: 30 October 2024
Published: 31 October 2024
Copyright: © 2024 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Review
Grey-Box Method for Urban Building Energy Modelling:
Advancements and Potentials
Yucheng Guo 1, Jie Shi 2,*, Tong Guo 3, Fei Guo 4,* , Feng Lu 5and Lingqi Su 2
1
College of Architecture and Urban Planning, Tongji University, Shanghai 200092, China; 2230059@tongji.com
2
Sino-German College of Applied Sciences, Tongji University, Shanghai 200092, China; lingqi_su@tongji.edu.cn
3Hermann Rietschel Institute, Technical University of Berlin, Marchstraße 4, 10587 Berlin, Germany;
tong.guo@tu-berlin.de
4School of Architecture and Fine Art, Dalian University of Technology, Dalian 116024, China
5Integrale Planung GmbH, Pfingstweidstrasse 16, 8005 Zurich, Switzerland; lu-pagenkopf@intep.com
*Correspondence: shijie@tongji.edu.cn (J.S.); guofei@dlut.edu.cn (F.G.)
Abstract: Urban building energy modelling (UBEM) has consistently been a pivotal tool to evaluate
and control a building stock’s energy consumption. There are two main approaches to build up
UBEM: top-down and bottom-up. The latter is the most commonly used in engineering. The bottom-
up approach includes three methods: the physical-based method, the data-driven method, and the
grey-box method. The first two methods have previously received ample attention and research.
The grey-box method is a modelling method that has emerged in recent years that combines the
traditional physical method with the data-driven method while it aims to avoid their problems and
merge their advantages. Nowadays, there are several approaches for modelling the grey-box model.
However, the majority of existing reviews on grey-box methods concentrate on a specific technical
approach and thus lack a comprehensive overview of modelling method perspectives. Accordingly,
by conducting a comprehensive review of the literature on grey-box research in recent years, this
paper classifies grey-box models into three categories from the perspective of modelling methods and
provides a detailed summary of each, concluding with a synthesis of potential research opportunities
in this area. The aim of this paper is to provide a foundational understanding of grey-box modelling
methods for similar research, thereby removing potential barriers in the field of research methods.
Keywords: urban building energy modelling; hybrid model; modelling approach; building energy
efficiency; building performance simulation
1. Introduction
Building energy modelling (BEM) and urban building energy modelling (UBEM)
represent crucial instruments for regulating energy consumption and carbon emissions in
buildings. An evaluation of a building’s energy performance can be conducted through
the utilisation of energy modelling. In the context of existing buildings, building energy
modelling can facilitate an understanding of their current energy performance, thereby
informing the necessary renovation and repair work. In the context of new construction,
building energy modelling can be employed to optimise energy performance from the
design phase onwards. Compared with building energy modelling, urban building energy
modelling is more complex because it involves interrelationships and microclimates among
buildings [
1
]. The challenge of balancing computational overhead with accuracy is also
a key issue in this field of research [
2
]. However, due to the scale flexibility of urban
building energy models, they can reflect the energy consumption characteristics of several
to thousands of buildings at different scales, which helps to reflect the synergistic effect of
the complete solution and has high application value. Therefore, it is a hot research issue in
the field of building energy nowadays.
Energies 2024,17, 5463. https://doi.org/10.3390/en17215463 https://www.mdpi.com/journal/energies
Energies 2024,17, 5463 2 of 25
As Figure 1shows, there are two approaches for UBEM: the top-down approach and
the bottom-up approach [
3
]. The former refers to the prediction of energy characteristics
of buildings by analysing a large set of statistics reflecting the energy performance char-
acteristics of a certain type of building. This is a statistical method and normally based
on socio-econometric, technological, and physical factors [
4
,
5
]. The top-down approach
usually does not require detailed building information; however, the resulting predictions
are comparatively less precise and are unable to account for the impact of sudden shifts
brought about by technological advancements [
6
,
7
]. A further limitation of this approach is
the accessibility of large and accurate data [8].
Figure 1. Classification of UBEM modelling methods [9,10].
The bottom-up methodology is more prevalent in the field of engineering. The meth-
ods for modelling include the physics-based dynamic simulation method (or white-box
model), the data-driven method (or black-box model), and the reduced-order method (or
grey-box model, hybrid model).
At the present time, the physics-based dynamic simulation method is the most devel-
oped. This method needs climate data, building geometric information, and non-geometric
information (schedule, thermo-physical properties of the envelope, etc.) as input, and it
calculates the energy consumption with energy equations. The most common computa-
tional tools are EnergyPlus and a large number of derivatives using EnergyPlus as the
underlying computational engine, including CityBES [
11
], COFFEE [
12
], and UMI (1.0) [
13
],
in addition to IDA-ICE and DOE2. This method is highly comprehensible, but the model’s
level of development has a significant impact on the results of the calculations, which are
greatly reduced when complete modelling information is difficult to obtain [
14
]. In order to
ensure the most accurate simulation results, it is essential to conduct a thorough validation
process [
15
], as the uncertainty introduced by the parameters involved may lead to discrep-
ancies in the results [
16
]. For a case of a simplified physics model, studies have reported
errors of
±
10% for cooling and
±
20% for heating [
17
]. Otherwise, the physics-based model
computes the building energy consumption by abstracting building geometry as nodes in
a thermodynamic network, which is cumbersome and time-consuming for modelling a
large number of buildings due to the large number of nodes and equations that need to be
solved and the equally high requirement for computing power [3,18,19].
The data-driven method is particularly well suited to the needs of rapid assessment
or real-time evaluation. This approach does not require the input of detailed geometric
and non-geometric information about the building, and it can significantly increase the
speed of model computation. The approach to this method can be classified into two
categories of related research: statistical and artificial intelligence approaches [
20
]. The
statistical approach is common in early research [
4
,
19
]. Historical statistics can reflect
the impact of economic and social factors and people’s behaviour on building energy
consumption, which are difficult to characterise in a white-box approach [
21
,
22
]. The
algorithms used for the statistical approach include regression analysis, conditional demand
analysis, and artificial neural networks [
19
]. The most prevalent of these is the multiple
regression analysis, particularly linear regression methods. One example of typical research
Energies 2024,17, 5463 3 of 25
is a multivariate linear regression (MLR) model that was developed for the New York
City LL84 building dataset to predict the relationship between building end-use energy
intensity and a range of building characteristics, including age and energy type [
23
]. The
artificial intelligence approach is frequently combined with machine learning technology
in order to mine the fundamental relationships and patterns within a dataset, which can
then be used to develop a model of urban building energy consumption. This, in turn,
reflects the mathematical relationship between energy use and building characteristics that
are related to building energy consumption [
20
]. In addition to regression, the artificial
intelligence approach is capable of performing a range of tasks, including classification
and clustering [
24
,
25
]. The performance of the UBEM developed with the data-driven
method is contingent upon the performance of the algorithms employed and the quality
of the dataset. Currently, these approaches are widely utilised and have demonstrated
accuracy and validity. A variety of algorithms have been used for UBEM, apart from several
artificial neural networks [
26
,
27
], and comprise an extreme learning machine (ELM) [
28
], a
supported vector machine (SVM) [
29
], multiple gradients boosting algorithms based on
decision trees [
30
], and hierarchical clustering methods in unsupervised learning [
31
]. The
accessibility and quality of datasets represent significant constraints for the development
of black-box methods. At present, open-source urban data is an important basis for
relevant research [
32
,
33
]. Chen et al. clarified that the level of detail of the information
presented could meet the demands of research, but there is still a necessity for further
enhancements to be made with regard to the standardisation of data [
34
]. Data share-
ability and data platform development also need to be further improved. Furthermore,
the scope of open data that can be employed for data-driven applications is also subject
to certain limitations. Currently, the locations mentioned in the study where open urban
building energy consumption data are available were mainly in developed countries or
regions, including New York [
23
,
35
,
36
], San Francisco [
37
], Chicago [
33
], and Hong Kong
(China) [
30
]. Considering that the large amount of non-geometric information, such as user
behaviour, implicit in the black-box model can be affected by the development of regional
economic levels and climate, the generalisability of the model is limited. In addition, while
data-driven models are also capable of predicting the energy consumption characteristics
of buildings based on historical data, they can suffer from problems similar to those that
exist in top-down approaches that cannot reflect the role of technological breakthroughs.
Moreover, complex data-driven models may require significant computational resources in
the training phase.
The grey-box model combines the advantages of the black-box and white-box models,
which can improve the overall accuracy of the model and produce more reliable results
with less input information [
38
]: fewer model details are required than in white-box models,
which reduces the modelling time and accelerates the computation speed; and the ability
and generalisability of the model can be effectively improved compared to black-box
models. Related studies have also demonstrated the advantages of grey-box models in
the field of building energy systems and HVAC control strategies [
39
,
40
]. According to
statistics in three SCI journals focusing on building energy, Energy and Buildings, Building
Simulation, and Energies, in the last five years, more than 30% of the UBEM-related research
papers have developed a grey-box method or adopted grey-box tools to assess the energy
performance of urban buildings, and in 2023 this proportion even exceeded 40% (shown
in Figure 2), which confirms the application value and research potential of the grey-box
method. However, compared to black-box and white-box methods, the grey-box method
is more ambiguous in definition, and the modelling approach is relatively diverse, and
thus may cause confusion to researchers at the method selection stage of the study. Most
of the previous reviews of UBEM grey-box approaches have reviewed and summarised
one specific modelling approach [
41
43
], lacking an explanation of the different modelling
paths as well as a summary of their strengths and limitations. For an important current
research area [
41
], this hardly solves the difficulties for researchers to select a grey-box
model at the beginning of the study. Therefore, the aim of this paper is to introduce
Energies 2024,17, 5463 4 of 25
the different modelling paths and their technical details for the grey-box model of urban
building energy consumption, and to summarise its advantages and limitations.
Figure 2. Trends in research papers related to grey-box tools in the last five years. (Search Scope:
Energy and Buildings, Building Simulation, Energies).
Section 2of this paper will classify and briefly explain several concepts and modelling
approaches present in grey-box modelling, Section 3will explain the technical details
of different grey-box modelling approach types and summarise their advantages and
limitations, respectively, and Sections 4and 5will compare, summarise, and discuss the
limitations and shortcomings of this paper’s work by illustrating the grey-box modelling
approaches of different technical paths.
2. Methodology
2.1. Literature Search Strategy
The field of grey-box methods is currently undergoing rapid evolution, yet it is also
characterised by a substantial corpus of original and classic studies that have established
the foundations upon which the field is built. Accordingly, the objective of this study is
to conduct a comprehensive literature search, initially focusing on literature published
since 2018, and subsequently examining classical literature for review based on the citation
networks within these texts. During the initial search, the following search query was used:
(TS = (urban building energy modelling) OR TI = (urban building energy mod-
elling)) AND (AK = (hybrid model) OR AK = (data-driven) OR AK = (grey-box)) AND
PY = (2018–2024)
(TS: Topic; TI: Title; AK: Author Keywords; PY: Published Year)
Subsequently, the retrieved literature was initially classified according to the mod-
elling approach.
2.2. Preliminary Analysis for the Classification of the Modeling Approach
According to the literature materials, some basic concepts will be clarified. On this
basis, a preliminary categorisation based on technical pathways in the modelling process
has been made.
The early researchers in this field first proposed the concept of simplified building
models [
43
]. At this stage, there are three main technical paths to building simplified
models: executing order reduction techniques on detailed building models [
44
], building
simplified models from known building information [
45
], and building simplified models
directly and adopting inverse methods to determine model parameter values [
46
]. The
Energies 2024,17, 5463 5 of 25
objective of all three technical routes is to construct a lumped capacitance model (also
referred to as a thermal resistance heat capacity model, RC model), thereby simplifying the
calculation process.
In 2013, the concept termed “hybrid model” was proposed by Foucquier et al. [
47
],
which entails integrating the physics-based method with the data-driven method, so that
the input of the model is simplified while maintaining its physical interpretation. Three
strategies have been put forth for the construction of hybrid models. The first strategy
involves the utilisation of machine learning techniques to estimate the parameters of a
physics-based model. The second strategy entails the creation of a dataset through the
physics-based method, which is then employed to construct a black-box model. The
third strategy involves the application of both physical and statistical methods to discrete
segments of the model. For example, detailed physical modelling may be employed in
specific thermal zones, while statistical data may be utilised in the remaining thermal
zones. The term “reduced order model” (ROM) has been employed in the majority of
recent reviews [
1
,
8
], as the objective of grey-box models is to reduce the complexity of the
model to the greatest extent possible while maintaining accuracy.
In this paper, the grey-box modelling method is classified into three categories based
on the technical approach of modelling. The portion of the modelling process that is the
focus of these three approaches is shown in Figure 3. Approach 1 aims to speed up the
modelling process by using statistical knowledge to reduce the need for refinement of the
input parameters. Approach 2 improves the calculation process in white-box models with
the help of statistical knowledge to speed up the calculation speed or increase the accuracy
of the calculation. Approach 3 is more inclined to black-box models, but compared to
the traditional black-box models, the datasets built through the white-box method may
contain more technical parameters to support the subsequent technical analysis. However,
compared to the traditional black-box model, the dataset created by the white-box approach
may contain more technical parameters, thus providing more support for subsequent
technical analyses. Depending on the mixing process, the three approaches can be referred
to as the Simplified Parameter Approach, the Computational Optimisation Approach, and
the Data Expansion Approach, respectively.
Energies 2024,17, 5463 6 of 25
Figure 3. Classification of grey-box modelling approaches’ focus.
3. Results
3.1. Review of the Simplified Parameter Approach
The physics-based method requires input parameters such as meteorological data,
building geometry data, building thermal performance data, and occupant behaviour
and usage data. The difficulty in obtaining complete and accurate input parameters is a
recurring issue highlighted in existing research, with the potential to significantly influence
the precision of the final outcomes. A statistical approach to defining normative criteria
for the calculation process and the parameters of the building and technical equipment
represents a viable strategy. Consequently, the overall process of the method may be
regarded as a white-box model, but statistical knowledge is included in the parameter
input process in order to overcome the difficulties of data accessibility and to increase the
speed of modelling. The general approach is shown in Figure 4. This section may be divided
by subheadings. It should provide a concise and precise description of the experimental
results, their interpretation, and the experimental conclusions that can be drawn.
The simplified parameter approach, in particular, usually involves the creation of a
database by collecting the input parameters of the white-box model required for a particular
region, thus allowing the user to reduce the difficulty of the data collection phase by
calling the parameters in the database. The database will typically contain meteorological
data, information regarding the performance of equipment, regional simplified calculation
method constants, and other pertinent information. This approach is typically accompanied
by the development of software tools. The objective is therefore to reduce the size of the
database or to enhance the user-friendliness of the operation. A further solution is the
classification or clustering of buildings based on their energy characteristics. This involves
Energies 2024,17, 5463 7 of 25
analysing the characteristics of the buildings in question and treating those with similar
characteristics as a single type. Selected prototype buildings are then extracted from this
group, and the modelling parameters of these prototype buildings are extended to include
other buildings with similar characteristics. This is also an example of grey-box modelling
with simplified parameters.
Figure 4. General workflow of simplified parameter approach.
The research on the establishment of simplified parameter grey-box models through
the development of databases is mainly concentrated in the European region, which is
inextricably linked to the large amount of data collected in the stock building data surveys
led by the governments of various countries after the release of the European Union’s
Energy Performance of Buildings Directive (EPBDII: Energy Performance of Buildings
Directive). A large number of related integrated calculation tools have been developed,
such as the EPA series of tools, SimStadt (version 0.2), City Energy Analyst, and TEASER.
Bart Poel et al. developed software called EPA-ED (version 1.4.10.30) (for residen-
tial buildings) and EPA-NR (for non-residential buildings), for the analysis of the energy
performance of existing buildings in the EU region. The software is equipped with country-
specific databases containing local weather files, building libraries (encompassing thermal
performance of the envelope and performance of the systems and equipment, among other
factors), and constants of simplified calculation methods for specific regions. Additionally,
case studies were conducted in Austria, Denmark, the Netherlands, and Greece [
48
]. Kaden
et al., in the production of Energy Atlas Berlin, combined with geographic information
data in the open CityGML format, specified different values of envelope heat transfer
coefficients and window-to-wall ratios based on the age of the buildings, thus realising fast
energy consumption calculations on a city-wide scale. However, this study is relatively
rudimentary and only considers the effect of building age on the thermal performance of
the envelope [
49
]. SimStadt also employs the open CityGML format to realise the geometric
modelling from the city-region-buildings-buildings-region-buildings construction level and
the modelling parameters required at different levels The database of Simstadt comprises
three main libraries: the building type library, which includes physical parameters based
on building type and age; the building use library, which primarily encompasses user
Energies 2024,17, 5463 8 of 25
occupancy and operational parameters; and the energy system and fuel type libraries.
Additionally, probabilistic extrapolation of default parameters is available in cases where
insufficient parameters are available [
50
,
51
]. City Energy Analyst (CEA) is a calculation
software developed by ETH Zurich. Compared to EPA and SimStadt, its built-in library
is more complete and detailed, initially requiring 26 parameters stored in five databases.
The databases include meteorological databases, a city geographic information database
(containing information on building characteristics and the surrounding topography), a
building prototype database (thermal characteristics of the building envelope structure and
specific annual consumption values of different buildings of different ages), a distribution
database (mainly occupancy characteristics and HVAC system equipment characteristics),
and a building type database (thermal characteristics of the building envelope structure and
specific annual consumption values of different buildings of different ages). It also includes
the parameters of HVAC system equipment and specific annual energy consumption values
of different buildings with different building ages), a distribution database (mainly occu-
pancy characteristics and HVAC system setpoints) and a measurement database (energy
consumption values of non-standardised building types), which was further expanded
in the subsequent development process by the addition of a metrics database (metrics
database, mainly occupancy characteristics and HVAC system setpoints) [
52
]. The database
(metrics database, mainly containing economic and technical indicators) and Decision
Database (target database, mainly containing key indicators and weights in the decision-
making process) can better guide urban design and regeneration [
53
]. Compared to EPA
and SimStadt, CEA has subsequently been adopted by researchers in different parts of
the world, further improving its generalisation performance. Another open-source tool
is TEASER, based on the Modelica computational engine, which provides parametric
inputs for energy modelling by integrating statistical information on existing buildings
and relevant standard codes, with a database covering building form, interior zoning, use
functions, and envelope material properties, in addition to a simplified model for speeding
up computation using the RC model (the simplified model will be discussed and reviewed
in Section 4) [
54
]. Table 1below lists the library parameters collected and included in the
relevant regimentation tools or projects and the areas where they have gained application.
Table 1. Projects using simplified parameter approach.
Name Library Contains Parameters Study Case
EPA-ED, EPA-NR
Local meteorological documentation,
Enclosure thermal performance,
System equipment performance,
Local simplified calculation method factors.
EU countries (Austria,
Denmark, Netherlands,
Greece) [48]
Energy Atlas Berlin
Envelope thermal performance,
Window-to-wall ratio. Berlin [49]
SimStadt
Building Physical Parameters,
Occupancy and operation,
Energy system and fuel type.
Germany (Ludwigburg [50]),
Netherlands (Rotterdam [55])
City Energy
Analyst (CEA)
Meteorological data,
Building characteristics and surrounding terrain information,
Thermal characteristics of the envelope,
Characteristic parameters of HVAC system equipment specific annual
energy consumption values,
Energy consumption values for non-standardised building types,
Economic and technical indicators of equipment and systems,
Decision-making key indicators and weights.
Switzerland, (Zug [53],
Zurich [56]),
Singapore [57],
Netherlands [56,58],
Norway [56]
Energies 2024,17, 5463 9 of 25
Table 1. Cont.
Name Library Contains Parameters Study Case
TEASER
Architectural form,
Interior zoning,
Functions,
Enclosure material properties.
Germany (Bonn [54],
Hamburg [59]),
Austria (Graz [60])
The utilisation of stock databases undoubtedly reduces the necessity for input data
refinement while maintaining considerable modelling accuracy. However, the current
modelling process of simplifying input parameters with databases relies on a substantial
quantity of open-source data; therefore, this modelling approach is not applicable to
research in areas where open data is not accessible. Furthermore, the time and effort
required to collect and maintain a large-scale database is also considerable.
In regions where is a lack of widely available and detailed data, a potential solution is
through the prototype to reduce the work for survey. The prototype method is also applied
to reduce the size of the database. In the process of the modelling of CEA and Energy Atlas
Berlin, the prototype method has been used to assign the same computational parameters to
a class of similar buildings. This method usually requires classification or clustering work
on a certain number of building energy features during the extraction process. The primary
objective of the prototype extraction process is to establish a set of criteria for classification.
These criteria serve to evaluate the extent to which the prototype accurately represents
the prevailing characteristics of the class of buildings in question. A normal set of criteria
includes date of construction, building function, and climate zone [
61
,
62
]. On this basis,
some research projects have sought to enhance the criteria for delineation, with the aim
of improving the precision of prototype extraction and classification. A typical example
of this is the TABULA project [
63
], which provides a classification of residential buildings
in 21 European countries (Austria, Bosnia and Herzegovina, Belgium, Bulgaria, Cyprus,
Czech Republic, Germany, Denmark, Spain, France, United Kingdom, Greece, Hungary,
Ireland, Italy, Netherlands, Norway, Poland, Serbia, Sweden, Slovenia). based on the region,
residential buildings are categorised according to region, age, building structure, building
use, and form of energy system. The study of Davila et al. was developed with Boston’s
open-source data, the buildings were classified by year of construction and function of use,
given building non-geometric information based on relevant standards and construction
references, and further consolidated and classified by energy characteristics (peak energy
use and sub end-use energy intensities for lighting, appliances, water heating, heating, and
cooling) using statistics from the United States Energy Information Administration [
64
]. In
contrast, the study by Pasichnyi et al. employed the type of heating source as the primary
criterion for categorising the building prototypes. This is due to the fact that the study
area is a cold region, such as Stockholm, where heating energy consumption is a dominant
factor [65].
In addition to classification based on pre-set criteria, some studies have employed
unsupervised learning techniques to cluster building samples. In comparison with the
traditional method, this method can facilitate the discovery of objective data patterns and,
to a certain extent, mitigate the potential influence of human subjectivity in classification.
The most commonly used unsupervised learning algorithms can be divided into two cate-
gories: data transformation and clustering. Principal Component Analysis (PCA) is one
of the most common data transformation algorithms, and clustering algorithms can be
classified into distance-based clustering and density-based clustering. The former includes
K-means, K-Medoide, GMM, PAM, Mean-Shift, Hierarchical clustering algorithms, etc.
Current research on prototype extraction using machine learning methods usually starts
from building morphological features or building energy features. In studies that used
building form as a basis for classification, such as Li et al.‘s study of residential building
Energies 2024,17, 5463 10 of 25
prototypes in Yuzhong District, Chongqing, China [
66
], building height, aspect ratio, and
compactness were used as morphological indicators, and K-means and K-Medoide algo-
rithms were adopted to execute clustering, respectively, to compare the two parameter
combination methods of building height, aspect ratio, and aspect ratio and compactness,
and to compare the energy consumption of 321 buildings calculated by prototype extrac-
tion and building-by-building simulation methods, respectively. The study concluded
that taking the K-Medoide algorithm and using building height and aspect ratio as pro-
totype extraction metrics is a better extraction method, at which point three prototypes
were extracted with an error of only 0.03% compared to the building-by-building sim-
ulation method. In studies using energy characteristics as clustering attributes, such as
Borges et al.’s study of Andorran building prototypes [
67
], the prototype extraction took
a two-stage approach, firstly classifying building types based on building function and
heating form, based on which a K-means clustering algorithm was executed based on the
intensity of electricity usage collected, and finally 18 prototypes were extracted to represent
1172 residential, commercial buildings. Some studies were conducted with the indicators
combining morphology and energy features, such as Ghiassi et al., who raised an indicator
system based on building form (building volume, envelope area, building height, compact-
ness), solar gain (weighted window-to-wall ratio), and envelope performance (effective
envelope heat transfer coefficient, effective wall heat transfer coefficient, effective roof heat
transfer coefficient, effective floor heat transfer coefficient), and operational parameters
(year-round occupancy, daytime occupancy ratio, year-round daytime occupancy ratio,
year-round nighttime occupancy ratio, weighted internal heat gain intensity, heating sea-
son internal heat gain intensity, weighted hourly air change ratio, daily air change ratio).
Under these criteria, seven building types were obtained in a study of a neighbourhood in
Vienna [
68
,
69
]. Tardioli et al. conducted a representative identification of buildings in the
city of Geneva [
70
]; this study considered both morphological and energy characteristics
of buildings in order to avoid the influence of building age and function on morphology
and energy. Firstly, buildings in the city of Geneva were classified according to their year
of construction and their function, and then the number of floors of the building, the
building floor area, the building perimeter, the building heating area, the building height,
the number of floors, annual energy consumption, energy index, emissions and the number
of meters were then used as input parameters for clustering, resulting in the extraction of
67 building prototypes.
The majority of extant studies adopt a two-step approach to the application of machine
learning methods for prototype extraction, comprising firstly the artificial classification
of building types, typically based on functional classification, energy form, and year of
completion, and secondly the implementation of unsupervised clustering. The classification
process can effectively reduce the size of the dataset. Furthermore, since the majority of
current research employs the same distance-based clustering method (similarity measure)
as the clustering algorithm, reducing the dataset size can enhance the speed and accuracy
of the clustering process.
UBEM through the prototype method usually involves running simulations on a single
prototype building and then performing summation calculations based on the proportions
of each type of building within a city or region. Undoubtedly, the amount of work required
for modelling is greatly reduced. For example, Shen et al. used the classifications in the
US Residential Energy Consumption Survey (RECS) and based them on related studies to
build different building form prototypes for five different types of residential buildings
(mobile, single detached, single sttached, 2–4 unit flat, and 5 or more unit flat), and analysed
the energy consumption and energy saving potential of residential buildings within New
York State [
71
]. Deng et al. used Changsha as an example, extracting building prototypes
based on building use and number of floors, and, given building envelope performance
parameters based on the age of construction, the energy performance simulation was run
on individual buildings using EnergyPlus, and the final summation was performed based
on the percentage of different types of buildings in the city [
72
]. A similar approach was
Energies 2024,17, 5463 11 of 25
extended to Shanghai and visualisation of electricity and natural gas usage was carried
out [
32
]. In addition to the above prototyping methods, the simulation process is easy
and fast. However, the quality of prototype extraction can have a significant impact on
the final results, and the key lies in selecting appropriate indicators and making suitable
fuzzy abstractions to build the prototype, which usually requires rich relevant experience.
In addition, since simulations are usually performed on individual buildings only, the
interactions between the building and the environment, e.g., microclimate and daylight
shading, are not taken into account, which may have a negative effect on the accuracy of
the results.
3.2. Review of the Computational Optimisation Approach
In contrast to the simplified parameter approach, the computational optimisation
model is typically a streamlined representation of the energy calculation process, thereby
accelerating the calculation. Furthermore, the process of simplifying the model may also
result in a reduction of the number of input parameters required. The most prevalent
simplified model is the RC model, which employs the lumped parameter approach. This
model removes the building from electrical circuits, replacing components with heat
transfer capability with resistors, components with heat storage capability with capacitors,
the temperature difference between indoor and outdoor with a voltage source, and the
solar radiation and other indoor heat with a current source. The RC model is typically
named after the thermal resistance and heat capacity in the simplified model. The quantity
is designated as the xRyC model (where x represents the number of thermal resistances and
y represents the number of thermal capacitances). Additionally, the nomenclature of the
z-order model is employed, where z represents the number of thermal capacitances [
42
].
For the sake of clarity, the nomenclature of the xRyC model is employed uniformly in this
paper. In Table 2, some common RC models and related research are shown.
Table 2. Some common RC models.
Model
Name Simplifying Assumptions and Descriptions Study Case
1R1C
In the simplest RC model, the following four assumptions were made to build
the simplified model [73]:
Same initial temperature inside and outside the building;
Good mixing and uniform distribution of temperatures in the room;
The room thermal resistance and heat capacity are global parameters and
are not calculated from the values of each material;
No additional heat fluxes from solar radiation, ventilation, infiltration,
thermal bridges, etc.
Indoor air temperature in single
buildings [7476],
Solar heat gain in single
buildings [75]
2R1C
Heat transfer between the internal and external surfaces of the wall is
considered, and convective heat transfer is included in the thermal resistance
term [
42
], which can also reflect the uneven heating of the external surfaces of
the building (e.g., roofs with solar collectors laid [74], following the following
two basic assumptions [77]:
Heat transfer in building components is a one-dimensional heat transfer
process;
The thermal resistance and heat capacity in the model can be calculated
as global parameters.
Indoor air temperature in a
single building [74,78],
Concrete embedded tube
radiant floor [79],
Heat load calculations for single
buildings [80]
3R1C
The heat transfer between internal and external surfaces is explicitly
considered, the 3 thermal resistances represent the external surface heat
transfer, the wall heat transfer and the internal surface heat transfer
respectively, which is also the model adopted by ISO [81] and VDI
Usually for engineering applications
Energies 2024,17, 5463 12 of 25
Table 2. Cont.
Model
Name Simplifying Assumptions and Descriptions Study Case
3R2C
Based on the 3R1C model, the wall heat storage is split into external and
internal surface heat storage. It is one of the most widely used models due to
its moderate degree of simplification.
Indoor air temperature in single
building [74,8286],
Calculation of annual cooling
and heating loads and peak
loads in buildings with multiple
heat zones [87],
Energy performance of urban
buildings [88]
5R1C
Ventilation heat transfer, door and window heat transfer are considered to
have no heat storage capacity and are abstracted as two thermal resistances.
The building envelope is abstracted as an external surface thermal resistance,
an internal surface thermal resistance, and a heat storage heat capacity. The
indoor heat transfer case is also abstracted as one thermal resistance [
89
]. This
model is also the recommended model for the EN ISO 13790 standard [81].
Heating and cooling loads for
individual buildings [8992],
Peak heating and cooling loads
for individual buildings [92],
Internal air temperature in a
single building [90]
5R4C
It can be disassembled into a 3R2C model and a 2R2C model, with the 3R2C
model being used for thermodynamic modelling of the building envelope and
the 2R2C model being used for thermodynamic modelling of the building’s
interior components and furnishings [93].
Calculation of real-time heat loads in
individual building thermal
zones [9497]
In addition to some of the more common RC models mentioned above, a large number
of other RC models have been proposed for specific studies, such as the 4R1C model [
87
],
the 4R4C model [
98
], the 6R2C model [
99
], the 6R4C model [
74
], the 7R2C model [
92
], and
the 8R3C model [
76
]. For an overview and summary of the different types of RC models,
Li et al. conducted a detailed systematic review [42], which will not be repeated here.
The modelling process of the RC model shown in Figure 5usually consists of two
steps: firstly, the building is simplified and assumed reasonably according to the research
objectives, and the equivalent analogous circuit model of heat transfer process is established;
then, based on the established analogous circuit model, the values of thermal resistance and
heat capacity parameters in the RC model are calculated according to the test or simulation
data; the values of the parameters in the RC model determine the accuracy of the model.
In order to ascertain the optimal parameter values, optimisation algorithms or machine
learning methods are frequently employed. Optimisation algorithms are applied by global
search methods. For example, Harb et al. used global search algorithms to identify the
optimal parameter set with the highest accuracy based on the actual indoor air temperature
data. This was achieved by defining the upper and lower limits of the values of each
parameter and the accuracy of different types of RC. Models for different types of buildings
were computed and compared [
76
], and Hossain et al. employed Bayesian neural networks
(BNNs) for training purposes in order to determine the parameter values of RC models,
utilising a dataset derived from real thermostat recording data [100].
The RC model is one of the most widely used grey-box models, and many packaged
tools have been developed, such as TEASER [
49
], CitySim [
101
,
102
], OpenIDEAS [
103
],
and the Modelica library [
104
], which have been demonstrated to exhibit exemplary perfor-
mance in the prediction of dynamic temperatures, the calculation of cooling and heating
loads for individual buildings, and other related tasks. Some of the current studies adopted
TEASER or Citysim report results at the neighbourhood or city scale [
54
,
105
]. But there are
still some challenges to extend the RC model from BEM to UBEM. One of the challenges
is that different types of buildings may be suitable for different simplified levels [
100
];
a potential solution is to combine with the prototype method, extending the modelling
calculations from a small number of prototype buildings. Another challenge is the problem
of solar radiation calculation mentioned in almost all relevant studies. A further notable
Energies 2024,17, 5463 13 of 25
drawback of the RC model is that when alterations are made to a building or even a com-
ponent of a building within the model’s scope, the values of resistance and capacitance in
the original model should also be recalibrated. This limits the model’s adaptability.
Figure 5. General workflow of computational optimisation approach based on RC model.
Another approach for the grey-box model focuses on computational process combining
the simulation and data-driven approach, as Figures 6and 7show. The simulation and data-
driven processes may be either serial or parallel. Serial processes are typically employed
to bridge the gap between physical and data-driven approaches. In contrast, parallel
processes are based on the availability of existing data. In such cases, the white-box
approach is used to perform calculations in the parts where the physical approach can be
implemented, while the black-box approach is used in the other parts. The two parts are
then aggregated to obtain the final result. Li et al. refer to the model created by this method
as a hybrid model [
106
], which does not aim to create a black-box model as an alternative
to thermodynamic calculations, but rather to use the black-box model as an adjunct or
complement to the white-box model.
A representative study that has adopted a serial structure is the DUE-S series of
studies [
18
,
38
,
107
], in which a baseline model of building energy was established by
EnergyPlus simulation. The building information required for simulation was simplified,
with the degree of simplification based on the trade-off between the level of details (LOD)
and the data availability (DA). The objective was to establish a baseline model for the
analysis of general patterns of building energy use, rather than precise energy use. The
aforementioned trade-offs between LOD and DA are also relevant in this context. The
baseline model was established for the purpose of analysing the general pattern of building
energy use, rather than the precise energy use. Relevant studies have reported that the
direct transfer of the baseline model may result in significant errors [
108
]. Subsequently,
a deep learning approach was employed to train a calibrated model utilising building
energy consumption data obtained from simulations and test or reference data as a training
Energies 2024,17, 5463 14 of 25
dataset. The objective was not to optimise the computational process, but rather to enhance
the accuracy of the building energy consumption model with less precise input data. The
fundamental premise of this approach is to adopt an artificial intelligence methodology for
the calibration of simulation outcomes, thereby reducing the necessity for modelling input
parameters through the utilisation of open data on building energy. Similarly, Chen et al.
developed a meta-modelling approach for calibrating the simulation data of a white-box
model. This approach effectively improves the calibration efficiency and accuracy of the
simulation results by building a black-box model as a calibrator [109].
Figure 6. General workflow of computational optimisation approach combining simulation and
data-driven method in serial structure.
Studies that have adopted a parallel structure include those conducted by Li and
Dong et al., in which a thermodynamic model was employed to calculate heating and
cooling energy consumption, and a machine learning approach was used to calculate
non-air-conditioning energy consumption [
110
,
111
]. This approach mitigates the impact
of high uncertainty factors inherent to traditional physical models, including room rates,
ventilation, and indoor occupant activity. Furthermore, it reduces the number of input
parameters required for physical model modelling through machine learning as a statistical
method, with the study reporting an improvement in accuracy of approximately 15%. The
objective of the parallel structure is analogous to that of the serial structure study, which
similarly seeks to enhance the precision of the model with a reduction in the quantity of
input data.
The primary challenge associated with this approach persists in the initial stage of
acquiring authentic building energy data. However, Nutkiewicz et al. have proposed a
solution in the absence of genuine test data, namely the utilisation of officially published
guideline reference building energy levels as calibration data [
38
]. Nevertheless, the
Energies 2024,17, 5463 15 of 25
precision of this method in comparison to authentic data remains a subject of debate.
Furthermore, the optimised calculation method still necessitates the partial or complete
calculation of the building energy consumption by physical methods, which presents a
significant computational overhead challenge when applied to large-scale studies.
Figure 7. General workflow of computational optimisation approach combining simulation and
data-driven method in parallel structure.
3.3. Review of the Data Expansion Approach
Unlike the grey-box model for optimisation calculations, which combines simulation
and data-driven calculations as mentioned above, the data augmentation approach to
modelling aims to develop an agent model [
106
] that can be used as an alternative to
physical simulation methods. As shown in Figure 8below, this modelling approach is also
usually a two-step process, where the building energy performance dataset is firstly created
by a white-box approach, and then a black-box approach is applied to the dataset created
by the white-box approach to create a statistical model. But in the creation of the black-box
model, the input parameters are usually the ones that are required in the modelling of the
physical approach (meteorological data, geometric and non-geometric information of the
building, etc.). However, with the introduction of model interpretation algorithms such as
SHAP, the influencing factors of black-box models of building energy consumption can be
summarised in a meaningful way [35,112115].
The traditional black-box approach relies on open datasets that usually lack detailed
technical characteristics, and building age or use function is adopted in related studies to
roughly reflect the thermal attributes of buildings. This is shown in Table 3below, which
demonstrates the basic situation of some current statistical datasets, and three limitations of
current open datasets can be seen: spatial limitation: these datasets are usually concentrated
in developed countries or regions; temporal limitation: the time granularity of the datasets
is coarse, usually month by month or year by year; technical limitations: the datasets
usually do not contain or contain only a small number of technically detailed parameters.
Energies 2024,17, 5463 16 of 25
Figure 8. General workflow of data expansion approach.
Therefore, compared to the traditional black-box model, this way of building a dataset
through a white-box approach has the following three advantages:
It overcomes the difficulty of data accessibility, which is important for conducting
relevant research in domains where open-source data is scarce.
The input parameters of the model will be more focused on technical features, which
will facilitate a more comprehensive understanding of the impact of technical parameters in
the design process on the energy performance of buildings among architects and engineers.
The range of input parameters for technical characteristics can cover values that may
be reached in future developments, which can reflect the impact of technical developments
on energy performance.
The first advantage is obvious, for a considerable number of countries and regions,
particularly in developing countries, where there is an acute necessity to enhance the
energy performance of buildings. The lack of open statistical data can be a hindrance to
relevant research, and the use of modelled data is an alternative. The latter two points
are more reflective of the differences and advantages of hybrid modelling over black-box
modelling. Firstly, current open datasets are typically not based on a technical perspective.
Consequently, input parameters such as building function and year of construction are
not as important as technical parameters for the design of refurbished or new buildings.
Furthermore, when datasets are constructed using a white-box approach, it is possible
to anticipate the development of technical parameters, such as the performance of the
envelope, which can reflect the role of technological development in the improvement of
energy performance. Furthermore, the temporal resolution of existing datasets is typically
annual or monthly, which is conducive to modelling energy consumption at the urban scale.
However, when the scale is refined to a regional or neighbourhood level, data with higher
temporal resolution, such as daily or hourly, are essential for the design of energy systems,
including energy storage facilities and microgrids.
Energies 2024,17, 5463 17 of 25
Table 3. Open datasets that used for black-box modelling [116,117].
Country/City Name of the Dataset Time
Refinement Data Features
USA Building Performance
Dataset (BPD) [118]yearly
Building location and climate zone,
Building envelope and system type,
Building envelope thermal resistance,
Building energy intensity,
Energy intensity by type of itemised statistics.
New York, USA LL 84 [119] monthly
Building codes,
Monthly building energy consumption.
Often used in conjunction with the PLUTO dataset [
120
], which
contains address information, geometric information, building
function, and year of construction/renovation information for
New York City buildings
Chicago, USA Chicago Energy
Benchmarking [121]yearly
Year of data,
Building function,
Building energy rating,
Floor area,
Construction year.
Seattle, USA
Seattle’s Building
Energy Benchmarking
Program [122]
yearly
Building code,
Building location,
Building type,
Year of completion,
Number of storeys,
Energy intensity,
Breakdown of energy consumption by type of building.
Singapore
Listing of Building
Energy Performance
Data [123]
yearly
Name of the building,
Location of the building,
Function of the building,
Year of completion,
Form and technical details of the building
air-conditioning system,
Energy intensity of the building.
England and
Wales, UK
Energy Performance of
Buildings Register [124]monthly
Building location,
Building function,
Floor area,
Building envelope level (level descriptions, not specific
values) and technical details of equipment,
Breakdown of building energy consumption (including
renewable energy) and potential for retrofitting
Ireland Building Energy
Rating [125]yearly
Building location,
Building function,
Building area,
Building envelope level (heat transfer coefficient U for
different parts),
Envelope area and technical details of equipment,
Building energy intensity and carbon emission intensity.
Hong Kong, China
Energy Audit Form [
126
]
yearly
Building address,
Building name,
Energy use index.
Energies 2024,17, 5463 18 of 25
This type of studies aims to reduce the computation cost and improve the efficiency,
such as how Ding et al. established a database and analysed with linear regression with
EnergyPlus simulation data by extracting building prototypes in the absence of a Chinese
open building energy consumption dataset, and analysed the effects of year of completion
and land price on building energy consumption [
127
]. Liu et al. developed a machine
learning model based on a MARS model with energy consumption data generated from
EnergyPlus simulations. They then analysed the importance of different simulation input
parameters in order to propose potential building energy efficiency strategies [
128
]. Miu
et al. determined the value ranges of building envelope performance, operating parameters
and indoor temperature ranges based on a synthesis of relevant studies and standards.
They also collected meteorological data of Hong Kong since 1989 and adopted a parametric
approach to obtain 620,000 parameters. Based on these combinations, they established a
dataset and trained a hybrid EP-ANN model for building energy consumption calculations.
They then analysed the effect of envelope performance, indoor temperature set point,
and outdoor temperature on building energy consumption in Hong Kong based on the
results of the calculations [
129
]. In a recent study, Zhang et al. developed an artificial
neural network agent model for replacing the CFD computational process during coupled
energy simulations. The study reported a 2/3 reduction in simulation time, which can
significantly speed up microclimate simulation calculations and thus improve the computa-
tional efficiency of coupled simulations [
130
]. Vazquez-Canteli et al. constructed two deep
neural networks for the prediction of solar heat gain and building heat loss, respectively,
on a larger scale. These were then aggregated to calculate building energy consumption.
The training data for the deep neural networks were obtained from CitySim, in which
simulations were performed on 2620 buildings of varying heights in Austin, USA. The
simulation results were formed into a dataset comprising 25 input variables for the building
heat loss model and 15 input variables for the solar heat gain model. The most notable
contribution of the method is its improvement in computational speed, which enables the
return of computational results within 12 s, which greatly improves the computational
efficiency of large-scale building energy models [
131
]. A study by Joséet al. mixed real
and simulated data during the construction of the dataset and found that the addition of
the simulated data and variables can improve the prediction accuracy of NARX and RNN
neural networks [
132
]. Westermann et al. trained a residual neural network incorporating
a feature learning process to calculate building energy consumption using EnergyPlus
simulations and verified the generalisability of the model by applying it to meteorological
data from several different climatic zones in Canada, concluding that feature learning can
effectively improve the generalisation performance of the agent model [133].
The data expansion approach for grey-box modelling, although it also combines
thermodynamic and statistical knowledge, is more oriented towards black-box modelling.
This improves the computational efficiency in comparison to white-box methods and adds
technical features to the inputs that can be used to explain the role of building attributes
in relation to the building’s energy consumption. Nevertheless, the primary issue is the
absence of a calibration process. Additionally, it is essential to deliberate on the manner in
which the requisite data for AI modelling in the development phase can be balanced with
the computational overhead associated with the white-box approach for data acquisition.
4. Discussion
In the above sections, we have reviewed several modelling approaches for establishing
grey-box models through relevant researches. As proposed in the methodology, the grey-
box model is divided into three paths. But the boundaries between these approaches are
in fact quite blurred, for example, the establishment of TESEAR combines the parameter
simplification method and the optimisation calculation method.
The basic logistic of the simplified parameter approach is still physics-based, and it is
assisted by statistical methods to reduce the parameters needed in the modelling process
so that the efficiency of the energy consumption calculation of urban buildings can be
Energies 2024,17, 5463 19 of 25
improved. With this goal in mind, there are two methods to realise this approach. One
method is to collect as much information as possible to build a database, which is usually
done with government support, while another method is to carry out a typology study with
a small amount of open data and build a prototype of each type, which can be replaced
by a small amount of research to obtain the detailed parameters of the prototype. The
two approaches can also be combined to reduce the size of the database and the cost of
development and maintenance. Nevertheless, since this approach remains contingent upon
simulation, the process of reducing the parameters introduces an element of uncertainty
into the model. Consequently, it would be prudent to incorporate a validation into this
type of research process, with a view to reducing errors and ensuring an accurate reflection
of the energy performance of the building.
The focus of the computational optimisation approach to modelling is on the develop-
ment of new computational methods to replace the original computational engine, and the
process of building new computational methods may reduce the need for input parameters,
thus further simplifying the parameter input stage at the same time. The most common
simplification is the lumped-parameter approach, which is based on a thermodynamic
heat transfer network with a clear physical interpretation, and is classified as a white-box
model in some studies. However, a large number of studies have been conducted on the
optimal values of the parameters in the network through the training of machine-learning
or optimisation algorithms for a more accurate parameter value in the RC model, which is
a statistical process, and is therefore also a grey-box modelling category. Another approach
is to perform simulation and data-driven steps sequentially in the computational process to
achieve hybridisation, which is suitable for modelling situations where public open-source
energy data is available but lacks the technical details of the parameters. This hybridisation
process reduces the need for input parameters and provides model calibration, which is a
modelling approach that improves the accuracy of the model.
The fundamental premise of the data expansion approach is the data-driven approach.
The main difference is the source of the dataset for modelling. The data expansion ap-
proach enables researchers to construct datasets that are more aligned with the specific
requirements of their study. Advanced AI technologies are integrated in this approach. For
instance, they can analyse technological pathways or crucial influencing factors that en-
hance the energy performance of a building. Furthermore, this approach is not constrained
by data openness and exhibits enhanced generalisability, which are two additional factors
that contribute to its popularity among a significant number of researchers.
Of the above three approaches, the modelling approaches’ focus of the first two is to
enhance the performance of the UBEM and to improve the efficiency or accuracy, whereas
the focus of the latter is mainly on performing technology-oriented analyses. Thus, the first
two have a natural advantage in the area of software tools development, while the latter is
more suitable for individualised research.
5. Conclusions
Grey-box models have shown greater potential for application than single white- or
black-box models due to their combination of thermodynamic and statistical knowledge,
and there has been a considerable number of different types of research work in this area in
recent years. As proposed in the Methodology Section, based on the modelling approach,
the related research classified grey-box models into three categories, which provides a
useful framework for related research in this field. However, there are still certain gaps that
need to be filled by subsequent research, and these potential research opportunities lie in
the following four main points:
The development of separate databases for application in different regions in the
parameter simplification methodology, which usually focuses on operational parameters,
especially personnel behaviour, is not negligibly affected by the level of economic and social
development, different cultural customs and local climatic conditions, e.g., in developing
countries and regions, where the energy intensity is significantly lower than in developed
Energies 2024,17, 5463 20 of 25
countries and regions, and people living in hot areas use air conditioners at a higher
operating temperature than those in cold areas. This limits the application of the tools, as
most of the existing parameter simplification tools have been developed based on data
from developed countries; however, developing countries or regions have more pressing
sustainable building development needs and the development of complementary databases
or plug-ins for application in these countries or regions is of significant relevance. The
development of the databases is also helpful for the validation of the simplified parameter
approach and future research on the modelling of the data-driven approach.
Parametric simplification methods with higher generalisability attributes, existing
prototype extraction studies have used crude parameters and to some extent lacked consid-
eration of design parameters, which may result in cognitive biases between the building
design and building energy domains. In addition, prototype extraction often targets in-
dividual buildings, which ignores the influence of the urban fabric on building energy
consumption in urban environments, an influence which may need to be first classified and
defined at the neighbourhood scale, and subsequently extracted on the basis of building
morphology and energy characteristics.
Combining the selection of input parameters for the building in the white-box and
black-box optimisation calculation methods, the most important input parameters can be
selected in the machine learning step in conjunction with data processing methods, such as
feature selection. This can further reduce the workload of data collection in the input phase
and thus also provide objective selection criteria for the modelling choices, i.e., at what
level of detail a more accurate white-box model can be built, and under what circumstances
further data collection or alternative modelling approaches are required.
Extending the calibration process in dataset approaches, existing studies usually lack
a model calibration step during the dataset building process, which may compromise the
quality of the dataset and thus the accuracy of the agent model. One possible approach
is to analyse and correct the model’s own errors during the simulation process by using
some samples with open test data (not the research object itself), and apply the results of
this calibration to the dataset building process to improve the quality of the dataset.
Funding: This research received no external funding.
Conflicts of Interest: Author Feng Lu was employed by the Integrale Planung GmbH. The remaining
authors declare that the research was conducted in the absence of any commercial or financial
relationships that could be construed as a potential conflict of interest.
References
1.
Hong, T.; Chen, Y.; Luo, X.; Luo, N.; Lee, S.H. Ten questions on urban building energy modeling. Build. Environ. 2020,168, 106508.
[CrossRef]
2.
Picco, M.; Marengo, M. On the Impact of Simplificationson Building Energy Simulation for Early Stage Building Design. J. Eng.
Archit. 2015,3, 66–78. [CrossRef]
3.
Reinhart, C.F.; Cerezo Davila, C. Urban building energy modeling—A review of a nascent field. Build. Environ. 2016,97, 196–202.
[CrossRef]
4.
Li, W.; Zhou, Y.; Cetin, K.; Eom, J.; Wang, Y.; Chen, G.; Zhang, X. Modeling urban building energy use: A review of modeling
approaches and procedures. Energy 2017,141, 2445–2457. [CrossRef]
5.
Kong, D.; Cheshmehzangi, A.; Zhang, Z.; Ardakani, S.P.; Gu, T. Urban building energy modeling (UBEM): A systematic review of
challenges and opportunities. Energy Effic. 2023,16, 69. [CrossRef]
6.
Howard, B.; Parshall, L.; Thompson, J.; Hammer, S.; Dickinson, J.; Modi, V. Spatial distribution of urban building energy
consumption by end use. Energy Build. 2012,45, 141–151. [CrossRef]
7.
Hong, S.-M.; Paterson, G.; Burman, E.; Steadman, P.; Mumovic, D. A comparative study of benchmarking approaches for
non-domestic buildings: Part 1—Top-down approach. Int. J. Sustain. Built Environ. 2013,2, 119–130. [CrossRef]
8.
Ali, U.; Shamsi, M.H.; Hoare, C.; Mangina, E.; O’Donnell, J. Review of urban building energy modeling (UBEM) approaches,
methods and tools using qualitative and quantitative analysis. Energy Build. 2021,246, 111073. [CrossRef]
9.
Ferrando, M.; Causone, F.; Hong, T.; Chen, Y. Urban building energy modeling (UBEM) tools: A state-of-the-art review of
bottom-up physics-based approaches. Sustain. Cities Soc. 2020,62, 102408. [CrossRef]
10.
Guo, T.; Bachmann, M.; Kersten, M.; Kriegel, M. A combined workflow to generate citywide building energy demand profiles
from low-level datasets. Sustain. Cities Soc. 2023,96, 104694. [CrossRef]
Energies 2024,17, 5463 21 of 25
11.
Hong, T.; Chen, Y.; Lee, S.H.; Piette, M.A. CityBES: A web-based platform to support city-scale building energy efficiency. In
Proceedings of the 5th International Urban Computing Workshop, San Francisco, CA, USA, 14 August 2016.
12.
Brackney, L.J. Portfolio-Scale Optimization of Customer Energy Efficiency Incentive and Marketing: Cooperative Research and Development
Final Report; CRADA Number CRD-13-535; National Renewable Energy Lab. (NREL): Golden, CO, USA, 2016.
13.
Reinhart, C.; Dogan, T.; Jakubiec, A.; Rakha, T.; Sang, A. In UMI–an urban simulation environment for building energy use,
daylighting and walkability. In Proceedings of the 13th International Conference of the International-Building-Performance-
Simulation-Association (IBPSA), Chambery, France, 25–28 August 2013.
14.
Klimczak, M.; Bojarski, J.; Ziembicki, P.; K
e¸
skiewicz, P. Analysis of the impact of simulation model simplifications on the quality
of low-energy buildings simulation results. Energy Build. 2018,169, 141–147. [CrossRef]
15.
Yu, J.; Chang, W.-S.; Dong, Y. Building Energy Prediction Models and Related Uncertainties: A Review. Buildings 2022,12, 1284.
[CrossRef]
16.
Malhotra, A.; Bischof, J.; Nichersu, A.; Häfele, K.-H.; Exenberger, J.; Sood, D.; Allan, J.; Frisch, J.; van Treeck, C.; O’Donnell, J.; et al.
Information modelling for urban building energy simulation—A taxonomic review. Build. Environ. 2022,208, 108552. [CrossRef]
17.
Battini, F.; Pernigotto, G.; Gasparella, A. District-level validation of a shoeboxing simplification algorithm to speed-up Urban
Building Energy Modeling simulations. Appl. Energy 2023,349, 121570. [CrossRef]
18.
Nutkiewicz, A.; Choi, B.; Jain, R.K. Exploring the influence of urban context on building energy retrofit performance: A hybrid
simulation and data-driven approach. Adv. Appl. Energy 2021,3, 100038. [CrossRef]
19.
Swan, L.G.; Ugursal, V.I. Modeling of end-use energy consumption in the residential sector: A review of modeling techniques.
Renew. Sustain. Energy Rev. 2009,13, 1819–1835. [CrossRef]
20.
Abbasabadi, N.; Ashayeri, M. Urban energy use modeling methods and tools: A review and an outlook. Build. Environ. 2019,
161, 106270. [CrossRef]
21.
Douthitt, R.A. An economic analysis of the demand for residential space heating fuel in Canada. Energy 1989,14, 187–197.
[CrossRef]
22.
Lim, H.; Zhai, Z.J. Review on stochastic modeling methods for building stock energy prediction. Build. Simul. 2017,10, 607–624.
[CrossRef]
23.
Kontokosta, C.E. Predicting building energy efficiency using New York City benchmarking data. In Proceedings of the 2012
ACEEE Summer Study on Energy Efficiency in Buildings. American Council for an Energy-Efficient Economy, Washington, DC,
USA, 11 July 2012.
24.
Zhao, Y.; Zhang, C.; Zhang, Y.; Wang, Z.; Li, J. A review of data mining technologies in building energy systems: Load prediction,
pattern identification, fault detection and diagnosis. Energy Built Environ. 2020,1, 149–164. [CrossRef]
25.
Wei, Y.; Zhang, X.; Shi, Y.; Xia, L.; Pan, S.; Wu, J.; Han, M.; Zhao, X. A review of data-driven approaches for prediction and
classification of building energy consumption. Renew. Sustain. Energy Rev. 2018,82, 1027–1047. [CrossRef]
26.
Ascione, F.; Bianco, N.; De Stasio, C.; Mauro, G.M.; Vanoli, G.P. CASA, cost-optimal analysis by multi-objective optimisation and
artificial neural networks: A new framework for the robust assessment of cost-optimal energy retrofit, feasible for any building.
Energy Build. 2017,146, 200–219. [CrossRef]
27.
Reveshti, A.M.; Khosravirad, E.; Rouzbahani, A.K.; Fariman, S.K.; Najafi, H.; Peivandizadeh, A. Energy consumption prediction
in an office building by examining occupancy rates and weather parameters using the moving average method and artificial
neural network. Heliyon 2024,10, e25307. [CrossRef]
28.
Sajjadi, S.; Shamshirband, S.; Alizamir, M.; Yee, P.L.; Mansor, Z.; Manaf, A.A.; Altameem, T.A.; Mostafaeipour, A. Extreme learning
machine for prediction of heat load in district heating systems. Energy Build. 2016,122, 222–227. [CrossRef]
29.
Zhang, F.; Deb, C.; Lee, S.E.; Yang, J.; Shah, K.W. Time series forecasting for building energy consumption using weighted Support
Vector Regression with differential evolution optimization technique. Energy Build. 2016,126, 94–103. [CrossRef]
30.
Cui, X.; Lee, M.; Koo, C.; Hong, T. Energy consumption prediction and household feature analysis for different residential
building types using machine learning and SHAP: Toward energy-efficient buildings. Energy Build. 2024,309, 113997. [CrossRef]
31.
Choi, S.; Yoon, S. Change-point model-based clustering for urban building energy analysis. Renew. Sustain. Energy Rev. 2024,199,
114514. [CrossRef]
32.
Song, C.; Deng, Z.; Zhao, W.; Yuan, Y.; Liu, M.; Xu, S.; Chen, Y. Developing urban building energy models for shanghai city with
multi-source open data. Sustain. Cities Soc. 2024,106, 105425. [CrossRef]
33.
Quan, S.J. Comparing hyperparameter tuning methods in machine learning based urban building energy modeling: A study in
Chicago. Energy Build. 2024,317, 114353. [CrossRef]
34.
Chen, Y.; Hong, T.; Luo, X.; Hooper, B. Development of city buildings dataset for urban building energy modeling. Energy Build.
2019,183, 252–265. [CrossRef]
35.
Li, Z.; Ma, J.; Jiang, F.; Zhang, S.; Tan, Y. Assessing the impacts of urban morphological factors on urban building energy modeling
based on spatial proximity analysis and explainable machine learning. J. Build. Eng. 2024,85, 108675. [CrossRef]
36.
Kontokosta, C.; Tull, C.; Marulli, D.; Pingerra, R.; Yaqub, M. In Web-based visualization and prediction of urban energy use
from building benchmarking data. In Proceedings of the Bloomberg Data for Good Exchange Conference, New York, NY, USA,
28 September 2015.
37.
Chen, Y.; Deng, Z.; Hong, T. Automatic and rapid calibration of urban building energy models by learning from energy
performance database. Appl. Energy 2020,277, 115584. [CrossRef]
Energies 2024,17, 5463 22 of 25
38.
Nutkiewicz, A.; Yang, Z.; Jain, R.K. Data-driven Urban Energy Simulation (DUE-S): A framework for integrating engineering
simulation and machine learning methods in a multi-scale urban energy modeling workflow. Appl. Energy 2018,225, 1176–1189.
[CrossRef]
39.
Déqué, F.; Ollivier, F.; Poblador, A. Grey boxes used to represent buildings with a minimum number of geometric and thermal
parameters. Energy Build. 2000,31, 29–35. [CrossRef]
40.
Reynders, G.; Diriken, J.; Saelens, D. Quality of grey-box models and identified parameters as function of the accuracy of input
and observation signals. Energy Build. 2014,82, 263–274. [CrossRef]
41.
Serasinghe, R.; Long, N.; Clark, J.D. Parameter identification methods for low-order gray box building energy models: A critical
review. Energy Build. 2024,311, 114123. [CrossRef]
42.
Li, Y.; O’Neill, Z.; Zhang, L.; Chen, J.; Im, P.; DeGraw, J. Grey-box modeling and application for building energy simulations—A
critical review. Renew. Sustain. Energy Rev. 2021,146, 111174. [CrossRef]
43. Kramer, R.; van Schijndel, J.; Schellen, H. Simplified thermal and hygric building models: A literature review. Front. Archit. Res.
2012,1, 318–325. [CrossRef]
44.
Gouda, M.M.; Danaher, S.; Underwood, C.P. Building thermal model reduction using nonlinear constrained optimization. Build.
Environ. 2002,37, 1255–1265. [CrossRef]
45.
Nielsen, T.R. Simple tool to evaluate energy demand and indoor environment in the early stages of building design. Sol. Energy
2005,78, 73–83. [CrossRef]
46.
Bălan, R.; Cooper, J.; Chao, K.-M.; Stan, S.; Donca, R. Parameter identification and model based predictive control of temperature
inside a house. Energy Build. 2011,43, 748–758. [CrossRef]
47.
Foucquier, A.; Robert, S.; Suard, F.; Stéphan, L.; Jay, A. State of the art in building modelling and energy performances prediction:
A review. Renew. Sustain. Energy Rev. 2013,23, 272–288. [CrossRef]
48.
Poel, B.; van Cruchten, G.; Balaras, C.A. Energy performance assessment of existing dwellings. Energy Build. 2007,39, 393–403.
[CrossRef]
49.
Kaden, R.; Kolbe, T.H. City-wide total energy demand estimation of buildings using semantic 3D city models and statistical data.
ISPRS Annals of the Photogrammetry. Remote Sens. Spat. Inf. Sci. 2013,2, 163–171.
50.
Nouvel, R.; Brassel, K.-H.; Bruse, M.; Duminil, E.; Coors, V.; Eicker, U.; Robinson, D. In SimStadt, a new workflow-driven urban
energy simulation platform for CityGML city models. In Proceedings of the International Conference CISBAT 2015 Future
Buildings and Districts Sustainability from Nano to Urban Scale, Lausanne, Switzerland, 9–11 September 2015; pp. 889–894.
51.
Monsalvete, P.; Robinson, D.; Eicker, U. Dynamic simulation methodologies for urban energy demand. Energy Procedia 2015,78,
3360–3365. [CrossRef]
52.
Fonseca, J.A.; Schlueter, A. Integrated model for characterization of spatiotemporal building energy consumption patterns in
neighborhoods and city districts. Appl. Energy 2015,142, 247–265. [CrossRef]
53.
Fonseca, J.A.; Nguyen, T.-A.; Schlueter, A.; Marechal, F. City Energy Analyst (CEA): Integrated framework for analysis and
optimization of building energy systems in neighborhoods and city districts. Energy Build. 2016,113, 202–226. [CrossRef]
54.
Remmen, P.; Lauster, M.; Mans, M.; Fuchs, M.; Osterhage, T.; Müller, D. TEASER: An open tool for urban energy modelling of
building stocks. J. Build. Perform. Simul. 2018,11, 84–98. [CrossRef]
55.
Nouvel, R.; Mastrucci, A.; Leopold, U.; Baume, O.; Coors, V.; Eicker, U. Combining GIS-based statistical and engineering urban
heat consumption models: Towards a new framework for multi-scale policy support. Energy Build. 2015,107, 204–212. [CrossRef]
56.
Mosteiro-Romero, M.; Hischier, I.; Fonseca, J.A.; Schlueter, A. A novel population-based occupancy modeling approach for
district-scale simulations compared to standard-based methods. Build. Environ. 2020,181, 107084. [CrossRef]
57.
Happle, G.; Fonseca, J.A.; Schlueter, A. Impacts of diversity in commercial building occupancy profiles on district energy demand
and supply. Appl. Energy 2020,277, 115594. [CrossRef]
58.
Mosteiro-Romero, M.; Maiullari, D.; Collins, F.; Schlueter, A.; Timmeren, A.V. District-scale energy demand modeling and urban
microclimate: A case study in The Netherlands. J. Phys. Conf. Ser. 2019,1343, 012003. [CrossRef]
59.
Malhotra, A.; Shamovich, M.; Frisch, J.; van Treeck, C. Urban energy simulations using open CityGML models: A comparative
analysis. Energy Build. 2022,255, 111658. [CrossRef]
60.
Boiger, T.; Schweiger, G. SHP2SIM: A python pipeline for Modelica based district and urban scale energy simulations. Int. J.
Sustain. Energy 2023,42, 1028–1041. [CrossRef]
61.
Dascalaki, E.G.; Droutsa, K.G.; Balaras, C.A.; Kontoyiannidis, S. Building typologies as a tool for assessing the energy performance
of residential buildings—A case study for the Hellenic building stock. Energy Build. 2011,43, 3400–3409. [CrossRef]
62.
Dascalaki, E.G.; Droutsa, K.; Gaglia, A.G.; Kontoyiannidis, S.; Balaras, C.A. Data collection and analysis of the building stock and
its energy performance—An example for Hellenic buildings. Energy Build. 2010,42, 1231–1237. [CrossRef]
63.
Loga, T.; Diefenbach, N.; Stein, B.; Balaras, C.; Villatoro, O.; Wittchen, K. Typology Approach for Building Stock Energy Assessment;
Main Results of the TABULA Project; Institut Wohnen und Umwelt GmbH: Darmstadt, Germany, 2012.
64.
Cerezo Davila, C.; Reinhart, C.F.; Bemis, J.L. Modeling Boston: A workflow for the efficient generation and maintenance of urban
building energy models from existing geospatial datasets. Energy 2016,117, 237–250. [CrossRef]
65.
Pasichnyi, O.; Wallin, J.; Kordas, O. Data-driven building archetypes for urban building energy modelling. Energy 2019,181,
360–377. [CrossRef]
Energies 2024,17, 5463 23 of 25
66.
Li, X.; Yao, R.; Liu, M.; Costanzo, V.; Yu, W.; Wang, W.; Short, A.; Li, B. Developing urban residential reference buildings using
clustering analysis of satellite images. Energy Build. 2018,169, 417–429. [CrossRef]
67.
Borges, P.; Travesset-Baro, O.; Pages-Ramon, A. Hybrid approach to representative building archetypes development for urban
models—A case study in Andorra. Build. Environ. 2022,215, 108958. [CrossRef]
68.
Ghiassi, N.; Mahdavi, A. Reductive bottom-up urban energy computing supported by multivariate cluster analysis. Energy Build.
2017,144, 372–386. [CrossRef]
69.
Ghiassi, N.; Tahmasebi, F.; Mahdavi, A. Harnessing buildings’ operational diversity in a computational framework for high-
resolution urban energy modeling. Build. Simul. 2017,10, 1005–1021. [CrossRef]
70.
Tardioli, G.; Kerrigan, R.; Oates, M.; O’Donnell, J.; Finn, D.P. Identification of representative buildings and building groups in
urban datasets using a novel pre-processing, classification, clustering and predictive modelling approach. Build. Environ. 2018,
140, 90–106. [CrossRef]
71.
Shen, P.; Wang, Z.; Ji, Y. Exploring potential for residential energy saving in New York using developed lightweight prototypical
building models based on survey data in the past decades. Sustain. Cities Soc. 2021,66, 102659. [CrossRef]
72.
Deng, Z.; Chen, Y.; Yang, J.; Chen, Z. Archetype identification and urban building energy modeling for city-scale buildings based
on GIS datasets. Build. Simul. 2022,15, 1547–1559. [CrossRef]
73.
Park, H.; Ruellan, M.; Bouvet, A.; Monmasson, E.; Bennacer, R. In Thermal Parameter Identification of Simplified Building Model
with Electric Appliance. In Proceedings of the 11th International Conference on Electrical Power Quality and Utilisation, Lisbon,
Portugal, 17–19 October 2011.
74.
Fux, S.F.; Ashouri, A.; Benz, M.J.; Guzzella, L. EKF based self-adaptive thermal model for a passive house. Energy Build. 2014,68,
811–817. [CrossRef]
75.
Omar, F.; Bushby, S.T.; Williams, R.D. A self-learning algorithm for estimating solar heat gain and temperature changes in a
single-Family residence. Energy Build. 2017,150, 100–110. [CrossRef]
76.
Harb, H.; Boyanov, N.; Hernandez, L.; Streblow, R.; Müller, D. Development and validation of grey-box models for forecasting
the thermal response of occupied buildings. Energy Build. 2016,117, 199–207. [CrossRef]
77.
Mathews, E.H.; Richards, P.G.; Lombard, C. A first-order thermal model for building design. Energy Build. 1994,21, 133–145.
[CrossRef]
78.
Wei, Z.; Ren, F.; Zhu, Y.; Yue, B.; Ding, Y.; Zheng, C.; Li, B.; Zhai, X. Data-driven two-step identification of building thermal
characteristics: A case study of office building. Appl. Energy 2022,326, 119949. [CrossRef]
79. Li, A.; Sun, Y.; Xu, X. Development of a simplified resistance and capacitance (RC)-network model for pipe-embedded concrete
radiant floors. Energy Build. 2017,150, 353–375. [CrossRef]
80.
Mirakhorli, A.; Dong, B. Model predictive control for building loads connected with a residential distribution grid. Appl. Energy
2018,230, 627–642. [CrossRef]
81.
International Standard Organization. Energy Performance of Buildings—Calculation of Energy Use for Space Heating and Cooling; ISO:
Geneva, Switzerland, 2008.
82.
Wang, S.; Xu, X. Simplified building model for transient thermal performance estimation using GA-based parameter identification.
Int. J. Therm. Sci. 2006,45, 419–432. [CrossRef]
83.
Dimitriou, V.; Firth, S.K.; Hassan, T.M.; Kane, T. The applicability of Lumped Parameter modelling in houses using in-situ
measurements. Energy Build. 2020,223, 110068. [CrossRef]
84.
Chan, K.; Bashash, S. Modeling and Energy Cost Optimization of Air Conditioning Loads in Smart Grid Environments. In
Proceedings of the ASME 2017 Dynamic Systems and Control Conference, Tysons, VA, USA, 11–13 October 2017.
85.
Wang, J.; Jiang, Y.; Tang, C.Y.; Song, L. Development and validation of a second-order thermal network model for residential
buildings. Appl. Energy 2022,306, 118124. [CrossRef]
86.
Brastein, O.M.; Ghaderi, A.; Pfeiffer, C.F.; Skeie, N.O. Analysing uncertainty in parameter estimation and prediction for grey-box
building thermal behaviour models. Energy Build. 2020,224, 110236. [CrossRef]
87.
Wang, X.; Tian, S.; Ren, J.; Jin, X.; Zhou, X.; Shi, X. A novel resistance-capacitance model for evaluating urban building energy
loads considering construction boundary heterogeneity. Appl. Energy 2024,361, 122896. [CrossRef]
88.
Bueno, B.; Norford, L.; Pigeon, G.; Britter, R. A resistance-capacitance network model for the analysis of the interactions between
the energy performance of buildings and the urban climate. Build. Environ. 2012,54, 116–125. [CrossRef]
89.
Michalak, P. The simple hourly method of EN ISO 13790 standard in Matlab/Simulink: A comparative study for the climatic
conditions of Poland. Energy 2014,75, 568–578. [CrossRef]
90.
Michalak, P. The development and validation of the linear time varying Simulink-based model for the dynamic simulation of the
thermal performance of buildings. Energy Build. 2017,141, 333–340. [CrossRef]
91.
Bruno, R.; Pizzuti, G.; Arcuri, N. The Prediction of Thermal Loads in Building by Means of the EN ISO 13790 Dynamic Model: A
Comparison with TRNSYS. Energy Procedia 2016,101, 192–199. [CrossRef]
92.
Vivian, J.; Zarrella, A.; Emmi, G.; De Carli, M. An evaluation of the suitability of lumped-capacitance models in calculating energy
needs and thermal behaviour of buildings. Energy Build. 2017,150, 447–465. [CrossRef]
93.
Xu, X. Model Based Building Evaluation and Diagnosis. Ph.D. Thesis, The Hong Kong Polytechnic University, Kowloon,
Hong Kong, 2005.
Energies 2024,17, 5463 24 of 25
94.
Ogunsola, O.T.; Song, L.; Wang, G. Development and validation of a time-series model for real-time thermal load estimation.
Energy Build. 2014,76, 440–449. [CrossRef]
95.
Ogunsola, O.T.; Song, L. Application of a simplified thermal network model for real-time thermal load estimation. Energy Build.
2015,96, 309–318. [CrossRef]
96.
Lin, X.; Tian, Z.; Song, W.; Lu, Y.; Niu, J.; Sun, Q.; Wang, Y. Grey-box modeling for thermal dynamics of buildings under the
presence of unmeasured internal heat gains. Energy Build. 2024,314, 114229. [CrossRef]
97.
Zhou, Q.; Wang, S.; Xu, X.; Xiao, F. A grey-box model of next-day building thermal load prediction for energy-efficient control.
Int. J. Energy Res. 2008,32, 1418–1431. [CrossRef]
98.
Cui, B.; Fan, C.; Munk, J.; Mao, N.; Xiao, F.; Dong, J.; Kuruganti, T. A hybrid building thermal modeling approach for predicting
temperatures in typical, detached, two-story houses. Appl. Energy 2019,236, 101–116. [CrossRef]
99.
Berthou, T.; Stabat, P.; Salvazet, R.; Marchio, D. Development and validation of a gray box model to predict thermal behavior of
occupied office buildings. Energy Build. 2014,74, 91–100. [CrossRef]
100.
Hossain, M.M.; Zhang, T.; Ardakanian, O. Identifying grey-box thermal models with Bayesian neural networks. Energy Build.
2021,238, 110836. [CrossRef]
101.
Darren, R.; Frédéric, H.; Philippe, L.; Diane, P.; Adil, R.; Urs, W. CITYSIM: Comprehensive Micro-Simulation of Resource Flows
for Sustainable Urban Planning. In Proceedings of the Eleventh International IBPSA Conference, Glasgow, UK, 27–30 July 2009;
pp. 1083–1090.
102.
Emmanuel, W.; Jérôme Henri, K. A Verification of CitySim Results Using the BESTEST and Monitored Consumption Values. In
Proceedings of the 2nd Building Simulation Applications Conference, Bolzano, Italy, 4–6 February 2015; Bozen-Bolzano University
Press: Bolzano, Italy, 2015; pp. 215–222.
103.
Baetens, R.; De Coninck, R.; Jorissen, F.; Picard, D.; Helsen, L.; Saelens, D. Openideas-an Open Framework for Integrated District
Energy Simulations. In Proceedings of the Building Simulation, Hyderabad, India, 7–9 December 2015; pp. 345–354.
104.
Wetter, M.; Fuchs, M.; Grozman, P.; Helsen, L.; Jorissen, F.; Müller, D.; Nytsch-Geusen, C.; Picard, D.; Sahlin, P.; Thorade, M. IEA
EBC Annex 60 Modelica Library—An International Collaboration to Develop a Free Open-Source Model Library for Buildings
and Community Energy Systems. In Proceedings of the Building Simulation, Hyderabad, India, 7–9 December 2015; pp. 395–402.
105.
Vázquez-Canteli, J.R.; Kämpf, J. Massive 3D models and physical data for building simulation at the urban scale: A focus on
Geneva and climate change scenarios. WIT Trans. Ecol. Environ. 2016,204, 35–46.
106.
Li, Z.; Ma, J.; Tan, Y.; Guo, C.; Li, X. Combining physical approaches with deep learning techniques for urban building energy
modeling: A comprehensive review and future research prospects. Build. Environ. 2023,246, 110960. [CrossRef]
107.
Nutkiewicz, A.; Yang, Z.; Jain, R.K. Data-driven Urban Energy Simulation (DUE-S): Integrating machine learning into an urban
building energy simulation workflow. Energy Procedia 2017,142, 2114–2119. [CrossRef]
108.
Eggimann, S.; Fiorentini, M. Transferring energy signatures across space and time to assess their viability for rapid urban energy
demand estimation. Energy Build. 2024,316, 114348. [CrossRef]
109.
Chen, J.; Gao, X.; Hu, Y.; Zeng, Z.; Liu, Y. A meta-model-based optimization approach for fast and reliable calibration of building
energy models. Energy 2019,188, 116046. [CrossRef]
110.
Li, Z.; Dong, B.; Vega, R. A Hybrid Model for Electrical Load Forecasting—A New Approach Integrating Data-Mining with
Physics-Based Models. In Proceedings of the ASHRAE Atlanta Conference 2015, Atlanta, GA, USA, 30 September–2 October
2015.
111.
Dong, B.; Li, Z.; Rahman, S.M.M.; Vega, R. A hybrid model approach for forecasting future residential electricity consumption.
Energy Build. 2016,117, 341–351. [CrossRef]
112.
Lan, H.; Gou, Z.; Hou, C. Understanding the relationship between urban morphology and solar potential in mixed-use neighbor-
hoods using machine learning algorithms. Sustain. Cities Soc. 2022,87, 104225. [CrossRef]
113.
Zhang, W.; Liu, F.; Wen, Y.; Nee, B. Toward Explainable and Interpretable Building Energy Modelling: An Explainable Artificial
Intelligence Approach. In Proceedings of the 8th ACM International Conference on Systems for Energy-Efficient Buildings, Cities,
and Transportation, Association for Computing Machinery, Coimbra, Portugal, 17–18 November 2021; pp. 255–258.
114.
Mouakher, A.; Inoubli, W.; Ounoughi, C.; Ko, A. Expect: EXplainable Prediction Model for Energy ConsumpTion. Mathematics
2022,10, 248. [CrossRef]
115.
Fan, C.; Xiao, F.; Yan, C.; Liu, C.; Li, Z.; Wang, J. A novel methodology to explain and evaluate data-driven building energy
performance models based on interpretable machine learning. Appl. Energy 2019,235, 1551–1560. [CrossRef]
116.
Mims, N.; Schiller, S.; Stuart, E.; Schwartz, L.; Kramer, C.; Faesy, R. Evaluation of US Building Energy Benchmarking and Transparency
Programs: Attributes, Impacts, and Best Practices; Lawrence Berkeley National Laboratory: Berkeley, CA, USA, 2017.
117.
Jin, X.; Zhang, C.; Xiao, F.; Li, A.; Miller, C. A review and reflection on open datasets of city-level building energy use and their
applications. Energy Build. 2023,285, 112911. [CrossRef]
118.
Berkeley Lab, Building Performance Database. Available online: https://buildings.lbl.gov/cbs/bpd (accessed on 1 Septem-
ber 2024).
119.
Department of Buildings (DOB), Local Law 84 2021 (Monthly Data for Calendar Year 2020). Available online: https://data.
cityofnewyork.us/Environment/Local-Law-84-2021-Monthly-Data-for-Calendar-Year-2/in83-58q5/about_data (accessed on
1 September 2024).
Energies 2024,17, 5463 25 of 25
120.
Department of City Planning (DCP), Primary Land Use Tax Lot Output (PLUTO). Available online: https://data.cityofnewyork.
us/City-Government/Primary-Land-Use-Tax-Lot-Output-PLUTO-/64uk-42ks/about_data (accessed on 1 September 2024).
121.
City of Chicago Sustainability Program, Chicago Energy Benchmarking. Available online: https://data.cityofchicago.org/
Environment-Sustainable-Development/Chicago-Energy-Benchmarking/xq83- jr8c/about_data (accessed on 1 September 2024).
122.
Department of Sustainability and Environment, 2022 Building Energy Benchmarking. Available online: https://data.seattle.gov/
Permitting/2022-Building-Energy-Benchmarking/5sxi-iyiy/data (accessed on 1 September 2024).
123.
Building and Construction Authority, Listing of Building Energy Performance Data 2020. Available online: https://beta.data.gov.
sg/collections/22/view (accessed on 1 September 2024).
124.
Department for Levelling Up, Housing and Communities, Energy Performance of Buildings Data. Available online: https:
//epc.opendatacommunities.org/ (accessed on 1 September 2024).
125.
Sustainable Energy Authority of Ireland, Building Energy Rating. Available online: https://ndber.seai.ie/BERResearchTool/ber/
search.aspx (accessed on 1 September 2024).
126.
Electrical and Mechanical Services Department, Register of Buildings Issued with Certificate of Compliance Registration (COCR).
Available online: https://www.emsd.gov.hk/beeo/en/register/search_cocr.php (accessed on 1 September 2024).
127.
Ding, C.; Zhou, N. Using Residential and Office Building Archetypes for Energy Efficiency Building Solutions in an Urban Scale:
A China Case Study. Energies 2020,13, 3210. [CrossRef]
128.
Liu, Y.; Tian, W.; Zhou, X. Energy and carbon performance of urban buildings using metamodeling variable importance techniques.
Build. Simul. 2021,14, 535–547. [CrossRef]
129.
Mui, K.W.; Wong, L.T.; Satheesan, M.K.; Balachandran, A. A Hybrid Simulation Model to Predict the Cooling Energy Consumption
for Residential Housing in Hong Kong. Energies 2021,14, 4850. [CrossRef]
130.
Zhang, R.; Mirzaei, P.A. Virtual dynamic coupling of computational fluid dynamics-building energy simulation-artificial
intelligence: Case study of urban neighbourhood effect on buildings’ energy demand. Build. Environ. 2021,195, 107728.
[CrossRef]
131.
Vazquez-Canteli, J.; Demir, A.D.; Brown, J.; Nagy, Z. In Deep neural networks as surrogate models for urban energy simulations.
J. Phys. Conf. Ser. 2019,1343, 012002. [CrossRef]
132.
José, A.B.A.; Hugo, F.; Jimeno, F. Hybrid Model for Energy Consumption Forecasting in Buildings Stocks at Tropical Regions. In
Proceedings of the Building Simulation 2019: 16th Conference of IBPSA, IBPSA, Rome, Italy, 2–4 September 2019; Volume 16,
pp. 3578–3585.
133.
Westermann, P.; Welzel, M.; Evins, R. Using a deep temporal convolutional network as a building energy surrogate model that
spans multiple climate zones. Appl. Energy 2020,278, 115563. [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Occupancy rate refers to the level of usage and presence of individuals within a building or a specific space. This factor can have a significant impact on building energy consumption. When the occupancy rate in a building is high, naturally, energy consumption also increases. This correlation might be due to the increased use of lighting, heating, and cooling, higher numbers of electrical and electronic devices, and similar factors associated with the presence of people in the building. One of the modern methods in the energy field involves empirically utilizing occupancy monitoring tools in buildings and analyzing the relationship between such utilization and building energy consumption through artificial neural network tools. In this research, a camera sensitive to entry and exit was installed at the entrance of an office building in Tehran, Iran. By doing so, the rate of entry and exit was accurately monitored. In the next stage, by investigating the impact of this entry and exit rate on the building's energy consumption, the energy consumption amount was predicted using an artificial neural network and a statistical method (moving average). The results indicate errors of 9.8 and 4.5 for the respective methods, highlighting that the artificial neural network yields the most accurate outcomes. Moreover, the study's findings suggest a direct correlation: as occupancy rates increase, the predicted energy consumption values also rise. Keywords Occupancy rateWeather parametersArtificial intelligence networkStatistical methodsEnergy
Article
Energy signatures offer a rapid bottom-up energy modelling tool that relates thermal building space heating and cooling demand to ambient temperature. While simplified from a building physics perspective, energy signatures can easily exploit existing buildings' operational datasets for their calibration, fitting individual buildings or archetypes. To assess their viability in modelling buildings' heating and cooling demand at urban or larger scales, it is important to assess whether the same building model expressed as an energy signature could be used under different weather conditions, through spatial (location) or temporal (future weather) transfer. As energy demand depends on environmental factors other than temperature, a systematic error quantification from transferring energy signatures across different climatic contexts is required. These errors are assessed by comparing signature-based estimates to EnergyPlus simulations of a single-family home and an office building archetype exposed to current and future climate across eight US climate zones. Energy signatures are confirmed to be a valid approach if sufficient heating and cooling data is used for their identification. On an annual basis, signature-based estimates result in an error of approximately ±30% if the signature is spatially transferred, which can exceed 75% for unsuitable source locations. When energy signatures are temporally transferred, an annual error of only ±5-10% is expected. At a daily resolution, the inherent error from applying the energy signature method is significantly larger than the additional error introduced through a spatiotemporal transfer of the energy signature. 1. Introduction Building energy demand simulation forms the backbone for a wide range of research projects, applications, or analyses in the domains of urban planning [1-4]. While today, modelling the heating and cooling demand of a single building, or a reduced number of buildings, is a relatively straightforward task achievable via Building Performance Simulation (BPS) software, estimating energy demand for a large number of buildings at the city or country level is still a major challenge. BPS simulation and reduced-order dynamical modelling techniques can provide accurate bottom-up estimations of the heating and cooling demand of buildings when all their geometrical, construction, and operational characteristics are available. However, their accuracy is greatly reduced by making simplified assumptions if this information is not at hand, resulting in potential errors even in the annual energy consumption estimate that can easily reach 50% [5]. Studies conducting analyses at a city or national scale need to model a large number of buildings [6], which can be in the order of tens of thousands or millions of buildings. Simulations for such a large number of buildings are typically too time-consuming or data-intensive for Urban Building Energy Modelling (UBEM) tools relying on detailed BPS software. Data-driven methods provide a different approach to overcome these limitations, such as energy signatures (also called change-point regression models) [7,8], which rely on minimal input data and allow fast and simplified quantification of building energy demand. Energy signatures relate outdoor temperature to building energy demand, which is the most commonly relied upon correlation when modelling energy demand across large scales [9,10]. Nevertheless, this comes at the expense of a limited representation of building physics processes, limiting the energy signature at the representation of one specific building-the one that produced the data to fit the model. For this reason, energy signatures have only rarely been applied at the urban scale, and it is unclear whether their large-scale application can provide viable results.
Article
Achieving carbon neutrality by 2050 requires evaluating and retrofitting existing buildings. However, despite the numerous studies on energy analytics, they usually focus on energy consumption patterns and motifs rather than encompassing various energy usage characteristics. This study proposes a novel symbolic hierarchical clustering for building energy analytics at the city level. It utilizes change-point model (CPM) parameters to represent building energy usage, performance, occupant behavioral characteristics. The clustering method based on the CPM parameters defines energy performance signatures (EPS) for determining their energy characteristics and as symbolic data transformation. In a case study conducted in Gangwon, South Korea, five different energy performance signatures (EPSs 1–5) showing their unique energy characteristics were determined for commercial buildings. EPS1 to 3 were classified as signatures with good performance (65.5% of all buildings) while EPS4 and 5 were classified as signatures with bad performance (34.5%). Using this EPS symbolic data, an EPS map was visualized and analyzed from various perspectives. For example, buildings that showed a continuous or overall decline in envelope performance over five years were among the oldest buildings (construction completion date closer to 1978; 7.9%). Despite poor envelope performance, buildings with lower energy usage showed a tendency for occupants to delay heating (28.4%). The proposed method can contribute to the data-driven building energy analytics in providing detailed insights into energy usage patterns, building energy performance, and occupant behavioral characteristics at the city level. The effectiveness of open-source energy data for urban building energy analysis would be improved through the proposed method.
Article
To address the problems associated with the accuracy of existing resistance-capacitance (RC)-based urban building energy modeling (UBEM) being low, a novel RC-based UBEM was developed. This model takes into account construction boundary heterogeneity by clustering and lumping together building construction elements with similar boundary conditions. In addition to buildings with single thermal zone, the novel RC model was extended to buildings with multiple thermal zones by using voltage-controlled voltage sources (VCVSs). The model was validated by four ASHRAE 140 cases with single thermal zone and two real urban building cases with multiple thermal zones. To verify the effectiveness of the proposed lumping concept, comparisons between the proposed RC model and the existing RC models were conducted. The results show that the novel RC-based UBEM could reach the balance between accuracy and efficiency. The hourly heating/cooling loads of the novel RC model agree well with those produced by the benchmark model EnergyPlus. The novel model has shown an efficiency improvement of approximately 78% compared to EnergyPlus. The 4R1C model is more accurate than 3R2C model especially for cases with large capacitance and resistance differences between interior and exterior construction components. The accuracy of the proposed RC-based UBEM is significantly improved compared with the existing RC-based UBEM. Thus, it is necessary to consider the construction boundary heterogeneity in the RC models for evaluating urban building energy loads.