Conference PaperPDF Available

Balancing thermal comfort datasets: We GAN, but should we?


Abstract and Figures

Thermal comfort assessment for the built environment has become more available to analysts and researchers due to the proliferation of sensors and subjective feedback methods. These data can be used for modeling comfort behavior to support design and operations towards energy efficiency and well-being. By nature, occupant subjective feedback is imbalanced as indoor conditions are designed for comfort, and responses indicating otherwise are less common. This situation creates a scenario for the machine learning workflow where class balancing as a pre-processing step might be valuable for developing predictive thermal comfort classification models with high-performance. This paper investigates the various thermal comfort dataset class balancing techniques from the literature and proposes a modified conditional Generative Adversarial Network (GAN), comfortGAN, to address this imbalance scenario. These approaches are applied to three publicly available datasets, ranging from 30 and 67 participants to a global collection of thermal comfort datasets, with 1,474; 2,067; and 66,397 data points, respectively. This work finds that a classification model trained on a balanced dataset, comprised of real and generated samples from comfortGAN, has higher performance (increase between 4% and 17% in classification accuracy) than other augmentation methods tested. However, when classes representing discomfort are merged and reduced to three, better imbalanced performance is expected, and the additional increase in performance by comfortGAN shrinks to 1-2%. These results illustrate that class balancing for thermal comfort modeling is beneficial using advanced techniques such as GANs, but its value is diminished in certain scenarios. A discussion is provided to assist potential users in determining which scenarios this process is useful and which method works best.
Content may be subject to copyright.
Balancing thermal comfort datasets: We GAN, but should we?
Matias Quintana
School of Design and Environment
National University of Singapore
Stefano Schiavon
Center for the Built Environment
University of California, Berkeley
Kwok Wai Tham
School of Design and Environment
National University of Singapore
Clayton Miller
School of Design and Environment
National University of Singapore
Thermal comfort assessment for the built environment has become
more available to analysts and researchers due to the proliferation
of sensors and subjective feedback methods. These data can be
used for modeling comfort behavior to support design and opera-
tions towards energy eciency and well-being. By nature, occu-
pant subjective feedback is imbalanced as indoor conditions are
designed for comfort, and responses indicating otherwise are less
common. This situation creates a scenario for the machine learning
workow where class balancing as a pre-processing step might be
valuable for developing predictive thermal comfort classication
models with high-performance. This paper investigates the various
thermal comfort dataset class balancing techniques from the litera-
ture and proposes a modied conditional Generative Adversarial
Network (GAN),
, to address this imbalance scenario.
These approaches are applied to three publicly available datasets,
ranging from 30 and 67 participants to a global collection of ther-
mal comfort datasets, with 1,474; 2,067; and 66,397 data points,
respectively. This work nds that a classication model trained on
a balanced dataset, comprised of real and generated samples from
, has higher performance (increase between 4% and
17% in classication accuracy) than other augmentation methods
tested. However, when classes representing discomfort are merged
and reduced to three, better imbalanced performance is expected,
and the additional increase in performance by
to 1-2%. These results illustrate that class balancing for thermal
comfort modeling is benecial using advanced techniques such as
GANs, but its value is diminished in certain scenarios. A discussion
is provided to assist potential users in determining which scenarios
this process is useful and which method works best.
Information systems
Information integration;
ing methodologies Machine learning algorithms.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from
BuildSys ’20, November 16–19, 2020, Yokohama, Japan
©2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-9999-9/18/06.. .$15.00
Data Augmentation, Class Balancing, Thermal Comfort, Genera-
tive Adversarial Network, Survey, Buildings, Machine Learning
ACM Reference Format:
Matias Quintana, Stefano Schiavon, Kwok Wai Tham, and Clayton Miller.
2020. Balancing thermal comfort datasets: We GAN, but should we?. In
BuildSys ’20: 7th ACM International Conference on Systems for Energy-Ecient
Built Environments, November 16–19, 2020, Yokohama, Japan. ACM, New
York, NY, USA, 11 pages.
Thermal comfort modeling has come a long way since one of its
main predecessors, the Predicted Mean Vote (PMV) model [
which was proposed in the 1970s. Even though it is still the most
used model in industry standards, current approaches are shifting
to use direct feedback from the occupant, alongside relevant data,
to gure out a model feasible of predicting the feedback for future
or yet exposed scenarios [
]. Thanks to the latest advancement
in sensor and data management technologies, their reduction in
cost, and the proliferation of interconnected devices known as the
Internet of Things (IoT), building analysts and researchers are now
able to collect more granular data, i.e., high sampling frequency,
within buildings that can be used for this purpose as well as other
building applications such as monitoring, control, and predictive
maintenance [
]. Current data-driven approaches aim to employ
environmental [
], physiological [
], and behavioral [
] data
as features for thermal comfort prediction and are predominant in
recent thermal comfort modeling research.
Nevertheless, generalizability and high-performance are hard to
achieve in these data-driven models, especially given the dataset’s
cohort and sample sizes regardless of the algorithm chosen [
These models rely on labeled data, i.e., data points where envi-
ronmental and physiological parameters are paired up with the
occupant’s subjective thermal comfort feedback. Although the pro-
liferation of IoT sensors in the built environment is vast [
], sensing
occupants has its challenges. Particularly, getting the occupant to
provide their subjective comfort feedback is a tedious task. Re-
cent studies use mobile applications, computer programs, or emails
] to facilitate the occupant subjective feedback collec-
tion, but these methods are still limited by administrative overhead,
cohort size, and users’ behavior during the experiments, i.e., sur-
vey fatigue [
], which in turn could create concerns about how
accurate their responses are [
]. This results in datasets where
arXiv:2009.13154v1 [cs.LG] 28 Sep 2020
BuildSys ’20, November 16–19, 2020, Yokohama, Japan intana, et al.
people only report certain discomfort (e.g., “warm”) and not other
classes (e.g., “too hot”, “cool”, or “cold”). One way to address this
is to conduct targeted surveys [
], e.g., if not enough “cold” re-
sponses are available, a survey will be prompted to the user when
the temperature surrounding it is below 15
C or another arbitrary
low temperature. This approach, however, is only applicable to
future human studies. While thermal comfort applications such as
building operation controls avoid the need of comfort feedback by
prioritizing a temperature range deemed “comfortable” [
], most
occupant-centric approaches still require user data collection.
1.1 Addressing imbalanced data
Datasets with a disproportionate ratio of samples in each class
are referred to as imbalanced datasets. This issue is often ignored,
or only a portion of the dataset is used [
], which leads to a
wasted eort in the data collection process. While the number of
choices, i.e., the number of dierent class values, may contribute
to a more uneven number of samples per class, a binary scenario
would limit the thermal comfort dataset’s versatility. This situation
is particularly critical when we consider thermal comfort models’
usability and outputs to design indoor environments better and
control existing mechanisms for heating or cooling [
]. Simulation
results in [
] found that an occupant’s thermal comfort charac-
teristics are one of the most impactful parameters to determine
the energy savings of comfort-driven control strategies. In these
scenarios, the direction of discomfort, i.e., “too cold” or “too hot”,
is needed and can be directly captured when at least three classes
(e.g., “cool”, “comfortable”, and “hot”) are present in the thermal
comfort feedback.
One potential solution for imbalanced data is the use of genera-
tive models, especially Generative Adversarial Networks (GANs).
GANs models have been widely deployed for many generative tasks
using image datasets, and the results have been at par with state of
the art. While some applications of GANs in the built environment
have shown their potential [43, 48, 49], eorts on numeric tabular
datasets are mostly done in datasets with binary classes on the de-
pendant variable and failed to scale with multi-class variables [
Their application in the eld of thermal comfort datasets is limited,
and there has not been a thorough investigation of which class
balancing technique works best for this application. This paper
aims to explore the application of GANs in this context compared
to the methods from previous work and ascertain in which situa-
tions balancing provides more or less value for the accuracy of the
In order to deal with the imbalanced datasets, researchers make use
of a part of the modeling workow toolkit called data augmentation.
This strategy is a way of pre-processing the data to increase its
size and diversity; however, it also oers the additional benet of
enabling models, trained on said pre-processed data, to be more
robust to transformations of the model’s input [
]. As mentioned,
thermal comfort researchers sometimes discard data points with the
predominant class value, known as undersampling [
], or they
generate synthetic data points based on the real dataset to balance
the number of samples per class, a way of augmenting the data
also known as oversampling [
]. The latter approach often
relies on standard algorithms such as Synthetic Minority Over-
sampling Technique (SMOTE) [
] and Adaptive Synthetic sampling
approach (ADASYN) [
]. Nevertheless, other elds familiarized
with imbalanced datasets, such as computer vision, have shown
the potential to utilize other algorithms for generating synthetic
data points known as Generative Models. This family of algorithms
models the real data distribution
, by learning a distribution
, parameterized by
, that approximates it. Two approaches to
achieve this would be to directly learn
or to learn a function
which transform a common distribution
(i.e., Uniform or
Gaussian) such that
. Since these models aim to capture
the data distribution, after being trained on a dataset, they are
capable of generating data points that were not found originally in
the dataset but are likely to come from the same distribution. One
type of model from the latter approach are Generative Adversarial
Networks (GANs) [20].
GANs can be thought of as a game on which two models, most of
the time neural networks, compete against each other. The Genera-
tor Guses a random noise vector as an input and tries to generate an
instance that looks like it was drawn from the same distribution as
the data. On the other hand, the Discriminator Daims to distinguish
between true samples from the data and samples generated by G.
Variations of this initial setup have allowed researchers to tackled
various image-based problems like human pose generation [
image quality enhancer [
], balancing image datasets [
], just
to name a few. Furthermore, when compared to other generative
models, GANs have shown to produce more realistic samples than
other well-known models, such as Adversarial Autoencoders [
Another popular and highly useful variant of GANs are conditional
GANs [
]. In this case, samples’ generation and discrimination are
conditioned on additional information, e.g., a class label.
A survey done in [
] encountered that applications outside com-
puter vision have yielded promising results, but research related
to the application of GANs in other areas is still somewhat limited.
To our knowledge, GANs have not yet been used in the context of
thermal comfort studies, and there are few applications of them in
the built environment. Work done in [
] explores the usability of
GANs on the imagery of urban areas to identify factors associated
with cycling crashes in an eort to help urban planners design safer
environments. Moreover, GANs have also been incorporated into
the fault detection and diagnosis of chillers for augmenting such
datasets [
] since they are more suitable for extremely imbalanced
datasets [
]. On the task of generating synthetic tabular data, cur-
rent eorts deal with privacy concerns. Researchers aim to have
synthetic, or generated, datasets that share the same characteristics
as the original datasets and can therefore be used interchangeably
for modeling purposes or shared with third parties. [
] show
GANs able to generate synthetic data that preserves the descriptive
statistics of the original data but enhances the anonymization of the
original users. These approaches have gained popularity within the
healthcare domain [
]. Another existing method is Tabular-GAN
], which focuses on the marginal distribution of the features
using a more complex network architecture for the Generator G.
Their subsequent work CTGAN [
] alleviates the model complex-
ity of their approach, but the main focus remains to generate purely
synthetic samples rather than correctly balancing the dataset. As
Balancing thermal comfort datasets: We GAN, but should we? BuildSys ’20, November 16–19, 2020, Yokohama, Japan
mentioned earlier, the studies found that, in practice, GAN variants
from image-based problems under-perform when the input data
is a mixture of continuous and discrete features, which is the case
for tabular data like thermal comfort studies. Thus, an active re-
search trend is nding empirical and theory-backed GAN variants
for non-image datasets.
To bridge the gap of generative methods for imbalanced and
numerical thermal comfort datasets, and building on previous work
] , we propose comfortGAN, a conditional Wasserstein GAN
with gradient penalty (cWGAN-GP) as a class balancing algorithm
for data-driven thermal comfort modeling instead of commonly
used methods. We assessed the performance of a balanced thermal
comfort dataset, composed of generated and real samples, on a
multi-class classication model, on scenarios where comfort feed-
back can take as much as seven distinct values, as well as a re-
duced version with only three possible values. This analysis’s focus
was not only to show improvements but also to determine which
thermal comfort scenarios are improved more or less to guide the
general application of class balancing in this context. We applied
these techniques on three large open datasets that include comfort
satisfaction, preference, and sensation objectives at dierent scales
(3, 5, and 7-point scales).
3.1 Datasets
Three publicly available thermal comfort datasets were chosen,
one laboratory-controlled experiment, and two eld experiments.
Among the latter experiments, one was collected by one research
group in homogeneous conditions, and the other is an ensemble of
many studies. The dataset [
] (hereafter named controlled dataset)
consists of a year-long three-hour session controlled experiment
in Pittsburgh - Pennsylvania, U.S.A, where 77 participants’ en-
vironmental and physiological measurements were taken along-
side thermal comfort subjective feedback via a mobile application
throughout a xed temperature scheduled. The dataset used in [
(hereafter named eld dataset) consists of two weeks of intensive
sampling of 30 participants in an educational building in Singapore.
Occupants wore a smartwatch from which they gave a minimum of
100 subjective feedback responses, and the environmental measure-
ments of the nearest xed sensor in the building were paired to each
response. Finally, the ASHRAE Global Thermal Comfort Database
II (hereafter named Comfort Database) [
] is a global joint eort
to collect and harmonize numerous thermal comfort eld studies
systematically. Due to the treatment and data cleaning these studies
went through, all of them can be considered one whole dataset.
3.2 Feature selection and pre-processing
Data-driven thermal comfort models rely on a handful of mea-
surable features and often outperform industry standards such as
Predicted Mean Vote (PMV), especially when dealing with individ-
ual, personalized, thermal comfort. While the empirical models in
the related literature can incorporate a handful of these variables,
the exclusion of the rest can cause signicant errors. Therefore, the
features selected from each dataset will be the same as the features
chosen in the related literature that used these datasets for data-
driven thermal comfort modeling. By doing this, the data augmen-
tation methods will evaluate how models would have changed if
such augmentation models were to be used in the machine learning
workow. For the controlled dataset [
], we chose the feature set
dened as Featureset-1 in [
], these features are
Air temperature
Skin temperature
Clothing insulation
Outside temperature
relative humidity, and Thermal comfort feedback.
For the eld dataset [
] we chose their second set of proposed
features with the same pre-processing cyclical representation of
Hour of the day
Day of the week
, the other features were
Heart rate
Relative humidity
Luminous intensity
level,Air temperature, and Thermal sensation feedback.
Finally, for the Comfort Database [
], we picked six of the top
seven most signicant variables for data-driven thermal comfort
proposed by [
] for this dataset, which overlaps with the already
chosen features from the other datasets. The features are
Effective Temperature (SET)
Clothing level
Air temperature
Relative humidity
Air velocity
Thermal sensation feedback
. We left out
, the extra
feature [
] chose because the available data points with this value
40% of the total data points in the database. The six fea-
tures we chose resulted in
66,000 data points (
62% of the entire
database), and the
Thermal sensation feedback
was rounded to
the closest integer. Overall, we believe the chosen features for all
three datasets are a subset of common measurements researchers
consider when modeling thermal comfort from longitudinal studies.
3.3 Customized GAN for thermal comfort
A survey done on GANs by [
] identies two main families of
GANs: Architecture variant and loss variant. Architecture variants
refer to the dierent types of neural networks used, e.g., fully con-
nected networks, convolutional neural networks, recurrent neural
networks, and the interaction between the data and the labels with
the generator and discriminator. On the other hand, loss variants
refer to the loss functions used to facilitate more stable learning
of the Generator. For both families, the specic type and function
to be used usually depends on the application [
]. As introduced
in Section 2, conditional GANs (cGAN) are an architecture variant
where the generation and discrimination of samples is conditioned
on the class label. By doing this, the Generator is exposed to a set
of inputs for a specic class value or, in our case, a specic thermal
comfort response.
In term of the loss function, the original proposed loss [
] can
lead to vanishing gradients, stopping the training process, unless
the distributions of real and generated samples have signicant
overlap [
]. However, in practice, the possibility that the real
samples distribution and the generated samples distribution do not
overlap or have negligible overlap is very high [
]. Therefore, a loss
variant known as Wasserstein GAN (WGAN) [
] was introduced
to facilitate the training and convergence of both the Generator
and Discriminator. Subsequent modications on WGAN, known
as WGAN-gradient penalty (WGAN-GP) [
], enhances training
stability and have shown better results and convergence in practice
compared to conventionally used image-based GAN variants (e.g.,
BuildSys ’20, November 16–19, 2020, Yokohama, Japan intana, et al.
(a) Original number of classes
(b) Classes remapped to only three values
Figure 1: Class distribution of original and train split set for the controlled dataset [17], eld dataset [23], and the Comfort
Database [16], from left to right respectively. Row a) shows the original datasets with their original classes and b) is the datasets
with their classes reduced to three. The original imbalanced dataset is colored grey and the train set, based on a 0.7 train-test
random split, is colored blue. The horizontal black line indicates the number of data points of the predominant class in each
dataset, and represent the value at which the other classes will be augmented.
Figure 2: Overview of the architecture used. The Generator
samples from a Normal distribution Zand the class label y
is concatenated to the input of both the Generator and Dis-
criminator (i.e., conditional GAN architecture variant). The
latter takes real samples xor generated samples ˜
x. Both the
Generator and Discriminator are Multi-layer Perceptrons,
where each blue rectangle is a layer, and the number on top
indicates its number of units. Wasserstein loss with gradient
penalty [21] was chosen for more stable training.
Figure 3: Loss value of both Generator (G_Loss) and Discrim-
inator (D_loss) during the convergence process on the train-
ing split of the controlled experiment [17]. Similar values
were obtained for the other two datasets and can be found
in the TensorBoard platform that accompanies this work.
convolutional GANs), specically on tabular data from other elds
]. Therefore, we move away from the vanilla architecture
used in [
] and on this work, we use the WGAN-GP loss variant
for comfortGAN.
For categorical features as well as for the labels, we proceeded to
smooth them by converting them to one-hot-encoding representa-
tion and adding uniform noise (
U ni f or m(
, for which we chose
Balancing thermal comfort datasets: We GAN, but should we? BuildSys ’20, November 16–19, 2020, Yokohama, Japan
=0.2) to each resulting binary variable and re-normalize each rep-
resentation, similar to [
]. On the other hand, continuous features
are scaled in the range of
. Recent research shows that the
capacity and performance of GANs are related to the network size
and batch size [
]. With this in mind, we proceeded to evaluate dif-
ferent combinations of parameters on the datasets. Figure 1 shows
the distribution of class values for the original datasets and the
distribution of their respective 0.7 train-test ratio train set, which
is used for experiments throughout this paper. Figure 1a shows the
original distribution with all possible class values whereas Figure
1b shows a reduced version on which the extremes (“cold” or “hot”)
are grouped into one single class in order to have only 3 classes.
This reduction only takes place in the controlled dataset [
] and
in the Comfort Database [
], which initially had 5 and 7 classes,
After trial-and-error, the architecture was set to ve fully con-
nected layers with
Rectified Linear Units (ReLU)
as activation
functions for the Generator and
Leaky ReLU
with a slope of 0.2 for
the Discriminator; the number of units per layer is 128/256/128/64/32
and 64/128/64/32/16 respectively, with a learning rate set to 0.0002,
Figure 2 shows the architecture used. Following the suggestions in
], the optimizer used for both the Generator and Discriminator
regularization of 0.5 was used on the Discrimi-
nator, and Batch Normalization was used on the Generator.
We tried dierent combinations of latent dimension size, batch
size, and number of critics for the Discriminator (i.e., number of
times the Discriminator is trained per Generator training iteration)
for each training set. Ultimately, we chose the parameters that visu-
ally showed convergence after 20,000 (controlled and eld dataset)
and 200,000 (Comfort Database) iterations on the Generator. More
iterations were used for the Comfort Database to account for the
size of the dataset. These numbers were chosen empirically after re-
vising the number of iterations in the related work [
] ranges
from 18,000 to 40,000 for medium-sized tabular datasets. While
there is no explicit guarantee this range of iterations is suitable for
all thermal comfort datasets, existing work in GANs for tabular
data suggests these ranges are an excellent place to start while the
theoretical explanations are a research trend.
The convergence plots of the Discriminator and Generator losses
are available online via the TensorBoard platform
and one of the
plots available at TensorBoard is displayed in Figure 3. This Figure
shows the Generator and Discriminator losses,
respectively, for the controlled dataset [
]. The nal values of
batch size, number of critics, and latent dimension size for the
controlled dataset [
], eld dataset [
], and Comfort Database
] are 128, 1, and 20; 128, 3, and 80; 64, 1, and 100; respectively.
Finally, while existing GAN architectures and loss functions were
used, this work’s technical contribution extends to the appropriate
selection of them and evaluation of the framework for three large
open public datasets in the context of thermal comfort.
3.4 Evaluation
Evaluating generative models can be tricky since it is rarely a
straightforward process, especially when dierent metrics can yield
substantially dierent results [
]. If the goal is image genera-
tion, a subjective evaluation based on visual delity, e.g., “real vs.
fake” perceptual study, can be appropriate and can even be crowd-
sourced on Amazon Mechanical Turk (AMT) [
]. For the case of
numeric datasets, it is common to rely on the eect that generated
samples, combined with real samples, have on the performance
of a classication model [
], researchers often name
this machine learning ecacy. Moreover, work done in [
] added
metrics to evaluate the generated samples themselves, i.e., how
diverse they are and if they look similar to the original training
samples in an image dataset.
(a) Variability of generated samples: Dierence between generated
(b) Diversity of generated samples with respect to the training set:
Dierence between closest pairs from generated and training set.
(c) Machine learning ecacy or quality of generated samples on
classication tasks: Classication accuracy on a balanced, mixture
of generated and training samples, dataset.
Figure 4: Diagrams for the dierent evaluation metrics used
to evaluate the generated samples for all augmentation
We want to make sure our GAN variant does not suer from
a well known GAN-related problem called mode collapse, i.e., gen-
erating the same type of sample over and over again. Thus, we
quantify the variability of generated samples by randomly drawing
two generated samples and calculating their Euclidean distance to
quantify how dierent the samples are. In image problems, this
distance is usually calculated by measuring the Structural Similar-
ity Index (SSIM) [
]. Secondly, to make sure the algorithm is not
just memorizing the training samples, we measure the diversity
of generated samples with respect to the training set by randomly
drawing a sample from the generated set and nding its closest
BuildSys ’20, November 16–19, 2020, Yokohama, Japan intana, et al.
matching sample, i.e., minimum Euclidean distance, from the train-
ing set. Finally, as our goal is to increase the minority classes on
the original dataset to make classication models more robust, we
combine the training set and the generated set to form a balanced
dataset and train a classication model. We then report the classi-
cation accuracy, F1-micro score for this multi-class scenario; on
the test set, we refer to this as the quality of generated samples on
classication tasks, or machine learning ecacy. Figure 4 shows a
visual representation of the mentioned metrics.
The classication model chosen for all the evaluation metrics is
the Decision Random Forest (RDF) based on their extensive use on
the thermal comfort modeling literature and in the respective work
that utilizes the datasets [
]; they were also found to be the
top-performing model for data-driven thermal comfort modeling
on the Comfort Database [
]. One RDF classier was used per
dataset, and their hyper-parameters were grid-searched based on
10-fold cross-validation on each respective train set. Experiments
showed that 100 trees seem to consistently provide classiers with
the highest average cross-validation accuracy across all training
sets, and this result is aligned with the models the authors used on
the respective datasets. Finally, the optimal depth varied from 9 to
13, which coincide with what was found in [
]. Thus, 100 trees
and a tree depth of 10 were xed for all classiers; by specifying
the classier, we emphasize the training samples’ role in the model
performance. Furthermore, to imitate a conventional data-driven
modeling approach, we used a random train-test split (Figure 1) and
will evaluate the classication models’ performance on said test
split. Related work has opted to pick a test set carefully, comprise
of real samples, with balanced classes [
]; however, we see that
in practice, thermal comfort researchers don’t have the luxury of
having enough samples per class to build a balanced test set due
to experiment’s cohort sizes or the number of data points per par-
ticipants. Although the test set will also be inevitably imbalanced,
we chose this approach to be as close as possible to what thermal
comfort researchers do in practice.
Each draw action is repeated 30 times for each class value, and
the average distance is reported on both scenarios, variability, and
diversity. The baselines for all three metrics are calculated with
the original train set alone, without any generated sample, which
translates to the baseline of machine learning ecacy being the per-
formance of the xed RDF model trained on the original imbalanced
train set. Furthermore, a total of four other augmentation models
are evaluated for comparison: SMOTE [
] and ADASYN [
] as mod-
els regularly used in the literature for data-driven thermal comfort
modeling [
], and TGAN [
] and CTGAN [
] as GAN variant
approaches for data generation. We repeated all evaluation metrics
30 times for each augmentation model; each repetition involved
generating a newly generated dataset, and the average results are
reported. The dimensions of each generated dataset are equal to
the number of samples per class required to balanced the respective
training set in Figure 1. We share our code base for reproducibility
on a GitHub repository:
4.1 Experiment setup
The workstation used for the experiments of this study has the fol-
lowing conguration: Intel Core (TM) i5-7600 @ 4.1GHz, NVIDIA
GeForce GTX 1070 graphics card, Python 3.7.3 (64-bit), and PyTorch
1.5.0. The hyper-parameters tuning for TGAN [
] was done uti-
lizing its random hyper-parameters search
, the best set of values
over 30 random searches was chosen for each dataset; CTGAN [
does not provide such an environment for hyper-parameter tuning,
so the default recommended hyper-parameters were chosen. As
mentioned, the experiments were repeated with two copies of all
datasets, one with the original number of classes (5, 3, and 7 for the
controlled dataset [
], eld dataset [
], and Comfort Database
], respectively) and another one with the classes remapped to
only three values (the eld dataset [23] remained unchanged).
4.2 Variability of generated samples
The baseline value on this rst column Variability on Table 1b gives
a reference of how dierent, on average, the original samples are
in the original training set. We want the generated samples to be
as variable as the original samples used from training, although
the main concern is to avoid a very low number. SMOTE [
] and
] score slightly below the baseline, with the excep-
tion of ADASYN [
] on the eld dataset (middle column within
the Variability column). Both algorithms use a linear interpolation
on the training data to generate new data points; thus, it is rea-
sonable for them to score close to the baseline for both groups of
datasets with original and reduced classes. The GAN variant mod-
els, CTGAN [
] and TGAN [
] outperform the latter models with
some exceptions on the Comfort Database [
] for CTGAN [
and TGAN [
]. CTGAN [
] achieves the highest score for the
controlled dataset, in the original number of classes scenario and
is the closest model to the baseline on the controlled dataset with
reduced classes. ComfortGAN achieves the second-highest value
for the comfort dataset with its original classes but drops to the last
place when the classes are reduced. Nevertheless, it surpasses all
the models for the other two datasets in both scenarios.
4.3 Diversity of generated samples with respect
to training set
The second column Diversity on Table 1b shows the average dis-
tance between a generated sample and the closest sample from
the training dataset. The baseline in this metric is a reference on
the average dierences between similar samples, analogous as the
previous metric; the main concern is to avoid a very low number.
] score close but below the baseline for almost all cases
(except on controlled dataset [
] with the original classes). Al-
though ADASYN [
] also scores below the baseline for almost all
cases (except on controlled dataset [
], it surpasses SMOTE [
] in
all cases (with a tie-on Comfort Database [
] with reduced classes).
This result highlights the main dierence among them; even if both
rely on linear interpolation on the training samples, ADASYN [
has a small random coecient that allows it to produce samples that
are slightly more dierent than the original samples in the training
Balancing thermal comfort datasets: We GAN, but should we? BuildSys ’20, November 16–19, 2020, Yokohama, Japan
Table 1: Experiment results for all data augmentation models and datasets. The F1-micro score was used for classication
evaluations (Machine learning ecacy), and Euclidean distance was used for distance calculation (Variability and Diversity).
The datasets columns under each metric (rst row in the column header) correspond to the controlled dataset [17], eld dataset
[23], and the Comfort Database [16], respectively in that order. The percentage of improvement over the baseline is shown in
parenthesis for the Machine learning ecacy case. For all metrics, the higher the value the better.
Model Variability Diversity Machine learning ecacy
[17] [23] [16] [17] [23] [16] [17] [23] [16]
Baseline 52.50
18.12 1.90 5.66 0.55 0.60 0.65 0.26
SMOTE [6] 49.30
17.90 1.94 2.77 0.29 0.47(-13%) 0.53(-12%) 0.30(+4%)
ADASYN [22] 49.46
17.76 2.52 3.26 0.30 0.41(-19%) 0.51(-14%) 0.30(+4%)
TGAN [47] 42.80
17.99 15.58 12.10 0.89 0.62(+2%) 0.66(+1%) 0.38(+12%)
CTGAN [46] 62.85
17.40 23.92 19.80 1.17 0.48(-12%) 0.63(-2%) 0.40(+14%)
comfortGAN 56.67 324.07 38.95 34.21 21.27 17.26 0.64(+4%) 0.67(+2%) 0.43(+17%)
(a) Original number of classes
Model Variability Diversity Machine learning ecacy
[17] [23] [16] [17] [23] [16] [17] [23] [16]
Baseline 53.73
18.16 1.69 5.66 0.48 0.66 0.65 0.49
SMOTE [6] 50.19
18.28 1.18 2.77 0.20 0.60(-6%) 0.53(-12%) 0.50(+1%)
ADASYN [22] 50.52
17.12 2.13 3.26 0.20 0.53(-13%) 0.51(-14%) 0.50(+1%)
TGAN [47] 39.91
17.69 18.75 12.10 0.55 0.69(+3%) 0.66(+1%) 0.50(+1%)
CTGAN [46] 52.65
17.76 19.57 19.80 0.74 0.59(-7%) 0.63(-2%) 0.50(+1%)
comfortGAN 43.84 324.07 37.07 37.75 21.27 13.13 0.72(+6%) 0.67(+2%) 0.51(+2%)
(b) Classes remapped to only three values
set. On the other hand, the GAN variants can achieve values one
magnitude of order higher on all datasets with original and reduced
classes. Although comfortGAN surpasses CTGAN [
] and TGAN
] on the controlled dataset (original and reduced classes), CT-
] and comfortGAN perform similarly on the eld dataset
]. On the Comfort Database [
], comfortGAN seems to develop
more sparse data points, allowing for a higher distance, since visual
inspection of the generated samples suggest that the numerical
values are within the physical boundaries of the features. Never-
theless, the higher values of all GAN variants in terms of distance
of generated versus original samples is expected given that the
objective of these methods is to try to learn the underlying data
distribution instead of only doing minor changes on the original
training data points, like the traditional approaches like SMOTE
[6] and ADASYN [22].
4.4 Machine learning ecacy
Finally, we assess the accuracy of a classier trained on the bal-
anced dataset, i.e., a dataset with both the training samples and the
generated ones such that all classes have the same number of data
points. Given that the datasets used in this work are multi-class, the
accuracy metric is calculated as an F1-micro score as it was also the
metric reported related work on these datasets [
]. The last
column Machine learning ecacy in Table 1b shows that generated
samples from SMOTE [
] and ADASYN [
] end up aecting the
classication performance on all datasets and only barely increas-
ing the performance on the Comfort Database [
] with reduced
classes. When the datasets have more than three classes (controlled
dataset [
] and Comfort Database [
], Table 1a), comfortGAN in-
creases the classier performance 4% and 17% respectively. TGAN
] increase the performance on the controlled dataset [
] by
2% and both TGAN and CTGAN [
] increase the performance
on Comfort Database [
] by 12% and 14% respectively. However,
when the number of classes is reduced to only 3 (Table 1b), the
increase in performance changes.
For the controlled dataset [
], where classes are reduced from
5 to 3, the increase in performance by comfortGAN is sort of main-
tained (6% increase), but for the Comfort Database [
] where
classes are reduced from 7 to 3, comfortGAN performance increase
is barely 2%, while all the other methods provide 1% increase. The
eld dataset [
], which originally had three classes, to begin with,
also shows a minimal increase of 2% in performance by comfort-
GAN, while the other GAN variants achieve at most almost the
same as the baseline performance. Figure 5 summarises this perfor-
mance, in terms of F1-micro score, of the baseline and comfortGAN
approaches for all datasets with their original and reduced classes.
Though GAN variants, particularly our proposed method comfort-
GAN, surpassed the traditionally used augmentation algorithms
for thermal comfort model classication performance, they are
not ready to be a one-size-ts-all easily implementable solution.
], TGAN [
], and our proposed comfortGAN models
required some preliminary ne-tuning and longer training time,
BuildSys ’20, November 16–19, 2020, Yokohama, Japan intana, et al.
Figure 5: Machine learning ecacy (F1-micro score) of the
baseline and comfortGAN approaches for all datasets with
their original number of classes and with reduced classes (-
unlike the traditionally used SMOTE [
] and ADASYN [
] that are
available to building analysts and researchers for immediate use.
However, even after going through the process of utilizing these
more complex models, the results in Figure 5 show that class bal-
ancing for thermal comfort modeling achieves a small performance
increase when compared to their baseline.
5.1 Evaluation and eect on classication
The assessment of generated samples with Euclidean distance (Vari-
ability and Diversity in Figure 4) may not be as a reliable similarity
metric as the SSIM score is for images in the eld of computer
vision. The GAN related literature usually prefers metrics such as
Inception Score or Fréchet Inception distance for comparing dif-
ferent GAN variants [
]; the former measures the ability of the
generative model to retain the class ratio, and the latter measures
the distribution of features and labels, but it’s biased when there’s
a small number of samples per class. Based on this, we opted to
retain the Euclidean distance-based metric that can be used with
the traditionally used methods (SMOTE [6] and ADASYN [22]).
Moreover, unlike other studies of GANs in the built environment
] that constructs a balanced test set for evaluation, we decided
to randomly split the datasets into a train and test split, which lead
to an also imbalanced test set, in order to mimic what is done in
practice by thermal comfort researchers. However, in this scenario,
a classication model still gives more importance to the predomi-
nant class in the test set and holds back a wider evaluation of the
generated under-represented classes. Also, it is possible that a more
tuned model on each balanced dataset performs better than the
current results in Table 1b, but we xed an RDF for each dataset to
assess the impact on the generated samples alone. As mentioned
in [
], a proper assessment of generative model performance is
only possible in the context of the application. For the thermal
comfort model context, if the goal is to use the model’s prediction
for building operations control, the GAN performance assessment
is tied with the model’s prediction performance. However, quanti-
tative evaluation metrics for GANs is currently a critical research
direction [3, 36, 42].
Additionally, as we show in Figure 5, when the dataset has only
three classes, the classication performance of a balanced dataset,
even with the top-performing method (which is based on our results
in Table 1b is comfortGAN), barely increases. This result raises the
question of whether class balancing is needed at all. As mentioned
in Section 1, thermal comfort models are ultimately used for better
design of indoor spaces or to modify current control strategies in
indoor environments [
]. Therefore, from a human perspective,
the dierence between predicting “slightly cool” instead of “cool”
is smaller than predicting “slightly cool” instead of “neutral”. The
baseline performance increase on the controlled dataset [
] and
the Comfort Database [
] by reducing the number classes is 6% and
23% respectively, surpassing what the augmentation methods could
provide (4% and 17% respectively). Although using augmentation
methods on the datasets with reduced classes can still give more
performance improvement, this performance increase seems to be
minimal in two datasets, 1% on the eld dataset [
] and 2% on
Comfort Database [
]. Even if an increase of 6% can be achieved
on a dataset with reduced classes (comfortGAN on the controlled
dataset [
], Table 1b), an alternative in a real-world application to
improve the performance of a thermal comfort classication model
is to reduce the number of classes. By doing this as a rst step,
we simplify the classication problem and might even be able to
ditch the class-balancing pre-processing to focus on other more
promising aspects (e.g., better models themselves) to increase classi-
cation performance. This idea can be further explored on a control
strategy’s energy and thermal comfort repercussions with a 5 or 7
class thermal comfort model compared to a control strategy that
relies on a 3 class thermal comfort model. We leave this direction
as future work.
5.2 Model applicability
Considering the limitations and downsides of GANs when deployed
in practice, e.g., training stability, hyper-parameter tuning, etc.,
GANs are still a popular approach for data generation, particularly
when dealing with human-generated samples. The Generator does
not have access to the real data during the entire training process,
making privacy more achievable than other used methods like
Variational Autoencoders [
]. In the built environment, privacy
and anonymization of thermal comfort responses is not necessarily
a concern since occupants willingly share this information with
their peers or the facilities manager. While the usability of GANs
for class-balancing can be circumvented by reducing the number
of classes in the dataset, their upper-hand in implicitly learning
distributions can be used for other tasks like transfer learning (e.g.,
learning the representation of a bigger thermal comfort experiment
to be applied on a smaller one), self-labeling of samples (e.g., use
the data points with subjective feedback to label the measurements
without feedback), or missing data imputation (based on learning
the data distribution rst). We leave the exploration of these options
as future work.
Balancing thermal comfort datasets: We GAN, but should we? BuildSys ’20, November 16–19, 2020, Yokohama, Japan
5.3 Class balancing
Finally, the main assumption made on this balancing scenario is that
the data available to us is, to some extend, not fully representative
of the reality, and the data is disproportionate due to the collection
methods. If the real data is actually balanced in nature, researchers
could collect more data or augment the already collected data. How-
ever, if this is not the case and the real underlying distribution is,
in fact, disproportionate, by augmenting it or re-balancing it, we
are changing its nature. In the thermal comfort context, one of the
main reasons we obtain imbalanced datasets, in eld experiments at
least, is because it is expected that occupants nd their thermal en-
vironment acceptable in most situations. In fact, a balanced dataset
from a eld experiment would indicate that the building is very
poorly managed since it’s allowing such a range of situations.
Thus, a modication on the data collection process, mentioned
in Section 1, is to conduct targeted surveys to purposely query the
user for subjective comfort feedback responses in unseen scenarios.
E.g. if not enough ‘cold‘ responses are available, a survey will be
prompted to the user when the temperature surrounding it is below
C or another arbitrary low temperature. Initial results of this
approach in [
] show that this type of surveying method decreases
user disturbance and data redundancy. Furthermore, relying on peer
information can also alleviate the need to collect longitudinal data
from every single building occupant [23].
Finally, class balancing is an alternative to address imbalance
datasets. As mentioned in Section 5.1, in the thermal comfort con-
text, the dierence between the class values (e.g., “slightly cool”,
“cool”, and “warm”) might be better captured as a cost-sensitive clas-
sication problem or reformulation on how these labels are seen
by the loss functions of a given model. These other approaches,
however, are out of the scope of this work.
In this work, we present comfortGAN, a conditional Wasserstein
GAN-based approach for data augmentation, specically class-
balancing, in thermal comfort datasets. This approach’s main contri-
bution is the appropriate combination of GAN architecture and loss
function combined with an extensive evaluation on three publicly
available thermal comfort datasets. The datasets cover controlled
and eld experiments as well as the Comfort Database (the biggest
thermal comfort database available to date). We evaluated the eect
on performance a balanced dataset, comprised of generated and
real samples, has on the thermal comfort classication task. Mim-
icking building analysts and thermal comfort researchers modeling
set up, we evaluated samples generated from comfortGAN, two
traditionally use methods (SMOTE [
] and ADASYN [
]) and two
other GAN variants for data synthesis (CTGAN [
] and TGAN
We found that GAN variants consistently outperform the tradi-
tionally used methods (Table 1b), and comfortGAN achieves the
highest increase in performance: from 4% in the controlled dataset
] to 17% in the Comfort Database [
]. However, reducing the
number of classes in the dataset by merging similar values (for exam-
ple, “cold“, “cool”, and “slightly cool”) already increases the baseline
classication performance (6% in the controlled dataset [
] and
23% in the Comfort Database [
]), reducing the performance in-
crease of augmentation methods to 1% or 2% (Table 1b). Ultimately,
the choice of using a class-balancing algorithm depends on the end
goal of the thermal comfort research. For control purposes, a 3 class
dataset and model should provide enough information to control
building and HVAC systems. Therefore, reducing the number of
classes should be the rst attempt to have more robust and better
classication models, which in some cases might benet from class
balancing (6% increase in the controlled dataset, Table 1b).
Our results open the door for further research into generative
models and thermal comfort data. Given that the environmental
and physiological parameters are measured at a much higher fre-
quency than the subjective comfort feedback, self-labeling of these
data points could be explored with GAN-based approaches or other
generative methods as well. Transfer learning on thermal comfort
datasets is also a promising venue that could allow researchers
to leverage their human studies’ cohort size or the number of re-
sponses per participant.
We described comfortGAN as a tool that shows the potential to
be part of the data-driven thermal comfort modeling, particularly
when compared to the traditional methods commonly used. Even
if other practical ways, such as reducing the number of classes,
achieve an equal or better performance increase in most cases, we
believe the decision relies on the researcher or practitioner and their
use case. If their assumptions on the data allow for class-balancing,
augmentation methods like comfortGAN can be used as part of their
modeling pipeline. On the other hand, if the resulting model will be
incorporated into a control strategy, a reduction in the number of
classes should be considered rst before diving into augmentation
This research was funded by the Republic of Singapore’s National
Research Foundation through a grant to the Berkeley Education
Alliance for Research in Singapore (BEARS) for the Singapore-
Berkeley Building Eciency and Sustainability in the Tropics (Sin-
BerBEST) Program. BEARS has been established by the University
of California, Berkeley as a center for intellectual excellence in
research and education in Singapore
Martin Arjovsky and Léon Bottou. 2019. Towards principled methods for training
generative adversarial networks. 5th International Conference on Learning Repre-
sentations, ICLR 2017 - Conference Track Proceedings (2019), 1–17. arXiv:1701.04862
Martin Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein GAN.
(2017). arXiv:1701.07875
Sanjeev Arora, Andrej Risteski, and Yi Zhang. 2018. Do GANs Learn the Distri-
bution? Some Theory and Empirics. Iclr 2018 (2018), 1–16.
Liliana Barrios and Wilhelm Kleiminger. 2017. The Comfstat - Automatically
sensing thermal comfort for smart thermostats. In EEE International Conference
on Pervasive Computing and Communications (PerCom). 257–266. https://doi.
Andrew Brock, Je Donahue, and Karen Simonyan. 2019. Large scale GaN
training for high delity natural image synthesis. 7th International Conference on
Learning Representations, ICLR 2019 (2019), 1–35. arXiv:1809.11096
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer.
2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Arti-
cial Intelligence Research 16 (2002), 321–357.
Zhengping Che, Yu Cheng, Shuangfei Zhai, Zhaonan Sun, and Yan Liu. 2017.
Boosting deep learning risk prediction with generative adversarial networks for
BuildSys ’20, November 16–19, 2020, Yokohama, Japan intana, et al.
electronic health records. In IEEE International Conference on Data Mining (ICDM),
Vol. November. 787–792. arXiv:1709.01648
Toby C.T. Cheung, Stefano Schiavon, Elliott T. Gall, Ming Jin, and William W.
Nazaro. 2017. Longitudinal assessment of thermal and perceived air quality ac-
ceptability in relation to temperature, humidity, and CO 2 exposure in Singapore.
Building and Environment 115 (2017), 80–90.
Hafedh Chourabi, Taewoo Nam, Shawn Walker, J. Ramon Gil-Garcia, Sehl Mel-
louli, Karine Nahon, Theresa A. Pardo, and Hans Jochen Scholl. 2012. Un-
derstanding smart cities: An integrative framework. Proceedings of the An-
nual Hawaii International Conference on System Sciences (2012), 2289–2297. arXiv:arXiv:1011.1669v3
Adrian K. Clear, Sam Mitchell Finnigan, Patrick Olivier, and Rob Comber. 2018.
ThermoKiosk: Investigating Roles for Digital Surveys of Thermal Experience in
Workplace Comfort Management. Proc. of CHI (2018), 1–12.
Georgios Douzas and Fernando Bacao. 2018. Eective data generation for im-
balanced learning using conditional generative adversarial networks. Expert
Systems with Applications 91, January 2018 (2018), 464–471.
Carlos Duarte Roa, Stefano Schiavon, and Thomas Parkinson. 2020. Targeted
occupant surveys: A novel method to eectively relate occupant feedback with
environmental conditions. Building and Environment 184, April (2020), 107129.
Diana Enescu. 2017. A review of thermal comfort models and indicators for indoor
environments. Renewable and Sustainable Energy Reviews 79 (2017), 1353–1379.
P.O. Fanger. 1973. Assessment of thermal comfort practice. British journal of
Industrial Medicine 30 (1973), 313–324.
Damien Fay, Liam O’Toole, and Kenneth N. Brown. 2017. Gaussian Process
models for ubiquitous user comfort preference sampling; global priors, active
sampling and outlier rejection. Per vasive and Mobile Computing 39 (2017), 135–
Veronika Földváry Ličina, Toby Cheung, Hui Zhang, Richard de Dear, Thomas
Parkinson, Edward Arens, Chungyoon Chun, Stefano Schiavon, Maohui Luo, Gail
Brager, Peixian Li, and Soazig Kaam. 2018. ASHRAE Global Thermal Comfort
Database II. Dataset v4 (2018), 1–4.
Jonathan Francis, Matias Quintana, Nadine Von Frankenberg, and Mario Bergés.
2019. Dataset : Inferring Thermal Comfort using Body Shape Information Utiliz-
ing Depth Sensors. In DATA’19 Proceedings of the 2nd Workshop on Data Acquisi-
tion To Analysis, ACM (Ed.). New York, NY, USA, 13–15.
Jonathan Francis, Matias Quintana, Nadine Von Frankenberg, and Mario Bergés.
2019. OccuTherm : Occupant Thermal Comfort Inference using Body Shape
Information. In BuildSys ’19 Proceedings of the 6th ACM International Conference
on Systems for Energy-Ecient Built Environments]. New York, NY, USA. https:
Peter Xiang Gao and Srinivasan Keshav. 2013. Optimal Personal Comfort Man-
agement Using SPOT+. In Proceedings of the 5th ACM Workshop on Embedded
Systems For Energy-Ecient Buildings (BuildSys’13). ACM, ACM, New York, NY,
USA, 1–8.
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-
Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative
Adversarial Nets. Corrosion 3, January (2014), iii.
0-408- 00109-0.50001-8 arXiv:arXiv:1011.1669v3
Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron
Courville. 2017. Improved training of wasserstein GANs. Advances in Neural
Information Processing Systems 2017-Decem (2017), 5768–5778. arXiv:1704.00028
Haibo He, Yang Bai, Edwardo A Garcia, and Shutao Li. 2008. ADASYN: Adaptive
synthetic sampling approach for imbalanced learning. In IEEE International Joint
Conference on Neural Networks, 2008. IJCNN 2008.(IEEE World Congress on
Computational Intelligence) 3 (2008), 1322– 1328.
Prageeth Jayathissa, Matias Quintana, Mahmoud Abdelrahman, and Clayton
Miller. 2020. Humans-as-a-sensor for buildings: Intensive longitudinal indoor
comfort models. (2020). arXiv:2007.02014
James Jordon, Jinsung Yoon, and Mihaela Van Der Schaar. 2019. PATE-GaN:
Generating synthetic data with dierential privacy guarantees. 7th International
Conference on Learning Representations, ICLR 2019 (2019), 1–21.
Wooyoung Jung and Farrokh Jazizadeh. 2020. Energy saving potentials of in-
tegrating personal thermal comfort models for control of building systems :
Comprehensive quantication through combinatorial consideration of inuen-
tial parameters. Applied Energy 268, November 2019 (2020), 114882. https:
Joyce Kim, Stefano Schiavon, and Gail Brager. 2018. Personal comfort models
âĂŞ A new paradigm in thermal comfort for occupant-centric environmental
control. Building and Environment 132, November 2017 (2018), 114–124. https:
Joyce Kim, Yuxun Zhou, Stefano Schiavon, Paul Raftery,and Gail Brager. 2018. Per-
sonal comfort models: predicting individuals’ thermal preference using occupant
heating and cooling behavior and machine learning. Building and Environment
129, February 2018 (2018), 96–106.
Aleksandra Lipczynska, Stefano Schiavon, and Lindsay T. Graham. 2018. Thermal
comfort and self-reported productivity in an oce with ceiling fans in the tropics.
Building and Environment 135, January (2018), 202–212.
Shichao Liu, Stefano Schiavon, Hari Prasanna Das, Costas J. Spanos, Ming Jin,
and Costas J. Spanos. 2019. Personal thermal comfort models with wearable
sensors. Building and Environment 162, July (2019), 106281.
Mario Lucic, Karol Kurach, Marcin Michalski, Sylvain Gelly, and Olivier Bous-
quet. 2017. Are GANs Created Equal? A Large-Scale Study. Nips (2017).
Maohui Luo, Jiaqing Xie, Yichen Yan, Zhihao Ke, Peiran Yu, Zi Wang, and Jingsi
Zhang. 2020. Comparing machine learning algorithms in predicting thermal
sensation using ASHRAE Comfort Database II. Energy and Buildings 210 (2020),
Liqian Ma, Xu Jia, Qianru Sun, Bernt Schiele, Tinne Tuytelaars, and Luc Van Gool.
2017. Pose Guided Person Image Generation. Nips (2017), 1–11. arXiv:1705.09368
Giovanni Mariani, Florian Scheidegger, Roxana Istrate, Costas Bekas, and Cris-
tiano Malossi. 2018. BAGAN: Data Augmentation with Balancing GAN. (2018),
1–9. arXiv:1803.09655
Mehdi Mirza and Simon Osindero. 2014. Conditional Generative Adversarial
Nets. (2014), 1–7. arXiv:1411.1784
Sergey I. Nikolenko. 2019. Synthetic Data for Deep Learning. (2019), 1–156.
Augustus Odena. 2019. Open Questions about Generative Adversarial Networks.
Distill (2019).
June Young Park and Zoltan Nagy. 2018. Comprehensive analysis of the rela-
tionship between thermal comfort and building control research - A data-driven
literature review. Renewable and Sustainable Energy Reviews 82, July (2018),
Noseong Park, Mahmoud Mohammadi, Kshitij Gorde, Sushil Jajodia, Hongkyu
Park, and Youngmin Kim. 2018. Data Synthesis based on Generative Adversarial
Networks. 11, 10 (2018), 1071–1083.
Guim Perarnau, Joost van de Weijer, Bogdan Raducanu, and Jose M. Álvarez. 2016.
Invertible Conditional GANs for image editing. arXiv preprint arXiv:1611.06355
(2016). arXiv:1611.06355
Stephen R. Porter, Michael E. Whitcomb, and William H. Weitzer. 2004. Multiple
surveys of students and survey fatigue. New Directions for Institutional Research
2004, 121 (2004), 63–73.
Matias Quintana and Clayton Miller. 2019. Poster Abstract: Towards Class-
Balancing Human Comfort Datasets with GANs. In BuildSys ’19 Proceedings
of the 6th ACM International Conference on Systems for Energy-Ecient Built
Environments. New York, NY, USA.
Mehdi S.M. Sajjadi, Olivier Bousquet, Olivier Bachem, Mario Lucic, and Sylvain
Gelly. 2018. Assessing generative models via precision and recall. Advances in
Neural Information Processing Systems 2018-Decem, NeurIPS (2018), 5228–5237.
F Stinner, Y Yang, T Schreiber, G Bode, M Baranski, and D Müller. 2019. Generating
generic data sets for machine learning applications in building services using
standardized time series data. Proceedings of the 36th International Symposium
on Automation and Robotics in Construction, ISARC 2019 Isarc (2019), 226–233.
Lucas Theis, Aäron van den Oord, and Matthias Bethge. 2015. A note on the
evaluation of generative models. (2015), 1–10. arXiv:1511.01844
Zhengwei Wang, Qi She, and Tomas E. Ward. 2019. Generative Adversarial
Networks: A Survey and Taxonomy. 2 (2019), 1–16. arXiv:1906.01529 http:
Lei Xu, Maria Skoularidou, Alfredo Cuesta-Infante, and Kalyan Veeramachaneni.
2019. Modeling Tabular data using Conditional GAN. In 33rd Conference on
Neural Information Processing Systems (NeurIPS). arXiv:1907.00503 http://arxiv.
Lei Xu and Kalyan Veeramachaneni. 2018. Synthesizing Tabular Data using
Generative Adversarial Networks. (2018). arXiv:1811.11264
Ke Yan, Adrian Chong, and Yuchang Mo. 2020. Generative adversarial network
for fault detection diagnosis of chillers. Building and Environment 172, December
2019 (2020), 106698.
Haifeng Zhao, Jasper S Wijnands, Kerry A Nice, Jason Thompson, Gideon D P A
Aschwanden, Mark Stevenson, and Jingqiu Guo. 2019. Unsupervised Deep Learning
to Explore Streetscape Factors Associated with Urban Cyclist Safety Haifeng. Vol.149.
Springer Singapore. 155–164 pages.
Balancing thermal comfort datasets: We GAN, but should we? BuildSys ’20, November 16–19, 2020, Yokohama, Japan
Jun Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Un-
paired Image-to-Image Translation Using Cycle-Consistent Adversarial Net-
works. Proceedings of the IEEE International Conference on Computer Vi-
sion 2017-Octob (2017), 2242–2251.
... To that end, the same core of researchers in [55] expanded the scope of their study in [64], including more training datasets and hyper-parametrizing a bespoke model, called ComfortGAN. ComfortGAN relies on the CGAN architecture with Wasserstein loss and a gradient penalty. ...
... Subsequently, the group reprised its experiments, now adjusting the datasets to consolidate the number of predictable classes down to 3. Once again, ComfortGAN retained its statistical advantage, but the margin was less pronounced. This led [64] to speculate that for the research problem in question, the complexity and computational load of a GAN-based approach may not be justified, particularly when, as in the case with human comfort survey data, classes can be combined, and greater emphasis can be placed on fine-tuning the machine learning models themselves rather than data augmentation. ...
... As suggested by Johnson and Khoshgoftaar [6], the ability of a deep learning scheme to recognize patterns may not be solely affected by the ratio of class imbalance, but also by the raw total of minority samples. Even with severe class imbalance, if the quantity of minority cases is sufficient, a pattern may be detected; however, in analogous case Lee and Park [39] Wang et al. [44] Wang et al. [43] Vu et al. [36] Financial Transactions Engelmann and Lessmann [54] Other Disciplines Quintana and Miller [55] Quintana and Miller [64] Dos Santos Tanakha et al. [65] CGAN with other modules Financial Transactions Lei et al. [51] Other Disciplines Deepshikha and Naman [69] Page 22 of 37 Sauber-Cole and Khoshgoftaar Journal of Big Data (2022) 9:98 Therefore, domains where the total number of minority cases is lower (such as in financial fraud detection) may be more conducive to the use of AnoGAN architecture relative to CGAN architecture. The preponderance of Wasserstein GANs and Wasserstein loss as a loss function in the domain of financial transactions relative to the domain of cybersecurity is related to the complexity of the respective datasets in these two domains. ...
Full-text available
The existence of class imbalance in a dataset can greatly bias the classifier towards majority classification. This discrepancy can pose a serious problem for deep learning models, which require copious and diverse amounts of data to learn patterns and output classifications. Traditionally, data-level and algorithm-level techniques have been instrumental in mitigating the adverse effect of class imbalance. With the recent development and proliferation of Generative Adversarial Networks (GANs), researchers across a variety of disciplines have adapted the architecture of GANs and implemented them on imbalanced datasets to generate instances of the underrepresented class(es). Though the bulk of research has been centered on the application of this methodology in computer vision tasks, GANs are likewise being appropriated for use in tabular data, or data consisting of rows and columns with traditional structured data types. In this survey paper, we assess the methodology and efficacy of these modifications on tabular datasets, across domains such network traffic classification and financial transactions over the past seven years. We examine what methodologies and experimental factors have resulted in the greatest machine learning efficacy, as well as the research works and frameworks which have proven most influential in the development of the application of GANs in tabular data settings. Specifically, we note the prevalence of the CGAN architecture, the optimality of novel methods with CNN learners and minority-class sensitive measures such as F1 score, the popularity of SMOTE as a baseline technique, and the improved performance in the year-over-year use of GANs in imbalanced tabular datasets.
... Generating synthetic data to augment the training dataset is one of the approaches that has been proposed for tackling with the above challenges. The synthetic data generation can be done using classical methods such as SMOTE [49,50] or using advanced neural networkbased generative models [50,51,52,53]. ...
... Generating synthetic data to augment the training dataset is one of the approaches that has been proposed for tackling with the above challenges. The synthetic data generation can be done using classical methods such as SMOTE [49,50] or using advanced neural networkbased generative models [50,51,52,53]. ...
Full-text available
Energy consumption in buildings, both residential and commercial, accounts for approximately 40% of all energy usage in the U.S., and similar numbers are being reported from countries around the world. This significant amount of energy is used to maintain a comfortable, secure, and productive environment for the occupants. So, it is crucial that the energy consumption in buildings must be optimized, all the while maintaining satisfactory levels of occupant comfort, health, and safety. Recently, Machine Learning has been proven to be an invaluable tool in deriving important insights from data and optimizing various systems. In this work, we review the ways in which machine learning has been leveraged to make buildings smart and energy-efficient. For the convenience of readers, we provide a brief introduction of several machine learning paradigms and the components and functioning of each smart building system we cover. Finally, we discuss challenges faced while implementing machine learning algorithms in smart buildings and provide future avenues for research at the intersection of smart buildings and machine learning.
... Besides generating data, GANs are also useful in data upsampling, data privacy protection, and data extrapolation as they excel at learning and reproducing the ground truth. This trait has made GANs applicable in various domains such medicine (Beaulieu-Jones et al., 2019;Litjens et al., 2019;Bowles et al., 2018) and the built environment (Quintana et al., 2020;Yan et al., 2020a;Rachele et al., 2021). ...
Full-text available
Generative Adversarial Network (GAN) is widely used in many generative problems, including in spatial information sciences and urban systems. The data generated by GANs can achieve high quality to augment downstream training or to complete missing entries in a dataset. GANs can also be used to learn the relationship between two datasets and translate one into another, e.g. road network data into building footprint data. However, such approach has not been developed in the geospatial and urban data science context, its usability remains unknown, and the methods are not fully developed. We develop a new Geographical Data Translation algorithm based on GAN to generate high-resolution vector building data solely from street networks, which may be used to predict the urban morphology in absence of building data, also enabling studies in unmapped or undermapped urban geographies, among other advantages. Experiments on 16 cities around the world demonstrate that the generated datasets are largely successful in resembling ground truth morphologies. Thus, the approach may be used in lieu of traditional data for tasks that are often hampered by lack of data, e.g. urban form studies, simulation of urban morphologies in new contexts, and spatial data quality assessment. Our work proposes a novel rapid approach to generate building footprints in replacement of procedural methods and it introduces a new intrinsic method for large-scale spatial data quality control, which we test on OpenStreetMap by predicting missing buildings and suggesting the completeness of data without the usually required authoritative counterparts. The code, sample model, and dataset are available openly.
... vey.33 To build upon this foundation and help solve the issue of collecting perception data from people, our team has contributed to the development of the micro-EMA Cozie project that targets indoor occupant data collection.34,35 Cozie is an open-source application that one can install on Fitbit (Versa 2 and Ionic) or Apple smartwatches.The platform has been utilized in previous studies to test the implementation and modelling of smartwatch-based subjective data collection, 36-38 study thermal preference, imbalanced classes,39 and create personal comfort models using building information model components as inputs.40 One can find more information about Cozie and the official documentation at and https://cozie ...
Personal thermal comfort models are a paradigm shift in predicting how building occupants perceive their thermal environment. Previous work has critical limitations related to the length of the data collected and the diversity of spaces. This paper outlines a longitudinal field study comprising 20 participants who answered Right‐Here‐Right‐Now surveys using a smartwatch for 180 days. We collected more than 1080 field‐based surveys per participant. Surveys were matched with environmental and physiological measured variables collected indoors in their homes and offices. We then trained and tested seven machine learning models per participant to predict their thermal preferences. Participants indicated 58% of the time to want no change in their thermal environment despite completing 75% of these surveys at temperatures higher than 26.6°C. All but one personal comfort model had a median prediction accuracy of 0.78 (F1‐score). Skin, indoor, near body temperatures, and heart rate were the most valuable variables for accurate prediction. We found that ≈250–300 data points per participant were needed for accurate prediction. We, however, identified strategies to significantly reduce this number. Our study provides quantitative evidence on how to improve the accuracy of personal comfort models, prove the benefits of using wearable devices to predict thermal preference, and validate results from previous studies.
... This approach does not address the inherent class balancing issue in thermal preference datasets, i.e,. having an unequal number of data points for each thermal preference category [66]. However, we decided to have the same number of training data points per participants. ...
Cohort Comfort Models (CCM) are introduced as a technique for creating a personalized thermal prediction for a new building occupant without the need to collect large amounts of individual comfort-related data. This approach leverages historical data collected from a sample population, who have some underlying preference similarity to the new occupant. The method uses background information such as physical and demographic characteristics and one-time onboarding surveys (satisfaction with life scale, highly sensitive person scale, personality traits) from the new occupant, as well as physiological and environmental sensor measurements paired with a few thermal preference responses. The framework was implemented using two personal comfort datasets containing longitudinal data from 55 people. The datasets comprise more than 6,000 unique right-here-right-now thermal comfort surveys. The results show that a CCM that uses only the one-time onboarding survey information of an individual occupant has generally as good or better performance as compared to conventional general-purpose models, but uses no historical longitudinal data as compared to personalized models. If up to ten historical personal preference data points are used, CCM increased the thermal preference prediction by 8% on average and up to 36% for half of the occupants in the first of the tested datasets. In the second dataset, one-third of the occupants increased their thermal preference prediction by 5% on average and up to 46%. CCM can be an important step toward the development of personalized thermal comfort models without the need to collect a large number of datapoints per person.
... This data-agnostic nature has expanded the potential applications of GANs to a wider range of generative problems and has already established the state-of-the-art in many applications such as speech enhancement, music composition, fault detection, and graph-based prediction (Pascual et al., 2017;Dong et al., 2018;Lee et al., 2017;Wang et al., 2018a). In addition to generating data, GANs can also be applied to data upsampling, data privacy protection, and data augmentation, as they excel in learning and reproducing the data distribution of the target dataset (Beaulieu-Jones et al., 2019;Litjens et al., 2019;Bowles et al., 2018;Quintana et al., 2020;Yan et al., 2020;Rachele et al., 2021;Wu and Biljecki, 2022). ...
Full-text available
Generative Adversarial Networks (GANs) are a type of deep neural network that have achieved many state-of-the-art results for generative tasks. GANs can be useful in the built environment, from processing large-scale urban mobility data and remote sensing images at the regional level, to performance analysis and design generation at the building level. We analyzed 100 articles to provide a comprehensive state-of-the-art review on how GANs are currently applied to solve challenging tasks in the built environment. Our results show that: (i) GANs are replacing older methods in some problems and setting state-of-the-art performances; (ii) GANs are opening new frontiers in previously overlooked problems, such as automatically generating spatially accurate floorplan layouts; (iii) GANs can be applied to different scales in the built environment, from entire cities to neighborhoods and buildings; and (iv) GANs are being used in a variety of problems and data types, from remote sensing data augmentation, vector data generation, spatio-temporal data privacy protection, to building design generation. In total, there are 26 unique application domains enabled by GANs; (v) however, one common challenge in this field currently is the lack of high-quality datasets curated specifically for problems in the built environment. With more data in the future, GANs could potentially produce even better results than today.
... In computer vision Das et al. (2021a) propose synthetic data generation across multiple domains. In smart buildings, Quintana et al. (2020) used a conditional tabular GAN based model for thermal comfort synthetic data generation. We use a state-of-the-art conditional synthetic data generation model that has shown improved results over all baselines to generate thermal comfort synthetic data. ...
Full-text available
Personal thermal comfort models aim to predict an individual's thermal comfort response, instead of the average response of a large group. Recently, machine learning algorithms have proven to be having enormous potential as a candidate for personal thermal comfort models. But, often within the normal settings of a building, personal thermal comfort data obtained via experiments are heavily class-imbalanced. There are a disproportionately high number of data samples for the "Prefer No Change" class, as compared with the "Prefer Warmer" and "Prefer Cooler" classes. Machine learning algorithms trained on such class-imbalanced data perform sub-optimally when deployed in the real world. To develop robust machine learning-based applications using the above class-imbalanced data, as well as for privacy-preserving data sharing, we propose to implement a state-of-the-art conditional synthetic data generator to generate synthetic data corresponding to the low-frequency classes. Via experiments, we show that the synthetic data generated has a distribution that mimics the real data distribution. The proposed method can be extended for use by other smart building datasets/use-cases.
... Generative adversarial networks (GANs) are a type of generative models introduced by Goodfellow et al. (2014), which have rapidly gained currency in a variety of application domains, such as thermal comfort, energy, and design (Quintana et al., 2020;Yan et al., 2020a;Rachele et al., 2021). Using a generatordiscriminator model pair in the training process, the generator in a GAN gradually learns to create data distributions that pass the checks by the discriminator, therefore producing patterns that closely resemble the original dataset. ...
Full-text available
We present a new method to create spatial data using a generative adversarial network (GAN). Our contribution uses coarse and widely available geospatial data to create maps of less available features at the finer scale in the built environment , bypassing their traditional acquisition techniques (e.g. satellite imagery or land surveying). In the work, we employ land use data and road networks as input to generate building footprints and conduct experiments in 9 cities around the world. The method, which we implement in a tool we release openly, enables the translation of one geospatial dataset to another with high fidelity and morphological accuracy. It may be especially useful in locations missing detailed and high-resolution data and those that are mapped with uncertain or heterogeneous quality, such as much of OpenStreetMap. The quality of the results is influenced by the urban form and scale. In most cases, the experiments suggest promising performance as the method tends to truthfully indicate the locations, amount, and shape of buildings. The work has the potential to support several applications , such as energy, climate, and urban morphology studies in areas previously lacking required data or inpainting geospatial data in regions with incomplete data.
... By undersampling and oversampling the training set, the unbalanced data set can be transformed into a balanced data set. To generate data, some researchers use the data synthesis method, specifically the Generative Adversarial Network (GAN) method [53][54][55]. Hence, re-training produces much better results than using only the initial data. ...
Machine learning-based human thermal comfort prediction is becoming increasingly popular as artificial intelligence (AI) technologies advance. Human skin temperature is a critical physiological factor in thermal comfort research. In winter, we developed a thermal comfort prediction model based on skin temperature and environmental factors. During the experimental phase, the superior performance of the proposed method is demonstrated through a comparative study that includes four different state-of-the-art models, including Support Vector Machine, Decision Tree, Ensemble Algorithms, and K-Nearest Neighbor. With all variables as inputs, the actual accuracy of the proposed thermal sensation vote (TSV) model prediction is 95.8%. In addition, the hyperparameters of machine learning algorithms were tuned using a personal classification model based on the Bayesian Optimization technique. This study demonstrates the model’s capability of predicting individual thermal comfort.
Increasing research on data-driven methods to optimize energy systems and power grid operation requires a large amount of data with regard to building energy consumption profiles; owing to the difficulty in availing load data, time-consuming collection and the privacy issues in the collection process have become limitations and previous research on total load generation cannot meet research requirements of refined energy control and optimization. In this study, we propose a novel approach based on the conditional generative adversarial network(CGAN) and moving average method to generate sub-item load profiles of building energy consumption to solve the aforementioned problems. Sub-item load profiles include light and socket load, HVAC(Heating Ventilation and Air Conditioning) load, impetus load and special load. The CGAN algorithm is employed to generate sub-item load profiles considering specific conditions (multiple labels) e.g. time, weather, and load shape labels. In addition, the moving average method was used to reduce noise in the generated profiles. The case study was conducted based on real-world sub-item load data collected from office buildings, commercial buildings, and hospitals in Shenzhen, China. We validated the generation performance of the sub-item load profile of CGAN-MA by comparing it with the traditional load profile generation method GAN and variational autoencoder based on three aspects: similarity, variability and diversity. Compared with the traditional model, the proposed model improves the similarity and variability by about 5.7% to 64.8%, 76.7% to 135.5% respectively, and can satisfy the requirements of diversity with the diversity indicator of four sub item generated load is 1.36, 1.93, 1.81 and 2.08 respectively. Furthermore, we compared the generated load and real load possibility distributions under the selected conditions. The results show that the load generated by CGAN-MA is higher on working days, rainy days and hot days than on non-working days, sunny days and cool days, which correspond to the real circumstances, and sub-item B (HVAC) is the most sensitive one to different conditions. The proposed model can be applied to effectively generate sub-load profiles under the required conditions and further help in studies related to the development of data-driven methods for energy consumption prediction, demand-side management and the optimization of power grid operation.
Full-text available
Evaluating and optimising human comfort within the built environment is challenging due to the large number of physiological, psychological and environmental variables that affect occupant comfort preference. Human perception could be helpful to capture these disparate phenomena and interpreting their impact; the challenge is collecting spatially and temporally diverse subjective feedback in a scalable way. This paper presents a methodology to collect intensive longitudinal subjective feedback of comfort-based preference using micro ecological momentary assessments on a smartwatch platform. An experiment with 30 occupants over two weeks produced 4378 field-based surveys for thermal, noise, and acoustic preference. The occupants and the spaces in which they left feedback were then clustered according to these preference tendencies. These groups were used to create different feature sets with combinations of environmental and physiological variables, for use in a multi-class classification task. These classification models were trained on a feature set that was developed from time-series attributes, environmental and near-body sensors, heart rate, and the historical preferences of both the individual and the comfort group assigned. The most accurate model had multi-class classification F1 micro scores of 64%, 80% and 86% for thermal, light, and noise preference, respectively. The discussion outlines how these models can enhance comfort preference prediction when supplementing data from installed sensors. The approach presented prompts reflection on how the building analysis community evaluates, controls, and designs indoor environments through balancing the measurement of variables with occupant preferences in an intensive longitudinal way.
Full-text available
Human comfort datasets are widely used in smart buildings. From thermal comfort prediction to personalized indoor environments, labelled subjective responses from participants in an experiment are required to feed different machine learning models. However, many of these datasets are small in samples per participants, number of participants, or suffer from a class-imbalance of its subjective responses. In this work we explore the use of Generative Adversarial Networks to generate synthetic samples to be used in combination with real ones for data-driven applications in the built environment. CCS CONCEPTS • Computing methodologies → Machine learning; Modeling and simulation.
Full-text available
Automatic fault detection and diagnosis (AFDD) for chillers has significant impacts on energy saving, indoor environment comfort and systematic building management. Recent works show that the artificial intelligence (AI) enhanced techniques outperform most of the traditional fault detection and diagnosis methods. However, one serious issue has been raised in recent studies, which shows that insufficient number of fault training samples in the training phase of AI techniques can significantly influence the final classification accuracy. The insufficient number of fault samples refers to the imbalanced-class classification problem, which is a hot topic in the field of machine learning. In this study, we re-visit the imbalanced-class problem for fault detection and diagnosis of chiller in the heating, ventilation and air-conditioning (HVAC) system. The generative adversarial network is employed and customized to re-balance the training dataset for chiller AFDD. Experimental results demonstrate the effectiveness of the proposed GAN-integrated framework compared with traditional chiller AFDD methods.
Conference Paper
Full-text available
Thermal comfort is a decisive factor for the well-being, productivity, and overall satisfaction of commercial building occupants. Many commercial building automation systems either use a fixed zone-wide temperature set-point for all occupants or they rely on extensive sensor deployments with frequent online interaction with occupants. This results in inadequate comfort levels or significant training effort from users, respectively. However, the increasing ubiquity of cheap, depth-based occupancy tracking systems has enabled an improvement in inferential capabilities. We propose the novel system OccuTherm to model thermal comfort of occupants. We conducted a laboratory study with 77 participants to collect data for the implementation of a thermal comfort model that derives thermal comfort using the human body shape. Based on the comparison with model baselines and ablations, we show that our approach infers thermal comfort of individuals with 60% accuracy when body shape information is taken into account; 6% more than state-of-the-art approaches. We make our code, mobile app, datasets, and models freely available.
Conference Paper
Full-text available
Thermal comfort is very important for well-being and productivity of building occupants. It has been shown that body shape is a useful feature to determine thermal comfort of individuals [2]. It is because, the heat dissipation rate of individuals depends on the body surface area. As a result, a tall and skinny person can tolerate higher room temperature than a rounded body shape person [5]. In order to test this hypothesis, we performed a year-long experiment in 2017, where we recruited 77 participants and put each of them in a thermally controlled conference room in CMU for 3 hours and recorded their subjective responses regarding thermal comfort at different temperature ranging from 60°F to 80°F. In addition, we collected depth data of individuals using a vertically mounted Microsoft Kinect for XBOX One at the entrance of the conference room to capture their body shape. We also collected biometric features (e.g., Galvanic Skin Response (GSR), skin temperature) using a Microsoft Health Band worn by the subjects. The resulting dataset provides rich information regarding how different features can be used to infer thermal comfort of the individuals.
Full-text available
A personal comfort model is an approach to thermal comfort modeling, for thermal environmental design and control, that predicts an individual's thermal comfort response, instead of the average response of a large population. We developed personal thermal comfort models using lab grade wearable in normal daily activities. We collected physiological signals (e.g., skin temperature, heart rate) of 14 subjects (6 female and 8 male adults) and environmental parameters (e.g., air temperature, relative humidity) for 2–4 weeks (at least 20 h per day). Then we trained 14 models for each subject with different machine-learning algorithms to predict their thermal preference. The results show that the median prediction power could be up to 24%/78%/0.79 (Cohen's kappa/accuracy/AUC) with all features considered. The median prediction power reaches 21%/71%/0.7 after 200 subjective votes. We explored the importance of different features on the prediction performance by considering all subjects in one dataset. When all features included for the entire dataset, personal comfort models can generate the highest performance of 35%/76%/0.80 by the most predictive algorithm. Personal comfort models display the highest prediction power when occupants' thermal sensations is outside thermal neutrality. Skin temperature measured at the ankle is more predictive than measured at the wrist. We suggest that Cohen's kappa or AUC should be employed to assess the performance of personal thermal comfort models for imbalanced datasets due to the capacity to exclude random success.
This is the first book on synthetic data for deep learning, and its breadth of coverage may render this book as the default reference on synthetic data for years to come. The book can also serve as an introduction to several other important subfields of machine learning that are seldom touched upon in other books. Machine learning as a discipline would not be possible without the inner workings of optimization at hand. The book includes the necessary sinews of optimization though the crux of the discussion centers on the increasingly popular tool for training deep learning models, namely synthetic data. It is expected that the field of synthetic data will undergo exponential growth in the near future. This book serves as a comprehensive survey of the field. In the simplest case, synthetic data refers to computer-generated graphics used to train computer vision models. There are many more facets of synthetic data to consider. In the section on basic computer vision, the book discusses fundamental computer vision problems, both low-level (e.g., optical flow estimation) and high-level (e.g., object detection and semantic segmentation), synthetic environments and datasets for outdoor and urban scenes (autonomous driving), indoor scenes (indoor navigation), aerial navigation, and simulation environments for robotics. Additionally, it touches upon applications of synthetic data outside computer vision (in neural programming, bioinformatics, NLP, and more). It also surveys the work on improving synthetic data development and alternative ways to produce it such as GANs. The book introduces and reviews several different approaches to synthetic data in various domains of machine learning, most notably the following fields: domain adaptation for making synthetic data more realistic and/or adapting the models to be trained on synthetic data and differential privacy for generating synthetic data with privacy guarantees. This discussion is accompanied by an introduction into generative adversarial networks (GAN) and an introduction to differential privacy.
Occupant satisfaction surveys are widely used in laboratory and field research studies of indoor environmental quality. Field studies pose several challenges because researchers usually have no control over the indoor environments experienced by building occupants, it is difficult to recruit and retain participants, and data collection methods can be cumbersome. With this in mind, we developed a survey platform that uses real-time feedback to send targeted occupant surveys (TOS) at specific indoor environmental conditions and stops sending survey requests when collected responses reach the maximum surveys required. We performed a pilot study of the TOS platform with occupants of a radiant heated and cooled building to target survey responses at 16 radiant slab surface (infrared) temperatures evenly distributed from 15 to 30 °C. We developed metrics and ideal datasets to compare the TOS platform against other occupant survey distribution methods. The results show that this novel method has a higher approximation to characteristics of an ideal dataset; 41% compared to 23%, 19%, and 12% of other datasets in previous field studies. Our TOS method minimizes the number of times occupants are surveyed and ensures a more complete and balanced dataset. This allows researchers to more efficiently and reliably collect subjective data for occupant satisfaction studies.
Research studies provided evidence on the energy efficiency of integrating personal thermal comfort profiles into the control loop of Heating, Ventilation, and Air-Conditioning (HVAC) systems (i.e., comfort-driven control). However, some conflicting cases with increased energy consumption were also reported. Addressing the limited and focused nature of those demonstrations, in this study, we have presented a comprehensive assessment of the energy efficiency implications of comfort-driven control to (i) understand the impact of a wide range of contextual factors and their combinatorial effect and (ii) identify the operational conditions that benefit from personal comfort integration. In doing so, we have proposed an agent-based modeling framework, coupled with EnergyPlus simulations. We considered five potentially influential parameters and their combinatorial arrangements including occupants’ thermal comfort characteristics, diverse multi-occupancy scenarios, number of occupants in thermal zones, control strategies, and climate. We identified the most influencing factor to be the variations across occupants’ thermal comfort characteristics - reflected in probabilistic models of personal thermal comfort - followed by the number of occupants that share a thermal zone, and the control strategy in driving the collective setpoint in a zone. In thermal zones, shared by fewer than six occupants, we observed potentials for average energy efficiency gain in a range between −3.5% and 21.4% from comfort-driven control. Accounting for a wide range of personal comfort profiles and number of occupants, the average (±standard deviation) energy savings for a single zone and multiple zones were in ranges of [−3.7 ± 4.8%, 5.3 ± 5.6%] and [−3.1 ± 4.9%, 9.1 ± 5.1%], respectively. Across all multi-occupancy scenarios, a range between 0.0% and 96.0% of combinations resulted in energy savings.
Predicting building occupants’ thermal comfort via machine learning (ML) is a hot research topic. Many algorithms and data processing methods have been applied to predict thermal comfort indices in different contexts. But few studies have systematically investigated how different algorithms and data processing methods can influence the prediction accuracy. In this study, we first summarized the recent literature from perspectives of predicted comfort indices, algorithms applied, input fea- tures, data sources, sample size, training proportion, predicting accuracy, etc. Then, we applied nine ML algorithms and three data sampling methods to predict the 3-point and 7-point thermal sensation vote (TSV) in ASHRAE Comfort Database II. The results show that with an accuracy of 66.3% and 61.1% for 3-point and 7-point TSV respectively, Random Forest (RF) has the best performance among the tested algorithms. Compared to the Predicted Mean Vote (PMV) model, ML TSV models generally have higher accuracy in TSV prediction. Based on feature importance analysis, the air temperature, humidity, clothing, air velocity, age, and metabolic rate are the top six important features for TSV prediction. The RF algorithm can achieve 63.6% overall accuracy in TSV prediction with the top three features, which is only 2.6% lower than involving 12 input features. Further, this paper addressed other common considerations in ML comfort model establishment such as tuning hyperparameters, splitting of training and testing data, and encoding methods. We also provided Python and R pro- gramming codes and packages as appendixes, which can be a good reference for future studies.