Available via license: CC BY 4.0
Content may be subject to copyright.
Cohort comfort models - Using occupants’ similarity to predict personal thermal
preference with less data
Matias Quintana 1, Stefano Schiavon 2, Federico Tartarini 2, Joyce Kim 3, Clayton Miller1,∗
1College of Design and Engineering, National University of Singapore (NUS), Singapore
2Center for the Built Environment, University of California Berkeley, U.S.
3Department of Civil and Environmental Engineering, University of Waterloo, Canada
∗Corresponding Author: clayton@nus.edus.sg, +65 81602452
Abstract
We introduce Cohort Comfort Models, a new framework for predicting how new occupants would perceive their thermal environment.
Cohort Comfort Models leverage historical data collected from a sample population, who have some underlying preference similarity,
to predict thermal preference responses of new occupants. Our framework is capable of exploiting available background information
such as physical characteristics and one-time on-boarding surveys (satisfaction with life scale, highly sensitive person scale, the
Big Five personality traits) from the new occupant as well as physiological and environmental sensor measurements paired with
thermal preference responses. We implemented our framework in two publicly available datasets containing longitudinal data from
55 people, comprising more than 6,000 individual thermal comfort surveys. We observed that, a Cohort Comfort Model that uses
background information provided very little change in thermal preference prediction performance but uses none historical data. On
the other hand, for half and one third of each dataset occupant population, using Cohort Comfort Models, with less historical data
from target occupants, Cohort Comfort Models increased their thermal preference prediction by 8 % and 5 % on average, and up
to 36 % and 46 % for some occupants, when compared to general-purpose models trained on the whole population of occupants.
The framework is presented in a data and site agnostic manner, with its different components easily tailored to the data availability
of the occupants and the buildings. Cohort Comfort Models can be an important step towards personalization without the need of
developing a personalized model for each new occupant.
Keywords:
Thermal comfort, Clustering, Personalized environments, Cold start, Warm start, Recommender systems
Nomenclature
PMV Predicted Mean Vote
PCM Personal Comfort Model
PCS Personal Comfort System
CCM Cohort Comfort Model
RHRH Right-Here-Right-Now
HSPS Highly Sensitive Person Scale
SWLS Satisfaction With Life Scale
RF Random Forest
JS Jensen-Shannon
RBF Radial Basis Function
B5P Big Five Personality
1. Introduction
In the built environment, thermal comfort has a consider-
able influence on well-being, performance, and the overall sat-
isfaction of occupants [
1
,
2
,
3
]. A recent analysis done by [
4
]
based on approximately 90,000 occupant satisfaction survey
responses found that roughly 40 % of occupants are dissatis-
fied with their thermal environment. Thermal comfort affects
health [
5
,
6
], office work performance [
7
,
8
,
9
,
2
], learning per-
formance [10, 11, 12], and well-being [13, 14].
The thermal comfort index, Predicted Mean Vote (
PMV
) [
15
],
is the most widely used in industry and research. Although
Fanger [
15
] was aware of people’s different thermal preferences,
he developed the
PMV
model for a general population since
many buildings are not designed to provide personalized cooling
and heating to occupants. Hence, the
PMV
aims to find the
optimal indoor conditions to ensure that most of the occupants
are comfortable. Another popular model added to ASHRAE55
in 2004 is the Adaptive model [
16
]. This model is based on
outside air temperature, thus it is more suitable for naturally
conditioned environments. Both mentioned models consider
human factors, rather than using specific temperature set-points
as fixed comfort thresholds, but they average the individual oc-
cupants’ responses. A recent study on the ASHRAE Global
Thermal Comfort Database II [
17
] has shown evidence of the
low accuracy of
PMV
, i.e., 33 % [
18
] in predicting thermal
sensation of individual occupants. This, combined with the
proliferation of Internet-of-Things (IoT) sensors and wearable
devices, has pushed researchers to look into other alternative
Preprint submitted to Building and Environment August 8, 2022
arXiv:2208.03078v1 [cs.LG] 5 Aug 2022
solutions to predict how occupants perceive their thermal en-
vironment. Data-driven approaches leverage direct feedback
from occupants paired with environmental and physiological
data to develop personalized thermal comfort models for each
person [
19
]. These models can then be used to optimize the
operation of heating, ventilation, and air conditioning (HVAC)
systems or used to quantify the quality of the thermal environ-
ment in buildings. Based on the framework of Personal Comfort
Model (
PCM
) [
20
], researchers can use environmental, physi-
ological, and behavioural data to predict thermal sensation or
preference. In order to investigate relationships between the
above-mentioned variables, field data collection experiments
offer a more realistic context of what a person experiences com-
pared to most climate chambers studies [
21
]. Collecting field
data poses several challenges since people are required to com-
plete surveys while performing day-to-day activities. This may
disrupt their activities, and accurately monitor and log environ-
mental variables with a high spatial and temporal resolution is
expensive and nontrivial. Moreover, even when there is enough
available data to develop a
PCM
for all occupants, it is still
problematic from the control strategy perspective, to ensure
that all occupants will find the thermal environment comfort-
able [
22
]. To make the most of the available data, another
common approach is to develop a general-purpose model. This
model is trained on a the whole population of occupants so that
the available data for training is maximized, at the expense of
personalization. In open shared spaces where the HVAC system
cannot be designed to tailor to individual needs, one option is
to control environmental parameters to maximize the number of
occupants that are comfortable and at the same time to provide
Personal Comfort Systems (PCS) to each occupant, e.g., heated
and cooled chairs, desk fans, and foot warmers so each of them
can adjust the environment based on their needs and preferences.
We seek to investigate whether we can leverage historical
information gathered from building occupants to predict their
thermal comfort preferences of a new person from which lit-
tle individual data has been collected. By individual data, we
refer to longitudinal subjective responses to Right-Here-Right-
Now (
RHRH
) surveys about thermal preference. In particular,
our research hypothesis is that the thermal comfort preference
of a building occupant can be estimated by analyzing their psy-
chological and behavioral traits and comparing them to other
occupants from which historical subjective feedback was previ-
ously obtained. This way, we leverage the personalization aspect
of PCM with less data.
In previous studies, the predominant value of subjective re-
sponses alone was used to group occupants (i.e., proportion of
subjective feedback that is “prefer cooler”, “no change”, or “pre-
fer warmer”) [
23
] whereas in [
24
] a comfort profile was first
determined based on some historical data, and then new occu-
pants were compared against the available group profiles. In
both works, once similarities are found between occupants, the
group’s data is used for the prediction model for new occupants.
Environmental measurements, e.g., air temperature, relative hu-
midity, coupled with physiological measurements, e.g., height
and weight, were used as features in the model [
25
]. However,
in both approaches, some amount of data is still required from
the new occupant to correctly assign it to a respective group.
In this work we introduce Cohort Comfort Models (
CCM
s), a
framework that uses similarities between occupants to predict
how new occupants with very little or no historical data would
perceive their environment. More details are explained on Sec-
tion 2. With this approach we tried to answer the following
questions:
1.
Do
CCM
increase the prediction performance for new oc-
cupants when compared to a data-driven general-purpose
model, i.e., all data are merged to develop a single comfort
model?
2.
Can we pinpoint back to which occupants benefited from
using
CCM
, if so, which features mostly contributed to
improving the prediction accuracy?
2. Cohort Comfort Models
2.1. Definition
Cohort Comfort Models (
CCM
s) are a new framework, built
on Personal Comfort Models (
PCM
s), that leverage historical
data collected from a sample population, who have some un-
derlying preference similarity, to predict thermal preference
responses of new occupants. By doings this,
CCM
leverage
the power of both one-size-fits-all models and
PCM
s. The key
characteristic of
CCM
s is that they are trained on a group of
occupants who share features and subjective preferences with
the new person we are trying to predict. The comparison of
data-driven thermal comfort modeling alternatives against more
established ones like
PMV
and the adaptive model is shown in
Figure 1a. An overview of the proposed framework is shown in
Figure 1b.
One of the main contributions of this work is to determine
ways on which characteristics of occupants, relevant to thermal
comfort, can be compared. Firstly, a longitudinal data collection
experiment of various buildings occupants is required, since a
high number of participants could allow the framework to find
both occupants with similar and different characteristics. As for
the required number of labeled data points per occupant (
n
in Fig-
ure 1b), previous relevant works suggests that 50 [
26
], 60 [
19
],
90 [
27
], and 200 [
28
] data points per occupant are needed for
acceptable accuracy and stable prediction. However, the current
landscape on thermal comfort longitudinal field data collection
experiments is very limited in terms of public datasets [
29
] and
only a handful of them contain a high number of data points
(
n
) per participant. It should also be noted that the number of
data points needed for each participant varies as a function of
the range of environmental conditions a participant may experi-
ence. While a high
n
would be ideal, our proposed framework
is not restricted to a threshold of available data. Our framework
then utilizes the different data streams from the dataset collected
(top gray segment, Figure 1b): on-boarding survey, sensor data,
RHRH
. On-boarding survey refers to a one-time background
survey that occupants fill out only once before the data collection
experiment starts. This information must gather personal data
(e.g., sex, height, weight) and preferably should also include psy-
chological tests, such as the Big Five personality test [
30
], which
is also used for occupant similarity in the healthcare monitoring
2
(a) Diagram of thermal comfort modeling alternatives. The darker blue region highlights thermal comfort models at the population level (Predicted Mean Vote (PMV) and adaptive). Lighter
blue area comprises of personalised thermal comfort models trained on the person itself (Personal Comfort Model (PCM)) or on a group of similar people (Cohort Comfort Model (CCM)).
(b) Cohort comfort modeling framework, built on Personal Comfort Models (PCMs), leverages historical data collected from a sample population, who have some underlying preference
similarity, to predict thermal preference responses of new occupants. Firstly the comfort cohorts are created as shown in the top segment following a one-time data collection effort.
Comfort cohorts are determined using data from the on-boarding surveys, sensor measurements, and subjective thermal comfort data. The middle dark gray segment shows a new occupant
from which only the on-boarding survey is acquired and is used to assign the occupant to one of the previously created cohorts; this is also known as “cold start”. Finally, the bottom
segment shows another new occupant from which mdata points are collected in order to be assigned to an existing cohort with the possibility of using these mdata points to fine-tune the
existing cohort model; this is also known as “warm start” when n>> m. For both cold and warm start, the model corresponding to the assigned cohort is used to predict the occupant’s
thermal comfort with the addition that the latter can use the already collected information from the occupant to fine-tune the cohort model.
Figure 1: Comparison of current thermal comfort modeling alternatives (1a) and overview of our work Cohort Comfort Model (CCM) (1b).
domain for well-being label prediction [
31
]. Sensor Measure-
ments encompasses all environmental and physiological data
collected when the participant completed the
RHRH
.Subjective
Thermal Comfort comprises the questions asked in the
RHRH
survey. These main three streams of data are used to define the
CCM
. These
CCM
s are groups or clusters capable of reflecting
preference patterns extrapolated from the relationship between
Sensor Measurements and Subjective Thermal Comfort which
are longitudinal time-series data, which we will refer to as warm
start, and from On-boarding Survey responses, which we will
refer to as cold start.
Once the cohorts are determined and existing occupants are
allocated into one, new occupants can also be assigned to a
cohort. The cohort assignment is coupled with how the cohort
was created and will vary depending on the data availability for
the new occupant. When a new occupant has not been through a
longitudinal data collection phase, i.e., only on-boarding survey
data is available, they can still be assigned to a cohort using
3
a “cold start” assignment procedure (middle dark gray section
in Figure 1b). On the other hand, when a new occupant has
been through a longitudinal data collection phase and a subset
of
RHRH
data points is available (bottom light gray section in
1b), the assignment to a cohort can be done using a “warm start”
approach. It should be noted, that in the latter case, the new
occupant can still be assigned to its respective cohort without
leveraging the collected labeled data points using a “cold start”
assignment procedure with the on-boarding survey responses.
In addition, the availability of the
RHRH
data for this occupant
makes it possible to fine-tune the respective cohort model to the
new occupant if required. The details on cold start and warm
start cohort creation and assignations are further explained in
Section 2.3.
2.2. Current Related Work
2.2.1. Data-driven thermal comfort prediction
Data-driven thermal comfort models rely on a handful of the
measured features and often outperform the
PMV
and adaptive
comfort models [
20
], specifically when trying to predict thermal
preferences of individuals [
19
]. Recent literature employs ma-
chine learning models in order to contextualize environmental
data [
32
,
33
,
34
,
35
] as well as thermoregulation from human
skin through video [
36
], and physiological data [
28
,
37
,
38
]. In
addition, some work has also analyzed the transition time it takes
for an occupant to change its thermal preferences in the same
indoor environment [
39
]. At the individual level, [
19
] has shown
that data-driven
PCM
perform better than conventional models,
such as the
PMV
, on a cohort of 38 occupants. These
PCM
s are
tailored to a occupant’s specific needs and can be considered as
one of the best approaches to achieve high performing models
for a particular individual. As for the predicting variable, the
3-point Thermal Preference scale is a well-established scale used
in thermal comfort research as the label or target variable for
data-driven thermal comfort prediction modeling [
19
]. Aggre-
gated data-driven models, meaning models that use data from a
large group of people to predict another person’s thermal prefer-
ence, face the challenge of multiple occupants having different
thermal comfort preference from each other [40, 41, 1, 42].
2.2.2. Cold start prediction
The built environment research is moving towards using a
small amount of data from new occupants in order to successfully
predict their thermal comfort [
24
]. This scenario is a well known
one when dealing with personalization, particularly in the field
of Recommender Systems and is known as “cold start prediction”.
The cold start personalization problem refers to start providing
relevant information to the occupant, or a system, as fast and
with as little effort as possible [
43
]. This is a common practice
in smart agents (e.g., Cortana, Google Assistant, Siri) or ser-
vices (e.g., Amazon, Netflix) where occupants can be clustered
based on limited knowledge about them [
44
,
43
]. Alternatively,
information about the occupants can be obtained by analysing
historical data and infer other relevant features [
45
,
46
]. These
two techniques can be used together to improve the model pre-
diction performance.
In the context of health monitoring and well-being label (i.e.,
mood, health, stress) prediction, [
47
] provided a framework to
quantify occupant similarity based on individual behavioural
patterns and then cluster them in order to have groups from
which group models can be use for new occupants. [
31
] used the
Big Five personality trait survey as the one-time general prefer-
ence survey to determine groups of occupants in a data-driven
manner to then use group-level information for occupant person-
alization. Both works rely on physiological, environmental, and
behavioural data paired up with mood, health, and well-being
labels from more than 200 different participants. However, to
the best of our knowledge, this is the first work that explores
these techniques in the context of thermal comfort prediction.
2.3. Cohort creation framework
2.3.1. Cold start
Personal characteristics captured during the on-boarding one-
time survey, include categorical, ordinal and scalar data that
can directly be used to group occupants. In addition, some
standardized surveys allow for their questions scores to be ag-
gregated, i.e., Highly Sensitive Person Scale (
HSPS
) [
48
] and
the Satisfaction With Life Scale (SWLS) [49].
The simple approach to create cohorts is to use individual one-
time survey responses. Cohorts are created in a straightforward
manner by dividing occupants based on a simple statistic, e.g.,
median, or creating as many cohorts as unique responses values.
Multiple one-time survey responses can also be combined in a
data-driven setup where each survey response is a feature and
a clustering algorithm can automatically group occupants into
cohorts based on the discovered patterns from the questions
responses. We determined the optimal number of cohorts,
k
, by
using the clustering metric known as Silhouette score [
50
] which
ranges from 1 (best score) to -1 (worst score). The appropriate
k
is determined by the highest mean value of the Silhouette score.
Once the cohorts exists, new occupants can be directly as-
signed to one. If the cohorts were created in a data-driven
manner, the trained clustering algorithm can then predict the
corresponding cohort using the required occupant’s one-time
survey responses. On the other hand, new occupants can be
assigned to cohorts created based on specific values by simply
matching its respective one-time survey response: e.g., if two
cohorts are created based on sex, a new person is either assigned
to the Female or Male cohort.
2.3.2. Warm start
Responses distribution similarity:
Previous work has
looked into grouping occupants solely based on the trend of
their thermal comfort responses [
23
], for the case of thermal
preference, occupants who predominantly have more prefer
cooler responses are grouped together, and so on. Figure 3b
illustrates the comparison of two occupants’ thermal preference
responses. Unlike work done in [
23
], where grouping was done
based on the majority response, the probability distribution of
responses is first calculated for each occupant, on this example
the responses are thermal preference votes where the values
are prefer warmer: -1, no change: 0, prefer cooler: 1. Then,
the occupant’s responses distributions are compared using the
Jensen-Shannon (
JS
) divergence, bounded in [0
,
1] and sym-
4
metric, which calculates the similarity between two probability
distributions. A lower distance value between occupants means
the occupants are more similar.
Cross-model performance:
We propose to group people
into cohorts based on the relationship between their responses
in the
RHRH
survey and the different available environmental
and physiological sensor data. [
47
] suggest that these relation-
ships can be captured by evaluating the model of one occupant,
trained with its own data, using labeled sensor data from an-
other participant, this is referred as cross-model performance
evaluation. In the context of thermal comfort, a
PCM
is a good
representation of a occupant’s individual comfort requirements
based on the data of its environment [
20
]. Hence, using
PCM
s
for cross-model performance evaluation is an adequate similarity
metric. If the
PCM
of
occupantA
has a high performance on data
from
occupantB
, it suggests that both occupants have similari-
ties in how they perceive their thermal environment. Figure 3a
shows an example of evaluating the cross-model performance
of two occupants. The data used for the
PCM
on each occupant
comprise only sensor measurements since any personal charac-
teristic data collected via a one-time preference survey would
have a repeating value for each data point. For a consistent
comparison, each
PCM
must be trained on the respective occu-
pant’s data, and any hyper-parameter must be fine-tuned using
the same evaluation metric. Based on the current literature on
PCM
s where the target variable comprises 3 ordinal values (i.e.,
multi-class), one of the most widely used models is Random
Forests (
RF
s) [
23
,
19
,
25
] and the evaluation metric is F1-micro
score [
23
,
25
] which can be seen as accuracy for the multi-class
classification scenario. F1-micro scores range from 0 to 1, with
1 being a better prediction accuracy which in turn indicates two
occupants are more similar.
Finally, both metrics of similarity between occupants, re-
sponses distribution similarity and cross-model performance,
can be rearranged into what is called a similarity or affinity ma-
trix. An affinity matrix is a squared symmetric matrix where
each element to compare, in this case occupants, is both the
rows and columns indices. Each value of the matrix represent
the similarity of its row and column; these values are bounded
in [0
,
1] where 1 means identical elements. Affinity matrices
have shown to be a useful way of comparing occupants when the
similarity calculations can be transform into the [0
,
1] range [
47
].
Both similarity metrics already matches the desired range but
the responses distributions metric have an inverse interpretation
where a value of 0 means identical occupants. Thus, these values
are normalized using the Radial Basis Function (
RBF
) kernel
(Equation 1) such that now a value of 1 translates to identical
occupants.
RBF Kernel (Zi,c)=e
−(zi,j−c)2
2µ2(1)
Where
Zi
is the similarity metric (Jensen-Shannon (
JS
) diver-
gence in this scenario),
µ
is the standard deviation of
Zi
,
zi,j∈Zi
is the JS divergence between occupants
i
and
j
, and
c
is the nor-
malization center. In order to consider both similarity metrics,
both metrics can be added together with a weight coefficient
α
and
β
(such that
α
+
β
=1), respectively, to indicate their
respective contribution in the final affinity matrix. At this point,
Spectral clustering [
51
] is applied to the affinity matrix and,
similar to Section 2.3.1, the adequate number of cohorts,
k
, is
determined by the highest mean value of the Silhouette score.
Figure 3 illustrates both similarity metrics calculation, their
rearrangement into an affinity matrix, and cohort calculation.
New occupants can be assigned to a cohort only after they
complete some
RHRH
surveys. However, as depicted in Fig-
ure 1b, collecting the same number of labeled datapoints,
n
, for
each new occupant as it was done with the existing occupants
can be impractical and cost prohibitive. Hence, we propose
to use a “warm start” approach that leverages a much smaller
number of data points
m
from the new participant, such that
m<< n
. The
m
datapoints from the new occupant are used to
evaluate the performance of each cohort we previously created.
The performance values are then compared and the occupant
is assigned to the cohort with the highest performance score
overall. The metric used in this last step must be the same as the
metric used to determine the initial cross-model performance
evaluation, which for the case of a multi-class thermal comfort
variable is the F1-micro score. Figure 2b illustrates this warm
start cohort assignation.
2.4. Evaluation
Experiments and modeling parameters are detailed in Ta-
ble Appendix A.1. We used a 0.8 train-test split, participant-wise,
on each dataset. Hence, 80 % randomly selected occupants and
their data are used to built the cohorts following the approaches
mentioned in the previous section. The remaining 20 % of the
occupants are used as test and assigned to a newly created co-
hort. This entire process was repeated 100 times for each cohort
approach on each dataset used. Then, once a test occupant is
assigned to a cohort, the cohort thermal preference model is
evaluated on the new occupant’s labeled data.
PCM
models were computed for cross-model performance
and served as a upper bound baseline.
PCM
s are trained with
RF
as a the classification model following a 5-fold cross-validation.
We also used the entirety of the available data from the train par-
ticipants to created an aggregate model where everyone belongs
to one big cohort from which a general-purpose model can be
trained on. This model will serve as a lower bound baseline. All
processes are repeated 100 times in order to minimize biases.
Additionally, the prediction results of
PMV
at the personal
level are included. The Python package pythermalcomfort [
52
]
is used to compute the
PMV
as per the calculations methods in
ISO 7730 [
53
], and the package Scikit-learn: Machine Learn-
ing in Python [
54
] is used for all the remaining data-driven
calculations. Similar to how it was done in [
19
], we used the lon-
gitudinal field data (i.e., air temperature, humidity, self-reported
clothing insulation) and the static values (i.e., air velocity=0.1
m/s, metabolic rate=1.1met) for the PMV calculation unless the
values are present in the dataset. To compare the results on the
same 3-point scale, the PMV is converted into thermal prefer-
ence classes based on the following assumptions:
|PMV|≤
1.5
is “no change”;
PMV >
1.5 is “prefer cooler”; and
PMV <
-1.5
is “prefer warmer”. Nevertheless, we acknowledge that making
these many assumptions and simplifications on the values them-
5
(a) Cold start:Top The occupant is directly assigned to the cohort with the matching value
of the on-boarding survey used for cohort creation. Bottom The clustering algorithm used
for cohort creation is used on the participant’s on-boarding survey responses to predict the
corresponding cohort.
(b) Warm start: Data points collected from the new participant are evaluated on each ex-
isting Cohort model. The participant is then assigned to the cohort on which it had the best
averaged performance based on F1-micro score.
Figure 2: Participant assignment to the existing cohorts depending if the cohorts were cold start (2a or warm start (2b). Only 2 cohorts are shown for illustration
purposes.
(a) Cross-model performance evaluation calculated with F1-micro score, bounded in [0,1],
using each occupants’ PCM with another occupant’ data. The higher the F1-micro score the
more similar two occupants are.
(b) Responses distribution similarity based on Jensen-Shannon divergence, bounded in [0,1].
After applying an Radial Basis Function (RBF), the higher the value the more similar two
occupants are.
Figure 3:
Warm start
cohort creation approach. 3a calculates the cross-model performance and 3b calculates the distance between occupants’ responses distribution.
Each approach is weighted before their summation into an affinity matrix with
α
and
β
respectively, where
α
+
β
=1. Spectral clustering is then ran on the affinity
matrix on a multiple number of cohorts (e.g., k=[2,10]). The best kis the one with the highest Silhouette Score.
selves, the accuracy of the PMV calculation will be low. We
share our code base for reproducibility of our analysis on a public
GitHub repository: https://github.com/buds-lab/ccm.
2.5. Datasets
In order to asses the proposed framework, two publicly avail-
able longitudinal thermal comfort datasets were used to validate
our methodology. One dataset, named Dorn [
55
], was collected
in Singapore where 20 occupants were asked to complete ap-
proximately 1000
RHRH
surveys over a period of 6 months.
Participants were asked to wear a Fitbit smartwatch with the
Cozie [
56
]
1
application installed. This application has been used
to collect
RHRH
surveys [
23
] and is publicly available for both
Android
2
and Apple
3
platforms. The second dataset we used
was collected from 37 participants over a period of 12 weeks in
an office building located in Redwood City, California, US [
57
].
Participants were asked to complete three
RHRH
online surveys
per day while using a Personal Comfort System (
PCS
) in the
form of an instrumented chair. In this work we will be referring
to the form dataset as the “Dorn” and the latter as the “SMC”
dataset. Both surveys comprised the following overlapping ques-
tions: thermal preferences (using a 3-point scale), clothing, and
activity. Environmental measurements (e.g., dry-bulb air temper-
ature, relative humidity) were taken using data loggers installed
in the near proximity (i.e., within a 5-m radius) from where the
1https://cozie.app/
2https://github.com/cozie-app/cozie
3https://github.com/cozie-app/cozie-apple
Dataset
# Occupants
(Sex)
Age
range
#Responses per
participant
Duration
Dorn [
55
]
20
(10 M, 10 F)
20
to 55
872(min),
1332(max)
6
months
SMC [
57
]
37
(17 M, 20 F)
25
to 37
33(min),
218(max)
12
weeks
Table 1: Overview of the two datasets used in this work.
participant completed the survey. Additionally, [55] used Fitbit
Versa smartwatches to measure and log physiological data and
iButtons for near-body temperature and humidity. Table 1 shows
a brief overview of both datasets, more details can be found in
their respective publications.
Personal information about the participants (e.g., sex, height,
weight) were obtained asking them to complete an on-boarding
survey when joining the experiment. Additionally, the Dorn
dataset included three standardize surveys: Highly Sensi-
tive Person Scale (
HSPS
) [
48
], Satisfaction With Life Scale
(
SWLS
) [
49
], and a short version of the Big Five Personal-
ity (
B5P
) test known as Ten-Item Personality Inventory [
30
].
These surveys were used to investigate potential relationships
between survey responses from occupants and their thermal
preferences.
6
2.6. Data pre-processing
We time-aligned sensor measurements with the
RHRH
re-
sponses. Thermal preference response provided by the occupants
are used as the ground truth label for each respective dataset. Fea-
ture selection scrutiny was done based on the latest efforts in data-
driven personal thermal comfort prediction [
58
,
23
,
19
,
28
,
59
]
and based on what the original dataset papers suggest [
57
,
55
].
A summary of the features used in this paper for model train-
ing is shown in Table 2 for the present study, and Table 2a and
Table 2b for the Dorn [55] and SMC [57] dataset respectively.
To have comparable results between occupants we decided to
use a fixed number of responses per occupant on each dataset.
While this approach does not address the inherent class balancing
issue in thermal preference datasets, i.e,. having a disproportion-
ate number of data points for each thermal preference type [
60
],
we decided to prioritize the quantity of available data points
per participants. The Dorn dataset contains a minimum of 872
responses per participant (Table 1); however, after removing
surveys completed outdoors, in non-transition periods and while
participants were exercising, the minimum number of responses
per occupant was 231. [
19
] found that when the dataset comes
from an environment that all occupants shared during the ex-
periment, the required number of responses for a stable thermal
preference prediction is at least 60. These results were obtained
on the SMC [
57
] dataset, hence, we opted to also use the first
60 responses from each occupant as it also covers most of its
participants. This threshold of minimum responses resulted in
35 participants (18 females, 16 males). Figure Appendix A.1
in the Appendix shows the distribution of the filtered thermal
preference responses for each participant on both datasets.
2.7. Cohort creation
Following the framework presented in Section 2.3, we created
two different sets of cohorts one for each datasets. Table 3 shows
the cold start cohorts (upper row) and warm start cohorts (lower
row) that are constructed based on the data available in each
dataset. As mentioned in Section 2.3.1 the cold start cohorts
are created using only one-time survey responses. In the Dorn
dataset, the on-boarding surveys were also used to create other
sets of cohorts. The three surveys (
HSPS
,
SWLS
, and
B5P
) were
used together (
Surveys
in Table 3) where each survey score is
treated as a feature and the number of cohorts was determined
in a data-driven manner via Spectral clustering as detailed in
Section 2.3.1. Moreover, each survey was also used individually.
The
HSPS
and
SWLS
surveys, named
Sensitive
and
Life
Satisfaction
respectively in Table 3, have an aggregated nu-
merical score which was used to create two cohorts based on
the median value of the occupants’ scores. All participants were
tested but only participants with extreme values on these survey
responses, with a low and high aggregated score, are shown
since their results are higher. Occupants with an aggregated
score between the 25
th
and the 75
th
interquantile range were
filtered out, which meant only 50 % of the total occupants in
the Dorn dataset were used. Since the responses of the B5P
cannot be aggregated into one final score, the number of cohorts
based on this survey was determined in a data-driven manner
using each personality score as a feature. This cohort approach
is named Personality in Table 3.
Finally, warm start cohorts are created following the two ap-
proaches mentioned in Section 2.3.2, using both the responses
distribution similarity and cross-model performance combined,
and cross-model performance alone (Dist-Cross and Cross re-
spectively in Table 3). Table Appendix A.1 list the different
experimental and modeling parameters. The best performing
model, in terms of average expected F1-micro score is found
using grid search on the list of hyperparameters. In the Dorn
dataset we determined that the optimal number of cohorts was
2 for all data-driven approaches. On the other hand, a value of
k
equal to 2 and 3 were chosen for the
Dist-Cross
and
Cross
approaches in the SMC dataset, respectively. More details on dif-
ferent numbers of cohorts
k
are shown in Figure Appendix A.2
in the Appendix.
In order to corroborate the usefulness of each cohort approach
we also purposely assigned each occupant to the opposite cluster
(if
k
=2 and were created by
cold start
approaches) or
assigning the occupant to the worst performing cohort (
warm
start
approaches). This will serve as an ablation analysis for
the cohort approach and, by extension, the data used for its
creation and assignment.
3. Results
3.1. Overall prediction performance
The performance results, in terms of F1-micro score, of all
cohort approaches from Table 3 in new occupants from their
respective datasets are summarized in Figure 4. Results sets
with a tilde (”
∼
”) as prefix denote those approaches where test
users were purposely assigned to an incorrect cluster as detailed
in Section 2.7.
As mentioned in Section 2.3.2, Warm start cohort approaches
like
Dist-Cross
and
Cross
need that new participants com-
plete a few
RHRH
surveys prior to assigning them to a given
cohort. Different number of labeled data points, or
RHRH
, were
tested, i.e., 1, 3, 5, and 7, and all of them fall within the baselines.
Figure Appendix A.3 shows the results for the Dorn dataset, sim-
ilar results were obtained with the SMC dataset. For the purpose
of evaluating the minimum required number of labeled data
points, all results in Figure 4 used one single, randomly selected,
data point from each test occupant for assignment. Compared
to using more labeled data points, the performance of using a
single data point still lead to improved results, on average 10%
compared to the general-purpose model.
7
Feature Source
Air temperature Fixed sensor (indoor)
Relative humidity
Near-body temperature Wearable sensor
Heart rate
Clothing level RHRN survey
Sex
One-time surveyHeight
Weight
(a) Subset of features chosen for data-driven modeling in the Dorn dataset based on the original work [55]. Near-body (wrist level) temperature measurements are found to contain more
information than skin temperature [28, 59].
Feature Source
Dry-bulb air temperature
Fixed sensor (indoor)
Operative temperature
Relative humidity
Slope in air temperature
Control location
PCS control behaviour
Control intensity
Control frequency in the past x (1h, 4h, 1d, 1wk)
Occupancy status
Occupancy frequency in the past x (1h, 4h, 1d, 1wk)
Ratio of control duration over occupancy
duration in the past x (x=1 h, 4 h, 1 d, 1wk)
Outdoor air temperature
Outdoor Environment
Sky cover
Weighted mean monthly temperature
Precipitation
Clothing level
RHRN surveyHour of the day
Day of the week
Sex
One-time surveyHeight
Weight
(b) Subset of features recommended for data-driven modeling [19] in the SMC dataset [57].
Table 2: Features chosen for data-driven modeling: 2a Dorn [
55
], 2b SMC [
57
] dataset. Features obtained through the one-time survey are dropped for
PCM
s due to
their constant value within each participant.
Type Cohorts approaches Data used
Cold start
Sex†Self-reported value
Surveys All on-boarding surveys (HSPS, SWLS, B5P)
Sensitive HSPS scores
Life Satisfaction SWLS scores
Personality B5P scores
Agreeableness Best performing trait from B5P
Warm start
Responses Distribution and
Cross-model performance (Dist-Cross)†
Thermal preference responses
and occupants’ PCM
Cross-model performance (Cross)†Occupants’ PCM
Table 3: Cohort approaches chosen. The upper rows are cold start cohorts and the lower rows are warm start. †were cohorts approaches used on both datasets.
8
Figure 4: Performance results in F1-micro score of 100 iterations for each cohort approach on each dataset Dorn and SMC (y-axis) and their baselines after correct
assignment (dark gray) and incorrect assignment (light gray). Approaches with occupants incorrectly assigned to cohorts have the “tilde” symbol (
∼
) before their
name. Cold start approaches are highlighted by a light-blue region and Warm start approaches by a yellow region. The general-purpose model and PCM median
value are highlighted with red dashed lines on each of their boxplot and filled with red. Additionally, PMV results, following the calculation in Section 2.4 are shown.
Warm start approaches (
Dist-Cross
and
Cross
) have used only one randomly selected data-point. Dorn dataset:
General-purpose
boxplot contains 20 values;
Sensitive
and
Life Satisfaction
contain 200 values (2 test occupants at a time
×
100 iterations) and all remaining box plots have 400 values (4 test occupants
at a time
×
100 iterations). SMC dataset:
PCM
boxplot, contains 35 values and all remaining boxplots have 700 values (7 test uses at a time
×
100 iterations). The
cohort approaches are taken from Table 3.
3.1.1. Under performing cohorts
Cohort approaches such as
Sex
,
Surveys
,
Sensitivity
,
and
Big 5 Personalities
have a median performance below the
General-purpose
in the Dorn dataset (upper plot in Figure 4).
Although “sex” has been found to have some influence in ther-
mal preference, dividing occupants based on it does not pro-
vide significant changes, on average, on thermal preference
prediction when compared to a general-purpose model on both
datasets. This is supported by the similar performance of occu-
pants being assigned to their same sex cohort and opposite sex
cohorts (
Sex
and
∼Sex
in Figure 4). The lower performance
of two of the
Surveys
approach components (
Sensitivity
and
Big 5 Personalities
, when used individually, might be
the reason of this approach’s lower than
General-purpose
performance. We hypothesize the
HSPS
survey, used for the
Sensitivity
was not able to capture enough information due
to the geographical location and participants’ background of
the data collection experiment. Participants from Singapore
(Dorn dataset) may had provided a non-accurate response to this
survey since they mostly experience a hot climate, compared
to a full range of cold and hot climate, which could translate
in the low performances of this cohort approach. Each per-
sonality trait from the B5P survey was also tested individually
and
Agreeableness
performed above the
General-purpose
value (Figure Appendix A.3). This suggests that looking at
all personality traits together (
Big 5 Personality
) is not as
beneficial as looking into each personality trait individually.
9
Moreover, it is possible that the small sample size of participants
on the Dorn dataset provided little variability on these survey
responses or that the three on-boarding surveys chosen,
HSPS
,
SWLS
, and
B5P
, are simply not relevant for thermal preference.
Overall, the efficacy and usability of these warm start approaches
remains an open question.
3.1.2. Above-baseline performing cohorts
The
Life Satisfaction
is the only survey that by itself
has an above
General-purpose
performance. This indicates
that treating each survey individually instead of combining them
(
Surveys
) is a better approach for cohort prediction perfor-
mance. In fact, only the
HSPS
and B5P surveys lead to cohort ap-
proach with favorable performances above the general-purpose
models. Since both of these cohort approaches are related to
satisfaction and optimism, we hypothesize occupants within the
resulting cohorts are less prone to preference changes due to
small variations, meaning they have a wider acceptability range.
On the other hand, cohort approaches
Dist-Cross
and
Cross
achieved the highest median performance on both
datasets (top and bottom plots in Figure 4 for the Dorn and
SMC dataset, respectively). While one single labeled data point
used for cohort assignment may be prone to bias, the consis-
tent F1-micro scores after 100 iterations still positioned both
cohort approaches as the top performing ones, with a much
clearer difference in the Dorn dataset than in the SMC dataset.
Moreover, when each of this cohort approach is compared to
their respective worst assigned cohort, the median performance
is reduced by almost half for
Dist-Cross
(from 0.61 to 0.32)
and
Cross
(from 0.63 to 0.32) in the Dorn dataset; and the
performance is reduced 4 times for
Dist-Cross
(from 0.72 to
0.17) and
Cross
(from 0.72 to 0.15) in the SMC dataset. The
median performance for
Life Satisfaction
is also reduced
when occupants are incorrectly assigned (from 0.56 to 0.51).
The reduced number of occupants used for this approach, 10
instead of 20, might be one of the reason why the difference in
performance is less pronounced when compared to
Dist-Cross
and
Cross
Nevertheless, the
Agreeableness
cohort approach
exhibits a different behaviour. Among all 20 occupants in the
Dorn dataset, the lowest score obtained was 4 out of 7 which
indicates the occupants rank highly on this personality trait. As
previously mentioned, this personality trait might influence the
acceptable thermal preference range of an occupant and, with
only two cohorts created, both created cohorts could improve
an occupant’s thermal preference prediction. These results sug-
gests that when the cohort modeling framework is followed
correctly, it leads to an overall increase of median F1-micro
performance. This increase is shown to be higher with warms
tart approaches (
Dist-Cross
and
Cross
) than with cold start
approaches (Life Satisfaction and Agreeableness).
3.2. Occupant-specific improvement
Figure 5 shows a scatter plot of the average percentage
changes in occupant-specific F1-micro score of the above-
baseline performing cohort approaches mentioned in the previ-
ous subsection for warm start and cold start types of cohorts on
both datasets.
For each occupant, a positive average percentage change in
F1-micro score is desirable since it indicates an increase in
thermal preference prediction performance when using a co-
hort model instead of a general-purpose model. Conversely,
a negative value indicates the cohort model fails to provide
useful personalization and a generic model, that dismisses oc-
cupants similarity entirely, performs better on average since
it was trained with more data. Overall, in the Dorn dataset
around half of the population of occupants and one third of
the population of occupants in the SMC dataset saw a boost
in prediction accuracy, in terms of F1-micro score, under the
cohort modeling framework in their respective cohort approach.
For the remaining occupants there is little change in perfor-
mance. In Figure 4 the cohort
Life Satisfaction
scored
marginally better than the
General-purpose
(0.55 to 0.52) and
the
Agreeableness
approached score a slightly higher median
performance (0.58). This is highlighted in Figure 5
Dorn:Life
Satisfaction
where 40 % of occupants saw a performance
boost of less than 1 %, whereas in
Dorn:Agreeableness
55 %
of occupants saw a performance boost of around 2 %.
Dist-Cross
and
Cross
approaches show a performance
boost of up to 36 % and 46 %, and average of 5 % and 8 %, on
the Dorn and SMC dataset respectively. Overall, the number of
occupants that benefited from these approaches are more than
half (45 % and 60 %) and one third (35 % and 32 %), respectively
on each dataset. The number of occupants that are better-off
from using
Dist-Cross
or
Cross
are different on each dataset.
The
Dist-Cross
approach benefits more occupants than the
Cross
approach on the SMC dataset, where the data was col-
lected in an office building across many multiple shared space in
two different floors. Under these circumstances, personal ther-
mal preferences captured by the feedback distribution (the
Dist
component of
Dist-Cross
) contribute to more occupants being
better-offby
Dist-Cross
compared to not using this informa-
tion (
Cross
approach). On the other hand, in the Dorn dataset
all occupants provided their thermal preference response in their
own work environments with no guarantees of similar conditions
and exposures amongst all of the spaces. Occupants’ feedback
distribution (the
Dist
component of
Dist-Cross
) was not able
to capture enough preference characteristics that could be quan-
tified and compared among the occupants, resulting in more
occupants being better-offwith the
Cross
approach alone. Nev-
ertheless, it is also important to highlight that some occupants
are actually worse-offregardless of the cohort approach used,
and their averaged percentage performance is reduced. The en-
tire distribution of percentage changes across all 100 iterations
for these approaches and for each occupant can be found in
Figure Appendix A.4 and Appendix A.5 for the Dorn and SMC
dataset respectively.
Finally, metadata about the participants who are better-offand
worse-offis summarized in Table Appendix A.2. Overall, there
are no major differences between the sex and mean and standard
deviation of height and weight among the occupants on both
groups, most cohort approaches. The cohort approach of
Life
Satisfaction
on the Dorn dataset and
Cross
on the SMC
dataset show a slight majority on Female and Male better-off
occupants, respectively. Nevertheless, due to the small sample
10
Figure 5: Scatter plot of occupant-specific percentage changes in F1-micro score from the general-purpose performance (
General-purpose
) to the respective cohort
approach performance. Each dot represents the average value across all 100 iterations for one occupant. A positive value means the occupant benefited from the
cohort approached and a negative one means they are worse-off. For the cohort approaches in their respective datasets (y-axis) 45 %, 60 %, 40 %, 55 %, 35 % and 32%
of the occupants saw an increase in their thermal preference prediction performance by relying on their peers’ data, respectively, although the actual benefit is small.
size of occupants on both datasets it is not possible to generalize
these findings.
4. Discussion
We showed that
CCM
can improve the thermal preference
prediction of a new occupant by using other occupants’ data
who are grouped based on preference similarity. We determined
that: i) Warm start cohort approaches outperform cold start
approaches at the cost of requiring labeled data points from
each new occupant; ii) while around half and one third of oc-
cupants, in each Dorn and SMC dataset respectively, benefited
from being assigned to a cohort via warm start and using its
cohort model, the remaining occupants saw very little variation
on their performance for both being better-offor worse-off; and
iii) the applicability of cohorts is dependent on the availability
of data from the occupants as well as the amount of occupants
willingness to provide it, as we found ourselves able to investi-
gate more cohort approaches on the dataset with more sensed
modalities and more one-time on-boarding surveys, like in the
Dorn dataset.
4.1. Warm vs cold start
Warm start cohorts are bottom-up approaches that focus on
the granular data with a direct connection with thermal pref-
erence, i.e., relying on feedback distribution and cross-model
performance, whereas cold start cohorts are a top-bottom ap-
proach that rely on the set of questions used to showcase the
separability of personal thermal preferences. It can be argued
that the latter approaches were expected to underperform since
by their definition, information regarding the actual thermal pref-
erence of occupants, e.g., the labeled data points, was not used
to create the cohorts. Nevertheless, recent literature in person-
alised healthcare monitoring [
47
,
31
] highlights the usability of
these sources of information as primary ways to leverage other
occupants’ data. We found that the
SWLS
surveys, or
Life
Satisfaction
, and the personality trait of
Agreeableness
from the
B5P
survey lead to cohorts where thermal preferences
are shared, leading to slightly improved thermal preference pre-
diction when compared to general-purpose models. One advan-
tage of using this information is that allows a cold start from a
new occupant.
However, the notion of “no historical data” is not a harsh
threshold since depending on the amount of data available, some
work still consider cold start when very little historical data
is used [
46
], i.e., less than 10 data points [
47
]. While work
done in [
24
] explores a Bayesian approach to cluster occupants
based on historical measured data and then use the cluster’s
thermal comfort models with as little as 8 data points which also
falls under the cold start definition used in [
47
], to the best of
our knowledge our proposed framework is the first to not only
look at measured data streams to find occupant similarity but
includes the aforementioned mechanisms from other fields. Our
framework allows for a full cold start prediction and a warm
start prediction with as little as one single historical data point.
Both types of cohort approaches have shown that while only
some occupants benefited from this framework, the majority
experience very little variability in performance, either better-off
of worse-off. Nevertheless, this also shows that our framework
allows for prediction performance just as good as the general-
purposes models but with at least half less training data, for the
case of only two cohorts.
4.2. From “how much can you benefit” to “who can benefit”
The warm start cohorts at the occupant level produced an
overall average increase of up to 8 % (
Dorn:Dist-Cross
)
and 5 % (
SMC:Dist-Cross
) on the Dorn and SMC dataset
respectively among the occupants who were indeed better-off.
These numbers changed to 4 % (
Dorn:Dist-Cross
) and 2 %
(
SMC:Dist-Cross
) when all participants are considered in the
calculation. However, on both datasets there are occupants who
could significantly take advantage of these approaches with per-
formance increases as high as 45 %, but also occupants who saw
a decrease in their prediction performance. While this scenario is
not uncommon when dealing with group-level occupant person-
alization [
31
], cold start approaches like our framework offers
the possibility to identify this better-offand worse-offoccupants
even before the occupant starts occupying the building.
From a facility manager’s perspective, being able to identify if
an occupant will be worse-off, or better-off, on a specific thermal
zone with other group of occupants can help prioritize certain
occupants, avoid complaints, and quantify the need of PCS. The
11
trend of hot-desking is not well received among occupants [
61
]
and can show a negative effect on work engagement, job satis-
faction, and fatigue [
62
]. Thus, a more suitable approach would
require to engage with specific occupants and adapt their needs.
On the other hand, although the results here are limited to two
datasets, it is possible that some occupants are inherently more
difficult to predict than others, and pinpointing back to these
occupants alone is equally important. Be that as it may, cold
start approaches that rely on one-time surveys, such as
cold
start
cohort approaches, raises data privacy concerns. Facility
managers and building practitioners must take the necessary
precautions when dealing with such data from its occupants.
4.3. Limitations and future work
The evaluation of our proposed framework based on one-time
on-boarding surveys and sensor measurements as features has
some limitations. First, the amount of occupants on each dataset
is 10 times smaller than those used in related work in the health-
care domain [
31
,
47
] and in a related thermal comfort cluster
prediction [
24
]. Additionally, one of the two datasets used (Dorn
dataset) lacks seasonal variability because it was collected in
Singapore (tropical climate), whereas the other dataset (SMC)
that included both seasons and a more diverse group of occu-
pants, lacks the one-time surveys required by the framework.
While multiple datasets can be used together as one much bigger
dataset, like the ASHRAE Thermal Comfort Database II [
17
],
the methodology and available measurements should overlap,
which is not the case for the recent field data collection experi-
ments.
Second, the models and features used were not chosen after
an exhaustive feature and model selection pipeline. One could
argue that features extracted from the physiological time series
measurements from each occupants, matched with their thermal
preference responses, could contain useful information which
could in turn be used for comparison across occupants and subse-
quent cohort creation. While extracting those temporal attributes
are out of the scope of the present work, the modularity of the
framework and its data-agnostic proposition makes the inclusion
of these new learned or discovered features, or any other new
modalities, plausible.
Third, in our evaluations we considered that once a occupant
is assigned to a cohort, it remains its membership for the dura-
tion of the evaluation experiment. It is likely that, influenced by
external factors, a occupant might be better-offin terms of ther-
mal preference prediction if it is assigned to a different cohort
on certain settings. Future work can build on this cohort frame-
work and insert the respective adaptive mechanisms that allow
occupants to change cohorts as time goes by. Particularly, when
coupled with the energy repercussions of using cohort models
to predict thermal preference, an entire feedback loop from the
occupant to the system could enable this online incremental
learning.
Aware of the context-dependent nature of our framework and
its components, we do not claim generalisable results. Ideally, a
dataset with a large enough group of people where all occupants
undertake the same one-time on-boarding set surveys, with over-
lapping sensor measurements, could provide more generalisable
insights. Also, if multiple datasets from different contexts are
used, features that represent the location, climate, and building
type could be considered as part of the modeling portion of
the framework. However, the presented framework is modular
enough to easily adapt to new sets of features either as part of the
cohort models themselves or as part of how the cohorts are cre-
ated. We plan to investigate the results on more diverse datasets
with occupants from different backgrounds and populations.
5. Conclusion
Cohort Comfort Models (
CCM
s) are a new approach to ther-
mal comfort modeling that builds on Personal Comfort Mod-
els (
PCM
s). Cohort models leverage the previously collected
data from other occupants to predict an occupant’s thermal pref-
erence based on similarities with other occupants. The advantage
of this method in two different datasets show that the proposed
framework for cohort creation and occupant assignment lead to
half and one third of the occupants in each dataset experience an
average increase of 5 % and 8 %, respectively, in their thermal
comfort prediction performance with as little as a single labeled
data point required from the new occupant when compared to
general-purpose data-driven models.
Unlike related literature focused on occupant segmentation for
better thermal preference prediction, our proposed framework
offers the ability of identifying the occupants who will be better-
offfrom said cohort approach, and those who might not, based
on the occupant’s background information or, at most, a single
labeled data point from the occupant at the building premises.
We provided the framework in a data and site agnostic manner
and described its different implementations depending on what
information is available with scalable potential to various data
streams. Cohort comfort models can benefit the building indus-
try by improving the level of thermal comfort among occupants
without the need to rely on individual customisation which can
be unfeasible and too expensive. Further global data collection
experiments with multiple occupants are encouraged to inves-
tigate the generalisation of the potential different cohorts and
the potential discovery of thermal preference signatures shared
across different occupant populations. Additionally, we encour-
age more research on control strategies that take advantage of
the cohort-customisation instead of an individual-level catering.
CRediT author statement
Matias Quintana:
Conceptualization, Methodology, Soft-
ware, Validation, Formal analysis, Investigation, Visualization,
Writing - Original Draft.
Stefano Schiavon:
Conceptualiza-
tion, Visualization, Supervision, Writing - Reviewing and Edit-
ing, Funding acquisition.
Federico Tartarini:
Resources, Data
Curation, Writing - Reviewing and Editing.
Joyce Kim:
Re-
sources, Data Curation, Writing - Reviewing and Editing.
Clay-
ton Miller:
Conceptualization, Resources, Visualization, Super-
vision, Project administration, Writing - Reviewing and Editing,
Funding acquisition
Acknowledgements
This research was funded by the Republic of Singapore’s
National Research Foundation through a grant to the Berkeley
12
Education Alliance for Research in Singapore (BEARS) for the
Singapore-Berkeley Building Efficiency and Sustainability in
the Tropics (SinBerBEST2) Program and a Singapore Ministry
of Education (MOE) Tier 1 Grant (Ecological Momentary As-
sessment (EMA) for Built Environment Research). BEARS has
been established by the University of California, Berkeley as a
center for intellectual excellence in research and education in
Singapore. The authors would like to acknowledge the team
behind the data collection and processing, including Mario Frei,
Yi Ting Teo, and Yun Xuan Chua.
Appendix A. Appendix
13
Parameter Description or value
Train-test ratio 0.8, participant-wise
Model Random Forest
Model Hyperparameters
Number of trees: 100, 300, 500
Split criterion: Gini index
Tree depth: 1, . . . , 10
Min. smaples for split: 2, 3, 4
Min. samples on an edge : 1, 2, 3
Model training Hyperparameter grid search
5-fold Cross-Validation
Model metric F1-micro score
Clustering Number of cohorts tested: k∈[2,10]
Metric: Silhouette score
Iterations Modeling pipeline is repeated 100 times
Table Appendix A.1: Experiments and modeling parameters
Cohort Approach Better/Worse Sex Height (m) Weight (kgs)
Dorn:Dist-Cross Better-off5M, 4F 168±4.71 64.3±12.04
Worse-off5M, 6F 167±10.63 64.55±14
Dorn:Cross Better-off6M, 6F 169.08±8.95 64.08±12.99
Worse-off4M, 4F 166±7.38 65±13.38
Dorn:Life Satisfaction Better-off1M, 3F 165.25±8.95 55.25±5.76
Worse-off3M, 3F 170±5.86 70.17±8.53
Dorn:Agreeableness Better-off7M, 4F 167.18±8.70 65.82±14.33
Worse-off3M, 6F 168.67±8.16 62.78±11.34
SMC:Dist-Cross Better-off8M, 4F 169.58±11.50 74.92±15.77
Worse-off8M, 14F 166±8.85 73.59±17.02
SMC:Cross Better-off9M, 2F 174±7.36 78.36±18.86
Worse-off7M, 16F 164.04±9.50 72±14.98
Table Appendix A.2: Breakdown of occupants’ metadata based on the cohort approaches they are better or worse-off. Height and Weight columns display the mean
±
standard deviation
14
(a) Dorn [55] participants and their first 231 responses per participants. (b) SMC [57] participants and their first 60 responses per participants
Figure Appendix A.1: Thermal preference response distribution for all participants in both datasets. Appendix A.1a shows the Dorn [
55
] dataset and Appendix A.1b
shows the SMC [57] dataset.
(a) Average Silhouette scores for cold start (Surveys and B5P) and warm start
(Dist-Cross and Cross) approaches on the Dorn dataset.
(b) Average Silhouette scores for warm start approaches on the SMC dataset.
Figure Appendix A.2: Number of cohorts determined based on the average Silhouette scores at different number of cohorts
k
(
k∈
[2
,
10]) on both datasets. The higher
the Silhouette score the better. Appendix A.2a and Appendix A.2b showcase the values for the Dorn and SMC datasets respectively.
15
Figure Appendix A.3: Performance results in F1-micro score of 100 iterations for each cohort approach on the Dorn dataset (y-axis). Cold start approaches are
highlighted by a light-blue region and Warm start approaches by a yellow region. These results are complementary to Figure 4. Each personality trait from the
B5P (
Extraversion
,
Agreeableness
,
Conscientiousness
,
Emotional stability
,
Openness to experiences
) are considered individually as cold start
approaches. Warm start approaches, Dist-Cross and Cross, have the number of labeled data points used for assignation; i.e., 1, 3, 5, and 7; as suffixes.
16
Figure Appendix A.4: Performance percentage change in F1-micro score of 100 iterations for each occupant in each cohort approach (y-axis) for the Dorn dataset.
The percentage change is calculated based on the respective cohort approach performance and the
General-purpose
performance. The average value of each
occupant is reported in Figure 5. 17
Figure Appendix A.5: Performance percentage change in F1-micro score of 100 iterations for each occupant in each cohort approach (y-axis) for the SMC dataset.
The percentage change is calculated based on the respective cohort approach performance and the
General-purpose
performance. The average value of each
occupant is reported in Figure 5.
18
References
[1]
R. J. De Dear, T. Akimoto, E. A. Arens, G. Brager, C. Candido, K. W.D.
Cheong, B. Li, N. Nishihara, S. C. Sekhar, S. Tanabe, J. Toftum, H. Zhang,
and Y. Zhu. Progress in thermal comfort research over the last twenty
years. Indoor Air, 23(6):442–461, 2013.
[2]
Fan Zhang, Richard de Dear, and Peter Hancock. Effects of moderate ther-
mal environments on cognitive performance: A multidisciplinary review.
Applied Energy, 236(July 2018):760–777, 2019.
[3]
Ken Parsons. Human thermal environments: the effects of hot, moderate,
and cold environments on human health, comfort and performance. CRC
press, 2014.
[4]
Lindsay T Graham, Thomas Parkinson, and Stefano Schiavon. Lessons
learned from 20 years of CBE ’ s occupant surveys. Building &Cities,
2(1):166–184, 2021.
[5]
David Ormandy and V
´
eronique Ezratty. Health and thermal comfort: From
WHO guidance to housing strategies. Energy Policy, 49:116–121, 2012.
[6]
K Pantavou, G Theoharatos, A Mavrakis, and M Santamouris. Evaluating
thermal comfort conditions and health responses during an extremely hot
summer in Athens. Building and Environment, 46(2):339–344, 2011.
[7]
P. A. Hancock, Jennifer M. Ross, and James L. Szalma. A meta-analysis of
performance response under thermal stressors. Human Factors, 49(5):851–
877, 2007.
[8]
Olli A. Sepp
¨
anen and William Fisk. Some quantitative relations between
indoor environmental quality and work performance or health. HVAC and
R Research, 12(4):957–973, 2006.
[9]
David P Wyon and Pawel Wargocki. Room temperature effects on office
work. In Creating the productive workplace, pages 209–220. Taylor &
Francis, 2006.
[10]
M J Mendell and G A Heath. Do indoor pollutants and thermal conditions
in schools influence student performance? A critical review of the literature.
Indoor Air, 15(1):27–52, 2005.
[11]
Pawel Wargocki, Jose Ali Porras-Salazar, and Sergio Contreras-Espinoza.
The relationship between classroom temperature and children’s perfor-
mance in school. Building and Environment, 157(February):197–204,
2019.
[12]
Pawel Wargocki and David P. Wyon. Ten questions concerning thermal and
indoor air quality effects on the performance of office work and schoolwork.
Building and Environment, 112:359–366, 2017.
[13]
Li Lan, Zhiwei Lian, and Li Pan. The effects of air temperature on office
workers’ well-being, workload and productivity-evaluated with subjective
ratings. Applied Ergonomics, 42(1):29–36, 2010.
[14]
Sergio Altomonte, Joseph Allen, Philomena Bluyssen, Gail Brager, Lisa
Heschong, Angela Loder, Stefano Schiavon, Jennifer Veitch, Lily Wang,
and Pawel Wargocki. Ten questions concerning well-being in the built
environment. Building and Environment, page 106949, 2020.
[15]
P.O. Fanger. Assessment of thermal comfort practice. British journal of
Industrial Medicine, 30:313–324, 1973.
[16]
Richard J. de Dear and Gail Schiller Brager. Developing an adaptive model
of thermal comfort and preference. ASHRAE Transactions, 104(1):1–18,
1998.
[17]
Veronika F
¨
oldv
´
ary Li
ˇ
cina, Toby Cheung, Hui Zhang, Richard de Dear,
Thomas Parkinson, Edward Arens, Chungyoon Chun, Stefano Schiavon,
Maohui Luo, Gail Brager, Peixian Li, and Soazig Kaam. ASHRAE Global
Thermal Comfort Database II. Dataset, v4:1–4, 2018.
[18]
Toby Cheung, Stefano Schiavon, Thomas Parkinson, Peixian Li, and
Gail Brager. Analysis of the accuracy on PMV – PPD model using the
ASHRAE Global Thermal Comfort Database II. Building and Environment,
153(December 2018):205–217, 2019.
[19]
Joyce Kim, Yuxun Zhou, Stefano Schiavon, Paul Raftery, and Gail Brager.
Personal comfort models: Predicting individuals’ thermal preference using
occupant heating and cooling behavior and machine learning. Building
and Environment, 129(December 2017):96–106, 2018.
[20]
Joyce Kim, Stefano Schiavon, and Gail Brager. Personal comfort models–
A new paradigm in thermal comfort for occupant-centric environmental
control. Building and Environment, 132:114–124, 2018.
[21]
Hui Zhang, Edward Arens, Charlie Huizenga, and Taeyoung Han. Thermal
sensation and comfort models for non-uniform and transient environments,
part II: Local comfort of individual body parts. Building and Environment,
45(2):389–398, 2010.
[22]
Eun-jeong Shin and Roberto Yus. Exploring Fairness in Participatory
Thermal Comfort Control in Smart Buildings. BuildSys ’17 Proceedings
of the 4th ACM International Conference on Systems for Energy-Efficient
Built Environments, 2017.
[23]
Prageeth Jayathissa, Matias Quintana, Mahmoud Abdelrahman, and Clay-
ton Miller. Humans-as-a-sensor for buildings: Intensive longitudinal
indoor comfort models. Buildings, 10(174):1–23, 2020.
[24]
Seungjae Lee, Ilias Bilionis, Panagiota Karava, and Athanasios Tzempe-
likos. A Bayesian approach for probabilistic classification and inference
of occupant thermal preferences in office buildings. Building and Environ-
ment, 118:323–343, 2017.
[25]
Maohui Luo, Jiaqing Xie, Yichen Yan, Zhihao Ke, Peiran Yu, Zi Wang,
and Jingsi Zhang. Comparing machine learning algorithms in predicting
thermal sensation using ASHRAE Comfort Database II. Energy and
Buildings, 210:109776, 2020.
[26]
Da Li, Carol C. Menassa, and Vineet R. Kamat. Personalized human
comfort in indoor building environments under diverse conditioning modes.
Building and Environment, 126(July):304–317, 2017.
[27]
David Daum, Fr
´
ed
´
eric Haldi, and Nicolas Morel. A personalized measure
of thermal comfort for building controls. Building and Environment,
46(1):3–11, 2011.
[28]
Shichao Liu, Stefano Schiavon, Hari Prasanna Das, Costas J Spanos,
and Ming Jin. Personal thermal comfort models with wearable sensors.
Building and Environment, page 106281, 2019.
[29]
Ma
´
ıra Andr
´
e, Renata De Vecchi, and Roberto Lamberts. User-centered en-
vironmental control: a review of current findings on personal conditioning
systems and personal comfort models. Energy and Buildings, 222, 2020.
[30]
Samuel D. Gosling, Peter J. Rentfrow, and William B. Swann. A very
brief measure of the Big-Five personality domains. Journal of Research in
Personality, 37(6):504–528, 2003.
[31]
Boning Li and Akane Sano. Extraction and Interpretation of Deep
Autoencoder-based Temporal Features from Wearables for Forecasting
Personalized Mood, Health, and Stress. Proceedings of the ACM on
Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(2):1–26,
2020.
[32]
Toby C.T. Cheung, Stefano Schiavon, Elliott T. Gall, Ming Jin, and
William W. Nazaroff. Longitudinal assessment of thermal and perceived
air quality acceptability in relation to temperature, humidity, and CO2
exposure in Singapore. Building and Environment, 115:80–90, 2017.
[33]
Liliana Barrios and Wilhelm Kleiminger. The Comfstat - Automatically
sensing thermal comfort for smart thermostats. In EEE International Con-
ference on Pervasive Computing and Communications (PerCom), pages
257–266, mar 2017.
[34]
Peter Xiang Gao and Srinivasan Keshav. Optimal Personal Comfort Man-
agement Using SPOT+. In Proceedings of the 5th ACM Workshop on
Embedded Systems For Energy-Efficient Buildings, BuildSys’13, pages
1–8, New York, NY, USA, 2013. ACM, ACM.
[35]
Peter Xiang Gao and Srinivasan Keshav. SPOT: a smart personalized
office thermal control system. In Proceedings of the Fourth International
Conference on Future Energy Systems, e-Energy ’13, pages 237–246, New
York, NY, USA, 2013. ACM, ACM.
[36]
Farrokh Jazizadeh and S Pradeep. Can computers visually quantify human
thermal comfort?: Short paper. In Proceedings of the 3rd ACM Inter-
national Conference on Systems for Energy-Efficient Built Environments,
pages 95–98. ACM, 2016.
[37]
Parisa Mansourifard, Farrokh Jazizadeh, Bhaskar Krishnamachari, and
Burcin Becerik-Gerber. Online learning for personalized room-level ther-
mal control: A multi-armed bandit framework. In Proceedings of the
5th ACM Workshop on Embedded Systems For Energy-Efficient Build-
ings, BuildSys’13, page 1–8, New York, NY, USA, 2013. Association for
Computing Machinery.
[38]
Liang Zhang, Abraham Hang-yat Lam, and Dan Wang. Strategy-proof
thermal comfort voting in buildings. In Proceedings of the 1st ACM
Conference on Embedded Systems for Energy-Efficient Buildings, pages
160–163. ACM, 2014.
[39]
Pimpatsohn Sae-Zhang, Matias Quintana, and Clayton Miller. Differences
in thermal comfort state transitional time among comfort preference groups.
In 16th Conference of the International Society of Indoor Air Quality and
Climate: Creative and Smart Solutions for Better Built Environments,
Indoor Air 2020, number November, 2020.
[40]
Panagiota Antoniadou and Agis M. Papadopoulos. Occupants’ thermal
comfort: State of the art and the prospects of personalized assessment in
office buildings. Energy and Buildings, 153:136–149, 2017.
19
[41]
Gail Brager, Hui Zhang, and Edward Arens. Evolving opportunities for
providing thermal comfort. Building Research and Information, 43(3):274–
287, 2015.
[42]
Joost Van Hoof, Mitja Mazej, and Jan L.M. Hensen. Thermal comfort:
Research and practice. Frontiers in Bioscience, 15(2):765–788, 2010.
[43]
Blerina Lika, Kostas Kolomvatsos, and Stathes Hadjiefthymiades. Facing
the cold start problem in recommender systems. Expert Systems with
Applications, 41(4 PART 2):2065–2073, 2014.
[44]
Xuan Nhat Lam, Thuc Vu, Trong Duc Le, and Anh Duc Duong. Addressing
cold-start problem in recommendation systems. Proceedings of the 2nd
International Conference on Ubiquitous Information Management and
Communication, ICUIMC-2008, pages 208–211, 2008.
[45]
Andrew I Schein, Alexandrin Popescul, Lyle H Ungar, and David M
Pennock. Methods and Metrics for Cold-Start Recommendations. In
Proceedings of the 25th annual international ACM SIGIR conference on
Research and development in information retrieval - SIGIR ’02, pages
253–260, 2002.
[46]
Nikola Banovic and John Krumm. Warming Up to Cold Start Personal-
ization. Proceedings of the ACM on Interactive, Mobile, Wearable and
Ubiquitous Technologies, 1(4):1–13, 2018.
[47]
Hsien Te Kao, Shen Yan, Homa Hosseinmardi, Shrikanth Narayanan,
Kristina Lerman, and Emilio Ferrara. User-Based Collaborative Filtering
Mobile Health System. Proceedings of the ACM on Interactive, Mobile,
Wearable and Ubiquitous Technologies, 4(4), 2020.
[48]
Elaine Nancy Aron and Arthur Aron. Sensory-Processing Sensitivity
and Its Relation to Introversion and Emotionality Sensory-Processing
Sensitivity and Its Relation to Introversion and Emotionality. Journal of
Personal and Social Psychology, 73(2):345–368, 1997.
[49]
Ed Diener, Robert A Emmons, Randy J Larsen, and Sharon Griffin. The
Satisfaction With Life Scale. Journal of Personality Assessment, 49:71–75,
1985.
[50]
Peter J. Rousseeuw. Silhouettes: a graphical aid to the interpretation and
validation of cluster analysis. Journal of Computational North-Holland
and Applied Mathematics, pages 53–65, 1987.
[51]
Andrew Ng, Michael Jordan, and Yair Weiss. On spectral clustering:
Analysis and an algorithm. Advances in neural information processing
systems, 14, 2001.
[52]
Federico Tartarini and Stefano Schiavon. pythermalcomfort: A Python
package for thermal comfort research. SoftwareX, 12:100578, 2020.
[53]
ISO7730 ISO. 7730: Ergonomics of the thermal environment analytical
determination and interpretation of thermal comfort using calculation of
the pmv and ppd indices and local thermal comfort criteria. Management,
3(605):e615, 2005.
[54]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel,
M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Pas-
sos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-
learn: Machine learning in Python. Journal of Machine Learning Research,
12:2825–2830, 2011.
[55]
Federico Tartarini, Stefano Schiavon, Matias Quintana, and Clayton Miller.
Personalized thermal comfort models using wearables and IoT devices .
[56]
Prageeth Jayathissa, Matias Quintana, Tapeesh Sood, Negin Narzarian, and
Clayton Miller. Is your clock-face cozie ? A smartwatch methodology for
the in-situ collection of occupant comfort data. In CISBAT2019 Climate
Resilient Buildings - Energy Efficiency &Renewables in the Digital Era,
Lausanne, Switzerland, 2019.
[57]
Joyce Kim, Fred Bauman, Paul Raftery, Edward Arens, Hui Zhang, Gabe
Fierro, Michael Andersen, and David Culler. Occupant comfort and
behavior: High-resolution data from a 6-month field study of personal
comfort systems with 37 real office workers. Building and Environment,
148(September 2018):348–360, 2019.
[58]
Joon Ho Choi and Vivian Loftness. Investigation of human body skin
temperatures as a bio-signal to indicate overall thermal sensations. Building
and Environment, 58:258–269, 2012.
[59] Chengcheng Shan, Jiawen Hu, Jianhong Wu, Aili Zhang, Guoliang Ding,
and Lisa X. Xu. Towards non-intrusive and high accuracy prediction of
personal thermal comfort using a few sensitive physiological parameters.
Energy and Buildings, 207:109594, 2020.
[60]
Matias Quintana, Stefano Schiavon, Kwok Wai Tham, and Clayton Miller.
Balancing thermal comfort datasets: We GAN, but should we? In Pro-
ceedings of the 7th ACM International Conference on Systems for Energy-
Efficient Buildings, Cities, and Transportation, pages 120–129, Virtual
Event, Japan, 2020.
[61]
Rachel L. Morrison and Keith A. Macky. The demands and resources
arising from shared office spaces. Applied Ergonomics, 60:103–115, 2017.
[62]
Sabina Hodzic, Bettina Kubicek, Lars Uhlig, and Christian Korunka.
Activity-based flexible offices: effects on work-related outcomes in a
longitudinal study. Ergonomics, 64(4):455–473, 2021.
20