Building and Urban Data Science (BUDS) Lab

About the lab

BUDS Lab is a scientific research group that leverages data sources from the built and urban environments to improve the energy efficiency and conservation, comfort, safety and satisfaction of humans.

Featured projects (6)

The scientific literature is rich in references regarding the evaluation of human responses to environmental stimuli, both in controlled environments and in field studies, through Post Occupancy Evaluations or in-situ interviews. Regarding this aspect, human-centric approaches are becoming increasingly popular in the scientific community for three main reasons. First, due to the intrinsic nature of human preferences and responses, peculiar to each individual and defined by not only the environmental stimuli but also by other external factors (e.g., sex, age, culture, expectations). Second, due to the different types of adaptive responses of people to environmental stimuli, categorized into behavioural, physiological and psychological. Third, due to the unique exposure to specific environmental stimuli, characterized by their combination and interactions, to which everyone is subject. The aim of this Research Topic is to gather innovative human-centric research aimed at understanding the complex multisensory interaction between occupants and building technologies and at defining new models that fill the gap of the current methodologies to design comfortable, usable, adaptable, and energy-efficient buildings and public spaces also through the use of new technologies (wearables and nearables) and approaches, such as Internet of Things (IoT), Virtual Reality (VR), and Machine Learning (ML) techniques. In this context, this Frontiers' Research Topic intends to give you the possibility to submit original research covering the human-centric investigations. Topics of interest include but are not limited to: • Human perception and responses to combined environmental stimuli • Interactions between occupants and building technologies • New human-centric models • Innovative human-centric sensing technologies • Innovative human-centric building control technologies • Energy implications of human-centric approaches • Design implications of human-centric approaches • Multisensory interactions between occupants and urban environments Link:
As building envelopes and mechanical and electrical equipment become more efficient, the impact of occupants on building energy increases. Meanwhile trends in teleworking, co-working, and home-sharing mean vastly different occupancy than the standard occupancy schedules. Finally, global expectations for comfort are increasing, while a variety of new technologies may or may not succeed in meeting this demand. The convergence of these trends has necessitated a new look at how occupants are incorporated into building design and operation practice throughout the building life-cycle. The field of occupant modelling emerged over four decades ago; however, it has surged in the past decade – particularly as a result of IEA EBC Annex 66 – “Simulation and Definition of Occupant Behaviour in Buildings”. Annex 66 played an important role in formalizing experimental research methods, modeling and model validation, and occupant simulation. Given the number of unanswered questions about occupant comfort and behaviour and minimal penetration of advanced occupant modelling into practice, this follow-up Annex 79 - “Occupant-centric building design and operation”. The IEA EBC Annex 79 term is from 2018-2023. More details here:
Develop an open-source clock face for Fitbit and Apple Watch, to allow researchers, facility managers, engineers and architects to easily collect subjective feedback from building occupants.
An open data set of non-residential meters
Crowdsource the most accurate long-term energy prediction models for buildings

Featured research (36)

The paper presents a review on major contributions in infrared thermography to study the built environment at multiple scales. To elaborate the review, hundreds of studies conducted between the 1980s and 2020s were first selected based on their relevance to the scope. Afterward, the most relevant contributions were classified and chronologically sorted. From the classification, it is observed that most reviewed studies were conducted to evaluate the thermal performance of buildings or detect their defects using images collected by an infrared camera. At the same time, a considerable number of studies used thermal images obtained by a satellite to observe the urban heat island effect. Despite the important number of contributions in infrared thermography at multiple scales of the built environment, three main research gaps or opportunities can be identified in the literature. First, it would be possible to perform a more detailed analysis of urban heat fluxes using thermal images collected at multiple scales. Then, thermal images collected by a mounted or handheld infrared camera could be used to create building energy models. Finally, better visualization tools would be developed to monitor a city’s energy use and improve its sustainability if thermal images were integrated into Internet-of-Things and digital twin platforms.
Research is needed to explore the limitations and potential for improvement of machine learning for building energy prediction. With this aim, the ASHRAE Great Energy Predictor III (GEPIII) Kaggle competition was launched in 2019. This effort was the largest building energy meter machine learning competition of its kind, with 4,370 participants who submitted 39,403 predictions. The test data set included two years of hourly whole building readings from 2,380 meters in 1,448 buildings at 16 locations. This paper analyzes the various sources and types of residual model error from an aggregation of the competition's top 50 solutions. This analysis reveals the limitations for machine learning using the standard model inputs of historical meter, weather, and basic building metadata. The errors are classified according to timeframe, behavior, magnitude, and incidence in single buildings or across a campus. The results show machine learning models have errors within a range of acceptability (RMSLE_scaled =< 0.1) on 79.1% of the test data. Lower magnitude (in-range) model errors (0.1 < RMSLE_scaled =< 0.3) occur in 16.1% of the test data. These errors could be remedied using innovative training data from onsite and web-based sources. Higher magnitude (out-of-range) errors (RMSLE_scaled > 0.3) occur in 4.8% of the test data and are unlikely to be accurately predicted.
This paper describes the adaptation of an open-source ecological momentary assessment smart-watch platform with three sets of micro-survey wellness-related questions focused on i) infectious disease (COVID-19) risk perception, ii) privacy and distraction in an office context, and iii) triggers of various movement-related behaviors in buildings. This platform was previously used to collect data for thermal comfort, and this work extends its use to other domains. Several research participants took part in a proof-of-concept experiment by wearing a smartwatch to collect their micro-survey question preferences and perception responses for two of the question sets. Participants were also asked to install an indoor localization app on their phone to detect where precisely in the building they completed the survey. The experiment identified occupant information such as the tendencies for the research participants to prefer privacy in certain spaces and the difference between infectious disease risk perception in naturally versus mechanically ventilated spaces.
Collecting intensive longitudinal thermal preference data from building occupants is emerging as an innovative means of characterizing the performance of buildings and the people who use them. These techniques have occupants giving subjective feedback using smartphones or smartwatches frequently over the course of days or weeks. The intention is that the data will be collected with high spatial and temporal diversity to best characterize a building and the occupant’s preferences. But in reality, leaving the occupant to respond in an ad-hoc or fixed interval way creates unneeded survey fatigue and redundant data. This paper outlines a scenario-based (virtual experiment) method for optimizing data sampling using a smartwatch to achieve comparable accuracy in a personal thermal preference model with fewer data. This method uses BIM-extracted spatial data and Graph Neural Network-based (GNN) modeling to find regions of similar comfort preference to identify the best scenarios for triggering the occupant to give feedback. This method is compared to two baseline scenarios that use conventional zoning and a generic 4x4 square meter grid method from two field-based data sets. The results show that the proposed Build2Vec method has an 18%–23% higher overall sampling quality than the spaces-based and square-grid-based sampling methods. The Build2Vec method also performs similar to the baselines when removing redundant occupant feedback points but with better scalability potential.
Data-driven building energy prediction is an integral part of the process for measurement and verification, building benchmarking, and building-to-grid interaction. The ASHRAE Great Energy Predictor III (GEPIII) machine learning competition used an extensive meter data set to crowdsource the most accurate machine learning workflow for whole building energy prediction. A significant component of the winning solutions was the pre-processing phase to remove anomalous training data. Contemporary pre-processing methods focus on filtering statistical threshold values or deep learning methods requiring training data and multiple hyper-parameters. A recent method named ALDI (Automated Load profile Discord Identification) managed to identify these discords using matrix profile, but the technique still requires user-defined parameters. We develop ALDI++, a method based on the previous work that bypasses user-defined parameters and takes advantage of discord similarity. We evaluate ALDI++ against a statistical threshold, variational auto-encoder, and the original ALDI as baselines in classifying discords and energy forecasting scenarios. Our results demonstrate that while the classification performance improvement over the original method is marginal, ALDI++ helps achieve the best forecasting error improving 6% over the winning’s team approach with six times less computation time.

Lab head

Clayton Miller
  • Department of the Built Environment
About Clayton Miller
  • Dr. Clayton Miller is an Asst. Professor at NUS in the BUDS Lab, the Co-Leader of Theme D - Data Analytics at the UC Berkeley SinBerBEST2 Lab and the Co-Leader of Subtask 4 of the IEA Annex 79 Occupant-Centric Building Design and Operation. He holds a Doctor of Sciences (Dr. sc. ETH Zurich) from the ETH Zürich, an MSc. (Building) from the National University of Singapore (NUS), and a BSc./Masters of Architectural Engineering (MAE) from the University of Nebraska - Lincoln (UNL).

Members (10)

Chuan Fu Tan
  • Johnson Controls
Matias Quintana
  • National University of Singapore
Miguel Martin
  • Berkeley Education Alliance for Research in Singapore
Martín Mosteiro Romero
  • National University of Singapore
Chun Fu
  • National University of Singapore
Vasantha Ramani
  • Berkeley Education Alliance for Research in Singapore
Yi Ting Teo
  • National University of Singapore

Alumni (8)

Prageeth Jayathissa
  • Vector Limited
Pandarasamy Arjunan
  • Berkeley Education Alliance for Research in Singapore (BEARS) Limited
Jayashree Chadalawada
  • National University of Singapore