ArticlePDF Available

A hybrid model of machine learning for classifying household water-consumption behaviors

Authors:

Abstract

Classifying household water-consumption behaviors is crucial for providing targeted suggestions for water-saving behaviors and enabling effective resource management and conservation. Although it is common knowledge that energy consumption is closely coupled with household water consumption, the effectiveness of energy consumption information in classifying household water behaviors remains unexplored. This study proposes a hybrid model of long short-term memory (LSTM) and random forest (RF) using water and electricity consumption as inputs to classify household water-consumption behaviors. Data from three households in Beijing collected from January to March 2020 were used for the case studies. The hybrid model achieved a macro F1 score of 0.89 at a 5-min resolution, outperforming the standalone LSTM and RF models. Additionally, the inclusivity of time-series electricity consumption improves the accuracy (F1 scores) of classifying bathing and laundry behaviors by 0.12 and 0.20, respectively. These findings underscore the scientific value of integrating electricity consumption as a proxy variable in water-consumption behavior classification models, demonstrating its potential to enhance accuracy while simplifying data acquisition processes. This study establishes a framework for demand-side water management aimed at empowering residents to understand their own water-energy consumption behavior patterns and engage in personalized water conservation efforts.
A hybrid model of machine learning for classifying household
water-consumption behaviors
Miao Wang
a
, Zonghan Li
a
, Yi Liu
a
, Lu Lin
b
, Chunyan Wang
a,*
a
School of Environment, Tsinghua University, China
b
School of Economics and Management, China University of Petroleum, Beijing, China
ARTICLE INFO
Keywords:
Behaviors classication
Hybrid machine learning method
Water-electricity nexus
High resolution
ABSTRACT
Classifying household water-consumption behaviors is crucial for providing targeted suggestions for water-
saving behaviors and enabling effective resource management and conservation. Although it is common
knowledge that energy consumption is closely coupled with household water consumption, the effectiveness of
energy consumption information in classifying household water-consumption behaviors remains unexplored.
This study proposes a hybrid model of long short-term memory (LSTM) and random forest (RF) using water and
electricity consumption as inputs to classify household water-consumption behaviors. Data from three house-
holds in Beijing collected from January to March 2020 were used for the case studies. The hybrid model achieved
a macro F1 score of 0.89 at a 5-min resolution, outperforming the standalone LSTM and RF models. Additionally,
the inclusivity of time-series electricity consumption improves the accuracy (F1 scores) of classifying bathing and
laundry behaviors by 0.12 and 0.20, respectively. These ndings underscore the scientic value of integrating
electricity consumption as a proxy variable in water-consumption behavior classication models, demonstrating
its potential to enhance accuracy while simplifying data acquisition processes. This study establishes a frame-
work for demand-side water management aimed at empowering residents to understand their own water-energy
consumption behavior patterns and engage in personalized water conservation efforts.
1. Introduction
With economic development and urbanization, urban water con-
sumption, particularly household water consumption, has increased
rapidly (Dolan et al., 2021). According to WRIs Aqueduct, global
household water demand increased by 600% between 1960 and 2014
(Fl¨
orke et al., 2013). The volume of household water consumption in
China surged from 6.8 billion m
3
in 1980 to 77 billion m
3
in 2010, a
more than tenfold increase (Wang et al., 2018). There were projections
indicating that by 2050, global household water consumption could
increase by 50%250% compared with that in 2010 (Wada et al., 2016).
This escalation has led to water scarcity, which is a signicant factor
restricting sustainable development and underscores the urgent need for
water conservation strategies.
Classifying household water-consumption behaviors is vital for
unlocking potential water-saving measures (Cominola et al., 2023;
Russell and Fielding, 2010), such as providing personalized
water-saving suggestions, positively facilitating water conservation en-
deavors (Liu et al., 2016), and enabling managers to scrutinize and
rene their incentive measures based on detailed information about
water end uses (Gleick et al., 2003). Based on the water-consumption
behavior classication, households could adjust their high
water-consumption behavior, which could lead to signicant water
savings of up to 28% (Kneebone, 2018). However, despite its impor-
tance, there remain critical gaps in the methods and frameworks used for
classifying water-consumption behaviors.
Existing research on household water-consumption behavior classi-
cation has primarily relied on high-resolution data collected at sub-
minute intervals (i.e., data collection intervals of <1 min) (Cominola
et al., 2019; Heydari et al., 2022). While these approaches achieve high
accuracy, they are resource-intensive, costly, and often require invasive
monitoring systems that raise privacy concerns. This limits their scal-
ability and practical application. Recent advancements suggest that
coarser temporal resolutionssuch as minute-level datacan provide
comparable classication accuracy while reducing costs and avoiding
intrusive monitoring (Britton et al., 2013). However, the effectiveness of
minute-level resolution data for behavior classication remains theo-
retical. Furthermore, although machine learning techniques have been
* Corresponding author.
E-mail address: wangchunyan@tsinghua.edu.cn (C. Wang).
Contents lists available at ScienceDirect
Cleaner and Responsible Consumption
journal homepage: www.journals.elsevier.com/cleaner-and-responsible-consumption
https://doi.org/10.1016/j.clrc.2025.100252
Received 18 November 2024; Received in revised form 3 January 2025; Accepted 8 January 2025
Cleaner and Responsible Consumption 16 (2025) 100252
Available online 9 January 2025
2666-7843/© 2025 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-
nc-nd/4.0/ ).
widely applied in this domain, new algorithmic approaches (e.g., hybrid
machine learning methods) are still needed to enhance classication
modeling accuracy (Huang et al., 2022; Manandhar et al., 2023; Nguyen
et al., 2015).
Another critical gap lies in the limited integration of the water-
energy nexus into existing classication models. Water-related activ-
ities such as laundry and bathing are closely linked to electricity con-
sumption, yet most studies fail to incorporate this interdependence
(Fidar et al., 2010; Plappally, 2012; Wang et al., 2022). Recent evidence
suggests that integrating electricity consumption data as a proxy for
water-consumption behaviors can signicantly improve classication
accuracy (Li et al., 2024). However, challenges such as the need to align
electricity and water consumption data, or the requirement for
specialized monitoring systems that involve installing numerous sensors
in households, have hindered widespread adoption.
Research on classifying household water-consumption behaviors has
identied gaps in the inclusion of the water-energy nexus, underex-
plored utilization of hybrid machine learning methods, and limited
temporal resolution for data acquisition. To ll these research gaps, this
study makes the following contributions: (1) develop a neural-network-
based hybrid machine learning model, leveraging non-invasive devices
to classify household water-consumption behaviors using both water
and electricity consumption data; (2) demonstrate the effectiveness of
incorporating electricity consumption as a proxy variable by capturing
interdependencies between water and electricity consumption; (3)
identify an optimal temporal resolution for data collection, showing that
minute-level resolutions can achieve high performance while reducing
costs and avoiding intrusive monitoring; and (4) validate the proposed
method through a case study, providing empirical evidence for their
effectiveness in practical applications. Therefore, this study proposes a
method for household water-consumption behavior classication that
offers greater practical applicability.
The remainder of this paper is organized as follows. The Literature
review section identies main research gaps on household water-
consumption behaviors. The Data and method section provides the
research framework, describes the feature variables, introduces the Long
Short-Term Memory (LSTM) and Random Forests (RF) hybrid model.
The Results section shows a descriptive statistical analysis of the char-
acteristics of household water and electricity consumption and com-
pares the performance of the hybrid model with that of the LSTM and RF
models. The impact of different temporal resolutions and electricity
proxies on water-consumption behavior classication is also analyzed.
The Discussion section compares the results of this study with those of
other studies and explores the insights for water management and
household water conservation. The Conclusion summarizes the key
ndings and limitations of this study.
2. Literature review
Critical research gaps on household water-consumption behavior
classication persist in three key dimensions: the limited application of
hybrid machine learning methods, the challenges associated with high-
resolution data collection, and the effectiveness of integrating the
electricity consumption data as a proxy into classication models.
2.1. Methods for classifying household water-consumption behaviors
Household water-consumption behavior classication has evolved
through various methods, each with distinct advantages and limitations.
Early studies relied on tree-based algorithms, such as Trace Wizard
(DeOreo et al., 1996) and Identiow (Kowalski and Marshallsay, 2003),
which classify water consumption based on physical characteristics like
volume, duration, and ow rate. Another approach involves Baye-
sian-based methods (such as HydroSense (Froehlich et al., 2011)) that
integrate data from multiple pressure sensors across household appli-
ances. These methods leverage probabilistic models to achieve moderate
or even quite high accuracy (e.g., 70% (Nguyen et al., 2013) and 90%
(Froehlich et al., 2011)). However, the high cost of installing numerous
pressure sensors and the associated privacy concerns make them
impractical for large-scale applications.
To address these challenges, machine learning methods have been
increasingly adopted in recent years. Long Short-Term Memory (LSTM)
networks are particularly advantageous for capturing complex temporal
dependencies inherent in time series data related to water consumption
(Bennett et al., 2013; Cascone et al., 2023; Ismail Fawaz et al., 2019).
For instance, an LSTM model applied to data from 83 households ach-
ieved an average root mean square error (RMSE) of 0.40, demonstrating
its strong predictive capabilities for various water-consumption behav-
iors (Rahim et al., 2019). As for classication, Random Forest (RF)
classiers have been utilized for household water-consumption behavior
classication based on high-resolution data obtained from smart water
meters. These models have consistently shown high performance,
achieving weighted F1-scores above 0.85 when trained on datasets
aggregated at different temporal resolutions (Heydari et al., 2022).
Comparative studies indicate that RF outperforms other traditional al-
gorithms like Support Vector Machines (SVM) and Logistic Regression
(Log-reg), establishing itself as a preferred method in this domain
(Heydari and Stillwell, 2024). In addition, hybrid machine learning
models that combine multiple algorithms have shown great promise by
leveraging the strengths of different algorithms (Huang et al., 2022;
Manandhar et al., 2023; Nguyen et al., 2015). For instance, Autoow
and EU2016 utilize hidden Markov models (HMM), articial neural
networks (ANN), and dynamic time warping (DTW) to decompose total
water consumption into specic behaviors, achieving accuracies be-
tween 85% and 90% (Beal et al., 2011; Bennett et al., 2013; Nguyen
et al., 2014). Despite their effectiveness, these methods typically require
high-resolution data collected at subminute intervals, which increases
costs.
2.2. Data acquisition and temporal resolution
The optimal balance between the temporal resolution of water con-
sumption data in terms of model accuracy remains uncertain (Hall et al.,
2025). In existing studies, the intervals for collecting water consumption
data ranged from a few seconds to a few hours. Subminute resolution
data are widely adopted for the classication of water consumption
(Mazzoni et al., 2023). Collection at this resolution typically requires
invasive equipment (Bastidas Pacheco et al., 2022) as it requires the
addition of sensors or custom hardware and software to meters (Stewart
et al., 2018). High-resolution data collection often incurs greater costs,
such as the need for more precise equipment and the difculty in
recruiting participants. However, coarser temporal resolutions,
although more cost effective, may compromise the effectiveness of the
models. For instance, resolutions coarser than an hour are considered
relatively crude for classifying water-consumption behaviors (Britton
et al., 2013). The trade-off between classication accuracy and cost
highlights the signicance of investigating the effectiveness of
water-consumption behavior classication across different minute-level
resolutions.
The challenge of data acquisition could be partially addressed by the
advent of technologies such as smart water meters and ow sensors
(Darby, 2010; Gurung et al., 2015; Stewart et al., 2018). Measurement
methods are primarily divided into invasive and non-invasive types
(Cominola et al., 2018). Invasive measurement involves installing sen-
sors on individual water-consuming devices, such as washing machines
and showerheads, and is commonly used for water-consumption
behavior classication. For instance, one study installed 92 ow sen-
sors in a household to record second-by-second readings of the sensors
(i.e., the temporal resolution is 1 s) (Kropp et al., 2022). Yet, invasive
measurement has high costs and privacy concern (Attallah et al., 2021;
Heydari et al., 2022; Kropp et al., 2022; Mazzoni et al., 2023; Meyer
et al., 2021; Nguyen et al., 2015). In contrast, non-invasive
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
2
measurement recording the total household water consumption at only a
single monitoring point in each household is installed by water utility
company, making it more suitable for large-scale use (Ellert et al., 2015;
Vitter and Webber, 2018a).
2.3. Proxy of electricity in household water-consumption behaviors
classication
Existing approaches reaching the high classication accuracy
(exceeding 81% (Attallah et al., 2021; Kropp et al., 2022; Mazzoni et al.,
2021; Meyer et al., 2021; Nguyen et al., 2015; Rahim et al., 2021)) often
rely on multidimensional water-related metrics, such as start and end
times of water-consumption behaviors (Mazzoni et al., 2021; Rahim
et al., 2021), duration (Kropp et al., 2022; Mazzoni et al., 2021; Meyer
et al., 2021; Nguyen et al., 2015; Rahim et al., 2021), total water con-
sumption (Mazzoni et al., 2021; Nguyen et al., 2015), water ow rate
(Attallah et al., 2021; Mazzoni et al., 2021; Meyer et al., 2021), and
maximum ow rate (Kropp et al., 2022; Nguyen et al., 2015; Rahim
et al., 2021), which require high-resolution data and invasive mea-
surements. These approaches are effective at the cost of being
resource-intensive. By contrast, electricity consumption data, which can
be easily obtained from household smart meters without additional
monitoring devices, offers a scalable and non-invasive alternative.
The theoretical foundation for incorporating electricity consumption
lies in the water-energy nexus, which highlights the interdependence
between water and energy use in households (Vitter and Webber, 2018a,
2018b). Major water-consumption behaviors, such as laundry, dish-
washing, and bathing, are inherently energy-intensive due to their
reliance on appliances like washing machines and water heaters (Fidar
et al., 2010; Plappally, 2012; Wang et al., 2022). Studies have demon-
strated that leveraging electricity consumption data can signicantly
enhance classication accuracy (Ellert et al., 2015, Nguyen et al., 2017;
Vitter and Webber, 2018a, 2018b). For instance, using a binary variable
(0/1) to indicate the operational status of key appliances, such as
washing machines and dishwashers, improved the accuracy of laundry
and dishwashing classications from 71% to 87% (Vitter and Webber,
2018a). Furthermore, incorporating detailed electricity consumption
data through circuit-level monitoring further increased overall classi-
cation accuracy from 90.4% to 93.1% (Nguyen et al., 2017). These
ndings underscore the potential of electricity data as a proxy for
water-consumption behaviors, capturing patterns that are otherwise
difcult to discern using water-related metrics alone (Bongungu et al.,
2022; Li et al., 2024; Wang et al., 2023).
3. Data and method
3.1. Research framework
The research design encompassed three components (Fig. 1). First,
historical time series data on water and electricity consumption were
collected from three households in Beijing. Second, a hybrid model
combining the LSTM and RF models was developed to classify the three
water-consumption behaviors considered in this study: bathing, cook-
ing, and laundry. Finally, a comprehensive evaluation was conducted to
assess the performance of the proposed model, including the models
overall performance (i.e., macro F1 score) and individual behavioral
performance (i.e., F1 score). The inuence of different temporal reso-
lutions and electricity proxies on the classication of water-
consumption behavior was examined.
3.2. Input features and output labels
Three types of input features are used in this study: water con-
sumption, electricity consumption, and time. Water consumption is a
direct indicator of household water-consumption behaviors. To effec-
tively capture time-related patterns, this study used sine and cosine
transformations for the time variable, considering the temporal
sequence within a day and the cyclic nature of daily patterns (Mahajan
et al., 2021). Employing this commonly used method for encoding
cyclical data, the time variable was transformed into two dimensions,
T
sin
and T
cos
, as shown in formulas (1) and (2).
Tisin =sin2
π
i
max (i)(1)
Ticos =cos2
π
i
max (i)(2)
where.
Tisin represents the sine transformation of the time variable i;
Ticos represents the cosine transformation of the time variable i;
Fig. 1. Research design.
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
3
i represents the time variable, indicating a specic time point within
the day (e.g., if the temporal resolution is 5 min, i =1 represents the time
interval from 00:0000:05, and max(i) is 288).
The output labels represent household water-consumption behavior.
In this study, bathing, cooking, and laundry were selected as the labeled
behaviors. These three behaviors are the dominant household water-
consumption behaviors, accounting for 43%70% of total household
water consumption (Zhang et al., 2021). Furthermore, the durations of
these behaviors, in contrast to instantaneous actions, such as toilet
ushing, are relatively extended, making them well suited for classi-
cation at minute-level temporal resolutions.
3.3. Water-consumption behavior classication modeling
This study proposes a hybrid model that combines LSTM and RF
models to classify water-consumption behaviors occurring at a partic-
ular moment. RF is a traditional non-probabilistic classier that can
improve the classication performance and robustness through
ensemble capabilities (Breiman, 2001). The LSTM is a classic neural
network architecture specically designed to handle long-term
sequential data while retaining relevant information, making it suit-
able for capturing complex relationships within time-series data
(Sagheer and Kotb, 2019). The model comprises four main parts.
a) Input Data: This part primarily includes the time information, water
consumption data, and electricity consumption data (as a proxy) for
the target and historical moments.
b) LSTM: The LSTM model is employed to extract features from his-
torical time information and predict the probabilities of different
water-consumption behaviors. The hyperparameters of the LSTM
model used in this study are listed in Supplementary Material
Table S1.
c) RF: The probabilities of each water-consumption behavior occur-
rence, along with the original input data consisting of time, water-
consumption, and electricity consumption, are collectively input
into the RF model to obtain the preliminary classication results of
water-consumption behaviors. The hyperparameters of the RF model
used in this study are listed in Supplementary Material Table S2.
d) Correction Module: Because the water-consumption behavior at each
time step is independently classied, errors can occur when classi-
fying specic time steps within long and complete water-
consumption behaviors (details are provided in the Supplementary
Material Fig. S1). The correction module is designed to overcome this
limitation. It evaluates the consistency of the classication results for
adjacent time steps and determines whether the same behavior oc-
curs. Detailed rules and a owchart of the correction module are
shown in Supplementary Material Fig. S2.
3.4. Model performance evaluation
The household water-consumption behavior classication model can
be considered a multi-classier model. For individual behavior classi-
cation, four types of results may occur: true positives (TP), true neg-
atives (TN), false positives (FP), and false negatives (FN), as shown in
Table 1. Metrics such as precision, recall, and F1 score are commonly
used to evaluate classier model performance (Grandini et al., 2020)
and formulas (3)(5). The precision represents the proportion of
correctly classied occurrences among behaviors classied as a certain
water-consumption behavior, whereas the recall represents the pro-
portion of correctly classied occurrences among the actual occurrences
of a certain water-consumption behavior (Goutte and Gaussier, 2005).
In general, the precision and recall were negatively correlated. The F1
scores comprehensively assess the precision and recall of each
classication.
Recall =TP
TP +FN (3)
Precision =TP
TP +FP (4)
F1=2Precision Recall
Precision +Recall (5)
A macro F1 score is used to measure the overall performance of the
model. It addresses the potential biases caused by imbalanced samples
and considers the contribution of each sample classication. The macro
F1 score was calculated as the average of each individual behavioral F1
score calculated using formula (5), as shown in formula (6).
Macro F1=1
3F1Bathing +F1Cooking +F1Laundry(6)
3.5. Model comparison
To assess the effectiveness of the developed hybrid model, a com-
parison was conducted between the performance of the hybrid model
and those of the standalone LSTM and RF models. Additionally, the
effectiveness of the models with different temporal resolutions (5, 10,
20, and 30 min) was evaluated to determine the appropriate temporal
resolution for household water consumption management purposes.
Furthermore, the classication results were compared using only
water and incorporating water and electricity proxies as inputs to the
hybrid model. This comparison was accomplished by calculating the
macro and individual behavioral F1 scores. The disparities observed in
the performance between the water-only and water-electricity input
models indicate the effectiveness of considering the water-energy nexus.
3.6. Data collection
3.6.1. Case study
Data were collected from three volunteer households (HH1HH3) in
Beijing, China, from January to February 2020. The demographic
characteristics of the three households are shown in Supplementary
Material Table S3. Note that it was winter in Beijing, and the heating
system used was municipal central heating rather than energy-intensive
methods, such as heat pumps or air conditioners. Two categories of data
were collected: smart-metered water/electricity consumption and
behavioral record data. Smart water and electricity meters provided by
Evavisdom were installed in the three households. These smart meters
facilitated data collection through image reading or infrared trans-
mission, allowing real-time data to be uploaded to the cloud platform
(www.evavision.cn) using narrowband Internet of Things (NB-IoT)
technology. Considering the smart meter battery capacity, NB-IoT signal
strength, and transmission speed, the time interval of the smart metering
was set to 5 min. The measurement units were 0.1 L for water con-
sumption and 0.01 kW h for electricity consumption. Residents of the
three households provided records of their water-consumption behav-
iors, as shown in Supplementary Material Table S4. These records
included information on the specic types of water-consumption be-
haviors as well as the start and end times of each behavior.
3.6.2. Data processing
The water/electricity consumption data were cleaned and missing
Table 1
Confusion matrix.
Modeling results
1 0
Ground truth 1 TP FN
0 FP TN
Notes: 0represents the absence of a specic behavior, whereas 1represents
the occurrence of that behavior.
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
4
values were lled by averaging the readings before and after the missing
interval. Behavioral records and meter data were then matched based on
time. This processing resulted in a dataset that contained household
water-consumption behaviors, T
sin
, T
cos
, and water and electricity con-
sumption at each time step at 5 min intervals. In total, 1923 records were
included in the study, with bathing accounting for 10.7%, cooking for
64.8%, and laundry for 24.5%. In addition, the dataset was converted
into three other versions at 10, 20, and 30-min intervals. The sample
sizes at different temporal resolutions are listed in Table 2.
Datasets with different temporal resolutions were randomly split into
training (70%) and testing (30%) sets. Notably, there was a signicant
discrepancy in sample sizes across behaviors, with cooking behavior
having a substantially larger amount of data (ranging from 65% to 75%)
than bathing and laundry behaviors. To address the potential bias to-
wards the larger dataset and ensure balanced learning, the Synthetic
Minority Over-sampling Technique (SMOTE) method was employed to
oversample the bathing and laundry behaviors in the training set
(Fern´
andez et al., 2018). Detailed information is provided in Supple-
mentary Material S1. This balancing procedure resulted in approxi-
mately equal sample sizes for the three behavioral classes in the training
set.
4. Results
4.1. Descriptive analysis of household water-consumption behavior
4.1.1. The correlation between household water and electricity consumption
As shown in Fig. 2, the peak and off-peak periods in water and
electricity consumption generally align. For example, in the case of HH2,
high electricity and water consumption peaks occurred at 10:00,
14:0015:00, 17:00, and 20:0021:00, whereas low electricity and
water consumption valleys were observed at 16:00, 18:0019:00, and
22:0023:00. Statistically, there was a signicant correlation between
the average hourly water and electricity consumption throughout the
day. The correlation coefcients for hourly water consumption and
electricity consumption were 0.94, 0.85, and 0.63 for HH1, HH2, and
HH3, respectively. This nding supports the feasibility of using elec-
tricity consumption as a proxy for water consumption in the following
modeling.
4.1.2. Average duration of different behaviors
Understanding the duration of water-consumption behaviors is
crucial for accurate classication. The temporal resolution should be
shorter than the behavior duration to capture multiple data points
within a single event and reveal consumption trends. Moreover, varia-
tions in duration serve as a key temporal feature that enhances the
models ability to distinguish between different behaviors, thereby
improving classication accuracy.
The average duration of household water-consumption behaviors in
this study exceeded the temporal resolution of 5 min, as shown in Fig. 3.
Among the three households, bathing, cooking, and laundry behaviors
have average durations of 13.5 min, 18.5 min, and 41.4 min, respec-
tively. These behaviors exhibited considerable variations in duration,
with standard deviations of 7.1, 15.7, and 23.9. There were also dif-
ferences in the behavior duration among the three households. The
average bathing duration was similar across all three households. For
cooking behavior, HH1 had the longest duration, averaging 26.9 min,
followed by HH3, whereas HH2 had the shortest duration, averaging
13.5 min. For laundry behavior, HH1 and HH2 had similar durations,
averaging 32.3 and 35.9 min respectively; however, HH3 had an
average duration of 76.9 min, which may be attributed to the difference
between an impeller (used in HH1 and HH2) and a drum washing ma-
chine (used in HH3).
4.1.3. Time distribution of behaviors within a day
The time distributions of different water-consumption behaviors in
the three households are summarized in Fig. 4. Overall, 54% of the
bathing behaviors occurred during 20:0024:00. Cooking behaviors
were observed mainly during the period 8:0020:00, indicating frequent
occurrence during breakfast, lunch, and dinner. Laundry behaviors were
more common in the morning, with 43% of laundry behaviors occurring
within the temporal interval of 8:0012:00.
4.2. Model performance comparison
4.2.1. Overall performance
The macro F1 scores for the models at various temporal resolutions
are shown in Fig. 5 (details are provided in Supplementary Material
Table S5). Across different temporal resolutions, the average macro F1
score of the postcorrection hybrid model was 0.82. The hybrid model
outperformed the sole models (i.e., RF and LSTM models) by 6.9%
13.4%, indicating the advantage of combining LSTM and RF for
household water-consumption behavior classication. A comparison of
the results of the hybrid model before and after the correction module
revealed that the post-correction model exhibited an improvement from
3.9% to 7.8% in the macro F1 score.
Increasing the temporal resolution from 30 to 5 min led to substantial
improvements in the macro F1 scores, reaching 14.1% from 0.78 to 0.89.
Notably, the hybrid model achieved the highest performance at a 5-min
resolution, with a macro F1 score of 0.89.
4.2.2. Individual behavior performance
Among the three water-consumption behaviors, the post-correction
hybrid model demonstrated the best performance in classifying cook-
ing behavior (Fig. 6 and Supplementary Material Fig. S3), with an
impressive F1 score of 0.94 at the 5- and 10-min temporal resolutions.
For the other behaviors, the classication performance of bathing
behavior outperformed that of laundry behavior by 0.030.14, except at
a 30-min resolution. Notably, the classication performance of bathing
and laundry behaviors is inuenced by the temporal resolution.
Increasing the resolution from 30 to 5 min resulted in a substantial
improvement of 0.21 in the F1 score for bathing behavior and 0.10 for
laundry behavior. However, the improvement for cooking behavior was
relatively modest, with only a slight increase of 0.01 in the F1 score.
Furthermore, the post-correction model consistently outperformed
the pre-correction model. The correction module enhanced the F1 score
of bathing and laundry behaviors, with a maximum increase of 0.09 for
bathing behavior and 0.08 for laundry behavior. Conversely, cooking
behavior achieved a high F1 score >0.90, even before the application of
the correction module, indicating a relatively smaller impact from the
correction process.
4.3. The effectiveness of water-energy nexus on classifying water-
consumption behaviors
The inclusion of the water-energy nexus has signicantly enhanced
the classication performance of the hybrid model. As depicted in Fig. 7,
when the proxy of electricity was incorporated, the hybrid model ach-
ieved macro F1 scores ranging from 0.78 to 0.89. This represents a
substantial increase of 0.040.08 compared to not considering the proxy
of electricity. Moreover, the integration of electricity consumption
consistently improved the classication performance for all three be-
haviors, with particularly notable enhancements observed in bathing
(the highest, up to 0.14) and laundry behaviors. The F1 scores for
Table 2
Sample sizes of water-consumption behaviors at different temporal resolutions.
Water End-use Behavior 5 min 10 min 20 min 30 min
Bathing 205 144 181 134
Cooking 1247 817 978 768
Laundry 471 259 182 125
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
5
cooking and laundry behaviors showed increases of 0.010.04 and
0.020.20, respectively.
The varying impacts of the water-energy nexus can be attributed to
the inherent characteristics of each behavior. Bathing and laundry be-
haviors often involve the use of xed appliances, such as water heaters
and washing machines, which exhibit a consistent coupling relationship
between water and electricity. Consequently, the water-energy nexus
plays a more signicant role in these behaviors, resulting in a relatively
large improvement in their classication performance. Nevertheless,
cooking behavior displays a diverse range of water and electricity con-
sumption patterns owing to the various cooking techniques utilized. This
diversity leads to a lack of a consistent correlation between water and
electricity consumption, thereby diminishing the effectiveness of using
the water-energy nexus.
5. Discussion
The proposed hybrid mode by integrating LSTM networks with RF
for classifying household water-consumption behaviors improves the
understanding of time-series patterns. This integration establishes a
critical link between data resolution and classication accuracy,
demonstrating that a temporal resolution of 5 min outperforms the
subminute or hourly resolutions which are widely adopted in existing
studies. Furthermore, the model provides a framework for analyzing the
relationship between electricity consumption data and water-
consumption behaviors, an area that has been underexplored in exist-
ing literature. This framework leverages data from smart water and
electricity meters to enable accurate behavior classication without
relying on high temporal resolution, thereby reducing the complexity
and cost of data collection systems. This framework offers a practical
foundation for homeowners to develop tailored water conservation
strategies and supports scalable applications in diverse residential
households.
5.1. Insights from comparative analysis
The resolution of data collection strikes a balance between predictive
capability and costs in academic research and practical management.
Previous research on classifying household water-consumption behavior
has predominantly focused on subminute temporal resolutions, such as
5 s (Mazzoni et al., 2023). This is accompanied by the high costs of
intelligent metering devices, data storage, and computational resources
(Cominola et al., 2018). Nevertheless, it has also been demonstrated that
hourly data are too coarse to accurately classify household
water-consumption behaviors (Britton et al., 2013). Therefore, in this
study, the highest data resolution was set to 5 min, reducing the required
temporal resolution compared with previous studies. Our results
demonstrated that a 5-min resolution yields the best classication per-
formance, with 5- and 10-min resolutions achieving a macro F1 score
>0.80, signicantly outperforming 20- and 30-min resolutions. This
suggests that higher resolutions generally lead to better classication
accuracy, with an optimal temporal resolution of 5 min. Moreover, to
discern the impact of the input data size, this study also conducted an
experiment in which the training sets of the 10-, 20-, and 30-min reso-
lutionsdatasets were oversampled to match the size of the 5-min res-
olution dataset. The results indicated that, similar to the classication
Fig. 2. Average hourly water and electricity consumption of HH1 (a), HH2 (b), HH3 (c) and three households in total (d) during the study periods.
Fig. 3. Duration of the considered water-consumption behaviors of the
three households.
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
6
results before dataset augmentation, the 5-min resolution still exhibited
the best performance (Supplementary Material Table S6).
Using a hybrid model combining LSTM and RF, this study increased
the accuracy of household water-consumption behaviors. The LSTM
networks helped to better understand the time-series patterns of the
input data and extract the behavioral possibilities hidden behind the
consumption data, whereas the RF provided a more precise analysis of
probabilistic tabular data (Grinsztajn et al., 2022). Combining their
advantages can further enhance the accuracy. Compared to the sole
model, the performance of the hybrid model (pre-correction) increased
by 11.0% on average and by 13.4% at the maximum. We also compared
our model with those used in previous studies (Table 3). Although the
primary evaluation metric in this study was the macro F1 score, the
results for other indicators are also provided (Supplementary Material
Table S7). Previous models achieved weighted F1 scores ranging from
0.71 to 0.89, whereas the model proposed in this study achieved a
weighted F1 score of 0.870.91. Similarly, previous studies reported
accuracy ranging from 0.81 to 0.98, whereas the accuracy of the model
proposed in this study ranged from 0.87 to 0.90. Regarding sensitivity,
previous research has reported models with sensitivities >0.70, with
some studies reaching 0.95. In this study, the sensitivity of the proposed
model ranged from 0.75 to 0.89. Therefore, compared with existing
research, this study achieved comparable performance in classifying
water-consumption behavior using relatively low-temporal-resolution
Fig. 4. Time distribution of the considered water-consumption behaviors of the three households.
Fig. 5. Macro F1 scores of hybrid models at different time resolutions (a) and the improvements brought about by hybrid (b) and correction (c).
Fig. 6. Comparison of three behaviors F1 scores before and after
model correction.
Fig. 7. The proxy of electricity on macro F1 score and three behaviors
F1 scores.
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
7
data.
Understanding the impact of different temporal resolutions on the
classication performance of a model is crucial for effective water utility
management. Our study reveals that higher temporal resolutions
generally lead to improved macro F1 scores, as evidenced by an increase
of 14.1% (0.780.89) when transitioning from a 30-min temporal res-
olution to a 5-min temporal resolution. In addition, the sensitivity of
specic behaviors to temporal resolution highlights the importance of
selecting an appropriate resolution based on the behavior under inves-
tigation, considering the trade-off between model accuracy and cost. For
instance, bathing and laundry exhibit increased classication perfor-
mance at higher temporal resolutions owing to their longer durations
and more regular water consumption patterns. By contrast, cooking
behavior, which is characterized by shorter durations and diverse ac-
tivities, is less sensitive to temporal resolution.
Moreover, the water-energy nexus, that is, the proxy for electricity,
enables resource managers to estimate and understand the interplay
between water and electricity consumption at the household level.
Previous studies have largely overlooked the water-energy nexus, as
shown in Table 3. The limited studies that have considered the water-
energy nexus have only incorporated binary variables indicating the
usage of washing machines or dishwashers (Vitter and Webber, 2018a,
2018b). This study highlights the importance of considering the
water-energy nexus using electricity consumption as a proxy for
water-consumption behavior classication. Our results demonstrate an
effective enhancement of an average of 8.63% in the classication
performance through the integration of the water-energy nexus. There is
a close interconnection between water and electricity consumption
within households, as the activities considered in this study (bathing,
cooking, and laundry) involve water and electricity consumption.
Similarly, in the classication of electricity-consumption behavior, a
potential proxy for water consumption can also be considered. Policy-
makers should incorporate a proxy mechanism for other resources such
as energy when formulating policies related to household water
consumption.
5.2. Practical implications
The hybrid model presented in this study highlights the value of
mining water and energy consumption datasets, particularly in the
context of the rapid adoption of smart water meters and smart home
appliances. By leveraging water and electricity consumption data ana-
lytics, this model supports ne management strategies that can lead to
more sustainable practices. The ndings demonstrate that accurate
classication of household water-consumption behaviors can be ach-
ieved without relying on high temporal resolution data, alleviating the
burden associated with deploying complex and costly data collection
systems. Furthermore, this hybrid model is not constrained by family-
specic characteristics and can effectively explore behavioral patterns
as long as sufcient data is available. Although the current imple-
mentation has been tested on only three households, its underlying
framework shows strong potential for large-scale application across
diverse households.
Accurately classifying household water consumption behaviors using
the proposed model can provide valuable insights that enhance house-
holdsunderstanding and awareness of their water use, which is crucial
for developing tailored water-saving recommendations. These insights
can then be leveraged to promote more responsible consumption pat-
terns and empower individuals to make informed decisions about their
water usage. In contexts where marginal pricing of resources is not
feasible, raising water conservation awareness and utilizing behavior-
driven strategies for resource conservation become especially impor-
tant (Olmstead and Stavins, 2009; Vivek et al., 2021).
5.3. Limitations
This study has several limitations that could be addressed in future
research. Firstly, the models testing on only three households limits the
Table 3
A comparison of existing household water-consumption behavior classication studies.
Place Studied Year Studied Temporal
Resolution
Methods Electricity
Proxy
Indicators Performance References
China 2020 5 min, 10 min,
20 min, 30 min
LSTM +RF Macro F1 score 0.780.89 This study
Weighted F1 score 0.870.91
Accuracy 0.870.90
Sensitivity 0.750.89
Italy,
Netherlands
2018 for Italy,
20192020 for
Netherlands
1 min Rules ×Accuracy 91% Mazzoni et al.
(2024)
USA 2011 10 s Model-based method (DTW);
Learning-based method (SVM,
RF, XGBoost, MLP)
×Weighted F1 score 0.710.72 Pavlou et al.
(2024)
USA / 4 s DBSCAN +RF ×Accuracy 98% Attallah et al.
(2023)
Italy 2018 1 min Rules ×Appliance
contribution
accuracy
90.4%
97.5%
Mazzoni et al.
(2021)
South Africa,
Australia
20162018 for South
Africa, 20102012 for
Australia
5 s SVM +RF +EDS ×Accuracy 81%98% Meyer et al.
(2021)
Australia 20102012 5 s HMM +ANN ×Accuracy 85.9%
96.1%
Nguyen et al.
(2015)
USA 2021 5 s, 10 s, 30 s, 1
min
RF ×Weighted F1 score 0.730.89 Heydari et al.
(2022)
Australia 20102013 / SOM +K-means +HMM +ANN ×Accuracy 86%94.2% Yang et al.
(2018)
Spain 20192020 5 s Prole Recognition ×Sensitivity 70%80% Fontdecaba et al.
(2013)
Australia 20102013 10120 s SVM Sensitivity 71%87% Vitter and
Webber (2018a)
USA, Canada 2016 / SVM Sensitivity >87.1% Vitter and
Webber (2018b)
Notes: The calculation of the indicators mentioned in this table is explained in detail in Supplementary Material S2.
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
8
generalizability of the ndings. Additionally, the methodological
framework requires enhancement to increase adaptability to various
data sources and to more effectively manage missing data or anomalies.
Future studies should validate the model across a larger and more
diverse households while optimizing its performance under different
conditions to ensure its effectiveness in practical applications.
Secondly, while the integration of LSTM and RF effectively captures
time-series patterns, it does not account for complex combined behav-
iors, such as simultaneous activities like bathing and laundry. Future
research should explore advanced algorithms, e.g., waveform decom-
position techniques, to accurately classify these combined behaviors and
improve overall classication accuracy.
Lastly, although this study emphasizes the water-energy nexus, it
lacks a comprehensive exploration of other inuencing factors, such as
demographic and economic characteristics, as well as other water-
consumption behaviors (e.g., toilet ushing and faucet use). Incorpo-
rating these variables could provide a more nuanced understanding of
household water-consumption patterns.
6. Conclusion
This study presents a novel framework for classifying household
water-consumption behaviors through the integration of a hybrid model
that combines LSTM and RF. By investigating the impact of electricity
consumption as a proxy variable and comparing the classication per-
formance under different temporal resolutions (i.e., 5 min, 10 min, 20
min, 30 min), this research proposes a practical approach that leverages
the available water and energy consumption data from smart meters.
The results demonstrate that the hybrid model outperforms the
standalone LSTM and RF models by 0.090.13. In addition, higher res-
olutions generally lead to better classication accuracy, as evidenced by
the hybrid models signicantly higher macro F1 score of 0.11 at the 5-
min resolution in comparison to that at the 30-min resolution.
Regarding specic behaviors, bathing and laundry behaviors
demonstrated improved performance with higher resolutions, with
optimal results observed at a 5-min resolution. The hybrid model
exhibited less sensitivity to temporal resolution when classifying cook-
ing behavior, consistently achieving an F1 score >0.92, demonstrating
the models robustness across different activities.
The inclusion of electricity consumption as a proxy variable proved
benecial, particularly for the classication of bathing and laundry
behaviors. This consideration resulted in notable improvements in the
F1 scores, with maximum increases of 0.12 and 0.20 for bathing and
laundry behaviors, respectively. This integration underscores the
importance of considering the water-energy nexus in future research, as
it enhances understanding of household water-consumption patterns
while simplifying data acquisition processes.
However, our study has some limitations. The studys data acquisi-
tion was not exhaustive, and complex combined behaviors may require
advanced algorithms for accurate classication. Future research should
expand the behaviors types analyzed and consider demographic factors
to provide a more comprehensive understanding of household water-
consumption patterns.
CRediT authorship contribution statement
Miao Wang: Writing review & editing, Writing original draft,
Methodology. Zonghan Li: Writing review & editing, Data curation. Yi
Liu: Writing review & editing, Supervision. Lu Lin: Writing review &
editing, Funding acquisition. Chunyan Wang: Writing review &
editing, Supervision, Funding acquisition.
Declaration of competing interest
The authors declare the following nancial interests/personal re-
lationships which may be considered as potential competing interests:
Chunyan Wang reports nancial support was provided by National
Natural Science Foundation of China. Lu Lin reports nancial support
was provided by National Natural Science Foundation of China. Chun-
yan Wang reports nancial support was provided by Young Elite Sci-
entists Sponsorship Program by CAST. If there are other authors, they
declare that they have no known competing nancial interests or per-
sonal relationships that could have appeared to inuence the work re-
ported in this paper.
Acknowledgements
This study was supported by National Natural Science Foundation of
China (No. 52470212 and NO. 71904203) and Young Elite Scientists
Sponsorship Program by CAST (No. 2023QNRC001).
Appendix A. Supplementary data
Supplementary data to this article can be found online at https://doi.
org/10.1016/j.clrc.2025.100252.
Data availability
Data will be made available on request.
References
Attallah, N.A., Horsburgh, J.S., Bastidas Pacheco, C.J., 2023. An open-source,
semisupervised water end-use disaggregation and classication tool. J. Water
Resour. Plann. Manag. 149 (7), 04023024.
Attallah, N.A., Rosenberg, D.E., Horsburgh, J.S., 2021. Water end-use disaggregation for
six nonresidential facilities in Logan, Utah. J. Water Resour. Plann. Manag. 147 (7),
05021006.
Bastidas Pacheco, C.J., Horsburgh, J.S., Beckwith, A.S., 2022. Impact of data temporal
resolution on quantifying residential end uses of water. Water 14 (16), 2457.
Beal, C., Stewart, R., Huang, T., Rey, E., 2011. South East Queensland Residential End
Use Study. Urban Water Security Research Alliance Brisbane, Australia.
Bennett, C., Stewart, R.A., Beal, C.D., 2013. ANN-based residential water end-use
demand forecasting model. Expert Syst. Appl. 40 (4), 10141023.
Bongungu, J.L., Francisco, P.W., Gloss, S.L., Stillwell, A.S., 2022. Estimating residential
hot water consumption from smart electricity meter data. Environ. Res.:
Infrastructure and Sustainability 2 (4), 045003.
Breiman, L., 2001. Random forests. Mach. Learn. 45, 532.
Britton, T.C., Stewart, R.A., OHalloran, K.R., 2013. Smart metering: enabler for rapid
and effective post meter leakage identication and water loss management. J. Clean.
Prod. 54, 166176.
Cascone, L., Sadiq, S., Ullah, S., Mirjalili, S., Siddiqui, H.U.R., Umer, M., 2023. Predicting
household electric power consumption using multi-step time series with
convolutional LSTM. Big Data Research 31, 100360.
Cominola, A., Giuliani, M., Castelletti, A., Rosenberg, D.E., Abdallah, A.M., 2018.
Implications of data sampling resolution on water use simulation, end-use
disaggregation, and demand management. Environ. Model. Software 102, 199212.
Cominola, A., Nguyen, K., Giuliani, M., Stewart, R.A., Maier, H.R., Castelletti, A., 2019.
Data mining to uncover heterogeneous water use behaviors from smart meter data.
Water Resour. Res. 55 (11), 93159333.
Cominola, A., Preiss, L., Thyer, M., Maier, H.R., Prevos, P., Stewart, R., Castelletti, A.,
2023. The determinants of household water consumption: a review and assessment
framework for research and practice. npj Clean Water 6 (1), 11.
Darby, S., 2010. Smart metering: what potential for householder engagement? Build.
Res. Inf. 38 (5), 442457.
DeOreo, W.B., Heaney, J.P., Mayer, P.W., 1996. Flow trace analysis to access water use.
J. Am. Water Works Assoc. 88 (1), 7990.
Dolan, F., Lamontagne, J., Link, R., Hejazi, M., Reed, P., Edmonds, J., 2021. Evaluating
the economic impact of water scarcity in a changing world. Nat. Commun. 12 (1).
Ellert, B., Makonin, S., Popowich, F., 2015. International summit. Smart City 360.
Springer, pp. 455467.
Fern´
andez, A., Garcia, S., Herrera, F., Chawla, N.V., 2018. SMOTE for learning from
imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif.
Intell. Res. 61, 863905.
Fidar, A., Memon, F., Butler, D., 2010. Environmental implications of water efcient
microcomponents in residential buildings. Sci. Total Environ. 408 (23), 58285835.
Fl¨
orke, M., Kynast, E., B¨
arlund, I., Eisner, S., Wimmer, F., Alcamo, J., 2013. Domestic
and industrial water uses of the past 60 years as a mirror of socio-economic
development: a global simulation study. Global Environ. Change 23 (1), 144156.
Fontdecaba, S., S´
anchez-Espigares, J.A., Marco-Almagro, L., Tort-Martorell, X.,
Cabrespina, F., Zubelzu, J., 2013. An approach to disaggregating total household
water consumption into major end-uses. Water Resour. Manag. 27, 21552177.
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
9
Froehlich, J., Larson, E., Saba, E., Campbell, T., Atlas, L., Fogarty, J., Patel, S., 2011.
A Longitudinal Study of Pressure Sensing to Infer Real-World Water Usage Events in
the Home. Springer, pp. 5069.
Gleick, P., Wolff, G.H., Cushing, K.K., 2003. Waste Not, Want Not: the Potential for
Urban Water Conservation in California. Pacic Institute for Studies in Development,
Environment, Security.
Goutte, C., Gaussier, E., 2005. A Probabilistic Interpretation of Precision, Recall and F-
Score, with Implication for Evaluation. Springer, pp. 345359.
Grandini, M., Bagli, E., Visani, G., 2020. Metrics for multi-class classication: an
overview. arXiv preprint. https://doi.org/10.48550/arXiv.2008.05756.
Grinsztajn, L., Oyallon, E., Varoquaux, G., 2022. Why do tree-based models still
outperform deep learning on typical tabular data? Adv. Neural Inf. Process. Syst. 35,
507520.
Gurung, T.R., Stewart, R.A., Beal, C.D., Sharma, A.K., 2015. Smart meter enabled water
end-use demand data: platform for the enhanced infrastructure planning of
contemporary urban water supply networks. J. Clean. Prod. 87, 642654.
Hall, R., Kenway, S., OBrien, K., Memon, F., 2025. Quantication of residential water-
related energy needs cohesion, validation and global representation to unlock
efciency gains. Renew. Sustain. Energy Rev. 207, 114906.
Heydari, Z., Cominola, A., Stillwell, A.S., 2022. Is smart water meter temporal resolution
a limiting factor to residential water end-use classication? A quantitative
experimental analysis. Environ. Res.: Infrastructure and Sustainability 2 (4), 045004.
Heydari, Z., Stillwell, A.S., 2024. Comparative analysis of supervised classication
algorithms for residential water end uses. Water Resour. Res. 60 (6),
e2023WR036690.
Huang, J., Pang, C., Yang, W., Zeng, X., Zhang, J., Huang, C., 2022. A deep learning
neural network for the residential energy consumption prediction. IEEJ Trans.
Electr. Electron. Eng. 17 (4), 575582.
Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., Muller, P.-A., 2019. Deep
learning for time series classication: a review. Data Min. Knowl. Discov. 33 (4),
917963.
Kneebone, S.C., 2018. Catalysing Water Saving Behaviours in Australian Urban
Households. Monash University.
Kowalski, M., Marshallsay, D., 2003. A System for Improved Assessment of Domestic
Water Use Components.
Kropp, I., Pouyan Nejadhashemi, A., Julien, R., Mitchell, J., Whelton, A.J., 2022.
A machine learning framework for predicting downstream water end-use events with
upstream sensors. Water Supply 22 (7), 64276442.
Li, Z., Wang, C., Liu, Y., Wang, J., 2024. Enhancing the explanation of household water
consumption through the water-energy nexus concept. npj Clean Water 7 (1), 8.
Liu, A., Giurco, D., Mukheibir, P., 2016. Urban water conservation through customised
water and end-use information. J. Clean. Prod. 112, 31643175.
Mahajan, T., Singh, G., Bruns, G., Bruns, G., Mahajan, T., Singh, G., 2021. An
Experimental Assessment of Treatments for Cyclical Data, p. 22.
Manandhar, P., Raq, H., Rodriguez-Ubinas, E., 2023. Current status, challenges, and
prospects of data-driven urban energy modeling: a review of machine learning
methods. Energy Rep. 9, 27572776.
Mazzoni, F., Alvisi, S., Blokker, M., Buchberger, S.G., Castelletti, A., Cominola, A.,
Gross, M.-P., Jacobs, H.E., Mayer, P., Steffelbauer, D.B., 2023. Investigating the
characteristics of residential end uses of water: a worldwide review. Water Res. 230,
119500.
Mazzoni, F., Alvisi, S., Franchini, M., Ferraris, M., Kapelan, Z., 2021. Automated
household water end-use disaggregation through rule-based methodology. J. Water
Resour. Plann. Manag. 147 (6), 04021024.
Mazzoni, F., Blokker, M., Alvisi, S., Franchini, M., 2024. An enhanced method for
automated end-use classication of household water data. J. Hydroinf. 26 (2),
408423.
Meyer, B.E., Nguyen, K., Beal, C.D., Jacobs, H.E., Buchberger, S.G., 2021. Classifying
household water use events into indoor and outdoor use: improving the benets of
basic smart meter data sets. J. Water Resour. Plann. Manag. 147 (12), 04021079.
Nguyen, K.A., Stewart, R.A., Zhang, H., 2014. An autonomous and intelligent expert
system for residential water end-use classication. Expert Syst. Appl. 41 (2),
342356.
Nguyen, K.A., Stewart, R.A., Zhang, H., 2017. Water end-use classication with
contemporaneous water-energy data and deep learning network. In: World Academy
of Science, Engineering and Technology, International Journal of Computer, vol. 12.
Electrical, Automation, Control and Information Engineering, pp. 16, 1.
Nguyen, K.A., Stewart, R.A., Zhang, H., Jones, C., 2015. Intelligent autonomous system
for residential water end use classication: Autoow. Appl. Soft Comput. 31,
118131.
Nguyen, K.A., Zhang, H., Stewart, R.A., 2013. Development of an intelligent model to
categorise residential water end use events. J. hydro-environ. res. 7 (3), 182201.
Olmstead, S.M., Stavins, R.N., 2009. Comparing price and nonprice approaches to urban
water conservation. Water Resour. Res. 45 (4).
Pavlou, P.V., Filippou, S., Solonos, S., Vrachimis, S.G., Malialis, K., Eliades, D.G.,
Theocarides, T., Polycarpou, M.M., 2024. Monitoring domestic water consumption: a
comparative study of model-based and data-driven end-use disaggregation methods.
J. Hydroinf. 26 (4), 709726.
Plappally, A., 2012. Energy requirements for water production, treatment, end use,
reclamation, and disposal. Renew. Sustain. Energy Rev. 16 (7), 48184848.
Rahim, M.S., Nguyen, K.A., Stewart, R.A., Ahmed, T., Giurco, D., Blumenstein, M., 2021.
A clustering solution for analyzing residential water consumption patterns. Knowl.
Base Syst. 233, 107522.
Rahim, M.S., Nguyen, K.A., Stewart, R.A., Giurco, D., Blumenstein, M., 2019. Predicting
Household Water Consumption Events: towards a Personalised Recommender
System to Encourage Water-Conscious Behaviour. IEEE, pp. 18.
Russell, S., Fielding, K., 2010. Water demand management research: a psychological
perspective. Water Resour. Res. 46 (5).
Sagheer, A., Kotb, M., 2019. Time series forecasting of petroleum production using deep
LSTM recurrent networks. Neurocomputing 323, 203213.
Stewart, R.A., Nguyen, K., Beal, C., Zhang, H., Sahin, O., Bertone, E., Vieira, A.S.,
Castelletti, A., Cominola, A., Giuliani, M., 2018. Integrated intelligent water-energy
metering systems and informatics: visioning a digital multi-utility service provider.
Environ. Model. Software 105, 94117.
Vitter, J.S., Webber, M., 2018a. A non-intrusive approach for classifying residential
water events using coincident electricity data. Environ. Model. Software 100,
302313.
Vitter, J.S., Webber, M., 2018b. Water event categorization using sub-metered water and
coincident electricity data. Water 10 (6), 714.
Vivek, V., Malghan, D., Mukherjee, K., 2021. Toward achieving persistent behavior
change in household water conservation. Proc. Natl. Acad. Sci. USA 118 (24),
e2023014118.
Wada, Y., Fl¨
orke, M., Hanasaki, N., Eisner, S., Fischer, G., Tramberend, S., Satoh, Y., Van
Vliet, M., Yillia, P., Ringler, C., 2016. Modeling global water use for the 21st century:
the Water Futures and Solutions (WFaS) initiative and its approaches. Geosci. Model
Dev. (GMD) 9 (1), 175222.
Wang, C., Li, Z., Ni, X., Shi, W., Zhang, J., Bian, J., Liu, Y., 2023. Residential water and
energy consumption prediction at hourly resolution based on a hybrid machine
learning approach. Water Res. 246, 120733.
Wang, C., Zhang, J., Long, J., Liu, Y., 2022. Panel data regression model for identifying
the spatiotemporal characteristics and key factors inuencing household water-
energy consumption. J. Tsinghua Univ. (Sci. Technol.) 62 (3), 614626.
Wang, X.-j., Zhang, J.-y., Gao, J., Shahid, S., Xia, X.-h., Geng, Z., Tang, L., 2018. The new
concept of water resources management in China: ensuring water security in
changing environment. Environ. Dev. Sustain. 20, 897909.
Yang, A., Zhang, H., Stewart, R.A., Nguyen, K., 2018. Enhancing residential water end
use pattern recognition accuracy using self-organizing maps and K-means clustering
techniques: Autoow v3. 1. Water 10 (9), 1221.
Zhang, L., Njepu, A., Xia, X., 2021. Minimum cost solution to residential energy-water
nexus through rainwater harvesting and greywater recycling. J. Clean. Prod. 298,
126742.
M. Wang et al.
Cleaner and Responsible Consumption 16 (2025) 100252
10
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Reduced energy consumption is essential for a rapid transition to net zero carbon emissions. Residential energy may constitute 27 % of primary energy consumption, and 20 %-50 % of residential energy is water-related energy (WRE). However, residential WRE consumption is difficult to quantify due to challenges in collecting data. The aim of this literature review is to critically appraise and compare models of residential WRE. This is the first literature review to provide a comparison of modelled estimates of residential WRE consumption. Reported values for residential WRE consumption were highly variable, ranging from 1 to 7 kWh/person/day. The results are not representative of the global population because 50 % of studies were conducted in Europe, while remaining studies were scattered across eight countries. 30 % of studies quantified energy consumption of specific end-uses (e.g. shower), and 40 % of studies only considered average consumption. Of the 61 studies reviewed, only four studies demonstrated clear validation of WRE consumption, and no studies validated energy consumption of individual end-uses. Therefore, it is difficult to determine whether the variability in reported results is due to true variability in residential WRE consumption, or uncertainty in the modelling approaches. Since successful water and energy reduction has been based on knowledge of specific end-uses, WRE models need better consideration of end-uses in order to inform design of interventions to reduce WRE consumption. Future research in this area also requires a greater focus on validation of modelling tools and wider geographical scope.
Article
Full-text available
Water sustainability in the built environment requires an accurate estimation of residential water end uses (e.g., showers, toilets, faucets, etc.). In this study, we evaluate the performance of four models (Random Forest, RF; Support Vector Machines, SVM; Logistic Regression, Log‐reg; and Neural Networks, NN) for residential water end‐use classification using actual (measured) and synthetic labeled data sets. We generated synthetic labeled data using Conditional Tabular Generative Adversarial Networks. We then utilized grid search to train each model on their respective optimized hyperparameters. The RF model exhibited the best model performance overall, while the Log‐reg model had the shortest execution times under different balanced and imbalanced (based on number of events per class) synthetic data scenarios, demonstrating a computationally efficient alternative for RF for specific end uses. The NN model exhibited high performance with the tradeoff of longer execution times compared to the other classification models. In the balanced data set scenario, all models achieved closely aligned F1‐scores, ranging from 0.83 to 0.90. However, when faced with imbalanced data reflective of actual conditions, both the SVM and Log‐reg models showed inferior performance compared to the RF and NN models. Overall, we concluded that decision tree‐based models emerge as the optimal choice for classification tasks in the context of water end‐use data. Our study advances residential smart water metering systems through creating synthetic labeled end‐use data and providing insight into the strengths and weaknesses of various supervised machine learning classifiers for end‐use identification.
Conference Paper
Full-text available
Recent work by WRc has successfully demonstrated a cost-effective technique for measuring the components of domestic water consumption. The technique involves the capture of relatively high-resolution flow data and uses novel software incorporating a decision tree algorithm to deconstruct a flow-time graph into constituent water uses (microcomponents). Using examples from a number of recent studies conducted for different UK water companies, this paper discusses the merits of the WRc technique and its importance to demand forecasting and assessment of water efficiency schemes.
Article
Full-text available
Estimating household water consumption can facilitate infrastructure management and municipal planning. The relatively low explanatory power of household water consumption, although it has been extensively explored based on various techniques and assumptions regarding influencing features, has the potential to be enhanced based on the water-energy nexus concept. This study attempts to explain household water consumption by establishing estimation models, incorporating energy-related features as inputs and providing strong evidence of the need to consider the water-energy nexus to explain water consumption. Traditional statistical (OLS) and machine learning techniques (random forest and XGBoost) are employed using a sample of 1320 households in Beijing, China. The results demonstrate that the inclusion of energy-related features increases the coefficient of determination (R2) by 34.0% on average. XGBoost performs the best among the three techniques. Energy-related features exhibit higher explanatory power and importance than water-related features. These findings provide a feasible modelling basis and can help better understand the household water-energy nexus.
Article
Full-text available
An accurate estimation of residential end uses of water is helpful in developing efficient water systems. If not obtainable through direct metering, this information can be gathered by disaggregating and classifying household-level water-use data. However, most automated techniques require fine-resolution data (e.g., 1 s) and end-use parameters which may be unavailable to water utilities. To fill the above gap, this study presents a method for the automated disaggregation and classification of indoor water-use data collected at the 1-min temporal resolution, and by exclusively relying on the end-use parameter values available in the literature. Specifically, the features of each water-use event detected at the household level are compared against the most common event features for the selected end-use category. The results obtained by testing the method with real data collected at 14 households in two different countries (Italy and the Netherlands) confirm its potential in disaggregating and classifying water end-use events with an average accuracy higher than 90% and an average (normalized) root-mean-square lower than 0.06 despite the lack of information about end uses in individual households. This demonstrates that end-use detection is possible even with data whose resolution is closer to that of most commercial water meters.
Article
Full-text available
Achieving a thorough understanding of the determinants of household water consumption is crucial to support demand management strategies. Yet, existing research on household water consumption determinants is often limited to specific case studies, with findings that are difficult to generalize and not conclusive. Here, we first contribute an updated framework for review, classification, and analysis of the literature on the determinants of household water consumption. Our framework allows trade-off analysis of different criteria that account for the representation of a potential water consumption determinant in the literature, its impact across heterogeneous case studies, and the effort required to collect information on it. We then review a comprehensive set of 48 publications with our proposed framework. The results of our trade-off analysis show that distinct groups of determinants exist, allowing for the formulation of recommendations for practitioners and researchers on which determinants to consider in practice and prioritize in future research.
Article
Full-text available
Urban energy modeling is essential in planning electricity generation and efficiently managing electric power systems. Various urban energy models were developed for several energy-driven applications, including emission reduction, retrofit analysis, and forecasting. Electricity load forecasts help to estimate the load demand and effectively aid in power system operation and balancing. The accuracy of load forecasts at high temporal and spatial resolution can impact system planning and operation. Therefore, it is essential to know the factors that affect the accuracy of these forecasts and how they can be improved regarding the current state of the art. This article reviews the recent literature on data-driven electricity load forecasts in three steps. First, different phases of the review process are explained to select and analyze recent literature on machine learning-based short-term load forecasts. Then various aspects of load forecasting techniques have been reviewed, addressing their advantages, disadvantages, temporal resolution, and performance. Finally, the review covers the current challenges in load forecasting and describes the reasons for performance degradation and lower accuracy. Based on the reviewed literature, it was found that temperature, user load profiles, and proper management of input data highly affect load forecast accuracy. In addition, shortcomings of existing performance evaluation metrics make the applicability of those techniques questionable. Finally, we conclude the review by highlighting the necessary actions to improve load forecast accuracy that are relatively unexplored and can be used as a reference for future research on accurate load forecasts.
Article
Monitoring the water usage of different appliances and informing consumers about it has been shown to have an impact on their behavior toward drinking water conservation. The most practical and cost-effective way to accomplish this is through a non-intrusive approach, that locally analyzes data received from a flow sensor at the main water supply pipe of a household. In this work, we present two different methods addressing the challenges of disaggregating end-use consumption and classifying consumption events. The first method is model-based (MB) and uses a combination of dynamic time wrapping and statistical bounds to analyze four water end-use characteristics. The second, learning-based (LB) method is data-driven and formulates the problem as a time-series classification problem without relying on a priori identification of events. We perform an extensive computational study that includes a comparison between an MB and an LB method, as well as an experimental study to demonstrate the application of the LB method on an edge computing device. Both methods achieve similar F1 scores (LB: 71.73%, MB: 71.04%) with the LB being more precise. The embedded LB method achieves a slightly higher score (72.01%) while enhancing on-site real-time processing, improving security, and privacy and enabling cost savings.