ArticlePDF Available

Abstract and Figures

The problem of detecting illegal pollutants in wastewater is of fundamental importance for public health and security. The availability of distributed, low–cost and low–power monitoring systems, particularly enforced by IoT communication mechanisms and low-complexity machine learning algorithms, would make it feasible and easy to manage in a widespread manner. Accordingly, an End-to-End IoT-ready node for the sensing, local processing, and transmission of the data collected on the pollutants in the wastewater is presented here. The proposed system, organized in sensing and data processing modules, can recognize and distinguish contaminants from unknown substances typically present in wastewater. This is particularly important in the classification stage since distinguishing between background (not of interest) and foreground (of interest) substances drastically improves the classification performance, especially in terms of false positive rates. The measurement system, i.e., the sensing part, is represented by the so-called Smart Cable Water based on the SENSIPLUS chip, which integrates an array of sensors detecting various water-soluble substances through impedance spectroscopy. The data processing is based on a commercial Micro Control Unit (MCU), including an anomaly detection module, a classification module, and a false positive reduction module, all based on machine learning algorithms that have a computational complexity suitable for low-cost hardware implementation. An extensive experimental campaign on different contaminants has been carried out to train machine-learning algorithms suitable for low-cost and low-power MCU. The corresponding dataset has been made publicly available for download. The obtained results demonstrate an excellent classification ability, achieving an accuracy of more than 95% on average, and are a reliable ”proof of concept” of a pervasive IoT system for distributed monitoring.
Content may be subject to copyright.
An end-to-end real-time pollutants spilling recognition in wastewater
based on the IoT-ready SENSIPLUS platform
Luca Gerevini
a
, Gianni Cerro
b
, Alessandro Bria
a
, Claudio Marrocco
a
, Luigi Ferrigno
a
, Michele Vitelli
c
,
Andrea Ria
d
, Mario Molinara
a
a
Dept. of Electrical and Information Engineering, University of Cassino and Southern Lazio, 03043 Cassino, Italy
b
Dept. of Medicine and Health Sciences ‘‘V. Tiberio, University of Molise, 86100 Campobasso, Italy
c
Sensichips s.r.l., 04011 Aprilia, Italy
d
Department of Information Engineering, 56122 Pisa, Italy
article info
Article history:
Received 3 September 2022
Revised 23 December 2022
Accepted 24 December 2022
Available online xxxx
Keywords:
Machine learning
Smart sensors
Wastewater
Anomaly detection
IoT
Supervised learning
abstract
The problem of detecting illegal pollutants in wastewater is of fundamental importance for public health
and security. The availability of distributed, low–cost and low–power monitoring systems, particularly
enforced by IoT communication mechanisms and low-complexity machine learning algorithms, would
make it feasible and easy to manage in a widespread manner. Accordingly, an End-to-End IoT-ready node
for the sensing, local processing, and transmission of the data collected on the pollutants in the wastew-
ater is presented here. The proposed system, organized in sensing and data processing modules, can rec-
ognize and distinguish contaminants from unknown substances typically present in wastewater. This is
particularly important in the classification stage since distinguishing between background (not of inter-
est) and foreground (of interest) substances drastically improves the classification performance, espe-
cially in terms of false positive rates. The measurement system, i.e., the sensing part, is represented by
the so-called Smart Cable Water based on the SENSIPLUS chip, which integrates an array of sensors
detecting various water-soluble substances through impedance spectroscopy. The data processing is
based on a commercial Micro Control Unit (MCU), including an anomaly detection module, a classifica-
tion module, and a false positive reduction module, all based on machine learning algorithms that have
a computational complexity suitable for low-cost hardware implementation.
An extensive experimental campaign on different contaminants has been carried out to train machine-
learning algorithms suitable for low-cost and low-power MCU. The corresponding dataset has been made
publicly available for download. The obtained results demonstrate an excellent classification ability,
achieving an accuracy of more than 95% on average, and are a reliable ‘‘proof of concept” of a pervasive
IoT system for distributed monitoring.
Ó2022 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access
article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
1. Introduction
Water, which covers more than 70% of the Earth’s surface and is
involved in almost all life activities, is a primary factor influencing
life on the Earth. Consequently, water quality monitoring is a cru-
cial task, and ways to address it are widely spreading in the scien-
tific literature (Ighalo et al., 2021; Budiarti et al., 2019; Saravanan
et al., 2018; Akhter et al., 2022; Ferdinandi et al., 2019). Particu-
larly critical is the issue related to wastewater (Trubetskaya
et al., 2021), i.e., water having suffered pollution due to domestic,
industrial, or hospital processes. Its monitoring has been a hot
topic for two years as the COVID-19 pandemic spread throughout
the world (Bogler et al., 2020; Farkas et al., 2020). Capabilities to
get detailed and accurate monitoring and detect possible contam-
inants are related to three distinct components: sensing systems,
geographical pervasiveness, and data processing.
As for sensing systems (Tyszczuk-Rotko et al., 2022;
Kamaruidzaman and Rahmat, May 2020; Vikesland, 2018; Alam
https://doi.org/10.1016/j.jksuci.2022.12.018
1319-1578/Ó2022 The Author(s). Published by Elsevier B.V. on behalf of King Saud University.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer review under responsibility of King Saud University.
Production and hosting by Elsevier
E-mail addresses: luca.gerevini@unicas.it (L. Gerevini), gianni.cerro@unimol.it
(G. Cerro), a.bria@unicas.it (A. Bria), c.marrocco@unicas.it (C. Marrocco), fer-
rigno@unicas.it (L. Ferrigno), michele.vitelli@sensichips.com (M. Vitelli), andrea.
ria@ing.unipi.it (A. Ria), m.molinara@unicas.it (M. Molinara)
Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
Contents lists available at ScienceDirect
Journal of King Saud University
Computer and Information Sciences
journal homepage: www.sciencedirect.com
et al., 2020), different costs and performance levels can be experi-
enced. Issues to be faced are related to sensitivity, selectivity, and
miniaturization. Most solutions prefer adopting a sensor array to
increase the capability to discern between different substances.
To get widespread monitoring and ensure water quality estimation
and pollutant detection in a distributed way, the adoption of high-
cost systems appears unsuitable.
IoT-ready, low-cost platforms enable the achievement of geo-
graphical pervasiveness that could benefit from a high level of
energy autonomy, low computational burden, and high data trans-
fer capabilities. Their flexibility allows spreading devices in the
area of interest by creating a monitoring network. The IoT capabil-
ities adopted for water monitoring are widespread (Junior et al.,
2021; Dupont et al., 2018; Overmars and Venkatraman, 2020).
In terms of data processing, acquired measurements are gener-
ally processed to become features to feed Machine Learning (ML)/
Deep Learning (DL) algorithms adopted for classification (Lowe
et al., 2022; Koditala and Pandey, 2018; Bansal and Geetha,
2020; Dilmi and Ladjal, 2021; Bria et al., 2021; Bria et al., 2020).
Major challenges regard finding pathways to have fast data
exchange, classify with acceptable computational complexity in
nearly real-time and be able to discriminate among different pollu-
tants that can be found in the flowing wastewater.
The paper’s goal is to present an end-to-end system for spilling
detection in wastewater that includes a complete chain from sens-
ing to final classification. The proposed system is:
real-time because it can respond on a single sample basis gen-
erating a classification for each set of ten measures: the total
time needed for a single acquisition/classification is equal to
about 1.6 s;
low power, low cost, and IoT ready, thanks to the coupling of
the SENSIPLUS (Ria et al., 2022; Manfredini et al., 2021) (dis-
cussed in the following) with a commercial MCU;
able to tackle unknown substances, thanks to the anomaly
detection module;
can monitor a single polluting source at a time, considering that
spilling of the considered pollutants in wastewater is a rare
event.
Based on this concept, the paper is structured as follows. Sec-
tion 2contains a complete review of state of the art in pollutant
detection in water and wastewater through machine learning. Sec-
tion 3highlights the main contributions provided in this paper.
Section 4describes the measurement set-up as well as the data
processing stage to get data ready for classification. Section 5
shows the obtained experimental results. A discussion of the
obtained results is reported in 6. Conclusions and future directions
are finally discussed in Section 7.
2. Related works
In recent years several sensor prototypes for monitoring the
composition of wastewater in the context of WasteWater Treat-
ment Plants (WWTP) systems have been proposed in the literature
(Ferdinandi et al., 2019; Bourelly et al., 2020; Betta et al., 2019;
Molinara et al., 2020; Bria et al., 2020; De Vito et al., 2018;
Sewage monitoring system for tracking synthetic drug
laboratories, 2022; Hoes et al., 2009; Lim, 2012; Lepot et al.,
2017; Ji et al., 2020; Drenoyanis et al., 2019; Pisa et al., 2019;
Desmet et al., 2017). The sensors presented are based on different
technologies such as electrochemical sensors, optical sensors,
based on mass or ion spectrometry, etc. and can be mounted inside
wells with the aim of detecting the presence or concentration of
certain pollutants. In Ferdinandi et al. (2019), Bourelly et al.
(2020), Betta et al. (2019), Molinara et al. (2020), Bria et al.
(2020) the application of the SENSIPLUS as air and water monitor-
ing system is presented and its effectiveness preliminary demon-
strated. In De Vito et al. (2018) the authors describe a distributed
sewage monitoring system based on low-cost technologies. In this
case, the authors do not carry out recognition of specific substances
but limit themselves to carrying out the detection of generic pollu-
tants. In Sewage monitoring system for tracking synthetic drug
laboratories (2022) a drug detection system is described in the
sewage system to identify the presence of drug factories. The weak
point of this solution lies precisely in the fact that it deals with a
very specific problem, not designed for the detection of generic
pollutants. In Hoes et al. (2009), a technique to find illicit house-
hold sewage connections to storm-water systems in the Nether-
lands using Distributed Temperature Sensing has been
developed. In Lim (2012), a generic system for detecting pollutants
in wastewater is presented. The system lacks the ability to discern
between different substances and is based on outdated technolo-
gies. In Lepot et al. (2017) a system for detecting illicit connections
to the sewage system is presented. The solution, based on the use
of an infrared camera, is not designed for the detection of specific
substances. In Ji et al. (2020), a system for measuring the amount of
wastewater based on image analysis is presented. In this case, the
distinction between the different substances is completely missing,
and in general, the vision systems, although immune to corrosion
phenomena and deposits of material on the sensors, are generally
characterized by high energy consumption and, therefore, not very
suitable for continuous monitoring systems at low power. In Pisa
et al. (2019), the authors propose a system specifically designed
to detect nitrogen-derived components, specifically ammonium
and total nitrogen, without any attention to the pervasiveness
and low power/low cost. In Drenoyanis et al. (2019), a standalone,
portable radar device allowing non-invasive benchmarking of
sewer pumping station pumps is presented. The system is designed
to generate timely alarms in the event of anomalies in wastewater
flows near WWTP. It does not include any pollutant detection sys-
tem. In Desmet et al. (2017), a system for detecting explosive pre-
cursors is presented, i.e., those substances that terrorists could use
to manufacture rudimentary bombs. In this work, sensors func-
tionalized with gold, palladium, and platinum are used, and
voltammetry is used to detect substances.
3. Main contribution of this paper
From the review of the scientific literature on the application of
machine learning to water analysis, it emerges that the problem of
anomaly detection is neglected. Not considering anomalies in real
systems means making them unusable in a context other than the
laboratory, as the system would not be able to react correctly to
substances not taken into consideration during the training phase,
potentially generating false positives.
To summarize, the open issues in wastewater analysis are
mainly related to the complex and expensive equipment often
required, unsuitable for the IoT and pervasive paradigm, and to
the lack of an anomaly detection step. This paper proposes a solu-
tion to both of these issues.
In terms of IoT readiness, is proposed the adoption of the SEN-
SIPLUS chip, a proprietary device developed by the Italian company
Sensichips s.r.l., which has been proven to be effective in reliable
measurements for pollutant detection in air and water
(Ferdinandi et al., 2019; Bourelly et al., 2020; Betta et al., 2019;
Molinara et al., 2020; Bria et al., 2020). The SENSIPLUS chip,
together with a commercial Micro Control Unit (MCU), becomes
a low-power, low-cost, and IoT-ready miniaturized sensing plat-
form. The MCU is needed to run the C++ API supplied with the SEN-
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
2
SIPLUS chip and to equip the system to communicate with external
systems (for example, with USB or with MQTT over TCP/IP) and for
the inference phase of various machine learning algorithms.
The second issue is tackled with a double-stage classification
system: an anomaly detector and a multiclass classifier, starting
from the idea that some pollutants are interesting while others
are simply interferants and do not need to be classified. The anom-
aly detection allows stating if the analyzed substance can be one of
interest or something else (for simplicity, unknown). Whenever
such a module declares that the substance is not an anomaly, the
multiclass classifier module is activated, and its computational
burden is included in the system load. The combination of both
modules permits having a substantial false positive reduction
while keeping a very high accuracy value for the substances of
interest.
The combination of the developed platform and the new con-
cept of supervised double-stage classification represents the main
contribution of this work to the state of the art.
4. Methodology
4.1. The detection chain
Fig. 1 shows the overall detection system that is based on the
Smart Cable Water (SCW) visible in Fig. 2, a proprietary IoT-
ready smart sensor system by Sensichips s.r.l, composed of
InterDigitated Electrodes (IDEs) and based on SENSIPLUS (Ria
et al., 2022).
The latter is a tiny analytical sensing platform of 1.5 mW power
absorption, with communication capabilities like SPI, I2C, and SEN-
SIBUS (a proprietary 1-wire communication protocol). The SENSI-
PLUS needs an MCU to run its C++ API, which includes the engine
for inference of machine learning algorithms. The ESP32/ESP8266
with USB and WiFi communication capability has been selected
as an MCU. ESP32/ESP8266 can also guarantee data transfer
through (for example) MQTT on TCP/IP. In this configuration, the
MCU can act as a simple bridge to transmit the data collected
through the sensors to the cloud (via MQTT, for example) and as
a device that performs local processing for detecting substances
of interest through the execution of suitable machine learning
algorithms. During the operational time, SCW can be flooded in
the water, and communication and control signals are conveyed
through a suitable cable. SENSIPLUS is a micro-chip capable of
interrogating on-chip and off-chip sensors with its versatile and
accurate Electrical Impedance Spectrometer (EIS) in the frequency
range comprised between 3.1 mHz and 1.2 MHz. With the SENSI-
PLUS, it is possible to perform measurements working with multi-
Fig. 1. The overall detection system deployed.
Fig. 2. Smart Cable Water (SCW) with InterDigitated Electrodes functionalized by
coating them with six different metals.
Fig. 3. Randles equivalent circuit.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
3
ple sensors; in particular, the SCW system has 6 IDEs. The physical
principle adopted to detect and recognize a given set of substances
is related to the RedOx dynamics of catalytic noble metals. Such a
phenomenon can be observed as an electrical behavior. Fig. 3
shows the modeled electrical equivalent circuit of two electrodes
flooded in a water solution, known in the literature as the Randles
circuit (Alavi et al., 2017).
As can be seen from the electrical circuit, each electrode is mod-
eled through a double-layer capacitance C
d
and a faradic resistance
R
f
, which takes into account the interface between the water solu-
tion (called bulk) and the electrode itself. The model values depend
on the electrode composition, geometry, bulk composition, etc. The
parameter R
e
is the equivalent resistance of the bulk, so it mainly
depends on the bulk composition and the electrode area.
To maximize sensitivity to the substances of interest and the
RedOx dynamics, the 6 IDEs of the SCW have been functionalized
by coating them with six different metals: (M1) Gold, (M2) Copper,
(M3) Silver, (M4) Nickel, (M5) Palladium and (M6) Platinum. (M1)
to (M5) IDEs are 3 mm by 7 mm each, while (M6) is 12 mm by
8 mm (see Fig. 2).
4.2. Dataset acquisition for training
The proposed system is intended to detect and recognize sub-
stances spilled in wastewater. Consequently, the best solution to
build a good dataset for the training phase would be to acquire
all the measurements directly in a controlled drain of a sewage
network. However, this is not a viable solution mainly for two
reasons:
Measurements point of view: all measurements should be taken
from the same and reliable conditions; however, due to the
instability typical of the sewage background environmental
composition, it is impossible to reach an acceptable level of reli-
ability conditions.
Heath point of view: due to the presence of viruses, bacteria,
and other dangers, operating directly in the sewage network
would represent biological hazards.
To solve the listed problems, we create Synthetic WasteWater
(SWW) to simulate the sewage composition and a measurement
setup as described in Fig. 4 to create a suitable dataset. The
adopted recipe for the SWW is inspired by a simplified version of
the one created in Nopens et al. (2001). Moreover, to better repro-
duce a real wastewater scenario, the pH of every batch of the SWW
has been corrected according to Janna (2016), where measure-
ments on the real wastewater are reported. For the more detailed
chemical composition of the SWW refer to Table 1.
Fourteen substances have been spilled in the SWW background:
(1) Acetic Acid; (2) Acetone; (3) Ethanol; (4) Ammonia; (5) Formic
Acid; (6) Phosphoric Acid; (7) Sulphuric Acid; (8) Hydrogen Perox-
ide; (9) Synthetic Waste Water; (10) Sodium Hypochlorite; (11)
Sodium Chloride; (12) Dish Wash Detergent; (13) Wash Machine
Detergent; (14) Nelsen.
The listed substances can be split into two groups: substances
1–9 (group 1) and 10–14 (group 2). Group 1 includes only sub-
stances that our system should be capable of recognizing, while
group 2 includes only the outlier samples that our system should
be able to reject.
The measurement procedure for each substance for dataset cre-
ation is composed of two phases:
Warm-Up phase: to let all sensors stabilize, 600 samples at
0.5 Hz rate (total warm-up time: 900 s) in pure SWW are
acquired.
Fig. 4. Measurement Set-Up for dataset acquisition.
Table 1
Synthetic waste water chemical composition.
Compounds Concentration [mg/l]
Fertilizer 91.74
Ammonium Chloride 12.75
Sodium Acetate Trihydrate 131.64
Magnesium Hydrogen Phosphate Trihydrate 29.02
Monopotassium Phosphate 23.4
Iron (II) Sulfate Heptahydrate 5.80
Starch 122.00
Milk Powder 116.19
Yeast 52.24
Soy Oil 29.02
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
4
Measurement phase: after the first 600 samples, the substance of
interest is spilled in the SWW, and, to record the entire sensor’s
evolution after the injection, another 1000 samples at the same
sample rate are acquired (total measurement phase time: 2000
s).
The obtained dataset has been made publicly available here
(Public link for downloading the acquired dataset, 2022).
One of the main problems related to Machine Learning is
related to feature identification, i.e. the choice of informative prop-
erties derived from sensors, able to maximize the classification
accuracy.
In our case, according to the electrical equivalent circuit
described in the previous section, we choose to record the follow-
ing features:
Resistance measured at 78 kHz frequency, for the Gold and Plat-
inum IDEs.
Resistance and Capacitance measured at 200 Hz frequency, as
concerns Gold, Platinum, Silver, and Nickel.
obtaining a feature vector of size ten (6 resistance and 4 capaci-
tance). The Palladium and Copper IDEs have not been used in this
experimental campaign.
The cited features have been chosen because of the different
behavior of the equivalent circuit at low and high frequencies. In
particular, both C
d
exhibit a high impedance at the low frequencies
and can be represented as an open circuit (see Fig. 5a). So the mea-
surements depend either on the faradic or bulk resistance (R
e
). On
the other hand, at the high frequencies, the two C
d
present a low
impedance and can be seen as a short circuit (see Fig. 5b): the mea-
surements mainly depend on the bulk resistance.
4.3. Dataset structure and usage
For each substance, ten acquisitions of 1600 samples obtained
through the measurement procedure as mentioned above have
been collected, obtaining 16000 samples overall.
For evaluation purposes, the k-Fold Cross-Validation procedure
has been adopted. Cross-validation is primarily used in applied
machine learning to estimate the skill of a machine learning model
on unseen data. Its application generally results in a less biased or
less optimistic estimate of the model efficiency than other meth-
ods, such as a simple train/test split. Usually, the first step in k-
fold Cross-Validation is the random shuffle of the collected data.
In our case, taking into account that measures belonging to the
same experiment are strongly correlated, we preferred to assume
as a unit for k-fold an entire acquisition (1600 samples) of all the
substances.
In order to find the best anomaly detection and multiclass clas-
sifiers model to use for the entire system, the entire Data set has
been organized in ten Fold (Fold 0,Fold 1,...,Fold 9). Each Fold con-
tains nine additional Split (Split 0,Split 1,...,Split 9) and one Test.
The given Split are organized like the following:
Training data: used to train both anomaly detection and multi-
class classifier model.
Test data: used to find the best model’s hyperparameters for
anomaly detection and multiclass classifier.
For the final evaluation concern, it is composed of whose sam-
ples are not contained either in the Training data nor in the Test
data of all the Splits related to the given Fold.
In order to keep things clear we used a fixed nomenclature: the
number inside the given Fold’s name, indicates the experiment
(data acquisition) used to perform the final evaluation, while the
number inside the given Split indicates the experiment used for
the related Test data. For the Training data concern, it is composed
of all the experiments except the one used for the related Test Set
and the one used for the final evaluation that, as said before, con-
tains data that is unseen from both the Training and Test data of
the related Fold. For example, the Fold0 contains the Split from 1
to 9, excluding the Split 0 since the experiment 0 of all the sub-
stances is used to build the related Test set.
The Test data of the Split 1 is made of experiment 1 of all sub-
stances while the Training data is made of all the remaining exper-
iments (excluding experiment 1 used for the Test and experiment 0
used for the final evaluation). In this way, the Test data of the Split
2is made by experiment 2, while the related Training data will
exclude experiments 2 and 0, and so on. The final evaluation of Fold
0is made by experiment 0 of all substances. See Fig. 6 for a graph-
ical representation of the Data Set splitting. It is worth specifying
that in Fig. 6 Exp 0, Exp 1, ..., Exp 9 means respectively acquisition
0, 1, ...
, 9 of all substances.
Finally, it is worth specifying that regards the multiclass classi-
fier and anomaly detection model the training, test, and final eval-
uation set ratio during the learning phase was respectively:
80%;10%;10%. Furthermore, in order to properly validate and test
the learned anomaly detection models, the validation and test sets
have been polluted with outliers points taken from the substances
belonging to Group2.
4.4. Classification
The classification system is organized in two phases: (i) Data
Preprocessing; (ii) Classification. As can be seen in Fig. 7, the Data
Preprocessing phase (i) normalize the raw data coming from sen-
sors and discriminates through a Finite State Machine (FSM, see
Fig. 5. Randles at different frequencies.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
5
Fig. 6. Data Set Structure. Exp 0, Exp 1, ..., Exp 9 means respectively acquisition 0, 1, ..., 9 of all substances.
Fig. 7. An overall view of the system.
Fig. 8. Finite state machine.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
6
Fig. 8) if there should be submitted to the Classification phase (ii)
or not.
4.4.1. Data preprocessing
The Data Preprocessing phase is realized in two steps:
normalization of the raw data coming from sensors, through the
creation of a robust baseline signal;
takes a decision if the normalized sample should be forwarded
to the anomaly detector or classified directly.
The baseline signal b
t
is generated by the union of the FSM with
the application of an Exponential Moving Average (EMA) according
to the following equation:
b
t
¼
s
t
t¼0
b
t1
t>0;S2fBS;BSPg
a
s
t
þð1
a
Þs
t1
;t>0;S2fWT;BA;BT g
8
>
<
>
:
ð1Þ
where s
t
are the sensors’ raw data, Wait (WT), Baseline Acquisition
(BA), Baseline Tracking (BT), Baseline Suspended (BSP), and Baseline
Stopped (BS) are the states of the FSM.
The FSM aims to build a robust baseline capable of coping with:
the variability between sensors/chips, the sensor drift, the environ-
mental noise, interferences, etc. The first two states (WT and BA)
guarantee that the baseline is not affected by noise and/or interfer-
ences. Once the FSM reaches the BT state, the system tries to detect
the injection of a substance, revealed by a peak in the raw data
with respect to the baseline generated with EWA (see below).
The BSP state is an intermediate state between BT and BS that
try to filter out signal spike by waiting if a cluster of samples con-
firms the presence of a spilled substance. Once the system is con-
fident there is the spilling of a substance (after 5 acquisitions), the
FSM moves to the BS states. Consequently, the normalized samples
are passed as input to the classification phase algorithm.
Regarding the EMA, the
a
parameter is the reciprocal of EMA
c
(coefficient empirically set to 25). The normalization value is given
by the following formula:
f
t
¼s
t
=b
t
ð2Þ
where the f
t
is the normalized feature vector, while s
t
is the raw
sensor data and the b
t
is the baseline signal computed as described
by the Eq. 1.
Fig. 8 shows the entire FSM system. In particular, tis the current
time sample, while
s
is a threshold that, in our case, has been
empirically set equal to 0:05. Regards the d
t
parameters, it repre-
sents the Euclidean distance between the normalized features vec-
tor f
t
and the unit vector u(a vector of ones) in a 10-dimensional
space that is the size of the vector s
t
(see Eq. 3).
Starting from the euclidean distance d
t
evaluated between s
t
and b
t
in the feature space, there is a peak that reveals an injection
when d
t
is greater than a threshold
s
(empirically established to
0:05).
Looking at the Eq. 2it is clear that the vector f
t
, when b
t
is equal
to s
t
is equal to the unit vector. For this reason, the Euclidean dis-
tance has been computed with respect to the unit vector and so
when d
t
is equal to zero means that the baseline signal b
t
is per-
fectly tracking the sensors signals s
t
.
d
t
¼f
t
u
kk ð3Þ
As seen in the Fig. 8, the current state of the FSM can change accord-
ing to a given rule. In particular, the FSM starts with the WT state. In
this state the classification system will simply compute, and store
into a vector, the first EMA
c
distance computed over the s
t
measured
samples, as reported in the Eq. 2. Once the distance vector has been
filled, the FSM can pass in the BA state. Here the system will keep
updating the distance vector and, once the variability of the vector
(computed as the mean plus three times the standard deviation) is
below a given threshold, the system can move to the next state. At
this point, the system will check if a substance has been spilled in
the water, and this is done by checking when the current distance
is major of a given threshold. Once the FSM moves to the BSP state,
in order to not confuse the spill of a substance with a measurement
spike or simple noise, the system will check that the current dis-
tance remains above the threshold for five consecutive samples
(BSP), otherwise, the system comes back to the BT state. Finally
once the FSM is in BS state the current normalized samples are
given in input to the detection system. In this state, if the sample
classification is equal to the background substance, the FSM will
return to the BA state.
The entire system depicted so far is shown in Fig. 9.
Where S indicates the state of the FSM, C
t
is the classification of
the sample at time tand BKG is the background substance.
4.4.2. Detection phase
In a real scenario, there are plenty of substances that flow in the
sewerage network, therefore it is crucial to be able to distinguish
between the substances of interest and the other ones.
In this sense, the main goal of this phase is to determine if the
given flowing substance is one of the substances of interest, in
order to be then able to predict its name correctly.
The detection phase is basically divided into two main parts:
Anomaly Detection
Multiclass Classification
Anomaly Detection
Regards the anomaly detection algorithms, we can mainly dis-
tinguish them in two approaches:
Fig. 9. Finite state machine flow chart.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
7
Outlier Detection
Novelty Detection
In the outlier detection algorithms, the training data contains
outliers samples. In this case, the estimators try to fit the regions
where the training data is the most concentrated, ignoring the
deviant observations. The training data is not polluted with out-
liers samples in the novelty detection algorithms. In this context,
we want to determine whether a new observation is an outlier.
In this sense, an outlier is also called a novelty.
Our case is better represented by the novelty detection
approaches according to our data set and the application field. This
is because, in our application field, we want to discard all those
substances that are usually present in the sewage system, and
we want to recognize only the substances of interest that represent
a minimum part of the substances that can be found in wastewa-
ter. To have as complete a point of view as possible, we trained
and tested anomaly detection models built with either novelty or
outlier approaches:
Novelty Detection: One-class SVM, Local Outlier Factor, and
KNN
Outlier Detection: Elliptic Envelope and Isolation Forest
All the algorithms have been taken from the sci-kit learn library
(Pedregosa et al., 2011) except for the KNN, which has been taken
from the Python Outlier Detection (PyOD) library (Zhao et al.,
2019).
As described in Section 4.3, we have divided the entire data set
into ten cross-validations folders, each of which contains addi-
tional nine folders containing training and a validation set. For
the outlier detection data set concern, it is the same depicted in
Section 4.3 with the addition of some outlier samples in the train-
ing set (about 10%).
Multiclass Classification
Starting with the results obtained in previous work, we have
trained and optimized the accuracy of a KNN on the described data
set. It is noted that, unlike anomaly detection, the training and val-
idation sets are formed with only the samples of the substances of
interest.
In both cases, anomaly detection and multiclass classification,
the grid search approaches have been chosen to optimize the mod-
els’ accuracy. All the models parameters are detailed in the Table 2
and Table 3.
Finally, the entire system, composed of the data processing and
the detection systems, is shown in Fig. 10.
Eventually, to find out the best anomaly and multiclass classi-
fier model, the cross-validation technique over the ten sub-set of
the data set (see Section 4.3 for more details) has been used. Once
the best model of each classifier has been found, the entire system
has been tested over the test data.
It is important to point out that the proposed detection system
does not relay over any pattern/trajectory recognition, or time ser-
ies, or, in other words, it is time-independent. This feature allows
us to build an IoT-ready system capable of detecting and recogniz-
ing, the given spilled substance based only on the current samples,
as shown in Fig. 10. In this sense, we can refer to our system as an
IoT-ready platform for real-time pollutant spilling detection.
Algorithm 1. Training procedure
input: A dataset Frepresenting a single Fold, list of classifiers
to train, hyperparameters
output: Best Classifier
begin
F
n
¼normalizeDataSetðFÞ;
for clf in classifiers do
X
train
;Y
train
¼loadTrainigDataðF
n
Þ;
X
v
alidation
;Y
v
alidation
¼loadValidationDataðF
n
Þ;
for param in hyperparameters do
clf :set paramsðparamÞ;
clf :fitðX
train
Þ;
Y
pred
¼clf :predictðX
v
alidation
Þ;
Accuracy:appendð½clf ;e
v
aluateðY
pred
;Y
v
alidation
ÞÞ;
clf
best
¼getBestClf ðAccuracyÞ;
return clf
best
;
end
Algorithm 2. Test procedure
input: A TestSet T, best anomaly model (anly), best
multiclass model (clf), doAnomaly
output: [Accuracy, CM]
begin
groundtruth ¼getGroundtruthðT
n
Þ
for sample in T
n
do
if doAnomaly then
outClass:appendðonlineClassidicationðsampleÞ;
else
sample
n
¼normalizeðsampleÞ;
state ¼getFsmStateðsample
n
Þ;
if state BS then
outClass:appendðBKGÞ;
else
outClass:appendðclf :predictðsample
n
ÞÞ;
ConfusionMatrix ¼e
v
aluateðoutClass;groundtruthÞ;
return ½Accuracy;CM;
end
Algorithm 3. Online Classification procedure
input: Sample S, anomaly detection classifier (anly),
multiclass classifier (clf)
Table 2
Anomaly detection models parameters.
Classifier Parameters
KNN contamination [0.01, 0.05, 0.1, ..., 0.5]
N neighbors [10, 100, 200, ..., 500]
SVM
m
[0.01, 0.05, 0.15, ..., 1.0]
Kernel Radial basis function
c[auto, scale, 0.01, 0.05, 0.15, ..., 1.0]
Local Outlier Factor contamination [0.01, 0.05, 0.1, ..., 0.5]
N neighbors [10, 100, 200, ..., 500]
Elliptic Envelope contamination [0.01, 0.05, 0.1, ..., 0.5]
Isolation Forest contamination [auto, 0.01, 0.05, 0.1, ..., 0.5]
N estimators [50, 100, 150, ..., 500]
Table 3
Multiclass classification model parameters.
Classifier Parameters
KNN algorithm ball tree
N neighbors [10, 100, 150, ..., 500]
weights [uniform, distance]
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
8
output: predicted class (outClass)
begin
S
n
¼normalizeðSÞ;
state ¼getFsmStateðS
n
Þ;
if state BS then
outClass ¼BKG;
else
if anly:predictðS
n
Þ¼inlier then
outClass ¼clf :predictðS
n
Þ;
else
outClass ¼UNKNOWN;
return outClass;
end
The Algorithm 2 and 1 shows the pseudo code regarding the
Training and Test procedure. As described in the previous sections,
the training procedure is the same for both anomaly detection and
multiclass classifiers. For the test procedure concern, instead, it has
been built to be able to test either the entire system (anomaly
detection and multiclass classifier) rather than only the multiclass
classifier one. For that reason, the Test procedure takes as input an
extra parameter ‘‘doAnomaly that serves to decide if the test must
be performed over only the multiclass model (case doAnomaly =
FALSE) or on both anomaly detection and multiclass models (case
doAnomaly = TRUE). In the latter case, the online classification pro-
cedure (Algorithm 3) is called. It is worth specifying that the Algo-
rithm 3 represents the procedure implemented on the end-to-end
system to perform online tests of the entire system. The time com-
plexity of the overall chain is, in the worst case, the sum of the time
complexity of a kNN with an SVM at inference time that is compat-
ible with the computational capability of the selected MCU (Ray
et al., 2021). The time complexity of the kNN algorithm is
OðndÞ, where nis the number of samples in the training set,
and dis the total number of features. The time complexity of our
kernel SVM is linear with the number of support vectors n
s
and
the number of features dand can be represented as Oðn
s
dÞ.
The elapsed time for both steps is around 500ms on the ESP8266
allowing a complete evaluation between two acquisitions.
5. Experimental results
For each case (anomaly and multiclass classifier) the best model
to verify the entire system over the test set has been selected. In
the following subsections, the obtained results are reported.
5.1. Anomaly detection results
The best results have been obtained in the Fold0 case. In the
case of Fold0, all the experiments between 1 to 9 have been used
as training and validation sets following the cross-validation tech-
nique, while experiments 0 of all substances have been used as the
test set.
The best results are reported in Table 4, while the average plus
the standard deviation (STD), obtained over all the splits of the
Fold0 data set, are reported in Table 5. It is worth noting that the
obtained results show almost the same performance across the
used algorithms. Thus, it is impossible to easily declare a winner.
For that reason, since the application field of the proposed system
best fits the novelty detection approaches, to test the entire system
has been used the One-class SVM classifier.
Furthermore, to statistically validate the obtained results, we
performed the Wilcoxon rank-sum test (
a
¼0:05). Indeed, Table 5
also shows the p–value of the Wilcoxon test. From the table, it is
possible to see that the performance differences between the three
algorithms that best perform (One-Class SVM, Elliptic Envelope
and Isolation Forest) are not statically significant (p-value >
0.05). Regarding the Local Outlier Factor and the KNN algorithm,
it is possible to notice that the p-value is <0.05, highlighting a sta-
tistical difference between the obtained results. Finally, it is worth
Fig. 10. Entire system flow chart.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
9
noticing that the Wilcoxon test has been performed by evaluating
all chosen figures of merit (Accuracy, F1Score and MCC).
Regards the reported figure of merit they have been computed
by the following formulas:
Accuracy ¼TP þTN
TP þTN þFP þFN ð4Þ
F1Score ¼2precision recall
precision þrecall where precision ¼
TP
TPþFP
recall ¼
TP
TPþFN
(ð5Þ
MCC ¼TP TN FP FN
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðTP þFPÞðTP þFNÞðTN þFPÞðTN þFNÞ
pð6Þ
where the True Positive (TP) are all the outlier samples classified as
an outlier, True Negative (TN) are all the inlier samples classified as
inlier, False Positive (FP) are all the inlier samples classified as an
outlier, and False Negative (FN) are all the outlier samples classified
as an inlier.
Table 4
Best results anomaly detection.
Approach Algorithm Accuracy F1 Score MCC Parameters Split
Novelty One-Class SVM 0.9546 0.8868 0.8675
m
0.01 6
Kernel rbf
c0.45
KNN 0.5982 0.0938 0.2485 contamination 0.45 1
N neighbors 10
Local Outlier Factor 0.8624 0.7639 0.7123 contamination 0.01 7
N neighbors 400
Outlier Elliptic Envelope 0.9545 0.8864 0.8671 contamination 0.05 6
Isolation Forest 0.9547 0.8872 0.8679 contamination 0.1 4
N estimators 350
Table 5
CrossValidation0 results.
Algorithm Accuracy F1Score MCC p-value
One-Class SVM 0.9358 0.0171 0.8474 0.0352 0.8115 0.0497
KNN 0.5651 0.0226 0.0902 0.0018 0.2762 0.0188 5:6e 6
Local Outlier Factor 0.8201 0.0346 0.7137 0.0382 0.6519 0.0463 5:6e 6
Elliptic Envelope 0.9325 0.0157 0.8448 0.0311 0.8069 0.0445 0.4860
Isolation Forest 0.9463 0.0130 0.8689 0.0277 0.8423 0.0391 0.7317
Fig. 11. Multiclass classifier results.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
10
5.2. Multiclass classifier results
The best result obtained in the Fold0 is obtained by the KNN
algorithm, using a number of neighbors (N) equal to 10 and adopt-
ing uniform weights. The obtained accuracy is equal to 99.37%. In
terms of mean accuracy and standard deviation over the nine folds
contained in Fold0, the obtained result is A%= (91.0 5.7)%.
5.3. Entire system results
Two main tests have been done to highlight the benefits
obtained from the usage of anomaly detection followed by the
multiclass classifier for the entire system concerns.
Once with only the multiclass classifier, and once with anomaly
detection plus multiclass classifier. The obtained results are shown
in Figs. 11 and 12. As can be seen from the two confusion matrices,
the outlier substances used are:
Dish Wash Detergent (DW_DETERGEMT)
Nelsen (INT_NELSEN)
Washing Machine Detergent (WM_DETERGENT)
Sodium Chloride (SODIUM_CHLORIDE)
Sodium Hypochlorite (SODIUM_HYPOCHLORITE)
In the case of the Multiclass classifier only, the outlier sub-
stances get erroneously confused with one of the known sub-
stances generating many false positive alarms. To solve this
problem, as described in the previous sections, before the multi-
class classifier has been added an anomaly detection system cap-
able of working as a false positive reduction filter.
As reported in Fig. 12, with the addition of the anomaly detec-
tion system, most outlier samples get correctly labeled as
‘‘UNKNOWN”. More precisely, 79:4%of the outlier samples has
been correctly labeled as ‘‘UNKNOWN”, while the remaining
20:6%, which represents all the sodium hypochlorite samples, gets
mostly confused with the hydrogen peroxide (according to what is
shown in Fig. 11).
Finally, as already depicted in Section 5.1, since the obtained
results (see Table 4 and 5) show almost the same performance
across the used algorithms, and given the application field of the
proposed system, in the reported result the One-class SVM classi-
fier has been used.
6. Field tests
In order to obtain field tests, two preliminary experiments were
conducted on real scenarios: one at the Acqualatina treatment
plant in Borgopiave (Latina, Italy) (see Fig. 13), and the second on
a series of wells located in Via Castelbottaccio (Rome, Italy) in col-
laboration with ACEA S.p.A. (Azienda comunale energia e
ambiente, 2022) (see Fig. 14).
A flotation system has been designed to allow the immersion of
the SCW in water at the proper depth, as shown in Fig. 15.
To be able to install the sensing system completely inside the
manhole, a prototype of the measurement system has been devel-
oped, as shown in Fig. 16. As can be seen, the measurement system
is composed of an IP56 waterproof certificate box, a Raspberry Pi4,
a GSM hat for Raspberry Pi based on SIM7600E-H with two exter-
nal antennas, a 20000mAh power bank, an ESP8266 board con-
nected to the SCW via a 10 m SENSIBUS cable. This configuration
can ensure a continuous measurement and transmission of about
1 week. The prototype presented here was designed for a field test
without paying attention to energy consumption. A solution based
exclusively on MCU (without Raspberry) and on the LoRaWAN
(Long Range Wide Area Network) standard could work continu-
ously for several months of continuous monitoring with periodic
transmissions, exploiting an adequate local memory.
In this context, many substances have been tested: Phosphoric
Acid, Sodium Hypochlorite, Acetic Acid, Formic Acid, Ammonia,
and Hydrogen Peroxide. Unlike laboratory tests, in the real context,
they encountered many problems, such as the accumulation of
scale or air bubbles near the sensors of the SCW. After some pre-
liminary tests, part of the problems encountered was solved, and
Fig. 12. Entire system results.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
11
although other problems were still present, it was still possible to
reach an accuracy of more than 80%.
7. Discussion
From the Experimental Results section, given an outlier sample
as input to a multiclass classifier, the output will fall under one of
the known classes. This behavior leads to generating a number of
false alarms equal to the number of outlier samples, making the
system useless in a real scenario application. The results shown
in Fig. 11, clarify the drawbacks of using only the multiclass classi-
fier system to recognize a given substance. In this case, indeed
100%of the outlier samples represented by the dish wash deter-
gent, Nelsen, washing machine detergent, sodium chloride, and
sodium hypochlorite has been mainly confused with the sulphuric
acid and the hydrogen peroxide generating a great number of false
alarms. For sure, a multiclass classifier, used alone, cannot reject
any outlier sample. For that reason, an anomaly detection module
as a false alarm filter has been introduced to solve this kind of
behavior.
Table 4 and 5 show the results obtained by the anomaly detec-
tion system. As can be seen, the performance of the One-Class SVM,
Elliptic Envelope and Isolation Forest are pretty similar, this means
Fig. 13. Borgopiave. Green circle represents the sensing manhole, while red circles represent the spiking manhole positioned at 60 m from the sensing manhole.
Fig. 14. Castelbottaccio. The green circle represents the sensing manhole, while red circles represent the spiking manholes respectively positioned at 50 m, 75 m, and 150 m
from the sensing manhole.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
12
that the three algorithms can be equally used. Moreover, to statis-
tically validate the obtained results, the Wilcoxon rank-sum test
(
a
¼0:05) has been performed, and the results, shown in Table 5,
didn’t show any significant differences. Finally, the chosen figures
of merit show that the anomaly detection algorithms are able to
distinguish between the outlier samples and the normal ones
correctly.
At this point, putting the anomaly detection and multiclass clas-
sifier systems together has been possible to reach the results
shown in Fig. 12. The results show that the anomaly detection sys-
tem has rejected most outliers samples (around the 80%) by label-
ing it as ‘‘UNKNOWN”. As can be seen, all the sodium hypochlorite
samples have been mainly confused with hydrogen peroxide. Even
though this is a behavior that worsens the system’s performance, it
can still be considered an acceptable behavior. Indeed, even if the
two substances are chemically different (i.e., sodium hypochlorite
is a polar compound while hydrogen peroxide is nonpolar), they
have a similar oxidation potential: 1:6Vfor the sodium hypochlo-
rite and 1:75Vfor the hydrogen peroxide (Vany
´sek, 2010). More-
over, among all the substances of interest, hydrogen peroxide is
the only compound that can be considered a strong oxidant (in a
range that goes from þ3Vfor the oxidants to 3Vfor the reducers).
This similarity is particularly evident with the measurements at
78kHz. For all those reasons, the confusion between sodium
hypochlorite and hydrogen peroxide can be considered acceptable.
For the substance of interest concern, instead, Fig. 12 shows as
the multiclass classifier results (see Fig. 11) are substantially main-
tained. Indeed, the overall accuracy reached by the entire system
over the substances of interest is around 98:44%with a 0:93%of
accuracy loss with respect to the multiclass classifier system alone
(99:37%).
With the consideration made, we can say that the improvement
made by putting an anomaly detection system before the multi-
class classifier one has been proven. The entire system has been
able to reject around the 80%of outliers samples and correctly rec-
ognize around 98:44%of inlier samples. Finally, given a real sce-
nario application, it is clear that the presence of an anomaly
detection module is of vital importance for the utility of the system
itself.
In conclusion, after all the discussed analysis, we would like to
report what we believe represents the major system’s weaknesses.
In particular, from the classification system point of view, as
already discussed, one problem is related to all those substances
which share some chemical properties, as stated before. Another
weakness is related to sensor poisoning. More in detail, from some
laboratory tests, it has been noted that SCW’s sensors are particu-
larly sensitive to acids substances, capable of poisoning SCW’s sen-
sors, inhibiting their ability to distinguish the different substances
for a while. For sure, the poisoning problem is a complex one that
can be caused by countless other substances, but, to the best of our
knowledge, acids are the most difficult to be recovered. Eventually,
another problem regarding the classification system is related to
the computational complexity that tends to increase linearly with
the number of learned substances, so as consequence, a suitable
MCU has to be selected.
As the tests on the field concern, another system weakness
related to the solid waste emerged, which can get stuck over the
SCW’s sensors altering all the measurements made, leading, in
the worst cases, to a degradation of the overall system’s
performance.
Fig. 15. The flotation system for the immersion of the SCW. In red are highlighted
some garbages.
Fig. 16. Measurement system prototype.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
13
Although the proposed algorithm could be applied to an array of
sensors for real-time spilling recognition, to the best of our knowl-
edge, there are no similar publicly available data sets and neither
papers that follow a similar approach. This makes it impossible
to provide other sets of experimental results other than those
obtained from our data. At the same time, in the recent scientific
literature, many papers face the problem of wastewater analysis.
Still, nothing of these use an array of generic IDE in real-time
recognition of different substances.
We also discarded the possibility of considering the signals dur-
ing the injection as time series because their pattern over time
would be highly dependent on the rate of injection of the sub-
stance (quantity per second), rather than its nature. We preferred
an approach where the orthogonality of the adopted sensor, read
‘‘instantaneously”, gives us the right information, not considering
what happens in the time dimension.
As a final remark, it is known that Deep Learning (LeCun et al.,
2015) Algorithms (DLA) are well-established in many fields. How-
ever, we don’t have considered such an approach in this paper
because the feature space that we obtained from sensors is very
small. DLAs are well-established for time series, images, Natural
Language Processing, etc., but their effectiveness is questionable
when the feature space is too small. This assumption is also
demonstrated in a previous paper, where Convolutional Neural
Network (CNN) and Long-Short Term Memory models (LSTM) have
been considered (Molinara et al., 2020), and where CNN and LSTM
have been outperformed by traditional machine learning tech-
niques like Multi-Layer Perceptron or kNN.
8. Conclusions and future directions
In conclusion, the proposed work has been meant to develop a
stable and robust detection system capable of working in an
aggressive environment such as the one represented by the sewage
network. The complex environment implies that many different
substances can be present, even those whose danger level is not
significant and, therefore, not to be detected by the proposed sys-
tem. Nevertheless, adopting a classical supervised ML approach,
whatever substance would be recognized as one of those belonging
to the training set. The important novelty carried out and proven
effective in this work is implementing a two-stage scheme to
strongly reduce false alarms and keep the classification accuracy
very high. To do that, a Finite State Machine was intended to filter,
process, and normalize the measured sensors data, and then a
detection system was built. The detection system is divided into
two main parts, one represented by the One-class SVM classifier,
an anomaly detection algorithm with the purpose of rejecting all
the samples belonging to the unknown substances, and one repre-
sented by the KNN multiclass classifier to recognize the given sub-
stance belonging to those of interest.
From the obtained results, shown in Section 5, it can be seen
that the developed system work as supposed, drastically reducing
the false positive errors given by the outlier samples and keeping
accuracy on the substances of interest higher than 0.93 in all con-
sidered cases.
As concerns future developments, the authors would like to
continue testing in real scenarios to validate in a quantitative
way the promising results obtained in the laboratory activity and
enforce the generalization property of the proposed system. Fur-
thermore, the sodium hypochlorite confusion shown in Fig. 12,as
discussed in Section 7, suggests to us that substances with some
common chemical properties could be confused by the anomaly
detection system. A possible way to reduce this phenomenon as
much as possible would be to investigate an optimum set of
orthogonal features that can exploit the chemical differences to
maximize the overall system performance.
Declaration of Competing Interest
The authors declare that they have no known competing finan-
cial interests or personal relationships that could have appeared
to influence the work reported in this paper.
Acknowledgment
The research leading to these results has received funding from
the European Unions Horizon 2020 research and innovation pro-
gram under grant agreement SYSTEM No. 787128. The authors
are solely responsible for it and that it does not represent the opin-
ion of the Community and that the Community is not responsible
for any use that might be made of the information contained
therein. This work was also supported by MIUR (Minister for Edu-
cation, University and Research, Law 232/216, Department of
Excellence).
References
Akhter, F., Siddiquei, H.R., Alahi, M.E.E., Jayasundera, K.P., Mukhopadhyay, S.C.,
2022. An iot-enabled portable water quality monitoring system with mwcnt/
pdms multifunctional sensor for agricultural applications. IEEE Internet Things
J. 9 (16), 14307–14316.
Alam, A.U., Clyne, D., Jin, H., Hu, N.-X., Deen, M.J., 2020. Fully integrated, simple, and
low-cost electrochemical sensor array for in situ water quality monitoring. ACS
Sensors 5 (2), 412–422.
Alavi, S.M.M., Mahdi, A., Payne, S.J., Howey, D.A., 2017. Identifiability of generalized
randles circuit models. IEEE Trans. Control Syst. Technol. 25 (6), 2112–2120.
”Azienda comunale energia e ambiente.” https://www.acea.it/, 2022. [Online:
accessed 11-November-2022].
Bansal, S., Geetha, G., 2020. A machine learning approach towards automatic water
quality monitoring. J. Water Chem. Technol. 42 (5), 321–328.
Betta, G., Cerro, G., Ferdinandi, M., Ferrigno, L., Molinara, M., 2019. Contaminants
detection and classification through a customized iot-based platform: A case
study. IEEE Instrument. Measur. Mag. 22 (6), 35–44.
Bogler, A., Packman, A., Furman, A., Gross, A., Kushmaro, A., Ronen, A., Dagot, C., Hill,
C., Vaizel-Ohayon, D., Morgenroth, E., et al., 2020. Rethinking wastewater risks
and monitoring in light of the covid-19 pandemic. Nat. Sustainab. 3 (12), 981–
990.
Bourelly, C., Bria, A., Ferrigno, L., Gerevini, L., Marrocco, C., Molinara, M., Cerro, G.,
Cicalini, M., Ria, A., 2020. A preliminary solution for anomaly detection in water
quality monitoring. In: 2020 IEEE International Conference on Smart Computing
(SMARTCOMP), pp. 410–415.
Bria, A., Cerro, G., Ferdinandi, M., Marrocco, C., Molinara, M., 2020. An iot-ready
solution for automated recognition of water contaminants. Pattern Recogn. Lett.
135, 188–195.
Bria, A., Ferrigno, L., Gerevini, L., Marrocco, C., Molinara, M., Bruschi, P., Cicalini, M.,
Manfredini, G., Ria, A., Cerro, G., Simmarano, R., Teolis, G., Vitelli, M., 2021. A
false positive reduction system for continuous water quality monitoring. In
2021 IEEE International Conference on Smart Computing (SMARTCOMP), pp.
311–316.
Budiarti, R.P.N., Tjahjono, A., Hariadi, M., Purnomo, M.H., 2019. Development of iot
for automated water quality monitoring system. In: 2019 International
Conference on Computer Science, Information Technology, and Electrical
Engineering (ICOMITEE), pp. 211–216.
Desmet, C., Degiuli, A., Ferrari, C., Romolo, F.S., Blum, L., Marquette, C., 2017.
Electrochemical sensor for explosives precursors’ detection in water. Challenges
8 (1), pp.
De Vito, S., Fattoruso, G., Esposito, E., Salvato, M., Agresta, A., Panico, M., Leopardi, A.,
Formisano, F., Buonanno, A., Delli Veneri, P., Di Francia, G., 2018. A distributed
sensor network for waste water management plant protection. In: Andò, B.,
Baldini, F., Di Natale, C., Marrazza, G., Siciliano, P. (Eds.), Sensors, Springer
International Publishing, Cham, pp. 303–314.
Dilmi, S., Ladjal, M., 2021. A novel approach for water quality classification based on
the integration of deep learning and feature extraction techniques. Chemomet.
Intell. Lab. Syst. 214, 104329.
Drenoyanis, A., Raad, R., Wady, I., Krogh, C., 2019. Implementation of an iot based
radar sensor network for wastewater management. Sensors 19 (2), pp.
Dupont, C., Cousin, P., Dupont, S., 2018. Iot for aquaculture 4.0 smart and easy-to-
deploy real-time water monitoring with iot. In: 2018 Global Internet of Things
Summit (GIoTS). IEEE, pp. 1–5.
Farkas, K., Hillary, L.S., Malham, S.K., McDonald, J.E., Jones, D.L., 2020. Wastewater
and public health: the potential of wastewater surveillance for monitoring
covid-19. Current Opin. Environ. Sci. Health 17, 14–20. Environmental Health:
COVID-19.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
14
Ferdinandi, M., Molinara, M., Cerro, G., Ferrigno, L., Marroco, C., Bria, A., Di Meo, P.,
Bourelly, C., Simmarano, R., 2019. A novel smart system for contaminants
detection and recognition in water. In: 2019 IEEE international conference on
smart computing (SMARTCOMP). IEEE, pp. 186–191.
Hoes, O., Schilperoort, R., Luxemburg, W., Clemens, F., van de Giesen, N., 2009.
Locating illicit connections in storm water sewers using fiber-optic distributed
temperature sensing. Water Res. 43 (20), 5187–5197.
Ighalo, J.O., Adeniyi, A.G., Marques, G., 2021. Internet of things for water quality
monitoring and assessment: a comprehensive review. Artificial intelligence for
sustainable development: theory, practice and future applications, pp. 245–
259.
Janna, H., 2016. Characterisation of raw sewage and performance evaluation of al-
diwaniyah sewage treatment work, Iraq. World J. Eng. Technol. 4 (2), 296–304.
Ji, H.W., Yoo, S.S., Lee, B.-J., Koo, D.D., Kang, J.-H., 2020. Measurement of wastewater
discharge in sewer pipes using image analysis. Water 12 (6), pp.
Junior, A.C.D.S., Munoz, R., Quezada, M.D.L.A., Neto, A.V.L., Hassan, M.M.,
Albuquerque, V.H.C.D., 2021. Internet of water things: A remote raw water
monitoring and control system. IEEE Access 9, 35790–35800.
Kamaruidzaman, N.S., Rahmat, S.N., May 2020. Water monitoring system embedded
with internet of things (IoT) device: A review. IOP Conf. Series: Earth Environ.
Sci. 498, 012068. May.
Koditala, N.K., Pandey, P.S., 2018. Water quality monitoring system using iot and
machine learning. In: 2018 International Conference on Research in Intelligent
and Computing in Engineering (RICE). IEEE, pp. 1–5.
LeCun, Y., Bengio, Y., Hinton, G., 2015. Deep learning. Nature 521, 436–44.
Lepot, M., Makris, K.F., Clemens, F.H., 2017. Detection and quantification of lateral,
illicit connections and infiltration in sewers with infra-red camera: Conclusions
after a wide experimental plan. Water Res. 122, 678–691.
Lim, J, 2012. Mobile sensor network to monitor wastewater collection pipelines.
https://escholarship.org/uc/item/0d9813bn, [Online: accessed 11-November-
2022].
Lowe, M., Qin, R., Mao, X., 2022. A review on machine learning, artificial intelligence,
and smart technology in water treatment and monitoring. Water 14 (9), 1384.
Manfredini, G., Ria, A., Bruschi, P., Gerevini, L., Vitelli, M., Molinara, M., Piotto, M.,
2021. An asic-based miniaturized system for online multi-measurand
monitoring of lithium-ion batteries. Batteries 7 (3), pp.
Molinara, M., Ferdinandi, M., Cerro, G., Ferrigno, L., Massera, E., 2020. An end to end
indoor air monitoring system based on machine learning and sensiplus
platform. IEEE Access 8, 72204–72215.
Nopens, I., Capalozza, C., Vanrolleghem, P.A., 2001. Stability analysis of a synthetic
municipal wastewater. Department of Applied Mathematics Biometrics and
Process Control, University of Gent, Belgium.
Overmars, A., Venkatraman, S., 2020. Towards a secure and scalable iot
infrastructure: A pilot deployment for a smart water monitoring system.
Technologies 8 (4), 50.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel,
M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,
Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E., 2011. Scikit-learn:
Machine learning in Python. J. Machine Learn. Res. 12, 2825–2830.
Pisa, I., Santín, I., Vicario, J.L., Morell, A., Vilanova, R., 2019. Ann-based soft sensor to
predict effluent violations in wastewater treatment plants. Sensors 19 (6), pp.
”Public link for downloading the acquired dataset.” https://aida.unicas.it/data/
JKSU_2022.zip, 2022. [Online: accessed 11-November-2022].
Ray, S., 2021. An analysis of computational complexity and accuracy of two
supervised machine learning algorithms—k-nearest neighbor and support
vector machine. In: Sharma, A., Chakrabarti, A., Balas, V.E., Martinovic, J.
(Eds.), Data Management, Analytics and Innovation (N. Springer Singapore,
Singapore, pp. 335–347.
Ria, A., Cicalini, M., Manfredini, G., Catania, A., Piotto, M., Bruschi, P., 2022. The
sensiplus: A single-chip fully programmable sensor interface. In: International
Conference on Applications in Electronics Pervading Industry, Environment and
Society. Springer, pp. 256–261.
Saravanan, K., Anusuya, E., Kumar, R., Son, L.H., 2018. Real-time water quality
monitoring using internet of things in scada. Environ. Monit. Assess. 190 (9), 1–
16.
”Sewage monitoring system for tracking synthetic drug laboratories.” http://
micromole.eu/, 2022. [Online: accessed 11-November-2022].
Trubetskaya, A., Horan, W., Conheady, P., Stockil, K., Merritt, S., Moore, S., 2021. A
methodology for assessing and monitoring risk in the industrial wastewater
sector. Water Resourc. Ind. 25, 100146.
Tyszczuk-Rotko, K., Kozak, J., Czech, B., 2022. Screen-printed voltammetric sensors
tools for environmental water monitoring of painkillers. Sensors 22 (7), pp.
Vany
´sek, P., 2010. Electrochemical series.
Vikesland, P.J., 2018. Nanosensors for water quality monitoring. Nat. Nanotechnol.
13 (8), 651–660.
Zhao, Y., Nasrullah, Z., Li, Z., 2019. Pyod: A python toolbox for scalable outlier
detection. J. Machine Learn. Res. 20 (96), 1–7.
L. Gerevini, G. Cerro, A. Bria et al. Journal of King Saud University Computer and Information Sciences xxx (xxxx) xxx
15
... Water quality monitoring has piqued the interest of researchers in this twenty-first century. The open issues in wastewater analysis are mainly related to the complex and expensive equipment often required, unsuitable for the IoT and pervasive paradigm, and the lack of an anomaly detection step [5]. This research aims to develop an efficient, cost-effective, real-time water quality monitoring system that integrates with the Internet of Things (IoT). ...
... The graph indicates that turbidity increases until the afternoon and decreases towards the evening. When the stream moves, the kinetic energy of the high-speed stream can resuspend river bottom sediments and convey suspended solids, resulting in high turbidity levels [5]. Suspended sediments settle on the riverbed in still water, resulting in decreased turbidity values. ...
... This is supported in [18] as the pH value will change due to the changes in the ion concentrations caused by temperature changes. As the temperature increases, so do the molecular vibrations, allowing water to ionize and generate additional H + , resulting in a pH drop [5]. However, this does not imply that the water becomes acidic as the temperature rises. ...
Article
Water contamination is fast becoming a worldwide challenge as a result of soaring water demands due to rapid urbanization and population growth. Water quality must be evaluated daily to meet the demand for clean water. Traditional techniques have been used in monitoring water quality. However, these techniques have several limitations, including being labor-intensive, lacking real-time data, and having significant operating and equipment expenses. This project aims to develop a monitoring system using ESP32 that can monitor real-time water quality. The system comprises ESP32 as a microcontroller, pH sensor, turbidity sensor, and temperature sensor to measure water quality such as pH, turbidity, and temperature. The sensors are connected to the ESP32 and interfaced with it. The data of the measured parameters were collected by NodeMCU ESP32 and sent to the Blynk application. By the end of the project, users can monitor the parameters, i.e., pH, turbidity, and temperature, on their smartphones through BlynkApp. The outcome reveals that temperature affects the turbidity and pH of the water, with the temperature directly proportional to turbidity. Meanwhile, the pH of the water decreases as the temperature rises.
... In this context, starting from the previous experience of the authors in side-channel experimental analysis [13], [14], [15]and application of ML techniques [16], [17], [18], [19] this paper presents and compares several classification techniques able to identify the applications running on a device very popular in IoT applications. By leveraging machine learning techniques as well as proper measurement and data processing techniques, we show how we could identify, among several operating scenarios, which application is currently active based solely on characteristics observable through side-channels. ...
Article
Full-text available
In the field of cybersecurity, the ability to gather detailed information about target systems is a critical component of the reconnaissance phase of cyber attacks. This phase, known as cybersecurity reconnaissance, involves techniques that adversaries use to collect information vital for the success of subsequent attack stages. Traditionally, reconnaissance activities include network scanning, sniffing, and social engineering, which allow attackers to map the network, identify vulnerabilities, and plan their exploits. In this paper, we explore a novel application of side-channel analysis within the domain of system-based reconnaissance. Side-channel attacks, typically used to extract cryptographic keys or sensitive data through indirect observations such as power consumption or electromagnetic emissions, are here repurposed for a different kind of system intrusion. Specifically, we demonstrate how side-channel analysis and machine learning techniques can classify running processes on a target system very popular in common IoT applications. This method could enable to identify which applications are active without needing direct access to the system’s internal data. By categorizing this approach as a form of local system-based reconnaissance, we highlight its potential to silently gather critical information about a system’s state. Such capabilities represent a significant breach of privacy and provide attackers with the intelligence needed to carry out more targeted and effective attacks. This research also underscores the evolving nature of reconnaissance techniques and the growing risks of advanced side-channel cybersecurity methods.
... A central repository was set up to collect and store the transmitted data for further analyses and visualization. The real-time air pollution spike detection method involves a multi-step process that takes advantage of edge computing and IoT technologies, as explained in this section [49]. The paper proposes to generate and share real-time alerts and reports with relevant stakeholders, including local authorities, health departments, and the public. ...
Article
Full-text available
Air pollution is a critical problem in densely populated urban areas, with traffic significantly contributing. To mitigate the adverse effects of air pollution on public health and the environment, there is a growing need for the real-time monitoring and detection of pollution spikes in transportation. This paper presents a novel approach to using Internet of Things (IoT) edge networks for the real-time detection of air pollution peaks in transportation, specifically designed for innovative city applications. The proposed system uses IoT sensors in buses, cabs, and private cars. These sensors are equipped with air quality monitoring capabilities, including the measurement of pollutants such as particulate matter (PM2.5 and PM10), nitrogen dioxide (NO2), ozone (O3), sulfur dioxide (SO2), and carbon dioxide (CO2). The sensors continuously collect air quality data and transmit them to edge devices within the transportation infrastructure. The data collected by these sensors are analyzed, and alerts are generated when pollution levels exceed predefined thresholds. By deploying this system within IoT edge networks, transportation authorities can promptly respond to pollution spikes, improving air quality, public health, and environmental sustainability. This paper details the sensor technology, data analysis methods, and the practical implementation of this innovative system, shedding light on its potential for addressing the pressing issue of transportation-related pollution. The proposed IoT edge network for real-time air pollution spike detection in transportation offers significant advantages, including low-latency data processing, scalability, and cost-effectiveness. By leveraging the power of edge computing and IoT technologies, smart cities can proactively monitor and manage air pollution, leading to healthier and more sustainable urban environments.
... In this section, we consider a chemical sensor for Water Quality Monitoring (WQM), namely the Smart Cable Water (SCW) [24]. It is an embedded IoT-ready micro-analytical sensing platform of size 12.2×15.5 ...
Article
This paper presents an automatic, smart, safe and battery-less network for environmental monitoring implemented by passive Internet of Things (IoT) sensing devices with an Ultra High-Frequency (UHF) Radio Frequency IDentification (RFID) interface. A mobile robot navigates into the environment enabling continuous and automatic communication with passive RFID sensor tags deployed at specified locations and their localization as well. These low-power sensors, identified through the tag Electronic Product Code (EPC), may provide temperature, humidity, lighting, or other data through the RFID standardized communication protocol. To enhance the system degree of automation, passive RFID tags implementing antenna self-tuning strategies are also exploited by the robot to identify obstacles in the environment by exploiting the same mobile RFID architecture used for environmental monitoring. Fine-grained positioning of passive RFID sensors is achieved with techniques based on the Synthetic Arrays principle. The paper presents a demonstrator illustrating the described system. It includes passive RFID sensor tags designed for indoor temperature monitoring, with a moving antenna featured to localize the sensor tags and detect self-tuning tags installed for the collision-avoidance system. The performance confirms the practicality of the proposed IoT system.
Article
Low-cost commercial ECG electrodes combined with custom integrated electronic circuits can create a compact system capable of performing both ECG and bioimpedance measurements. This paper introduces a compact and wireless solution for ECG and bioimpedance acquisition, relying on a newly introduced versatile low-power, mixed-signal single chip sensor interface, without the need for complex acquisition and signal processing algorithms. Experimental tests were conducted on a prototype to evaluate its ability to measure biomedical signals. Results are compared with the performance of commercial device with excellent agreement.
Article
This paper presents a one data wire communication protocol, developed by Sensichips s.r.l., for multi-sensor readout applications with advantages in transfer efficiency and readout speed over currently used protocols. The protocol has been named Sensibus, it is a multi-drop single master multiple slaves type asynchronous protocol, and takes some features of standard serial data buses, improving the addressing and data transfer efficiency aspects. It allows multiple real-time writings, readings and it supports beam-forming applications through sensor level programmatic delays. In the case of sensor arrays, i.e. ultrasonic sensing, the ability to perform beam-forming and beam-stearing provides a better detection in a preferential spatial direction through precise and accurately programmable delays. The Sensibus was then implemented on a cable and used for gas measurements with distributed sensors to demonstrate its actual effectiveness. This work opens the door to new applications for distributed sensing, enabling fast measurements for real-time tasks: the simplicity of the protocol makes it easily implementable and low cost, ideal for what is required by IoT standards.
Article
Full-text available
Artificial-intelligence methods and machine-learning models have demonstrated their ability to optimize, model, and automate critical water- and wastewater-treatment applications, natural-systems monitoring and management, and water-based agriculture such as hydroponics and aquaponics. In addition to providing computer-assisted aid to complex issues surrounding water chemistry and physical/biological processes, artificial intelligence and machine-learning (AI/ML) applications are anticipated to further optimize water-based applications and decrease capital expenses. This review offers a cross-section of peer reviewed, critical water-based applications that have been coupled with AI or ML, including chlorination, adsorption, membrane filtration, water-quality-index monitoring, water-quality-parameter modeling, river-level monitoring, and aquaponics/hydroponics automation/monitoring. Although success in control, optimization, and modeling has been achieved with the AI methods, ML models, and smart technologies (including the Internet of Things (IoT), sensors, and systems based on these technologies) that are reviewed herein, key challenges and limitations were common and pervasive throughout. Poor data management, low explainability, poor model reproducibility and standardization, as well as a lack of academic transparency are all important hurdles to overcome in order to successfully implement these intelligent applications. Recommendations to aid explainability, data management, reproducibility, and model causality are offered in order to overcome these hurdles and continue the successful implementation of these powerful tools.
Article
Full-text available
The dynamic production and usage of pharmaceuticals, mainly painkillers, indicates the growing problem of environmental contamination. Therefore, the monitoring of pharmaceutical concentrations in environmental samples, mostly aquatic, is necessary. This article focuses on applying screen-printed voltammetric sensors for the voltammetric determination of painkillers residues, including non-steroidal anti-inflammatory drugs, paracetamol, and tramadol in environmental water samples. The main advantages of these electrodes are simplicity, reliability, portability, small instrumental setups comprising the three electrodes, and modest cost. Moreover, the electroconductivity, catalytic activity, and surface area can be easily improved by modifying the electrode surface with carbon nanomaterials, polymer films, or electrochemical activation.
Conference Paper
Full-text available
Water monitoring systems continuously working ensure real–time pollutant detection capabilities according to their sensitivity and specificity. It is necessary to balance such features because, although being able to sense several substances is a desired feature, the reduction of false positives is a primary goal a classification system should have. High false positive makes the system unusable. The current solution enables a 24/7 service with a sampling rate equal to 0.6 Hz. Our goal is to limit false positives to 1 per day, thus achieving 99.99% accuracy at least. In this paper, we add a false positive reduction module to our pre- existent system, aiming to manage false positive boosters as sensor drift and signal oscillations. Obtained results, using a Multi Layer Perceptron classifier, confirm the false positive reduction while keeping high true positive rates.
Article
Full-text available
To better asses the ageing and to reduce the hazards involved in the use of Lithium-Ion Batteries, multi-measurand monitoring units and strategies are urged. In this paper, a Cell Management Unit, based on the SENSIPLUS chip, a recently introduced multichannel, multi-mode sensor interface, is described. SENSIPLUS is a single System on a Chip combined with a reduced number of external components, resulting in a highly miniaturized device, built on 20 × 8 mm² printed circuit board. Thanks to SENSIPLUS’ versatility, the proposed system is capable of performing direct measurements (EIS, cell voltage) on the cell it is applied to, and reading different kinds of sensors. The SENSIPLUS versatile digital communication interface, combined with a digital isolator, enable connection of several devices to a single bus for parallel monitoring a large number of cells connected in series. Experiments performed by connecting the proposed system to a commercial Lithium-Ion Battery and to capacitive and resistive sensors are described. In particular, the capability of measuring the cell internal impedance with a resolution of 120 μΩ is demonstrated.
Article
Full-text available
The concept of sustainable risk assessment in industrial wastewater treatment is vital to determine the causes and consequences of plant failure. The potential wastewater-related risks that could hamper the operation of the entire manufacturing facility are currently inadequately defined and under researched. This work proposes a framework that includes the comparison of literature and experimental data to quantify the impact of the significant process parameters on the critical process outputs. From the business perspective, managing and minimising risks will be possible when the number of impact parameters is low and the relationships between different parameters are clearly understood. The results show that even only the evaluation of technical risks can provide an assessment platform template for other risk types. Also, the structured and statistically analyzed data sets applied might be further used in the design and development of machine learning platforms algorithms to inform sustainable process outcomes adjusted for various geographical locations and human factors which significantly affect the industrial water sector globally.
Article
Full-text available
The scarcity of the planet’s water resources is a concern of several international entities and governments. Smart solutions for water quality monitoring are gaining prominence with advances in communication technology. This work's primary goal is to develop a new online system to monitor and manage water resources, called Internet of Water Things (IoWT). The proposed system’s objective would be to control and manage raw water resources. Thus, it has developed a platform based on the server-less architecture and Internet of Things Architectural Reference Model, in which it is applied in a simulation environment, considering several electronic devices to validate its performance. For this research, there is a system for capturing raw water from tubular wells. Each well has a level sensor, a temperature sensor and a rain gauge. The data is collected every minute by an electronic device and sent every hour to the IoWT system. From data analysis, the amount of memory allocated to functions minimally interferes with efficiency. The IoWT system, applied in a real case, consists of connecting a device installed in a water well to the platform, where the data is transmitted through a 3G network and then processed. Thus, the proposed approach has great potential to be considered a complementary tool in monitoring raw water and assisting in decision-making for the management of water resources.
Chapter
The SENSIPLUS, a recently introduced versatile sensor interface, is described. The SENSIPLUS is a single-chip solution that allows a wide variety of operations required by sensor systems, such as vector impedance, voltage and current measurements across a wide frequency range. Integration with standard embedded systems is facilitated by the presence of a configurable communication line. In this work, the capability of interfacing resistive sensors and performing impedance spectroscopy is demonstrated by means of experiments executed on external reference components.
Article
Water quality monitoring plays a vital role in the protection of water resources, environmental management, and decision-making. Artificial intelligence (AI) based on machine learning techniques has been widely used to evaluate and classify water quality for the last two decades. However, traditional machine learning techniques face many limitations, the most important of which is the inability to apply these techniques with big data generated by smart water quality monitoring stations to improve the prediction. Real-time water quality monitoring with high accuracy and efficiency for intelligent water quality monitoring stations requires new and sophisticated techniques based on machine and deep learning techniques. For this purpose, we propose a novel approach based on the integration of deep learning and feature extraction techniques to improve water quality classification. In this paper, was chosen the Tilesdit dam in Bouira (Algeria) as a case study. Moreover, we implemented the advanced deep learning method - Long Short Term Memory Recurrent Neural Networks (LSTM RNNs) to construct an intelligent model for drinking water quality classification. Furthermore, principal component analysis (PCA), linear discriminant analysis (LDA) and independent component analysis (ICA) techniques were used for features extraction and data reduction from original features. Additionally, we used three methods of cross-validation and two methods of the out-of-sample test to estimate the performance of LSTM RNNs model. From the results we found that the integration of LSTM RNNs with LDA, and LSTM RNNs with ICA yields an accuracy of 99.72%, using Random-Holdout technique.
Article
The need to develop a low-power, low-cost nitrate, phosphate, and pH sensor and sensing system is essential for monitoring water quality in real-time. A novel interdigital sensor has been fabricated and characterized for temperature, nitrate, phosphate, and pH detection in water. The sensor is fabricated using the 3D printing technique where electrodes are formed using Multi-Walled Carbon Nanotubes (MWCNTs), and the substrate is developed using Polydimethylsiloxane (PDMS). The sensor is characterized by Electrochemical Impedance Spectroscopy (EIS) to determine various temperatures, pH levels, nitrate, and phosphate concentrations. Experimental outcomes prove that the developed sensor can distinguish nitrate and phosphate concentrations ranging from 0.1 ppm 30 ppm, pH values from 1.71 12.59, temperature from 0 45circ C. The sensitivity for temperature, nitrate, phosphate, and pH level of the sensor are 1.1974Ω/circ C, 1.9396Ω/ppm, 0.8839Ω/ppm and 1.0295Ω, respectively. A location-independent portable smart sensing system with LoRa connectivity is also developed to surveil water quality and get feedback from the experts. A machine learning algorithm trains the Arduino-based system and determines temperature, nitrate and phosphate concentrations, and pH level in real water samples. All the outcomes are compared with the standard method for validation. The sensor and the sensing system’s performances are highly stable, reliable, and repeatable to be a part of a smart sensing network for continuous water quality monitoring.