Conference Paper

Calibrating low-cost air quality sensors using multiple arrays of sensors

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... However, other approaches have also been suggested for calibration as alternatives to ML methods such as work that was performed in ref. [33], there still a deeper knowledge of models and parameters of the environment is required. The use of up to four sensors along with sensor fusion and MLR techniques has been used for calibration in ref. [34]. Authors in ref. [34] showed that the calibration results obtained from their proposed techniques leaded to better accuracy as compared to the accuracy of the best of four sensors [34]. ...
... The use of up to four sensors along with sensor fusion and MLR techniques has been used for calibration in ref. [34]. Authors in ref. [34] showed that the calibration results obtained from their proposed techniques leaded to better accuracy as compared to the accuracy of the best of four sensors [34]. Other techniques for sensor fusion have been focused on statistical methods for combination of data [35,36]. ...
... The use of up to four sensors along with sensor fusion and MLR techniques has been used for calibration in ref. [34]. Authors in ref. [34] showed that the calibration results obtained from their proposed techniques leaded to better accuracy as compared to the accuracy of the best of four sensors [34]. Other techniques for sensor fusion have been focused on statistical methods for combination of data [35,36]. ...
Article
Full-text available
The use of inexpensive, lightweight, and portable particulate matter (PM) sensors is increasingly becoming popular in air quality monitoring applications. As an example, these low‐cost sensors can be used in surface or underground coal mines for monitoring of inhalable dust, and monitoring of inhalable particles in real‐time can be beneficial as it can possibly assist in preventing coal mine related respiratory diseases such as black lung disease. However, commercially available PM sensors are not inherently calibrated, and as a result, they have vague and unclear measurement accuracy. Therefore, they must initially be evaluated and compared with standardised instruments to be ready to be deployed in the fields. In this study, three different types of inexpensive, light‐scattering‐based widely available PM sensors (Shinyei PPD42NS, Sharp GP2Y1010AU0F, and Laser SEN0177) are evaluated and calibrated with reference instruments. PM sensors are compared with reference instruments in a controlled environment. The calibration is done by means of different machine learning techniques. The results demonstrate that the calibrated response obtained by fusion of sensors has a higher accuracy in comparison to the calibrated response of each individual sensor.
... Moreover, some gas sensors tend not to be selective enough and to be perturbated by other gas concentrations, which is particularly challenging, especially for the metal-oxide sensosr [179,260]. For this reason, as LCS are usually miniaturized, it became classical to put in the same sensing device several sensors whose concentrations influence each others [13,188]. The sensor array thus houses two or more sensors sensing different physical phenomena. ...
... Another interesting finding can be found in [240] where the authors show that their zero-calibration protocol-which is based on multiple least-squares-efficiently corrected the observed drift of the low-cost sensor output. Other findings can be seen in [13,125,192,213]. ...
... A variant of Eq. (3.78) was proposed in [227]. In their formalism, the authors only consider two matrices to jointly factorize 12 and add a discrepancy 13 ...
Thesis
Full-text available
Air pollution poses substantial health issues with several hundred thousands of premature deaths in Europe each year. Effective air quality monitoring is thus an major task for environmental agencies. It is usually carried out by some highly accurate monitoring stations. However, these stations are expensive and limited in number, thus providing a low spatio-temporal resolution. The deployment of low-cost sensors (LCS) promises a complementary solution with lower cost and higher spatio-temporal resolution. Unfortunately, LCS tend to drift over time and their high number prevents regular in-lab calibration. Data-driven techniques named in-situ calibration have thus been proposed. In particular, revisiting mobile sensor calibration as a matrix factorization problem seems promising. However, existing approaches are based on slow methods-and are not suited forlarge-scale problems involving hundreds of sensors deployed over a large area-and are designed for short-term deployments. To solve both issues, compressive non-negative matrix factorization have been proposed in this thesis, which is divided into two parts. In the first part, we investigate the enhancement provided by random projections for weighted non-negative matrix factorization. We show that these techniques can significantly speed-up large-scale and low-rank matrix factorization methods, thus allowing the fast estimation of missing entries in low-rank matrices. In the second part, we revisit mobile heterogeneous sensor calibration as an informed factorization of large matrices with missing entries. We thus propose fast informed matrix factorization approaches, and in particular informed extensions of compressive methods proposed in the first part, which are found to be well-suited for the considered problem.
... Among those proposed, this review focused on the following pollutants: nitrogen dioxide (NO2), ozone (O3), carbon monoxide (CO), volatile organic compounds (VOCs) and airborne particulate matter (PM) with an aerodynamic diameter below 2.5 µm (PM2.5) and below 10 µm (PM10) (the coding "PM" was applied to categorize NGMSs that can simultaneously analyze more than one fraction of particulate matter) ( Table 2). After the fulltext reading step, it was outlined that some pollutants were poorly investigated and the available evidence did not allow for an extensive discussion: for this reason, NGMSs for NO [25,[27][28][29][30][31], NOx [32,33], CO2 [31,[34][35][36], SO2 [37,38], and BC [39] were not discussed in this review. As a first result, we found that the most commonly used sensors to monitor the selected air pollutant gases are those produced by Alphasense (www.alphasense.com; ...
... It allows easy and stable communication between NGSMs and a smartphone in which the mobile app is supported. In this review, 23 articles [19,27,32,35,[37][38][39]50,51,56,60,67,69,77,[79][80][81][82][83][84][85][86] out of 67 reports information about the use of any mobile app supporting NGMSs; most of those (13 apps) were developed on the Android platform [6,35,37,38,40,50,53,60,81,84,[86][87][88], only one was developed on the iOS platform [81], and the remaining were not specified. As reported by Kanjo et al. [89], using a mobile phone to collect data can bring many advantages, especially related to the fact that (i) a large percentage of the population carries around mobile phones; (ii) many kinds of data can be processed, stored, and transferred easily by mobile phones; (iii) the collection of data should be more power-efficient because the acquired information are sent directly to the mobile phone. ...
... One of the reasons encouraging scientists to continuously develop NGMSs is the fact that these devices can be used to investigate air quality together with citizens to support well-informed actions and communicate the problems regarding this topic to the general population, raising the attention of politicians. Few studies underline the importance of citizen science in the field of air quality to evaluate the impact of everyday citizen life and daily routine [32,50,56,60,92]. In 2018, the ECSA (European Citizen Science Association) [93] created a collaboration between scientists and industries with the aim to encourage networking in the field of citizen science, aiming to reach a constant fueling of resources, not only economic but also in terms of ideas, research needs, and initiatives. ...
Article
Full-text available
In the last years, the issue of exposure assessment of airborne pollutants has been on the rise, both in the environmental and occupational fields. Increasingly severe national and international air quality standards, indoor air guidance values, and exposure limit values have been developed to protect the health of the general population and workers; this issue required a significant and continuous improvement in monitoring technologies to allow the execution of proper exposure assessment studies. One of the most interesting aspects in this field is the development of the "next-generation" of airborne pollutants monitors and sensors (NGMS). The principal aim of this review is to analyze and characterize the state of the art and of NGMS and their practical applications in exposure assessment studies. A systematic review of the literature was performed analyzing outcomes from three different databases (Scopus, PubMed, Isi Web of Knowledge); a total of 67 scientific papers were analyzed. The reviewing process was conducting systematically with the aim to extrapolate information about the specifications, technologies, and applicability of NGMSs in both environmental and occupational exposure assessment. The principal results of this review show that the use of NGMSs is becoming increasingly common in the scientific community for both environmental and occupational exposure assessment. The available studies outlined that NGMSs cannot be used as reference instrumentation in air monitoring for regulatory purposes, but at the same time, they can be easily adapted to more specific applications, improving exposure assessment studies in terms of spatiotemporal resolution, wearability, and adaptability to different types of projects and applications. Nevertheless, improvements needed to further enhance NGMSs performances and allow their wider use in the field of exposure assessment are also discussed.
... Gas sensors, such as CO, O 3 , and NO 2 sensors, are examples [15,16,27,28,29,30,31,32,33,34] of sensors that follow 4 The input parameters are called predictors, features, independent variables, or variables in machine learning and statistical learning terminology. 5 The output is also called response or dependent variable in machine learning and statistical learning terminology. ...
... To finalize the set of examples, to monitor air pollution (Table 4), we mainly need arrays of sensors, because pollutants depend on several factors (Table 1). In general, air pollution sensors are analyzed using multiple linear regression (MLR) [12,15,28,30,31,33,55], although when nonlinearities due to the chemical composition of the sensor appear, techniques such as artificial neural networks (ANN), support-vector regression (SVR) or random forest are used [16,27,31,42]. Table 2: Simple applications include light, temperature, relative humidity, vibration, and accelerometer sensors. ...
... This technique is called array of sensors and aims to reduce the uncertainty of calibration parameters in the data model. Arrays of different classes of gas sensors [16,27,28,30,32,33] have proven useful to qualitatively identify gas species using pattern recognition approaches and quantitatively determine gas composition based on regression models. An example is a NO 2 sensor that is regressed using NO 2 , O 3 , temperature, and relative humidity sensors. ...
... Maag et al. [23] focuses more on characterizing how sensors are calibrated in the air pollution area. In general, the most used calibration architecture [13] defined for low-cost air pollution sensors in WSN is centralized, micro, collocated, pre/post-calibration, off-line, non-blind with an array of M sensors [14,15,21,[24][25][26]. The reason for such an approach lies in the need to have a reference station that provides accurate data (micro, collocated and non-blind calibration). ...
... From an algorithmic point of view, many of the low cost sensors have been calibrated with linear models such as multiple linear regression [14,15,21,[24][25][26]. Recently, it has been proven that some of these sensors have non-linear behaviors and has been investigated how air pollution sensors can be calibrated with non-linear methods. ...
... We can see in the Figure 2a the high variability in the RMSE of the different sensors ranging from 24.0 to 7.8 µg/m 3 . To install one or another sensor of the same family impacts on the quality of the data obtained irrespective of the calibration process [26]. Figure 2b displays the RMSE for each sensor against R 2 (Coefficient of Determination). ...
Article
Full-text available
New advances in sensor technologies and communications in wireless sensor networks have favored the introduction of low-cost sensors for monitoring air quality applications. In this article, we present the results of the European project H2020 CAPTOR, where three testbeds with sensors were deployed to capture tropospheric ozone concentrations. One of the biggest challenges was the calibration of the sensors, as the manufacturer provides them without calibrating. Throughout the paper, we show how short-term calibration using multiple linear regression produces good calibrated data, but instead produces biases in the calculated long-term concentrations. To mitigate the bias, we propose a linear correction based on Kriging estimation of the mean and standard deviation of the long-term ozone concentrations, thus correcting the bias presented by the sensors.
... Gas sensors, such as CO, O 3 , and NO 2 sensors, are examples [15,16,27,28,29,30,31,32,33,34] of sensors that follow 4 The input parameters are called predictors, features, independent variables, or variables in machine learning and statistical learning terminology. 5 The output is also called response or dependent variable in machine learning and statistical learning terminology. ...
... To finalize the set of examples, to monitor air pollution (Table 4), we mainly need arrays of sensors, because pollutants depend on several factors (Table 1). In general, air pollution sensors are analyzed using multiple linear regression (MLR) [12,15,28,30,31,33,55], although when nonlinearities due to the chemical composition of the sensor appear, techniques such as artificial neural networks (ANN), support-vector regression (SVR) or random forest are used [16,27,31,42]. Table 2: Simple applications include light, temperature, relative humidity, vibration, and accelerometer sensors. ...
... This technique is called array of sensors and aims to reduce the uncertainty of calibration parameters in the data model. Arrays of different classes of gas sensors [16,27,28,30,32,33] have proven useful to qualitatively identify gas species using pattern recognition approaches and quantitatively determine gas composition based on regression models. An example is a NO 2 sensor that is regressed using NO 2 , O 3 , temperature, and relative humidity sensors. ...
Preprint
Growing progress in sensor technology has constantly expanded the number and range of low-cost, small, and portable sensors on the market, increasing the number and type of physical phenomena that can be measured with wirelessly connected sensors. Large-scale deployments of wireless sensor networks (WSN) involving hundreds or thousands of devices and limited budgets often constrain the choice of sensing hardware, which generally has reduced accuracy, precision, and reliability. Therefore, it is challenging to achieve good data quality and maintain error-free measurements during the whole system lifetime. Self-calibration or recalibration in ad hoc sensor networks to preserve data quality is essential, yet challenging, for several reasons, such as the existence of random noise and the absence of suitable general models. Calibration performed in the field, without accurate and controlled instrumentation, is said to be in an uncontrolled environment. This paper provides current and fundamental self-calibration approaches and models for wireless sensor networks in uncontrolled environments.
... Many studies exist in which low-cost air pollution sensors are co-located with reference sensors, and calibration functions are then derived to, in principle, allow the low-cost sensor to then be used without the reference sensor (Lewis et al., 2016;Liu et al., 2017;Zimmerman et al., 2018;Barcelo-Ordinas et al., 2018;Crilley et al., 2018;Badura et al., 2019;Wang et al., 2019;Datta et al., 2020;Lee et al., 2020;Ferrer-Cid et al., 2020). This can also extend to calibrating remote observation data (for example Shaddick et al. (2018)). ...
Preprint
Full-text available
Networks of low-cost sensors are becoming ubiquitous, but often suffer from low accuracies and drift. Regular colocation with reference sensors allows recalibration but is often complicated and expensive. Alternatively the calibration can be transferred using low-cost, mobile sensors, often at very low cost. However inferring appropriate estimates of the calibration functions (with uncertainty) for the network of sensors becomes difficult, especially as the network of visits by the mobile, low-cost sensors becomes large. We propose a variational approach to model the calibration across the network of sensors. We demonstrate the approach on both synthetic and real air pollution data, and find it can perform better than the state of the art (multi-hop calibration). We extend it to categorical data, combining classifications of insects by non-expert citizen scientists. Achieving uncertainty-quantified calibration has been one of the major barriers to low-cost sensor deployment and citizen-science research. We hope that the methods described will enable such projects.
... A calibration using a sensor array for ozone is described in [1]. The array is calculated to a virtual calibrated sensor using linear regression. ...
Chapter
Urban air quality is an important problem of our time. Due to their high costs and therefore low spacial density, high precision monitoring stations cannot capture the temporal and spatial dynamics in the urban atmosphere, low-cost sensors must be used to setup dense measurement grids. However, low-cost sensors are imprecise, biased and susceptible to environmental influences. While neural networks have been explored for their calibration, issues include the amount of data needed for training, requiring sensors to be co-located with reference stations for extensive periods of time. Also re-calibrating them with new data can lead to catastrophic forgetting. We propose using Elastic Weight Consolidation (EWC) as an incremental calibration method. By exploiting the Fisher-Information-Matrix it enables the network to compensate for different sources of error, both pertaining to the sensor itself, as well as caused by varying environmental conditions. Models are pre-calibrated with data of 40 h measurement on a low-cost SDS011 PM sensor and then re-calibrated on another SDS011 sensor. Our evaluation on 1.5 years of real world data shows that a model using EWC with a time period of data of 6 h for re-calibration is more precise than models without EWC, even those with longer re-calibration periods. This demonstrates that EWC is suitable for on-the-fly collaborative calibration of low-cost sensors.
... Refs. [5] and [6] study the fusion of data taken from the four Captor sensors placed at the same node as a way to reduce the estimation error with respect to the use of a single sensor. Ref. [7] propose a graph sensing framework to re-calibrate sensors, impute missing data, and reconstruction techniques in an heterogenous monitoring air pollution network with reference stations and low-cost sensor nodes. ...
Article
Full-text available
The H2020 CAPTOR project deployed three testbeds in Spain, Italy and Austria with low-cost sensors for the measurement of tropospheric ozone (O3). The aim of the H2020 CAPTOR project was to raise public awareness in a project focused on citizen science. Each testbed was supported by an NGO in charge of deciding how to raise citizen awareness according to the needs of each country. The data presented in this document correspond to the raw data captured by the sensor nodes in the Spanish testbed using SGX Sensortech MICS 2614 metal-oxide sensors. The Spanish testbed consisted of the deployment of twenty-five nodes. Each sensor node included four SGX Sensortech MICS 2614 ozone sensors, one temperature sensor and one relative humidity sensor. Each node underwent a calibration process by co-locating the node at an EU reference air quality monitoring station, followed by a deployment in a sub-urban or rural area in Catalonia, Spain. All nodes spent two to three weeks co-located at a reference station in Barcelona, Spain (urban area), followed by two to three weeks co-located at three sub-urban reference stations near the final deployment site. The nodes were then deployed in volunteers' homes for about two months and, finally, the nodes were co-located again at the sub-urban reference stations for two weeks for final calibration and assessment of potential drifts. All data presented in this paper are raw data taken by the sensors that can be used for scientific purposes such as calibration studies using machine learning algorithms, or once the concentration values of the nodes are obtained, they can be used to create tropospheric ozone pollution maps with heterogeneous data sources (reference stations and low-cost sensors).
... Sensor S2 temperature and relative humidity data. Regression (MLR) can be used to approximate a linear combination of raw air pollutant concentration, temperature and humidity measurements that best fits the target reference concentration for low cost sensors [54,[58][59][60]. The developed sensors produce temperature and relative humidity data (see Fig. 11) along with raw gas sensor responses and all these data can be used to perform temperature and humidity correction in the sensor response using MLR as shown in (1) ...
Article
Air pollution poses significant risk to environment and health. Air quality monitoring stations are often confined to a small number of locations due to the high cost of the monitoring equipment. They provide a low fidelity picture of the air quality in the city; local variations are overlooked. However, recent developments in low cost sensor technology and wireless communication systems like Internet of Things (IoT) provide an opportunity to use arrayed sensor networks to measure air pollution, in real time, at a large number of locations. This paper reports the development of a novel low cost sensor node that utilizes cost-effective electrochemical sensors to measure Carbon Monoxide (CO) and Nitrogen Dioxide (NO2) concentrations and an infrared sensor to measure Particulate Matter (PM) levels. The node can be powered by either solar-recharged battery or mains supply. It is capable of long-range, low power communication over public or private LoRaWAN IoT network and short-range high data rate communication over Wi-Fi. The developed sensor nodes were co-located with an accurate reference CO sensor for field calibration. The low cost sensors’ data, with offset and gain calibration, shows good correlation with the data collected from the reference sensor. Multiple linear regression (MLR) based temperature and humidity correction results in Mean Absolute Percentage Error (MAPE) of 48.71% and R2 of 0.607 relative to the reference sensor’s data. Artificial Neural Network (ANN) based calibration shows the potential for significant further improvement with MAPE of 38.89% and R2 of 0.78 for leave one out cross validation.
... Presently, static-monitoring stations are implemented for observing the air pollution degrees which are outfitted with specialized reference equipment. Quality equipment is very huge expensive and heavy (Barcelo-Ordinas et al. 2018). The conventional Air Quality Monitoring stations are employed in restricted, on account of the high maintenance cost, huge size, huge initial investment cost involved and power limitations Pavani (Pavani and Rao 2017). ...
Article
Full-text available
An efficient air pollution monitoring (APM) scheme is proposed to establish an optimal less polluted path using WSN (wireless sensor network), in which the sensor node (SN) senses the temperature and CO gas (carbon monoxide) concentration’s existent in the air. Initially, the sensed information from the SNs is preprocessed. During preprocessing, the value that is missed in the sensed information is imputed. Next, Hadoop's distributed file system (HDFS) MapReduce (MR) is implemented on the preprocessed data and subsequently, the resulting data is saved in the cloud server. The resulting data is analyzed using Improved-Adaptive Neuro-Fuzzy Inference System (I-ANFIS) Algorithm for checking air pollutions severities and its location is then presented in the Google Map. After that, the multi-path routing is established through the less polluted area. Lastly, the optimal path is chosen with the assistance of KHOA (Krill Herd Optimization Algorithm). The outcomes are evaluated by contrasting the proposed and prevailing techniques.
... Analyses of these interviews, together with onsite observations from the research team, are presented in this paper. Insights on the technical aspects of the low-cost ozone measurement are presented in Ripoll et al. (2019) and Barcelo-Ordinas et al. (2018). ...
Article
Full-text available
Air pollution is a serious problem that is causing increasing concern among European citizens. It is responsible for more than 400,000 premature deaths in Europe each year and considerably damages human health, agriculture, and the natural environment. Despite these facts, the readiness and power of citizens to take actions is limited. To address this challenge, the citizen science project CAPTOR was launched in 2016. Using low-cost measurement devices, citizens in three European testbeds supported the monitoring of tropospheric ozone. This paper presents the results from 53 interviews with involved residents and shows that the active involvement of individuals in a complex process such as measuring tropospheric ozone can have important impacts on their knowledge and attitudes. In an attempt to expand the benefits of low-cost air quality sensors from an individual to a regional level, certain preconditions are key. Strong support in assuring data quality, visibility of the collected data in online and offline media, broad dissemination of results, and intensified communication with political decision-makers are needed.
... • for reference-based group calibration in [41] [42] and [43] • for partially blind group calibration strategies in [38] [44] [45] [46], including with time-sensitive models in [29] and [44] • for blind strategies, either pairwise or group based, in [35] [47] [48] [49] [50]. On the contrary, relationships with multiple-kind variables were shown to be unnecessary in [51] and in [52] where the control of the operating temperature of the device was sufficient to perform a pairwise calibration without being influenced by this quantity. ...
Article
The recent developments in both nanotechnologies and wireless technologies have enabled the rise of small, low-cost and energy-efficient environmental sensing devices. Many projects involving dense sensor networks deployments have followed, in particular, within the Smart City trend. If such deployments are now within economical and technical reach, their maintenance and reliability remain, however, a challenge. In particular, reaching, then maintaining, the targeted quality of measurement throughout deployment duration is an important issue. Indeed, factory calibration is too expensive for systematic application to low-cost sensors, as these sensors are usually prone to drifting because of premature aging. In addition, there are concerns about the applicability of factory calibration to field conditions. These challenges have fostered many researches on in situ calibration. In situ means that the sensors are calibrated without removing them from their deployment location, preferably without physical intervention, often leveraging their communication capabilities. It is a critical challenge for the economical sustainability of networks with large-scale deployments. In this paper, we focus on in situ calibration methods for environmental sensor networks. We propose a taxonomy of the methodologies in the literature. Our classification relies on both the architecture of the network of sensors and the algorithmic principles of the calibration methods. This review allows us to identify and discuss two main challenges: how to improve the performance evaluation of such methods and how to enable a quantified comparison of these strategies?
... Holstius et al. [43] Multiple Least Squares Field LS Piedrahita et al. [20] Multiple Least Squares Lab & Field MOX, LS Jiao et al. [12] Multiple Least Squares Field EC, MOX, LS Sun et al. [44] Multiple Least Squares Lab & Field EC Martin et al. [45] Multiple Least Squares Lab LS Eugster and Kling [46] Multiple Least Squares Field MOX Barcelo-Ordinas et al. [47] Multiple Least Squares Field MOX Hagan et al. [29] Multiple Least Squares, kNN Field EC Wei et al. [48] Linear [13], [14] Multiple Least Squares Field EC, MOX Maag et al. [52] Multiple Least Squares Field EC, MOX Fang et al. [53] Multiple Least Squares Field MOX, LS Cross et al. [54] Non-Linear Curve Fitting Field EC Zimmermann et al. [55] Random Forests Field EC, LS Kamionka et al. [56] Neural Networks Lab MOX Spinelle et al. [13], [14] Neural Networks Field EC, MOX Barakeh et al. [57] Neural Networks Field MOX De Vito et al. [58], [59] Neural Networks Field MOX Esposito et al. [60], [61] Neural Networks Field EC Esposito et al. [62] Various machine learning methods Field EC De Vito et al. [63] Various machine learning methods Field EC Lewis et al. [64] Various machine learning methods Lab & Field EC, LS Note: EC: Electrochemical, MOX: Metaloxide, LS: Light-Scattering (Particulate Matter and CO 2 ) 1) Principles: Offset and gain calibration fits a calibration curve, either a linear or a non-linear one, to model relationships between raw sensor readings and pollutant concentrations. The calibration curve is defined by an offset term, i.e., the sensor's response to complete absence of the target pollutant, and a gain term that characterizes the sensor's response to increasing pollutant concentrations. ...
Article
Air pollution is a major concern for public health and urban environments. Conventional air pollution monitoring systems install a few highly accurate, expensive stations at representative locations. Their sparse coverage and low spatial resolution are insufficient to quantify urban air pollution and its impacts on human health and environment. Advances in low-cost portable air pollution sensors have enabled air pollution monitoring deployments at scale to measure air pollution at high spatiotemporal resolution. However, it is challenging to ensure the accuracy of these low-cost sensor deployments because the sensors are more error-prone than high-end sensing infrastructures and they are often deployed in harsh environments. Sensor calibration has proven to be effective to improve the data quality of low-cost sensors and maintain the reliability of long-term, distributed sensor deployments. In this article, we review the state-of-the-art low-cost air pollution sensors, identify their major error sources, and comprehensively survey calibration models as well as network re-calibration strategies suited for different sensor deployments. We also discuss limitations of exiting methods and conclude with open issues for future sensor calibration research.
Article
In Delhi, the capital city of India, air pollution has been a perpetual menace to urban sustainability and public health. The present study uses a mixed-method approach to enumerate to the urban authorities: (a) the state of air pollution in the city; (b) systemic flaws in the current monitoring network; (c) potential means to bolster it; and (d) need of a participatory framework for monitoring. Information about Air Quality Index (AQI), obtained from 36 monitoring stations across Delhi is compared between 2021 (20 April–25 May; 2nd year/phase of SARS-CoV-2 lockdown), and the corresponding time periods in 2020 (1st year/phase of lockdown), and 2019 (business-as-usual) using the Mann–Whitney U Test. AQI during the 2021 lockdown (a) appeared statistically more similar ( p < .01) to that of 2019 and (b) exceeded the environmental health safety benchmark for 85% days during the study period (20 April–25 May). However, this only presented a partial glimpse into the air pollution status. It owes to numerous ‘holes’ in the AQI data record (no data and/or insufficient data). Moreover, certain areas in Delhi yet have no monitoring station, or only too few, to yield a ‘representative’ estimate (inadequate spatial coverage). Such shortcomings in the existing monitoring network may deter future research and targeted/informed decision-making for pollution control. To that end, the present research offers a summary view of Low-Cost Air Quality Sensors (LCAQS), to offer the urban sustainability authorities, ‘complementary’ technique to bolster and diversify the existing network. The main advantages and disadvantages of various LCAQS sensor technologies are highlighted while emphasizing on the challenges around various calibration techniques (linear and non-linear). The final section reflects on the integration of science and technology with social dimensions of air quality monitoring and highlights key requirements for (a) community mobilization and (b) stakeholder engagement to forge a participatory systems’ design for LCAQS deployment.
Article
Urban planning and design solutions affect urban ventilation conditions, thus mitigating the effects of atmospheric pollution. However, these findings are not being implemented in the planning practice to a sufficient extent, partly due to the lack of specific guidelines. Moreover, many urban air quality monitoring (AQM) sites have low represnentativeness and thus do not provide comprehensive data for effective urban air pollution control with respect to the urban spatial policy. An integrated assessment method based on modelling, simulation, and geospatial data processing tools was used to investigate the impact of micro-scale urban form on the local ventilation conditions and pollution dispersion. The proposed approach combined computational fluid dynamics (CFD) and geographic information system (GIS) tools, and particularly the newly developed Residence Time Index (RTI) - a single CFD-derived parameter quantifying the capabilities of the micro-scale built environment to retain PM10 pollution. Urban segments around monitoring sites in three Polish cities (Gdańsk, Warsaw, and Poznań), characterised by relatively low density and varied urban form typologies, were investigated. The results indicate that in these conditions the following features of the urban form have the strongest correlation with the RTI: plan area density (λP), gross floor area ratio (λGFA), and occlusivity (Oc), making them useful indicators for urban air quality management. On the other hand, PM10 data from the AQM sites are rather poorly linked with urban form indicators, which suggests that in complex urban scenarios a higher spatial resolution of air quality data is required for shaping the spatial policy. The implications from this analysis are useful for the urban planning practice. The developed approach may be also a valuable decision support tool for the assessment of the spatial representativeness of AQM sites.
Article
The analysis of sensor networks for air pollution monitoring is challenging. Recent studies have demonstrated the ability to reconstruct the network measurements with graphs derived from the acquired data, thus describing the complex relationships between the sensors that compose the network. In this work, we propose a graph-based data reconstruction framework that can be used to carry out different post-processing applications that arise in real-world low-cost sensor deployments for air pollution monitoring. This data reconstruction framework first describes the relationships between the different network sensors by means of a graph learned from the measured data, and then a signal reconstruction model is superimposed to reconstruct sensor data. This methodology allows reconstructing sensor data to carry out missing value imputation, signal reconstruction at points where there are no physical sensors (virtual sensing), or data fusion. The results, using real data taken with an air pollution monitoring network including reference stations and low-cost sensors, show how this framework performs well for these applications compared to estimates obtained from individual sensors and task-specific state-of-the-art methods that are not able to deal with this necessary range of applications. In short, the results show the potential of using graphs, whose topology is based on the measurements taken, to calibrate and reconstruct signals in heterogeneous networks of low-cost air pollution sensors for a wide variety of applications.
Article
Air pollution is a severe problem growing over time. A dense air-quality monitoring network is needed to update the people regarding the air pollution status in cities. A low-cost sensor device (LCSD) based dense air-quality monitoring network is more viable than continuous ambient air quality monitoring stations (CAAQMS). An in-field calibration approach is needed to improve agreements of the LCSDs to CAAQMS. The present work aims to propose a calibration method for PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2.5</sub> using domain adaptation technique to reduce the collocation duration of LCSDs and CAAQMS. A novel calibration approach is proposed in this work for the measured PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2.5</sub> levels of LCSDs. The dataset used for the experimentation consists of PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2.5</sub> values and other parameters (PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">10</sub> , temperature, and humidity) at hourly duration over a period of three months data. We propose new features, by combining PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2.5</sub> , PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">10</sub> , temperature, and humidity, that significantly improved the performance of calibration. Further, the calibration model is adapted to the target location for a new LCSD with a collocation time of two days. The proposed model shows high correlation coefficient values (R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) and significantly low mean absolute percentage error (MAPE) than that of other baseline models. Thus, the proposed model helps in reducing the collocation time while maintaining high calibration performance.
Article
Existing air pollution monitoring networks use reference stations as the main nodes. The addition of low-cost sensors calibrated in-situ with machine learning techniques allows the creation of heterogeneous air pollution monitoring networks. However, current monitoring networks or calibration techniques have limitations in estimating missing data, adding virtual sensors or recalibrating sensors. The use of graphs to represent structured data is an emerging area of research that allows the use of powerful techniques to process and analyze data for air pollution monitoring networks. In this article, we compare two techniques that rely on structured data, one based on statistical methods and the other on signal smoothness, with a baseline technique based on the distance between nodes and that does not rely on the measured signal data. To compare these techniques, the sensor signal is reconstructed with a supervised method based on linear regression and a semisupervised method based on Laplacian interpolation, which allows reconstruction even when data is missing. The results, on data sets measuring O <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sub> , NO <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> , and PM <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">10</sub> , show that the signal smoothness-based technique behaves better than the other two, and used together with the Laplacian interpolation is near optimal with respect to the linear regression method. Moreover, in the case of heterogeneous networks, the results show a reconstruction accuracy similar to the in-situ calibrated sensors. Thus, the use of the network data increases the robustness of the network against possible sensor failures.
Thesis
In various fields going from agriculture to public health, ambient quantities have to be monitored in indoors or outdoors areas. For example, temperature, air pollutants, water pollutants, noise and so on have to be tracked. To better understand these various phenomena, an increase of the density of measuring instruments is currently necessary. For instance, this would help to analyse the effective exposure of people to nuisances such as air pollutants.The massive deployment of sensors in the environment is made possible by the decreasing costs of measuring systems, mainly using sensitive elements based on micro or nano technologies. The drawback of this type of instrumentation is a low quality of measurement, consequently lowering the confidence in produced data and/or a drastic increase of the instrumentation costs due to necessary recalibration procedures or periodical replacement of sensors.There are multiple algorithms in the literature offering the possibility to perform the calibration of measuring instruments while leaving them deployed in the field, called in situ calibration techniques.The objective of this thesis is to contribute to the research effort on the improvement of data quality for low-cost measuring instruments through their in situ calibration.In particular, we aim at 1) facilitating the identification of existing in situ calibration strategies applicable to a sensor network depending on its properties and the characteristics of its instruments; 2) helping to choose the most suitable algorithm depending on the sensor network and its context of deployment; 3) improving the efficiency of in situ calibration strategies through the diagnosis of instruments that have drifted in a sensor network.Three main contributions are made in this work. First, a unified terminology is proposed to classify the existing works on in situ calibration. The review carried out based on this taxonomy showed there are numerous contributions on the subject, covering a wide variety of cases. Nevertheless, the classification of the existing works in terms of performances was difficult as there is no reference case study for the evaluation of these algorithms.Therefore in a second step, a framework for the simulation of sensors networks is introduced. It is aimed at evaluating in situ calibration algorithms. A detailed case study is provided across the evaluation of in situ calibration algorithms for blind static sensor networks. An analysis of the influence of the parameters and of the metrics used to derive the results is also carried out. As the results are case specific, and as most of the algorithms recalibrate instruments without evaluating first if they actually need it, an identification tool enabling to determine the instruments that are actually faulty in terms of drift would be valuable.Consequently, the third contribution of this thesis is a diagnosis algorithm targeting drift faults in sensor networks without making any assumption on the kind of sensor network at stake. Based on the concept of rendez-vous, the algorithm allows to identify faulty instruments as long as one instrument at least can be assumed as non-faulty in the sensor network. Across the investigation of the results of a case study, we propose several means to reduce false results and guidelines to adjust the parameters of the algorithm. Finally, we show that the proposed diagnosis approach, combined with a simple calibration technique, enables to improve the quality of the measurement results. Thus, the diagnosis algorithm opens new perspectives on in situ calibration.
Conference Paper
Air pollution poses significant risks to environment and health. Air quality monitoring stations are often confined to a small number of locations due to the high cost of the monitoring equipment. They provide a low fidelity picture of the air quality in the city; local variations are overlooked. However, recent developments in low cost sensor technology and wireless communication systems like Internet of Things (IoT) provide an opportunity to use arrayed sensor networks to measure air quality, in real time, at a large number of locations. This paper reports the development of a novel low cost sensor node that utilizes cost-effective electrochemical sensors to measure Carbon Monoxide (CO) and Nitrogen Dioxide (NO2) concentrations and an infrared sensor to measure Particulate Matter (PM) levels. The node can be powered by either solar-recharged battery or mains supply. It is capable of long-range, low power communication over public or private LoRaWAN IoT network and short-range high data rate communication over Wi-Fi. The developed sensor nodes were co-located with an accurate reference CO sensor for field calibration. The low cost sensors’ data shows strong correlation with the data collected from the reference sensor. Offset and gain calibration further improves the quality of the sensor data.
Article
Air pollution is growing ever more serious as a result of rising consumption of energy and other natural resources. Generally, governmental static monitoring stations provide accurate air pollution data, but they are sparsely distributed in the space. In contrast, micro stations as a kind of low-cost air monitoring equipment can be distributed densely though their accuracy is relative low. This paper proposes a deep calibration method (DeepCM) for low-cost air monitoring sensors equipped in the micro stations, which consists of an encoder and a decoder. In the encoding stage, multi-level time series features are extracted, including local, global and periodic time series features. These features can not only capture local, global and periodic trend information, but also benefit to alleviating cross interference and noise effect. In the decoding stage, a final feature extracted by the encoder along with initial features of the moment to be calibrated are fed into the decoder to obtain a calibrated result. The proposed method is evaluated on two real-world datasets. The experimental results demonstrate that our method yields the best performance by comparison with eight baseline methods.
Article
This paper investigates the calibration of low-cost sensors for air pollution. The sensors were deployed on three IoT (Internet of Things) platforms in Spain, Austria, and Italy during the summers of 2017, 2018, and 2019. One of the biggest challenges in the operation of an IoT platform, which has a great impact on the quality of the reported pollution values, is the calibration of the sensors in an uncontrolled environment. This calibration is performed using arrays of sensors that measure cross sensitivities and therefore compensate for both interfering contaminants and environmental conditions. The paper investigates how the fusion of data taken by sensor arrays can improve the calibration process. In particular, calibration with sensor arrays, multi-sensor data fusion calibration with weighted averages, and multi-sensor data fusion calibration with machine learning models are compared. Calibration is evaluated by combining data from various sensors with linear and nonlinear regression models.
Article
This paper shows the result of the calibration process of an Internet of Things platform for the measurement of tropospheric ozone (O <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sub> ). This platform, formed by 60 nodes, deployed in Italy, Spain, and Austria, consisted of 140 metal–oxide O <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sub> sensors, 25 electro-chemical O <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sub> sensors, 25 electro-chemical NO <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> sensors, and 60 temperature and relative humidity sensors. As ozone is a seasonal pollutant, which appears in summer in Europe, the biggest challenge is to calibrate the sensors in a short period of time. In this paper, we compare four calibration methods in the presence of a large dataset for model training and we also study the impact of a limited training dataset on the long-range predictions. We show that the difficulty in calibrating these sensor technologies in a real deployment is mainly due to the bias produced by the different environmental conditions found in the prediction with respect to those found in the data training phase.
Article
Growing progress in sensor technology has constantly expanded the number and range of low-cost, small, and portable sensors on the market, increasing the number and type of physical phenomena that can be measured with wirelessly connected sensors. Large-scale deployments of wireless sensor networks (WSN) involving hundreds or thousands of devices and limited budgets often constrain the choice of sensing hardware, which generally has reduced accuracy, precision, and reliability. Therefore, it is challenging to achieve good data quality and maintain error-free measurements during the whole system lifetime. Self-calibration or recalibration in ad hoc sensor networks to preserve data quality is essential, yet challenging, for several reasons, such as the existence of random noise and the absence of suitable general models. Calibration performed in the field, without accurate and controlled instrumentation, is said to be in an uncontrolled environment. This paper provides current and fundamental self-calibration approaches and models for wireless sensor networks in uncontrolled environments.
Conference Paper
Full-text available
Over the past few years, many low-cost pollution sensors have been integrated into measurement platforms for air quality monitoring. However, using these sensors is challenging: concentrations of toxic gases in ambient air often lie at sensors' sensitivity boundaries, environmental conditions affect the sensor signal, and the sensors are cross-sensitive to multiple pollutants. Datasheet information on these effects is scarce or may not cover deployment conditions. Consequently the sensors need to undergo extensive pre-deployment testing to examine their feasibility for a given application and to find the optimal measurement setup that allows accurate data collection and calibration. In this work, we propose a novel method to conduct in-field testing of low-cost sensors. The algorithm proposed is based on multiple least-squares and leverages the physical variation of urban air pollution to quantify the amount of explained and unexplained sensor signal. We verify (i) whether a sensor is feasible for air quality monitoring in a given environment , (ii) model sensor cross-sensitivities to interfering gases and environmental effects and (iii) compute the optimal sensor array and its calibration parameters for stable and accurate sensor measurements over long time periods. Finally, we apply our testing approach on five off-the-shelf low-cost sensors and twelve reference signals using over 9 million measurements collected in an urban area. We propose an optimized sensor array and show—compared to a state-of-the-art calibration technique—an up to 45% lower calibration error with better long-time stability of the calibration parameters.
Article
Full-text available
In this work the performances of several field calibration methods for low-cost sensors, including linear/multi linear regression and supervised learning techniques, are compared. A cluster of either metal oxide or electrochemical sensors for nitrogen monoxide and carbon monoxide together with miniaturized infra-red carbon dioxide sensors was operated. Calibration was carried out during the two first weeks of evaluation against reference measurements. The accuracy of each regression method was evaluated on a five months field experiment at a semi-rural site using different indicators and techniques: orthogonal regression, target diagram, measurement uncertainty and drifts over time of sensor predictions. In addition to the analyses for ozone and nitrogen oxide already published in Part A [1], this work assessed if carbon monoxide sensors can reach the Data Quality Objective (DQOs) of 25% of uncertainty set in the European Air Quality Directive for indicative methods. As for ozone and nitrogen oxide, it was found for NO, CO and CO2 that the best agreement between sensors and reference measurements was observed for supervised learning techniques compared to linear and multilinear regression.
Article
Full-text available
The goal of this paper is to explore the potential of opportunistic mobile monitoring to map the exposure to air pollution in the urban environment at a high spatial resolution. Opportunistic mobile monitoring makes use of existing mobile infrastructure or people's common daily routines to move measurement devices around. Opportunistic mobile monitoring can also play a crucial role in participatory monitoring campaigns as a typical way to gather data. A case study to measure black carbon was set up in Antwerp, Belgium, with the collaboration of city employees (city wardens). The Antwerp city wardens are outdoors for a large part of the day on surveillance tours by bicycle or on foot, and gathered a total of 393 h of measurements. The data collection is unstructured both in space and time, leading to sampling bias. A temporal adjustment can only partly counteract this bias. Although a high spatial coverage was obtained, there is still a rather large uncertainty on the average concentration levels at a spatial resolution of 50 m due to a limited number of measurements and sampling bias. Despite of this uncertainty, large spatial patterns within the city are clearly captured. This study illustrates the potential of campaigns with unstructured opportunistic mobile monitoring, including participatory monitoring campaigns. The results demonstrate that such an approach can indeed be used to identify broad spatial trends over a wider area, enabling applications including hotspot identification, personal exposure studies, regression mapping, etc. But, they also emphasize the need for repeated measurements and careful processing and interpretation of the data.
Article
Full-text available
The widespread diffusion of sensors, mobile devices, social media and open data are reconfiguring the way data underpinning policy and science are being produced and consumed. This in turn is creating both opportunities and challenges for policy-making and science. There can be major benefits from the deployment of the IoT in smart cities and environmental monitoring, but to realize such benefits, and reduce potential risks, there is an urgent need to address current limitations, including the interoperability of sensors, data quality, security of access and new methods for spatio-temporal analysis. Within this context, the manuscript provides an overview of the AirSensEUR project, which establishes an affordable open software/hardware multi-sensor platform, which is nonetheless able to monitor air pollution at low concentration levels. AirSensEUR is described from the perspective of interoperable data management with emphasis on possible use case scenarios, where reliable and timely air quality data would be essential.
Article
Full-text available
High temperature gas sensors have been highly demanded for combustion process optimization and toxic emissions control, which usually suffer from poor selectivity. In order to solve this selectivity issue and identify unknown reducing gas species (CO, CH4, and CH8) and concentrations, a high temperature resistive sensor array data set was built in this study based on 5 reported sensors. As each sensor showed specific responses towards different types of reducing gas with certain concentrations, based on which calibration curves were fitted, providing benchmark sensor array response database, then Bayesian inference framework was utilized to process the sensor array data and build a sample selection program to simultaneously identify gas species and concentration, by formulating proper likelihood between input measured sensor array response pattern of an unknown gas and each sampled sensor array response pattern in benchmark database. This algorithm shows good robustness which can accurately identify gas species and predict gas concentration with a small error of less than 10% based on limited amount of experiment data. These features indicate that Bayesian probabilistic approach is a simple and efficient way to process sensor array data, which can significantly reduce the required computational overhead and training data.
Article
Full-text available
A novel distributed algorithm for blind macro-calibration of large sensor networks is introduced. The algorithm is in the form of a system of gradient-type recursions for estimating parameters of local sensor calibration functions. The method does not require any fusion center. The convergence analysis is based on diagonal dominance of the dynamical systems with blockmatrices. It is proved that the asymptotic consensus is achieved for all the equivalent sensor gains and offsets (in the mean square sense and with probability one) in lossy sensor networks with possible communication outages and additive communication noise. An illustrative simulation example is provided.
Article
Full-text available
The performances of several field calibration methods for low-cost sensors, including linear/multi linear regression and supervised learning techniques are compared. A cluster of ozone, nitrogen dioxide, nitrogen monoxide, carbon monoxide and carbon dioxide sensors was operated. The sensors were either of metal oxide or electrochemical type or based on miniaturized infra-red cell. For each method, a two-week calibration was carried out at a semi-rural site against reference measurements. Subsequently, the accuracy of the predicted values was evaluated for about five months using a few indicators and techniques: orthogonal regression, target diagram, measurement uncertainty and drifts over time of sensor predictions. The study assessed if the sensors were could reach the Data Quality Objective (DQOs) of the European Air Quality Directive for indicative methods (between 25 and 30% of uncertainty for O3 and NO2). In this study it appears that O3 may be calibrated using simple regression techniques while for NO2 a better agreement between sensors and reference measurements was reached using supervised learning techniques. The hourly O3 DQO was met while it was unlikely that NO2 hourly one could be met. This was likely caused by the low NO2 levels correlated with high O3 levels that are typical of semi-rural site where the measurements of this study took place.
Article
Full-text available
With widespread usages of smart phones, participatory sensing becomes mainstream, especially for applications requiring pervasive deployments with massive sensors. However, the sensors on smart phones are prone to the unknown measurement errors, requiring automatic calibration among uncooperative participants. Current methods need either collaboration or explicit calibration process. However, due to the uncooperative and uncontrollable nature of the participants, these methods fail to calibrate sensor nodes effectively. We investigate sensor calibration in monitoring pollution sources, without explicit calibration process in uncooperative environment. We leverage the opportunity in sensing diversity, where a participant will sense multiple pollution sources when roaming in the area. Further, inspired by expectation maximization (EM) method, we propose a two-level iterative algorithm to estimate the source presences, source parameters and sensor noise iteratively. The key insight is that, only based on the participatory observations, we can “calibrate sensors without explicit or cooperative calibrating process”. Theoretical analysis proves that, our method can converge to the optimal estimation of sensor noise, where the likelihood of observations is maximized. Also, extensive simulations show that, ours improves the estimation accuracy of sensor bias up to 20 percent and that of sensor noise deviation up to 30 percent, compared with three baseline methods.
Article
Full-text available
Up-to-date information on urban air pollution is of great importance for environmental protection agencies to assess air quality and provide advice to the general public in a timely manner. In particular, ultrafine particles (UFPs) are widely spread in urban environments and may have a severe impact on human health. However, the lack of knowledge about the spatio-temporal distribution of UFPs hampers profound evaluation of these effects. In this paper, we analyze one of the largest spatially resolved UFP data set publicly available today containing over 50 million measurements. We collected the measurements throughout more than two years using mobile sensor nodes installed on top of public transport vehicles in the city of Zurich, Switzerland. Based on these data, we develop land-use regression models to create pollution maps with a high spatial resolution of 100 m × 100 m. We compare the accuracy of the derived models across various time scales and observe a rapid drop in accuracy for maps with sub-weekly temporal resolution. To address this problem, we propose a novel modeling approach that incorporates past measurements annotated with metadata into the modeling process. In this way, we achieve a 26% reduction in the root-mean-square error—a standard metric to evaluate the accuracy of air quality models—of pollution maps with semi-daily temporal resolution. We believe that our findings can help epidemiologists to better understand the adverse health effects related to UFPs and serve as a stepping stone towards detailed real-time pollution assessment.
Article
Full-text available
Systematic biases in sensor measurements undermine the performance of wireless sensor networks in mission- critical applications such as target detection and tracking. Traditional device-level calibration approaches become in- tractable for moderate to large-scale networks due to limited access of individual sensors after deployment. In this paper, we propose a two-tier system-level calibration approach for a class of sensor networks that employ data fusion to im- prove the overall system performance. In the first tier, each sensor learns its local sensing model from noisy measure- ments using an online algorithm and only transmits a few model parameters. In the second tier, sensors' local sens- ing models are then calibrated to a common system sensing model. Our approach fairly distributes computation over- head among sensors and significantly reduces the communi- cation overhead of system-level calibration. Based on this approach, we develop an optimal model calibration scheme that maximizes the target detection probability of a sensor network under bounded false alarm rate. The simulations based on synthetic data as well as real data traces collected by 18 sensors show that our system-level calibration scheme can improve the detection performance of a sensor network by up to 50%.
Article
Full-text available
In this paper we study the problem of estimating the channel parameters for a generic wireless sensor network (WSN) in a completely distributed manner, using consensus algorithms. Specifically, we first propose a distributed strategy to minimize the effects of unknown constant offsets in the reading of the radio strength signal indicator due to uncalibrated sensors. Then we show how the computation of the optimal wireless channels parameters, which are the solution of a global least-square optimization problem, can be obtained with a consensus-based algorithm. The proposed algorithms are general algorithms for sensor calibration and distributed least-square parameter identification, and do not require any knowledge either on the global topology of the network nor the total number of nodes. Finally, we apply these algorithms to experimental data collected from an indoor WSN.
Article
Growing progress in sensor technology has constantly expanded the number and range of low-cost, small, and portable sensors on the market, increasing the number and type of physical phenomena that can be measured with wirelessly connected sensors. Large-scale deployments of wireless sensor networks (WSN) involving hundreds or thousands of devices and limited budgets often constrain the choice of sensing hardware, which generally has reduced accuracy, precision, and reliability. Therefore, it is challenging to achieve good data quality and maintain error-free measurements during the whole system lifetime. Self-calibration or recalibration in ad hoc sensor networks to preserve data quality is essential, yet challenging, for several reasons, such as the existence of random noise and the absence of suitable general models. Calibration performed in the field, without accurate and controlled instrumentation, is said to be in an uncontrolled environment. This paper provides current and fundamental self-calibration approaches and models for wireless sensor networks in uncontrolled environments.
Book
An Introduction to Statistical Learning provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years. This book presents some of the most important modeling and prediction techniques, along with relevant applications. Topics include linear regression, classification, resampling methods, shrinkage approaches, tree-based methods, support vector machines, clustering, and more. Color graphics and real-world examples are used to illustrate the methods presented. Since the goal of this textbook is to facilitate the use of these statistical learning techniques by practitioners in science, industry, and other fields, each chapter contains a tutorial on implementing the analyses and methods presented in R, an extremely popular open source statistical software platform. Two of the authors co-wrote The Elements of Statistical Learning (Hastie, Tibshirani and Friedman, 2nd edition 2009), a popular reference book for statistics and machine learning researchers. An Introduction to Statistical Learning covers many of the same topics, but at a level accessible to a much broader audience. This book is targeted at statisticians and non-statisticians alike who wish to use cutting-edge statistical learning techniques to analyze their data. The text assumes only a previous course in linear regression and no knowledge of matrix algebra.
Article
This paper proposes a system for activity recognition using multi-sensor fusion. In this system, four sensors are attached to the waist, chest, thigh, and side of the body. In the study we present two solutions for factors that affect the activity recognition accuracy: the calibration drift and the sensor orientation changing. The datasets used to evaluate this system were collected from 8 subjects who were asked to perform 8 scripted normal activities of daily living (ADL), three times each. The Naïve Bayes classifier using multi-sensor fusion is adopted and achieves 70.88%-97.66% recognition accuracies for 1-4 sensors.
Conference Paper
In this paper, we present algorithms for in-situ calibration of sensor networks for distributed detection in the parallel fusion architecture. The wireless sensors act as local detectors and transmit preliminary detection results to an access point or fusion center for decision combining. In order to implement an optimal fusion center, both the performance parameters of each local detector (i.e., its probability of false alarm and probability of miss) as well as the wireless channel conditions must be known. However, in real-world applications these statistics may be unknown or vary in time. In our approach, the fusion center receives a collection of labeled samples from the sensor nodes after deployment of the network and calibrates the impact of individual sensors on the final detection result. In the case that local sensor decisions are independent, we employ maximum likelihood parameter estimation techniques, whereas in the case of arbitrarily correlated sensor outputs, we use the method of kernel smoothing. The obtained fusion rules are both asymptotically optimal and show good performance for finite sample sizes.
Conference Paper
Sensornet systems research is being conducted with various applications and deployment scenarios in mind. In many of these scenarios, the presumption is that the sensornet will be deployed and managed by users who do not have a background in computer science. In this paper we describe the "tiny application sensor kit" (TASK), a system we have designed for use by end-users with minimal sensornet sophistication. We describe the requirements that guided our design, the architecture of the system and results from initial deployments. Based on our experience to date we present preliminary design principles and research challenges that arise in delivering sensornet research to end users.
Data fusion of crowdsourced observations and model data for high-resolution mapping of urban air quality
  • P Schneider
  • N Castell
  • I Vallejo
  • M V W Lahoz
  • A Bartonova
P. Schneider, N. Castell, I. Vallejo, M. V. W. Lahoz, and A. Bartonova, "Data fusion of crowdsourced observations and model data for highresolution mapping of urban air quality," in 10th International Conference on Air Quality Science and Application, March 2016, p. 76.