Conference Paper

Towards Air Quality Estimation Using Collected Multimodal Environmental Data

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper presents an open platform, which collects multimodal environmental data related to air quality from several sources including official open sources, social media and citizens. Collecting and fusing different sources of air quality data into a unified air quality indicator is a highly challenging problem, leveraging recent advances in image analysis, open hardware, machine learning and data fusion. The collection of data from multiple sources aims at having complementary information, which is expected to result in increased geographical coverage and temporal granularity of air quality data. This diversity of sources constitutes also the main novelty of the platform presented compared with the existing applications.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Low-cost and relatively good result correlations with reference air pollution stations (Karagulian et al., 2019) allows users to set up citizen science initiatives and involve local communities into global problem solving. The most relevant of these projects are listed in Table 1, which is an extension of the review carried out by Moumtzidou et al. (2016). The relatively simple design of citizen science sensors makes them suitable for do-it-yourself (DiY) workshops. ...
Preprint
Full-text available
The United Nations (UN) sustainable development goals (SDGs), a strategy to guide the world’s social and economic transformation, highlight the issue of urban air pollution in SDG 11. Open data, as an output of citizen science (CS), are needed to supply and improve the SDG indicator system. Therefore, we propose a CS framework to extend the paradigm of urban air pollution monitoring from particulate matter concentration levels to air quality-related health symptom load, and foster the development of a tier-3 SDG indicator (which we call indicator 11.6.3). Building this new perspective for CS contributions to the achievement of SDGs, we address the problem of crowdsourced data bias as a prerequisite for better quality open data output. The aim of this study is to propose an air pollution symptom mapping framework for citizen-driven research and to find the most robust data quality assurance system (QAs) in this field. The method includes a GeoWeb application as well as data quality assurance mechanisms based on conditional statements, in order to reduce crowdsourced data bias. A four-month crowdsourcing campaign, released in Lubelskie voivodship (Poland), resulted in 1823 outdoor reports with a rejection rate of up to 28%, depending on the applied QA system (QAs). Testing the QAs variants, we find the most robust data bias solving method in survey-based symptom mapping. The framework output is shared via GeoWeb dashboards, including the 11.6.3 indicator evaluation. By familiarizing the public with citizen science, a city can track the progress of its SDG achievements and increase the transparency of the process through the use of GeoWeb.
... The implementation and design of low-cost monitoring systems, along with easily accessible monitoring results, may constitute a partial solution and may provide help to the current insufficient conventional methods. This paper compares two low cost systems comprising DFROBOT SEN0177 & NOVA PM SDS011 laser sensors, implemented within the frame of a European HORIZON 2020 project (hackAIR V1and hackAIR V2 [4][5]), with the commercially available system Dylos DC1100 pro [6], and discussing the reliability of respective measurements. Until now, only limited reference is made in the literature regarding similar measurements, providing information with respect to the sensors reliability, their behavior over time (measurement drifting), as well as the impact of weather conditions on the measured quantities [7] [8] [9][10] [11]and [12]. ...
... However, the method is not limited to sunset conditions, is extended to images from users, social media and public webcams and furthermore uses a better representation of the local atmospheric characteristics. The methodology described in this chapter is part of the framework developed within the hackAIR project and constitutes an update of the system presented in [25] that overcomes several of its limitations (e.g. need for more images, better sky localization methods). ...
Chapter
Air pollution causes nearly half a million premature deaths each year in Europe. Despite air quality directives that demand compliance with air pollution value limits, many urban populations continue being exposed to air pollution levels that exceed by far the guidelines. Unfortunately, official air quality sensors are sparse, limiting the accuracy of the provided air quality information. In this chapter, we explore the possibility of extending the number of air quality measurements that are fed into existing air quality monitoring systems by exploiting techniques that estimate air quality based on sky-depicting images. We first describe a comprehensive data collection mechanism and the results of an empirical study on the availability of sky images in social image sharing platforms and on webcam sites. In addition, we present a methodology for automatically detecting and extracting the sky part of the images leveraging deep learning models for concept detection and localization. Finally, we present an air quality estimation model that operates on statistics computed from the pixel color values of the detected sky regions.
... Breezometer (https: //breezometer.com), etc.), demonstrate the added value of up-to-date, spatiotemporally defined air-quality-related information and recommendation provision [37]. However, the above applications produce recommendations that generally apply to sensitive people, without any specialization to specific individuals' needs. ...
Article
Full-text available
Although air pollution is one of the most significant environmental factors posing a threat to human health worldwide, air quality data are scarce or not easily accessible in most European countries. The current work aims to develop a centralized air quality data hub that enables citizens to contribute to air quality monitoring. In this work, data from official air quality monitoring stations are combined with air pollution estimates from sky-depicting photos and from low-cost sensing devices that citizens build on their own so that citizens receive improved information about the quality of the air they breathe. Additionally, a data fusion algorithm merges air quality information from various sources to provide information in areas where no air quality measurements exist.
... There are several initiatives including projects and applications that exploit social media, such as Flickr tags [20], in order to create awareness about emergency situations or any other health related issues such as environmental conditions. First, within the hackAIR project [12], a platform has been developed for gathering and fusing environmental data and specifically Particulate Matter measurements from official sources and social media communities such as publicly available images shared through Instagram. In [21], the authors describe a framework that distinguishes between informational and conversational tweets shared during any major event and especially natural disasters. ...
Conference Paper
Disaster monitoring based on social media posts has raised a lot of interest in the domain of computer science the last decade, mainly due to the wide area of applications in public safety and security and due to the pervasiveness not solely on daily communication but also in life-threating situations. Social media can be used as a valuable source for producing early warnings of eminent disasters. This paper presents a framework to analyse social media multimodal content, in order to decide if the content is relevant to flooding. This is very important since it enhances the crisis situational awareness and supports various crisis management procedures such as preparedness. Evaluation on a benchmark dataset shows very good performance in both text and image classification modules.
... To the best of our knowledge, no other DSS covers this multifaceted task as a whole through the adoption of ontologies. The proposed work comprises the operational EDSS of the hackAIR EU project [6]. ...
Chapter
As urban atmospheric conditions are tightly connected to citizens’ quality of life, the concept of efficient environmental decision support systems becomes highly relevant. However, the scale and heterogeneity of the involved data, together with the need for associating environmental information with physical reality, increase the complexity of the problem. In this work, we capitalize on the semantic expressiveness of ontologies to build a framework that uniformly covers all phases of the decision making process: from structuring and integration of data, to inference of new knowledge. We define a simplified ontology schema for representing the status of the environment and its impact on citizens’ health and actions. We also implement a novel ontology- and rule-based reasoning mechanism for generating personalized recommendations, capable of treating differently individuals with diverse levels of vulnerability under poor air quality conditions. The overall framework is easily adaptable to new sources and needs.
... To the best of our knowledge, no other DSS covers this multifaceted task as a whole through the adoption of ontologies. The proposed work comprises the operational EDSS of the hackAIR EU project [6]. ...
Conference Paper
Full-text available
As urban atmospheric conditions are tightly connected to citizens’ quality of life, the concept of efficient environmental decision support systems becomes highly relevant. However, the scale and heterogeneity of the involved data, together with the need for associating environmental information with physical reality, increase the complexity of the problem. In this work, we capitalize on the semantic expressiveness of ontologies to build a framework that uniformly covers all phases of the decision making process: from structuring and integration of data, to inference of new knowledge. We define a simplified ontology schema for representing the status of the environment and its impact on citizens’ health and actions. We also implement a novel ontology- and rule-based reasoning mechanism for generating personalized recommendations, capable of treating differently individuals with diverse levels of vulnerability under poor air quality conditions. The overall framework is easily adaptable to new sources and needs.
... Environment data collection more and more exploits the contribution of citizens, who cooperate through their mobile phones, for acquiring and processing large geo-referenced data sets and for extracting from them information usable in the study of natural and anthropic processes [1], [2], [3], [4], [5], [6], [7], [8], [9]. The main challenge of developing crowd- sourcing applications for large scale environment geo-data collection is to offer citizens a useful and possibly entertaining experience, so as to motivate them to use the application frequently and spread the word about it to their social circles. ...
Article
Recently, real-time air quality estimation has attracted more and more attention from all over the world, which is close to our daily life. With the prevalence of mobile sensors, there is an emerging way to monitor the air quality with mobile sensors on vehicles. Compared with traditional expensive monitor stations, mobile sensors are cheaper and more abundant, but observations from these sensors have unstable spatial and temporal distributions, which results in the existing model could not work very well on this type of data. In this article, taking advantage of air quality data from mobile sensors, we propose an real-time urban air quality estimation method based on the Gaussian Process Regression for air pollution of the unmonitored areas, pivoting on the diffusion effect and the accumulation effect of air pollution. In order to meet the real-time demands, we propose a two-layer ensemble learning framework and a self-adaptivity mechanism to improve computational efficiency and adaptivity. We evaluate our model with real data from mobile sensor system located in Beijing, China. And the experiments show that our proposed model is superior to the state-of-the-art spatial regression methods in both precision and time performances.
Article
Full-text available
Through the last decades, the development of technology was rapid. As a result, changes in a series of sectors of human life have been observed. One of these sectors is spatial planning, where new applications contribute towards its skillful application. Especially, in the sector of public participation in urban planning procedure, an urge of motivation of the public is noted in order to participate as an active participant who collects data, creates maps, suggests ideas and, finally, accepts or not a design proposal. In that context, this research paper investigates how new technologies contribute in the promotion of community engagement in urban planning. In a parallel manner, this paper attempts to locate the effects that are expected to have technologically advanced applications in participant planning in the local community. In order to examine the above issues, an international literature review occurs and institutional guidelines towards this sector are investigated, in European level. Furthermore, the investigation of case studies is utilized in order to establish a guide of line practices and locate the effects that presented similar policies in societies that implemented them. All the above contribute to an ex-ante evaluation of the application of such practices in Greece, in order to find out how much usefulness will their integration provide to the procedures of spatial planning of the country.
Article
Available online xxxx Outdoor air pollution is a serious environmental problem in many developing countries; obtaining timely and accurate information about urban air quality is a first step toward air pollution control. Many developing countries however, do not have any monitoring stations and therefore the means to measure air quality. We address this problem by using social media to collect urban air quality information and propose a method for inferring urban air quality in Chinese cities based on China's largest social media platform, Sina Weibo combined with other meteorological data. Our method includes a data crawler to locate and acquire air-quality associated historical Weibo data, a procedure for extracting indicators from these Weibo and factors from meteorological data, a model to infer air quality index (AQI) of a city based on the extracted Weibo indicators supported by meteorological factors. We implemented the proposed method in case studies at Beijing, Shanghai, and Wuhan, China. The results show that based the Weibo indicators and meteorological factors we extracted, this method can infer the air quality conditions of a city within narrow margins of error. The method presented in this article can aid air quality assessment in cities with few or even no air quality monitoring stations.
Article
Full-text available
An optimal-estimation algorithm for inferring aerosol optical properties from digital twilight photographs is proposed. The sensitivity of atmospheric components and surface characteristics to brightness and color of twilight sky is investigated, and the results suggest that tropospheric and stratospheric aerosol optical thickness (AOT) are sensitive to condition of the twilight sky. The coarse–fine particle volume ratio is moderately sensitive to the sky condition near the horizon under a clean-atmosphere condition. A radiative transfer model that takes into account a spherical-shell atmosphere, refraction, and multiple scattering is used as a forward model. Error analysis shows that the tropospheric and stratospheric AOT can be retrieved without significant bias. Comparisons with results from other ground-based instruments exhibit reasonable agreement on AOT. A case study suggests that the AOT retrieval method can be applied to atmospheric conditions with varying aerosol vertical profiles and vertically inhomogeneous species in the troposphere.
Technical Report
Full-text available
This Discussion Paper specifies a potential OGC Candidate Standard for a JSON implementation of the OGC and ISO Observations and Measurements (O&M) conceptual model (OGC Observations and Measurements v2.0 also published as ISO/DIS 19156). This encoding is expected to be useful in RESTful implementations of observation services. More specifically, this Discussion Paper defines JSON schemas for observations, and for features involved in sampling when making observations. These provide document models for the exchange of information describing observation acts and their results, both within and between different scientific and technical communities
Article
Full-text available
An optimal-estimation algorithm for inferring aerosol optical properties from digital twilight photographs is proposed. The sensitivity of atmospheric components and surface characteristics to brightness and color of twilight sky is investigated, and the results suggest that tropospheric and stratospheric aerosol optical thickness (AOT) are sensitive to condition of the twilight sky. The coarse–fine particle volume ratio is moderately sensitive to the sky condition near the horizon under a clean-atmosphere condition. A radiative transfer model that takes into account a spherical-shell atmosphere, refraction, and multiple scattering is used as a forward model. Error analysis shows that the tropospheric and stratospheric AOT can be retrieved without significant bias. Comparisons with results from other ground-based instruments exhibit reasonable agreement on AOT. A case study suggests that the AOT retrieval method can be applied to atmospheric conditions with varying aerosol vertical profiles and vertically inhomogeneous species in the troposphere.
Article
Full-text available
There is a large amount of meteorological and air quality data available online. Often, different sources provide deviating and even contradicting data for the same geographical area and time. This implies that users need to evaluate the relative reliability of the information and then trust one of the sources. We present a novel data fusion method that merges the data from different sources for a given area and time, ensuring the best data quality. The method is a unique combination of land-use regression techniques, statistical air quality modelling and a well-known data fusion algorithm. We show experiments where a fused temperature forecast outperforms individual temperature forecasts from several providers. Also, we demonstrate that the local hourly NO2 concentration can be estimated accurately with our fusion method while a more conventional extrapolation method falls short. The method forms part of the prototype web-based service PESCaDO, designed to cater personalized environmental information to users.
Article
Full-text available
We examine sunsets painted by famous artists as proxy information for the aerosol optical depth after major volcanic eruptions. Images derived from precision colour protocols applied to the paintings were compared to online images, and found that the latter, previously analysed, provide accurate information. Aerosol optical depths (AODs) at 550 nm, corresponding to Northern Hemisphere middle latitudes, calculated by introducing red-to-green (R / G) ratios from a large number of paintings to a radiative transfer model, were significantly correlated with independent proxies from stratospheric AOD and optical extinction data, the dust veil index, and ice core volcanic indices. AODs calculated from paintings were grouped into 50-year intervals from 1500 to 2000. The year of each eruption and the 3 following years were defined as "volcanic". The remaining "non-volcanic" years were used to provide additional evidence of a multidecadal increase in the atmospheric optical depths during the industrial "revolution". The increase of AOD at 550 nm calculated from the paintings grows from 0.15 in the middle 19th century to about 0.20 by the end of the 20th century. To corroborate our findings, an experiment was designed in which a master painter/colourist painted successive sunsets during and after the passage of Saharan aerosols over the island of Hydra in Greece. Independent solar radiometric measurements confirmed that the master colourist's R / G ratios which were used to model his AODs, matched the AOD values measured in situ by co-located sun photometers during the declining phase of the Saharan aerosol. An independent experiment was performed to understand the difference between R / G ratios calculated from a typical volcanic aerosol and those measured from the mineral aerosol during the Hydra experiment. It was found that the differences in terms of R / G ratios were small, ranging between -2.6% and +1.6%. Also, when analysing different parts of cloudless skies of paintings following major volcanic eruptions, any structural differences seen in the paintings had not altered the results discussed above. However, a detailed study on all possible sources of uncertainties involved (such as the impact of clouds on R / G ratios) still needs to be studied. Because of the large number of paintings studied, we tentatively propose the conclusion that regardless of the school, red-to-green ratios from great masters can provide independent proxy AODs that correlate with widely accepted proxies and with independent measurements.
Article
Full-text available
This paper addresses the problem of generating possible object locations for use in object recognition. We introduce selective search which combines the strength of both an exhaustive search and segmentation. Like segmentation, we use the image structure to guide our sampling process. Like exhaustive search, we aim to capture all possible object locations. Instead of a single technique to generate possible object locations, we diversify our search and use a variety of complementary image partitionings to deal with as many image conditions as possible. Our selective search results in a small set of data-driven, class-independent, high quality locations, yielding 99 % recall and a Mean Average Best Overlap of 0.879 at 10,097 locations. The reduced number of locations compared to an exhaustive search enables the use of stronger machine learning techniques and stronger appearance models for object recognition. In this paper we show that our selective search enables the use of the powerful Bag-of-Words model for recognition. The selective search software is made publicly available (Software: http://disi.unitn.it/~uijlings/SelectiveSearch.html).
Article
Full-text available
Online social and news media generate rich and timely information about real-world events of all kinds. However, the huge amount of data available, along with the breadth of the user base, requires a substantial effort of information filtering to successfully drill down to relevant topics and events. Trending topic detection is therefore a fundamental building block to monitor and summarize information originating from social sources. There are a wide variety of methods and variables and they greatly affect the quality of results. We compare six topic detection methods on three Twitter datasets related to major events, which differ in their time scale and topic churn rate. We observe how the nature of the event considered, the volume of activity over time, the sampling procedure and the pre-processing of the data all greatly affect the quality of detected topics, which also depends on the type of detection method used. We find that standard natural language processing techniques can perform well for social streams on very focused topics, but novel techniques designed to mine the temporal distribution of concepts are needed to handle more heterogeneous streams containing multiple stories evolving in parallel. One of the novel topic detection methods we propose, based on -grams cooccurrence and topic ranking, consistently achieves the best performance across all these conditions, thus being more reliable than other state-of-the-art techniques.
Chapter
Full-text available
It is common practice to present environmental information of spatial nature (like atmospheric quality patterns) in the form of pre-processed images. The current paper deals with the harmonization, comparison and reuse of Chemical Weather (CW) forecasts in the form of pre-processed images of varying quality and informational content, without having access to the original data. In order to compare, combine and reuse such environmental data, an innovative method for the inverse reconstruction of environmental data from images, was developed. The method is based on a new, neural adaptive data interpolation algorithm, and is tested on CW images coming from various European providers. Results indicate a very good performance that renders this method as appropriate to be used in various image-processing problems that require data reconstruction, retrieval and reuse.
Article
Full-text available
Domain-specific Web search engines are effective tools for reducing the difficulty experienced when acquiring information from the Web. Existing methods for building domain-specific Web search engines require human expertise or specific facilities. However, we can build a domain-specific search engine simply by adding domain-specific keywords, called "keyword spices," to the user's input query and forwarding it to a general-purpose Web search engine. Keyword spices can be effectively discovered from Web documents using machine learning technologies. The paper describes domain-specific Web search engines that use keyword spices for locating recipes, restaurants, and used cars.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Conference Paper
Approving the new federal low about geo-information (RS 510.62), the Switzerland gave a great impulse to the establishment of a national spatial data infrastructure, which in agreement with the EU directive INSPIRE (2007/2/CE), shall be based on geo-services. This approach gives to data maintainers the freedom to choose the desired formats, software and data model but impose the provision of the required information throughout a well defined service. While some of the Geographical Web Services, like WMS and WFS, are today widely diffused and applied, others like SOS (Sensor Observation Service) are still a work in progress and need some revisions and verifications for their correct application. The SOS primary aim is to provide a standardized service for accessing observations in a standard format. The secondary aim is to provide to intelligent sensor network an interface to interact with for automatic sensor registration and observation storage. In order to investigate the maturity level of the version 1.0 of the SOS OGC standard, the IST (Institute of Earth Sciences) has developed a new software implementing this standard and has applied it for the management of the hydro-meteorological network of the Canton Ticino. The development of the new software, which is named istSOS, together with its application in a real case has provided the opportunity to evaluate some open issues, weakness and lacks that are discussed in this paper. As a result the authors can overall conclude that the SOS v.1.0 is currently incomplete and still open to ambiguities but with some corrections it can easily be an invaluable resource for a great number of disciplines and actions, including climate changes, risk reduction, security improvement.
Article
In this paper, we deal with the problem of extending and using different local descriptors, as well as exploiting concept correlations, toward improved video semantic concept detection. We examine how the state-of-the-art binary local descriptors can facilitate concept detection, we propose color extensions of them inspired by previously proposed color extensions of scale invariant feature transform, and we show that the latter color extension paradigm is generally applicable to both binary and nonbinary local descriptors. In order to use them in conjunction with a state-of-the-art feature encoding, we compact the above color extensions using PCA and we compare two alternatives for doing this. Concerning the learning stage of concept detection, we perform a comparative study and propose an improved way of employing stacked models, which capture concept correlations, using multilabel classification algorithms in the last layer of the stack. We examine and compare the effectiveness of the above algorithms in both semantic video indexing within a large video collection and in the somewhat different problem of individual video annotation with semantic concepts, on the extensive video data set of the 2013 TRECVID Semantic Indexing Task. Several conclusions are drawn from these experiments on how to improve the video semantic concept detection.
Article
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make train-ing faster, we used non-saturating neurons and a very efficient GPU implemen-tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.
Article
Instagram is a relatively new form of communication where users can instantly share their current status by taking pictures and tweaking them using filters. It has seen a rapid growth in the number of users as well as uploads since it was launched in October 2010. Inspite of the fact that it is the most popular photo sharing application, it has attracted relatively less attention from the web and social media research community. In this paper, we present a large-scale quantitative analysis on millions of users and pictures we crawled over 1 month from Instagram. Our analysis reveals several insights on Instagram which were never studied before: 1) its social network properties are quite different from other popular social media like Twitter and Flickr, 2) people typically post once a week, and 3) people like to share their locations with friends. To the best of our knowledge, this is the first in-depth analysis of user activities, demographics, social network structure and user-generated content on Instagram.
Article
SBDART is a software tool that computes plane-parallel radiative transfer in clear and cloudy conditions within the earth's atmosphere and at the surface. All important processes that affect the ultraviolet, visible, and infrared radiation fields are included. The code is a marriage of a sophisticated discrete ordinate radiative transfer module, low-resolution atmospheric transmission models, and Mie scattering results for light scattering by water droplets and ice crystals. The code is well suited for a wide variety of atmospheric radiative energy balance and remote sensing studies. It is designed so that it can be used for case studies as well as sensitivity analysis. For small sets of computations or teaching applications it is available on the World Wide Web with a user-friendly interface. For sensitivity studies requiring many computations it is available by anonymous FTP as a well organized and documented FORTRAN 77 source code.
Article
Focused crawling is aimed at selectively seeking out pages that are relevant to a predefined set of topics. Since an ontology is a well-formed knowledge representation, ontology-based focused crawling approaches have come into research. However, since these approaches utilize manually predefined concept weights to calculate the relevance scores of web pages, it is difficult to acquire the optimal concept weights to maintain a stable harvest rate during the crawling process. To address this issue, we proposed a learnable focused crawling framework based on ontology. An ANN (artificial neural network) was constructed using a domain-specific ontology and applied to the classification of web pages. Experimental results show that our approach outperforms the breadth-first search crawling approach, the simple keyword-based crawling approach, the ANN-based focused crawling approach, and the focused crawling approach that uses only a domain-specific ontology.
Article
Two different data assimilation techniques have been applied to assess exceedances of the daily and annual mean limit values for PM10 on the regional scale in Europe. The two methods include a statistical interpolation method (SI), based on residual kriging after linear regression of the model, and Ensemble Kalman filtering (EnKF). Both methods are applied using the LOTOS-EUROS model. Observations for the assimilation and validation of the methods have been retrieved from the Airbase database using rural background stations only. For the period studied, 2003, 127 suitable stations were available. The LOTOS-EUROS model is found to underestimate PM10 concentrations by a factor of 2. This large model bias is found to be prohibitive for the effective use of the EnKF methodology and a bias correction was required for the filter to function effectively. The results of the study show that both methods provide significant improvement on the model calculations when compared to an independent set of validation stations. The total root mean square error of the daily mean concentrations of PM10 at the validation stations was reduced from 16.7 μg m−3 for the model to 9.2 μg m−3 using SI and to 13.5 μg m−3 using EnKF. Similarly, correlation (R2) is also significantly improved from 0.21 for the model to 0.66 using SI and 0.41 using EnKF. Significant improvement in the annual mean and number of exceedance days of PM10 is also seen. In addition to the validation of the methods, maps of exceedances and their associated uncertainty are presented. The most effective methodology is found to be the statistical interpolation method. The application of EnKF is novel and yields promising results, although its application to PM10 still needs to be improved.
Article
It has become increasingly difficult to locate relevant information on the Web, even with the help of Web search engines. Two approaches to addressing the low precision and poor presentation of search results of current search tools are studied: meta-search and document categorization. Meta-search engines improve precision by selecting and integrating search results from generic or domain-specific Web search engines or other resources. Document categorization promises better organization and presentation of retrieved results. This article introduces MetaSpider, a meta-search engine that has real-time indexing and categorizing functions. We report in this paper the major components of MetaSpider and discuss related technical approaches. Initial results of a user evaluation study comparing MetaSpider, NorthernLight, and MetaCrawler in terms of clustering performance and of time and effort expended show that MetaSpider performed best in precision rate, but disclose no statistically significant differences in recall rate and time requirements. Our experimental study also reveals that MetaSpider exhibited a higher level of automation than the other two systems and facilitated efficient searching by providing the user with an organized, comprehensive view of the retrieved documents.
https:// www. uclahealth. org/ Pages/ AirForU-App
  • Airforu