Chapter

Towards Improved Air Quality Monitoring Using Publicly Available Sky Images

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Air pollution causes nearly half a million premature deaths each year in Europe. Despite air quality directives that demand compliance with air pollution value limits, many urban populations continue being exposed to air pollution levels that exceed by far the guidelines. Unfortunately, official air quality sensors are sparse, limiting the accuracy of the provided air quality information. In this chapter, we explore the possibility of extending the number of air quality measurements that are fed into existing air quality monitoring systems by exploiting techniques that estimate air quality based on sky-depicting images. We first describe a comprehensive data collection mechanism and the results of an empirical study on the availability of sky images in social image sharing platforms and on webcam sites. In addition, we present a methodology for automatically detecting and extracting the sky part of the images leveraging deep learning models for concept detection and localization. Finally, we present an air quality estimation model that operates on statistics computed from the pixel color values of the detected sky regions.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The geo-tagged social data adds to the knowledge of the pollution levels such as hotspot, health exposures, human responses and awareness (Charitidis et al., 2019). Data extracted from publicly available images from social media correlated with monitoring data, satellite and meteorology data is proving to be useful in nowcasting and AQI prediction (Kosmidis et al., 2018;Spyromitros-Xioufis et al., 2018;Khaefi et al. (2018). ...
... al., 2019;Piedrahita et al., 2014;Hu et al., 2014;Maag et al., 2018;Henderson et al., 2016;Arvind et al., 2016;Chen et al., 2018; Yarza et al.al., 2015;Castell et al., 2015;Gately et al., 2017;Steinle et al., 2015;Skjetne and Liu, 2017;Steinle et al., 2015;Constant, 2018;de Nazelle et al., 2013;Larkin and Hystad, 2017;Nyhan et al., 2016Nyhan et al., , 2019 Steinle et al.al., 2019;Kosmidis et al., 2018;Spyromitros-Xioufis et al., 2018;Khaefi et al., 2018;Du et al., 2016;Jiang et al., 2019;Dong et al., 2019;Zheng et al., 2019aZheng et al., , 2019bUpadhyay and Upadhyay, 2017;Yan et al., 2019; Jiang et al.al., 2015;Degbelo et al., 2016; EEA, 2019;English et al., 2018;Jerrett et al., 2017;Liu et al., 2014;Castell et al., 2018;Davis et al., 2020;Mahajan et al., 2020;Snyder et al., 2013;Zheng et al., 2019a;US EPA, 2015;Hano et al. ...
Article
Cities foster economic growth. However, growing cities also contribute to air pollution and climate change. The paper provides a perspective regarding the opportunity available in addressing the urban air quality management (UAQM) issues using smart city framework in the context of ‘urban computing’. Traditionally, UAQM has been built on sparse regulatory monitoring, enhanced with satellite data and forecast models. The ‘Fourth Industrial Revolution’ (4IR) technologies such as Internet of Things (IoT), big data, artificial intelligence, smartphones, social and cloud computing are reshaping urban conglomerates, worldwide. Cities can harness these ubiquitous technologies in concert with traditional methods for betterment of air quality governance and to improve quality of life. This paper discusses the role of urban computing in UAQM through a review of scientific publications and ‘grey literature’ from technical reports of governments, international organizations and institutional websites. It provides an interdisciplinary knowledge repository on urban computing applications for air quality functions. It highlights the potential of integrated technologies in enabling data driven, strategic and real-time mitigation governance actions and helping citizens to take informed decisions. It recommends ‘fit for the purpose’ multitechnology framework for UAQM services in emerging smart cities.
... The second approach for sky localization consists of a set of heuristic rules that aim to identify pixels that meet certain criteria with respect to their color values and the color values of neighboring pixels. In rough terms (a more detailed description of this algorithm can be found in [18]), if R, G, and B denote the Red, Green, and Blue values of each pixel, sky pixels must satisfy the following three conditions: ...
... Furthermore, sky radiances obtained by digital cameras were compared with CIMEL sunphotometer radiances, finding mean absolute differences between 2% and 15% except for pixels near the sun and high scattering angles [25]. The method used in hackAIR is based on the comparison of the R/G and G/B ratios from images and precalculated LUTs to retrieve AOD [18]. ...
Article
Full-text available
Although air pollution is one of the most significant environmental factors posing a threat to human health worldwide, air quality data are scarce or not easily accessible in most European countries. The current work aims to develop a centralized air quality data hub that enables citizens to contribute to air quality monitoring. In this work, data from official air quality monitoring stations are combined with air pollution estimates from sky-depicting photos and from low-cost sensing devices that citizens build on their own so that citizens receive improved information about the quality of the air they breathe. Additionally, a data fusion algorithm merges air quality information from various sources to provide information in areas where no air quality measurements exist.
... The sky regions of the images are used in their analysis and achieved maximum accuracy of 59.38% using CNN. Spyromitros-Xioufis E et al. [4] also used the sky region of visual image for development of air quality estimation model. They performed the whole analysis into three-step: sky region detection, sky region localization, air quality estimation. ...
Conference Paper
Increasing level of air pollution is now threatening issues and results severe health problem . Predicting the air pollution level from visual image can be helpful for society for awareness and maintaining safety from this. Availability of high-resolution visual camera in smart phone will accelerate the aim of decreasing air pollution effect by developing simple system for pollution level detection using pixel information. Aiming at the requirement, in this study we propose a technique to detect the pollution level through image texture feature extraction and classification. Online available image dataset of Beijing has been used for experiment and particulate matter 2.5 (PM2.5) is used for validation of classification. Neighborhood pixel influence oriented five texture level features are computed from each image and used as the input of artificial neural network for classification. The outcome of classification shows 84.2% of correct pollution level detection comparing with PM2.5 value. Therefore, outcome of the method’s accuracy and less complexity of processing makes it acceptable and easy for image base pollution level detection.KeywordsAir pollutionClassificationFeature extractionImageMachine learning
... Finally, it would be interesting to study whether better accuracy could be obtained by exploiting the image content of tweets using image-based air quality estimation approaches (e.g. [25]). ...
Article
The air quality monitoring system systematically monitors the level of pollutants in the air by measuring the concentration of the particular pollutant in the surrounding and the outside environment. The strategy for developing the monitoring system should ensure the acceptable quality of data, to store and record the data in the database, the analysis of data and to present the result. In this concern, we have proposed a smart system of air quality monitoring that help in preventing and controlling airborne allergies and reducing the burden of disease and the cost of treatment. The healthcare data layers and healthcare APIs for standardizing the smart health predictive analytics were derived using meta-heuristic and Machine learning algorithms. This work mainly focused on improving the existing expert systems of air quality monitoring. In concern of this, the detail study of various terms related to air quality monitoring has been done and proposed a new approach that able to gives a better outcomes. In the proposed work, the meta-heuristic firefly optimization has been applied to optimize the selected features during feature selection process and then further classified by using support vector machine which predict the index level and gives better precision and recall of 95.7% and 93.1% respectively and accuracy of 94.4% while compare it with the existing approaches.
Article
Full-text available
The study was undertaken in Krakow, which is situated in Lesser Poland Voivodeship, where bad PM10 air-quality indicators occurred on more than 100 days in the years 2010-2019. Kra-kow has continuous air quality measurement in seven locations that are run by the Province Environmental Protection Inspectorate. The research aimed to create regression and classification models for PM10 and PM2.5 estimation based on sky photos and basic weather data. For this research, one short video with a resolution of 1920 × 1080 px was captured each day. From each film, only five frames were used, the information from which was averaged. Then, texture analysis was performed on each averaged photo frame. The results of the texture analysis were used in the regression and classification models. The regression models' quality for the test datasets equals 0.85 and 0.73 for PM10 and 0.63 for PM2.5. The quality of each classification model differs (0.86 and 0.73 for PM10, and 0.80 for PM2.5). The obtained results show that the created classification models could be used in PM10 and PM2.5 air quality assessment. Moreover, the character of the obtained regression models indicates that their quality could be enhanced; thus, improved results could be obtained.
Article
Full-text available
Investigating perceived air quality (AQ) in urban areas is a rather new topic of interest. Papers presenting results from studies on perception of AQ have thus far focused on the individual characteristics leading to a certain AQ perception or have compared personal perception with on-site measurements. Here we present a novel approach, namely applying volunteered geographic information (VGI) technologies in urban AQ monitoring. We present two smartphone applications that have been developed and applied in two EU projects (FP7 CITI-SENSE and H2020 hackAIR) to obtain citizens' perception of AQ. We focus on observations reported through the smartphone apps for the greater Oslo area in Norway. In order to evaluate whether the reports on perceived AQ contain information about the actual spatial patterns of AQ, we carried out a comparison of the perception data against the output from the high-resolution urban AQ model EPISODE. The results indicate an association between modelled annual average pollutant concentrations and the provided perception reports. This demonstrates that the spatial patterns of perceived AQ are not entirely random but follow to some extent what would be expected due to proximity of emission sources and transport. This information shows that VGI about citizens' perception of AQ has the potential to identify areas with low environmental quality for urban development.
Article
Acquiring the reflectance, radiance, and related structural cloud properties from repositories of historical sky images is a challenging and a computationally intensive task, especially when performed manually or by means of nonautomated approaches. In this article, a quick and efficient, self-adaptive Python tool for the acquisition, and analysis of cloud segmentation properties that is applicable to images from all-sky image repositories is presented and a case study demonstrating its usage and the overall efficacy of the technique is demonstrated. The proposed Python tool aims to build a new data extraction technique and to improve the accessibility of data to future researchers, utilizing the freely available libraries in the Python programing language with the ability to be translated into other programing languages. After development and testing of the Python tool in determining cloud and whole sky segmentation properties, over 42,000 sky images were analyzed in a relatively short time of just under 40 min, with an average execution time of about 0.06 s to complete each image analysis.
Article
Air pollution has become a worldwide concerned issue and automatical estimation of air quality can provide a positive guidance to both individual and industrial behaviors. Given that the traditional instrument-based method requires high economic, labor costs on instrument purchase and maintenance, this paper proposes an effective, efficient, and cheap photo-based method for the air quality estimation in the case of particulate matter (PM2.5). The success of the proposed method lies in extracting two categories of features (including the gradient similarity and distribution shape of pixel values in the saturation map) by observing the photo appearances captured under different PM2.5 concentrations. Specifically, the gradient similarity is extracted to measure the structural information loss with the consideration that PM2.5 attenuates the light rays emitted from the objects and accordingly distorts the structures of the formed photo. Meanwhile, the saturation map is fit by the Weibull distribution to quantify the color information loss. By combining two features, a primary PM2.5 concentration estimator is obtained. Next, a nonlinear function is adopted to map the primary one to the real PM2.5 concentration. Sufficient experiments on real data captured by professional PM2.5 instrument demonstrate the effectiveness and efficiency of the proposed method. Specifically, it is highly consistent with real sensor's measures and requires low implementation time.
Conference Paper
Full-text available
Air pollution has raised people's intensive concerns especially in developing countries such as China and India. Different from using expensive or unreliable methods like sensor-based or social network based one, photo based air pollution estimation is a promising direction, while little work has been done up to now. Focusing on this immediate problem, this paper devises an effective convolutional neural network to estimate air's quality based on photos. Our method is comprised of two ingredients: first a negative log-log ordinal classifier is devised in the last layer of the network, which can improve the ordinal discriminative ability of the model. Second, as a variant of the Rectified Linear Units (ReLU), a modified activation function is developed for photo based air pollution estimation. This function has been shown it can alleviate the vanishing gradient issue effectively. We collect a set of outdoor photos and associate the pollution levels from official agency as the ground truth. Empirical experiments are conducted on this real-world dataset which shows the capability of our method.
Article
Full-text available
This study characterizes the spatiotemporal variability and relative contribution of different types of aerosols to the Aerosol Optical Depth (AOD) over the Eastern Mediterranean as derived from MODIS Terra (3/2000-12/2012) and Aqua (7/2002-12/2012) satellite instruments. For this purpose, a 0.1° × 0.1° gridded MODIS dataset was compiled and validated against sunphotometric observations from the AErosol RObotic NETwork (AERONET). The high spatial resolution and long temporal coverage of the dataset allows for the determination of local hot spots like megacities, medium sized cities, industrial zones, and power plant complexes, seasonal variabilities, and decadal averages. The average AOD at 550 nm (AOD550) for the entire region is ~ 0.22 ± 0.19 with maximum values in summer and seasonal variabilities that can be attributed to precipitation, photochemical production of secondary organic aerosols, transport of pollution and smoke from biomass burning in Central and Eastern Europe, and transport of dust from the Sahara Desert and the Middle East. The MODIS data were analyzed together with data from other satellite sensors, reanalysis projects and a chemistry-aerosol-transport model using an optimized algorithm tailored for the region and capable of estimating the contribution of different aerosol types to the total AOD550. The spatial and temporal variability of anthropogenic, dust and fine mode natural aerosols over land and anthropogenic, dust and marine aerosols over the sea is examined. The relative contribution of the different aerosol types to the total AOD550 exhibits a low/high seasonal variability over land/sea areas, respectively. Overall, anthropogenic aerosols, dust and fine mode natural aerosols account for ~ 51 %, ~ 34 % and ~ 15 % of the total AOD550 over land, while, anthropogenic aerosols, dust and marine aerosols account ~ 40 %, ~ 34 % and ~ 26 % of the total AOD550 over the sea, based on MODIS Terra and Aqua observations.
Article
Full-text available
This paper presents an open platform, which collects multimodal environmental data related to air quality from several sources including official open sources, social media and citizens. Collecting and fusing different sources of air quality data into a unified air quality indicator is a highly challenging problem, leveraging recent advances in image analysis, open hardware, machine learning and data fusion and is expected to result in increased geographical coverage and temporal granularity of air quality data.
Article
Full-text available
In this work, we assess the ability of RegCM4 regional climate model to simulate surface solar radiation (SSR) patterns over Europe. A decadal RegCM4 run (2000–2009) was implemented and evaluated against satellite-based observations from the Satellite Application Facility on Climate Monitoring (CM SAF) showing that the model simulates adequately the SSR patterns over the region. The bias between RegCM4 and CM SAF is +1.54 % for MFG (Meteosat First Generation) and +3.34 % for MSG (Meteosat Second Generation) observations. The relative contribution of parameters that determine the transmission of solar radiation within the atmosphere to the deviation appearing between RegCM4 and CM SAF SSR is also examined. Cloud macrophysical and microphysical properties such as cloud fractional cover (CFC), cloud optical thickness (COT) and cloud effective radius (Re) from RegCM4 are evaluated against data from CM SAF. The same procedure is repeated for aerosol optical properties such as aerosol optical depth (AOD), asymmetry factor (ASY) and single scattering albedo (SSA), as well as other parameters including surface broadband albedo (ALB) and water vapor amount (WV) using data from MACv1 aerosol climatology, from CERES satellite sensors and from ERA-Interim reanalysis. It is shown here that the good agreement between RegCM4 and satellite-based SSR observations can be partially attributed to counteracting effects among the above mentioned parameters. The contribution of each parameter to the RegCM4-CM SAF SSR deviations is estimated with the combined use of the aforementioned data and a radiative transfer model (SBDART). CFC, COT and AOD are the major determinants of these deviations; however, the other parameters also play an important role for specific regions and seasons.
Article
Full-text available
An optimal-estimation algorithm for inferring aerosol optical properties from digital twilight photographs is proposed. The sensitivity of atmospheric components and surface characteristics to brightness and color of twilight sky is investigated, and the results suggest that tropospheric and stratospheric aerosol optical thickness (AOT) are sensitive to condition of the twilight sky. The coarse–fine particle volume ratio is moderately sensitive to the sky condition near the horizon under a clean-atmosphere condition. A radiative transfer model that takes into account a spherical-shell atmosphere, refraction, and multiple scattering is used as a forward model. Error analysis shows that the tropospheric and stratospheric AOT can be retrieved without significant bias. Comparisons with results from other ground-based instruments exhibit reasonable agreement on AOT. A case study suggests that the AOT retrieval method can be applied to atmospheric conditions with varying aerosol vertical profiles and vertically inhomogeneous species in the troposphere.
Article
Full-text available
Exposure to fine particles can cause various diseases, and an easily accessible method to monitor the particles can help raise public awareness and reduce harmful exposures. Here we report a method to estimate PM air pollution based on analysis of a large number of outdoor images available for Beijing, Shanghai (China) and Phoenix (US). Six image features were extracted from the images, which were used, together with other relevant data, such as the position of the sun, date, time, geographic information and weather conditions, to predict PM2.5 index. The results demonstrate that the image analysis method provides good prediction of PM2.5 indexes, and different features have different significance levels in the prediction.
Article
Full-text available
Sky detection is an important problem in computer vision. Generally, the color and gradient of the pixel in sky border change greatly. So, sky in the scene can be detected by measuring the sky border points. In this paper, a new method for sky detection based on border points is proposed, where the border points in each column of image can be detected to complete complex sky region identification in image. Different to the existing sky detection methods, the proposed algorithm is efficient in detecting sky in scenes of complex structure, for example, sky which are separated by buildings or flags. Furthermore, the paper introduced the concept of sky border for the sky detection, which made the proposed scheme effective. The proposed method is able to handle even more complex sky conditions, including a cloudy sky and a sunny sky. Experimental results have proven that the effectiveness of this method.
Article
Full-text available
In this work, we assess the ability of RegCM4 regional climate model to simulate surface solar radiation (SSR) patterns over Europe. A decadal RegCM4 run (2000–2009) was implemented and evaluated against satellite-based observations from the Satellite Application Facility on Climate Monitoring (CM SAF) showing that the model simulates adequately the SSR patterns over the region. The bias between RegCM4 and CM SAF is +1.54 % for MFG (Meteosat First Generation) and +3.34 % for MSG (Meteosat Second Generation) observations. The relative contribution of parameters that determine the transmission of solar radiation within the atmosphere to the deviation appearing between RegCM4 and CM SAF SSR is also examined. Cloud macrophysical and microphysical properties such as cloud fractional cover (CFC), cloud optical thickness (COT) and cloud effective radius (Re) from RegCM4 are evaluated against data from CM SAF. The same procedure is repeated for aerosol optical properties such as aerosol optical depth (AOD), asymmetry factor (ASY) and single scattering albedo (SSA), as well as other parameters including surface broadband albedo (ALB) and water vapor amount (WV) using data from MACv1 aerosol climatology, from CERES satellite sensors and from ERA-Interim reanalysis. It is shown here that the good agreement between RegCM4 and satellite-based SSR observations can be partially attributed to counteracting effects among the above mentioned parameters. The contribution of each parameter to the RegCM4-CM SAF SSR deviations is estimated with the combined use of the aforementioned data and a radiative transfer model (SBDART). CFC, COT and AOD are the major determinants of these deviations; however, the other parameters also play an important role for specific regions and seasons.
Article
Full-text available
This paper deals with content-based large-scale image retrieval using the state-of-the-art framework of VLAD and Product Quantization proposed by Jegou as a starting point. Demonstrating an excellent accuracy-efficiency trade-off, this framework has attracted increased attention from the community and numerous extensions have been proposed. In this work, we make an in-depth analysis of the framework that aims at increasing our understanding of its different processing steps and boosting its overall performance. Our analysis involves the evaluation of numerous extensions (both existing and novel) as well as the study of the effects of several unexplored parameters. We specifically focus on: a) employing more efficient and discriminative local features; b) improving the quality of the aggregated representation; and c) optimizing the indexing scheme. Our thorough experimental evaluation provides new insights into extensions that consistently contribute, and others that do not, to performance improvement, and sheds light onto the effects of previously unexplored parameters of the framework. As a result, we develop an enhanced framework that significantly outperforms the previous best reported accuracy results on standard benchmarks and is more efficient.
Article
Full-text available
We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. This was achieved by a carefully crafted design that allows for increasing the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC 2014 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.
Article
Full-text available
Blue color has been proven to be a useful and robust cue for sky detection, localization and for tracking RGB color in different applications of image processing. In this paper a pixel based solution utilizing the sky color information has been proposed for sky detection. The sky color information is extracted through the comparison of RGB values of a pixel. Based on the experimental results on highly complex still images, our approach for sky detection has been proved to be accurate, fast and simple.
Conference Paper
Full-text available
In this paper we study the problem of object detection for RGB-D images using semantically rich image and depth features. We propose a new geocentric embedding for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity. We demonstrate that this geocentric embedding works better than using raw depth images for learning feature representations with convolutional neural networks. Our final object detection system achieves an average precision of 37.3%, which is a 56% relative improvement over existing methods. We then focus on the task of instance segmentation where we label pixels belonging to object instances found by our detector. For this task, we propose a decision forest approach that classifies pixels in the detection window as foreground or background using a family of unary and binary tests that query shape and geocentric pose features. Finally, we use the output from our object detectors in an existing superpixel classification framework for semantic scene segmentation and achieve a 24% relative improvement over current state-of-the-art for the object categories that we study. We believe advances such as those represented in this paper will facilitate the use of perception in fields like robotics.
Conference Paper
Full-text available
We aim to detect all instances of a category in an image and, for each instance, mark the pixels that belong to it. We call this task Simultaneous Detection and Segmentation (SDS). Unlike classical bounding box detection, SDS requires a segmentation and not just a box. Unlike classical semantic segmentation, we require individual object instances. We build on recent work that uses convolutional neural networks to classify category-independent region proposals (R-CNN [16]), introducing a novel architecture tailored for SDS. We then use category-specific, top- down figure-ground predictions to refine our bottom-up proposals. We show a 7 point boost (16% relative) over our baselines on SDS, a 5 point boost (10% relative) over state-of-the-art on semantic segmentation, and state-of-the-art performance in object detection. Finally, we provide diagnostic tools that unpack performance and provide directions for future work.
Article
Full-text available
This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform existing state-of-the-art techniques.
Article
Full-text available
We examine sunsets painted by famous artists as proxy information for the aerosol optical depth after major volcanic eruptions. Images derived from precision colour protocols applied to the paintings were compared to online images, and found that the latter, previously analysed, provide accurate information. Aerosol optical depths (AODs) at 550 nm, corresponding to Northern Hemisphere middle latitudes, calculated by introducing red-to-green (R / G) ratios from a large number of paintings to a radiative transfer model, were significantly correlated with independent proxies from stratospheric AOD and optical extinction data, the dust veil index, and ice core volcanic indices. AODs calculated from paintings were grouped into 50-year intervals from 1500 to 2000. The year of each eruption and the 3 following years were defined as "volcanic". The remaining "non-volcanic" years were used to provide additional evidence of a multidecadal increase in the atmospheric optical depths during the industrial "revolution". The increase of AOD at 550 nm calculated from the paintings grows from 0.15 in the middle 19th century to about 0.20 by the end of the 20th century. To corroborate our findings, an experiment was designed in which a master painter/colourist painted successive sunsets during and after the passage of Saharan aerosols over the island of Hydra in Greece. Independent solar radiometric measurements confirmed that the master colourist's R / G ratios which were used to model his AODs, matched the AOD values measured in situ by co-located sun photometers during the declining phase of the Saharan aerosol. An independent experiment was performed to understand the difference between R / G ratios calculated from a typical volcanic aerosol and those measured from the mineral aerosol during the Hydra experiment. It was found that the differences in terms of R / G ratios were small, ranging between -2.6% and +1.6%. Also, when analysing different parts of cloudless skies of paintings following major volcanic eruptions, any structural differences seen in the paintings had not altered the results discussed above. However, a detailed study on all possible sources of uncertainties involved (such as the impact of clouds on R / G ratios) still needs to be studied. Because of the large number of paintings studied, we tentatively propose the conclusion that regardless of the school, red-to-green ratios from great masters can provide independent proxy AODs that correlate with widely accepted proxies and with independent measurements.
Article
Full-text available
Online social and news media generate rich and timely information about real-world events of all kinds. However, the huge amount of data available, along with the breadth of the user base, requires a substantial effort of information filtering to successfully drill down to relevant topics and events. Trending topic detection is therefore a fundamental building block to monitor and summarize information originating from social sources. There are a wide variety of methods and variables and they greatly affect the quality of results. We compare six topic detection methods on three Twitter datasets related to major events, which differ in their time scale and topic churn rate. We observe how the nature of the event considered, the volume of activity over time, the sampling procedure and the pre-processing of the data all greatly affect the quality of detected topics, which also depends on the type of detection method used. We find that standard natural language processing techniques can perform well for social streams on very focused topics, but novel techniques designed to mine the temporal distribution of concepts are needed to handle more heterogeneous streams containing multiple stories evolving in parallel. One of the novel topic detection methods we propose, based on -grams cooccurrence and topic ranking, consistently achieves the best performance across all these conditions, thus being more reliable than other state-of-the-art techniques.
Article
Full-text available
Scene labeling consists of labeling each pixel in an image with the category of the object it belongs to. We propose a method that uses a multiscale convolutional network trained from raw pixels to extract dense feature vectors that encode regions of multiple sizes centered on each pixel. The method alleviates the need for engineered features, and produces a powerful representation that captures texture, shape, and contextual information. We report results using multiple postprocessing methods to produce the final labeling. Among those, we propose a technique to automatically retrieve, from a pool of segmentation components, an optimal set of components that best explain the scene; these components are arbitrary, for example, they can be taken from a segmentation tree or from any family of oversegmentations. The system yields record accuracies on the SIFT Flow dataset (33 classes) and the Barcelona dataset (170 classes) and near-record accuracy on Stanford background dataset (eight classes), while being an order of magnitude faster than competing approaches, producing a $(320\times 240)$ image labeling in less than a second, including feature extraction.
Article
Full-text available
(In Press) The Hamburg Aerosol Climatology version 1 (HAC-v1) is introduced. It describes the optical properties of tropospheric aerosols on monthly timescales and with global coverage at a spatial resolution of 1 degree in latitude and longitude. By providing aerosol radiative properties for any wavelength of the solar (or shortwave) and of the terrestrial (or longwave) radiation spectrum, as needed in radiative transfer applications, this HAC v1 data-set lends itself to simplified and computationally efficient representations of tropospheric aerosol in climate studies. Estimates of aerosol radiative properties are provided for both total and anthropogenic aerosol in annual time-steps from pre-industrial times (i.e. starting with year 1860) well into the future (until the year 2100). Central to the aerosol climatology is the merging of monthly statistics of aerosol optical properties for current (year 2000) conditions. Hereby locally sparse but trusted high quality data by ground-based sun-photometer networks are merged onto complete background maps defined by central data from global modeling with complex aerosol modules. This merging yields 0.13 for the global annual mid-visible aerosol optical depth (AOD), with 0.07 attributed to aerosol sizes larger than 1m in diameter and 0.06 of attributed to aerosol sizes smaller than 1m in diameter. Hereby larger particles are less absorbing with a single scattering albedo (SSA) of 0.98 compared to 0.93 for smaller sizes. Simulations by global modeling are applied to prescribe the vertical distribution and to estimate anthropogenic contributions to the smaller size AOD as a function of time, with a 0.037 value for current conditions. In a demonstration application the associated aerosol direct radiative effects are determined. For current conditions total aerosol is estimated to reduce the combined shortwave and longwave net-flux balance at the top of the atmosphere by about -1.6W/m2 from which -0.5 W/m2 (with an uncertainty of +/- 0.2 W/m2) is attributed to anthropogenic activities. Based on past and projected aerosol emission data the global anthropogenic direct aerosol impact (i.e. ToA cooling) is currently near the maximum and is projected to drop by 2100 to about -0.3 W/m2. The reported global averages are driven by considerable spatial and temporal variability. To better convey this diversity, regional and seasonal distributions of aerosol optical properties and their radiative effects are presented. On regional scales the anthropogenic direct aerosol forcing can be an order of magnitude stronger than the global average and it can be of either sign. It is also shown that maximum anthropogenic impacts have shifted during the last 30 years from the US and Europe to eastern and southern Asia.
Article
Full-text available
This paper reviews the many,developments in estimates,of the direct and,indirect global annual mean,radiative forcing due to present-day concentra- tions of anthropogenic,tropospheric,aerosols since Inter- governmental,Panel on Climate Change [1996]. The range of estimates of the global mean,direct radiative forcing due to six distinct aerosol types is presented. Addition- ally, the indirect effect is split into two components corresponding,to the radiative forcing due to modifica- tion of the radiative properties of clouds (cloud albedo effect) and the effects of anthropogenic,aerosols upon the lifetime of clouds (cloud lifetime effect). The radia- tive forcing for anthropogenic,sulphate aerosol ranges from 20.26 to 20.82 W m,2; even the sign of the radiative forcing is not well established due to the competing,effects of solar and terrestrial radiative
Article
Full-text available
Image category recognition is important to access visual information on the level of objects and scene types. So far, intensity-based descriptors have been widely used for feature extraction at salient points. To increase illumination invariance and discriminative power, color descriptors have been proposed. Because many different descriptors exist, a structured overview is required of color invariant descriptors in the context of image category recognition. Therefore, this paper studies the invariance properties and the distinctiveness of color descriptors (software to compute the color descriptors from this paper is available from http://www.colordescriptors.com) in a structured way. The analytical invariance properties of color descriptors are explored, using a taxonomy based on invariance properties with respect to photometric transformations, and tested experimentally using a data set with known illumination conditions. In addition, the distinctiveness of color descriptors is assessed experimentally using two benchmarks, one from the image domain and one from the video domain. From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition. The results further reveal that, for light intensity shifts, the usefulness of invariance is category-specific. Overall, when choosing a single descriptor and no prior knowledge about the data set and object and scene categories is available, the OpponentSIFT is recommended. Furthermore, a combined set of color descriptors outperforms intensity-based SIFT and improves category recognition by 8 percent on the PASCAL VOC 2007 and by 7 percent on the Mediamill Challenge.
Conference Paper
Full-text available
This paper details an empirical study of large image sets taken by static cameras. These images have consistent cor- relations over the entire image and over time scales of days to months. Simple second-order statistics of such image sets show vastly more structure than exists in generic nat- ural images or video from moving cameras. Using a slight variant to PCA, we can decompose all cameras into com- parable components and annotate images with respect to surface orientation, weather, and seasonal change. Experi- ments are based on a data set from 538 cameras across the United States which have collected more than 17 million images over the the last 6 months.
Article
Full-text available
We summarize an advanced, thoroughly documented, and quite general purpose discrete ordinate algorithm for time-independent transfer calculations in vertically inhomogeneous, nonisothermal, plane-parallel media. Atmospheric applications ranging from the UV to the radar region of the electromagnetic spectrum are possible. The physical processes included are thermal emission, scattering, absorption, and bidirectional reflection and emission at the lower boundary. The medium may be forced at the top boundary by parallel or diffuse radiation and by internal and boundary thermal sources as well. We provide a brief account of the theoretical basis as well as a discussion of the numerical implementation of the theory. The recent advances made by ourselves and our collaborators-advances in both formulation and numerical solution-are all incorporated in the algorithm. Prominent among these advances are the complete conquest of two illconditioning problems which afflicted all previous discrete ordinate implementations: (1) the computation of eigenvalues and eigenvectors and (2) the inversion of the matrix determining the constants of integration. Copies of the FORTRAN program on microcomputer diskettes are available for interested users.
Article
Full-text available
Paintings created by famous artists, representing sunsets throughout the period 1500–1900, provide proxy information on the aerosol optical depth following major volcanic eruptions. This is supported by a statistically significant correlation coefficient (0.8) between the measured red-to-green ratios of 327 paintings and the corresponding values of the dust veil index. A radiative transfer model was used to compile an independent time series of aerosol optical depth at 550 nm corresponding to Northern Hemisphere middle latitudes during the period 1500–1900. The estimated aerosol optical depths range from 0.05 for background aerosol conditions, to about 0.6 following the Tambora and Krakatau eruptions and cover a time period mostly outside of the instrumentation era.
Article
Crowdsensing of air quality is a useful way to improve public awareness and supplement local air quality monitoring data. However, current air quality monitoring approaches are either too sophisticated, costly or bulky to be used effectively by the mass. In this paper, we describe AirTick, a mobile app that can turn any camera enabled smart mobile device into an air quality sensor, thereby enabling crowdsensing of air quality. AirTick leverages image analytics and deep learning techniques to produce accurate estimates of air quality following the Pollutant Standards Index (PSI). We report the results of an initial experimental and empirical evaluations of AirTick. The AirTick tool has been shown to achieve, on average, 87% accuracy in day time operation and 75% accuracy in night time operation. Feedbacks from 100 test users indicate that they perceive AirTick to be highly useful and easy to use. Our results provide a strong positive case for the benefits of applying artificial intelligence techniques for convenient and scalable crowdsensing of air quality.
Article
Image category recognition is important to access visual information on the level of objects and scene types. So far, intensity-based descriptors have been widely used for feature extraction at salient points. To increase illumination invariance and discriminative power, color descriptors have been proposed. Because many different descriptors exist, a structured overview is required of color invariant descriptors in the context of image category recognition. Therefore, this paper studies the invariance properties and the distinctiveness of color descriptors (software to compute the color descriptors from this paper is available from http://www.colordescriptors.com) in a structured way. The analytical invariance properties of color descriptors are explored, using a taxonomy based on invariance properties with respect to photometric transformations, and tested experimentally using a data set with known illumination conditions. In addition, the distinctiveness of color descriptors is assessed experimentally using two benchmarks, one from the image domain and one from the video domain. From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition. The results further reveal that, for light intensity shifts, the usefulness of invariance is category-specific. Overall, when choosing a single descriptor and no prior knowledge about the data set and object and scene categories is available, the OpponentSIFT is recommended. Furthermore, a combined set of color descriptors outperforms intensity-based SIFT and improves category recognition by 8 percent on the PASCAL VOC 2007 and by 7 percent on the Mediamill Challenge.
Conference Paper
Can a large convolutional neural network trained for whole-image classification on ImageNet be coaxed into detecting objects in PASCAL? We show that the answer is yes, and that the resulting system is simple, scalable, and boosts mean average precision, relative to the venerable deformable part model, by more than 40% (achieving a final mAP of 48% on VOC 2007). Our framework combines powerful computer vision techniques for generating bottom-up region proposals with recent advances in learning high-capacity convolutional neural networks. We call the resulting system R-CNN: Regions with CNN features. The same framework is also competitive with state-of-the-art semantic segmentation methods, demonstrating its flexibility. Beyond these results, we execute a battery of experiments that provide insight into what the network learns to represent, revealing a rich hierarchy of discriminative and often semantically meaningful features.
Conference Paper
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.
Conference Paper
This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint training of these two model paradigms improves performance and allows us to significantly outperform existing state-of-the-art techniques.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Article
Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a novel architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes one third of a second for a typical image.
Article
In this work, the spatiotemporal variability of surface solar radiation (SSR) is examined over the Eastern Mediterranean region for a 31-year period (1983–2013). The CM SAF SARAH (Satellite Application Facility on Climate Monitoring Solar surfAce RAdiation Heliosat) satellite-based product was found to be homogeneous (based on relative Standard Normal Homogeneity Tests — SNHTs, 95% confidence level) as compared to ground-based observations, and hence appropriate for climatological studies. Specifically, the dataset shows good agreement with monthly observations from five quality assured stations in the region with a mean bias of 7.1 W/m2 or 3.8% and a strong correlation. This high resolution (0.05° × 0.05°) product is capable of revealing various local features. Over land, the SSR levels are highly dependent on the topography, while over the sea, they exhibit a smooth latitudinal variability. SSR varies significantly over the region on a seasonal basis being three times higher in summer (309.6 ± 26.5 W/m2) than in winter (100.2 ± 31.4 W/m2). The CM SAF SARAH product was compared against three satellite-based and one reanalysis products. The satellite-based data from CERES (Cloud and the Earth's Radiant Energy System), GEWEX (Global Energy and Water Cycle Experiment) and ISCCP (International Satellite Cloud Climatology Project) underestimate SSR while the reanalysis data from the ERA-Interim overestimate SSR compared to CM SAF SARAH. Using a radiative transfer model and a set of ancillary data, these biases are attributed to the atmospheric parameters that drive the transmission of solar radiation in the atmosphere, namely, clouds, aerosols and water vapor. It is shown that the bias between CERES and CM SAF SARAH SSR can be explained through the cloud fractional cover and aerosol optical depth biases between these datasets. The CM SAF SARAH SSR trend was found to be positive (brightening) and statistically significant at the 95% confidence level (0.2 ± 0.05 W/m2/year or 0.1 ± 0.02%/year) being almost the same over land and sea. The CM SAF SARAH SSR trends are closer to the ground-based ones than the CERES, GEWEX, ISCCP and ERA-Interim trends. The use of an aerosol climatology for the production of CM SAF SARAH, that neglects the trends of aerosol loads, leads to an underestimation of the SSR trends. It is suggested here, that the inclusion of changes of the aerosol load and composition within CM SAF SARAH would allow for a more accurate reproduction of the SSR trends.
Conference Paper
This paper presents an open platform, which collects multimodal environmental data related to air quality from several sources including official open sources, social media and citizens. Collecting and fusing different sources of air quality data into a unified air quality indicator is a highly challenging problem, leveraging recent advances in image analysis, open hardware, machine learning and data fusion. The collection of data from multiple sources aims at having complementary information, which is expected to result in increased geographical coverage and temporal granularity of air quality data. This diversity of sources constitutes also the main novelty of the platform presented compared with the existing applications.
Article
Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a skip architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves improved segmentation of PASCAL VOC (30% relative improvement to 67.2% mean IU on 2012), NYUDv2, SIFT Flow, and PASCAL-Context, while inference takes one tenth of a second for a typical image.
Article
With the rapid development of economy in China over the past decade, air pollution has become an increasingly serious problem in major cities and caused grave public health concerns in China. Recently, a number of studies have dealt with air quality and air pollution. Among them, some attempt to predict and monitor the air quality from different sources of information, ranging from deployed physical sensors to social media. These methods are either too expensive or unreliable, prompting us to search for a novel and effective way to sense the air quality. In this study, we propose to employ the state of the art in computer vision techniques to analyze photos that can be easily acquired from online social media. Next, we establish the correlation between the haze level computed directly from photos with the official PM 2.5 record of the taken city at the taken time. Our experiments based on both synthetic and real photos have shown the promise of this image-based approach to estimating and monitoring air pollution.
Article
This article presents a novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features). SURF approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper encompasses a detailed description of the detector and descriptor and then explores the effects of the most important parameters. We conclude the article with SURF's application to two challenging, yet converse goals: camera calibration as a special case of image registration, and object recognition. Our experiments underline SURF's usefulness in a broad range of topics in computer vision.
We address a central problem of neuroanatomy, namely, the automatic segmen-tation of neuronal structures depicted in stacks of electron microscopy (EM) im-ages. This is necessary to efficiently map 3D brain structure and connectivity. To segment biological neuron membranes, we use a special type of deep artificial neural network as a pixel classifier. The label of each pixel (membrane or non-membrane) is predicted from raw pixel values in a square window centered on it. The input layer maps each window pixel to a neuron. It is followed by a succes-sion of convolutional and max-pooling layers which preserve 2D information and extract features with increasing levels of abstraction. The output layer produces a calibrated probability for each class. The classifier is trained by plain gradient descent on a 512 × 512 × 30 stack with known ground truth, and tested on a stack of the same size (ground truth unknown to the authors) by the organizers of the ISBI 2012 EM Segmentation Challenge. Even without problem-specific post-processing, our approach outperforms competing techniques by a large margin in all three considered metrics, i.e. rand error, warping error and pixel error. For pixel error, our approach is the only one outperforming a second human observer.
Conference Paper
We propose a new architecture for difficult image processing operations, such as natural edge detection or thin object segmentation. The architecture is based on a simple combination of convolutional neural networks with the nearest neighbor search. We focus our attention on the situations when the desired image transformation is too hard for a neural network to learn explicitly. We show that in such situations, the use of the nearest neighbor search on top of the network output allows to improve the results considerably and to account for the underfitting effect during the neural network training. The approach is validated on three challenging benchmarks, where the performance of the proposed architecture matches or exceeds the state-of-the-art.
Article
The latest generation of Convolutional Neural Networks (CNN) have achieved impressive results in challenging benchmarks on image recognition and object detection, significantly raising the interest of the community in these methods. Nevertheless, it is still unclear how different CNN methods compare with each other and with previous state-of-the-art shallow representations such as the Bag-of-Visual-Words and the Improved Fisher Vector. This paper conducts a rigorous evaluation of these new techniques, exploring different deep architectures and comparing them on a common ground, identifying and disclosing important implementation details. We identify several useful properties of CNN-based representations, including the fact that the dimensionality of the CNN output layer can be reduced significantly without having an adverse effect on performance. We also identify aspects of deep and shallow methods that can be successfully shared. A particularly significant one is data augmentation, which achieves a boost in performance in shallow methods analogous to that observed with CNN-based methods. Finally, we are planning to provide the configurations and code that achieve the state-of-the-art performance on the PASCAL VOC Classification challenge, along with alternative configurations trading-off performance, computation speed and compactness.
Article
Can a large convolutional neural network trained for whole-image classification on ImageNet be coaxed into detecting objects in PASCAL? We show that the answer is yes, and that the resulting system is simple, scalable, and boosts mean average precision, relative to the venerable deformable part model, by more than 40% (achieving a final mAP of 48% on VOC 2007). Our framework combines powerful computer vision techniques for generating bottom-up region proposals with recent advances in learning high-capacity convolutional neural networks. We call the resulting system R-CNN: Regions with CNN features. The same framework is also competitive with state-of-the-art semantic segmentation methods, demonstrating its flexibility. Beyond these results, we execute a battery of experiments that provide insight into what the network learns to represent, revealing a rich hierarchy of discriminative and often semantically meaningful features.
Article
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.
Article
SBDART is a software tool that computes plane-parallel radiative transfer in clear and cloudy conditions within the earth's atmosphere and at the surface. All important processes that affect the ultraviolet, visible, and infrared radiation fields are included. The code is a marriage of a sophisticated discrete ordinate radiative transfer module, low-resolution atmospheric transmission models, and Mie scattering results for light scattering by water droplets and ice crystals. The code is well suited for a wide variety of atmospheric radiative energy balance and remote sensing studies. It is designed so that it can be used for case studies as well as sensitivity analysis. For small sets of computations or teaching applications it is available on the World Wide Web with a user-friendly interface. For sensitivity studies requiring many computations it is available by anonymous FTP as a well organized and documented FORTRAN 77 source code.
Article
As part of a wider study into the use of smartphones as solar ultraviolet radiation monitors, this article characterizes the ultraviolet A (UVA; 320-400 nm) response of a consumer complementary metal oxide semiconductor (CMOS)-based smartphone image sensor in a controlled laboratory environment. The CMOS image sensor in the camera possesses inherent sensitivity to UVA, and despite the attenuation due to the lens and neutral density and wavelength-specific bandpass filters, the measured relative UVA irradiances relative to the incident irradiances range from 0.0065% at 380 nm to 0.0051% at 340 nm. In addition, the sensor demonstrates a predictable response to low-intensity discrete UVA stimuli that can be modelled using the ratio of recorded digital values to the incident UVA irradiance for a given automatic exposure time, and resulting in measurement errors that are typically less than 5%. Our results support the idea that smartphones can be used for scientific monitoring of UVA radiation.
Scene categorization is a fundamental problem in computer vision. However, scene understanding research has been constrained by the limited scope of currently-used databases which do not capture the full variety of scene categories. Whereas standard databases for object categorization contain hundreds of different classes of objects, the largest available dataset of scene categories contains only 15 classes. In this paper we propose the extensive Scene UNderstanding (SUN) database that contains 899 categories and 130,519 images. We use 397 well-sampled categories to evaluate numerous state-of-the-art algorithms for scene recognition and establish new bounds of performance. We measure human scene classification performance on the SUN database and compare this with computational methods. Additionally, we study a finer-grained scene representation to detect scenes embedded inside of larger scenes.
Article
This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.
Chapter
Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies