Preprint
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

Effective shadow removal is pivotal in enhancing the visual quality of images in various applications, ranging from computer vision to digital photography. During the last decades physics and machine learning -based methodologies have been proposed; however, most of them have limited capacity in capturing complex shadow patterns due to restrictive model assumptions, neglecting the fact that shadows usually appear at different scales. Also, current datasets used for benchmarking shadow removal are composed of a limited number of images with simple scenes containing mainly uniform shadows cast by single objects, whereas only a few of them include both manual shadow annotations and paired shadow-free images. Aiming to address all these limitations in the context of natural scene imaging, including urban environments with complex scenes, the contribution of this study is twofold: a) it proposes a novel deep learning architecture, named Soft-Hard Attention U-net (SHAU), focusing on multiscale shadow removal; b) it provides a novel synthetic dataset, named Multiscale Shadow Removal Dataset (MSRD), containing complex shadow patterns of multiple scales, aiming to serve as a privacy-preserving dataset for a more comprehensive benchmarking of future shadow removal methodologies. Key architectural components of SHAU are the soft and hard attention modules, which along with multiscale feature extraction blocks enable effective shadow removal of different scales and intensities. The results demonstrate the effectiveness of SHAU over the relevant state-of-the-art shadow removal methods across various benchmark datasets, improving the Peak Signal-to-Noise Ratio and Root Mean Square Error for the shadow area by 25.1% and 61.3%, respectively.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Shadows often occur when we capture the document with casual equipment, which influences the visual quality and readability of the digital copies. Different from the algorithms for natural shadow removal, the algorithms in document shadow removal need to preserve the details of fonts and figures in high-resolution input. Previous works ignore this problem and remove the shadows via approximate attention and small datasets, which might not work in real-world situations. We handle high-resolution document shadow removal directly via a larger-scale real-world dataset and a carefully-designed frequency-aware network. As for the dataset, we acquire over 7k couples of high-resolution (2462 × 3699) images of real-world documents pairs with various samples under different lighting circumstances, which is 10 times larger than existing datasets. As for the design of the network, we decouple the high-resolution images in the frequency domain, where the low-frequency details and high-frequency boundaries can be effectively learned via the carefully designed network structure. Powered by our network and dataset, the proposed method shows a clearly better performance than previous methods in terms of visual quality and numerical results. The code, models, and dataset are available at https://github.com/CXH-Research/DocShadow-SD7K.
Article
Full-text available
Shadow removal is an important problem in computer vision, since the presence of shadows complicates core computer vision tasks, including image segmentation and object recognition. Most state-of-the-art shadow removal methods are based on complex deep learning architectures, which require training on a large amount of data. In this paper a novel and efficient methodology is proposed aiming to provide a simple solution to shadow removal, both in terms of implementation and computational cost. The proposed methodology is fully unsupervised, based solely on color image features. Initially, the shadow region is automatically extracted by a segmentation algorithm based on Electromagnetic-Like Optimization. Superpixel-based segmentation is performed and pairs of shadowed and non-shadowed regions, which are nearest neighbors in terms of their color content, are identified as parts of the same object. The shadowed part of each pair is relighted by means of histogram matching, using the content of its non-shadowed counterpart. Quantitative and qualitative experiments on well-recognized publicly available benchmark datasets are conducted to evaluate the performance of proposed methodology in comparison to state-of-the-art methods. The results validate both its efficiency and effectiveness, making evident that solving the shadow removal problem does not necessarily require complex deep learning-based solutions.
Article
Full-text available
Synthetic datasets, for which we propose the term synthsets, are not a novelty but have become a necessity. Although they have been used in computer vision since 1989, helping to solve the problem of collecting a sufficient amount of annotated data for supervised machine learning, intensive development of methods and techniques for their generation belongs to the last decade. Nowadays, the question shifts from whether you should use synthetic datasets to how you should optimally create them. Motivated by the idea of discovering best practices for building synthetic datasets to represent dynamic environments (such as traffic, crowds, and sports), this study provides an overview of existing synthsets in the computer vision domain. We have analyzed the methods and techniques of synthetic datasets generation: from the first low-res generators to the latest generative adversarial training methods, and from the simple techniques for improving realism by adding global noise to those meant for solving domain and distribution gaps. The analysis extracts nine unique but potentially intertwined methods and reveals the synthsets generation diagram, consisting of 17 individual processes that synthset creators should follow and choose from, depending on the specific requirements of their task.
Article
Full-text available
In humans, Attention is a core property of all perceptual and cognitive operations. Given our limited ability to process competing sources, attention mechanisms select, modulate, and focus on the information most relevant to behavior. For decades, concepts and functions of attention have been studied in philosophy, psychology, neuroscience, and computing. For the last 6 years, this property has been widely explored in deep neural networks. Currently, the state-of-the-art in Deep Learning is represented by neural attention models in several application domains. This survey provides a comprehensive overview and analysis of developments in neural attention models. We systematically reviewed hundreds of architectures in the area, identifying and discussing those in which attention has shown a significant impact. We also developed and made public an automated methodology to facilitate the development of reviews in the area. By critically analyzing 650 works, we describe the primary uses of attention in convolutional, recurrent networks, and generative models, identifying common subgroups of uses and applications. Furthermore, we describe the impact of attention in different application domains and their impact on neural networks’ interpretability. Finally, we list possible trends and opportunities for further research, hoping that this review will provide a succinct overview of the main attentional models in the area and guide researchers in developing future approaches that will drive further improvements.
Article
Full-text available
Recent years, many researches attempt to open the black box of deep neural networks and propose a various of theories to understand it. Among them, information bottleneck (IB) theory claims that there are two distinct phases consisting of fitting phase and compression phase in the course of training. This statement attracts many attentions since its success in explaining the inner behavior of feedforward neural networks. In this paper, we employ IB theory to understand the dynamic behavior of convolutional neural networks (CNNs) and investigate how the fundamental features such as convolutional layer width, kernel size, network depth, pooling layers and multi-fully connected layer have impact on the performance of CNNs. In particular, through a series of experimental analysis on benchmark of MNIST and Fashion-MNIST, we demonstrate that the compression phase is not observed in all these cases. This shows us the CNNs have a rather complicated behavior than feedforward neural networks.
Article
Full-text available
Shadow removal is a fundamental and challenging problem in image processing field. Current approaches can only process shadows with simple scenes. For complex texture and illumination, the performance is less impressive. In this paper, we propose a novel shadow removal algorithm based on multi-scale image decomposition, which can recover the illumination for complex shadows with inconsistent illumination and different surface materials. Independent of shadow detection, our algorithm only requires a rough boundary distinguishing shadow regions from non-shadow regions. It first performs a multi-scale decomposition for the input image based on an illumination-sensitive smoothing process and then removes shadows in the basic layer using a local-to-global optimization strategy, which fuses all local shadow-free results in a global manner. Finally, we recover the texture details for the shadow-free basic layer and obtain the final shadow-free image. We validate the performance of the proposed method under various lighting and texture conditions and show consistent illumination between shadow and surrounding regions in the shadow removal results.
Article
Full-text available
Shadow is a natural phenomenon observed in most natural images. It can reveal information about the objects shape as well as the illumination direction. In computer vision algorithms, shadow can affect negatively image segmentation results, feature extraction, or object tracking. For that, it is necessary to detect and eliminate shadow. Texture remains the best feature used to detect the shadow and photometric information can be used to eliminate it. However, in case of an image with a shadow projected on a complex texture, most of the proposed approaches in literature are useless. In this study, we propose an automatic and data-driven approach for shadow detection and elimination based on the Bidimensional Empirical Mode Decomposition (BEMD). The main idea is to decompose the shaded image into intrinsic components (IMF) that contains only texture and a residue with only objects shape. Then, shadow detection is performed on the IMFs by matching the pair of segmented regions using texture features, while elimination is carried out via a Gaussian approximation applied only on the residue. Finally, the shadow-free image is obtained by adding all the IMFs and the shadow-free residue. The proposed approach is evaluated in comparison with recent approaches on images with the different type of shadow.
Article
Full-text available
In this study, we systematically investigate the impact of class imbalance on classification performance of convolutional neural networks (CNNs) and compare frequently used methods to address the issue. Class imbalance is a common problem that has been comprehensively studied in classical machine learning, yet very limited systematic research is available in the context of deep learning. In our study, we use three benchmark datasets of increasing complexity, MNIST, CIFAR-10 and ImageNet, to investigate the effects of imbalance on classification and perform an extensive comparison of several methods to address the issue: oversampling, undersampling, two-phase training, and thresholding that compensates for prior class probabilities. Our main evaluation metric is area under the receiver operating characteristic curve (ROC AUC) adjusted to multi-class tasks since overall accuracy metric is associated with notable difficulties in the context of imbalanced data. Based on results from our experiments we conclude that (i) the effect of class imbalance on classification performance is detrimental; (ii) the method of addressing class imbalance that emerged as dominant in almost all analyzed scenarios was oversampling; (iii) oversampling should be applied to the level that totally eliminates the imbalance, whereas undersampling can perform better when the imbalance is only removed to some extent; (iv) as opposed to some classical machine learning models, oversampling does not necessarily cause overfitting of CNNs; (v) thresholding should be applied to compensate for prior class probabilities when overall number of properly classified cases is of interest.
Conference Paper
Full-text available
This paper introduces training of shadow detectors under the large-scale dataset paradigm. This was previously impossible due to the high cost of precise shadow annotation. Instead, we advocate the use of quickly but imperfectly labeled images. Our novel label recovery method automatically corrects a portion of the erroneous annotations such that the trained classifiers perform at state-of-the-art level. We apply our method to improve the accuracy of the labels of a new dataset that is 20 times larger than existing datasets and contains a large variety of scenes and image types. Naturally, such a large dataset is appropriate for training deep learning methods. Thus, we propose a semantic-aware patch level Convolutional Neural Network architecture that efficiently trains on patch level shadow examples while incorporating image level semantic information. This means that the detected shadow patches are refined based on image semantics. Our proposed pipeline can be a useful baseline for future advances in shadow detection.
Article
Full-text available
A user-centric method for fast, interactive, robust, and high-quality shadow removal is presented. Our algorithm can perform detection and removal in a range of difficult cases, such as highly textured and colored shadows. To perform detection, an on-the-fly learning approach is adopted guided by two rough user inputs for the pixels of the shadow and the lit area. After detection, shadow removal is performed by registering the penumbra to a normalized frame, which allows us efficient estimation of nonuniform shadow illumination changes, resulting in accurate and robust removal. Another major contribution of this work is the first validated and multiscene category ground truth for shadow removal algorithms. This data set containing 186 images eliminates inconsistencies between shadow and shadow-free images and provides a range of different shadow types such as soft, textured, colored, and broken shadow. Using this data, the most thorough comparison of state-of-the-art shadow removal methods to date is performed, showing our proposed algorithm to outperform the state of the art across several measures and shadow categories. To complement our data set, an online shadow removal benchmark website is also presented to encourage future open comparisons in this challenging field of research.
Article
Full-text available
We present a framework to automatically detect and remove shadows in real world scenes from a single image. Previous works on shadow detection put a lot of effort in designing shadow variant and invariant hand-crafted features. In contrast, our framework automatically learns the most relevant features in a supervised manner using multiple convolutional deep neural networks (ConvNets). The features are learned at the super-pixel level and along the dominant boundaries in the image. The predicted posteriors based on the learned features are fed to a conditional random field model to generate smooth shadow masks. Using the detected shadow masks, we propose a Bayesian formulation to accurately extract shadow matte and subsequently remove shadows. The Bayesian formulation is based on a novel model which accurately models the shadow generation process in the umbra and penumbra regions. The model parameters are efficiently estimated using an iterative optimization procedure. Our proposed framework consistently performed better than the state-of-the-art on all major shadow databases collected under a variety of conditions.
Article
Full-text available
Background subtraction methods are widely exploited for moving object detection in videos in many applications, such as traffic monitoring, human motion capture, and video surveillance. How to correctly and efficiently model and update the background model and how to deal with shadows are two of the most distinguishing and challenging aspects of such approaches. The article proposes a general-purpose method that combines statistical assumptions with the object-level knowledge of moving objects, apparent objects (ghosts), and shadows acquired in the processing of the previous frames. Pixels belonging to moving objects, ghosts, and shadows are processed differently in order to supply an object-based selective update. The proposed approach exploits color information for both background subtraction and shadow detection to improve object segmentation and background update. The approach proves fast, flexible, and precise in terms of both pixel accuracy and reactivity to background changes.
Article
Recent deep learning methods have achieved promising results in image shadow removal. However, most of the existing approaches focus on working locally within shadow and non-shadow regions, resulting in severe artifacts around the shadow boundaries as well as inconsistent illumination between shadow and non-shadow regions. It is still challenging for the deep shadow removal model to exploit the global contextual correlation between shadow and non-shadow regions. In this work, we first propose a Retinex-based shadow model, from which we derive a novel transformer-based network, dubbed ShandowFormer, to exploit non-shadow regions to help shadow region restoration. A multi-scale channel attention framework is employed to hierarchically capture the global information. Based on that, we propose a Shadow-Interaction Module (SIM) with Shadow-Interaction Attention (SIA) in the bottleneck stage to effectively model the context correlation between shadow and non-shadow regions. We conduct extensive experiments on three popular public datasets, including ISTD, ISTD+, and SRD, to evaluate the proposed method. Our method achieves state-of-the-art performance by using up to 150X fewer model parameters.
Article
Shadow removal, which aims to restore the illumination in shadow regions, is challenging due to the diversity of shadows in terms of location, intensity, shape, and size. Different from most multi-task methods, which design elaborate multi-branch or multi-stage structures for better shadow removal, we introduce feature decomposition to learn better feature representations. Specifically, we propose a single-stage and decoupled multi-task network (DMTN) to explicitly learn the decomposed features for shadow removal, shadow matte estimation, and shadow image reconstruction. First, we propose several coarse-to-fine semi-convolution (SMC) modules to capture features sufficient for joint learning of these three tasks. Second, we design a theoretically supported feature decoupling layer to explicitly decouple the learned features into shadow image features and shadow matte features via weight reassignment. Last, these features are converted to a target shadow-free image, affiliated shadow matte, and shadow image, supervised by multi-task joint loss functions. With multi-task collaboration, DMTN effectively recovers the illumination in shadow areas while ensuring the fidelity of non-shadow areas. Experimental results show that DMTN competes favorably with state-of-the-art multi-branch/multi-stage shadow removal methods, while maintaining the simplicity of single-stage methods. We have released our code to encourage future exploration in powerful feature representation for shadow removal https://github.com/nachifur/DMTN</uri
Article
Shadow extraction is an important and challenging task in remote sensing image analysis because the presence of shadows not only reduces radiation information but also affects the interpretation of remote sensing images. In this article, a clustering feature constraint multiscale attention network for shadow extraction from remote sensing images is proposed. First, in addition to the pixel-level description of the traditional neural network, our method focuses on the clustering relationships between pixel pairs to obtain the pixel group features of shadows. The feature extraction capability of the network is improved with a reweighting mechanism at the pixel level and pixel group features. Second, we employ a feature fusion algorithm by considering contextual information to improve the network’s attention toward shadow areas and enhance the nonlinear expression ability during the encoding and decoding layers. Furthermore, considering the most prominent multiscale features of shadows in remote sensing images, a deep multiscale feature aggregation structure is established to better fit the multiscale feature expression of shadows. Finally, we construct a shadow extraction dataset to verify the proposed approach. We compare our method with the results of state-of-the-art deep learning models. The results show that the intersection over union (IOU) of our method is improved by 0.85%–9.51% and that the F1 -score is improved by 0.73–6.48. In addition, the test results for images with different resolutions prove that the proposed approach is more robust than the other methods.
Article
Constructing effective priors is critical to solving ill-posed inverse problems in image processing and computational imaging. Recent works focused on exploiting non-local similarity by grouping similar patches for image modeling, and demonstrated state-of-the-art results in many image restoration applications. However, compared to classic methods based on filtering or sparsity, non-local algorithms are more time-consuming, mainly due to the highly inefficient block matching step, i.e., distance between every pair of overlapping patches needs to be computed. In this work, we propose a novel Self-Convolution operator to exploit image non-local properties in a unified framework. We prove that the proposed Self-Convolution based formulation can generalize the commonly-used non-local modeling methods, as well as produce results equivalent to standard methods, but with much cheaper computation. Furthermore, by applying Self-Convolution, we propose an effective multi-modality image restoration scheme, which is much more efficient than conventional block matching for non-local modeling. Experimental results demonstrate that (1) Self-Convolution with fast Fourier transform implementation can significantly speed up most of the popular non-local image restoration algorithms, with two-fold to nine-fold faster block matching, and (2) the proposed online multi-modality image restoration scheme achieves superior denoising results than competing methods in both efficiency and effectiveness on RGB-NIR images. The code for this work is publicly available at https://github.com/GuoLanqing/Self-Convolution .
Article
We propose a novel deep learning method for shadow removal. Inspired by physical models of shadow formation, we use a linear illumination transformation to model the shadow effects in the image that allows the shadow image to be expressed as a combination of the shadow-free image, the shadow parameters, and a matte layer. We use two deep networks, namely SP-Net and M-Net, to predict the shadow parameters and the shadow matte respectively. This system allows us to remove the shadow effects from images. We then employ an inpainting network, I-Net, to further refine the results. We train and test our framework on the most challenging shadow removal dataset (ISTD). Our method improves the state-of-the-art in terms of mean absolute error (MAE) for the shadow area by 20\%. Furthermore, this decomposition allows us to formulate a patch-based weakly-supervised shadow removal method. This model can be trained without any shadow- free images (that are cumbersome to acquire) and achieves competitive shadow removal results compared to state-of-the-art methods that are trained with fully paired shadow and shadow-free images. Last, we introduce SBU-Timelapse, a video shadow removal dataset for evaluating shadow removal methods.
Article
Attention has arguably become one of the most important concepts in the deep learning field. It is inspired by the biological systems of humans that tend to focus on the distinctive parts when processing large amounts of information. With the development of deep neural networks, attention mechanism has been widely used in diverse application domains. This paper aims to give an overview of the state-of-the-art attention models proposed in recent years. Toward a better general understanding of attention mechanisms, we define a unified model that is suitable for most attention structures. Each step of the attention mechanism implemented in the model is described in detail. Furthermore, we classify existing attention models according to four criteria: the softness of attention, forms of input feature, input representation, and output representation. Besides, we summarize network architectures used in conjunction with the attention mechanism and describe some typical applications of attention mechanism. Finally, we discuss the interpretability that attention brings to deep learning and present its potential future trends.
Article
Shadow detection in general photos is a nontrivial problem, due to the complexity of the real world. Though recent shadow detectors have already achieved remarkable performance on various benchmark data, their performance is still limited for general real-world situations. In this work, we collected shadow images for multiple scenarios and compiled a new dataset of 10,500 shadow images, each with labeled ground-truth mask, for supporting shadow detection in the complex world. Our dataset covers a rich variety of scene categories, with diverse shadow sizes, locations, contrasts, and types. Further, we comprehensively analyze the complexity of the dataset, present a fast shadow detection network with a detail enhancement module to harvest shadow details, and demonstrate the effectiveness of our method to detect shadows in general situations.
Conference Paper
Modern computer vision algorithms typically require expensive data acquisition and accurate manual labeling. In this work, we instead leverage the recent progress in computer graphics to generate fully labeled, dynamic, and photo-realistic proxy virtual worlds. We propose an efficient real-to-virtual world cloning method, and validate our approach by building and publicly releasing a new video dataset, called Virtual KITTI (see http://www.xrce.xerox.com/Research-Development/Computer-Vision/Proxy-Virtual-Worlds), automatically labeled with accurate ground truth for object detection, tracking, scene and instance segmentation, depth, and optical flow. We provide quantitative experimental evidence suggesting that (i) modern deep learning algorithms pre-trained on real data behave similarly in real and virtual worlds, and (ii) pre-training on virtual data improves performance. As the gap between real and virtual worlds is small, virtual worlds enable measuring the impact of various weather and imaging conditions on recognition performance, all other things being equal. We show these factors may affect drastically otherwise high-performing deep models for tracking.
Article
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.0 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
Conference Paper
We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing perceptual loss functions based on high-level features extracted from pretrained networks. We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
Manipulated images lose believability if the user's edits fail to account for shadows. We propose a method that makes removal and editing of soft shadows easy. Soft shadows are ubiquitous, but remain notoriously difficult to extract and manipulate. We posit that soft shadows can be segmented, and therefore edited, by learning a mapping function for image patches that generates shadow mattes. We validate this premise by removing soft shadows from photographs with only a small amount of user input. Given only broad user brush strokes that indicate the region to be processed, our new supervised regression algorithm automatically unshadows an image, removing the umbra and penumbra. The resulting lit image is frequently perceived as a believable shadow-free version of the scene. We tested the approach on a large set of soft shadow images, and performed a user study that compared our method to the state-of-the-art and to real lit scenes. Our results are more difficult to identify as being altered and are perceived as preferable compared to prior work.
Article
In this paper, we present a novel shadow removal system for single natural images as well as color aerial images using an illumination recovering optimization method. We first adaptively decompose the input image into overlapped patches according to the shadow distribution. Then by building the correspondence between the shadow patch and the lit patch based on texture similarity, we construct an optimized illumination recovering operator which effectively removes the shadows and recovers the texture detail under the shadow patches. Based on coherent optimization processing among the neighboring patches, we finally produce high-quality shadow-free results with consistent illumination. Our shadow removal system is simple and effective, and can process shadow images with rich texture types and nonuniform shadows. The illumination of shadow-free results is consistent with that of surrounding environment. We further present several shadow editing applications to illustrate the versatility of the proposed method.
Article
In this paper, we present a new method for removing shadows from images. First, shadows are detected by interactive brushing assisted with a Gaussian Mixture Model. Secondly, the detected shadows are removed using an adaptive illumination transfer approach that accounts for the reflectance variation of the image texture. The contrast and noise levels of the result are then improved with a multi-scale illumination transfer technique. Finally, any visible shadow boundaries in the image can be eliminated based on our Bayesian framework. We also extend our method to video data and achieve temporally consistent shadow-free results.
Article
In this paper, we address the problem of shadow detection and removal from single images of natural scenes. Differently from traditional methods that explore pixel or edge information, we employ a region-based approach. In addition to considering individual regions separately, we predict relative illumination conditions between segmented regions from their appearances and perform pairwise classification based on such information. Classification results are used to build a graph of segments, and graph-cut is used to solve the labeling of shadow and nonshadow regions. Detection results are later refined by image matting, and the shadow-free image is recovered by relighting each pixel based on our lighting model. We evaluate our method on the shadow detection dataset in Zhu et al. . In addition, we created a new dataset with shadow-free ground truth images, which provides a quantitative basis for evaluating shadow removal. We study the effectiveness of features for both unary and pairwise classification.
Conference Paper
In this paper, we address the problem of shadow detection and removal from single images of natural scenes. Different from traditional methods that explore pixel or edge information, we employ a region based approach. In addition to considering individual regions separately, we predict relative illumination conditions between segmented regions from their appearances and perform pairwise classification based on such information. Classification results are used to build a graph of segments, and graph-cut is used to solve the labeling of shadow and non-shadow regions. Detection results are later refined by image matting, and the shadow free image is recovered by relighting each pixel based on our lighting model. We evaluate our method on the shadow detection dataset. In addition, we created a new dataset with shadow-free ground truth images, which provides a quantitative basis for evaluating shadow removal.
Conference Paper
We propose a new method for estimating intrinsic dimension of a dataset derived by applying the principle of maximum likelihood to the distances between close neighbors. We derive the estimator by a Poisson process approximation, assess its bias and variance theo- retically and by simulations, and apply it to a number of simulated and real datasets. We also show it has the best overall performance compared with two other intrinsic dimension estimators.
Article
This paper is concerned with the derivation of a progression of shadow-free image representations. First, we show that adopting certain assumptions about lights and cameras leads to a 1D, gray-scale image representation which is illuminant invariant at each image pixel. We show that as a consequence, images represented in this form are shadow-free. We then extend this 1D representation to an equivalent 2D, chromaticity representation. We show that in this 2D representation, it is possible to relight all the image pixels in the same way, effectively deriving a 2D image representation which is additionally shadow-free. Finally, we show how to recover a 3D, full color shadow-free image representation by first (with the help of the 2D representation) identifying shadow edges. We then remove shadow edges from the edge-map of the original image by edge in-painting and we propose a method to reintegrate this thresholded edge map, thus deriving the sought-after 3D shadow-free image.
Article
A method is developed for representing any communication system geometrically. Messages and the corresponding signals are points in two "function spaces," and the modulation process is a mapping of one space into the other. Using this representation, a number of results in communication theory are deduced concerning expansion and compression of bandwidth and the threshold effect. Formulas are found for the maxmum rate of transmission of binary digits over a system when the signal is perturbed by various types of noise. Some of the properties of "ideal" systems which transmit at this maxmum rate are discussed. The equivalent number of binary digits per second for certain information sources is calculated.