[show abstract][hide abstract] ABSTRACT: We prove a closed-form solution to tensor voting (CFTV): Given a point set in any dimensions, our closed-form solution provides an exact, continuous, and efficient algorithm for computing a structure-aware tensor that simultaneously achieves salient structure detection and outlier attenuation. Using CFTV, we prove the convergence of tensor voting on a Markov random field (MRF), thus termed as MRFTV, where the structure-aware tensor at each input site reaches a stationary state upon convergence in structure propagation. We then embed structure-aware tensor into expectation maximization (EM) for optimizing a single linear structure to achieve efficient and robust parameter estimation. Specifically, our EMTV algorithm optimizes both the tensor and fitting parameters and does not require random sampling consensus typically used in existing robust statistical techniques. We performed quantitative evaluation on its accuracy and robustness, showing that EMTV performs better than the original TV and other state-of-the-art techniques in fundamental matrix estimation for multiview stereo matching. The extensions of CFTV and EMTV for extracting multiple and nonlinear structures are underway.
[show abstract][hide abstract] ABSTRACT: Reconstructing transparent objects is a challenging problem. While producing reasonable results for quite complex objects, existing approaches require custom calibration or somewhat expensive labor to achieve high precision. On the other hand, when an overall shape preserving salient and fine details is sufficient, we show in this paper a significant step toward solving the problem on a shoestring budget, by using only a video camera, a moving spotlight, and a small chrome sphere. Specifically, the problem we address is to estimate the normal map of the exterior surface of a given solid transparent object, from which the surface depth can be integrated. Our technical contribution lies in relating this normal reconstruction problem to one of graph-cut segmentation. Unlike conventional formulations, however, our graph is dual-layered, since we can see a transparent object's foreground as well as the background behind it. Quantitative and qualitative evaluation are performed to verify the efficacy of this practical solution.
[show abstract][hide abstract] ABSTRACT: Representative surface reconstruction algorithms taking a gradient field as input enforce the integrability constraint in a discrete manner. While enforcing integrability allows the subsequent integration to produce surface heights, existing algorithms have one or more of the following disadvantages: They can only handle dense per-pixel gradient fields, smooth out sharp features in a partially integrable field, or produce severe surface distortion in the results. In this paper, we present a method which does not enforce discrete integrability and reconstructs a 3D continuous surface from a gradient or a height field, or a combination of both, which can be dense or sparse. The key to our approach is the use of kernel basis functions, which transfer the continuous surface reconstruction problem into high-dimensional space, where a closed-form solution exists. By using the Gaussian kernel, we can derive a straightforward implementation which is able to produce results better than traditional techniques. In general, an important advantage of our kernel-based method is that the method does not suffer discretization and finite approximation, both of which lead to surface distortion, which is typical of Fourier or wavelet bases widely adopted by previous representative approaches. We perform comparisons with classical and recent methods on benchmark as well as challenging data sets to demonstrate that our method produces accurate surface reconstruction that preserves salient and sharp features. The source code and executable of the system are available for downloading.
IEEE Transactions on Pattern Analysis and Machine Intelligence 12/2010; · 4.80 Impact Factor
[show abstract][hide abstract] ABSTRACT: This paper presents a robust and automatic approach to photometric stereo, where the two main components, namely surface normals and visible surfaces, are respectively optimized by expectation maximization (EM). A dense set of input images is conveniently captured using a digital video camera while a handheld spotlight is being moved around the target object and a small mirror sphere. In our approach, the inherently complex optimization problem is simplified into a two-step optimization, where EM is employed in each step: 1) Using the dense input, the weight or importance of each observation is alternately optimized with the normal and albedo at each pixel and 2) using the optimized normals and employing the Markov random fields (MRFs), surface integrabilities and discontinuities are alternately optimized in visible surface reconstruction. Our mathematical derivation gives simple updating rules for the EM algorithms, leading to a stable, practical, and parameter-free implementation that is very robust even in the presence of complex geometry, shadows, highlight, and transparency. We present high-quality results on normal and visible surface reconstruction, where fine geometric details are automatically recovered by our method.
IEEE Transactions on Pattern Analysis and Machine Intelligence 04/2010; · 4.80 Impact Factor
[show abstract][hide abstract] ABSTRACT: We propose tensor-based multiview stereo (TMVS) for quasi-dense D reconstruction from uncalibrated images. Our work is inspired by the patch-based multiview stereo (PMVS), a state-of-the-art technique in multiview stereo reconstruction. The effectiveness of PMVS is attributed to the use of 3D patches in the match-propagate-filter MVS pipeline. Our key observation is: PMVS has not fully utilized the valuable 3D geometric cue available in 3D patches which are oriented points. This paper combines the complementary advantages of photoconsistency, visibility and geometric consistency enforcement in MVS via the use of 3D tensors, where our closed-form solution to tensor voting provides a unified approach to implement the match-propagate-filter pipeline. Using PMVS as the implementation backbone where TMVS is built, we provide qualitative and quantitative evaluation to demonstrate how TMVS significantly improve the MVS pipeline.
The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13-18 June 2010; 01/2010
[show abstract][hide abstract] ABSTRACT: This article introduces an optimization approach for modeling and rendering impossible figures. Our solution is inspired by how modeling artists construct physical 3D models to produce a valid 2D view of an impossible figure. Given a set of 3D locally possible parts of the figure, our algorithm automatically optimizes a view-dependent 3D model, subject to the necessary 3D constraints for rendering the impossible figure at the desired novel viewpoint. A linear and constrained least-squares solution to the optimization problem is derived, thereby allowing an efficient computation and rendering new views of impossible figures at interactive rates. Once the optimized model is available, a variety of compelling rendering effects can be applied to the impossible figure.
ACM Transactions on Graphics 01/2010; 29(2):13. · 3.36 Impact Factor
[show abstract][hide abstract] ABSTRACT: We present an interactive system for reconstructing surface normals from a single image. Our approach has two complementary contributions. First, we introduce a novel shape-from-shading algorithm (SfS) that produces faithful normal reconstruction for local image region (high-frequency component), but it fails to faithfully recover the overall global structure (low-frequency component). Our second contribution consists of an approach that corrects low-frequency error using a simple markup procedure. This approach, aptly called rotation palette, allows the user to specify large scale corrections of surface normals by drawing simple stroke correspondences between the normal map and a sphere image which represents rotation directions. Combining these two approaches, we can produce high-quality surfaces quickly from single images.
[show abstract][hide abstract] ABSTRACT: Layer decomposition from a single image is an under-constrained problem, because there are more unknowns than equations. This paper studies a slightly easier but very useful alternative where only the background layer has substantial image gradients and structures. We propose to solve this useful alternative by an expectation-maximization (EM) algorithm that employs the hidden Markov model (HMM), which maintains spatial coherency of smooth and overlapping layers, and helps to preserve image details of the textured background layer. We demonstrate that, using a small amount of user input, various seemingly unrelated problems in computational photography can be effectively addressed by solving this alternative using our EM-HMM algorithm.
2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 24-26 June 2008, Anchorage, Alaska, USA; 01/2008
[show abstract][hide abstract] ABSTRACT: Surface gradients are useful to surface reconstruction in single view modeling, shape-from-shading, and photomet- ric stereo. Previous algorithms minimize a complex, non- linear energy functional, or require dense surface gradients to perform integration to generate 3D locations, or require user-input heights to constrain the solution space, or pro- duce severe distortion and smooth out surface details. Most single-view algorithms output a Monge patch (height-field), which may introduce further surface distortion along ob- ject silhouettes and surface orientation discontinuities. Our proposed algorithm operates on a single view of complete or incomplete data. The data can be gradients without 3D locations, or 3D locations without gradients. The output surface, which is not necessarily a height-field, preserves salient depth and orientation discontinuities. Experimental comparisons on both simple and complex data show that our method produces better surfaces with significantly less distortion and more details preserved. The implementation of our closed-form solution is very straightforward.
IEEE 11th International Conference on Computer Vision, ICCV 2007, Rio de Janeiro, Brazil, October 14-20, 2007; 01/2007
[show abstract][hide abstract] ABSTRACT: This paper addresses the problem of natural shadow matting: the removal or extraction of natural shadows from a single image. Because textures are maintained in the shadowless image after the extraction process, our approach produces some of the best results to date among shadow removal techniques. Using the image formation equation typical of computer vision, we advocate a new model for shadow formation where shadow effect is understood as light attenuation instead of a mixture of two colors governed by the conventional matting equation. This leads to a new shadow equation with fewer unknowns to solve, where a three-channel shadow matte and a shadowless image are considered in our optimization. Our problem is formulated as one of energy minimization guided by user-supplied hints in the form of a quadmap which can be specified easily by the user. This formulation allows for robust shadow matte extraction while maintaining texture in the shadowed region by considering color transfer, texture gradient, and shadow smoothness. We demonstrate the usefulness of our approach in shadow removal, image matting and compositing.
[show abstract][hide abstract] ABSTRACT: We present a simple interactive approach to specify 3D shape in a single view using "shape palettes". The interaction is as follows: draw a simple 2D primitive in the 2D view and then specify its 3D orientation by drawing a corresponding primitive on a shape palette. The shape palette is presented as an image of some familiar shape whose local 3D orientation is readily understood and can be easily marked over. The 3D orientation from the shape palette is transferred to the 2D primitive based on the markup. As we will demonstrate, only sparse markup is needed to generate expressive and detailed 3D surfaces. This markup approach can be used to model freehand 3D surfaces drawn in a single view, or combined with image-snapping tools to quickly extract surfaces from images and photographs.
[show abstract][hide abstract] ABSTRACT: We address the problem of robust normal reconstruction by dense photometric stereo, in the presence of complex geometry, shadows, highlight, transparencies, variable attenuation in light intensities, and inaccurate estimation in light directions. The input is a dense set of noisy photometric images, conveniently captured by using a very simple set-up consisting of a digital video camera, a reflective mirror sphere, and a handheld spotlight. We formulate the dense photometric stereo problem as a Markov network and investigate two important inference algorithms for Markov Random Fields (MRFs)--graph cuts and belief propagation--to optimize for the most likely setting for each node in the network. In the graph cut algorithm, the MRF formulation is translated into one of energy minimization. A discontinuity-preserving metric is introduced as the compatibility function, which allows alpha-expansion to efficiently perform the maximum a posteriori (MAP) estimation. Using the identical dense input and the same MRF formulation, our tensor belief propagation algorithm recovers faithful normal directions, preserves underlying discontinuities, improves the normal estimation from one of discrete to continuous, and drastically reduces the storage requirement and running time. Both algorithms produce comparable and very faithful normals for complex scenes. Although the discontinuity-preserving metric in graph cuts permits efficient inference of optimal discrete labels with a theoretical guarantee, our estimation algorithm using tensor belief propagation converges to comparable results, but runs faster because very compact messages are passed and combined. We present very encouraging results on normal reconstruction. A simple algorithm is proposed to reconstruct a surface from a normal map recovered by our method. With the reconstructed surface, an inverse process, known as relighting in computer graphics, is proposed to synthesize novel images of the given scene under user-specified light source and direction. The synthesis is made to run in real time by exploiting the state-of-the-art graphics processing unit (GPU). Our method offers many unique advantages over previous relighting methods and can handle a wide range of novel light sources and directions.
IEEE Transactions on Pattern Analysis and Machine Intelligence 12/2006; 28(11):1830-46. · 4.80 Impact Factor
[show abstract][hide abstract] ABSTRACT: This paper presents a complete system capable of synthesizing a large number of pixels that are missing due to occlusion or damage in an uncalibrated input video. These missing pixels may correspond to the static background or cyclic motions of the captured scene. Our system employs user-assisted video layer segmentation, while the main processing in video repair is fully automatic. The input video is first decomposed into the color and illumination videos. The necessary temporal consistency is maintained by tensor voting in the spatio-temporal domain. Missing colors and illumination of the background are synthesized by applying image repairing. Finally, the occluded motions are inferred by spatio-temporal alignment of collected samples at multiple scales. We experimented on our system with some difficult examples with variable illumination, where the capturing camera can be stationary or in motion.
IEEE Transactions on Pattern Analysis and Machine Intelligence 06/2006; 28(5):832-9. · 4.80 Impact Factor
[show abstract][hide abstract] ABSTRACT: Given a dense set of imperfect normals obtained by photometric stereo or shape from shading, this paper presents an optimization algorithm which alternately optimizes until convergence the surface integrabilities and discontinuities inherent in the normal field, in order to derive a segmented surface description of the visible scene without noticeable distortion. In our Expectation-Maximization (EM) framework, we enforce discontinuity-preserving integrability so that fine details are preserved within each output segment while the occlusion boundaries are localized as sharp surface discontinuities. Using the resulting weighted discontinuity map, the estimation of a discontinuity-preserving height field can be formulated into a convex optimization problem. We compare our method and present convincing results on synthetic and real data.
Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on; 02/2006
[show abstract][hide abstract] ABSTRACT: We formulate a robust method using Expectation Maximiza- tion (EM) to address the problem of dense photometric stereo. Previous approaches using Markov Random Fields (MRF) utilized a dense set of noisy photometric images for estimating an initial normal to encode the matching cost at each pixel, followed by normal refinement by consid- ering the neighborhood of the pixel. In this paper, we argue that they had not fully utilized the inherent data redundancy in the dense set and that its full exploitation leads to considerable improvement. Using the same noisy and dense input, this paper contributes in learning relevant observations, recovering accurate normals and very good surface albedos, and inferring optimal parameters in an unifying EM framework that con- verges to an optimal solution and has no free user-supplied parameter to set. Experiments show that our EM approach for dense photometric stereo outperforms the previous approaches using the same input.
Computer Vision - ECCV 2006, 9th European Conference on Computer Vision, Graz, Austria, May 7-13, 2006, Proceedings, Part IV; 01/2006
[show abstract][hide abstract] ABSTRACT: This paper addresses the problem of shadow extraction from a single image of a complex natural scene. No simplifying assumption on the camera and the light source other than the Lambertian assumption is used. Our method is unique because it is capable of translating very rough user-supplied hints into the effective likelihood and prior functions for our Bayesian optimization. The likelihood function requires a decent estimation of the shadowless image, which is obtained by solving the associated Poisson equation. Our Bayesian framework allows for the optimal extraction of smooth shadows while preserving texture appearance under the extracted shadow. Thus our technique can be applied to shadow removal, producing some best results to date compared with the current state-of-the-art techniques using a single input image. We propose related applications in shadow compositing and image repair using our Bayesian technique.
[show abstract][hide abstract] ABSTRACT: We present a surprisingly simple system that performs robust normal reconstruction by dense photometric stereo, in the presence of large shadows, highlight, transparencies, complex geometry, variable attenuation in light intensity and inaccurate light directions. Our system consists of a mirror sphere, a spotlight and a DV camera only. Using this, we infer a dense set of unbiased but noisy photometric data uniformly distributed on the light direction sphere. We use this dense set to derive a very robust matching cost for our MRF photometric stereo model, where the maximum a posteriori (MAP) solution is estimated. To aggregate support for candidate normals in the normal refinement process, we introduce a compatibility function that is translated into a discontinuity-preserving metric, thus speeding up the MAP estimation by energy minimization using graph cut. No reference object of similar material is used. We perform detailed comparison on our approach with conventional convex minimization. We show very good normals estimated from very noisy data on a wide range of difficult objects to show the robustness and usefulness of our method.
Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on; 07/2005
[show abstract][hide abstract] ABSTRACT: While subsurface scattering is common in many real objects, almost all sepa- ration algorithms focus on extracting specular and diffuse components from real images. In this paper, we propose an appearance-based approach to separate non-directional sub- surface scattering reectance from photometric images, in addition to the separation of the off-specular and non-Lambertian diffuse components. Our mathematical model suf- ciently accounts for the photometric response due to non-directional subsurface scat- tering, and allows for a practical image acquisition system to capture its contribution. Relighting the scene is possible by employing the separated reectances. We argue that it is sometimes necessary to separate subsurface scattering component, which is essen- tial to highlight removal, when the object reectance cannot be modeled by specular and diffuse components alone.
Computer Vision - ECCV 2004, 8th European Conference on Computer Vision, Prague, Czech Republic, May 11-14, 2004. Proceedings, Part II; 01/2004