Silhouette and stereo fusion for 3D object modeling

Signal and Image Processing Department, CNRS UMR 5141, Ecole Nationale Supérieure des Télécommunications, France
Computer Vision and Image Understanding (Impact Factor: 1.54). 01/2003; 96(3):367-392. DOI: 10.1016/j.cviu.2004.03.016
Source: DBLP


In this paper, we present a new approach to high quality 3D object reconstruction. Starting from a calibrated sequence of color images, the algorithm is able to reconstruct both the 3D geometry and the texture. The core of the method is based on a deformable model, which defines the framework where texture and silhouette information can be fused. This is achieved by defining two external forces based on the images: a texture driven force and a silhouette driven force. The texture force is computed in two steps: a multi-stereo correlation voting approach and a gradient vector flow diffusion. Due to the high resolution of the voting approach, a multi-grid version of the gradient vector flow has been developed. Concerning the silhouette force, a new formulation of the silhouette constraint is derived. It provides a robust way to integrate the silhouettes in the evolution algorithm. As a consequence, we are able to recover the contour generators of the model at the end of the iteration process. Finally, a texture map is computed from the original images for the reconstructed 3D model.

Full-text preview

Available from:
  • Source
    • "One research direction in this area makes the assumption that the images are taken from known viewpoints. This is used in the multi-view stereo approach, e.g., [43], [28] or [20]. As presented in a survey [50] comparing multi-view stereo methods, the camera is usually positioned at known viewpoints by a robotic arm, which qualifies as costly infrastructure for object modeling. "
    [Show abstract] [Hide abstract]
    ABSTRACT: An approach for generating textured 3D models of objects without the need for complex infrastructure such as turn-tables or high-end sensors on precisely controlled rails is presented. The method is inexpensive as it uses only a low-cost RGBD sensor, e.g., Microsoft Kinect or ASUS Xtion, and Augmented Reality (AR) markers printed on paper sheets. The sensor can be moved by hand by an untrained person and the AR-markers can be arbitrarily placed in the scene, thus allowing the modeling of objects of a large range of sizes. Due to the use of the simple AR markers, the method is significantly more robust than just using the RGBD sensor or a monocular camera alone and it hence avoids the typical need for manual post-processing of alternative approaches like Kinect-Fusion, 123D Catch, Photosynth, or similar. This article has two main contributions: First, the development of a simple, inexpensive method for the quick and easy digitization of physical objects is presented. Second, the development of an uncertainty model for AR-marker pose estimation is introduced. The latter is of interest beyond the object modeling application presented here. The uncertainty model is used in a graph-based relaxation method to improve model-consistency. Realistic modeling of various objects, such as parcels, sport balls, coffee sacks, human dolls, etc., is experimentally demonstrated. Good model-accuracy is shown for several ground-truth objects with simple geometries and known dimensions. Furthermore, it is shown that the models obtained using the uncertainty model have fewer errors than the ones obtained without it.
    Full-text · Article · Jan 2015 · Robotics and Autonomous Systems
  • Source
    • "This is a functional that penalizes solutions that do not respect prior assumptions, and plays a key role both in the quality of the reconstruction, as well as in the efficiency of the numerical optimization scheme. The most principled approaches to 3-d reconstruction aim to infer a collection of (multiply-connected, piecewise smooth) surfaces directly, represented intrinsically without regards to the images [2] [10] [18] [28] [38] [21] [42], as evident by the large body of literature on shape space and shape optimization. In these methods, both the geometry and the topology is then inferred to fit the available images. "

    Full-text · Conference Paper · Jan 2015
  • Source
    • "In shape-from-silhouettes, a set of silhouettes extracted from images is used to model the 3D scene by generating the convex hull produced by a union of projection cones [1] [2]. An energy function used both texture and silhouettes for guiding a deformable model in [10] for single 3D object representation. The methodology described in this paper aims to robustly enforce the consistency of scenes with multiple objects with their corresponding contours segmented from images. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes enforcing the consistency with segmented contours when modelling scenes with multiple objects from multi-view images. A certain rough initialization of the 3D scene is assumed to be available and in the case of multiple objects inconsistencies are expected. In the proposed shape-from-contours approach images are segmented and back-projections of segmented contours are used for enforcing the consistency of the segmented contours with 3D objects from the scene. We provide a study for the physical requirements for detecting occlusions when reconstructing 3-D scenes with multiple objects.
    Full-text · Conference Paper · Oct 2014
Show more