Spacetime stereo: a unifying framework for depth from triangulation

Honda Research Institute, Mountain View, CA 94041, USA.
IEEE Transactions on Pattern Analysis and Machine Intelligence (Impact Factor: 5.69). 03/2005; 27(2):296-302. DOI: 10.1109/TPAMI.2005.37
Source: PubMed

ABSTRACT Depth from triangulation has traditionally been investigated in a number of independent threads of research, with methods such as stereo, laser scanning, and coded structured light considered separately. In this paper, we propose a common framework called spacetime stereo that unifies and generalizes many of these previous methods. To show the practical utility of the framework, we develop two new algorithms for depth estimation: depth from unstructured illumination change and depth estimation in dynamic scenes. Based on our analysis, we show that methods derived from the spacetime stereo framework can be used to recover depth in situations in which existing methods perform poorly.

1 Follower
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a novel approach for matching 2-D points between a video projector and a digital camera. Our method is motivated by camera–projector applications for which the projected image needs to be warped to prevent geometric distortion. Since the warping process often needs geometric information on the 3-D scene obtained from a triangulation, we propose a technique for matching points in the projector to points in the camera based on arbitrary video sequences. The novelty of our method lies in the fact that it does not require the use of pre-designed structured light patterns as is usually the case. The backbone of our application lies in a function that matches activity patterns instead of colors. This makes our method robust to pose, severe photometric and geometric distortions. It also does not require calibration of the color response curve of the camera–projector system. We present quantitative and qualitative results with synthetic and real-life examples, and compare the proposed method with the scale invariant feature transform (SIFT) method and with a state-of-the-art structured light technique. We show that our method performs almost as well as structured light methods and significantly outperforms SIFT when the contrast of the video captured by the camera is degraded.
    Machine Vision and Applications 09/2012; 23(5). DOI:10.1007/s00138-011-0358-4 · 1.44 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Algorithms for stereo video image processing typicaly assume that the various tasks; calibration, static stereo matching, and egomotion are independent black boxes. In particular, the task of computing disparity estimates is normally performed independently of ongoing egomotion and environmental recovery processes. Can information from these processes be exploited in the notoriously hard problem of disparity field estimation? Here we explore the use of feedback from the environmental model being constructed to the static stereopsis task. A prior estimate of the disparity field is used to seed the stereomatching process within a probabilistic framework. Experimental results on simulated and real data demonstrate the potential of the approach.
    Proceedings of the 2012 Joint International Conference on Human-Centered Computer Environments, Aizu-Wakamatsu, Japan; 03/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a novel geometry video (GV) framework to model and compress 3-D facial expressions. GV bridges the gap of 3-D motion data and 2-D video, and provides a natural way to apply the well-studied video processing techniques to motion data processing. Our framework includes a set of algorithms to construct GVs, such as hole filling, geodesic-based face segmentation, expression-invariant parameterization (EIP), and GV compression. Our EIP algorithm can guarantee the exact correspondence of the salient features (eyes, mouth, and nose) in different frames, which leads to GVs with better spatial and temporal coherence than that of the conventional parameterization methods. By taking advantage of this feature, we also propose a new H.264/AVC-based progressive directional prediction scheme, which can provide further 10%-16% bitrate reductions compared to the original H.264/AVC applied for GV compression while maintaining good video quality. Our experimental results on real-world datasets demonstrate that GV is very effective for modeling the high-resolution 3-D expression data, thus providing an attractive way in expression information processing for gaming and movie industry.
    IEEE Transactions on Circuits and Systems for Video Technology 02/2012; 22(1-22):77 - 90. DOI:10.1109/TCSVT.2011.2158337 · 2.26 Impact Factor

Preview (2 Sources)

Available from