Conference Paper

High-accuracy stereo depth maps using structured light

Middlebury Coll., VT, USA;
DOI: 10.1109/CVPR.2003.1211354 In proceeding of: Computer Vision and Pattern Recognition, 2003. Proceedings. 2003 IEEE Computer Society Conference on, Volume: 1
Source: DBLP

ABSTRACT Progress in stereo algorithm performance is quickly outpacing the ability of existing stereo data sets to discriminate among the best-performing algorithms, motivating the need for more challenging scenes with accurate ground truth information. This paper describes a method for acquiring high-complexity stereo image pairs with pixel-accurate correspondence information using structured light. Unlike traditional range-sensing approaches, our method does not require the calibration of the light sources and yields registered disparity maps between all pairs of cameras and illumination projectors. We present new stereo data sets acquired with our method and demonstrate their suitability for stereo algorithm evaluation. Our results are available at

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, an algorithm is presented for estimating scene flow, which is a richer, 3D analogue of Optical Flow. The approach operates orders of magnitude faster than alternative techniques, and is well suited to further performance gains through parallelized implementation. The algorithm employs multiple hypothesis to deal with motion ambiguities, rather than the traditional smoothness constraints, removing oversmoothing errors and providing significant performance improvements on benchmark data, over the previous state of the art. The approach is flexible, and capable of operating with any combination of appearance and/or depth sensors, in any setup, simultaneously estimating the structure and motion if necessary. Additionally, the algorithm propagates information over time to resolve ambiguities, rather than performing an isolated estimation at each frame, as in contemporary approaches. Approaches to smoothing the motion field without sacrificing the benefits of multiple hypotheses are explored, and a probabilistic approach to Occlusion estimation is demonstrated, leading to 10% and 15% improved performance respectively. Finally, a data driven tracking approach is described, and used to estimate the 3D trajectories of hands during sign language, without the need to model complex appearance variations at each viewpoint.
    IEEE Transactions on Software Engineering 03/2014; 36(3):564-576. · 2.59 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: In this article, we present a fast and high quality stereo matching algorithm on FPGA using cost aggregation (CA) and fast locally consistent (FLC) dense stereo. In many software programs, global matching algorithms are used in order to obtain accurate disparity maps. Although their error rates are considerably low, their processing speeds are far from that required for real-time processing because of their complex processing sequences. In order to realize real-time processing, many hardware systems have been proposed to date. They have achieved considerably high processing speeds; however, their error rates are not as good as those of software programs, because simple local matching algorithms have been widely used in those systems. In our system, sophisticated local matching algorithms (CA and FLC) that are suitable for FPGA implementation are used to achieve low error rate while maintaining the high processing speed. We evaluate the performance of our circuit on Xilinx Vertex-6 FPGAs. Its error rate is comparable to that of top-level software algorithms, and its processing speed is nearly 2 clock cycles per pixel, which reaches 507.9 fps for 640 480 pixel images.
    ACM Transactions on Reconfigurable Technology and Systems (TRETS). 02/2014; 7(1).
  • [Show abstract] [Hide abstract]
    ABSTRACT: Automatic focus and exposure are the key components in digital cameras nowadays, which jointly play an essential role for capturing a high quality image/video. In this paper, we make an attempt to address these two challenging issues for future depth cameras. Relying on a programmable projector, we establish a structured light system for depth sensing with focus and exposure adaptation. The basic idea is to change current illumination pattern and intensity locally according to the prior depth information. Consequently, multiple object surfaces appearing at different depths in the scene can receive proper illumination respectively. In this way, more flexible and robust depth sensing can be achieved in comparison with fixed illumination, especially at near depth.
    Journal of Visual Communication and Image Representation 01/2014; 25(4):649–658. · 1.20 Impact Factor


1 Download
Available from