On Learning Conditional Random Fields for Stereo

International Journal of Computer Vision (Impact Factor: 3.62). 09/2012; 99(3):1-19. DOI: 10.1007/s11263-010-0385-z

ABSTRACT Until recently, the lack of ground truth data has hindered the application of discriminative structured prediction techniques
to the stereo problem. In this paper we use ground truth data sets that we have recently constructed to explore different
model structures and parameter learning techniques. To estimate parameters in Markov random fields (MRFs) via maximum likelihood
one usually needs to perform approximate probabilistic inference. Conditional random fields (CRFs) are discriminative versions
of traditional MRFs. We explore a number of novel CRF model structures including a CRF for stereo matching with an explicit
occlusion model. CRFs require expensive inference steps for each iteration of optimization and inference is particularly slow
when there are many discrete states. We explore belief propagation, variational message passing and graph cuts as inference
methods during learning and compare with learning via pseudolikelihood. To accelerate approximate inference we have developed
a new method called sparse variational message passing which can reduce inference time by an order of magnitude with negligible
loss in quality. Learning using sparse variational message passing improves upon previous approaches using graph cuts and
allows efficient learning over large data sets when energy functions violate the constraints imposed by graph cuts.

KeywordsStereo-Learning-Structured prediction-Approximate inference

  • [Show abstract] [Hide abstract]
    ABSTRACT: Probabilistic graphical models have had a tremendous impact in machine learning and approaches based on energy function minimization via techniques such as graph cuts are now widely used in image segmentation. However, the free parameters in energy function-based segmentation techniques are often set by hand or using heuristic techniques. In this paper, we explore parameter learning in detail. We show how probabilistic graphical models can be used for segmentation problems to illustrate Markov random fields (MRFs), their discriminative counterparts conditional random fields (CRFs) as well as kernel CRFs. We discuss the relationships between energy function formulations, MRFs, CRFs, hybrids based on graphical models and their relationships to key techniques for inference and learning. We then explore a series of novel 3D graphical models and present a series of detailed experiments comparing and contrasting different approaches for the complete volumetric segmentation of multiple organs within computed tomography imagery of the abdominal region. Further, we show how these modeling techniques can be combined with state of the art image features based on histograms of oriented gradients to increase segmentation performance. We explore a wide variety of modeling choices, discuss the importance and relationships between inference and learning techniques and present experiments using different levels of user interaction. We go on to explore a novel approach to the challenging and important problem of adrenal gland segmentation. We present a 3D CRF formulation and compare with a novel 3D sparse kernel CRF approach we call a relevance vector random field. The method yields state of the art performance and avoids the need to discretize or cluster input features. We believe our work is the first to provide quantitative comparisons between traditional MRFs with edge-modulated interaction potentials and CRFs for multi-organ abdominal segmentation and the first to explore the 3D adrenal gland segmentation problem. Finally, along with this paper we provide the labeled data used for our experiments to the community.
    Machine Vision and Applications 02/2014; · 1.10 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This study tackles the image color to gray conversion problem. The aim was to understand the conversion qualities that can improve the accuracy of results when the grayscale conversion is applied as a pre-processing step in the context of vision algorithms, and in particular dense stereo matching. We evaluated many different state of the art color to grayscale conversion algorithms. We also propose an ad-hoc adaptation of the most theoretically promising algorithm, which we call Multi-Image Decolorize (MID). This algorithm comes from an in-depth analysis of the existing conversion solutions and consists of a multi-image extension of the algorithm by Grundland and Dodgson (The decolorize algorithm for contrast enhancing, color to grayscale conversion, Tech. Rep. UCAM-CL-TR-649, University of Cambridge, 2005) which is based on predominant component analysis. In addition, two variants of this algorithm have been proposed and analyzed: one with standard unsharp masking and another with a chromatic weighted unsharp masking technique (Nowak and Baraniuk in IEEE Trans Image Process 7(7):1068–1074, 1998) which enhances the local contrast as shown in the approach by Smith etal. (Comput Graph Forum 27(2), 2008). We tested the relative performances of this conversion with respect to many other solutions, using the StereoMatcher test suite (Scharstein and Szeliski in Int J Comput Vis 47(1–3):7–42, 2002) with a variety of different datasets and different dense stereo matching algorithms. The results show that the overall performance of the proposed MID conversion are good and the reported tests provided useful information and insights on how to design color to gray conversion to improve matching performance. We also show some interesting secondary results such as the role of standard unsharp masking vs. chromatic unsharp masking in improving correspondence matching. KeywordsColor to grayscale conversion–Dense stereo matching–Dimensionality reduction–3D reconstruction–Unsharp masking
    Machine Vision and Applications 01/2012; 23:327-348. · 1.10 Impact Factor
  • Source
    International Journal of Computer Vision 01/2012; · 3.62 Impact Factor


Available from