Article

Recovering 3D human pose from monocular images

INRIA Rhone-Alpes, Montbonnot, France;
IEEE Transactions on Pattern Analysis and Machine Intelligence (impact factor: 4.91). 02/2006; 28(1):44-58. DOI:10.1109/TPAMI.2006.21 pp.44-58
Source: IEEE Xplore

ABSTRACT We describe a learning-based method for recovering 3D human body pose from single images and monocular image sequences. Our approach requires neither an explicit body model nor prior labeling of body parts in the image. Instead, it recovers pose by direct nonlinear regression against shape descriptor vectors extracted automatically from image silhouettes. For robustness against local silhouette segmentation errors, silhouette shape is encoded by histogram-of-shape-contexts descriptors. We evaluate several different regression methods: ridge regression, relevance vector machine (RVM) regression, and support vector machine (SVM) regression over both linear and kernel bases. The RVMs provide much sparser regressors without compromising performance, and kernel bases give a small but worthwhile improvement in performance. The loss of depth and limb labeling information often makes the recovery of 3D pose from single silhouettes ambiguous. To handle this, the method is embedded in a novel regressive tracking framework, using dynamics from the previous state estimate together with a learned regression value to disambiguate the pose. We show that the resulting system tracks long sequences stably. For realism and good generalization over a wide range of viewpoints, we train the regressors on images resynthesized from real human motion capture data. The method is demonstrated for several representations of full body pose, both quantitatively on independent but similar test data and qualitatively on real image sequences. Mean angular errors of 4-6° are obtained for a variety of walking motions.

0 0
 · 
0 Bookmarks
 · 
41 Views
  • Article: Three-dimensional human shape inference from silhouettes: reconstruction and validation
    [show abstract] [hide abstract]
    ABSTRACT: Silhouettes are robust image features that provide considerable evidence about the three-dimensional (3D) shape of a human body. The information they provide is, however, incomplete and prior knowledge has to be integrated to reconstruction algorithms in order to obtain realistic body models. This paper presents a method that integrates both geometric and statistical priors to reconstruct the shape of a subject assuming a standardized posture from a frontal and a lateral silhouette. The method is comprised of three successive steps. First, a non-linear function that connects the silhouette appearances and the body shapes is learnt and used to create a first approximation. Then, the body shape is deformed globally along the principal directions of the population (obtained by performing principal component analysis over 359 subjects) to follow the contours of the silhouettes. Finally, the body shape is deformed locally to ensure it fits the input silhouettes as well as possible. Experimental results showed a mean absolute 3D error of 8mm with ideal silhouettes extraction. Furthermore, experiments on body measurements (circumferences or distances between two points on the body) resulted in a mean error of 11mm. KeywordsHuman models–Statistical prior–Shape-from-silhouettes–Three-dimensional reconstruction
    Machine Vision and Applications 04/2012; · 1.01 Impact Factor
  • Article: Region-based pose tracking with occlusions using 3D models
    [show abstract] [hide abstract]
    ABSTRACT: Despite great progress achieved in 3-D pose tracking during the past years, occlusions and self-occlusions are still an open issue. This is particularly true in silhouette-based tracking where even visible parts cannot be tracked as long as they do not affect the object silhouette. Multiple cameras or motion priors can overcome this problem. However, multiple cameras or appropriate training data are not always readily available. We propose a framework in which the pose of 3-D models is found by minimising the 2-D projection error through minimisation of an energy function depending on the pose parameters. This framework makes it possible to handle occlusions and self-occlusions by tracking multiple objects and object parts simultaneously. Therefore, each part is described by its own image region each of which is modeled by one probability density function. This allows to deal with occlusions explicitly, which includes self-occlusions between different parts of the same object as well as occlusions between different objects. The results we present for simulations and real-world scenes demonstrate the improvements achieved in monocular and multi-camera settings. These improvements are substantiated by quantitative evaluations, e.g. based on the HumanEVA benchmark. KeywordsPose estimation–Model-based tracking–Kinematic chain–Computer vision–Human motion analysis–Occlusion handling
    Machine Vision and Applications 04/2012; 23(3):557-577. · 1.01 Impact Factor
  • Source
    Conference Proceeding: Generative Tracking of Human Motion by Sequential Clonal Selection Algorithm
    [show abstract] [hide abstract]
    ABSTRACT: High dimensional pose state space is the main challenge in articulated human motion tacking. In this paper, we propose a novel generative approach in the framework of artificial immune model, by which we try to widen the bottleneck with effective search strategy embedded in the extracted state subspace. Firstly, we learn the latent space of pose state and propose a manifold reconstruction method to establish the inverse mapping. Pose analysis in this latent space is more effective and accurate. Secondly, we apply Clone Selection Algorithm (CSA) for human pose estimation. In order to make CSA suitable for pose tracking, we propose a sequential CSA (SCSA) framework by incorporating the temporal continuity information into the traditional CSA. Experimental results show that our method achieves better results than state-ofart methods.
    Computer Graphics International 2012, United Kingdom; 06/2012

Full-text

View
0 Downloads
Available from

Keywords

3D human body
 
direct nonlinear regression
 
explicit body model
 
full body
 
image silhouettes
 
images resynthesized
 
learned regression value
 
limb labeling information
 
local silhouette segmentation errors
 
Mean angular errors
 
monocular image sequences
 
previous state estimate
 
prior labeling
 
real image sequences
 
shape descriptor vectors
 
similar test data
 
single images
 
single silhouettes ambiguous
 
support vector machine
 
wide range
 

A. Agarwal