Conference Paper

Recognizing human actions from still images with latent poses.

DOI: 10.1109/CVPR.2010.5539879 Conference: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13-18 June 2010
Source: DBLP

ABSTRACT We consider the problem of recognizing human actions from still images. We propose a novel approach that treats the pose of the person in the image as latent variables that will help with recognition. Different from other work that learns separate systems for pose estimation and action recognition, then combines them in an ad-hoc fashion, our system is trained in an integrated fashion that jointly considers poses and actions. Our learning objective is designed to directly exploit the pose information for action recognition. Our experimental results demonstrate that by inferring the latent poses, we can improve the final action recognition results.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper describes a supervised classification approach based on non-negative matrix factorization (NMF). Our classification framework builds on the recent expansions of non-negative matrix factorization to multiview learning, where the primary dataset benefits from auxiliary information for obtaining shared and meaningful spaces. For discrimination, we utilize data categories in a supervised manner as an auxiliary source of information in order to learn co-occurrences through a common set of basis vectors. We demonstrate the efficiency of our algorithm in integrating various image modalities for enhancing the overall classification accuracy over different benchmark datasets. Our evaluation considers two challenging image datasets of human action recognition. We show that our algorithm achieves superior results over state-of-the-art in terms of efficiency and overall classification accuracy.
    35th German Conference on Pattern Recognition (GCPR/DAGM); 09/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Deformable part-based models [1, 2] achieve state-of-the-art performance for object detection, but rely on heuristic initialization during training due to the optimization of non-convex cost function. This paper investigates limitations of such an initialization and extends earlier methods using additional supervision. We explore strong supervision in terms of annotated object parts and use it to (i) improve model initialization, (ii) optimize model structure, and (iii) handle partial occlusions. Our method is able to deal with sub-optimal and incomplete annotations of object parts and is shown to benefit from semi-supervised learning setups where part-level annotation is provided for a fraction of positive examples only. Experimental results are reported for the detection of six animal classes in PASCAL VOC 2007 and 2010 datasets. We demonstrate significant improvements in detection performance compared to the LSVM [1] and the Poselet [3] object detectors.
    Proceedings of the 12th European conference on Computer Vision - Volume Part I; 10/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Various sports video genre categorization methods are proposed recently, mainly focusing on professional sports videos captured for TV broadcasting. This paper aims to categorize sports videos in the wild, captured using mobile phones by people watching a game or practicing a sport. Thus, no assumption is made about video production practices or existence of field lining and equipment. Motivated by distinctiveness of motions in sports activities, we propose a novel motion trajectory descriptor to effectively and efficiently represent a video. Furthermore, temporal analysis of local descriptors is proposed to integrate the categorization decision over time. Experiments on a newly collected dataset of amateur sports videos in the wild demonstrate that our trajectory descriptor is superior for sports videos categorization and temporal analysis improves the categorization accuracy further.
    the IEEE International Conference on Image Processing (ICIP 2014),, Paris, France; 10/2014


Available from