Conference Paper

Action classification on product manifolds.

DOI: 10.1109/CVPR.2010.5540131 Conference: The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2010, San Francisco, CA, USA, 13-18 June 2010
Source: DBLP

ABSTRACT Videos can be naturally represented as multidimensional arrays known as tensors. However, the geometry of the tensor space is often ignored. In this paper, we argue that the underlying geometry of the tensor space is an important property for action classification. We characterize a tensor as a point on a product manifold and perform classification on this space. First, we factorize a tensor relating to each order using a modified High Order Singular Value Decomposition (HOSVD). We recognize each factorized space as a Grassmann manifold. Consequently, a tensor is mapped to a point on a product manifold and the geodesic distance on a product manifold is computed for tensor classification. We assess the proposed method using two public video databases, namely Cambridge-Gesture gesture and KTH human action data sets. Experimental results reveal that the proposed method performs very well on these data sets. In addition, our method is generic in the sense that no prior training is needed.

  • [Show abstract] [Hide abstract]
    ABSTRACT: Human action classification is an important task in computer vision. The Bag-of-Words model uses spatio-temporal features assigned to visual words of a vocabulary and some classification algorithm to attain this goal. In this work we have studied the effect of reducing the vocabulary size using a video word ranking method. We have applied this method to the KTH dataset to obtain a vocabulary with more descriptive words where the representation is more compact and efficient. Two feature descriptors, STIP and MoSIFT, and two classifiers, KNN and SVM, have been used to check the validity of our approach. Results for different vocabulary sizes show an improvement of the recognition rate whilst reducing the number of words as non-descriptive words are removed. Additionally, state-of-the-art performances are reached with this new compact vocabulary representation.
    High Performance Computing and Simulation (HPCS), 2012 International Conference on; 01/2012
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A lot of nonlinear embedding techniques have been developed to recover the intrinsic low-dimensional manifolds embedded in the high-dimensional space. However, the quantitative evaluation criteria are less studied in literature. The embedding quality is usually evaluated by visualization which is subjective and qualitative. The few existing evaluation methods to estimate the embedding quality, neighboring preservation rate for example, are not widely applicable. In this paper, we propose several novel criteria for quantitative evaluation, by considering the global smoothness and co-directional consistence of the nonlinear embedding algorithms. The proposed criteria are geometrically intuitive, simple, and easy to implement with a low computational cost. Experiments show that our criteria capture some new geometrical properties of the nonlinear embedding algorithms, and can be used as a guidance to deal with the embedding of the out-of-samples.
    IEEE Transactions on Neural Networks 10/2011; 22(12):1987-98. · 2.95 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Given a finite set of subspaces of RnRn, perhaps of differing dimensions, we describe a flag of vector spaces (i.e. a nested sequence of vector spaces) that best represents the collection based on a natural optimization criterion and we present an algorithm for its computation. The utility of this flag representation lies in its ability to represent a collection of subspaces of differing dimensions. When the set of subspaces all have the same dimension d, the flag mean is related to several commonly used subspace representations. For instance, the d-dimensional subspace in the flag corresponds to the extrinsic manifold mean. When the set of subspaces is both well clustered and equidimensional of dimension d, then the d-dimensional component of the flag provides an approximation to the Karcher mean. An intermediate matrix used to construct the flag can also be used to recover the canonical components at the heart of Multiset Canonical Correlation Analysis. Two examples utilizing the Carnegie Mellon University Pose, Illumination, and Expression Database (CMU-PIE) serve as visual illustrations of the algorithm.
    Linear Algebra and its Applications 06/2014; 451:15–32. · 0.97 Impact Factor