Publications

  • Y Yun, IYH Gu, H Aghajan
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper addresses the issue of classification of human activities in still images. We propose a novel method where part-based features focusing on human and object interaction are utilized for activity representation, and classification is designed on manifolds by exploiting underlying Riemannian geometry. The main contributions of the paper include: (a) represent human activity by appearance features from image patches containing hands, and by structural features formed from the distances between the torso and patch centers; (b) formulate SVM kernel function based on the geodesics on Riemannian manifolds under the log-Euclidean metric; (c) apply multi-class SVM classifier on the manifold under the one-against-all strategy. Experiments were conducted on a dataset containing 2750 images in 7 classes of activities from 10 subjects. Results have shown good performance (average classification rate of 95.83%, false positive 0.71%, false negative 4.24%). Comparisons with three other related classifiers provide further support to the proposed method.
    IEEE int'l conf. on image processing (ICIP) 2013; 09/2013
  • Y Yun, IYH Gu, H Aghajan
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper addresses issues in object tracking with occlusion scenarios, where multiple uncalibrated cameras with overlapping fields of view are exploited. We propose a novel method where tracking is first done independently in each individual view and then tracking results are mapped from different views to improve the tracking jointly. The proposed tracker uses the assumptions that objects are visible in at least one view and move uprightly on a common planar ground that may induce a homography relation between views. A method for online learning of object appearances on Riemannian manifolds is also introduced. The main novelties of the paper include: (a) define a similarity measure, based on geodesics between a candidate object and a set of mapped references from multiple views on a Riemannian manifold; (b) propose multiview maximum likelihood (ML) estimation of object bounding box parameters, based on Gaussian-distributed geodesics on the manifold; (c) introduce online learning of object appearances on the manifold, taking into account of possible occlusions; (d) utilize projective transformations for objects between views, where parameters are estimated from warped vertical axis by combining planar homography, epipolar geometry and vertical vanishing point; (e) embed single-view trackers in a three-layer multi-view tracking scheme. Experiments have been conducted on videos from multiple uncalibrated cameras, where objects contain long-term partial/full occlusions, or frequent intersections. Comparisons have been made with three existing methods, where the performance is evaluated both qualitatively and quantitatively. Results have shown the effectiveness of the proposed method in terms of robustness against tracking drift caused by occlusions.
    IEEE Journal on Emerging and Selected Topics in Circuits and Systems. 05/2013; 3(2):12.
  • Source
    Int'l conf. ICPR 2012, 2012.; 11/2012
  • Source
    Yixiao Yun, Irene YH Gu, Hamid Aghajan
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper addresses problem of object tracking in occlusion scenarios, where multiple uncalibrated cameras with overlapping fields of view are used. We propose a novel method where tracking is first done independently for each view and then tracking results are mapped between each pair of views to improve the tracking in individual views, under the assumptions that objects are not occluded in all views and move uprightly on a planar ground which may induce a homography relation between each pair of views. The tracking results are mapped by jointly exploiting the geometric constraints of homography, epipolar and vertical vanishing point. Main contributions of this paper include: (a) formulate a reference model of multi-view object appearance using region covariance for each view; (b) define a likelihood measure based on geodesics on a Riemannian manifold that is consistent with the destination view by mapping both the estimated positions and appearances of tracked object from other views; (c) locate object in each individual view based on maximum likelihood criterion from multi-view estimations of object position. Experiments have been conducted on videos from multiple uncalibrated cameras, where targets experience long-term partial or full occlusions. Comparison with two existing methods and performance evaluations are also made. Test results have shown effectiveness of the proposed method in terms of robustness against tracking drifts caused by occlusions.
    6th ACM/IEEE Int'l Conf. on Distributed Smart Cameras, 2012 (ICDSC 12); 10/2012
  • Source
    Yixiao Yun, Irene Y.H. Gu
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a novel method for multi-view face pose classification through sequential learning and sensor fusion. The basic idea is to use face images observed in visual and thermal infrared (IR) bands, with the same sampling weight in a multi-class boosting structure. The main contribution of this paper is a multi-class AdaBoost classification framework where information obtained from visual and infrared bands interactively complement each other. This is achieved by learning weak hypothesis for visual and IR band independently and then fusing the optimized hypothesis sub-ensembles. In addition, an effective feature descriptor is introduced to thermal IR images. Experiments are conducted on a visual and thermal IR image dataset containing 4844 face images in 5 different poses. Results have shown significant increase in classification rate as compared with an existing multi-class AdaBoost algorithm SAMME trained on visual or infrared images alone, as well as a simple baseline classification-fusion algorithm.
    IEEE international conf. ASSP, 2012, (ICASSP 2012); 03/2012 · 4.63 Impact Factor

14 Following View all

15 Followers View all