Conference Paper

View-Invariant Human Action Detection Using Component-Wise HMM of Body Parts.

DOI: 10.1007/978-3-540-70517-8_20 Conference: Articulated Motion and Deformable Objects, 5th International Conference, AMDO 2008, Port d'Andratx, Mallorca, Spain, July 9-11, 2008, Proceedings
Source: DBLP

ABSTRACT This paper presents a framework for view-invariant action recognition in image sequences. Feature-based human detection becomes extremely chal- lenging when the agent is being observed from different viewpoints. Besides, similar actions, such as walking and jogging, are hardly distinguishable by con- sidering the human body as a whole. In this work, we have developed a system which detects human body parts under different views and recognize similar ac- tions by learning temporal changes of detected body part components. Firstly, human body part detection is achieved to find separately three components of the human body, namely the head, legs and arms. We incorporate a number of sub-classifiers, each for a specific range of view-point, to detect those body parts. Subsequently, we have extended this approach to distinguish and recognise ac- tions like walking and jogging based on component-wise HMM learning.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The ability to predict the intentions of people based solely on their visual actions is a skill only performed by humans and animals. This requires segmentation of items in the field of view, tracking of moving objects, identifying the importance of each object, determining the current role of each important object individually and in collaboration with other objects, relating these objects into a predefined scenario, assessing the selected scenario with the information retrieve, and finally adjusting the scenario to better fit the data. This is all accomplished with great accuracy in less than a few seconds. The intelligence of current computer algorithms has not reached this level of complexity with the accuracy and time constraints that humans and animals have, but there are several research efforts that are working towards this by identifying new algorithms for solving parts of this problem. This survey paper lists several of these efforts that rely mainly on understanding the image processing and classification of a limited number of actions. It divides the activities up into several groups and ends with a discussion of future needs. KeywordsVisual human action classification–Artificial intelligence–Hidden Markov Model–Grammars
    Artificial Intelligence Review 01/2011; 37(4):301-311. · 1.57 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we present a novel descriptor to characterize human action when it is being observed from a far field of view. Visual cues are usually sparse and vague under this scenario. An action sequence is divided into overlapped spatial-temporal volumes to make reliable and comprehensive use of the observed features. Within each volume, we represent successive poses by time series of Histogram of Oriented Gradients (HOG) and movements by time series of Histogram of Oriented Optical Flow (HOOF). Supervised Principle Component Analysis (SPCA) is applied to seek a subset of discriminantly informative principle components (PCs) to reduce the dimension of histogram vectors without loss of accuracy. The final action descriptor is formed by concatenating sequences of SPCA projected HOG and HOOF features. A Support Vector Machines (SVM) classifier is trained to perform action classification. We evaluated our algorithm by testing it on one normal resolution and two low-resolution datasets, and compared our results with those of other reported methods. By using less than 1/5 the dimension a full-length descriptor, our method is able to achieve perfect accuracy on two of the datasets, and perform comparably to other methods on the third dataset.
    Motion and Video Computing, 2009. WMVC '09. Workshop on; 01/2010
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recognition of natural gestures is a key issue in many applications including videogames and other immersive applications. Whatever is the motion capture device, the key problem is to recognize a motion that could be performed by a range of different users, at an interactive frame rate. Hidden Markov Models (HMM) that are commonly used to recognize the performance of a user however rely on a motion representation that strongly affects the overall recognition rate of the system. In this paper, we propose to use a compact motion representation based on Morphology-Independent features and we evaluate its performance compared to classical representations. When dealing with 15 very similar upper limb motions, HMM based on Morphology-Independent features yield significantly higher recognition rate (84.9%) than classical Cartesian or angular data (70.4% and 55.0%, respectively). Moreover, when the unknown motions are performed by a large number of users who have never contributed to the learning process, the recognition rate of Morphology-Independent input feature only decreases slightly (down to 68.2% for a HMM trained with the motions of only one subject) compared to other features (25.3% for Cartesian features and 17.8% for angular features in the same conditions). The method is illustrated through an interactive demo in which three virtual humans have to interactively recognize and replay the performance of the user. Each virtual human is associated with a HMM recognizer based on the three different input features.
    International Journal of Pattern Recognition and Artificial Intelligence 10/2013; 27(08):1-19. · 0.56 Impact Factor

Full-text (2 Sources)

Available from
Jun 4, 2014