Conference Paper

Combination of manual and non-manual features for sign language recognition based on conditional random field and active appearance model

Sch. of Comput. Eng., Chosun Univ., Gwangju, South Korea
DOI: 10.1109/ICMLC.2011.6016973 Conference: Machine Learning and Cybernetics (ICMLC), 2011 International Conference on, Volume: 4
Source: IEEE Xplore

ABSTRACT Sign language recognition is the task of detection and recognition of manual signals (MSs) and non-manual signals (NMSs) in a signed utterance. In this paper, a novel method for recognizing MS and facial expressions as a NMS is proposed. This is achieved through a framework consisting of three components: (1) Candidate segments of MSs are discriminated using an hierarchical conditional random field (CRF) and Boost-Map embedding. It can distinguish signs, fingerspellings and non-sign patterns, and is robust to the various sizes, scales and rotations of the signer's hand. (2) Facial expressions as a NMS are recognized with support vector machine (SVM) and active appearance model (AAM), AAM is used to extract facial feature points. From these facial feature points, several measurements are computed to distinguish each facial component into defined facial expressions with SVM. (3) Finally, the recognition results of MSs and NMSs are fused in order to recognize signed sentences. Experiments demonstrate that the proposed method can successfully combine MSs and NMSs features for recognizing signed sentences from utterance data.

0 Bookmarks
 · 
56 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Gesture recognition is useful for human-computer interaction. The difficulty of gesture recognition is that instances of gestures vary both in motion and shape in three-dimensional (3-D) space. We use depth infor-mation generated using Microsoft's Kinect in order to detect 3-D human body components and apply a threshold model with a conditional random field in order to recognize meaningful gestures using continuous motion information. Body gesture recognition is achieved through a framework consisting of two steps. First, a human subject is described by a set of features, encoding the angular relationship between body components in 3-D space. Second, a feature vector is recognized using a threshold model with a conditional random field. In order to show the performance of the proposed method, we use a public data set, the Microsoft Research Cambridge-12 Kinect gesture database. The experimental results demon-strate that the proposed method can efficiently and effectively recognize body gestures automatically.
    Optical Engineering 01/2013; 5. · 0.88 Impact Factor