The investigation and implementation of real-time face pose and direction estimation on mobile computing devices

To read the full-text of this research, you can request a copy directly from the authors.


The mobile computing device has many limitations, such as relative small user interface and slow computing speed. Usually, augmented reality requires face pose estimation can be used as a HCI and entertainment tool. As far as the realtime implementation of head pose estimation on relatively resource limited mobile platforms is concerned, it is required to face different constraints while leaving enough face pose estimation accuracy. The proposed face pose estimation method met this objective. Experimental results running on a testing Android mobile device delivered satisfactory performing results in the real-time and accurately.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Face detection is a complicated and important problem in pattern recognition and also has wide application. Since Viola and Jones’ work effective and real-time face detection has been made possible by using the method of rectangle Haar-like features with AdaBoost learning. In this paper a face detection algorithm based on AdaBoost has been implemented, but we extract MB-LBP features instead of Haar features, which are used for Training of AdaBoost classifier. In order to reduce false alarm rate, we also combine skin color with AdaBoost. "skin" and "non-skin" area. In the "skin" area to further confirm the detection of human face, it not only increases the detection rate but also the false alarm rate, and there are many false positives which are not skin color but similar with skin color. Based on this, this paper combines skin color with AdaBoost [1] based on MB-LBP features. And the experimental results show that the proposed algorithm can be dramatically improved the detection rate of multiple face images.
In this paper, we proposes a novel hardware architecture and FPGA implementation method of high- performance real-time face-detection engine for robustness to variable illumination and rotation. The proposed face- detection algorithm improved its performance by using MCT (Modified Census Transform), rotation transformation and AdaBoost learning algorithm. For implementation, we used a QVGA class camera, LCD display, and Virtex5 LX330 FPGA made by Xilinx Corporation. The verification results showed that it is possible to detect at least 32 faces in a wide variety of sizes at a maximum speed of 43 frames per second in real time. This finding can be applied to artificial intelligence robots for human recognition, conventional security systems for identity certification, and cutting-edge digital cameras using image- processing techniques.
Although many face detection algorithms have been introduced in the literature, only a handful of them can meet the real-time constraints of mobile devices. This paper presents the real-time implementation of our previously introduced face detection algorithm on a mobile device. The steps taken to achieve such a real-time implementation are discussed. Real-time comparison results with the widely used Viola-Jones face detection algorithm in terms of detection rate and processing speed are presented to demonstrate the robustness of our real-time solution.
We present a novel approach to localizing parts in images of human faces. The approach combines the output of local detectors with a non-parametric set of global models for the part locations based on over one thousand hand-labeled exemplar images. By assuming that the global models generate the part locations as hidden variables, we derive a Bayesian objective function. This function is optimized using a consensus of models for these hidden variables. The resulting localizer handles a much wider range of expression, pose, lighting and occlusion than prior ones. We show excellent performance on a new dataset gathered from the internet and show that our localizer achieves state-of-theart performance on the less challenging BioID dataset.
The development of intelligent environments for human movement capture and analysis is discussed. In order to achieve reliable and robust system performance in intelligent spaces where interactions among multiple people and other fixtures can be properly captured and analyzed. Robust audiovisual (AV) signatures of the participants, gestures, and events are required in the development of intelligent rooms. In the context of lipreading and person-identification applications, it is shown that combining multimodal information greatly improves the system performance.
The capacity to estimate the head pose of another person is a common human ability that presents a unique challenge for computer vision systems. Compared to face detection and recognition, which have been the primary foci of face-related vision research, identity-invariant head pose estimation has fewer rigorously evaluated systems or generic solutions. In this paper, we discuss the inherent difficulties in head pose estimation and present an organized survey describing the evolution of the field. Our discussion focuses on the advantages and disadvantages of each approach and spans 90 of the most innovative and characteristic papers that have been published on this topic. We compare these systems by focusing on their ability to estimate coarse and fine head pose, highlighting approaches that are well suited for unconstrained environments.
Conference Paper
A conversation robot that recognizes user's head gestures and uses its results as para-linguistic information is developed. In the conversation, humans exchange linguistic information, which can be obtained by transcription of the utterance, and para-linguistic information, which helps the transmission of linguistic information. Para-linguistic information brings a nuance that cannot be transmitted by linguistic information, and the natural and effective conversation is realized. We recognize user's head gestures as the para-linguistic information in the visual channel. We use the optical flow over the head region as the feature and model them using HMM for the recognition. In actual conversation, while the user performs a gesture, the robot may perform a gesture, too. In this situation, the image sequence captured by the camera mounted on the eyes of the robot includes sways caused by the movement of the camera. To solve this problem, we introduced two artifices. One is for the feature extraction: the optical flow of the body area is used to compensate the swayed images. The other is for the probability models: mode-dependent models are prepared by the MLLR model adaptation technique, and the models are switched according to the motion mode of the robot. Experimental results show the effectiveness of these techniques.
This paper introduces a novel Gabor-Fisher (1936) classifier (GFC) for face recognition. The GFC method, which is robust to changes in illumination and facial expression, applies the enhanced Fisher linear discriminant model (EFM) to an augmented Gabor feature vector derived from the Gabor wavelet representation of face images. The novelty of this paper comes from (1) the derivation of an augmented Gabor feature vector, whose dimensionality is further reduced using the EFM by considering both data compression and recognition (generalization) performance; (2) the development of a Gabor-Fisher classifier for multi-class problems; and (3) extensive performance evaluation studies. In particular, we performed comparative studies of different similarity measures applied to various classifiers. We also performed comparative experimental studies of various face recognition schemes, including our novel GFC method, the Gabor wavelet method, the eigenfaces method, the Fisherfaces method, the EFM method, the combination of Gabor and the eigenfaces method, and the combination of Gabor and the Fisherfaces method. The feasibility of the new GFC method has been successfully tested on face recognition using 600 FERET frontal face images corresponding to 200 subjects, which were acquired under variable illumination and facial expressions. The novel GFC method achieves 100% accuracy on face recognition using only 62 features