ArticlePDF Available

Abstract and Figures

With the invention of fast USB interfaces and recent increase of computer power and decrease of camera cost, it has be- come very common to see a camera on top of a computer monitor. Vision-based games and interfaces however are still not common, even despite the realization of the benefits vision could bring: hand-free control, multiple-user inter- action etc. The reason for this lies in the inability to track human faces in video both precisely and robustly. This pa- per describes a face tracking technique based on tracking a convex-shape nose feature which resolves this problem. The technique has been successfully applied to interactive com- puter games and perceptual user interfaces. These results
Content may be subject to copyright.
A preview of the PDF is not available
... Previous research has explored to use head movements as a human-computer interaction channel [29] , including wheelchair control [30,31] for users with limited hand or arm mobility and target selection tasks on desktop [32,33] and mobile devices [34] for able-bodied users. However, they required users to perform predefined head movements to trigger different functions. ...
Article
Full-text available
Background Gesture is a basic interaction channel that is frequently used by humans to communicate in daily life. In this paper, we explore to use gesture-based approaches for target acquisition in virtual and augmented reality. A typical process of gesture-based target acquisition is: when a user intends to acquire a target, she performs a gesture with her hands, head or other parts of the body, the computer senses and recognizes the gesture and infers the most possible target. Methods We build mental model and behavior model of the user to study two key parts of the interaction process. Mental model describes how user thinks up a gesture for acquiring a target, and can be the intuitive mapping between gestures and targets. Behavior model describes how user moves the body parts to perform the gestures, and the relationship between the gesture that user intends to perform and signals that computer senses. Results In this paper, we present and discuss three pieces of research that focus on the mental model and behavior model of gesture-based target acquisition in VR and AR. Conclusions We show that leveraging these two models, interaction experience and performance can be improved in VR and AR environments. Keywords: Gesture-based interaction, Mental model, Behavior model, Virtual reality, Augmented reality
... Early works as the ones from Toyama [15] and Bradski [16] already hinted the use of head motion to apply it in a Human-Computer Interaction (HCI) context. Betke et al. [17] and Gorodnichy et al. [18] evaluated and adapted their systems to real end-users, offering them a replacement to the standard mouse device. We address the reader to our previous work [19] where we compile works based on head motion to access desktop computers for motion-impaired users. ...
Article
Full-text available
We designed a natural user interface to access mobile devices for motion-impaired people who cannot use the standard multi-touch input system to work with tablets and smartphones. We detect the head motion of the user by means of the frontal camera and use its position to interact with the mobile device. The purpose of this work is to evaluate the performance of the system. We conducted two laboratory studies with 12 participants without disabilities and a field study with four participants with multiple sclerosis (MS). The first laboratory study was done to test the robustness and to count with a base to compare the results of the evaluation done with the participants with MS. Once observed the results of the participants with disabilities, we conducted a new laboratory study with participants without disabilities simulating the limitations of the users with MS to tune the system. All participants completed a set of defined tasks: pointing and pointing-selecting. We logged use and conducted questionnaires post-experiment. Our results showed positive outcomes using the system as an input device, although apps should follow a set of recommendations on the size of the targets and their position to facilitate the interaction with mobile devices for motion-impaired users. The work demonstrates the interface’s possibilities for mobile accessibility for motion-impaired users who need alternative access devices to interact with mobile devices.
... Early works as the ones from Toyama [15] and Bradski [16] already hinted the use of head motion to apply it in a Human-Computer Interaction (HCI) context. Betke et al [17] and Gorodnichy et al [18] evaluated and adapted their systems to real end-users, offering them a replacement to the standard mouse device. We address the reader to our previous work [19] where we compile works based on head motion to access desktop computers for motion-impaired users. ...
... Speed of cursor input is limited by the individual soldier's tolerance for rapid and abrupt head movements. For example, in order to move the mouse across the screen the soldier must move his or her head in some direction, then move it back in the opposite direction, and then continue movement in the original direction to bring the cursor to the desired destination (Gorodnichy, Malik, and Roth, 2001). b) Accuracy: The head and neck are not well suited to the fine motor movements required for cursor control. ...
Article
Full-text available
Head-operated computer accessibility tools (CATs) are useful solutions for the ones with complete head control; but when it comes to people with only reduced head control, computer access becomes a very challenging task since the users depend on a single head-gesture like a head nod or a head tilt to interact with a computer. It is obvious that any new interaction technique based on a single head-gesture will play an important role to develope better CATs to enhance the users’ self-sufficiency and the quality of life. Therefore, we proposed two novel interaction techniques namely HeadCam and HeadGyro within this study. In a nutshell, both interaction techniques are based on our software switch approach and can serve like traditional switches by recognizing head movements via a standard camera or a gyroscope sensor of a smartphone to translate them into virtual switch presses. A usability study with 36 participants (18 motor-impaired, 18 able-bodied) was also conducted to collect both objective and subjective evaluation data in this study. While HeadGyro software switch exhibited slightly higher performance than HeadCam for each objective evaluation metrics, HeadCam was rated better in subjective evaluation. All participants agreed that the proposed interaction techniques are promising solutions for computer access task.**Keywords: Interaction techniques · Universal access · Inclusive design · Switch access · Computer access ·Head- operated access · Software switch · Switch-accessible interface · Head tracking · Hands-free computer access
Chapter
Gesture recognition is mainly apprehensive on analyzing the functionality of human wits. The primary goal of gesture recognition research is to create a system which can recognize specific human gestures and use them to convey information or for device control. The purpose of this paper is to interface machines directly to human wits without any corporeal media in an ambient environment. This work pertains to reckoning on tracking of nose tip. In the pragmatic phenomenon the nose tip is tracked and mouse positioning event is generated on how the nose tip moves on the real world domain. In effectuation phase a single camera based computational paradigm is used for tracking nose tip, and recognizing gestures. Reference point location tracking method is used to spot nose tip in successive frames.
Article
This paper presents a robust method for eye shape detection. Existence of eye components such as pupil, iris, eyelid and sclera makes the color variation between the eye and skin very sharp. This variation is measured by applying Principal Component Analysis (PCA) on RGB color channels of an eye image. In addition, eyes have an elliptical structure which can easily be detected in Log-polar domain. Log-polar transform (LPT) decomposes the eye into two parabolas along theta axis. Thus, the eye shape can be detected by searching on corresponding coordinates along log r axis. Based on these properties of eye representation by PCA and LPT, the eye shape is extracted regardless the change of scale, rotation and lighting conditions. The results show high accuracy with reducing time consumption compared to the other existing methods.
Conference Paper
There are over eight million disabled people available in Turkey. The biggest proportion of this disabled population is formed by orthopedic disabled people. An orthopedic disabled person who's only capable move head to some directions needs applications that make their life simpler. Therefore in this study two popular face tracking methods were applied to different resolution of videos and efficiency was evaluated. Then from one of this face tracking method which will give better results was selected.
Conference Paper
Full-text available
This paper describes a work in progress relating to the development of a virtual monitoring environment for space telemanipulation systems. The focus is on the improvement of performance and safety of current control systems by using the potential offered by virtualized reality, as well as good human factors practices. Cet article décrit un travail en cours concernant le développement d'un environnement de surveillance virtuel pour les systèmes de télémanipulation spatiale. Le but visé consiste à améliorer le rendement et la sécurité des systèmes de contrôle actuels, ce grâce au potentiel de la réalité virtuelle ainsi qu'à des pratiques modèles en ce qui concerne les facteurs humains.
Article
Full-text available
Traditionally, image intensities have been processed to segment an image into regions or to find edge-fragments. Image intensities carry a great deal more information about three-dimensional shape, however. To exploit this information, it is necessary to understand how images are formed and what determines the observed intensity in the image. The gradient space, popularized by Huffman and Mackworth in a slightly different context, is a helpful tool in the development of new methods.
Article
Full-text available
We show how partial reduction of self-connections of the network designed with the pseudo-inverse learning rule increases the direct attraction radius of the network. Theoretical formula is obtained. Data obtained by simulation are presented.
Article
Full-text available
Detection and tracking of facial features without using any head mounted devices may become required in various future visual communication applications, such as teleconferencing, virtual reality etc. In this paper we propose an automatic method of face feature detection using a method called edge pixel counting. Instead of utilizing color or gray scale information of the facial image, the proposed edge pixel counting method utilized the edge information to estimate the face feature positions such as eyes, nose and mouth in the first frame of a moving facial image sequence, using a variable size face feature template. For the remaining frames, feature tracking is carried out alternatively using a method called deformable template matching and edge pixel counting. One main advantage of using edge pixel counting in feature tracking is that it does not require the condition of a high inter frame correlation around the feature areas as is required in template matching. Some experimental results are shown to demonstrate the effectiveness of the proposed method.
Article
As a first step towards a perceptual user interface, a computer vision color tracking algorithm is developed and applied towards tracking human faces. Computer vision algorithms that are intended to form part of a perceptual user interface must be fast and efficient. They must be able to track in real time yet not absorb a major share of computational resources: other tasks must be able to run while the visual interface is being used. The new algorithm developed here is based on a robust...
Conference Paper
As a first step towards a perceptual user interface, a computer vision color tracking algorithm is developed and applied towards tracking human faces. Computer vision algorithms that are intended to form part of a perceptual user interface must be fast and efficient. They must be able to track in real time yet not absorb a major share of computational resources: other tasks must be able to run while the visual interface is being used. The new algorithm developed here is based on a robust non- parametric technique for climbing density gradients to find the mode (peak) of probability distributions called the mean shift algorithm. In our case, we want to find the mode of a color distribution within a video scene. Therefore, the mean shift algorithm is modified to deal with dynamically changing color probability distributions derived from video frame sequences. The modified algorithm is called the Continuously Adaptive Mean Shift (CAMSHIFT) algorithm. CAMSHIFT's tracking accuracy is compared against a Polhemus tracker. Tolerance to noise, distractors and performance is studied. CAMSHIFT is then used as a computer interface for controlling commercial computer games and for exploring immersive 3D graphic worlds.
Article
At the heart of every model-based visual tracker lies a pose estimation routine. Recent work has emphasized the use of least-squares techniques which employ all the available data to estimate the pose. Such techniques are, however, susceptible to the sort of spurious measurements produced by visual feature detectors, often resulting in an unrecoverable tracking failure. This paper investigates an alternative approach, where a minimal subset of the data provides the pose estimate, and a robust regression scheme selects the best subset. Bayesian inference in the regression stage combines measurements taken in one frame with predictions from previous frames, eliminating the need to further filter the pose estimates. The resulting tracker performs very well on the difficult task of tracking a human face, even when the face is partially occluded. Since the tracker is tolerant of noisy, computationally cheap feature detectors, frame-rate operation is comfortably achieved on standard hardware.
Conference Paper
We have developed an artificial neural network based gaze tracking system which can be customized to individual users. A three layer feed forward network, trained with standard error back propagation, is used to determine the position of a user's gaze from the appearance of the user's eye. Unlike other gaze trackers, which normally require the user to wear cumbersome headgear, or to use a chin rest to ensure head immobility, our system is entirely non-intrusive. Currently, the best intrusive gaze tracking sys- tems are accurate to approximately 0.75 degrees. In our experiments, we have been able to achieve an accuracy of 1.5 degrees, while allowing head mobility. In its current implementation, our system works at 15 hz. In this paper we present an empirical analysis of the performance of a large number of artificial neural network architectures for this task. Suggestions for further explorations for neurally based gaze trackers are presented, and are related to other similar artificial neural network applications such as autonomous road following.
Conference Paper
This paper provides an introduction to the field of reasoning with uncertainty in Artificial Intelligence (AI), with an emphasis on reasoning with numeric uncertainty. The considered formalisms are Probability Theory and some of its generalizations, the Certainty Factor Model, Dempster-Shafer Theory, and Probabilistic Networks.
Conference Paper
Computer systems which analyse human face/head motion have attracted significant attention recently as there are a number of interesting and useful applications. Not least among these is the goal of tracking the head in real time. A useful extension of this problem is to estimate the subject's gaze point in addition to his/her head pose. This paper describes a real-time stereo vision system which determines the head pose and gaze direction of a human subject. Its accuracy makes it useful for a number of applications including human/computer interaction, consumer research and ergonomic assessment