Conference Paper

Hand-free head mouse control based on mouth tracking

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... To select a command on the keyboard, the user points to the desired item from the perspective and selects it with five predefined hand gestures. In [34], a human-computer interface that provides hands-free mouse controls based on mouth tracking was presented. In the study, mouth movements for mouse movement and head shake movements for mouse clicking were used. ...
... Wang Jun [10] et al. provided a model that selects the center point of mouth as a reference point to estimate mouth rotation angle with respect to X-axis and Y-axis. When face mouth is rotated from one position to another (Example: From frontal face to left 45 degree in x direction) corresponding to the captured image, the mouth position is also moved from horizontal center to a corresponding position linearly. ...
Chapter
Full-text available
Tremendous advancements in the technology has lead to the monumental growth of various branches of computing such as Computer Vision and Human-Computer Interaction (HCI). Computer’s input has data about different properties of users, objects or places. For instance, mouse and keyboard works by the movement performed by the end user’s hands. These approaches are not appropriate for differently abled people. Away to create an application which replaces the input devices such as mouse and keyboard by using face of the user is proposed. This paper introduces how head motion of the user can be used to control the mouse cursor and how gaze tracking can be used to control the keyboard. A face detecting system precisely records the motion parameters from video at real-time using a typical webcam. While the pace reduces while using the virtual mouse and keyboard, the performance of the system is intact for differently abled people whose only means of communication are the head movements and gaze.
... The mouse click is implemented by holding the pointer at the desired position for a specified number of seconds. Naizhong et al. (2015) presented a human-computer interface that realizes hands-free mouse controls based on mouth tracking. The authors used mouth tracking for the mouse movement and head shaking for the mouse click operations. ...
Article
Full-text available
In this paper, a human–machine interface for disabled people with spinal cord injuries is proposed. The designed human–machine interface is an assistive system that uses head movements and blinking for mouse control. In the proposed system, by moving one's head, the user moves the mouse pointer to the required coordinates and then blinks to send commands. The considered head mouse control is based on image processing including facial recognition, in particular, the recognition of the eyes, mouth, and nose. The proposed recognition system is based on the convolutional neural network, which uses the low‐quality images that are captured by a computer's camera. The convolutional neural network (CNN) includes convolutional layers, a pooling layer, and a fully connected network. The CNN transforms the head movements to the actual coordinates of the mouse. The designed system allows people with disabilities to control a mouse pointer with head movements and to control mouse buttons with blinks. The results of the experiments demonstrate that this system is robust and accurate. This invention allows people with disabilities to freely control mouse cursors and mouse buttons without wearing any equipment.
... The mouse click is implemented by holding the pointer at the desired position for a specified number of seconds. Naizhong et al. (2015) presented a human-computer interface that realizes hands-free mouse controls based on mouth tracking. The authors used mouth tracking for the mouse movement and head shaking for the mouse click operations. ...
Chapter
In this paper, computer mouse control with head movements and eye blinks was proposed for people with impaired spinal cord injury. The head mouse control is based on finding and predicting eye states and direction of the head. This human computer interface (HCI) is an assistant system for people with physical disabilities who are suffering from motor neuron diseases or severe cerebral palsy. By moving the head to the right, left, up and down moves the mouse pointer and sends mouse button commands using eye blinks. Here, left eye-blink triggers left mouse button, right eye-blink triggers right mouse button and double-eyed sends “holds” command. In this system, eye blink and head movement used same Convolutional Neural Network (CNN) architecture with different number of classes (output). In head movement part, CNN has 5 outputs, (forward, up, down, left and right), in eye-blink part CNN has 2 output either opened or closed. This combined system allows people with down to neck paralyzed, to control computer using head movement and eye blinking. The test results reveal that this system is robust and accurate. This invention allows people with disabilities to use computer with head movements and eye blinks without using any extra hardware.
... Estos dispositivos obtienen imágenes de los movimientos del usuario para traducirlos en desplazamientos relativos del cursor [4]. Hay módulos encargados de la detección de partes especificas del rostro humano, por ejemplo, la boca, estos utilizan procesamientos de imagen avanzados con el fin de encontrar un punto específico que sirva de referencia para el movimiento del puntero [5]. Incluso se ha decidido realizar modificaciones sobre las cámaras web del mercado en módulos como SmartNav el cual es una de las últimas opciones del mercado en cuanto a dispositivos para la inclusión de personas con tetraplejía. ...
... En [13] implementaron un mouse controlado por señales electromiográficas provenientes de dos músculos del antebrazo, reportando un acierto del 89% para el clic izquierdo y del 92% para el clic derecho. En [14] se propuso un sistema que controla el mouse mediante el movimiento de la boca registrado vía cámara web; logrando un porcentaje de del 97.25% para la activación del clic izquierdo y de 96.75% para la activación del clic derecho. ...
Article
Full-text available
A system is presented for emulate a mouse from the movement of the head and eyelids. The position of the head is used for controlling the horizontal and vertical displacement of the cursor, and the closing of the eyelids to activate the click of the right and left buttons. The system includes zoom, navigation shortcuts, vertical scrollbars and menus activation according to the cursor position; that enhance the functionality and facilitate handling applications. The proposed solution eliminates the restriction of direct contact with the mouse, and empowers people with motor disabilities in the upper extremities to interact with the computer. The mouse emulator can also be used by people without limitations to expand command instructions. The system was tested in navigation on social network Facebook, where an average speed of 382 pixels / s was obtained, with an average accuracy of 22 pixels for the X axis and 17 pixels for the Y axis. However, after user interaction with the interface, improvements of 23% and 37% were observed, in execution time and location accuracy, respectively. Click activation by temporary location over a menu option had a performance of 100%; while the right and left clicks by eyelid closing had a performance of 93% and 92%, respectively. Finally, the surveys showed high satisfaction about the proposed interface during user interaction with Facebook.
Article
Full-text available
With the invention of fast USB interfaces and recent increase of computer power and decrease of camera cost, it has be- come very common to see a camera on top of a computer monitor. Vision-based games and interfaces however are still not common, even despite the realization of the benefits vision could bring: hand-free control, multiple-user inter- action etc. The reason for this lies in the inability to track human faces in video both precisely and robustly. This pa- per describes a face tracking technique based on tracking a convex-shape nose feature which resolves this problem. The technique has been successfully applied to interactive com- puter games and perceptual user interfaces. These results
Article
Full-text available
We discuss a multilinear generalization of the singular value decomposition. There is a strong analogy between several properties of the matrix and the higher-order tensor decomposition; uniqueness, link with the matrix eigenvalue decomposition, first-order perturbation effects, etc., are analyzed. We investigate how tensor symmetries affect the decomposition and propose a multilinear generalization of the symmetric eigenvalue decomposition for pair-wise symmetric tensors.
Article
Full-text available
We describe an algorithm for detecting human faces and facial features, such as the location of the eyes, nose and mouth. First, a supervised pixel-based color classifier is employed to mark all pixels that are within a prespecified distance of “skin color”, which is computed from a training set of skin patches. This color-classification map is then smoothed by Gibbs random field model-based filters to define skin regions. An ellipse model is fit to each disjoint skin region. Finally, we introduce symmetry-based cost functions to search the center of the eyes, tip of nose, and center of mouth within ellipses whose aspect ratio is similar to that of a face.
Conference Paper
Full-text available
Human nose, while being in many cases the only facial feature clearly visible during the head motion, seems to be very undervalued in face tracking technology. This paper shows theoretically and by experiments conducted with ordinary USB cameras that, by properly defining nose as an extremum of the 3D curvature of the nose surface, nose becomes the most robust feature which can be seen for almost any position of the head and which can be tracked very precisely, even with low resolution cameras
Article
Full-text available
In the Fall of 2000, we collected a database of more than 40,000 facial images of 68 people. Using the Carnegie Mellon University 3D Room, we imaged each person across 13 different poses, under 43 different illumination conditions, and with four different expressions. We call this the CMU pose, illumination, and expression (PIE) database. We describe the imaging hardware, the collection procedure, the organization of the images, several possible uses, and how to obtain the database.
Article
Full-text available
Human face detection plays an important role in applications such as video surveillance, human computer interface, face recognition, and face image database management. We propose a face detection algorithm for color images in the presence of varying lighting conditions as well as complex backgrounds. Based on a novel lighting compensation technique and a nonlinear color transformation, our method detects skin regions over the entire image and then generates face candidates based on the spatial arrangement of these skin patches. The algorithm constructs eye, mouth, and boundary maps for verifying each face candidate. Experimental results demonstrate successful face detection over a wide range of facial variations in color, position, scale, orientation, 3D pose, and expression in images from several photo collections (both indoors and outdoors)
Article
Full-text available
Detecting faces in images with complex backgrounds is a difficult task. Our approach, which obtains state of the art results, is based on a neural network model: the constrained generative model (CGM). Generative, since the goal of the learning process is to evaluate the probability that the model has generated the input data, and constrained since some counter-examples are used to increase the quality of the estimation performed by the model. To detect side view faces and to decrease the number of false alarms, a conditional mixture of networks is used. To decrease the computational time cost, a fast search algorithm is proposed. The level of performance reached, in terms of detection accuracy and processing time, allows us to apply this detector to a real world application: the indexing of images and videos
Article
Full-text available
Humans detect and interpret faces and facial expressions in a scene with little or no effort. Still, development of an automated system that accomplishes this task is rather difficult. There are several related problems: detection of an image segment as a face, extraction of the facial expression information, and classification of the expression (e.g., in emotion categories). A system that performs these operations accurately and in real time would form a big step in achieving a human-like interaction between man and machine. The paper surveys the past work in solving these problems. The capability of the human visual system with respect to these problems is discussed, too. It is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.
Article
Human face tracking (HFT) is one of several technologies useful in vision-based interaction (VBI), which is one of several technologies useful in the broader area of perceptual user interfaces (PUI). In this paper we motivate our interests in PUI and VBI, and describe our recent efforts in various aspects of face tracking in the Interaction Lab at UCSB. The HFT methods (GWN, EHT, and CFD), in the context of VBI and PUI, are part of an overall “TLA approach ” to face tracking. TLA /T-L-A / n. [Three-Letter Acronym] 1. Selfdescribing abbreviation for a species with which computing terminology is infested. 2. Any confusing acronym…. (From the Jargon File v. 4.3.1)
Article
In this paper, we will present a novel method to recognize multi-pose face image. In the algorithm we firstly estimate pose of camera by using an estimation algorithm which just used a 2D face image. Then the estimated data can be applied to view position of 3D face model in virtual environment to synthesize 2D verification exemplar DB. Thus face pose in these synthesized 2D verification images can be the same with the input ones. Then an nD-PCA based algorithm will be employed to extract eigenvector of exemplar images and input ones. Finally, the classification machine which is based on cosine distance method will be employed to classify the verifier. We also have carried out a simulation experiment in windows XP system to evaluate efficiency of the proposed algorithm. As we can see from the experiment result that the correction rate can achieve to 92% for front view and 40% for nearly profile-view, robustness of our proposed algorithm is much better than some kinds of conventional approach in multi-pose face recognition field.
Article
For some time, graphical user interfaces (GUIs) have been the dominant platform for human computer interaction. The GUI-based style of interaction has made computers simpler and easier to use, especially for office productivity applications where computers are used as tools to accomplish specific tasks. However, as the way we use computers changes and computing becomes more pervasive and ubiquitous, GUIs will not easily support the range of interactions necessary to meet users' needs. In order to accommodate a wider range of scenarios, tasks, users, and preferences, we need to move toward interfaces that are natural, intuitive, adaptive, and unobtrusive. The aim of a new focus in HCI, called Perceptual User Interfaces (PUIs), is to make human-computer interaction more like how people interact with each other and with the world. This paper describes the emerging PUI field and then reports on three PUI-motivated projects: computer vision-based techniques to visually perceive relevant information about the user.
Article
In this article a segmentation method is described for the face skin of people of any race in real time, in an adaptive and unsupervised way, based on a Gaussian model of the skin color (that will be referred to as Unsupervised and Adaptive Gaussian Skin-Color Model, UAGM). It is initialized by clustering and it is not required that the user introduces any initial parameters. It works with complex color images, with random backgrounds and it is robust to lighting and background changes. The clustering method used, based on the Vector Quantization (VQ) algorithm, is compared to other optimum model selection methods, based on the EM algorithm, using synthetic data. Finally, real results of the proposed method and conclusions are shown.
Article
This work is motivated by the goal of providing a non-contact means of controlling the mouse pointer on a computer system for people with motor difficulties using low-cost, widely available hardware. The required information is derived from video data captured using a web camera mounted below the computer's monitor. A colour filter is used to identify skin coloured regions. False positives are eliminated by optionally removing background regions and by applying statistical rules that reliably identify the largest skin-coloured region, which is assumed to be the user's face. The nostrils are then found using heuristic rules. The instantaneous location of the nostrils is compared with their at-rest location; any significant displacement is used to control the mouse pointer's movement. The system is able to process 18 frames per second at a resolution of 320 by 240 pixels, or 30 fps at 160 by 120 pixels using moderately powerful hardware (a 500 MHz Pentium III desktop computer).
Article
In seeking hitherto-unused methods by which users and computers can comrnumcate, we investigate the usefulness of eye movements as a fast and convenient auxiliary user-to-computer communication mode. The barrier to exploiting this medium has not been eye-tracking technology but the study of interaction techniques that incorporate eye movements mto the usercomputer dialogue in a natural and unobtrusive way This paper discusses some of the human factors and technical considerations that arise in trying to use eye movements as an input medium, describes our approach and the first eye movement-based interaction techniques that we have devised and implemented in our laboratory, and reports our experiences and observa tions on them.
Conference Paper
In this paper, a synthetic exemplar based framework for face recognition with variant pose and illumination is proposed. Our purpose is to construct a face recognition system only according to one single frontal face image of each person for recognition. The framework consists of three main parts. First, a deformation based 3D face modeling technique is introduced to create an individual 3D face model from a single frontal face image of a person with a generic 3D face model. Then, the virtual faces for recognition at various lightings and views are synthesized. Finally, an eigenfaces based classifier is constructed where the virtual faces synthesized are used as training exemplars. The experimental results show that the proposed 3D face modeling technique is efficient and the synthetic face exemplars can significantly improve the accuracy of face recognition with variant pose and illumination.
Conference Paper
In this paper, we propose 'Hands-free Interface', a technology that is intended to replace conventional computer screen pointing devices for the use of the disabled. We describe a real time, nonintrusive, fast and affordable technique for facial feature tracking. Our technique ensures accuracy and is robust to feature occlusion as well as target scale variations and rotations. It is based on a novel variant of template matching techniques. A Kalman filter is also used for adaptive search window positioning and sizing
Realtime detection of eyes and faces, In Workshop on Perceptural User Interfaces
  • C Morimoto
  • D Koons
  • A Amir
  • M Flickner
A novel approach to fast learning: smart neural nets. Neural Networks
  • B W Dahanayake
  • A R Upton
Human Head Mouse System Based on Facial Gesture Recognition
  • Wei Li
  • E J Lee