Conference Paper

'Hands-free interface'- a fast and accurate tracking procedure for real time human computer interaction

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In this paper, we propose 'Hands-free Interface', a technology that is intended to replace conventional computer screen pointing devices for the use of the disabled. We describe a real time, nonintrusive, fast and affordable technique for facial feature tracking. Our technique ensures accuracy and is robust to feature occlusion as well as target scale variations and rotations. It is based on a novel variant of template matching techniques. A Kalman filter is also used for adaptive search window positioning and sizing

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Other systems that have been designed for disabled users [27,79] lack of studies with their intended users. It is technology specifically designed as a hands-free system in order to replace the standard mouse for users with mobility impairments in the upper extremities but in the designing process there is no mention in their papers of the participation of this kind of users. ...
... Kjeldsen [56] does not track the nose directly, instead he tracks the user's face, but the pointer moves to approximately where the user's nose is pointing. Morris and Chauhan [71] and El-Afifi [27] choose to track the user's nostrils and use their position to define the head pose. They mention two advantages in using nostrils: they are clearly separated from any other features that could be confused with them and they are relatively small and situated away from the face boundary, this means that they remain visible even under extreme facial poses. ...
... Hannuksela et al [42] did not try the system over disabled users and they neither presented an exhaustive evaluation, they draw the trajectory of the mouse and say that the movement seems to be roughly correct. About the users participating in the evaluation of El-Afifi et al's [27] system nothing is said. They have tried the system with a drawing program and using it with the Snake game that involves simple direction movements but with no evidence of the results. ...
... Another System proposed [17] makes use of nostrils for tracking. The system proposed is founded on novel template matching technique [18] that solves the problem of occlusion. ...
Article
Full-text available
Graphical User Interfaces based on WIMP (Window, Icon, Menu, Pointer) paradigm has been a popular choice of interfaces for many systems and has made Human-Computer interaction simpler and easier. In the past few decades great research efforts have been made to make user friendly Human-Computer Interfaces. The Perceptual User Interfaces aims to focus this new form of HCI (Human Computer Interaction). In this study we describe different Vision Based Perceptive Interfaces that make use of different computer vision techniques for tracking and gesture recognition that may serve as an alternative for outdated mouse. The major focus of the work presented here is on the critical evaluation of different vision-Based interfaces that will give the perceptual ability of vision to computers and will make human-Computer interaction more human friendly interaction. The comparative study and critical evaluation using Human Computer Interaction approaches highlights the strength and weakness of the existing vision-Based interfaces. The emphasis of our research is focused on summarizing the literature by highlighting its limitations and strengths.
... These features of ASL make it an ultimate solution for controlling home appliances through body language. In the presented system we will focus on hand motion rather than facial feature tracking however the adaptive modules presented in this paper may be well integrated in facial tracking systems [1][3] to enhance its accuracy. ...
Article
In this paper, we present a real time system for recognizing isolated and sentence American Sign Lan-guage (ASL) gestures. We mainly focus on image preproc-essing to isolate hand pose and motion which allow fast and accurate recognition. Preprocessing is accomplished through a series of adaptive modules forming a very adap-tive system. Such a system achieved high accuracy rate even with a noncomplex gesture recognition technique. This evaluation was conducted using a set of 5 gestures and translation in 4 different directions.
... Among human body parts, face has been the most studied for visual human tracking and perceptual user interface, because face appearance is more statistically consistent in color, shape and texture, and thus allow computer to detect and track with robustness and accuracy. With different assumptions, people have proposed to navigate mouse with the movement of eyes [4][5] [6] [7], nose [8] [9], and nostrils [10] [11], etc. For eye tracking, people usually employ infrared lighting cameras, and take advantage of the fact that the iris of human eye has large infra-red light reflection. ...
Article
This paper introduces a novel camera mouse driven by visual face tracking based on a 3D model. As the camera becomes standard configuration for personal computers (PCs) and computation speed increases, achieving human–machine interaction through visual face tracking becomes a feasible solution to hands-free control. Human facial movements can be broken down into rigid motions, such as rotation and translation, and non-rigid motions such as opening, closing, and stretching of the mouth. First, we describe our face tracking system which can robustly and accurately retrieve these motion parameters from videos in real time [H. Tao, T. Huang, Explanation-based facial motion tracking using a piecewise Bezier volume deformation model, in: Proceedings of IEEE Computer Vision and Pattern Recogintion, vol. 1, 1999, pp. 611–617]. The retrieved (rigid) motion parameters can be employed to navigate the mouse cursor; the detection of mouth (non-rigid) motions triggers mouse events in the operating system. Three mouse control modes are investigated and their usability is compared. Experiments in the Windows XP environment verify the convenience of our camera mouse in hands-free control. This technology can be an alternative input option for people with hand and speech disability, as well as for futuristic vision-based games and interfaces.
... As we examine vision-based cursor control systems available on the market [2, 3, 4, 5, 6] or research literature [24,10,11,12,8,23,13,19,20] dedicated to designing vision-based perceptual user interfaces (PUIs), we note that in order to provide a user with the knowledge on how the face is detected by the camera, conventional PUIs use a separate window somewhere on a screen in addition to normal cursor, which shows the capture video image of face with the results on vision detection overlaid on top of it 1 . This is the case for the Nouse (the additional window showing the result of nose tracking can be seen in Figure 1a). ...
Article
Full-text available
This paper introduces a new concept for the area of user interfaces called Perceptual Cursor. As opposed to the regular cursor, which is by definition an item to mark a position of applied control, Perceptual Cursor also serves a purpose of providing the visual information about the perceived sensory data governing the cursor control. By fulfilling two purposes in one object, Perceptual Cursor eliminates one of the major problems in designing hands-free user interfaces — that of the absence of “touch” or tangible feedback in these interfaces. The paper also introduces a number of propositions related to cursor design that need to be followed when designing a hands-free cursor-controlled system and describes two specific designs for the Perceptual Cursor that are currently being tested with the hands-free addition of the Perceptual Vision Interface Nouse. Cet article présente un nouveau concept dans le domaine des interfaces utilisateurs, celui du curseur perceptif. Contrairement au curseur ordinaire, qui est par définition un dispositif utilisé pour marquer la position d'application d'une commande, le curseur perceptif fournit également des informations visuelles sur les données sensorielles perçues qui régissent la commande du curseur. Grâce à sa double fonction réunie dans un seul objet, le curseur perceptif élimine un des principaux problèmes de la conception des interfaces utilisateurs mains libres – celui de l'absence du « toucher » ou d'une rétroaction perceptible. Cet article présente également un certain nombre de principes rattachés à la conception des curseurs, principes qui doivent être respectés dans la conception d'un système commandé par un curseur mains libres, et il décrit deux modèles particuliers de curseurs perceptifs qui font actuellement l'objet d'essais d'adjonction de la vision perceptive à la commande mains libres.
... As we examine vision-based cursor control systems available on the market [1, 2, 3, 4, 5] or research lit- erature [18, 8, 9, 10, 7, 17, 11, 13, 16], we note that in order to provide a user with the knowledge on how the face is detected by the camera, these PUIs use a separate window somewhere on a screen in addition to normal cursor, which shows the capture video image of face with the results on vision detection overlaid on top of it. This is how the visual feedback is provided for CameraMouse [1], Quilieye [4] , IBM HeadTracker, which are the examples of the commercially sold visionbased computer control programs that have been tested in SCO Health Center The drawback of this visual feedback is that the user has to look both at the cursor (to know where/how to move it, e.g. to open a Windows menu) and at the image showing the results captured by the videocamera (to know how to move his head in order to achieve the desired cursor motion). ...
Article
Full-text available
Normal work with a computer implies being able to perform the following three computer control tasks: 1) pointing, 2) clicking, and 3) typing. Many attempts have been made to make it possible to perform these tasks hands-free using a video image of the user as input. Nevertherless, rehabilitation center practitioners agree that no marketable solution making vision-based hands-free computer control a commonplace reality for disabled users has been produced as of yet. as reported by rehabilitation center practitioners, no marketable solution making vision-based hands-free computer control a commonplace reality for disabled users has been produced as of yet. Here we present the Nouse Perceptual Vision Interface
... The framework can be illustrated by Among human body parts, face has been the most studied for visual human tracking and perceptual user interface , because face appearance is more statistically consistent in color, shape and texture, and thus allow computer to detect and track with robustness and accuracy. With different assumptions, people have proposed to navigate mouse with the movement of eyes[4][5][6][7], nose[8][9], and nostrils[10][11], etc. For eye tracking, people usually employ infrared lighting cameras, and take advantage of the fact that the iris of human eye has large infra-red light reflection . ...
Conference Paper
This paper introduces a novel camera mouse driven by 3D model based visual face tracking technique. While camera becomes standard configuration for personal computer (PC) and computer speed becomes faster and faster, achieving human machine interaction through visual face tracking becomes a feasible solution to hand-free control. The human facial movement can be decomposed into rigid movement, e.g. rotation and translation, and non-rigid movement, such as the open/close of mouth, eyes, and facial expressions, etc. We introduce our visual face tracking system that can robustly and accurately retrieve these motion parameters from video at real-time. After calibration, the retrieved head orientation and translation can be employed to navigate the mouse cursor, and the detection of mouth movement can be utilized to trigger mouse events. 3 mouse control modes are investigated and compared. Experiments in Windows XP environment verify the convenience of navigation and operations using our face mouse. This technique can be an alternative input device for people with hand and speech disability and for futuristic vision-based game and interface.
Book
Full-text available
El presente libro es el resultado de la docencia e investigación en las áreas de Realidad Virtual, Realidad Aumentada e Interfaces basadas en visión, llevadas a cabo en tres instituciones universitarias, dos de ellas de Argentina –Universidad Nacional de La Plata (UNLP) y Universidad Nacional del Centro de la Provincia de Buenos Aires (UNICEN)– y una española –Universidad de las Islas Baleares (UIB)–. El texto se estructura en dos partes principales: la primera parte relacionada con Realidad Virtual (RV) y Realidad Aumentada (RA) y la segunda parte relacionada con las denominadas Interfaces avanzadas o Basadas en Visión (VBI). La primera parte consta de tres capítulos. El capítulo 1 presenta una introducción a conceptos y tecnología compartidos por las aplicaciones de realidad virtual y realidad aumentada. El capítulo 2 presenta los desafíos actuales para el desarrollo de simuladores de entrenamiento que utilizan realidad virtual, y describe los simuladores desarrollados por el Instituto Pladema de la UNICEN. El capítulo 3 presenta el tema Realidad Aumentada, sus fundamentos, algoritmos de tracking y librerías utilizadas para el desarrollo de aplicaciones. Lo incluido en este capítulo es utilizado como material de docencia en un curso del Doctorado de Ciencias Informáticas de la UNLP, dictado en la actualidad por una docente de dicha institución e investigadora del III-LIDI. La segunda parte, Interfaces Avanzadas, consta de dos capítulos. El material incluido es resultado de la docencia e investigación de dos investigadores de la Unidad de Gráficos y Visión por Ordenador e Inteligencia Artificial de la UIB. El capítulo 4 realiza una introducción a las interfaces basadas en visión, así como explica el proyecto SINA desarrollado en la UIB. El capítulo 5 presenta los sistemas de interacción multitáctil, y además explica un caso de estudio del diseño de una mesa multitáctil.
Article
Full-text available
Appearance preservation aims to estimate reflectance functions to model the way real materials interact with light. These functions are especially useful in digital preservation of heritage and realistic rendering, as they reproduce the appearance of real materials in virtual scenes. This work proposes an image-based process that aims to preserve the appearance of surfaces whose reflectance properties are spatially variant. During image acquisition, this process considers the whole environment as a source of light over the area to be preserved and, assuming the environment is static, it does not require controlled environments. To achieve this goal, the scene geometry and relative camera positions are approximated from a set of HDR images taken inside the real scene, using a combination of structure from motion and multi-view stereo methods. Based on this data, a set of unstructured lumigraphs is traced, on-demand, inside the reconstructed scene. The color information retrieved from these lumigraphs is then used to estimate a linear combination of basis BRDFs for a grid of points in the surface area, defining thus its SVBRDF. This paper details the proposed method and presents the results obtained using real and synthetic settings. It shows that considering the whole environment as a source of light is a viable approach to obtain reliable results and to enable more flexible acquisition setups.
Article
Full-text available
Augmentative and Alternative Communication (AAC) aims to complement or replace spoken language to compensate for expression difficulties faced by people with speech impairments. Computing systems have been developed to support AAC, however, partially due to technical problems, poor interface, and limited interaction functions, AAC systems are not widespread, adopted, and used, therefore reaching a limited audience. This paper proposes a methodology to support AAC for people with motor impairments, using computer vision and machine learning techniques to allow for personalized gestural interaction. The methodology was applied in a pilot system used by both volunteers without disabilities, and by volunteers with motor and speech impairments, to create datasets with personalized gestures. The created datasets and a public dataset were used to evaluate the technologies employed for gesture recognition, namely the Support Vector Machine (SVM) and Convolutional Neural Network (using Transfer Learning), and for motion representation, namely the conventional Motion History Image and Optical Flow-Motion History Image (OF-MHI). Results obtained from the estimation of prediction error using K-fold cross-validation suggest SVM associated with OF-MHI presents slightly better results for gesture recognition. Results indicate the technical feasibility of the proposed methodology, which uses a low-cost approach, and reveals the challenges and specific needs observed during the experiment with the target audience.
Article
Full-text available
Head-operated computer accessibility tools (CATs) are useful solutions for the ones with complete head control; but when it comes to people with only reduced head control, computer access becomes a very challenging task since the users depend on a single head-gesture like a head nod or a head tilt to interact with a computer. It is obvious that any new interaction technique based on a single head-gesture will play an important role to develope better CATs to enhance the users’ self-sufficiency and the quality of life. Therefore, we proposed two novel interaction techniques namely HeadCam and HeadGyro within this study. In a nutshell, both interaction techniques are based on our software switch approach and can serve like traditional switches by recognizing head movements via a standard camera or a gyroscope sensor of a smartphone to translate them into virtual switch presses. A usability study with 36 participants (18 motor-impaired, 18 able-bodied) was also conducted to collect both objective and subjective evaluation data in this study. While HeadGyro software switch exhibited slightly higher performance than HeadCam for each objective evaluation metrics, HeadCam was rated better in subjective evaluation. All participants agreed that the proposed interaction techniques are promising solutions for computer access task.**Keywords: Interaction techniques · Universal access · Inclusive design · Switch access · Computer access ·Head- operated access · Software switch · Switch-accessible interface · Head tracking · Hands-free computer access
Article
Full-text available
Este trabalho apresenta um sistema educacional baseado na web para a criação de apresentações multimídia de objetos culturais. Por meio de uma mecanismo de associação web 3D, conteúdo multimídia como textos, imagens, vídeo e áudio, pode ser associado ao modelo 3D ou a regiões de interesse definidas pelo usuário. A partir dessas associações, apresentações multimídia são geradas automaticamente pelo sistema. O sistema possibilita a visualização de apresentações multimídia de modelos 3D de alta resolução em navegadores web. A solução proposta é destinada a acadêmicos e estudantes do patrimônio cultural. Professores podem criar apresentações multimídia interativas para estudo remoto dos objetos. Também, profissionais das áreas relacionadas à preservação do patrimônio cultural podem contribuir com seu acervo pessoal de dados para a criação de ricas apresentações.
Article
Digital reconstruction of mechanically-shredded documents has received increasing attention in the last years mainly for historical and forensics needs. Computational methods to solve this problem are highly desirable in order to mitigate the time-consuming human effort, and to preserve document integrity. The reconstruction of strips-shredded documents is accomplished by horizontally splicing pieces so that the arising sequence (solution) is as similar as the original document. In this context, a central issue is the quantification of the fitting between the pieces (strips), which generally involves stating a function that associates a pair of strips to a real value indicating the fitting quality. This problem is also more challenging for text documents, such as business letters or legal documents, since they depict poor color information. The system proposed here addresses this issue by exploring character shapes as visual features for compatibility computation. Experiments conducted with real mechanically-shredded documents showed that our approach outperformed in accuracy other popular techniques in literature considering documents with (almost) only textual content.
Article
Full-text available
RGB-D cameras have a great potential to solve several problems arising during the digitization of objects, such as cultural heritage. Three-dimensional (3D) digital preservation is usually performed with the use of high-end 3D scanners, as the 3D points generated by this type of equipment are in average millimeter up to sub-millimeter accurate. The downside of 3D scanners, in addition to the high cost, is the infrastructure requirements. It requires its own source of energy, a large workspace with tripods, special training to calibrate and operate the equipment, and high acquisition time, potentially taking several minutes for capturing a single image. An alternative is the use of low-cost depth cameras that are easy to operate and only require connection to a laptop and a source of energy. There are several recent studies showing the potential of RGB-D sensors. However, they often exhibit errors when applied to a full 360 degrees 3D reconstruction setup, known as the loop closure problem. This kind of error accumulation is intensified by the lower accuracy and large volume of data generated by RGB-D cameras. This article proposes a complete methodology for 3D reconstruction based on RGB-D sensors. To mitigate the loop closure effect, a pairwise alignment method was developed. The proposed approach expands the connectivity graph connections in a pairwise alignment system, by automatically discovering new pairs of meshes with overlapping regions. Then the alignment is more evenly distributed over the aligned pairs, avoiding the loop closure problem of full 3D reconstructions. The experiments were performed on a collection of 30 artworks made by the Baroque artist Antonio Francisco Lisboa, known as Aleijadinho, as part of the Aleijadinho Digital project conducted in partnership with IPHAN (Brazilian National Institute for Cultural and Artistic Heritage) and United Nations Educational, Scientific and Cultural Organization (UNESCO). Experimental results show 3D models that are favorably compared to state-of-the-art methods available in the literature using RGD-D sensors. The main contributions of this work are: a new method for 3D alignment dedicated to attenuate the RGB-D camera loop closure problem; the development and disclosure of a complete, practical solution for 3D reconstruction of artworks; and the construction of 3D digital models of an important and challenging collection of Brazilian cultural heritage, made accessible by a virtual museum.
Book
Full-text available
Este estudo nasceu da necessidade de avaliação de um software às reais necessidades das pessoas com deficiências em suas atividades, tendo como principal finalidade a sua integração ao uso do computador nos moldes atuais. O software intitulado Mousenose, foi desenvolvido pelo Grupo IMAGO, da Universidade Federal do Paraná. O Mousenose funciona reconhecendo através de uma câmera de vídeo, uma parte do corpo do usuário (nariz) e a partir do rastreamento e da captura do campo visual da câmera, transmite o movimento do usuário no cursor sem o uso de hardware. https://www.amazon.com.br/Mousenose-Weldt-Claudia-Francele/dp/3639895495
Conference Paper
Recovery of shredded documents helps in security informatics, forensic and investigation science. Shredded document reconstruction requires much time and human effort. Hence, there is a great need to enhance its performance due to the high growth of critical cases requiring fast shredded document reconstruction. In this paper, we focus particularly on the most influential sub-problem which is enhancing and speeding up the matching process in addition to reducing the search space. Furthermore, fully automated pre-processing, feature extraction and matching are applied in order to minimize the user interaction and reduce the time needed for reconstructing enormous number of shredded documents.
Article
Full-text available
This paper presents the k-Optimum Path Forest (k-OPF) supervised classifier, which is a natural extension of the OPF classifier. k-OPF is compared to the k-Nearest Neighbors (k-NN), Support Vector Machine (SVM) and Decision Tree (DT) classifiers, and ...
Article
Full-text available
This work focuses on camera-based systems that are designed for mouse replacement. Usually, these interfaces are based on computer vision techniques that capture the user’s face or head movements and are specifically designed for users with disabilities. The work identifies and reviews the key factors of these interfaces based on the lessons learnt by the authors’ experience and by a comprehensive analysis of the literature to describe the specific points to consider in their design. These factors are as follows: user features to track, initial user detection (calibration), position mapping, feedback, error recovery, event execution, profiles and ergonomics of the system. The work compiles the solutions offered by different systems to help new designers avoid problems already discussed by the others.
Article
The eight-person winning team used original computer algorithms to narrow the search space and then relied on human observation to move the pieces into their final positions.
Article
This paper introduces a new approach for the automated reconstruction- reassembly of fragmented objects having one surface near to plane, on the basis of the 3D representation of their constituent fragments. The whole process starts by 3D scanning of the available fragments. The obtained representations are properly processed so that they can be tested for possible matches. Next, four novel criteria are introduced, that lead to the determination of pairs of matching fragments. These criteria have been chosen so as the whole process imitates the instinctive reassembling method dedicated scholars apply. The first criterion exploits the volume of the gap between two properly placed fragments. The second one considers the fragments’ overlapping in each possible matching position. Criteria 3,4 employ principles from calculus of variations to obtain bounds for the area and the mean curvature of the contact surfaces and the length of contact curves, which must hold if the two fragments match. The method has been applied, with great success, both in the reconstruction of objects artificially broken by the authors and, most importantly, in the virtual reassembling of parts of wall paintings belonging to the Mycenaic civilization (c.1300 BC.), excavated in a highly fragmented condition in Tyrins, Greece.
Article
This paper addresses the use of information and communications technology, ICT or IT for brevity, to combat illiteracy and to move illiterates directly from illiteracy to computer literacy. The resulting assistive technology and instructional software and hardware can be employed to speed up a literacy program and make it more attractive and effective. The approach provides interactive learning, self-paced and autonomous learning, entertainment learning, ease of information updating, ease of entry and exit, and ease of application to E-Learning. The hallmark of the proposed approach is the integration of speech and handwriting recognition, as well audio and visual aids into the flow. I. Introduction Adult illiteracy is often defined as the inability to read and write for people whose ages are more than 15 years. A more realistic definition will define literacy as an individual's ability to read, write, speak, compute and solve problems at levels of proficiency necessary to function on the job, in the family of the individual, and in society. As information and technology are increasingly shaping our society, the skills we need to function successfully have gone beyond reading, and literacy has come to include the skills listed in the above definition. Reading, writing and innumeracy are at the core of any literacy program. They are indeed needed by any person who wants to navigate his way in society. However, increasingly, computer literacy is also becoming important for a person to function adequately in society. Consequently, any literacy program that does not provide computer literacy will not do justice to its recipients and will not enable them to navigate their way easily in our increasingly computerized society. A few years ago it came to my attention that illiteracy in the Arab World approaches 40%. I have also noticed that over the years the reported percentages of illiteracy were lowered, however the total number of illiterates increased. This indicates clearly the failure of the universal elementary education programs in the Arab world. The only long term solution to illiteracy is to enforce universal elementary education. Obviously the Arab countries had failed to qualify sufficient number of teachers and resources and failed to overcome societal resistance to implement universal elementary education, let alone combating adult illiteracy. II. The Basic Concept: Assistive Technology for Literacy [6-7], [12] Looking at the past fifty years and projecting to the next fifty years, the problem would not go away if we keep doing the same things. Hence different methods should be used. The current adult literacy programs, if successful, would move their graduates from traditional illiteracy to computer illiteracy. Thus I envisaged an approach that would utilize Information and Communications technologies, ICT or IT for brevity, to move the illiterates from illiteracy to computer literacy. I have called the approach: From Illiteracy to Computer Literacy (Teaching and Learning Using Information Technology (TILT))It. This research was supported, by George Kadifa and the Rathmann Family Foundation. 2 is proposed that adult literacy programs should provide computer literacy in addition to the traditional literacy and innumeracy. It is recommended that literacy and innumeracy programs should employ TLIT (Teaching and Learning Using Information Technology) , with appropriate hardware and software. A pleasant side effect is that the graduate of such a program will, with very little additional effort, become a computer literate. Thus such a program will not only provide needed competence in literacy and innumeracy, but will also bridge the 'digital gap' and move its graduates from illiteracy to computer literacy. It is to be noted that one of the pioneers in using computers for teaching Arabic was the late Egyptian expatriate Josephine Abboud. She employed computers to teach Arabic as a foreign language at the University of Texas and demonstrated that this approach reduced the required instruction time several folds [1]. Developments in electronics have been going much faster than developments in adult literacy educational systems. Radio and TV broadcasting lessons aimed at reducing illiteracy helped somewhat but did not drastically reduce illiteracy [6]. Computers allow interaction between student and machine to ensure that the student never becomes lost or bored. One proposed solution for illiteracy is to provide the billion illiterate people with a billion computers. However, merely placing computers in backward nations will not solve the problem. There is a need to develop an educational infrastructure. Computers can help because they can be used to educate a cadre of educators faster than the traditional approaches. It is significant that very few IT based programs were found in the developed world that specifically target illiteracy. This is not surprising since illiteracy is not perceived as a problem for the developed world, although this is changing. However lots of resources go into the education of the handicapped in the developed world and the resulting hardware and software is particularly suitable, with some modifications, to combat illiteracy. A study that identified and characterized software and hardware solutions for the delivery of literacy programs suggested that programs developed for the handicapped could be utilized in combating illiteracy [6]. Indeed illiteracy is a handicap. Three such programs are relevant: Programs for the dyslexic: It should be noted that in the Arab world, with no assistive technology, dyslexic people are naturally added to the rank of the illiterates, and thus the percentage in the illiterate population of dyslexic people would be higher than the normal ten percent in the population at large.
Conference Paper
Full-text available
In this work, we focus on the reconstruction of strip shred- ded text documents (RSSTD) which is of great interest in investigative sciences and forensics. After presenting a formal model for RSSTD, we suggest two solution approaches: On the one hand, RSSTD can be re- formulated as a (standard) traveling salesman problem and solved by well-known algorithms such as the chained Lin Kernighan heuristic. On the other hand, we present a specific variable neighborhood search ap- proach. Both methods are able to outperform a previous algorithm from literature, but nevertheless have practical limits due to the necessarily imperfect objective function. We therefore turn to a semi-automatic sys- tem which also integrates user interactions in the optimization process. Practical results of this hybrid approach are excellent; difficult instances can be quickly resolved with only few user interactions.
Conference Paper
This paper discusses the destroyed documents that have been strip-shredded, which is a often problem in forensic science. The proposed method first extracts features based on color of the boundaries and then computes the nearest neighbor algorithm to carry out the local reconstruction. In this way the overall complexity can be dramatically reduced because few features are used to perform the matching. The preliminary results reported in this paper, which take into account a two hundred documents database, demonstrate that color-matching-based method produces interesting results for the problem of document reconstruction and can be of interest to the forensic document examiners and provide some effective solutions for law enforcement practitioners.
Article
Full-text available
This paper introduces a new prototype system for controlling a PC by head movements and also with voice commands. Our system is a multimodal interface concerned with controlling the computer. The selected modes of interaction are speech and gestures. We are seeing the revolutionary of computers and information technologies into daily practice. Healthy people use keyboard, mouse, trackball, or touchpad for controlling the PC. However these peripheries are usually not suitable for handicapped people. They may have problems using these standard peripheries, for example when they suffer from myopathy, or cannot move their hands after an injury. Our system has been developed to provide computer access for people with severe disabilities. This system tracks the computer user’s Head movements with a video camera and translates them into the movements of the mouse pointer on the screen and the voice as button presses. Therefore we are coming with a proposal system that can be used with handicapped people to control the PC.
Article
Full-text available
We describe a system for tracking real-time hand gestures captured by a cheap web camera and a standard Intel Pentium based personal computer with no specialized image processing hardware. To attain the necessary processing speed, the system exploits the Multi-Media Instruction set(MMX) extensions of the Intel Pentium chip family through software including. the Microsoft DirectX SDK and the Intel Image Processing and Open Source Computer Vision (OpenCV) libraries. The system is based on the Camshift algorithm (from OpenCV) and the compound constant acceleration Kalman filter algorithms. Tracking is robust and efficient and can track hand motion at 30 fps.
Conference Paper
Full-text available
We propose a new method for tracking rigid objects in image sequences using template matching. A Kalman filter is used to make the template adapt to changes in object orientation or illumination. This approach is novel since the Kalman filter has been used in tracking mainly for smoothing the object trajectory. The performance of the Kalman filter is further improved by employing a robust and adaptive filtering algorithm. Special attention is paid to occlusion handling
Article
As a first step towards a perceptual user interface, a computer vision color tracking algorithm is developed and applied towards tracking human faces. Computer vision algorithms that are intended to form part of a perceptual user interface must be fast and efficient. They must be able to track in real time yet not absorb a major share of computational resources: other tasks must be able to run while the visual interface is being used. The new algorithm developed here is based on a robust...
Article
This paper describes a thorough analysis of the pattern matching techniques used to compute image motion from a sequence of two or more images. Several correlation/distance measures are tested, and problems in displacement estimation are investigated. As a byproduct of this analysis, several novel techniques are presented which improve the accuracy of flow vector estimation and reduce the computational cost by using filters, multi-scale approach and mask sub-sampling. Further, new algorithms to obtain a sub-pixel accuracy of the flow are proposed. A large amount of experimental tests have been performed to compare all the techniques proposed, in order to understand which are the most useful for practical applications, and the results obtained are very accurate, showing that correlation-based flow computation is suitable for practical and real-time applications.
Article
The "Camera Mouse" system has been developed to provide computer access for people with severe disabilities. The system tracks the computer user's movements with a video camera and translates them into the movements of the mouse pointer on the screen. Body features such as the tip of the user's nose or finger can be tracked. The visual tracking algorithm is based on cropping an online template of the tracked feature from the current image frame and testing where this template correlates in the subsequent frame. The location of the highest correlation is interpreted as the new location of the feature in the subsequent frame. Various body features are examined for tracking robustness and user convenience. A group of 20 people without disabilities tested the Camera Mouse and quickly learned how to use it to spell out messages or play games. Twelve people with severe cerebral palsy or traumatic brain injury have tried the system, nine of whom have shown success. They interacted with their environment by spelling out messages and exploring the Internet.
Conference Paper
Among head gestures, nodding and head-shaking are very common and used often. Thus the detection of such gestures is basic to a visual understanding of human responses. However it is difficult to detect them in real-time, because nodding and head-shaking are fairly small and fast head movements. We propose an approach for detecting nodding and head-shaking in real time from a single color video stream by directly detecting and tracking a point between the eyes, or what we call the “between-eyes”. Along a circle of a certain radius centered at the “between-eyes”, the pixel value has two cycles of bright parts (forehead and nose bridge) and dark parts (eyes and brows). The output of the proposed circle-frequency filter has a local maximum at these characteristic points. To distinguish the true “between-eyes” from similar characteristic points in other face parts, we do a confirmation with eye detection. Once the “between-eyes” is detected, a small area around it is copied as a template and the system enters the tracking mode. Combining with the circle-frequency filtering and the template, the tracking is done not by searching around but by selecting candidates using the template; the template is then updated. Due to this special tracking algorithm, the system can track the “between-eyes” stably and accurately. It runs at 13 frames/s rate without special hardware. By analyzing the movement of the point, we can detect nodding and head-shaking. Some experimental results are shown
Computer Vision Face Tracking For Use in a Perceptual User InterfaceOcclusion robust adaptive template tracking'. Computer Vision
  • G Bradsky
  • H T Nguyen
  • M Worring
  • Rvd Boomgaard
G. Bradsky. 'Computer Vision Face Tracking For Use in a Perceptual User Interface'. Intel TechnologV Fd Quarrep '98 Journal. H.T. Nguyen, M.Worring and RVD. Boomgaard. 'Occlusion robust adaptive template tracking'. Computer Vision, 2001.ICCV 2001 Proceedings VuZume I. pp.7-14, July 2001, Intel 0 OpenCV Reference Manual. Intel Corporation 2001. pp. 14:105, pp. 2:20-23.