Article

Eye-Tracking Technologies in Mobile Devices Using Edge Computing: A Systematic Review

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Eye-tracking provides invaluable insight into the cognitive activities underlying a wide range of human behaviours. Identifying cognitive activities provide valuable perceptions of human learning patterns and signs of cognitive diseases like Alzheimer’s, Parkinson’s, autism. Also, mobile devices have changed the way that we experience daily life and become a pervasive part. This systematic review provides a detailed analysis of mobile device eye-tracking technology reported in 36 studies published in high ranked scientific journals from 2010 to 2020 (September), along with several reports from grey literature. The review provides in-depth analysis on algorithms, additional apparatus, calibration methods, computational systems, and metrics applied to measure the performance of the proposed solutions. Also, the review presents a comprehensive classification of mobile device eye-tracking applications used across various domains such as healthcare, education, road safety, news and human authentication. We have outlined the shortcomings identified in the literature and the limitations of the current mobile device eye-tracking technologies, such as using the front-facing mobile camera. Further, we have proposed an edge computing driven eye tracking solution to achieve the real-time eye tracking experience. Based on the findings, the paper outlines various research gaps and future opportunities that are expected to be of significant value for improving the work in the eye-tracking domain.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... While numerous studies have investigated the role of eye tracking and its effect on user perception in online gaming [21][22][23][24][25], there remains a research gap concerning mobile gaming and user perception of offered services. This study utilizes Pupil Core eye-tracking hardware developed by Pupil Labs Berlin, Germany [26,27] to capture user interactions with mobile devices and simultaneously measure their visual attention. ...
... Gunawardena et al. [25] analyzed 36 publications published after 2010 related to gaze tracking in mobile phones and tablets and proposed an edge computing-based eye-tracking solution. The study encompasses the publications related to using commercial devices specifically designed for eye tracking using external glasses like Tobii, Pupil Core, etc., and screen-based eye-tracking solutions that depend on the mobile device's front-facing camera. ...
Article
Full-text available
Mobile gaming accounts for more than 50% of global online gaming revenue, surpassing console and browser-based gaming. The success of mobile gaming titles depends on optimizing applications for the specific hardware constraints of mobile devices, such as smaller displays and lower computational power, to maximize battery life. Additionally, these applications must dynamically adapt to the variations in network speed inherent in mobile environments. Ultimately, user engagement and satisfaction are critical, necessitating a favorable comparison to browser and console-based gaming experiences. While Quality of Experience (QoE) subjective evaluations through user surveys are the most reliable method for assessing user perception, various factors, termed influence factors (IFs), can affect user ratings of stimulus quality. This study examines human influence factors in mobile gaming, specifically analyzing the impact of user delight towards displayed content and the effect of gaze tracking. Using Pupil Core eye-tracking hardware, we captured user interactions with mobile devices and measured visual attention. Video stimuli from eight popular games were selected, with resolutions of 720p and 1080p and frame rates of 30 and 60 fps. Our results indicate a statistically significant impact of user delight on the MOS for most video stimuli across all games. Additionally, a trend favoring higher frame rates over screen resolution emerged in user ratings. These findings underscore the significance of optimizing mobile gaming experiences by incorporating models that estimate human influence factors to enhance user satisfaction and engagement.
... Eye tracking provides information about where you are looking, how long you have looked at a particular area, and how often you look at a specifc object, among other details [5]. Eye-tracking devices are used in various research and industrial domains, including neuroscience [6], marketing [7], authentication [8], education [9], human-computer interaction [10], medicine [11], and consumer electronics (e.g., VR/AR goggles). ...
... Tis category of algorithms typically requires specialized hardware, such as an infrared camera [25,26], depth camera [27], microelectromechanical-system (MEMS) scanner, and photodetector [28], to achieve accurate eye tracking. On the other hand, appearance-based algorithms take the lowresolution eye and face images as input and rely on a large dataset for model training [5]. Te current paper addresses 2D gaze estimation on smartphones with an appearance-based algorithm. ...
Article
Full-text available
Eye tracking has emerged as a valuable tool for both research and clinical applications. However, traditional eye-tracking systems are often bulky and expensive, limiting their widespread adoption in various fields. Smartphone eye tracking has become feasible with advanced deep learning and edge computing technologies. However, the field still faces practical challenges related to large-scale datasets, model inference speed, and gaze estimation accuracy. The present study created a new dataset that contains over 3.2 million face images collected with recent phone models and presents a comprehensive smartphone eye-tracking pipeline comprising a deep neural network framework (MGazeNet), a personalized model calibration method, and a heuristic gaze signal filter. The MGazeNet model introduced a linear adaptive batch normalization module to efficiently combine eye and face features, achieving the state-of-the-art gaze estimation accuracy of 1.59 cm on the GazeCapture dataset and 1.48 cm on our custom dataset. In addition, an algorithm that utilizes multiverse optimization to optimize the hyperparameters of support vector regression (MVO–SVR) was proposed to improve eye-tracking calibration accuracy with 13 or fewer ground-truth gaze points, further improving gaze estimation accuracy to 0.89 cm. This integrated approach allows for eye tracking with accuracy comparable to that of research-grade eye trackers, offering new application possibilities for smartphone eye tracking.
... Eye-tracking technology, traditionally limited to specialised hardware such as infrared-based commercial eye trackers, is now being adapted to work on mobile devices using deep learning-based appearance models. This shift enables a range of applications, such as accessibility tools, gaming, user attention analysis, and mobile health diagnostics, to operate without requiring dedicated equipment [16]. ...
Preprint
Full-text available
This study evaluates a smartphone-based, deep-learning eye-tracking algorithm by comparing its performance against a commercial infrared-based eye tracker, the Tobii Pro Nano. The aim is to investigate the feasibility of appearance-based gaze estimation under realistic mobile usage conditions. Key sensitivity factors, including age, gender, vision correction, lighting conditions, device type, and head position, were systematically analysed. The appearance-based algorithm integrates a lightweight convolutional neural network (MobileNet-V3) with a recurrent structure (Long Short-Term Memory) to predict gaze coordinates from grayscale facial images. Gaze data were collected from 51 participants using dynamic visual stimuli, and accuracy was measured using Euclidean distance. The deep learning model produced a mean error of 17.76 mm, compared to 16.53 mm for the Tobii Pro Nano. While overall accuracy differences were small, the deep learning-based method was more sensitive to factors such as lighting, vision correction, and age, with higher failure rates observed under low-light conditions among participants using glasses and in older age groups. Device-specific and positional factors also influenced tracking performance. These results highlight the potential of appearance-based approaches for mobile eye tracking and offer a reference framework for evaluating gaze estimation systems across varied usage conditions.
... The infrared light source allows for capturing infrared eye images that are resilient to visible light interferences; it also creates salient reflections on the cornea (glints) that help estimate the gaze direction [Guestrin and Eizenman 2006]. These devices are often prohibitively expensive and are typically designed for use in controlled laboratory environments [Gunawardena et al. 2023], hindering the broader application of eyetracking technology. ...
Article
Gaze-tracking has a wide range of applications across scientific and industrial fields, and recent computer vision and deep-learning advances have made gaze-tracking with standard webcams possible. However, current solutions only offer suboptimal performance and lack flexibility. This paper introduces GazeFollower, an accessible system for webcam gaze-tracking in Python. GazeFollower stands out for its customizability, allowing researchers to quickly develop and adapt algorithms to meet their needs. At its core, GazeFollower estimates gaze with a model trained on 32 million face images, ensuring robust gaze tracking. A benchmark test on a sizeable sample (N=31) shows that the tracking performance of GazeFollower is on par with or better than budget commercial eye trackers. With calibration, GazeFollower has an accuracy of 1.11 cm and a precision of 0.11 cm, and personalized model fine-tuning further enhances these metrics to 0.92 cm and 0.08 cm. These results suggest that GazeFollower holds potential for real-world applications.
... These limitations can lead to skewed results, particularly in diverse demographic settings where user interaction with technology may differ. Moreover, ethical concerns about privacy and data misuse remain significant, especially when real-time attention data is captured without explicit consent (Gunawardena et al., 2022;King et al., 2019;Valliappan et al., 2020). A balanced perspective requires addressing these challenges to ensure broader, equitable, and responsible application of this technology in neuromarketing. ...
Article
Full-text available
This review paper examines the influence of neuromarketing on consumer behavior research, emphasizing its origins, methodologies, impacts, and implications for customer engagement. This paper conducts a thorough narrative literature review on neuromarketing, analyzing the application of diverse neuroscientific techniques, including functional magnetic resonance imaging (fMRI), electroencephalography (EEG), positron emission tomography (PET), magnetoencephalography (MEG), and facial coding/eye tracking, to assess cerebral responses and forecast consumer behavior. Recent scholarly investigations underscore how neuromarketing utilizes contemporary breakthroughs to transform consumer behavior research by offering enhanced insights into cognition and emotion. These sophisticated tools have contested conventional study methodologies, facilitating a more comprehensive comprehension of the interaction between marketing stimuli and the brain. Nevertheless, the research also examines the ethical ramifications and issues associated with neuromarketing, especially with privacy and the absence of regulatory frameworks. The theoretical and practical aspects of the findings are that, the advanced neuroimaging techniques like fMRI, EEG, and eye-tracking have revealed the complex interactions between emotions, attention, and memory that traditional methods and neuromarketing transforms marketing by giving marketers new tools and approaches to boost results. This review synthesizes previous research on the topic, highlighting the necessity for equilibrium between scientific advancement and ethical accountability in neuromarketing, thereby benefiting both academic and commercial sectors.
... It is also evident in the thematic evolution over the last decade that mobile technologies contributed eye-tracking research. With their ubiquitous and seamless learning opportunities (Hwang, 2015) mobile technologies play a significant role in eye-tracking research (Gunawardena et al., 2022). Also, previous studies identified the potentials of eyetracking in terms of reading strategies and improved comprehension regarding L2 learning (Roncevic, 2021). ...
Article
Full-text available
The present study carried out a bibliometric analysis of L2 eye-tracking research. VOSviewer, CiteSpace, and CitNetExplorer were used for the analysis. The data were 245 articles indexed by Social Sciences Citation Index (SSCI). The study identified the research topics, trends, promising research directions, influential authors and documents, and countries. Important clusters of research were related to L2 processing and cognitive aspects, language acquisition and learning strategies, explicit instruction, and bilingualism and lexical processing. The most prominent topics under these clusters were glosses, multimodality, vocabulary, incidental learning, visual attention distribution, bilingual subtitles, cognitive load, sight translation, keystroke recording, and captions. The analysis also showed the emerging trends in L2 research: pen-recording, audiovisual translation, mobile-assisted language learning, college settings, gesture, and prosody. These findings indicate that there is an increasing interplay between technology, nonverbal communication, and linguistic features in L2 learning and education. By giving a quantitative and comprehensive overview of the literature, this study does not merely contribute to our better understanding of the L2 eye-tracking research but also brings out direct practical implications in the design of more effective language teaching methodologies. The dynamism of the eye-tracking research points out the role of technology toward enriching language education and indicates areas for further exploration and development.
... Due to the increasing prevalence of personal mobile devices, mobile eye-tracking technology has become a low-cost alternative solution. This technology utilizes the front or rear camera of basic personal smart devices such as smartphones and tablets, in conjunction with powerful software applications, to achieve face detection, eye detection, iris or pupil detection, and gaze angle calculation [35]. This technology overcomes the high cost and limited mobility of existing commercial eye trackers, thereby providing a technical foundation for potential applications in education and classrooms. ...
Article
Full-text available
The integration of advanced technologies is revolutionizing classrooms, significantly enhancing their intelligence, interactivity, and personalization. Central to this transformation are sensor technologies, which play pivotal roles. While numerous surveys summarize research progress in classrooms, few studies focus on the integration of sensor and AI technologies in developing smart classrooms. This systematic review classifies sensors used in smart classrooms and explores their current applications from both hardware and software perspectives. It delineates how different sensors enhance educational outcomes and the crucial role AI technologies play. The review highlights how sensor technology improves the physical classroom environment, monitors physiological and behavioral data, and is widely used to boost student engagements, manage attendance, and provide personalized learning experiences. Additionally, it shows that combining sensor software algorithms with AI technology not only enhances the data processing and analysis efficiency but also expands sensor capabilities, enriching their role in smart classrooms. The article also addresses challenges such as data privacy protection, cost, and algorithm optimization associated with emerging sensor technologies, proposing future research directions to advance educational sensor technologies.
... Traditional eye-tracking methods have faced challenges such as high costs, lack of portability, limited accuracy in dynamic environments, and extensive calibration needs [5,6,7]. The integration of advanced camera technologies and sophisticated machine learning algorithms in smartphones offers a promising avenue to address these challenges [3,4,8]. ...
Preprint
Full-text available
A significant limitation of current smartphone-based eye-tracking algorithms is their low accuracy when applied to video-type visual stimuli, as they are typically trained on static images. Also, the increasing demand for real-time interactive applications like games, VR, and AR on smartphones requires overcoming the limitations posed by resource constraints such as limited computational power, battery life, and network bandwidth. Therefore, we developed two new smartphone eye-tracking techniques for video-type visuals by combining Convolutional Neural Networks (CNN) with two different Recurrent Neural Networks (RNN), namely Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU). Our CNN+LSTM and CNN+GRU models achieved an average Root Mean Square Error of 0.955cm and 1.091cm, respectively. To address the computational constraints of smartphones, we developed an edge intelligence architecture to enhance the performance of smartphone-based eye tracking. We applied various optimisation methods like quantisation and pruning to deep learning models for better energy, CPU, and memory usage on edge devices, focusing on real-time processing. Using model quantisation, the model inference time in the CNN+LSTM and CNN+GRU models was reduced by 21.72% and 19.50%, respectively, on edge devices.
... Next (Ge et al., 2022), Research on impaired driving recognition using bibliometric analysis and cooccurrence networks maps trends, patterns, and methods of impaired recognition comprehensively, but does not focus on facial recognition with glasses. Then (Gunawardena et al., 2022), Eye tracking research on mobile devices using deep learning and machine learning analyzes eye tracking solutions in depth to provide insight into learning patterns and cognitive signs, but does not address bespectacled facial recognition specifically. On the other hand (Y. ...
Article
Full-text available
Facial recognition technology has rapidly advanced, but identifying individuals wearing glasses remains challenging due to altered or obscured facial features. This study addresses this issue by combining the Nearest Neighbor Interpolation Method and Naive Bayes Classification for bespectacled face identification. The method applies interpolation to enhance facial image quality, preserving critical features before classification by Naive Bayes into spectacle and non-spectacle classes. Using the Kaggle MeGlass dataset for training and testing, the approach achieved a training accuracy of 78%, a testing accuracy of 76%, and a cross-validation value of 0.70. These results indicate a significant improvement in recognizing bespectacled faces, contributing to enhanced accuracy in facial recognition systems. Despite these advancements, further improvements are possible, such as integrating more advanced models and expanding the dataset, which could lead to even greater accuracy and reliability in practical applications. This research provides a novel solution to a persistent challenge in facial recognition technology
... Computer Vision (CV), the interdisciplinary field that enables machines to extract information from visual data, has witnessed significant advancements in recent years [1][2][3]. The availability of high-resolution cameras, coupled with the exponential growth in computational power, has opened up new opportunities for developing CV solutions across a wide range of domains [4][5][6]. From object detection and recognition to image segmentation and scene understanding, CV plays an important role in enabling machines to perceive and interpret visual information. Considering smart cities as a use case, CV has been used to address the urban challenges such as traffic management [7], waste management [8], transport safety, [9] and flood management [10]. ...
Article
Full-text available
Computer Vision (CV) has become increasingly important for Single-Board Computers (SBCs) due to their widespread deployment in addressing real-world problems. Specifically, in the context of smart cities, there is an emerging trend of developing end-to-end video analytics solutions designed to address urban challenges such as traffic management, disaster response, and waste management. However, deploying CV solutions on SBCs presents several pressing challenges (e.g., limited computation power, inefficient energy management, and real-time processing needs) hindering their use at scale. Graphical Processing Units (GPUs) and software-level developments have emerged recently in addressing these challenges to enable the elevated performance of SBCs; however, it is still an active area of research. There is a gap in the literature for a comprehensive review of such recent and rapidly evolving advancements on both software and hardware fronts. The presented review provides a detailed overview of the existing GPU-accelerated edge-computing SBCs and software advancements including algorithm optimization techniques, packages, development frameworks, and hardware deployment specific packages. This review provides a subjective comparative analysis based on critical factors to help applied Artificial Intelligence (AI) researchers in demonstrating the existing state of the art and selecting the best suited combinations for their specific use-case. At the end, the paper also discusses potential limitations of the existing SBCs and highlights the future research directions in this domain.
... The use of a VR headset can help to see where the users are looking and how to make the experience a significant immersive invention. The PC gaming eye-tracking [20,21,22,23,24,25,26] can also allow the user to look at objects that they desire to interact with and press controls instead of the mouse to guide them to the widgets they navigate. The user of web forms is now at the center of expressive user interaction, and this requires a steady handle. ...
Chapter
Full-text available
In most usability and user experience tests, eye movement is one of the major components of user perception study. The user eye tracking makes it easier and possible to record the eye movement positions by analysing in both Mobile 2-Dimensional and 3-Dimensional virtual environments. The process is to gather insight into the user's mind and mind processes by revealing more than one can imagine. The mobile eye movement can help understand how users feel, think, and act about certain objects and subjects in the visual field. Most research used quantitative data from mobile-based eye tracking on image and video content to help complement studies through questionnaires and surveys; traditional eye movement studies have been known to capture remote and mobile eye trackers through confined lab studies and some of these analyses have proved tedious. This paper seeks to investigate the trends in mobile eye movement studies, by generating users' data through interaction with webpage dynamic contents. The data generated underwent a post-analysis by simulating the effect and eye movement on e-learning webpages, the patterns observed are seen to be synchronous in behaviour with both original and simulated fixation patterns on the web-pages they visited; this also shows the authenticity of using simulated eye movement behaviour for decision making.
... The findings of the study indicated that the search patterns on curves to the left and right were not symmetrical, and that the duration of fixations may be linked to accident rates when navigating curves. In addition, a systematic review by Gunawardena et al. (Gunawardena et al., 2023) analyzed 36 studies published from 2010 to 2020 that reported on eye tracking technology for mobile devices and identified the algorithms, equipment, techniques for calibration, computerized systems, and criteria utilized to evaluate the effectiveness of the suggested approaches. In another study, Land (Land, 1993) developed a portable camera to monitor the movements of eyes, head, and gaze while driving, and a correlation between eye and head movements at intersections was observed, which could be anticipated by considering the magnitude and timing of each gaze shift. ...
Article
Full-text available
The aim of this study is to revisit the important question of how human drivers steer in straight and curved roads. We review the two-point and the tangent point models and suggest a new perspective by introducing the center point model. An eye tracking platform was used to investigate the driver’s gaze direction in real-time driving. Extensive data was collected in an urban environment via integration of the eye tracking system and a scene camera. The objectives of the experiments were to investigate the driver gaze behaviour in two scenarios in real traffic settings. In the first set of trials, the driver followed the tangent method whereas in the second set of tests, the driver focused on the perceived center of the road during driving in straight and curved roads. Moreover, the distance between the driver and the gaze point is found to be lower when turning left compared to turning right. This study underpins the improved performance of the center point model over the tangent point model. Furthermore, our analysis reveals correlation between gaze strategy and vehicular control, offering a deeper comprehension of visuomotor coordination essential for safe and efficient driving. These findings provide valuable insight into human driving behavior and can be utilized to develop more effective driver assistance systems and enhance lateral control mechanisms for driverless cars.
... In some cases, the accuracy of machine learning solutions in the mobile context has been even shown to be comparable to that of state-of-the-art eye trackers [17]. Edge computing approaches have also been considered to lighten the computational load on the mobile device, while trying to limit network traffic [6]. ...
... Pupil-cornea reflection measurements are performed in real-time, while the image processing algorithm provides an accurate measurement of the gaze. The appearance model updated only would function as a classifier while the filter is used to predict the search region and the fixation state can be modeled from Markov decision models, Cumulative Sum algorithms, Deep Learning and Machine Learning (ML) processes and Edge computing [52][53][54]. Edge computing can promote multimodal interaction with improved pilot-cockpit integration beyond standard interfaces with a high level of speed and accuracy [55]. The trend of applicable AI solutions could include Fog/Edge Computing to enable on-board computation rather than other off-board platforms, therefore improving SA responsiveness [24]. ...
Chapter
Full-text available
A learning framework for combining state-of-the-art augmented reality (AR) technologies and artificial intelligence (AI) for helmet-mounted display applications in combat aviation has been proposed to explore perceptual and cognitive performance factors and their influence on mission needs. The analysis originated through examining helmet-mounted display (HMD) design features and their configurations for tactical situational awareness (SA). In accomplishing this goal, the relationship between the pilot visual search and recent advancements in AI have been gauged as a background source to unlock pilot’s uncued visual search limit. In this context, the Augment-Me framework is introduced with the ability to view and organize SA information in a predictive way. The provisioning of AI-augmented fixation maps could effectively outperform current AR-HMD capabilities, facilitating human decision while pursuing the detection and compensation of the mechanisms of human error.
... In our previous extensive systematic literature review [21] we have identified 20 studies that attempted to develop gaze estimation methods for mobile devices using the front camera. The majority of those works are exploratory, directly applying previously presented methods to mobile devices. ...
Article
Full-text available
Eye-tracking is a technique used for determining where users are looking and how long they keep their gaze fixed on a particular location. Developments in mobile technology have made mobile applications pervasive; however, eye tracking on mobile devices is still uncommon. This paper proposes a mobile edge computing architecture for eye tracking. We evaluate four lightweight CNN models (LeNet-5, AlexNet, MobileNet, and ShuffleNet) for gaze estimation on mobile devices using a publicly available dataset called GazeCapture. In order to analyse the feasibility of different inference modes such as on-device, edge-based and cloud-based, we conduct an empirical measurement study to quantify inference time, communication time, and resource consumption in these inference modes. Our analysis indicates that while cloud-based inference provides faster predictions, the communication time between the mobile device and the cloud introduces significant latency into the application. This effectively eliminates the ability to perform real-time eye tracking via cloud inference. Furthermore, our findings show that on-device inference performance is limited by energy and memory consumption, making it unsuitable to provide a high-quality user experience. Additionally, we demonstrated that edge-based inference results in a reasonable response time, memory usage, and energy consumption for eye-tracking applications on mobile devices.
Article
Introduction: This study aimed to develop and validate an efficient eye-tracking algorithm suitable for the analysis of images captured in the visible-light spectrum using a smartphone camera. Methods: The investigation primarily focused on comparing two algorithms, which were named CHT_TM and CHT_ACM, abbreviated from the core functions: Circular Hough Transform (CHT), Active Contour Models (ACMs), and Template Matching (TM). Results: CHT_TM significantly improved the running speed of the CHT_ACM algorithm, with not much difference in the resource consumption, and improved the accuracy on the x axis. CHT_TM achieved a reduction by 79% of the execution time. CHT_TM performed with an average mean percentage error of 0.34% and 0.95% in the x and y direction across the 19 manually validated videos, compared to 0.81% and 0.85% for CHT_ACM. Different conditions, like manually opening the eyelids with a finger versus without a finger, were also compared across four different tasks. Conclusions: This study shows that applying TM improves the original eye-tracking algorithm with CHT_ACM. The new algorithm has the potential to help the tracking of eye movement, which can facilitate the early screening and diagnosis of neurodegenerative diseases.
Article
Gaze input offers strong potential for creating intuitive and engaging user interfaces, but remains constrained by inherent limitations in accuracy and precision. Although extensive research has explored gaze-based interaction over the past three decades, a systematic framework that fully captures the diversity of gaze interaction techniques is still lacking. To address this gap, we present a novel two-dimensional taxonomy that classifies gaze interactions by (1) the type of input , distinguishing between gaze-only and gaze-assisted modalities, and (2) the type of target , differentiating between those requiring absolute gaze coordinates and thus higher accuracy, and those using relative coordinates, which tolerate lower accuracy. Our taxonomy explicitly captures the required input accuracy and interface constraints of each technique, providing clearer guidance for designers of gaze-based interfaces. We apply this taxonomy to review and classify 125 studies of active gaze interactions on 2D displays. The findings highlight promising techniques and identify research opportunities to advance gaze interaction design.
Chapter
This chapter profiles the self-report, behavioral, and physiological methods and measures used to investigate user engagement in research and practice settings.
Preprint
Full-text available
In recent years we have witnessed an increasing number of interactive systems on handheld mobile devices which utilise gaze as a single or complementary interaction modality. This trend is driven by the enhanced computational power of these devices, higher resolution and capacity of their cameras, and improved gaze estimation accuracy obtained from advanced machine learning techniques, especially in deep learning. As the literature is fast progressing, there is a pressing need to review the state of the art, delineate the boundary, and identify the key research challenges and opportunities in gaze estimation and interaction. This paper aims to serve this purpose by presenting an end-to-end holistic view in this area, from gaze capturing sensors, to gaze estimation workflows, to deep learning techniques, and to gaze interactive applications.
Article
Full-text available
In recent years we have witnessed an increasing number of interactive systems on handheld mobile devices which utilise gaze as a single or complementary interaction modality. This trend is driven by the enhanced computational power of these devices, higher resolution and capacity of their cameras, and improved gaze estimation accuracy obtained from advanced machine learning techniques, especially in deep learning. As the literature is fast progressing, there is a pressing need to review the state of the art, delineate the boundary, and identify the key research challenges and opportunities in gaze estimation and interaction. This paper aims to serve this purpose by presenting an end-to-end holistic view in this area, from gaze capturing sensors, to gaze estimation workflows, to deep learning techniques, and to gaze interactive applications.
Preprint
Full-text available
This study investigates the capability blessings of the use of eye-monitoring technology to beautify the usability of web sites. With the upward thrust of on-line interactions, website usability has turn out to be increasingly important for making sure person pleasure and engagement. Eye-tracking technology offers a non-invasive way to measure how users interact with web sites with the aid of monitoring their eye actions and gaze styles. By studying those statistics, internet site designers and builders can advantage insights into how customers navigate, examine, and method data on their web sites. This paper affords an overview of applicable literature on eye-monitoring and website usability, as well as a precis of research which have explored the usage of eye-tracking era to improve website design and overall performance. The outcomes propose that eye-tracking era can offer valuable records for enhancing internet site usability, inclusive of insights into consumer attention, visible hierarchy, and consumer engagement. Further studies are wanted to explore the full potential of eye-tracking generation and to develop great practices for incorporating this technology into website design and development techniques.
Article
Full-text available
Purpose: To assess optical and motor changes associated with near vision reading under different controlled lighting conditions performed with two different types of electronic screens. Methods: Twenty-four healthy subjects with a mean age of 22.9±2.3 years (18-33) participated in this study. An iPad and an e-ink reader were chosen to present calibrated text, and each task lasted 5 minutes evaluating both ambient illuminance level and luminance of the screens. Results:Eye-tracker data revealed a higher number of saccadic eye movements under minimum luminance than under maximum luminance. The results showed statistically significant differences between the iPad (p=0.016) and the e-ink reader (p=0.002). The length of saccades was also higher for the minimum luminance level for both devices: 6.2±2.8 mm and 8.2±4.2 mm (e-ink max vs min), 6.8±2.9 mm and 7.6±3.6 mm (iPad max vs min), and blinking rate increased significantly for lower lighting conditions. Conclusions: Performing reading tasks on electronic devices is highly influenced by both the configuration of the screens and the ambient lighting, meanwhile, low differences in visual quality that are transient in healthy young people, were found.
Article
Full-text available
Identifying and localizing the user's visual attention can enable various intelligent service computing paradigms in a mobile environment. However, existing solutions can only compute the gaze direction, but without the distance to the intended target. In addition, most of them rely on eye tracker or similar infrastructure support. This paper explores the possibility of using portal mobile devices, e.g., smartphone, to detect the visual attention of a user. i-VALS only requires the user to do one simple action to localize the intended object: gazing at the intended object and holding up the smartphone so that the object and the user's face can be simultaneously captured by the front and rear cameras. We develop efficient algorithms to obtain both the distance between the camera and user, the user's gaze direction and the object's direction from the camera. The object's location can then be computed by solving a trigonometric problem. i-VALS has been prototyped on commercial off-the-shelf (COTS) devices. The extensive experiment results show that i-VALS achieves high accuracy and small latency, effectively supporting a large variety of applications in smart environments.
Article
Full-text available
Eye tracking has been widely used for decades in vision research, language and usability. However, most prior research has focused on large desktop displays using specialized eye trackers that are expensive and cannot scale. Little is known about eye movement behavior on phones, despite their pervasiveness and large amount of time spent. We leverage machine learning to demonstrate accurate smartphone-based eye tracking without any additional hardware. We show that the accuracy of our method is comparable to state-of-the-art mobile eye trackers that are 100x more expensive. Using data from over 100 opted-in users, we replicate key findings from previous eye movement research on oculomotor tasks and saliency analyses during natural image viewing. In addition, we demonstrate the utility of smartphone-based gaze for detecting reading comprehension difficulty. Our results show the potential for scaling eye movement research by orders-of-magnitude to thousands of participants (with explicit consent), enabling advances in vision research, accessibility and healthcare.
Article
Full-text available
Biometric systems use scanners to verify the identity of human beings by measuring the patterns of their behavioral or physiological characteristics. Some biometric systems are contactless and do not require direct touch to perform these measurements; others, such as fingerprint verification systems, require the user to make direct physical contact with the scanner for a specified duration for the biometric pattern of the user to be properly read and measured. This may increase the possibility of contamination with harmful microbial pathogens or of cross-contamination of food and water by subsequent users. Physical contact also increases the likelihood of inoculation of harmful microbial pathogens into the respiratory tract, thereby triggering infectious diseases. In this viewpoint, we establish the likelihood of infectious disease transmission through touch-based fingerprint biometric devices and discuss control measures to curb the spread of infectious diseases, including COVID-19.
Article
Full-text available
With an ever-increasing number of mobile devices competing for attention, quantifying when, how often, or for how long users look at their devices has emerged as a key challenge in mobile human-computer interaction. Encouraged by recent advances in automatic eye contact detection using machine learning and device-integrated cameras, we provide a fundamental investigation into the feasibility of quantifying overt visual attention during everyday mobile interactions. In this article, we discuss the main challenges and sources of error associated with sensing visual attention on mobile devices in the wild, including the impact of face and eye visibility, the importance of robust head poses estimation, and the need for accurate gaze estimation. Our analysis informs future research on this emerging topic and underlines the potential of eye contact detection for exciting new applications toward next-generation pervasive attentive user interfaces.
Article
Full-text available
The advancement and popularity of smartphones have made it an essential and all-purpose device. But lack of advancement in battery technology has held back its optimum potential. Therefore, optimal use and efficient management of energy are crucial, considering its scarcity. For that, a fair understanding of a smartphone’s energy consumption factors is necessary for both users and device manufacturers, along with other stakeholders in the smartphone ecosystem. It is important to assess how much of the device’s energy is consumed by which components and under what circumstances. This paper provides a generalised, but details analysis of the power consumption causes (internal and external) of a smartphone and also offers suggestive measures to minimise the consumption for each factor. The main contribution of this paper is four comprehensive literature reviews on: a) smartphone’s power consumption assessment and estimation (including power consumption analysis and modelling), b) power consumption management for smartphones (including energy-saving methods and techniques), c) state-of-the-art of the research and commercial developments of smartphone batteries (including alternative power sources), and d) mitigating the hazardous issues of smartphones’ batteries (with a details explanation of the issues). The research works are further subcategorised based on different research and solution approaches. A good number of recent empirical research works are considered for this comprehensive review, and each of them is succinctly analysed and discussed.
Conference Paper
Full-text available
Laparoscopic surgery has revolutionised state of the art in surgical health care. However, its complexity puts a significant burden on the surgeon's cognitive resources resulting in major biliary injuries. With the increasing number of laparoscopic surgeries, it is crucial to identify surgeons' cognitive loads (CL) and levels of focus in real time to give them unobtrusive feedback when detecting the suboptimal level of attention. Assuming that the experts appear to be more focused on attention, we investigate how the skill level of surgeons during live surgery is reflected through eye metrics. Forty-two laparoscopic surgeries have been conducted with four surgeons who have different expertise levels. Concerning eye metrics, we have used six metrics which belong to fixation and pupillary based metrics. With the use of mean, standard deviation and ANOVA test we have proven three reliable metrics which we can use to differentiate the skill level during live surgeries. In future studies, these three metrics will be used to classify the surgeons' cognitive load and level of focus during the live surgery using machine learning techniques.
Conference Paper
Full-text available
Split keyboards are widely used on hand-held touchscreen devices (e.g., tablets). However, typing on a split keyboard often requires eye movement and attention switching between two halves of the keyboard, which slows users down and increases fatigue. We explore peripheral typing, a superior typing mode in which a user focuses her visual attention on the output text and keeps the split keyboard in peripheral vision. Our investigation showed that peripheral typing reduced attention switching, enhanced user experience and increased overall performance (27 WPM, 28% faster) over the typical eyes-on typing mode. This typing mode can be well supported by accounting the typing behavior in statistical decoding. Based on our study results, we have designed GlanceType, a text entry system that supported both peripheral and eyes-on typing modes for real typing scenario. Our evaluation showed that peripheral typing not only well co-existed with the existing eyes-on typing, but also substantially improved the text entry performance. Overall, peripheral typing is a promising typing mode and supporting it would significantly improve the text entry performance on a split keyboard.
Article
Full-text available
With the Internet of Things (IoT) becoming part of our daily life and our environment, we expect rapid growth in the number of connected devices. IoT is expected to connect billions of devices and humans to bring promising advantages for us. With this growth, fog computing, along with its related edge computing paradigms, such as multi-access edge computing (MEC) and cloudlet, are seen as promising solutions for handling the large volume of security-critical and time-sensitive data that is being produced by the IoT. In this paper, we first provide a tutorial on fog computing and its related computing paradigms, including their similarities and differences. Next, we provide a taxonomy of research topics in fog computing, and through a comprehensive survey, we summarize and categorize the efforts on fog computing and its related computing paradigms. Finally, we provide challenges and future directions for research in fog computing.
Conference Paper
Full-text available
The widespread use of smartphones has brought great convenience to our daily lives, while at the same time we have been increasingly exposed to security threats. Keystroke security is an essential element in user privacy protection. In this paper, we present GazeRe-vealer, a novel side-channel based keystroke inference framework to infer sensitive inputs on smartphone from video recordings of victim's eye patterns captured from smartphone front camera. We observe that eye movements typically follow the keystrokes typing on the number-only soft keyboard during password input. By exploiting eye patterns, we are able to infer the passwords being entered. We propose a novel algorithm to extract sensitive eye pattern images from video streams, and classify different eye patterns with Support Vector Classification. We also propose a novel enhanced method to boost the inference accuracy. Compared with prior key-stroke detection approaches, GazeRevealer does not require any external auxiliary devices, and it relies only on smartphone front camera. We evaluate the performance of GazeRevealer with three different types of smartphones, and the result shows that GazeRe-vealer achieves 77.43% detection accuracy for a single key number and 83.33% inference rate for the 6-digit password in the ideal case.
Article
Full-text available
48 pages, 7 tables, 11 figures, Open Access ***************************************************************************************** The data (categories and features/objectives of the papers) of this survey are available at https://github.com/ashkan-software/fog-survey-data ***************************************************************************************** Complete list of conferences, journals, and magazines that publish state-of-the-art research papers on fog computing and its related edge computing paradigms is compiled along with this article and is available at https://anrlutdallas.github.io/resource/projects/fog-computing-conferences.html ***************************************************************************************** With the Internet of Things (IoT) becoming part of our daily life and our environment, we expect rapid growth in the number of connected devices. IoT is expected to connect billions of devices and humans to bring promising advantages for us. With this growth, fog computing, along with its related edge computing paradigms, such as multi-access edge computing (MEC) and cloudlet, are seen as promising solutions for handling the large volume of security-critical and time-sensitive data that is being produced by the IoT. In this paper, we first provide a tutorial on fog computing and its related computing paradigms, including their similarities and differences. Next, we provide a taxonomy of research topics in fog computing, and through a comprehensive survey, we summarize and categorize the efforts on fog computing and its related computing paradigms. Finally, we provide challenges and future directions for research in fog computing.
Conference Paper
Full-text available
Meaningful treatment of dementia today consists of multi-component interventions, such as, cognitive and also physical, sensomotor oriented stimulation. A serious game was developed for multimodal training performed by clients and caregiver using easily configurable services on a Tablet PC. A key problem in developing substantial knowledge about dementia and impacting factors is lack of data about mental processes evolving over time. For this purpose, eye tracking data were captured from non-obtrusive sensing during the game to enable continuous monitoring of dementia profiles. An antisaccade paradigm was used to detect attention inhibition problems that typically occur in executive function related neurodegenerative diseases, such as, in Alzheimer. In a 6 month study with 12 users a classifier was developed that enables to discriminate dementia stages from robustly extracted eye movement features received from training at home. The playful training and its diagnostics oriented toolbox offer affordances for entertaining users, measuring and analysis of mental process parameters, to enable people with dementia to stay longer at home and slowing down the progress of disease.
Article
Full-text available
Mobile access to the Internet is changing the way people consume information, yet we know little about the effects of this shift on news consumption. Consuming news is key to democratic citizenship, but is attention to news the same in a mobile environment? We argue that attention to news on mobile devices such as tablets and smartphones is not the same as attention to news for those on computers. Our research uses eye tracking in two lab experiments to capture the effects of mobile device use on news attention. We also conduct a large-scale study of web traffic data to provide further evidence that news attention is significantly different across computers and mobile devices. © 2018, © The Author(s) 2018. Published by Oxford University Press on behalf of International Communication Association. All rights reserved. For permissions, please e-mail: [email protected]
Article
Full-text available
Advances in biological and medical technologies have been providing us explosive volumes of biological and physiological data, such as medical images, electroencephalography, genomic and protein sequences. Learning from these data facilitates the understanding of human health and disease. Developed from artificial neural networks, deep learning-based algorithms show great promise in extracting features and learning patterns from complex data. The aim of this paper is to provide an overview of deep learning techniques and some of the state-of-the-art applications in the biomedical field. We first introduce the development of artificial neural network and deep learning. We then describe two main components of deep learning, i.e., deep learning architectures and model optimization. Subsequently, some examples are demonstrated for deep learning applications, including medical image classification, genomic sequence analysis, as well as protein structure classification and prediction. Finally, we offer our perspectives for the future directions in the field of deep learning.
Conference Paper
Full-text available
Commodity mobile devices are now equipped with high-resolution front-facing cameras, allowing applications in bio-metrics (e.g., FaceID in the iPhone X), facial expression analysis , or gaze interaction. However, it is unknown how often users hold devices in a way that allows capturing their face or eyes, and how this impacts detection accuracy. We collected 25,726 in-the-wild photos, taken from the front-facing camera of smartphones as well as associated application usage logs. We found that the full face is visible about 29% of the time, and that in most cases the face is only partially visible. Furthermore , we identified an influence of users' current activity; for example, when watching videos, the eyes but not the entire face are visible 75% of the time in our dataset. We found that a state-of-the-art face detection algorithm performs poorly against photos taken from front-facing cameras. We discuss how these findings impact mobile applications that leverage face and eye detection, and derive practical implications to address state-of-the art's limitations.
Conference Paper
Full-text available
During recent years, there have been many studies implemented on the automatic diagnosis of Alzheimer’s Disease (AD) using different methods. The focus of most of these studies has relied upon the detection of AD from neuroimaging data. However, recognizing symptoms early as much as possible(Pre-detection) is crucial as disease-modifying drugs will be most effective if administered early in the course of the disease, before the occurrence of irreversible brain damages. Therefore, there is high importance of utilising automated techniques for pre-detection of AD symptoms from such data. We report an experimental approach to evaluate the best pre-detection method of AD. Our study consists of two main experiments. Those two experiments were implemented using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset. Prior to our first experiment, we have stated an assumption which is if there is a successful AD detection method that will be successful in AD pre-detection also. Different studies have used different data sets and different diagnosing methods. Therefore we have verified the existing and the most successful detection method which is Support Vector Machine (SVM) as the first experiment. According to the results obtained from the initial investigation (detection study), the sensitivity is 95.3%, the specificity is 71.4%, and the accuracy is 84.4% with the use of an SVM. Since those results were not successful, a deep learning based technique (Convolutional Neural Network) was proposed as the second experiment. The proposed Convolutional Neural Network (CNN) model was being tested using different image segmentation methods and different datasets. Finally, the best image segmentation method obtained a high accuracy of around 96% (sensitivity - 96%, specificity - 98%). And the CNN model remains unbiased to the dataset. Results of those experiments suggest an important role for the early diagnosis of Alzheimer’s disease using image processing and deep learning techniques
Conference Paper
Full-text available
Though there is an explosion of cloud services in recent years, cloud gaming industry is still way behind in making a strong breakthrough. Despite many benefits of cloud gaming (video or pixel streaming of rendered game scenes), the primary factor that hinders the growth is response latency, a well known issue. In this paper, we introduce a novel predictive paradigm of cloud gaming in order to mitigate latency that is inherent in cloud gaming systems. The main idea of the predictive paradigm is to pre-generate and send future outcome frames in advance to the thin-client so that the thin-client can respond immediately to any user input. The approach in this work seeks to exploit the near unlimited processing power of the cloud and ever increasing network bandwidth. The paper provides a generic and comprehensive theoretical model that shows the relationship between various parameters (network bandwidth, round-trip-time and processing time for game logic, predicting future outcomes, rendering, capturing, encoding and decoding) of predictive cloud gaming and response latency. The theocratical model design is started with an ideal server with unlimited resources for ease of understanding and then it is enhanced with resource constraints of a practically feasible non-ideal server. The model demonstrates significant amount of response latency reduction in predictive cloud gaming when compared to conventional cloud gaming. The model is analysed further to estimate the resource requirements for predictive cloud gaming. A simple game is built to demonstrate key aspects of the model and to conduct subjective evaluation of quality of experience (QoE) of the game play. The evaluations show significant gain in QoE. The resource requirements of the predictive cloud gaming gaming system grows exponentially with respect to prediction period of future frames. The paper illustrates some optimisation methods to reduce exponential growth and discusses directions for further works in this area.
Conference Paper
Full-text available
Mobile devices addiction has been an important research topic in cognitive science, mental health, and human-machine interaction. Previous works observed mobile device addiction by logging mobile devices activity. Although immersion has been linked as a significant predictor of video game addiction, investigation on addiction factors of mobile device with behavioral measurement has never been done before. In this research, we demonstrated the usage of eye tracking to observe effect of screen size on experience of immersion. We compared subjective judgment with eye movements analysis. Non-parametric analysis on immersion score shows that screen size affects experience of immersion (p<;0.05). Furthermore, our experimental results suggest that fixational eye movements may be used as an indicator for future investigation of mobile devices addiction. Our experimental results are also useful to develop a guideline as well as intervention strategy to deal with smartphone addiction.
Article
Full-text available
Driven by the visions of Internet of Things and 5G communications, recent years have seen a paradigm shift in mobile computing, from the centralized Mobile Cloud Computing towards Mobile Edge Computing (MEC). The main feature of MEC is to push mobile computing, network control and storage to the network edges (e.g., base stations and access points) so as to enable computation-intensive and latency-critical applications at the resource-limited mobile devices. MEC promises dramatic reduction in latency and mobile energy consumption, tackling the key challenges for materializing 5G vision. The promised gains of MEC have motivated extensive efforts in both academia and industry on developing the technology. A main thrust of MEC research is to seamlessly merge the two disciplines of wireless communications and mobile computing, resulting in a wide-range of new designs ranging from techniques for computation offloading to network architectures. This paper provides a comprehensive survey of the state-of-the-art MEC research with a focus on joint radio-and-computational resource management. We also discuss a set of issues, challenges and future research directions for MEC research, including MEC system deployment, cache-enabled MEC, mobility management for MEC, green MEC, as well as privacy-aware MEC. Advancements in these directions will facilitate the transformation of MEC from theory to practice. Finally, we introduce recent standardization efforts on MEC as well as some typical MEC application scenarios.
Article
Full-text available
We study gaze estimation on tablets; our key design goal is uncalibrated gaze estimation using the front-facing camera during natural use of tablets, where the posture and method of holding the tablet are not constrained. We collected a large unconstrained gaze dataset of tablet users, labeled Rice TabletGaze dataset. The dataset consists of 51 subjects, each with 4 different postures and 35 gaze locations. Subjects vary in race, gender and in their need for prescription glasses, all of which might impact gaze estimation accuracy. We made three major observations on the collected data and employed a baseline algorithm for analyzing the impact of several factors on gaze estimation accuracy. The baseline algorithm is based on multilevel HoG feature and Random Forests regressor, which achieves a mean error of 3.17 cm. We perform extensive evaluation on the impact of various practical factors such as person dependency, dataset size, race, wearing glasses and user posture on the gaze estimation accuracy.
Chapter
Full-text available
Little is known about micro-processes by which sensorimotor interaction gives rise to conceptual development. Per embodiment theory, these micro-processes are mediated by dynamical attentional structures. Accordingly this study investigated eye-gaze behaviors during engagement in solving tablet-based bimanual manipulation tasks designed to foster proportional reasoning. Seventy-six elementary- and vocational-school students (9-15 yo) participated in individual task-based clinical interviews. Data gathered included action-logging, eye-tracking, and videography. Analyses revealed the emergence of stable eye-path gaze patterns contemporaneous with first enactments of effective manipulation and prior to verbal articulations of manipulation strategies. Characteristic gaze patterns included consistent or recurring attention to screen locations that bore non-salient stimuli or no stimuli at all yet bore invariant geometric relations to dynamical salient features. Arguably, this research validates empirically hypothetical constructs from constructivism, particularly reflective abstraction.
Conference Paper
Full-text available
Traditional user authentication methods using passcode or finger movement on smartphones are vulnerable to shoulder surfing attack, smudge attack, and keylogger attack. These attacks are able to infer a passcode based on the information collection of user’s finger movement or tapping input. As an alternative user authentication approach, eye tracking can reduce the risk of suffering those attacks effectively because no hand input is required. However, most existing eye tracking techniques are designed for large screen devices. Many of them depend on special hardware like high resolution eye tracker and special process like calibration, which are not readily available for smartphone users. In this paper, we propose a new eye tracking method for user authentication on a smartphone. It utilizes the smartphone’s front camera to capture a user’s eye movement trajectories which are used as the input of user authentication. No special hardware or calibration process is needed. We develop a prototype and evaluate its effectiveness on an Android smartphone. We recruit a group of volunteers to participate in the user study. Our evaluation results show that the proposed eye tracking technique achieves very high accuracy in user authentication.
Article
Full-text available
In recent years, the use of visualizations or infographics in the news has become increasingly popular. We know, however, surprisingly little about how news consumers use and appreciate news visualizations. We apply a mixed-method approach to answer these two questions. First, we conduct an eye-tracking study that measures use, by means of direct attention to visualizations on three different news platforms (print newspaper, e-newspaper on tablet, and news website). Second, we conduct focus groups and a survey among readers of three news media to study the extent to which news consumers actually value the inclusion of visualizations in the news. Our results show that news consumers do indeed read news visualizations, regardless of the platform on which the visual is published. We also find that visualizations are appreciated, but only if they are coherently integrated into a news story and thus fulfill a function that can be easily understood. With this study, we provide the first comprehensive picture of the usefulness of information visualizations in the news, and contribute to a growing literature on alternative ways of storytelling in journalism today.
Article
Full-text available
This article describes two user studies that evaluate different interface designs of indoor pedestrian navigation systems displaying landmarks. In particular, very reduced and abstract interfaces only showing route segments and landmarks are compared to depictions additionally showing floor plans. For this purpose, not only the time it took the participants to fulfill the task, but also eye-tracking data were analyzed. The first experiment (N = 81) was carried out with a smartphone. In the second study (N = 69), a device with a bigger screen was used so that gazes on different screen elements could be analyzed. Results show that the participants reach their destination faster with the abstract interface and, moreover, spend less visual attention on this interface.
Article
Full-text available
Systematic reviews and meta-analyses have become increasingly important in health care. Clinicians read them to keep up to date with their field [1],[2], and they are often used as a starting point for developing clinical practice guidelines. Granting agencies may require a systematic review to ensure there is justification for further research [3], and some health care journals are moving in this direction [4]. As with all research, the value of a systematic review depends on what was done, what was found, and the clarity of reporting. As with other publications, the reporting quality of systematic reviews varies, limiting readers' ability to assess the strengths and weaknesses of those reviews. Several early studies evaluated the quality of review reports. In 1987, Mulrow examined 50 review articles published in four leading medical journals in 1985 and 1986 and found that none met all eight explicit scientific criteria, such as a quality assessment of included studies [5]. In 1987, Sacks and colleagues [6] evaluated the adequacy of reporting of 83 meta-analyses on 23 characteristics in six domains. Reporting was generally poor; between one and 14 characteristics were adequately reported (mean = 7.7; standard deviation = 2.7). A 1996 update of this study found little improvement [7]. In 1996, to address the suboptimal reporting of meta-analyses, an international group developed a guidance called the QUOROM Statement (QUality Of Reporting Of Meta-analyses), which focused on the reporting of meta-analyses of randomized controlled trials [8]. In this article, we summarize a revision of these guidelines, renamed PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses), which have been updated to address several conceptual and practical advances in the science of systematic reviews (Box 1). Box 1: Conceptual Issues in the Evolution from QUOROM to PRISMA Completing a Systematic Review Is an Iterative Process The conduct of a systematic review depends heavily on the scope and quality of included studies: thus systematic reviewers may need to modify their original review protocol during its conduct. Any systematic review reporting guideline should recommend that such changes can be reported and explained without suggesting that they are inappropriate. The PRISMA Statement (Items 5, 11, 16, and 23) acknowledges this iterative process. Aside from Cochrane reviews, all of which should have a protocol, only about 10% of systematic reviewers report working from a protocol [22]. Without a protocol that is publicly accessible, it is difficult to judge between appropriate and inappropriate modifications.
Chapter
Estimating eye-gaze from images alone is a challenging task, in large parts due to un-observable person-specific factors. Achieving high accuracy typically requires labeled data from test users which may not be attainable in real applications. We observe that there exists a strong relationship between what users are looking at and the appearance of the user’s eyes. In response to this understanding, we propose a novel dataset and accompanying method which aims to explicitly learn these semantic and temporal relationships. Our video dataset consists of time-synchronized screen recordings, user-facing camera views, and eye gaze data, which allows for new benchmarks in temporal gaze tracking as well as label-free refinement of gaze. Importantly, we demonstrate that the fusion of information from visual stimuli as well as eye images can lead towards achieving performance similar to literature-reported figures acquired through supervised personalization. Our final method yields significant performance improvements on our proposed EVE dataset, with up to 28% improvement in Point-of-Gaze estimates (resulting in 2.49∘ in angular error), paving the path towards high-accuracy screen-based eye tracking purely from webcam sensors. The dataset and reference source code are available at https://ait.ethz.ch/projects/2020/EVE.
Article
This paper describes a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates. There are three key contributions. The first is the introduction of a new image representation called the “Integral Image” which allows the features used by our detector to be computed very quickly. The second is a simple and efficient classifier which is built using the AdaBoost learning algorithm (Freund and Schapire, 1995) to select a small number of critical visual features from a very large set of potential features. The third contribution is a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions. A set of experiments in the domain of face detection is presented. The system yields face detection performance comparable to the best previous systems (Sung and Poggio, 1998; Rowley et al., 1998; Schneiderman and Kanade, 2000; Roth et al., 2000). Implemented on a conventional desktop, face detection proceeds at 15 frames per second.
Article
Background: Self-disgust has been associated with loneliness and mental health difficulties in clinical and non-clinical populations, but there is limited research on the role of self-disgust in loneliness and mental health outcomes in older adults. Methods: In Study 1 (N = 102; M age = 68.4 years, SD = 10.9, 68% females) we used a cross-sectional survey to explore the association between loneliness, self-disgust and mental health outcomes. In Study 2 (N = 80; M age = 68.8 years, SD = 11.4, 57% females) we used eye-tracking to investigate attentional vigilance, maintenance and avoidance in individuals with high (vs. low) self-disgust. Results: In study 1 we found that self-disgust mediated the associations of loneliness with anxiety and depressive symptoms, and in study 2 it was demonstrated that older adults with high (vs. low) self-disgust displayed attentional avoidance to their own faces, compared to the faces of unknown others, a process that may perpetuate loneliness. Limitations: The cross-sectional design used in Study 1 limits our potential to make causal inferences. Additionally, both studies included a wide age range of older adults. Conclusions: Our findings are novel and highlight the importance of self-disgust experiences in the context of loneliness and mental health outcomes in older adults. Implications for practice and interventions against loneliness in this age group are discussed.
Conference Paper
Can a smartphone administer a driver license test? We ask this question because of the inadequacy of manual testing and the expense of outfitting an automated testing track with sensors such as cameras, leading to less-than-thorough testing and ultimately compromising road safety. We present ALT, a low-cost smartphone-based system for automating key aspects of the driver license test. A windshield-mounted smartphone serves as the sole sensing platform, with the front camera being used to monitor driver's gaze, and the rear camera, together with inertial sensors, being used to evaluate driving maneuvers such as parallel parking. The sensors are also used in tandem, for instance, to check that the driver scanned their mirror during a lane change. The key challenges in ALT arise from the variation in the subject (driver) and the environment (vehicle geometry, camera orientation, etc.), little or no infrastructure support to keep costs low, and also the limitations of the smartphone (low-end GPU). The main contributions of this paper are: (a) robust detection of driver's gaze by combining head pose and eye gaze information, and performing auto-calibration to accommodate environmental variation, (b) a hybrid visual SLAM technique that combines visual features and a sparse set of planar markers, placed optimally in the environment, to derive accurate trajectory information, and (c) an efficient realization on smartphones using both CPU and GPU resources. We perform extensive experiments, both in controlled settings and on an actual driving test track, to validate the efficacy of ALT.
Conference Paper
In recent years, eye tracking technique has been greatly promoted. However, due to the limitations of the hardware performance of mobile device, the image captured by the mobile device camera has low resolution, so the existed image processing based eye tracking technique for mobile device still has low accuracy. This paper proposed an approach, MobiET, to compute the gaze fixation on mobile device. Based on the image processing of eye image, the rectangle area of eye was extracted; and then the geometric center of this area (EC), and the iris center of gravity (IC) were detected. Then the vector of EC-IC was generated, and by means of calibration, the mapping relationship between the vector and the gaze fixation coordinates could be calculated. The evaluation results showed that the proposed approach had high accuracy, e.g., 2.34 to 4.69 degrees of visual angle when the distance between eyes and smart phone screen was from 22 to 28 centimeters.
Article
People watching smartphones while walking causes a significant impact to their safety. Pedestrians staring at smartphone screens while walking along the sidewalk are generally more at risk than other pedestrians not engaged in smartphone usage. In this study, the authors propose Safe Walking, an Android smartphone-based system that detects the walking behaviour of pedestrians by leveraging the sensors and front camera on smartphones, improving the safety of pedestrians staring at smartphone screens. More specifically, Safe Walking first exploits a pedestrian speed calculation algorithm by sampling acceleration data via the accelerometer and calculating gravity components via the gravity sensor. Then, this system utilises a greyscale image detection algorithm to detect the face and eye movement modes based on OpenCV4Android to determine if pedestrians are staring at the screens. Finally, Safe Walking generates a vibration by a vibrator on smartphones to alert pedestrians to pay attention to road conditions. The authors implemented Safe Walking on an Android smartphone and evaluated pedestrian walking speed, the accuracy of eye movement, and system performance. The results show that Safe Walking can prevent the potential danger for pedestrians staring at smartphone screens with a true positive rate of 91%.
Conference Paper
We study the question of feature sets for robust visual object recognition, adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of Histograms of Oriented Gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.
Article
In last years smartphone and tablet devices have been handling an increasing variety of sensitive resources. As a matter of fact, these devices store a plethora of information related to our every-day life, from the contact list, the received email, and also our position during the day (using not only the GPS chipset that can be disabled but only the Wi-Fi/mobile connection it is possible to discover the device geolocalization). This is the reason why mobile attackers are producing a large number of malicious applications targeting Android (that is the most diffused mobile operating system), often by modifying existing applications, which results in malware being organized in families, where each application belonging to the same family exhibit the same malicious behaviour. These behaviours are typically information gathering related, for instance a very widespread malicious behaviour in mobile is represented by sending personal information (as examples: the contact list, the received and send SMSs, the browser history) to a remote server managed by the attackers. In this paper, we investigate whether deep learning algorithms are able to discriminate between malicious and legitimate Android samples. To this end, we designed a method based on convolutional neural network applied to syscalls occurrences through dynamic analysis. We experimentally evaluated the built deep learning classifiers on a recent dataset composed of 7100 real-world applications, more than 3000 of which are widespread malware belonging to several different families in order to test the effectiveness of the proposed method, obtaining encouraging results.
Conference Paper
Current eye-tracking input systems for people with ALS or other motor impairments are expensive, not robust under sunlight, and require frequent re-calibration and substantial, relatively immobile setups. Eye-gaze transfer (e-tran) boards, a low-tech alternative, are challenging to master and offer slow communication rates. To mitigate the drawbacks of these two status quo approaches, we created GazeSpeak, an eye gesture communication system that runs on a smartphone, and is designed to be low-cost, robust, portable, and easy-to-learn, with a higher communication bandwidth than an e-tran board. GazeSpeak can interpret eye gestures in real time, decode these gestures into predicted utterances, and facilitate communication, with different user interfaces for speakers and interpreters. Our evaluations demonstrate that GazeSpeak is robust, has good user satisfaction, and provides a speed improvement with respect to an e-tran board; we also identify avenues for further improvement to low-cost, low-effort gaze-based communication technologies.
Article
This article proposes an approach to enhance users' experience of video streaming in the context of smart cities. The proposed approach relies on the concept of MEC as a key factor in enhancing QoS. It sustains QoS by ensuring that applications/services follow the mobility of users, realizing the "Follow Me Edge" concept. The proposed scheme enforces an autonomic creation of MEC services to allow anywhere anytime data access with optimum QoE and reduced latency. Considering its application in smart city scenarios, the proposed scheme represents an important solution for reducing core network traffic and ensuring ultra-short latency through a smart MEC architecture capable of achieving the 1 ms latency dream for the upcoming 5G mobile systems.
Article
Sign language can be used to facilitate communication with and between deaf or hard of hearing (Deaf/HH). With the advent of video streaming applications in smart TVs and mobile devices, it is now possible to use sign language to communicate over worldwide networks. In this article, we develop a prototype assistive device for real-time speech-to-sign translation. The proposed device aims at enabling Deaf/HH people to access and understand materials delivered in mobile streaming videos through the applications of pipelined and parallel processing for real-time translation, and the application of eye-tracking based user-satisfaction detection to support dynamic learning to improve speech-to-signing translation. We conduct two experiments to evaluate the performance and usability of the proposed assistive device. Nine deaf people participated in these experiments. Our real-time performance evaluation shows the addition of viewer’s attention-based feedback reduced translation error rates by 16% (per the sign error rate [SER] metric) and increased translation accuracy by 5.4% (per the bilingual evaluation understudy [BLEU] metric) when compared to a non-real-time baseline system without these features. The usability study results indicate that our assistive device was also pleasant and satisfying to deaf users, and it may contribute to greater engagement of deaf people in day-to-day activities.
Article
This paper explores the perception and effectiveness of mobile search ads from the perspective of users. The study investigates the attention and interaction of users as well as their subjective estimation of paid listings within Google search results on smartphones. During the tests, each of the 20 users has to accomplish four different search tasks. Data collection methods combine eye-tracking with click-through analysis and interviews. Results indicate that there is no “ad blindness” on mobile search, but similar to desktop search, users also tend to avoid search advertising on smartphones. For mobile search, ads appear to cause higher usability costs than on desktop.
Article
The proliferation of Internet of Things and the success of rich cloud services have pushed the horizon of a new computing paradigm, Edge computing, which calls for processing the data at the edge of the network. Edge computing has the potential to address the concerns of response time requirement, battery life constraint, bandwidth cost saving, as well as data safety and privacy. In this paper, we introduce the definition of Edge computing, followed by several case studies, ranging from cloud offloading to smart home and city, as well as collaborative Edge to materialize the concept of Edge computing. Finally, we present several challenges and opportunities in the field of Edge computing, and hope this paper will gain attention from the community and inspire more research in this direction.
Conference Paper
Smartphones and tablets are often used in dynamic environments that force users to break focus and attend to their surroundings, creating a form of "situational impairment." Current mobile devices have no ability to sense when users divert or restore their attention, let alone provide support for resuming tasks. We therefore introduce SwitchBack, a system that allows mobile device users to resume tasks more efficiently. SwitchBack is built upon Focus and Saccade Tracking (FAST), which uses the front-facing camera to determine when the user is looking and how their eyes are moving across the screen. In a controlled study, we found that FAST can identify how many lines the user has read in a body of text within a mean absolute percent error of just 3.9%. We then tested SwitchBack in a dual focus-of-attention task, finding that SwitchBack improved average reading speed by 7.7% in the presence of distractions.
Book
In the past few years, there has been an explosion of eye movement research in cognitive science and neuroscience. This has been due to the availability of 'off the shelf' eye trackers, along with software to allow the easy acquisition and analysis of eye movement data. Accompanying this has been a realisation that eye movement data can be informative about many different aspects of perceptual and cognitive processing. Eye movements have been used to examine the visual and cognitive processes underpinning a much broader range of human activities, including, language production, dialogue, human computer interaction, driving behaviour, sporting performance, and emotional states. Finally, in the past thirty years, there have been real advances in our understanding of the neural processes that underpin eye movement behaviour. The Oxford Handbook of Eye Movements provides a comprehensive review of the entire field of eye movement research. In over fifty articles, it reviews the developments that have so far taken place, the areas actively being researched, and looks at how the field is likely to develop in the coming years. The first section considers historical and background material, before moving onto a second section on the neural basis of eye movements. The third and fourth sections look at visual cognition and eye movements and eye movement pathology and development. The final sections consider eye movements and reading and language processing and eye movements.
Chapter
This chapter describes a variety of behavioral and physiological measures that might be helpful in usability testing as additional ways of learning about the users' experiences with a product and their reactions to it. Some of these can be detected by careful observation, and some require specialized equipment. A structured approach to collecting observational data both verbal and nonverbal during a usability test can be very helpful in subsequent analysis. Facial expressions that participants make during a usability test may give additional insight into what they are thinking and feeling beyond what they say. A trained observer can detect and categorize many of these expressions, but some are very fleeting and may require video analysis. Eye-tracking can be a significant benefit in many kinds of usability tests. Its key value can be in determining whether participants in a usability test even looked at a particular element of the interface. Most eye-tracking systems must detect the location of the participant's pupil and calculate its diameter to determine where he or she is looking. Participants' pupils tend to dilate with higher mental workload and with overall arousal. Skin conductance and heart rate can be used to detect parts of an interface that participants find particularly frustrating. But the technology readily available today for measuring these is too intrusive for normal usability testing.
Article
This paper describes an eye-tracking analysis of usability studies on applications designed for children with autism. The design implications for people with autism are discussed. We report on insights from our experience using eye tracking to evaluate the applications of augmentative and alternative communication (AAC) for children with autism spectrum disorders (ASD). We present the main challenges of using eye tracking with tablets for children with ASD.