About
25
Publications
2,023
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
103
Citations
Citations since 2017
Introduction
Publications
Publications (25)
With an ever-increasing number of mobile devices competing for attention, quantifying when, how often, or for how long users look at their devices has emerged as a key challenge in mobile human-computer interaction. Encouraged by recent advances in automatic eye contact detection using machine learning and device-integrated cameras, we provide a fu...
Common calibration techniques for head-mounted eye trackers rely on markers or an additional person to assist with the procedure. This is a tedious process and may even hinder some practical applications. We propose a novel calibration technique which simplifies the initial calibration step for mobile scenarios. To collect the calibration samples,...
Many real-life scenarios can benefit from both physical proximity and natural gesture interaction. In this paper, we explore shared collocated interactions on unmodified wearable devices. We introduce an interaction technique which enables a small group of people to interact using natural gestures. The proximity of users and devices is detected thr...
We propose Unified Model of Saliency and Scanpaths (UMSS)-a model that learns to predict multi-duration saliency and scanpaths (i.e. sequences of eye fixations) on information visualisations. Although scanpaths provide rich information about the importance of different visualisation elements during the visual exploration process, prior work has bee...
We propose Neuro-Symbolic Visual Dialog (NSVD) -the first method to combine deep learning and symbolic program execution for multi-round visually-grounded reasoning. NSVD significantly outperforms existing purely-connectionist methods on two key challenges inherent to visual dialog: long-distance co-reference resolution as well as vanishing questio...
Despite its importance for assessing the effectiveness of communicating information visually, fine-grained recallability of information visualisations has not been studied quantitatively so far. In this work, we propose a question-answering paradigm to study visualisation recallability and present VisRecall - a novel dataset consisting of 200 visua...
One approach to mitigate shoulder surfing attacks on mobile devices is to detect the presence of a bystander using the phone’s front-facing camera. However, a person’s face in the camera’s field of view does not always indicate an attack. To overcome this limitation, in a novel data collection study (N=16), we analysed the influence of three viewin...
Gaze-based analysis of areas of interest (AOIs) is widely used in information visualisation research to understand how people explore visualisations or assess the quality of visualisations concerning key characteristics such as memorability. However, nearby AOIs in visualisations amplify the uncertainty caused by the gaze estimation
error, which st...
We propose Unified Model of Saliency and Scanpaths (UMSS) -- a model that learns to predict visual saliency and scanpaths (i.e. sequences of eye fixations) on information visualisations. Although scanpaths provide rich information about the importance of different visualisation elements during the visual exploration process, prior work has been lim...
Human-like attention as a supervisory signal to guide neural attention has shown significant promise but is currently limited to uni-modal integration - even for inherently multimodal tasks such as visual question answering (VQA). We present the Multimodal Human-like Attention Network (MULAN) - the first method for multimodal integration of human-l...
We propose a novel method that leverages human fixations to visually decode the image a person has in mind into a photofit (facial composite). Our method combines three neural networks: An encoder, a scoring network, and a decoder. The encoder extracts image features and predicts a neural activation map for each face looked at by a human observer....
With an ever-increasing number of mobile devices competing for our attention, quantifying when, how often, or for how long users visually attend to their devices has emerged as a core challenge in mobile human-computer interaction. Encouraged by recent advances in automatic eye contact detection using machine learning and device-integrated cameras,...
Quantification of human attention is key to several tasks in mobile human-computer interaction (HCI), such as predicting user interruptibility, estimating noticeability of user interface content, or measuring user engagement. Previous works to study mobile attentive behaviour required special-purpose eye tracking equipment or constrained users' mob...
Nowadays, humans are surrounded by many complex computer systems. When people interact among each other, they use multiple modalities including voice, body posture, hand gestures, facial expressions, or eye gaze. Currently, computers can only understand a small subset of these modalities, but such cues can be captured by an increasing number of wea...
Augmenting people with wearable technology can enhance
their natural sensing, actuation, and communication capabil-
ities. Interaction with smart devices can become easier and
less explicit when combining multiple wearables instead of
using device-specific apps on a single smartphone. We demon-
strate a prototype for smart device control by combini...
When compared to image recognition, object detection is a
much more challenging task because it requires the accurate
real-time localization of an object in the target image. In
interaction scenarios, this pipeline can be simplified by incor-
porating the users’ point of regard. Wearable eye trackers can
estimate the gaze direction, but lack own pr...
When people are introduced to each other, exchanging contact information happens either via smartphone interactions or via more traditional business cards. Crowded social events make it more challenging to keep track of all the new contacts. We introduce HandshakAR, a novel wearable augmented reality application that enables effortless sharing of d...
We describe ubiGaze, a novel wearable ubiquitous method to augment any real-world object with invisible messages through gaze gestures that lock the message into the object. This enables a context and location dependent messaging service, which users can utiize discreetly and effortlessly. Further, gaze gestures can be used as an authentication met...
Indoor localization is an important topic for context aware applications. In particular, many applications for wireless devices can benefit from knowing the location of a user. Despite the huge effort from the research community to solve the localization problem, there is no widely accepted solution for localization in an indoor environment. In thi...
This paper proposes a probabilistic model for automated reasoning for identifying the lane on which the vehicle is driving on. The solution is based on the visual information from an on-board stereo-vision camera and a priori information from an extended digital map. The visual perception system provides information about on-the-spot detected later...
This paper proposes a method for achieving accurate ego-vehicle global localization with respect to an approaching intersection; the method is based on the data alignment of the information from two input systems: a Sensorial Perception system, on-board of the ego-vehicle, and an a priori digital map. For this purpose an Extended Digital Map is pro...