Conference Paper

TraceMatch: a Computer Vision Technique for User Input by Tracing of Animated Controls

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Recent works have explored the concept of movement correlation interfaces, in which moving objects can be selected by matching the movement of the input device to that of the desired object. Previous techniques relied on a single modality (e.g. gaze or mid-air gestures) and specific hardware to issue commands. TraceMatch is a computer vision technique that enables input by movement correlation while abstracting from any particular input modality. The technique relies only on a conventional webcam to enable users to produce matching gestures with any given body parts, even whilst holding objects. We describe an implementation of the technique for acquisition of orbiting targets, evaluate algorithm performance for different target sizes and frequencies, and demonstrate use of the technique for remote control of graphical as well as physical objects with different body parts.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Motion correlation allows for the control of interactive objects by using body motion to mimic an observed movement [1]. One commonly used technique is the use of orbital controls [2][3][4][5][6], where a user must imitate the motion of an orbital target to activate it. These orbital targets can be (1) displayed visually through screens when these are available (e.g., in smart TV sets) [2,5], (2) projected onto objects [6], or (3) embodied in artifacts specifically designed to make orbital movements [4,7], although the last two solutions are impractical and expensive. ...
... One commonly used technique is the use of orbital controls [2][3][4][5][6], where a user must imitate the motion of an orbital target to activate it. These orbital targets can be (1) displayed visually through screens when these are available (e.g., in smart TV sets) [2,5], (2) projected onto objects [6], or (3) embodied in artifacts specifically designed to make orbital movements [4,7], although the last two solutions are impractical and expensive. ...
... To investigate this aspect, we assessed synchronization abilities at different orbit speeds of 2.4 and 3.6 s. Similarly to other works (e.g., [4,5]), the results indicate that users exhibit better performance at a lower speed (3.6 s). ...
Article
Full-text available
SoundOrbit is a novel input technique that uses motion correlation to control smart devices. The technique associates controls with specific orbital sounds, made of cyclically increasing/decreasing musical scales, and the user can activate a control by mimicking the corresponding sound by body motion. Unlike previous movement-correlation techniques based on visual displays, SoundOrbit operates independent of visual perception, enabling the development of cost-effective smart devices that do not require visual displays. We investigated SoundOrbit by conducting two user studies. The first study evaluated the effectiveness of binaural sound spatialization to create a distinct orbiting sound. In comparison to a cyclic musical scale that is fixed in the apparent auditory space, we found that spatial effects did not improve users’ ability to follow the sound orbit. In the second study, we aimed at determining the optimal system parameters, and discovered that users synchronize better with slower speeds. The technique was found to be feasible and reliable for one and two orbits simultaneously, each orbit using a distinct sound timbre, but not for three orbits due to a high error rate.
... A recent class of such embodied interaction techniques, broadly described as motion matching [20,46,55] (synonyms include motion coincidence, motion pointing, rhythmic path mimicry), allows users to interact with digital systems by imitating a moving entity using bodily movements (e.g. moving one's hand to match a circular motion, see Fig. 1) [9,11,21,48]. Compared to other touchless embodied systems, motion matching is seen as an interesting alternative for interaction with public displays and a growing number of smart appliances at home. ...
... Compared to other touchless embodied systems, motion matching is seen as an interesting alternative for interaction with public displays and a growing number of smart appliances at home. For example, it works with unmodified off-the-shelf hardware such as web-cams [11] or smart watches [50]; does not require any gesture data training sets; and does not require gesture (or speech) discovery and memorization -making it an ideal candidate for spontaneous interaction [52]. On the other hand, previous work in this domain has focused primarily on seminal performance studies and technical developments. ...
... Related work has developed and studied a variety of technical implementations of motion matching interaction, using webcams [11,12], depth-sensors [9], eye-trackers [18,25,37,48,52], magnets [39], and inertial measurement units (IMUs) embedded in smart-watches [50], phones [4], and AR headsets [19]. These implementations are supplemented with work on further algorithmic developments and novel deployments [10,16,21,27,29,47]. ...
Conference Paper
Full-text available
Amongst the variety of (multi-modal) interaction techniques that are being developed and explored, the Motion Matching paradigm provides a novel approach to selection and control. In motion matching, users interact by rhythmically moving their bodies to track the continuous movements of different interface targets. This paper builds upon the current algorithmic and usability focused body of work by exploring the product possibilities and implications of motion matching. Through the development and qualitative study of four novel and different real-world motion matching applications --- with 20 participants --- we elaborate on the suitability of motion matching in different multi-user scenarios, the less pertinent use in home environments and the necessity for multi-modal interaction. Based on these learnings, we developed three novel motion matching based interactive lamps, which report on clear paths for further dissemination of the embodied interaction technique's experience. This paper hereby informs the design of future motion matching interfaces and products.
... SEQUENCE bypasses both Midas Touch and mapping issues by matching the user's input against a corresponding rhythm, in contrast to corresponding position. PathSync [12] and TraceMatch [14,15], instead, bypass the same issues by matching the user's input against the corresponding motion. ...
... Finally, rhythm can be detected regardless of conventional body tracking technique, e.g., using a pressure sensor to detect users' blowing or even using bio-signal-based human-computer interfaces like EOG glasses to detect eye blinks [4]. At any rate, as other similar works highlight (e.g., [14,15,20]), SEQUENCE is not intended to replace existing remotes, but aims to complement them to provide users instant control Bright now^and Bright there.3 ...
... In sum, this lets us think that SEQUENCE can be effectively used by most users, but few could suffer from difficulty in synchronization. It must be pointed out that-even if in different ways-also other techniques based on motion correlation (e.g., [12,14,15]) leverage synchronization of user movements with moving targets, but none of them pointed out the beat deafness issue-probably because of its rarity. At any rate, in a state-of-art article on motion correlation techniques, the authors state that Bproblems caused by users' abilities to match the motion are somewhat harder to resolve. ...
Article
Full-text available
We present SEQUENCE, a novel interaction technique for selecting objects from a distance. Objects display different rhythmic patterns by means of animated dots, and users can select one of them by matching the pattern through a sequence of taps on a smartphone. The technique works by exploiting the temporal coincidences between patterns displayed by objects and sequences of taps performed on a smartphone: if a sequence matches with the pattern displayed by an object, the latter is selected. We propose two different alternatives for displaying rhythmic sequences associated with objects: the first one uses fixed dots (FD), the second one rotating dots (RD). Moreover, we performed two evaluations on such alternatives. The first evaluation, carried out with five participants, was aimed to discover the most appropriate speed for displaying animated rhythmic patterns. The second evaluation, carried out on 12 participants, was aimed to discover errors (i.e., activation of unwanted objects), missed activations (within a certain time), and time of activations. Overall, the proposed design alternatives perform in similar ways (errors, 2.8% for FD and 3.7% for RD; missed, 1.3% for FD and 0.9% for RD; time of activation, 3862 ms for FD and 3789 ms for RD).
... Motion-matching is an alternative selection mechanism to pointing, relying on the ability of users to couple with motion displayed at the interface [57]. First explored by Williamson and Murray-Smith [63], motion-matching has been used with a variety of different input modalities, including the mouse [63,24], eye gaze [47,59,22], and recently touchless interaction [13,17,16]. PathSync demonstrated the discoverability, intuitiveness and multi-user capacity of motion-matching for hand-based gestures [13], while TraceMatch showed users' capacity to synchronise using different input modalities [16]. ...
... PathSync demonstrated the discoverability, intuitiveness and multi-user capacity of motion-matching for hand-based gestures [13], while TraceMatch showed users' capacity to synchronise using different input modalities [16]. TraceMatch also introduced a webcam-based implementation of motion-matching that accepts any form of movement as input [17], an approach we adopt for the motion-matching phase in MatchPoint. Prior work proposed motion-matching as an alternative to spatial coupling, whereas we combine the two principles to leverage their respective advantages. ...
... For motion-matching we use the TraceMatch processing pipeline, introduced by Clarke et al. [17]. TraceMatch uses Orbits, introduced by Esteves et al. [22], as input controls which consist of an orbiting target around a circular widget, a motion that is not likely to be reproduced accidentally by the user. ...
Conference Paper
Pointing is a fundamental interaction technique where user movement is translated to spatial input on a display. Conventionally, this is based on a rigid configuration of a display coupled with a pointing device that determines the types of movement that can be sensed, and the specific ways users can affect pointer input. Spontaneous spatial coupling is a novel input technique that instead allows any body movement, or movement of tangible objects, to be appropriated for touchless pointing on an ad hoc basis. Pointer acquisition is facilitated by the display presenting graphical objects in motion, to which users can synchronise to define a temporary spatial coupling with the body part or tangible object they used in the process. The technique can be deployed using minimal hardware, as demonstrated by MatchPoint, a generic computer vision-based implementation of the technique that requires only a webcam. We explore the design space of spontaneous spatial coupling, demonstrate the versatility of the technique with application examples, and evaluate MatchPoint performance using a multi-directional pointing task.
... Velloso et al. [4] and Clarke et al. [1] expanded this work to the control of interactive devices in the home, using computer vision to capture the tracking movement of users' eyes and bodies (respectively). Velloso et al. explored different ways of producing movement in physical space, using a projector, a laser pointer, and a physical artefact that moved (a windmill) to enable users to control a music system, smart lights, and a fan, using solely their eyes. ...
... This work highlights how motion matching interfaces offer not only direct but uniform control in interactive spaces, with no need for proxy-devices such as controllers or smart phone applications. Clarke et al. [1] explored the use of motion matching for interaction with smart TVs, using a simple webcam to capture tracking movement performed by the users' bodies, such as their hands or head; or any hand-held object, such as a smart phone or a cup of coffee. This highlights how motion matching can support interaction when the users' hands are busy, without the need for specialized hardware such as eye-trackers or depth-cameras. ...
... Fig. 3 describes our system architecture: an off-the-shelf Android smartwatch is paired to a smartphone responsible for gathering and correlating user and target movement displayed in one or more smart appliances. These appliances can be a simple interactive display (e.g., a smart TV [1]), or any of the smart devices described in [4] -the only requirement is the ability to generate motion, and to stream that information as x-and y-data to the user's smartphone. A pilot study on user performance with the prototype is described in [5]. ...
Conference Paper
Full-text available
This paper presents a prototype of a smart home control system operated through motion matching input. In motion matching, targets move continuously in a singular and pre-defined path; users interact with these targets by tracking their movement for a short period of time. Our prototype captures user input through the motion sensors embedded in off-the-shelf smartwatches while users track the moving targets with their arms and hands. The wearable nature of the tracking system makes our prototype ideal for interaction with numerous devices in a smart home.
... In Pursuits [10] (public displays), Orbits [4] (smart watches), and AmbiGaze [9] (smart rooms), an eye-tracker captures the input naturally provided by the user's eyes while these follow various moving targets. Similarly, in TraceMatch [3] and PathSync [2] (smart TVs), a simple optical system attached to a TV tracks the user's body for motion matching input. ...
... As with PathSync [2] and TraceMatch [3], WaveTrace is a motion matching technique that supports target acquisition through pointing gestures. Unlike these techniques, which use a computer vision system to ...
... We conducted a user study to first, test the feasibility of our idea; and second, highlight performance differences between WaveTrace, which uses Euler angles to represent user input, and optical-based systems described in literature (where user input is represented as x and y in space). To achieve this, several of the target conditions found in PathSync [2] and TraceMatch [3] were replicated. ...
Conference Paper
Full-text available
We present WaveTrace, a novel interaction technique based on selection by motion matching. In motion matching systems, targets move continuously in a singular and pre-defined path -- users interact with these by performing a synchronous bodily movement that matches the movement of one of the targets. Unlike previous work which tracks user input through optical systems, WaveTrace is arguably the first motion matching technique to rely on motion data from inertial measurement units readily available in many wrist-worn wearable devices such as smart watches. To evaluate the technique, we conducted a user study in which we varied: hand; degrees of visual angle; target speed; and number of concurrent targets. Preliminary results indicate that the technique supports up to eight concurrent targets; and that participants could select targets moving at speeds between 180 and 270/s (mean acquisition time of 2237ms, and average success rate of 91%).
... However, it is only in recent work that this concept has been applied as part of holistic interface designs. In our combined work, we have used motion correlation with gaze and gesture as modalities, in interfaces designed for public displays, smart watches, interactive TVs, and smart homes [Vidal et al. 2013b;Esteves et al. 2015a;Clarke et al. 2016]. ...
... In CycloStar, continuous closed loop motion is used to support panning and zooming in touch interfaces in a clutch-free manner [Malacria et al. 2010]. We have employed cyclic motion in similar ways in the design of gaze and mid-air gesture interfaces [Esteves et al. 2015b;Clarke et al. 2016]. ...
... In this section, we review five of the present authors' recent works in which motion correlation was applied in the design of gaze and gesture interfaces: Pursuits [Vidal et al. 2013a[Vidal et al. , 2013bPfeuffer et al. 2013], Orbits [Esteves et al. 2015a[Esteves et al. , 2015b, Path-Sync , AmbiGaze , and TraceMatch [Clarke et al. 2016]. None of these works were designed to study motion correlation in the first place, but demonstrate the principle at work in giving rise to novel, practical, and compelling interaction designs. ...
Article
Selection is a canonical task in user interfaces, commonly supported by presenting objects for acquisition by pointing. In this article, we consider motion correlation as an alternative for selection. The principle is to represent available objects by motion in the interface, have users identify a target by mimicking its specific motion, and use the correlation between the system’s output with the user’s input to determine the selection. The resulting interaction has compelling properties, as users are guided by motion feedback, and only need to copy a presented motion. Motion correlation has been explored in earlier work but only recently begun to feature in holistic interface designs. We provide a first comprehensive review of the principle, and present an analysis of five previously published works, in which motion correlation underpinned the design of novel gaze and gesture interfaces for diverse application contexts. We derive guidelines for motion correlation algorithms, motion feedback, choice of modalities, overall design of motion correlation interfaces, and identify opportunities and challenges identified for future research and design.
... There were also trials to use multi-modal gaze-hand input to address this issue [39,61], but it increases the cognitive load. In addition, hardware limitations related to form factor, battery In response to these issues, and inspired by the potential of motion matching interaction methods [19,29,37,40,59], this paper introduces the Whirling Interface, a motion matching-based selection interface for distant object selection. We implemented the Whirling Interface with user feedback, addressing Norman's concerns with gesture interfaces [44]. ...
... The final version of this record is available at: 10.1109/IS-MAR62088.2024.00046 of display-guided interactions, facilitated by tapping [10] or touch gestures on the screen [19]. When it comes to hand-based motion matching interaction, PathSync [29] and TraceMatch [13,37] have shown potential using a computer vision-based tracking system, while WaveTrace [40] demonstrated applicability using a smartwatch. ...
... Rhythmic-synchronization-based techniques originated somehow from movementcorrelation techniques [5][6][7][8][9][10], where controls show different motion patterns and the user can select one of them by mimicking the corresponding motion. ...
... From the point of view of the input device, rhythmic synchronization techniques can be simpler compared to those based on movement correlation. While motion-correlation techniques require sensors that detect movement (e.g., cameras [6,7] or kinect [9]), rhythmic synchronization techniques require sensors as simple as a button, e.g., [1,4]. As matter of fact, previous studies showed that these techniques can support a wide variety of sensors [1,3]; in particular, any sensor capable of generating a binary input through which users can perform the required rhythm. ...
Article
Full-text available
Rhythmic-synchronization-based interaction is an emerging interaction technique where multiple controls with different rhythms are displayed in visual form, and the user can select one of them by matching the corresponding rhythm. These techniques can be used to control smart objects in environments where there may be interfering auditory stimuli that contrast with the visual rhythm (e.g., to control Smart TVs while playing music), and this could compromise users’ ability to synchronize. Moreover, these techniques require certain reflex skills to properly synchronize with the displayed rhythm, and these skills may vary depending on the age and gender of the users. To determine the impact of interfering auditory stimuli, age, and gender on users’ ability to synchronize, we conducted a user study with 103 participants. Our results show that there are no significant differences between the conditions of interfering and noninterfering auditory stimuli and that synchronization ability decreases with age, with males performing better than females—at least as far as younger users are concerned. As a result, two implications emerge: first, users are capable of focusing only on visual rhythm ignoring the auditory interfering rhythm, so listening to an interfering rhythm should not be a major concern for synchronization; second, as age and gender have an impact, these systems may be designed to allow for customization of rhythm speed so that different users can choose the speed that best suits their reflex skills.
... matched by the user to select them [Velloso et al. 2017]. Velloso et al. refer to this principle as Motion Correlation [Velloso et al. 2017], encompassing techniques based on a wide range of devices, including mice [Fekete et al. 2009;Williamson and Murray-Smith 2004], accelerometers [Verweij et al. 2017b], depth cameras , and webcams [Clarke et al. 2016]. ...
... Because users are able to match movements using many body parts, previous works have shown applications that go beyond hand gestures. In SmoothMoves, Esteves et al. tracked the user's head movements using an augmented reality headset to match the motions being displayed in the AR interface ]. Clarke et al. abstracted from the body part that is matching the movement by tracking any matching body part using a webcam [Clarke et al. , 2016. The authors showed that the technique can work with movements ranging from a hand motion while holding a cup of tea to feet movements. ...
Conference Paper
Full-text available
Recently, interaction techniques in which the user selects screen targets by matching their movement with the input device have been gaining popularity, particularly in the context of gaze interaction (e.g. Pursuits, Orbits, AmbiGaze, etc.). However, though many algorithms for enabling such interaction techniques have been proposed , we still lack an understanding of how they compare to each other. In this paper, we introduce two new algorithms for matching eye movements: Profile Matching and 2D Correlation, and present a systematic comparison of these algorithms with two other state-of-the-art algorithms: the Basic Correlation algorithm used in Pursuits and the Rotated Correlation algorithm used in PathSync. We also examine the effects of two thresholding techniques and post-hoc filtering. We evaluated the algorithms on a user dataset and found the 2D Correlation with one-level thresholding and post-hoc filtering to be the best performing algorithm.
... There is an active community exploring viable modalities for head-mounted displays (HMDs) including on-headset touch [36], mid-air hand input [23] and the use of dedicated wearable peripherals such as gloves [12] or belts [8]. Within this space, we argue that input from movements of the eyes [35] and head [3] are particularly practical and appealing: in such scenarios, hands remain free and all sensing can be integrated into the headset. ...
... Other authors have proposed the use of head tracking in mobile contexts to provide gestural input in the form of head tilting [5] and nodding [24]. Furthermore, studies on smart TVs have explored the use of off-the-shelf webcams to capture head motion during smooth pursuits [3]. Finally, while rigorous studies are presently lacking, recent work has proposed achieving head-based input during pursuits tracking by monitoring VOR movements [7]. ...
Conference Paper
Full-text available
SmoothMoves is an interaction technique for augmented reality (AR) based on smooth pursuits head movements. It works by computing correlations between the movements of on-screen targets and the user's head while tracking those targets. The paper presents three studies. The first suggests that head based input can act as an easier and more affordable surrogate for eye-based input in many smooth pursuits interface designs. A follow-up study grounds the technique in the domain of augmented reality, and captures the error rates and acquisition times on different types of AR devices: head-mounted (2.6%, 1965ms) and hand-held (4.9%, 2089ms). Finally, the paper presents an interactive lighting system prototype that demonstrates the benefits of using smooth pursuits head movements in interaction with AR interfaces. A final qualitative study reports on positive feedback regarding the technique's suitability for this scenario. Together, these results indicate show SmoothMoves is viable, efficient and immediately available for a wide range of wearable devices that feature embedded motion sensing.
... The principal motivation for TraceMatch is to enable users to select a displayed control with minimal effort (a small circular movement) and maximum flexibility (freedom to perform the movement in ways that are convenient in any given situation). Previous work has laid a foundation for the technique with the introduction of a computer vision system for detection and matching of movement that corresponds with presented Orbits [7]. Right: The first feature to be matched is shown with its trajectory (green) and a fitted circle (red) found using RANSAC with inlier thresholds (blue). ...
... This infers that the participants were following the motion of the target, however their movements were not circular enough to pass the circle fitting stage of the matching process, i.e. the movements were elliptical. This is further exaggerated because fast Orbits require more constrictive parameters, compared with slow Orbits, to avoid accidental matching with background movements [7]. ...
Article
In this work we consider how users can use body movement for remote control with minimal effort and maximum flexibility. TraceMatch is a novel technique where the interface displays available controls as circular widgets with orbiting targets, and where users can trigger a control by mimicking the displayed motion. The technique uses computer vision to detect circular motion as a uniform type of input, but is highly appropriable as users can produce matching motion with any part of their body. We present three studies that investigate input performance with different parts of the body, user preferences, and spontaneous choice of movements for input in realistic application scenarios. The results show that users can provide effective input with their head, hands and while holding objects, that multiple controls can be effectively distinguished by the difference in presented phase and direction of movement, and that users choose and switch modes of input seamlessly.
... For example, to focus on a specific data subset, the analyzer just steers the attention toward a specific vowel being sung by a tenor voice, thus perceptualizing data points as if they were notes of a vocal counterpoint. In a future experiment, the orbit-synchronization object selection paradigm 57 , where one of a few orbiting displays is selected by synchronized motion gesture, will be adopted and tested in the audio domain. Here, a few streams, each associated with a voice type, orbit circularly in a space of vowel-pitch, spanning a range of pitches and interpolating between two vowels. ...
Article
Full-text available
When designing displays for the human senses, perceptual spaces are of great importance to give intuitive access to physical attributes. Similar to how perceptual spaces based on hue, saturation, and lightness were constructed for visual color, research has explored perceptual spaces for sounds of a given timbral family based on timbre, brightness, and pitch. To promote an embodied approach to the design of auditory displays, we introduce the Vowel–Type–Pitch (VTP) space, a cylindrical sound space based on human sung vowels, whose timbres can be synthesized by the composition of acoustic formants and can be categorically labeled. Vowels are arranged along the circular dimension, while voice type and pitch of the vowel correspond to the remaining two axes of the cylindrical VTP space. The decoupling and perceptual effectiveness of the three dimensions of the VTP space are tested through a vowel labeling experiment, whose results are visualized as maps on circular slices of the VTP cylinder. We discuss implications for the design of auditory and multi-sensory displays that account for human perceptual capabilities.
... The literature contains a wealth of examples that successfully employ the principle, including for gaze interaction with smart watches [10,11], public displays [18,32], virtual reality [17], and smart homes [30]; for manual control in body-based games [2,6] and smart TVs [3,4,31]; for head control of head-up displays [12]; for one-handed ring-based input [36]; for menu selection with mice [13,35]; among others. ...
Preprint
Full-text available
Motion correlation interfaces are those that present targets moving in different patterns, which the user can select by matching their motion. In this paper, we re-formulate the task of target selection as a probabilistic inference problem. We demonstrate that previous interaction techniques can be modelled using a Bayesian approach and that how modelling the selection task as transmission of information can help us make explicit the assumptions behind similarity measures. We propose ways of incorporating uncertainty into the decision-making process and demonstrate how the concept of entropy can illuminate the measurement of the quality of a design. We apply these techniques in a case study and suggest guidelines for future work.
... Users can also synchronize their head, hand, and finger movements for target selection. The movements of different body parts can be recorded by eye tracking apparatus [14], cameras [8,11,12], or IMUs [15,31,39,48]. Spatial synchronous selection techniques differentiate targets by their movement traces instead of positions, which alleviates the 'Midas Touch' issue of pointing. ...
Article
Temporal synchronous target selection is an association-free selection technique: users select a target by generating signals (e.g., finger taps and hand claps) in sync with its unique temporal pattern. However, classical pattern set design and input recognition algorithm of such techniques did not leverage users' behavioral information, which limits their robustness to imprecise inputs. In this paper, we improve these two key components by modeling users' interaction behavior. In the first user study, we asked users to tap a finger in sync with blinking patterns with various period and delay, and modeled their finger tapping ability using Gaussian distribution. Based on the results, we generated pattern sets for up to 22 targets that minimized the possibility of confusion due to imprecise inputs. In the second user study, we validated that the optimized pattern sets could reduce error rate from 23% to 7% for the classical Correlation recognizer. We also tested a novel Bayesian, which achieved higher selection accuracy than the Correlation recognizer when the input sequence is short. The informal evaluation results show that the selection technique can be effectively scaled to different modalities and sensing techniques.
... From a system design perspective the criteria for defning a smooth pursuits match can be reduced to a similarity-based classifcation problem. Conventionally, determining smooth pursuit matching criteria is non-trivial and often necessitates optimisation of a combination of different thresholds and window sizes, all of which could be affected by the properties of the target's trajectory and sensor characteristics [5,6,10]. Choice of parameters should ensure that users are able to select objects in a timely fashion, whilst being robust to incorrect selections due to natural eye movements, also known as the Midas touch [16]. ...
Conference Paper
Full-text available
In 3D environments, objects can be difficult to select when they overlap, as this affects available target area and increases selection ambiguity. We introduce Outline Pursuits which extends a primary pointing modality for gaze-assisted selection of occluded objects. Candidate targets within a pointing cone are presented with an outline that is traversed by a moving stimulus. This affords completion of the selection by gaze attention to the intended target's outline motion, detected by matching the user's smooth pursuit eye movement. We demonstrate two techniques implemented based on the concept, one with a controller as the primary pointer, and one in which Outline Pursuits are combined with head pointing for hands-free selection. Compared with conventional raycasting, the techniques require less movement for selection as users do not need to reposition themselves for a better line of sight, and selection time and accuracy are less affected when targets become highly occluded.
... Furthermore, researchers and practitioners could apply and evaluate the slope method beyond gaze, e.g. motion matching for body movements [4][5][6] and mid-air gestures [2]. ...
Conference Paper
Full-text available
In this paper we introduce a novel approach for smooth pursuits eye movement detection and demonstrate that it allows up to 160 targets to be distinguished. With this work we advance the well-established smooth pursuits technique, which allows gaze interaction without calibration. The approach is valuable for researchers and practitioners, since it enables novel user interfaces and applications to be created that employ a large number of targets, for example, a pursuits-based keyboard or a smart home where many different objects can be controlled using gaze. We present findings from two studies. In particular, we compare our novel detection algorithm based on linear regression with the correlation method. We quantify its accuracy for around 20 targets on a single circle and up to 160 targets on multiple circles. Finally, we implemented a pursuits-based keyboard app with 108 targets as proof-of-concept.
... In our implementation, we employed a KLT tracker [8] to evaluate the movements of the interest points. Clarke et al. [1] proposed TraceMatch, which used motion correlation between hand movements and GUIs. They also used the KLT tracker to compute hand movements, but they did not compute the motion vector of target objects. ...
Conference Paper
User calibration is a significant problem in eye-based interaction. To overcome this, several solutions, such as the calibration-free method and implicit user calibration, have been proposed. Pursuits-based interaction is another such solution that has been studied for public screens and virtual reality. It has been applied to select graphical user interfaces (GUIs) because the movements in a GUI can be designed in advance. Smooth pursuit eye movements (smooth pursuits) occur when a user looks at objects in the physical space as well and thus, we propose a method to identify the focused object by using smooth pursuits in the real world. We attempted to determine the focused objects without prior information under several conditions by using the pursuits-based approach and confirmed the feasibility and limitations of the proposed method through experimental evaluations.
... Some synchronous motion eye interfaces take advantage of the natural tendency for users' eyes to track moving targets of interest [7,8,31,35,36]. Selection of both virtual UI elements and real-world objects can be performed by tracking body-based synchronous gestures as well [3,4]. Finally, synchronous gestures have recently been explored as hand gestures for subtle control and one-handed smartwatch input [9,24,38,39]. ...
Conference Paper
SelfSync enables rapid, robust initiation of a gesture interface using synchronized movement of different body parts. SelfSync is the gestural equivalent of a hotword such as OK-Google in a speech interface and is enabled by the increasing trend where a user wears two or more wearables, such as a smartwatch, wireless earbuds, or a smartphone. In a user study comparing five potential SelfSync gestures in isolation, our system averages 96%, 98% and 88% for user dependent, user adapted, and user independent accuracy, respectively. For when the user has a phone in a pocket and a smart-watch, we suggest twisting the hand about the wrist while moving the leg with the phone in synchrony left and right. When the user has a head worn device and a smartwatch, we suggest twisting the hand while twisting the head left and right.
... Optical Flow is widely used in motion detection [Yin and Shi, 2018] action recognition in static [Gao et al., 2018] and video [Choutas et al., 2018] images, 3D tracking [Wilson and Benko, 2014], etc. A close work to Eyelow is TraceMatch [Clarke et al., 2016], which used hand or head movements to simulate remote control interaction. Features are detected using the FAST feature detector, in contrast to our approach which uses the overall indicative motion of the eye. ...
Conference Paper
We investigate the smooth pursuit eye movement based interaction using an unmodified off-the-shelf RGB camera. In each pair of sequential video frames, we compute the indicative direction of the eye movement by analyzing flow vectors obtained using the Lucas-Kanade optical flow algorithm. We discuss how carefully selected low vectors could replace the traditional pupil centers detection in smooth pursuit interaction. We examine implications of unused features in the eye camera imaging frame as potential elements for detecting gaze gestures. This simple approach is easy to implement and abstains from many of the complexities of pupil based approaches. In particular, EyeFlow does not call for either a 3D pupil model or 2D pupil detection to track the pupil center location. We compare this method to state-of-the-art approaches and ind that this can enable pursuit interactions with standard cameras. Results from the evaluation with 12 users data yield an accuracy that compares to previous studies. In addition, the benefit of this work is that the approach does not necessitate highly matured computer vision algorithms and expensive IR-pass cameras.
... → touching a visual proxy next to device [311] → touching around the device using camera/projector [362] → touching and gesturing around the display using a head-mounted display [109] → gesturing on an extended projected display [57] → augmenting mouse, touch and keyboard input around the device [23] → using reconfigurable projection-based multidisplay environments [38] Interact with content through motion on 2D surface: → rotating device [143,270,288] → moving to explore map [90,270] → moving device to explore information visualizations, e.g., graphs [367] Interact through 3D motion with device: → moving in 3D space [321] → pour content into another device [159] → throwing to display [69] → tilting to pan map on display [69] → tiling to share photos [190] → face-to-mirror onto display [206] → motion correlation [65] → rotation [188] → synchronous gestures [127,272] → throwing, chucking [122] → mid-air pointing [185] → holding in place [312] → double bump zoom out [54] → touch-project [28] → shoot and copy [27] → hold device in air to receive [191] → physical proxy [68,107] → tilt towards self to take [191] → pan-and-zoom ...
Conference Paper
Designing interfaces or applications that move beyond the bounds of a single device screen enables new ways to engage with digital content. Research addressing the opportunities and challenges of interactions with multiple devices in concert is of continued focus in HCI research. To inform the future research agenda of this field, we contribute an analysis and taxonomy of a corpus of 510 papers in the cross-device computing domain. For both new and experienced researchers in the field we provide: an overview, historic trends and unified terminology of cross-device research; discussion of major and under-explored application areas; mapping of enabling technologies; synthesis of key interaction techniques spanning across multiple devices; and review of common evaluation strategies. We close with a discussion of open issues. Our taxonomy aims to create a unified terminology and common understanding for researchers in order to facilitate and stimulate future cross-device research.
... Interaction with Wattom is built around motion matching [20,34,36], a novel interaction paradigm where users interact with the system by tracking its different animations with their hands (tracked through the motion sensors embedded on most wrist-worn wearable devices). Compared to other mid-air and embodied interaction techniques [5,14,27], motion matching does not require gesture discovery and memorization [11]; and our implementation is not affected by the limitations of optical tracking, such as a limited field-of-view (FOV), occlusion and changing lighting conditions, and well-known privacy concerns when used in the context of the home [8]. This paper provides the following contributions. ...
Conference Paper
This paper presents Wattom, a highly interactive ambient eco-feedback smart plug that aims to support a more sustainable use of electricity by being tightly coupled to users' energy-related activities. We describe three use cases of the system: using Wattom to power connected appliances and understand the environmental impact of their use in real time; scheduling these power events; and presenting users with personal consumption data desegregated by device. We conclude with a user study in which the effectiveness of the plug's novel interactive capabilities is assessed (mid-air, hand-based motion matching). The study explores the effectiveness of Wattom and motion matching input in a realistic setup, where the user is not always directly ahead of the interface, and not always willing to point straight at the device (e.g., when the plug is at an uncomfortable angle). Despite not using a graphical display, our results demonstrate that our motion matching implementation was effective in line with previous work, and that participants' pointing angle did not significantly affect their performance. On the other hand, participants were more effective while pointing straight at Wattom, but reported not to finding this significantly more strenuating then when pointing to a comfortable position of their choice.
... Recent research has explored various motion correlation [34] techniques to select objects by correlating the user's movements with moving objects on a distal display. The spatial correlation is calculated between the moving path of the user's gaze [11,37], hands [5,36], head and hand-held object [8,9] and that of the displayed objects. The object is selected when the correlation coefficient between the two paths is high. ...
Article
Ad-hoc wireless device pairing enables impromptu interactions in smart spaces, such as resource sharing and remote control. The pairing experience is mainly determined by the device association process, during which users express their pairing intentions between the advertising device and the scanning device. Currently, most wireless devices are associated by selecting the advertiser's name from a list displayed on the scanner's screen, which becomes less efficient and often misplaced as the number of wireless devices increases. In this paper, we propose Tap-to-Pair, a spontaneous device association mechanism that initiates pairing from advertising devices without hardware or firmware modifications. Tapping an area near the advertising device's antenna can change its signal strength. Users can then associate two devices by synchronizing taps on the advertising device with the blinking pattern displayed by the scanning device. By leveraging the wireless transceiver for sensing, Tap-to-Pair does not require additional resources from advertising devices and needs only a binary display (e.g. LED) on scanning devices. We conducted a user study to test users' synchronous tapping ability and demonstrated that Tap-to-Pair can reliably detect users' taps. We ran simulations to optimize parameters for the synchronization recognition algorithm and provide pattern design guidelines. We used a second user study to evaluate the on-chip performance of Tap-to-Pair. The results show that Tap-to-Pair can achieve an overall successful pairing rate of 93.7% with three scanning devices at different distances.
... In contrast to gaze gestures [5], smooth pursuits offer a target to the user and following this target with the eyes feels more natural than performing a gaze gesture. The HCI community continued research on smooth pursuits and explored applications [3,8,14,19,23,25]. However, despite much research on pursuits, there are still many open questions that become particularly relevant as smooth pursuits find their way into novel application areas. ...
Conference Paper
Full-text available
In this paper we present an investigation of how the speed and trajectory of smooth pursuits targets impact on detection rates in gaze interfaces. Previous work optimized these values for the specific application for which smooth pursuit eye movements were employed. However, this may not always be possible. For example UI designers may want to minimize distraction caused by the stimulus, integrate it with a certain UI element (e.g., a button), or limit it to a certain area of the screen. In these cases an in-depth understanding of the interplay between speed, trajectory, and accuracy is required. To achieve this, we conducted a user study with 15 participants who had to follow targets with different speeds and on different trajectories using their gaze. We evaluated the data with respect to detectability. As a result, we obtained reasonable ranges for target speeds and demonstrate the effects of trajectory shapes. We show that slow moving targets are hard to detect by correlation and that introducing a delay improves the detection rate for fast moving targets. Our research is complemented by design rules which enable designers to implement better pursuit detectors and pursuit-based user interfaces.
... As using a bare hand for distant pointing and clicking is advantageous when the interface is distant (as with large displays) or even disappearing (as with immersive systems or ubiquitous environments) [3,18], we seek more sophisticated hand tracking technologies for enabling full-hand gesture recognition. Mid-air hand gestures have been widely deployed in controlled environments, such as for virtual reality or immersive systems, in which hand tracking commonly relies on computer vision [4,[18][19][20]. Computer vision-based gesture recognition not only limits their users' mobility (users need to perform the gestures within camera's view) but also suffers from occlusion issues (users need to show the gestures explicitly to the camera). ...
Article
Full-text available
Locating places in cities is typically facilitated by handheld mobile devices, which draw the visual attention of the user on the screen of the device instead of the surroundings. In this research, we aim at strengthening the connection between people and their surroundings through enabling mid-air gestural interaction with real-world landmarks and delivering information through audio to retain users’ visual attention on the scene. Recent research on gesture-based and haptic techniques for such purposes has mainly considered handheld devices that eventually direct users’ attention back to the devices. We contribute a hand-worn, mid-air gestural interaction design with directional vibrotactile guidance for finding points of interest (POIs). Through three design iterations, we address aspects of (1) sensing technologies and the placement of actuators considering users’ instinctive postures, (2) the feasibility of finding and fetching information regarding landmarks without visual feedback, and (3) the benefits of such interaction in a tourist application. In a final evaluation, participants located POIs and fetched information by pointing and following directional guidance, thus realising a vision in which they found and experienced real-world landmarks while keeping their visual attention on the scene. The results show that the interaction technique has comparable performance to a visual baseline, enables high mobility, and facilitates keeping visual attention on the surroundings.
... Synchronous gestures are similar to rhythmic patterns in that they both allow the user to express intent over time, but in the case of synchronous gestures, the stimulus is presented to the user indicating the expected gesture or pattern [11]. Motion correlation has been implemented successfully in many camera-based systems, often allowing for robust, multi-user selection on large displays [3,4,11]. Synchronous gestures have also been explored using smooth pursuit tracking, as the eyes are able to closely follow the motion of the target. ...
Conference Paper
We present SeeSaw, a synchronous gesture interface for commodity smartwatches to support watch-hand only input with no additional hardware. Our algorithm, which uses correlation to determine whether the user is rotating their wrist in synchrony with a tactile and visual prompt, minimizes false-trigger events while maintaining fast input during situational impairments. Results from a 12 person evaluation of the system, used to respond to notifications on the watch during walking and simulated driving, show interaction speeds of 4.0 s - 5.5 s, which is comparable to the swipe-based interface control condition. SeeSaw is also evaluated as an input interface for watches used in conjunction with a head-worn display. A six subject study showed a 95% success rate in dismissing notifications and a 3.57 s mean dismissal time.
... Feature Tracking: In order to detect the displacement of features within a sequence of frames, a Pyramid Kanade-Lucas-Tomasi (KLT) feature tracker [26,36,37], provided by BoofCV [2], is utilized, inspired by Clarke and Gellerson who used a similar approach for movement correlated interfaces [5]. For tracker initialization, a predefined number N of KLT features, N with the highest eigenvalues, are marked as active and registered for tracking. ...
Conference Paper
Although, gaze-based interaction has been investigated since the 1980s and provides promising concepts to realize cognitive systems and support universal interaction within distributed environments, the main challenges, such as the Midas touch problem [16] or calibration are still frequent topics of research. In this work, Natural Pursuit Calibration is presented, which is a comfortable, unobtrusive technique enabling ongoing attention detection and eye tracker calibration within an off-screen context. The user is able to perform calibration, without a digital user interface, artificial annotation of the environment nor further assistance, by simply following any arbitrary moving target. Due to the characteristics of the calibration process, it can be executed simultaneously to any primary task, without active user participation. A two-stage evaluation process is conducted to (i) optimize parameter settings in a first setup and (ii) compare the accuracy as well as the user acceptance of the proposed procedure to prevailing calibration techniques.
... Researchers could investigate, how quickly users adapt to such interfaces and whether the need to strongly focus on the target decreases over time. Furthermore, researchers and practitioners could apply and evaluate the slope-based method in domains other than gaze, such as motion matching for body movements [4,5,6], and mid-air gestures [2]. ...
Preprint
Full-text available
We introduce and evaluate a novel approach for detecting smooth pursuit eye movements that increases the number of distinguishable targets and is more robust against false positives. Being natural and calibration-free, Pursuits has been gaining popularity in the past years. At the same time, current implementations show poor performance when more than eight on-screen targets are being used, thus limiting its applicability. Our approach (1) leverages the slope of a regression line, and (2) introduces a minimum signal duration that improves both the new and the traditional detection method. After introducing the approach as well as the implementation, we compare it to the traditional correlation-based Pursuits detection method. We tested the approach up to 24 targets and show that, if accepting a similar error rate, nearly twice as many targets can be distinguished compared to state of the art. For fewer targets, accuracy increases significantly. We believe our approach will enable more robust pursuit-based user interfaces, thus making it valuable for both researchers and practitioners.
... Once the eye movements are segmented, the next step is to calculate the similarity of each window to a given set of predefined gesture paths. This is inspired by path-mimicry interaction techniques [Carter et al. 2016;Clarke et al. 2016;Esteves et al. 2015;, that rely on the Pearson product-moment correlation coefficient (PCC). These methods correlated eye or hand movements to the trajectories of moving objects that were created as part of the user interface. ...
Conference Paper
The eyes are an interesting modality for pervasive interactions, though their applicability for mobile scenarios is restricted by several issues so far. In this paper, we propose the idea of contour-guided gaze gestures, which overcome former constraints, like the need for calibration, by relying on unnatural and relative eye movements, as users trace the contours of objects in order to trigger an interaction. The interaction concept and the system design are described, along with two user studies, that demonstrate the method's applicability. It is shown that users were able to trace object contours to trigger actions from various positions on multiple different objects. It is further determined, that the proposed method is an easy to learn, hands-free interaction technique, that is robust against false positive activations. Results highlight low demand values and show that the method holds potential for further exploration, but also reveal areas for refinement.
... We are also optimistic that the control perspective can inspire designers to propose novel interaction styles. An example of this is interfaces which can infer the user's intent based on detection of control behaviour, as developed in [58] and built on by [21] and [9] and [55]. ...
Article
Full-text available
This article presents an empirical comparison of four models from manual control theory on their ability to model targeting behaviour by human users using a mouse: McRuer’s Crossover, Costello’s Surge, second-order lag (2OL), and the Bang-bang model. Such dynamic models are generative, estimating not only movement time, but also pointer position, velocity, and acceleration on a moment-to-moment basis. We describe an experimental framework for acquiring pointing actions and automatically fitting the parameters of mathematical models to the empirical data. We present the use of time-series, phase space, and Hooke plot visualisations of the experimental data, to gain insight into human pointing dynamics. We find that the identified control models can generate a range of dynamic behaviours that captures aspects of human pointing behaviour to varying degrees. Conditions with a low index of difficulty (ID) showed poorer fit because their unconstrained nature leads naturally to more behavioural variability. We report on characteristics of human surge behaviour (the initial, ballistic sub-movement) in pointing, as well as differences in a number of controller performance measures, including overshoot, settling time, peak time, and rise time. We describe trade-offs among the models. We conclude that control theory offers a promising complement to Fitts’ law based approaches in HCI, with models providing representations and predictions of human pointing dynamics, which can improve our understanding of pointing and inform design.
... For example, Pursuits [7] explores motion matching by gaze input using an eye-tracker. TraceMatch [2] and PathSync [1] use optical systems to track bodily input, whilst WaveTrace [6] utilizes a smartwatch to track hand movements as input. ...
Conference Paper
Full-text available
Motion matching input, following continuously moving targets by performing bodily movements, offers new interaction possibilities in multiple domains. Unlike optical motion matching input systems, our technique utilizes a smartwatch to record motion data from the users' wrists, providing robust input regardless of lighting conditions or momentary occlusions. We demonstrate an implementation of motion matching input using smartwatches for interactive television, that allows multi-user input using bodily movements and offers new interaction possibilities by means of a second screen as extension on TV display.
Article
We introduce a novel one-handed input technique for mobile devices that is not based on pointing, but on motion matching -where users select a target by mimicking its unique animation. Our work is motivated by the findings of a survey (N=201) on current mobile use, from which we identify lingering opportunities for one-handed input techniques. We follow by expanding on current motion matching implementations - previously developed in the context of gaze or mid-air input - so these take advantage of the affordances of touch-input devices. We validate the technique by characterizing user performance via a standard selection task (N=24) where we report success rates (>95%), selection times (~1.6 s), input footprint, grip stability, usability, and subjective workload - in both phone and tablet conditions. Finally, we present a design space that illustrates six ways in which motion matching can be embedded into mobile interfaces via a camera prototype application.
Article
Much has changed in the landscape of wearables research since the first International Symposium on Wearable Computers (ISWC) was organized in 1997. The authors, many of whom were active in this community since the beginning, reflect now 25 years later on the role of the conference, emerging research methods, the devices, and ideas that have stood the test of time—such as fitness/health sensors or augmented reality devices—as well as the ones that can be expected still to come, like everyday head-worn displays.
Article
SynchroWatch is a one-handed interaction technique for smartwatches that uses rhythmic correlation between a user's thumb movement and on-screen blinking controls. Our technique uses magnetic sensing to track the synchronous extension and reposition of the thumb, augmented with a passive magnetic ring. The system measures the relative changes in the magnetic field induced by the required thumb movement and uses a time-shifted correlation approach with a reference waveform for detection of synchrony. We evaluated the technique during three distraction tasks with varying degrees of hand and finger movement: active walking, browsing on a computer, and relaxing while watching online videos. Our initial offline results suggest that intentional synchronous gestures can be distinguished from other movement. A second evaluation using a live implementation of the system running on a smartwatch suggests that this technique is viable for gestures used to respond to notifications or issue commands. Finally, we present three demonstration applications that highlight the technique running in real-time on the smartwatch.
Conference Paper
Current freehand interactions with large displays rely on point & select as the dominant paradigm. However, constant hand movement in air for pointer navigation leads to hand fatigue quickly. We introduce summon & select, a new model for freehand interaction where, instead of navigating to the control, the user summons it into focus and then manipulates it. Summon & select solves the problems of constant pointer navigation, need for precise selection, and out-of-bounds gestures that plague point & select. We describe the design and conduct two studies to evaluate the design and compare it against point & select in a multi-button selection study. The results show that summon & select is significantly faster and has less physical and mental demand than point & select.
Conference Paper
Full-text available
Today's environments are populated with a growing number of electric devices which come in diverse form factors and provide a plethora of functions. However, rich interaction with these devices can become challenging if they need be controlled from a distance, or are too small to accommodate user interfaces on their own. In this work, we explore PICOntrol, a new approach using an off-the-shelf handheld pico projector for direct control of physical devices through visible light. The projected image serves a dual purpose by simultaneously presenting a visible interface to the user, and transmitting embedded control information to inexpensive sensor units integrated with the devices. To use PICOntrol, the user points the handheld projector at a target device, overlays a projected user interface on its sensor unit, and performs various GUI-style or gestural interactions. PICOntrol enables direct, visible, and rich interactions with various physical devices without requiring central infrastructure. We present our prototype implementation as well as explorations of its interaction space through various application examples.
Conference Paper
Full-text available
A 'Universal Remote Console' (URC) is a personal device that can be used to control any electronic and information technology device (target device/service), such as thermostats, TVs, or copy machines. The URC renders the user interface (UI) of the target device in a way that accommodates the user's preferences and abilities. This paper introduces the efforts of user groups, industry, government and academia to develop a standard for 'Alternate Interface Access' within the V2 technical committee of the National Committee for Information Technology Standards (NCITS). Some preliminary design aspects of the standard in work are discussed shortly.
Conference Paper
Full-text available
We present a novel method calledmotion-pointingfor select- ing a set of visual items such as push-buttons without actu- ally pointing to them. Instead, each potential target displays a rhythmically animated point we call the driver. To select a specific item, the user only has to imitate the motion of its driver using the input device. Once the motion has been recognized by the system, the user can confirm the selection to trigger the action. We consider cyclic motions on an el- liptic trajectory with a specific period, and study the most effective methods for real-time matching such a trajectory, as well as the range of parameters a human can reliably re- produce. We then show how to implement motion-pointing in real applications using an interaction technique we call move-and-stroke. Finally, we measure the throughput and error rate of move-and-stroke in a controlled experiment. We show that the selection time is linearly proportional to the number of input bits conveyed up to 6 bits, confirming that motion-pointing is a practical input method.
Conference Paper
Full-text available
Powerful mobile devices with minimal I/O capabilities increase the likelihood that we will want to annex these devices to I/O resources we encounter in the local environment. This opportunistic annexing will require authentication. We present a sensor-based authentication mechanism for mobile devices that relies on physical possession instead of knowledge to setup the initial connection to a public terminal. Our solution provides a simple mechanism for shaking a device to authenticate with the public infrastructure, making few assumptions about the surrounding infrastructure while also maintaining a reasonable level of security.
Conference Paper
Full-text available
PhoneTouch is a novel technique for integration of mobile phones and interactive surfaces. The technique enables use of phones to select targets on the surface by direct touch, facilitating for instance pick&drop-style transfer of objects between phone and surface. The technique is based on separate detection of phone touch events by the surface, which determines location of the touch, and by the phone, which contributes device identity. The device-level observations are merged based on correlation in time. We describe a proof-of-concept implementation of the technique, using vision for touch detection on the surface (including discrimination of finger versus phone touch) and acceleration features for detection by the phone.
Article
Full-text available
A challenge in facilitating spontaneous mobile interactions is to provide pairing methods that are both intuitive and secure. Simultaneous shaking is proposed as a novel and easy-to-use mechanism for pairing of small mobile devices. The underlying principle is to use common movement as a secret that the involved devices share for mutual authentication. We present two concrete methods, ShaVe and ShaCK, in which sensing and analysis of shaking movement is combined with cryptographic protocols for secure authentication. ShaVe is based on initial key exchange followed by exchange and comparison of sensor data for verification of key authenticity. ShaCK, in contrast, is based on matching features extracted from the sensor data to construct a cryptographic key. The classification algorithms used in our approach are shown to robustly separate simultaneous shaking of two devices from other concurrent movement of a pair of devices, with a false negative rate of under 12 percent. A user study confirms that the method is intuitive and easy to use, as users can shake devices in an arbitrary pattern.
Conference Paper
Full-text available
Where feature points are used in real-time frame-rate applications, a high-speed feature detector is necessary. Feature detectors such as SIFT (DoG), Harris and SUSAN are good methods which yield high quality features, however they are too computationally intensive for use in real-time applications of any complexity. Here we show that machine learning can be used to derive a feature detector which can fully process live PAL video using less than 7% of the available processing time. By comparison neither the Harris detector (120%) nor the detection stage of SIFT (300%) can operate at full frame rate. Clearly a high-speed detector is of limited use if the features produced are unsuitable for downstream processing. In particular, the same scene viewed from two different positions should yield features which correspond to the same real-world 3D locations [1]. Hence the second contribution of this paper is a comparison corner detectors based on this criterion applied to 3D scenes. This comparison supports a number of claims made elsewhere concerning existing corner detectors. Further, contrary to our initial expectations, we show that despite being principally constructed for speed, our detector significantly outperforms existing feature detectors according to this criterion.
Conference Paper
Full-text available
Image registration finds a variety of applications in computer vision. Unfortunately, traditional image registration techniques tend to be costly. We present a new image registration technique that makes use of the spatial intensity gradient of the images to find a good match using a type of Newton-Raphson iteration. Our technique is taster because it examines far fewer potential matches between the images than existing techniques Furthermore, this registration technique can be generalized to handle rotation, scaling and shearing. We show how our technique can be adapted tor use in a stereo vision system.
Article
Full-text available
We present a method for performing selection tasks based on continuous control of multiple, competing agents who try to determine the user's intentions from their control behaviour without requiring an explicit pointer. The entropy in the selection process decreases in a continuous fashion -- we provide experimental evidence of selection from 500 initial targets. The approach allows adaptation over time to best make use of the multimodal communication channel between the human and the system. This general approach is well suited to mobile and wearable applications, shared displays and security conscious settings.
Conference Paper
Full-text available
. A problem in smart environments is the interaction with a myriad of different devices, in particular the selection which of the devices should be controlled. Typically, different devices require different control tools. In contrast, a generic Point & Click appliance is proposed. To interact with a device in the environment this generic control appliance is pointed at devices for selection providing visual feedback to the user, obtains control information from the device, and allows control with the help of a simple user interface. 1 Introduction An important issue in smart environments is how to interact with a wide diversity of devices, and one interesting but little explored aspect is how to select devices for interaction. Typical approaches are the use of separate controls for different devices (e.g. separate remote controls for TV and video), selection based on profiling of user action [1], and selection based on context-awareness technology [3]. As an alternative approac...
Conference Paper
Eye tracking offers many opportunities for direct device control in smart environments, but issues such as the need for calibration and the Midas touch problem make it impractical. In this paper, we propose AmbiGaze, a smart environment that employs the animation of targets to provide users with direct control of devices by gaze only through smooth pursuit tracking. We propose a design space of means of exposing functionality through movement and illustrate the concept through four prototypes. We evaluated the system in a user study and found that AmbiGaze enables robust gaze-only interaction with many devices, from multiple positions in the environment, in a spontaneous and comfortable manner.
Conference Paper
In this paper, we present PathSync, a novel, distal and multi-user mid-air gestural technique based on the principle of rhythmic path mimicry; by replicating the movement of a screen-represented pattern with their hand, users can intuitively interact with digital objects quickly, and with a high level of accuracy. We present three studies that each contribute (1) improvements to how correlation is calculated in path-mimicry techniques necessary for touchless interaction, (2) a validation of its efficiency in comparison to existing techniques, and (3) a demonstration of its intuitiveness and multi-user capacity 'in the wild'. Our studies consequently demonstrate PathSync's potential as an immediately legitimate alternative to existing techniques, with key advantages for public display and multi-user applications.
Conference Paper
We introduce Orbits, a novel gaze interaction technique that enables hands-free input on smart watches. The technique relies on moving controls to leverage the smooth pursuit movements of the eyes and detect whether and at which control the user is looking at. In Orbits, controls include targets that move in a circular trajectory in the face of the watch, and can be selected by following the desired one for a small amount of time. We conducted two user studies to assess the technique's recognition and robustness, which demonstrated how Orbits is robust against false positives triggered by natural eye movements and how it presents a hands-free, high accuracy way of interacting with smart watches using off-the-shelf devices. Finally, we developed three example interfaces built with Orbits: a music player, a notifications face plate and a missed call menu. Despite relying on moving controls – very unusual in current HCI interfaces – these were generally well received by participants in a third and final study.
Conference Paper
Eye gaze is a compelling interaction modality but requires user calibration before interaction can commence. State of the art procedures require the user to fixate on a succession of calibration markers, a task that is often experienced as difficult and tedious. We present pursuit calibration, a novel approach that, unlike existing methods, is able to detect the user's attention to a calibration target. This is achieved by using moving targets, and correlation of eye movement and target trajectory, implicitly exploiting smooth pursuit eye movement. Data for calibration is then only sampled when the user is attending to the target. Because of its ability to detect user attention, pursuit calibration can be performed implicitly, which enables more flexible designs of the calibration task. We demonstrate this in application examples and user studies, and show that pursuit calibration is tolerant to interruption, can blend naturally with applications and is able to calibrate users without their awareness.
Conference Paper
We describe the SAWUI architecture by which smartphones can easily show user interfaces for nearby appliances, with no modification or pre-installation of software on the phone, no reliance on cloud services or networking infrastructure, and modest additional hardware in the appliance. In contrast to appliances? physical user interfaces, which are often as simple as buttons, icons and LEDs, SAWUIs leverage smartphones? powerful UI hardware to provide personalized, self-explanatory, adaptive, and localized UIs. To explore the opportunities created by SAWUIs, we conducted a study asking designers to redesign two appliances to include SAWUIs. Task characteristics including frequency, proximity, and complexity were used in deciding whether to place functionality on the physical UI, the SAWUI, or both. Furthermore, results illustrate how, in addition to support for accomplishing tasks, SAWUIs have the potential to enrich human experiences around appliances by increasing user autonomy and supporting better integration of appliances into users' social and personal lives.
Conference Paper
Although gaze is an attractive modality for pervasive interactions, the real-world implementation of eye-based interfaces poses significant challenges, such as calibration. We present Pursuits, an innovative interaction technique that enables truly spontaneous interaction with eye-based interfaces. A user can simply walk up to the screen and readily interact with moving targets. Instead of being based on gaze location, Pursuits correlates eye pursuit movements with objects dynamically moving on the interface. We evaluate the influence of target speed, number and trajectory and develop guidelines for designing Pursuits-based interfaces. We then describe six realistic usage scenarios and implement three of them to evaluate the method in a usability study and a field study. Our results show that Pursuits is a versatile and robust technique and that users can interact with Pursuits-based interfaces without prior knowledge or preparation phase.
Conference Paper
The XWand is a novel wireless sensor package that enables styles of natural interaction with intelligent environments. For example, a user may point the wand at a device and control it using simple gestures. The XWand system leverages the intelligence of the environment to best determine the user's intention. We detail the hardware device, signal processing algorithms to recover position and orientation, gesture recognition techniques, a multimodal (wand and speech) computational architecture and a preliminary user study examining pointing performance under conditions of tracking availability and audio feedback.
Conference Paper
The pRemote ('p' stands for personal) is an alternative input device based on digital pen technology, paper-based interface layouts and text recognition. Compared to standard remote controls the concept allows the creation of personal interfaces by the user, the use of different templates during running an application and an alternative for text input by writing with a pen. In our work we developed a design study of the pRemote which was evaluated with eight users. The concept was appreciated by almost all of the participants. We exemplified the concept for controlling an already developed media application. The pRemote should establish a basis for further evaluations of different input designs.
Conference Paper
Mobile phones are increasingly becoming ubiquitous compu- tational devices that are almost always available, individually adaptable, and nearly universally connectable (using both wide area and short range communication capabilities). Until Star Trek-like speech interfaces are fully developed, mobile phones seem thus poised to become our main devices for interacting with intelligent spaces and smart appliances, such as buying train passes, operating vending machines, or controlling smart homes (e.g., TVs, stereos, and dishwashers, as well as heating and light). But how much can a mobile phone simplify our everyday interactions, before it itself becomes a usability burden? What are the capabilities and limitations of using mobile phones to control smart appliances, i.e., operating things like ATMs or coee makers that typically do not benefit from remote control? This paper presents a user study investigating the use of a prototypical, mobile phone based interaction system to operate a range of appliances in a number of dierent task settings. Our results show that mobile devices can greatly simplify appliance operation in ex- ceptional situations, but that the idea of a universal interaction device is less suited for general, everyday appliance control.
Conference Paper
This research explores distributed sensing techniques for mobile devices using synchronous gestures. These are patterns of activity, contributed by multiple users (or one user with multiple devices), which take on a new meaning when they occur together in time, or in a specific sequence in time. To explore this new area of inquiry, this work uses tablet computers augmented with touch sensors and two-axis linear accelerometers (tilt sensors). The devices are connected via an 802.11 wireless network and synchronize their time-stamped sensor data. This paper describes a few practical examples of interaction techniques using synchronous gestures such as dynamically tiling together displays by physically bumping them together, discusses implementation issues, and speculates on further possibilities for synchronous gestures.
Article
Smart home environments have evolved to the point where everyday objects and devices at home can be networked to give the inhabitants new means to control them. Familiar information appliances can be used as user interfaces (UIs) to home functions to achieve a more convenient user experience. This paper reports an ethnographic study of smart home usability and living experience. The purpose of the research was to evaluate three UIs—a PC, a media terminal, and a mobile phone—for smart home environments. The results show two main types of activity patterns, pattern control and instant control, which require different UI solutions. The results suggest that a PC can act as a central unit to control functions for activity patterns that can be planned and determined in advance. The mobile phone, on the other hand, is well suited for instant control. The mobile phone turned out to be the primary and most frequently used UI during the 6-month trial period in the smart apartment.
Article
Thesis (Ph.D.) - University of Glasgow, 2006. Includes bibliographical references.
Logitech Study Shows Multiple Remote Controls Hindering Entertainment Experiences Around the Globe
  • Logitech
Logitech. 2010. Logitech Study Shows Multiple Remote Controls Hindering Entertainment Experiences Around the Globe. Press Release. (Nov 2010). http:
Prototype Implementations for a Universal Remote Console Specification. InCHI '02 Ext. Abstracts on Human Factors in Comp
  • Gottfried Zimmermann
  • Gregg Vanderheiden
  • Al Gilman
Pyramidal implementation of the Lucas Kanade feature tracker. Intel Corporation Microprocessor Research Labs
  • Jean-Yves Bouguet