Article

Using Audio Cues to Support Motion Gesture Interaction on Mobile Devices

Authors:
  • Barnard College
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Motion gestures are an underutilized input modality for mobile interaction despite numerous potential advantages. Negulescu et al. found that the lack of feedback on attempted motion gestures made it difficult for participants to diagnose and correct errors, resulting in poor recognition performance and user frustration. In this article, we describe and evaluate a training and feedback technique, Glissando, which uses audio characteristics to provide feedback on the system's interpretation of user input. This technique enables feedback by verbally confirming correct gestures and notifying users of errors in addition to providing continuous feedback by manipulating the pitch of distinct musical notes mapped to each of three dimensional axes in order to provide both spatial and temporal information.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In their seminal work on the Charade system, Baudel and Beaudoin-Lafon [6] note that one primary problem with midair gestures is that gestures are not self-revealing -the user must know the set of gestures that the system can recognize and their associated functionality. Most often, systems designed to teach mid-air interaction techniques require the use of additional hardware to reveal gestures to the user, usually in a visual [1,3,4,10,12,13,15,16,20,30,34,36,40], auditory [25,30], or haptic form [30,34]. Mirrored representations of the user are one common form of user training [1,3,4,12,13,20,31]. In terms of ubiquitous public display interaction, Vatavu suggested not requiring users to learn at all, but, rather, to use a preferred, familiar gesture set that is individual to each user. ...
Conference Paper
While mid-air gestures are an attractive modality with an extensive research history, one challenge with their usage is that the gestures are not self-revealing. Scaffolding techniques to teach these gestures are difficult to implement since the input device, e.g. a hand, wand or arm, cannot present the gestures to the user. In contrast, for touch gestures, feedforward mechanisms (such as Marking Menus or OctoPocus) have been shown to effectively support user awareness and learning. In this paper, we explore whether touch gesture input can be leveraged to teach users to perform mid-air gestures. We show that marking menu touch gestures transfer directly to knowledge of mid-air gestures, allowing performance of these gestures without intervention. We argue that cross-modal learning can be an effective mechanism for introducing users to mid-air gestural input.
... Most of the time, scholars only use sound as a secondary interactive channel to assist other forms of interaction, and apply speech as an interactive mean to specific scenarios or tasks. For example, using auditory feedback to assist pen or gesture interaction, or to shape the user's touch behavior [12,13,14]. ...
... For example, Anderson [6] investigated the use of auditory feedback in pen gesture interface and showed that gaining performance advantage with auditory feedback is possible. Sarah [7] explored the auditory cues to support motion gesture interaction on mobile devices. They proposed two techniques that use audio features to provide spatial representation of desired gesture and feedback on the system's interpretation of user input. ...
Conference Paper
Full-text available
Direct touch gestures are getting popular as an input modality for mobile and tabletop interaction. However, the finger touch interface is considered as not accurate compared with pen-based interface. One of the main reasons is that the visual feedback of the finger touch is occluded because of the size of fingertip. It has made difficult for perceiving and correcting errors. We propose to utilize another modality to provide information on occluded area. Spatial information on visual channel is transformed to temporal and frequency information on another modality. We use sound modality to illustrate the proposed trans-modality. Results show that performance with additional modality is better for drawing where the visual information is important than only with the visual feedback.
Conference Paper
Full-text available
The use of ultrasound haptic feedback for mid-air gestures in cars has been proposed to provide a sense of control over the user's intended actions and to add touch to a touchless interaction. However, the impact of ultrasound feedback to the gesturing hand regarding lane deviation, eyes-off-the-road time (EORT) and perceived mental demand has not yet been measured. This paper investigates the impact of uni- and multimodal presentation of ultrasound feedback on the primary driving task and the secondary gesturing task in a simulated driving environment. The multimodal combinations of ultrasound included visual, auditory, and peripheral lights. We found that ultrasound feedback presented unimodally and bimodally resulted in significantly less EORT compared to visual feedback. Our results suggest that multimodal ultrasound feedback for mid-air interaction decreases EORT whilst not compromising driving performance nor mental demand and thus can increase safety while driving.
Conference Paper
Full-text available
This paper presents an investigation into the effects of different feedback modalities on mid-air gesture interaction for infotainment systems in cars. Car crashes and near-crash events are most commonly caused by driver distraction. Mid-air interaction is a way of reducing driver distraction by reducing visual demand from infotainment. Despite a range of available modalities, feedback in mid-air gesture systems is generally provided through visual displays. We conducted a simulated driving study to investigate how different types of multimodal feedback can support in-air gestures. The effects of different feedback modalities on eye gaze behaviour, and the driving and gesturing tasks are considered. We found that feedback modality influenced gesturing behaviour. However, drivers corrected falsely executed gestures more often in non-visual conditions. Our findings show that non-visual feedback can reduce visual distraction significantly
Conference Paper
When users want to interact with an in-air gesture system, they must first address it. This involves finding where to gesture so that their actions can be sensed, and how to direct their input towards that system so that they do not also affect others or cause unwanted effects. This is an important problem which lacks a practical solution. We present an interaction technique which uses multimodal feedback to help users address in-air gesture systems. The feedback tells them how ("do that") and where ("there") to gesture, using light, audio and tactile displays. By doing that there, users can direct their input to the system they wish to interact with, in a place where their gestures can be sensed. We discuss the design of our technique and three experiments investigating its use, finding that users can "do that" well (93.2%-99.9%) while accurately (51mm-80mm) and quickly (3.7s) finding "there".
Conference Paper
Many studies have highlighted the advantages of expanding the input space of mobile devices by utilizing the back of the device. We extend this work by performing an elicitation study to explore users' mapping of gestures to smartphone commands and identify their criteria for using back-of-device gestures. Using the data collected from our study, we present elicited gestures and highlight common user motivations, both of which inform the design of back-of-device gestures for mobile interaction.
Article
Full-text available
We investigate the use of auditory feedback in pen-gesture interfaces in a series of informal and formal experiments. Initial iterative exploration showed that gaining performance advantage with auditory feedback was possible using absolute cues and state feedback after the gesture was produced and recognized. However, gaining learning or performance advantage from auditory feedback tightly coupled with the pen-gesture articulation and recognition process was more difficult. To establish a systematic baseline, Experiment 1 formally evaluated gesture production accuracy as a function of auditory and visual feedback. Size of gestures and the aperture of the closed gestures were influenced by the visual or auditory feedback, while other measures such as shape distance and directional difference were not, supporting the theory that feedback is too slow to strongly influence the production of pen stroke gestures. Experiment 2 focused on the subjective aspects of auditory feedback in pen-gesture interfaces. Participants' rating on the dimensions of being wonderful and stimulating was significantly higher with musical auditory feedback. Several lessons regarding pen gestures and auditory feedback are drawn from our exploration: a few simple functions such as indicating the pen-gesture recognition results can be achieved, gaining performance and learning advantage through tightly coupled process-based auditory feedback is difficult, pen-gesture sets and their recognizers can be designed to minimize visual dependence, and people's subjective experience of gesture interaction can be influenced using musical auditory feedback. These lessons may serve as references and stepping stones toward future research and development in pen-gesture interfaces with auditory feedback.
Article
Full-text available
We present McSig, a multimodal system for teaching blind children cursive handwriting so that they can create a personal signature. For blind people handwriting is very difficult to learn as it is a near-zero feedback activity that is needed only occasionally, yet in important situations; for example, to make an attractive and repeatable signature for legal contracts. McSig aids the teaching of signatures by translating digital ink from the teacher's stylus gestures into three non-visual forms: (1) audio pan and pitch represents the x and y movement of the stylus; (2) kinaesthetic information is provided to the student through a force-feedback haptic pen that mimics the teacher's stylus movement; and (3) a physical tactile line on the writing sheet is created by the haptic pen. McSig has been developed over two major iterations of design, usability testing and evaluation. The final step of the first iteration was a short evaluation with eight visually impaired children. The results suggested that McSig had the highest potential benefit for congenitally and totally blind children and also indicated some areas where McSig could be enhanced. The second prototype incorporated significant modifications to the system, improving the audio, tactile and force-feedback. We then ran a detailed, longitudinal evaluation over 14 weeks with three of the congenitally blind children to assess McSig's effectiveness in teaching the creation of signatures. Results demonstrated the effectiveness of McSig—they all made considerable progress in learning to create a recognizable signature. By the end of ten lessons, two of the children could form a complete, repeatable signature unaided, the third could do so with a little verbal prompting. Furthermore, during this project, we have learnt valuable lessons about providing consistent feedback between different communications channels (by manual interactions, haptic device, pen correction) that will be of interest to others developing multimodal systems.
Article
Full-text available
This paper describes a novel and functional application of data sonification as an element in an immersive stroke rehabilitation system. For two years, we have been developing a task-based experiential media biofeedback system that incorporates musical feedback as a means to maintain patient interest and impart movement information to the patient. This paper delivers project background, system goals, a description of our system including an in-depth look at our audio engine, and lastly an overview of proof of concept experiments with both unimpaired subjects and actual stroke patients suffering from right-arm impairment.
Conference Paper
Full-text available
Growing evidence suggests that sonification supports movement perception as well as motor functions. It is hypothesized that real-time sonification supports movement control in patients with sensorimotor dysfunctions efficiently by intermodal substitution of sensory loss. The present article describes a sonification system for the upper extremities that might be used in neuromotor rehabilitation after stroke. A key-feature of the system is mobility: Arm movements are captured by intertial sensors that transmit their data wirelessly to a portable computer. Hand position is computed in an egocentric reference frame and mapped onto four acoustic parameters. A pilot feasibility study with acute stroke patients resulted in significant effects and is encouraging with respect to ambulatory use.
Article
Full-text available
Smartphones are frequently used in environments where the user is distracted by another task, for example by walking or by driving. While the typical interface for smartphones involves hardware and software buttons and surface gestures, researchers have recently posited that, for distracted environments, benefits may exist in using motion gestures to execute commands. In this paper, we examine the relative cognitive demands of motion gestures and surface taps and gestures in two specific distracted scenarios: a walking scenario, and an eyes-free seated scenario. We show, first, that there is no significant difference in reaction time for motion gestures, taps, or surface gestures on smartphones. We further show that motion gestures result in significantly less time looking at the smartphone during walking than does tapping on the screen, even with interfaces optimized for eyes-free input. Taken together, these results show that, despite somewhat lower throughput, there may be benefits to making use of motion gestures as a modality for distracted input on smartphones.
Conference Paper
Full-text available
Speed Dependent Automatic Zooming proposed by Igarashi and Hinckley is a powerful tool for document navigation on mobile devices. We show that browsing and targeting can be facilitated by using a model-based sonification approach to generate audio feedback about document structure, in a tilt-controlled SDAZ interface. We implemented this system for a text browser on a Pocket PC instrumented with an ac-celerometer and headset, and found that audio feedback pro-vided valuable information, supporting intermittent interac-tion, i.e. allowing movement-based interaction techniques to continue while the user is simultaneously involved with other tasks. This was demonstrated by a blindfolded user success-fully locating specified elements in a text file.
Conference Paper
Full-text available
Recent advances in touch screen technology have increased the prevalence of touch screens and have prompted a wave of new touch screen-based devices. However, touch screens are still largely inaccessible to blind users, who must adopt error-prone compensatory strategies to use them or find accessible alternatives. This inaccessibility is due to interaction techniques that require the user to visually locate objects on the screen. To address this problem, we introduce Slide Rule, a set of audio- based multi-touch interaction techniques that enable blind users to access touch screen applications. We describe the design of Slide Rule, our interaction techniques, and a user study in which 10 blind people used Slide Rule and a button-based Pocket PC screen reader. Results show that Slide Rule was significantly faster than the button-based system, and was preferred by 7 of 10 users. However, users made more errors when using Slide Rule than when using the more familiar button-based system.
Conference Paper
Full-text available
Devices capable of gestural interaction through motion sensing are increasingly becoming available to consumers; however, motion gesture control has yet to appear outside of game consoles. Interaction designers are frequently not expert in pattern recognition, which may be one reason for this lack of availability. Another issue is how to effectively test gestures to ensure that they are not unintentionally activated by a user's normal movements during everyday usage. We present MAGIC, a gesture design tool that addresses both of these issues, and detail the results of an evaluation.
Conference Paper
Full-text available
Modern smartphones contain sophisticated sensors to monitor three-dimensional movement of the device. These sensors permit devices to recognize motion gestures - deliberate movements of the device by end-users to invoke commands. However, little is known about best-practices in motion gesture design for the mobile computing paradigm. To address this issue, we present the results of a guessability study that elicits end-user motion gestures to invoke commands on a smartphone device. We demonstrate that consensus exists among our participants on parameters of movement and on mappings of motion gestures onto commands. We use this consensus to develop a taxonomy for motion gestures and to specify an end-user inspired motion gesture set. We highlight the implications of this work to the design of smartphone applications and hardware. Finally, we argue that our results influence best practices in design for all gestural interfaces.
Conference Paper
Full-text available
We present the results of an empirical study investigating the effect of feedback, mobility and index of difficulty on a deictic spatial audio target acquisition task in the horizontal plane in front of a user. With audio feedback, spatial audio display elements are found to enable usable deictic interac-tion that can be described using Fitts law. Feedback does not affect perceived workload or preferred walking speed compared to interaction without feedback. Mobility is found to degrade interaction speed and accuracy by 20%. Participants were able to perform deictic spatial audio target acquisition when mobile while walking at 73% of their pre-ferred walking speed. The proposed feedback design is ex-amined in detail and the effects of variable target widths are quantified. Deictic interaction with a spatial audio display is found to be a feasible solution for future interface designs.
Conference Paper
Full-text available
We propose spatially aware portable displays which use movement in real physical space to control navigation in the digital information space within. This paper describes two interface design studies which use physical models, such as friction and gravity, in relating the movement of the display to the movement of information on the display surface. In combining input and output aspects of the interface into a single object, we can improve control and provide a meaningful relationship between the interface and the body of the user.
Conference Paper
Full-text available
Mobile and wearable computers present input/output prob-lems due to limited screen space and interaction techniques. When mobile, users typically focus their visual attention on navigating their environment - making visually demanding interface designs hard to operate. This paper presents two multimodal interaction techniques designed to overcome these problems and allow truly mobile, 'eyes-free' device use. The first is a 3D audio radial pie menu that uses head gestures for selecting items. An evaluation of a range of different audio designs showed that egocentric sounds re-duced task completion time, perceived annoyance, and al-lowed users to walk closer to their preferred walking speed. The second is a sonically enhanced 2D gesture recognition system for use on a belt-mounted PDA. An evaluation of the system with and without audio feedback showed users' ges-tures were more accurate when dynamically guided by au-dio-feedback. These novel interaction techniques demon-strate effective alternatives to visual-centric interface de-signs on mobile devices.
Conference Paper
Full-text available
Accelerometers are common on many devices, including those required for text-entry. We investigate how to enter text with devices that are solely enabled with accelerometers. The challenge of text-entry with such devices can be overcome by the careful investigation of the human limitations in gestural movements with accelerometers. Preliminary studies provide insight into two potential text-entry designs that purely use accelerometers for gesture recognition. In two experiments, we evaluate the effectiveness of each of the text-entry designs. The first experiment involves novice users over a 45 minute period while the second investigates the possible performance increases over a four day period. Our results reveal that a matrix-based text-entry system with a small set of simple gestures is the most efficient (5.4wpm) and subjectively preferred by participants.
Conference Paper
Full-text available
This paper reports on the design and use of tactile user in- terfaces embedded within or wrapped around the devices that they control. We discuss three different interaction prototypes that we built These interfaces were embedded onto two handheld devices of dramatically different form factors. We describe the design and implementation chal- lenges, and user feedback and reactions to these prototypes. Implications for future design in the area of manipulative or haptic user interfaces are highlighted.
Conference Paper
Full-text available
Desktop user interface design originates from the fact that users are stationary and can devote all of their visual resource to the application with which they are interacting. In contrast, users of mobile and wearable devices are typically in motion whilst using their device which means that they cannot devote all or any of their visual resource to interaction with the mobile application – it must remain with the primary task, often for safety reasons. Additionally, such devices have limited screen real estate and traditional input and output capabilities are generally restricted. Consequently, if we are to develop effective applications for use on mobile or wearable technology, we must embrace a paradigm shift with respect to the interaction techniques we employ for communication with such devices. This paper discusses why it is necessary to embrace a paradigm shift in terms of interaction techniques for mobile technology and presents two novel multimodal interaction techniques which are effective alternatives to traditional, visual-centric interface designs on mobile devices as empirical examples of the potential to achieve this shift. 1
Conference Paper
Full-text available
An empirical study that compared six different feedback cue types to enhance pointing efficiency in deictic spatial audio displays is presented. Participants were asked to select a sound using a physical pointing gesture, with the help of a loudness cue, a timbre cue and an orientation update cue as well as with combinations of these cues. Display content was varied systematically to investigate the effect of increasing display population. Speed, accuracy and throughput ratings are provided as well as effective target widths that allow for minimal error rates. The results showed direct pointing to be the most efficient interaction technique; however large effective target widths reduce the applicability of this technique. Movement-coupled cues were found to significantly reduce display element size, but resulted in slower interaction and were affected by display content due to the requirement of continuous target attainment. The results show that, with appropriate design, it is possible to overcome interaction uncertainty and provide solutions that are effective in mobile human computer interaction.
Conference Paper
Full-text available
The study reported here investigates the design and evaluation of a gesture- controlled, spatially-arranged auditory user interface for a mobile computer. Such an interface may provide a solution to the problem of limited screen space in handheld devices and lead to an effective interface for mobile/eyes-free computing. To better understand how we might design such an interface, our study compared three potential interaction techniques: head nodding, pointing with a finger and pointing on a touch tablet to select an item in exocentric 3D audio space. The effects of sound direction and interaction technique on the browsing and selection process were analyzed. An estimate of the size of the minimum selection area that would allow efficient 3D sound selection is pro- vided for each interaction technique. Browsing using the touch screen was found to be more accurate than the other two techniques, but participants found it significantly harder to use.
Conference Paper
Full-text available
This TechNote introduces a novel interaction technique for small screen devices such as palmtop computers or hand-held electric devices, including pagers and cellular phones. Our proposed method uses the tilt of the device itself as input. Using both tilt and buttons, it is possible to build several interaction techniques ranging from menus and scroll bars, to more complicated examples such as a map browsing system and a 3D object viewer. During operation, only one hand is required to both hold and control the device. This feature is especially useful for field workers.
Conference Paper
Full-text available
TiltType is a novel text entry technique for mobile devices. To enter a character, the user tilts the device and presses one or more buttons. The character chosen depends on the button pressed, the direction of tilt, and the angle of tilt. TiltType consumes minimal power and requires little board space, making it appropriate for wristwatch-sized devices. But because controlled tilting of one's forearm is fatiguing, a wristwatch using this technique must be easily removable from its wriststrap. Applications include two-way paging, text entry for watch computers, web browsing, numeric entry for calculator watches, and existing applications for PDAs.
Article
Full-text available
We present McSig, a multimodal system for teaching blind children cursive handwriting so that they can create a personal signature. For blind people handwriting is very difficult to learn as it is a near-zero feedback activity that is needed only occasionally, yet in important situations; for example, to make an attractive and repeatable signature for legal contracts. McSig aids the teaching of signatures by translating digital ink from the teacher’s stylus gestures into three non-visual forms: (1) audio pan and pitch represents the x and y movement of the stylus; (2) kinaesthetic information is provided to the student through a force-feedback haptic pen that mimics the teacher’s stylus movement; and (3) a physical tactile line on the writing sheet is created by the haptic pen. McSig has been developed over two major iterations of design, usability testing and evaluation. The final step of the first iteration was a short evaluation with eight visually impaired children. The results suggested that McSig had the highest potential benefit for congenitally and totally blind children and also indicated some areas where McSig could be enhanced. The second prototype incorporated significant modifications to the system, improving the audio, tactile and force-feedback. We then ran a detailed, longitudinal evaluation over 14 weeks with three of the congenitally blind children to assess McSig’s effectiveness in teaching the creation of signatures. Results demonstrated the effectiveness of McSig—they all made considerable progress
Article
Full-text available
We investigate the use of auditory feedback in pen-gesture interfaces in a series of informal and formal experiments. Initial iterative exploration showed that gaining performance advantage with auditory feedback was possible using absolute cues and state feedback after the gesture was produced and recognized. However, gaining learning or performance advantage from auditory feedback tightly coupled with the pen-gesture articulation and recognition process was more difficult. To establish a systematic baseline, Experiment 1 formally evaluated gesture production accuracy as a function of auditory and visual feedback. Size of gestures and the aperture of the closed gestures were influenced by the visual or auditory feedback, while other measures such as shape distance and directional difference were not, supporting the theory that feedback is too slow to strongly influence the production of pen stroke gestures. Experiment 2 focused on the subjective aspects of auditory feedback in pen-gesture interfaces. Participants' rating on the dimensions of being wonderful and stimulating was significantly higher with musical auditory feedback. Several lessons regarding pen gestures and auditory feedback are drawn from our exploration: a few simple functions such as indicating the pen-gesture recognition results can be achieved, gaining performance and learning advantage through tightly coupled process-based auditory feedback is difficult, pen-gesture sets and their recognizers can be designed to minimize visual dependence, and people's subjective experience of gesture interaction can be influenced using musical auditory feedback. These lessons may serve as references and stepping stones toward future research and development in pen-gesture interfaces with auditory feedback.
Conference Paper
Full-text available
We investigate the use of audio and haptic feedback to augment the display of a mobile device controlled by tilt input. We provide an example of this based on Doppler effects, which highlight the user's approach to a target, or a target's movement from the current state, in the same way we hear the pitch of a siren change as it passes us. Twelve participants practiced navigation/browsing a state-space that was displayed via audio and vibrotactile modalities. We implemented the experiment on a Pocket PC, with an accelerometer attached to the serial port and a headset attached to audio port. Users navigated through the environment by tilting the device. Feedback was provided via audio displayed via a headset, and by vibrotactile information displayed by a vibrotactile unit in the Pocket PC. Users selected targets placed randomly in the state-space, supported by combinations of audio, visual and vibrotactile cues. The speed of target acquisition and error rate were measured, and summary statistics on the acquisition trajectories were calculated. These data were used to compare different display combinations and configurations. The results in the paper quantified the changes brought by predictive or 'quickened' sonified displays in mobile, gestural interaction.
Article
Full-text available
We describe a method to improve user feedback, specifically the display of time-varying probabilistic information, through asynchronous granular synthesis. We have applied these techniques to challenging control problems as well as to the sonification of online probabilistic gesture recognition. We're using these displays in mobile, gestural interfaces where visual display is often impractical.
Article
Full-text available
A general framework for producing formative audio feedback for gesture recognition is presented, including the dynamic and semantic aspects of gestures. The beliefs states are probability density functions conditioned on the trajectories of the observed variables. We describe example implementations of gesture recognition based on Hidden Markov Models and a dynamic programming recognition algorithm. Granular synthesis is used to present the audio display of the changing probabilities and observed states.
Article
Full-text available
Patients with lack of proprioception are unable to build and maintain internal models of their limbs and monitor their limb movements because these patients do not receive the appropriate information from muscles and joints. This project was undertaken to determine if auditory signals can provide proprioceptive information normally obtained through muscle and joint receptors. Sonification of spatial location and sonification of joint motion, for monitoring arm/hand motions, was attempted in two pilot experiments with a patient. Sonification of joint motion though strong time/synchronization cues was the most successful approach. These results are encouraging and suggest that auditory feedback of joint motions may be substitute for proprioceptive input. However, additional data will have to be collected and control experiments will have to be done. Keywords Data sonification, limb movements, joint coordination, proprioception, interface design, software synthesis.
Article
Full-text available
We describe sensing techniques motivated by unique aspects of human-computer interaction with handheld devices in mobile settings. Special features of mobile interaction include changing orientation and position, changing venues, the use of computing as auxiliary to ongoing, real-world activities like talking to a colleague, and the general intimacy of use for such devices. We introduce and integrate a set of sensors into a handheld device, and demonstrate several new functionalities engendered by the sensors, such as recording memos when the device is held like a cell phone, switching between portrait and landscape display modes by holding the device in the desired orientation, automatically powering up the device when the user picks it up the device to start using it, and scrolling the display using tilt. We present an informal experiment, initial usability testing results, and user reactions to these techniques. Keywords Input devices, interaction techniques, sensing, contextaware...
Article
Designers of motion gestures for mobile devices face the difficult challenge of building a recognizer that can separate gestural input from motion noise. A threshold value is often used to classify motion and effectively balances the rates of false positives and false negatives. We present a bi-level threshold recognizer that is built to lower the rate of recognition failures by accepting either a tightly thresholded gesture or two consecutive possible gestures recognized by a looser model. We evaluate bi-level thresholding with a pilot study in order to gauge its effectiveness as a recognition safety net for users who have difficulty activating a motion gesture. Lastly, we suggest the use of bi-level thresholding to scaffold learning of motion gestures.
Conference Paper
While sighted users may learn to perform touchscreen gestures through observation (e.g., of other users or video tutorials), such mechanisms are inaccessible for users with visual impairments. As a result, learning to perform gestures can be challenging. We propose and evaluate two techniques to teach touchscreen gestures to users with visual impairments: (1) corrective verbal feedback using text-to-speech and automatic analysis of the user's drawn gesture; (2) gesture sonification to generate sound based on finger touches, creating an audio representation of a gesture. To refine and evaluate the techniques, we conducted two controlled lab studies. The first study, with 12 sighted participants, compared parameters for sonifying gestures in an eyes-free scenario and identified pitch + stereo panning as the best combination. In the second study, 6 blind and low-vision participants completed gesture replication tasks with the two feedback techniques. Subjective data and preliminary performance findings indicate that the techniques offer complementary advantages.
Conference Paper
When using motion gestures, 3D movements of a mobile phone, as an input modality, one significant challenge is how to teach end users the movement parameters necessary to successfully issue a command. Is a simple video or image depicting movement of a smartphone sufficient? Or do we need three-dimensional depictions of movement on external screens to train users? In this paper, we explore mechanisms to teach end users motion gestures, examining two factors. The first factor is how to represent motion gestures: as icons that describe movement, video that depicts movement using the smartphone screen, or a Kinect-based teaching mechanism that captures and depicts the gesture on an external display in three-dimensional space. The second factor we explore is recognizer feedback, i.e. a simple representation of the proximity of a motion gesture to the desired motion gesture based on a distance metric extracted from the recognizer. We show that, by combining video with recognizer feedback, participants master motion gestures equally quickly as end users that learn using a Kinect. These results demonstrate the viability of training end users to perform motion gestures using only the smartphone display.
Article
LightGuide is a system that explores a new approach to gesture guidance where we project guidance hints directly on a user's body. These projected hints guide the user in completing the desired motion with their body part which is particularly useful for performing movements that require accuracy and proper technique, such as during exercise or physical therapy. Our proof-of-concept implementation consists of a single low-cost depth camera and projector and we present four novel interaction techniques that are focused on guiding a user's hand in mid-air. Our visualizations are designed to incorporate both feedback and feedforward cues to help guide users through a range of movements. We quantify the performance of LightGuide in a user study comparing each of our on-body visualizations to hand animation videos on a computer display in both time and accuracy. Exceeding our expectations, participants performed movements with an average error of 21.6mm, nearly 85% more accurately than when guided by video.
Article
In the past decade, the Open Source Model for software development has gained popularity and has had numerous major achievements: emacs, Linux, the Gimp, and Python, to name a few. The basic idea is to provide the source code of the model or application, a tutorial on its use, and a feedback mechanism with the community so that the model can be tested, improved, and archived. Given the success of the Open Source Model, we believe it may prove valuable in the development of scientific research codes. With this in mind, we are `Open Sourcing' the low to mid-latitude ionospheric model that has recently been developed at the Naval Research Laboratory: SAMI2 (Sami2 is Another Model of the Ionosphere). The model is comprehensive and uses modern numerical techniques. The structure and design of SAMI2 make it relatively easy to understand and modify: the numerical algorithms are simple and direct, and the code is reasonably well-written. Furthermore, SAMI2 is designed to run on personal computers; prohibitive computational resources are not necessary, thereby making the model accessible and usable by virtually all researchers. For these reasons, SAMI2 is an excellent candidate to explore and test the open source modeling paradigm in space physics research. We will discuss various topics associated with this project. Research supported by the Office of Naval Research.
Article
This paper describes an interaction concept for controlling the cursor on a hand-held computing device's display, in difference to the desktop interaction paradigm. "Cursor" is defined as a small point-like indicator which is movable on a graphic interface. "Hand-held computing device" can for example be a Personal Digital Assistant (PDA). Moving the cursor should be like moving a piece of butter in a hot frying pan: The more the pan (device) is tilted, the quicker the butter (cursor) will slide "downhills". We also describe a menu system designed for this type of control.
Conference Paper
Sensors are becoming increasingly important in interaction design. Authoring a sensor-based interaction comprises three steps: choosing and connecting the appropriate hard- ware, creating application logic, and specifying the relationship between sensor values and application logic. Recent research has successfully addressed the first two issues. However, linking sensor input data to application logic remains an exercise in patience and trial-and-error testing for most designers. This paper introduces techniques for authoring sensor-based interactions by demonstration. A combination of direct manipulation and pattern recognition techniques enables designers to control how demonstrated examples are generalized to interaction rules. This approach emphasizes design exploration by enabling very rapid iterative demonstrate-edit-review cycles. This paper de- scribes the manifestation of these techniques in a design tool, Exemplar, and presents evaluations through a first-use lab study and a theoretical analysis using the Cognitive Dimensions of Notation framework.
Conference Paper
To make motion gestures more widely adopted on mobile devices it is important that devices be able to distinguish between motion intended for mobile interaction and every-day motion. In this paper, we present DoubleFlip, a unique motion gesture designed as an input delimiter for mobile motion-based interaction. The DoubleFlip gesture is distinct from regular motion of a mobile device. Based on a collection of 2,100 hours of motion data captured from 99 users, we found that our DoubleFlip recognizer is extremely resistant to false positive conditions, while still achieving a high recognition rate. Since DoubleFlip is easy to perform and unlikely to be accidentally invoked, it provides an always-active input event for mobile interaction.
Conference Paper
We present and evaluate an approach towards eyes-free auditory display of spatial information that considers radial direction as a fundamental type of value primitive. There are many benefits to being able to sonify radial directions, such as indicating the heading towards a point of interest in a direct and dynamic manner, rendering a path or shape outline by sonifying a continual sequence of tangent directions as the path is traced, and providing direct feedback of the direction of motion of the user in a physical space or a pointer in a virtual space. We propose a concrete mapping of vowel-like sounds to radial directions as one potential method to enable sonification of such information. We conducted a longitudinal study with five sighted and two blind participants to evaluate the learnability and effectiveness of this method. Results suggest that our directional sound mapping can be learned within a few hours and be used to aurally perceive spatial information such as shape outlines and path contours.
Conference Paper
GestureBar is a novel, approachable UI for learning gest- ural interactions that enables a walk-up-and-use experience which is in the same class as standard menu and toolbar interfaces. GestureBar leverages the familiar, clean look of a common toolbar, but in place of executing commands, richly discloses how to execute commands with gestures, through animated images, detail tips and an out-of- document practice area. GestureBar's simple design is also general enough for use with any recognition technique and for integration with standard, non-gestural UI components. We evaluate GestureBar in a formal experiment showing that users can perform complex, ecologically valid tasks in a purely gestural system without training, introduction, or prior gesture experience when using GestureBar, discover- ing and learning a high percentage of the gestures needed to perform the tasks optimally, and significantly outperform- ing a state of the art crib sheet. The relative contribution of the major design elements of GestureBar is also explored. A second experiment shows that GestureBar is preferred to a basic crib sheet and two enhanced crib sheet variations.
Conference Paper
This study examines methods for displaying distance information to blind travellers using sound, focussing on abstractions of methods currently used in, commercial Electronic Travel Aids (ETAs). Ten blind participants assessed three sound encodings commonly used to convey distance information by ETAs: sound frequency (Pitch), Ecological Distance (ED), and temporal variation or Beat Rate (BR). Re-sponse time and response correctness were chosen for measures. Pitch variation was found to be the least effective encoding, which is a surprise because most ETAs encode distance as Pitch. Tempo, or BR, encoding was found to be superior to Pitch. ED, which was simulated by filtering high frequencies and decreasing intensity with distance, was found to be best. Grouping BR and ED redundantly slightly outperformed ED. Consistent polarity across participants was found in ED and BR but not in Pitch encoding.
Conference Paper
We report a series of user studies that evaluate the feasibility and usability of light-weight user authentication with a single tri-axis accelerometer. We base our investigation on uWave, a state-of- the-art recognition system for user-created free-space manipula- tion, or gestures. Our user studies address two types of user au- thentication: non-critical authentication (or identification) for a user to retrieve privacy-insensitive data; and critical authentica- tion for protecting privacy-sensitive data. For non-critical authen- tication, our evaluation shows that uWave achieves high recogni- tion accuracy (98%) and its usability is comparable with text ID- based authentication. Our results also highlight the importance of constraints for users to select their gestures. For critical authenti- cation, the evaluation shows uWave achieves state-of-the-art resi- lience to attacks with 3% false positives and 3% false negatives, or 3% equal error rate. We also show that the equal error rate increases to 10% if the attackers see the users performing their gestures. This shows the limitation of gesture-based authentication and highlights the need for visual concealment.
Conference Paper
Many touch screens remain inaccessible to blind users, and those approaches to providing access that do exist offer minimal support for interacting with large touch screens or spatial data. In this paper, we introduce a set of three software-based access overlays intended to improve the accessibility of large touch screen interfaces, specifically interactive tabletops. Our access overlays are called edge projection, neighborhood browsing, and touch-and-speak. In a user study, 14 blind users compared access overlays to an implementation of Apple's VoiceOver screen reader. Our results show that two of our techniques were faster than VoiceOver, that participants correctly answered more questions about the screen's layout using our techniques, and that participants overwhelmingly preferred our techniques. We developed several applications demonstrating the use of access overlays, including an accessible map kiosk and an accessible board game.
Conference Paper
We describe OctoPocus, an example of a dynamic guide that combines on-screen feedforward and feedback to help users learn, execute and remember gesture sets. OctoPocus can be applied to a wide range of single-stroke gestures and recognition algorithms and helps users progress smoothly from novice to expert performance. We provide an analysis of the design space and describe the results of two experi- ments that show that OctoPocus is significantly faster and improves learning of arbitrary gestures, compared to con- ventional Help menus. It can also be adapted to a mark- based gesture set, significantly improving input time com- pared to a two-level, four-item Hierarchical Marking menu. ACM Classification: D.2.2 (Software Engineering): Design Tools & Techniques, User interfaces, H.1.2 (Models & Principles): User/Machine Systems Human factors, H.5.3
Conference Paper
Triggering shortcuts or actions on a mobile device often requires a long sequence of key presses. Because the functions of buttons are highly dependent on the current application's context, users are required to look at the display during interaction, even in many mobile situations when eyes-free interactions may be preferable. We present Virtual Shelves, a technique to trigger programmable shortcuts that leverages the user's spatial awareness and kinesthetic memory. With Virtual Shelves, the user triggers shortcuts by orienting a spatially-aware mobile device within the circular hemisphere in front of her. This space is segmented into definable and selectable regions along the phi and theta planes. We show that users can accurately point to 7 regions on the theta and 4 regions on the phi plane using only their kinesthetic memory. Building upon these results, we then evaluate a proof-of-concept prototype of the Virtual Shelves using a Nokia N93. The results show that Virtual Shelves is faster than the N93's native interface for common mobile phone tasks.
Article
The article outlines the development of the Rock 'n' Scroll input method which lets users gesture to scroll, select, and command an application without resorting to buttons, touchscreens, spoken commands, or other input methods. The Rock'n' Scroll user interface shows how inertial sensors in handheld devices can provide additional function beyond “tilt-to-scroll”. By also using them to recognize gestures, a significantly richer vocabulary for controlling the device is available that implements an electronic photo album, pager, or other limited function digital appliance without any additional input methods. The examples presented offer a glimpse at the freedom for both device designers and users inherent in devices that can be held in either hand, at any orientation, operated with mittens on, or not in the hand at all
Article
TiltText, a new technique for entering text into a mobile phone is described. The standard 12-button text entry keypad of a mobile phone forces ambiguity when the 26letter Roman alphabet is mapped in the traditional manner onto keys 2-9. The TiltText technique uses the orientation of the phone to resolve this ambiguity, by tilting the phone in one of four directions to choose which character on a particular key to enter. We first discuss implementation strategies, and then present the results of a controlled experiment comparing TiltText to MultiTap, the most common text entry technique. The experiment included 10 participants who each entered a total of 640 phrases of text chosen from a standard corpus, over a period of about five hours. The results show that text entry speed including correction for errors using TiltText was 23% faster than MultiTap by the end of the experiment, despite a higher error rate for TiltText. TiltText is thus amongst the fastest known language-independent techniques for entering text into mobile phones.
iPhone User Guide For iPhone OS 3.1 Software
  • Apple Inc
Apple Inc. 2009. iPhone User Guide For iPhone OS 3.1 Software. (2009). https://manuals.info.apple.com/MANUALS/0/MA616/ en US/iPhone iOS3.1 User Guide.pdf.
Audio-enhanced collaboration at an interactive electronic whiteboard
  • Christian Müller-Tomfelde
  • Sascha Steiner
Christian Müller-Tomfelde and Sascha Steiner. 2001. Audio-enhanced collaboration at an interactive electronic whiteboard. In Proceedings of the 7th International Conference on Auditory Display (ICAD'01). 267-271.
Motor Learning and Control: From Theory to Practice
  • William Edwards
  • Ca Parisa Belmont
  • Andrew Eslambolchilar
  • Roderick Crossan
  • Murray-Smith
William Edwards. 2010. Motor Learning and Control: From Theory to Practice. Cengage Learning, Belmont, CA. Parisa Eslambolchilar, Andrew Crossan, and Roderick Murray-Smith. 2004a. Model-based target sonification on mobile devices. In Proceedings of the International Workshop on Interactive Sonification (ISon'04). Interactive Sonification Organisation.