Article

Auditory Browser for Blind and Visually Impaired Users

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper presents our work on the development of a multimodal auditory interface which permits blind users to work more easily and efficiently with GUI browsers. A macro-analysis phase, which can be either passive or active, informs on the global layout of HTML documents. A subsequent active micro-analysis phase allows to explore particular elements of the document. The interface is based on : (1) a mapping of the graphical HTML document into a 3D virtual sound space environment, where non-speech auditory cues differentiate HTML elements; (2) the transcription into sound not only of text, but also of images; (3) the use of a touch-sensitive screen to facilitate user interaction. Moreover, in order to validate the sonification model of the images, we have created an audio "memory game", that can be used as a pedagogical tool to help blind pupils learn spatial exploration cues.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... In fact, the combination of synchronization, structural navigation and annotations management, using visual, audio, speech and standard interactions, poses ambiguity and cognitive problems that must be dealt with at the UI design level (Morley, 1998). Work has been done at the audio browser level (Roth, Petrucci, Assimacopoulos & Pun, 1999), but several issues are still open. Furthermore, multimodal DTBs use can be enlarged, aiming at particular applications and situational constraints, not necessarily for print-disabled readers. ...
... also been reported on DTB applications, or more generally, on the development of nonvisual and multimodal environments for WWW browsing. Methods for conveying the document structure and assist in the navigation within the document have varied from 3D audio (Goose & Moller, 1999) to the use of touch-sensitive screen to facilitate user interaction (Roth et. al., 1999). To convey document content, such as the presence of hyperlinks and headings, and assist in the navigation between documents, auditory icons (Gaver, 1993) Nevertheless, several issues are not addressed: the automatic generation of DTBs from large amounts of existing raw material; the use of DTB standards in of-the-shelf browsers; the d ...
Conference Paper
Full-text available
This paper presents a framework for the production and design of digital talking books. The produced books provide means for multimodal interaction aiming at print-disabled reading contexts (users or situationally determined) and experimental enriched scenarios (e.g. for cognitive performance enhancement). The framework purpose is twofold: automatic generation of multimodal digital talking books from original raw materials (tapes and text), and provision of fundamental mechanisms to identify, extract and store excerpts of digital spoken books, to enrich them with other media and to combine them into new perspectives, stories or documents. Some preliminary usability tests were performed, particularly referring to the synchronization and resynchronization units, that influenced the interaction redesign.
... Works that fits into the visual-to-audio transformation category include document browsers that support synthesizing text to speech, such as Adobe Acrobat PDF reader for visually impaired users [15]. Furthermore, some work has been done on developing Web browsers for blind and visually impaired users [16][17]. The focus in [17] is to map a graphical HTML document into a 3D virtual sound space, where non-speech auditory cues differentiate HTML documents. ...
... Furthermore, some work has been done on developing Web browsers for blind and visually impaired users [16][17]. The focus in [17] is to map a graphical HTML document into a 3D virtual sound space, where non-speech auditory cues differentiate HTML documents. Their goal is to transform as much information as possible into the audio channel. ...
Article
Small displays on mobile handheld devices, such as personal digital assistants (PDAs) and cellular phones, are the bottlenecks for usability of most content browsing applications. Generally, conventional content such as documents and Web pages need to be modified for effective presentation on mobile devices. This paper proposes a novel visualization for documents, called multimedia thumbnails, which consists of text and image content converted into playable multimedia clips. A multimedia thumbnail utilizes visual and audio channels of small portable devices as well as both spatial and time dimensions to communicate text and image information of a single document. The proposed algorithm for generating multimedia thumbnails includes 1) a semantic document analysis step, where salient content from a source document is extracted; 2) an optimization step, where a subset of this extracted content is selected based on time, display, and application constraints; and 3) a composition step, where the selected visual and audible document content is combined into a multimedia thumbnail. Scalability of MMNails that allows generation of multimedia clips of various lengths is also described. A user study is presented that evaluates the effectiveness of the proposed multimedia thumbnail visualization.
... Overviews have also been used to augment links on a web page with destination previews, however, via the form of textual summaries [9]. Earcons and auditory icons are more often used to provide feedback about the individual element encountered during active exploration of an interface by blind users [8,10]. ...
Conference Paper
Blind users browse the web using screen readers. Screen readers read the content on a web page sequentially via synthesized speech. The linear nature of this process makes it difficult to obtain an overview of the web page, which creates navigation challenges. To alleviate this problem, we have developed ScreenTrack, a browser extension that summarizes a web page's accessibility features into a short, dynamically generated soundtrack. Users can quickly gain an overview of the presence of web elements useful for navigation on a web page. Here we describe ScreenTrack and discuss future research plans.
... Provide auditory explanation: Low vision learners and other types of VI depend 100% on audio to explain everything that appear on screen [34], [35] [36], [37]. Without auditory explanation, the visual aspect means nothing to them. ...
Article
Full-text available
This paper reports on an ongoing study, which intends to propose a conceptual design model of Assistive Courseware(AC) that is particularly designed for low vision learners (LV) learners.Altogether, 15 conceptual design models of courseware were compared and analyzed exhaustively with the main objectives (i) to determine the research gaps in proposing a conceptual design model of AC4LV and (ii) to identify their common components.Through a systematic and critical analysis, this study discovers that all of the previous models do not suggest any specific conceptual design model of courseware that caters the visually-impaired (VI) particularly low vision (LV) learners in detail.It is noted that this is the research gap that should be the focal point for further study.Also, the previous literatures suggest that Instructional Design (ID) model, learning theories, and learning approach must be the basic component in designing the conceptual design model of courseware.
... Wagner and Lieberman [27] introduced Woodstein, which predicts and assists the next user action based on analysis of collected sequences of previous actions on the webpages. Roth et al. [28] created an agent to provide audio feedback for the user's cursor location. Yu et al. [29] designed context-aware Web agents to provide audio and haptic feedback for the user's cursor location in a screen reader. ...
Article
Full-text available
Online Web applications have become widespread and have made our daily life more convenient. However, older adults often find such applications inaccessible because of age-related changes to their physical and cognitive abilities. Two of the reasons that older adults may shy away from the Web are fears of the unknown and of the consequences of incorrect actions. We are extending a voice-based augmentation technique originally developed for blind users. We want to reduce the cognitive load on older adults by providing contextual support. An experiment was conducted to evaluate how voice augmentation can support elderly users in using Web applications. Ten older adults participated in our study and their subjective evaluations showed how the system gave them confidence in completing Web forms. We believe that voice augmentation may help address the users' concerns arising from their low confidence levels.
... P. Roth et al demonstrated a series of projects on tactile-auditory interaction tools for the visually impaired. Among the projects are AB-Web [8], a 3D-audio Web browser that uses 3D sonic rendering, WebSound [9], a generic tool that permits to associate with each HTML tag a given sonic object (earcon or auditory icon), From Dots to Shapes [10], a family of sonic games, and IDEA [11], a drawing creation and analysis audio tool.Fig. 2 illustrates our proposal using audio to assist visually impaired in getting a better quality of acquired image (in terms of head pose) using handheld devices. As can be seen, three different sinusoidal wave sounds at increasing frequencies and tempos are used to indicate 3 different stages (non-face/partial face, non-frontal face and frontal face). ...
Conference Paper
Full-text available
As mobile devices are becoming more ubiquitous, it is now possible to enhance the security of the phone, as well as remote services requiring identity verification, by means of biometric traits such as fingerprint and speech. We refer to this as mobile biometry. The objective of this study is to increase the usability of mobile biometry for visually impaired users, using face as biometrics. We illustrate a scenario of a person capturing his/her own face images which are as frontal as possible. This is a challenging task for the following reasons. Firstly, a greater variation in head pose and degradation in image quality (e.g., blur, de-focus) is expected due to the motion introduced by the hand manipulation and unsteadiness. Second, for the visually impaired users, there currently exists no mechanism to provide feedback on whether a frontal face image is detected. In this paper, an audio feedback mechanism is proposed to assist the visually impaired to acquire face images of better quality. A preliminary user study suggests that the proposed audio feedback can potentially (a) shorten the acquisition time and (b) improve the success rate of face detection, especially for the non-sighted users.
... Moreover, some work has been done on developing Web browsers for blind and visually impaired users. The focus in [10] is to map a graphical HTML document into a 3D virtual sound space environment, where nonspeech auditory cues differentiate HTML documents. In all the applications for blind or visually impaired users, the goal is to transform as much information as possible into the audio channel and giving up on the visually channel completely. ...
Conference Paper
Full-text available
As small portable devices are becoming standard personal equipments, there is a great need for the adaptation of information content to small displays. Currently, no good solutions exist for viewing formatted documents, such as pdf documents, on these devices. Adapting content of web pages to small displays is usually achieved by complete redesign of a page or automatically reflowing text for small displays. Such techniques may not be applicable to documents whose format needs to be preserved. To address this problem, we propose a new document representation called Multimedia Thumbnail. Multimedia Thumbnail uses the visual and audio channels of small portable devices to communicate document information in form of a multimedia clip, which can be seen as a movie trailer for a document. Generation of such a clip includes a document analysis step, where salient document information is extracted, an optimization step, where the document information to be included in the thumbnail is determined based on display and time constraints, and a synthesis step, where visual and audible information are formed into a playable Multimedia Thumbnail. We also present user study results that evaluate an initial system design and point to further modification on analysis, optimization, and user interface components.
... Some proposals have been researched when developing non-visual browsing environments, in particular for WWW browsing. Methods for conveying the document structure and assist in the navigation within the document have varied from 3D audio (Goose and Moller, 1999) to the use of touch-sensitive screen to facilitate user interaction (Roth et al., 1999 ). To convey document content, such as the presence of hyperlinks and headings, and assist in the navigation between documents, auditory icons (Gaver, 1993; Blattner et al., 1990), multiple speakers and sound effects (James, 1997), amongst other techniques have been studied. ...
Conference Paper
Full-text available
This paper presents a framework for the conversion of audiotape spoken books to full featured digital talking books. It is developed within the context of the IPSOM project. The introduction of search, cross-referencing and annotation mechanisms, with multimedia and trough multimodal capabilities are considered. Different formats and standards are taken into consideration, as well as different interaction alternatives. The resulting digital talking books aim the visually impaired community, but also situated applications and studies of cog- nitive aspects. The framework is part of a larger setting enabling the authoring, by reuse of and enrichment of multimedia units, of digital multimedia and multimodal documents.
Article
Screen readers for the visually impaired and blind and short video platforms have conflicting functionalities. In particular, blind users encounter information access barriers when searching for video content, which reduces their user experience. We embed auditory cues at the beginning of a short video corresponding to its content to help blind users identify the video type. The experimental design and evaluation results reveal the significant impact of these auditory cues. By embedding auditory cues, we can significantly enhance the user's usability, recognition efficiency, and emotional experience, surpassing traditional short videos' experience. Speech had the shortest response time and highest accuracy, while auditory icons provided a better emotional experience. In addition, some participants expressed concerns about the potential social privacy issues associated with Speech. This study provides auditory cue-matching solutions for a wide range of short videos. It offers a beacon of hope for enhancing the experience of short video platforms for the blind user. By doing so, we contribute to the well-being of people with disabilities and provide highly versatile user experience design recommendations for a broader range of digital media platforms.
Article
3D models are an important means for understanding spatial contexts. Today these models can be materialized by 3D printing, which is increasingly used at schools for people with visual impairments. In contrast to sighted people, people with visual impairments have so far, however, neither been able to search nor to print 3D models without assistance. This article describes our work to develop an aid for people with visual impairments that would facilitate autonomous searching for and printing of 3D models. In our initial study, we determined the requirements to accomplish this task by means of a questionnaire and developed a first approach that allowed personal computer-based 3D printing. An extended approach allowed searching and printing using common smartphones. In our architecture, technical details of 3D printers are abstracted by a separate component that can be accessed via Wi-Fi independently of the actual 3D printer used. It comprises a search of the models in an annotated database and 3D model retrieval from the internet. The whole process can be controlled by voice interaction. The feasibility of autonomous 3D printing for people with visual impairments is shown with a first user study. Our second user study examines the usability of the user interface when searching for 3D models on the internet and preparing them for the materialization. The participants were able to define important printing settings, whereas other printing parameters could be determined algorithmically.
Article
Full-text available
Assistive technology for the visually impaired and blind people is a research field that is gaining increasing prominence owing to an explosion of new interest in it from disparate disciplines. The field has a very relevant social impact on our ever-increasing aging and blind populations. While many excellent state-of-the-art accounts have been written till date, all of them are subjective in nature. We performed an objective statistical survey across the various sub-disciplines in the field and applied information analysis and network-theory techniques to answer several key questions relevant to the field. To analyze the field we compiled an extensive database of scientific research publications over the last two decades. We inferred interesting patterns and statistics concerning the main research areas and underlying themes, identified leading journals and conferences, captured growth patterns of the research field; identified active research communities and present our interpretation of trends in the field for the near future. Our results reveal that there has been a sustained growth in this field; from less than 50 publications per year in the mid 1990s to close to 400 scientific publications per year in 2014. Assistive Technology for persons with visually impairments is expected to grow at a swift pace and impact the lives of individuals and the elderly in ways not previously possible.
Chapter
Spatial auditory interfaces use three-dimensional sound as an additional display dimension and consist of audio items at different spatial locations. They have evolved significantly in the last couple of years and can be found in a variety of environments where visual communication is obstructed or completely blocked by other activities, such as walking, driving, flying, operating multimodal virtual displays, etc. The precise spatial position of each source can offer an additional informational cue in the interface or can simply help resolving various ambiguities in the content of simultaneously-played sources. It can also be used to increase realism in virtual worlds by imitating real environments where the majority of sounds can be localized and associated with their sources.
Conference Paper
Auditory user interfaces have great Web-access potential for billions of people with visual impairments, with limited literacy, who are driving, or who are otherwise unable to use a visual interface. However a sequential speech-based representation can only convey a limited amount of information. In addition, typical auditory user interfaces lose the visual cues such as text styles and page structures, and lack effective feedback about the current focus. To address these limitations, we created Sasayaki (from whisper in Japanese), which augments the primary voice output with a secondary whisper of contextually relevant information, automatically or in response to user requests. It also offers new ways to jump to semantically meaningful locations. A prototype was implemented as a plug-in for an auditory Web browser. Our experimental results show that the Sasayaki can reduce the task completion times for finding elements in webpages and increase satisfaction and confidence.
Article
Full-text available
Introduction This article reports on a study that explored the benefits and drawbacks of using spatially positioned synthesized speech in auditory interfaces for computer users who are visually impaired (that is, are blind or have low vision). The study was a practical application of such systems—an enhanced word processing application compared to conventional screen-reading software with a braille display. Methods Two types of user interfaces were compared in two experimental conditions: a JAWS screen reader equipped with an ALVA 544 Satellite braille display and a custom auditory interface based on spatialized speech. Twelve participants were asked to read and process three different text files with each interface and to collect the information about their form and structure. Task-completion times and the correctness of the perceived information on text decorations, text alignment, and table structures were measured. Results The spatial auditory interface proved to be significantly faster (3 minutes, 12 seconds) than the JAWS screen reader with ALVA braille display (8 minutes, 38 seconds), F(1,70) = 391.523, p < .001, and 15% more accurate when gathering information on text alignment, F(1,70) = 28.220, p < .001. No significant difference between the interfaces could be established when comparing questions on text decorations, F(1,70) = 0.912, p = .343, or table structures, F(1,70) = 1.045, p = .310). Discussion The findings show that the auditory interface with spatialized speech is more than 160% faster than the tactile interface while remaining equally accurate and effective for gathering information on various properties of text and tables. Implications for practitioners The spatial location of synthesized speech can be used for the fast presentation of the physical position of texts in a file, their alignment, the dimensions of tables, and the position of specific texts within tables. The quality of spatial sound reproduction can play an important role in the overall performance of such systems.
Conference Paper
Audio Enriched Links provide previews of linked web pages to users with visual impairments. Before a user follows a hyperlink, the Audio Enriched Links software presents a spoken summary of the next page including its title, its relation to the current page, statistics about its content, and some highlights from its content. We believe that such a summary may be a useful surrogate for a full web page, and help users with visual impairments decide whether or not to spend time visiting a linked page. In this paper, we present some motivation for the Audio Enriched Links project. We describe the design and implementation of the current software prototype, and discuss the results of an initial evaluation involving four participants. We conclude with some implications of this work and directions for future research.
Article
Full-text available
Human discourse is an embodied activity emerging from the embodied imagery and construction of our talk. Gesture and speech are coexpressive, conveying this imagery and meaning simultaneously. Mathematics instruction and discourse typically involve two modes of communication: speech and graphical presentation. Our goal is to assist Individuals who are Blind or Severely Visually Impaired (IBSVI) to access such instruction/communication. We employ a haptic glove interface to furnish the IBSVI with awareness of the deictic gestures performed by the instructor over the graphic in conjunction with speech. We present a series of studies spanning two years where we show how our Haptic Deictic System (HDS) can support learning in inclusive classrooms where IBSVI receive instruction alongside sighted students. We discuss how the introduction of the HDS was advantageous to all parties: IBSVI, instructor, and sighted students. The HDS created more learning opportunities, increasing mutual understanding and promoting greater engagement.
Conference Paper
This paper reviews the overall impact of culture on the design of crossover applications, which are particularly intended to support the blind and visually impaired (B/VI) community. We believe that cultural differences have an impact on the proliferation and wide usage of any assistive technology. Therefore, cultural aspects must be considered in the design of crossover applications for the B/VI community. Comprehensive cultural vision is necessary for an innovative application to be accepted widely. In addition, there is a need to revise and periodically review the application's acceptance, and update a product based on the cultural changes that may occur to some communities especially with the vast cultural exchange. This paper reviews and highlights good design practices for assistive technology from cultural perspectives.
Conference Paper
Speech is the most natural form of face-to-face communication. Due to more sophisticated information systems and advanced educational requirements speech also recognizes growing importance in human-computer interaction. The present study investigates a text-to-speech (TTS) feature in a learning context. 252 questionnaires allow for descriptions concerning positive and negative experiences of TTS learners. Additionally, descriptive insights for enjoyment factors are provided and differences between German and English texts are shown. Furthermore, preferences of different learning styles and values of TTS features conveyed are explored. Findings provide a starting point for more specific future studies through insights into TTS evaluation in a learning context. Based on positive and negative experiences 13 dimensions relevant for a performance measurement scale are suggested. It is shown that among others, theoretical texts and exercises are appreciated as TTS especially by the aural learning style to enable for instance language learning on the go.
Article
Previous research has shown that violent video game exposure increases aggressive thoughts, aggressive feelings, aggressive behavior and physiological arousal. However, most of the research in this field has only focused on the “video” aspect of these games, and little attention has been paid to the “audio”. In this study, both background music within video games and the games themselves were used as two independent variables to test their influence on physical excitement and aggression. Physical excitement was measured using biofeedback equipment and aggression was measured using the hot sauce paradigm. Results showed that both music and video games can cause significant increases in physical excitement, while violent video games cause higher levels of physical excitement than non-violent games. The excitement level of the background music interacted with the game content to give a combined effect on aggression. Thus, the present study extended prior findings by showing that background music has an indispensable role in the level of aggression induced through video games. The results also demonstrated that it is both necessary and beneficial to design background music for video games in such a way that it matches the action taking place in the game.
Article
Non-visual environments are becoming important in supporting human activities through the use of man-machine systems. Here, a non-visual environment means a situation in which a screen and mouse are not available. For example, a user is occupied with some other task involving eye-hand coordination, or is visually disabled. In this paper, we propose “Speech Pointer”, a user interface for non-visual environments using speech recognition and synthesis, whose aim is to enable direct access or pointing to textual information. We prototyped the speech pointer for browsing the Web. We also prototyped another non-visual user interface on top of an existing visual application in order to find an efficient method for extending the non-visual applications. The purpose of these activities is to increase the scope of human activities in non-visual environments
Article
Exergames are video games that use physical activity as input and which have potential to change sedentary lifestyles and improve associated health problems such as obesity. However, exergames are generally difficult for the visually impaired to play. In this research, we describe a method of interacting with exergames for visually impaired players.
Article
Full-text available
This paper deals with research for the design of "sound fonts" and development of an evaluation methodology suitable for use with non visual presentation based on the speech modality or on multimodality (speech and tactile). The work hypothesis of the study presented here relies on the fact that both structure and typographic attributes increase the comprehension process in visual presentation. Based on this constant, the Human Computer Interaction (HCI) question is to find alternative sounds or prosodic variants to display the typographic attributes –bold, italic–, for instance. This question takes part of the paradigm of the information accessibility problems.
Conference Paper
In this paper, we describe auditory and tactile interfaces to represent visual effects nonvisually for blind users, allowing intuitive recognition of visual content that appears on the Web. This research examines how visual effects could be recognized by blind subjects using the senses of hearing and touch, aiming at integrating the results into a practical system in the future. As an initial step, two experiments were performed, one for sonification and tactilization of a page overview based on color-based fragmented groupings without speech, and one for sonification and tactilization of emphasized text based on analyzing rich text information with speech. The subjects could recognize visual representations presented by auditory and tactile interfaces throughout the experiment, and were conscious of the importance of the visual structures. We believe this shows our approach may be practical and available in the future.We will summarize our results and discuss what kind of information is suitable for each sense, as well as the next planned experiment and other future work.
Conference Paper
Audio Enriched Links provide previews of linked web pages to users with visual impairments. Before a user follows a hyperlink, the Audio Enriched Links software presents a spoken summary of the next page including its title, its relation to the current page, statistics about its content, and some highlights from its content. We believe that such a summary may be a useful surrogate for a full web page, and help users with visual impairments decide whether or not to spend time visiting a linked page. In this paper, we present some motivation for the Audio Enriched Links project. We describe the design and implementation of the current software prototype, and discuss the results of an initial evaluation involving four participants. We conclude with some implications of this work and directions for future research.
Conference Paper
Full-text available
This paper describes an extension to a multimodal system designed to improve Internet accessibility for the visually impaired. Here we discuss the novel application of a grid (patent pending) to our assistive web interface. Findings from our evaluation have shown that the grid enhances interaction by improving the user.s positional awareness when exploring a web page.
Conference Paper
Parallel visual data acquisition is not available to blind. Yet, sequential tactile scanning (e.g. white cane) allows them to form mental concepts of their surroundings, albeit slower. The purpose of this project is to demonstrate that acoustic serial scanning of graphical objects allows a blind user to form mental concepts and to reproduce these objects graphically. Moreover, this system is designed to enable blind users to obtain graphic information and express their visual ideas graphically through sound.
Conference Paper
While the usability of voice-based Web navigation has been steadily improving, it is still not as easy for users with visual impairments as it is for sighted users. One reason is that sequential voice representation can only convey a limited amount of information at a time. Another challenge comes from the fact that current voice browsers omit various visual cues such as text styles and page structures, and lack meaningful feedback about the current focus. To address these issues, we created Sasayaki, an intelligent voice-based user agent that augments the primary voice output of a voice browser with a secondary voice that whispers contextually relevant information as appropriate or in response to user requests. A prototype has been implemented as a plug-in for a voice browser. The results from a pilot study show that our Sasayaki agent is able to improve users' information search task time and their overall confidence level. We believe that our intelligent voice-based agent has great potential to enrich the Web browsing experiences of users with visual impairments.
Conference Paper
Full-text available
A persistent concern in the field of auditory display design has been how to effectively use environmental sounds, which are naturally occurring familiar non-speech, non-musical sounds. Environmental sounds represent physical events in the everyday world, and thus they have a semantic content that enables learning and recognition. However, unless used appropriately, their functions in auditory displays may cause problems. One of the main considerations in using environmental sounds as auditory icons is how to ensure the identifiability of the sound sources. The identifiability of an auditory icon depends on both the intrinsic acoustic properties of the sound it represents, and on the semantic fit of the sound to its context, i.e., whether the context is one in which the sound naturally occurs or would be unlikely to occur. Relatively recent research has yielded some insights into both of these factors. A second major consideration is how to use the source properties to represent events in the auditory display. This entails parameterizing the environmental sounds so the acoustics will both relate to source properties familiar to the user and convey meaningful new information to the user. Finally, particular considerations come into play when designing auditory displays for special populations, such as hearing impaired listeners who may not have access to all the acoustic information available to a normal hearing listener, or to elderly or other individuals whose cognitive resources may be diminished. Some guidelines for designing displays for these populations will be outlined.
Article
The inability of computer users who are visually impaired to access graphical user interfaces (GUIs) has led researchers to propose approaches for adapting GUIs to auditory interfaces, with the goal of providing access for visually impaired people. This article outlines the issues involved in nonvisual access to graphical user interfaces, reviews current research in this field, classifies methods and approaches, and discusses the extent to which researchers have resolved these issues.
Article
In the Internet world, the widespread use of graphical user interfaces (GUIs) increasingly bars visually handicapped people from accessing digital information. In this context our project aims at providing sight handicapped people with alternative access modalities to various types of GUIs and graphic intensive programs, in order for instance to facilitate usage of Web services. We describe in this paper ABWeb [1], a 3D-audio Web browser that allows blind computer users to explore Web pages, fill in forms, etc., using a 3D sonic rendering. We also present WebSound [2], a generic tool that permits to associate with each HTML tag a given sonic object (earcon or auditory icon). Finally, we describe a series of associated programs composed of the family of sonic games From Dots to Shapes [3], as well as of IDEA, a tool that lets users comprehend simple drawings, as well as to create graphics. Keywords: WWW, blind users, 3D virtual sound space, multimodal interface, sound and image proces...
Article
Full-text available
Users who are blind currently have limited access to graphical user interfaces based on MS Windows or X Windows. Past access strategies have used speech synthesizers and braille displays to present text-based interfaces. Providing access to graphical applications creates new human interface design challenges which must be addressed to build intuitive and efficient nonvisual interfaces. Two contrasting designs have been developed and implemented in the projects Mercator and GUIB. These systems differ dramatically in their approaches to providing nonvisual interfaces to GUIs. This paper discusses four main interface design issues for access systems, and describes how the Mercator and GUIB designs have addressed these issues. It is hoped that the exploration of these interfaces will lead to better nonvisual interfaces used in low visibility and visually overloaded environments. KEYWORDS Nonvisual HCI, blind users, graphical user interfaces, auditory interfaces, tactile interfaces INTRODU...
Article
Full-text available
This paper includes our main findings; for a more detailed discussion, refer to [7].
Article
Stephen M. Kosslyn is Professor of Psychology at Harvard University and an Associate Psychologist in the Department of Neurology at the Massachusetts General Hospital. He received his B.A. in 1970 from UCLA and his Ph.D. from Stanford University in 1974, both in psychology, and taught at Johns Hopkins, Harvard, and Brandeis Universities before joining the Harvard Faculty as Professor of Psychology in 1983. His work focuses on the nature of visual mental imagery and high-level vision, as well as applications of psychological principles to visual display design. He has published over 125 papers on these topics, co-edited five books, and authored or co-authored five books. His books include Image and Mind (1980), Ghosts in the Mind's Machine (1983), Wet Mind: The New Cognitive Neuroscience (with O. Koenig, 1992), Elements of Graph Design (1994), and Image and Brian: The Resolution of the Imagery Debate (1994). Dr. Kosslyn has received numerous honors, including the National Academy of Sciences Initiatives in Research Award, is currently on the editorial boards of many professional journals, and has served on several National Research Council committees to advise the government on new technologies.
Article
Carroll and Campbell have exercised themselves over a straw man not subscribed to by us. In doing so, they have misrepresented our position and even the statements in our paper. In reply, we restate as clearly as we can the position for which we actually did and do argue and give examples of their misrepresentations. The underlying issue seems to concern the advantages of using technical psychological theories to identify underlying mechanisms in human-computer interaction. We argue that such theories are an important part of a science of human-computer interaction. We argue further that technical theories must be considered in the context of the uses to which they are put. Such considerations help the theorist to determine what is a good approximation, the degree of formalization that is justified, the appropriate commingling of qualitative and quantitative techniques, and encourages cumulative progress through the heuristic of divide and conquer.
Article
The Internet now permits easy access to textual and pictorial material from an exponentially growing number
New Technologies in the education of the Visually Handicapped
  • D Burger
New Technologies in the education of the Visually Handicapped, D. Burger, Ed., Les Editions INSERM, Paris, FR, Vol. 237, 1996
W3C Recommendation; see http://www.w3
  • Cascading Style
  • Sheets
Cascading Style Sheets, W3C Recommendation; see http://www.w3.org/TR/REC-CSS2/, 1998.
Wet Mind: The New Cognitive Neuroscience
  • S M Kosslyn
  • O Koenig