Conference PaperPDF Available

Educational Sound Symbols for the Visually Impaired

Authors:

Abstract and Figures

Acoustic-based computer interactivity offers great potential [1], particularly with blind and visually impaired users [2]. At Indiana University’s School of Informatics at IUPUI, we have developed an innovative educational approach relying on “audemes,” short, nonverbal sound symbols made up of 2-5 individual sounds lasting 3-7 seconds - like expanded “earcons”[3] - to encode and prompt memory. To illustrate: An audeme for “American Civil War” includes a 3-second snippet of the song Dixie partially overlapped by a snippet of Battle Hymn of the Republic, followed by battle sounds, together lasting 5 seconds. Our focus on non-verbal sound explores the mnemonic impact of metaphoric rather than literal signification. Working for a year with BVI students, we found audemes improved encoding and long-term memory of verbal educational content, even after five months, and engaged the students in stimulating ways.
Content may be subject to copyright.
C. Stephanidis (Ed.): Universal Access in HCI, Part I, HCII 2009, LNCS 5614, pp. 106–115, 2009.
© Springer-Verlag Berlin Heidelberg 2009
Educational Sound Symbols for the Visually Impaired
Steve Mannheimer, Mexhid Ferati, Davide Bolchini, and Mathew Palakal
Indiana University School of Informatics
535 West Michigan Street, Indianapolis, Indiana 46202
{smannhei,mferati,dbolchin,mpalakal}@iupui.edu
Abstract. Acoustic-based computer interactivity offers great potential [1], par-
ticularly with blind and visually impaired users [2]. At Indiana University’s
School of Informatics at IUPUI, we have developed an innovative educational
approach relying on “audemes,” short, nonverbal sound symbols made up of
2-5 individual sounds lasting 3-7 seconds - like expanded “earcons”[3] - to en-
code and prompt memory. To illustrate: An audeme for “American Civil War”
includes a 3-second snippet of the song Dixie partially overlapped by a snippet
of Battle Hymn of the Republic, followed by battle sounds, together lasting 5
seconds. Our focus on non-verbal sound explores the mnemonic impact of
metaphoric rather than literal signification. Working for a year with BVI stu-
dents, we found audemes improved encoding and long-term memory of verbal
educational content, even after five months, and engaged the students in stimu-
lating ways.
Keywords: Audeme, sound, acoustic, interface, accessibility, blind and visually
impaired, cognition, long-term memory, education.
1 Introduction
For most people the visual sense dominates the day-to-day perception of the world.
The recent proliferation of visual or screen-based technologies has reinforced that
domination, and elevated language over non-verbal sounds that in some contexts may
be considered irrelevant or, worse, interference [18] or at best a mixed blessing [19].
This exacerbates the educational challenges to blind and visually impaired (BVI)
students. “Screen-reader” or text-to-speech (T2S) applications are restrictively linear
and affectively empty, making long T2S translations hard to remember. Not enough
research has leveraged the common ability to instantly remember non-verbal sounds
such as old song melodies or the voice of a long-lost friend, as well as the innate abil-
ity to identify a wide range of natural and machine-based sounds. The affective, cog-
nitive and mnemonic power of non-verbal sound has been an indispensible element in
the entertainment media for nearly a century, but has generally taken a back seat to
language-based approaches in information technologies and educational settings.
Our research explores the efficacy of non-symmetrical paradigms of acoustic inter-
activity, and the utility of non-verbal sound as the output process from computers. In
some contexts non-verbal sound offers a superior means to achieve cognitive goals
dependent on memory and the semantic construction of meaning.
Educational Sound Symbols for the Visually Impaired 107
The preliminary hypothesis of our research is that short non-speech acoustic sym-
bols, which we have called audemes (to suggest an auditory morpheme, lexeme
and/or phoneme) can substitute for visual/textual labels/icons to improve computer-
mediated access to educational material for BVI users. An audeme is a combination
of raw sounds crafted into a brief audio track, generally in the 3-7 second range, used
to signify a specific theme and to prompt memory of an associated body of verbal
content. Audemes may combine 1) iconic sounds made by natural and/or manufac-
tured things (e.g. surf and seagulls, cash registers); 2) abstract sounds manufactured
by computers (e.g. buzzes, blips, etc.); 3) music; and 4) occasional snippets of lan-
guage gleaned from songs or well-known cultural sources (e.g. President Kennedy’s
“Ask not what your country can do for you…”). The semiotic structure of an audeme
is shown in Fig. 1.
Working over a year with students and staff of the Indiana School for the Blind and
Visually Impaired (ISBVI) we determined that audemes work best when combining
2-5 separate sounds. Our 20 ISBVI student collaborators could identify most iconic
sounds available from commercial sound effects libraries, sometimes after only 2-3
seconds. This allowed researchers to construct relatively complex sequenced and/or
layered audemes (see Fig. 1). For example, an audeme of key jangles + car engine
revving = driving trip + shore sounds = trip to the beach. More complexly, an audeme
signifying the American Civil War contains short snippets of Dixie and Battle Hymn
of the Republic, staggered and conflicting musically for two seconds, followed by the
sound of rifle and cannon fire, all combined in 5-second audeme.
Fig. 1. Semiotic structure of an audeme. For example, audeme “a” (key jangle) references
“keys,” but which meaning of “keys?” Adding audeme “b” (car engine revving) directs that
meaning toward the process of driving, thus suggesting “trip.” The process of constructing
meaning is expandable: Adding an audeme “c” (sea gulls and surf) steers the meaning toward
“trip to the beach” or “vacation.”
108 S. Mannheimer et al.
In audeme design we utilized a mix of sound types but avoided speech as much as
possible. This decision enabled us to explore whether symbolic or metaphoric asso-
ciations of audeme-to-content were stronger than literal associations (simply using a
verbal title for content). It also allowed our collaborators a more engaged and crea-
tive participation in constructing audeme meanings. Constructions were often debated
because common sounds (e.g. the jangle of keys) can suggest different meanings
depending on the contexts most familiar to users (access to home; a car; locks and
security; even “the key” to a problem) and thus required more sounds to direct the
meaning (e.g. jangle + typing = computer security). We also observed the strong im-
pact of affect and “aesthetic” quality in audemes. Users preferred audemes and
audeme sequences that were ingeniously interpreted to connect to target themes. This
indicated the strong value of play in itself and for meaning construction, expressed
both as self-satisfaction or in competition with others.
Audemes are more complex and contextually variable than graphic symbols, and
are perhaps more comparable to signs in American Sign Language (ASL), in which
users create new signs for new ideas in the world (e.g. the sign for “credit card” is
SIGNATURE-RECTANGULAR; “tranquilizer” is signed as PILL-QUIET) [22]. We
found it useful at times to think of our growing dictionary of audemes like a deck of
cards, with each card clearly capable of fulfilling different roles depending on the
individual dynamics of any “hand” in any game being played. Because audemes ex-
hibit a natural semantic flexibility, and due to our subjects’ preference for playful
engagement with audemes, we chose to develop an infrastructure and interface that
could be “played” like a game or “played with” like a deck of cards or a musical in-
strument that allows either improvisation, set pieces or hybrids with either autotelic or
pre-defined goals. This strategy parallels the practice of Web-surfing and other types
of information exploration or discovery used in education.
2 Related Work
Foundational work in psychoacoustics [1] raised questions about how speech and
non-speech stimuli proceeded from short-term memory to long-term memory. With
the advent of the personal computer in the 1980s, exploratory work in the use of
acoustic cues for graphic interfaces was performed by researchers such as W. Gaver
[9], S. Brewster [4], M, Blattner et al [3], A. Edwards [7] and others. This helped
promote work with sound-based interfaces, including speech-based, for BVI people,
or in “eyes free” situations such as driving [7], [16]. One conceptual debate in this
arena concerned the relative value of speech vs. non-speech sound cues to supplement
graphic-textual displays. Smither suggested that synthetic speech is generally less
memorable than natural speech [15] and Brewster agrees [4]. Anecdotal testimony
from the BVI community supports this.
Further debate concerns the relative value of abstract sound (beeps, blips, et al.) vs.
natural sounds (also called metaphoric or iconic) referring to a topic (e.g., the sound
of rain to signify rain, a weather report or meteorology). Gaver [9] suggested that
iconic sounds are more memorable cues for content, both more long-lasting in mem-
ory and better able to conjure a range or depth of content associations. Conversy [5]
suggested abstract or synthesized sounds can signify concepts such as speed or waves.
Educational Sound Symbols for the Visually Impaired 109
As suggested by Back and Des [1], popular media strongly influence how we expect
the natural world to sound. As we know from movies “…thunder must crack, boom,
or roll, and seagulls must utter high lonesome cries or harsh squawks…” [10] Our
ISBVI subjects easily identified natural or mechanical sounds that they would only
have experienced via entertainment media (e.g. tiger growls or machine-gun fire). A
judicious mix of sound types may be best. In their workshop, Frohlich and Pucher [8]
state, “Some pioneering projects have presented promising design ideas and informal
usability evaluations of auditory systems, in which a systematic integration of sound
and speech played a significant role.”
Studies strongly suggest that sound can be a powerful prompt for memory [13]. In
some performed with BVI students [6] their performance was superior to that of
sighted students, perhaps due to a relative lack of acoustic acuity in sighted children
[12]. Other researchers created games that enhanced children’s short-term memory
[11]. Previous work on earcons or other sound symbols has focused on short-term
associations with relatively simple content or meanings, and has not, to our knowl-
edge, explored their long-term potential to encode and cue relatively large amounts of
thematically complex material. Our study helps fill this gap.
To address broad concerns that audemes might work primarily as mnemonic cues
simply because they were unusual stimuli associated with content, or would function
no better or worse than verbal cues, we conducted a series of simple tests. In previ-
ously published work [23], we determined that 1) memory for random numbers
presented with audemes was 14.88% stronger than memory for random number pre-
sented with spoken words; 2) that audemes with thematic or metaphoric connection to
verbal content (e.g., footsteps in snow + gunfire = The Cold War) were 67.82% more
effective as memory cues than audemes with no thematic relation to their texts
(e.g. mechanical buzz + snippet of classical music = National Grange); 3) that aude-
mes with positive affect (explained as good, happy, positive, I like it, etc.) improved
recall in 67.86% of the cases while audemes with negative affect (bad, unhappy, I
don’t like it) improved memory in 32.14% of the test cases.
3 Methods
3.1 Experimental Environment
Audemes. Audemes are very short sounds tracks that may include natural, mechani-
cal, musical or abstract sounds. Verbal cues also may be included as song lyrics or
thematic quotations. In our work, audeme design was a dialogic process between
researchers and students. For the three initial memory tests we created audemes for
“Radio,” for “Slavery” and for “US Constitution.” Researchers also created three
essays of approximately 500 words each from accepted content from Web-based
sources. The “Radio” audeme was the sound of a radio dial being twisted through
different stations with static in between. The “Slavery” audeme combined an opening
short passage of a choir singing “Swing Low, Sweet Chariot” punctuated at the end
by a whip crack. The “US Constitution” audeme combined the sound of a
gavel (symbolizing courts and legal processes), the sound of quill pen writing and the
110 S. Mannheimer et al.
opening bars of Star Spangled Banner. The audemes were constructed using Sound-
track Pro software, to be heard through inexpensive speakers.
Participants. For this study we conducted weekly sessions with approximately 20
students of the Indiana School for the Blind and Visually Impaired (ISBVI), working
in ISBVI classrooms with an ISBVI faculty monitor. Students ages ranged from 9 to
17 years old. Eleven of them were completely blind, and the others were partially
blind. Because of occasional other commitments, the number of participants fluctu-
ated from 15-20. For their recruitment, consent of the school and their parents was
granted. They were recruited with IBSVI guidance and volunteered to participate.
3.2 Experiment 1
The same experimental format was followed for the initial memory tests of 3 audeme-
essay combinations. Students were divided into three groups in a careful single-stage
sampling to evenly distribute students by age, learning abilities, and level of visual
impairment. Group I (named IU) was the control group, while Group II (named Notre
Dame) and Group III (Purdue) were the experimental groups. A pretest was conducted
with all groups to establish a baseline of their previous knowledge of the essay content.
The pretest contained 10 questions derived from the essay and these were printed in
Braille or large-print. All three groups took the same test. After the pretest Group I was
moved to a separate classroom; Groups II and III remained together to hear the essay
read aloud. In its room, Group I (the control group) listened to the essay without aude-
mes. Group II and III listened to the same essay with the single relevant audeme played
between each paragraph, approximately 8-10 times for each essay. Two weeks after
each initial session, we conducted a posttest with all 3 groups. The test contained the
questions from the pretest, but in randomized order. Additionally, we added 3 more
questions as statistical noise. The posttest was the same for the 3 groups, except that
Group I and II took the posttest without hearing the audeme, while the Group III heard
the audeme played before and after each of the questions. In short,
Group I was not exposed to the audeme, allowing researchers to track how well
students remember the essay as speech after two weeks.
Group II was exposed to audeme when hearing the essay but not when taking the
posttest, allowing researchers to track how well the audemes enhanced the encod-
ing of the spoken essay.
Group III was exposed to the audeme when hearing the lecture and also when tak-
ing the posttest, allowing us to track how well the audemes enhanced encoding and
recall of the essay.
The results of the tests in Fig. 2 strongly indicated that exposure to the audemes in-
creased encoding and recollection of the essay information. For “Radio,” Group III
showed a 52% increase in knowledge of the content included in the test (from 4.2
correct answers to 6.4), factored against the pre-knowledge. For the “US Constitu-
tion” essay, Group III showed a 65% increase (from 3.3 correct answers in the pre-
test to 5.50 correct answers in the post-test). For “Slavery,” Group III showed an
80% increase (from 3.75 correct answers in the pre-test to 6.75 correct answers in the
post-test). Group II showed a 38% increase in knowledge for “Radio” (from 4.2 to
Educational Sound Symbols for the Visually Impaired 111
5.80 correct answers); and a 16% increase for “US Constitution” (from 5.16 to 6.00
correct answers) and a 12% increase for “Slavery” (6.25 to 7.00). The Group I, the
control group, demonstrated a 47% increase in knowledge for “Radio” (3.40 to 5.00),
then a 3.6% decrease in knowledge for “US Constitution” (4.67 to 4.50); and a 20%
increase for “Slavery” (5.00 to 6.00) (Table 1).
Fig. 2. Posttest results for all three essays Fig. 3. Cumulative test results after 5 months
Table 1. Results for all groups
Table 2. Anova
Data Analysis. The statistical analysis of the data began by computing the difference
between the pretest and posttest scores for each participant. Afterwards, we analyzed
those differences in a One-Way ANOVA. This difference was called Gain.
Gain = posttest – pretest (1)
The p-value is .001 (p<.05), which means that there is significant difference in the
level of improvement among the three groups.
112 S. Mannheimer et al.
3.3 Experiment 2
Five months later we re-tested the students on all three essays. We reshuffled the
order of the questions and multiple-choice answers. Results are shown in Fig. 3. The
average result for the Group I (heard no audemes) showed a -7.2% change (decrease)
in long-term memory of the essays; the average score for Group II (heard audemes
during encoding five months prior but not during this test) improved .78% and the
average score for Group III (heard audemes during encoding five months prior and for
this testing) increased 3.8%. Although mindful of the limitations of our small sample
size, we are encouraged by the apparent power of audemes to help resist the erosion
of long-term memory. We believe the increased recall may be attributed to the encod-
ing of correct answers given to students after the earlier round of testing.
3.4 Experiment 3
The practical goal of our work is a user interface by which BVI students could access
a complex database of educational content. This would require navigating sequences
or sets of audemes, or the use of very long (VL) audemes incorporated many sounds
in tracks over 15 seconds. We debated whether this navigation worked best as a hier-
archy or as a semantic web. Through this debate, as well as discussions with our
subjects and a series of experiments, we ultimately concluded that audemes worked
best as semantically flexible signifiers or acoustic landmarks in an autotelic explora-
tion, more akin to a semantic web. Our observations based on subject testimony: 1)
ISBVI students enjoyed the challenge of interpreting or constructing narratives to
explain arbitrary sequences of 2, 3, 4, 5, 6, 7 and even 8 audemes; 2) sequences of 3
or 5 audemes were easier for them to narrate; 3) even-numbered sequences did not
readily resolve into narrative structures; 4) sequences of more than 6 audemes were
generally too difficult to coherently narrate. 5) With very little practice, some of the
students could generate interpretations, sometimes amazingly sensible, for any arbi-
trary sequence of 3-5 audemes. More research should help clarify these results.
In related tests, we explored how our subjects judged similarities between, on the
one hand, core sets of 6 audemes (called C for Charlie, D for David and E for Ed-
ward) and, on the other, new sets recombined from these cores. We determined the
following factors were most powerful in determining the perception of similarity
between core sets and new: 1) Majority: Subjects linked new sets to cores 69.85% of
the time when a majority of new set audemes came from that core. 2) Core-first posi-
tion: Of all positions in core sets, first audemes (C1, D1, E1) had the greatest “ge-
netic” impact for establishing resemblance with new sets. 3) New-last position: Of all
positions in new sets, last audemes had the strongest genetic influence on resemblance
with core sets; 4) Core-Consecutiveness: Audeme “chunks” (e.g., core chunk C3-4-5
in new set D5-C3-4-5-E4-2), did not demonstrate appreciable genetic impact in
establishing resemblance to core sets. We also tested very long (VL) audemes
(18 seconds), vs. standard versions (6 seconds) abbreviated from the VLs. Using the
established three groups (IU, Notre Dame and Purdue) we performed two versions of
this experiment to test 1) changes in recall when encoding with VL audemes but test-
ing with standard versions; and 2) changes in recall when encoding with standard
audemes but testing with VL. The results indicated that IU, the group that encoded
Educational Sound Symbols for the Visually Impaired 113
with VL audemes then tested with the abbreviated standards, and next encoded with
standard then tested with VL, scored lowest. While Purdue and Notre Dame, which
heard either VL-then-VL or standard-then-standard scored clearly higher. This sug-
gests that consistency in audeme exposure maximizes, or inconsistency interferes
with, recall of associated content.
3.5 Experiment 4
Researchers tested the subjects’ sense of the virtual or intuited location for audemes
on a rectangular field. This experiment followed and hoped to build upon the very
interesting work of the Sonic Mapper [20], in which sounds of common objects were
shown to “cluster” by referenced category. Anecdotal testimony from our IBSVI
collaborators confirmed the general idea that location is a critical element in their
auditory perception, and that the same sound issuing from different locations can
carry different meanings. Location is also an important factor in ASL, in which a sign
made, for instance, near the signer’s chin has a different meaning than the same sign
made away from the head. On a more abstract level, we hoped to explore how BVI
people positioned acoustic symbols in a virtual or metaphoric framework or space.
This question has roots in the ancient art of memory, which relied on an imagined
architectural framework or “memory palace” in which ideas were placed [21]. More
recent analyses of the correlation of space and ideation comes from Julian Jaynes,
who argued that the consciousness of any idea occupied a virtual space, and that
thinking involved positioning ideas as if they were visible objects with distinct loca-
tions. A senior ISBVI senior technology staff suggested that blind people rely instead
on metaphoric acoustic spaces. We hoped our experiment might suggest what such a
space might be. We played 20 previously unheard audemes and asked subjects to
intuitively locate each with a crayon mark on a separate piece of graph paper. All
audemes were played without stereo effects, and avoided specific references to spatial
realms of the experienced world (no airplanes or bird calls). No group discussion was
allowed. This experiment failed to demonstrate any consensus or statistically signifi-
cant clustering for any single audeme or general type of audeme. Further experiments
will be needed to provide better data and clearer concepts.
4 Discussion and Conclusion
From our experiments and interviews with our subject-collaborators we have deter-
mined that audemes increase memory for associated text and may contribute to very
long-term retention of that textual information. We also determined that audemes can be
remembered in sets and that its “set memory” can become part of the overall semantic
identity and mnemonic power of any single audeme. Other factors that increase the
mnemonic power of audemes include metaphoric connection between audeme and
content, and positive affect. Because the same constituent sound or audeme can be in-
terpreted differently depending on any established context or adjacent audemes, mne-
monic success was also influenced by the users’ abilities to creatively and playfully
interpret audemes. We believe these factors can be applied to the design and implemen-
tation of an acoustic interface combining audemes and associated content through a
114 S. Mannheimer et al.
touch-screen monitor to serve the educational goals of BVI students. Further, we believe
this total platform will work best through a variable set of game-like processes and
protocols to provide a fun and flexible learning environment for the students. Our im-
mediate goal is to integrate this interface/platform into the pedagogy of the ISBVI. Our
larger goal is to offer the audeme dictionary, games and overall concept to the larger
BVI community via the Web. We hope this community will help guide the expansion
and application of the platform in ways we may not have anticipated.
Moreover, we believe that these results also can be practically applied to a broad
range of mainstream applications including 1) the development of sound-based inter-
faces and content symbols for Web searches and Website translation; 2) for handheld
devices with limited screen space; or 3) uses in “eyes free” contexts such as driving.
In a larger sense, we believe this work points toward a new understanding of the po-
tential of auditory cognition, with much territory still to explore. This territory in-
cludes ideas about the semantic flexibility of acoustic stimuli, the construction of
meaning from the combination and context of several stimuli, and the role of meta-
phoric and/or semantic association in this semiotic and signification process.
Acknowledgment. This work was supported by a grant from the Nina Mason Pulliam
Charitable Trust. Researches thank the students and the staff of Indiana School for the
Blind and Visually Impaired.
References
1. Back, M., Des, D.: Micro-Narratives in Sound Design: Context, Character, and Caricature
in Waveform Manipulation. In: ICAD (1996)
2. Baddeley, A.: Short-term memory for word sequences as a function of acoustic, semantic
and formal similarity. Quarterly Journal of Experimental Psychology 18, 362–365 (1996)
3. Blattner, M.M., Sumikawa, D.A., Greenberg, R.M.: Earcons and Icons: Their structure and
common design principles. Human-Computer Interaction 4, 11–44 (1989)
4. Brewster, S.A.: Providing a Structured Method for Integrating Non-Speech Audio into
Human-Computer Interfaces. PhD thesis, University of York (1994)
5. Conversy, S.: Ad-hoc synthesis of auditory icons. In: ICAD (1998)
6. Doucet, M.-E., Guillemot, J.-P., Lassonde, M., Gagne, J.-P., Leclerc, C., Lepore, F.: Blind
subjects process auditory spectral cues more efficiently than sighted individuals. Springer,
Heidelberg (2004)
7. Edwards, A.N.D.: Modelling Blind Users’ Interactions with an Auditory Computer Inter-
face. International Journal of Man-Machine Studies (1989)
8. Frohlich, P., Pucher, M.: Combining Speech and Sound in the User Interface. In: ICAD
2005 (2005)
9. Gaver, W.W.: The SonicFinder: An Interface That Uses Auditory Icons. Human-Computer
Interaction 4(1), 67–94 (1989)
10. Mynatt, E.D.: Designing with auditory icons: how well do we identify auditory cues? In:
Proceedings of the 2nd International Conference on Auditory Display (1994)
11. Sanchez, J., Flores, H.: AudioMath: blind children learning mathematics through audio. In:
Proceedings of Fifth Conf. Disability, Virtual Reality & Assoc. Tech., Oxford, UK (2004)
12. Sanchez, J., Jorquera, L.: Interactive virtual environments for blind children: usability and
cognition. Department of Computer Science, University of Chile (2001)
Educational Sound Symbols for the Visually Impaired 115
13. Sanchez, J., Flores, H.: Memory enhancement through Audio. Department of Computer
Science, Chile (2004)
14. Scavone, G.P., Lakatos, S., Harbke, C.: The Sonic Mapper: An Interactive Program For
Obtaining Similarity Ratings With Auditory Stimuli. In: Proceedings of the 2002 Interna-
tional Conference on Auditory Display, Kyoto, Japan (2002)
15. Smither, J.A.: Short term memory demands in processing synthetic speech by old and
young adults. Behaviour & Information Technology 12(6), 330–335 (1993)
16. Stevens, R.D., Brewster, S.A.: Providing an audio glance at algebra for blind readers. In:
Proceedings of ICAD (1994)
17. Turnbull, D., Barrington, L., Torres, D., Lanckriet, G.: Modeling the Semantics of Sound.
Department of Computer Science and Engineering, UCSD (2006)
18. Moreno, R., Mayer, R.E.: Designing for Understanding: A Learner-Centered Approach to
Multimedia Learning. Journal of Educational Psychology 92(1), 117–125 (2000)
19. Hughes, R.W., Jones, D.M.: Indispensible benefits and unavoidable costs of unattended
sound for cognitive functioning. Noise and Health 6(21), 63–76 (2003)
20. Yates, F.A.: The Art of Memory. University of Chicago Press, Chicago (1966)
21. Jaynes, J.: The Origin of Consciousness in the Breakdown of the Bicameral Mind. Hough-
ton Mifflin Company (1976)
22. Bellugi, U.: How Signs Express Complex Meanings. In: Baker, C., Battison, R. (eds.) Sign
Language and the Deaf Community: Essays in Honor of William C. Stokoe, p. 72. Na-
tional Association of the Deaf (1980)
23. Mannheimer, S., Ferati, M., Huckleberry, D., Palakal, M.: Using Audemes as a Learning
Medium for the Visually Impaired. In: HealthINF 2009, Porto, Portugal (accepted, 2009)
... Because of this flexibility, they can represent more complex content compared to auditory icons. The meaning of audemes is typically generated by concatenating sounds, and although meanings are not completely open and arbitrary, they start broad and then narrow to additional sound cues, which merge into a single meaning [38]. ...
... In terms of information retaining, audemes were found helpful in reducing memory erosion and even after five months, the content was better remembered with audemes than without them [37,38]. Audemes were found to have potential in scientific applications including gaming and productivity, navigation of large content within a user interface, as well as in education as a tool for better memory retention [42]. ...
... They heavily rely on meaning derived from semiotics structure and an intuitive link between sounds and natural events. Audemes are defined as short, non--speech sound symbols, under seven seconds, comprised of various combinations of sound effects, which include natural or artificial context, abstract sounds, and music excerpts [38]. ...
Conference Paper
Full-text available
Screen-reader users access images on the Web using alternative text delivered via synthetic speech. However, research shows that this is a tedious and unsatisfying experience for blind users, because text-to-speech applications lack expressiveness. This paper, poses an alternative approach using an experiment that compares audemes, a type of non-speech sounds, with alternative text delivered using synthetic speech. In a pilot study with fourteen sighted users, findings show that audemes perform better across many areas. Specifically, audemes required lower mental and temporal demands and led to less effort and frustration and better task performance. Moreover, participants recognized audemes with higher accuracy and lower errors. Audemes were also perceived as more engaging compared to alternative text delivered using synthetic speech. Additionally, audemes were found to be richer in delivering information. This study suggests that non-speech sounds could substitute or complement alternative text when describing images on the Web.
... Audemes are sequential and/or layered combinations of music, sound effects, abstract computer-generated sounds, and occasional samples of sung lyrics. Our previous work [7][8] demonstrated that audemes work in the 3-10 sec. range, but best in a 4-7 sec. ...
Conference Paper
Full-text available
This paper presents and discusses design decisions for an acoustic edutainment application for blind users called AEDIN (Acoustic EDutainment INterface), comprising audio elements used as navigational and thematic landmarks in touch-screen computers. We tested designs with blind and visually impaired teenagers. Preliminary results demonstrated the efficacy of AEDIN as an easy-to-learn and memorize architecture, and a potentially fun interface. The paper illustrates the lessons learned from the design and evaluation experience and contextually outlines new research directions for aural communication design.
Conference Paper
To easily revisit websites of interest, users typically use the browser's bookmarking feature. Due to technological barriers to access the web, however, the blind and visually impaired users make a little use of this feature. To further investigate such claim, we conducted a survey with 12 blind and visually impaired K-12 students. The results of the interview show that students have no or limited use of the bookmarking service. The main reason is found to be the small number of websites they visit, thus not having the need to bookmark them. Another reason is the difficulty to create and also access the created bookmarks. An interesting result from the survey is that blind and visually impaired like to share the websites they use. To take advantage of this, we propose a solution that will combine the bookmarking and sharing features using a "bag" metaphor, which enriched with non-speech sounds, could encourage blind and visually impaired students browse larger number and more diverse websites.
Article
With the rapid advent of touchscreen devices, opportunities are increasing to develop innovative interfaces, including applications that combine touch input with auditory feedback to serve the blind and visually impaired (BVI) community. Targeted to blind high-school children, our innovative design, AEDIN (Acoustic EDutainment INterface), uses non-speech sounds simultaneously as navigational prompts and content icons/signifiers for recorded text-to-speech educational essays, which are the main content of this application. A study of two versions of AEDIN was conducted with 20 participants from a K-12 school for the BVI to evaluate its usability and identify ways to improve it. Through the collection of quantitative and qualitative data, we discovered key design improvements that made AEDIN a highly usable and enjoyable interface for these users. The paper highlights good design practices for acoustic interfaces.
Article
Full-text available
To access interactive systems, blind users can leverage their auditory senses by using non-speech sounds. The structure of existing non-speech sounds, however, is geared toward conveying atomic operations at the user interface (e.g., opening a file) rather than evoking broader, theme-based content typical of educational material (e.g., an historical event). To address this problem, we investigate audemes, a new category of non-speech sounds whose semiotic structure and flexibility open new horizons for the aural interaction with content-rich applications. Three experiments with blind participants examined the attributes of an audeme that most facilitate the accurate recognition of their meaning. A sequential concatenation of different sound types (music, sound effect) yielded the highest meaning recognition, whereas an overlapping arrangement of sounds of the same type (music, music) yielded the lowest meaning recognition. We discuss seven guidelines to design well-formed audemes.
Article
Full-text available
The authors tested the recommendation that adding bells and whistles (in the form of background music and/or sounds) would improve the quality of a multimedia instructional message. In 2 studies, students received an animation and concurrent narration intended to explain the formation of lightning (Experiment 1) or the operation of hydraulic braking systems (Experiment 2). For some students, the authors added background music (Group NM), sounds (Group NS), both (Group NSM), or neither (Group N). On tests of retention and transfer, Group NSM performed worse than Group N; groups receiving music performed worse than groups not receiving music; and groups receiving sounds performed worse (only in Experiment 2) than groups not receiving sounds. Results were consistent with the idea that auditory adjuncts can overload the learner's auditory working memory, as predicted by a cognitive theory of multimedia learning. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
Full-text available
The Sonic Mapper is an interactive Linux-based graphical program that affords increased methodological flexibility and sophistication to researchers who collect proximity data for auditory research. The Sonic Mapper consists of a mapping environment in which participants can position and group icons in the two-dimensional plane of the screen. Options for collecting data concerning hierarchical groupings, category prototypicality, and verbal labeling provide additional opportunities to test hypotheses in a convergent manner. The Sonic Mapper also offers an environment for traditional pairwise comparisons, as well as one for performing free sorting tasks. A pilot study that attempts to use many of the Sonic Mapper's key features is described briefly below.
Article
How can we help students understand scientific explanations of cause-and-effect systems, such as how lightning storms develop, or how the respiratory system works? One promising approach involves multimedia presentation of explanations in visual and verbal formats, such as presenting a computer- generated animation synchronized with narration or on-screen text. In this paper, a cognitive theory of multimedia learning is presented from which two principles of instructional design are derived and tested.
Article
In this paper we examine earcons, which are audio messagesused in the user-computer interface to provide information andfeedback to the user about computer entities. (Earcons includemessages and functions, as well as states and labels.) We identifysome design principles that are common to both visual symbols andauditory messages, and discuss the use of representational andabstract icons and earcons. We give some examples of audio patternsthat may be used to design modules for earcons which then may beassembled into larger groupings called families. The modules aresingle pitches or rhythmicized sequences of pitches calledmotives. The families are constructed about related motivesthat serve to identify a family of related messages. Issuesconcerned with learning and remembering earcons are discussed.
Article
This experiment investigated the demands synthetic speech places on short term memory by comparing performance of old and young adults on an ordinary short term memory task. Items presented were generated by a human speaker or by a computer based text-to-speech synthesizer. Results were consistent with the idea that the comprehension of synthetic speech imposes increased resource demands on the short term memory system. Older subjects performed significantly more poorly than younger subjects, and both groups performed more poorly with synthetic than with human speech. Findings suggest that short term memory demands imposed by the processing of synthetic speech should be investigated further, particularly regarding the implementation of voice response systems in devices for the elderly.