Content uploaded by Natasa Paterson
Author content
All content in this area was uploaded by Natasa Paterson on Sep 19, 2014
Content may be subject to copyright.
I
nt. J. Arts and Technology, Vol. 6, No. 1, 2013 61
Copyright © 2013 Inderscience Enterprises Ltd.
Viking Ghost Hunt: creating engaging sound design
for location-aware applications
Natasa Paterson*
Department of Electronic and Electrical Engineering,
Trinity College Dublin,
College Green, Dublin 2, Ireland
E-mail: patersn@tcd.ie
*Corresponding author
Gavin Kearney
Department of Theatre, Film and Television,
The University of York,
East Campus, Baird Lane,
York YO10 5GB, UK
E-mail: gavin.kearney@york.ac.uk
Katsiaryna Naliuka, Tara Carrigy and
Mads Haahr
School of Computer Science and Statistics,
Trinity College Dublin,
College Green, Dublin 2, Ireland
E-mail: katsiaryna.naliuka@ndrc.ie
E-mail: carrigyt@tcd.ie
E-mail: haahrm@cs.tcd.ie
Fionnuala Conway
Department of Electronic and Electrical Engineering,
Trinity College Dublin,
College Green, Dublin 2, Ireland
E-mail: conwayfi@tcd.ie
Abstract: In this paper, we present and describe the sound design and audio
evaluation of Viking Ghost Hunt, a location-aware game based around
important historical locations in Dublin, Ireland. We describe the novel
approach taken in developing an engaging soundscape for location-aware
games in mobile multimedia devices that highlight the importance of spatial
audio and reverberation in the subjective sensation of engagement with the
game. In particular, we present how the level difference and temporal
separation between the direct sound of the sound object, the early reflections
and the diffuse reverberant field influence this engagement. Results obtained
from this investigation in regards to reverberation and engagement may
therefore inform future location-aware sound designs.
62 N. Paterson et al.
Keywords: location-aware gaming; sound design; audio; engagement;
immersion; presence; multimedia; reverberation; spatialisation.
Reference to this paper should be made as follows: Paterson, N., Kearney, G.,
Naliuka, K., Carrigy, T., Haahr, M. and Conway, F. (2013) ‘Viking Ghost
Hunt: creating engaging sound design for location-aware applications’, Int. J.
Arts and Technology, Vol. 6, No. 1, pp.61–82.
Biographical notes: Natasa Paterson is a Dublin-based Composer, Sound
Designer and Performer. She obtained two Bachelor of Science Degrees in
Clinical Science and Osteopathic Science from RMIT University, Melbourne
and worked in the field of Osteopathy before completing her MPhil in Music
and Media Technologies in Trinity College, Dublin. She is currently working
on a project designing the sound and composition for an augmented reality
application which is part of a PhD in Trinity College Dublin and in
collaboration with the National Digital Research Centre. Her compositional
work includes pieces for choir, piano, string quartet and the use of
electroacoustic elements with performances at The National Concert Hall,
Samuel Beckett Theatre, Cake Contemporary Centre and Centre for Creative
Practices in Ireland.
Gavin Kearney graduated from Dublin Institute of Technology in 2002 with an
Honours degree in Electronic Engineering and has since obtained both MSc
and PhD degrees in Audio Signal Processing from Trinity College Dublin.
During this time, he also worked (and continues to) in the audio industry as
Sound Engineer and Producer. Currently, he is a Lecturer in Sound Design at
the Department of Theatre, Film and Television at the University of York. His
research interests include audio signal processing, acoustics and
psychoacoustics.
Katsiaryna Naliuka is a Post-doctoral Researcher in the Distributed System
Group at Trinity College Dublin. She received a PhD in Computer Science
from Universit. degli Studi di Trento in 2009. Her main research interests
include mobile computing, mixed and augmented reality, and location-aware
games.
Tara Carrigy is a Designer, Lecturer and Researcher. She holds a first-class
honours degree in Design from the National College of Art and Design, Dublin.
She worked as a Design Practitioner and Lecturer at the National College of
Art and Design, Dublin and University of Ulster, Belfast before completing an
MSc in Multimedia Systems from Trinity College Dublin. Currently, she is
working as a Researcher in the Distributed System Group at Trinity College
Dublin and is the Lead Content Designer on the Viking Ghost Hunt project at
the National Digital Research Centre. Her research interests include human–
computer interaction, user-centred design, embodied interaction, augmented
reality and location-specific mobile games.
Mads Haahr is a Lecturer in Computer Science at Trinity College Dublin. He
holds BSc and MSc degrees from the University of Copenhagen and a PhD
from Trinity College Dublin. His research interests include mobile and
ubiquitous computing, self-organising systems, interactive and location-aware
narrative, computer game studies and artificial intelligence for games. He edits
a multidisciplinary academic journal called Crossings:Electronic Journal of
Art and Technology and has built and operates the internet’s premier true
random number service – random.org.
Creating engaging sound design for location-aware applications 63
Fionnuala Conway is a Musician/Composer and Multimedia Artist. With a
background in music and music technology, she has worked as composer and
performer on a number of theatre productions and produced work in a wide
variety of forms, from traditional materials to interactive digital media,
wearable technology, installations and theatre presentation, including Art of
Decision and Urban Chameleon. Her PhD thesis, ‘Exploring citizenship
through art and technology’, focuses on the creative use of technology to
generate awareness of citizenship (and other social issues), with a particular
focus on interactive immersive physical environments. She has been lecturing
on the MPhil in Music and Media Technologies course since 2002 and was
appointed Course Director in 2006.
This paper is a revised and expanded version of a paper entitled Design,
implementation and evaluation of audio for a location-based augmented reality
game presented at the Third International Conference on Fun and Games 2010
(FNG 2010), Leuven, Belgium, 15–17 September 2010.
1 Introduction
Multimedia experiences attempt to perceptually and psychologically immerse their
participants to convey a meaning, concept or feeling to create a sense of presence.
Presence is defined as a psychological state or subjective perception in which an
individual’s experience is generated by and/or filtered through human technology (ISPR,
2000). This can only occur when the participant is engaged or involved with the virtual
space without being aware of the technology. Hence, presence is a psychological state,
which is not only induced by technological determinants (e.g. sound design) but also by
psychological determinants (e.g. meaningfulness of the situation, perceived realism).
Becoming involved in a virtual space can be achieved by using multimodal interfaces that
support the environment, concept and narrative flow of the information being conveyed.
Engagement and the role that audio plays in facilitating these states is of importance for
location-aware gaming, especially considering the limitations of the current display size
of mobile phone screens. With that in mind, the focus in Viking Ghost Hunt was to
develop a sound design and audio interface that incorporates real-world locations and
thereby engages the player in the game.
This paper is organised as follows: we first investigate the state of the art in location-
aware gaming and subsequently introduce the game prototype Viking Ghost Hunt,
together with the concept and implementation of the sound design. The game sound
design is then evaluated and attention is given to specific psychoacoustic parameters
pertaining to reverberation and their affect on participant engagement. Finally, the
significance of the results is discussed and conclusions in relation to engagement,
presence and audio production techniques for location-aware application sound design
are presented.
2 Location-aware applications
Location technology allows for the overlay of a digital mediascape onto the physical
world so that the virtual world responds to real-world contextual cues. This overlay
64 N. Paterson et al.
merges the virtual and physical realities (De Souza e Silva, 2004) augmenting the space
and producing a sense of the place (Nisi et al., 2008). Therefore, the aim for location-
aware applications is not to create total ‘spatial presence’ which is a sense of being
somewhere else rather than the physical location (ISPR, 2000), but to instead encourage
the player to reside both in the physical location and remain engaged with the virtual
space created by the technology, therefore creating a mixed reality experience.
Additionally, in previous research by Reid et al. (2005), the authors inferred that situating
a game world overlay onto actual historical sites has a special form of engagement and
empathy to the narrative – a sense of history coming alive.
To generate an engaging experience, sound designs for console gaming include rich
orchestral backdrops, realistic sound effects and complex three-dimensional (3D) spatial
audio presentations. Applications that use complex visuals and audio for mobile
multimedia are at the research stage of deployment and include War Memorials in Vision
(2010), which is developed as an iPhone application at the Netherlands Institute for
Sound and Vision. However, there are very few current examples of location-aware
applications that overlay a narrative-driven game world onto a specific site with the same
graphical and audio complexities found in console gaming. In location-aware audio
research, investigations have generally focused on non-gaming applications, artistic
installations or navigational tools for the visually impaired. Research into spatialised
sound in a location-aware application was undertaken by Cater et al. (2007) with initial
results indicating that the soundscape of a personal digital assistant application can help
users to navigate in a physical environment by conveying relevant information through
audio. Demor (2004) is another spatialised location-aware 3D audio first-person shooter
game primarily designed for the blind. This game investigated psychoacoustic properties
for the accurate presentation of a 3D soundscape that is used by players for real physical
space navigation. However, these applications mainly focus on sound as navigation and
not on potential virtual space engagement or experiential properties. Additionally, the
applications were not based on the mobile platform and often involved complex technical
set-ups. One of the few examples of mobile-based location-aware gaming that considers
immersion is The Songs of North (Ekman et al., 2005), although this is not fully explored,
as the emphasis is on the use of audio for navigation and in conveying game information.
The sound design of a location-aware game can convey important information such as
navigation and instructional dialogue enabling the user to look away from the visual
graphical interface and remain engaged within their physical space. Research has already
shown that the use of a realistic sound design can help build excitement and tension in the
game world and an enhanced sense of engagement without the use of visual graphics
(Valente et al., 2008). A multisensory approach, using both visual and audio interfaces, is
increasingly being implemented as there is a strong correlation with associating a sound
object to a visual one (Reid et al., 2005) which increases sound source localisation and
hence engagement to that object. Audio can be a significant element in location-aware
gaming especially when careful music and sound composition together with attention to
psychoacoustic properties is implemented. Although there is current research on console
game audio that focuses on engagement (Brown and Cairns, 2004), game audio mobility
and its implications have not been thoroughly dealt with. Therefore, it was the aim of the
Viking Ghost Hunt prototype to take into account the previous sound designs in the
gaming and location-aware genres and to develop a novel approach that included
psychoacoustic principles to perceptually and technically engage participants in the
overall gaming experience.
Creating engaging sound design for location-aware applications 65
3 Viking Ghost Hunt
Viking Ghost Hunt is a narrative-led single player location-aware game based in Dublin,
Ireland, which engages the player in local Viking history (from the years 800–1169)
through a ghost-hunting game. The aim of Viking Ghost Hunt is to blend the ghostly
virtual Viking world and historical location to engage the player in a multimodal
experience by correlating and engaging all senses – haptic, visual and aural. The
application is developed on the Google Android platform and is deployed using the HTC
T-Mobile G1 with the gamer wearing stereo headphones. Because the experience is
situated in a real space, additional location-specific information, such as environmental
noise, weather and physical movement, becomes a part of the overall experience.
3.1 Gameplay experience
Viking Ghost Hunt uses the concept of role-play, which has been found to enhance
narrative engagement and add to presence, especially when the player enacts the
character themselves (Freeman, 2003; LeBlanc, 2005). The gamer plays the role of a
paranormal investigator searching for location-bound ghostly manifestations to capture
their image and listen to their stories that contain clues for game progression. The mobile
phone functions as a paranormal investigative interface supporting the role-play aspect.
The ghostly manifestations are of a visual or aural presentation and the different
graphical user interface modes enable the player to locate, photograph and/or hear the
ghosts (Carrigy et al., 2010). The application utilises five modes of interaction that enable
players to locate ghosts, photograph or X-ray paranormal artefacts, capture voices of
ghostly manifestations and finally review the evidence.
3.1.1 Radar mode
Radar mode, shown in Figure 1(a), uses a radar graphical interface that visualises the
direction and distance of the paranormal activity from the player. Ghosts that can be seen
only as images are indicated onscreen by a blue ‘cloud’, and ghosts that appear only as
audio in green. When the ghost cloud is in the centre of the radar, the player is in the
vicinity of the paranormal manifestation.
3.1.2 Map mode
This mode uses the Google map function on the phone to show ghost locations
(Figure 1(b)) with an additional map overlay to contextualise the historical information
onto the modern landscape. This enables players to orientate themselves in the city centre
of Dublin in relation to the physical game location.
3.1.3 Camera mode
This mode makes use of the camera application on the mobile phone and allows players
to capture ghostly images (Figure 1(c)) or artefacts that are overlaid on the live camera
image. By photographing ghosts, players are able to unlock game dialogue using the
frequency scanner mode and tuning into audio signals of the ghostly presence
(static noise).
66 N. Paterson et al.
Figure 1 Viking Ghost Hunt interface showing (a) radar mode, (b) Google map mode and
(c) camera mode (see online version for colours)
(a)
(b)
(c)
Creating engaging sound design for location-aware applications 67
3.1.4 Frequency scanner mode
The frequency scanner mimics an electromagnetic voice phenomenon (EVP) interface
(Figure 2(a)), a device traditionally popular with paranormal investigators designed to
detect static produced by a ghostly presence. EVP describes the decoding of noises
resembling speech from radio static and stray audio transmissions. Using this concept,
gamers use the frequency scanner to tune into the correct ghost frequency (Hertz), record
the radio static and subsequently hear the decoded ghostly message on playback. This
mode is activated only after the player has captured the image of the ghost.
3.1.5 Casebook mode
The casebook mode, shown in Figure 2(b), allows players to review all the evidence they
have collected, such as images and recorded dialogue. Captured game assets can be
reviewed at any point in the game.
All players were given a tutorial on the gameplay mechanics and use of modes for
game interaction before beginning their mission. This was to make players familiar with
the game-specific technology and to facilitate ease of use with the intention that this
would assist them in remaining engaged in the game space.
Figure 2 Viking Ghost Hunt interface showing (a) frequency scanner mode and (b) casebook
mode (see online version for colours)
(a)
(b)
68 N. Paterson et al.
3.2 Sound design
The goal of the Viking Ghost Hunt prototype is to create ‘augmented reality audio’,
which according to Kyriakakis (1998) is defined as incorporating natural environmental
sounds into the artificial mediascape. Therefore, interactivity between the technology,
virtual game space and physical location is an important aspect of the sound design where
changes in player movements and locations modify the audio behaviour. This is made
possible with the use of pre-defined global positioning system (GPS) locations in the
relevant physical space to trigger related audio files. Hence, the player soundscape
interaction is largely a subconscious one as the technology does not interfere with the
experience, making it pervasive and ubiquitous.
Audio files were placed into categories (Table 1) that described their role as
background sound, sound effects, dialogue or user interface sounds, which influenced and
defined how the files would be creatively and technically implemented. To blend the real
and virtual worlds, sound effects that are of both paranormal and environmental origin
are mixed together. Hence, the player at times cannot distinguish from game sound and
external environmental sound. The continuous background audio predominantly uses
musical techniques such as drones and repeating loops drawn from modal scales to create
an atmospheric mood. The audio samples gathered for the sound design were stereo field
recordings, sourced samples and samples created electronically using a midi synthesiser
and various audio sequencers.
Table 1 Sound categories
Sound effects Background
User interface
D
ialogu
e
Game world
E
nvironmenta
l
M
usica
l
N
on-musical
Footsteps Animal sounds Minors Drones Old AM radio Dialogue
Floorboards
creaking
Church bells Chromatics Textures White noise Coded
Bangs Laughing Pitched Audio static
interference
Undecoded
High pitch
metallic squeal
Children playing Unpitched Metal detector
Breathing School bell Drums Dial clicks
Screams Paper Pan-flute Geiger metre
clicks
Whispers Thunder Button sounds
Faraway voices Wind Sonar bleep
Growls Rain Compass sounds
Scratching Traffic Transitional
sound
Moaning Dog barking
Modulated
voices
Chimes
Rattling chains Birds
Battle sounds Leaves
Wind in trees
People talking
Source: Cited in Paterson et al. (2010a, p.151).
Creating engaging sound design for location-aware applications 69
3.3 Psychoacoustic considerations in the sound design: spatialisation and
reverberation
To create a realistic and engaging sound design, it is necessary to understand the
psychological effects of spatial hearing (psychoacoustics). In the real world, sound is
presented in a 3D manner to enable localisation, distance perception and recognition
(Ashmead et al., 1989). The most accurate method of spatialised audio over headphones
is by the use of head-related transfer function (HRTF) binaural audio filters. These filters
take into account the effect of the ear structure, head and torso on the sound input before
it reaches the eardrum for sound localisation (Gardner and Martin, 1995) and hence
incorporate interaural time differences and interaural level differences. In the current
prototype, sound effects, like a scream moving from right to left, did not require
determination of the absolute real-world location as the sound design focus was on the
multimedia experience and not 3D spatial audio accuracy. Therefore, files such as ghostly
voices and sound effects were spatialised by using a binaural simulator that incorporated
HRTF filters and allowed for sounds to be panned (moved) in a 3D audio scene. For
example, in the location described as ‘The Lane of Hell’, the sound of heavy chains
moving past the player is used to signify the paranormal activity of an old leper ghost
anchored to that location. All these files were spatialised in a pre-rendered format, as
mobile phone technology does not as yet possess the required processing power to
implement this in real time.
Another psychoacoustic principle that is important for realistic sound presentation
and the creation of presence is reverberation. Reverberation describes the propagation
and interaction of sound with surfaces that create the ambience of a space (Rumsey,
2001). Reverberation is comprised of the direct sound and reflections from nearby
surfaces of increasing order, which combine to form a diffuse field reverberation and can
be characterised by a reaction of a space to a sound emanating from a sound object,
known as its impulse response. The relationship and time gap between the direct sound
and first arriving early reflection (initial time delay gap (ITDG)) is of particular
importance in contributing to the sense of a space, sound source size and distance
perception (Begault, 1994). Eventually, the early reflections build up and the reverberant
field becomes diffuse. The point at which this transition occurs is typically believed to be
between 20 and 80 ms after the sound begins for medium to large size spaces.
Artificial reverberation is used in the sound design of Viking Ghost Hunt with audio
files being pre-rendered with different reverberation settings dependent on the narrative
context and physical location that they would be triggered. In real-world locations that
contained numerous reflective surfaces such as high walls and buildings (as is
experienced in the ‘Lane of Hell’ scenario in the game), reverberation settings with
higher amplitude (loudness) early reflections were used. In locations that were free-field
(open spaces) reverberation settings with reduced early reflections at lower amplitudes
were implemented, reflecting the associated physical locations.
70 N. Paterson et al.
3.4 Technical implementation of the sound design
To create a dense and engaging game atmosphere, the sound design involved the
playback of a complex configuration of multiple and varied simultaneous audio files that
were created keeping in mind the psychoacoustic principles and game narrative.
Therefore, in developing the Viking Ghost Hunt prototype collaboration between
computer programmers and artists was important in designing and implementing a
complex soundscape on the mobile platform technology.
3.4.1 Generative audio
The sound categories outlined in Section 3.2 indicated how they would be coded in the
Android operating system. The SoundPool class (deals with uncompressed short audio
files for quick access) plays back small-sized files such as sound effects while the
MediaPlayer class1 (deals with larger compressed audio file streams) in the Android
operating system is responsible for the playback of longer files, like files associated with
dialogue. To have a continuous background sound, generative audio is preferred over
looping files to avoid habituation and boredom. Generative audio refers to a system that
creates audio, constructed in an algorithmic manner, which enables music and sound to
be presented in an ever-different and changing format (Beilharz, 2004). In the prototype,
small audio files of the same sound type but of different pitch are coded to play randomly
at varying times. To achieve this, ‘audio wavelets’ (short audio components of small file
size and short 2–5 sec duration) overlap and are randomised in both file selection and
time to create a continuous, ever-changing background. This aspect of the sound design
supports the non-linearity of gameplay and related audio content.
3.4.2 Interactive audio
The use of ‘audio wavelets’ is also important in allowing interactivity between player
movement and game choices with instant but varied playback of files that are related to
the different physical locations. Mobile phone technology such as the internal compass,
accelerometer and GPS receiver is used here: the internal compass gives the direction the
device is facing, the accelerometer enables the device to be used in a vertical or
horizontal position and the GPS receiver ascertains player position in the real world. GPS
coordinates relating to Viking historical locations were chosen and as seen in Figure 3,
concentric rings of audio surrounding the location with varying radial distances were
defined, to delineate audio that is to be triggered as the player approaches the location.
For example, when the player enters the outer concentric ring in the vicinity of the Viking
ghost ‘Olaf’, a background of continuous wind sound is triggered using GPS information
updated according to player movement and proximity to the historical location. As the
player moves closer to the location and into the sound effects area, low frequency,
percussive, reverberant echoes are added together with a ghostlike voice calling ‘help
me’. The background and sound effects continue until the player reaches the central
dialogue zone where the user interface radar mode indicates that a visual manifestation of
the ghost is imminent. The player subsequently switches to camera mode in the vertical
Creating engaging sound design for location-aware applications 71
position and captures an image of the ghost. Taking a picture of the ghost enables the
player to tune into the ghost dialogue frequency using the frequency scanner mode and
access infomation related to the game. All game content, including audio, is stored on the
internal memory card of the mobile phone and triggered when a player is within a pre-
defined GPS location. Depending on the position of the player in relation to the GPS
location, different sounds enter and remain for different lengths of time, providing an
interesting and varied audio experience. The ambient background audio is always
triggered automatically by player proximity to GPS defined game locations. It is
therefore important that smooth audio transitions occur between locations as players
move between pre-defined locations of interest. As the player progresses from their
current physical location to the next (as indicated in map mode), the background sound
volume automatically fades to silence as they leave one zone with another background
sound fading in as they enter the new zone.
Figure 3 This image presents the implementation of concentric rings of audio around a fixed
GPS location (see online version for colours)
4 Evaluation of Viking Ghost Hunt sound design
Evaluation of the Viking Ghost Hunt sound design was conducted in three phases with the
focus of the different phase evaluations being determined by the results of the preceding
phase. The first phase evaluates the players’ general experience of the audio with regard
to feelings of immersion and presence in the physical location of the overlaid game
space. The second phase investigates the affect of reverberation and spatial audio on
immersion and emotional engagement. The third phase focuses primarily on the
parameters of reverberation, specifically the relationship between the direct sound, early
reflections and the diffuse field on participant engagement.
72 N. Paterson et al.
4.1 Phase 1: evaluation of Viking Ghost Hunt audio experience
The Phase 1 evaluation was a user study examining the game interface, players’
engagement with the game space (Carrigy et al., 2010) as well as a short section
dedicated to investigating the role of audio. Objective findings like physiological changes
are difficult to measure for emotional response and immersivity to sound and music
(Meyer, 1956), as it does not take into account the cognitive aspects. Therefore, for a
phenomenological experience (experienced from the first-person point of view),
subjective trials were used for the assessment of engagement and immersion conducted at
the physical location of the gameplay. The trials took place over a three-day period with
19 subjects of ages varying between 18 and 48. There was a mixture of male and female
participants all with varying degrees of gaming and technology experience. Testers were
provided with a G1 HTC mobile phone and stereo headphones and were presented with a
brief tutorial on the game modes of interaction. They then went on to play one mission of
Viking Ghost Hunt. On completing the mission, participants were asked to fill out a
questionnaire consisting of 28 qualitative open-ended questions that dealt with issues
such as engagement with the narrative and emotional experience of the game.2
Additionally, a five-item Likert scale (strongly disagree to strongly agree) was presented
to players that consisted of a set of nine audio-specific statements. Each audio statement
related to game immersion, emotion, presence or interactivity, with some questions being
based on Witmer and Singer’s (1998) presence questionnaire and included:
“The sound made the game feel scary” (emotional engagement).
“I feel that the sound was reactive to my movements” (interaction).
“When playing the game, time seemed to pass quickly” (immersion 1).
“The user interface sounds and soundscape contributed to the idea that I was a
paranormal investigator” (immersion 2).
“The audio was seamless and felt a real part of the game” (immersion 3).
“The game audio sounded natural within the game environment” (presence).
The questionnaire was specifically designed not to lead participants to comment on the
audio unless it had a significant impact on the game experience.
4.1.1 Results and discussion: Viking Ghost Hunt audio experience
Overall, participants responded positively to the quantitative audio statements with the
majority (70%) of participants agreeing that the audio contributed to engagement,
immersion, presence and interactivity with the game space. Additionally, the open-ended
responses supported the quantitative evidence with statements, such as:
“I felt immersed in the game.”
“At the moment it seems like the atmosphere is the unifying element (also one
of the strongest aspects of the game).”
“Addition of audio greatly increased the atmosphere and engagement.”
“It was a very immersive experience. A couple of times I found myself not
realising that I am on a street and there are people around me.”
Creating engaging sound design for location-aware applications 73
In terms of the role-play aspect of the game (immersion 2 statement), 84% of participants
agreed and strongly agreed that the user interface sounds and soundscape supported the
paranormal investigator game concept.
When asked whether the game audio contributed to the game feeling scary (emotional
engagement statement), 63% of participants responded in agreement or strong agreement
that the ‘sound effects created a scary atmosphere’. The positive outcome, seen in
Figure 4, confirmed the importance of audio for engagement, immersion and presence in
location-aware gaming. Another important factor of the design was to have an element of
interactivity between the player, physical locations and the device user interface.
Participants were asked whether they found the sound reactive (interaction statement) to
the physical environment and in response, 68% felt that the game sound was reactive.
This result supports the blending of the virtual world and physical one:
“The backing audio changing as I moved location was a good touch.”
“Loved the sound – that was new for mobile gaming – location sensitive.”
Additionally, 79% of testers felt that the audio supported the game environment in the
physical locations (presence statement), hence creating a sense of presence.
In creating a complex sound design with many layers and types of sounds playing
simultaneously, care must be taken in the presentation of the audio files. This was evident
in some of the feedback as sound effects were at times found to overpower the dialogue,
an essential element in gameplay. A balance is required between important audio game
information and ambient background and sound effects:
“Felt a little scary, however the first time the ghost spoke I had difficulty
understanding what he said over the sound effects. Though the sound was quite
good.”
External noise and busy environments bring another challenge for location-aware audio.
In this game, most testers did not feel distracted or interrupted by external noise but this
is likely due to the fact that a quiet location was sourced as a testing ground. Therefore,
game locations must be carefully sourced:
“Stopping in the middle of a path was distracting as I felt I was in the way [of
people]….isolated paths and lanes were more atmospheric and I felt more
immersed.”
Location-aware gaming on the mobile platform has the potential to be immersive and
engaging even when faced with the challenges of slower processing speeds and limited
memory space. Previous research has shown that spatialisation of sound and the use of
reverberation can be instrumental in immersing players and creating a sense of presence
in a virtual world (Hollier et al., 1997). Hence, due to the use of these psychoacoustic
properties in Viking Ghost Hunt, it was decided to extend the research by investigating
the role of reverberation and spatial audio on immersion and engagement.
74 N. Paterson et al.
Figure 4 Box plot representation of the distribution of the results of the 19 listeners for each
question. They depict the median, the lower and upper quartiles and the maximum and
minimum value (the whiskers) and the outliers (*) (see online version for colours)
4.2 Phase 2: evaluation of reverberation and spatial audio
Further listening tests on a small subset of six players were undertaken investigating the
effect of reverberation and spatial audio on immersion and engagement. Each listener
auditioned 12 sound samples from the Viking Ghost Hunt sound design, 3 with and 3
without reverberation applied, 3 with the audio spatialised dynamically (moving from one
side to another) and 3 not spatialised at all (i.e. centrally positioned). The tests were
conducted in an indoor space (room with low lighting) with the participant wearing stereo
headphones. The listener was asked to compare the samples by a three-way forced choice
preference, indicating which of the samples (presented in pairs of reverberant/non-
reverberant and spatialised/non-spatialised) they thought was more emotionally engaging
and which was more immersive (Paterson et al., 2010b). They could indicate a preference
for sample A or B or no preference at all. Sound samples used were from the game and
included screams, drones, Viking battle sounds and outdoor environmental noise. All
samples, both reverberated and spatialised, maintained the same settings as were used in
the game.
4.2.1 Results and discussion: reverberation and spatial audio
The results revealed that the majority of subjects (67%) found the reverberated samples
more engaging, with 77% (Figure 5(a)) also perceiving these samples to be more
immersive when compared to the unreverberated audio samples. There was no statistical
difference between the perceived immersion or engagement using spatialised and non-
spatialised sound (Figure 5(b)). This shows that reverberation was found to be of more
importance than spatialised sound in creating an immersive and engaging soundscape.
Creating engaging sound design for location-aware applications 75
Figure 5 Bar-chart representation of the results of participant engagement and immersion tests
for (a) reverberation and (b) spatial audio with the standard error of the proportion
(see online version for colours)
(a)
(b)
76 N. Paterson et al.
With this result in mind, a more detailed investigation into reverberation parameters and
settings seemed appropriate to ascertain how these settings could be used to full effect in
location-aware applications. It was decided to further this research, in a Phase 3
evaluation, to correlate the objective physical attributes of reverberation of direct sound,
early reflections and the diffuse field to subjective reports.
4.3 Phase 3: the role of direct sound and early reflections in reverberant sound
for engagement
The concept of engagement, as defined by Asbjørn Krokstad, is the focused attention and
involvement to a sound object (Geil and Weinberg, 2010). It is the belief that sounds
perceived as close to the listener and more accurately localised are potentially more
engaging and emotionally involving than those that sound distant and diffuse. David
Griesinger, a physicist who works in the field of sound and music (Geil and Weinberg,
2010), adds that it may be the relationship between the direct sound and its associated
early reflections and reverberant field that will dictate the level of engagement a listener
may have to the given sound object, although the extent of which is yet to be shown.
While many studies have focused on clarity, immersion and envelopment of reverberant
fields, there has been little work on the effect of reverberation and in particular direct
sound and early reflections on engagement, an important aspect of game design.
Measures useful in investigating the effect of the direct to reverberant energy with
and without early reflections include the direct-early reflection ratio (DERR), the direct-
diffuse field ratio (DDFR) and the ITDG. The DERR gives a measure of the strength of
the direct sound within the first 10 ms in comparison to the early reflections within a time
period of 10–100 ms. The DDFR is a measure of the strength of the direct sound to the
diffuse field (no early reflections). The third measure, ITDG describes the time delay
between the direct sound and the first reflection, typically occurring within 30 ms. By
varying these three parameters, we can easily derive a set of test samples with varying
levels of early and late reverberant fields, and varying time values of ITDGs. The test
hypothesis is that changes in these reverberation parameters will lead to changes in
engagement. The listening test was again designed around a headphone presentation. The
direct sound was formed through convolution with KEMAR (Knowles Electronic
Manikin), HRTFs taken from the CIPIC (Centre for Image Processing and Integrated
Computing (University of California, Davis)) database (Algazi et al., 2001). Three types
of audio samples were used for the tests: music, speech and noise bursts, broadly
representing the sound categories used in the Viking Ghost Hunt sound design. The music
samples were anechoic recordings extracted from the Denon Anechoic Orchestral
database (Denon, 1995). Speech recordings were phrases re-recorded from the TIMIT
(Texas Instruments (TI) and Massachusetts Institute of Technology (MIT)) speech corpus
database (Fisher et al., 1986). The noise audio sample was 100 ms pink noise bursts with
100 ms silence intervals over five repetitions. The listening test consisted of multiple
audio samples, presented in random order with the inclusion of a reference sample for
comparison. During the tests, listeners were asked to rank the ‘engagement’ of each
sample with respect to the reference, which was the median sample in each test range by
Creating engaging sound design for location-aware applications 77
utilising a comparison category rating. This reference was also hidden as one of the
samples to be rated in each test round. ‘Engagement’ was clearly defined to the
subject as:
“A focused attention onto the sound source, as if the performance is just for
you; it sounds closer and more accurately localised and more emotionally
involving.”
The level of engagement was recorded using an 11-point hedonic scale ranging from 5
to +5, to 1 decimal place. Subjects were asked to supply a negative rating when they felt
the source was less engaging than the reference and a positive rating if it was more
engaging. The test was conducted in a controlled studio environment over AKG-K271
headphones and took ~15 min per subject.
4.3.1 Results and discussion: direct sound, early reflections, diffuse sound and
engagement
Data was collected from 15 listeners. Each listener was under 35 years of age and of
excellent hearing. The first set of results is shown in Figure 6(a).
The reference ITDG sample in this case has a value of 20 ms and the overall
reverberation time is 1 sec. Here, we see that changes in the ITDG did not have a large
impact on the overall perception of engagement. However, it is noteworthy in the cases of
both music and speech that there is a statistically significant improvement when the
ITDG changes from 0 to 10 ms. At 30 ms, however, the sense of engagement with speech
drops to the previous level at 0 ms. The DERR has a more significant bearing on the
perception of engagement as shown in Figure 6(b). It appears in this case that this is
highly dependent on the audio source. Pink noise, which represents here an unfamiliar
sound source to the listener, shows the least change in engagement (which is also the case
when manipulating the ITDG). Additionally, the engagement with music diminishes
significantly when the level of the direct sound is dominant. Conversely, speech becomes
far more engaging when the level of the direct sound is dominant. From this result, it
might be deduced that an increase in the level of the reverberant energy is important for
the sensation of engagement with music. While this seems to be true in the case of early
reflections, it does not seem to be the case with the diffuse decay (seen in Figure 6(c)). In
this test, an overall trend is exhibited over all sources where engagement increases with a
decreasing level of the diffuse field. The reference audio sample in this case is where
DERR equals 0 dB. It is noteworthy that the engagement level for music falls at a DERR
of 20 dB.
These results demonstrate that reverberation has a significant impact on the
perception of engagement, not only due to manipulation of the reverberation parameters,
but also in regards to the sound samples utilised. According to Griesinger (Geil and
Weinberg, 2010), the inability to perceive the direct sound from the diffuse field results
in diminished engagement to a sound source. The reasoning is that the listener is unable
to distinguish the direct sound (foreground) from the diffuse sound (background). This
was evident from the results obtained which showed an increase in engagement, when the
direct and diffuse sound was delayed from 0 to 10 ms. However, a saturation point for
engagement was reached when the direct and diffuse delay was 30 ms where engagement
was eventually lost. Therefore, the initial time delay between the direct and the diffuse
sound will change how the sound is perceived. These findings support Griesinger’s
78 N. Paterson et al.
engagement theory, the findings of Gestalt psychologists, and researchers Meyer (1956)
and Bregman (1990) on their studies regarding multiple sensory stimuli, sound stream
organisation and resultant perception. However, this theory seems to be dependent on the
sound sample being utilised, as it was found that speech was more engaging when the
direct sound was dominant but not so for music, where listeners preferred a mixture of
direct sound and early reflections for engagement. Additionally, noise sound samples did
not significantly change listener engagement levels even when DEER and ITDG settings
were varied. This may indicate that unfamiliarity to sound sources affects a listener’s
ability to engage. Furthermore, a decrease in the level of the diffuse decay generally
increased listener engagement to music and speech, which indicates that late reflections
detract from listener engagement. This result supports Griesinger’s theory that diffuse
fields which interfere or over power the direct sound source level will be less engaging.
Figure 6 Listening test results for (a) DDFR and ITDG of 20 ms; (b) DERR and ITDG of 0 ms
and (c) DDFR and ITDG of 30 ms (see online version for colours)
(a)
(b)
Creating engaging sound design for location-aware applications 79
Figure 6 Listening test results for (a) DDFR and ITDG of 20 ms; (b) DERR and ITDG of 0 ms
and (c) DDFR and ITDG of 30 ms (see online version for colours) (continued)
(c)
5 Discussion
The three phases of the Viking Ghost Hunt investigation highlight the importance of a
sound design for engagement, immersion and presence. The results of this research show
the significance of designing an audio framework that incorporates gaming concepts and
implements psychoacoustic principles for location-aware gaming. The addition of
generative audio and interactivity contribute to audio non-linearity, which is particularly
useful for quick soundscape transitions that respond to player movements. Due to the
limited memory available on the mobile platform, smaller audio files are also ideal for
interactivity, file reusability and allow for a more complex and diverse soundscape design
as opposed to the popular use of longer linear files. However, slower processing speeds
make any real-time rendering of files difficult, especially for spatial audio and
reverberation. With regard to reverberation and engagement levels, the results suggest
that by maintaining an ITDG between 0 and 10 ms and reducing the diffuse field levels of
the reverberation, sound samples of speech and music may be perceived as more
engaging. However, the results indicate that early reflections have more of an affect on
engagement than ITDG. Therefore, depending on whether the sound source is music or
speech, the level of the direct sound and early reflections must be adjusted accordingly.
Even though these results do not directly reflect on the Viking Ghost Hunt sound design
as it currently stands, the information could potentially be an important tool in its further
development and the development of engaging sound designs on the mobile platform.
80 N. Paterson et al.
6 Conclusion
Location-aware gaming is an exciting and evolving genre at the forefront of
entertainment technologies and technological development. As seen by the results
obtained in the three phases of the evaluation, audio has the potential for immersing and
engaging participants in a virtual space overlaid onto physical locations. The research
presented in this paper focuses specifically on the creative and technical implementation
of a sound design and how it is produced in the game. It shows the importance of a
dominant direct sound and a reduced diffuse field in reverberation for listener
engagement to the sound source. Additionally, the use of early reflections in relation to
the direct sound is shown to be dependent on whether the source is speech or music.
Therefore, careful composition of audio files and implementation of reverberation
parameters can greatly effect game space involvement and the overall game experience.
As the research has indicated, psychoacoustic properties, and especially that of
reverberation, are of importance in sound designs. Subsequent future research will
include the development of another prototype on the mobile platform that will
incorporate the suggested reverberation parameters for engagement.
Acknowledgements
We would like to thank the rest of the Viking Ghost Hunt research team, Trinity College
Dublin and the National Digital Research Centre for supporting and funding this project.
References
Algazi, V.R., Duda, R.O., Thompson, D.M. and Avendano, C. (2001) ‘The CIPIC HRTF database’,
in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz,
NY, USA, pp.99–102.
Ashmead, D.H., Hill, E.W. and Talor, C.R. (1989) ‘Obstacle perception by congenitally blind
children’, Perception and Psychophysics, Vol. 46, No. 5, pp.425–433.
Begault, D. (1994) 3D Sound for Virtual Reality and Multimedia. Cambridge, MA: Academic Press
Inc.
Beilharz, K. (2004) ‘Interactively determined generative sound design for sensate environments:
extending cyborg control’, in Y. Pisan (Ed.), Interactive Entertainment ‘04, Creativity and
Cognition Studios. Sydney: University of Technology, pp.11–18.
Bregman, A.S. (1990) Auditory Scene Analysis: The Perceptual Organization of Sound.
Cambridge, MA: MIT Press.
Brown, E. and Cairns, P. (2004) ‘A grounded investigation of game immersion’, in CHI 2004,
University College London Interaction Centre, Vienna, Austria, 24–29 April.
Carrigy, T., Naliuka, K., Paterson, N. and Haahr, M. (2010) ‘Design and evaluation of player
experience of a location-based mobile game’, in 6th Nordic Conference on Human-Computer
Interaction (NordiCHI 2010), Reykjavik, Iceland.
Cater, K., Hull, R., Melamed, E. and Hutchings, R. (2007) ‘An investigation into the use of
spatialised sound in locative games’, CHI 2007, San Jose, USA, April 28–May 3.
Creating engaging sound design for location-aware applications 81
Denon (1995) ‘Anechoic orchestral music recording’, Audio CD, Denon Records, ASIN:
B0000034M9, Japan.
De Souza e Silva, A. (2004) ‘Location based games: blurring the borders between physical and
virtual spaces’, in Proceedings of Inter-Society for the Electronic Arts (ISEA), Helsinki,
Finland, August 2004.
Ekman, I., Ermi, L., Jussi, L., Nummela, J., Lankoski, P. and Mäyrä, F. (2005) ‘Designing sound
for a pervasive mobile game’, in Proceedings of the ACM SIGCHI International Conference
on Advances in Computer Entertainment Technology, New York, USA.
Fisher, W.M., Doddington, G.R. and Goudie-Marshall, K.M. (1986) ‘The DARPA speech
recognition research database: specifications and status’, in Proceedings of DARPA Workshop
on Speech Recognition, pp.93–99, SAIC-86/1546.
Freeman, D. (2003) Emotioneering: Creating Emotions in Games. Indianapolis, IN: New Riders
Games.
Gardner, W.G. and Martin, K.D. (1995) ‘HRTF measurements of a KEMAR’, Journal of the
Acoustical Society of America, Vol. 97, No. 6, pp.3907–3908.
Geil, F. and Weinberg, D.J. (2010) ‘Concert audience engagement – the perception of closeness
engage us with sound’, Boston Audio Society, Vol. 32, No. 2, p.42.
Hollier, M.P., Rimell, A.N. and Burraston, D. (1997) ‘Spatial audio technology for telepresence’,
BT Technology Journal, Vol. 15, No. 4, pp.33–41.
International Society for Presence Research (ISPR) (2000) The Concept of Presence: Explication
Statement. Available at: http://ispr.info, Accessed on 12 February 2011.
Kyriakakis, C. (1998) ‘Fundamental and technological limitations of immersive audio systems’, in
Proceedings of IEEE, Vol. 86, pp.941–951.
LeBlanc, M. (2005) ‘Tools for creating dramatic game dynamics’, in K. Salen and E. Zimmerman
(Eds.), The Game Design Reader: A Rules of Play Anthology. Cambridge, MA: MIT Press.
Meyer, L. (1956) Emotion and Meaning in Music. Chicago, USA: University of Chicago Press.
Nisi, V., Oakley, I. and Haahr, M. (2008) ‘Location-aware multimedia stories: turning spaces into
places’, in Proceedings of ArTech 2008, Porto, Portugal, November, 2008.
Paterson, N., Naliuka, K., Jensen, S.K., Carrigy, T., Haahr, H. and Conway, F. (2010a) ‘Design,
implementation and evaluation of audio for a location based augmented reality game’, in
Proceedings of ACM Fun and Games 2010, Leuven, Belgium, 15–17 September 2010.
Paterson, N., Naliuka, K., Jensen, S.K., Carrigy, T., Haahr, H. and Conway, F. (2010b) ‘Spatial
audio and reverberation in an augmented reality game sound design’, 40th AES Conference:
Spatial Audio, Tokyo, Japan, 8–10 October 2010, Audio Engineering Society.
Reid, J., Geelhoed, K., Hull, R., Cater, K. and Clayton, B. (2005) ‘Parallel worlds: immersion in
location-based experiences’, in CHI ‘05 Extended Abstracts on Human Factors in Computing
Systems.
Rumsey, F. (2001) Spatial Audio. Oxford, UK: Focal Press.
Valente, L., Sieckenius de Souza, C. and Feijo, B. (2008) ‘An exploratory study on non-visual
mobile phone interfaces for games’, in Proceedings of the VIII Brazilian Symposium on
Human Factors in Computing Systems, IHC 08, Porto Alegre, Brazil, 21–24 October 2008.
War Memorials in Vision (2010) Available at: http://imagesforthefuture.com/en/activities/iphone-
app-war-memorials-vision, Accessed on 15 September 2010.
Witmer, B.G. and Singer, M.J. (1998) ‘Measuring presence in virtual environments: a presence
questionnaire’, Presence, Vol. 7, No. 3, pp.225–240.
82 N. Paterson et al.
Bibliography
Anastasi, R., Tandavanity, N., Flintham, M., Crabtree, A., Adams, M., Row-Farr, J., Iddon, J.,
Benford, S., Hemmings, T., Izadi, S. and Taylor, I. (2006) ‘Can you see me now? A citywide
mixed-reality gaming experience’, ACM Transactions on Computer–Human Interaction
(TOCHI), Vol. 13, No. 1, pp.100–133.
Biswas, A. (2008) ‘Managing art–technology research collaborations’, Int. J. Arts and Technology,
Vol. 1, No. 1, pp.66–89.
Bode, H. (1984) ‘History of electronic sound modification’, Journal of the Audio Engineering
Society, Vol. 32, No. 10, pp.736–739.
Calleja, G. (2007) ‘Digital game involvement: a conceptual model’, Games and Culture, Vol. 2,
pp.236–260.
Coleridge, S.T. (1983) ‘Biographica literaria’, in J. Engell and W.J. Bate (Eds.), The Collected
Works of Samuel Taylor Coleridge (Bollingen Series, 75). Princeton, NJ: Princeton University
Press.
Collins, K. (2008) Game Sound: An Introduction to the History, Theory and Practice of Video
Game Music and Sound Design. USA: MIT Press.
Cumming, N. (1997) ‘The subjectivities of Erbarme Dich’, Music Analysis, Vol. 16, No. 1,
pp.5–44.
El-Nasr, M.S., Vasilakos, A.V. and Robinson, J. (2008) ‘Process drama in the virtual world – a
survey’, Int. J. Arts and Technology, Vol. 1, No. 1, pp.13–33.
Elson, L.C. (2009) Elson’s Music Dictionary. Charleston, SC: BiblioBazaar, LLC.
Gardner, W.G. (2004) ‘Spatial audio reproduction: toward individualized binaural sound’,
The Bridge, Vol. 34, No. 4, pp.37–42.
Grau, O. (2004) ‘Immersion and interaction from circular frescoes to interactive image spaces’, in
R. Frieling and D. Daniels (Eds.), Wien. New York: Springer-Verlag, pp.303–304.
ISO (2009) ‘ISO 3382-1:2009 ‘acoustics – measurement of room acoustic parameters – part 1:
performance spaces’, Available at: http://www.iso.org/, Accessed on 19 November 2010.
Kyriakakis, C., Tsakalides, P. and Holman, T. (1999) ‘Surrounded by sound’, IEEE Signal Process
Magazine, Vol. 16. pp.55–66.
Lemordant, J. and Guerraz, A. (2007) ‘Mobile immersive music’, in International Computer Music
Conference ‘07 (2007).
Leydon, R. (2001) ‘The soft-focus sound: reverberation as a gendered attribute in mid-century
mood’, Perspectives of New Music, Vol. 39, No. 2, pp.96–107.
Lombard, M., Ditton, T.B., Crane, D., Davis, B., Gil-Egui, G., Horvath, K., Rossman, J. and Park,
S. (2000) ‘Measuring presence: a literature-based approach to the development of a
standardized paper-and-pencil instrument’, in W. Ijsselsteijn, J. Freeman and H. de Ridder
(Eds.), Proceedings of the Third International Workshop on Presence, Eindhoven University
of Technology, Eindhoven, The Netherlands.
Packer, R. and Jordan, K. (2002) Multimedia: From Wagner to Virtual Reality. New York: W.W.
Norton & Company.
Steinbock, D. and Wilson, J.L. (2007) The Mobile Revolution. UK: Kogan Page, p.150.
Notes
1SoundPool and MediaPlayer are programming classes (a template for creating objects of that
class) built into the Google Android operating system code that deal with the playback of audio
files.
2The full results of this part of the evaluation are documented in the paper by Paterson et al.
(2010a) and will not be elaborated on in this paper.