Conference PaperPDF Available

The Audible Artefact: Promoting Cultural Exploration and Engagement with Audio Augmented Reality


Abstract and Figures

This paper introduces two ongoing projects where audio augmented reality is implemented as a means of engaging museum and gallery visitors with audio archive material and associated objects, artworks and artefacts. It outlines some of the issues surrounding the presentation and engagement with sound based material within the context of the cultural institution, discusses some previous and related work on approaches to the cultural application of audio augmented reality, and describes the research approach and methodology currently engaged with in developing an increased understanding in this area. Additionally, it discusses the project within the context of related cultural and sound studies literature, presents some initial conclusions as a result of a practice-based approach, and outlines the next steps for the project.
Content may be subject to copyright.
The Audible Artefact: Promoting Cultural Exploration and
Engagement with Audio Augmented Reality
Laurence Cliffe1, James Mansell2, Joanne Cormac3, Chris Greenhalgh1 and Adrian Hazzard1
1School of Computer Science, 2Department of Cultural, Media and Visual Studies, 3Department of Music
The University of Nottingham, United Kingdom
laurence.cliffe, james.mansell, joanne.cormac, christopher.greenhalgh,
This paper introduces two ongoing projects where audio
augmented reality is implemented as a means of engaging
museum and gallery visitors with audio archive material and
associated objects, artworks and artefacts. It outlines some of the
issues surrounding the presentation and engagement with sound
based material within the context of the cultural institution,
discusses some previous and related work on approaches to the
cultural application of audio augmented reality, and describes
the research approach and methodology currently engaged with
in developing an increased understanding in this area.
Additionally, it discusses the project within the context of
related cultural and sound studies literature, presents some
initial conclusions as a result of a practice-based approach, and
outlines the next steps for the project.
Human-centered computing Ubiquitous and mobile
computing Empirical studies in ubiquitous and mobile
Audio; soundscape; locative; augmented; cultural; experience.
ACM Reference format:
Laurence Cliffe, James Mansell, Joanne Cormac, Christopher Greenhalgh
and Adrian Hazzard. 2019. The Audible Artefact: Promoting Cultural
Exploration and Engagement with Audio Augmented Reality. In
Proceedings of AM ’19, Nottingham, United Kingdom, September 18-20,
2019, 7 pages.
In this paper we present two ongoing projects, Mapping the
Symphony and National Science and Media Museum Gallery
Listening Sessions both of which attempt to directly apply audio
augmented reality as a means of promoting visitor exploration
and engagement with art, artefacts, sound archive material and
their related stories.
Within the context of this paper, audio augmented reality
(AAR) is considered as a virtual audio augmentation of the
physical and visual reality, or the physical artefact. In an
approach similar to [1, 2, 3], a virtual audio soundscape currently
replaces the ambient acoustic reality of the location, rather than
mixing with it, a mixed reality experience is therefore realised
through the meeting of physical artefact and virtual audio. The
possible inclusion of external acoustic ambience as a part of the
system is discussed in section 6.3.
We show how this practice-based, research through design
approach has developed a workable object detection and
nomadic indoor positioning prototype that extends the
capabilities of art and artefacts to advertise their presence to
visitors through audio augmentation.
We also outline a methodological approach, where the
ethnographical study and subsequent ethnomethodological
analysis [4, 5] of deployed prototypes will support the
developing iterations of the project going forward [6].
Additionally, we discuss the project’s potential for increasing
public engagement with sound archive material, art and artefacts
in relation to the system’s ability to extend the communicative
potential of the museum and gallery object, and in relation to
affording primacy to the sonic, rather than the visual. This is
based on related contemporary sound studies literature,
historical examples of augmented reality intervention within
cultural institutional contexts, and related work.
Also described is how the project builds on some of the
approaches outlined by Zimmerman & Lorenz in relation to the
LISTEN system [2], namely the concept of the attractor sound, to
develop an approach capable of longer range indoor positioning
with a reliance on virtually no background technological
Finally, we outline the ongoing, upcoming and future work
related to the project, and present a sound art installation
Permissio n to make digital or hard copies of all or part of this work for personal
or classroom use is granted without fee provided that copies are not made or
distributed for profit or commercial advantage and that copies bear this notice
and the full citation on the first page. Co pyrights for components of this work
owned by others than the author(s) must be honored. Abstracting with credit is
permitted. To copy otherwise, or republish, to post on servers or to redistribute
to lists, require s prior specific permission and/or a f ee. Req uest permissions
AM'19, September 1820, 201 9, Nottingham, Uni ted Kingdom
© 2019 Copyright is held by the owner/author(s). Publication rights licensed to
ACM ISBN 978-1-4503-7297-8/19/09…$15.00
AM ’19, September, 2019, Nottingham UK
L. Cliffe et al.
environment for each of the projects that will act as settings
within which we can conduct our ethnomethodological studies.
In the academic literature on museums, sound has been
identified as having the potential to give exhibitions emotional
power [7] and to generate multiplicity of interpretative
perspective [8, 9]. The argument, in short, is that sonic
exhibitions might help us to break from the truth effects of
visual and textual storytelling and all of the asymmetrical power
relations that they have been said to produce (especially in
Foucauldian critiques of museums), opening the ground for
visitors to ‘poach’ what they need from exhibitions, to borrow
Boon’s paraphrasing [8] of Michel de Certeau. Museums have
enthusiastically embraced the challenge of sound, identifying its
potential to produce more entertaining exhibitions, most notably
in order to deal with auditory subject matter as in the case of the
V&A’s exhibitions ‘David Bowie Is’ and ‘Pink Floyd: Their
Mortal Remains’ both of which provided a fully sound-tracked
experience on headphones. Also of note is the Wellcome
Collection’s less obviously crowd-pleasing 2016 exhibition ‘This
is a Voice’ which used installed sound, mainly via contemporary
art commissions, to tell the scientific, medical and cultural story
of the human voice.
This trajectory has established sound as an interpretation
tactic in museums. However, there remains a live question about
how to approach sound itself as an object of display. Kannenberg
[10] has prompted us to think of sound as artefactual and worthy
of display in museums (his own Museum of Portable Sound, a
collection of everyday field recordings stored on an iPhone, is an
effort in this direction). An exhibition at a major UK national
museum about sound in one form or another, with visual and
textual interpretation there as support rather than the main
attraction, has yet to be achieved. Something of this nature was
proposed by Boon et al [11] in preparation for a Science Museum
exhibition on music. The proposed exhibition would seek to
display the museum’s collection of historical acoustic and sound
reproduction technologies by foregrounding the listening
experience and skills of those who designed and used objects
such as tuning forks, gramophone sets and noise meters in the
past. It would do so by engaging visitors primarily as listeners,
drawing them into the auditory world of past music and sound
technologies and leading them through the exhibition via the ear
rather than the eye. That an exhibition such as the one described
by Boon et al [11] has yet to be realised speaks, it is fair to say, to
the enormous practical challenges of delivering this format.
This paper introduces two ongoing projects that aim to
uncover the potential challenges and opportunities involved in
the implementation of AAR as a means of promoting visitor
exploration and engagement with cultural institutions,
collections and exhibitions. Both projects also intend to explore
how AAR can be deployed within cultural institutions, cultural
venues and at heritage sites, and what value it may hold as a
curatorial and artistic tool within these contexts.
2.1 National Science Museum Listening
At Science Museum, like many other science and technology
museums, technological artefacts are conserved to maintain
physical integrity but not to continue in an operational state.
The acoustic world produced by late Victorian phonography, for
example, is lost in national museum collections because to
operate an original wax cylinder would risk its permanent
degradation as a museum object. The situation is worse still for
objects from the early electrical age: practices of conservation
typically do not address the preservation of electronic function,
meaning that a 1930s radio or gramophone set in the Science
Museum collection cannot be switched back on. Paradoxically,
Science Museum originally collected many of these objects, as
Rich [12] has shown, specifically for the purpose of auditory
demonstration. There is growing recognition that the history of
sound recording and reproduction is a story worth telling in the
museum. The challenge of doing so, with accessioned objects
which cannot be used to recreate the sound worlds of the past, is
still to be overcome.
There is also the challenge posed by the curious practice of
collecting media technology and media content separately. The
national sound archive is now held at the British Library, at one
remove from the objects which once created and replayed
recorded sound held largely at Science Museum and its regional
branches, especially the National Science and Media Museum in
Bradford. In response to the rapidly deteriorating physical state
of British Library sound archive materials and others like it in
regional collections, the Library has embarked on an ambitious
programme of digitisation known as ‘Unlocking Our Sound
Heritage,’ though there remains little sense of what public use
will be made of this digital archive once it is made available.
From a silenced collection of sound technology hardware to an
abundant, even noisy, digital sound archive, there is at present
little strategy or consensus about what might be termed ‘sonic
engagement’ – the practice of engaging the public in the history
of hearing, listening and sound. The question of what sonic
engagement should mean and how it should be achieved in the
context of museums of science and technology was taken up by
the Gallery Listening Sessions project at the National Science
and Media Museum.
2.2 Mapping The Symphony
The second project introduced here explores how to present the
evolving, fluid nature of music, exploding typical stereotypes
surrounding classical music, and particularly the genre of the
symphony, as something that is fixed and finished and must be
performed as according to the composer’s wishes. Museums
have attempted to tell various stories about music history. In
2018, the V&A’s Opera: Passion , Power and Politics exhibition
traced the development of opera from the 17th century until
today across Europe and America. This immersive experience
brought performance objects, such as scenery and costumes to
life through music. However, a different challenge is to explain
how one piece of music can evolve due to changes made by
The Audible Artefact
AM ’19, September, 2019, Nottingham UK
editors, performers, arrangers, and the influence of outside
forces, such as the resources available or the fashions prevalent
in a particular time or place. It intends to explore how
technology can guide the listener along the geographical and
evolutionary journey made by a single symphony, using the
musical content to engage visitors in its history, for example
how they evolved to be performed as hymn settings in church,
within ballets and operas in the theatre, and as military and
brass band tunes in promenade and open-air concerts.
In addition to the exhibition based audio experiences outlined
within the introduction of this paper, there are a number of
other related projects that provide useful reference points.
Zimmerman & Lorenz’s LISTEN system [2] provides an
excellent example of the capabilities of AAR within the context
of a cultural institution. The LISTEN project, which they describe
as ‘an attempt to make use of the inherent everyday integration of
aural and visual perception’, delivers a personalised and
interactive location based audio experience based on an adaptive
system model. It does this by tracking aspects of the visitors
behaviour (which artworks have been visited, how long were
they visited for etc.) to assign the visitor a behavioural model
and adjust the delivery of audio content accordingly. The
LISTEN system relies on a substantial technical background
infrastructure to realise this personalised and invisible tech nical
frontend experience for the visitor, who can wander freely
through the exhibition space with just a set of customised
headphones. LISTEN also introduces the concept of the attractor
sound, which, based on the visitor’s personalised profile model,
suggests other nearby artworks to the visitor that may be of
interest to them via spatially located audio prompts.
Furthermore, LISTEN characterises many of the key differences
between the usual audio guide experience and an interactive,
adaptive and immersive approach. These include the dynamic
delivery of spatial audio based on the listener’s movement, and
the delivery of related audio content based on the listener’s
proximity to an exhibit.
Like The Rough Mile [13], Sikora et al’s archaeological AAR
experience [3] could be categorised as an example of
transformative soundscaping, where virtual audio is used to
alter, or to reframe, rather than to directly compliment, the
context of the locative experience. In the case of Sikora et al’s
AAR experience this change of context is from rural to urban.
Being an outdoor experience it relies, as does The Rough Mile, on
GPS technology for determining the position of the user within
the physical landscape, values from which are translated into
coordinates on a virtually authored representation of this
landscape based on satellite imagery, onto which are placed
virtual sound sources for the user to encounter in the real world.
A similar authoring approach is taken by the system presented
here, though being for indoor experiences they rely on a custom
indoor positioning approach rather than GPS for determining
the position of the user in virtual and physical space.
Seidenari et al’s work on an automatic context-aware audio
museum guide [14] demonstrates how a combination of both
context modelling and artwork detection work together to
influence the playback of audio descriptions. It also shows how
the current object of the visitors focus is determined by a
wearable camera based object recognition system. Additionally,
the inclusion of speech detection within Seidenari et al’s context-
aware audio guide suggests a desire for users of such systems to
maintain the ability to socially interact with their co-visitors, or
rather it tries to ensure that visitors can still talk to other visitors.
This ability is maintained in addition to an understanding that
personalisation is a key factor in enabling museums to talk with
visitors, rather than talking to them.
Both projects employ a practice-based, research through design
approach where a series of iterative prototype interactive sound
installations are developed through a cyclical process of
development, deployment, study, analysis and redevelopment
[6]. Ethnographical studies and subsequent ethnomethodological
analysis [4, 5] of the deployed prototypes, where experts and
prospective audiences are invited to participate and interact with
the technologies within the installation environment will be
undertaken. These study and analysis phases will then play a key
role informing the subsequent redevelopment phases. The
experts and prospective audiences actions and interactions will
be observed, recorded and analysed in accordance with
recognised ethnomethodological techniques, including the
development of thick descriptions and a detailed understanding of
the machinery of interaction [4, 5]. Additional data in relation to
the audience experiences will be obtained from post-
participatory questionnaires and interviews, along with
quantitative analysis of data obtained from system logs and
Figure 1: Architecture of the current prototype
• Object ID
• Distance of object from user
• Angle between object and user
3D Audio Spatialisation
artwork, object or unique architectural feature
Mobile Application
Object recognition & tracking
AM ’19, September, 2019, Nottingham UK
L. Cliffe et al.
The current prototype installations are delivered to listeners
though headphones connected to a smartphone running an
application which is authored using Unity [15], FMOD [16] and
the Vuforia SDK [17], see figure 1.
Each sound source either has an audio logic script attached
to it, or is attached to an FMOD event, which is provided with
the current distance and orientation values of the listener in
relation to it, which it uses to control the delivery of the audio
source to the listener. This includes its spatial position within
the virtual soundscape, based on the listener’s orientation in
relation to the virtual sound source and the real-world object,
and its attenuation within the virtual soundscape, based on the
listener’s distance from the virtual sound source and the real-
world object. The spatial position and the attenuation of the
sound source within the stereo binaural mix of the virtual
soundscape are the primary audio logic parameters which all the
sound sources contain in order to place them within, and
construct, a convincing and viable interactive and virtual three-
dimensional soundscape. Based on these orientation and distance
values other audio logic events can be scripted, such as the
delivery of different audio files, or sections of an audio file, based
on the listener’s position in relation to the source.
In an attempt to develop a useable Bluetooth beacon based
Indoor Positioning System (IPS), similar to that presented by
Gimenes et al. [18], an initial prototype utilised a Puck.js [19], a
Javascript programmable Bluetooth Low Energy (BLE) [20]
beacon device, which was mounted on top of a set of Bluetooth
enabled headphones. It was envisaged that this headphone-
mounted interface would be useful in determining the user’s
real-world orientation in order to deliver an interactive surround
sound experience. The Puck.js is equipped with a magnetometer,
which was calibrated to work as a compass and return the user’s
bearing over a Bluetooth web connection to a local laptop
running a Web Audio API [21] based web application. Bluetooth
beacons were placed above wall mounted artwork, and a beacon
scanning script was uploaded to the Puck.js which returned the
RSSI values of nearby beacons to the local web application in an
attempt to determine the listener’s proximity to the artwork.
Preliminary lab based testing and observations determined that
this initial attempt was prone to ambient magnetic interference
with the bearing data, along with problems of battery longevity
in the Puck.js when using a frequency that gave a useable
measurement of distance when returning the RSSI values back to
the web application.
Informed and inspired by the artwork detection project
presented by Seidenari et al. [14], and a realisation of the need to
employ image recognition technology as a means to develop an
application that was useable from both an authoring and
curatorial perspective in a variety of locations, the Vuforia SDK
[17] was adopted as a means to realise this. Along with artwork
recognition, the use of image recognition and tracking
technology also presented opportunities for the development of
an Indoor Positioning System (IPS).
The Vuforia SDK enables the development of mobile
augmented reality applications that use computer vision
technology to recognise and track image targets and three-
dimensional objects in real-time, and is compatible with both the
iOS and Android mobile application platforms.
The Vuforia Engine’s camera-based object recognition and
tracking capabilities not only facilitate the recognition of the
artwork and artefacts to which virtual audio sources can be
associated, but also enable the implementation of an IPS were
the mobile listener’s angle and distance can be determined in
relation to tracked, stationary two or three-dimensional objects.
Through an authoring approach similar to the one presented
in the LISTEN system by Zimmerman & Lorenz [2], where a
world model is combined with a locative model, we can determine
our listener’s position both in the physical and virtual
environment of the experience. Additionally, the system is also
capable of determining the listener’s current focus by returning
the angle and distance of the listener in relation to the tracked
An additional and important feature of this camera based IPS
is made possible through Vuforia’s Extended Tracking or
Simultaneous Localisation and Mapping (SLAM) capability,
delivered through either Apple’s ARKit [22] or Google’s ARCore
[23], when compiled for delivery as either an iOS or Android
application respectively. Vuforia’s extended tracking enables the
continued recognition and estimated location of a tracked object
outside of the camera’s field-of-view. This fusion based sensing
technology extends our ability to determine the location of our
physical objects and their associated virtual audio sources in
relation to the listener’s position in space. By being able to
estimate both the angle and distance of the virtual audio sources
around the listener, we can deliver a virtual and interactive
three-dimensional soundscape based on the listener’s physical,
real-world environment.
Initial prototype designs centred around tracking the objects
to which the virtual sound sources where going to be attached
to, and using these as reference points to determine our
listener’s position and orientation, an approach that seemed
natural given that these were the objects that we wanted to
detect. But through the prototype development stages, once a
system had been developed that demonstrated a useable degree
of accuracy and reliability, and through the trials and
manipulations involved in sculpting the positions and
dimensions of the virtual audio sources in physical space, a
‘natural feature’ detection approach emerged. This approach
involved providing the object tracking software (Vuforia) with
isolated images of unique and static physical features within the
experience environment, and determining the listener’s position
and orientation in relation to these physical features, and in-turn
determine the position of the user in relation to the object to be
augmented with sound.
6.1 Beyond line-of-sight
The fact that sound has the ability to extend the
communicative reach of the visual is a theme reflected upon by
both Conner [24] and Attali [25]. In essence, we are made aware
The Audible Artefact
AM ’19, September, 2019, Nottingham UK
of objects and events that emit sound prior to observing them;
hearing is a sense that augments our vision.
This camera-based, natural feature detection approach to
indoor positioning is both different and interesting. By
associating virtual and spatialised audio sources to objects, or
features, that may not be directly related to the experience, one
can begin to think about how artworks or artefacts within
gallery and museum spaces could advertise their presence
beyond the traditional confines of line-of-sight.
For example, detecting and tracking a painted portrait in the
gallery lobby could advertise, through spatially positioned
virtual audio, the presence of the natural history exhibits
through the door to the left (bird song, monkeys and lions) and a
contemporary urban photography exhibition through the door to
the right (a city soundscape).
Additionally, one can see how this could also be used to
curate and design visitor journeys through an exhibition or a
collection of artefacts by triggering sound sources at certain
times in certain locations, or in relation to other objects. Also
possible would be advertising the location of other objects in
relation to the one you are currently viewing, associated objects
that work well together sonically as well was contextually,
guiding and suggesting potential trajectories to the listener.
Such approaches build upon Zimmerman & Lorenz’s [2]
concept of the attractor sound, a feature of the LISTEN system
that recommends additional artworks, via emitted and localized
virtual sound sources, to users of the system based on adaptive
and personalised recommendations.
Giving objects within these cultural spaces the ability to
communicate to visitors beyond line-of-sight has the ability to
provide great potential, and significant challenges, for the
designers and curators of such spaces. Spaces where the visual
has maintained primacy from architectural design through to
curatorial decision making for centuries [24], and constitutes a
new way of seeing within such environments.
Additionally, regarding the authorship and curation of
trajectories through an exhibition space, such an approach places
the object in the role of both waypoint and content, making an
object potentially a functional, and a thematic, part of the system.
This is considered in relation to Hazzard et al’s definition of an
audio event that ‘tells, or supports the telling of the narrati ve as
being thematic, and an audio event that ‘supports participants in
their navigation or comprehension of the experience’ as being a
functional one [10].
6.2 The Permeable Institution
The idea of the permeable cultural institution is made with
reference to the We AR groups subversive, site-specific and
visual based augmented reality interventions at MoMA in 2010
[26]. This artist group curated an unofficial, virtual art exhibition
and virtual artworks were placed within MoMA’s gallery spaces
which visitors could view through their smartphones, leading
Thiel [26] to remark that ‘The institutional walls of the white cube
are no longer solid…’.
In relation to the application of audio augmented reality that
has already been discussed, we can see how this also applies
through the potentially subversive or unofficial use of virtually
placed sound [27] within an institution. But, in the case of AAR,
Thiel’s remarks take on a more literal meaning, both in relation
to the internal walls of the gallery and its external boundaries,
and present opportunity as well as challenge. If we can hear the
visual before we see it, then the dividing walls of gallery rooms
are no longer obstacles to our exploration, if we can hear the
contents of the institution before we arrive, then these objects
are no longer confined by external architecture. According to
Breitsameter, in Behrendt [27], this fluid and borderless design
approach stems from ‘a sonic understanding of space’ which
allows for a space which is more permeable and one that ‘doesn’t
suggest the same kind of hard and fast boundaries of a visual
construction of space’.
In relation to the appropriation of experimental artistic
interventions for institutional based curatorial purposes, it is
perhaps worth noting Zimmerman & Lorenz’s positive curatorial
feedback on their LISTEN system [2] which acknowledges the
curatorial potential of innovative, less descriptive and enriched
audio content. Which also speaks more generally for the
curatorial appropriation of experimental sonic art practice as a
tool for cultural engagement within the gallery and museum.
6.3 Ambient Inclusion
The locative audio walk The Rough Mile [13] provides compelling
evidence, and testimony from its participants, for the inclusion
of situated ambient noise within such an experience, which, in
this particular case, was realised through the use of bone-
conducting headphones. Participants of The Rough Mile generally
noted that their ability to hear both the real and virtual sound
sources of the experience’s location added to their feeling of
immersion within the experience, and that it better situated the
virtual sound within the physical location. Although there is also
evidence to suggest that occasional loud ambient sounds masked
the ability to hear the virtual audio through the bone-conducting
headphones. This problem has also been observed whilst using
bone-conducting headphones with this project’s prototypes,
along with a significant loss in the fidelity of the audio, which
can make it difficult to discern nuances in the spatialisation and
frequency of the audio signal, an issue not experienced when
using more traditional closed-cup, or over-ear headphones. A
logical ‘best of both worlds’ solution could be to include ambient
sound within the experience either through the smartphone’s
microphone, or an external stereo or binaural microphone,
delivered into the headphones in real-time, as a way of
maintaining the immersive quality of a high fidelity surround
sound experience, whilst adding to these immersive qualities
though ambient inclusion. Additionally, this would enable the
level of the two sources, the virtual recorded audio content and
ambient audio content, to be monitored, mixed and managed.
AM ’19, September, 2019, Nottingham UK
L. Cliffe et al.
Participatory audience focus groups have been organised within
the context of a gallery space where interactions between users,
the space, the technology, and an audio augmented museum
artefact can be observed and recorded. The first will take place at
The National Science and Media Museum in Bradford where a
1940’s radio receiver from the museum’s collection will be
augmented with archive BBC radio broadcasts from the same
period, originally produced for a project called ‘Sound and Fury:
Listening to the Second World War’. Participants will be able to
‘tune in’ to different broadcasts through their movement and
proximity to the radio set within the gallery space (see figure 2).
The event will be documented, and an in-depth
ethnomethodological analysis [13, 14] will be conducted in an
effort to unfurl and build ‘thick descriptions’ of the social
interactions around the participants involvement with the
installation. It is also hoped that additional data can be gathered
by conducting interviews with participants subsequent to their
interacting with the experience, and also asking participants to
complete a questionnaire regarding their experiences. The
analysis of the results of these workshops will be considered
throughout the development of the subsequent iteration of
prototypes, in an effort to inform the ongoing design process
within the context of a recognised, iterative practice-based
research approach [12]. Future studies may also include a
quantitative analysis of engagement, as demonstrated by Sikora
et al. [3], where participants time spent at specific audio
augmented locations using the AAR system is compared
alongside time spent at these locations by participants not using
the AAR system, with data being made available for analysis
through observation and data logging within the AAR system.
Figure 2. Diagram of radio installation design for the
Science Museum Listening Sessions.
Similarly, an installation environment has been proposed for
Mapping the Symphony. This will involve virtually attaching the
musically rendered symphonic arrangements to 3D models of
historical symphonic halls associated with those specific
arrangements. With the 3D architectural models placed on their
relevant geographical locations upon a map rendered on the
gallery floor, visitors will be able to explore a virtual historical
symphonic soundscape generated by their proximity to these
models, from which will emanate the different symphonic
arrangements, see figure 3.
Figure 3. Diagram showing an installation layout for
Mapping the Symphony project.
It is envisaged that both projects will be evaluating similar
interactions in a similar manner, though within different
locations and contexts, with a view to obtaining data that can
help inform both the artistic and curatorial opportunities and
challenges for the system across a variety of cultural and
institutional contexts.
In relation to the IPS employed by the system, throughout
the initial prototype development significant discrepancies have
been observed in the accuracy of the tracking (both distance and
orientation values) when the system switches to a reliance on
SLAM based extended tracking, from tracking within the
camera’s field-of-vision. Various interpolations and algorithmic,
value smoothing techniques have been used to improve this,
though significant work remains in terms of developing these,
and in terms of accounting for this within the design and
authoring of the related experiences. Of particular interest would
be how the reliability of tracked targets are affected by changes
in lighting conditions and human traffic and how any
shortcomings relating to this can be accounted for within the
design and authoring process. Additionally, the inclusion of
ambient noise as an additional sound source for these projects,
whilst maintaining the quality of the interactive surround sound
experience, offers much promise regarding increased immersion
and adding to the listener’s susp ension of disbelief in the virtual
An initial demonstration of the current prototype at Science
Museum was received favourably by curatorial and collections
staff in terms of its potential application within a museum
context, and in terms of its immersive qualities. It is therefore
believed that the approach to AAR outlined in this paper
Vintage radio artefact
Tracked environmental feature
Position of sound
source in relation
to tracked feature
in real and virtual
Position of virtual sound source in relation to
audio augmented artefact.
-90° 90°
3D architectural models Virtual sound sources
World map on floor
The Audible Artefact
AM ’19, September, 2019, Nottingham UK
warrants further exploration, and demonstrates a potential for
connecting silenced sound technology hardware in museum
collections with relevant archive recordings, and in helping to
engage visitors with these objects, their stories, and their
associated audio archival material. It is also proposed that this
approach demonstrates a contribution to indoor positioning
within gallery and museum environments through the
application of camera based object detection for determining
visitor location and focus. Which, in turn, enables the system to
expand upon Zimmermann & Lorenz’s [2] concept of the
attractor sound through a system reliant on little background
infrastructure. Also concluded is that this appropriation and
application of Simultaneous Localisation and Mapping (SLAM) for
solely audio augmentation purposes works well with this
particular technologies current shortcomings. The small and
gradual movement of overlaid graphics placed on real world
objects that can sometimes be observed with visual based AR
applications are not so acute or obvious when translated to the
spatial position of audio sources, a shortcoming which is the
result of the mapping technologies adjusting their placement of
virtual augmentations as they build up, or gain additional
information about their environment [22, 23].
The author is supported by the Horizon Centre for Doctoral
Training at the University of Nottingham (RCUK Grant No.
EP/L015463/1) and Fusing Audio and Semantic Technologies
[1] Bederson, Benjamin B. Audio Augmented Reality: A Prototype Automated
Tour Guide. Proceedings of CHI ‘95, May 1995, pp. 210-211.
DOI:https:/ /doi.or g/10.1145/223355.223526
[2] Zi mmermann, A ., & Lorenz, A. (2 008). LISTEN: a user-adaptive audio-
augmented muse um guide. User Mo deling and User-Ad apted Inte raction, 18(5),
389416. DOI:https:/ /
[3] Sikora, M., Russo, M., Derek, J., & Jurčević, A. (2018). Soundscape of an
Archaeological Site Recreated with Audio Augmented Reality. ACM
Transactions on Multimedia Computing, Communications, and Applications,
14(3), 122. DOI:
[4] Crabtree, A., Rouncefield, M. and Tolmie, P. (2012) Doin g Design Ethnogr aphy,
[5] Garfinkel, H. (196 7). Studies in Ethnomethodology. Prentice-Hall.
[6] Benford, S., Adams, M., Tandavanitj, N., Row Farr, J., Greenhalgh, C., Crabtree,
A., et al. (2013). Performance-Led Rese arch in the Wild. ACM Transactions on
Computer-Human Interaction, 20(3), 122.
DOI:https:/ /doi.or g/10.1145/2491500.2491502
[7] Bubaris, N. (2014) Sound in museums museums in sound, Museum
Management and Cura torship, 29:4,391-402.
DOI:https:/ /doi.or g/10.1080/09647775.2014.934049
[8] Boon, T (2014). ‘Music for Spaces: Music for Space An Argument for Sound
as a Component of Museum Experience’ Journal of Sonic Studies, 8.
[9] Hutchison, M and Collins, L (2009). ‘Translations: Experiments in Dialogic
Representation of Cultural Diversity in Three Museum Sound Installations’,
Museum and Society, 7 /2, pp 92109,
[10] Kannenberg, J. (2017). ‘Towar ds a more so nically inclusive museum pr actice: a
new definition of the ‘sound object’’ Science Museum Group Journal, 8.
[11] Boon, T., Jamieson, A., Kannenberg, J, Kolkowski, A. & Mansell, J. (2017).
‘‘Organising Sound’: how a research network might help structure an
exhibition’. Science Museum Group Journal, 8.
[12] Rich, J. (2017). ‘Acoustics on display: collecting and curating sound at the
Science Museum.’ Science Museum Group Journal, 7.
[13] Hazzard, A., Spence, J., Greenhalgh, C., & McGrath, S. (2017). The Rough Mile
(pp. 18). Presented at the 12th International Audio Mostly Conference, New
York, New York, USA: ACM Press.
DOI:https:/ /doi.or g/10.1145/3123514.3123540
[14] Seidenari, L., Baecchi, C., Uricchio, T., Ferracani, A., Bertini, M., & Bimbo, A.
D. (2017) . Deep Artwork Detection and Retrieval for Auto matic Context-
Aware Audio Guides. ACM Transactions on Multimedia Computing,
Communications, and Applications, 13(3s), 121.
[15] Unity Technologies (2019). Unity for all. Retrie ved August 1, 2019 from:
[16] Firelight Technologies Pty Ltd. (2019). FMOD: Imagine, create, be heard.
Retrieved August 1, 2019 from:
[17] PTC (2019). Vuforia En gine 8 .1. Retrieved August 1, 2019 from:
[18] Gimenes, M., Largeron, P., & Miranda, E. R. (2016). Frontiers: Expanding
Musical Imagination With Audience Participation. Proceedings of the
International Conference on New Interfaces for Musical Expression, 16, 350354.
[19] Pur3 Ltd. (2018 ). The ground-breaking bluetooth bea con. Retrieved August 1,
2019 from:
[20] Google. (2019). Bluetooth low energy overview. Retrieved August 1, 2019 from:
[21] Mozilla and individual contributors. (2019) . Web Audio API. Retrieved August
1, 2019 from:
[22] Apple Inc. (2019). ARKit. Retrieved August 1, 2019 from:
[23] Google. (2019). ARCore: Build the future. Retrieved August 1, 2019 from:
[24] Conner, S. (2011). ‘Ears Have Walls, On Hearing Art.’ In Sound, Edited by Kelly,
C. Whitechapel Gallery/MIT Press.
[25] Attali, J. (2009). Noise: The Political Economy of Music. Minneapolis:
University of Minnesota Press
[26] Geroimenko, V. (ed.) (2014). Augmented Reality Art: From an Emerging
Technolo gy to a Novel Creative Medium. Spring er. Londo n.
[27] Behrendt, F. (2012). The sound of locative media. Convergence: the
International Journal of Research Into New Media Technologies, 18(3), 283
295. DOI:
... Taking into account the position and orientation of a user, an AAR system can create spatially registered sounds giving the illusion that the sound source is located at a specific point in the 3D space. In the context of CH applications, visitors may listen to 3D sound attributed either to 'talking' physical artefacts or invisible virtual characters [5], [6], [9], [16]. ...
... While conventional audio guides may be regarded as analogous to human guides, the AAR applications involve the exhibits themselves as storytellers that produce sounds or narrate stories in the first person, revealing sounds "hidden" in the environment and creating emotive visiting experiences. Cliffe et al. [5] introduce two museum guide projects and investigate the challenges and opportunities of AAR as a means for promoting visitor exploration and engagement with cultural institutions, collections and exhibitions. In those projects, the spatial positions of virtual sound sources are set based on the estimation of the angle and distance of the listener as to real-world exhibits, where the estimation based on image tracking techniques. ...
... The authors found that the spatial audio encouraged a more exploratory and playful response to the environment. It is noted that in some of the above discussed implementations the user is only enrolled as a passive listener of automatically played 3D sounds [5], [16], while others account for users to control the audio scene either through voice commands [6] or head gestures [9]. ...
Full-text available
Typical Augmented Reality (AR) cultural heritage (CH) guides adopt a visuo-centric approach, wherein visual virtual elements are superimposed onto the physical world. Recent research has investigated the use of Audio AR to evoke multisensory immersive experiences to visitors of CH sites adopting screen-free interfaces to ensure that user attention is not distracted from the physical exhibits. A parallel trend in the audience engagement programs of cultural institutions involves the employment of AI chatbots which are engaged in dialogues with visitors to provide meaningful responses to user questions. Herein, we present Exhibot, an intelligent audio guide system aiming at enhancing the user experience of CH site visitors. Exhibot represents the first-ever approach to combine Audio AR and chatbot technologies to enable natural visitor-exhibit interaction, while also leveraging IoT devices to contextualize the delivered information. The usability and utility of Exhibot has been tested in a case study in outdoors environment with the preliminary results indicating a very positive user experience.Keywords3D audioAudio augmented realityConversational audio guideChatbotIoTContext-awarenessCultural heritageDigital storytelling
... Cliffe et al. [10], [11] introduce two museum guide projects and investigate the challenges and opportunities of AAR as a means for promoting visitor exploration and engagement within cultural sites. In those projects, the spatial positions of virtual sound sources are set based on the estimation of the angle and distance of the listener in relation to the real-world exhibits. ...
... However, the use of GPS alone for location tracking was reported to compromise the overall experience. It is noted that in some of the above discussed implementations the user is only enrolled as a passive listener of 3D sounds [10], [14], while others account for users to control the audio scene either through voice commands [12] or head gestures 1 [13]. To the best of our knowledge, none of the existing AAR guide implementations encourages users to actively explore interpretive information or enables a natural visitor-exhibit interaction. ...
... This essentially enabled the exploration of a head-tracking approach, rather than the handheld device-tracking approach formally implemented. Whilst the device-based tracking approach for the delivery of dynamic binaural audio presents as an effective and accessible solution (Heller and Borchers, 2014;Cliffe et al., 2019Cliffe et al., , 2020, the use of a generic HRTF profile by the binaural rendering software suggest that a more compellingly realistic result would be achieved through the implementation of a headbased approach to tracking. Additionally, for this installation, Vuforia was exchanged for Unity's ARFoundation 1 plugin, which represented a more flexible and scalable option over Vuforia largely due to licensing considerations, but also because authoring assets such as image targets could be stored locally, rather than having to be uploaded in advance to a third-party server space. ...
... With millions of physical recordings now digitised, indexed and tagged with descriptive meta data (BBC, 2022; The-British-Library-Board, 2022) opportunities for extending these recordings' ability to engage with a wider public through the use of innovative and creative digital-based solutions, with a few exceptions (Cliffe et al., 2019(Cliffe et al., , 2020 remain, for the most part, unexplored. ...
Full-text available
This thesis explores the characteristics, experiential qualities and functional attributes of audio augmented objects within the context of museums and the home. Within these contexts, audio augmented objects are realised by attaching binaurally rendered and spatially positioned virtual audio content to real-world objects, museum artefacts, physical locations, architectural features, fixtures and fittings. The potential of these audio augmented objects is explored through a combination of practice-based research and ethnographically framed studies. The practical research takes the form of four sound installation environments delivered through the use of an augmented reality mobile phone application that are deployed within a museum environment and in participants’ homes. Within these experiences, audio augmented objects are capable of being perceived as the actual source of virtual audio content. The findings also demonstrate how the perceived characteristics of real-world objects and physical space can be altered and manipulated through their audio augmentation. In addition, audio augmented museum objects present themselves as providing effective interfaces to digital audio archival content, and digital audio archival content presents itself as an effective re-animator of silenced museum objects. How audio augmented objects can function as catalysts for the exploration of physical space and virtual audio space within both the home and museum is presented. This is achieved by the uncovering of a sequence of interactional phases along with the uncovering of the functional properties of different types of audio content and physical objects within audio augmented object realities. By way of conclusion, it is proposed that the audio augmented object reality alters the current, popular experience of acoustic virtual reality from an experience of you being there, to one of it being here. This change in the perception of the acoustic virtual reality has applications across an array of audio experiences, not just within cultural institutions, but also within various domestic listening experiences including the consumption and delivery of recorded music and audio-based drama.
... art exhibition and museum tours One use-case being explored for locationbased applications is museum tours [19,39,188]. The challenge tackled by these applications is to embed sound recordings into works of art. ...
... Both these effects can be explained by the maximum likelihood estimate model, in which the more variable the auditory distance perception is, the more likely it is to be integrated with a visual source as a unified percept [4]. 39 ...
Full-text available
This thesis aims to investigate a variety of effects linking the auditory distance perception of virtual sound sources to the context of audio-only augmented reality (AAR) applications. It focuses on how its specific perceptual context and primary objectives impose constraints on the design of the distance rendering approach used to generate virtual sound sources for AAR applications. AAR refers to a set of technologies that aim to merge computer-generated auditory content into a user's acoustic environment. AAR systems have fundamental requirements as an audio playback system must enable a seamless integration of virtual sound events within the user's environment. Different challenges arise from these critical requirements. The first part of the thesis concerns the critical role of acoustic cue reproduction in the auditory distance perception of virtual sound sources in the context of audio-only augmented reality. Auditory distance perception is based on a range of cues categorized as acoustic, and cognitive. We examined which strategies for weighting auditory cues are used by the auditory system to create the perception of sound distance. By considering different spatial and temporal segmentations, we attempted to characterize how early energy is perceived in relation to reverberation. The second part of the thesis's motivations focuses on how, in AAR applications, environment-related cues could impact the perception of virtual sound sources. In AAR applications, the geometry of the environment is not always completely considered. In particular, the calibration effect induced by the perception of the visual environment on the auditory perception is generally overlooked. We also became interested in the instance in which co-occurring real sound sources whose placements are unknown to the user could affect the auditory distance perception of virtual sound sources through an intra-modal calibration effect.
... art exhibition and museum tours One use-case being explored for locationbased applications is museum tours [19,39,188]. The challenge tackled by these applications is to embed sound recordings into works of art. ...
... Both these effects can be explained by the maximum likelihood estimate model, in which the more variable the auditory distance perception is, the more likely it is to be integrated with a visual source as a unified percept [4]. 39 ...
No PDF available ABSTRACT Visual and acoustic environment may influence the perception of auditory distance. In the context of Audio-only augmented reality (AAR), the coherence of the perceived virtual sound sources with the apparent room geometry and acoustics cannot always be guaranteed. The perceptual consequences of these incoherences are not well known. We conducted two online perceptual studies with a sound distance rendering model based on measured spatial room impulse responses (SRIR). A first study evaluated the perceptual performances of the model in incongruent visual contexts. The incongruent environment-related visual cues (spatial visual boundary and room volume) demonstrated a significant effect on the auditory distance perception (ADP) of virtual sound sources, through a calibration effect. A second study evaluated the impact of acoustical incongruence. Virtual sound sources distances were judged after the participant listened to distracting sound sources conveying distance cues relative to a different acoustical environment. When this distracting sound sources corresponded to a larger room than the one reproduced by the model, a higher compression effect was observed on the ADP of virtual sound sources. However, when the intensity cue conveyed by the distracting sound sources were coherent with the acoustical environment simulated by the model, their distracting effect were negligible.
... As previously mentioned, visitors' experience can be enriched by the recreation of sounds that augment the actual environment with evocative or informative cues. Audio augmented reality can be used to let visitors hear sounds as if produced by old playback machinery, bridging the gap between audio and artifact archives [9]. War-related sounds have been used to attract visitors to given locations and evoke impressions in a site with the remnants of trenches and fortified camps from World War I [25]. ...
Full-text available
Although overshadowed by visual information, sound plays a central role in how people perceive an environment. The effect of a landscape is enriched by its soundscape , that is, the stratification of all the acoustic sources that, often unconsciously, are heard. This paper presents a framework for archiving, browsing, and accessing soundscapes, either remotely or on-site. The framework is based on two main components: a web-based interface to upload and search the recordings of an acoustic environment, enriched by in- formation about geolocation, timing, and context of the recording; and a mobile app to browse and listen to the recordings, using an interactive map or GPS information. To populate the archive, we launched two crowdsourcing initiatives. An initial experiment examined the city of Padua’s soundscape through the participation of a group of undergraduate students. A broader experiment, which was proposed to all people in Italy, aimed at tracking how the nationwide COVID-19 lockdown was dramatically changing the soundscape of the entire country.
What does an archaeology museum sound like? Museum practitioners in general have, in the past decade, participated in a ‘multisensory turn’ within the humanities, bringing a new awareness to the potentialities for sound and soundscapes in exhibition strategies. This chapter explores the intersections between archaeology, sound, and museums, offering brief overviews of sound’s relationship with archaeology and museum practice while providing key examples of sound on display within archaeology museums. Finally, a case study of an artistic research project, the Museum of Portable Sound, explores a museological object-based approach to the curation of sounds within the archaeology museum, demonstrating how empathic listening between institution and visitor can contribute positively to visitor experience.
Full-text available
What kind of research goes into the design, creation, and maintenance of a museum dedicated to the culture and history of sound? Now’s your chance to find out, as we unlock our Research Library and present the Museum of Portable Sound Research Library Catalogue: 1,400 books, articles, patents, manuals, audio recordings, and more – including links to those available online – organised into over 50 subject areas. These are the items we have collected so far for our own reference since our museum opened in November 2015, and cover a diverse range of cross-disciplinary topics from the worlds of sound studies, museum studies, and beyond.
Full-text available
As museums continue to search for new ways to attract visitors, recent trends within museum practice have focused on providing audiences with multisensory experiences. Books such as 2014’s The Multisensory Museum present preliminary strategies by which museums might help visitors engage with collections using senses beyond the visual. In this article, an overview of the multisensory roots of museum display and an exploration of the shifting definition of ‘object’ leads to a discussion of Pierre Schaeffer’s musical term objet sonore – the ‘sound object’, which has traditionally stood for recorded sounds on magnetic tape used as source material for electroacoustic musical composition. A problematic term within sound studies, this article proposes a revised definition of ‘sound object’, shifting it from experimental music into the realm of the author’s own experimental curatorial practice of establishing The Museum of Portable Sound, an institution dedicated to the collection and display of sounds as cultural objects. Utilising Brian Kane’s critique of Schaeffer, Christoph Cox and Casey O’Callaghan’s thoughts on sonic materialism, Dan Novak and Matt Sakakeeny’s anthropological approach to sound theory, and art historian Alexander Nagel’s thoughts on the origins of art forgery, this article presents a new working definition of the sound object as a museological (rather than a musical) concept.
Full-text available
In this article, we address the problem of creating a smart audio guide that adapts to the actions and interests of museum visitors. As an autonomous agent, our guide perceives the context and is able to interact with users in an appropriate fashion. To do so, it understands what the visitor is looking at, if the visitor is moving inside the museum hall, or if he or she is talking with a friend. The guide performs automatic recognition of artworks, and it provides configurable interface features to improve the user experience and the fruition of multimedia materials through semi-automatic interaction. Our smart audio guide is backed by a computer vision system capable of working in real time on a mobile device, coupled with audio and motion sensors. We propose the use of a compact Convolutional Neural Network (CNN) that performs object classification and localization. Using the same CNN features computed for these tasks, we perform also robust artwork recognition. To improve the recognition accuracy, we perform additional video processing using shape-based filtering, artwork tracking, and temporal filtering. The system has been deployed on an NVIDIA Jetson TK1 and a NVIDIA Shield Tablet K1 and tested in a real-world environment (Bargello Museum of Florence).
Full-text available
This article traces sound as it echoes through approaches to displaying the Science Museum’s acoustics collection over the course of the twentieth century. Focusing on three key moments in the collection’s historical development, the article explores the role of sound as both medium and object of museum display. Each moment exposes how the practice of using sound to interpret sounding objects was articulated and problematised by past generations of museum practitioners. Each moment, too, exposes the problem of sound as a potential threat to the cultural politics of a national museum, disrupting the economies of the senses governing the museum as a consecrated space for learning. Thinking historically, this article excavates a body of practical experience and expertise which has the potential to support a growing body of modern museum practitioners using sound as a medium for, and object of, museum display.
This article investigates the use of an audio augmented reality (AAR) system to recreate the soundscape of a medieval archaeological site. The aim of our work was to explore whether it is possible to enhance a tourist's archaeological experience, which is often derived from only scarce remains. We developed a smartphone-based AAR system, which uses location and orientation sensors to synthesize the soundscape of a site and plays it to the user via headphones. We recreated the ancient soundscape of a medieval archaeological site in Croatia and tested it in situ on two groups of participants using the soundwalk method. One test group performed the soundwalk while listening to the recreated soundscape using the AAR system, while the second control group did not use the AAR equipment. We measured the experiences of the participants using two methods: the standard soundwalk questionnaire and affective computing equipment for detecting the emotional state of participants. The results of both test methods show that participants who were listening to the ancient soundscape using our AAR system experienced higher arousal than those visiting the site without AAR.
Written by a team of world-renowned artists, researchers and practitioners - all pioneers in using augmented reality based creative works and installations as a new form of art - this is the first book to explore the exciting new field of augmented reality art and its enabling technologies. As well as investigating augmented reality as a novel artistic medium the book covers cultural, social, spatial and cognitive facets of augmented reality art. Intended as a starting point for exploring this new fascinating area of research and creative practice it will be essential reading not only for artists, researchers and technology developers, but also for students (graduates and undergraduates) and all those interested in emerging augmented reality technology and its current and future applications in art.
Conference Paper
We chart the design and deployment of The Rough Mile: a multi-layered locative audio walk that blends pre-recorded spoken word, original music, and ambient environmental sound with real-time external ambient sound by employing bone conduction headphones. The design of the walking experience -- set in a city centre streets -- deliberately sought to explore novel mechanisms to create thematic and functional relationships between the layers of audio and attributes of the built environment, with the intention of constructing an augmented environment where the sounds of real and fictional are blurred. Twenty-six participants completed the walk describing an absorbing and well paced experience that encouraged them to view the location with an altered perspective, one that pulled aspects of the built environment and its population into the fictional story. We distil the findings and present a set of implications for the design of such locative walking experiences.
Conference Paper
This paper introduces Performance Without Borders and Embodied iSound, two sound installations performed at the 2016 Peninsula Arts Contemporary Music Festival at Plymouth University. Sharing in common the use of smartphones to afford real-time audience participation, two bespoke distributed computer systems (Sherwell and Levinsky Music, respectively). Whilst the first one implements a cloud-based voting system, the second implements movement tracking and iBeacon-based indoor-positioning to control the choice of soundtracks, audio synthesis, and surround sound positioning, among other parameters. The general concepts of the installations, in particular design and interactive possibilities afforded by the computer systems are presented.