Content uploaded by Stephen Roddy
Author content
All content in this area was uploaded by Stephen Roddy on Feb 06, 2020
Content may be subject to copyright.
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Vol.:(0123456789)
1 3
Journal on Multimodal User Interfaces
https://doi.org/10.1007/s12193-020-00318-y
ORIGINAL PAPER
Mapping formeaning: theembodied sonication listening model
andits implications forthemapping problem insonic information
design
StephenRoddy1 · BrianBridges2
Received: 30 April 2019 / Accepted: 23 January 2020
© Springer Nature Switzerland AG 2020
Abstract
This is a theoretical paper that considers the mapping problem, a foundational issue which arises when designing a sonifica-
tion, as it applies to sonic information design. We argue that this problem can be addressed by using models from the field of
embodied cognitive science, including embodied image schema theory, conceptual metaphor theory and conceptual blends,
and from research which treats sound and musical structures using these models, when mapping data to sound. However,
there are currently very few theoretical frameworks for applying embodied cognition principles in a sonic information design
context. This article describes one such framework, the embodied sonification listening model, which provides a theoretical
description of sonification listening in terms of conceptual metaphor theory.
Keywords Auditory display· Sonification· Conceptual metaphor· Image schema· Conceptual blending
1 Introduction: sonic information design
andthemapping problem
Sonic information design refers to the application of design
research, as defined by Faste and Faste [1], to sonification,
an auditory display technique in which data is mapped to
non-speech sound to communicate information about its
source to a listener. A key challenge in sonification is the
mapping problem, first introduced by Flowers [2], who
stated that meaningful information does not necessarily arise
when complex data sets are submitted to sonification. In
fact, due to cognitive–perceptual dimensional entanglement
(such as the ecological intermingling of what had tradition-
ally been considered to be discrete auditory dimensions,
e.g. pitch and amplitude), this may rarely be the case [3].
Similar concerns have been raised within sound studies and
practices, notably Truax [4], who criticised the overreliance
on the ‘energy transfer model’ of sound (asserting that a
psychophysical approach does not account for many aspects
of sound’s communicative affordances), O’Callaghan [5], a
philosopher of sound, and sound artists and sound studies
theorists Kahn [6], LaBelle [7] and Cox [8]. The relationship
between arts practices and the sonification mapping problem
is further discussed by Roddy and Bridges [9, 10]. From a
design-centered perspective Worrall [3, 11, 12] presents a
similar argument: that the software tools used in sonification
parameterise sound using the basic parameters of Western
tonal music (pitch, duration, loudness and timbral identity/
difference), an example being the PMSon mapping of pitch,
loudness, duration and timbre to unique data [13]). These
parameters, Worall argues, fail to account for the embodied
aspects of sound and sound production, which he sees as
critical to meaning-making in the context of sonification.
From this perspective, then, the mapping problem
becomes a design challenge that must be addressed anew
whenever one attempts to create a sonification. The sonic
parameters we choose when designing a sonification deter-
mine how well the sonification communicates information
and how well the listener can interpret it. As Ryle [14],
Searle [15] and Harnad [16] have variously shown, mean-
ing cannot be generated for a listener without providing suf-
ficient context because, as Dreyfus [17] and Polyani [18]
* Stephen Roddy
roddyst@tcd.ie
Brian Bridges
bd.bridges@ulster.ac.uk
1 Electronic andElectrical Engineering, Trinity College
Dublin, Dublin2, Ireland
2 School ofArts andHumanities, Ulster University, Magee
Campus, DerryBT487JL, NorthernIreland
AQ1
AQ2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
A1
A2
A3
A4
A5
A6
A7
A8
Author Proof
This is a pre-print of an article published in The Journal on Multimodal User Interfaces.
The final authenticated version is available online at: https://doi.org/10.1007/s12193-020-00318-y
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Journal on Multimodal User Interfaces
1 3
point out, objects of meaning require a background context
against which their meaning can be assigned, and, as such,
auditory display solutions must be designed with this criti-
cally important constraint in mind. We argue that the map-
ping problem can be addressed by adopting models of sound
which draw from contemporary theories of embodied cogni-
tion to refine the more traditional perspectives of psychoa-
coustics and formalist/computationalist models of cognition.
This, in turns may provide designers with new higher-level
parameter mapping strategies that allow them to map data in
ways may be better suited to providing sufficient context by
which the symbolic component sounds of sonification might
become meaningful and informative to a listener.
2 Embodied cognition guiding sonic
information design
Embodied cognition researchers approach the problem of
how to describe cognitive processes and conceptual sys-
tems from the perspectives of the physical and perceptual
affordances of the human body [19]. To this end, the field
has introduced a number of theoretical cognitive faculties
that complement our traditional computationally—based
understanding of cognitive faculties. Image schema theory
[20] posits that the building blocks of thought are derived
from frequently-encountered structures within sensorimotor
experience; according to this theory, we draw upon image
schema to lend structure to both our thinking and percep-
tual activities. One way in which we may do this is through
conceptual metaphors. A conceptual metaphor [21] is the
cognitive process by which image schemas in a familiar
domain of thought are leveraged to make sense of an abstract
domain of thought. A common example is highlighted in
the phrase “Love is a Journey”. In that phrase the familiar
logical structure of a ‘journey’ is mapped to frame the more
abstract domain of ‘love’. Inferences can then be made about
the concept of ‘love’ on the basis of this logical frame. For
example, it can be inferred that, just like a journey, love has a
beginning, middle and end and is typified by forward motion
along that linear path. The image schema involved here is the
SOURCE-PATH-GOAL schema [20]. In addition to provid-
ing a structure for logical inference, conceptual metaphors,
within this theory, are also assumed to structure experience
on the perceptual and sensorimotor levels. Here they frame
an unfamiliar perceptual or sensorimotor domain in terms of
a more familiar one. For example the desktop metaphor in
human computer interaction (HCI) frames what would oth-
erwise be an unfamiliar and abstract virtual space in terms of
an office desk space. This structures how a user understands,
reasons about and interacts with the virtual space.
Conceptual blending is another process by which famil-
iar conceptual content is integrated to generate new hybrid
conceptual content [22]. Conceptual blending and its rela-
tionship to sonic information design is explored elsewhere
[9], and design approaches informed by embodied cogni-
tion have been successfully applied in the context of HCI
[23–26]. In a similar fashion, embodied approaches to
interactive sonic information design have been developed,
informed by Dourish’s [27] introduction of the concept of
embodied interaction; see (Serafin etal. 2011). In recent
years, more broadly embodied models of sound have
become increasingly prevalent in sonification. Diniz etal.
[28, 29] apply principles from embodied music cognition
to the design of a multilevel interactive sonification and
Dyer etal. [30, 31], drawing from similar principles in the
design of sonification mapping strategies for motor skill
learning. Peres etal. [32] explore embodied approaches
to sonification in the design of a real-time sonification for
surface electromyography (EMG).
Whilst these approaches have provided productive con-
nections between theories of embodied cognition and map-
ping strategies, we argue that a consideration of embodied
perspectives drawn from music theories and practices may
be helpful in further extending sonic information design.
There is an extensive body of literature which has inves-
tigated the structures of Western tonal music in terms
of embodied image schemas, conceptual metaphors and
blending [33–37]. In particular, these models provide
perspectives on the temporal dynamics of listening via
embodied metaphors. A crucial factor for our present pur-
poses is that the strategies which underpin music’s evoca-
tion of apparent causality may inform more complex and
dynamic sonic information design approaches. Beyond
pitch—based musical structures, a number of researchers
within the field of electroacoustic music have investigated
embodied theories of timbre and sound—structural organi-
sation. Kendall [38, 39] describes electroacoustic music
on the basis of image schemas, conceptual metaphors and
conceptual blending, and Graham and Bridges [40, 41]
describe how Smalley’s theory of spectromorphology [42,
43], a model of how sound textures and ‘gestures’ within
electroacoustic music may relate to one another, can be
seen as compatible with image schema and conceptual
metaphor theory. Similar work by Godøy [44] highlights
the implied embodied underpinnings of Pierre Schaef-
fer’s concept of the sound object (itself an antecedent of
Smalley’s theories [42, 43]). Further work in this domain
[45] argues that an influential three-dimensional paramet-
ric model of timbral relationships—[46], with primary
dimensions for spectral centroid, synchrony of start times,
and presence/absence of attack transients—is compatible
with dynamics drawn from embodied image schema and
conceptual metaphor theory (verticality schemas, tension/
projection/linearity dynamics of movement and spatial
presence/diffusion).
AQ3
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
Author Proof
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Journal on Multimodal User Interfaces
1 3
2.1 Theoretical frameworks andtheembodied
sonication listening model
The mapping problem could be said to arise when we fail
to account for the idiosyncrasies of human perception and
cognition. Sonification designers are broadly aware that they
must work within the limits of human perception and that
psychoacoustic constraints have a very large impact on how
we represent data to a listener using sound. Furthermore,
beyond the psychophysics of perception, we must account
for the cognitive constraints of the listener in terms of work-
ing memory and cognitive load, etc. Embodied cognition
suggests that there is both another layer of constraints for
which we must account and another layer of possibilities
that we can exploit in the design of effective sonification and
auditory display solutions. The theory posits that we think,
reason and understand, at least in part, on the basis of image
schemata, conceptual metaphors and conceptual blends and
as such we must account for them in our design solutions.
The problem is that we do not yet have the theoretical tools
with which to analyse, discuss and address these in the con-
text of sonification listening. We present one such theoretical
model below.
The embodied sonification listening model (ESLM) aims
to describe the role of conceptual metaphor in the listeners’
interpretation of a sonification. A model of the embodied
meaning-making faculties active in sonification listening
might help to guide the design of communicatively effec-
tive sonification mapping strategies. Vickers and Hogg
[47] make a similar argument: that the modes of listening
proposed by thinkers like Schaeffer [48], Chion [49], and
Gaver [50] are insufficient in describing sonification listen-
ing, and calls for a new paradigm that is exclusively focused
on describing the richness and diversity of the sonification
listening experience.
The embodied sonification listening model (Fig.1) was
originally introduced by Roddy [51] but is formalised and
described in greater detail here. It uses Lakoff and John-
son’s [21] conceptual metaphor theory to provide a theo-
retical description of how meaning might emerge in sonifi-
cation listening, from an embodied perspective. Typically,
a listener does not have direct access to the data or the
original data source being represented during sonification
listening. As a result, they must construct an imaginary
model of the data on the basis of the cues provided by the
sonification. In the same way that a sonification designer
creates a mapping strategy from data to sound, the listener
must create their own cognitive—perceptual mapping
strategy from that sound back to an imagined data source.
The embodied sonification listening model provides a the-
oretical explanation of the embodied meaning—making
faculties involved in this process. It relies on the embod-
ied meaning—making faculties discussed previously to
describe the sonification listening process. The ESLM
involves two novel conceptual evaluation schemes: the
embodied sonic dimension and embodied sonic complex.
These were devised to account for traditional dimensions
of sound such as pitch, duration, amplitude and timbre,
and also to account for the dimensionality of sonic aes-
thetics, and their role in framing and associated meaning-
making in the context of sonic information design. These
dimensions tend to be, generally speaking, too complex
to be adequately described in terms of simple interactions
of the traditional dimensions of pitch, duration, amplitude
and timbre alone.
Examples of such dimensions might be a sense of nar-
rative development over a sequence of sounds, felt emo-
tional qualities conveyed by a sound, such as a sense of
foreboding, tension as communicated in prosodic informa-
tion of human vocalisations or the unique sense of place
established by a specific soundscape. Smalley’s spectro-
morphology framework [42, 43], mentioned earlier, also
describes a number of similar sonic dimensions such as
motion and growth processes, behaviours and structural
functions.
These conceptual evaluation schemes are also intended
to address the perceived need for dedicated theoretical
descriptors for sonification [52]. (They were motivated by
Koestler’s concepts of the holon and the holarchy [53], a
holon being something which is simultaneously a whole
and a part of a larger whole while a holarchy is a hierar-
chical arrangement of individual holons). An embodied
sonic dimension is defined here as any individual sonic
aspect that a listener can attend to as a meaningful percep-
tual unit which remains identifiable while evolving in time
along a continuous bi-polar axis. An embodied complex is
defined as any perceptual grouping that contains multiple
embodied sonic dimensions and can also be identified by
a listener as a meaningful perceptual unit.
Formula1 Formalization of the embodied sonification lis-
tening model.
f(t) ∶ ((sC
→
m1dP)+(sD
→
m2dM))eK
Fig. 1 The embodied sonification listening model
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
2 11
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
Author Proof
This is a pre-print of an article published in The Journal on Multimodal User Interfaces.
The final authenticated version is available online at: https://doi.org/10.1007/s12193-020-00318-y
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Journal on Multimodal User Interfaces
1 3
At a given time t a listener attending to a sonification ƒ(t)
will associate the sound they are hearing (the sonic complex
sC), with the phenomenon of which they imagine the rep-
resented data to be a measurement (the data phenomenon
dP). This constitutes the first metaphorical mapping (m1).
The second metaphorical mapping (m2) involves the
association of changes along dimensions within the sonifica-
tion (sonic dimension, sD) with changes in the original data-
set (measurement dimension, dM). These mappings are fur-
ther constrained and modulated by the listener’s embodied
knowledge, eK. This contains the listener’s understanding
of the sound, the data, any instructions or training they have
received regarding the sonification and any associations,
conscious or unconscious, the listener draws between or to
these elements. More broadly it encompasses a listener’s
everyday knowledge of their physical, social and cultural
environments. This knowledge determines the cognitive
mapping strategy a listener employs to map the sound back
to an imagined data source during sonification listening.
(A more detailed description of how embodied knowledge
mediates a listener’s interpretation and understanding of a
sound is presented by Kendall [38, 39].)
As previously pointed out, there are two metaphorical
mappings within the ESLM. In the first metaphorical map-
ping (m1) the listener maps, or identifies, the sonic complex
with the source of the data. That is to say that they associ-
ate the sounds they are hearing with the source from which
the original data was recorded or measured. In the second
metaphorical mapping (m2), the listener maps or identifies
changes in attributes of the sonic complex to the data set.
This simply means that they associate changes in different
attributes of the sound with changes in the data.
2.2 Applying theembodied sonication listening
model
To better illustrate the operation of the ESLM let us consider
a number of design strategies that a sonification designer
can employ to present different kinds of data. For example
considering a sonification developed for a flood monitor-
ing and alert system the key data in question is water level.
This is usually measured in meters and centimetres. For the
sake of this illustrative example let’s imagine that a designer
chooses to represent this data using a pitch-mapped sine tone
and that the polarity of the mapping is such that as the water
level increases the pitch rises and as the water level falls the
pitch falls too. This is a clear and direct mapping strategy.
In this context the sine tone is the sonic complex (sC)
of the sonification. It acts as a metaphor (m1) for our data
phenomenon (dP). In this case the data phenomenon (dP)
is the ‘water’, the level of which has been measured and
recorded in the dataset.
This representation or substitution of the water with the
sine tone is our first metaphorical mapping (m1). The second
metaphorical mapping (m2) is between the sonic dimension
(sD) and the data measurement (dM). In this example our
sonic dimension (sD) is pitch and our measurement dimen-
sion (dM) is water level as recorded in metres or centime-
tres in the dataset. In this example the designer has mapped
increases in water level to increases in pitch, and vice versa.
(Whilst they could have inverted the polarity and mapped
increases in data value to decreases in pitch, the original
mapping polarity is in line with common practice in pitch-
mapping sonification, one reason for which will be discussed
below).
Our model suggests that these mappings are mediated by
the listeners’ previous embodied knowledge (eK) and that
generally speaking, listeners come to a sonification with
their own unique and vast history of such knowledge (eK).
This raises some important additional issues:
(a) Which aspects of previous embodied knowledge do
listeners draw upon when interpreting a sonification?
(b) How do we design a mapping strategy that a listener
can understand on the basis of their previous embodied
knowledge?
We suggest that designers focus on the data here, and
choose the strategy that best reflects the real-world, physical
behaviours of the data phenomenon and the data measure-
ment. In doing so the designer is leveraging the listeners’
previous embodied knowledge of the data being represented.
In the example above, the data phenomenon is water and it is
probably safe to assume that from previous experiences the
average listener knows that when you add water to a vessel,
the overall level of the water within the vessel ‘rises’. The
polarity of this ‘mapping’ is therefore grounded within our
direct, real-world experience of water. On this basis we can
reason that when a listener perceives a rising pitch contour
in a sonification of water level data they will interpret it as
a rise in water level.
We can approach a sonification of a phenomenon like
wind in a similar manner. For example, consider a sonifi-
cation where wind speed data is mapped to the control the
cutoff frequency of a filtered white noise generator. The
white noise in this example is the sonic complex (sC). This
provides a metaphor (m1) for the original data phenom-
enon (dP), which is the wind. The cutoff of the filter is the
sonic dimension (sD) and this in turn provides a metaphor
(m2) for the data phenomenon which is the set of recorded
changes in wind speed. Again, one might assume that, on
the basis of past embodied knowledge (eK), an increase
in cutoff frequency would be interpreted as an increase
in wind speed. The reasoning here is that filtered white
noise provides a good analogue for the sound of wind and
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
3 11
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
Author Proof
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Journal on Multimodal User Interfaces
1 3
increasing the cutoff increases the amount of perceptible
activity in the frequency spectrum. As wind produces
sound through friction when in contact with a surface,
the higher the wind speed, the higher the frequency (e.g.
spectral centroid) of the resulting sound. As such, higher
filter cutoff frequencies might coincide with higher wind
speeds and vice versa, and the dimension and polarity of
the sonic mapping is thus consistent with eK.
An interesting, but more demanding, example which
nonetheless conforms to this model is the sonification of
population data. Whilst still a measure of a physical phe-
nomenon, population data (unless we are dealing with very
small populations) is somewhat less immediately acces-
sible than physical data like water levels and wind speed.
However, if we consider a sonification where population
data is mapped to control the number of individual grains
in a grain cloud, the grain cloud (sC) becomes the meta-
phor (m1) for overall population (dP) and the density of
the cloud (sD) becomes a metaphor (m2) for increases in
the number of people in the population (dM). In this case,
one might assume that on the basis of previous embod-
ied knowledge (eK) increases in the density of the grain
cloud would be interpreted as increases in the population
number. The previous knowledge at play here can be quan-
tified in terms of basic arithmetic or, from an embodied
point of view, from simple everyday experience of add-
ing and removing individual members from larger collec-
tions of physical objects; a conceptual metaphor of spatial
coverage and density versus sparseness, which relates to
Talmy’s [54] ‘states of consolidation’ (whereby spatial
coverage may be compact or diffuse). Thus, in this exam-
ple, the mapping is informed by a familiar, real-world,
physical model (in this case, the behaviour of crowds), but
is reinforced with reference to a more generic conceptual
metaphor of spatial coverage/density.
The key point here is to focus on various aspects of com-
mon physical experiences that we can assume a listener is
familiar with when designing a mapping strategy, and, fur-
thermore, to design mapping strategies that are congruent
with these familiar embodied experiences based on simple,
directly-observable physical relationships (the first two
examples) and, potentially, their reinforcement by more
generic spatial conceptual metaphors (the third example).
However, not all data have clear connections to physi-
cal experience. For example, consider changes in the gross
domestic product (GDP) of a country. We cannot experi-
ence an economic phenomenon like GDP in the same direct
manner that we can experience physical phenomena like
water and wind. When representing data of this type, we
don’t have previous embodied knowledge of the data source
that we can draw upon to inform our sonification mapping
strategy. In these cases we suggest choosing sounds (sC)
which themselves have proven to have familiar embodied
associations for a listener. If we cannot derive a suitable
background of embodied knowledge from the original data,
we can import one by representing and framing the data with
sounds for which the listener has sufficient previous embod-
ied experience, developing a new ‘narrative’ through this
new metaphorical connection which are nonetheless based
on established metaphors from eK.
An example of this type of approach would proceed as
follows.
1. Identify linguistic conceptual metaphors which are asso-
ciated with the data set. White [55] argues that there is a
clear conceptual metaphor underpinning the concept of
an ‘economy’: that economies are often conceptualised
as living organisms and are thought and reasoned about
in those terms.
2. As such, a sonification designer would consider a bio-
logically-inspired sonic complex (sC) in order to repre-
sent changes in an economic metric such as GDP. For
example, a heartbeat sound could be used as a param-
eterised auditory icon to represent the data phenomenon
(dP) in question: GDP.
3. The sonic dimension (sD) in this case could be the pulse
or heart rate and the data measurement (dM) could be
the changes in GDP, whereby increases would map to
an increased pulse/heart rate and decreases could map
to a decreased pulse/heart rate. (These types of mapping
example are summarised in Table1, above.)
It must be noted here that we are making some assump-
tions about embodied knowledge in the previous examples.
In practice, embodied knowledge is a critical aspect of this
model as it mediates how exactly a listener will interpret
Table 1 Example mappings
based on prior embodied
knowledge (eK)
Metaphor 1 (m1) Metaphor 2 (m2)
Sonic complex (sC) Data phenom-
enon (dP)
Sonic dimension (sD) Data measurement (dM)
Sine tone ≫Water Pitch ≫Water level
Filtered white noise ≫Wind Filter cutoff frequency ≫Wind speed
Grain cloud ≫Population Grain density ≫Pop. number
Heartbeat (sound) ≫GDP Heart/pulse rate ≫GDP changes
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
4 11
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
Author Proof
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Journal on Multimodal User Interfaces
1 3
a sonification, this is complicated by the fact that embod-
ied knowledge (eK) can vary wildly from person to person
and from culture to culture. Such factors must be taken into
account during phases of design by adopting user-centric
design and evaluation methodologies to produce systems
which are better adapted to the specific embodied knowledge
(eK) of expected user groups.
As discussed previously, the listener’s background of
embodied knowledge contains their understanding of the
sound, the data and any instructions or training they have
received for the sonification. This knowledge is grounded
in the listeners’ embodied experience through embodied
schemata and these embodied schemata determine how the
embodied sonic dimensions are mapped to data. A similar
phenomenon is referred to by Walker [56] as polarity. For
example, when the speed of a train is mapped to the sound
of flowing water, an increase in the speed of the water flow
(embodied sonic dimension) is likely to coincide with an
increase in the speed of the train as both share a common
measure, speed, which is structured by the fast–slow schema
[20]. When the depth of a submarine is mapped to pitch, a
decrease in pitch (embodied sonic dimension) is likely to
correspond to an increase in depth. This is because both
depth and pitch are structured by a common up–down (ver-
ticality) schema [20, 36]. For depth however, an increase in
the data means downward motion and so a decrease in pitch
might be interpreted as an increase in data. In this case, the
listener’s embodied knowledge of the data determines their
experience of the sonification.
2.3 Sonication metaphor andculture
Lakoff and Johnson [21] argue that all linguistic cultures
that they have considered employ embodied knowledge
and create and use conceptual metaphors based on embod-
ied experience; however, the metaphors created are often
specific to that culture. Taking this a step further, Kövec-
ses [57] illustrates a class of conceptual metaphors that,
while still rooted in embodied experience, have a pre-
dominantly cultural basis. As an example, he points to the
idiom ‘Time is Money’ and argues that it can only result
from, and make sense in a capitalistic culture in which
profit can be equated with the time required to produce a
product. This is an important point for sonic information
design and suggests that when applying a model like the
ESLM, which relies heavily on conceptual metaphor, the
designer must be aware of the culture in which the lis-
tener is embedded and base their design on metaphors with
which these users are accustomed. Consider the sonifica-
tion of economic data. Since its inception sonification has
proved a useful tool for representing economic and market
data [3]. However, research by Chung [58] has suggested
that Chinese, Malay and English speakers use different
conceptual metaphors for markets. The results show that
Chinese and Malay speakers tend to use more metaphors
based on ‘competition’ than English speakers when con-
ceptualising markets. By contrast metaphors used by Eng-
lish speakers tend to focus on the ‘fall’ of a market. A
related study compared the use of conceptual metaphors
across financial reports written by native English speak-
ers with those written by native Spanish speakers [59].
The results showed that while both groups conceptualised
the economy as an organism (similar to our earlier GDP
example), Spanish reports used more metaphors based
on psychological mood and personality while reports in
English showed a tendency towards more nautically based
metaphors. These differences in the conceptualisation of
markets and economies across cultures are important and
they call for unique approaches to the sonification of mar-
ket and economic data for the groups in question. Account-
ing for these differences can result in systems that are more
inclusive overall as well systems that are developed spe-
cifically for users of a certain culture.
How listeners interpret the meaning of a given sound in
a sonification context is, undoubtedly, highly dependent on
cultural factors. Polli [60] points out that approaches to soni-
fication reliant on the Western harmonic music system fail
to account non-Western listeners. She argues that listening
to the soundscape is an experience more commonly shared
across cultures, though the content of those soundscapes
can differ radically over time and geographical space. Jeon
etal. [61] showed that in the representation of emotional
state data in auditory display, Koreans listeners showed a
stronger preference for either auditory icons (real-world
sounds) or earcons (Westernised musical sounds), whereas
U.S. listeners showed more distributed preference between
the two categories. These results are suggestive of cultural
differences between listeners. Vickers and Hogg [47] also
comment on cultural differences in auditory display sug-
gesting that spectromorphology (discussed previously), has
the advantage of being less culturally specific than some
Westernised alternatives as it is chiefly concerned with the
sonic gestures discussed earlier.
The ESLM has been consciously designed to accommo-
date a broad range of sounds. The sonic complex can just as
easily be a soundscape, or a section of a raga, as it can be a
sine tone, melodic pattern or rhythmic pulse. They key point
is to choose a sonic complex and sonic dimension that cre-
ates the right conceptual metaphorical mapping for a given
listener allowing them to interpret the sonification on the
basis of a familiar domain of embodied experience for them.
It is critical that designers account for cultural factors, taking
care to choose sounds and mapping strategies that are con-
sistent with the metaphors employed by the user, whilst also
seeking to better understand how cultural factors may be at
play in the construction or modification of these metaphors.
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
5 11
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
Author Proof
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Journal on Multimodal User Interfaces
1 3
2.4 ESLM re-iterated
To summarise in plain English, listeners attend to the sound
as though it were the data during sonification listening.
Thus, the sound is experienced as a metaphorical repre-
sentation of the data. There are two metaphors involved in
this process. In the first, the sound heard is identified with
the original data source. In the second, and arguably more
critical, metaphorical changes in the sound are identified
as changes in the data recorded from the original source.
Crucially, this entire process is mediated by the listener’s
background of embodied knowledge, which determines how
exactly metaphorical mappings take place. However, where
the data is more complex, more broadly embodied models
of sound, through which multivariate data series can be rep-
resented using conceptual metaphors, blends and a wider
range of timbral/textural changes (informed by the treatment
of timbre’s component dimensions via embodied dynamics),
may be helpful.
The ESLM is proposed as a tool for guiding the design
of sonifications that can exploit the embodied aspects of
meaning-making during sonification listening. The ESLM
connects a number of research strands in embodied cogni-
tion, sonification and music composition to describe how
sonifications can be parameterised in terms of image sche-
mas, conceptual blending, and conceptual metaphor theory.
Research of the kind explored in this article is important
for sonic information design because it shows that both
sound and music can be modelled in terms of embodied
cognitive processes. Rather than parameterising sound and
music on the basis of listeners’ abilities to discern changes
in pitch, amplitude and timbre, the present article proposes
the parameterisation of sound on the basis of their ability to
detect, track and interpret changes based on image schema
and conceptual metaphors and blends. This is arguably
crucial to sonic information design because it allows the
designer to map data to more complex sonic dimensions
(and combinations of dimensions) that are far better suited
to communicating information to a listener, and to which
the listener, based on ecological-embodied experience, is
adapted to making sense of.
3 Discussion andconcluding remarks
The ESLM has far-reaching implications for the field of
sonic information design.
There are strong cultural and historical precedents in the
West for conceptualising sound in either psychophysical
terms, (i.e. pitch, loudness and timbre) or Western musical
terms (i.e. rhythm, melody etc.). This approach has proven
to be less useful for sonification, where the mapping prob-
lem imposes hard limits on how data can be represented
with these auditory dimensions. The ESLM provides a novel
framework for thinking about and working with sound in the
context of sonification. It differs from standard approaches
in that it is specifically intended to account for sonification
and it provides this account in terms of conceptual meta-
phor theory so as to address some of the embodied aspects
of sonification listening. In doing so the ESLM serves as
an explanatory framework for how given groups of listen-
ers might interpret a sonification. Crucially, it provides a
framework for thinking about, and better understanding, the
processes by which listeners might relate a specific sound to
a data source when listening to and interpreting a sonifica-
tion. The ESLM allows a designer to work with sounds from
a wide and varied range of sources in a systematic manner.
The model can be applied if a sound can be parameterised
with a sonic complex (sC) that can represent the data phe-
nomenon (dP) and a sonic dimension (sD) that is mapped to
the data measurement (dM) in the original data set via a rel-
evant conceptual metaphor informed by, and adapted to, the
listeners previous embodied knowledge (eK). This model,
with its novel sonic and conceptual dimensions, intro-
duces more degrees of freedom for representing data with
sound. This expanded possibility space gives the designer
the opportunity to choose sounds and mapping strategies
which might better represent their data to the listener. The
conceptual metaphors involved in the model, and the need
to choose metaphorical mappings from data to sound that
make sense to a listener, help to constrain this possibility
space to only those mapping strategies that are meaningful
and can be interpreted by a listener. In doing so, the ESLM
provides a useful tool for addressing the first design chal-
lenge posed by the mapping problem: the question of how
to design a meaningful data to sound mapping strategy [2].
The ESLM also helps to address another design challenge
posed by the mapping problem: the issue of dimensional
entanglement encountered in traditional approaches to soni-
fication as the use of a sonic complex (sC) to represent a
specific data phenomenon (dP) and a sonic dimension (sD)
to represent changes in the measured data (dM) allows the
designer to make clear delineations between different sounds
in a sonification and the data sources they represent. Another
crucial component here however is the embodied knowledge
(eK) that a listener draws upon to interpret a sonification.
While this doesn’t unequivocally solve the mapping prob-
lem, which remains a design problem that must be solved
each time a designer creates a sonification, it does offer a
framework in which to address it.
Adopting approaches informed by embodied cognition
may support designers in more efficiently investigating and
devising solutions to this problem. Frameworks such as
the ESLM can help a designer to account for some of the
embodied cognitive aspects of cognition involved in a listen-
ers interpretation of a sonification. AQ4
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
6 11
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
Author Proof
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Journal on Multimodal User Interfaces
1 3
Acknowledgements This publication has been funded by an Irish
Research Council Government of Ireland Postdoctoral Fellowship
Award (Grant no. 14887). This publication has emanated from research
supported in part by a research grant from Science Foundation Ireland
(SFI) and is co- funded under the European Regional Development
Fund under Grant No. 13/RC/2077.
References
1. Faste T, Faste H (2012) Demystifying “design research”: design is
not research, research is design. Presented at Industrial Designers
Society of America Education Symposium, Boston, Massachusetts
2. Flowers JH (2005) Thirteen years of reflection on auditory graph-
ing: promises, pitfalls, and potential new directions. Faculty Pub-
lications, Department of Psychology, 430
3. Worrall D (2019) Intelligible sonifications. In: Sonification
design: from data to intelligible soundfields. Springer, Berlin
4. Truax B (1984) Acoustic communication. Ablex, Norwood
5. O’Callaghan C (2007) Sounds: a philosophical theory. Oxford
University Press, Oxford
6. Kahn D (1999) Noise, water, meat: a history of sound in the arts.
MIT press, Cambridge
7. LaBelle B (2010) Acoustic territories: sound culture and everyday
life. A&C Black, London
8. Cox C (2011) Beyond representation and signification: toward a
sonic materialism. J Vis Cult 10(2):145–161
9. Roddy S, Bridges B (2016) Sounding human with data: the role
of embodied conceptual metaphors and aesthetics in representing
and exploring data sets. In: The proceedings of the MusTWork
2016 the music technology workshop
10. Roddy S, Bridges B (2018) Addressing the mapping problem in
sonic information design through embodied image schemata, con-
ceptual metaphors and conceptual blending. J Sonic Stud 17
11. Worrall D (2013) Understanding the need for micro-gestural
inflections in parameter-mapping sonification. Georgia Institute
of Technology, Atlanta
12. Worrall D (2014) Can micro-gestural inflections be used to
improve the soniculatory effectiveness of parameter mapping
sonifications? Organ Sound 19(1):52–59
13. Grond F, Berger J (2011) Parameter mapping sonification. In:
Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook.
Logos Publishing House, Berlin, pp 363–397
14. Ryle G (1949) The concept of mind. Hutchinson, London
15. Searle JR (1980) Minds, brains, and programs. Behav Brain Sci
3(03):417–424
16. Harnad S (1990) The symbol grounding problem. Phys D Nonlin-
ear Phenom 42(1):335–346
17. Dreyfus HL (1965) Alchemy and artificial intelligence. The Rand
Corporation, Santa Monica, Research Report
18. Polanyi M (2012) Personal knowledge. Routledge, London
19. Varela FJ, Thompson E, Rosch E (1991) The embodied mind:
cognitive science and human experience. MIT Press, Cambridge
20. Johnson M (1987) The body in the mind: the bodily basis of
meaning, imagination, and reason. University of Chicago Press,
Chicago
21. Lakoff G, Johnson M (1980) Metaphors we live by. University of
Chicago Press, Chicago
22. Fauconnier G, Turner M (2002) The way we think. Conceptual
blending and the mind’s hidden complexities, New York
23. Imaz M, Benyon D (2007) Designing with blends: conceptual
foundations of human–computer interaction and software engi-
neering methods. MIT Press, Cambridge
24. Hurtienne J (2009) Image schemas and design for intuitive use.
Exploring new guidance for user interface design. (Doctoral
Thesis). Retrieved from Deposit Once TU Berlin Reposityory for
Research and Data Publications. TU Berlin identifier opus3 2970
25. Waterworth J, Riva G (2014) Feeling present in the physical world
and in computer-mediated environments. Palgrave Macmillan,
London
26. Bødker S, Klokmose CN (2016). Dynamics, multiplicity and
conceptual blends in HCI. In: Proceedings of the 2016 CHI
conference on human factors in computing systems. ACM, pp
2538–2548
27. Dourish P (2004) Where the action is: the foundations of embod-
ied interaction. MIT press, Cambridge
28. Diniz N, Deweppe A, Demey M, Leman M (2010) A framework
for music-based interactive sonification. In: 16th International
conference on auditory display (ICAD-2010)
29. Diniz N, Coussement P, Deweppe A, Demey M, Leman M (2012)
An embodied music cognition approach to multilevel interactive
sonification. J Multimodal User Interfaces 5(3–4):211–219
30. Dyer J, Stapleton P, Rodger MW (2015) Sonification as concurrent
augmented feedback for motor skill learning and the importance
of mapping design. Open Psychol J 8(3):1–11
31. Dyer J, Stapleton P, Rodger M (2017) Transposing musical skill:
sonification of movement as concurrent augmented feedback
enhances learning in a bimanual task. Psychol Res 81(4):850–862
32. Peres SC, Verona D, Nisar T, Ritchey P (2017) Towards a system-
atic approach to real-time sonification design for surface electro-
myography. Displays 47:25–31
33. Cox A (2001) The mimetic hypothesis and embodied musical
meaning. Musicae Scientiae 5(2):195–212
34. Brower C (2000) A cognitive theory of musical meaning. J Music
Theory 44(2):323–379
35. Adlington R (2003) Moving beyond motion: metaphors for chang-
ing sound. J R Music Assoc 128(2):297–318
36. Zbikowski LM (2002) Conceptualizing music: cognitive structure,
theory, and analysis. Oxford University Press, Oxford
37. Larson S (2012) Musical forces: motion, metaphor, and meaning
in music. Indiana University Press, Bloomington
38. Kendall GS (2010) Meaning in electroacoustic music and the eve-
ryday mind. Organ Sound 15(1):63–74
39. Kendall GS (2014) The feeling blend: feeling and emotion in elec-
troacoustic art. Organ Sound 19(2):192
40. Graham R, Bridges B (2014) Gesture and embodied metaphor in
spatial music performance systems design. In: NIME, pp 581–584
41. Graham R, Bridges B (2014) Strategies for spatial music per-
formance: the practicalities and aesthetics of responsive systems
design. https ://doi.org/10.5920/divp.2015.33
42. Smalley D (1996) The listening imagination: listening in the elec-
troacoustic era. Contemp Music Rev 13(2):77–107
43. Smalley D (1997) Spectromorphology: explaining sound-shapes.
Organ Sound 2(02):107–126
44. Godøy RI (2006) Gestural-sonorous objects: embodied extensions
of Schaeffer’s conceptual apparatus. Organ Sound 11(02):149–157
45. Graham R, Manzione C, Bridges B, Brent W (2017) Exploring
pitch and timbre through 3d spaces: embodied models in virtual
reality as a basis for performance systems design. In: New inter-
faces for musical expression proceedings
46. Grey JM, Gordon JW (1978) Perceptual effects of spectral modi-
fications on musical timbres. J Acoust Soc Am 63(5):1493–1500
47. Vickers P, Hogg B (2006) Sonification ab-straite/sonification
concr`ete: An ‘æsthetic perspectivespace’ for classifying auditory
displays in the ars musica domain. In: 12th International confer-
ence on auditory display (ICAD-2010), pp 210–216
48. Schaeffer P, Reibel G, Ferreyra B, Chiarucci H, Bayle F, Tanguy
A etal (1967) Solfège de l’objet sonore. In: INA/GRM
49. Chion M, Gorbman C, Murch W (1994) Audio-vision
50. Gaver WW (1989) The SonicFinder: an interface that uses audi-
tory icons. Hum Comput Interact 4:67–94
AQ5
AQ6
AQ7
AQ8
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
7 11
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
Author Proof
This is a pre-print of an article published in The Journal on Multimodal User Interfaces.
The final authenticated version is available online at: https://doi.org/10.1007/s12193-020-00318-y
UNCORRECTED PROOF
Journal : Large 12193 Article No : 318 Pages : 9MS Code : 318 Dispatch : 27-1-2020
Journal on Multimodal User Interfaces
1 3
51. Roddy S (2015) Embodied sonification. Doctoral Dissertation.
Trinity College Dublin. Ireland
52. Walker BN, Nees MA (2011) Theory of sonification. Sonification
Handb 9–39
53. Koestler A (1967) The ghost in the machine. Hutchinson & Co.,
London
54. Talmy L (2008) The fundamentals of spatial systems. In: Hampe
B (ed) From perception to meaning: image schemas in cognitive
linguistics. Walter de Greuyter, Berlin
55. White M (2003) Metaphor and economics: the case of growth.
Engl Specif Purp 22(2):131–151
56. Walker BN (2002) Magnitude estimation of conceptual data
dimensions for use in sonification. J Exp Psychol Appl 8(4):211
57. Kövecses Z (2010) Metaphor and culture. Acta Universitatis Sapi-
entiae Philologica 2(2):197–220
58. Chung SF (2005). MARKET metaphors: Chinese, English and
Malay. In: Proceedings of the 19th Pacific Asia conference on
language, information and computation, pp 71–81
59. Charteris-Black J, Ennis T (2001) A comparative study of meta-
phor in Spanish and English financial reporting. Engl Specif Purp
20(3):249–266
60. Polli A (2012) Soundscape, sonification, and sound activism. AI
Soc 27(2):257–268
61. Jeon M, Lee JH, Sterkenburg J, Plummer C (2015) Cultural differ-
ences in preference of auditory emoticons: USA and South Korea.
Georgia Institute of Technology, Atlanta
Publisher’s Note Springer Nature remains neutral with regard to
jurisdictional claims in published maps and institutional affiliations.
AQ9
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
Author Proof