ArticlePDF Available

Abstract and Figures

This article examines how the ‘Battlefield’ (EA Games) series of games generates authenticity in its soundtrack both through a meticulous approach to modelling the physical world and through the appropriation of audio characteristics from our, typically mediated, experience of conflict. It goes on to examine how we might reconcile such ‘authentic’ audio with the more ludic features of the soundtrack, required to support gameplay, that are typically presented as inauthentic. The absence of these sounds during narrative-based sequences and the acceptance of them without negative impact on immersion during gameplay implies that these inauthentic sounds appear not to disrupt the immersive qualities of the ‘authentic’ but only when clearly positioned as ego-ludic (heard only by the player, non-spatialized and synthetic in quality) and only within the context of challenge-based sequences of the game.
Content may be subject to copyright.
The Reality Paradox: Authenticity, fidelity and the real in Battlefield 4
Richard Stevens and Dave Raybould, with additional contributions from Ben Minto, Audio Director
Battlefield 4.1
Although the interactive medium of video games is often referred to as non-linear, since progression
through the game may follow different paths on repeated playthroughs, it is typically much more linear
than the medium of film with respect to its treatment of time and space. The grammar of film involves
frequent cuts between perspectives, geographical locations and times, whereas, for example, military first
person shooter games much more closely match the perceptual experience of everyday life through their
spatial and temporal continuity. It is not surprising, then, that this game genre often aspires to simulate
the reality of the physical world (Gapper, 2014) based on an assumption that the construction of a
‘believable, realistic space’ will be a significant driver for playersimmersion in the game (Collins,
2008). Indeed, the trailer to Battlefield 3 (Electronic Arts, 2011a) posited the question ‘Is it real? Or is it
Battlefield 3?’ (Electronic Arts, 2011b).2
This paper will examine how the Battlefield series of games facilitates player immersion3 in the game
world by appropriating audio characteristics from our typically mediated experience of conflict and
through a meticulous approach to modeling an authentic real-world audio experience. The paper also
discusses how we might reconcile this immersion in the seemingly real and authentic with the presence
of the more artificial or inauthentic ludic elements of the soundtrack required to support gameplay.
Authenticity in ambience
An attempt to clarify problematic terms such as “real” and “authentic” forms part of the discussion below
but such notions should be seen within the context of Michel Chion’s point that realism does not
necessarily equate to reality. In games, as in cinema, the audio-visual channel needs to transmit or render
an experience that in the physical world is perceived by many senses. In order to more convincingly
relate the “real” experience, aspects of the audio-visual representation of real-world phenomena will
often be heightened or manipulated (Chion, 1994). We will return to a specific usage of “real” below but
for the moment we will adopt “authentic” to indicate audio that players might consider credible for the
real-world circumstance represented by the game.
The ambient sounds in Battlefield 4 (Electronic Arts, 2013) are generated through a combination of
“Sound Areas” – up to thirty multi-layered quadraphonic loops of approximately three minutes length per
level (Figure 1), “Big World” sounds – one-shot identifier sounds such as animals, insects, and distant
war, which are randomly triggered and randomly positioned, and “Spot Sounds” which are placed at
specific locations within the game (Figure 2).4
1 Personal correspondence between the authors and Ben Minto is indicated by the reference (Minto, 2014).
2 The discussion of systems for the implementation of audio in Battlefield 4 discussed below has been simplified for more general
consumption and so should be taken as a general indication of the approach rather than an absolute description of every detail.
3 Although there is a wide variety of literature on types of immersion (Ermi and Mäyrä, 2005), presence (Ryan, 2006; Skalski and Whitbred,
2010) or incorporation (Calleja, 2011) it is perhaps most useful, in the absence of any unifying theory at this stage, to consider it as an
umbrella term (Nacke, 2011) for a phenomenon that operates on a continuum of involvement (classified by Brown and Cairns as ranging
from engagement through engrossment to total immersion [Brown and Cairns, 2004]), and that immersion is not a type, but a state which
can be instigated and supported through a range of different processes or drivers.
4 “Sound Areas”, “Big World” sounds and “Spot Sounds” are terms that were used by designers for Battlefield 4.
Figure 1: Game view, Active Sound Areas, All Sound areas. (Courtesy of DICE Electronic Arts, 2014)
Figure 2: ‘Spot Sounds’ in the editor and debug view. (Courtesy of DICE Electronic Arts, 2014)
Forming part of the acoustic ecology of the game (Grimshaw, 2007), each of these elements for the seven
main locations (each containing over 100 unique sub-areas) in the single player campaign for Battlefield
4, has been meticulously researched to reflect what Schafer might refer to as the Keynote sounds (the
background environmental sounds) and Soundmarks (sonic landmarks) of their real world counterparts
(Schafer, 1994), even to the extent of reflecting the appropriate season during which the action takes
place, as Mari Saastamoinen Minto explains:5
For Caspian Border in Battlefield 3, we used sounds from swallows that exist in the real
Caspian area. But in Second Assault (Battlefield 4 expansion released November 2013) it’s
autumn, so we want birds with a more autumn-like sound. We’ve found a bird called
Caspian Snowcook, but we’ve already used that in the Alborz Mountain map, so the bird
hunt goes on… (Saastamoinen Minto, 2013)
Authenticity in Foley
For the player to feel a sense of perceptual presence within the environment it is also important that the
sounds that are generated as they interact with it (referred to by Grimshaw [2010] as kinediegetic sounds)
match with the player’s sonic expectations that have been formed in the physical world. Indeed, some
consider this ergo-audition (Chion, 2010), the ‘perception and joy of self-generated sounds’ (Leibe,
2013), to be key to enhancing the player’s sense of presence, by extending the body into the game
(Collins, 2011). In order to feel authentic, the sounds must match the depiction of movements (or achieve
‘kinesonic congruence’ [Collins, 2013]) and avoid any sense of repetition; a key challenge to the game
sound designer is avoiding the repeated soundfile, a disruptor of immersion (Vachon, 2009). To combat
this problem of repetition in Foley sound, Battlefield 4 uses three categories of footstep samples (walk,
run, and sprint) for each of the basic surface types (dirt, tarmac, wet tarmac, wood, grass short, grass
long, gravel, metal, sand, water, forest, snow, ice, glass, and indoor floor). These footstep sounds are not
played as discrete samples, but are blended with each other depending on the speed at which the player-
controlled character moves through the environment (which modulates the amplitude and duration of the
5 See accompanying video 01: “Ambience”.
sound). They are further varied via a random pitch shift. The footsteps are supplemented by a Foley layer
from the character’s clothing, which again uses blending techniques, modulating both the amplitude and
pitch using player speed across the different movement types. Each of these types of movement (walk,
run, sprint, scuffle, crawl, vault, land, change pose, fall, swim, and melee) also have associated samples
within their own logic systems, generating a Foley system of huge complexity and flexibility (Figure 3).6
Figure 3: One of 16 Foley subsystems. (Courtesy of DICE Electronic Arts, 2014)
The importance of the role of Foley for player immersion is recognized by the decision in Battlefield 4 to
remove them from the automated mixing system so that they are always present in the mix.
...whilst your firearm is the key way you interact with the environment, you have a big part
to play too, so we’ve taken player Foley out of the High Dynamic Range mixing system,
now explosions will not cull your footsteps; there’s a constant sonic connection between
character and world. (Minto in Broomhall, 2013)
Authenticity of acoustics
Just as sound adds materiality to visual objects (Beck, 2008), giving the two dimensional images weight
and substance, the acoustic reverberation within a game adds materiality to the graphical renderings of
the environment. The sheer complexity of calculating the scattering of sound waves across an intricate
virtual scene remains out of reach of the real-time demands of video games7 so the items of key
importance in the shooter game, the weapons, rely on a system of pre-rendered sounds to evoke the
acoustic response of the different spaces within the game.
In Battlefield 4 each weapon sound is in fact a combination of numerous elements. These enable the
player to identify the weapon type (the ‘What?’) through the transient8 and body elements of the sound,
and create a sense of authenticity for the game environment (the ‘Where?’) through additional layers and
reverb tails (Minto, 2011).
6 See accompanying video 02: “Foley”.
7 Attempts to pre-compute an accurate impulse response to convolve in real time would require 60GB of data per M3 (Raghuvanshi et al.,
8 The initial attack portion of the sound.
Figure 4: Weapon layers (for illustration only).
In Figure 4, the top three files represent the “body” elements of the weapon and the five lower audio files
are the pre-recorded acoustic tails captured to represent environments such as Indoor Small’, ‘Indoor
Large’, ‘Urban’, ‘Far’, and ‘Open field’.9 There are multiple variations for each tail type, and the
blending between the different types is governed by run-time parameters based on the type of area in
which the weapon sounds.10
Although we have touched only briefly on some of the systems within the game, we can see how the
Battlefield audio team go to great lengths to achieve authenticity in the audio through the appropriation of
realistic ambiences, Foley and acoustic spaces.
Authenticity through codes of realism
Although some of us will be conscious of ‘the symbol, the emblematic sound(s)’ (Chion, 2009) that serve
as codes of reality in cinema, such as the lonely Amtrak horn that echoes through every wrong-side-of-
the-tracks neighborhood, or the Kookaburra laugh that embeds the truth of a jungle environment (despite
actually being native to Eastern Australia), much less explored is the role played by the quality and
characteristics of the medium itself in informing our notions of the authentic. What is considered
authentic is actually an evolving notion that is based on the fidelity of the recording and storage mediums
available in a particular era or context. In visual terms we are very familiar with how the distortions of
dust and scratches over black and white imagery, or the aspect ratio and grain of 1980s home video can
be used to reference different eras or contexts. Likewise, when assessing whether the Battlefield series
has achieved its aims of creating a ‘real and believable world’ (Strandberg in Jock, 2011), we bring our
expectations formed by mediated realism to play. In the Battlefield series we can trace this influence,
from the DV Cam of television news and documentary footage to the vérité rawness of the new media of
citizen journalism such as the mobile phone and personal camera.
9 See accompanying video 03: “Weapons”.
10 The description and illustration of the weapons system here is illustrative since the actual system is far too complex to tackle in this paper.
Each [game] release is like an audio snapshot of the time. Bad Company was as recorded
through a handycam mic. Bad Company 2 was akin to mobile phone recordings uploaded
to YouTube. For Battlefield 3, the ‘unedited war’ vibe of contemporary Iraq and
Afghanistan documentaries was the benchmark, resulting in a more refined and
decipherable soundscape for the game. For Battlefield 4 we wanted to recapture the
rawness of BC2 [Bad Company 2], but not at the expense of clarity and readability.....This
is the “Go-Pro helmet-mounted camera” take. (Minto in Broomhall, 2013).
Throughout the development of previous Battlefield games, the aesthetic of the authentic has often led to
a preference for the characteristic artifacts produced by consumer level (i.e. “lo-fi”) audio equipment for
the sounds of battle.
When we were recording military exercises, it was the crappiest recorders capturing the
most violent sound that actually ended up in the game not the expensive recorders – they
sound more “authentic” when compared to the kind of war footage we’re used to seeing.
(Strandberg, 2011)11
Or as Audio Director Ben Minto puts it, ‘we don’t mind recordings with dirty in them – embrace the dirt’
(Minto, 2014).
In Battlefield 4 we can identify specific examples of these types of distortions or colorations, such as the
buffeting of wind against the microphone (specifically in the coastal storm of the ‘Singapore’ level) (see
Figure 5) or the clipped distorted recordings produced by signal overload (see Figure 6), most apparent in
the tank projectile impact sounds in ‘Shanghai’, and the optional ‘War Tapes’ audio setting (introduced
in Battlefield: Bad Company 2) that ‘mimics the very heavy compressed and distorted feel of video
footage recorded out in the field.. it’s an aesthetic setting that embraces or translates the game sound into
the “sound of youtube-ism” where everything is exaggerated. It’s like a film-grain filter in Photoshop’
(Strandberg in Jock, 2011) (see Figure 7).12
Figure 5: High levels of low frequency energy caused by air turbulence moving the diaphragm of the microphone
wind buffeting.
11 In the DICE weapon recording video ( compare the sound of the recording made with the DPA4011
(approximately £1,500) and HHB Portadrive (approx. £8,000) to the one made by the Zoom H2 (approx £100).
12 See accompanying video 04: “Medium Distortions”.
Figure 6: The clipped waveform of a tank projectile impact sound.
Figure 7: The heavily compressed results of the “War Tapes” setting.
A now commonly used term in game audio, the HDR (High Dynamic Range) mixing system mentioned
above was originally developed by DICE (Digital Illusions CE) in 2006 as a way of ensuring certain mix
priorities through an automated process, mapping the large dynamic range of reality onto the smaller one
permitted by 16-bit playback systems (140dB to 96dB [Huber and Runstein, 2010]).13
HDR preserves the relative assigned dB difference between adjacent sounds within a fixed,
yet movable, window of dynamic range. The window will always rise to encompass the
loudest sounds, and those that fall below the minimum threshold are culled away. (Minto,
In addition to this automated application, the HDR system is also exploited creatively within Battlefield 4
as a symbol for loudness itself. Instead of trying to recreate the true dynamic range of war (neither
possible nor desirable since machine gun fire in an enclosed space would lead very quickly to permanent
hearing loss), the “pumping” effect of cheap limiters found in consumer video equipment (where the
recording volume is automatically lowered to avoid clipping distortion, once a peak threshold is reached,
before gradually resetting) is adopted as a metaphor for the amount of acoustic energy put into an
environment, and the environment’s recovery from this audio assault.
Loud is not just about volume – but perceived effect on the environment. HDR pumping
can be set per level, it’s basically the release time which can be varied across a level – i.e.
after something loud, how quick do the quiet sounds come back? It can also have a
cumulative effect, so if you fire for longer it takes longer for the energy to disperse and
allow quiet sounds back. We choose to exaggerate the effect in some cases in the BF titles
– it makes loud things feel louder/brutal. (Minto, 2014) 14
13 Menu options allow the player to target a range of playback systems, from the relatively narrow dynamic range for TV speakers to a more
filmic range for Hi Fi or home cinema setups (Minto, 2011).
14 See the “Shanghai” section of accompanying video 02 for an illustration.
Figure 8: HDR response: ambience, weapon firing, then HDR release. (Courtesy of DICE Electronic Arts, 2014)
It can be seen in Figure 8 that the HDR level returns to is base setting of 80, after reaching a peak of
115.93 (see HDR Scaling data in Figure 8 above) but it does so over a “release” time period, the resultant
gradual re-introduction of the quieter sounds evoking the pumping effect of consumer audio equipment.15
The paradox of the “inauthentic
As described above, Battlefield 4 invests significant resources to immerse players in a realistic ambient
environment and to support their sense of presence in the game through realistic interaction sounds
(Foley) and acoustic responses. We could describe such sounds as “real”, in that they are representations
or attempted simulations of audio in the physical world (bearing in mind the caveat regarding rendering
made in the opening comments). The reality signifiers that reference distortions of recording or
reproduction media (such as buffeting, clipping, compression, and pumping) are clearly not “real” in the
same way, but can be instead described as “mediated real”16 in that they still match a player’s
expectations or schema and therefore feel “authentic”. In other words the “authentic” encompasses both
the “real” and the “mediated real”.
Through the construction of a seemingly authentic soundtrack, Battlefield 4 is more successful than most
in evoking the perceived reality of being in combat (as a number of military veterans have testified
[Beynon, 2013]) and yet it appears a paradox that, on close inspection, a significant proportion of the
audio appears to make no attempt to be authentic (neither “real” nor “mediated real”) at all.
Identifying the inauthentic
Below are some of the sounds present in the game that appear to have no basis in the reality of the game
1. As the mission instructions appear on the screen they are accompanied by electronic sounds (Figure 9).
15 These numbers are for illustration only, the precise nature of the numbers and functioning of the system is not in the public domain.
16 It has been noted before how immersion in games often derives from a cinematic realismrather than an objective one (Collins, 2008:
134) but it is perhaps useful to extend this concept to “mediated realism”.
17 See accompanying video 05: “ego-ludic sounds”.
Figure 9: Mission instructions.
2. As the player approaches an enemy, a synthesized sound accompanies the appearance of a semi-
circular indicator on the screen, orientated in the direction of the enemy (Figure 10). This intensifies in
volume, notifying the player of the enemy’s increasing awareness of their presence.
Figure 8: Icon indicating the enemy’s awareness of the player.
3. When coming within a certain radius of a collectable item, a sound plays as an orientation function,
alerting the player to its close proximity. Another sound notifies the player of a successful pickup.
(Figure 11) (The automatic picking up of ammo is also accompanied by a reload notification sound.)
Figure 9: Item pickup.
4. A sound notifies the player that their squad is available and awaiting instruction, and then another
accompanies the appearance of orange aiming reticles as confirmation feedback that the command has
been received from the player (Figure 12). (A charging up and breath sound also serves as notification
when the player’s health rejuvenates to full).
Figure 10: Squad accepts instructions.
Unlike the authentically “real” sounds of ambience, Foley, and weapons discussed above, where every
effort is made to match the variation in sound found in the physical world, these sounds appear to share a
different characteristic, in that they are all highly repetitive. Even if exactly the same sound is not always
repeated (though it is in most cases), then at least they are repetitive in gesture, using four or five highly
related sounds, as in the instance of the ammo pickup notifications.
Ludic functions of the inauthentic
What also unites these inauthentic (or “unreal”) sounds is that they are all performing a ludic function,
that is, they are all supporting the player’s ability to succeed in the game by providing vital gameplay
information (Stevens and Raybould, 2014). Much of the “fun” of games derives from the player’s
fundamental desire for mastery (Przybylski et al., 2010) and the sense of immersion that comes from
engaging with challenge (Ermi and Mäyrä, 2005), but in order to maintain the player’s state of flow
(Csíkszentmihályi and Csíkszentmihályi, 1992), the players skills must remain balanced against the
challenges they face. The function of these sounds in drawing attention to instructions, providing
notification, feedback, and orientation to the player are vital in compensating for the relatively low
fidelity of information available when playing a game, when compared to that of the physical world.
The SALIs (State And Location Indicators, see Figure 13), such as the direction indicators referred to
above under 2., provide a good example of the issues at play here. These semicircular icons, which have
become part of the grammar of games through their ubiquitous adoption in first and third person genres,
perform the dual function of informing the player about the current state of the enemy (since due to
resolution and memory issues we often lack the graphical fidelity (Figure 14) that might allow us to
represent the non-playable character’s (NPC’s) emotional state through facial expression or body
language [Edsall, 2003]) and the off-screen direction of the enemy – the field of view in Battlefield, like
most first person games is limited to around 60–70 degrees (Saastamoinen Minto, 2013).
Figure 11: The ubiquity of State And Location Indicators (SALIs): Deus Ex: Human Revolution (Eidos, 2011),
Dishonored (IO Interactive, 2012), Splinter Cell: Blacklist (Ubisoft, 2013), Call of Duty: Ghosts (Activision, 2013), Far
Cry 3 (Ubisoft, 2012).
Figure 12: Illustration of the lack of fidelity in facial expression when viewed over distance in Battlefield 4.
The proximity cue for collectables (group 3, above) is also a compensation for limited graphical
fidelity and field of view, and confirmation sounds for the pickup of collectables or the
acknowledgement of squad instructions (group 4) are a proxy for the haptic feedback that would
provide a confirmation of interaction in the physical world (Stevens and Raybould, 2011).
Although they could be perceived as inauthentic, these ludic functions of audio convey vital
information to the player. As Ben Minto describes, ‘...the more information [players] have the
more informed their choices and the more immersive the title becomes. If the choices pay off
they feel the game is “better”.’ (Minto, 2014)
Situating the authenticand inauthentic
Of course, many of the “authentic” elements of a game’s soundtrack also provide ludic information.
Experienced players can decode the high frequency content of a machine gun sound to understand its
distance from them, or identify the particular faction that the sniper on the distant horizon belongs to
through the characteristic crack of the rifle. The iconic, repetitive sound signals of the purely ludic,
however, appear to occupy a specific place within a game’s social and functional space. The concepts of
diegesis have been seen to be problematic for film (Kassabian, 2013), and even more so for games
(Jørgensen, 2011), so firstly we will attempt to position these sounds within a self-devised framework
that is focused on the function of the sound, and by whom it is heard (Figure 15).
Figure 13: The functional spaces of game audio. (The lines are not ones of division but are purely for orientation.)
1. Narrative: Audio that draws us into a fictional world or narrative.
2. Ludic: Audio that provides information to help the player achieve, or motivate the player towards
achieving, mastery.
3. Social: Audio that is heard by all agents/entities in the game.
4. Personal: Audio that is heard only by the player.
It is clear that many games contain both strongly narrative elements and ludic elements (Aarseth, 2004),
and the relative proportions of these vary significantly with genre. For example, a platform game such as
Super Mario Bros. (Nintendo, 1985) might be strongly ludic, with much of the content being about
feedback and reward, and the idea of a narrative, or narrative world, being questionable in terms of its
impact on the experience of play. In contrast, a role-playing game such as Skyrim (Bethesda Softworks,
2011) will have far fewer of these elements, instead being focused on the socio-narrative depiction of a
credible world.
Battlefield contains elements from all functional zones; the ambient sounds (socio-narrative), voice-over-
type instructional speech directed at the player from their fictional superiors (ego-narrative), player-
generated sounds (such as footsteps) that serve a ludic function to others in alerting them to the player’s
location (socio-ludic), and of course, the instructional, notification, feedback, and orientation sounds
referred to above, which are heard only by the player, and which we can now identify as being “ego-
ludic” (Figure 16).
Figure 14: The realand the unreal”.
It is these repetitive, highly encoded, elements which clearly fall outside of the representation of the
authentic in Battlefield 4, and indeed it could be said that the aesthetic of video game audio is the product
not only of technological constraints (Collins, 2008) but of ludic imperative; it is the presence of the ego-
ludic that makes games sound like games.
These sounds appear to occupy a space similar to that of the visual Heads Up Display (HUD) present in
many games, in that they are clearly part of the game, and yet distinct from the reality of the game world
(Figure 17). It has been seen that, despite this apparent disconnect, the presence of HUDs does not have a
discernable effect on a player’s sense of immersion (Jørgensen, 2012), and in many cases the attempts to
better integrate them into the narrative world or provide some justification for their presence are
unnecessary (Fagenholt, 2009). In this light, the old diegetic considerations of defining elements in
relation to them being internal or external to the game-world makes little sense, since ego-ludic audio and
Heads Up Displays are clearly part of the game, and therefore internal or diegetic, and if everything is
diegetic then the terminology becomes redundant.18
Figure 15: HUDs (Heads Up Displays) from BF4, Assassin’s Creed IV: Black Flag (Ubisoft, 2013), DOTA2 (Valve,
Explaining, understanding or resolving the paradox of the “authenticand inauthentic
This discussion of social-functional spaces in games helps us to categorize, define and identify the
authentic (“real” and “mediated real”) and inauthentic, but does not bring us any closer to explaining how
18 This point is articulated more fully from an interface perspective by Jørgensen (2014).
these seemingly contradictory elements can co-exist in Battlefield without affecting a player’s absorption
in the game. How do we explain the fact that ‘even in the presence of non-authentic sounds….game
players experience immersion’ (Grimshaw, 2012)?
If we have a reality schema,19 formed through our everyday experience (and our mediated experience),
then it follows that as our familiarity with gaming grows (our ludoliteracy’ develops [Poulsen and
Gatzidis, 2010]) we also develop a gaming schema, one that is conversant with the tropes and
expectations of the medium, so part of the acceptance of the incongruence of the ego-ludic within an
authentic soundscape is simply down to our learnt experience of games. Indeed it could be that, for
people unfamiliar with the medium, it is precisely these kinds of repetitive sounds, seemingly not
embedded in the narrative world of the game, that prove to be disruptive to immersion. For the gamer,
however, once they understand the encoded meaning of the sound, they no longer actively listen to the
sound in the same way, they simply register its meaning and its implications for gameplay. In order for
this to happen, the positioning of a particular sound as ego-ludic needs to be clear; i.e. if it is not
‘ecologically valid’ (Bergman et al., 2009) within the reality of the game-world as presented to the
player, then it needs to be clearly positioned within the ludic space. In order to do this, the sounds are
deliberately characterized as non-realistic, granted iconic status through repetition and lack of
spatialization, and often synthetic in quality (for example in Battlefield 4, the instruction notification,
proximity notification, achievement unlock notification, deploy ready notification, and SALI sounds are
all based on synthesized or electronic sources).
The implications for immersion
A player may wander around an environment (their sensory experience contributing to immersion), then
have an encounter with an enemy (challenge driven immersion) before becoming involved with the
narrative through a brief cut scene (narrative driver).20 They may become lost in the music of the
cutscene (music instigated immersion) before being rudely pulled out of their immersive state by the
inconsiderate interruption of a spouse or parent… Upon re-entering the game, the sensory experience
once again immerses them before the next challenge begins (see Figure 18 below).
Figure 16: A hypothetical model of a player’s immersive experience.
19 Schema is ‘mental representations of what we know and have come to expect about the world’ (Bernstein, 2011).
20 The sensory, challenge based, and narrative immersions referred to here derive from Ermi and Mäyrä (2005).
Understanding the apparent paradox of the authentic and inauthentic illustrates in a new way what has
been recognized for some time, that players not only occupy different roles at different times when
playing a game, but actually oscillate between different schemas and different drivers of immersion
(Lindley and Sennersten, 2008). This is supported by Calleja’s observation that since ‘humans have a
limited attentional capacity, devoting more conscious attention to one of the dimensions leaves less that
can be invested in others’ (Calleja, 2011: 45).21 Whilst there is no evidence to suggest that the
inauthentic ego-ludic disrupts immersion during challenge based gameplay (as long as its ludic nature
and function is clear), the presence of the ego-ludic when other immersive drivers are operating runs the
risk of necessitating a schema, driver or dimension change, with the potential for a subsequent dip in
immersion due to the cognitive load of enacting such a transition. A typical example of the potential for
the ego-ludic to disrupt immersion might be where the presence of a trophy sound, or a notification that a
friend is now online disrupts a narrative or musically immersive sequence. There is perhaps some
evidence that this is understood on a design level in Battlefield 4 when we observe that the narrative
cutscenes temporarily exclude the elements of ego-ludic audio and visual HUDs in sequences that are
clearly differentiated from normal gameplay (Figure 19).
Figure 19: Illustrating the lack of ludic elements (HUDs) during narrative cutscenes.
The immersive authenticity of the Battlefield series is generated through reconstructing the reality, and
perceived reality, of the physical world. It achieves this both through an obsessive attention to detail in
the accuracy of its representations, and through the recognition that our notions of “real”, particularly
when it comes to war, are formed to a large extent by our exposure to media (the “mediated real”). The
colorations or distortions produced by low fidelity media or recordings immerse the player in a perceived
reality, whilst at the same time it is the low fidelity of sensory information available to the player that
necessitates the presence of the deliberately unreal or inauthentic; the ego-ludic sounds required to fill
these perceptual holes, enabling a mastery of the game. We have seen how players’ acceptance of this
paradox between the simultaneously authentic and inauthentic can be explained by their familiarity with
gameplay schemas, but that the presence of unreal or inauthentic audio elements is only accepted when
the drivers for immersion are challenge based, and the sounds have a clearly identified ludic role.
21 Calleja’s player involvement model refers to kinesthetic involvement, spatial involvement, shared involvement, narrative involvement,
affective involvement and ludic involvement (2011).
Aarseth, E.(2004),Genre Trouble: Narrativism and the Art of Simulation’, in N. Wardrip-Fruin and P.
Harrigan (eds), First Person: New Media as Story, Performance and Game. Massachusetts: MIT Press,
pp. 45–55.
Activision (2013), Call of Duty: Ghosts [Console], California: Activision.
Beck, J. (2008), ‘The Sounds of “Silence”: Dolby Stereo, Sound Design, and The Silence of the Lambs’,
in J. Beck and T. Grajeda (eds), Lowering the Boom: Critical Studies in Film Sound. Illinois: Illinois
University Press, pp. 68–83.
Bergman, P., Sköld, A., Västfjäll, D., and Fransson, N. (2009),Perceptual and Emotional Categorization
of Sound’, Journal of the Acoustical Society America, 126: 6, pp. 3156–3167.
Bethesda Softworks (2011), The ElderScrolls V: Skyrim [PC], Maryland: Zenimax Media.
Beynon, S. (2013), PTSD and How Battlefield Potentially Saved My Life [Online]. Giantbomb. Available
from: <
my-life/103621/> [Accessed 21 May 2014].
Broomhall, J. (2013), ‘Heard About: Battlefield 4’, Develop Magazine, December 2013, 145, p. 41.
Brown, E. and Cairns, P. (2004),A Grounded Investigation of Game Immersion’, in CHI’04 Extended
Abstracts on Human Factors in Computing Systems, pp. 1297–1300.
Calleja, G. (2011), In-Game: From Immersion to Incorporation, Massachusetts: MIT Press.
Chion, M. (1994), Audio Vision: Sound on Screen. New York: Columbia University Press.
------------ (2009), Film: A Sound Art. Columbia: Columbia University Press.
------------ (2010), Le son. Traité d’acoulogie. Paris: Armand Colin.
Collins, K. (2008), Game Sound: An Introduction to the History, Theory, and Practice of Video Game
Music and Sound Design, Massachusetts: MIT Press.
------------- (2011), ‘Making Gamers Cry: Mirror Neurons and Embodied Interaction with Game Sound’,
in Proceedings of the 6th Audio Mostly Conference: A Conference on Interaction with Sound, pp. 39–46.
------------- (2013), Playing with Sound: A Theory of Interacting with Sound and Music in Video Games.
Massachusetts: MIT Press.
Csíkszentmihályi, M., and Csíkszentmihályi, I. S. (1992) Optimal Experience: Psychological Studies of
Flow in Consciousness, Cambridge: Cambridge University Press.
Edsall, J. (2003), ‘Animation Blending: Achieving Inverse Kinematics and More’ [Online] Gamasutra,
Available from <>
[Accessed 02 May 2013].
Eidos Montreal (2011), Deus Ex: Human Revolution [PC]. Montreal: Square Enix.
Electronic Arts (2010), Battlefield: Bad Company 2 [Console]. California: Electronic Arts.
Electronic Arts (2011a), Battlefield 3 [PC]. California: Electronic Arts.
------------------ (2011b), Battlefield 3 Is it real? Trailer (HD) [Online video], 24 October. Available
from: <> [Accessed 21 May 2014].
------------------ (2012), Mass Effect 3 [PC]. California: Electronic Arts.
------------------ (2013), Battlefield 4 [PC]. California: Electronic Arts.
Ermi, L. and Mäyrä, F. (2005),Fundamental Components of the Gameplay Experience: Analysing
Immersion’, in: Changing Views: Worlds in Play. Selected Papers of the 2005 Digital Games Research
Association’s Second International Conference, pp. 15–27.
Fagerholt, E. and Lorentzon, M. (2009), Beyond the HUD: User Interfaces for Increased Player
Immersion in FPS Games [MSc. Thesis], Chalmers University of Technology.
Gapper, M. (2014),War Machines’, Edge Magazine, July 2014, 268, pp. 58–69.
Grimshaw, M. (2007), ‘The Resonating Spaces of First-Person Shooter Games’, in Proceedings of The
5th International Conference on Game Design and Technology, Liverpool: Game Design and
Technology Workshop.
Grimshaw, M. (2010), ‘Player Relationships as Mediated Through Sound in Immersive Multi-player
Computer Games’, Revista Comunicar, 17: 34, pp. 73–80.
----------------- (2012), ‘Sound and Immersion in Digital Games’, in T. Pinch and K. Bijsterveld (eds),
The Oxford Handbook of Sound Studies, New York: Oxford University Press, pp. 347–366.
Huber, D.M. and Runstein, R.E. (2010), Modern Recording Techniques, 7th Ed., New York: Taylor &
IO Interactive (2012), Hitman: Absolution [PC]. Copenhagen: Square Enix.
Jock, 2011. Bash 181: The Sound of Battlefield [Podcast] Available from:
<> [Accessed
21 May 2014].
Jørgensen, K. (2011),Time for a New Terminology?: Diegetic and Non-Diegetic Sounds in Computer
Games Revisited’, in M. Grimshaw (ed.), Game Sound Technology and Player Interaction: Concepts and
Developments, Hershey: Information Science Reference, pp. 78–97.
---------------- (2012),Between the Game System and the Fictional World A Study of Computer Game
Interfaces’, Games and Culture, 7: 2, pp. 142–163.
---------------- (2014), Gameworld Interfaces, Cambridge, Massachusetts: MIT Press.
Kassabian, A. (2013),The End of Diegesis As We Know It?’, in J. Richardson, C. Gorbman and C.
Vernallis (eds), The Oxford Handbook of New Audiovisual Aesthetics, New York: Oxford University
Press, pp. 89–106.
Liebe, M. (2013),Interactivity and Music in Computer Games’, in P. Moormann (ed.), Music and
Game: Perspectives on a Popular Alliance. Wiesbaden: Springer, pp. 41–62.
Lindley, C.A. and Sennersten, C.C. (2008),Game Play Schemas: From Player Analysis to Adaptive
Game Mechanics’, International Journal of Computer Games Technology 2008, pp. 1–7.
Minto, Ben (2011), Four Guns West. The Game Developers Conference, San Francisco. February 28
March 4.
Nacke, L.E., Stellmach, S., and Lindley, C.A. (2011),Electroencephalographic Assessment of Player
Experience: A Pilot Study in Affective Ludology’, Simulation & Gaming, 42: 5, pp. 632–655.
Nintendo (1985), Super Mario Bros. [Console]. Kyoto: Nintendo.
Poulsen, M. and Gatzidis, C. (2010), ‘Understanding the Game: An Examination of Ludoliteracy’, in B.
Meyer (ed.), Proceedings of the 4th European Conference on Games-Based Learning: ECGBL 2009,
Reading: Academic Conferences Limited, pp. 316–324.
Przybylski, A. K., Rigby, C. S., and Ryan, R. M. (2010),A Motivational Model of Video Game
Engagement’, Review of General Psychology, 14/2, pp. 154–166.
Raghuvanshi, N., Snyder, J., Mehra, R., Lin, M. C., Govindaraju, N. K., 2010. Precomputed Wave
Simulation for Real-Time Sound Propagation of Dynamic Sources in Complex Scenes. In: ACM
Transactions on Graphics - Proceedings of ACM SIGGRAPH 2010. 29, 4.
Ryan, R.M., Rigby, C.S. and Przybylski, A. (2006), ‘The Motivational Pull of Video Games: A Self-
Determination Theory Approach’, Motivation and Emotion 30: 4, pp. 344–360.
Saastamoinen Minto, M. (2013), The Road to Battlefield 4: Sounds of the Battlefield [Online]. SkyOkapi.
Available from: <> [Accessed
21 May 2014].
Schafer, R. M. (1994), Soundscape: Our Sonic Environment and the Tuning of the World, Rochester,
United States: Destiny Books.
Skalski, P., and Whitbred, R. (2010),Image versus Sound: A Comparison of Formal Feature Effects on
Presence and Video Game Enjoyment’, PsychNology Journal, 8: 1, pp. 67–84.
Stevens, R. and Raybould, D. (2011), The Game Audio Tutorial: A Practical Guide to Sound and Music
for Interactive Games, San Francisco: Focal Press.
------------------------------- 2014, ‘Designing a Game for Music: Integrated Design Approaches for Ludic
Music and Interactivity’, in K. Collins, B. Kapralos and H. Tessler (eds), The Oxford Handbook of
Interactive Audio, New York: Oxford University Press, pp. 147–166.
Ubisoft (2013), Assassin’s Creed IV: Black Flag [PC]. Montreal: Ubisoft.
Ubisoft Montreal (2013), Tom Clancy’s Splinter Cell: Blacklist [PC]. Montreal: Ubisoft.
--------------------- (2012), Far Cry 3 [PC]. Montreal: Ubisoft.
Vachon, J. F. (2009), ‘Avoiding Tedium – Fighting Repetition in Game Audio’, in Audio Engineering
Society Conference: 35th International Conference: Audio for Games.
Valve (2013), DOTA 2 [PC]. Washington: Valve Corporation.
... 27 Many modern conventional games already utilise dynamic 'real time' processing in order to create more acoustically realistic effects. For example, Battlefield 4 employs a complex algorithmic patch simply to render footsteps relative to variables such as speed or surface (see Stevens and Raybould 2015). Similarly, in the scene near the beginning of Bioshock Infinite, where the player character approaches a carnival/funfair, a variety of (simple) techniques-proximity-dependent volume attenuation, reverb, and low-pass filters-are used to manipulate the sound of the collective singing coming from 'up ahead'. ...
A recurring challenge in the use of BCI (and more generally HCI) for musical expression is in the design and conduct of appropriate evaluation strategies when considering BCI systems for music composition or performance. Assessing the value of computationally assisted creativity is challenging in most artistic domains, and the assessment of computer assisted (or entirely computer generated) music is no different. BCI provides two unique possibilities over traditional evaluation strategies: firstly, the possibility of devising evaluations which do not require conscious input from the listener (and therefore do not detract from the immersive experience of performing, creating, or listening to music), and secondly in devising neurofeedback loops to actively maneuver the creator or listener through an expressive musical experience. Music offers some unusual challenges in comparison to other artistic interfaces: for example, often it is made in ensemble, and there is evidence to suggest neurophysiological differences are evident in ensemble measurement when compared to solo performance activities, for example see (Babiloni et al. in cortex 47:1082–1090, 2011). Moreover, a central purpose of music is often to incite movement (swaying, nodding head, dancing)—both in performer and audience—and as such this also offers up challenges for BCI/HCI design. This chapter considers historical approaches as well as making proposals for borrowing solutions from the world of auditory display (also referred to as sonification) and psychoacoustic evaluation techniques, to propose a hybrid paradigm for the evaluation of expression in BCI music applications.
... 27 Many modern conventional games already utilise dynamic 'real time' processing in order to create more acoustically realistic effects. For example, Battlefield 4 employs a complex algorithmic patch simply to render footsteps relative to variables such as speed or surface (see Stevens and Raybould 2015). Similarly, in the scene near the beginning of Bioshock Infinite, where the player character approaches a carnival/funfair, a variety of (simple) techniques-proximity-dependent volume attenuation, reverb, and low-pass filters-are used to manipulate the sound of the collective singing coming from 'up ahead'. ...
This book presents an overview of the emerging field of emotion in videogame soundtracking. The emotional impact of music has been well-documented, particularly when used to enhance the impact of a multimodal experience, such as combining images with audio as found in the videogames industry. Soundtracking videogames presents a unique challenge compared to traditional composition (for example film music) in that the narrative of gameplay is non-linear – Player dependent actions can change the narrative and thus the emotional characteristics required in the soundtrack. Historical approaches to emotion measurement, and the musical feature mapping and music selection that might be used in video game soundtracking are outlined, before a series of cutting edge examples are given. These examples include algorithmic composition techniques, automated emotion matching from biosensors, motion capture techniques, emotionally-targeted speech synthesis and signal processing, and automated repurposing of existing music (for example from a players own library). The book concludes with some possibilities for the future.
... 27 Many modern conventional games already utilise dynamic 'real time' processing in order to create more acoustically realistic effects. For example, Battlefield 4 employs a complex algorithmic patch simply to render footsteps relative to variables such as speed or surface (see Stevens and Raybould 2015). Similarly, in the scene near the beginning of Bioshock Infinite, where the player character approaches a carnival/funfair, a variety of (simple) techniques-proximity-dependent volume attenuation, reverb, and low-pass filters-are used to manipulate the sound of the collective singing coming from 'up ahead'. ...
This chapter explores the theoretical context of emotion studies in terms of speech and sound effects, and in particular the concept of affective potential. Voice actors in game soundtracking can have a particularly powerful impact on the emotional presentation of a narrative; and this affective control can go beyond that of the actor alone if combined with emotionally-targeted signal processing (for example, sound design and audio processing techniques). The prospect of synchronousing emotionally congruent sound effects remains a fertile area for further work, but an initial study which will be presented later in this chapter suggests that timbral features from speech and sound effects can exert an influence on the perceived emotional response of a listener in the context of dynamic soundtracking for video games. This chapter extends upon material originally presented at the Audio Engineering Society conference on video game soundtracking in London, UK, 2015 (Williams et al. 2015), and subsequently on the specific design of affect in vocal production at the Audio Engineering society convention in New York, 2015 (Williams 2015a). Prosodic (nonverbal) speech features have been the subject of a considerable amount of research (Gobl 2003; Pell 2006). The role of such features as a communicative tool in emotion studies suggests that acoustic manipulation of prosody could be a useful way to explore emotional communication (Frick 1985; Baum and Nowicki 1998). For example, in studies which use dimensional approaches to emotion, known acoustic correlations found in prosody include emotional arousal with pitch height, range, rate of speech, and loudness. Some emotional cues can be derived acoustically from prosody (Bach et al. 2008) by time series analysis in a manner which is analogous to the temporal characteristics used to determine such cues in musical sequences (Gobl 2003; Juslin and Laukka 2006; Kotlyar and Morozov 1976; Deng and Leung 2013, for example pitch height and range, loudness, and density are suggested to correlate strongly with affective arousal by some research).
... 27 Many modern conventional games already utilise dynamic 'real time' processing in order to create more acoustically realistic effects. For example, Battlefield 4 employs a complex algorithmic patch simply to render footsteps relative to variables such as speed or surface (see Stevens and Raybould 2015). Similarly, in the scene near the beginning of Bioshock Infinite, where the player character approaches a carnival/funfair, a variety of (simple) techniques-proximity-dependent volume attenuation, reverb, and low-pass filters-are used to manipulate the sound of the collective singing coming from 'up ahead'. ...
This chapter introduces a working concept which a number of subsequent chapters will rely upon: Affectively-Driven Algorithmic Composition (or AAC). The reader should note that this is not related to perceptual data compression as in the Apple Lossless file format AAC. Instead it refers to a specific subset of interdisciplinary practices marrying sound design opportunities with emotional intent; a paradigm which is ideally suited to modern video game soundtracking practice. This chapter builds upon initial work reported in the ACM Computers in Entertainment journal (though in an online article, not a specific journal edition), in 2017 (Williams et al. 2017).
Full-text available
This Master's dissertation seeks to explore how the sounds (music, dialogue and sound effects) within The Witcher video game trilogy contribute to immersion and world-building within the games. Terms such as informative music and diegesis are discussed; specific discussions on topics such as menu music, tavern music, NPC dialogue and monster (creature) sound design can also be found, among other topics.
The mid-to-late 2010s saw a renewed interest in virtual reality technologies with the release of a wide selection of consumer VR headsets and glasses, and the increased power of smartphones to provide augmented reality experiences. While research on sound and music in these technologies has started to gather pace, practice and research continue to grapple with significant creative and practical questions. In addition, such discussions have tended to be limited within disciplinary or professional silos. The roundtable presented here was an effort to begin conversations across a variety of fields of research and practice. We shared perspectives and exchanged views informed by different disciplinary traditions and experiences. We also sought to identify key questions and issues regarding music and sound in VR/AR. Three main themes are presented here: 1) Spaces and musical performances, 2) Realities and realism, and 3) Movement, orientation, and disorientation.
Video game music has been permeating popular culture for over forty years. Now, reaching billions of listeners, game music encompasses a diverse spectrum of musical materials and practices. This book provides a comprehensive, up-to-date survey of video game music by a diverse group of scholars and industry professionals. The chapters and summaries consolidate existing knowledge and present tools for readers to engage with the music in new ways. Many popular games are analysed, including Super Mario Galaxy, Bastion, The Last of Us, Kentucky Route Zero and the Katamari, Gran Turismo and Tales series. Topics include chiptunes, compositional processes, localization, history and game music concerts. The book also engages with other disciplines such as psychology, music analysis, business strategy and critical theory, and will prove an equally valuable resource for readers active in the industry, composers or designers, and music students and scholars.
This article aims to illuminate the meanings and aesthetic effects generated by scenes of staged opera in video games. It also explores the images of opera transmitted to the huge audiences that games address. Three dimensions of the opera-game encounter are discussed. First, Tosca in Hitman: Blood Money and The Beggar’s Opera in Assassin’s Creed III are used to examine the treatment of violence and the discourse of popular appeal in games and opera. Second, the arias sung by women in Final Fantasy VI and Parasite Eve illustrate how a melodramatic mode of expression represents a confluence of the aesthetic priorities of the two media. Finally, The Beast Within ’s meditation on Wagner reveals how opera sequences aim to engage players by conjuring phantasmagorias through a unifying and enrapturing spectacle.
Despite their ubiquity, cutscenes remain a relatively neglected element in videogame studies and especially in ludomusicological studies. This chapter considers the function of cutscenes and the role of music in them. Definitions of, and perspectives on, cutscenes are summarised. Models of immersion are then considered, with an emphasis on the role of cutscenes in enhancing or detracting from the same. Audio-visual synchresis is discussed, with a view to explicating narrative function and emotional affect in cutscenes. The chapter concludes with a provisional evaluation of the challenges posed and the potentials afforded by virtual-reality (VR) gaming, focusing on both the ramifications for cutscenes and also the use of music/audio.
Full-text available
The question of how interactive music should function in games is perhaps a misleading one, as there are many different types of games and many different types of players. One of the most compelling explanations for the huge popularity of video games is that they meet people's intrinsic psychological needs quickly, with consistency, and with great frequency (Rigby, 2010). The apparent drivers of the development of games and their marketing-such as the fidelity of graphics and audio, or as the popular press would have us imagine, the degree of violence-are far less significant factors than the drive to increase our sense of well-being through meeting the basic needs of competence (or mastery), autonomy (or volition) and relatedness (social connection) (Przblinkski, 2009) or the desire to become immersed in narrative worlds (Cairns, 2006). Since it is clear that player satisfaction is a product of "needs met" over "needs", it is important that we recognize that music should operate in different ways in different circumstances.
This anthology examines the various facets of video game music. Contributors from the fields of science and practice document its historical development, discuss the music's composition techniques, interactivity and function as well as attending to its performative aspects.
In the widening field of Game Based learning, games are included and addressed in many different ways. In this paper, the authors explore the possibility of using games to strengthen students' digital literacy and more specifically their reflective understanding of video games, which we label ludoliteracy by adopting the term from José P. Zagal. For some years, digital literacy has been considered a pivotal competence due to the increasing digitization of information. It is simply not possible to become an actively participating citizen in society today without the skills and competencies required to navigate the digital information. Digital media are becoming ever more ubiquitous and intertwined, and games are a central component of this process. It is thus imperative that games are included in educational settings, and that we develop a framework for this inclusion. This leads to our primary research question: How can we define "ludoliteracy" and how can games be included in education in order to develop this literacy with students? As we are working within a new field at an early stage, neither theory nor practice is thoroughly consolidated. Our approach is therefore one of convergence, where we are fusing together available theory with our own empirical studies in order to build a more comprehensive framework for ensuring a sufficient understanding of video games. We draw on the last century of research from the field of game studies, and the knowledge gained in relation to the broader digital literacy. Game studies have provided us with important insights, and should be considered part of the foundation for any approach to game based learning, not least one that is concerned with ludoliteracy. Building upon these pillars of theory, we have carried out several empirical projects with students at different levels in order to shed light on possible approaches towards ludoliteracy.
This article deals with the use of sound in digital games. It reveals how the design of sound for such a medium contributes to player immersion in the game world. Following this, the article explores the relationship between sound and image. There is a difference in the use or perception of sound in the real world and in virtual worlds, which gives rise to the concept of immersion, and this difference is discussed in detail. Furthermore, the article focuses on the relationship between digital game sound and the player and how such sound is designed to achieve a perception of immersion and, and whether such immersion is achieved. Finally, it throws light on the fact that real world and virtual world are not isolated phenomena, that the immersed player still utilizes real-world objects to interface with the game world and is attentive to real-world happenings.