The Relative and Perceived Impact of Irrelevant
Speech, Vocal Music and Non-vocal Music
on Working Memory
Thomas R. Alley &Marcie E. Greene
Published online: 16 October 2008
#Springer Science + Business Media, LLC 2008
Abstract The ability to retain and manipulate information for brief periods of time
is crucial for proficient cognitive functioning but working memory (WM) is
susceptible to disruption by irrelevant speech. Music may also be detrimental, but its
impact on WM is not clear. This study assessed the effects of vocal music,
equivalent instrumental music, and irrelevant speech on WM in order to clarify what
aspect of music affects performance and the degree of impairment. To study this, 60
college students completed WM tests (digit span) in the presence of irrelevant
speech, vocal music, instrumental (karaoke) versions of the vocal music, and silence.
As expected, both speech and vocal music degraded performance. WM performance
with instrumental music was better than with vocal music but not significantly
different from either silence or speech. Familiarity with song lyrics had little effect
on performance. People were poor judges of the degree of memory impairment
resulting from various irrelevant sounds.
Keywords Working memory .Irrelevant sound .Interference
The ability to retain and manipulate information for brief periods of time is crucial
for proficient cognitive functioning. In recent years, most cognitive psychologists
have conceptualized this ability as a reflection of a multi-component “working
memory”(WM), and acknowledged that WM is vulnerable to interference from
certain concurrent activities or sensory inputs. Exactly which activities or inputs can
interfere with particular WM tasks, and why, remains a topic with considerable
research activity and debate.
The most commonly proposed and discussed components of WM are two
independent subsystems, the “phonological loop”and the “visuospatial sketchpad”,
Curr Psychol (2008) 27:277–289
This study was reviewed and approved by the Clemson University Institutional Review Board.
T. R. Alley (*):M. E. Greene
Department of Psychology, Clemson University, 418 Brackett Hall, Clemson, SC 29634-1355, USA
and a “central executive”that oversees and controls the operation of WM (Baddeley
1999,2001). More specifically, the visuospatial sketchpad is a system that holds and
manipulates visual and spatial images, which is useful in using imagery in learning
and in spatial problem solving. The phonological loop is a system for storing and
manipulating a limited number of sounds for a brief period of time, as in subvocal
This type of model of temporary memory has been challenged and modified over
the years, but remains widely used and remarkably useful: “it offers a simple and
coherent account of a relatively complex set of data”and has been “readily
applicable”to a variety of neuropsychological deficits involving speech, language
and memory (Baddeley 2001, p. 853). Evidence for a phonological loop component
in a multi-component WM system includes: the acoustic similarity effect (Conrad
and Hull 1964; Baddeley 1966; Larsen et al. 2000), the word-length effect (Baddeley
et al. 1975), little mutual interference of a visual task on verbal memory or vice versa
(Cocchini et al. 2002), and the disruption of recall by irrelevant spoken material
(Salame and Baddeley 1982). The last finding is particularly important for the
present study. From the perspective of Baddeley’s model, this study is mainly
concerned with the phonological loop, the type of WM responsible for the temporary
storage of speech-based information, and which permits subvocal articulation of
The phonological loop appears to be quite susceptible to negative effects of
speech and perhaps other sounds. Salame and Baddeley (1982,1989) showed that
the immediate recall of visually presented digits was disrupted when the presentation
and recall occurred in the presence of irrelevant speech (IS), including speech in an
unfamiliar foreign language. In contrast, irrelevant non-speech noise did not
interrupt recall. Moreover, the IS effects occur even when subjects are instructed
to ignore the irrelevant auditory stimulation. These basic findings have been
replicated by more recent research (for example, Ellermeier and Zimmer 1997).
Explanation of the IS effect usually is based on a model of working memory in
which irrelevant spoken material gains access to the phonological loop and,
therefore, interrupts the rehearsal process. Speech sounds appear to access WM
directly and automatically and to interfere with the ability to retain acoustically
encoded material at the same time.
Consistent with this view, Salame and Baddeley also showed that the use of
articulatory mechanisms to generate simple repetitive speech impairs immediate
recall of visually presented lists. Visual memory, however, is not susceptible to this
articulatory suppression (Cocchini et al. 2002). Furthermore, the use of articulatory
suppression during presentation and recall of visually presented lists eliminates the
IS effect (Hanley 1997; Norris et al. 2004; Salame and Baddeley 1982). These
results are generally interpreted as follows: articulatory suppression disrupts the
normal use of WM, presumably by preventing the conversion of visual material into
verbal material for rehearsal within the phonological loop or by preventing rehearsal
itself. Together, these phenomena support the hypothesized construct of a
phonological loop and show the importance of subvocal speech on some temporary
memory tasks (see Baddeley 2001; Norris et al. 2004). The effect of IS cannot be
reduced with additional effort and, even after a long exposure, does not weaken (that
is, no habituation occurs) (Ellermeier and Zimmer 1997; Hellbrück et al. 1996). It is
278 Curr Psychol (2008) 27:277–289
important to note that the interference produced by irrelevant sounds is not an all-or-
none phenomenon but instead can vary with both memory task and acoustic
characteristics. Finally, IS produces “reliable impairment at the moderate intensity
levels people are commonly exposed to in everyday life”(Ellermeier and Hellbrück
1998, p. 1406), and there appears to be little if any effect of loudness levels (Colle
1980; Ellermeier and Hellbrück 1998).
Speech, Music and Working Memory
A number of investigations indicate that tasks relying on the phonological loop may
be disrupted by sounds that are non-speech sounds. Early research on the effects of
noise on short-term memory seems inconsistent, with some studies finding no effects
(for example, Davies and Jones 1975; Salame and Baddeley 1983), some reporting
impaired recall (for example, Rabbitt 1968; Jones and Macken 1993; Salame and
Baddeley 1989), and some even reporting memory improvement (for example,
Wilding et al. 1982). Subsequently, other studies have been performed to examine
the effects of irrelevant speech, vocal music, and various non-speech sounds on
Knowledge of the auditory conditions that can impair WM is crucial for
understanding cognitive performance. If certain non-speech auditory patterns can
substantially impair WM, then a variety of environmental sounds may impair
cognitive performance. Understanding the auditory conditions that impair WM also
is crucial for evaluating models of temporary memory. For instance, Jones and
Macken (1993) proposed a changing-state model that suggested that the disruption
of WM performance by speech is mainly due to “the change in composition of sound
from one utterance to the next in the irrelevant stream”(p. 369). This model
predicted that speech and non-speech can be equivalent in disrupting recall because
the degree of the “state change”is what burdens the system, not the mere presence of
speech. LeCompte et al. (1997) tested this equipotentiality hypothesis, making slight
improvements (for example, using use of words rather than letters or nonsense
syllables) on some apparently supportive experiments by Jones and Macken (1993).
These experiments found that using words, especially meaningful speech, caused a
greater disruption of recall than did tones that mimicked some speech properties.
In general, studies comparing the effects of instrumental music and speech have
shown that speech is also significantly more disruptive than instrumental music (for
example, Salame and Baddeley 1989), but many unanswered questions about the
effects of instrumental and vocal music on WM remain. Two more recent studies
(Iwanaga and Ito 2002; Pring and Walker 1994) have tested working memory in the
presence of both instrumental and vocal music. In both of these studies, participants
completed WM tasks in the presence of irrelevant auditory stimuli (although
Iwanaga and Ito used a verbal memory task that involved the presentation of word
sequences instead of sequences of digits as used in many of the other studies,
including Pring and Walker 1994).
Pring and Walker (1994) used unvocalized music (nursery rhymes played
instrumentally without the vocal accompaniment) and traditional instrumental music.
Pring and Walker proposed that memories of the lyrics to nursery rhymes would be
Curr Psychol (2008) 27:277–289 279
activated when hearing the unvocalized music, leading to “obligatory access to
phonological short-term memory”(1994, pp. 169), in turn causing interference with
WM. They found that musical analogues of nursery rhymes did have a more
detrimental effect on verbal working memory than did regular instrumental music.
Assuming the participants were familiar with the nursery rhymes used, this
difference supports their hypothesis that implied words (familiar lyrics) can impair
use of the phonological loop for WM. Participants in this study also were asked to
perform an articulatory suppression task; this reduced the difference between the
groups, supporting the interpretation of interference with a verbal WM component.
Nonetheless, their study leaves two important matters unresolved. First, with no
irrelevant speech condition, the degree of disruption of WM they found can not be
compared to that of the irrelevant speech effect. Second, Pring and Walker did not
use a vocal music group for comparison. Thus, other properties of the nursery
rhymes (for example, similarity in rhythm or pitch variation) rather than the implied
linguistic content may be responsible for the disruption of WM (see below).
Iwanaga and Ito (2002) did use both vocal music and instrumental music in a
more recent study of both verbal and spatial memory tasks. Their results showed a
significant disturbance by music in verbal memory tasks, with the vocal music group
performing significantly worse than the instrumental music group. Consistent with
the conceptualization of a phonological loop, they found no significant impairment
on the spatial memory tasks. Their finding that vocal music caused greater verbal
WM impairment than instrumental music, however, is difficult to interpret since the
two types of music presented were different musical selections. Thus, any of several
variables (for example, complexity, rhythm, tempo, familiarity) instead of, or in
addition to, linguistic content may have contributed to reduced WM performance.
Research (for example, Jones and Macken 1993) has shown that auditory patterns
which resemble the dynamic patterns or complexity of speech can harm WM, raising
the level of concern with uncontrolled variables in these two studies of vocal, or
implied vocal, versus instrumental music. Unfortunately, no previous research has
compared WM performance with irrelevant music that either does, or does not,
contain vocals but which is otherwise equivalent. Without such matching musical
selections, one cannot determine whether the presence of lyrics effects performance
rather than some other factor. This is a weakness of the Pring and Walker (1994)
study which did reduce the number of uncontrolled variables by using instrumental
and the nursery rhyme selections played on the same instrument, but these selections
were not the same song.
Another common problem in previous research is the lack of a silent (no
interference) condition that would provide a standard for comparison for the
experimental groups. Lacking this condition, for instance, Pring and Walker (1994)
still showed that one type of instrumental music (a familiar nursery rhyme) is more
disturbing than another, but not how much disturbance each produces relative to
silence (or speech). Moreover, a silent condition can provide a standard of
performance (baseline) to insure that there are no significant initial differences in
performance between the groups in a between-subjects design.
The current study will address these methodological deficiencies by examining
WM performance under four conditions: silence, vocal music, equivalent instru-
mental music, and irrelevant speech. By presenting music backgrounds that either
280 Curr Psychol (2008) 27:277–289
include a vocal part or not, but that are otherwise equivalent, we will be able to test
whether it is really the presence of verbal material that causes the WM impairment.
Using both vocal music and instrumental music also will help to reveal which is
perceived to be, and which is actually, more disruptive. Even though the vocal music
contains ‘speech’and is more complex, it may be that there is more of an actual or
perceived distraction when the lyrics are removed because, when the lyrics are
present, listeners do not need to search through memory for the correct lyrics.
The present study also will record each participant’s familiarity with the songs
that are played so that differences in familiarity can be ruled out or controlled as a
factor contributing to differences in performance. The ratings of familiarity also will
help determine whether there is a significant difference in the interference from
instrumental music between participants who are very familiar with the song’s lyrics
and those who are less familiar with the lyrics. This comparison may show that the
instrumental music is either more or less disruptive for the participants who are more
familiar with the lyrics. That is, for those who are more familiar with the lyrics, there
may be less impairment because there is less cognitive effort used to retrieve the
lyrics from long-term memory. On the other hand, more familiarity with the lyrics
may induce greater ‘contamination’of the phonological loop, causing more
disruption of WM. A third reasonable expectation is that familiarity will have no
effect since, lacking actual linguistic content, instrumental music will have a
negligible effect on verbal working memory regardless of one’s knowledge of
Finally, this study will record participants’impressions of the relative impairment
caused by each type of acoustic background, allowing analysis of the accuracy of
such judgments. Prior research (for example, Ellermeier and Zimmer 1997; Pearman
and Storandt 2005; Schmidt et al. 2001) suggests that people may be poor at judging
the adverse impact of various conditions on memory performance.
A within-subjects design was used wherein each participant was given seven trials of
a digit span task under each of four auditory conditions: silence, vocal music,
equivalent instrumental music, and irrelevant speech.
The participants were 61 students from a U.S. university. Many of these volunteers
received course credit for participating. The data for one volunteer was removed
because of a computer error during testing. The remaining 60 participants included
10 males and 50 females, with an average age of 18.6 years.
Working Memory Test
The working memory task presented to the participants was a digit span task similar
to the task used by Salame and Baddeley (1989). While a variety of working
memory tasks are available, this traditional task allows the clearest comparison of
Curr Psychol (2008) 27:277–289 281
results with those of the most relevant earlier studies and, in requiring retention of
the order of information presented, is a type of task with reliable sensitivity to
irrelevant speech (Salame and Baddeley 1982). For our digit span task, participants
were shown seven random sequences of ten digits each that were presented on a
computer screen at the rate of one digit per 0.8 s. [Previous studies, including
Salame and Baddeley (1982,1989), have often used a nine digit task but, on at least
one occasion, this has produced a ceiling effect in our lab.] Subjects were instructed
to remain silent while viewing the digits and then to write down the sequence of
digits in order of presentation, immediately following presentation. Participants were
given ample time (up to 20 s) to recall each sequence.
Each participant was presented with silence, irrelevant speech, vocal music, and
instrumental music. The irrelevant speech was an excerpt taken from a recording of
Northanger Abbey (Chapter 10; Austen 1982). The standard and karaoke versions of
two pop songs were used to make vocal and non-vocal (instrumental) stimuli.
Specifically, two recent pop songs familiar to most college students were used:
“When I’m Gone”by Three Doors Down and “I’m With You”by Avril Lavigne.
Both songs spent substantial time on world and U.S. singles charts and were high
ranked not long before the present study was conducted.
Two different songs were used to prevent a familiarity effect that could occur
when hearing the same song twice (that is, with and without vocals) during testing.
Moreover, this design should strengthen the generalizability of the results as
compared to studies using a single music selection. For each of the instrumental
songs, background chorus vocals (when present) were removed using a computer
sound-editing program. This editing of the instrumental songs was done in a way
that preserved musical continuity. More specifically, the brief excerpts that were
removed were approximately 5 s in length, and were made so that the cuts were not
obvious and the music still flowed normally. The musical selections were looped, as
necessary, so that they would be long enough to fill the entire testing interval, then
recorded onto a CD-R disc.
Each participant listened to the excerpts through JVC Model HA-D610 headphones
that were connected to a CD player controlled by an experimenter. The auditory
stimuli began 3 to 5 s before each block, to prevent any affects that could result from
distraction or surprise when a new auditory excerpt began, and remained on during
each trial until the participant had completed their attempt to recall the digits.
The participants were divided into two groups of 30, so as to counter-balance the
order and version of the two musical selections. The participants in Group 1 were
presented with the vocal version of “When I’m Gone”before the instrumental
version of “I’m With You”. The participants in Group 2 were presented with the
instrumental version of “When I’m Gone”before the vocal version of “I’m With
You”. The participants were assigned in a quasi-random fashion to these two groups,
and tested individually.
282 Curr Psychol (2008) 27:277–289
The participants were given headphones and allowed to adjust the volume to a
comfortable level while listening to a sample of music that was different from the
two songs used in testing. Participants kept the headphones on throughout all four of
the background conditions, including silence.
All participants were tested under each of the four auditory conditions during a
single experimental session. The silent, speech, and two music conditions were
presented in a quasi-random fashion insuring that each condition was placed either
first, second, third or fourth an equal number of times across the 60 participants.
Each block of sequences lasted approximately 3 min, depending on how much of the
20 s allowed were used for recall.
After completing all four conditions, participants were given a brief questionnaire
that asked them to rate their familiarity with the two songs on a five-point scale that
ranged from 1, “not familiar”,to5,“very familiar”. This questionnaire also asked
participants to rate their knowledge of the lyrics for each song on five-point scales
that went from 1, “not well”,to5,“knew them all”. Next, they were asked to rate the
perceived level of distraction from each of the three background stimuli on five-
point scales that went from “not distracting”to “very distracting”. Finally,
participants were asked to report their age and sex, and then debriefed.
Participants’WM scores (digit spans) were determined by the number of digits that
were placed in the correct position. The overall mean for all participants in all
conditions was 5.1 digits.
The overall mean digit spans for all participants were calculated for each
condition and for each test group (see Table 1). These means were compared using
between-groups t-tests to determine whether there were significant differences
between Group 1 and Group 2. As shown in Table 1, there were no significant
differences between the two groups in any of the conditions. In addition, the rank
order of the conditions was the same for each group: silence, instrumental music,
irrelevant speech, and vocal music (in descending order of performance).
Consequently, the data from the two groups was combined for further analysis.
A graph of the overall means for each background condition is presented in
Fig. 1. This shows that performance was best for the silence condition, followed by
instrumental music, irrelevant speech, and vocal music. The overall means for each
background condition were compared using a one-way, within-subjects ANOVA,
Table 1 Mean digit spans for each background condition by group, mean difference between groups, and
p-values for independent samples t-tests comparing the groups
Condition Overall means Group 1 Group 2 Average difference p
Silence 5.37 5.48 5.27 0.21 >0.46
Instrumental Music 5.15 5.18 5.12 0.06 >0.86
Speech 5.02 4.95 5.09 −0.14 >0.69
Vocal music 4.81 4.59 5.03 −0.44 >0.08
Curr Psychol (2008) 27:277–289 283
revealing significant differences between the conditions, F(1, 59) =6.95, p<0.001,
Paired samples t-tests were performed between all of the conditions, revealing
three significant differences: Digit span was higher with silence than with speech, t=
2.7, p<0.01, or vocal music, t= 4.41, p< 0.001, and higher with instrumental music
than with vocal music, t=−2.63, p=0.011.
To determine whether a participant’s performance under one background
condition was predictive of performance under another, Pearson correlations were
calculated between all four conditions. As shown in Table 2, there were significant
(p<0.001) correlations between all of the conditions.
Despite the popularity of the two musical selections near the time of testing, there
was considerable variation in reported familiarity. Nearly half (47%) of the
participants gave the instrumental music condition a score of 4 or 5 (indicating
they were knew almost all of the lyrics), while 32% gave the instrumental music
condition a score of 2 or 3 (indicating partial familiarity). Finally, 21% of the
participants said that they were not at all familiar with the lyrics. For the vocal song,
70% of the participants reported being very familiar with the lyrics (giving a score of
4 or 5), 20% said that they were partially familiar (score of 2 or 3) and 10% said they
did not know any of the lyrics.
Spearman correlations to determine whether familiarity with the lyrics of a song
was associated with performance while listening to that song revealed non-
significant correlations for both instrumental (r
=0.25, p>0.05) and vocal (r
0.127, p>0.1) versions of the songs. Additional Spearman correlations revealed
positive, but not significant, correlations between familiarity with the song lyrics and
Table 2 Pearson correlations calculated between background conditions
Condition Vocal Silence Instrumental
Instrumental 0.610* 0.589*
Speech 0.546* 0.662* 0.450*
*p=0.001 (correlation is significant; two-tailed)
Silence Karaoke Speech Vocal
Mean Performance Score
Fig. 1 Comparison of mean
performance scores between
background conditions, in
descending order, with standard
error bars included
284 Curr Psychol (2008) 27:277–289
perceived distraction for both instrumental (r
=0.07, p>0.5) and vocal (r
The average perceived distraction scores for speech, instrumental music, and
vocal music were compared, revealing that instrumental music was perceived to be
the least distracting (M=2.80) and vocal music the most distracting (M = 3.78), with
speech rated about mid-way between these two conditions (M = 3.25). As shown in
Table 3, there were significant differences in perceived distractibility between all
three of the conditions.
To determine whether there was a correlation between perceived distraction levels
and actual performance levels during the various background conditions, the speech,
instrumental, and vocal WM scores were correlated with their corresponding
distractibility ratings. No significant correlations were found.
The lack of a significant difference between the two test groups (see Table 1)
signifies that it made no significant difference which of the two musical selections
was presented as an instrumental version and which was presented as a vocal
version, or which song was presented first.
As expected, the participants performed best under the silent condition. In
comparison, the irrelevant speech condition significantly degraded performance.
This difference in the effect of silence versus irrelevant speech is in line with
previous studies on the irrelevant speech effect (for example, Baddeley and Salame
1982; Colle and Welsh 1976; Jones and Macken 1995; Salame and Baddeley 1982,
1983,1989). This finding is also in line with predictions from a standard model of
WM, which predicts that phonologically processed stimuli will compete for the
limited processing of the phonological loop, thereby impairing its use for verbal
Adding support to this view, performance with a silent background also was
significantly better than with the other variety of irrelevant background language
used, vocal music, but not significantly better than with otherwise equivalent non-
vocal music. Previous research, including that by Salame and Baddeley (1989) and
Iwanaga and Ito (2002), also found that phonological WM was impaired by
irrelevant vocal music, and suggested that the linguistic aspect of this music caused
the impairment. However, it is possible that one or more of a large number of other
uncontrolled aspects of the auditory stimuli could have produced some or all of the
Table 3 Mean differences between perceived distractibility of background conditions, with SD and t
Comparison Mean difference SD tp
Speech and vocal −0.53 1.68 −2.46 0.017
Speech and instrumental 0.45 1.42 2.46 0.017
Vocal and instrumental −0.98 1.16 −6.58 0.000
All pvalues reflect two-way tests of significance
Curr Psychol (2008) 27:277–289 285
WM impairment observed; as noted above, even instrumental music presents
numerous acoustic variables that might alter cognitive performance. It is important to
note that the present study is the only published study using carefully matched vocal
and instrumental music. Our study clarifies the cause of the impairment produced by
vocal music by showing that non-vocal but otherwise equivalent versions of the
same music fail to produce significant impairment and allow significantly better
Contrary to one hypothesis, whereby instrumental versions of familiar vocal
songs could activate cognitive processing (for example, verbal retrieval) that
interferes with verbal memory more than words themselves, instrumental music
was less disruptive than vocal music. Given this performance and the finding that
participants typically reported good familiarity with the lyrics of the two pop songs
used, this pattern indicates that subvocal production of lyrics (or other activation of
the phonological loop) either does not occur automatically or does not present a
significant impediment to effective WM performance. Again, this is strong evidence
that the linguistic aspect of songs (or speech) is primarily responsible for declines in
performance. This finding fits nicely with Baddeley’s WM model in that language
should automatically activate the phonological loop, thereby reducing resources that
could be used for memory. In contrast, this finding appears to be problematic for
models of memory that treat irrelevant speech and irrelevant nonspeech sounds as
functionally equivalent in their effect on memory (Jones and Macken 1993; Jones
and Tremblay 2000). Even with this clarification, however, precise modeling of the
irrelevant speech effect remains unresolved (see Jones and Tremblay 2000; Neath
2000; Baddeley 2000,2001).
Overall, the present study supports the theory that language that is either spoken
or sung gains access to the phonological loop, disrupting the processes of the
working memory. Meanwhile, instrumental music does not gain the same access,
even when it presents ‘implied’vocals. Nonetheless, the intermediate WM
performance found with karaoke music in the background may reflect the occasional
use of phonological processing. This phonological processing may be activated by
‘implied’words (that is, lyrics) or, on occasion, by musical patterns themselves. The
lack of a correlation between familiarity with a song’s lyrics and WM performance
argues against the former interpretation, however, since familiarity with the lyrics
that accompany instrumental music should be a prerequisite of such phonological
The combined results of the existing studies of WM leave the effect of
instrumental music somewhat unclear. It does appear safe to conclude that
instrumental music is not as harmful to WM as is vocal music or speech, but the
differences reported between silence and instrumental music appear inconsistent. On
the one hand, the present study found that performance in the presence of
instrumental music is not significantly worse than performance during silence.
Salame and Baddeley (1989; Experiment 2) found the same pattern when testing
practiced adults, but found that both instrumental and vocal music disrupted WM
performance in adults with less practice (Experiment 1), with greater impairment
occurring with vocal music. However, the musical selections changed across these
two experiments, confounding the interpretation of their results. Pring and Walker
(1994) found that instrumental versions of nursery rhymes caused a drop in WM
286 Curr Psychol (2008) 27:277–289
performance compared to another type of instrumental music. Unfortunately, this
study had no silent condition for comparison and did not control for potential effects
of differences in familiarity. Other studies using a silent background condition as
well as music (for example, Ellermeier and Hellbrück 1998) have reported an
intermediate effect of instrumental music. Overall, the existing results suggest that
instrumental music does not seriously impair verbal WM but may be worse than
silence for some people or under some conditions.
There were significant correlations between performance levels across all
background conditions (see Table 3) such that the WM score for one condition
could provide a good prediction of how a participant may score in a different
condition. Thus, while different background conditions may cause variation in
performance, the person’s underlying working memory capacity is still a major
determinant of performance.
This study also looked at the effects of familiarity with song lyrics on WM
performance with exposure to music. Familiarity with the lyrics did not prove to be
correlated with the relative impact of music on performance. Participants in the study
who were unfamiliar or only partially familiar with the music were affected by the
music in about the same way as those who reported knowing all or almost all of the
lyrics. This probably reflects the nature of the irrelevant speech effect wherein
linguistic stimuli affect performance even if the material or, indeed, even the
language, is completely unfamiliar. Many of the participants felt that they were much
more familiar with the lyrics of the vocal song than the instrumental song. This is
likely due to the presence of the actual lyrics for the vocal songs. It seems likely that
actually hearing the lyrics may have given some participants a false impression of
how well they knew the lyrics due to hindsight bias (Guilbault et al. 2004).
Nonetheless, it should be noted that part of the experience of listening to vocal
musical is hearing the notes that the words are sung in and the style in which the
lyrics are presented. Removing the lyrics can remove such identifying information
and may eliminate or reduce a person’s perceived familiarity with a song. It may also
be argued the non-vocal versions are less complex than those with vocals.
Comparison of the familiarity scores and distraction scores for the instrumental
and vocal music conditions revealed no significant correlations between how
familiar a person was with a song and how distracted a person felt when that song
was heard. The lack of significant correlations between these two subjective
measures shows that people who were not familiar with a song felt as distracted as
people who were familiar with the song. This finding is in line with the results noted
above showing that a person’s familiarity with the words of a song does not affect
The perceived distraction levels for the various background auditory conditions
varied significantly, with the instrumental music ranking as the least distracting and
the vocal music condition ranking as the most distracting. While participants may
have felt that the vocal music was most distracting, this subjective impression did
not correspond with actual performance. More generally, perceived distraction levels
did not accurately predict performance. It is interesting that the perceived distraction
levels were significantly different even between speech and vocal music and
between speech and instrumental music; conditions that did not differ significantly in
WM performance. Hence, participants seemed to believe that the various auditory
Curr Psychol (2008) 27:277–289 287
backgrounds affected performance but were not able to accurately assess their
relative impact on WM. Furthermore, this result indicates that, despite notable
consistency in expectations, expectancy effects were weak or absent. That is, the
results indicate that expectations about the level of impairment imposed by various
auditory backgrounds had little, if any, effect on WM performance.
The results from the present study are in line with the typical results of previous
studies (for example, Ellermeier and Zimmer 1997) in showing that people are poor
judges of how much their WM performance is affected by various background
noises. Our results parallel those of Iwanaga and Ito (2002) both in finding the
highest subjective ratings of disturbance for a vocal music condition (although
Iwanaga and Ito had no irrelevant speech condition), and in finding low correlations
between subjective disturbance and actual memory performance. The results of the
present study add to the existing findings in showing that irrelevant speech is
perceived to be less distracting than vocal music but more distracting than
instrumental music, and that this holds for carefully matched instrumental versus
While we believe the present study clarifies the relative impact of several types of
auditory environment on digit span, numerous questions remain about other auditory
stimuli and other WM tasks, as well as the degree and cause of individual differences
in sensitivity to WM disruption.
Austen, J. (1982). Northanger Abbey [book on CD; recorded by Flo Gibson]. Charlotte Hall, MD:
Baddeley, A. D. (1966). Short-term memory for word sequences as a function of acoustic, semantic and
formal similarity. Quarterly Journal of Experimental Psychology,18, 362–365.
Baddeley, A. D. (1999). Essentials of human memory. East Sussex. UK: Psychology.
Baddeley, A. D. (2000). The phonological loop and the irrelevant speech effect: Some comments on Neath
(2000). Psychonomic Bulletin and Review,7, 544–549.
Baddeley, A. D. (2001). Is working memory still working? American Psychologist,56, 851–864.
Baddeley, A. D., & Salame, P. (1982). The unattended speech effect: Perception or memory? Journal of
Experimental Psychology: Learning, Memory, and Cognition,12, 525–529.
Baddeley, A. D., Thomson, N., & Buchanan, M. (1975). Word length and the structure of short-term
memory. Journal of Verbal Learning and Verbal Behavior,14, 575–589.
Cocchini, G., Logie, R. H., Della Sala, S., MacPherson, S. E., & Baddeley, A. D. (2002). Concurrent
performance of two memory tasks: Evidence for domain-specific working memory systems. Memory
& Cognition,30, 1086–1095.
Colle, H. A. (1980). Auditory encoding in visual short-term recall: Effects of noise intensity and spatial
location. Journal of Verbal Learning and Verbal Behavior,19, 722–735.
Colle, H. A., & Welsh, A. (1976). Acoustic masking in primary memory. Journal of Verbal Learning and
Conrad, R., & Hull, A. J. (1964). Information, acoustic confusion and memory span. British Journal of
Davies, D. R., & Jones, D. M. (1975). The effects of noise and incentive in short-term memory. British
Journal of Psychology,66,61–68.
Ellermeier, W., & Hellbrück, J. (1998). Is level irrelevant in ‘irrelevant speech’? Effects of loudness,
signal-to-noise ratio, and binaural unmasking. Journal of Experimental Psychology: Human
Perception and Performance,24, 1406–1414.
Ellermeier, W., & Zimmer, K. (1997). Individual differences in susceptibility to the “irrelevant speech
effect.”.Journal of the Acoustical Society of America,102, 2191–2199.
288 Curr Psychol (2008) 27:277–289
Guilbault, R., Bryant, F., Brockway, J., & Posavac, E. (2004). A meta-analysis of research on hindsight
bias. Basic and Applied Social Psychology,26, 103–117.
Hanley, J. R. (1997). Does articulatory suppression remove the irrelevant speech effect? Memory,5, 423–
Hellbrück, J., Namba, S., & Kuwano, S. (1996). Irrelevant background speech and human performance: Is
there long-term habituation? Journal of the Acoustical Society of Japan,17, 239–247.
Iwanaga, M., & Ito, T. (2002). Disturbance effect of music on processing of verbal and spatial memories.
Perceptual Motor Skills,94, 1251–1258.
Jones, D. M., & Macken, W. J. (1993). Irrelevant tones produce an irrelevant speech effect: Implications
for phonological coding in working memory. Journal of Experimental Psychology: Learning,
Memory, and Cognition,19, 369–381.
Jones, D. M., & Macken, W. J. (1995). Phonological similarity in the irrelevant speech effect: Within- or
between-stream similarity? Journal of Experimental Psychology: Learning, Memory, and Cognition,
Jones, D. M., & Tremblay, S. (2000). Interference in memory by process or content? A reply to Neath
(2000). Psychonomic Bulletin and Review,7, 550–558.
Larsen, J. D., Baddeley, A., & Andrade, J. (2000). Phonological similarity and the irrelevant speech effect:
Implications for models of short-term verbal memory. Memory,8, 145–157.
LeCompte, D. C., Neely, C. B., & Wilson, J. R. (1997). Irrelevant speech and irrelevant tones: The relative
importance of speech to the irrelevant speech effect. Journal of Experimental Psychology: Learning,
Memory, and Cognition,23, 472–483.
Neath, I. (2000). Modelling the effects of irrelevant speech on memory. Psychonomic Bulletin and Review,
Norris, D., Baddeley, A. D., & Page, M. P. A. (2004). Retroactive effects of irrelevant speech on serial
recall from short-term memory. Journal of Experimental Psychology: Learning, Memory, and
Pearman, A., & Storandt, M. (2005). Self-discipline and self-consciousness predict subjective memory in
older adults. Journals of Gerontology: Series B: Psychological Sciences and Social Sciences,60B,
Pring, L., & Walker, J. (1994). The effects of unvocalized music on short-term memory. Current
Rabbitt, P. M. A. (1968). Channel capacity, intelligibility and immediate memory. Quarterly Journal of
Salame, P., & Baddeley, A. D. (1982). Disruption of short-term memory by unattended speech:
Implications for the structure of working memory. Journal of Verbal Learning and Verbal Behavior,
Salame, P., & Baddeley, A. D. (1983). Differential effects of noise and speech on short-term memory. In
G. Rossi (Ed.), Proceedings of the 4th international congress on noise as a public health problem,
Vol. 2 (pp. 751–758). Milan: Centro Ricerche e Studi Amplifon.
Salame, P., & Baddeley, A. D. (1989). Effects of background music on phonological short-term memory.
Quarterly Journal of Experimental Psychology,41A, 107–122.
Schmidt, I. W., Berg, I. J., & Deelman, B. G. (2001). Relations between subjective evaluations of memory
and objective memory performance. Perceptual & Motor Skills,93, 761–776.
Wilding, J., Mohindra, N., & Breen-Lewis, K. (1982). Noise effects in free recall with different orienting
tasks. British Journal of Psychology,73, 479–486.
Curr Psychol (2008) 27:277–289 289