ArticlePDF Available

Abstract and Figures

Inductive learning -- that is, learning a new concept or category by observing exemplars -- happens constantly, for example, when a baby learns a new word or a doctor classifies x-rays. What influence does the spacing of exemplars have on induction? Compared with massing, spacing enhances long-term recall, but we expected spacing to hamper induction by making the commonalities that define a concept or category less apparent. We asked participants to study multiple paintings by different artists, with a given artist's paintings presented consecutively (massed) or interleaved with other artists' paintings (spaced). We then tested induction by asking participants to indicate which studied artist (Experiments 1a and 1b) or whether any studied artist (Experiment 2) painted each of a series of new paintings. Surprisingly, induction profited from spacing, even though massing apparently created a sense of fluent learning: Participants rated massing as more effective than spacing, even after their own test performance had demonstrated the opposite.
No caption available
No caption available
Content may be subject to copyright.
Research Article
Learning Concepts and
Is Spacing the ‘‘Enemy of Induction’’?
Nate Kornell and Robert A. Bjork
University of California, Los Angeles
ABSTRACT—Inductive learning—that is, learning a new
concept or category by observing exemplars—happens
constantly, for example, when a baby learns a new word or
a doctor classifies x-rays. What influence does the spacing
of exemplars have on induction? Compared with massing,
spacing enhances long-term recall, but we expected spac-
ing to hamper induction by making the commonalities that
define a concept or category less apparent. We asked
participants to study multiple paintings by different ar-
tists, with a given artist’s paintings presented consecutive-
ly (massed) or interleaved with other artists’ paintings
(spaced). We then tested induction by asking participants
to indicate which studied artist (Experiments 1a and 1b) or
whether any studied artist (Experiment 2) painted each of
a series of new paintings. Surprisingly, induction profited
from spacing, even though massing apparently created a
sense of fluent learning: Participants rated massing as
more effective than spacing, even after their own test
performance had demonstrated the opposite.
The spacing effect refers to the nearly ubiquitous finding that
items studied once and revisited after a delay are recalled better
in the long term than are items studied repeatedly with no in-
tervening delay (e.g., Cepeda, Pashler, Vul, Wixted, & Rohrer,
2006; Dempster, 1996; Glenberg, 1979; Hintzman, 1974;
Melton, 1970). The positive effects of spacing on long-term re-
call are large and robust, and have been demonstrated in a va-
riety of domains, such as conditioning (even in animals as simple
as Aplysia; see Carew, Pinsker, & Kandel, 1972), verbal learning
(e.g., Bahrick, Bahrick, Bahrick, & Bahrick, 1993; Ebbinghaus,
1885/1964), motor learning (e.g., Shea & Morgan, 1979), and
learning of educational materials (e.g., Bjork, 1979; Dempster,
In many everyday and educational contexts, however, what is
important to learn and remember transcends specific episodes,
instances, and examples. Instead, it is most important to learn
the principles, patterns, and concepts that can be abstracted
from related episodes or examples. In short, educators often
want to optimize the induction of concepts and patterns, and
there are reasons to think that such induction may be enhanced
by massing, rather than by spacing. As stated by E.Z. Rothkopf
(personal communication, September 1977), ‘‘spacing is the
friend of recall, but the enemy of induction.’’
There is a compelling logic behind Rothkopf’s assertion.
Massing allows one to notice the similarities between successive
episodes or exemplars, whereas spacing makes doing so more
difficult. Thus, for example, spacing presentations of individual
paintings by a given artist will make it more difficult to notice
any characteristics that define the artist’s style because spacing
increases the chances that those characteristics will be forgotten
between successive presentations.
The logic behind Rothkopf’s assertion is so compelling that, to
our knowledge, it has never been tested. Perhaps the most direct
evidence that massing facilitates induction comes from a study
by Kurtz and Hovland (1956), who asked participants to study
simple drawings that varied on four dimensions: size, shape,
position, and coloring. There were four categories of drawings
that participants learned to identify; for example, a ‘‘Kem’’ was
defined as a drawing containing a circle positioned near the top
of the display. Each category was presented eight times, with the
individual items either interleaved with items from the other
Address correspondence to Nate Kornell, Department of Psychology,
1285 Franz Hall, UCLA, Los Angeles, CA 90095, e-mail: nkornell@
Volume 19—Number 6 585Copyright r2008 Association for Psychological Science
categories (i.e., spaced) or massed together. No item was ever
repeated exactly. On a memory test following the study phase,
participants’ performance was better for drawings in the massed
condition than for drawings in the spaced condition. Gagne
(1950) obtained a similar result using four categories of non-
sense-figure/nonsense-syllable pairs: Error rates were reduced
when the highly similar category members were grouped to-
gether, instead of being interleaved.
Less direct evidence comes from experiments that compared
exact and nonexact repetitions (i.e., verbatim repetitions vs.
paraphrased or gist repetitions). In such experiments, the spac-
ing effect appears to diminish or disappear altogether for non-
verbatim repetitions (e.g., Appleton-Knapp, Bjork, & Wickens,
2005; Dellarosa & Bourne, 1985; Glover & Corkill, 1987).
Similarly, Melton (1970) demonstrated that spacing effects do
not occur when participants fail to recognize that a repeated item
is a repetition. Given such findings, and given that inductive
learning involves exposure to a variety of different exemplars
and does not involve exact repetition, it seems possible that
spacing effects will disappear or turn into massing effects in
tasks requiring induction.
Finally, research in the domain of motor learning also pro-
vides indirect evidence that inductive learning may profit from
massing, rather than spacing. Learning a motor skill, such as a
tennis serve, involves induction in the sense that exposure to
one’s own proprioceptive feedback is an important component of
learning—and such repetitions are, by necessity, not exact,
especially for novices learning complex skills. Although spaced
practice is effective in many areas of motor learning (e.g., Shea &
Morgan, 1979), massed practice can be more effective for learn-
ing complex motor skills (Wulf & Shea, 2002).
Thus, there are both logical and empirical reasons to expect
massing, not spacing, to facilitate induction. One goal of the ex-
periments reported here was to investigate the size of the mass-
ing effect—that is, the advantage of massed study over spaced
study—in an inductive-learning context relevant to educational
Another goal was to investigate participants’ subjective as-
sessments of massed versus spaced study in the context of in-
duction. Prior research has demonstrated that people often rate
massing as more effective than spacing, even in contexts in
which spacing is actually superior (Baddeley & Longman, 1978;
Simon & Bjork, 2001; Zechmeister & Shaughnessy, 1980). Such
massing illusions may derive from the fact that metacognitive
judgments are often grounded in feelings of fluency (e.g., see
Benjamin, Bjork, & Schwartz, 1998). Presenting the same item
twice consecutively makes processing the second presentation
seem highly fluent, providing a (misleading) impression of learn-
ing, whereas spacing decreases the fluency of processing the
second presentation. In other words, massing provides a sense of
ease, which learners assume will translate to good memory on a
later test, whereas spacing is often a ‘‘desirable difficulty’’
(Bjork, 1994) in the sense that it enhances long-term retention.
If massing produces a sense of fluency of induction, participants
may prefer massing to spacing in the induction task we used in
these experiments.
In this experiment, participants were asked to learn the styles of
12 different artists by viewing six different paintings by each
artist. In Experiment 1a, spacing was manipulated within par-
ticipants: Paintings by each of 6 of the artists were presented
massed, and paintings by each of the other 6 artists were
presented spaced. In Experiment 1b, spacing was manipulated
between participants: For a given participant, the paintings
were presented either all massed by painter or all interleaved
(spaced). After the learning phase, participants were shown new
paintings by the same 12 artists and asked to select, from a list of
all the artists’ names, the artist who had painted each new
painting. After the test, participants in Experiment 1a were
asked what presentation condition, massing or spacing, they felt
had been more effective for learning a given artist’s style.
The participants were University of California, Los Angeles,
undergraduates, who participated for course credit. There were
120 participants in Experiment 1a and 72 participants, 36 in
each condition, in Experiment 1b.
The materials were 10 paintings by each of 12 artists (Georges
Braque, Henri-Edmond Cross, Judy Hawkins, Philip Juras, Ryan
Lewis, Marilyn Mylrea, Bruno Pessani, Ron Schlorff, Georges
Seurat, Ciprian Stratulat, George Wexler, and YieMei). Six paint-
ings by each artist were presented during the study phase, and 4
more were presented during the test phase. All the paintings were
landscapes or skyscapes. We selected artists who would be rela-
tively unknown to the participants, although some of the paintings
by Braque and Seurat may have been familiar to some of the
participants (however, on the final test, average performance on
paintings by those two artists was not better than average perfor-
mance on paintings by all 12 artists). The paintings were cropped
to remove identifying characteristics such as names and signa-
tures, if necessary, and then resized to fit into a 15- 11-cm
rectangle on a computer screen.
Procedure and Design
Participants were instructed about the nature of the study and
test phases and were then shown 72 paintings, 6 paintings by
each of the 12 artists. Each painting was shown for 3 s on a
586 Volume 19—Number 6
Spacing and Induction
computer screen, with the last name of the artist displayed
In Experiment 1a, the paintings by each of six of the artists
were presented consecutively (massed), whereas the paintings by
each of the other six artists were intermingled with paintings by
other artists (spaced). The artists assigned to the massed and
spaced conditions were determined randomly for each partici-
pant. Each successive block of six paintings consisted of six
paintings by a given artist (massed, or M) or one painting by each
of the six artists (spaced, or S). The order of the blocks was
MSSMMSSMMSSM (see Fig. 1). In Experiment 1b, depending on
the condition to which a participant was assigned, either all of the
paintings were presented in the massed condition or all of the
paintings were presented in the spaced condition.
At the end of the study phase, there was a 15-s distractor task,
during which participants counted backward by 3s from 547; the
test phase began when participants completed the distractor
task. On each test trial, an unfamiliar painting by one of the 12
artists was presented. Participants indicated who they thought
had created each painting by clicking their computer’s mouse on
1 of 13 buttons, 12 labeled with the names of the artists and 1
labeled ‘‘I don’t know.’’ After this response, feedback was pro-
vided : The word ‘‘correct’’ followed a correct selection, and the
correct artist’s name was presented following an error.
There were 48 test trials divided into four blocks of 12
paintings. Each block consisted of one new painting by each of
the 12 artists, presented in random order. After the test phase in
Experiment 1a, participants were told the meanings of the terms
massed and spaced and asked, ‘‘Which do you think helped you
learn more, massed or spaced?’’ They were given three response
options: ‘‘massed,’’ ‘‘about the same,’’ and ‘‘spaced.’’ The same
question could not be asked in Experiment 1b because partic-
ipants did not experience both conditions.
In marked contrast to our expectations, spaced study resulted in
significantly better test performance than did massed study, as
measured by the proportion of artists identified correctly on the
test (Fig. 2). (Experiment 1b was conducted after we came up
with a convoluted conjecture that mixing massed and spaced
paintings in a single learning phase created a spacing effect in
Experiment 1a.) The advantage of spacing was significant in
both Experiment 1a, F(1, 119) 577.35, p<.0001, Zp25.39,
and Experiment 1b, F(1, 70) 515.63, p<.001, Zp25.18.
Not surprisingly, given that feedback was provided, test per-
formance increased across test blocks—Experiment 1a: F(3,
357) 526.99, p<.0001, Zp25.18; Experiment 1b: F(3, 210)
511.56, p<.0001, Zp25.14. The interaction of presentation
condition and test block was significant—Experiment 1a: F(3,
357) 513.25, p<.0001, Zp25.10; Experiment 1b: F(3, 210)
53.33, p<.05, Zp25.046. This interaction appears to reflect
the large increase from the first to the second test block in the
massed condition, which may have been a consequence of the
first test block acting as an additional, spaced study opportunity
that benefited previously massed items in particular. A planned
comparison of performance during the first test block, which
Fig. 1. The first 12 paintings presented to 1 of the participants in Ex-
periment 1a (the artists in each condition were determined randomly for
each participant). The first 6 paintings (left column) were all by the same
artist (massed, or M), and the next 6 paintings (right column) were all by
different artists (spaced, or S). In total, there were 12 blocks of 6
paintings in the order MSSMMSSMMSSM. Therefore, in the spaced
condition, a given artist was represented by 1 painting in each S block.
Volume 19—Number 6 587
Nate Kornell and Robert A. Bjork
was, presumably, largely unaffected by the presence of feed-
back, showed that participants performed significantly better in
the spaced condition than in the massed condition, in both
Experiment 1a (M5.61, SD 5.24 vs. M5.35, SD 5.24),
t(119) 510.82, p<.0001, p
1.00, d50.99, and Ex-
periment 1b (M5.59, SD 5.22 vs. M5.36, SD 5.18), t(70) 5
4.94, p<.0001, p
1.00, d51.28.
The advantage of spacing over massing is all the more sur-
prising given participants’ responses on the questionnaire ad-
ministered after the test. As Figure 3 shows, participants in
Experiment 1a judged massing to be more effective than spacing,
regardless of their performance in the two conditions. Overall,
78% of the participants did better with spaced presentations than
they did with massed presentations, but 78% of the participants
said that massing was as good as or better than spacing.
The results of Experiments 1a and 1b pose two puzzles. First,
why did spacing, not massing, foster induction when there were
compelling reasons to expect otherwise? Second, why did par-
ticipants remain unaware that spacing was more effective than
massing, even after taking the test? With respect to the second
puzzle, we hypothesized that participants, while taking the test,
might not have remembered which artists had been presented in
which condition. To investigate this hypothesis, we presented 28
participants in Experiment 1a with a list of the artists’ names, and
asked them to indicate—after they had completed the test—how
each artist’s paintings had been presented (spaced or massed).
Accuracy on the identification task was significantly above chance
for artists whose paintings had been presented spaced (M5.74,
SD 5.17), t(27) 57.48, p<.0001, p
1.00, d51.41, but
not for artists whose paintings had been presented massed (M5
.55, SD 5.21), t(27) 51.19, p5.25, p
Participants’ inability to remember which artists’ paintings had
been presented massed suggests that participants often, if not
always, made their metacognitive judgments on the basis of their
subjective experience during the study phase.
With respect to the puzzle of why spacing enhanced induction,
one possible explanation is that the test required recalling a given
artist’s name, not just knowing his or her style, and spacing facil-
itates recall. It seems possible that participants did indeed induce
an artist’s style more effectively in the massed condition than in the
spaced condition, but recalled the name associated with that style
better in the spaced condition. Experiment 2 was designed to test
that possibility by assessing participants’ recognition, not recall, so
that they did not need to remember name-style associations.
The learning phase in Experiment 2 was identical to the learning
phase in Experiment 1a. However, during the test phase, par-
Proportion Correct Proportion Correct
Test Block
Fig. 2. Proportion of artists selected correctly on the multiple-choice
tests in Experiments 1a (top panel) and 1b (bottom panel) as a function of
presentation condition (spaced or massed) and test block. Error bars
represent standard errors.
Massed >
Massed =
Spaced >
Judged Effectiveness
Number of Participants
Spaced > Massed
Massed = Spaced
Massed > Spaced
Actual Effectiveness
Fig. 3. Number of participants (out of 120) who judged massing as more
effective than, equally effective as, or less effective than spacing in Ex-
periment 1a. For each judgment, the number of participants is divided
according to their actual performance in the spaced condition relative to
the massed condition.
588 Volume 19—Number 6
Spacing and Induction
ticipants were given a style-recognition test. All of the tested
paintings were new paintings, as in Experiment 1, but partici-
pants were asked only to categorize a given test painting as by a
‘‘familiar artist’’ (i.e., by an artist whose paintings had been
presented during the study phase) or as by an ‘‘unfamiliar artist.’’
Thus, the test required remembering only studied artists’ styles,
not their names.
The participants were 80 undergraduate students at the Uni-
versity of California, Los Angeles, who participated for course
The materials consisted of the same set of paintings used in
Experiment 1, plus, for each studied artist, an additional set of
four distractor paintings. Each distractor painting was chosen to
be stylistically similar to a studied artist’s paintings, and each
distractor was by a different artist (see Fig. 4).
The study phase was exactly the same as in Experiment 1a, as
was the questionnaire at the end of the experiment. The only dif-
ference from Experiment 1a was in the test phase (and associ-
ated instructions).
During each trial of the test phase, a painting was presented
with two buttons on the computer screen; one button was labeled
‘‘familiar artist,’’ and one was labeled ‘‘unfamiliar artist.’’ Par-
ticipants were instructed to select the ‘‘familiar artist’’ button if
they thought the painting was by an artist whose paintings had
been presented during the study phase, and to select the ‘‘un-
familiar artist’’ button if they thought the painting was by an
artist whose paintings had not been presented during the study
phase. There were four test blocks, each of which included one
target painting and one distractor painting by a corresponding
nonstudied artist, making a total of 24 paintings per block. No
feedback was given during the test.
Recognition test trials are, inevitably, also learning events. A
side effect of falsely endorsing a painting as by a familiar artist
was that a participant might alter his or her concept of the fa-
miliar artist’s style by incorporating aspects of a painting by an
unfamiliar artist into that concept. On each successive test
block, the potential for contamination created by false alarms
grew, resulting in a significant decrease in recognition accuracy
across test blocks, F(3, 237) 53.60, p<.05, Zp25.04.
Therefore, to gain maximum leverage on the question of interest,
we restricted our analyses to the first test block, which provided
the purest measure of the learning that occurred during the study
phase of the experiment.
Again, we were surprised to find that performance in the
spaced condition was superior to performance in the massed
condition. As Figure 5 shows, the spaced and massed conditions
produced similar rates of false alarms (i.e., saying that a painting
by a nonstudied artist was by a studied artist), but the hit rate
(i.e., correctly categorizing a new painting by a studied artist as
by a familiar artist) was higher in the spaced condition (M5.77,
SD 5.22) than in the massed condition (M5.67, SD 5.24),
t(79) 53.28, p<.01, p
5.98, d50.41. Consequently, there
was a significant interaction between spacing condition and
response type, F(1, 79) 57.84, p<.01, Zp25.09. There was
also a main effect of response type, F(1, 79) 5177.82, p<
.0001, Zp25.69, with more hits than false alarms; this pattern
of results shows that participants could distinguish between the
target and distractor paintings. Thus, even in a situation that did
not require participants to recall name-style associations,
spacing led to more effective induction than did massing.
Fig. 4. Examples of four target (a) and four associated distractor (b) paintings from the test phase of Experiment 2. The target
paintings were the same as those used in Experiment 1. The distractors were all by different artists, and each was selected to be
similar to the paintings of a given studied artist.
Volume 19—Number 6 589
Nate Kornell and Robert A. Bjork
As in Experiment 1a, the participants’ metacognitive judg-
ments were strikingly at odds with their actual prior perfor-
mance. Of the 72 participants who did not say that learning in
the massed condition and learning in the spaced condition were
‘‘about the same,’’ 64 thought massing had been more effective
than spacing.
One possible explanation of the current findings is that
schema induction happened early in the study phase. For ex-
ample, the induction for each artist may have been ‘‘done’’ by the
third study trial, so that the next three study trials amounted to
either massed or spaced memory practice. To test this possi-
bility, we conducted an additional experiment. The test phase
was the same as in Experiment 2, but in the study phase, only two
paintings by each artist (instead of six) were presented. We
obtained the same pattern of results: There was a significant
benefit of spaced study, but participants thought massed study
had been more effective.
A common way to teach students about an artist is to show, in
succession, a number of paintings by that artist. Counterintu-
itive as it may be to art-history teachers—and our participants
we found that interleaving paintings by different artists (spac-
ing) was more effective than massing all of an artist’s paintings
together. A possible key to understanding the present findings
involves the relationship between induction and discrimination.
Induction and Discrimination
Experiment 1 required that participants discriminate among
different artists’ styles; that is, on the test, they had to decide
which artist, among the 12 studied artists, had painted a given
new painting. The interleaving of artists that was intrinsic to the
spaced condition might have fostered such discrimination. For
example, the key to deciding whether a tree is a maple or an
oak (or some other tree) is learning to appreciate the differences
among trees, not learning about a given type of tree in isolation.
Interleaving had the effect of juxtaposing different paintings and
therefore might have enhanced discrimination learning. (In fact,
we have presented no evidence regarding the effects of temporal
spacing in the absence of interleaving, and it may be inter-
leaving, not spacing itself, that is the key to enhancing inductive
With respect to this possibility, the following observation by
Kurtz and Hovland (1956) seems relevant: ‘‘When the degree of
discriminability is low, it might be expected that placing of in-
stances from different concepts in juxtaposition would facilitate
discrimination learning, whereas with greater discriminability,
like that obtaining in the present study, the reverse might ob-
tain’’ (p. 242). Thus, if discrimination is not difficult, as was the
case in Kurtz and Hovland’s experiment, massing may be ad-
vantageous, but if discrimination is difficult, as it was in our ex-
periments, spacing might be more effective.
This argument is appealing, but it is not entirely consistent
with the results from Experiment 2. The recognition test in Ex-
periment 2 required discriminating between paintings by pre-
viously studied artists and similar paintings by artists who had
not been studied; it did not require distinguishing among artists
whose work had been presented, and yet there was a benefit from
It could be argued that a by-product of being better able to
distinguish among the presented artists is being able to distin-
guish those artists, as a group, from other artists. It may be rare,
in fact, that a concept or category (such as what psychology is or
how to fly a kite) is ever learned without the need to discriminate
it from other categories (such as sociology or ways to make a kite
Our results notwithstanding, there surely are situations in
which massing is more effective for induction than is spacing.
We attempted to create one such situation by asking participants
to figure out, and remember on a later test, the single word that
could be used to fill in the blanks in each of 12 sets of six words;
for example, in the case of _____ cracker, _____ wood, _____
side, _____ ant, _____ truck, _____ arm, the word to be gen-
erated and remembered was fire. The design was similar to the
design of Experiment 1a—that is, half of the sets that defined a
to-be-remembered word were presented massed, whereas the
other half were presented spaced—and 20 undergraduate par-
ticipants were tested. In this case, spacing made it nearly im-
possible to solve the problems, and, thus, later memory for the
target words was significantly better in the massed condition
than in the spaced condition (.34 vs. .22), t(19) 52.78, p<.05,
5.94, d50.65.
Admittedly, this simple experiment was contrived to be a
situation in which massing, not spacing, would enhance the
Studied Artists
Artists (FAs)
Proportion 'Familiar'
Fig. 5. Results from the recognition test in Experiment 2: proportion of
paintings judged to be painted by a studied artist as a function of whether
the artist had been studied (hits) or had not been studied (false alarms, or
FAs), separately for the spaced and massed conditions. Only data from
the first test block were analyzed and plotted here. Error bars represent
standard errors.
590 Volume 19—Number 6
Spacing and Induction
generation and memory of the critical words. The experiment
demonstrates, however, that whether spacing is the friend or
enemy of induction is a matter for sophisticated theorizing, be-
cause induction is a product of conceptual and memory processes
that are open to multiple situational influences. The important
point, though, is that in less contrived and more complex real-
world learning situations, spacing appears to facilitate induction.
Practical Implications
Inductive learning—that is, learning from examples—is a key
element of formal education, and of how humans (and other
animals) informally learn about the world. There are many in-
ductive-learning situations that would seem, from an intuitive
standpoint, to lend themselves to massed study, but may not.
Examples include a baby learning what chair means by ob-
serving people talking about chairs; an older child learning the
rules of a language, such as that most plural English words end
in s, by listening to people speak the language; a student in
school learning how words are spelled by reading them (as well
as through more direct instruction); a quarterback learning to
recognize a complex pattern of motion that predicts an inter-
ception by gaining experience in practice and during games; a
monkey learning to recognize the warning signs that another
monkey is acting threateningly by observing other monkeys’
behavior; and a medical student learning to recognize warning
signs of lung cancer by reading x-rays under an expert’s su-
pervision. Our results cannot necessarily be generalized to all of
these situations, of course, but they do suggest that in inductive-
learning situations, spacing may often be more effective than
massing, even when intuition suggests the opposite.
Our results also suggest that individuals responsible for the
design and evaluation of instruction that involves induction are
susceptible to being very misled by their own intuitions and
subjective experiences. Although prior experiments (Baddeley &
Longman, 1978; Simon & Bjork, 2001; Zechmeister & Shaugh-
nessy, 1980) have shown that people can experience an illusion
that massing is effective, we know of no experiment that can
match the current findings in terms of sheer inaccuracy of
judgments. In Experiments 1a and 2 combined, 85% of the
participants did at least as well in the spaced condition as in the
massed condition, but 83% of the participants rated the massed
condition as equally effective as or more effective than the
spaced condition. The illusion of effective learning in the massed
condition, based, apparently, on a sense of fluency of induction,
was clearly powerful in the experiments presented here. In real-
world inductive-learning tasks, therefore, it seems likely that
people will be heavily influenced by the illusory benefits of
massing when making decisions about their own learning or the
learning of their students or children. That is, most people are
likely to prefer massing in inductive-learning situations, but our
results suggest that they may do so at their own (and their stu-
dents’ and children’s) peril from a learning standpoint.
Looking back at our own inability to foresee the benefits of
spacing, perhaps we fell victim to the same illusion that we have
railed against (e.g., Bjork, 1994, 1999; Kornell & Bjork, 2007),
namely, the illusion that a sense of ease or fluency accompanies
effective learning, whereas a sense of difficulty signifies in-
effective learning. In the case of induction, as in many other
types of learning, spacing appears to be sometimes, if not al-
ways, a desirable difficulty (Bjork, 1994).
Acknowledgments—We thank Makah Leal and Timothy Wong
for their invaluable contributions to all facets of the experiments,
Elizabeth Bjork for her insights, and Katherine Huang and Jeri
Little for their help carrying out the experiments. Grant 29192G
from the McDonnell Foundation supported this research.
Appleton-Knapp, S., Bjork, R.A., & Wickens, T.D. (2005). Examining
the spacing effect in advertising: Encoding variability, retrieval
processes and their interaction. Journal of Consumer Research,
32, 266–276.
Baddeley, A.D., & Longman, D.J.A. (1978). The influence of length
and frequency of training session on the rate of learning to type.
Ergonomics,21, 627–635.
Bahrick, H.P., Bahrick, L.E., Bahrick, A.S., & Bahrick, P.E. (1993).
Maintenance of foreign language vocabulary and the spacing ef-
fect. Psychological Science,4, 316–321.
Benjamin, A.S., Bjork, R.A., & Schwartz, B.L. (1998). The mismeasure
of memory: When retrieval fluency is misleading as a metamne-
monic index. Journal of Experimental Psychology: General,127,
Bjork, R.A. (1979). An information-processing analysis of college
teaching. Educational Psychologist,14, 15–23.
Bjork, R.A. (1994). Memory and metamemory considerations in the
training of human beings. In J. Metcalfe & A. Shimamura (Eds.),
Metacognition: Knowing about knowing (pp. 185–205). Cam-
bridge, MA: MIT Press.
Bjork, R.A. (1999). Assessing our own competence: Heuristics and
illusions. In D. Gopher & A. Koriat (Eds.), Attention and perfor-
mance XVII: Cognitive regulation of performance: Interaction
of theory and application (pp. 435–459). Cambridge, MA: MIT
Carew, T.J., Pinsker, H.M., & Kandel, E.R. (1972). Long-term habit-
uation of a defensive withdrawal reflex in Aplysia. Science,175,
Cepeda, N.J., Pashler, H., Vul, E., Wixted, J.T., & Rohrer, D. (2006).
Distributed practice in verbal recall tasks: A review and quan-
titative synthesis. Psychological Bulletin,132, 354–380.
Dellarosa, D., & Bourne, L.E. (1985). Surface form and the spacing
effect. Memory & Cognition,13, 529–537.
Dempster, F.N. (1988). The spacing effect: A case study in the failure
to apply the results of psychological research. American Psy-
chologist,43, 627–634.
Dempster, F.N. (1996). Distributing and managing the conditions of
encoding and practice. In R. Bjork & E. Bjork (Eds.), Memory
(pp. 317–344). San Diego, CA: Academic Press.
Volume 19—Number 6 591
Nate Kornell and Robert A. Bjork
Ebbinghaus, H.E. (1964). Memory: A contribution to experimental
psychology (H.A. Ruger & C.E. Bussenius, Trans.). New York:
Dover. (Original work published 1885)
´, R.M. (1950). The effect of sequence of presentation of similar
items on the learning of paired-associates. Journal of Experi-
mental Psychology,40, 61–73.
Glenberg, A.M. (1979). Component-levels theory of the effects of
spacing of repetitions on recall and recognition. Memory &
Cognition,7, 95–112.
Glover, J.A., & Corkill, A.J. (1987). Influence of paraphrased repeti-
tions on the spacing effect. Journal of Educational Psychology,
79, 198–199.
Hintzman, D.L. (1974). Theoretical implications of the spacing effect.
In R.L. Solso (Ed.), Theories in cognitive psychology: The Loyola
symposium (pp. 77–97). Potomac, MD: Erlbaum.
Kornell, N., & Bjork, R.A. (2007). The promise and perils of self-
regulated study. Psychonomic Bulletin & Review,14, 219–224.
Kurtz, K.H., & Hovland, C.I. (1956). Concept learning with differing
sequences of instances. Journal of Experimental Psychology,51,
Melton, A.W. (1970). The situation with respect to the spacing of
repetitions and memory. Journal of Verbal Learning and Verbal
Behavior,9, 596–606.
Shea, J.B., & Morgan, R.L. (1979). Contextual interference effects on
the acquisition, retention, and transfer of a motor skill. Journal of
Experimental Psychology: Human Learning and Memory,5, 179–
Simon, D.A., & Bjork, R.A. (2001). Metacognition in motor learning.
Journal of Experimental Psychology: Learning, Memory, and
Cognition,27, 907–912.
Wulf, G., & Shea, C.H. (2002). Principles derived from the study of
simple skills do not generalize to complex skill learning. Psy-
chonomic Bulletin & Review,9, 185–211.
Zechmeister, E.B., & Shaughnessy, J.J. (1980). When you know that
you know and when you think that you know but you don’t.
Bulletin of the Psychonomic Society,15, 41–44.
592 Volume 19—Number 6
Spacing and Induction
... With a motorlearning task (key-stroke patterns) it was found that although participants who engaged in interleaved practice predicted their performance on a test the next day quite well, those given blocked practice were quite overconfident (Simon & Bjork, 2001). In a category learning task (learning to recognize artists' styles from their paintings), it was found that only 22% of the students thought they did better in the interleaved study condition than in the blocked study condition, whereas actually, 78% did better after interleaved than blocked study (Kornell & Bjork, 2008). One could argue that this is because students are unaware of the effectiveness of the strategy (cf. ...
... McCabe, 2011). Interestingly, however, Kornell and Bjork (2008) provided correctness feedback on the test, and the judgments of learning were made after the test, so students could have been aware that they performed better after interleaved studying. Moreover, it has been found that students who quite accurately rated study strategies in terms of being more effective versus less effective, still tended to use ineffective strategies (Blasiman et al., 2017). ...
... We had similar hypotheses for Experiment 1 and Experiment 2. First, we expected to replicate findings from prior research that students' experiences while studying and their learning outcomes do not match: Interleaved studying is expected to lead to higher mental effort investment and lower judgments of learning (Kirk-Johnson et al., 2019;Onan et al., 2022), yet is also expected to yield higher test performance than blocked studying (Kirk-Johnson et al., 2019;Kornell & Bjork, 2008;Onan et al., 2022). Second, we expect that students use their invested mental effort during studying as a cue when judging the effectiveness of blocked and interleaved studying and deciding on their willingness to use each strategy (i.e., we expect a negative association between invested effort and perceived effectiveness/willingness to use a strategy). ...
Full-text available
Students tend to avoid effective but effortful study strategies. One potential explanation could be that high-effort experiences may not give students an immediate feeling of learning, which may affect their perceptions of the strategy’s effectiveness and their willingness to use it. In two experiments, we investigated the role of mental effort in students’ considerations about a typically effortful and effective strategy (interleaved study) versus a typically less effortful and less effective strategy (blocked study), and investigated the effect of individual feedback about students’ study experiences and learning outcomes on their considerations. Participants learned painting styles using both blocked and interleaved studying (within-subjects, Experiment 1, N = 150) or either blocked or interleaved studying (between-subjects, Experiment 2, N = 299), and reported their study experiences and considerations before, during, and after studying. Both experiments confirmed prior research that students reported higher effort investment and made lower judgments of learning during interleaved than during blocked studying. Furthermore, effort was negatively related to students’ judgments of learning and (via these judgments) to the perceived effectiveness of the strategy and their willingness to use it. Interestingly, these relations were stronger in Experiment 1 than in Experiment 2, suggesting that effort might become a more influential cue when students can directly compare experiences with two strategies. Feedback positively affected students’ considerations about interleaved studying, yet not to the extent that they considered it more effective and desirable than blocked studying. Our results provide evidence that students use effort as a cue for their study strategy decisions.
... Inductive category learning is an important skill, as it allows people to generalize their knowledge attained from a limited amount of experience to a wider range of novel exemplars beyond the original learning event (Ashby and O'Brien 2005). The importance of category induction has been emphasized in various fields of study, such as art (Kornell and Bjork 2008), geology (Whitehead et al. 2021), ornithology (Birnbaum et al. 2013;Wahlheim et al. 2011), medical diagnoses (Chen et al. 2015;Hatala et al. 2003) and mathematics (Taylor and Rohrer 2010;Rohrer et al. 2015). Much research, therefore, has focused on identifying the effective learning techniques to promote category induction. ...
... A great deal of research has focused on the effects of study schedules on category induction (see Kang 2016 for an overview); that is, the participants learned different categories by studying a series of exemplars but were not explicitly informed about the rules or the characteristic features that defined category membership. Instead, they had to induce the pattern by themselves (Birnbaum et al. 2013;Kornell and Bjork 2008;Wahlheim et al. 2011). The present study extends the literature by examining how interleaving and blocking schedules may affect category learning differently when students are explicitly provided with verbal explanations that describe the important features of a category. ...
... Interestingly, blocking is believed by many learners to be more effective than interleaving schedule even when the empirical evidence shows the opposite (see Kang 2016 for an overview). For example, Kornell and Bjork (2008) had participants learn painting styles of twelve different artists through a series of their paintings. For half of the artists, their paintings were arranged in a blocked manner across the study, and for the other half, the paintings were interleaved. ...
Full-text available
The present study examined the effects of study schedule (interleaving vs. blocking) and feature descriptions on category learning and metacognitive predictions of learning. Across three experiments, participants studied exemplars from different rock categories and later had to classify novel exemplars. Rule-based and information-based categorization was also manipulated by selecting rock sub-categories for which the optimal strategy was the one that aligned with the extraction of a simple rule, or the one that required integration of information that may be difficult to describe verbally. We observed consistent benefits of interleaving over blocking on rock classification, which generalized to both rule-based (Experiment 1) and information-integration learning (Experiments 1–3). However, providing feature descriptions enhanced classification accuracy only when the stated features were diagnostic of category membership, indicating that their benefits were limited to rule-based learning (Experiment 1) and did not generalize to information-integration learning (Experiments 1–3). Furthermore, our examination of participants’ metacognitive predictions demonstrated that participants were not aware of the benefits of interleaving on category learning. Additionally, providing feature descriptions led to higher predictions of categorization even when no significant benefits on actual performance were exhibited.
... This implies considerable procedural repetition, but limited sight-reading practice. Although blocked repetition is often encouraged and gives a subjective sense of improvement (Kornell & Bjork, 2008), it is less effective than interleaved practice, where musicians alternate between short practice sessions of several musical scores (Carter & Grahn, 2016;Stambaugh, 2011;Stambaugh & Demorest, 2010). Randomized focal practice of sight reading on its own is not the norm. ...
Full-text available
Learning to read and play music written in standard notation, termed sight reading, is an important yet difficult aspect of early music education. However, the music contingency learning procedure produces rapid and robust early learning of the motor execution associated with note positions. In this task, nonmusicians identify a note name (e.g., "do") written inside a note in one of the vertical positions of the musical staff with a keyboard response. Each note position is presented frequently with the matching (congruent) note name and rarely with the incongruent note names. The present work further explores this novel learning paradigm. In Experiment 1, we manipulated the proportion of congruent trials from 50 to 100%. The contingency effect, along with contingency awareness (i.e., verbalizable knowledge of note meanings), increased with a stronger contingency manipulation. In Experiment 2, half of the participants responded to the note positions (instead of the note names) with a keyboard response. A learning effect was also observed for this task, though contingency awareness was reduced in this group. These results shed more light on the properties of incidental music learning and further suggest more ideal parameters for future practical applications to supplement traditional instruction in real-world music education.
... First, the experience of using retrieval may not be sufficient to convince learners that it is effective. Some research shows that even after directly experiencing effective learning methods, learners still hold onto the belief that those methods are not effective (Kornell & Bjork, 2008). Indeed, in the studies described above, learners were not given feedback of their own performance while learning from retrieval vs. restudy, so without awareness of the benefits of retrieval they may have chosen restudy for the same reasons that they tend to prefer study techniques that feel smooth and easy, even if those techniques are less effective for learning (Tricio et al., 2023). ...
Full-text available
Over 100 years of research shows that retrieval practice is highly effective for enhancing student learning. When managing their own study behaviors, however, students tend to avoid using retrieval practice as a way of learning. Understanding and improving students’ study decisions is important given the increasingly autonomous nature of educational experiences that require students to initiate and regulate their own learning. This review summarizes the emerging research on interventions designed to increase students’ decisions to use retrieval practice. Informing students about the benefits of retrieval, and even providing opportunities to directly experience retrieval, are not sufficient for getting students to engage with retrieval when they have the choice. However, reducing the effort and errors involved in retrieval, and providing students direct performance feedback on their own learning benefits associated with retrieval, can increase students’ decisions to use it. The small but growing literature on multifaceted interventions also shows some promise for increasing students’ decisions to use retrieval practice in their courses as a result of learning about its benefits, planning how to use it, practicing it over time, and reflecting on the outcomes. Suggestions are offered for how this research informs straightforward ways that teachers might encourage students to use retrieval practice in their own learning.
... However, these difficulties ultimately result in the type of learning that is highly desirable: learning that is both long-lasting and transferable. Examples of such desirable difficulties include (a) spaced or distributed practice (versus blocked or massed practice; Bjork & Allen, 1970;Cepeda et al., 2006;Greene, 2008;Karpicke & Bauernschmidt, 2011;Murphy et al., 2022); (b) contextual variation (that is, changing the conditions of practice rather than keeping them constant and predictable (Imundo et al., 2021;Smith et al., 1978); (c) interleaving (varying the topics being studied rather than studying only one over and over again before moving on to the next one (e.g., Kornell & Bjork, 2008); and (d) testing or retrieval practice (DeWinstanley & Bjork, 2004;Halamish & Bjork, 2011;Roediger & Karpicke, 2006a). ...
Full-text available
Although students tend to dislike exams, tests—broadly defined in the present commentary as opportunities to practice retrieving to-be-learned information—can function as one of the most powerful learning tools. However, tests have a variety of attributes that affect their efficacy as a learning tool. For example, tests can have high and low stakes (i.e., the proportion of a student’s grade the exam is worth), vary in frequency, cover different ranges of course content (e.g., cumulative versus non-cumulative exams), appear in many forms (e.g., multiple-choice versus short answer), and occur before or after the presentation of what is to be learned. In this commentary, we discuss how these different approaches to test design can impact the ability of tests to enhance learning and how their use as instruments of learning—not just means of assessment—can benefit long-term learning. We suggest that instructors use frequent, low-stakes, cumulative exams and a variety of test formats (e.g., cued recall, multiple-choice, and true/false) and give students exams both prior to learning and following the presentation of the to-be-learned material.
... For example, people often believe they have learned less on tasks that required more effort, "reading" high effort as indicative of weaker knowledge or a likelihood of performing poorly later (e.g., Benjamin et al. 1998). Bjork (2011, 2020) termed "desirable difficulties" for situations (and their findings) in which conditions of practice that are more effortful (difficult) for the learner lead to better learning; interestingly, learners often predict worse learning for conditions that require more effort (e.g., spacing rather than massing to-be-learned information, (Kornell and Bjork 2008); testing rather than restudying, (Roediger and Karpicke 2006)). Consistent with the idea that learners perceive effort as indicative of weaker knowledge, longer response times to answer a question are associated with reduced confidence (e.g., Ackerman and Zalmanov 2012;Hertwig et al. 2008). ...
Full-text available
Students claim that multiple-choice questions can be tricky, particularly those with competitive incorrect choices or choices like none-of-the-above (NOTA). Additionally, assessment researchers suggest that using NOTA is problematic for assessment. In experiments conducted online (with trivia questions) and in the classroom (with course-related questions), I investigated the effects of including NOTA as a multiple-choice choice alternative on students’ confidence and performance. In four experiments, participants answered two types of questions: basic multiple-choice questions (basic condition) and equivalent questions in which one incorrect choice was replaced with NOTA (NOTA condition). Immediately after answering each question, participants rated their confidence in their answer to that question (item-by-item confidence). At the end of the experiments, participants made aggregate confidence judgments for the two types of questions and provided additional comments about the use of NOTA as an alternative. Surprisingly, I found no significant differences in item-by-item confidence or performance between the two conditions in any of the experiments. However, across all four experiments, when making aggregate judgments, participants provided lower confidence estimates in the NOTA condition than in the basic condition. Although people often report that NOTA questions hurt their confidence, the present results suggest that they might not—at least not on a question-by-question basis.
... Various factors influence memory retention, including the difficulty and meaningfulness of the information, the method of presentation, the frequency of review, and psychological factors such as stress. Research by Kornell and Bjork (2008) suggests that the difficulty of a subject can affect how well it is retained, while Pekrun et al. (2009) highlight the importance of finding a topic engaging for better retention. Mayer (2009) emphasizes the relevance of the method of learning, and Roediger III and Karpicke (2006) emphasize the benefits of frequent memory testing. ...
Full-text available
Infographics have become more common in education as a way to graphically convey complex information. They offer a visually appealing structure that facilitates comprehension and retention, particularly for language acquisition. This study examines the impact of infographics in grammar courses on grammar retention as well as the attitudes of first-year non-English majors toward using infographics in grammar lessons. Freshmen at Mekong Delta University taking General English classes at the beginning of their first semester made up the participants. The goal of the study is to determine whether infographics assist students in remembering grammar and what they think of it. It emphasizes remembering five English tenses. As part of a quantitative research methodology using a quasi-experimental design, participants were split into two groups: an experimental group (EG) that received instruction via infographics and a CG that received traditional schooling. The study used a pre-test, post-test, delayed post-test, and a questionnaire to assess attitudes and retention. The findings suggest that including infographics into English language training can enhance students' perspectives on the subject and their memory of grammar. Teachers should consider the attitudes of their students while developing foreign language teaching materials and techniques.
Full-text available
Researchers have started to better understand spaced and interleaved practice as a strategy for improved retrieval performance. However, evidence often centres on laboratory studies with adults as opposed to school-age pupils and many school syllabuses are primarily designed for sequential learning in blocks, ensuring mastery of foundational knowledge. Nevertheless, this presentation explores the possibility of successfully integrating spaced and interleaved practice alongside traditional blocked learning, using proportionally sequenced starter questions, homework tasks and assessments.
Full-text available
Is self-assessment enough to keep physicians' cognitive skills-such as diagnosis, treatment, basic biological knowledge, and communicative skills-current? We review the cognitive strengths and weaknesses of self-assessment in the context of maintaining medical expertise. Cognitive science supports the importance of accurately self-assessing one's own skills and abilities, and we review several ways such accuracy can be quantified. However, our review also indicates a broad challenge in self-assessment is that individuals do not have direct access to the strength or quality of their knowledge and instead must infer this from heuristic strategies. These heuristics are reasonably accurate in many circumstances, but they also suffer from systematic biases. For example, information that feels easy to process in the moment can lead individuals to overconfidence in their ability to remember it in the future. Another notable phenomenon is the Dunning-Kruger effect: the poorest performers in a domain are also the least accurate in self-assessment. Further, explicit instruction is not always sufficient to remove these biases. We discuss what these findings imply about when physicians' self-assessment can be useful and when it may be valuable to supplement with outside sources.
Full-text available
This study looked into numerous academic domains, including engineering, physics, and the social sciences, place a high value on statistics and probability research. The high dropout and low retention rates that result from these disciplines are due to how challenging they are for students to learn. A number of teaching strategies, most notably the Spiral Progression Approach (SPA), have been developed to solve this problem. The Spiral Progression Approach, a teaching strategy that builds on existing knowledge and abilities and reinforces concepts via repetition, presents and revisits ideas in a cyclical fashion. Because it promotes deeper learning and allows for ongoing engagement with the material, this approach is expected to increase student retention. The Spiral Progression Approach's effectiveness in boosting student retention in Statistics and Probability has not received much research despite its potential advantages.
Full-text available
examine 2 . . . contributors to nonoptimal training: (1) the learner's own misreading of his or her progress and current state of knowledge during training, and (2) nonoptimal relationships between the conditions of training and the conditions that can be expected to prevail in the posttraining real-world environment / [explore memory and metamemory considerations in training] (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Full-text available
72 college students learned 3 motor tasks under a blocked (low interference) or random (high interference) sequence of presentation. Retention was measured after a 10-min or 10-day delay under blocked and random sequences of presentation. Subsequent transfer to a task of either the same complexity or greater complexity than the originally learned tasks was also investigated. Results showed that retention was greater following random acquisition than under changed contextual interference conditions. Likewise, transfer was greater for random acquisition groups than for blocked acquisition groups. This effect was most notable when transfer was measured for the transfer task of greatest complexity. Results are considered as support for W. F. Battig's (1978) conceptualization of contextual interference effects on retention and transfer. (13 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Full-text available
This article constitutes an optimistic argument that basic research on human cognitive processes has yielded principles and phenomena that have considerable promise in guiding the design and execution of college instruction. To illustrate that point, four somewhat interrelated principles and phenomena arc outlined and some possible implications and applications of those principles and phenomena are put forward.
Research on judgments of verbal learning has demonstrated that participants' judgments are unreliable and often overconfident The authors studied judgments of perceptual-motor learning. Participants learned 3 keystroke patterns on the number pad of a computer, each requiring that a different sequence of keys be struck in a different total movement time. Practice trials on each pattern were either blocked or randomly interleaved with trials on the other patterns, and each participant was asked, periodically, to predict his or her performance on a 24-hr test. Consistent with earlier findings, blocked practice enhanced acquisition but harmed retention. Participants, though, predicted better performance given blocked practice. These results augment research on judgments of verbal learning and suggest that humans, at their peril, interpret current ease of access to a perceptual-motor skill as a valid index of learning.
The spacing effect would appear to have considerable potential for improving classroom learning, yet there is no evidence of its widespread application. I consider nine possible impediments to the implementation of research findings in the classroom in an effort to determine which, if any, apply to the spacing effect. I conclude that the apparent absence of systematic application may be due, in part, to the ahistorical character of research on the spacing effect and certain gaps in our understanding of both the spacing effect and classroom practice. However, because none of these concerns seems especially discouraging, and in view of what we do know about the spacing effect, classroom application is recommended.
In a 9-year longitudinal investigation, 4 subjects learned and relearned 300 English-foreign language word pairs. Either 13 or 26 relearning sessions were administered at intervals of 14, 28, or 56 days. Retention was tested for 1.2.3. or 5 years after training terminated. The longer intersession intervals slowed down acquisition slightly, but this disadvantage during training was offset by substantially higher retention. Thirteen retraining sessions spaced at 56 days yielded retention comparable to 26 sessions spaced at 14 days. The retention benefit due to additional sessions was independent of the benefit due to spacing, and both variables facilitated retention of words regardless of difficulty level and of the consistency of retrieval during training. The benefits of spaced retrieval practice to long-term maintenance of access to academic knowledge areas are discussed.
Four groups of postmen were trained to type alpha-numeric code material using a conventional typewriter keyboard. Training was based on sessions lasting for one or two hours occurring once or twice per day. Learning was most efficient in the group given one session of one hour per day, and least efficient in the group trained for two 2-hour sessions. Retention was tested after one, three or nine months, and indicated a loss in speed of about 30%. Again the group trained for two daily sessions of two hours performed most poorly. It is suggested that where operationally feasible, keyboard training should be distributed over time rather than massed.