ArticlePDF Available

Abstract and Figures

Learning is often considered complete when a student can produce the correct answer to a question. In our research, students in one condition learned foreign language vocabulary words in the standard paradigm of repeated study-test trials. In three other conditions, once a student had correctly produced the vocabulary item, it was repeatedly studied but dropped from further testing, repeatedly tested but dropped from further study, or dropped from both study and test. Repeated studying after learning had no effect on delayed recall, but repeated testing produced a large positive effect. In addition, students' predictions of their performance were uncorrelated with actual performance. The results demonstrate the critical role of retrieval practice in consolidating learning and show that even university students seem unaware of this fact.
Content may be subject to copyright.
DOI: 10.1126/science.1152408
, 966 (2008);319 Science
Jeffrey D. Karpicke and Henry L. Roediger III
The Critical Importance of Retrieval for Learning
This copy is for your personal, non-commercial use only.
clicking here.colleagues, clients, or customers by , you can order high-quality copies for yourIf you wish to distribute this article to others
here.following the guidelines can be obtained byPermission to republish or repurpose articles or portions of articles
): May 2, 2012 (this information is current as of
The following resources related to this article are available online at
version of this article at: including high-resolution figures, can be found in the onlineUpdated information and services,
can be found at: Supporting Online Material
found at: can berelated to this article A list of selected additional articles on the Science Web sites
32 article(s) on the ISI Web of Sciencecited by This article has been
18 articles hosted by HighWire Press; see:cited by This article has been
Psychology subject collections:This article appears in the following
registered trademark of AAAS. is aScience2008 by the American Association for the Advancement of Science; all rights reserved. The title CopyrightAmerican Association for the Advancement of Science, 1200 New York Avenue NW, Washington, DC 20005.
(print ISSN 0036-8075; online ISSN 1095-9203) is published weekly, except the last week in December, by theScience
on May 2, 2012www.sciencemag.orgDownloaded from
The Critical Importance
of Retrieval for Learning
Jeffrey D. Karpicke
*and Henry L. Roediger III
Learning is often considered complete when a student can produce the correct answer to a
question. In our research, students in one condition learned foreign language vocabulary words in
the standard paradigm of repeated study-test trials. In three other conditions, once a student had
correctly produced the vocabulary item, it was repeatedly studied but dropped from further testing,
repeatedly tested but dropped from further study, or dropped from both study and test. Repeated
studying after learning had no effect on delayed recall, but repeated testing produced a large
positive effect. In addition, studentspredictions of their performance were uncorrelated with
actual performance. The results demonstrate the critical role of retrieval practice in consolidating
learning and show that even university students seem unaware of this fact.
Ever since the pioneering work of Ebbinghaus
(1), scientists have generally studied hu-
man learning and memory by presenting
people with information to be learned in a study
period and testing them on it in a test period to
see what they retained. When this procedure oc-
curs over many trials, an exponential learning
curve is produced. The standard assumption in
nearly all research is that learning occurs while
people study and encode material. Therefore, ad-
ditional study should increase learning. Retriev-
ing information on a test, however, is sometimes
considered a relatively neutral event that mea-
sures the learning that occurred during study but
does not by itself produce learning. Over the
years, researchers have occasionally argued that
learning can occur during testing (26). However,
the assumptions that repeated studying promotes
learning and that testing represents a neutral event
that merely measures learning still permeate con-
temporary memory research as well as contem-
porary educational practice, where tests are also
considered purely as assessments of knowledge.
Our goal in the present research was to ex-
amine these long-standing assumptions regard-
ing the effects of repeated studying and repeated
testing on learning. Specifically, once informa-
tion can be recalled from memory, what are the
effects of repeated encoding (during study trials)
or repeated retrieval (during test trials) on learn-
ing and long-term retention, assessed after a
week delay? A second purpose of this research
was to examine studentsassessments of their
own learning. After learning a set of materials
under repeated study or repeated test conditions,
we asked students to predict their future recall
on the week-delayed final test. Our question
was, would students show any insight into their
own learning?
A final purpose of the experiment was to
address another venerable issue in learning and
memory, concerning the relation between the
speed with which something is learned and the
rate at which it is forgotten. Is speed of learning
correlated with long-term retention, and if so, is
the correlation positive (processes that promote
fast learning also slow forgetting and promote
good retention) or negative (quick learning may
be superficial and produce rapid forgetting)? Early
research led to the conclusion that quick learn-
ing reduced the rate of forgetting and improved
long-term retention (7), but later critics argued
that, when forgetting is assessed more properly
than in the early studies, no differences exist be-
tween forgetting rates for fast and slow learning
conditions (8,9). By any account, conditions that
exhibit equivalent learning curves should produce
equivalent retention after a delay (9).
Using foreign language vocabulary word pairs,
we examined the contributions of repeated study
and repeated testing to learning by comparing a
standard learning condition to three dropout condi-
tions. The standard method of measuring learning,
used since Ebbinghauss research (1), involves
presenting subjects with information in a study
period, then testing them on it in a test period,
then presenting it again, testing on it again, and so
on. The dropout learning conditions of the present
experiment differed from the standard learning
condition in that, once an item was successfully
recalled once on a test, it was either (i) dropped
from study periods but still tested in one con-
dition, (ii) dropped from test periods but still re-
peatedly studied in a second condition, or (iii)
dropped altogether from both study and test pe-
riods in a third condition (Table 1).
Surprisingly, standard learning conditions
and dropout conditions have seldom been com-
pared in memory research, despite their critical
importance to theories of learning and their prac-
tical importance to students (in using flash cards
and other study methods). Dropout conditions
were originally developed to remedy methodo-
logical problems that arise from repeated practice
in the standard learning condition (10), but they
can also be used to examine the effect of re-
peated practice in its own right, as we did in the
present experiment. If learning happens exclu-
sively during study periods and if tests are neutral
assessments, then additional study trials should
have a strong positive effect on learning, whereas
additional test trials should produce no effect.
Further, if repeated study or test practice after an
item has been learned does indeed benefit long-
term retention, this would contradict the conven-
tional wisdom that students should drop material
that they have learned from further practice in
order to focus their effort on material they have
not yet learned. Dropping learned facts may create
the same long-term retention as occurs in stan-
dard conditions but in a shorter amount of time,
or it may improve learning by allowing stu-
dents to focus on items they have not yet recalled.
This strategy is implicitly endorsed by contem-
porary theories of study-time allocation (11,12)
and is explicitly encouraged in many popular
study guides (13).
Department of Psychological Sciences, Purdue University,
West Lafayette, IN 47907, USA.
Department of Psychol-
ogy, Washington University in St. Louis, St. Louis, MO
63130, USA.
*To whom correspondence should be addressed. E-mail:
Table 1. Conditions used in the experiment, average number of trials within each study or test
period, and total number of trials in the learning phase in each condition. S
indicates that only
vocabulary pairs not recalled in the previous test period were studied in the current study period. T
indicates that only pairs not recalled in the previous test period were tested in the current test
period. Students in all conditions performed a 30-s distracter task that involved verifying multi-
plication problems after each study period.
Study (S) or test (T) period and number of trials per period Total
of trials
12 3 4 5 6 7 8
40 40 40 40 40 40 40 40 320
40 40 26.8 40 8.0 40 2.0 40 236.8
40 40 40 27.9 40 11.8 40 3.3 243.0
40 40 27.1 27.1 8.8 8.8 1.5 1.5 154.8
15 FEBRUARY 2008 VOL 319 SCIENCE www.sciencemag.org966
on May 2, 2012www.sciencemag.orgDownloaded from
In the experiment, we had college students
learn a list of foreign language vocabulary word
pairs and manipulated whether pairs remained in
the list (and were repeatedly practiced) or were
dropped after the first time they were recalled,
as shown in Table 1. All students began by study-
ing a list of 40 Swahili-English word pairs (e.g.,
mashua-boat) in a study period and then testing
over the entire list in a test period (e.g., mashua-?).
All conditions were treated the same in the ini-
tial study and test periods. Once a word pair was
recalled correctly, it was treated differently in the
four conditions. In the standard condition, sub-
jects studied and were tested over the entire list in
each study and test period (denoted ST). In a
second condition, once a pair was recalled, it was
dropped from further study but tested in each sub-
T, wh e re S
cates that only nonrecalled pairs were restudied).
In a third condition, recalled pairs were dropped
from further testing but studied in each subsequent
study period (denoted ST
that only nonrecalled pairs were kept in the list
during test periods). In a fourth condition, recalled
pairs were dropped entirely from both study and
test periods (S
). The final condition repre-
sents what conventional wisdom and many edu-
cators instruct students to do: Study something
until it is learned (i.e., can be recalled) and then
drop it from further practice.
At the end of the learning phase, students in
all four conditions were asked to predict how
many of the 40 pairs they would recall on a final
test in 1 week. They were then dismissed and
returned for the final test a week later. Of key
importance were the effects of the four learning
conditions on the speed with which the vocabulary
words were learned, on studentspredictions of
their future performance, and on long-term reten-
tion assessed after a week delay (14).
Figure 1 shows the cumulative proportion of
word pairs recalled during the learning phase,
which gives credit the first time a student recalled
a pair. We also analyzed traditional learning
curves (the proportion of the total list recalled
in each test period) for the two conditions that
required recall of the entire list (ST and S
and the results by the two measurement meth-
ods were identical. Thus, we restrict our dis-
cussion to the cumulative learning curves on
which all four conditions can be compared.
Figure 1 shows that performance was virtually
perfect by the end of learning (i.e., all 40 English
target words were recalled by nearly all sub-
jects). More importantly, there were no differences
in the learning curves of the four conditions.
Given the similarity of acquisition perform-
ance, it is not too surprising that students in the
four conditions did not differ in their aggregate
judgments of learning (their predictions of their
future performance). On average, the students in
all conditions predicted they would recall about
50% of the pairs in 1 week. The mean number of
words predicted to be recalled in each condition
were as follows: ST = 20.8, S
22.0, and S
= 20.3. An analysis of variance did
not reveal significant differences among the
conditions (F<1).
Although studentscumulative learning per-
formance was equivalent in the four conditions
and predicted final recall was also equivalent,
actual recall on the final delayed test differed
widely across conditions, as shown in Fig. 2.
The results show that testing (and not studying)
is the critical factor for promoting long-term re-
call. In fact, repeated study after one successful
recall did not produce any measurable learning
a week later. In the learning conditions that re-
quired repeated retrieval practice (ST and S
students correctly recalled about 80% of the
pairs on the final test. In the other conditions in
which items were dropped from repeated test-
ing (ST
and S
), students recalled just
36% and 33% of the pairs. It is worth em-
phasizing that, despite the fact that students
repeatedly studied all of the word pairs in every
study period in the ST
condition, their long-
term recall was much worse than students who
were repeatedly tested on the entire list. Com-
bining the two conditions that involved repeated
testing (ST and S
T) and combining the two
conditions that involved dropping items from
testing after they were recalled once (ST
), repeated retrieval increased final recall
by 4 standard deviations (d= 4.03). The distri-
butions of scores in these two groups did not
overlap: Final recall in the drop-from-testing
conditions ranged from 10% to 60%, whereas
final recall in the repeated test conditions ranged
from 63% to 95%. Whether students repeatedly
studied the entire set or whether they restudied
only pairs they had not yet recalled produced
virtually no effect on long-term retention. The
dramatic difference shown in Fig. 2 was caused
by whether or not the pairs were repeatedly tested.
Even though cumulative learning perform-
ance was identical in the four conditions, the
total number of trials (study or test) in each con-
dition varied greatly. Table 1 shows the mean
number of trials in each study and test period
and the total number of trials in each condition.
The standard condition (ST) involved the most
trials (320) because all 40 items were presented
in each study and test period. The S
tion involved the fewest trials (154.8, on aver-
age) because the number of trials in each period
grew smaller as items were recalled and dropped
from further practice. The other two conditions
) involved about the same number
of trials (236.8 and 243.0, respectively) but be-
cause they differed in terms of whether items
were dropped from study or test periods, they
produced dramatically different effects on long-
term retention. In other words, about 80 more
study trials occurred in the ST
condition than
in the S
condition, but this produced prac-
tically no gain in retention. Likewise, about 80
more study trials occurred in the ST condition
than in the S
T condition, and this produced no
gain whatsoever in retention. However, when
about 80 more test trials occurred in the learning
phase (in the ST condition versus the ST
dition, and in the S
T condition versus the S
condition), repeated retrieval practice led to greater
than 150% improvements in long-term retention.
The present research shows the powerful ef-
fect of testing on learning: Repeated retrieval
practice enhanced long-term retention, whereas
repeated studying produced essentially no ben-
efit. Although educators and psychologists often
consider testing a neutral process that merely
assesses the contents of memory, practicing re-
trieval during tests produces more learning than
additional encoding or study once an item has
been recalled (1517). Dropout methods such as
the ones used in the present experiment have
seldom been used to investigate effects of re-
peated practice in their own right, but compar-
ison of the dropout conditions to the repeated
practice conditions revealed dramatic effects of
retrieval practice on learning.
Fig. 1. Cumulative performance during the learn-
ing phase.
Fig. 2. Proportion recalled on the final test 1 week
after learning. Error bars represent standard errors
of the mean. SCIENCE VOL 319 15 FEBRUARY 2008 967
on May 2, 2012www.sciencemag.orgDownloaded from
The experiment also shows a striking ab-
sence of any benefit of repeated studying once
an item could be recalled from memory. A basic
tenet of human learning and memory research is
that repetition of material improves its retention.
This is often true in standard learning situations,
yet our research demonstrates a situation that
stands in stark contrast to this principle. The
benefits of repetition for learning and long-term
retention clearly depend on the processes learners
engage in during repetition. Once information
can be recalled, repeated encoding in study trials
produced no benefit, whereas repeated retrieval
in test trials generated large benefits for long-
term retention. Further research is necessary to
generalize these findings to other materials. How-
ever, the basic effects of testing on retention have
been shown with many kinds of materials (16),
so we have confidence that the present results
will generalize, too.
Our experiment also speaks to an old debate
in the science of memory, concerning the rela-
tion between speed of learning and rate of for-
getting (79). Our study shows that the forgetting
rate for information is not necessarily deter-
mined by speed of learning but, instead, is greatly
determined by the type of practice involved.
Even though the four conditions in the experi-
ment produced equivalent learning curves, re-
peated recall slowed forgetting relative to
recalling each word pair just one time.
Importantly, students exhibited no awareness
of the mnemonic effects of retrieval practice, as
evidenced by the fact that they did not predict
they would recall more if they had repeatedly
recalled the list of vocabulary words than if they
only recalled each word one time. Indeed, ques-
tionnaires asking students to report on the strat-
egies they use to study for exams in education
also indicate that practicing recall (or self-testing)
is a seldom-used strategy (18). If students do test
themselves while studying, they likely do it to
assess what they have or have not learned (19),
rather than to enhance their long-term retention
by practicing retrieval. In fact, the conventional
wisdom shared among students and educators is
that if information can be recalled from mem-
ory, it has been learned and can be dropped
from further practice, so students can focus their
effort on other material. Research on students
use of self-testing as a learning strategy shows
that students do tend to drop facts from further
practice once they can recall them (20). However,
the present research shows that the conventional
wisdom existing in education and expressed in
many study guides is wrong. Even after items
can be recalled from memory, eliminating those
items from repeated retrieval practice greatly re-
duces long-term retention. Repeated retrieval in-
duced through testing (and not repeated encoding
during additional study) produces large positive
effects on long-term retention.
References and Notes
1. H. Ebbinghaus, Memory: A Contribution to Experimental
Psychology, H. A. Ruger, C. E. Bussenius, Transls. (Dover,
New York, 1964).
2. R. A. Bjork, in Information Processing and Cognition:
The Loyola Symposium, R. L. Solso, Ed. (Erlbaum,
Hillsdale, NJ, 1975), pp. 123144.
3. M. Carrier, H. Pashler, Mem. Cognit. 20, 633 (1992).
4. A. I. Gates, Arch. Psychol. 6, 1 (1917).
5. C. Izawa, J. Math. Psychol. 8, 200 (1971).
6. E. Tulving, J. Verb. Learn. Verb. Behav. 6, 175 (1967).
7. J. A. McGeoch, The Psychology of Human Learning
(Longmans, Green, New York, 1942).
8. N. J. Slamecka, B. McElree, J. Exp. Psychol. Learn. Mem.
Cogn. 9, 384 (1983).
9. B. J. Underwood, J. Verb. Learn. Verb. Behav. 3, 112
10. W. F. Battig, Psychon. Sci. Monogr. 1(suppl.), 1 (1965).
11. J. Metcalfe, N. Kornell, J. Exp. Psychol. Gen. 132, 530
12. K. W. Thiede, J. Dunlosky, J. Exp. Psychol. Learn. Mem.
Cognit. 25, 1024 (1999).
13. S. Frank, The Everything Study Book (Adams, Avon, MA,
14. Materials and methods are available as supporting
material on Science Online.
15. J. D. Karpicke, H. L. Roediger, J. Mem. Lang. 57, 151
16. H. L. Roediger, J. D. Karpicke, Perspect. Psychol. Sci. 1,
181 (2006).
17. H. L. Roediger, J. D. Karpicke, Psychol. Sci. 17, 249
18. N. Kornell, R. A. Bjork, Psychon. Bull. Rev. 14, 219 (2007).
19. J. Dunlosky, K. Rawson, S. McDonald, in Applied
Metacognition, T. Perfect, B. Schwartz, Eds. (Cambridge
Univ. Press, Cambridge, 2002), pp. 6892.
20. J. D. Karpicke, thesis, Washington University, St. Louis,
MO (2007).
21. We thank J. S. Nairne for helpful comments on the
manuscript. This research was supported by a
Collaborative Activity Grant of the James S. McDonnell
Foundation to the second author.
Supporting Online Material
Materials and Methods
Table S1
31 October 2007; accepted 12 December 2007
15 FEBRUARY 2008 VOL 319 SCIENCE www.sciencemag.org968
on May 2, 2012www.sciencemag.orgDownloaded from
There is an ongoing transition in education from paper-based learning and testing to digital learning and testing. The purpose of the present research was to examine whether the relative effectiveness of digital and paper-based learning depends on the medium of testing in the context of foreign-language vocabulary learning. In a controlled experiment, young adults (N = 79) studied and practiced novel foreign-language vocabulary words using two study methods (restudying or retrieval practice) and were then tested on their memory for these words to assess learning. The study medium and the test medium were either congruent (i.e., paper-based learning and testing; digital learning and testing) or incongruent (paper-based learning and digital testing; digital learning and paper-based testing). The results revealed a study-test medium congruency effect: Paper-based learning yielded better test performance than digital learning when the test was conducted on paper, but this effect was eliminated when the test was digital. This effect may have important practical educational implications as it challenges common practices for vocabulary learning such as using digital tools to study vocabulary for on-paper memory tests.
Errorful learning suggests that, when perfect learning has not yet been attained, errors can enhance future learning if followed by corrective feedback. Research on memory updating has shown that after retrieval, memory becomes more malleable and prone to change. Thus, retrieval of a wrong answer might provide a good context for the incorporation of feedback. Here, we tested this hypothesis using sentences including pragmatic sentence implications, commonly used for the study of false memories. Across two experiments with young adults, we hypothesized that corrective feedback would be more efficient at reducing false memories if provided immediately after retrieval, when memory is more malleable than after being exposed to the material. Participants’ memory was assessed as a function of the type of learning task (Experiment 1: retrieval vs. restudy; and Experiment 2: active vs. passive recognition); and whether participants received corrective feedback or not. In both experiments, we observed that retrieval not only improved correct recall (replicating the testing effect) but also promoted the correction of false memories. Notably, corrective feedback was more effective when given after errors that were committed during retrieval rather than after restudy (Experiment 1) or after passive recognition (Experiment 2). Our results suggest that the benefits of retrieval go beyond the testing effect since it also facilitates false memories correction. Retrieval seems to enhance memory malleability, thus improving the incorporation of feedback, compared to the mere presentation of the information. Our results support the use of learning strategies that engage in active and explicit retrieval because, even if the retrieved information is wrong—when immediate feedback is provided—memory updating is promoted and errors are more likely to be corrected.
The study investigates how the test modality (spoken or written) of classroom weekly quizzes influences vocabulary learning strategies and facilitates learning the spoken and written knowledge of form‐meaning connection in L2 words. Japanese university students in academic English courses were assigned to two experimental conditions (spoken test and written test groups). The spoken test group prepared for and took weekly quizzes delivered in spoken format, whereas the written test group took the same quizzes delivered in written format. Over 10 weeks, learners were presented with the spoken or written forms of 20 English words and asked to provide the L1 translations of those words. Before and after the semester‐long treatment, 45 target words were tested in both spoken and written format via L2‐to‐L1 translation tasks. Additionally, a questionnaire survey on vocabulary learning strategies was conducted to examine how learners prepared for weekly quizzes outside of the classroom. Results revealed that learners in the spoken test group showed a significantly larger gain in spoken vocabulary than did the learners in the written test group. However, there was no significant difference between the two groups for written vocabulary learning. The spoken test group tended to rely on studying target vocabulary in a spoken form more frequently, whereas the written test group studied vocabulary in written form more frequently. This study provided implications for how teachers should administer classroom testing with the aim to develop learners' L2 spoken vocabulary knowledge effectively.
Memory retrieval allows us to reinstate previously encoded information but is also considered to contribute to memory enhancement. Retrieval-induced enhancement may involve processing to strengthen memory traces, but neural processing beyond reinstatement during retrieval remains elusive. Here, we show that hippocampal processing, different from memory reinstatement, exists during retrieval in the human brain. By tracking changes in the response patterns in the selected hippocampal and cortical regions over time during retrieval based on functional MRI, we found that the representation of associative memory in CA3/DG became stronger even after cortical memory reinstatement, while CA1 showed significant memory representation at retrieval onset with the cortical reinstatement, but not afterwards. This tendency was not observed in the condition without active retrieval. Moreover, subsequent long-term memory performance depended on the delayed CA3/DG representation during retrieval. These findings suggest that CA3/DG contributes to neural processing beyond memory reinstatement during retrieval, which may lead to memory enhancement.
We investigated how to optimize the effectiveness of retrieval‐based learning when the instructional text comprises seductive details (i.e., interesting but irrelevant text adjuncts). Specific questions during retrieval practice should help students focus their recall on main ideas ‐ and not on seductive details, which should in turn foster delayed posttest performance. In this experiment, participants (N=103) learned from an instructional text about coffee, either with or without seductive details; in subsequent retrieval practice, the participants received either unspecific or specific questions (2x2 between‐subjects design). One week later, all participants received a delayed posttest assessing learning outcomes. As expected, when the instructional text comprised seductive details, participants given specific questions during retrieval practice had better learning outcomes than those given an unspecific question. We conclude that retrieval tasks should be aligned with learning materials: more specific retrieval tasks are better for materials including irrelevant information. This article is protected by copyright. All rights reserved.
Background: Due to a nationwide shortage of anesthesia assistants, operating room nurses are often recruited to assist with the induction of obstetric general anesthesia (GA). We developed and administered a training program and hypothesized there would be significant improvements in knowledge and skills in anesthesia assistance during obstetric GA by operating room nurses following training with adequate retention at six months. Methods: Following informed consent, all operating room nurses at our institution were invited to participate in the study. Baseline knowledge of participants was assessed using a 14-item multiple choice questionnaire (MCQ), and skills were assessed using a 12-item checklist scored by direct observation during simulated induction of GA. Next, a 20-min didactic lecture followed by a ten-minute hands-on skills station were delivered. Knowledge and skills were immediately reassessed after training, and again at six weeks and six months. The primary outcomes of this study were adequate knowledge and skills retention at six months, defined as achieving ≥ 80% in MCQ and ≥ 80% in skills checklist scores and analyzed using longitudinal mixed-effects linear regression. Results: A total of 34 nurses completed the study at six months. The mean MCQ score at baseline was 8.9 (95% confidence interval [CI], 8.5 to 9.4) out of 14. The mean skills checklist score was 5.5 (95% CI, 4.9 to 6.1) out of 12. The mean comfort scores for assisting elective and emergency Cesarean deliveries were 3.6 (95% CI, 3.2 to 3.9) and 3.1 (95% CI, 2.7 to 3.5) out of 5, respectively. There was a significant difference in the mean MCQ and skills checklist scores across the different study periods (overall P value < 0.001). Post hoc pairwise tests suggested that, compared with baseline, there were significantly higher mean MCQ scores at all time points after the training program at six weeks (11.9; 95% CI, 11.4 to 12.4; P < 0.001) and at six months (12.0; 95% CI, 11.5 to 12.4; P < 0.001). Discussion: The knowledge and skills of operating room nurses in providing anesthesia assistance during obstetric GA at our institution were low at baseline. Following a single 30-min in-house, anesthesiologist-led, structured training program, scores in both dimensions significantly improved. Although knowledge improvements were adequately retained for up to six months, skills improvements decayed rapidly, suggesting that sessions should be repeated at six-week intervals, at least initially.
Desirable difficulties such as retrieval practice (testing) and spacing (distributed studying) are shown to improve long-term learning. Despite their knowledge about the benefits of retrieval practice, students struggle with application. We propose a mechanism of embedding desirable difficulties in the classroom called "retrieval-based teaching." We define it as asking students many ungraded, granular questions in class. We hypothesized that this method could motivate students to (1) study more and (2) increase the spacing of their studying. We tested these two hypotheses through a quasi-experiment in an introductory programming course. We compared 684 students' granular activities with an interactive eBook between the class discussion sections where the intervention was implemented and the control discussion sections. Over four semesters, there were a total of 17 graduate student instructors (GSIs) that taught the discussion sections. Each semester, there were five discussion sections, each taught by a distinct GSI. Only one of the five per semester implemented the treatment in their discussion section(s) by dedicating most of the class time for retrieval-based teaching. Our analysis of these data collected over four consecutive semesters shows that retrieval-based teaching motivated students to space their studying over an average of 3.78 more days, but it did not significantly increase the amount they studied. Students in the treatment group earned an average of 2.36 percentage points higher in course grades. Our mediation analysis indicates that spacing was the main factor in increasing the treated students' grades.
Full-text available
A powerful way of improving one's memory for material is to be tested on that material. Tests enhance later retention more than additional study of the material, even when tests are given without feedback. This surpris- ing phenomenon is called the testing effect, and although it has been studied by cognitive psychologists sporadically over the years, today there is a renewed effort to learn why testing is effective and to apply testing in educational settings. In this article, we selectively review laboratory studies that reveal the power of testing in improving re- tention and then turn to studies that demonstrate the basic effects in educational settings. We also consider the related concepts of dynamic testing and formative assess- ment as other means of using tests to improve learning. Finally, we consider some negative consequences of testing that may occur in certain circumstances, though these negative effects are often small and do not cancel out the large positive effects of testing. Frequent testing in the classroom may boost educational achievement at all levels of education. In contemporary educational circles, the concept of testing has a dubious reputation, and many educators believe that testing is overemphasized in today's schools. By ''testing,'' most com- mentators mean using standardized tests to assess students. During the 20th century, the educational testing movement produced numerous assessment devices used throughout edu- cation systems in most countries, from prekindergarten through graduate school. However, in this review, we discuss primarily the kind of testing that occurs in classrooms or that students engage in while studying (self-testing). Some educators argue
This paper was directed toward problems involved in the measurement of forgetting uncontaminated by differences in degree of learning. More particularly, it was concerned with these measurements when some variable, such as a characteristic of the task, is being manipulated and when such a variable produces differences in rate of learning. If we are to assess properly the influences of these variables on retention, degree of learning must be equated, since degree of learning and retention are directly related. The two basic situations considered were those in which a constant number of learning trials was given and those in which learning was carried to a specified criterion of performance. The single-entry technique is appropriate only when a constant number of learning trials is used. When a criterion of performance is set for learning another procedure (multiple-entry projection) may be used. Although the mean predictions are fairly accurate by this method, predictions for individual Ss are not. In most studies of retention it seems most efficient to use a constant-trials procedure for learning. Finally, it was pointed out that some studies of short-term retention of single items have probably confounded effects of degree of learning on retention with the effects of variables producing differences in rate of learning the items.
A theory is proposed to account for the functions of unreinforced test (T) trials in paired-associate learning: the forgetting-prevention effects as contrasted to blank (B) trials and the potentiating effects upon the subsequent reinforcement (R) trials. Both empirical effects were confirmed, through six experiments with a total of 384 valid subjects, under the three basic repetitive programming of R-T-B sequences of RT1 … Tm (Case 1), RB1 … Bm−1T (Case 2), and RTB1 … Bm−1 (Case 3), in which m was varied from 1 to 5. To solve the fifficulties encountered in extant learning theories, the new model postulates active retrieval processes as unique theoretical functions of T trials. The processes did not change empirical response probabilities significantly over the m successive Ts within a replication, but resulted in increasing the effectiveness on the subsequent Rs. Consistently satisfactory quantitative analyses with respect to both empirical effects of Ts provided decisive grounds for support for the proposed test trial potentiating model.
People of all ages are more likely to choose to restudy items (or allocate more study time to items) that are perceived as more difficult to learn than as less difficult to learn. Existing models of self-regulated study adequately account for this inverse relation between perceived difficulty of learning and these 2 measures of self-regulated study (item selection and self-paced study). However, these models cannot account for positive relations between perceived difficulty of learning and item selection, which are demonstrated in the present investigation. Namely, in Experiments 1 and 2, the authors described conditions in which people more often selected to study items judged as less difficult than as more difficult to learn. This positive relation was not demonstrated for self-paced study, which was always negatively correlated with judged difficulty to learn. In Experiments 3 through 6, the authors explored explanations for this dissociation between item selection and self-paced study. Discussion focuses on a general model of self-regulated study that includes planning, discrepancy reduction, and working-memory constraints. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
From a much larger number 1200 titles have been selected covering the period from the work of Ebbinghaus to and including the year 1930. Selection is on the basis of significance for learning as such, representativeness, and availability. Under Learning, the titles are grouped under fifty topics, and under Retention they are grouped under thirty-two. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
A comprehensive survey of the literature on human learning for advanced students and research workers in this area. Although certain organizational changes are made in the revision, the author has attempted to maintain Dr. McGeoch's (see 16: 4303) systematic position with regard to the increased factual knowledge and new emphasis in the field. Extensive chapter bibliographies. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Three experiments, with 276 undergraduates, examined how degree of learning affects normal forgetting. Exp I varied learning of categorized lists and tested retention at 3 intervals (immediately and 1 or 5 days after presentation). Across all measures, study trials affected intercepts but not slopes of forgetting functions. Exp II varied learning of paired-associate lists and tested retention at the same 3 intervals. Across all measures, trials influenced intercepts but not forgetting slopes. Exp III varied learning of sentence lists and tested verbatim and gist memory at the same intervals. Again, trials affected intercepts but not slopes. Results suggest that the forgetting of verbal lists is independent of their degree of learning. No current theories of memory predict these outcomes, but neither does the pattern of results disconfirm any theory. It is argued that present memory theorizing neglects almost entirely the central problem of normal forgetting. (36 ref) (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Tests not only measure the contents of memory, they can also enhance learning and long-term retention. We report two experiments inspired by Tulving’s (1967) pioneering work on the effects of testing on multitrial free recall. Subjects learned lists of words across multiple study and test trials and took a final recall test 1 week after learning. In Experiment 1, repeated testing during learning enhanced retention relative to repeated studying, although alternating study and test trials produced the best retention. In Experiment 2, recalled items were dropped from further studying or further testing to investigate how different types of practice affect retention. Repeated study of previously recalled items did not benefit retention relative to dropping those items from further study. However, repeated recall of previously recalled items enhanced retention by more than 100% relative to dropping those items from further testing. Repeated retrieval of information is the key to long-term retention.