ArticlePDF Available

Participation in Voluntary Re-quizzing Is Predictive of Increased Performance on Cumulative Assessments in Introductory Biology

Authors:

Abstract and Figures

Low-stakes testing, or quizzing, is a formative assessment tool often used to structure course work. After students complete a quiz, instructors commonly encourage them to use those quizzes again to retest themselves near exam time (i.e., delayed re-quizzing). In this study, we examine student use of online, ungraded practice quizzes that are reopened near exam time after a first graded attempt 1-3 weeks prior. We find that, when controlling for preparation (performance in a previous science, technology, engineering, and mathematics [STEM] course and incoming biology knowledge), re-quizzing predicts better performance on two cumulative exams in introductory biology: a course posttest and final exam. Additionally, we describe a preliminary finding that, for the final exam, but not the posttest, re-quizzing benefits students with lower performance in a previous STEM course more than their higher-performing peers. But unfortunately, these struggling students are also less likely to participate in re-quizzing. Together, these data suggest that a common practice, reopening quizzes for practice near exam time, can effectively benefit student performance. This study adds to a growing body of literature that suggests quizzing can be used as both an assessment tool and a learning tool by showing that the "testing effect" extends to delayed re-quizzing within the classroom.
Content may be subject to copyright.
CBE—Life Sciences Education • 18:ar15, 1–13, Summer 2019 18:ar15, 1
ARTICLE
ABSTRACT
Low-stakes testing, or quizzing, is a formative assessment tool often used to structure
course work. After students complete a quiz, instructors commonly encourage them to
use those quizzes again to retest themselves near exam time (i.e., delayed re-quizzing).
In this study, we examine student use of online, ungraded practice quizzes that are re-
opened near exam time after a first graded attempt 1–3 weeks prior. We find that, when
controlling for preparation (performance in a previous science, technology, engineering,
and mathematics [STEM] course and incoming biology knowledge), re-quizzing predicts
better performance on two cumulative exams in introductory biology: a course posttest
and final exam. Additionally, we describe a preliminary finding that, for the final exam,
but not the posttest, re-quizzing benefits students with lower performance in a previous
STEM course more than their higher-performing peers. But unfortunately, these struggling
students are also less likely to participate in re-quizzing. Together, these data suggest that
a common practice, reopening quizzes for practice near exam time, can eectively ben-
efit student performance. This study adds to a growing body of literature that suggests
quizzing can be used as both an assessment tool and a learning tool by showing that the
“testing eect” extends to delayed re-quizzing within the classroom.
INTRODUCTION
Actively engaging students during course work is gaining traction across disciplines
and educational levels. However, promoting engagement using carefully crafted activ-
ities during class time is just one factor that contributes toward student learning.
Increased course structure—for example, providing students with consistent, frequent
formative assessments—is another important evidence-based contributor toward
effective learning (Freeman et al., 2011; Haak et al., 2011; Eddy and Hogan, 2014).
One type of formative assessment often used in highly structured courses is quizzing,
or low-stakes testing. Mounting evidence from the laboratory (Roediger and Karpicke,
2006; Karpicke and Roediger, 2008; Karpicke and Blunt, 2011) and the classroom
(McDaniel et al., 2012; Orr and Foster, 2013; Bjork et al., 2014; Trumbo et al., 2016)
has shown that quizzing does not just serve as an evaluation tool—learning actually
occurs during quizzing. This phenomenon is referred to as the “testing effect.” What is
not known, however, is whether there is added benefit to student learning if students
retake the quizzes for no additional points before exam time as a study tool. In this
study, we ask whether delayed re-quizzing before each exam (used as a study tool and
for no additional points) throughout the semester is associated with higher perfor-
mance on two types of cumulative course assessments in introductory biology and, if
Elise M. Walck-Shannon,†‡ Michael J. Cahill, Mark A. McDaniel,§ and
Regina F. Frey*
Center for Integrative Research on Cognition, Learning, and Education, Departments of Biology,
§Psychological and Brain Sciences, and Chemistry, Washington University in St. Louis,
St. Louis, MO 63130
Participation in Voluntary Re-quizzing Is
Predictive of Increased Performance on
Cumulative Assessments in Introductory
Biology
Cynthia Brame, Monitoring Editor
Submitted Aug 21, 2018; Revised Jan 17, 2019;
Accepted Jan 23, 2019
DOI:10.1187/cbe.18-08-0163
*Address correspondence to: Regina F. Frey
(gfrey@wustl.edu).
© 2019 E. M. Walck-Shannon et al. CBE—Life
Sciences Education © 2019 The American Society
for Cell Biology. This article is distributed by The
American Society for Cell Biology under license
from the author(s). It is available to the public
under an Attribution–Noncommercial–Share
Alike 3.0 Unported Creative Commons License
(http://creativecommons.org/licenses/
by-nc-sa/3.0).
“ASCB®” and “The American Society for Cell
Biology®” are registered trademarks of The
American Society for Cell Biology.
CBE Life Sci Educ June 1, 2019 18:ar15
18:ar15, 2 CBE—Life Sciences Education • 18:ar15, Summer 2019
E. M. Walck-Shannon et al.
so, whether this could simply be explained by greater self-
reported study times among students who retook quizzes.
The testing effect has been applied to classroom contexts
across the disciplines of psychology (Leeming, 2002; Trumbo
et al., 2016), statistics (Lyle and Crawford, 2011), medicine
(Larsen et al., 2009; Messineo et al., 2015), and biology (Orr
and Foster, 2013; Carnegie, 2015). In an introductory biology
classroom, Orr and Foster (2013) studied the use of low-stakes
(10% of course grade) quizzes and found that students who
participated in 100% of the available quizzes scored better on
course exams than students who participated in 0% of the quiz-
zes. In a large, introductory psychology course, Trumbo et al.
(2016) asked whether quizzing was more effective if it was
optional versus a course component. They found that, in both
cases, the number of quiz attempts was positively correlated
with course grades; however, these quiz attempts were all
during the initial learning and not given as a delayed study tool.
Classroom studies have also introduced new opportunities to
answer social and motivational questions about the testing
effect’s impact on performance. For example, as Brame and Biel
(2015) point out, it is still an open question whether the testing
effect is particularly beneficial for certain students, such as
those who are underprepared. We hope to add to this growing
literature base by asking whether delayed re-quizzing improved
student performance and whether these benefits were equal
across preparation levels.
To our knowledge, there have been no studies that have
implemented a delayed time course of no-stakes re-quizzing for
use as a study tool. In this study, we aimed to address whether
delayed re-quizzing sufficiently enhanced delayed recall of
course content as measured by two cumulative assessments.
Our study occurred in a large introductory biology classroom
and differed from previous studies in several ways. First, we
specified that our re-quizzing attempts would be delayed rela-
tive to both the initial learning and initial quiz attempt. Second,
the majority of the quiz questions used in this study were ran-
domly chosen from larger pools of highly related questions (see
Methods); hence, students could see slightly different versions
of questions in each of their attempts, and none of the questions
used on the quizzes were identical to any questions on either of
our outcome variables (i.e., our posttest or final course exam).
Third, we controlled for student preparation. Fourth, rather
than binning quiz participation into all or nothing, we treated
the delayed re-quiz participation as a continuous variable. We
hoped that these additional considerations would provide us
with a nuanced answer to whether delayed re-quizzing, used as
a study tool, was effective in a biology classroom context. This
study occupies a specific niche within both the biology educa-
tion and classroom quizzing literature by analyzing the correla-
tion between delayed re-quizzing and performance on exams
that contain completely novel items, while also accounting for
students’ previous preparatory experiences.
Theoretical Frameworks
The recent classroom applications of quizzing cited earlier were
informed by decades of laboratory studies of the testing effect,
which have identified factors that enhance or mitigate its
magnitude. Here, we will only mention those findings that are
relevant to this study; for a more comprehensive review written
for biology education researchers, see Brame and Biel (2015).
Providing students with feedback (i.e., whether their answer is
correct) enhances the testing effect (Butler and Roediger,
2008). In addition, the number of times that a learner practices
retrieval for the same prompt, the greater the testing effect,
even after successful recall attempts (Roediger and Karpicke,
2006; Karpicke and Roediger, 2008). These laboratory studies
suggest that if our goal is to improve semester-long learning, we
should provide our students with low-stakes testing opportuni-
ties that are sufficiently difficult, occur throughout the semester,
and deliver feedback. However, as stated earlier, it is not yet
entirely clear whether delayed no-stakes re-quizzing, used as a
study tool, has added benefits within a classroom context.
A number of theoretical explanations of the testing effect
have been offered in the literature (McDaniel and Masson,
1985; Carpenter, 2009; Pyc and Rawson, 2010; Thomas and
McDaniel, 2013). Here, we present two theories that are most
relevant to the current study, wherein we examine the benefits
of delayed re-quizzing: the current theory of disuse (described
by Bjork and Bjork, 1992, as a modification of Thorndike,
1914), and the dual-memory model (Rickard and Pan, 2018).
The current theory of disuse posits that each memory has
both a storage strength (i.e., how well a student has learned an
item) and a retrieval strength (i.e., how accessible that item is
in the current context; Bjork and Bjork, 1992). A student’s abil-
ity to retrieve information on an exam would be largely related
to retrieval strength. While storage strength is theoretically
infinite and can increase as we learn more over time, retrieval
strength decreases over time as we forget items and are con-
fronted with extraneous information. However, practicing
retrieval (in this study, quizzing) has the potential to increase
retrieval strength and is especially potent when it is difficult
(Bjork, 1994). By having time lapse between students’ initial
learning of the information and retrieval practice (i.e., quiz first
attempt) and later retrieval practice (i.e., quiz retakes), we can
essentially make the situation “desirably difficult” for our stu-
dents. For this reason, we reopened quizzes for voluntary use
after a period of 1–3 weeks from initial learning and quizzing,
during which the retrieval strength of the content in memory
for many students will have presumably decreased.
The second theoretical framework that we considered is the
dual-memory model, which posits that testing and restudying
can actually form separate memories (Rickard and Pan, 2018).
According to the dual-memory model, initial learning or study
of a topic yields “study memories” that can be further strength-
ened by restudying. Retrieval practice, on the other hand,
encodes new “test memories” and strengthens existing “study
memories.” On the final test, students can draw from “study
memory,” “test memory,” or both. Therefore, we would expect a
student who has practiced retrieval to have more diverse
memories to draw from and to perform better. Notably, in the
laboratory, this dual-memory model provides quantitative pre-
dictions of the testing effect that could explain the differing
results of iterative “restudying” (i.e., rereading) and iterative
testing (i.e., re-quizzing consisting of fill-in-the-blank ques-
tions) on final test performance (Storm et al., 2014; Rickard
and Pan, 2018). Thus, we suspect that re-quizzing using some-
times identical but mostly slightly different items, as we have in
the current study, would strengthen “test memory” and encode
new “test memories,” which would lead to learning gains for
our students.
CBE—Life Sciences Education • 18:ar15, Summer 2019 18:ar15, 3
Re-quizzing in Introductory Biology
FIGURE 1. Flowchart of selection criteria for delayed re-quizzing analysis. Each step of
exclusion is shown horizontally, and numbers of students at each step are shown in bold.
The students represented by the gray (top) box were included for the next step, while the
students represented by the white (bottom) box were excluded from the next step. The
hierarchical regression models reported include the students represented by the box with
a thick red line (n = 310). The sample for the study time estimates was further limited by
whether or not a student voluntarily self-reported times, as described in the text.
Experimental Questions
Given the overwhelming evidence for the testing effect, both in
the classroom and in the laboratory, we wanted all of our stu-
dents to have some benefit of quizzing. Therefore, we required
all students to take one attempt on a weekly quiz, which con-
tributed toward a small part of the course grade. Close to each
exam time, the relevant quizzes were reopened for unlimited,
ungraded, completely optional practice. This was essentially a
“no-stakes” reassessment after a “low-stakes” assessment. Then,
at the end of the semester, we measured performance using two
different cumulative assessments. Using this design, we aimed
to answer the following research question:
Within a large-enrollment, introductory biology classroom,
do students who participate in delayed re-quizzing tend to
perform better on cumulative assessments than those who
do not participate in re-quizzing when controlling for prepa-
ration? And, do students with differing levels of preparation
benefit equally from re-quizzing?
On the basis of the aforementioned lab and classroom find-
ings, we hypothesized that the number of quizzes retaken in a
delayed manner may predict higher performance on cumula-
tive assessments (i.e., a course posttest and the final course
exam). However, we did not have an a priori hypothesis about
whether preparation would interact with quizzing. Prior studies
have found an equal testing effect over a range of academic
abilities (Orr and Foster, 2013), as measured by performance
on an earlier course exam.
METHODS
Context and Participants
This study was performed in a large-enrollment (n = 646)
Introductory Biology I course at a selective, private institution
in the Midwest and has been approved by our internal review
board (IRB ID#: 201807055). This course covers basic bio-
chemistry, cell signaling, replication, gene expression, and
molecular techniques, and it is the first semester of a two-
semester sequence. It is typically taken by
students interested in life science majors
and/or with pre-medicine intentions. We
limited our analysis to students who gave
consent (93.3%, n = 603/646) and first-
year students who are in their second
semester of college (77.1%, n = 465/603).
We further limited the sample based on
availability of control and outcome vari-
able data, as described below (see Out-
come Variables and Selection of Control
Variables) (final n = 310, Figure 1).
According to registrar records, within this
final sample, 39.0% were white (n = 121);
41.0% were Asian (n = 127); and 17.4%
(n = 54) were either African American,
Hispanic, Native American, or Pacific
Islanders, which we grouped together into
underrepresented minorities. Ethnicity
data were unavailable for eight students.
A slight majority of the final sample were
women (59.4%, n = 184/310).
Quiz Development
To provide students with the opportunity for low-stakes quiz-
zing in this course, an internally funded educational specialist
(E.M.W.-S.), who supports faculty as they incorporate evi-
dence-based teaching techniques in the classroom, generated
quiz questions, which were iteratively revised in close collabo-
ration with the instructor team for the Introductory Biology I
course. We wrote our own quiz questions, rather than using a
publisher bank, because when such question banks were used
previously in this course, our students reported dissatisfaction
due to misalignment with course expectations. To ensure that
these questions were better aligned with the course content and
level of exam questions, we generated learning objectives for
the course, based on exams from previous years, then wrote
quiz questions that were aligned with those objectives. Each
quiz corresponded to a weekly study guide and ungraded prob-
lem set; thus, the quizzes were noncumulative.
The quiz questions were closed response (multiple choice,
multiple answer, matching, etc.) so that they could be auto-
graded by our learning management system, which was impera-
tive, given our large class size. About 20% of the exam questions
are a similar closed-response format. The quizzes were short, on
average 6.23 questions (minimum = 5, maximum = 7, SD =
0.725), and the majority of questions (90%, 72/80) were part of
a pool, from which questions were randomly selected for each
student. (The remaining 10% of questions were fundamental
vocabulary and conceptual comparisons, which were the same
for all students and were typically fill-in-the-blank format.) The
average number of quiz questions per pool was 4.94 (minimum
= 2, maximum = 10, SD = 2.109). The majority of pool questions
were different forms of a unified question stem. For example, a
student might be asked to determine the coding strand of DNA
during transcription. And in different versions, the DNA strands
are depicted in different orientations or a different strand is
being transcribed. To demonstrate to students that we valued the
quizzes, we included one question identical to a quiz question on
each of the midterm exams. However, for the tests used as our
18:ar15, 4 CBE—Life Sciences Education • 18:ar15, Summer 2019
E. M. Walck-Shannon et al.
outcome variables—the final course exam and the posttest—
none of the questions was identical to a quiz question.
Quiz Implementation and Analysis
All quiz attempts were given online through our learning man-
agement system. Each week, students were assigned a quiz to
take for credit (i.e., first attempt). The highest 11 (out of 13)
quizzes counted toward 10% of the total grade. The vast major-
ity of the students in our final sample completed all of the quiz-
zes required for the course (98.7%, n = 306/310). In the week
before exam time, quizzes related to that exam were reopened
to students for unlimited, untimed, ungraded practice (i.e.,
delayed no-stakes re-quizzing). Students received an email
announcement when the quizzes had been reopened for the
purpose of studying for the exam. Before exam 1, this opportu-
nity was also announced in class. Feedback during delayed
re-quizzing was provided immediately after completing each
attempt, whereas feedback after the graded first attempt was
released after the quiz due date to minimize answer sharing.
Feedback included the student’s answer, the correct answer,
and in some cases, an explanation for why a particular answer
was correct for a specific question. Because each midterm exam
was noncumulative, quizzes were only reopened once, right
before exam time. A total of 10 quizzes were reopened for prac-
tice during the semester: four before exam 1 (9 weeks before
the cumulative assessments), three before exam 2 (5 weeks
before the cumulative assessments), and three before exam 3
(10 days before the cumulative assessments).
For the purposes of this study, we analyzed only the ungraded
practice attempts (i.e., the delayed re-quizzing) and not first
attempts, which were required for the course grade. To deter-
mine whether a student retook a quiz for the purpose of study-
ing for an exam, we retrieved the quiz-attempt log from our
learning management system, then counted the number of
available delayed quizzes each student retook at least once,
which ranged from 0 to 10. For simplicity, in this study, we con-
sidered only whether a student retook each delayed quiz and
not the number of attempts per quiz or the score of each attempt
in the delayed opportunity.
Outcome Variables
To assess the effectiveness of re-quizzing, we analyzed two cumu-
lative assessments: the course posttest and the final course exam.
Posttest. Our primary outcome measure was the course
posttest (Supplemental Material 1), which was a multi-
ple-choice measure developed by our instructor team to align
with the course goals and measure learning gains in Introduc-
tory Biology I. Students take this test within the first week of the
semester (“pre,” see Selection of Control Variables) and the last
week of the semester (“post”), which allows for within-subject
knowledge comparisons. The test was first developed 6 years
before the present study (2013) and has been iteratively
improved based on difficulty and discrimination measures from
student responses. The mean of the posttest was 26.74 (70.37%)
with an SD of 6.26 (full descriptive statistics are reported in
Supplemental Material 2). To verify that the test aligned with
the course focus, we qualitatively categorized questions by
topic and compared their distribution to the course syllabus.
Two raters categorized questions (interrater kappa was 0.94),
then met to discuss discrepancies until agreement was reached.
The distribution of these agreed-upon topic categorizations did
not significantly differ from the distribution of topics covered
during lecture as determined by the course syllabus (χ2 = 58.95,
p = 0.96), suggesting that the content coverage was appropriate
for this course. Because the internal reliability was within
acceptable range (Cronbach’s alpha for all items = 0.73), the
scores on all items were averaged together into a single score
for each student, which is reported here.
Similar to the quizzes, the pre- and postcourse tests were
completed outside class on our learning management system.
Students received a small amount of credit for completing the
pre and posttests (low-stakes). While the accuracy of their
answers did not count toward the course grade, students were
told that “reasonable effort” would be required for credit and
were encouraged to view the posttest as an opportunity to
review course material and identify concepts for further study.
Final Course Exam. We also conducted secondary analyses, in
which the final course exam was used as the outcome variable.
The majority of this exam is cumulative and was written collab-
oratively by the instructor team. The exam was given in class
and accounted for 20% of the course grade (i.e., high-stakes).
The mean of the exam was 84.33 with an SD of 9.93 (full
descriptive statistics are reported in Supplemental Material 2).
The internal reliability was within acceptable range (Cronbach’s
alpha = 0.84). Because this is not a validated research instru-
ment and changes from year to year, we did not consider this to
be a primary outcome. However, after examining the effects of
delayed re-quizzing on the posttest, we conducted secondary
analyses to see whether the general pattern of effects also held
for the final course exam. One consideration in conducting these
analyses is that effects on the cumulative final exam, which
counts toward course grades, may resonate more with students
and instructors than effects on the ungraded posttest alone.
Selection of Control Variables
Potential Preparatory Control Variables. Given our experi-
mental questions, we aimed to control for student preparation.
To select which preparation variables to include, we looked at
the pairwise correlations between the primary outcome perfor-
mance variable (i.e., the posttest score) and the following avail-
able preparatory scores.
College Preparation. We examined two measures of college
readiness: ACT scores and Advanced Placement (AP) success in
science, technology, engineering, and mathematics (STEM). We
found a small but significant correlation between ACT compos-
ite score and posttest score (r(281) = 0.23, p < 0.0001). For our
AP success measure, we calculated the proportion of AP course
work from four STEM disciplines (biology, chemistry, calculus,
and physics) that a student successfully completed. A score of
4 or 5 on the AP exam was considered to be successful comple-
tion. Using this AP proportion measure, we found a significant
moderate correlation with biology posttest score (r(308) =
0.37, p < 0.0001; Cohen, 1992).
Incoming Biology Knowledge. To gauge students’ incoming biol-
ogy content knowledge, we used the course pretest, which con-
tains the same items as the posttest but is given during the first
CBE—Life Sciences Education • 18:ar15, Summer 2019 18:ar15, 5
Re-quizzing in Introductory Biology
week of class. Not surprisingly, we found a significant moderate
correlation between the pretest and posttest (r(308) = 0.43,
p < 0.0001).
Previous STEM Course Performance. Because first-year students
take Introductory Biology I in their second semester, we also
have scores on their previous STEM course work that could be
used as preparation variables. The vast majority of students take
the first semester of general chemistry in the semester preceding
Introductory Biology I. Owing to the limited overlap in content
of General Chemistry I and Introductory Biology I, we suspect
that a correlation would result from general cognitive and non-
cognitive skills (critical thinking, problem-solving skills, social
skills, persistence, etc.) that students are able to apply across
college-level STEM courses. (On the other hand, the biology pre-
test would include content-specific knowledge and cognitive
skills.) When using cumulative final exam scores from General
Chemistry I, we found a significant large correlation to biology
posttest score (r(308) = 0.50, p < 0.0001; Cohen, 1992).
Selected Preparatory Control Variables. As stated earlier
(and summarized in Supplemental Material 3A), we found the
highest correlation with posttest score and the two preparatory
control variables: a previous STEM course final exam score and
the biology pretest score. While it is straightforward that the
biology pretest represents incoming biology knowledge, we sus-
pected that previous STEM course work and college prepara-
tion measures may have represented somewhat overlapping
attributes.
To check this interpretation, we asked whether the propor-
tion of successful STEM AP course work and ACT composite
score added any unique explanatory power to a model predict-
ing posttest score. We used the following hierarchical approach
for this analysis. Once the biology pretest and first- semester
General Chemistry I final exam scores were incorporated into a
model predicting posttest scores, adding in both the AP STEM
proportion and the ACT composite score did not significantly
increase the explanatory power of the model (F(1, 278) =
0.2651, p = 0.7673, ΔR2 = 0.0013; Supplemental Material 3B).
Therefore, we limited our preparatory variables to biology pre-
test score and final exam score in a previous STEM course (Gen-
eral Chemistry I).
Potential Personality Control Variables. We also had access to
a subset of students’ personality traits, as measured by the Big
Five Inventory (Goldberg, 1993; John et al., 2008) one semes-
ter prior, but these were not significantly correlated with
posttest or final exam score (Supplemental Material 3A). Addi-
tionally, once the biology pretest and first-semester chemistry
final exam scores were incorporated into a model predicting
posttest scores, adding in conscientiousness did not signifi-
cantly increase the explanatory power of the model (F(1, 227) =
0.907, p = 0.3418, ΔR2 = 0.0023; Supplemental Material 3C).
Hence, we did not use these personality trait measures as
predictor variables in our models.
Final Sample. In summary, we decided to include the final
exam grade from a previous STEM course (General Chemistry
I) and the biology pretest scores as our control variables. There-
fore, students included in this study were first-year students
who had completed every question on the pre- and posttests
and for whom we had General Chemistry I and Introductory
Biology I final exam grades (Figure 1, n = 310), unless other-
wise noted.
Statistical Method
For this analysis, we used a form of multiple regression, a
method that has been suggested for use by biology education
researchers to more accurately estimate the effects in their stud-
ies (Theobald and Freeman, 2014). The method we used, called
hierarchical multiple regression, sequentially adds groups of
predictor variables to the statistical model so that one can
observe whether a specific predictor or group of predictor
variables significantly improves the model’s explanation of the
outcome variable when other predictor variables have already
been accounted for (such as preparation). Because we know
that all readers may not be familiar with this method, we insert
some explanatory details in the following sections.
Specifically, we report two main hierarchical models and
two post hoc models. The first model predicts the primary out-
come variable of posttest score and the second and post hoc
models predict the secondary outcome variable of final exam
score. These models incorporate preparation, delayed re-quiz-
zing, and preparation by delayed re-quizzing interactions as
predictor variables in steps 1, 2, and 3, respectively. Unless
otherwise noted, standardized β values are reported for each
predictor variable, which represent the number of standard
deviations of change in the outcome variable (i.e., posttest or
final exam score) that would result for every 1 SD change in the
predictor variable of interest. Thus, these standardized β values
allow us to weigh the importance of individual predictor vari-
ables toward explaining the outcome variable. An F-test was
performed for each hierarchical step to determine whether the
fit of the model was significantly improved by the addition of
the predictors included in that current step, relative to the pre-
vious step, more than is expected by chance. R was used for the
hierarchical regression analyses (R Core Team, 2013) and JMP
(v. 14; SAS Institute, Cary, NC) was used for all remaining
analyses.
Self-Reported Study Time. Additionally, at six times through-
out the semester, we asked students to report the number of
hours that they spent studying in the previous week. Three of
these reports were on exam weeks and three were on non–
exam weeks. Reporting study time was completely voluntary
for students and was gathered on quiz first attempts. Students
were included in the analysis of study time if they made any
number of reports (1–3). If multiple reports existed, they were
averaged across exam or non–exam weeks. Numbers above 3
SD of the mean were excluded from the analysis.
Delayed Re-quizzing Attempt Times
To determine the mean duration of re-quizzing attempts, we
obtained raw attempt times from the quiz-attempt log. This
includes every delayed re-quizzing attempt for every student.
To account for re-quizzing attempts during which browsers
were left open for long periods, we excluded attempt times
greater than double the intended time (i.e., greater than
30 minutes), which represents <5% of all recorded attempts
(n = 222/5516).
18:ar15, 6 CBE—Life Sciences Education • 18:ar15, Summer 2019
E. M. Walck-Shannon et al.
RESULTS
Participation in Delayed Re-quizzing throughout the
Semester Predicted Higher Scores on the Low-Stakes
Cumulative Posttest
We performed hierarchical multiple regression first to predict
our primary outcome measure: posttest scores. Our first hierar-
chical regression model was aimed at answering our research
question, namely, does delayed re-quizzing increase perfor-
mance on cumulative assessments when controlling for prepa-
ration and does re-quizzing help all preparation groups equally?
Therefore, we added our predictor variables into the model in
three steps: 1) preparation control variables, 2) delayed re-quiz-
zing participation, and 3) interaction terms between delayed
re-quizzing and preparation.
Model 1, Step 1: Preparation. The first step of our hierarchical
regression analysis predicted posttest score from two prepara-
tion variables, pretest score and final exam score in a previous
STEM course (General Chemistry I; see Methods section for jus-
tification). Despite having the same questions on the pretest
and posttest, in this case, the final exam score in a previous
STEM course (General Chemistry I; βstd = 0.404; step 1, Table 1)
was even more informative for predicting the biology posttest
score than the score on the biology pretest (βstd = 0.309; step 1,
Table 1). Together, these two preparation terms explained
33.5% of the variance in biology posttest score.
Model 1, Step 2: Delayed Re-quizzing. We next asked whether
adding in re-quizzing participation significantly improved the
explanatory power of the model that already accounted for stu-
dent preparation. Therefore, in step 2, we added to the model
the number of delayed quizzes that a student retook at least
once (0–10). For example, a value of 0 denotes that only the
first, graded attempts were completed during the semester and
a value of 10 denotes that a student retook every delayed quiz
available at least once after the first attempt. We found that
there was a significant increase in the explanatory power of our
model when adding in the delayed re-quizzing variable (F(1,
306) = 8.790, p = 0.003, ΔR2 = 0.018; see also step 2 of Table
1). When looking at the individual predictor terms, delayed
re-quizzing was significant (p = 0.0034) but less predictive than
either preparatory variable in predicting posttest score (βstd re-quiz
= 0.139 vs. βstd pretest = 0.329, βstd chem = 0.374; step 2, Table 1).
The unstandardized estimate, or the estimate in its original
units, of the delayed re-quizzing effect was 0.233. This means
that, on average, for every quiz that a student retook, he or she
scored 0.6% higher on the posttest (i.e., 0.233 gain out of a
total of 38 points). Therefore, on average, a student who retook
all 10 quizzes scored 6% higher on the posttest than a student
who retook none. Taken together, these results suggest that
delayed re-quizzing promoted a small but significant increase in
posttest scores among our students.
Model 1, Step 3: Delayed Re-quizzing Interactions with
Preparation. Next, we sought to determine whether this
delayed re-quizzing effect was equal across preparation levels.
Because we did not have an a priori prediction of whether
delayed re-quizzing would help students with differing incom-
ing biology knowledge (pretest score) or differing success in
previous STEM course work (chemistry final exam score), we
added both interaction terms into our model in step 3. This step
did not significantly increase the explanatory power of the
model (F(2, 304) = 2.421, p = 0.091, ΔR2 = 0.010; see also step
3, Table 1). The coefficient for the chemistry final exam by
re-quizzing interaction was significant, but because the overall
step did not improve the model, we refrain from interpreting
this interaction at this time.
In summary, even when controlling for preparation, delayed
re-quizzing was an effective learning tool for our students as
measured using a low-stakes biology posttest. And, according to
this analysis, there was no robust evidence that the influence of
delayed re-quizzing was dependent on student preparation.
Though the values differed slightly, these same patterns were
observed when the model also accounted for scores on the first,
graded quiz attempts (Supplemental Material 4A).
Participation in Delayed Re-quizzing throughout the
Semester Predicted Higher Scores on a High-Stakes
Cumulative Final Course Exam
Given the observed effect of delayed re-quizzing on posttest
scores outlined in Table 1, we sought to use a second outcome
variable to look for similar patterns. For this second statistical
TABLE 1. Standardized β values for each predictor variable (βstd) and model-level statistics of hierarchical regression analysis that predicts
biology posttest scorea
Standardized predictor coefficientsb
Step 1 Step 2 Step 3
Chemistry final exam score 0.404*** 0.374*** 0.353***
Biology pretest 0.309*** 0.329*** 0.329***
Number of quizzes retaken 0.139** 0.158***
Chemistry final exam score × number of quizzes retaken 0.107*
Biology pretest × number of quizzes retaken 0.011
Model-level statisticsc
R20.335 0.353 0.363
R2adj 0.331 0.347 0.353
ΔR20.018** 0.010
aThe following symbols indicate significance: *p 0.05; **p 0.01; ***p 0.001; n = 310.
bPlease see the Methods section for a description of standardized β values.
cFor the model-level statistics, an F-test was performed to determine whether each step significantly improved the fit of the model relative to the previous step.
CBE—Life Sciences Education • 18:ar15, Summer 2019 18:ar15, 7
Re-quizzing in Introductory Biology
model, we used the course final exam as the outcome variable
rather than posttest score, but we used the same steps (prepa-
ration, delayed re-quizzing participation, and preparation by
delayed re-quizzing) as our first model.
Model 2, Step 1: Preparation. First, we used the preparation
variables of final exam score in previous STEM course work
(General Chemistry I) and biology pretest score to explain
biology final exam performance. As in model 1 explaining the
biology posttest, in model 2, we found a greater effect of
chemistry final exam score on biology final exam performance
than biology pretest score (βstd = 0.565 vs. 0.265; step 1, Table
2). This large effect of chemistry final exam score makes
sense given the similar assessment types of these two vari-
ables (high-stakes traditional testing situations). We found
that these two preparatory factors, when taken together,
explained 48.0% of the variance in the final biology course
exam score.
Model 2, Step 2: Delayed Re-quizzing. Next, we added
delayed re-quizzing participation throughout the semester
into our model at step 2. This slightly, but significantly,
improved the model’s explanatory power in predicting the
biology final exam (F(1, 306) = 4.813, p = 0.029, ΔR2 = 0.008;
see also step 2, Table 2). Put in more absolute terms, the
unstandardized estimate of delayed re-quizzing was 0.240,
meaning that, on average, for every quiz that a student retook,
he or she scored 0.24% higher on the final exam (which is out
of a total of 100 points). Therefore, we would predict that a
student who retook all 10 quizzes would score 2.4% higher on
the final exam than a student who retook none. Further, we
found that the effect of delayed re-quizzing was much smaller
than the preparation variables (βstd = 0.090; step 1, Table 2).
Taken together, these results suggest that delayed re-quizzing
promoted a small but significant increase in high-stakes final
exam scores among our students.
Model 2, Step 3: Delayed Re-quizzing Interactions with
Preparation. Finally, we asked whether the effect of delayed
re-quizzing on final exam scores was equal across preparation
levels by adding the same two preparation by re-quizzing inter-
action terms into our model at step 3. Adding in these terms
significantly improved the explanatory power of our model
predicting final exam score (F(2, 304) = 6.251, p = 0.002, ΔR2 =
0.020; see also step 3, Table 2). Additionally, we found a nega-
tive interaction with re-quizzing and chemistry exam score; that
is, delayed re-quizzing was most helpful for students who previ-
ously struggled in a STEM course (General Chemistry I). But
there was not a significant relationship between incoming biol-
ogy knowledge (pretest score) and re-quizzing (step 3, Table 2).
To put this in more absolute terms, on average, a student who
scored in the 25th percentile for both the chemistry final exam
(75/120) and biology pretest (14/38) would earn 76.5% on the
biology final if they did not participate in re-quizzing but would
earn 82.0% on the biology final if they re-quizzed at every
opportunity. On the other hand, on average, a student who
scored in the 75th percentile for both the chemistry final exam
(97.25/120) and biology pretest (22/38) would earn 90.1% on
the biology final if they did not participate in re-quizzing and
would earn 90.5% on the biology final if they re-quizzed at
every opportunity. This can be seen graphically in Figure 2A, as
students participate more in re-quizzing (x-axis), there is less of
an effect across chemistry exam score (i.e., there is a decrease in
the slope across the panels in Figure 2A). Further, when we
looked back at a similar plot for the posttest (Figure 2B), the
pattern still existed, though it was not at the level of statistical
significance. Conversely, there was no significant difference in
the effectiveness of re-quizzing depending on students’ pretest
score (step 3, Table 2). Hence, when considering biology final
exam scores, delayed re-quizzing is more effective for students
who have struggled in previous STEM course work than those
who have been more successful.
In summary, when controlling for preparation, delayed
re-quizzing was an effective learning tool, as measured by the
biology cumulative final exam scores. According to this analysis
predicting the high-stakes final, there was a significantly greater
benefit of delayed re-quizzing for students with lower final
exam scores in a previous STEM course (General Chemistry I).
Though the values differed slightly, we also observed a signifi-
cant interaction with preparation and delayed re-quizzing in a
model that also accounted for scores on the first, graded
attempts (Supplemental Material 4B).
TABLE 2. Standardized β values for each predictor variable (βstd) and model-level statistics of hierarchical regression analysis that predicts
cumulative final exam scorea
Standardized predictor coefficientsb
Step 1 Step 2 Step 3
Chemistry final exam score 0.565*** 0.545*** 0.516***
Biology pretest 0.265*** 0.277*** 0.277***
Number of quizzes retaken 0.090* 0.115**
Chemistry final exam score × number of quizzes retaken 0.138**
Biology pretest × number of quizzes retaken 0.022
Model-level statisticsc
R20.480 0.488 0.508
R2adj 0.477 0.483 0.500
ΔR20.008* 0.020**
aThe following symbols indicate significance: *p 0.05; **p 0.01; ***p 0.001; n = 310.
bPlease see the Methods section for a description of standardized β values.
cFor the model-level statistics, an F-test was performed to determine whether each step significantly improved the fit of the model relative to the previous step.
18:ar15, 8 CBE—Life Sciences Education • 18:ar15, Summer 2019
E. M. Walck-Shannon et al.
FIGURE 2. Scatter plots relating biology final exam score (A) or biology posttest score
(B) to the number of quizzes retaken during the semester. Each graph is split into four
panels that represent quartiles for the final exam score in a previous STEM course (General
Chemistry I). Blue line indicates best-fit, while the shaded area around it represents the
95% confidence interval for the fit.
Models 3 and 4: Post Hoc Investigation of Exam Question
Type. After observing a relationship between delayed re-quiz-
zing and biology final exam scores, we next decided to examine
whether the correlation was specific to
closed response (multiple choice, match-
ing, etc.) and/or free-response question
types as a post hoc analysis. To do so,
we calculated closed-response (37/100
points) and free-response (63/100 points)
subscores from each student’s final exam,
then performed the same hierarchical
regression analysis described earlier sepa-
rately with each subscore. We observed
complementary results when predicting
closed- and free-response final subscores.
When predicting closed-response sub-
scores (model 3, Table 3A), we found that,
even after accounting for student prepara-
tion (step 1), the addition of delayed
re-quizzing (step 2) significantly increased
the explanatory power of the model (F(1,
306) = 6.217, p = 0.010, ΔR2 = 0.012).
However, the addition of preparation
interactions into this model did not
improve its explanatory power (F(2, 304)
= 2.219, p = 0.110, ΔR2 = 0.009; Table 3A).
Put in more absolute terms, the unstan-
dardized estimate of re-quizzing was
0.119, meaning that we would predict that
a student who retook all 10 quizzes would
score 3.2% (1.19/37 total points) higher
on the closed-response question compo-
nent of the final exam than a student who
retook none.
On the other hand, when predicting
free-response subscores (model 4, Table
3B), we found that, after accounting for
student preparation (step 1), the addition
of delayed re-quizzing (step 2) did not sig-
nificantly increase the explanatory power
of the model (F(1, 306) = 6.217, p = 0.010,
ΔR2 = 0.012). Interestingly, however, the
addition of preparation and delayed
re-quizzing interactions (step 3) did sig-
nificantly increase the explanatory power
of this model (F(2, 304) = 7.276, p =
0.001, ΔR2 = 0.027). That is, delayed
re-quizzing tended to have the greatest
benefit in the biology free-response final
exam scores among students with the
lowest General Chemistry I scores, with
the delayed re-quizzing benefit decreasing
as students’ General Chemistry I scores
increased (see graph in Supplemental
Material 5). To put this in more absolute
terms, on average, a student who scored in
the 25th percentile for both the chemistry
final exam (75/120) and biology pretest
(14/38) would earn 77.4% (48.74/63) on
the free-response portion of the biology
final if they did not participate in delayed re-quizzing but would
earn 83.0% (52.27/63) if they re-quizzed as a study tool at
every opportunity; whereas, on average, the model would not
CBE—Life Sciences Education • 18:ar15, Summer 2019 18:ar15, 9
Re-quizzing in Introductory Biology
predict a meaningful difference for a student who scored in the
75th percentile for both the chemistry final exam (97.25/120)
and biology pretest (22/38). Such a student would be predicted
to earn 92.4% (58.21/63) on the free-response portion of the
biology final if they did not participate in delayed re-quizzing
and would earn 91.4% (57.57/63) if they re-quizzed as a study
tool at every opportunity.
Together, these results recapitulate the results from the
entire final, but suggest that the effects of delayed re-quizzing
are unique for different questions types: delayed closed-
response re-quizzing significantly predicted higher closed-
response final exam scores regardless of success in previous
course work; whereas, for the free-response questions on the
biology course final, delayed re-quizzing particularly benefited
students who struggled in previous course work.
Success in Previous STEM Course Work Is Correlated with
Participation in Delayed Re-quizzing
Given this potential pattern that delayed re-quizzing may be
more helpful for students who have struggled in the previous
chemistry course, we wanted to investigate whether such stu-
dents were taking advantage of the opportunity. Therefore, we
looked at the number of quizzes that students retook in relation
to their chemistry final exam grade and found a weak but posi-
tive correlation (pairwise r(310) = 0.170, p = 0.002). In other
words, students who were more successful in a previous STEM
course were more likely to take advantage of the delayed
re-quizzing. This suggests that, despite the potential trend that
delayed re-quizzing is especially helpful to students who have
scored lower in the previous chemistry course, these students
are significantly less likely to participate.
Students Who Participated in Delayed Re-quizzing Did
Not Report More Time Studying
Finally, we questioned whether students who participated in
delayed re-quizzing were just studying more. To answer this
question, we relied on self-reported study times gathered at
multiple points throughout the semester. We separated these
self-reports into exam weeks, when re-quizzing was available,
and non–exam weeks, when re-quizzing was not available. We
wanted to check the latter, because if re-quizzing participation
correlated with study time even when it was not available, this
would suggest that there was some confounding difference
between re-quizzing participants and nonparticipants. In exam
weeks, students reported studying 4.5 hours on average; while
in non–exam weeks, students reported studying nearly 4 hours
on average (Table 4). We observed that delayed re-quizzing
participation was not correlated with study times in exam
weeks (pairwise r(237) = 0.0274, p = 0.675) or non–exam
weeks (pairwise r(229) = 0.035, p = 0.602). These results
suggest that the extent of delayed re-quizzing participation
does not affect students’ own estimates of their study time.
TABLE 3. Models 3 and 4 (post hoc): standardized β values for each predictor variable (βstd) and model levels statistics of hierarchical
regression analysis that predicts closed-response (A) and free-response (B) portions of the cumulative final exam separatelya
A. Closed-response portion of final (model 3)
Standardized predictor coefficientsb
Step 1 Step 2 Step 3
Chemistry final exam score 0.486*** 0.462*** 0.446***
Biology pretest 0.244*** 0.260*** 0.260***
Number of quizzes retaken 0.115** 0.125**
Chemistry final exam score × number of quizzes retaken 0.055
Biology pretest × number of quizzes retaken 0.062
Model-level statisticsc
R20.369 0.381 0.390
R2adj 0.365 0.375 0.380
ΔR20.012** 0.009
B. Free-response portion of final (model 4)
Standardized predictor coefficientsb
Step 1 Step 2 Step 3
Chemistry final exam score 0.530*** 0.517*** 0.484***
Biology pretest 0.249*** 0.258*** 0.257***
Number of quizzes retaken 0.060 0.091*
Chemistry final exam score × number of quizzes retaken 0.170**
Biology pretest × number of quizzes retaken 0.011
Model-level statisticsc
R20.424 0.427 0.454
R2adj 0.420 0.422 0.445
ΔR20.003 0.027***
aThe following symbols indicate significance: *p 0.05; **p 0.01; ***p 0.001; n = 310.
bPlease see the Methods section for a description of standardized β values.
cFor the model-level statistics, an F-test was performed to determine whether each step significantly improved the fit of the model relative to the previous step.
18:ar15, 10 CBE—Life Sciences Education • 18:ar15, Summer 2019
E. M. Walck-Shannon et al.
TABLE 4. Self-reported study time and re-quiz durations with
mean durations and SDs shown in hours, minutes, and seconds
(hh:mm:ss)
Mean
(hh:mm:ss)
SD
(hh:mm:ss)
Time studying per non–exam week 04:08:04 02:38:08
Time studying per exam week 04:30:18 03:08:01
Time per re-quizzing attempt 00:05:32 00:04:16
Total time of all re-quizzing
attempts per student
01:02:44 00:52:51
We further asked whether this interpretation was reasonable
by objectively observing the time commitment for re-quizzing
based on attempt logs. The average re-quizzing attempt
duration was close to 5 minutes (Table 4). Because students
may have attempted a delayed re-quiz more than once, we also
summed all re-quiz attempt durations from all available quizzes
across the semester for each student. On average, for students
who re-quizzed, this summed duration over the course of the
semester was about 1 hour (Table 4). This suggests that delayed
re-quizzing requires only a small time commitment from
the student and does not constitute a significant portion of
students’ weekly study-time estimates.
DISCUSSION
Was Delayed Re-quizzing Eective?
In this work, we found that no-stakes, delayed re-quizzing was
an effective learning tool in a large classroom context as mea-
sured by two cumulative assessments, biology knowledge
posttest and biology-course final exam score. This finding is
particularly meaningful, given that neither of these cumulative
assessments contained questions identical to quiz questions,
but rather asked different questions based on the same con-
cepts. The posttest was low-stakes—it was graded only on com-
pletion and was implemented outside class—meanwhile, the
course final was high-stakes—it constituted 20% of the course
grade and was given in class. These assessments are also differ-
ent in structure, with the posttest being completely closed-re-
sponse multiple-choice questions and the final being a mix of
closed-response and free-response questions. In fact, when we
separated out the closed- and free-response portions of the final
exam in a post hoc analysis, we found that closed-response
delayed re-quizzing was helpful for all students—regardless of
preparation—on the closed-response portion of the final exam,
whereas re-quizzing was particularly helpful for students with
lower success in a previous STEM course on the free-response
portion of the final exam. This finding differs from another
classroom study (McDermott et al., 2014) in which the testing
effect was largely independent of question type, but is consis-
tent with the majority of laboratory findings that the magnitude
of the testing effect depends on question type (Kang et al.,
2007; Rowland, 2014; see Smith and Karpicke, 2014, for an
exception). Together, our observations using these fundamen-
tally different assessments suggests that the positive effect of
delayed re-quizzing holds over different performance pressures;
however, in our analysis, the main effect of delayed re-quizzing
was specific for questions of a similar type on the course final
and posttest.
The result that delayed re-quizzing was correlated with
higher cumulative exam scores is consistent with theoretical
views of the testing effect presented in the Introduction. On
the basis of the current theory of disuse, we designed a situa-
tion where there was a delay between initial learning/quiz-
zing and the retake attempts. On the basis of the dual-memory
theory, we designed each quiz attempt to differ slightly for the
majority of questions so that “test memories” would be both
formed (novel questions) and strengthened (identical ques-
tions to previous attempts) and “study memories” would be
strengthened. One interpretational limit to our study, how-
ever, is that the delayed timing of the re-quizzing (i.e., the
“placement” relative to the cumulative assessment) could be
the major mediator of the observed effect, rather than the fact
that these were re-quizzing attempts. However, it is important
to note that, in the present study, the end-of-semester cumu-
lative assessments were administered as long as 9 weeks after
re-quizzing. Hence, the present findings are perhaps more
likely reflective of benefits of re-quizzing rather than quiz
placement per se. Still, future studies are needed to further
disentangle these two factors.
Strength of the Testing Eect for Delayed Re-quizzing in
Our Study
It is worth noting that the delayed re-quizzing effect sizes
when predicting either the posttest (βstd = 0.1581) or the
course final exam (βstd = 0.1149) were small in size (but still
within the expected range) compared with previous studies
examining the testing effect (compiled in Rickard and Pan,
2018). We suggest a few potential explanations for this. First,
some of the items on the posttest and cumulative final cov-
ered concepts that were not covered on the reopened delayed
quizzes. Second, unlike many previous studies examining the
testing effect, we did not include identical questions in the
quizzes and cumulative assessments (i.e., our outcome vari-
ables). In previous studies, even small differences between
quiz and exam questions decreased (McDaniel et al., 2012) or
completely mitigated (Wooldridge et al., 2014) the testing
effect. Third, and perhaps most importantly, students may
have tested themselves in contexts outside the delayed
re-quizzing opportunities that we created. For example, stu-
dents who did not retake the quizzes online may still have
tested themselves using flashcards or our course-organized
peer-learning teams. If these active approaches were used by
students who did not retake quizzes, they could have possibly
lowered the magnitude of the testing effect. Even though the
effect sizes of delayed re-quizzing that we observed were
small, they were significant and required a relatively small
time investment on the part of both the student (Table 4) and
the instructor; furthermore, they may have particularly
helped vulnerable students who had struggled with previous
STEM course work.
Explanations for the Eect of Delayed Re-quizzing beyond
the Testing Eect
Given the classroom context of this study, we—like others
(McDaniel et al., 2013; McDaniel and Little, 2019)—appreciate
that factors other than the act of retrieval practice itself likely
contribute to the correlation that we observed between delayed
re-quizzing and exam grades. For example, re-quizzing may
CBE—Life Sciences Education • 18:ar15, Summer 2019 18:ar15, 11
Re-quizzing in Introductory Biology
help students metacognitively monitor their knowledge so that
they can follow up with further studying. While we did not
observe any increase in self-reported study time depending on
the number of quizzes retaken, it may be that students who
re-quizzed were able to better focus their study time based on
the feedback of those attempts.
Did the Benefits of Delayed Re-quizzing Depend on
Preparation?
We observed mixed results while trying to investigate whether
there are differing effects of delayed re-quizzing for different
preparatory levels. While predicting both the posttest (model 1)
and course final exam (models 2–4), we noticed an interesting
trend, wherein students who struggled in a previous STEM
course (General Chemistry I) benefited more from delayed
re-quizzing than their peers who had higher previous achieve-
ment in this STEM course.
The analysis using the posttest hinted at this interaction,
but the statistical support was equivocal, given that step 3 of
model 1, which included this interaction term, did not signifi-
cantly improve the model. On the other hand, the analysis
using the entire final exam (model 2) or the free-response por-
tion of the final exam (model 4) shows strong statistical sup-
port for this interaction, with the drawback being that the
final exam score was a secondary outcome measure, because
it was not intended for research. However, strong support for
this interaction comes from the fact that both posttest and
final exam analyses converge, in an almost identical manner,
on a pattern in which delayed re-quizzing has the greatest
benefit for students who struggled the most in General
Chemistry I.
Both General Chemistry I and Introductory Biology I are
large introductory courses that are taken in the first year of
college, when the majority of attrition within science occurs
(Seymour and Hewitt, 1997; Chang et al., 2008). Given the
correlation in scores between these two courses, we should try
to have tools for students to improve their performance during
these first-year courses that could improve their persistence.
Hence, because of our preliminary observation that delayed
re-quizzing may help lower-achieving students (based on
previous STEM course work) more than their higher-
achieving peers, instructors might want to encourage delayed
re-quizzing.
Intriguing is the lack of evidence for a similar interaction
with incoming biology knowledge (i.e., pretest score). Why
would delayed re-quizzing have differing effects based on
previous achievement in STEM course work but not previous
biology knowledge? While this finding certainly needs to be
studied further, we provide one potential explanation. It is
well established that lower-achieving students tend to report
lower self-regulated learning skills (Zimmerman and Pons,
1986; Pintrich and de Groot, 1990; Kitsantas, 2002; Lopez
et al., 2013; Sebesta and Bray Speth, 2017), which includes
tasks such as selecting effective study strategies. This is sup-
ported by our own data, in which chemistry exam perfor-
mance predicted participation in re-quizzing a semester later
in a different course. Given this, we expect lower-achieving
students are more likely drawing from a pool of less effective
study strategies, while higher-achieving students are more
likely drawing from more effective strategies. We speculate
that reopening the quizzes as a study tool for the exams might
make self-testing more easily accessible to these lower-achiev-
ing students. It might be that lower-achieving students who
took advantage of delayed re-quizzing may have incorporated
this effective strategy into their repertoire of otherwise less-ef-
fective strategies, whereas higher-achieving students may
have incorporated self-testing into their repertoire of other-
wise more-effective strategies. In the future, it would be inter-
esting to directly correlate the effectiveness of re-quizzing
with study habits.
Implications for Instruction
This study reports that delayed no-stakes re-quizzing can be
an effective learning tool for students, even when the quiz
questions and exam questions are not identical. This suggests
that the practice of reopening online quizzes that contain
pools of questions (or rereleasing paper copies of multiple ver-
sions of quizzes) as study tools for exams, which many of us
already do or could easily implement, can indeed benefit stu-
dent learning. Further, to give timely feedback to our large
class, our quizzes were solely closed-response questions (mul-
tiple choice, matching, etc.), yet they still generated a testing
effect on our cumulative assessments. Because answering any
high-level question requires some lower-level retrieval, regard-
less of the cognitive level of student-learning goals, retrieval
practice (such as delayed re-quizzing) may be important for
all instructors to consider.
Future Directions
Though we tried to make self-testing accessible, it was still
underappreciated by students relative to more passive study
strategies. For example, about half of our students reported
that rewatching lecture recordings was “extremely helpful” for
their learning, while less than a quarter reported that quizzing
was. This fits with previous studies, which have similarly
found that students have “illusions of competence” from more
passive strategies, such as rereading, and tend not to view
self-testing as a very effective learning strategy (Carrier, 2003;
Karpicke et al., 2009; Karpicke and Blunt, 2011; Kornell and
Son, 2009). Despite this, more of our students chose to partic-
ipate to some extent in re-quizzing (77% of students retook at
least one quiz) than we expected based on published surveys
(Karpicke et al., 2009; Hartwig and Dunlosky, 2012). How-
ever, on the basis of published reports, we suspect that many
of our students are using self-testing solely as a way to assess
their preparedness, rather than as a learning tool (Kornell and
Bjork, 2007; Karpicke et al., 2009). That shift will be consider-
ably harder to achieve. Some research suggests that, with
guidance (including in-class demonstrations on the value of
retrieval practice; Einstein et al., 2012), students can begin to
make more accurate judgments about the effectiveness of
self-testing toward learning (Tullis et al., 2013). This guid-
ance toward helping students perceive the benefits of evi-
dence-based learning strategies, though time- consuming, has
the potential to benefit students throughout their undergrad-
uate education.
ACKNOWLEDGMENTS
We thank Anton Weisstein, Barbara Kunkel, John Majors, and
Kathleen Weston-Hafer for their consistent efforts to develop
18:ar15, 12 CBE—Life Sciences Education • 18:ar15, Summer 2019
E. M. Walck-Shannon et al.
and revise the course pre/posttest and their help in revising
quiz question pools. This research was supported in part by an
internal grant, “Transformational Initiative for Educators in
STEM,” which aims to foster the adoption of evidence-based
teaching practices in science classrooms at Washington Univer-
sity in St. Louis.
REFERENCES
Bjork, E. L., Little, J. L., & Storm, B. C. (2014). Multiple-choice testing as a
desirable diculty in the classroom. Journal of Applied Research in
Memory and Cognition, 3(3), 165–170.
Bjork, R. A. (1994). Memory and metamemory considerations in the training
of human beings. In Metcalfe, J., & Shimamura, A. (Eds), Metacognition:
Knowing about knowing (pp. 185–205). Cambridge, MA: MIT Press.
Bjork, R. A., & Bjork, E. L. (1992). A new theory of disuse and an old theory of
stimulus uctuation. In Healy, A., Kosslyn, S., & Shirin, R. (Eds.), From
learning processes to cognitive processes: Essays in honor of William K.
Estes (2nd ed., pp. 35–67). Hillsdale, NJ: Erlbaum.
Brame, C. J., & Biel, R. (2015). Test-enhanced learning: The potential for
testing to promote greater learning in undergraduate science courses.
CBE—Life Sciences Education, 14(2), es4.
Butler, A. C., & Roediger, H. L. (2008). Feedback enhances the positive eects
and reduces the negative eects of multiple-choice testing. Memory &
Cognition, 36(3), 604–616.
Carnegie, J. (2015). Use of feedback-oriented online exercises to help phys-
iology students construct well-organized answers to short-answer
questions. CBE—Life Sciences Education, 14(3), ar25.
Carpenter, S. K. (2009). Cue strength as a moderator of the testing eect:
The benets of elaborative retrieval. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 35(6), 1563–1569.
Carrier, L. M. (2003). College students’ choices of study strategies. Percep-
tual and Motor Skills, 96(1), 54–56.
Chang, M. J., Cerna, O., Han, J., & Sàenz, V. (2008). The contradictory roles
of institutional status in retaining underrepresented minorities in bio-
medical and behavioral science majors. Review of Higher Education,
31(4), 433–464.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155–159.
Eddy, S. L., & Hogan, K. A. (2014). Getting under the hood: How and for
whom does increasing course structure work? CBE—Life Sciences
Education, 13(3), 453–468.
Einstein, G. O., Mullet, H. G., & Harrison, T. L. (2012). The testing eect: Illus-
trating a fundamental concept and changing study strategies. Teaching
of Psychology, 39(3), 190–193.
Freeman, S., Haak, D., & Wenderoth, M. P. (2011). Increased course structure
improves performance in introductory biology. CBE—Life Sciences
Education, 10(2), 175–186.
Goldberg, L. R. (1993). The structure of phenotypic personality traits.
American Psychologist, 48(1), 26–34.
Haak, D. C., HilleRisLambers, J., Pitre, E., & Freeman, S. (2011). Increased
structure and active learning reduce the achievement gap in introducto-
ry biology. Science, 332(6034), 1213–1216.
Hartwig, M. K., & Dunlosky, J. (2012). Study strategies of college students: Are
self-testing and scheduling related to achievement? Psychonomic
Bulletin & Review, 19(1), 126–134.
John, O. P., Naumann, L. P., & Soto, C. J. (2008). Paradigm shift to the inte-
grative Big Five trait taxonomy: History, measurement, and conceptual
issues. In John, O., Robins, R., & Pervin, L. (Eds.), Handbook of personal-
ity: Theory and research (3rd ed., pp. 114–158). New York: Guilford.
Kang, S. H. K., Mcdermott, K. B., Roediger, H. L., & Kang, S. (2007). Test format
and corrective feedback modify the eect of testing on long-term reten-
tion. European Journal of Cognitive Psychology, 19(45), 528–558.
Karpicke, J. D., & Blunt, J. R. (2011). Retrieval practice produces more learn-
ing than elaborative studying with concept mapping. Science, 331(6018),
772–775.
Karpicke, J. D., Butler, A. C., & Roediger, H. L. (2009). Metacognitive strategies
in student learning: Do students practise retrieval when they study on
their own? Memory, 17(4), 471–479.
Karpicke, J. D., & Roediger, H. L. (2008). The critical importance of retrieval
for learning. Science, 319(5865), 966–968.
Kitsantas, A. (2002). Test preparation and performance: A self-regulatory
analysis. Journal of Experimental Education, 70(2), 101–113.
Kornell, N., & Bjork, R. A. (2007). The promise and perils of self-regulated
study. Psychonomic Bulletin & Review, 14(2), 219–224.
Kornell, N., & Son, L. K. (2009). Learners’ choices and beliefs about self-test-
ing. Memory, 17(5), 493–501.
Larsen, D. P., Butler, A. C., & Roediger, H. L., III (2009). Repeated testing im-
proves long-term retention relative to repeated study: A randomised
controlled trial. Medical Education, 43(12), 1174–1181.
Leeming, F. C. (2002). The Exam-A-Day procedure improves performance in
psychology classes. Teaching of Psychology, 29(3), 210–212.
Lopez, E. J., Nandagopal, K., Shavelson, R. J., Szu, E., & Penn, J. (2013).
Self-regulated learning study strategies and academic performance in
undergraduate organic chemistry: An investigation examining ethnically
diverse students. Journal of Research in Science Teaching, 50(6), 660–
676.
Lyle, K. B., & Crawford, N. A. (2011). Retrieving essential material at the end of
lectures improves performance on statistics exams. Teaching of Psy-
chology, 38(2), 94–97.
McDaniel, M. A., & Little, J. (2019). Multiple-choice and short-answer quiz-
zing on equal footing in the classroom: Potential indirect eects of
testing. In Dunlowsky, J., & Rawson, K. A. (Eds.), The Cambridge hand-
book of cognition and education (pp. 480–499). Cambridge, UK:
Cambridge University Press.
McDaniel, M. A., & Masson, M. E. (1985). Altering memory representations
through retrieval. Journal of Experimental Psychology: Learning, Memo-
ry, and Cognition, 11(2), 371–385.
McDaniel, M. A., Thomas, R. C., Agarwal, P. K., Mcdermott, K. B., & Roediger,
H. L. (2013). Quizzing in middle-school science: Successful transfer per-
formance on classroom exams. Applied Cognitive Psychology, 27(3),
360–372.
McDaniel, M. A., Wildman, K. M., & Anderson, J. L. (2012). Using quizzes to
enhance summative-assessment performance in a Web-based class: An
experimental study. Journal of Applied Research in Memory and Cogni-
tion, 1(1), 18–26.
McDermott, K. B., Agarwal, P. K., D’Antonio, L., Roediger, H. L., & McDaniel, M.
A. (2014). Both multiple-choice and short-answer quizzes enhance
later exam performance in middle and high school classes. Journal of
Experimental Psychology: Applied, 20(1), 3–21.
Messineo, L., Gentile, M., & Allegra, M. (2015). Test-enhanced learning: Anal-
ysis of an experience with undergraduate nursing students. BMC Medical
Education, 15, 182.
Orr, R., & Foster, S. (2013). Increasing student success using online quizzing
in introductory (majors) biology. CBE—Life Sciences Education, 12(3),
509–514.
Pintrich, P. R., & de Groot, E. V. (1990). Motivational and self-regulated
learning components of classroom academic performance. Journal of
Educational Psychology, 82(1), 33–40.
Pyc, M. A., & Rawson, K. A. (2010). Why testing improves memory: Mediator
eectiveness hypothesis. Science, 330(6002), 335.
R Core Team. (2013). R: A language and environment for statistical comput-
ing. Vienna, Austria: R Foundation for Statistical Computing.
Rickard, T. C., & Pan, S. C. (2018). A dual memory theory of the testing eect.
Psychonomic Bulletin & Review, 25(3), 847–869.
Roediger, H. L., & Karpicke, J. D. (2006). Test-enhanced learning. Psycholog-
ical Science, 17(3), 249–255.
Rowland, C. A. (2014). The eect of testing versus restudy on retention: A
meta-analytic review of the testing eect. Psychological Bulletin, 140(6),
1432–1463.
Sebesta, A. J., & Bray Speth, E. (2017). How should I study for the exam?
Self-regulated learning strategies and achievement in introductory
biology. CBE—Life Sciences Education, 16(2), ar30.
Seymour, E., & Hewitt, N. M. (1997). Talking about leaving : Why undergradu-
ates leave the sciences. Boulder, CO: Westview.
Smith, M. A., & Karpicke, J. D. (2014). Retrieval practice with short-answer,
multiple-choice, and hybrid tests. Memory, 22(7), 784–802.
CBE—Life Sciences Education • 18:ar15, Summer 2019 18:ar15, 13
Re-quizzing in Introductory Biology
Storm, B. C., Friedman, M. C., Murayama, K., & Bjork, R. A. (2014). On the
transfer of prior tests or study events to subsequent study. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 40(1), 115–
124.
Theobald, R., & Freeman, S. (2014). Is it the intervention or the students?
Using linear regression to control for student characteristics in under-
graduate STEM education research. CBE—Life Sciences Education, 13(1),
41–48.
Thomas, R. C., & McDaniel, M. A. (2013). Testing and feedback eects on
front-end control over later retrieval. Journal of Experimental Psycholo-
gy: Learning, Memory, and Cognition, 39(2), 437–450.
Thorndike, E. (1914). The psychology of learning. New York: Teachers’
College, Columbia University.
Trumbo, M. C., Leiting, K. A., McDaniel, M. A., & Hodge, G. K. (2016). Eects
of reinforcement on test-enhanced learning in a large, diverse introduc-
tory college psychology course. Journal of Experimental Psychology:
Applied, 22(2), 148–160.
Tullis, J. G., Finley, J. R., & Benjamin, A. S. (2013). Metacognition of the testing
eect: Guiding learners to predict the benets of retrieval. Memory &
Cognition, 41(3), 429–442.
Wooldridge, C. L., Bugg, J. M., McDaniel, M. A., & Liu, Y. (2014). The testing
eect with authentic educational materials: A cautionary note. Journal of
Applied Research in Memory and Cognition, 3(3), 214–221.
Zimmerman, B. J., & Pons, M. M. (1986). Development of a structured inter-
view for assessing student use of self-regulated learning strategies.
American Educational Research Journal, 23(4), 614–628.
... These provided students with a second opportunity to exhibit their skills and knowledge associated with learning outcomes, and took place during a part of students' final exam period. Although others have implemented similar retake practices (e.g., Leeming, 2002;Walck-Shannon et al., 2019;Campbell et al., 2020;Harsy, 2020;Mangum, 2020;Harsy et al., 2021;Supriya et al., 2024), few have systematically measured their impact on undergraduate students' perception of motivation and well-being, 1 and even fewer have applied the rich motivational framework of self-determination theory (SDT). This study utilizes SDT to interpret how students view the impact of traditional and alternative grading practices on their academic motivation and success. ...
... In contrast, grading systems that include a large number of low-weighted quizzes or homework assignments reduce the pressure of mandatory high performance on any single assignment. A course having many low-stakes assessments benefits student learning (Leeming, 2002;Orr and Foster, 2013;Pennebaker et al., 2013;Casselman and Atwood, 2017;Sullivan et al., 2018;Walck-Shannon et al., 2019;Crucho et al., 2020), and this frequent feedback process is likely to promote student autonomy and ultimately motivation (Ryan and Deci, 2000b). More generally, when students perceive that their instructors support autonomy in learning, students experience greater motivation and course engagement, as well as increased feelings of competence, autonomy, and relatedness (e.g., Williams and Deci, 1996; reviewed by Reeve and Cheon, 2021;Guay, 2022). ...
... Retakes may be optional and require students to find time to complete the assignment outside of class. In such cases, students completing the retakes may benefit in their course performance (Cates, 1982;Davidson et al., 1984;Friedman, 1987;Juhler et al., 1998;Walck-Shannon et al., 2019;Supriya et al., 2024) and experience reduced test anxiety (Supriya et al., 2024). However, systems that require optional, outside-of-class work risk lower participation rates by students with lower initial performance (Walck-Shannon et al., 2019) and can perpetuate inequities by preferentially advantaging students with more flexible schedules (Supriya et al., 2024). ...
Article
Full-text available
Grades are a staple of education and a gateway to future career opportunities. Yet, grading practices can (re)produce inequities and cause students to feel inadequate and unmotivated. Alternative grading practices may address these problems, but these strategies are often time intensive and impractical in larger classroom settings. In this study, we explore an easy-to-implement grading practice, in-class quiz retakes, to conceptualize how grades motivate learning and impact well-being for undergraduate students in science, technology, engineering, and mathematics (STEM). Through the lens of self-determination theory, we conducted semistructured interviews with undergraduates who experienced quiz retakes in two STEM courses. Our results revealed that retakes largely improved students’ perceptions of their competence in the subject matter, autonomy in grade outcomes, feelings of relatedness to the instructors, and overall motivation to learn. The majority of students also expressed how traditional grading practices negatively impacted their motivation and well-being. In addition, a quantitative analysis revealed that quiz retakes particularly benefitted students who scored lower on their initial quizzes. We aspire for this study to prompt educators to reconsider traditional grading practices by opting for more equitable and just alternative grading approaches that motivate student learning and mitigate systemic barriers in education.
... • Regular Practice, Online Quizzes, and Immediate Feedback: Regular practice, online quizzes, and immediate feedback are crucial in online STEM education, enhancing engagement and understanding. Quizzes allow students to gauge mastery, with feedback guiding targeted learning [169]. This approach promotes autonomous learning, enabling students to adjust strategies in real-time [170]. ...
Article
Full-text available
This study explores the critical role of STEM education in equipping students for the challenges of the 21st century and examines the effectiveness of Online Learning Environments (OLEs) in delivering such education. The purpose of this research is to identify the essential characteristics of effective OLEs and provide a comprehensive blueprint for their development. Utilizing a systematic review and qualitative synthesis of 228 peer-reviewed articles published between 2000 and 2023, the study adopts a rigorous methodological approach following PRISMA guidelines to analyze key trends, themes, and actionable insights. Findings reveal 46 essential features of optimal OLEs, categorized into ten themes: future-proofing, brain-based approaches, diverse learning mechanisms, high-fidelity implementation, instructional design perspectives, advanced technologies, online learning objects, pedagogical approaches, psychological considerations, and usability factors. These findings emphasize the integration of innovative technologies and pedagogical strategies to create engaging, inclusive, and adaptive learning environments tailored to diverse learner needs. The study concludes with a comprehensive blueprint designed to guide educators, policymakers, and technology developers in creating OLEs that enhance engagement and learning outcomes in STEM education. Practical implications include actionable recommendations for integrating emerging technologies, fostering professional development, and addressing accessibility challenges to democratize STEM education and prepare learners for the digital economy.
... This can be helpful for long-term student learning, especially for courses where student understanding of one part of course content is essential to their ability to learn subsequent course content (Juhler et al., 1998). Prior studies where students were offered opportunities to retake exams have shown that students score better on retakes compared with first attempts (Fichten and Adler, 1977;Cates, 1982;Davidson et al., 1984;Friedman, 1987;Juhler et al., 1998;Abraham, 2000;Badawy et al., 2016;Herman et al., 2020) and students who participate in a greater number of retakes do better on a cumulative final (Friedman, 1987;Abraham, 2000;Walck-Shannon et al., 2019). Moreover, studies have reported that students liked the use of exam retakes and reported lower anxiety about the exams (Fichten and Adler, 1977;Cates, 1982;Davidson et al., 1984;Friedman, 1987;Abraham, 2000). ...
Article
Full-text available
Use of high-stakes exams in a course has been associated with gender, racial, and socioeconomic inequities. We investigated whether offering students the opportunity to retake an exam makes high-stakes exams more equitable. Following the control value theory of achievement emotions, we hypothesized that exam retakes would increase students' perceived control over their performance and decrease the value of a single exam attempt, thereby maximizing exam performance. We collected data on exam scores and experiences with retakes from three large introductory biology courses and assessed the effect of optional exam retakes on gender, racial/ethnic, and socioeconomic disparities in exam scores. We found that Black/African American students and those who worked more than 20 h a week were less likely to retake exams. While exam retakes significantly improved student scores, they slightly increased racial/ethnic and socioeconomic disparities in scores partly because of these differences in participation rates. Most students reported that re-take opportunities reduced their anxiety on the initial exam attempt. Together our results suggest that optional exam retakes could be a useful tool to improve student performance and reduce anxiety associated with high-stakes exams. However, barriers to participation must be examined and reduced for retakes to reduce disparities in scores.
... Such an assessment strategy can reduce or eliminate gender inequities that are commonly found when employing high-stakes assessments [1][2][3][4][5]. Studies across a range of disciplines including psychology [6,7], biology [8,9], medicine [10], and chemistry [11,12] have shown the benefits of frequent assessment, a strategy that is well grounded in educational theories centered in mechanisms of retrieval [13][14][15][16][17] and growth mindset [18]. In distinguishing between fixed and growth mindsets, Dweck emphasizes that a fixed mindset is not permanent but may evolve into a growth mindset with proper intervention. ...
Article
Full-text available
Across a variety of fields, the use of low-stakes assessments has led to reductions in achievement gaps and improved student success. Here, we probe the use of a low-stakes assessment model with a retake option for failed quizzes in a two-semester general chemistry sequence. We find that the quiz-retake rate in general chemistry II was significantly higher for students who had completed a retake in a general chemistry I section, and the percentage of students who failed at least one quiz in general chemistry I but passed all quizzes in general chemistry II was significantly higher for students who had retaken at least one quiz in general chemistry I. However, across both semesters only 40% of students who failed a quiz and were offered a retake completed one. To examine this trend, we probed a connection to student attitudes and self-concept. As instruments, we used version 2 of the Attitudes towards Chemistry Subject Inventory (ASCIv2) and the Chemistry Subject Concept Inventory (CSCI), which were administered across all sections of our general chemistry I course in the fall 2021 semester, and the results subjected to confirmatory factor analysis. Two sections employed low-stakes assessments (quizzes), with one section offering a retake option, while the remaining two used a traditional assessment pattern of five exams. The instruments were applied again for the quiz-retake section of general chemistry II, affording a longitudinal comparison of students common to both sections. In a pairwise comparison, we find significant increases in factors corresponding to Intellectual Accessibility and Chemistry Self-Concept for students in the quiz-retake sections across semesters, with the former more pronounced for men and the latter for women. We take these results to provide additional data supporting the benefit of low-stakes assessments with a retake option, that may be particularly impactful for women in chemistry.
... It is important to note, however, that easy access to retrieval does not guarantee that students will use it. A number of studies show that students do not automatically use retrieval resources simply because they are available (Corral et al., 2020;Walck-Shannon et al., 2019); however, students can be encouraged to use these resources when incentives are applied such as giving students course credit (Trumbo et al., 2016) or showing students the positive outcomes associated with using retrieval (Carpenter et al., 2017). In addition, although prompting retrieval with digital technology and other tools is different from students making their own decisions to use retrieval, such tools can expand students' opportunities to experience retrieval practice, which is a necessary ingredient in the development of students' self-regulated decisions to engage in retrieval. ...
Article
Full-text available
Over 100 years of research shows that retrieval practice is highly effective for enhancing student learning. When managing their own study behaviors, however, students tend to avoid using retrieval practice as a way of learning. Understanding and improving students’ study decisions is important given the increasingly autonomous nature of educational experiences that require students to initiate and regulate their own learning. This review summarizes the emerging research on interventions designed to increase students’ decisions to use retrieval practice. Informing students about the benefits of retrieval, and even providing opportunities to directly experience retrieval, are not sufficient for getting students to engage with retrieval when they have the choice. However, reducing the effort and errors involved in retrieval, and providing students direct performance feedback on their own learning benefits associated with retrieval, can increase students’ decisions to use it. The small but growing literature on multifaceted interventions also shows some promise for increasing students’ decisions to use retrieval practice in their courses as a result of learning about its benefits, planning how to use it, practicing it over time, and reflecting on the outcomes. Suggestions are offered for how this research informs straightforward ways that teachers might encourage students to use retrieval practice in their own learning.
... There are several metrics suggested in the literature for tracking student success over time. Some of universal quantitative measures such as number of publications, attrition rate, persistence in major or program, academic self-efficacy, time to completion, exam scores such as Scholastic Aptitude Test (SAT) and American College Test (ACT), course scores, and cumulative Grade Point Average (GPA) can reflect the academic strength of students (Kreiser et al., 2021;Weatherton and Schussler, 2021;Walck-Shannon et al., 2019;Gregg-Jolly et al., 2016). Also, measurements pertaining to career progression of students including seeking and securing employment in a field related to their study program can be used to determine student success (Martinez et al., 2018). ...
... There are several metrics suggested in the literature for tracking student success over time. Some of universal quantitative measures such as number of publications, attrition rate, persistence in major or program, academic self-efficacy, time to completion, exam scores such as Scholastic Aptitude Test (SAT) and American College Test (ACT), course scores, and cumulative Grade Point Average (GPA) can reflect the academic strength of students (Kreiser et al., 2021;Weatherton and Schussler, 2021;Walck-Shannon et al., 2019;Gregg-Jolly et al., 2016). Also, measurements pertaining to career progression of students including seeking and securing employment in a field related to their study program can be used to determine student success (Martinez et al., 2018). ...
... Biology education research is increasingly focusing on students' use of learning strategies because strategy use is central to self-regulated learning, and can be coached through interventions that improve outcomes in STEM. Several studies contribute to a growing understanding of how specific study strategies relate to achievement [51,53,55,91,92], whether course-level interventions can promote the use of effective learning strategies and lead to better outcomes [52,54,77,78,93], and whether students change their strategies and under what circumstances they do so [60,74,[94][95][96][97][98]. ...
Article
Full-text available
Students’ use of learning strategies (i.e., what students do when studying) is linked to their achievement in undergraduate science, technology, engineering, and math (STEM) courses, and several study strategies have been individually associated with course and exam grades in multiple contexts. In this study, we surveyed students in a learner-centered, large-enrollment introductory biology course about their study strategies. We aimed to identify groups of strategies that students often reported together, possibly reflecting broader approaches to studying. Exploratory factor analysis revealed three groups of study strategies frequently co-reported (which we named housekeeping strategies, use of course materials, and metacognitive strategies). These strategy groups map onto a model of learning that associates specific suites of strategies to phases of learning, which correspond to different levels of cognitive and metacognitive engagement. Consistent with previous work, only some study strategies were significantly associated with exam scores: students reporting higher use of course materials and of metacognitive strategies earned higher scores on the first course exam. Students who improved on the subsequent course exam reported increasing their use of housekeeping strategies and of course materials. Our findings contribute to a deeper understanding of students’ approaches to studying in introductory college biology and of the relationships between study strategies and achievement. This work may support instructors in adopting intentional classroom practices to foster students’ development as self-regulated learners, able to identify expectations and criteria for success and to implement appropriate and effective study strategies.
Article
Full-text available
Despite considerable evidence that testing benefits subsequent retrieval of information, it remains uncertain whether this effect extends to topically related information with authentic classroom materials. In the current study we first profile the way in which quizzing is used in the classroom through a survey of introductory psychology instructors. The survey results indicate that, instructors frequently use related but different questions on quizzes and tests unlike many laboratory experiments that use identical questions. In two subsequent experiments, participants studied information from a college biology textbook, were quizzed twice, and given a final test. The items on the final test were either identical to or were related but different than the quiz items. Experiment 1 showed that testing produced the typical robust testing effect for repeated items, but there was no significant effect of testing for topically related items. In Experiment 2, participants could use their quizzes to guide restudy, and there was still no positive effect of testing for topically related information.
Article
Full-text available
A new theoretical framework for the testing effect-the finding that retrieval practice is usually more effective for learning than are other strategies-is proposed, the empirically supported tenet of which is that separate memories form as a consequence of study and test events. A simplest case quantitative model is derived from that framework for the case of cued recall. With no free parameters, that model predicts both proportion correct in the test condition and the magnitude of the testing effect across 10 experiments conducted in our laboratory, experiments that varied with respect to material type, retention interval, and performance in the restudy condition. The model also provides the first quantitative accounts of (a) the testing effect as a function of performance in the restudy condition, (b) the upper bound magnitude of the testing effect, (c) the effect of correct answer feedback, (d) the testing effect as a function of retention interval for the cases of feedback and no feedback, and (e) the effect of prior learning method on subsequent learning through testing. Candidate accounts of several other core phenomena in the literature, including test-potentiated learning, recognition versus cued recall training effects, cued versus free recall final test effects, and other select transfer effects, are also proposed. Future prospects and relations to other theories are discussed.
Article
Full-text available
In college introductory science courses, students are challenged with mastering large amounts of disciplinary content while developing as autonomous and effective learners. Self-regulated learning (SRL) is the process of setting learning goals, monitoring progress toward them, and applying appropriate study strategies. SRL characterizes successful, “expert” learners, and develops with time and practice. In a large, undergraduate introductory biology course, we investigated: 1) what SRL strategies students reported using the most when studying for exams, 2) which strategies were associated with higher achievement and with grade improvement on exams, and 3) what study approaches students proposed to use for future exams. Higher-achieving students, and students whose exam grades improved in the first half of the semester, reported using specific cognitive and metacognitive strategies significantly more frequently than their lower-achieving peers. Lower-achieving students more frequently reported that they did not implement their planned strategies or, if they did, still did not improve their outcomes. These results suggest that many students entering introductory biology have limited knowledge of SRL strategies and/or limited ability to implement them, which can impact their achievement. Course-specific interventions that promote SRL development should be considered as integral pedagogical tools, aimed at fostering development of students’ lifelong learning skills. © 2017 A. J. Sebesta and E. Bray Speth. CBE-Life Sciences Education and 2017 The American Society for Cell Biology.
Chapter
Full-text available
Metacognition offers an up-to-date compendium of major scientific issues involved in metacognition. The twelve original contributions provide a concise statement of theoretical and empirical research on self-reflective processes or knowing about what we know. Self-reflective processes are often thought to be central to what we mean by consciousness and the personal self. Without such processes, one would presumably respond to stimuli in an automatized and environmentally bound manner—that is, without the characteristic patterns of behavior and introspection that are manifested as plans, strategies, reflections, self-control, self-monitoring, and intelligence. Bradford Books imprint
Article
Full-text available
A robust finding within laboratory research is that structuring information as a test confers benefit on long-term retention-referred to as the testing effect. Although well characterized in laboratory environments, the testing effect has been explored infrequently within ecologically valid contexts. We conducted a series of 3 experiments within a very large introductory college-level course. Experiment 1 examined the impact of required versus optional frequent low-stakes testing (quizzes) on student grades, revealing students were much more likely to take advantage of quizzing if it was a required course component. Experiment 2 implemented a method of evaluating pedagogical intervention within a single course (thereby controlling for instructor bias and student self-selection), which revealed a testing effect. Experiment 3 ruled out additional exposure to information as an explanation for the findings of Experiment 2 and suggested that students at the college level, enrolled in very large sections, accept frequent quizzing well. (PsycINFO Database Record
Article
The Cambridge Handbook of Cognition and Education - edited by John Dunlosky February 2019