ThesisPDF Available

The effects of output interference on metamemory and cued recall accuracy in young and older adults


Abstract and Figures

Output interference (OI) is a gradual decline in memory accuracy as a function of an item’s position in a testing sequence (M. C. Anderson & Neely, 1996). Despite having been researched for over 50 years (e.g., Tulving & Arbuckle, 1963), this effect has yet to be linked to metacognitive experiences. The current study examines differences in memory accuracy and monitoring for young and older adults who experience OI during cued recall. At study, participants were asked to remember 40 cue-target pairs: For half of the participants, cue words were exemplars that were sampled from the same taxonomic category, while for the other half of the participants, word pairs were completely unrelated. At test, participants first engaged in a cued recall task, where they were asked to predict future recognition memory outcomes (i.e. feelings-of-knowing; FOKs) as well as if they experienced feelings of “Remembering”, “Knowing”, or “No Memory” (i.e. R/K/N judgments) for each trial. Afterwards, participants engaged in a 4-alternative forced-choice recognition task and were asked to provide retrospective confidence judgments (RCJs) after each trial. In the aggregate, memory and metamemory accuracy were similar for young and older adults in both experimental conditions. At the level of the trial, however, recall accuracy, FOKs, and self-reported recollection significantly decreased across successive trials for participants of all ages experiencing OI. Decreases in memory accuracy during OI were mirrored by increases in retrieval failures and states of no memory. Only self-reported familiarity differed between age groups, where “Know” judgments decreased across trials for young adults, but increased for older adults. The results support previous findings of age invariance in FOK accuracy (Hertzog, Sinclair, et al., 2010) and highlight the role of retrieval suppression in mechanistic accounts of OI.
Content may be subject to copyright.
A Dissertation
Presented to
The Academic Faculty
Taylor M. Curley
In Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy in the
School of Psychology
College of Sciences
Georgia Institute of Technology
December 2021
© Taylor M. Curley 2021
Thesis committee:
Dr. Christopher Hertzog, Advisor
School of Psychology
Georgia Institute of Technology
Dr. Rick Thomas
School of Psychology
Georgia Institute of Technology
Dr. Paul Verhaeghen
School of Psychology
Georgia Institute of Technology
Dr. Dobromir Rahnev
School of Psychology
Georgia Institute of Technology
Dr. John Dunlosky
Department of Psychological Sciences
Kent State University
Date approved: August 25, 2021
Thank you to my friends and family, and especially Mom, Dad, Caitlin, Alex, and
Philip, who provided love and support throughout my dream to earn my doctorate—I love
you all very much. Thank you also to my committee—Rick Thomas, Paul Verhaeghen,
Doby Rahnev, and John Dunlosky—for all of your help and encouragement throughout
the years, and especially during this thesis. I also want to thank everybody in the Adult
Cognition Lab. I’m especially grateful to the undergraduates who helped me throughout
the years—Jayna, Omer, Hannah, Kirsten, Josh, Kenley, Alysha, Skyler, Aliyah, Aiman,
Mackenzie, Faizah, Yusra, Ana, and Caroline—and were absolutely essential to my success
in graduate school. Importantly, thank you to my lab mates—Emily, MacKenzie, Maugan,
and Marit—who provided me with all of the encouragement, help, and friendship that I
could ever ask for. Finally, I want to thank my advisor, Chris Hertzog, for his encour-
agement, excellent guidance, and sage advice throughout my years at Georgia Tech and
particularly during this dissertation. I truly owe my success to these wonderful people.
Acknowledgments ................................... iii
ListofTables...................................... vii
ListofFigures ..................................... ix
ListofAcronyms.................................... xi
Chapter1:Introduction................................ 1
1.1 Aging&Metamemory ............................ 2
1.1.1 Initial Investigations . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Origins of Feelings-of-knowing . . . . . . . . . . . . . . . . . . . 4
1.1.3 Sources of Age Differences in Episodic FOKs . . . . . . . . . . . . 8
1.2 MemoryInterference ............................. 15
1.2.1 Proactive Interference . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.2.2 Implicit Interference . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.2.3 Output Interference . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.3 Overview of the Current Experiment . . . . . . . . . . . . . . . . . . . . . 30
Chapter2:Methods.................................. 36
2.1 Participant Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2 Materials.................................... 37
2.3 Procedure ................................... 40
Chapter3:Results................................... 42
3.1 StatisticalMethods .............................. 42
3.2 CuedRecall .................................. 43
3.2.1 OverallMeans ............................ 44
3.2.2 Recall Accuracy Across Trials . . . . . . . . . . . . . . . . . . . . 46
3.2.3 RecallOutcomes ........................... 48
3.2.4 Summary ............................... 62
3.3 Recognition .................................. 63
3.3.1 OverallMeans ............................ 63
3.3.2 Summary ............................... 64
3.4 Remember/Know/No Memory . . . . . . . . . . . . . . . . . . . . . . . . 65
3.4.1 OverallMeans ............................ 65
3.4.2 R/K/N Endorsement Across Trials . . . . . . . . . . . . . . . . . . 66
3.4.3 MultilevelModels........................... 69
3.4.4 Summary ............................... 82
3.5 Post-recognition Confidence Judgments . . . . . . . . . . . . . . . . . . . 82
3.5.1 OverallMeans ............................ 83
3.5.2 RelativeAccuracy........................... 83
3.5.3 Summary ............................... 84
3.6 Feelings-of-knowing ............................. 84
3.6.1 OverallMeans ............................ 84
3.6.2 FOKsAcrossTrials.......................... 87
3.6.3 RelativeAccuracy........................... 90
3.6.4 MultilevelModel ........................... 94
3.6.5 Summary ...............................101
Chapter 4: General Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.1 Output Interference, Cued Recall, and Aging . . . . . . . . . . . . . . . . . 104
4.1.1 RecallAccuracy............................104
4.1.2 MemoryErrors ............................105
4.1.3 Heterogeneity in OI . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2 Metamemory and Output Interference . . . . . . . . . . . . . . . . . . . . 106
4.2.1 Overall FOK Magnitude and Accuracy . . . . . . . . . . . . . . . . 106
4.2.2 Metacognitive Judgments Over Trials . . . . . . . . . . . . . . . . 108
4.3 Theoretical Accounts of Output Interference . . . . . . . . . . . . . . . . . 110
4.4 Conclusions..................................112
4.4.1 FutureDirections ...........................112
Appendix A: Additional Analyses . . . . . . . . . . . . . . . . . . . . . . . . . 116
References .......................................122
2.1 Means and standard errors of participant characteristics. . . . . . . . . . . . 37
3.1 Memory performance and judgment means by age group and experimental
condition. ................................... 44
3.2 F-table of the multilevel models predicting correct recall (Model 1), errors
of omission (Model 2), and errors of Commission (Model 3). . . . . . . . . 51
3.3 Parameter estimates for the multilevel models predicting correct recall (Model
1), errors of omission (Model 2), and errors of commission (Model 3). Re-
sults are reported in log-odds. . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.4 Multinomial logistic model predicting R/K/N (in log-odds) from age group
and experimental condition. . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.5 Multinomial logistic model predicting R/K/N (in log-odds) from age group,
experimental condition, and trial within recall blocks. . . . . . . . . . . . . 68
3.6 F-table for the multilevel models predicting “Remember” (Model 1), “Know”
(Model 2), and “No Memory” (Model 3) judgments against all others. . . . 72
3.7 Parameter estimates for the multilevel models predicting “Remember” (Model
1), “Know” (Model 2), and “No Memory” (Model 3) judgments against all
others. Results are reported in log-odds. . . . . . . . . . . . . . . . . . . . 73
3.8 Average gamma correlation estimates for RCJs across age groups and con-
ditions...................................... 83
3.9 Average gamma correlation estimates for FOKs across age groups and con-
ditions...................................... 90
3.10 F-table for the multilevel model with item-level FOKs as the dependent
variable..................................... 96
3.11 Parameter estimates for the multilevel model using item-level FOKs as the
dependentvariable. .............................. 97
A.1 Logistic model predicting recognition outcome (in log-odds) from age group,
experimental condition, and trial within category during recognition. . . . . 117
A.2 Logistic model predicting recognition memory outcome (in log-odds) from
age group, experimental condition, and trial within recall blocks. . . . . . . 119
1.1 An overview of typical output interference effects in cued recall. From
Wilsonetal.(2019)............................... 22
1.2 Output interference effects for young, middle, and older adults. From Smith
(1975)...................................... 26
2.1 Visual example of the experimental procedure. . . . . . . . . . . . . . . . . 41
3.1 Average cued recall performance. . . . . . . . . . . . . . . . . . . . . . . . 45
3.2 Average cued recall performance averaged across blocked trials. . . . . . . 47
3.3 Correct recall, omission, and commission rates across cue-relatedness and
ageconditions. ................................ 48
3.4 Simple slopes analysis estimating the probability of correct recall across
recall trials within age group, condition, and recall cycle. . . . . . . . . . . 55
3.5 Simple slopes analysis estimating the probability of correct recall across
recall trials for Related and Unrelated Cues conditions. . . . . . . . . . . . 56
3.6 Simple slopes analysis estimating the probability of an error of omission
across recall trials within age group, condition, and recall cycle. . . . . . . 58
3.7 Simple slopes analysis estimating the probability of errors of omission
across recall trials for Related and Unrelated Cues conditions. . . . . . . . . 61
3.8 Average recognition memory performance. . . . . . . . . . . . . . . . . . . 63
3.9 Average recognition memory performance across blocked recognition trials. 64
3.10 Average R/K/N judgment rates. . . . . . . . . . . . . . . . . . . . . . . . . 66
3.11 Average R/K/N judgment rates over trials. . . . . . . . . . . . . . . . . . . 69
3.12 Simple slopes analysis estimating the probability of endorsing “Remem-
ber” across recall trials for Related and Unrelated Cues conditions. . . . . . 78
3.13 Simple slopes analysis estimating the probability of endorsing “Know”
across recall trials for young and older adults in the Related and Unrelated
Cuesconditions................................. 80
3.14 Simple slopes analysis estimating the probability of endorsing “No Mem-
ory” across recall trials for the Related and Unrelated Cues conditions. . . . 81
3.15 Average FOK judgments across recall outcomes, age groups, and experi-
mentalconditions................................ 86
3.16 Average feeling-of-knowing judgments across trials for recalled and unre-
calleditems................................... 87
3.17 Marginal mean FOK judgments by age group, condition, and whether an
itemwasrecalledornot. ........................... 99
A.1 Average 4-alternative forced choice recognition memory performance by
age group and cue-relatedness across consecutive cued recall trials from
γor GGoodman-Kruskal Gamma Correlation Coefficient
4AFC 4-alternative Forced Choice Test
dJOL Delayed Judgment of Learning
FOK Post-recall Feeling-of-knowing Judgment
HyGene Hypothesis Generation Model
iJOL Immediate Judgment of Learning
LTM Long-term Memory
OI Output Interference
PI Proactive Interference
POK Predictions of Knowing
R/K/N Remember/Know/No Memory Judgments
RCJ Post-recognition Confidence Judgment
RI Retroactive Interference
RIF Retrieval-induced Forgetting
WM/WMC Working Memory/Working Memory Capacity
Output interference (OI) is a gradual decline in memory accuracy as a function of an
item’s position in a testing sequence (M. C. Anderson & Neely, 1996). Despite having
been researched for over 50 years (e.g., Tulving & Arbuckle, 1963), this effect has yet to
be linked to metacognitive experiences. The current study examines differences in mem-
ory accuracy and monitoring for young and older adults who experience OI during cued
recall. At study, participants were asked to remember 40 cue-target pairs: For half of the
participants, cue words were exemplars that were sampled from the same taxonomic cat-
egory, while for the other half of the participants, word pairs were completely unrelated.
At test, participants first engaged in a cued recall task, where they were asked to predict
future recognition memory outcomes (i.e. feelings-of-knowing; FOKs) as well as if they
experienced feelings of “Remembering”, “Knowing”, or “No Memory” (i.e. R/K/N judg-
ments) for each trial. Afterwards, participants engaged in a 4-alternative forced-choice
recognition task and were asked to provide retrospective confidence judgments (RCJs) af-
ter each trial. In the aggregate, memory and metamemory accuracy were similar for young
and older adults in both experimental conditions. At the level of the trial, however, recall
accuracy, FOKs, and self-reported recollection significantly decreased across successive
trials for participants of all ages experiencing OI. Decreases in memory accuracy during OI
were mirrored by increases in retrieval failures and states of no memory. Only self-reported
familiarity differed between age groups, where “Know” judgments decreased across trials
for young adults, but increased for older adults. The results support previous findings of
age invariance in FOK accuracy (Hertzog, Sinclair, et al., 2010) and highlight the role of
retrieval suppression in mechanistic accounts of OI.
Adults often show increased difficulties in remembering learned information across
the lifespan (Craik, 1994; Light, 1991; Schaie, 2005). This is often compounded by neg-
ative self-beliefs older adults have about their own memory abilities and control over such
abilities (Dixon & Hultsch, 1983; Hertzog, Small, McFall, & Dixon, 2019; Lineweaver &
Hertzog, 1998); however, older adults’ beliefs about their memory abilities may not match
true memory performance (Hertzog et al., 2018).
A goal of metamemory research in adulthood is determining the extent to which mem-
ory judgment accuracy changes over the lifespan—if at all. In particular, the degree to
which adults differ with respect to feelings-of-knowing, or predictions that one will be able
to recognize an item that she cannot currently recall, is unclear. While some researchers
do find differences in FOK accuracy between young and older adults (Perrotin et al., 2006;
Perrotin et al., 2008; Souchay & Isingrini, 2004a; Souchay et al., 2007), others do not
(Eakin & Hertzog, 2006, 2012a; Hertzog, Fulton, Sinclair, & Dunlosky, 2014; Hertzog,
Kidder, Powell-Moman, & Dunlosky, 2002).
One possibility for such differences in empirical results is the task that is used to elicit
metamemory judgments. Here, I outline a study that examines memory and metamemory
performance in older adults using a specific memory phenomenon called output interfer-
ence (OI), where memory performance gradually declines as a function of an item’s posi-
tion in a testing sequence, specifically for items that share semantic relationships (M. C.
Anderson & Neely, 1996; Tulving & Arbuckle, 1963). This effect has its own precedent
in the extant literature, including aging and memory (Smith, 1971, 1975), but it has yet to
be connected to the topics of FOKs and metamemory in older adults. This study will not
only directly examine whether there are age-related differences in FOK accuracy during
output interference states, but it will also inform theories regarding the exact processes that
underlie OI.
The remainder of this chapter will explore critical discrepancies in the extant metamem-
ory research, where the aging literature has previously addressed interference effects with
respect to memory judgments, and why OI might represent a new, rich environment for
understanding memory awareness across the lifespan.
1.1 Aging & Metamemory
Metamemory refers to the ability to accurately review (monitor) and change the state
(control) of one’s own memory functioning (T. O. Nelson & Narens, 1990). Assessing the
metacognitive abilities of older adults is extremely important, who are likely to experience
heavy penalties for misjudging their own memory abilities, such as forgetting to take daily
medication (Zogg, Woods, Sauceda, Wiebe, & Simoni, 2012). Thus, two goals of memory
and aging research are to a) determine when and where memory prediction errors might
occur in older adults, and b) explore the factors that contribute to age-related differences in
metamemory performance where they are observed.
1.1.1 Initial Investigations
Early research in this field operationalized this ability by comparing the number or
percentage of items that participants predicted they would remember (a “global“ judgment
of learning, or JOL) to the amount that they actually remembered. The general finding in
this line of research is that older adults significantly over-estimate their memory abilities
(i.e. predict remembering more items than they actually d0) compared to young adults,
whose estimations are close to true performance (e.g. Perlmutter, 1978). In a study per-
formed by Bruce, Coyne, and Botwinick (1982), for example, older adults (ages 60 - 79)
over-estimated the number of items they would freely recall at test (out of 20) by an average
of 2. This pattern was replicated in several additional studies (e.g. Coyne, 1985; Devolder
et al., 1990), leading to the conclusion that memory prediction accuracy declines over the
Despite the multiple replications, a number of inconsistencies in these results chal-
lenged this general conclusion of age-related metamemory deficits. First, several studies
found no significant differences between young and older adults’ global predictions, de-
spite differences in true memory performance (Bruce et al., 1982). Second, several stud-
ies found the reverse pattern, i.e. better global JOL accuracy in older adults (Hertzog,
Dixon, & Hultsch, 1990). This led to a critical review by Connor, Dunlosky, and Hertzog
(1997) admonishing the use of global (or absolute) accuracy measures in favor of item-level
(or relative) accuracy measures, such as the Goodman-Kruskal gamma correlation (L. A.
Goodman & Kruskal, 1963; T. O. Nelson, 1984). The authors found that JOL accuracy,
as measured by item-level correlations, is invariant between young and older adults and
argued that the age-related differences found in global judgments can be attributed to the
reliance of a mid-point anchor. The results of Connor et al. were instrumental in demon-
strating that examining sensitivity to item-level variations in a learning sequence is critical
to broad interpretations of metamemory accuracy across the lifespan.
This work on aging and JOLs sets an important precedent for the current proposal for a
few different reasons. First, the work by Connor et al. (1997) and subsequent studies (Hert-
zog, Dunlosky, & Sinclair, 2010; Hertzog, Kidder, Powell-Moman, & Dunlosky, 2002;
Hines, Hertzog, & Touron, 2015; Robinson, Hertzog, & Dunlosky, 2006) established that
older and young adults generally do not differ with respect to item-level JOL sensitivity,
with the exception of certain situations, such as when older adults overly-rely on familiar-
ity (Daniels, Toth, & Hertzog, 2009; Toth, Daniels, & Solinger, 2011). More importantly,
these studies demonstrated a shift from examining metamemory predictions as reflective
of access to a quantity of information (“pure-accessibility“) to examining them as the in-
tegration of multiple diagnostic cues that are present during the encoding and retrieval
processes themselves (“cue-utilization“; Hertzog & Curley, 2018; Koriat, 1997; Robinson
et al., 2006).
1.1.2 Origins of Feelings-of-knowing
The feeling-of-knowing (FOK)—or a prediction about future memory accuracy for a
currently unretrievable item—has also been an important diagnostic tool for metamemory
in adult learners across the lifespan. Unlike the JOL, which provides a subjective appraisal
of confidence during memory encoding (c.f. Rhodes, 2016), FOKs indirectly examine the
extent to which an individual has access to a particular piece of information in LTM. FOKs
are maximally informative when examining judgments for items that an individual cannot
recall, but “feels“ that the item can be accessed via recognition (T. O. Nelson, Leonesio,
Shimamura, Landwehr, & Narens, 1982). The canonical procedure for eliciting FOKs is
the recall-judgment-recognition (RJR) paradigm (Hart, 1965), where participants first learn
a set of items with the instruction to remember them for a test later on and then engage in
a cued-recall task after a brief retention period. During recall, participants are asked to
rate their confidence that they would be able to correctly recognize the target item if it
were shown to them (FOK). After providing a judgment after each recall trial, participants
engage in a recognition memory task in which they must correctly recognize the items they
studied at the beginning of the experiment. The degree to which item-level judgments for
unrecalled items correspond to future recognition accuracy, typically measured using the
Goodman-Kruskal gamma correlation (L. A. Goodman & Kruskal, 1963; T. O. Nelson,
1984), is broadly interpreted as the level of awareness an individual has of her access to
information in LTM (T. O. Nelson & Narens, 1990). Thus, FOKs can be used to draw
inferences about stability/changes in this awareness across the lifespan and whether older
adults are less able to accurately monitor partial access to memory (Hertzog & Curley,
While the FOK phenomenon provides unique insight into specific metamemorial abil-
ities, several factors make these judgments more difficult to interpret than JOLs. For one,
the standard FOK does not have a direct analog by which to compare changes in access
to criterial informaton. JOLs, by contrast, can be given either immediately after studying
an item (“immediate JOL”, or iJOL), which yields moderate metacognitive accuracy, or
after studying all items in an intermediary testing session (“delayed JOL”, or dJOL), which
yields high metacognitive accuracy (T. O. Nelson & Dunlosky, 1991). The similarities in
judgment conditions between the two types of JOLs allow researchers to directly contrast
the conditions and heuristics that give rise to each. For example, the differences in judg-
ment accuracy between immediate and delayed JOLs can signal a change in reliance on
heuristics based on information in a short-term memory store to a reliance on informa-
tion in a long-term memory store that closely matches the retrieval conditions in the final
recall test (i.e. the monitoring-dual-memories account, T. O. Nelson & Dunlosky, 1991;
although see Narens, Nelson, & Scheck, 2008, for a review of alternative accounts of the
dJOL effect). FOKs do not have a close analog; instead, all empirical evidence regarding
the nature of these judgments must be based on systematic manipulations of the stimuli and
experimental conditions in the FOK task.
A second, but related, issue is the wide range of heuristics that potentially underlie the
FOK construction process. Early researchers focused their conclusions on the correspon-
dence between the amount of indirect information that participants are able to access at the
time of judgment (e.g. a “trace-strength“ account; Schacter, 1983) and the reported judg-
ments; however, this hypothesis evolved to suggest that FOKs are constructed on the basis
of conscious and nonconscious influences arising from the target search itself (Metcalfe,
2000). (Succinctly stated as “FOKs can access only the products of cognition”; Hertzog,
Dunlosky, & Sinclair, 2010, p. 772.) One popular extension of this account is the cue-
familiarity hypothesis, which posits that FOKs are constructed on the basis of the famil-
iarity of the cue at the time of retrieval/judgment (Metcalfe et al., 1993). Reder and Ritter
(1992), for example, demonstrated that FOKs can be almost entirely based on the familiar-
ity of a cue by increasing the number of times that an arithmetic problem, but not its answer,
is presented during study. A less stringent view of the importance of cue familarity is Ko-
riat’s cue-accessibility account (also referred to as the partial-retrieval hypothesis), where
a cued item initiates the search process for the target and information spawning from that
process is integrated into a judgment, regardless of whether that information is valid or not
(Koriat, 1993; Koriat & Levy-Sadot, 2001). In recent studies, researchers have adopted a
more holistic interpretation of FOK construction, one that hypothesizes that learners weigh
multiple cues simultaneously (Hertzog, Dunlosky, & Sinclair, 2010; Hertzog et al., 2014).
Along with the familiarity and accessibility of cue words, learners weigh the availability
of information regarding the original study context, such as the emotional valence of cue-
target pairs (A. K. Thomas, Bulevich, & Dubois, 2011) or mediators between the cue and
target (Hertzog et al., 2014), often with significantly improved accuracy compared to con-
trols. Information about the encoding context provides valid, albeit indirect, information
by which to draw inferences during FOK construction, a phenomenon that is referred to as
non-criteral recollection (Brewer et al., 2010; Yonelinas, 2002). Non-criterial recollection
has been shown to improve FOK accuracy in both young and older adults (Hertzog, Curley,
Castro, et al., 2020; Hertzog, Fulton, & Dunlosky, 2020; Hertzog et al., 2014).
The Role of Implicit and Explicit Information
There are several hypotheses related to the specific memorial cues are used to produce
FOKs; for example, one popular account by Koriat (1997) segregates metacognitive cues
into three categories: Intrinsic,extrinsic, and mnemonic factors. Intrinsic factors are those
that reflect stimulus materials, such as the relatedness of a pair, while extrinsic factors
are those that reflect the conditions of learning, such as study duration (Kelley & Jacoby,
1996). In contrast, mnemonic factors reflect information about items that are based on
subjective inferences about memory, such as encoding fluency (Hertzog et al., 2003; Koriat
& Ma’ayan, 2005).
The line of research that is most relevant to this study examines the role of implicit
and explicit information, where implicit information refers to unintentional influences of
previous experience on memory retrieval and explicit information refers to the influence
of ideas that are intentionally encoded (D. L. Nelson et al., 2013; D. L. Nelson et al.,
1992; D. L. Nelson & Zhang, 2000). During memory retrieval, both implicit and explicit
information can be used to construct metamemory judgments.
Schreiber and Nelson examined the influence of information available during retrieval
on episodic FOKs and predictions-of-knowing (POKs) in two experiments that manipulated
the number of semantic associates (i.e. set size) for cue words (Schreiber & Nelson, 1998)
and for target words (Schreiber, 1998). Here, the authors directly manipulated the role of
implicit information using variable cue set-sizes: For cues with larger set sizes, for instance,
implicit information will be influenced by the large number of associates that results from
the retrieval process. In formulating the hypothesis, the authors distinguish between two
potential retrieval accounts that might affect FOKs. The first, called the partial-retrieval
hypothesis (Koriat, 1993), posits that FOK judgments are constructed on the basis of re-
lated information that comes to mind during retrieval and not on explicit access to criterial
information. In the case of set-size manipulations, this account would hold in situations
where participants give higher FOKs for items with a larger number of semantic neigh-
bors than for items with smaller numbers of semantic neighbors. In this case, learners’
judgments would reflect the feeling of greater access to information due to competition,
regardless of whether the information is truly indicative of future memory performance or
not. The alternative account, called the competition hypothesis, states the opposite pattern:
Learners should report higher FOKs for items with fewer semantic neighbors compared to
items with a larger number of semantic associates. Here, learners’ judgments would reflect
sensitivity to competition between items. In these two studies, Schreiber and Nelson found
that FOKs were negatively correlated with the amount of competition during retrieval, such
that higher FOKs were given to words with smaller set sizes (i.e. lower semantic compe-
tition) and lower FOKs were given to words with larger set sizes (i.e. higher semantic
competition), supporting a competition hypothesis of FOK construction (Schreiber, 1998;
Schreiber & Nelson, 1998). Schreiber and Nelson’s work on the competition hypothesis
was instrumental in later examinations of FOKs and implicit interference (Eakin & Hert-
zog, 2006, 2012a, 2012b).
Taken together, the results of these studies indicate that FOK accuracy in young adults is
highest when learners have access to information that is diagnostic of (though does not nec-
essarily directly point to) to-be-remembered items, such as memories based on recollection
rather than familiarity (Hicks & Marsh, 2002), and lowest when direct information is un-
available or when implicit memory influences provide misleading information (Schreiber,
1998; Schreiber & Nelson, 1998). This is also the case for FOK judgment accuracy in older
adults (e.g. Hertzog, Dunlosky, & Sinclair, 2010; recollection vs. familiarity, MacLaverty
& Hertzog, 2009; Souchay et al., 2007); however, as I will discuss in the next section,
delineations between different cue weightings become paramount in understanding decline
and stability in FOK accuracy in later adulthood.
1.1.3 Sources of Age Differences in Episodic FOKs
An important consideration when examining feelings-of-knowing is the type of mem-
ory learners are asked to make judgments about. Here, the distinction between semantic and
episodic FOKs not only delineates functional differences in the task, but also differences
in common findings regarding aging and metamemory. The accuracy of semantic FOKs,
which are FOKs about facts or knowledge, remains stable over the lifespan, with very few
exceptions (e.g., Allen-Burge & Storandt, 2000; Eakin, Hertzog, & Harris, 2014; Souchay,
Isingrini, & Espagnet, 2000; Souchay, Moulin, Clarys, Taconnat, & Isingrini, 2007; for a
review of earlier studies, see Hertzog & Hultsch, 2000). The common explanation for this
is that semantic stimuli involve topics that are familiar to adults and information that has
been previously mastered (Hertzog & Curley, 2018). For example, semantic FOK tasks
may ask participants to recall the name of the U.S. Vice President during Jimmy Carter’s
administration.1While an older adult may not be able to directly recall the name, she might
be able to remember certain details about the former VP, such as being from the Midwest
and involved in investigating national intelligence organization abuses,2and use that infor-
mation to form an accurate FOK. While spreading activation from some items may prevent
older adults from accessing relevant information, enough items are sufficiently recollected
in order to attain above-chance FOK accuracy at the level of young adults.
In contrast, the stability of episodic FOK accuracy over the lifespan is still under de-
bate. Early studies indicated no significant differences in FOK accuracy between young and
older adults either for scale (Lachman, Lachman, & Thronesbery, 1979) or binary/relative
(Butterfield, Nelson, & Peck, 1988) FOK judgments. However, a paper by Perfect and
Stollery (1993) challenged these findings by demonstrating that significant age-related dif-
ferences in memory appraisals are evident when age changes in episodic memory prevent
older adults from accessing diagnostic cues. In more recent years, a number of conflicting
studies have examined FOK accuracy in further detail with differences in their findings.
These studies are outlined below and classified by two broad, but aptly-named schools of
thought: Experiments that support an inferential deficit hypothesis, i.e. that age-related dif-
ferences in FOK accuracy are due to diminished decision support systems in older adults,
and those that support a memory-constraint hypothesis, i.e. that age-related differences in
FOK accuracy are simply attributable to lower episodic memory strength in older adults
(Hertzog, Dunlosky, & Sinclair, 2010).
Inferential Deficits via Neuropsychological Factors
The most prominent evidence of age-related differences in episodic FOK accuracy
is from research conducted by a group from the Universit´
e de Tours in France. In their
1Answer: Walter Mondale.
2In 1975, Mondale was part of a Senate committee that investigated abuses in the CIA, NSA, and FBI.
It was informally known as the “Church Committee“ after its chair, Senator Frank Church.
first study on the subject, Souchay, Isingrini, and Espagnet (2000) asked young and older
adults of good physical and mental health to learn and give binary (i.e. “yes”/“no”) FOKs
for 36 moderately-associated French noun pairs using the common recall-judge-recognize
paradigm (Hart, 1965; Schacter, 1983). While young adults showed moderate item-level
relationships between FOKs and later recognition (i.e. γ= 0.40), older adults’ judgments
were no more accurate than chance (i.e. γ= -0.06). Several other studies from this group
have replicated this general pattern of age-related differences in episodic, but not semantic,
FOK accuracy (Morson, Moulin, & Souchay, 2015; Perrotin, Isingrini, Souchay, Clarys,
& Taconnat, 2006; Perrotin, Tournelle, & Isingrini, 2008; Sacher, Isingrini, & Taconnat,
2013; Sacher, Landr´
e, & Taconnat, 2015; Souchay, Moulin, Clarys, Taconnat, & Isingrini,
2007), with some connections to other metacognitive influences, such as control of study
(Souchay & Isingrini, 2004b), decisions made during study (Souchay & Isingrini, 2004a),
and the role of recollection (Souchay et al., 2007).
The primary conclusion from these studies is that age-related deficits in FOK accu-
racy arise from deficiencies in executive control processes, which themselves arise from
declines in frontal and medial temporal cortical areas. The authors connected these studies
to earlier research by Shimamura and Squire (1986) and Janowsky, Shimamura, and Squire
(1989), who demonstrated that patients with damage to the frontal areas of the brain, such
as those with Korsakoff syndrome or with frontal lobe lesions, show diminished FOK accu-
racy. When also considering accounts of diminished frontal lobe volume and functioning in
older adults (Dempster, 1992; Moscovitch & Winocur, 1992), one can make the reasonable
conclusion that metamemory accuracy will decline in older age, concurrent with decreases
in frontal lobe functioning. Indeed, these studies relate age differences in episodic FOK
accuracy to declines in executive functioning, such as through significant partial correla-
tions between judgment accuracy and neuropsychological measures (Perrotin et al., 2006;
Perrotin et al., 2008; Souchay et al., 2000); however, a direct link has yet to be made to
physical indicators of frontal lobe deficiencies (e.g., via neuroimaging) in older adults.
Memory Constraints via Diminished Access
The studies carried out by the Universit´
e de Tours research group, which largely sup-
port a inferential-deficit hypothesis, have yielded some discrepant conclusions. The role
of frontal lobe functioning in general episodic memory performance is the most prominent
example, where the authors argue that age-related declines in executive functioning are
more closely related to inaccuracies in metamemory judgments than to general declines
in episodic memory (i.e. Craik et al., 1990; Rosen et al., 2002). Sacher et al. (2015)
specifically addressed this issue by using signal-detection techniques to ascertain the sep-
arate influences of overall memory performance and metamemory ability in age-related
FOK accuracy differences. The authors used a Brier score analysis in order to examine
these two influences and concluded that memory-independent processes have a significant
effect on older adults’ FOK judgments, even after accounting for overall memory perfor-
mance. Two issues call this argument into question, however: First, Sacher et al. do not
directly compare age groups using equated memory performance in order to validate the
Brier score analysis. Second, the authors argue that SDT measures, such as the meta-
d’ statistic proposed by Maniscalco and Lau (2012), can estimate the separate effects of
metamemory ability and underlying episodic memory content, regardless of differences in
underlying memory performance. However, such inferences are predicated on the assump-
tion that the quality of metacognitive judgments can be normalized by stimulus sensitivity
(Fleming & Lau, 2014). This assumption is well-suited for tasks with judgments that fol-
low from first-order decisions, such as sensory and post-recognition confidence judgments,
but may not be appropriate for higher-order metacognitive judgments that are constructed
on the basis of subjective heuristics evolving from the retrieval attempt. Indeed, Mazan-
cieux, Dinze, Souchay, and Moulin (2020) hint at this possibility in a secondary analysis
in which metacognitive sensitivity in FOKs is shown to be significantly related to recall
performance, unlike RCJs, where metacognitive sensitivity was not related to recognition
memory performance.
An alternative explanation is that age-related differences in FOK accuracy stem from
differences in episodic memory, where young adults, who recall more items than older
adults in equal retention periods, are basing their judgments off of stronger memory traces.
Put another way, a memory-constraint hypothesis suggests that a lack of availability to di-
agnostic cues is a function of age-related declines in episodic memory (Perfect & Stollery,
1993). Hertzog, Dunlosky, and Sinclair (2010) directly test this account in a number of dif-
ferent ways. They manipulated the quality of memory representations by varying the num-
ber of item presentations (i.e. 1, 2, or 4 times) for both young and older adults and demon-
strate that FOK accuracy increases as a function of repetitions (i.e. memory strength).
Importantly, Hertzog, Dunlosky, and Sinclair introduced differential delays between study
and test in order to equate memory performance between young and older adults. Recall
memory performance was equated for young adults with a one-week delay between study
and test and older adults with a 2-day delay between sessions for all repetition conditions
(2010). Overall, the results indicate that there are no significant effects of age in FOK accu-
racy when equating underlying memory strength; thus, the lack of access to information is
a strong influence on episodic FOKs. The Hertzog group (which I am a part of) confirmed
this pattern in a recent replication (Hertzog, Curley, Castro, & Dunlosky, 2020).
Repairing Accuracy Deficits
While the memory-constraint hypothesis provides a parsimonious explanation for ex-
tant experimental findings, the account is complicated by the fact that FOK accuracy equiv-
alence between age groups is also found in studies that do not equate memory performance
between young and older adults. For example, MacLaverty and Hertzog (2009) did not
detect any significant differences in episodic FOK accuracy between age groups, despite
the fact that young adults correctly recalled 30% more target words on average than older
adults. Similarly, a study by Eakin, Hertzog, and Harris (2014) found no age-related dif-
ferences in episodic FOK accuracy for name-face pairs, despite a significant difference in
recall performance for episodic stimuli (MY A = 0.10,MOA = 0.03).
Given this, a related and more important issue—both for metamemory and aging as well
as the current study—is the availability of diagnostic information during FOK construction.
This theoretical stance integrates the general memory-constraint (Hertzog, Dunlosky, &
Sinclair, 2010) and multiple cue-integration (Hertzog et al., 2014) accounts by postulating
that all learners, both young and old, can provide equally-accurate FOKs if they have access
to cues that are diagnostic of future memory performance. A difference in memory trace
strength due to age-related declines in episodic memory functioning (e.g., Sacher et al.,
2015) is just one common path to diminished access to such cues; thus, age invariance in
FOK accuracy is not strictly reliant on equating recall performance across age groups.
To date, the most powerful evidence for this view comes from studies examining FOK
accuracy and non-criterial recollection, or the retrieval of information related to a target
item that is diagnostic of future memory performance, but does not directly evoke retrieval
of the target itself (Yonelinas & Jacoby, 1996). Non-criterial recollection has been specifi-
cally linked to familiarity deficits in older adults (Toth & Parks, 2006) as well as to cortical
regions that are likely to degrade with age (Diana, Yonelinas, & Ranganath, 2007), mak-
ing it a likely source of judgment accuracy in metamemory experiments. Accordingly,
Hertzog et al. (2014) showed that retrieval of information related to the study environment
(in this case, a sentence or image mediator connecting a cue word and target word pair)
significantly improved gamma correlations between scale FOK judgments and recognition
memory accuracy for unrecalled items in young adults. While the specific impact of non-
criterial recollection in repairing age-related differences in FOK accuracy is under debate
(e.g. A. K. Thomas et al., 2011), recent research from the Hertzog group has provided
evidence in favor of the hypothesis that access to non-criterial information during recall
increases FOK accuracy in both young and older adults (Hertzog, Curley, Castro, et al.,
2020; Hertzog, Fulton, & Dunlosky, 2020).
Lack of Constraints/Deficits in Implicit Interference
While research on non-criterial recollection focuses on how to increase access to in-
formation that will improve metacognitive judgment accuracy, an equally compelling line
of research (and one that is paramount to the proposed study) is understanding the factors
that decrease access to informative cues in young and older adults. While this issue has had
comparatively little coverage in the extant literature, the role of interference—a specific in-
hibitory mechanism that impairs one’s ability to remember an item that was studied with
similar items (M. C. Anderson & Neely, 1996)—has had a demonstrable role in FOK judg-
ment construction in young adults. Metcalfe, Schwartz, and Joaquim (1993), for example,
provide early experimental evidence for diminished metacognitive accuracy under interfer-
ence conditions. The authors examined FOK and TOT judgments under several proactive
interference (PI) experimental manipulations using cue and target repetition at study (i.e.,
A-B, A-B; A-B, A-B’; A-B, A-D designs). The authors found that these judgments were
related to the number of presentations of the cues, but not the targets, such that FOK judg-
ment magnitudes were significantly higher for items with repeated cues. These findings
were interpreted with respect to metamemory judgment construction rather than awareness
of PI; specifically, the authors argue that the results support FOK construction based on
cue-familiarity rather than information gleaned from a target or even access to information
overall (i.e. an accessibility account; Koriat, 1993). Maki (1999) developed this research
further by studying the effects of retroactive interference (RI) on JOLs and FOKs, where
learners reported significantly higher estimates for both types of judgments when stimuli
were repeated and their responses were semantic associates. Maki interpreted these results
as favoring the competition account of metamemory construction (Schreiber, 1998).
Importantly, Eakin and Hertzog published a set of studies examining metamemory un-
der manipulations of retrieval interference (Eakin & Hertzog, 2006, 2012a, 2012b). In their
first study, Eakin and Hertzog (2006) examined cued recall, FOK, and POK performance
between young and older adults for items that have small or large set sizes and under intra-
or extra-list cueing. Similar to previous research, set-size effects were only eliminated in
younger adults under intralist cueing. Importantly, metamemory judgment accuracy was
equivalent across age groups, indicating that all participants demonstrated sensitivity to
implicit interference effects. Differences in the relationship between POKs and recogni-
tion were significant, however; when considering all items, mean gamma correlations for
both age groups were moderate (γ0.4), while mean gamma correlations for unrecalled
items were practically zero. This indicates that interference effects largely influence recall,
but not recognition memory, and that inaccuracies in memory judgments were related these
interference effects during recall.
The authors also published two additional studies examining the influence of implicit
interference on FOKs in young and older adults (Eakin & Hertzog, 2012a) and immediate
JOLs in young adults only (Eakin & Hertzog, 2012b) using similar methods. Along with
replicating the set-size effects in cued recall, the researchers found that both young and
older adults’ FOKs reliably tracked recall, but not recognition (2012a), and that iJOLs
were not diagnostic of future recall performance (2012b). These studies provide further
evidence for the hypothesis that interference states are localized to retrieval during cued
recall and that only judgments closely related to the retrieval interference experience itself
are reliably diagnostic of recall performance. These studies by Eakin and Hertzog (2006,
2012a, 2012b) are some of the primary influences on the proposed project.
1.2 Memory Interference
The most surprising outcome of Eakin and Hertzog’s studies on aging and FOKs using
implicit interference (2006, 2012a) is a lack of age differences in overall judgment accu-
racy, despite the clear negative effects of extralist cueing and large set sizes. Further, the
results of the cued recall task indicate that older adults are particularly sensitive to inter-
ference effects during retrieval. In Eakin and Hertzog (2006), for example, cued recall
exhibited a three-way interaction between cue set size, cueing procedure, and age group,
such that set size effects (i.e. lower recall for items with larger set sizes) were eliminated
under intralist cueing for young adults, but not for older adults. These results are inconsis-
tent with Schreiber and Nelson’s (Schreiber, 1998; Schreiber & Nelson, 1998) competition
hypothesis, which predicts that FOKs should be negatively correlated with competition. In
the case of Eakin and Hertzog (2006), FOK accuracy for older adults, who show greater
effects of interference during recall, should be significantly lower in the conditions that
encourage implicit interference. Given these results, one could conclude that memory in-
terference cannot account for differences in FOK accuracy between young and older adults,
despite significant differences in underlying memory performance.
An alternative hypothesis is that different interference paradigms yield different metamem-
ory outcomes. While implicit (Eakin & Hertzog, 2006, 2012a) and proactive (Diaz & Ben-
jamin, 2011; Maki, 1999) interference paradigms exhibit similar patterns of metamemory
performance (which I will explore in more detail later in this section), they also share many
fundamental properties that make similarities in these results probable. At the end of this
section, I will introduce an older, but potent interference paradigm, output interference,
as a potential source of FOK inaccuracy in older adults that differs significantly from the
interference paradigms previously used.
1.2.1 Proactive Interference
Proactive interference (PI) refers to the deleterious impact of irrelevant information
learned prior to the main encoding trials on memory for relevant targets (M. C. Anderson
& Neely, 1996). In standard PI paradigms, participants initially study a set of irrelevant
items (“List 1”), followed by the true set of cue-target pairs (“List 2”), and ending with
either a cued-recall or recognition memory test on the relevant items (“List 2”). Studying
List 1 items interferes with the recollection of List 2 items, resulting in decreased memory
for relevant items, particularly for cued recall and longer retention periods (Postman, Stark,
& Fraser, 1968).
PI is a particularly important topic in aging research for many different reasons. For
one, PI has been argued to be the major source of forgetting in everyday life (Underwood,
1957), where years of prior learning can interfere with memory for new information. Ad-
ditionally, PI has been shown to have greater effects on older adults, potentially reflect-
ing decreases in the ability to inhibit irrelevant information in both long term and working
memory tasks (Lustig, May, & Hasher, 2001). Indeed, leading theoretical accounts hypoth-
esize that increases in PI are directly related to increases in search set evoked by the cue
word (M. C. Anderson et al., 1994; M. C. Anderson & Neely, 1996; Watkins & Watkins,
1975; Wixted & Rohrer, 1993).
Effects of PI on Metamemory
This interference paradigm has also been used in metacognitive research to help differ-
entiate between accessibility (Koriat, 1993) and competition (Schreiber, 1998; Schreiber &
Nelson, 1998) accounts of metamemory construction. A study by Maki (1999) was one of
the first to directly compare these two accounts of metamemory construction for both FOKs
and JOLs. Learners reported significantly higher estimates for both types of judgments
when stimuli were repeated and their responses were semantic associates. These results
were not limited to manipulations that only involve cue words, leading to the conclusion
that a more general metamemory judgment construction mechanism than cue-familiarity
or accessibility to the target is plausible, given interference at test. Thus, the results of
Maki’s study are consistent with Schreiber’s (1998) competition account of metamemory
judgment construction (Maki, 1999).
Research by Wahlheim (2011) provides further support for a competition account of
metamemory using a PI framework. Here, the author replicated the typical delayed JOL
effect, i.e. higher accuracy for dJOLs than iJOLs, but still found that reported dJOLs were
higher in magnitude under interference conditions. This inflation in dJOLs was attributed
to high-confidence judgments given to intrusion errors during recall. The implications of
these data are two-fold: First, dJOLs are susceptible to interference during retrieval, even
though these judgments rely on more “valid” information (i.e. traces from LTM) than do
iJOLs (i.e. noise from STM). Secondly, the effects of interference during the judgment
process leads learners to become over-confident. This finding challenges the conventional
monitoring-dual-memories (MDM; T. O. Nelson & Dunlosky, 1991) account, which postu-
lates that having access to information in LTM should increase judgment accuracy, regard-
less of the retrieval context. The conclusions support a theory of judgment construction
that emphasizes the role of the retrieval process in dictating the quality of information used
for a given judgment.
More recent research on metamemory and PI challenges the notion that attenuation in
accuracy estimates due to interference result from how items are themselves processed.
Diaz and Benjamin (2011) studied JOLs in conditions of PI and “release” from PI (Wick-
ens, 1973), where some cues were repeated in a block followed by novel ones. Overall
JOLs did decrease across trials in which PI was built up, but they continued to decrease
over trials, even after new cues were used. The judgments in the latter case did not follow
the increase in memory performance after new cues were presented. JOLs also decreased
equally for pairs with novel cues as well as those with repeated cues, even though recall
for pairs with novel cues did not decrease over trials. The authors argue for an account of
metacognition in which learners have a global, but not item-specific, awareness of interfer-
Phenomenology of PI
The most curious aspect of interference research is that there are few studies that at-
tempt to confirm that what memory theorists and learners call “interference” is congruent
with mechanistic accounts of memory interference. (Tulving, 1989, expertly coined this as
the “doctrine of concordance”, or the oft-untested assumption in cognitive psychology that
memory functioning and experience are the same.) Metamemory research on PI, such as
that from Maki (1999), is clear that learners do not use mnemonic cues to gauge the effects
of interference and, instead, rely on what Diaz and Benjamin (2011, p. 202) refer to as
a “naive theory of memory”; however, these interpretations are only sufficient in partially
describing how learners integrate information to make memory judgments during PI and
not the experience of PI itself.
1.2.2 Implicit Interference
Interference research distinguishes between two broad categories based on the amount
of awareness learners have during the task: In explicit memory interference tasks, such as
PI, learners are generally aware of sources of disruption, e.g. memory for List A when
attempting to recall List B. In implicit interference tasks, however, learners are not con-
sciously aware of such disruptive sources. An example relevant to this research is implicitly
activating activated associates of the cue during recall (Eakin & Hertzog, 2006).
To be clear, I am careful to distinguish between interference paradigms that rely on
implicitly-activated information and interference paradigms that examine implicit mem-
ory itself. The former describes any memory task that involves activation of competing
information that a participant is not explicitly aware of, while the latter describes interfer-
ence during implicit memory tasks, such as repetition priming and stem completion tasks
(Roediger, 1990; Schacter, 1987). This paper will not go into detail about interference
during implicit memory tasks, although it is worth mentioning that older adults are more
susceptible to interference effects in these tasks than young adults (Ikier & Hasher, 2006;
Ikier, Yang, & Hasher, 2008).
Many important demonstrations of interference from implicit sources come from Dou-
glas Nelson and colleagues (D. L. Nelson & McEvoy, 1979; D. L. Nelson, McEvoy, &
Schreiber, 1990; D. L. Nelson, McKinney, Gee, & Janczura, 1998). These studies demon-
strate that cue-set-size, or the number of associates that are implicitly co-activated when a
cue is shown at test, is an important determinant of memory performance. Specifically, the
probability of recalling an item is negatively correlated with the cue set size, i.e. targets
that are associated with cues that have a large number of semantic associates are less likely
to be recalled. One plausible explanation for Nelson et al.’s the cue-set-size effect is that in-
terference occurs when semantic associates to a cue word are co-activated during retrieval,
creating a negative implicit effect on memory performance. This is best explained using
the Processing Implicit and Explicit Representations model (PIER2; D. L. Nelson et al.,
1998), which holds that processing a cue activates semantic associates implicitly. The level
of co-activation is thought to be positively correlated with the amount of interference that
learners experience during retrieval, although cue-set-size effects can be eliminated when
a cue and target with shared associates are studied together (i.e. intralist cueing), which
reduces the number of potential associates to those that are shared by the word pair (D. L.
Nelson & McEvoy, 1979; D. L. Nelson et al., 1990).
Older adults are thought to be particularly prone to cue-set-size effects. McEvoy, Hol-
ley, and Nelson (1995), for example, found significant age-related differences in recall
performance in an extralist cueing paradigm, or a task in which the target is cued by a
previously-unseen word. The authors interpret this with respect to an inhibitory-deficit
account (Hasher & Zacks, 1988), where older adults are less able to inhibit the influence
of co-activated associates during retrieval. This interpretation is consistent with later re-
search suggesting that older adults have greater difficulty discriminating between targets
and competitors without memory training (e.g. Badham et al., 2016).
1.2.3 Output Interference
A central question to the current project is whether all types of interference give rise
to the same memorial experiences. If they do, then we would expect any interference
paradigm would give rise to competition due to co-activated associates, regardless of the
manner in which the associates are activated (i.e. implicitly or explicitly). This general
competition hypothesis (D. L. Nelson et al., 1998; Schreiber, 1998; Schreiber & Nelson,
1998) would also be expected to yield metamemory judgments that decrease in magnitude
with larger set sizes (Eakin & Hertzog, 2006, 2012a, 2012b). However, several studies
have indicated that some interference states are qualitatively different from ones such as
proactive and implicit interference. Output interference (OI), or the gradual decrease in
retrieval as a function of an item’s position in a testing sequence, is one such example,
which “violate[s] the widely held idea... that interference is initiated by competition for a
shared retrieval cue“ (M. C. Anderson & Neely, 1996, p. 270).
What follows is a brief overview of important research and theories regarding OI ef-
fects, as well as computational accounts of this interference effect. Importantly, this section
will also examine how OI differs from other interference paradigms and why the phe-
nomenological experience of OI might give rise to metacognitive judgments patterns that
differ from general competitions accounts.
Early Research
Original theories of output interference suggested that the effect is agnostic of item-
type, and that any memory task with sequential retrieval trials will exhibit decreases in
accuracy with serial position. Tulving and colleagues (Tulving & Arbuckle, 1963, 1966)
demonstrated this effect when using a cued recall task with simple noun-number paired
associates. The authors concluded that decreases in response accuracy were indicative of
a loss of information in short-term memory between study and test, particularly for items
that were tested at the end of the recall sequence. However, these interference effects
have been shown to occur in tasks that control for degradations in a short-term memory
store. Smith (1971), for example, demonstrated that giving learners a task in between study
and test to occupy their short-term memory does not eliminate output interference effects
(although the criterion task differed from those from Tulving and colleagues’ studies, i.e.
recognition). Participants showed decreased free-recall accuracy for categorized words
across sequential test trials, despite an interpolation task between learning and retrieval
Figure 1.1: An overview of typical output interference effects in cued recall. Typical OI
effects include decreased accuracy rates, increased omission rates, and relatively stable
commission error rates. From Wilson et al. (2020).
procedures. Output interference must therefore be a function of trace activation in LTM,
resulting as a consequence of repeated retrieval attempts.
Another early hypothesis regarding output interference is that the effect only exists
when targets have a shared cue. Generalizations of memory interference experimental
paradigms challenge this notion. In the study by Smith (1971), the study items consisted of
49 items, or 7 exemplars from 7 taxonomic categories. In a similar study by Roediger and
Schmidt (1980), learners studied lists of exemplars from taxonomic categories and were
cued either with only the category label or with the category label and 4 exemplars and
asked to recall the words that were presented during study. The results indicate that output
interference effects persist across these different conditions, such that recall accuracy by se-
rial position coefficients are similar between conditions in which learners are cued with just
the category label or the category label plus 4 exemplars, conditions with differing numbers
of to-be-remembered exemplars, and conditions in which the categories in the study list are
related versus unrelated. The results of these studies suggest, then, that interference effects
are not dependent upon either cue or category (“cue-independence”).
Current theoretical views of output interference do not postulate a generalized compe-
tition account; rather, the extant literature suggests that, while output interference is not
specifically dependent upon shared retrieval cues, it is dependent upon the type of informa-
tion being accessed during retrieval. Neely et al. (1983) provided an early demonstration
of this using a procedure modified from that of Roediger and Schmidt (1980). Participants
were asked to study lists of 5 category exemplars and then to choose the previously-seen
targets during a speeded yes/no recognition test. At test, the researchers manipulated im-
plicit influences on memory by presenting categorically-related lures (“primes”) prior to
certain recognition trials. The results show two distinct patterns. First, showing a related
prime prior to a yes/no recognition decreased response latencies compared to preceding the
test trial with an unrelated word, indicating that increasing semantic activation for an item
facilitates memory retrieval. Second, and most importantly, recognition response latencies
for items that were preceded by 6 related primes were significantly higher than those for
items preceded by only 2 primes. This second set of results has been key to the conception
that retrieval is a source of interference, and that accessing previously-encoded information
facilitates increased error rates across trials (M. C. Anderson & Neely, 1996; Criss et al.,
2011; Wilson et al., 2020).
Mechanisms of Output Interference
Recent empirical investigations regarding the mechanistic properties behind OI have
focused on modeling changes during memory retrieval. Simulations using the search of
associative memory (SAM; Raaijmakers & Shiffrin, 1981) model, where memory traces
are activated by co-occurring features during retrieval, have proven to be informative of
this issue. Specifically, SAM postulates that the process of retrieving an item carries the
additional benefit of encoding extra information about the item, known as learning dur-
ing retrieval (e.g., Carrier & Pashler, 1992; Roediger & Karpicke, 2006). For items that
are successfully recovered, SAM engages in a process called incrementing in which the
strength of the associations between a cue, the learning context, and target are strength-
ened. Incrementing was instrumental in early demonstrations of interference effects in free
recall (Raaijmakers & Shiffrin, 1981) and later investigations using the retrieving effec-
tively from memory (REM; Shiffrin & Steyvers, 1997) model to investigate OI effects in
recognition memory (Criss et al., 2011; Koop et al., 2015).
A key component to retrieval processes in these simulations is one that limits the num-
ber of retrieval attempts per memory test trial. The SAM model originally used this com-
ponent, referred to as a retrieval filter, simply as an economical device to prevent the model
from continuously engaging in retrieval search. This mechanism was adapted from a com-
putational model of interference in free recall by Rundus (1973), where the number of
unsuccessful retrieval attempts is bounded by the integer parameter mand re-sampling of
previously-retrieved (and recently-activated) items causes an individual to terminate search
The concept of the retrieval filter continues to be an important tool for understand-
ing retrieval mechanisms in free recall, cued recall, and recognition (Wilson et al., 2020),
despite its innocent origins. For example, The Hypothesis Generation (HyGene) model
(R. P. Thomas et al., 2008) implements this mechanism as TMax , which corresponds to the
maximum number of retrieval failures during the hypothesis comparison process, in order
to explore the effects of WM constraints in subadditivity (Dougherty & Hunter, 2003a,
The combination of the incrementing and retrieval failure mechanisms in formal models
of episodic memory provides the basis for OI effects in cued recall: Previously-retrieved
items have increased activations and are likely to be sampled in later trials with similar
cue information (e.g., categorically-related), causing the retrieval failure count to reach its
maximum earlier in the sampling process (Wilson et al., 2020). Thus, for consecutive recall
trials in which the cues are related to each other, recall performance declines as a function
of serial position due to interference from traces activated earlier in the testing sequence.
An important point to note is that these computational accounts largely examine the degree
to which patterns of memory performance that are simulated using these parameters fit true
patterns of performance in OI tasks and do not specifically examine the role of competition
during retrieval.
Aging and Output Interference
At the time of writing this proposal, there has been very little research conducted on
OI effects in older adults, and certainly none from the past few decades (Kausler, 1994).
However, indirect evidence from related tasks are concordant with the idea that older adults
show greater interference effects at test. For example, Duchek (1984) demonstrated that
elderly learners show retrieval deficits related to semantic context. Older and young adults
were asked to engage in a semantic orienting task for paired associate learning in which
they were asked about a target word’s category membership. At test, individuals were
prompted with either semantic or rhyming cues and asked to recall the target word. While
both age groups demonstrated greater memory accuracy for items with semantic cues at
test, young adults’ memory for items with semantic cues was significantly greater than
that for older adults. Ducheck ascribes this to a general deficit in older adults to reinstate
specific semantic contexts at test, although the extent to which this finding is dependent
upon competition during test that is influenced by semantic contexts is unclear (Kausler,
To date, only a few direct investigations regarding OI in older adults have been con-
ducted. Taub and Walker (1970) initially provided evidence for age-related differences in
interference effects. Learners were asked to recall a word from a previous list that over-
laps with a current one before fully recalling the words that are unique to a current list—a
procedure that is meant to induce interference for specific items in a list. Older adults
demonstrated significantly lower memory accuracy overall as well as qualitatively differ-
ent levels of recall over trials, which supports the hypothesis of an age difference in OI
Figure 1.2: Results from Smith (1975) regarding recall performance for young (”Gp. 1”),
middle (”Gp. 2”), and older (”Gp. 3”) adults. Output interference effects for recall as a
function of serial position at test (”OUTPUT”) are invariant across age groups.
The most direct test of the effects of aging on OI by Smith (1975), however, does not
support this general conclusion. In his study, Smith carefully examined the separate in-
fluences of input (STM) and output (LTM) interference by pairing study order with the
subsequent test order in a factorial design. Participants from three age groups (20-39, 40-
50, and 60-80 years of age) were asked to study 8 exemplars from 9 normative categories
(Battig & Montague, 1969). While there were no significant trends related to input posi-
tion, each of the 3 age groups demonstrated similar output interference effects during test,
despite having significantly different levels of free recall performance (Figure 1.2). Ad-
ditionally, rates of omission and commission errors were similar across the 3 age groups,
suggesting a lack of qualitative differences in recall outcomes.
Phenomenology of Output Interference
A major discrepancy in recent OI literature is the lack of explanation for how OI occurs
and its relation to actual subjective memory experiences. If OI is indeed a ubiquitous
experience in everyday life (M. C. Anderson & Neely, 1996), then connecting the “feeling”
of interference during OI tasks should provide converging evidence for the mechanisms
that have been hypothesized to give rise to OI (e.g., Raaijmakers & Shiffrin, 1981; Rundus,
1973; Wilson et al., 2020). Unfortunately, there does not appear to be any such research in
the extant literature.
The experimental paradigm that is most informative of how to connect the mechanisms
and experiences of OI is retrieval-induced forgetting (RIF; M. C. Anderson et al., 1994).
Here, participants study categorized lists of words that they will either see again (“Retrieval
pracice”, or Rp, items) or not (“No retrieval practice”, or Nrp, items). Importantly, specific
words within the Rp list are not shown again for restudy (Rp-), and memory for these items
is examined against items that were designated to be shown again (Rp+). While the RIF
paradigm is not the same as OI, the results are similar: Retrieving items that were restudied
earlier (Rp+) decreases the probability of retrieving related, but once-seen items (Rp-). This
general finding in RIF has been shown to be independent of the cue word (M. C. Anderson,
Bjork, & Bjork, 2000) and number of exposures to items in the intial study phase (Hulbert,
Shivde, & Anderson, 2012).
M. C. Anderson and Neely (1996) described two particular theoretical accounts of RIF
that were influential in later research: One that implicates the roles of inhibition and com-
petition (which I will refer to as the competition account), and a second that argues that
recently-activated items “block” the retrieval of related items later in the retrieval sequence
(which I will refer to as the retrieval-suppression account; c.f. B¨
auml, 1998). Anderson
and colleagues favor the former account and argue that RIF is a byproduct of spreading
activation to competitors and the over-activation of related items that have been recently
retrieved. These items compete for access and require a process to engage in a selective
retrieval process that ignores highly-activated non-target competitors in favor of the more
weakly-activated target. This is referred to as “response override“ (M. C. Anderson &
Levy, 2007, p. 81) and has been theorized to be directly related to executive functioning,
i.e. lower inhibitory abilities result in failures to engage in a response override (M. C. An-
derson, 2003; M. C. Anderson et al., 2000; M. C. Anderson & Neely, 1996; Hulbert et al.,
The competition account cannot fully account for RIF and OI results, however. For
one, interference effects are dependent upon the pre-experimental associative strengths of
the cues and targets. In Anderson and colleagues’ account, the source of interference arises
soley from the build up of related items that increased in activation due to previous re-
trieval attempts. Here, the strengthening of the category cue to item associations during
retrieval should produce increased interference from these items to other items with the
same cue, regardless of the pre-experimental associative strength. This is not strictly the
case, however; B¨
auml (1998) demonstrated that retrieving target words with moderate nor-
mative associativeness to a cue word produced output interference only for items that were
strongly associated with the cue word and not for those that were weakly associated with
the cue. This was later replicated in a RIF study by Williams and Zacks (2001).
Importantly, the competition account has been argued to be insufficient because of a
lack of specific evidence for the role of inhibition in RIF. Anderson and colleagues posit
that retrieval interference occurs as a result of active inhibition of co-activated, but compet-
ing representations early in a retrieval sequence. Later in the sequence, when these items
are no longer competitors to ignore, but are the correct targets, they continue to be down-
graded as a result of this early inhibition. Experimental evidence does not support a role
for active inhibition, though; in a study by Williams and Zacks (2001), for example, final
memory accuracy for retrieval-practice items that were not studied (Rp-) was not statisti-
cally different from items that did not undergo retrieval practice at all (Nrp). If there was
an active response override (M. C. Anderson & Levy, 2007), then the representations for
Rp- items later in a retrieval sequence would be down-regulated as a result and would have
a lower chance of being retrieved than Nrp items. Williams and Zacks (2001) concluded
that RIF is a function of non-inhibitory processes.
Further evidence against a competition account comes from a study by Aslan et al.
(2007), who examined RIF in older adults. The null hypothesis (i.e. a competition account)
holds that older adults have difficulties preventing irrelevant information from entering
into WM (Hasher et al., 1991; Hasher & Zacks, 1988; Hasher et al., 1999). If inhibitory
processes have an active role in retrieval interference, then memory performance betwen
older and young adults should be qualitatively different. However, similar to OI effects in
the study by Smith (1975), there were no interaction effects to suggest that young and older
adults had significantly different RIF results (although older adults recalled fewer items on
average; Aslan et al., 2007). The authors argue that the results indicate that inhibitory
deficits in older adults are task-dependent; however, given the reliability of the inhibitory-
deficit account and the results of previous RIF studies, these results also argue against an
inhibitory account of aging and RIF altogether.
In contrast to an inhibitory account of interference, the retrieval-suppression account of
OI assumes that activation of previously-retrieved items precludes future target items from
entering awareness rather than an intentional exclusion (i.e. inhibition) that occurred earlier
in the retrieval sequence. The largest difference in this account is that learners are unable
to recall the identity of a target item precluded from memory, whereas inhibited items can
be recalled, but learners are unable to distinguish the context in which it was first studied.
Anderson and Neely (1996) provide an example of metamemorial awareness during such
an occlusion of a target item:
Sometimes our ability to recall our current parking location seems blocked by
the intrusion of similar episodes. When this occurs, we often feel confident
that we know where we parked, but that recall of the location demands that we
penetrate through memories that get in the way. (pp. 237-238)
This conception of interference is largely informed by early memory research on blocking
(Wickelgren, 1976) and response suppression (Postman et al., 1968) in interference. Un-
fortunately, the type of evidence needed to empirically test a suppression/blocking account
of interference is not available in a standard interference paradigm; therefore, studies that
reference this hypothesis are limited to these early experiments.
1.3 Overview of the Current Experiment
Previous research has reliably demonstrated that buildup of competing semantic asso-
ciates over successive cued recall attempts results in decreases in cued recall performance
over successive trials (c.f. Tulving & Arbuckle, 1963, 1966). This has been demonstrated
in recognition (e.g. M. C. Anderson & Neely, 1996; Criss et al., 2011) and recall (e.g.
Wilson et al., 2020) in younger adults, free recall in older adults (Smith, 1975), and was
replicated in cued recall for young adults in a pilot study for this experiment. However,
hypotheses regarding the phenomenon of output interference have yet to be empirically
validated against metamemory outcomes.
Of particular interest are both a) the specific question of whether metamemory accuracy
during interference changes with age (e.g. Eakin & Hertzog, 2006, 2012a, 2012b) and b)
the broader question of how the retrieval operations that underlie OI might be uncovered
through learners’ experiences during cued recall. This study explores these questions using
a variation of the canonical Recall-Judge-Recognize procedure (Hart, 1965): Individuals
were asked to study either completely unrelated cue and target word pairs (i.e. control
items) or word pairs in which the cue and target pairs are unrelated to each other, but
cue words are related to each other via taxonomic categories (i.e. “interference” items).
After a short distractor task, individuals were asked to recall the target words that were
co-presented with the cues during study. During each cued-recall trial, individuals rated a)
whether they “Remember” or “Know” the target word (i.e. R/K/N judgments) and b) how
confident they are that they will be able to recognize the correct target if they saw it on
the screen (i.e. FOK judgments). In the final section of the experiment, individuals were
asked to recognize the correct target word when it was co-presented with the cue and 3
distractors. After each 4AFC recognition trial, participants rated their confidence that they
chose the correct target (i.e. RCJ).
Examining metamemory in the context of aging and OI will help to resolve a number