Memory for scripts in young and older adults. Memory & Cognition, 11, 435-444
October 1983


Patricia A. Anderson
This study examined the question of whether young and older adults differ in their representation or utilization of the generic knowledge contained in scripts. In Experiment 1, young and older adults generated scripts for routine daily activities, such as grocery shopping, going to the doctor, and writing a letter to a friend. No evidence was found for age-related differences in the way that stereotypical action sequences are represented in semantic memory. In Experiment 2, young adults were found to recall and recognize new instantiations of scripts better than did older adults. However, adults in both age groups displayed similar effects of action typicality on retention, suggesting that there are no age-related differences in drawing inferences from generic knowledge. The implications of these findings for processing-resource hypotheses about memory and aging are discussed.

Williams, J. M., Ellis, N. C., Tyers, C., Healy, H., Rose, G. & MacLeod, A. K. The specificity of autobiographical memory and imageability of the future. Mem. Cognit. 24, 116-125
February 1996


J. Mark G. Williams


Nick C. Ellis





Three studies examined whether the specificity with which people retrieve episodes from their past determines the specificity with which they imagine the future. In the first study, suicidal patients and nondepressed controls generated autobiographical events and possible future events in response to cues. Suicidal subjects' memory and future responses were more generic, and specificity level for the past and the future was significantly correlated for both groups. In the second and third studies, the effect of experimental manipulation of retrieval style was examined by instructing subjects to retrieve specific events or summaries of events from their past (Experiment 2) or by giving high- or low-imageable words to cue memories (Experiment 3). Results showed that induction of a generic retrieval style reduced the specificity of images of the future. It is suggested that the association between memory retrieval and future imaging arises because the intermediate descriptions used in searching autobiographical memory are also used to generate images of possible events in the future.

Figure 1. Proportions of participants using various time anchors in reporting when they heard the news in the September group as a function of when they completed the questionnaire. The 5-min and 1-min divisions of the clock are combined.  
Autobiographical memories for the September 11th attacks: Reconstructive errors and emotional impairment of memory

May 2004


College students were asked about their personal memories from September 11, 2001. Consistency in reported features over a 2-month period increased as the delay between the initial test and 9/11 increased. Central features (e.g., Where were you?) were reported with greater consistency than were peripheral features (What were you wearing?) but also contained a larger proportion of reconstructive errors. In addition, highly emotional participants demonstrated poor prospective memory and relatively inconsistent memory for peripheral details, when compared with less emotional participants. Highly emotional participants were also more likely to increase the specificity of their responses over time but did not exhibit greater consistency for central details than did less emotional participants. The results demonstrated reconstructive processes in the memory for a highly consequential and emotional event and emotional impairment of memory processing of incidental details.

The mirror effect in recognition memory. Memory & Cognition, 13, 8-20

February 1985


The mirror effect in recognition memory refers to the fact that, with several different classes of stimuli, performance on new items from each class mirrors (is correlated with) performance on the corresponding classes of old items. Classes of stimuli that are accurately recognized as old when old are also accurately recognized as new when new; those that are poorly recognized as old when old are also poorly recognized as new when new. The statement above is shown not to be a tautology. A survey demonstrates that the effect holds for several types of variables (ways to classify stimuli)—word frequency, concreteness, meaningfulness, and others. The survey includes a total of 80 findings. The theoretical implications of the effect are considered.

The similarity and diversity of semantic relations. Memory and Cognition, 12 ,134-141

April 1984


There is a rich variety of semantic relations in natural languages. Subjects’ perceptions of similarities among relations were studied for a wider variety of relations than had been used in previous studies. Forty subjects sorted 31 cards bearing five example pairs of each of 31 semantic relations. Subjects were able both to distinguish the relations and to perceive their similarities. A hierarchical clustering analysis of the sorting data indicated that the subjects perceived five families of semantic relations (contrasts, class inclusion, similars, case relations, and part-wholes). The five families were distinguished in terms of three properties of semantic relations: contrasting/noncontrasting, logical/pragmatic, and inclusion/noninclusion. Within each family, relations also were sorted in ways consistent with their defining properties. Relations were therefore viewed not as unanalyzable primitives, but in terms of the relational properties that distinguished them.

Figure 1. Distributions of (A) age and (B) education in the Dutch sample. Both are compared with the distribution of both variables in the general population of the Netherlands (source: In accordance with statistical convention in the Netherlands, education was classified by highest attained educational grade.
Figure 10. Two-year retention curves for the 4AFC and the open questions for the Dutch sample (Experiment 3). Continuous lines correspond to the fits of a two-store MCM with decline parameters shared by 4AFC and open questions, with parameter values fitted only on the first year of retention. 
Remembering the news: Modeling retention data from a study with 14,000 participants

July 2005


A retention study is presented in which participants answered questions about news events, with a retention interval that varied within participants between 1 day and 2 years. The study involved more than 14,000 participants and around 500,000 data points. The data were analyzed separately for participants who answered questions in Dutch or in English, providing an opportunity for replication. We fitted models of varying complexity to the data in order to test several hypotheses concerning retention. Evidence for an asymptote in retention was found in only one data set, and participants with greater media exposure displayed a higher degree of learning but no difference in forgetting. Thus, forgetting was independent of initial learning. Older adults were found to have forgetting curves similar to those of younger adults.

Gardiner, J.M. & Java, R.I. Forgetting in recognition memory with and without recollective experience. Mem. Cognit. 19, 617-623

December 1991


Retention interval was manipulated in two recognition-memory experiments in which subjects indicated when recognizing a word whether its recognition was accompanied by some recollective experience ("remember") or whether it was recognized on the basis of familiarity without any recollective experience ("know"). Experiment 1 showed that between 10 min and 1 week, "remember" responses declined sharply from an initially higher level, whereas "know" responses remained relatively unchanged. Experiment 2 showed that between 1 week and 6 months, both kinds of responses declined at a similar, gradual rate and that despite quite low levels of performance after 6 months, both kinds of responses still gave rise to accurate discrimination between target words and lures. These findings are discussed in relationship to current ideas about multiple memory systems and processing accounts of explicit and implicit measures of retention.

Erratum to: Citation rates for experimental psychology articles published between 1950 and 2004: Top-cited articles in behavioral cognitive psychology

May 2012


From citation rates for over 85,000 articles published between 1950 and 2004 in 56 psychology journals, we identified a total of 500 behavioral cognitive psychology articles that ranked in the top 0.6 % in each half-decade, in terms of their mean citations per year using the Web of Science. Thirty nine of these articles were produced by 78 authors who authored three or more of them, and more than half were published by only five journals. The mean number of cites per year and the total number of citations necessary for an article to achieve various percentile rankings are reported for each journal. The mean number of citations necessary for an article published within each half-decade to rank at any given percentile has steadily increased from 1950 to 2004. Of the articles that we surveyed, 11 % had zero total citations, and 35 % received fewer than four total citations. Citations for post-1994 articles ranking in the 50th-75th and 90th-95th percentiles have generally continued to grow across each of their 3-year postpublication bins. For pre-1995 articles ranking in the 50th-75th and 90th-95th percentiles, citations peaked in the 4- to 6- or 7- to 9-year postpublication bins and decreased linearly thereafter, until asymptoting. In contrast, for the top-500 articles, (a) for pre-1980 articles, citations grew and peaked 10-18-year postpublication bins, and after a slight decrease began to linearly increase again; (b) for post-1979 articles, citations have continually increased across years in a nearly linear fashion. We also report changes in topics covered by the top-cited articles over the decades.

Comparing modes of rule-based classification learning: A replication and extension of Shepard, Hovland, and Jenkins (1961)

May 1994


We partially replicate and extend Shepard, Hovland, and Jenkins's (1961) classic study of task difficulty for learning six fundamental types of rule-based categorization problems. Our main results mirrored those of Shepard et al., with the ordering of task difficulty being the same as in the original study. A much richer data set was collected, however, which enabled the generation of block-by-block learning curves suitable for quantitative fitting. Four current computational models of classification learning were fitted to the learning data: ALCOVE (Kruschke, 1992), the rational model (Anderson, 1991), the configural-cue model (Gluck & Bower, 1988b), and an extended version of the configural-cue model with dimensionalized, adaptive learning rate mechanisms. Although all of the models captured important qualitative aspects of the learning data, ALCOVE provided the best overall quantitative fit. The results suggest the need to incorporate some form of selective attention to dimensions in category-learning models based on stimulus generalization and cue conditioning.

Implicit serial learning: Questions inspired by Hebb (1961)

December 1993


Implicit serial learning occurs when indirect measures such as transfer reveal learning of a repeating sequence even when subjects are not informed of the repeating sequence, are not asked to learn it, and do not become of aware of it. This phenomenon is reminiscent of an experiment by Hebb (1961), who studied the repetition of sequences in a serial recall task. Two experiments investigated the relation between implicit serial learning and ideas about learning forwarded by Hebb and others who used his method. The experiments showed that implicit serial learning occurs even when the repeating sequence is intermixed with randomly generated sequences instead of being repeated continuously, that the organization of the sequence into regularly or irregularly grouped subsequences determines the extent of learning, and that the repetition effect observed does not depend on subjects' ability to recognize the repetition.

Strong cues are not necessarily weak: Thomson and Tulving (1970) and the encoding specificity principle revisited

January 2002


Performance on tests in which there is control over reporting (e.g., cued recall with the option to withhold responses) can be characterized by four parameters: free- and forced-report retrieval (correct responses retrieved from memory when the option to withhold responses is exercised and when it is not, respectively), monitoring (discrimination between correct and incorrect potential responses), and report bias (willingness to report responses). Typically, researchers do not examine all these components in cued-test performance; blanks are sometimes counted the same as errors, meaning that the (free-report) performance index is contaminated with report bias and monitoring ability. In this research, a two-stage testing procedure is described that allows measures of free- and forced-report retrieval, monitoring, and bias to be derived from the original encoding specificity experiments (Thomson & Tulving, 1970). The results show that their cue-reinstatement manipulation affected free-report retrieval, but once report bias and monitoring effects were removed by forcing output, retrieval was unaffected.

Identifying exceptions in a database of recognition failure studies from 1973 to 1992

June 1993


This paper presents a database of all published studies based on the recognition failure paradigm, which involves the study of pairs of items followed by a recognition test of the second item of each pair and a recall test of the same target item with the first item of each pair provided as a context cue. The paper also identifies, on the basis of a quantitative analysis, exceptions to the recognition failure function encompassing most data in the database. The database includes reference information about each study and a short description of materials and the manipulations made in each of the 302 experimental conditions reported. The database also includes information about the total number of observations for each condition, the overall hit rate in free or forced choice recognition, the overall probability of recall, the observed probability of recognition given recall, the predicted probability of recognition given recall, the difference between observed and predicted values, and the critical ratio between these difference scores and their overall standard deviation.

Lateral inhibition and echoic memory: Some comments on Crowder's (1978) theory

June 1982


Crowder (1978) has proposed a theory of the suffix effect based on lateral inhibition among echoic representations of the list and suffix items. The theory was prompted by, and derives its principal support from, the counterintuitive finding that the effect is smaller with multiple suffixes than with a conventional single suffix. In this paper, we describe four experiments, each of which fails to replicate this finding. In addition, we note a prediction of the theory and show that it is contrary to available evidence. It is argued that the details of the suffix effect are too complex to be captured by a theory of peripheral mechanism, even one as ingenious as Crowder’s.

Is there implicit memory without attention? A reexamination of task demands in Eich's (1984) procedure

December 1997


The relation between memory and attention has been of long-standing interest. Eich (1984) made an important discovery of implicit but not explicit memory for contextually determined homophones (e.g., taxi-FARE) presented in a channel to be ignored within a selective listening procedure. However, his slow rate of presentation of shadowing task materials may have allowed frequent attention shifts to the allegedly ignored channel. With a direct replication of Eich's timing parameters, we reproduced his results, but when the attended channel was presented twice as fast as Eich's, implicit memory for the to-be-ignored words vanished. Our results contradict claims of extensive semantic processing of unattended auditory information in this task.

Detectionless processing with semantic activation? A footnote to Greenwald, Klinger, and Liu (1989)

August 1990


Recently published research has suggested that, in a pattern masking task, semantic activation caused by the target may continue to exist even though subjects cannot detect the target. The experiments are reassessed as an exceptional case of the more general rule that subjects are able to use residual semantic activation to actually detect targets. Furthermore, residual graphic information is far less effective at supporting near-chance target detections.

Parts and the basic level in natural categories and artificial stimuli: Comments on Murphy (1991)

October 1991


Natural taxonomies consist of categories that vary in level of abstraction. Categories at the basic level, such as chair and apple, are preferred in a broad range of situations (Rosch, Mervis, Gray, Johnson, & Boyes-Braem, 1976). Several studies have revealed qualitative differences between the basic level and other levels. For example, Tversky and Hemenway (1984) presented evidence that parts proliferative at the basic level; they proposed that parts link the appearance of category members with their functions. Although not taking issue with these findings, Murphy (1991) investigated whether parts are necessary or sufficient for a basic level. In an attempt to demonstrate that parts are not necessary, Murphy used artificial stimuli that did not capture the essential features of natural taxonomies. These discrepancies preclude any conclusions based on his studies. Murphy's data also do not support his claim that parts are not sufficient for a basic level. Finally, it is unlikely that pursuing questions of necessity or sufficiency will produce insights into human categorization.

Causal and conditional inferences: A comment on Cummins (1995)

June 1997


Cummins (1995) offers an analysis of causal and truth-functional sufficiency and necessity to predict and explain the effects on conditional inferences of two pragmatic factors: alternative causes and disabling conditions. However, the justification of these predictions is inconsistent. This note offers a modified analysis which puts her predictions on a sounder base: it is proposed that alternative causes and disabling conditions affect judgments of argument validity under three different models for the causal conditional.

Table 1 Percentage of Individuals at Each Education Level 
Table 3 Mean Ratings and Standard Deviations for Young Adults, Older Adults, and the Web-Based Sample for the Same Set of Words 
Table 6 Simultaneous Regression Analyses for Subjective Frequency Estimates and Toglia and Battig's (1978) Familiarity Estimates as a Function of Subject Group Predictors Beta t value p < Semipartial 
Subjective frequency estimates for 2,938 monosyllabic words

July 2001


Subjective frequency estimates for large sample of monosyllabic English words were collected from 574 young adults (undergraduate students) and from a separate group of 1,590 adults of varying ages and educational backgrounds. Estimates from the latter group were collected via the internet. In addition, 90 healthy older adults provided estimates for a random sample of 480 of these words. All groups rated words with respect to the estimated frequency of encounters of each word on a 7-point scale, ranging from never encountered to encountered several times a day. The young and older groups also rated each word with respect to the frequency of encounters in different perceptual domains (e.g., reading, hearing, writing, or speaking). The results of regression analyses indicated that objective log frequency and meaningfulness accounted for most of the variance in subjective frequency estimates, whereas neighborhood size accounted for the least amount of variance in the ratings. The predictive power of log frequency and meaningfulness were dependent on the level of subjective frequency estimates. Meaningfulness was a better predictor of subjective frequency for uncommon words, whereas log frequency was a better predictor of subjective frequency for common words. Our discussion focuses on the utility of subjective frequency estimates compared with other estimates of familiarity. The raw subjective frequency data for all words are available at

Are age-of-acquisition effects on object naming due simply to differences in object recognition? Comments on Levelt (2002)

August 2006


Levelt (2002) argued that apparent effects of word frequency and age of acquisition (AoA) reported in recent picture naming studies might actually be confounded effects operating at the level of object recognition, rather than relevant to theories of lexical retrieval. In order to investigate this issue, AoA effects were examined in an object recognition memory task (Experiments 1 and 2) and a word-picture verification task (Experiment 3) and compared with those found in naming tasks using the same pictures. Contrary to Levelt's concerns, the results of the three experiments show that the AoA effect on picture naming has a lexical origin and does not simply result from a possible confound of object identification times.

Fig. 2 Proportion of correct responses as a function of block, recall direction, and phonological similarity in Experiment 1B. Errors bars represent 95% confidence intervals 
Table 3 Analyses of variance for the combined analyses in Experiments 1-4 
Revisiting backward recall and benchmark memory effects: A reply to Bireta et al. (2010)

November 2011


When participants are asked to recall lists of items in the reverse order, known as backward recall, several benchmark memory phenomena, such as the word length effect, are abolished (Bireta et al. Memory & Cognition 38:279-291, 2010). Bireta et al. (Memory & Cognition 38:279-291, 2010) suggested that in backward recall, reliance on order retention is increased at the expense of item retention, leading to the abolition of item-based phenomena. In a subsequent study, however, Guérard and Saint-Aubin (in press) showed that four lexical factors known to modulate item retention were unaffected by recall direction. In a series of five experiments, we examined the source of the discrepancy between the two studies. We revisited the effects of phonological similarity, word length, articulatory suppression, and irrelevant speech, using open and closed pools of words in backward and forward recall. The results are unequivocal in showing that none of these effects are influenced by recall direction, suggesting that Bireta et al.'s (Memory & Cognition 38:279-291, 2010) results are the consequence of their particular stimuli.

On the interpretation of removable interactions: A survey of the field 33 years after Loftus

November 2011


In a classic 1978 Memory & Cognition article, Geoff Loftus explained why noncrossover interactions are removable. These removable interactions are tied to the scale of measurement for the dependent variable and therefore do not allow unambiguous conclusions about latent psychological processes. In the present article, we present concrete examples of how this insight helps prevent experimental psychologists from drawing incorrect conclusions about the effects of forgetting and aging. In addition, we extend the Loftus classification scheme for interactions to include those on the cusp between removable and nonremovable. Finally, we use various methods (i.e., a study of citation histories, a questionnaire for psychology students and faculty members, an analysis of statistical textbooks, and a review of articles published in the 2008 issue of Psychology and Aging) to show that experimental psychologists have remained generally unaware of the concept of removable interactions. We conclude that there is more to interactions in a 2 × 2 design than meets the eye.

Figure 1. Learning curves. The four plotted curves show item scoring for free recall with varied presentation order (FR–varied [item]), free recall with constant presentation order (FR–constant [item]), and serial recall (SR [item]) conditions, and relative order scoring for serial recall (SR [order]).  
Figure 3. Associative contiguity effects. The left panels (A–C) show the conditional response probability as a function of lag (lag–CRP) across each of the five study–test trials (columns 1–5) and for each of the three conditions: A1–A5, free recall with varied presentation order (FR–varied); B1–B5, free recall with constant presentation order (FR–constant); C1–C5, serial recall (SR). To further quantify the changes in contiguity effects across trials and conditions, we fit the power function CRP(lag) a | lag | b to individual participant lag–CRP functions. The rightmost panels (D–F) show the mean power function exponents, b, as a function of trial number for each of the three conditions: D, FR–varied; E, FR–constant; F, SR. Error bars represent 95% confidence intervals.  
Figure 2. Serial position curves. (A) First-trial serial position curves for each of the three conditions: free recall with varied presentation order (FR–varied), free recall with constant presentation order (FR–constant), and serial recall (SR). (B) Second-trial serial position curves. (C) Third-trial serial position curves. (D) Fourth-trial serial position curves. (E) Fifth-trial serial position curves. In all cases, serial position curves are based on item scoring (see the text for details).  
A comparative analysis of serial and free recall. Memory & Cognition, 33(5), 833-839

August 2005


Multitrial free and serial recall tasks differ both in recall instruction and in presentation order across trials. Waugh (1961) compared these paradigms with an intermediate condition: free recall with constant presentation order. She concluded that differences between free and serial recall were due only to recall instructions, and not to presentation order. The present study reevaluated the relation between free and serial recall, using Waugh's three conditions. By examining recall transitions and the organization of information retained across trials, we conclude that presentation order is an important factor, causing participants to exhibit the same temporal associations in serial recall and in free recall with constant presentation order.

Figure 2. Experiment 3 replicated Experiment 2 but used Swahili–English word pairs instead of obscure facts. Having three test/study opportunities for Swahili– English word pairs enhanced the degree of learning compared with having three study opportunities, but this effect was not significant. Having three test/study opportunities for Swahili–English word pairs significantly reduced the rate of forgetting compared with having three study opportunities. The smooth curves represent the mean of the 44 individual subjects' forgetting curves.  
Figure 1. Subjects were given a test with feedback (test/study) or a restudy opportunity (study) for each fact. Recall of these facts was tested after 5 min or 1, 2, 7, 14, or 42 days; in Experiment 1, recall was tested following just one test/study or one study opportunity (results shown in panel a); in Experiment 2, it was tested following three test/ study or three study opportunities (results shown in panel B). The points represent the average proportion of facts recalled from test/study versus study at each of the six retention intervals. The power function y 5 a(bt 1 1) c was fit to each subject's data to yield a degree-of-learning parameter and a rate-of-forgetting parameter. Having just one test/ study opportunity increased the degree of learning and reduced the rate of forgetting over having just one study opportunity (a), but these effects did not reach significance. Having three test/study opportunities significantly increased the degree of learning and significantly reduced the rate of forgetting over having three study opportunities (B). The smooth curves represent the mean of the 55 individual subjects' forgetting curves in Experiment 1 and the 57 individual subjects' forgetting curves in Experiment 2. in all three experiments, the curve-fitting procedure produced a few extreme parameter estimates for degree of learning and rate of forgetting. These extreme values did not affect the visual display of the graphs, but they did affect the mean parameter estimates. The parameter estimates in the equations, therefore, are medians rather than means.  
The effects of tests on learning and forgetting. Memory & Cognition, 36, 438-448

April 2008


In three experiments, we investigated whether memory tests enhance learning and reduce forgetting more than additional study opportunities do. Subjects learned obscure facts (Experiments 1 and 2) or Swahili-English word pairs (Experiment 3) by either completing a test with feedback (test/study) or receiving an additional study opportunity (study). Recall was tested after 5 min or 1, 2, 7, 14, or 42 days. We explored forgetting by means of an ANOVA and also by fitting a power function to the data. In all three experiments, testing enhanced overall recall more than restudying did. According to the power function, in two out of three experiments, testing also reduced forgetting more than restudying did, although this was not always the case according to the ANOVA. We discuss the implications of these results both for approaches to measuring forgetting and for the use of tests in promoting long-term retention. The stimuli used in these experiments may be found at

