Journal of Experimental Psychology Learning Memory and Cognition

Published by American Psychological Association
Online ISSN: 1939-1285
Print ISSN: 0278-7393
Publications
presents the data for the recognition 
In 3 experiments, the effect of word frequency on an indirect word fragment completion test and on direct free-recall and Yes-no recognition tests was investigated. In Experiment 1, priming in word fragment completion was substantially greater for low-frequency words than for high-frequency words, but free recall was unaffected. Experiment 2 replicated the word fragment completion result and showed a corresponding effect in recognition. Experiment 3 replicated the low-frequency priming advantage in word fragment completion with the set of words that P.L. Tenpenny and E.J. Shoben (1992) had used in reporting the opposite pattern in word fragment completion. Using G. Mandler's (1980) dual-process theory, the authors argue that recognition and word fragment completion tests both rely on within-item integration that influences familiarity, whereas recall hinges on elaboration that influences retrievability.
 
A mirror effect can be produced by manipulating word class (e.g., high vs. low frequency) or by manipulating strength (e.g., short vs. long study time). The results of 5 experiments reported here suggest that a strength-based mirror effect is caused by a shift in the location of the decision criterion, whereas a frequency-based mirror effect occurs although the criterion remains fixed with respect to word frequency. Evidence supporting these claims is provided by a series of studies in which high frequency (HF) words were differentially strengthened (and sometimes differentially colored) during list presentation. That manipulation increased the HF hit rate above that for low frequency (LF) words without selectively decreasing the HF false alarm rate, just as a fixed-criterion account of the word-frequency mirror effect predicts.
 
Event-related brain potentials (ERPs) were recorded during a serial reaction time (RT) task, where single deviant items seldom (Experiment 1) or frequently (Experiment 2) replaced 1 item of a repeatedly presented 10-item standard sequence. Acquisition of sequence knowledge was reflected in faster RTs for standard as compared with deviant items and in an enhanced negativity (N2 component) of the ERP for deviant items. Effects were larger for participants showing explicit knowledge in their verbal reports and in a recognition test. The lateralized readiness potential indicated that correct responses were activated with shorter latencies after training. For deviant items, participants with explicit knowledge showed an initial activation of the incorrect but expected response. These findings suggest that the acquisition of explicit and implicit knowledge is reflected in different electrophysiological correlates and that sequence learning may involve the anticipatory preparation of responses.
 
Frequency distributions of average proportion correct for Type II and Type IV in all experiments.  
Observed behavioral data from Experiment 6 and attention learning covering map (ALCOVE) simulation fit to Type I, Type II, and Type IV. ALCOVE predicts slower Type IV learning than was observed.  
Observed behavioral data from Experiment 6 and attention learning covering map (ALCOVE) simulation fit to Types II and IV only. ALCOVE predicts slower Type I learning than was observed.  
a. Examples of stimuli used in Experiments 1, 2, and 4. b. Examples of stimuli used in Experiment 8.  
An illustrated overview of key methodological differences and results across experiments. Filled-in stars represent the presence of a Type II advantage over Type IV based on average proportion correct. Empty stars represent a lack of Type II advantage. EXP Experiment.  
The findings of Shepard, Hovland, and Jenkins (1961) on the relative ease of learning 6 elemental types of 2-way classifications have been deeply influential 2 times over: 1st, as a rebuke to pure stimulus generalization accounts, and again as the leading benchmark for evaluating formal models of human category learning. The litmus test for models is the ability to simulate an observed advantage in learning a category structure based on an exclusive-or (XOR) rule over 2 relevant dimensions (Type II) relative to category structures that have no perfectly predictive cue or cue combination (including the linearly-separable Type IV). However, a review of the literature reveals that a Type II advantage over Type IV is found only under highly specific experimental conditions. We investigate when and why a Type II advantage exists to determine the appropriate benchmark for models and the psychological theories they represent. A series of 8 experiments link particular conditions of learning to outcomes ranging from a traditional Type II advantage to compelling non-differences and reversals (i.e., Type IV advantage). Common interpretations of the Type II advantage as either a broad-based phenomenon of human learning or as strong evidence for an attention-mediated similarity-based account are called into question by our findings. Finally, a role for verbalization in the category learning process is supported. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
 
Discusses some problems with R. W. Proctor's (see record 1983-31898-001) explanation of the effect of unfilled intervals between the presentations of successive pictures on subsequent picture recognition. The present author suggests that picture rehearsal is more flexible than is generally supposed, in that it can be applied to any of several recently presented pictures and not to just the very last one. With this assumption, the extent to which a given picture is rehearsed may be independent of the duration of the particular interval that follows its presentation when the intervals are varied randomly, but not when they are varied between blocks. (11 ref) (PsycINFO Database Record (c) 2014 APA, all rights reserved)
 
In his recent articles, Bogartz offered a definition of what it means for forgetting rate to be independent of degree of original learning. He showed that, given this definition, independence is confirmed by extant data. Bogartz also criticized Loftus's (1985b) proposed method for testing independence. In this commentary, we counter Bogartz's criticisms and then offer two observations. First, we show that Loftus's horizontal-parallelism test distinguishes between two interesting class of memory models: unidimensional models wherein the memory system's state can be specified by a single number and multidimensional models wherein at least two numbers are required to specify the memory system's state. Independence by Loftus's definition is implied by a unidimensional model. Bogartz's definition, in contrast, is consistent with either model. Second, to better understand the constraints on memory mechanisms dictated by the mathematics of the models under consideration, we develop a simple but general feature model of learning and forgetting. We demonstrate what constraints must be placed on this model to make learning and forgetting rate independent by Loftus's and by Bogartz's definitions.
 
Mean sensitivity ( A Ј ) of the contrast and recognition tasks to familiarity (old vs. new) at each test block and overall in Experiment 1. In the recognition task, participants decided whether each (uncued) study word was repeated, and in the contrast task, they decided whether the contrast between each word and the background was high or low. Data from the 96 test trials are presented both overall (right panel) and broken into 3 blocks (left panel). Bars indicate standard errors. 
Mean sensitivity ( A Ј ) of the contrast task to familiarity (old vs. new) at each test block and overall in Experiment 2. Left panel shows data for words that were uncued at study. Right panel shows data for words that were cued at study. Bars indicate standard errors. 
Mean sensitivity ( A Ј ) of the contrast and recognition tasks to familiarity at each test block and overall in Experiment 3. Bars indicate standard errors. 
Mean sensitivity ( A Ј ) of the contrast and recognition tasks to familiarity at each test block and overall in Experiment 4 (100-ms study exposure). Bars indicate standard errors. 
Mean sensitivity ( A Ј ) of the contrast and recognition tasks to familiarity at each test block and overall in Experiment 4 (500-ms study exposure). Bars indicate standard errors. 
Four experiments are reported that reevaluate P. M. Merikle and E. M. Reingold's (1991) demonstration of unconscious memory: the greater sensitivity to familiarity (repetition) of an indirect (implicit) memory task than of a comparable direct (explicit) task. At study, participants named the cued member of a pair of visually presented words. At test, new and uncued study words were presented against a background mask. Participants judged whether each word was old or new (direct task) or whether the contrast between the word and the background was high or low (indirect task). Contrary to the original findings, the sensitivity of the indirect task to familiarity never exceeded that of the direct task. These findings pose a challenge to a key pillar of evidence for unconscious influences of memory.
 
S. Madigan and R. O'Hara (1992) analyzed data from repeated free-recall experiments and concluded that the rate of item recovery across tests was related to the level of recall performance on an initial free-recall test. We report a reanalysis of these data along with Monte Carlo simulations that indicate the measures used by Madigan and O'Hara may have inflated the magnitude of the relation between initial recall and item recovery. The results are discussed in terms of their implications for future research investigating reminiscence and hypermnesia.
 
B. B. Murdock and M. J. Kahana (1993a) presented a continuous memory version of the theory of distributed associative memory (TODAM) model; they claimed that this model predicts list-strength and list-length findings, including those reported by R. Ratcliff, S. E. Clark, and R. M. Shiffrin (1990) and K. Murnane and R. M. Shiffrin (1991a). This model is quite similar to one discussed by R. M. Shiffrin, R. Ratcliff, and S. Clark (1990), who rejected the model on the basis of its inability to predict both an absent or negative list-strength effect (when strength is varied by repetitions) and a present list-length effect. In this comment we elaborate the earlier discussion and demonstrate that the version of TODAM proposed by B. B. Murdock and M. J. Kahana (1993a) indeed fails for this reason. We show this first for a somewhat simplified version of the model for which derivations are obvious and then in a simulation of the complete version using the parameter values suggested by B. B. Murdock and M. J. Kahana (1993a).
 
D. L. Hintzman's (1994) criticism of our theory on recognition memory consists of 2 points: An equation of attention/likelihood theory has been incorrectly written and the likelihood ratios of the theory can be replaced by another, preferable transformation. Both of these points are discussed and rebutted.
 
N. G. Kanwisher (1987; J. Park & N. G. Kanwisher, 1994) has explained repetition blindness in terms of a distinction in visual perception between type activation and token individuation; repeated items are successfully recognized (matched to stored types) but are less likely than unrepeated items to become individuated as separate perceptual tokens. Whittlesea and colleagues (B. W. A. Whittlesea, M. D. Dorken, & K. W. Podrouzek, 1995; B. W. A. Whittlesea & K. W. Podrouzek, 1995) argued that repetition blindness does not reflect different processing of repeated and unrepeated items but is better explained as the result of a combination of separate but nondistinctive processing of repeated items and postlist report biases. However, we argue that none of the results reported by Whittlesea and colleagues are inconsistent with the token-individuation hypothesis.
 
D. W. Massaro and G. C. Oden (1995) claimed that M. A. Pitt's (1995) data provide strong evidence in favor of independence, not interactivity, as argued by Pitt. Massaro and Oden's arguments rested on an evaluation of the fit of the fuzzy logical model of perception (FLMP) to the identification data and on criticisms of the detection theory analyses. In this reply, Pitt shows that the latter criticisms were unfounded and that the data-fitting demonstrations raised questions about FLMP's ability to capture the phenomenon of interest (i.e., lexical context effects).
 
Panels A-F: Mean reaction times (RTs) for correct strings of string length 7 (single-case studies). Variability indication: Error bars represent the minimum and maximum of mean RTs in each block.
(opposite page and above). Individual mean reaction times (RTs) (correct tasks only) as a function of block. Panel A: high-complexity, latency shift condition. Panel B: high-complexity, no latency shift condition. Panel C: low-complexity, latency shift condition. Panel D: low-complexity, no latency shift condition. S subject.
Mean reaction-time differences between the last and first practice block (Overall training effect), between the first transfer block and the first training block (Item – general learning effect), between the first transfer block and the last training block (Item – specific learning effect), and between the last transfer block and the last practice block (Item – general/Item – specific learning effect) in the high- and low-complexity conditions separately for participants with and without latency shifts. 
Panel A: Individual learning functions of the artificial data set (n 100). Panel B: Overall power-function fits for aggregated mean reaction times (RTs) and standard deviations of the artificial data set. M y 2262.4 7803.0 * t i ** (0.62). R 2 99.8%. SD y 1190.7 1585.7 * t i ** (0.62).
The power law of practice is often considered a benchmark test for theories of cognitive skill acquisition. Recently, P. F. Delaney, L. M. Reder, J. J. Staszewski, and F. E. Ritter (1998), T. J. Palmeri (1999), and T. C. Rickard (1997, 1999) have challenged its validity by showing that empirical data can systematically deviate from power-function fits. The main purpose of the present article is to extend their explanations in two ways. First, the authors empirically show that abrupt changes in performance are not necessarily based on a shift from algorithm to memory-based processing, but rather and more generally, that they occur whenever a more efficient task strategy is generated. Second, the authors show mathematically and per simulation that power functions can perfectly fit aggregated learning curves even when all underlying individual curves are discontinuous. Therefore, the authors question conclusions drawn from fits to aggregated data.
 
D. von Winterfeldt, N.-K. Chung, R. D. Luce, and Y. Cho (1997) provided several tests for consequence monotonicity of choice or judgment, using certainty equivalents of gambles. The authors reaxiomatized consequence monotonicity in a probabilistic framework and reanalyzed von Winterfeldt et al.'s main experiment via a bootstrap method. Their application offers new insights into consequence monotonicity as well as into von Winterfeldt et al.'s 3 experimental paradigms: judged certainty equivalents (JCE), QUICKINDIFF, and parameter estimation by sequential testing (PEST). For QUICKINDIFF, the authors found no indication of violations of "random consequence monotonicity." This sharply contrasts the findings of von Winterfeldt et al., who concluded that axiom violations were the most pronounced under that procedure. The authors found potential evidence for violations in JCE and certainty equivalents derived from PEST.
 
C. L. Gagné and E. Shoben (1997) proposed that the conceptual system contains information about how concepts are used to modify other concepts and that this relational information influences the ease with which concepts combine. Recently, E. J. Wisniewski and G. L. Murphy suggested that C. L. Gagné and E. Shoben's measure of relation availability was confounded with familiarity and plausibility and that the participants could simply retrieve the stored meanings of the phrases because the phrases were not novel. In this article, the authors demonstrate that E. J. Wisniewski and G. L. Murphy's plausibility and familiarity judgments are dependent variables that (a) are themselves responsive to changes in relation availability, (b) modifier relation availability predicts response time even when the influence of phrase familiarity and plausibility is controlled, and (c) the materials consisted of mainly novel phrases.
 
On the basis of his assumption that recollection is a threshold process, A. P. Yonelinas (1999) predicted linear source-identification receiver operating characteristics (ROCs) and recently reported data that were consistent with this prediction. In this article, the authors present data showing curvilinear source-identification ROCs across various encoding and test conditions. On the basis of the source-monitoring framework (e.g., M. K. Johnson, S. Hashtroudi, & D. S. Lindsay, 1993), the authors argue that curvilinearity of source-identification ROCs is a result of differences in the qualitative characteristics of memories rather than simply the influence of undifferentiated familiarity as the dual-process model might suggest.
 
Weighted mean proportions of conditional inferences as a function of the polarity of the inferential and referred clause (N 48 studies). MP modus ponens; AC affirmation of the consequent; DA denial of the antecedent; MT modus tollens. 
M. Oaksford, N. Chater, and J. Larkin (2000) proffered a Bayesian model in which conditional inferences are a direct function of conditional probabilities. In the current article, the authors first considered this model regarding the processing of negatives in conditional reasoning. Its predictions were evaluated against a large-scale meta-analysis (W. J. Schroyens, W. Schaeken, & G. d'Ydewalle, 2001b). This evaluation shows that the model is flawed: The relative size of the negative effects does not match predictions. Next, the authors evaluated the model in relation to inferences about affirmative conditionals, again considering the results of a meta-analysis (W. J. Schroyens, W. Schaeken, & G. d'Ydewalle, 2001a). The conditional probability model is countered by the data reported in literature; a mental models based model produces a better fit. The authors conclude that a purely probabilistic model is deficient and incomplete and cannot do without algorithmic processing assumptions if it is to advance toward a descriptively adequate psychological theory.
 
J. D. Smith and J. P. Minda (2000) conducted a meta-analysis of 30 data sets reported in the classification literature that involved use of the "5-4" category structure introduced by D. L. Medin and M. M. Schaffer (1978). The meta-analysis was aimed at investigating exemplar and elaborated prototype models of categorization. In this commentary, the author argues that the meta-analysis is misleading because it includes many data sets from experimental designs that are inappropriate for distinguishing the models. Often, the designs involved manipulations in which the actual 5-4 structure was not, in reality, tested, voiding the predictions of the models. The commentary also clarifies various aspects of the workings of the exemplar-based context model. Finally, concerns are raised that the all-or-none exemplar processes that form part of Smith and Minda's (2000) elaborated prototype models are implausible and lacking in generality.
 
R. J. Crutcher and K. A. Ericsson (2000; see record 2000-05419-014) showed that subjects stopped reporting mnemonic mediation in a recall task after sufficient practice. They concluded that subjects continued to use the mediator indefinitely but that its execution eventually became automatic and no longer required access to working memory. Their article thus supports the more general hypothesis that multistep cognition can take place without awareness. In this article the authors evaluate that conclusion on both conceptual and empirical grounds and report results of a new experiment that indicate that a qualitative shift to direct, unmediated recall can occur for at least some tasks.
 
In their article, "Testing two theories of conceptual combination: Alignment versus diagnosticity in the comprehension and production of combined concepts," F. J. Costello and M. T. Keane (2001) evaluate the role of alignment in the interpretation of noun-noun combinations. They found that participants were not strongly biased to prefer and produce interpretations with alignable differences. Instead, participants sometimes preferred and produced interpretations with nonalignable differences. These results are surprising given that most research has found advantages of alignable differences over nonalignable differences. Costello and Keane also found that feature diagnosticity better predicted their results, and they concluded that alignment does not play an important role in conceptual combination. However, drawing on recent work, the author of the present article gives an alternative interpretation of Costello and Keane's results, showing that alignment is crucial in conceptual combination. The author also shows that the dual-process model accounts for their results.
 
A. Caramazza, A. Costa, M. Miozzo, and Y. Bi (2001) reported a series of experiments demonstrating that the ease of producing a word depends only on the frequency of that specific word but not on the frequency of a homophone twin. A. Caramazza, A. Costa, et al. concluded that homophones have separate word form presentations and that the absence of frequency-inheritance effects for homophones undermines an important argument in support of 2-stage models of lexical access, which assume that syntactic (lemma) representations mediate between conceptual and phonological representations. The authors of this article evaluate the empirical basis of this conclusion, report 2 experiments demonstrating a frequency-inheritance effect, and discuss other recent evidence. It is concluded that homophones share a common word form and that the distinction between lemmas and word forms should be upheld.
 
Proportion of correct classifications predicted by ADIT in the simplified IBRE design in Experiment 1 in Kruschke (1996) at three rates of the attention-shifting parameter (0, 2.35, 4.7) and three weight learning rates. Panel A: Predictions at half the weight learning rate fitted to data in Kruschke's (1996) Experiment 1 ( w .162). Panel B: Predictions at the weight learning rate fitted to data in Kruschke's (1996) Experiment 1 ( w .324). Panel C: Predictions at twice the weight learning rate fitted to data in Kruschke's (1996) Experiment 1 ( w .648). 
In J. K. Kruschke's (2001; see record 2001-18940-005) study, it is argued that attentional theory is the sole satisfactory explanation of the inverse base rate effect and that eliminative inference (P. Juslin, P. Wennerholm, & A. Winman, 2001; see record 2001-07828-016) plays no role in the phenomenon. In this comment, the authors demonstrate that, in contrast to the central tenets of attentional theory, (a) rapid attention shifts as implemented in ADIT decelerate learning in the inverse base-rate task and (b) the claim that the inverse base-rate effect is directly caused by an attentional asymmetry is refuted by data. It is proposed that a complete account of the inverse base-rate effect needs to integrate attention effects with inference rules that are flexibly used for both induction and elimination.
 
T. Trabasso and J. Bartolone (2003) used a computational model of narrative text comprehension to account for empirical findings. The authors show that the same predictions are obtained without running the model. This is caused by the model's computational setup, which leaves most of the model's input unchanged.
 
Hinson et al's (2003) Experiment 1 plotted as a function of k difference score and error difference score per subject. 
Simulated k values based on introducing different rates of random responding to Hinson et al.'s (2003) control condition in Experiment 1. 
Previous research by J. M. Hinson, T. L. Jameson, and P. Whitney (2003) demonstrated that a secondary task in a delayed discounting paradigm increased subjects' preference for the immediate reward. J. M. Hinson et al. interpreted their findings as evidence that working memory load results in greater impulsivity. The present authors conducted a reanalysis of the data from J. M. Hinson et al.'s Experiment 1 at the individual-subject level. Difference scores were calculated by subtracting the digit memory load condition from the control condition for k (discounting parameter) and a measure of "erroneous" responses. The results indicated that the secondary task increased random responding, which in turn can account for the increased mean estimates of k. Thus, the data do not support the claim that cognitive load affects impulsivity per se.
 
D. Briihl and A. W. Inhoff (1995; see record 1995-20036-001) found that exterior letter pairs showed no privileged status in reading when letter pairs were presented as parafoveal primes. However, T. R. Jordan, S. M. Thomas, G. R. Patching, and K. C. Scott-Brown (2003; see record 2003-07955-013) used a paradigm that (a) allowed letter pairs to exert influence at any point in the reading process, (b) overcame problems with the stimulus manipulations used by Briihl and Inhoff (1995), and (c) revealed a privileged status for exterior letter pairs in reading. A. W. Inhoff, R. Radach, B. M. Eiter, and M. Skelly (2003; see record 2003-07955-014) made a number of claims about the Jordan, Thomas, et al. study, most of which focus on parafoveal processing. This article addresses these claims and points out that although studies that use parafoveal previews provide an important contribution, other techniques and paradigms are required to reveal the full role of letter pairs in reading.
 
Potential sources for the discrepancy between the letter position effects in T. R. Jordan, S. M. Thomas, G. R. Patching, and K. C. Scott-Brown's (2003; see record 2003-07955-013) and D. Briihl and A. W. Inhoff s (1995; see record 1995-20036-001) studies are examined. The authors conclude that the lack of control over where useful information is acquired during reading in Jordan et al.'s study, rather than differences in the orthographic consistency and the availability of word shape information, account for the discrepant effect pattern in the 2 studies. The processing of a word during reading begins before it is fixated, when beginning letters occupy a particularly favorable parafoveal location that is independent of word length. Knowledge of parafoveal word length cannot be used to selectively process exterior letters during the initial phase of visual word recognition.
 
Schematic representation of the shared (SR) and independent representation (IR) hypotheses for two- and one-layer models of lexical access. A: The SR hypothesis within a two-lexical-layer model. B: The IR hypothesis within a two-lexical-layer model. C: The IR hypothesis within a one-lexical-layer model. 
A. Caramazza, A. Costa, M. Miozzo, and Y. Bi (2001) reported a series of experiments showing that naming latencies for homophones are determined by specific-word frequency (e.g., frequency of nun) and not homophone frequency (frequency of nun + none). J. D. Jescheniak, A. S. Meyer, and W. J. M. Levelt (2003) have challenged these studies on a variety of grounds. Here we argue that these criticisms are not well founded and try to clarify the theoretical issues that can be meaningfully addressed by considering the effects of frequency on homophone production. We conclude that the evidence from homophone production cannot be considered to provide support to 2-layer theories of the lexical system.
 
Whether sequence learning entails a single or multiple memory systems is a moot issue. Recently, D. R. Shanks, L. Wilkinson, and S. Channon advanced a single-system model that predicts a perfect correlation between true (i.e., error free) response time priming and recognition. The Shanks model is contrasted with a dual-process model that incorporates both response time priming and reportable sequence knowledge as predictors of recognition. The models were tested by applying confirmatory factor analysis to data obtained from a recognition test that was administered under both speed and accuracy conditions. The Shanks model accounted for the data in the speed condition, whereas the dual-process model provided a better fit in the accuracy condition. The results are compatible with the notion that cognitive processes were engaged differentially in recognition judgments under speed and accuracy conditions.
 
Fixation proportions to target, cohort (collapsed across consistency), and unrelated distracter objects in Experiment 3. All fixations beginning 200ms after object target word onset are included. 
In a recent study, G. Kuhn and Z. Dienes (2005) reported that participants previously exposed to a set of musical tunes generated by a biconditional grammar subsequently preferred new tunes that respected the grammar over new ungrammatical tunes. Because the study and test tunes did not share any chunks of adjacent intervals, this result may be construed as straightforward evidence for the implicit learning of a structure that was only governed by nonlocal dependency rules. It is shown here that the grammar modified the statistical distribution of perceptually salient musical events, such as the probability that tunes covered an entire octave. When the influence of these confounds was removed, the effect of grammaticality disappeared.
 
Recently, J. J. Starns and J. L. Hicks (2005) have argued that source dimensions are retrieved independently from memory. In their innovative experiment, manipulating the retrievability of 1 source feature did not affect memory for a 2nd feature. Following C. S. Dodson and A. P. Shimamura (2000), the authors argue that the source memory measure that Starns and Hicks used (known as the average conditional source identification measure) is vulnerable to a response bias in this particular paradigm, and this may undermine Starns and Hicks's conclusion. Starns and Hicks, however, acknowledged this possibility. The authors substantiate this claim by a simulation and by replicating Starns and Hicks's experiment. In 2 further experiments, the authors use an extended multinomial model to analyze data showing that Starns and Hicks's conclusion holds even if results cannot be attributed to response biases.
 
The z values corresponding to selected t values in the sampling distribution of sample for population 0 as a function of sample size. This graph is based on Soper, Young, Cave, Lee, and Pearson's (1915-1917) data for r .
The z scores corresponding to t 0.6 in the distribution of sample for population 0 (the solid line) and population 0.5 (dotted line) and the associated probabilities of Type I error and incorrect decisions, respectively, for selected sample sizes.  
The z scores corresponding to t 0.3 in the distribution of sample for population 0 (the solid line) and population 0.5 (dotted line) and the associated probabilities of Type I error and incorrect decisions, respectively, for selected sample sizes.  
The t values corresponding to .05 (two-tailed) in the distribution of sample for population 0 and the associated probabilities of correct and incorrect decisions assuming population 0.5, P( sample t) population 0.5 and P( sample t) population 0.5 , respectively, for selected sample sizes.  
Fiedler and Kareev (2006) showed that small samples can, in principle, outperform large samples in terms of the quality of contingency-based binary choice. The 1st part of this comment critically examines these authors' claim that this small sample advantage (SSA) contradicts Bernoulli's law of large numbers and concludes that this claim is unwarranted. The 2nd part of the comment provides insight into the etiology of the SSA and points to the following as necessary conditions for the SSA's occurrence: (a) the statistical invalidity of the underlying threshold-based decision algorithm and (b) the particular payoff scheme underlying the definition of the decisions' quality. Together, these 2 factors explain how better information provided by larger samples is translated into worse decisions.
 
Fiedler and Kareev (2006) have claimed that taking a small sample of information (as opposed to a large one) can, in certain specific situations, lead to greater accuracy--beyond that gained by avoiding fatigue or overload. Specifically, they have argued that the propensity of small samples to provide more extreme evidence is sufficient to create an accuracy advantage in situations of high caution and uncertainty. However, a close examination of Fiedler and Kareev's experimental results does not reveal any strong reason to conclude that small samples can cause greater accuracy. We argue that the negative correlation between sample size and accuracy that they reported (found only for the second half of Experiment 1) is also consistent with mental fatigue and that their data in general are consistent with the causal structure opposite to the one they suggest: Rather than small samples causing clear data, early clear data may cause participants to stop sampling. More importantly, Experiment 2 provides unequivocal evidence that large samples result in greater accuracy; Fiedler and Kareev only found a small sample advantage here when they artificially reduced the data set. Finally, we examine the model that Fiedler and Kareev used; they surmised that decision makers operate with a fixed threshold independent of sample size. We discuss evidence for an alternative (better performing) model that incorporates a dynamic threshold that lowers with sample size. We conclude that there is no evidence currently to suggest that humans benefit from taking a small sample, other than as a tactic for avoiding fatigue, overload, and/or opportunity cost-that is, there is no accuracy advantage inherent to small samples.
 
Drawings of (a) nonliving and (b) living things from a patient with herpes simplex encephalitis and a category-specific semantic deficit for living things. From "When Leopards Lose Their Spots: Knowledge of Visual Properties in Category-Specific Deficits for Living Things," by H. E. Moss, L. K. Tyler, and F. Jennings, 1997, Cognitive Neuropsychology, 14, pp. 935, 937. Copyright 1997 by Taylor & Francis. Reprinted with permission. 
The conceptual structure account of semantic memory (CSA; L. K. Tyler & H. E. Moss, 2001) claims that feature correlation (the degree to which features co-occur) and feature distinctiveness (the number of concepts in which a feature occurs) interact with domains of knowledge (e.g., living vs. nonliving) such that the distinctive features of nonliving things are more highly correlated than the distinctive features of living things. Evidence for (B. Randall, H. E. Moss, J. M. Rodd, M. Greer, & L. K. Tyler, 2004) and against this claim (G. S. Cree, C. McNorgan, & K. McRae, 2006) has been reported. This comment outlines the CSA, discusses Cree et al.'s (2006) critiques of the Randall et al. (2004) experiments and the CSA, and reports new analyses of property norm and behavioral data, which replicate the results reported by Randall et al. (2004).
 
P. Maguire, B. Devereux, F. Costello, and A. Cater discussed the Gagné and Shoben (1997) CARIN theory of conceptual combination and, after presenting a sample drawn from the British National Corpus and comparing the two corpora, concluded that the Gagné and Shoben corpus is too small and unrepresentative. They then discussed the mathematical model presented by Gagné and Shoben and claimed that the model does not incorporate relational competition. In this article, the authors present critical aspects of the mathematical model not considered by Maguire et al. and show that the mathematical instantiation of CARIN presented by Gagn and Shoben is, in fact, very sensitive to the number of strong competing relations. The authors then present some new comparisons between the corpora, showing that they correspond surprisingly well.
 
Vincentile means for participants' lexical decision response times as a function of word frequency and stimulus quality. RT response time.
The difference in the vincentile means for low- versus high-frequency items, for participants’ lexical decision response times, as a function of stimulus quality. RT ϭ response time. 
Vincentile means for participants' reading aloud response times as a function of word frequency and stimulus quality. RT response time.
The difference in the vincentile means for low- versus high-frequency items, for participants’ reading aloud response times, as a function of stimulus quality. RT ϭ response time. 
There have been multiple reports over the last 3 decades that stimulus quality and word frequency have additive effects on the time to make a lexical decision. However, it is surprising that there is only 1 published report to date that has investigated the joint effects of these two factors in the context of reading aloud, and the outcome of that study is ambiguous. The present study shows that these factors interact in the context of reading aloud and at the same time replicate the standard pattern reported for lexical decision. The main implication of these results is that lexical activation, at least as indexed by the effect of word frequency, does not unfold in a uniform way in the contexts reported here. The observed dissociation also implies, contrary to J. A. Fodor's (1983) view, that the mental lexicon is penetrable rather than encapsulated. The distinction between cascaded and thresholded processing offers one way to understand these and related results. A direction for further research is briefly noted.
 
Retrieval-induced forgetting (RIF) is the finding of impaired memory performance for information stored in long-term memory due to retrieval of a related set of information. This phenomenon is often assigned to operations of a specialized mechanism recruited to resolve interference during retrieval by deactivating competing memory representations. This inhibitory account is supported by, among others, findings showing that RIF occurs with independent cues not used during retrieval practice. However, these findings are not always consistent. Recently, Norman, Newman, and Detre (2007) have proposed a model that aims at resolving discrepancies concerning cue-independence of RIF. The model predicts that RIF should be present with independent cues when episodic associations are created between independent cues and their targets in the same episodic context that is later used to cue memory during retrieval practice. In the present study we aimed to test this prediction. We associated studied items with semantically unrelated words during the main study phase of the retrieval practice paradigm, and we tested memory with both cues used during retrieval practice (Experiment 2) and episodic associates serving as independent cues (Experiments 3a and 3b). Although RIF was present when the same cues were used during retrieval practice and a final test, contrary to the prediction formulated by Norman et al., RIF failed to emerge when episodic associates were employed as independent cues. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
 
On the basis of consistently finding significant overall costs to the ongoing task with a single salient target event, Smith, Hunt, McVay, and McConnell (2007) concluded that preparatory attentional processes are required for prospective remembering and that spontaneous retrieval does not occur. In this article, we argue that overall costs are not completely informative in terms of specifying the underlying processes mediating prospective memory retrieval, and we suggest more promising approaches for testing for the existence of these processes. We also argue that counterbalancing in a within-subjects design is one of several proper methods for assessing costs.
 
Mean reaction time (RT) and standard error in the Stroop color-naming task for (a) word and color control trials and (b) congruent and incongruent Stroop condition trials. Mono monolinguals; biling bilinguals.  
Mean reaction time (RT) and standard error for facilitation and cost in the Stroop task. The values are mean differences from baseline (0 ms) calculated as the average time to name colors from neutral stimuli (Xs). Mono monolinguals; biling bilinguals.  
Biplot showing multivariate relationship among five variables. The dimensions indicate the percentage of variance explained by each. These are additive, showing that the model explains 56% of the overall variance.  
Reports an error in "Cognitive control and lexical access in younger and older bilinguals" by Ellen Bialystok, Fergus Craik and Gigi Luk (Journal of Experimental Psychology: Learning, Memory, and Cognition, 2008[Jul], Vol 34[4], 859-873). An incorrect figure was printed due to an error in the production process. The correct version of Figure 1b is provided in the correction. (The following abstract of the original article appeared in record 2008-08549-012.) Ninety-six participants, who were younger (20 years) or older (68 years) adults and either monolingual or bilingual, completed tasks assessing working memory, lexical retrieval, and executive control. Younger participants performed most of the tasks better than older participants, confirming the effect of aging on these processes. The effect of language group was different for each type of task: Monolinguals and bilinguals performed similarly on working memory tasks, monolinguals performed better on lexical retrieval tasks, and bilinguals performed better on executive control tasks, with some evidence for larger language group differences in older participants on the executive control tasks. These results replicate findings from individual studies obtained using only 1 type of task and different participants. The confirmation of this pattern in the same participants is discussed in terms of a suggested explanation of how the need to manage 2 language systems leads to these different outcomes for cognitive and linguistic functions. (PsycINFO Database Record (c) 2009 APA, all rights reserved).
 
Parameter estimates (and 95% Confidence Intervals) for the mixed-effects model on critical-word RTs without including the effect of presentation Order. 
Parameter estimates (and 95% Confidence Intervals) for the mixed-effects model on critical-word RTs, including presentation Order as a fixed effect. 
In 2 separate self-paced reading experiments, Farmer, Christiansen, and Monaghan (2006) found that the degree to which a word's phonology is typical of other words in its lexical category influences online processing of nouns and verbs in predictive contexts. Staub, Grant, Clifton, and Rayner (2009) failed to find an effect of phonological typicality when they combined stimuli from the separate experiments into a single experiment. We replicated Staub et al.'s experiment and found that the combination of stimulus sets affects the predictiveness of the syntactic context; this reduces the phonological typicality effect as the experiment proceeds, although the phonological typicality effect was still evident early in the experiment. Although an ambiguous context may diminish sensitivity to the probabilistic relationship between the sound of a word and its lexical category, phonological typicality does influence online sentence processing during normal reading when the syntactic context is predictive of the lexical category of upcoming words.
 
J. C. Ziegler, C. Perry, and M. Zorzi (2009) have claimed that their connectionist dual process model (CDP+) can simulate the data reported by S. O'Malley and D. Besner. Most centrally, they have claimed that the model simulates additive effects of stimulus quality and word frequency on the time to read aloud when words and nonwords are randomly intermixed. This work represents an important attempt given that computational models of reading processes have to date largely ignored the issue of whether it is possible to simulate additive effects. Despite CDP+'s success at capturing many other phenomena, it is clear that CDP+ fails to capture the full pattern seen with skilled readers in these experiments.
 
Reports an error in "Visual priming of inverted and rotated objects" by Barbara J. Knowlton, Sean P. McAuliffe, Chase J. Coelho and John E. Hummel ( Journal of Experimental Psychology: Learning, Memory, and Cognition , 2009[Jul], Vol 35[4], 837-848). In the article, there was an error in the sixth sentence of the abstract. The sentence should read “Experiments 2 and 3 demonstrated that although identification was sensitive to orientation, visual priming was relatively invariant with image inversion (i.e., an image visually primed its inverted counterpart approximately as much as it primed itself).” (The following abstract of the original article appeared in record 2009-09620-008 .) Object images are identified more efficiently after prior exposure. Here, the authors investigated shape representations supporting object priming. The dependent measure in all experiments was the minimum exposure duration required to correctly identify an object image in a rapid serial visual presentation stream. Priming was defined as the change in minimum exposure duration for identification as a function of prior exposure to an object. Experiment 1 demonstrated that this dependent measure yielded an estimate of predominantly visual priming (i.e., free of name and concept priming). Experiments 2 and 3 demonstrated that although priming was sensitive to orientation, visual priming was relatively invariant with image inversion (i.e., an image visually primed its inverted counterpart approximately as much as it primed itself). Experiment 4 demonstrated a similar dissociation with images rotated 90° off the upright. In all experiments, the difference in the magnitude of priming for identical or rotated–inverted priming conditions was marginal or nonexistent. These results suggest that visual representations that support priming can be relatively insensitive to picture-plane manipulations, although these manipulations have a substantial effect on object identification.
 
Schematic representation of the experimental procedure in Experiment 1. 
Schematic representation of the experimental procedure in Experiment 2. 
M. G. Berman, J. Jonides, and R. L. Lewis (2009) adapted the recent-probes task to investigate the causes of forgetting in short-term memory. In 7 experiments, they studied the persistence of memory traces by assessing the level of proactive interference generated by previous-trial items over a range of intertrial intervals. None of the experiments found a reduction in proactive interference over time, which they interpreted as evidence against time-based decay. However, it is possible that decay actually occurred over a shorter time period than was tested in this study, wherein the shortest decay interval was 3,300 ms. By reducing the time scale, the 2 experiments reported in the current commentary revealed a sharp decrease in proactive interference over time, with this reduction reaching a plateau in less than 3 s. This pattern suggests that decay operates in the early stages, whereas subsequent forgetting is likely to be due to interference.
 
| Illustration of the temporal parameters relevant for recall performance in a complex span task according to the TBRS model. 
The sources of forgetting in working memory (WM) are a matter of intense debate: Is there a time-related decay of memory traces, or is forgetting uniquely due to representation-based interference? In a previous study, we claimed to have provided evidence supporting the temporal decay hypothesis (S. Portrat, P. Barrouillet, & V. Camos, 2008). However, reanalyzing our data, S. Lewandowsky and K. Oberauer (2009) demonstrated that they do not provide compelling evidence for temporal decay and suggested a class of alternative models favoring a representation-based interference account. In this article, we develop from the most recent proposals made by Lewandowsky and Oberauer 2 of the most plausible extensions of these alternative models. We show that neither of these extensions can account for recent findings related to between-domain WM performance and that both lead to predictions that are contradicted by new empirical evidence. Finally, we show that recent studies that have been claimed to rule out the temporal decay hypothesis do not resist close scrutiny. We conclude that the time-based resource-sharing model remains the most parsimonious way to account for forgetting and restoration of memory traces in WM.
 
Non-Gaussian mixture distributions produced by mixing two Gaussian distributions, where the means of the left (L) and right (R) distributions differ by 0.5 (A), 1.0 (B), 1.5 (C), and 2.0 (D) standard deviations (the standard deviation of the left distribution is 1.0, and the standard deviation of the right distribution is 1.25 in each case). For each panel, the Gaussian distributions are shown as solid lines, and the resulting mixture distribution is shown as a dotted line. Note that the mixture distributions are all non-Gaussian even though the mixture distribution in Panel A appears to be Gaussian in form. 
Upper panel: Observed data from the mixed condition from one participant (Subject 106 from Koen & Yonelinas, 2010) before mixing (left) and after mixing (right) confidence ratings made to the weak and strong targets. Middle panel: The mixture-unequal-variance signal-detection (UVSD) model corresponding to the unmixed (left) and mixed (right) data. Lower panel: Predicted data from the mixed condition from the same participant before mixing (left) and after mixing (right) confidence ratings made to the weak and strong targets. z-ROC ϭ z-transformed receiver-operating characteristic. 
Scatter plots of the observed z-transformed receiver-operating characteristic (zROC) slope and the predicted zROC slope of the unequal-variance signal-detection model: (A) Koen and Yonelinas (2010), (B) Experiment 1, (C) Experiment 2, and (D) Jang et al. (2011). 
The slope of the z-transformed receiver-operating characteristic (zROC) in recognition memory experiments is usually less than 1, which has long been interpreted to mean that the variance of the target distribution is greater than the variance of the lure distribution. The greater variance of the target distribution could arise because the different items on a list receive different increments in memory strength during study (the "encoding variability" hypothesis). In a test of that interpretation, Koen and Yonelinas (2010) attempted to further increase encoding variability to see whether it would further decrease the slope of the zROC. To do so, they presented items on a list for 2 different durations and then mixed the weak and strong targets together. After performing 3 tests on the mixed-strength data, Koen and Yonelinas concluded that encoding variability does not explain why the slope of the zROC is typically less than 1. However, we show that their tests have no bearing on the encoding variability account. Instead, they bear on the mixture-unequal-variance signal-detection (UVSD) model that corresponds to their experimental design. On the surface, the results reported by Koen and Yonelinas appear to be inconsistent with the predictions of the mixture-UVSD model (though they were taken to be inconsistent with the predictions of the encoding variability hypothesis). However, all 3 of the tests they performed contained errors. When those errors are corrected, the same 3 tests show that their data support, rather than contradict, the mixture-UVSD model (but they still have no bearing on the encoding variability hypothesis).
 
E. Dhooge and R. J. Hartsuiker (2010) reported experiments showing that picture naming takes longer with low- than high-frequency distractor words, replicating M. Miozzo and A. Caramazza (2003). In addition, they showed that this distractor-frequency effect disappears when distractors are masked or preexposed. These findings were taken to refute models like WEAVER++ (A. Roelofs, 2003) in which words are selected by competition. However, Dhooge and Hartsuiker do not take into account that according to this model, picture-word interference taps not only into word production but also into attentional processes. Here, the authors indicate that WEAVER++ contains an attentional mechanism that accounts for the distractor-frequency effect (A. Roelofs, 2005). Moreover, the authors demonstrate that the model accounts for the influence of masking and preexposure, and does so in a simpler way than the response exclusion through self-monitoring account advanced by Dhooge and Hartsuiker.
 
On the basis of earlier findings, we (Fiedler & Kareev, 2006) presented a statistical decision model that explains the conditions under which small samples of information about choice alternatives inform more correct choices than large samples. Such a small-sample advantage (SSA) is predicted for choices, not estimations. It is contingent on high constant decision thresholds. The model was harshly criticized by Cahan (2010), who argued that the SSA disappears when the threshold decreases with increasing sample size and when the costs of incorrect decisions are higher than the benefits of correct decisions. We refute Cahan's critique, which confuses normative and descriptive arguments. He neither questioned our theoretical reasoning nor presented empirical counterevidence. Instead, he discarded our model as statistically invalid because the threshold does not decrease with increasing sample size. Contrary to this normative intuition, which presupposes a significance-testing rationale, we point out that decisions are often insensitive to sample size. We also refute Cahan's intuition that ignoring the potential asymmetry of gains and losses creates a serious bias in favor of the SSA. We regret any misunderstandings resulting from our linking the SSA to Bernoulli's law of large numbers.
 
Koen and Yonelinas (2010; K&Y) reported that mixing classes of targets that had short (weak) or long (strong) study times had no impact on ʐROC slope, contradicting the predictions of the encoding variability hypothesis. We show that they actually derived their predictions from a mixture unequal-variance signal detection (UVSD) model, which assumes 2 discrete levels of strength instead of the continuous variation in learning effectiveness proposed by the encoding variability hypothesis. We demonstrated that the mixture UVSD model predicts an effect of strength mixing only when there is a large performance difference between strong and weak targets, and the strength effect observed by K&Y was too small to produce a mixing effect. Moreover, we re-analyzed their experiment along with another experiment that manipulated the strength of target items. The mixture UVSD model closely predicted the empirical mixed slopes from both experiments. The apparent misfits reported by K&Y arose because they calculated the observed slopes using the actual range of ʐ-transformed false-alarm rates in the data, but they computed the predicted slopes using an extended range from - 5 to 5. Because the mixed predictions follow a slightly curved ʐROC function, different ranges of scores have different linear slopes. We used the actual range in the data to compute both the observed and predicted slopes, and this eliminated the apparent deviation between them.
 
Reports an error in "Probabilistic cuing in large-scale environmental search" by Alastair D. Smith, Bruce M. Hood and Iain D. Gilchrist (Journal of Experimental Psychology: Learning, Memory, and Cognition, 2010[May], Vol 36[3], 605-618). This article contained typographical errors in the first paragraph under Experiment 2, Results. The first Analysis of Variance conducted on reaction time data reported incorrect degrees of freedom. This does not affect the interpretation of the article. The corrected paragraph is as follows. "Search times (see Figure 3) were significantly faster for targets in the rich side of the display (mean difference 6.57 s, SD 4.22), F(1, 17) 42.9, p .001. There was also a main effect of block, F(1, 17) 5.73, p .05, and a Probability X Block interaction, F(1, 17) 11.0, p .005, reflecting the slower overall search times for sparse trials in the second block and indicating a larger cuing effect in Block 2 (mean difference 8.10 s, SD 4.93) than Block 1 (mean difference 5.05 s, SD 4.31)." (The following abstract of the original article appeared in record 2010-08037-004.) Finding an object in our environment is an important human ability that also represents a critical component of human foraging behavior. One type of information that aids efficient large-scale search is the likelihood of the object being in one location over another. In this study we investigated the conditions under which individuals respond to this likelihood, and the reference frames in which this information is coded, using a novel, large-scale environmental search paradigm. Participants searched an array of locations, on the floor of a room, for a hidden target by pressing switches at each location. We manipulated the probability of the target being at a particular set of locations. Participants reliably learned target likelihoods when the possible search locations were kept constant throughout the experiment and the starting location was fixed. There was no evidence of such learning when room-based and body-based reference frames were dissociated. However, when this was combined with a more salient perceptual landmark, an allocentric cuing effect was observed. These data suggest that the encoding of this type of statistical contingency depends on the combination of spatial cues. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
 
Einstein et al., (2005) predicted no cost to an ongoing task when a prospective memory task meet certain criteria. Smith et al. (2007) used prospective memory tasks that met these criteria and found a cost to the ongoing task, contrary to Einstein et al.'s prediction. Einstein and McDaniel (in press) correctly note that there are limitations to using ongoing task performance as a measure of the processes that contribute to prospective memory performance, however, the alternatives suggested by Einstein and McDaniel all focus on ongoing task performance and therefore do not move beyond the cost debate. This article describes why the Smith et al. findings are important, provides recommendations for issues to consider when investigating cost, and discusses individual cost measures. Finally, noting the blurry distinction between Einstein and McDaniel's description of the reflexive associative processes and preparatory attentional processes and difficulties in extending the multiprocess view to nonlaboratory tasks, suggestions are made for moving beyond the cost debate.
 
Reports an error in "Eye closure reduces the cross-modal memory impairment caused by auditory distraction" by Timothy J. Perfect, Jackie Andrade and Irene Eagan (Journal of Experimental Psychology: Learning, Memory, and Cognition, 2011[Jul], Vol 37[4], 1008-1013). There is an error reported in the Results section on p. 1010. This error is addressed in the correction. (The following abstract of the original article appeared in record 2011-05332-001.) Eyewitnesses instructed to close their eyes during retrieval recall more correct and fewer incorrect visual and auditory details. This study tested whether eye closure causes these effects through a reduction in environmental distraction. Sixty participants watched a staged event before verbally answering questions about it in the presence of auditory distraction or in a quiet control condition. Participants were instructed to close or not close their eyes during recall. Auditory distraction did not affect correct recall, but it increased erroneous recall of visual and auditory details. Instructed eye closure reduced this effect equally for both modalities. The findings support the view that eye closure removes the general resource load of monitoring the environment rather than reducing competition for modality-specific resources. (PsycINFO Database Record (c) 2011 APA, all rights reserved).
 
Top-cited authors
Michael J Kane
  • University of North Carolina at Greensboro
Jennifer C Mcvay
  • University of North Carolina at Greensboro
Moshe Naveh-Benjamin
  • University of Missouri
Mary Hegarty
  • University of California, Santa Barbara
Stephan Lewandowsky
  • University of Bristol